There are several optimizations:
1) Use predefined SRGB, don't calculate. This is the most common case.
2) Precompute HW X points at boot since they're fixed in ColModule
3) Precompute PQ - it never changes and is very CPU intensive in fixed pt.
4) Reduce number of points in ColModule to 512 (32x16) from 1024. This also
requires reducing some regions for legacy DCEs to 16 pts at most.
Performance
1) is super-fast, build_output_tf is 1-2us, down from 25000-30000.
Programming also fast since only one reg write.
2)+3) gives build_output_tf for PQ in ~100us range, down from ~80000-110000
2) + 4) results in slightly over 50% improvement. It gives an idea of the
savings when we can't use SRGB or PQ table (e.g. sdr white level > 80).
There's also a bit of refactoring: renaming some stuff that was misleading
and removing a lot of magic numbers that novices might not be able to
understand where they come from and what they mean.
Signed-off-by: Krunoslav Kovac <Krunoslav.Kovac@amd.com>
Reviewed-by: Tony Cheng <Tony.Cheng@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
System soft hang when hotplug specific 4K DP panel
due to link caps read error and incorrect link setting
parmas to enable dp.
Add status check for DPCD read and add return value
for detect dp, in case of false, return from caller,
avoid further false operation.
Signed-off-by: Yongqiang Sun <yongqiang.sun@amd.com>
Reviewed-by: Tony Cheng <Tony.Cheng@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fixing null-deref on Vega10 due to regression after
'fix cursor related Pstate hang' change.
Added null checks in setting cursor position.
Signed-off-by: Roman Li <Roman.Li@amd.com>
Reviewed-by: Eric Yang <eric.yang2@amd.com>
Reviewed-by: Harry Wentland <Harry.Wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
* Previous code did some calculations with a mix of normal integers and
integers aligned as U2.24 fixed-point values.
* There were bugs in the conversion of the final result into the
S4.19 values required for the registers.
Signed-off-by: Ken Chalmers <ken.chalmers@amd.com>
Reviewed-by: Tony Cheng <Tony.Cheng@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Move cursor programming to inside the OTG_MASTER_UPDATE_LOCK
If graphics plane go from 1 pipe to hsplit, the cursor updates
after mpc programming and unlock. Which means there is a window
of time where cursor is enabled on the wrong pipe if it's on
the right side of the screen (i.e. case where cursor need to
move from pipe 0 to pipe 3 post split). This will cause pstate hang.
Solution is to program the cursor while still locked.
Signed-off-by: Eric Yang <Eric.Yang2@amd.com>
Reviewed-by: Tony Cheng <Tony.Cheng@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Building the amd display driver with link-time optimizations revealed a bug
that caused dal_cmd_tbl_helper_dce80_get_table() and
dal_cmd_tbl_helper_dce110_get_table() get called with an incompatible
return type between the two callers in command_table_helper.c and
command_table_helper2.c:
drivers/gpu/drm/amd/amdgpu/../display/dc/bios/dce80/command_table_helper_dce80.h:31: error: type of 'dal_cmd_tbl_helper_dce80_get_table' does not match original declaration [-Werror=lto-type-mismatch]
const struct command_table_helper *dal_cmd_tbl_helper_dce80_get_table(void);
drivers/gpu/drm/amd/amdgpu/../display/dc/bios/dce80/command_table_helper_dce80.c:351: note: 'dal_cmd_tbl_helper_dce80_get_table' was previously declared here
const struct command_table_helper *dal_cmd_tbl_helper_dce80_get_table(void)
drivers/gpu/drm/amd/amdgpu/../display/dc/bios/dce110/command_table_helper_dce110.h:32: error: type of 'dal_cmd_tbl_helper_dce110_get_table' does not match original declaration [-Werror=lto-type-mismatch]
const struct command_table_helper *dal_cmd_tbl_helper_dce110_get_table(void);
drivers/gpu/drm/amd/amdgpu/../display/dc/bios/dce110/command_table_helper_dce110.c:361: note: 'dal_cmd_tbl_helper_dce110_get_table' was previously declared here
const struct command_table_helper *dal_cmd_tbl_helper_dce110_get_table(void)
The two versions of the structure are obviously derived from the same
one, but have diverged over time, before they got added to the kernel.
This moves the structure to a new shared header file and uses the superset
of the members, to ensure the interfaces are all compatible.
Fixes: ae79c310b1 ("drm/amd/display: Add DCE12 bios parser support")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The trailing semicolon is an empty statement that does no operation.
Removing the two instances of them since they don't do anything.
Signed-off-by: Luis de Bethencourt <luisbg@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Directly editing pipe config outside of formula is error prone
and results in higher clocks being used when splitting.
For this reason we reverted to using bounding box hacking
to split. Since sometimes this erroneusly results in higher dpm
being required we unhack the bounding box and recalculate to allow
dpm0 is possible.
Side effect is we will lose some stutter efficiency
in non dpm0 cases. This is not a big concern since increased stutter
efficiency saves an order of magnitude less power than lower dpm.
Signed-off-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Reviewed-by: Tony Cheng <Tony.Cheng@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
There's no good place in DC to cover all place where stream signal should
be updated. update_stream_signal depends on timing which comes from DM.
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Tony Cheng <Tony.Cheng@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>