Smatch warns of
drivers/gpu/drm/i915/intel_pm.c:1161 g4x_compute_wm() warn: signedness bug returning '(-33554430)'
which is a result of it believing that wm may be INT_MAX following
g4x_tlb_miss_wa(). Just declaring g4x_tlb_miss_wa() as returning an
unsigned integer is not sufficient, we need to tell smatch that wm itself
is unsigned for it to not worry. So mark up the locals we expect to be
non-negative, and so silence smatch.
v2: Mark up vlv_compute_wm_level() as unsigned similarly.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> #v1
Link: https://patchwork.freedesktop.org/patch/msgid/20171107140338.13748-1-chris@chris-wilson.co.uk
On CNL, individual wake rate limit was added to each engine.
GT can only go to RC6 if both Render and Media engines are
individually qualified. So we need to set their individual
wake rate limit.
+-----------------+---------------+--------------+--------------+
| | GT RC6 | Render C6 | Media C6 |
+-----------------+---------------+--------------+--------------+
| Wake rate limit | 0xA09C[31:16] | 0xA09C[15:0] | 0xA0A0[15:0] |
+-----------------+---------------+--------------+--------------+
v2: - Tune Render and Media wake rate values according to some extra
info I got from HW engineers. Value can be tuned, but for now
these are the recommended values.
- Fix typos pointed by James.
Cc: Nathan Ciobanu <nathan.d.ciobanu@intel.com>
Cc: Wayne Boyer <wayne.boyer@intel.com>
Cc: Joe Konno <joe.konno@linux.intel.com>
Cc: David Weinehall <david.weinehall@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: James Ausmus <james.ausmus@intel.com>
Reviewed-by: David Weinehall <david.weinehall@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171023224612.27208-1-rodrigo.vivi@intel.com
According to Spec for SKL+: "Isochronous Priority Control.
If enabled, Display sends demoted requests once the transition
watermark is reached. If transition watermark is not enabled,
Display sends demoted requests when the display buffer is full."
The commit 'e57f1c02155f ("drm/i915/gen9+: Add has_ipc flag in
device info structure")' introduced that as gen9+ but missing many
SKL Skus.
I believe the reason for that is Spec also mentions workarounds for
SKL-ALL: "IPC (Isoch Priority Control) may cause underflows
WA: Do not enable IPC in register ARB_CTL2"
It seems lame to add the feature and forever disable it,
but it will avoid a mistake of enabling it when we are reorganizing
the feature definitions on i915_pci.c later.
It will also allow us to probably extend that workaround for
other platforms.
Cc: Mahesh Kumar <mahesh1.kumar@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Mahesh Kumar <mahesh1.kumar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171003063652.17248-1-rodrigo.vivi@intel.com
This patch adds IPC support. This patch also enables IPC in all supported
platforms based on has_ipc flag.
IPC (Isochronous Priority Control) is the hardware feature, which
dynamically controls the memory read priority of Display.
When IPC is enabled, plane read requests are sent at high priority until
filling above the transition watermark, then the requests are sent at
lower priority until dropping below the level 0 watermark.
The lower priority requests allow other memory clients to have better
memory access. When IPC is disabled, all plane read requests are sent at
high priority.
Changes since V1:
- Remove commandline parameter to disable ipc
- Address Paulo's comments
Changes since V2:
- Address review comments
- Set ipc_enabled flag
Changes since V3:
- move ipc_enabled flag assignment inside intel_ipc_enable function
Changes since V4:
- Re-enable IPC after suspend/resume
Changes since V5:
- Enable IPC for all gen >=9 except SKL
Changes since V6:
- fix commit msg
- after resume program IPC based on SW state.
Changes since V7:
- Modify IPC support check based on HAS_IPC macro (suggested by Chris)
Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170817134529.2839-8-mahesh1.kumar@intel.com
GEN > 9 require transition WM to be programmed if IPC is enabled.
This patch calculates & enable transition WM for supported platforms.
If transition WM is enabled, Plane read requests are sent at high
priority until filling above the transition watermark, then the
requests are sent at lower priority until dropping below the level-0 WM.
The lower priority requests allow other memory clients to have better
memory access.
transition minimum is the minimum amount needed for trans_wm to work to
ensure the demote does not happen before enough data has been read to
meet the level 0 watermark requirements.
transition amount is configurable value. Higher values will
tend to cause longer periods of high priority reads followed by longer
periods of lower priority reads. Tuning to lower values will tend to
cause shorter periods of high and lower priority reads.
Keeping transition amount to 10 in this patch, as suggested by HW team.
Changes since V1:
- Address review comments from Maarten
Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170817134529.2839-4-mahesh1.kumar@intel.com
Skip compressing 1 segment at the end of the frame,
avoid a pixel count mismatch nuke event when last active
pixel and dummy pixel has same color for Odd Plane
Width / Height.
For both platforms Gemini Lake and Cannon Lake.
v2: Use function-like macro and also use mask to clean
to make sure bit 11 is 0. (Suggested by Paulo).
v3: Add Display WA notation and also apply for GLK.
Both Forgotten on v2.
Using "GLK_" prefix since GLK came before CNL.
v4: Forgot to "|=" when moving directly macro to masked
val. (Noticed by Paulo.)
v5: Rebased on top of 0a46ddd57c ("drm/i915/cnp: Wa 1181:
Fix Backlight issue")
Cc: Imre Deak <imre.deak@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170905193013.31710-1-rodrigo.vivi@intel.com
If we miss the current vblank because the gpu was busy, that may cause a
jitter as the frame rate temporarily drops. We try to limit the impact
of this by then boosting the GPU clock to deliver the frame as quickly
as possible. Originally done in commit 6ad790c0f5 ("drm/i915: Boost GPU
frequency if we detect outstanding pageflips") but was never forward
ported to atomic and finally dropped in commit fd3a40242e ("drm/i915:
Rip out legacy page_flip completion/irq handling").
One of the most typical use-cases for this is a mostly idle desktop.
Rendering one frame of the desktop's frontbuffer can easily be
accomplished by the GPU running at low frequency, but often exceeds
the time budget of the desktop compositor. The result is that animations
such as opening the menu, doing a fullscreen switch, or even just trying
to move a window around are slow and jerky. We need to respond within a
frame to give the best impression of a smooth UX, as a compromise we
instead respond if that first frame misses its goal. The result should
be a near-imperceivable initial delay and a smooth animation even
starting from idle. The cost, as ever, is that we spend more power than
is strictly necessary as we overestimate the required GPU frequency and
then try to ramp down.
This of course is reactionary, too little, too late; nevertheless it is
surprisingly effective.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102199
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170817123706.6777-1-chris@chris-wilson.co.uk
Tested-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Radoslaw Szwichtenberg <radoslaw.szwichtenberg@intel.com>
This bit enables hardware that will change the approximation used for distances
calculations for AA wide lines so that they are rendered more accurately.
The default value for this bit leaves the legacy behavior. There is no good
reason to not enable the new approximation except if comparing to previous GEN
rendered images.
v2: Rebase
v3: Fix author.
Rebased by Rodrigo who also added a comment as suggested by Oscar.
Since it is surrounded by Workarounds let's just add a comment to
make clear it is not an Wa.
Cc: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Ben Widawsky <benjamin.widawsky@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Oscar Mateo <oscar.mateo@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170815231651.975-4-rodrigo.vivi@intel.com
Let's inherit workarounds from previous platforms that
according to wa_database and BSpec are still valid for
Cannonlake.
v2: Add missed workarounds.
v3: Rebase
v4: Remove bad chunk that was added to rc6 disable. (Ander)
Also remove A0 W/a that are not needed anymore.
v5: Rebase on top of CFL.
v6: Remove empty gen9_init_perctx_bb and gen9_init_indirectctx_bb
since they don't carry any gen10 related W/a. (by Oscar).
Also Remove A0 exclusive workaround.
v7: Remove more A0 exclusive workarounds. As pointed out by Oscar
many workarounds were changed to be A0 only so let's remove
them.
Cc: Oscar Mateo <oscar.mateo@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Oscar Mateo <oscar.mateo@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170815231651.975-1-rodrigo.vivi@intel.com
SKL+ display engine can scan out certain kinds of compressed surfaces
produced by the render engine. This involved telling the display engine
the location of the color control surfae (CCS) which describes
which parts of the main surface are compressed and which are not. The
location of CCS is provided by userspace as just another plane with its
own offset.
Add the required stuff to validate the user provided AUX plane metadata
and convert the user provided linear offset into something the hardware
can consume.
Due to hardware limitations we require that the main surface and
the AUX surface (CCS) be part of the same bo. The hardware also
makes life hard by not allowing you to provide separate x/y offsets
for the main and AUX surfaces (excpet with NV12), so finding suitable
offsets for both requires a bit of work. Assuming we still want keep
playing tricks with the offsets. I've just gone with a dumb "search
backward for suitable offsets" approach, which is far from optimal,
but it works.
Also not all planes will be capable of scanning out compressed surfaces,
and eg. 90/270 degree rotation is not supported in combination with
decompression either.
This patch may contain work from at least the following people:
* Vandana Kannan <vandana.kannan@intel.com>
* Daniel Vetter <daniel@ffwll.ch>
* Ben Widawsky <ben@bwidawsk.net>
v2: Deal with display workarounds 0390, 0531, 1125 (Paulo)
v3: Pretend CCS tiles are regular 128 byte wide Y tiles (Jason)
Put the AUX register defines to the correct place
Fix up the slightly bogus rotation check
v4: Use I915_WRITE_FW() due to plane update locking changes
s/return -EINVAL/goto err/ in intel_framebuffer_init()
Eliminate a bunch hardcoded numbers in CCS code
v5: (By Ben)
conflict resolution +
- res_blocks += fixed_16_16_to_u32_round_up(y_tile_minimum);
+ res_blocks += fixed16_to_u32_round_up(y_tile_minimum);
v6: (daniels) Fix botched commit message.
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Ville Syrjä <ville.syrjala@linux.intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net> (v1)
Reviewed-by: Daniel Stone <daniels@collabora.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Stone <daniels@collabora.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170801165817.7063-1-ben@bwidawsk.net
This patch make naming of fixed-point wrappers consistent
operation_<any_post_operation>_<1st operand>_<2nd operand>
also shorten the name for fixed_16_16 to fixed16
s/u32_to_fixed_16_16/u32_to_fixed16
s/fixed_16_16_to_u32/fixed16_to_u32
s/fixed_16_16_to_u32_round_up/fixed16_to_u32_round_up
s/min_fixed_16_16/min_fixed16
s/max_fixed_16_16/max_fixed16
s/mul_u32_fixed_16_16/mul_u32_fixed16
s/fixed_16_16_div/div_fixed16
Changes Since V1:
- Split the patch in more logical patches (Maarten)
Changes Since V2:
- Rebase
Signed-off-by: Mahesh Kumar <mahesh1.kumar@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170705143154.32132-4-mahesh1.kumar@intel.com