linux

Author	SHA1	Message	Date
Michal Wajdeczko	a0de908d44	drm/i915: Reorder early initialization In upcoming patch, we want to perform more actions in early initialization of the uC. This reordering will help resolve new dependencies that will be introduced by future patch. v2: s/i915_gem_load_init/i915_gem_init_early (Chris) v3: s/i915_gem_load_cleanup/i915_gem_cleanup_early (Michal) Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180323123451.59244-1-michal.wajdeczko@intel.com	2018-03-23 17:03:24 +00:00
Chris Wilson	ac697ae801	drm/i915: Stop engines when declaring the machine wedged If we fail to reset the GPU, we declare the machine wedged. However, the GPU may well still be running in the background with an in-flight request. So despite our efforts in cleaning up the request queue and faking the breadcrumb in the HWSP, the GPU may eventually write the in-flght seqno there breaking all of our assumptions and throwing the driver into a deep turmoil, wedging beyond wedged. To avoid this we ideally want to reset the GPU. Since that has already failed, make sure the rings have the stop bit set instead. This is part of the normal GPU reset sequence, but that is actually disabled by igt/gem_eio to force the wedged state. If we assume the worst, we must poke at the bit again before we give up. v2: Move the intel_gpu_reset() from set-wedged in the reset error path into i915_gem_set_wedged() itself. Even if the reset fails (e.g. if it is disabled by gem_eio), it still tries to make sure the engines are stopped. For i915_gem_set_wedged() callers from outside of i915_reset(), this should make sure the GPU is disabled while the driver is marked as being wedged. Testcase: igt/gem_eio Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180315151015.22741-1-chris@chris-wilson.co.uk	2018-03-16 10:16:08 +00:00
Chris Wilson	d9b13c4dde	drm/i915: Trace GEM steps between submit and wedging We still have an odd race with wedging/unwedging as shown by igt/gem_eio that defies expectations. Add some more trace_printks to try and visualize the flow over the precipice. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180315131451.4060-1-chris@chris-wilson.co.uk	2018-03-16 10:16:07 +00:00
Jackie Li	f08e2035cc	drm/i915/guc: Check the locking status of GuC WOPCM registers GuC WOPCM registers are write-once registers. Current driver code accesses these registers without checking the accessibility to these registers which will lead to unpredictable driver behaviors if these registers were touch by other components (such as faulty BIOS code). This patch moves the GuC WOPCM registers updating code into intel_wopcm.c and adds check before and after the update to GuC WOPCM registers so that we can make sure the driver is in a known state after writing to these write-once registers. v6: - Made sure module reloading won't bug the kernel while doing locking status checking v7: - Fixed patch format issues v8: - Fixed coding style issue on register lock bit macro definition (Sagar) v9: - Avoided to use redundant !! to cast uint to bool (Chris) - Return error code instead of GEM_BUG_ON for locked with invalid register values case (Sagar) - Updated guc_wopcm_hw_init to use guc_wopcm as first parameter (Michal) - Added code to set and validate the HuC_LOADING_AGENT_GUC bit in GuC WOPCM offset register based on the presence of HuC firmware (Michal) - Use bit fields instead of macros for GuC WOPCM flags (Michal) v10: - Refined variable names, removed redundant comments (Joonas) - Introduced lockable_reg to handle the write once register write and propagate the write error to caller (Joonas) - Used lockable_reg abstraction to avoid locking bit check on generic i915_reg_t (Michal) - Added log message for error paths (Michal) - Removed hw_updated flag and only relies on real hardware status v11: - Replaced lockable_reg with simplified function (Michal) - Used new macros for locking bits of WOPCM size/offset registers instead of using BIT(0) directly (Michal) - use intel_wopcm_init_hw() called from intel_gem_init_hw() to do GuC WOPCM register setup instead of calling from intel_uc_init_hw() (Michal) v12: - Updated function kernel-doc to align with code changes (Michal) - Updated code to use wopcm pointer directly (Michal) v13: - Updated the ordering of s-o-b/cc/r-b tags (Sagar) BSpec: 10875, 10833 Signed-off-by: Jackie Li <yaodong.li@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> (v11) Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (v12) Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/1520987574-19351-5-git-send-email-yaodong.li@intel.com	2018-03-14 15:35:37 +02:00
Jackie Li	6b0478fb72	drm/i915: Implement dynamic GuC WOPCM offset and size calculation Hardware may have specific restrictions on GuC WOPCM offset and size. On Gen9, the value of the GuC WOPCM size register needs to be larger than the value of GuC WOPCM offset register + a Gen9 specific offset (144KB) for reserved GuC WOPCM. Fail to enforce such a restriction on GuC WOPCM size will lead to GuC firmware execution failures. On the other hand, with current static GuC WOPCM offset and size values (512KB for both offset and size), the GuC WOPCM size verification will fail on Gen9 even if it can be fixed by lowering the GuC WOPCM offset by calculating its value based on HuC firmware size (which is likely less than 200KB on Gen9), so that we can have a GuC WOPCM size value which is large enough to pass the GuC WOPCM size check. This patch updates the reserved GuC WOPCM size for RC6 context on Gen9 to 24KB to strictly align with the Gen9 GuC WOPCM layout. It also adds support to verify the GuC WOPCM size aganist the Gen9 hardware restrictions. To meet all above requirements, let's provide dynamic partitioning of the WOPCM that will be based on platform specific HuC/GuC firmware sizes. v2: - Removed intel_wopcm_init (Ville/Sagar/Joonas) - Renamed and Moved the intel_wopcm_partition into intel_guc (Sagar) - Removed unnecessary function calls (Joonas) - Init GuC WOPCM partition as soon as firmware fetching is completed v3: - Fixed indentation issues (Chris) - Removed layering violation code (Chris/Michal) - Created separat files for GuC wopcm code (Michal) - Used inline function to avoid code duplication (Michal) v4: - Preset the GuC WOPCM top during early GuC init (Chris) - Fail intel_uc_init_hw() as soon as GuC WOPCM partitioning failed v5: - Moved GuC DMA WOPCM register updating code into intel_wopcm.c - Took care of the locking status before writing to GuC DMA Write-Once registers. (Joonas) v6: - Made sure the GuC WOPCM size to be multiple of 4K (4K aligned) v8: - Updated comments and fixed naming issues (Sagar/Joonas) - Updated commit message to include more description about the hardware restriction on GuC WOPCM size (Sagar) v9: - Minor changes variable names and code comments (Sagar) - Added detailed GuC WOPCM layout drawing (Sagar/Michal) - Refined macro definitions to be reader friendly (Michal) - Removed redundent check to valid flag (Michal) - Unified first parameter for exported GuC WOPCM functions (Michal) - Refined the name and parameter list of hardware restriction checking functions (Michal) v10: - Used shorter function name for internal functions (Joonas) - Moved init-ealry function into c file (Joonas) - Consolidated and removed redundant size checks (Joonas/Michal) - Removed unnecessary unlikely() from code which is only called once during boot (Joonas) - More fixes to kernel-doc format and content (Michal) - Avoided the use of PAGE_MASK for 4K pages (Michal) - Added error log messages to error paths (Michal) v11: - Replaced intel_guc_wopcm with more generic intel_wopcm and attached intel_wopcm to drm_i915_private instead intel_guc (Michal) - dynamic calculation of GuC non-wopcm memory start (a.k.a WOPCM Top offset from GuC WOPCM base) (Michal) - Moved WOPCM marco definitions into .c source file (Michal) - Exported WOPCM layout diagram as kernel-doc (Michal) v12: - Updated naming, function kernel-doc to align with new changes (Michal) v13: - Updated the ordering of s-o-b/cc/r-b tags (Sagar) - Corrected one tense error in comment (Sagar) - Corrected typos and removed spurious comments (Joonas) Bspec: 12690 Signed-off-by: Jackie Li <yaodong.li@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com> Cc: Sujaritha Sundaresan <sujaritha.sundaresan@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Spotswood <john.a.spotswood@intel.com> Cc: Oscar Mateo <oscar.mateo@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Sagar Arun Kamble <sagar.a.kamble@intel.com> (v8) Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (v9) Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> (v11) Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> (v12) Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/1520987574-19351-2-git-send-email-yaodong.li@intel.com	2018-03-14 15:35:33 +02:00
Chris Wilson	629820fcd0	drm/i915: Show GEM_TRACE when detecting a failed GPU idle If we timeout waiting for the GPU to idle, something went seriously wrong. We currently dump the engine state, but we can also dump the ftrace buffer showing our last operations (when available). In passing, note that since commit `559e040f1f` ("drm/i915: Show the GPU state when declaring wedged", we now show the engine state twice, once in detecting the failed idle and then again on declaring wedged. v2: ftrace_dump() takes a parameter specifying whether to dump all cpu buffers or the local cpu's. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180309101114.1138-1-chris@chris-wilson.co.uk	2018-03-13 21:41:09 +00:00
Dhinakaran Pandiyan	07bcd99b80	drm/i915/frontbuffer: Pull frontbuffer_flush out of gem_obj_pin_to_display i915_gem_obj_pin_to_display() calls frontbuffer_flush with origin set to DIRTYFB. The callers however are at a vantage point to decide if hardware frontbuffer tracking can do the flush for us. For example, legacy cursor updates, like flips, write to MMIO registers, which then triggers PSR flush by the hardware. Moving frontbuffer_flush out will enable us to skip a software initiated flush by setting origin to FLIP. Thanks to Chris for the idea. v2: Rebased due to Ville adding intel_plane_pin_fb(). Minor code reordering as fb_obj_flush doesn't need struct_mutex (Chris) Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180307033420.3086-1-dhinakaran.pandiyan@intel.com	2018-03-13 13:49:39 -07:00
Michal Wajdeczko	c37d572820	drm/i915/uc: Sanitize uC together with GEM Instead of dancing around uC on reset/suspend/resume scenarios, explicitly sanitize uC when we sanitize GEM to force uC reload and start from known beginning. v2: don't forget about reset path (Daniele) sanitize uc before gem initiated full reset (Daniele) v3: drop redundant disable_communication in init_hw (Daniele) Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180312130308.22952-3-michal.wajdeczko@intel.com	2018-03-12 22:06:19 +00:00
Chris Wilson	68ad361285	drm/i915: Only call tasklet_kill() on the first prepare_reset tasklet_kill() will spin waiting for the current tasklet to be executed. However, if tasklet_disable() has been called, then the tasklet is never executed but permanently put back onto the runlist until tasklet_enable() is called. Ergo, we cannot use tasklet_kill() inside a disable/enable pair. This is the case when we call set-wedge from inside i915_reset(), and another request was submitted to us concurrent to the reset. Fixes: `963ddd63c3` ("drm/i915: Suspend submission tasklets around wedging") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180307134226.25492-6-chris@chris-wilson.co.uk	2018-03-09 14:13:35 +00:00
Chris Wilson	47650db02d	drm/i915: Wrap engine->schedule in RCU locks for set-wedge protection Similar to the staging around handling of engine->submit_request, we need to stop adding to the execlists->queue prior to calling engine->cancel_requests. cancel_requests will move requests from the queue onto the timeline, so if we add a request onto the queue after that point, it will be lost. Fixes: `af7a8ffad9` ("drm/i915: Use rcu instead of stop_machine in set_wedged") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180307134226.25492-5-chris@chris-wilson.co.uk	2018-03-09 14:13:34 +00:00
Chris Wilson	2d4ecace3a	drm/i915: Finish the wait-for-wedge by retiring all the inflight requests Before we reset the GPU after marking the device as wedged, we wait for all the remaining requests to be completed (and marked as EIO). Afterwards, we should flush the request lists so the next batch start with the driver in an idle state. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180307134226.25492-1-chris@chris-wilson.co.uk	2018-03-09 14:13:25 +00:00
Chris Wilson	fa73055b84	drm/i915: Only prune fences after wait-for-all Currently, we only allow ourselves to prune the fences so long as all the waits completed (i.e. all the fences we checked were signaled), and that the reservation snapshot did not change across the wait. However, if we only waited for a subset of the reservation object, i.e. just waiting for the last writer to complete as opposed to all readers as well, then we would erroneously conclude we could prune the fences as indeed although all of our waits were successful, they did not represent the totality of the reservation object. v2: We only need to check the shared fences due to construction (i.e. all of the shared fences will be later than the exclusive fence, if any). Fixes: `e54ca97747` ("drm/i915: Remove completed fences after a wait") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180307171303.29466-1-chris@chris-wilson.co.uk	2018-03-08 18:03:36 +00:00
Michal Wajdeczko	7cfca4afd6	drm/i915/uc: Introduce intel_uc_suspend\|resume We want to use higher level 'uc' functions as the main entry points to the GuC/HuC code to hide some details and keep code layered. While here, move call to disable_guc_interrupts after sending suspend action to the GuC to allow it work also with CTB as comm mechanism. v2: update commit msg (Sagar) Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Sagar Arun Kamble <sagar.a.kamble@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180302111550.21328-1-michal.wajdeczko@intel.com	2018-03-02 23:11:12 +00:00
Chris Wilson	963ddd63c3	drm/i915: Suspend submission tasklets around wedging After staring hard at sequences like [ 28.199013] systemd-1 2..s. 26062228us : execlists_submission_tasklet: rcs0 cs-irq head=0 [0?], tail=1 [1?] [ 28.199095] systemd-1 2..s. 26062229us : execlists_submission_tasklet: rcs0 csb[1]: status=0x00000018:0x00000000, active=0x1 [ 28.199177] systemd-1 2..s. 26062230us : execlists_submission_tasklet: rcs0 out[0]: ctx=0.1, seqno=3, prio=-1024 [ 28.199258] systemd-1 2..s. 26062231us : execlists_submission_tasklet: rcs0 completed ctx=0 [ 28.199340] gem_eio-829 1..s1 26066853us : execlists_submission_tasklet: rcs0 in[0]: ctx=1.1, seqno=1, prio=0 [ 28.199421] <idle>-0 2..s. 26066863us : execlists_submission_tasklet: rcs0 cs-irq head=1 [1?], tail=2 [2?] [ 28.199503] <idle>-0 2..s. 26066865us : execlists_submission_tasklet: rcs0 csb[2]: status=0x00000001:0x00000000, active=0x1 [ 28.199585] gem_eio-829 1..s1 26067077us : execlists_submission_tasklet: rcs0 in[1]: ctx=3.1, seqno=2, prio=0 [ 28.199667] gem_eio-829 1..s1 26067078us : execlists_submission_tasklet: rcs0 in[0]: ctx=1.2, seqno=1, prio=0 [ 28.199749] <idle>-0 2..s. 26067084us : execlists_submission_tasklet: rcs0 cs-irq head=2 [2?], tail=3 [3?] [ 28.199830] <idle>-0 2..s. 26067085us : execlists_submission_tasklet: rcs0 csb[3]: status=0x00008002:0x00000001, active=0x1 [ 28.199912] <idle>-0 2..s. 26067086us : execlists_submission_tasklet: rcs0 out[0]: ctx=1.2, seqno=1, prio=0 [ 28.199994] gem_eio-829 2..s. 28246084us : execlists_submission_tasklet: rcs0 cs-irq head=3 [3?], tail=4 [4?] [ 28.200096] gem_eio-829 2..s. 28246088us : execlists_submission_tasklet: rcs0 csb[4]: status=0x00000014:0x00000001, active=0x5 [ 28.200178] gem_eio-829 2..s. 28246089us : execlists_submission_tasklet: rcs0 out[0]: ctx=0.0, seqno=0, prio=0 [ 28.200260] gem_eio-829 2..s. 28246127us : execlists_submission_tasklet: execlists_submission_tasklet:886 GEM_BUG_ON(buf[2 * head + 1] != port->context_id) the conclusion is that the only place where the ports are reset to zero, is from engine->cancel_requests called during i915_gem_set_wedged(). The race is horrible as it results from calling set-wedged on active HW (the GPU reset failed) and as such we need to be careful as the HW state changes beneath us. Fortunately, it's the same scary conditions as affect normal reset, so we can reuse the same machinery to disable state tracking as we clobber it. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104945 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Michel Thierry <michel.thierry@intel.com> Fixes: `af7a8ffad9` ("drm/i915: Use rcu instead of stop_machine in set_wedged") Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180302113324.23189-2-chris@chris-wilson.co.uk	2018-03-02 23:11:11 +00:00
Chris Wilson	ffed7bd236	drm/i915: Replace open-coded wait-for loop Now that we can pass arbitrary commands into the base __wait_for() macro, we can reimplement the open-coded wait-for inside i915_gem_idle_work_handler() using the new macro. This means that instead of using ktime, we now use jiffies, and benefit from the exponential sleep backoff that allows a fast response if the HW settles quickly. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180301103338.5380-1-chris@chris-wilson.co.uk	2018-03-01 17:42:58 +00:00
Chris Wilson	e61e0f51ba	drm/i915: Rename drm_i915_gem_request to i915_request We want to de-emphasize the link between the request (dependency, execution and fence tracking) from GEM and so rename the struct from drm_i915_gem_request to i915_request. That is we may implement the GEM user interface on top of requests, but they are an abstraction for tracking execution rather than an implementation detail of GEM. (Since they are not tied to HW, we keep the i915 prefix as opposed to intel.) In short, the spatch: @@ @@ - struct drm_i915_gem_request + struct i915_request A corollary to contracting the type name, we also harmonise on using 'rq' shorthand for local variables where space if of the essence and repetition makes 'request' unwieldy. For globals and struct members, 'request' is still much preferred for its clarity. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180221095636.6649-1-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2018-02-21 20:57:22 +00:00
Chris Wilson	5935485f8e	drm/i915: Move the policy for placement of the GGTT vma into the caller Currently we make the unilateral decision inside i915_gem_object_pin_to_display() where the VMA should resided (inside the fence and mappable region or above?). This is not our decision to make as it impacts on how the display engine can use the resulting scanout object, and it would rather instruct us where to place the VMA so that it can enable the features it wants. As such, make the pin flags an argument to i915_gem_object_pin_to_display() and control them from intel_pin_and_fence_fb_obj() Whilst taking control of the mapping for ourselves, start tracking how we use it to avoid trying to free a fence we never claimed: <3>[ 227.151869] GEM_BUG_ON(vma->fence->pin_count <= 0) <4>[ 227.152064] ------------[ cut here ]------------ <2>[ 227.152068] kernel BUG at drivers/gpu/drm/i915/i915_vma.h:391! <4>[ 227.152084] invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI <0>[ 227.152092] Dumping ftrace buffer: <0>[ 227.152099] (ftrace buffer empty) <4>[ 227.152102] Modules linked in: i915 snd_hda_codec_analog snd_hda_codec_generic coretemp snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm lpc_ich e1000e mei_me mei prime_numbers <4>[ 227.152131] CPU: 1 PID: 1587 Comm: kworker/u16:49 Tainted: G U 4.16.0-rc1-gbab67b2f6177-kasan_7+ #1 <4>[ 227.152134] Hardware name: Dell Inc. OptiPlex 755 /0PU052, BIOS A08 02/19/2008 <4>[ 227.152236] Workqueue: events_unbound intel_atomic_commit_work [i915] <4>[ 227.152292] RIP: 0010:intel_unpin_fb_vma+0x23a/0x2a0 [i915] <4>[ 227.152295] RSP: 0018:ffff88005aad7b68 EFLAGS: 00010286 <4>[ 227.152300] RAX: 0000000000000026 RBX: ffff88005c359580 RCX: 0000000000000000 <4>[ 227.152304] RDX: 0000000000000026 RSI: ffffffff8707d840 RDI: ffffed000b55af63 <4>[ 227.152307] RBP: ffff880056817e58 R08: 0000000000000001 R09: 0000000000000000 <4>[ 227.152311] R10: ffff88005aad7b88 R11: 0000000000000000 R12: ffff8800568184d0 <4>[ 227.152314] R13: ffff880065b5ab08 R14: 0000000000000000 R15: dffffc0000000000 <4>[ 227.152318] FS: 0000000000000000(0000) GS:ffff88006ac40000(0000) knlGS:0000000000000000 <4>[ 227.152322] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4>[ 227.152325] CR2: 00007f5fb25550a8 CR3: 0000000068c78000 CR4: 00000000000006e0 <4>[ 227.152328] Call Trace: <4>[ 227.152385] intel_cleanup_plane_fb+0x6b/0xd0 [i915] <4>[ 227.152395] drm_atomic_helper_cleanup_planes+0x166/0x280 <4>[ 227.152452] intel_atomic_commit_tail+0x159d/0x3380 [i915] <4>[ 227.152463] ? process_one_work+0x66e/0x1460 <4>[ 227.152516] ? skl_update_crtcs+0x9c0/0x9c0 [i915] <4>[ 227.152523] ? lock_acquire+0x13d/0x390 <4>[ 227.152527] ? lock_acquire+0x13d/0x390 <4>[ 227.152534] process_one_work+0x71a/0x1460 <4>[ 227.152540] ? __schedule+0x815/0x1e20 <4>[ 227.152547] ? pwq_dec_nr_in_flight+0x2b0/0x2b0 <4>[ 227.152553] ? _raw_spin_lock_irq+0xa/0x40 <4>[ 227.152559] worker_thread+0xdf/0xf60 <4>[ 227.152569] ? process_one_work+0x1460/0x1460 <4>[ 227.152573] kthread+0x2cf/0x3c0 <4>[ 227.152578] ? _kthread_create_on_node+0xa0/0xa0 <4>[ 227.152583] ret_from_fork+0x3a/0x50 <4>[ 227.152591] Code: c6 00 11 86 c0 48 c7 c7 e0 bd 85 c0 e8 60 e7 a9 c4 0f ff e9 1f fe ff ff 48 c7 c6 40 10 86 c0 48 c7 c7 e0 ca 85 c0 e8 2b 95 bd c4 <0f> 0b 48 89 ef e8 4c 44 e8 c4 e9 ef fd ff ff e8 42 44 e8 c4 e9 <1>[ 227.152720] RIP: intel_unpin_fb_vma+0x23a/0x2a0 [i915] RSP: ffff88005aad7b68 v2: i915_vma_pin_fence() is a no-op if a fence isn't required, so check vma->fence as well. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180220134208.24988-2-chris@chris-wilson.co.uk	2018-02-20 19:03:59 +00:00
Chris Wilson	ac87a6fd36	drm/i915: Also check view->type for a normal GGTT view We cannot simply use !view as shorthand for all normal GGTT views as a few callers will always populate a i915_ggtt_view struct and set the type to NORMAL instead. So check for (!view \|\| view->type == NORMAL) inside i915_gem_object_ggtt_pin(). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180220134208.24988-1-chris@chris-wilson.co.uk	2018-02-20 19:03:59 +00:00
Chris Wilson	c9c7047154	drm/i915: Track number of pending freed objects During igt, we frequently call into the driver to reset both HW and driver state (idling the device, waiting for it to become idle and freeing off old objects) to ensure that we start each test/subtest/pass from known state. This process incurs an RCU barrier or two to ensure that any such pending frees are indeed flushed before we return. However, unconditionally waiting on the RCU barrier adds needless delay to many callers, which adds up to several seconds when repeated thousands of times. We can skip the rcu_barrier() if by tracking how many outstanding frees we have, we know there are none. The same path is used along suspend, where we may be able to save the unconditional RCU barrier. To put it into perspective with a completely meaningless microbenchmark, igt/gem_sync/idle is improved from 50ms to 30us on bdw. v2: Remove the extra synchronize_rcu() inside i915_drop_caches_set() Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180219220631.25001-1-chris@chris-wilson.co.uk	2018-02-20 09:10:41 +00:00
Christian König	c0a51fd07b	drm: move read_domains and write_domain into i915 i915 is the only driver using those fields in the drm_gem_object structure, so they only waste memory for all other drivers. Move the fields into drm_i915_gem_object instead and patch the i915 code with the following sed commands: sed -i "s/obj->base.read_domains/obj->read_domains/g" drivers/gpu/drm/i915/.c drivers/gpu/drm/i915//.c sed -i "s/obj->base.write_domain/obj->write_domain/g" drivers/gpu/drm/i915/.c drivers/gpu/drm/i915//.c Change is only compile tested. v2: move fields around as suggested by Chris. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180216124338.9087-1-christian.koenig@amd.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2018-02-16 14:12:48 +00:00
Tvrtko Ursulin	c56b89f16d	drm/i915: Use INTEL_GEN everywhere Coccinelle patch: @@ identifier p; @@ -INTEL_INFO(p)->gen +INTEL_GEN(p) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180208130606.15556-12-tvrtko.ursulin@linux.intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180209215847.6660-1-chris@chris-wilson.co.uk	2018-02-09 22:29:02 +00:00
Chris Wilson	0d73e7a095	drm/i915: Mark the device as wedged from the beginning of set-wedged Reduce the window of opportunity for set-wedged being called concurrently with reset (after i915_reset() has performed the i915_gem_unset_wedged()) by moving the set_bit(I915_WEDGED) to before we complete the inflight requests. When i915_reset() is being blocked on a request, such completion may allow it to start and beginning resetting the GPU before i915_gem_set_wedged() has finished (and so before set-wedge will have marked the device as wedged). As such, i915_gem_init_hw() may see a wedged device even from inside i915_reset(). References: `36703e79a9` ("drm/i915: Break modeset deadlocks on reset") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180207151350.20883-1-chris@chris-wilson.co.uk	2018-02-08 11:44:27 +00:00
Daniele Ceraolo Spurio	ce1599a40d	drm/i915: do not stop engines on sanitize if i915.reset=0 Since commit `5896a5c8c9` (drm/i915: Always stop the rings before a missing GPU reset) we attempt to stop the engines during gem_sanitize even if reset=0 and nothing bad happened on the gpu. The specs says that the STOP_RINGS bit needs to be cleared to resume normal operation, but for some reason the value of the bit seems to be changing without us writing to it (maybe rc6 entry/exit?), so normal operation resumes correctly. However, it still feels incorrect to stop the engines if there hasn't been any issue so skip the whole reset call in gem_sanitize if i915.reset=0 Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180207212440.13438-1-daniele.ceraolospurio@intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2018-02-08 07:34:32 +00:00
Chris Wilson	3fed180812	drm/i915: Move the scheduler feature bits into the purview of the engines Rather than having the high level ioctl interface guess the underlying implementation details, having the implementation declare what capabilities it exports. We define an intel_driver_caps, similar to the intel_device_info, which instead of trying to describe the HW gives details on what the driver itself supports. This is then populated by the engine backend for the new scheduler capability field for use elsewhere. v2: Use caps.scheduler for validating CONTEXT_PARAM_SET_PRIORITY (Mika) One less assumption of engine[RCS] \o/ Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tomasz Lis <tomasz.lis@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Tomasz Lis <tomasz.lis@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180207210544.26351-2-chris@chris-wilson.co.uk Reviewed-by: Michel Thierry <michel.thierry@intel.com>	2018-02-08 07:30:11 +00:00
Chris Wilson	8177e11252	drm/i915: Tidy up some error messages around reset failure On blb and pnv, we are seeing sporadic i915 0000:00:02.0: Resetting chip after gpu hang [drm:intel_gpu_reset [i915]] rcs0: timed out on STOP_RING [drm:i915_reset [i915]] ERROR Failed hw init on reset -5 which notably lack the actual root cause of the error. Ostensibly it should be the init_ring_common() that failed, but it's error paths are covered by DRM_ERROR. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180207111545.17078-1-chris@chris-wilson.co.uk	2018-02-07 13:12:32 +00:00
Chris Wilson	01b8fdc522	drm/i915: Skip post-reset request emission if the engine is not idle Since commit `7b6da818d8` ("drm/i915: Restore the kernel context after a GPU reset on an idle engine") we submit a request following the engine reset. The intent is that we don't submit a request if the engine is busy (as it will restart active by itself) but we only checked to see if there were remaining requests in flight on the hardware and skipped checking to see if there were any ready requests that would be immediately submitted on restart (the same time as our new request would be). Having convinced the engine to appear idle in the previous patch, we can use intel_engine_is_idle() as a better test to only submit a new request if there are no pending requests. As it happens, this is tripping up igt/drv_selftest/live_hangcheck in CI as we overfill the kernel_context ringbuffer trigger an infinite recursion from within the reset. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104786 References: `7b6da818d8` ("drm/i915: Restore the kernel context after a GPU reset on an idle engine") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180205152431.12163-4-chris@chris-wilson.co.uk	2018-02-05 15:27:26 +00:00
Chris Wilson	24eae08d44	drm/i915: Remove unbannable context spam from reset During testing, we trigger a lot of resets on an unbannable context leading to massive amounts of irrelevant debug spam. Remove the ban_score accounting and message for the unbannable context so that we improve the signal:noise in the log messages for when the unexpected occurs. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180205092201.19476-7-chris@chris-wilson.co.uk	2018-02-05 13:24:45 +00:00
Chris Wilson	559e040f1f	drm/i915: Show the GPU state when declaring wedged Dump each engine state when i915_gem_set_wedged() is called to give us some more clues as to why we had to terminate the GPU. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180205092201.19476-5-chris@chris-wilson.co.uk	2018-02-05 13:23:40 +00:00
Chris Wilson	9e519bc8b9	drm/i915: Add some newlines to intel_engine_dump() headers The headers should be on a separate line for consistency, so add the missing trailing newline in a few intel_engine_dump() callers. Reported-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180205100618.11001-1-chris@chris-wilson.co.uk	2018-02-05 10:59:59 +00:00
Chris Wilson	889230489b	drm/i915: Always run hangcheck while the GPU is busy Previously, we relied on only running the hangcheck while somebody was waiting on the GPU, in order to minimise the amount of time hangcheck had to run. (If nobody was watching the GPU, nobody would notice if the GPU wasn't responding -- eventually somebody would care and so kick hangcheck into action.) However, this falls apart from around commit `4680816be3` ("drm/i915: Wait first for submission, before waiting for request completion"), as not all waiters declare themselves to hangcheck and so we could switch off hangcheck and miss GPU hangs even when waiting under the struct_mutex. If we enable hangcheck from the first request submission, and let it run until the GPU is idle again, we forgo all the complexity involved with only enabling around waiters. We just have to remember to be careful that we do not declare a GPU hang when idly waiting for the next request to be come ready, as we will run hangcheck continuously even when the engines are stalled waiting for external events. This should be true already as we should only be tracking requests submitted to hardware for execution as an indicator that the engine is busy. Fixes: `4680816be3` ("drm/i915: Wait first for submission, before waiting for request completion" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104840 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180129144104.3921-1-chris@chris-wilson.co.uk Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>	2018-01-31 10:10:43 +00:00
Sagar Arun Kamble	70deeaddc6	drm/i915/guc: Fix lockdep due to log relay channel handling under struct_mutex This patch fixes lockdep issue due to circular locking dependency of struct_mutex, i_mutex_key, mmap_sem, relay_channels_mutex. For GuC log relay channel we create debugfs file that requires i_mutex_key lock and we are doing that under struct_mutex. So we introduced newer dependency as: &dev->struct_mutex --> &sb->s_type->i_mutex_key#3 --> &mm->mmap_sem However, there is dependency from mmap_sem to struct_mutex. Hence we separate the relay create/destroy operation from under struct_mutex. Also added runtime check of relay buffer status. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> ====================================================== WARNING: possible circular locking dependency detected 4.15.0-rc6-CI-Patchwork_7614+ #1 Not tainted ------------------------------------------------------ debugfs_test/1388 is trying to acquire lock: (&dev->struct_mutex){+.+.}, at: [<00000000d5e1d915>] i915_mutex_lock_interruptible+0x47/0x130 [i915] but task is already holding lock: (&mm->mmap_sem){++++}, at: [<0000000029a9c131>] __do_page_fault+0x106/0x560 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #3 (&mm->mmap_sem){++++}: _copy_to_user+0x1e/0x70 filldir+0x8c/0xf0 dcache_readdir+0xeb/0x160 iterate_dir+0xdc/0x140 SyS_getdents+0xa0/0x130 entry_SYSCALL_64_fastpath+0x1c/0x89 -> #2 (&sb->s_type->i_mutex_key#3){++++}: start_creating+0x59/0x110 __debugfs_create_file+0x2e/0xe0 relay_create_buf_file+0x62/0x80 relay_late_setup_files+0x84/0x250 guc_log_late_setup+0x4f/0x110 [i915] i915_guc_log_register+0x32/0x40 [i915] i915_driver_load+0x7b6/0x1720 [i915] i915_pci_probe+0x2e/0x90 [i915] pci_device_probe+0x9c/0x120 driver_probe_device+0x2a3/0x480 __driver_attach+0xd9/0xe0 bus_for_each_dev+0x57/0x90 bus_add_driver+0x168/0x260 driver_register+0x52/0xc0 do_one_initcall+0x39/0x150 do_init_module+0x56/0x1ef load_module+0x231c/0x2d70 SyS_finit_module+0xa5/0xe0 entry_SYSCALL_64_fastpath+0x1c/0x89 -> #1 (relay_channels_mutex){+.+.}: relay_open+0x12c/0x2b0 intel_guc_log_runtime_create+0xab/0x230 [i915] intel_guc_init+0x81/0x120 [i915] intel_uc_init+0x29/0xa0 [i915] i915_gem_init+0x182/0x530 [i915] i915_driver_load+0xaa9/0x1720 [i915] i915_pci_probe+0x2e/0x90 [i915] pci_device_probe+0x9c/0x120 driver_probe_device+0x2a3/0x480 __driver_attach+0xd9/0xe0 bus_for_each_dev+0x57/0x90 bus_add_driver+0x168/0x260 driver_register+0x52/0xc0 do_one_initcall+0x39/0x150 do_init_module+0x56/0x1ef load_module+0x231c/0x2d70 SyS_finit_module+0xa5/0xe0 entry_SYSCALL_64_fastpath+0x1c/0x89 -> #0 (&dev->struct_mutex){+.+.}: __mutex_lock+0x81/0x9b0 i915_mutex_lock_interruptible+0x47/0x130 [i915] i915_gem_fault+0x201/0x790 [i915] __do_fault+0x15/0x70 __handle_mm_fault+0x677/0xdc0 handle_mm_fault+0x14f/0x2f0 __do_page_fault+0x2d1/0x560 page_fault+0x4c/0x60 other info that might help us debug this: Chain exists of: &dev->struct_mutex --> &sb->s_type->i_mutex_key#3 --> &mm->mmap_sem Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&mm->mmap_sem); lock(&sb->s_type->i_mutex_key#3); lock(&mm->mmap_sem); lock(&dev->struct_mutex); * DEADLOCK * 1 lock held by debugfs_test/1388: #0: (&mm->mmap_sem){++++}, at: [<0000000029a9c131>] __do_page_fault+0x106/0x560 stack backtrace: CPU: 2 PID: 1388 Comm: debugfs_test Not tainted 4.15.0-rc6-CI-Patchwork_7614+ #1 Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./J4205-ITX, BIOS P1.10 09/29/2016 Call Trace: dump_stack+0x5f/0x86 print_circular_bug.isra.18+0x1d0/0x2c0 __lock_acquire+0x14ae/0x1b60 ? lock_acquire+0xaf/0x200 lock_acquire+0xaf/0x200 ? i915_mutex_lock_interruptible+0x47/0x130 [i915] __mutex_lock+0x81/0x9b0 ? i915_mutex_lock_interruptible+0x47/0x130 [i915] ? i915_mutex_lock_interruptible+0x47/0x130 [i915] ? i915_mutex_lock_interruptible+0x47/0x130 [i915] i915_mutex_lock_interruptible+0x47/0x130 [i915] ? __pm_runtime_resume+0x4f/0x80 i915_gem_fault+0x201/0x790 [i915] __do_fault+0x15/0x70 ? _raw_spin_unlock+0x29/0x40 __handle_mm_fault+0x677/0xdc0 handle_mm_fault+0x14f/0x2f0 __do_page_fault+0x2d1/0x560 ? page_fault+0x36/0x60 page_fault+0x4c/0x60 v2: Added lock protection to guc->log.runtime.relay_chan (Chris) Fixed locking inside guc_flush_logs uncovered by new lockdep. v3: Locking guc_read_update_log_buffer entirely with relay_lock. (Chris) Prepared intel_guc_init_early. Moved relay_lock inside relay_create relay_destroy, relay_file_create, guc_read_update_log_buffer. (Michal) Removed struct_mutex lock around guc_log_flush and removed usage of guc_log_has_relay() from runtime_create path as it needs struct_mutex lock. v4: Handle NULL relay sub buffer pointer earlier in read_update_log_buffer (Chris). Fixed comment suffix **/. (Michal) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104693 Testcase: igt/debugfs_test/read_all_entries # with enable_guc=1 and guc_log_level=1 Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Marta Lofstedt <marta.lofstedt@intel.com> Cc: Michal Winiarski <michal.winiarski@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/1516808821-3638-3-git-send-email-sagar.a.kamble@intel.com	2018-01-24 19:44:04 +00:00
Chris Wilson	84a1074920	drm/i915: Shrink the GEM kmem_caches upon idling When we finally decide the gpu is idle, that is a good time to shrink our kmem_caches. v3: Defer until an rcu grace period after we idle. v4: Think about epoch wraparound and how likely that is. v5: Use I915_EPOCH_INVALID magic. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180124113608.14909-2-chris@chris-wilson.co.uk	2018-01-24 15:28:37 +00:00
Chris Wilson	e9af4ea2b9	drm/i915: Avoid waitboosting on the active request Watching a light workload on Baytrail (running glxgears and a 1080p decode), instead of the system remaining at low frequency, the glxgears would regularly trigger waitboosting after which it would have to spend a few seconds throttling back down. In this case, the waitboosting is counter productive as the minimal wait for glxgears doesn't prevent it from functioning correctly and delivering frames on time. In this case, glxgears happens to almost always be waiting on the current request, which we already expect to complete quickly (see i915_spin_request) and so avoiding the waitboost on the active request and spinning instead provides the best latency without overcommitting to upclocking. However, if the system falls behind we still force the waitboost. Similarly, we will also trigger upclocking if we detect the system is not delivering frames on time - again using a mechanism that tries to detect a miss and not preemptively upclock. v2: Also skip boosting for after missed vblank if the desired request is already active. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Radoslaw Szwichtenberg <radoslaw.szwichtenberg@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180118131609.16574-1-chris@chris-wilson.co.uk	2018-01-18 17:14:30 +00:00
Chris Wilson	2ef1e729c7	drm/i915: Rewrite some comments around RCU-deferred object free Tvrtko noticed that the comments describing the interaction of RCU and the deferred worker for freeing drm_i915_gem_object were a little confusing, so attempt to bring some sense to them. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180115205759.13884-1-chris@chris-wilson.co.uk	2018-01-16 10:47:39 +00:00
Chris Wilson	beacbd1615	drm/i915: Use our singlethreaded wq for freeing objects As freeing the objects require serialisation on struct_mutex, we should prefer to use our singlethreaded driver wq that is dedicated to work requiring struct_mutex (hence serialised).The benefit should be less clutter on the system wq, allowing it to make progress even when the driver/struct_mutex is heavily contended. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180115122846.15193-1-chris@chris-wilson.co.uk Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>	2018-01-15 20:33:01 +00:00
Sagar Arun Kamble	da943b5ab0	drm/i915/guc: Add uc_fini_wq in gem_init unwind path While moving code around for solving lockdep issue for GuC log relay, spotted that uc_fini_wq is not being called in failure path in gem_init. Missed in the below commit. Add it. v2: Removed GEM_BUG_ON(!HAS_GUC()) from intel_uc_fini_wq as init happens only based on enable_guc module parameter and does not consider has_guc capability. (Michal) Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com> Fixes: `3176ff49bc` ("drm/i915/guc: Move GuC workqueue allocations outside of the mutex") Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/1515588857-10283-1-git-send-email-sagar.a.kamble@intel.com Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2018-01-10 14:03:10 +00:00
Chris Wilson	c218ee03b9	drm/i915: Don't adjust priority on an already signaled fence When we retire a signaled fence, we free the dependency tree. However, we skip clearing the list so that if we then try to adjust the priority of the signaled fence, we may walk the list of freed dependencies. [ 3083.156757] ================================================================== [ 3083.156806] BUG: KASAN: use-after-free in execlists_schedule+0x199/0x660 [i915] [ 3083.156810] Read of size 8 at addr ffff8806bf20f400 by task Xorg/831 [ 3083.156815] CPU: 0 PID: 831 Comm: Xorg Not tainted 4.15.0-rc6-no-psn+ #1 [ 3083.156817] Hardware name: Notebook N24_25BU/N24_25BU, BIOS 5.12 02/17/2017 [ 3083.156818] Call Trace: [ 3083.156823] dump_stack+0x5c/0x7a [ 3083.156827] print_address_description+0x6b/0x290 [ 3083.156830] kasan_report+0x28f/0x380 [ 3083.156872] ? execlists_schedule+0x199/0x660 [i915] [ 3083.156914] execlists_schedule+0x199/0x660 [i915] [ 3083.156956] ? intel_crtc_atomic_check+0x146/0x4e0 [i915] [ 3083.156997] ? execlists_submit_request+0xe0/0xe0 [i915] [ 3083.157038] ? i915_vma_misplaced.part.4+0x25/0xb0 [i915] [ 3083.157079] ? __i915_vma_do_pin+0x7c8/0xc80 [i915] [ 3083.157121] ? intel_atomic_state_alloc+0x44/0x60 [i915] [ 3083.157130] ? drm_atomic_helper_page_flip+0x3e/0xb0 [drm_kms_helper] [ 3083.157145] ? drm_mode_page_flip_ioctl+0x7d2/0x850 [drm] [ 3083.157159] ? drm_ioctl_kernel+0xa7/0xf0 [drm] [ 3083.157172] ? drm_ioctl+0x45b/0x560 [drm] [ 3083.157211] i915_gem_object_wait_priority+0x14c/0x2c0 [i915] [ 3083.157251] ? i915_gem_get_aperture_ioctl+0x150/0x150 [i915] [ 3083.157290] ? i915_vma_pin_fence+0x1d8/0x320 [i915] [ 3083.157331] ? intel_pin_and_fence_fb_obj+0x175/0x250 [i915] [ 3083.157372] ? intel_rotation_info_size+0x60/0x60 [i915] [ 3083.157413] ? intel_link_compute_m_n+0x80/0x80 [i915] [ 3083.157428] ? drm_dev_printk+0x1b0/0x1b0 [drm] [ 3083.157443] ? drm_dev_printk+0x1b0/0x1b0 [drm] [ 3083.157485] intel_prepare_plane_fb+0x2f8/0x5a0 [i915] [ 3083.157527] ? intel_crtc_get_vblank_counter+0x80/0x80 [i915] [ 3083.157536] drm_atomic_helper_prepare_planes+0xa0/0x1c0 [drm_kms_helper] [ 3083.157587] intel_atomic_commit+0x12e/0x4e0 [i915] [ 3083.157605] drm_atomic_helper_page_flip+0xa2/0xb0 [drm_kms_helper] [ 3083.157621] drm_mode_page_flip_ioctl+0x7d2/0x850 [drm] [ 3083.157638] ? drm_mode_cursor2_ioctl+0x10/0x10 [drm] [ 3083.157652] ? drm_lease_owner+0x1a/0x30 [drm] [ 3083.157668] ? drm_mode_cursor2_ioctl+0x10/0x10 [drm] [ 3083.157681] drm_ioctl_kernel+0xa7/0xf0 [drm] [ 3083.157696] drm_ioctl+0x45b/0x560 [drm] [ 3083.157711] ? drm_mode_cursor2_ioctl+0x10/0x10 [drm] [ 3083.157725] ? drm_getstats+0x20/0x20 [drm] [ 3083.157729] ? timerqueue_del+0x49/0x80 [ 3083.157732] ? __remove_hrtimer+0x62/0xb0 [ 3083.157735] ? hrtimer_try_to_cancel+0x173/0x210 [ 3083.157738] do_vfs_ioctl+0x13b/0x880 [ 3083.157741] ? ioctl_preallocate+0x140/0x140 [ 3083.157744] ? _raw_spin_unlock_irq+0xe/0x30 [ 3083.157746] ? do_setitimer+0x234/0x370 [ 3083.157750] ? SyS_setitimer+0x19e/0x1b0 [ 3083.157752] ? SyS_alarm+0x140/0x140 [ 3083.157755] ? __rcu_read_unlock+0x66/0x80 [ 3083.157757] ? __fget+0xc4/0x100 [ 3083.157760] SyS_ioctl+0x74/0x80 [ 3083.157763] entry_SYSCALL_64_fastpath+0x1a/0x7d [ 3083.157765] RIP: 0033:0x7f6135d0c6a7 [ 3083.157767] RSP: 002b:00007fff01451888 EFLAGS: 00003246 ORIG_RAX: 0000000000000010 [ 3083.157769] RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007f6135d0c6a7 [ 3083.157771] RDX: 00007fff01451950 RSI: 00000000c01864b0 RDI: 000000000000000c [ 3083.157772] RBP: 00007f613076f600 R08: 0000000000000001 R09: 0000000000000000 [ 3083.157773] R10: 0000000000000060 R11: 0000000000003246 R12: 0000000000000000 [ 3083.157774] R13: 0000000000000060 R14: 000000000000001b R15: 0000000000000060 [ 3083.157779] Allocated by task 831: [ 3083.157783] kmem_cache_alloc+0xc0/0x200 [ 3083.157822] i915_gem_request_await_dma_fence+0x2c4/0x5d0 [i915] [ 3083.157861] i915_gem_request_await_object+0x321/0x370 [i915] [ 3083.157900] i915_gem_do_execbuffer+0x1165/0x19c0 [i915] [ 3083.157937] i915_gem_execbuffer2+0x1ad/0x550 [i915] [ 3083.157950] drm_ioctl_kernel+0xa7/0xf0 [drm] [ 3083.157962] drm_ioctl+0x45b/0x560 [drm] [ 3083.157964] do_vfs_ioctl+0x13b/0x880 [ 3083.157966] SyS_ioctl+0x74/0x80 [ 3083.157968] entry_SYSCALL_64_fastpath+0x1a/0x7d [ 3083.157971] Freed by task 831: [ 3083.157973] kmem_cache_free+0x77/0x220 [ 3083.158012] i915_gem_request_retire+0x72c/0xa70 [i915] [ 3083.158051] i915_gem_request_alloc+0x1e9/0x8b0 [i915] [ 3083.158089] i915_gem_do_execbuffer+0xa96/0x19c0 [i915] [ 3083.158127] i915_gem_execbuffer2+0x1ad/0x550 [i915] [ 3083.158140] drm_ioctl_kernel+0xa7/0xf0 [drm] [ 3083.158153] drm_ioctl+0x45b/0x560 [drm] [ 3083.158155] do_vfs_ioctl+0x13b/0x880 [ 3083.158156] SyS_ioctl+0x74/0x80 [ 3083.158158] entry_SYSCALL_64_fastpath+0x1a/0x7d [ 3083.158162] The buggy address belongs to the object at ffff8806bf20f400 which belongs to the cache i915_dependency of size 64 [ 3083.158166] The buggy address is located 0 bytes inside of 64-byte region [ffff8806bf20f400, ffff8806bf20f440) [ 3083.158168] The buggy address belongs to the page: [ 3083.158171] page:00000000d43decc4 count:1 mapcount:0 mapping: (null) index:0x0 [ 3083.158174] flags: 0x17ffe0000000100(slab) [ 3083.158179] raw: 017ffe0000000100 0000000000000000 0000000000000000 0000000180200020 [ 3083.158182] raw: ffffea001afc16c0 0000000500000005 ffff880731b881c0 0000000000000000 [ 3083.158184] page dumped because: kasan: bad access detected [ 3083.158187] Memory state around the buggy address: [ 3083.158190] ffff8806bf20f300: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc [ 3083.158192] ffff8806bf20f380: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc [ 3083.158195] >ffff8806bf20f400: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc [ 3083.158196] ^ [ 3083.158199] ffff8806bf20f480: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc [ 3083.158201] ffff8806bf20f500: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc [ 3083.158203] ================================================================== Reported-by: Alexandru Chirvasitu <achirvasub@gmail.com> Reported-by: Mike Keehan <mike@keehan.net> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104436 Fixes: `1f181225f8` ("drm/i915/execlists: Keep request->priority for its lifetime") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Alexandru Chirvasitu <achirvasub@gmail.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Tested-by: Alexandru Chirvasitu <achirvasub@gmail.com> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180106105618.13532-1-chris@chris-wilson.co.uk	2018-01-08 09:16:02 +00:00
Chris Wilson	fcb1de54e2	drm/i915: Add a strong mb to resetting the has-CS-interrupt bit After a reset, the state of the CSB registers are scrubbed and not valid until a powercontext is reloaded. We only know when a powercontext has been reloaded once we see a CS-interrupt, before then we must ignore the CSB registers within the execlists_submission_tasklet. However, glk is sporadically dying with an illegal CSB pointer value (both in the HSWP and mmio) suggesting that it is running with the CS-interrupt bit set before the powercontext has been reloaded. Make sure the clearing of that bit is serialised on reset with the re-enabling of the tasklet. References: https://bugs.freedesktop.org/show_bug.cgi?id=104262 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171219090110.11153-1-chris@chris-wilson.co.uk Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>	2017-12-19 12:43:14 +00:00
Matthew Auld	b65a9b9821	drm/i915: prefer i915_gem_object_has_pages() We have an existing helper for testing obj->mm.pages, so use it. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171218103855.25274-1-matthew.auld@intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2017-12-18 11:53:29 +00:00
Chris Wilson	7b6da818d8	drm/i915: Restore the kernel context after a GPU reset on an idle engine As part of the system requirement for powersaving is that we always have a context loaded. Upon boot and resume, we load the kernel_context to ensure that some valid state is set before powersaving kicks in, we should do so after a full GPU reset as well. We only need to do so for an idle engine, as any active engines will restart by executing the stuck request, loading its context. For the idle engine, we create a new request to load the kernel_context instead. For whatever reason, perfoming a dummy execute on the idle engine after reset papers over a subsequent GPU hang in rare circumstances, even on machines not using contexts (e.g. Pineview). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104259 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104261 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Michel Thierry <michel.thierry@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171216000334.8197-1-chris@chris-wilson.co.uk	2017-12-16 09:37:35 +00:00
Michał Winiarski	61b5c1587d	drm/i915/guc: Extract guc_init from guc_init_hw After GPU reset, GuC HW needs to be reinitialized (with FW reload). Unfortunately, we're doing some extra work there (mostly allocating stuff), work that can be moved to guc_init and called once at driver load time. As a side effect we're no longer hitting an assert in i915_ggtt_enable_guc on suspend/resume. v2: Do not duplicate disable_communication / reset_guc_interrupts v3: Add proper teardown after rebase References: `04f7b24ecc` ("drm/i915/guc: Assert that we switch between known ggtt->invalidate functions") Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171213221352.7173-3-michal.winiarski@intel.com	2017-12-14 08:06:56 +00:00
Michał Winiarski	3176ff49bc	drm/i915/guc: Move GuC workqueue allocations outside of the mutex This gets rid of the following lockdep splat: ====================================================== WARNING: possible circular locking dependency detected 4.15.0-rc2-CI-Patchwork_7428+ #1 Not tainted ------------------------------------------------------ debugfs_test/1351 is trying to acquire lock: (&dev->struct_mutex){+.+.}, at: [<000000009d90d1a3>] i915_mutex_lock_interruptible+0x47/0x130 [i915] but task is already holding lock: (&mm->mmap_sem){++++}, at: [<000000005df01c1e>] __do_page_fault+0x106/0x560 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #6 (&mm->mmap_sem){++++}: __might_fault+0x63/0x90 _copy_to_user+0x1e/0x70 filldir+0x8c/0xf0 dcache_readdir+0xeb/0x160 iterate_dir+0xe6/0x150 SyS_getdents+0xa0/0x130 entry_SYSCALL_64_fastpath+0x1c/0x89 -> #5 (&sb->s_type->i_mutex_key#5){++++}: lockref_get+0x9/0x20 -> #4 ((completion)&req.done){+.+.}: wait_for_common+0x54/0x210 devtmpfs_create_node+0x130/0x150 device_add+0x5ad/0x5e0 device_create_groups_vargs+0xd4/0xe0 device_create+0x35/0x40 msr_device_create+0x22/0x40 cpuhp_invoke_callback+0xc5/0xbf0 cpuhp_thread_fun+0x167/0x210 smpboot_thread_fn+0x17f/0x270 kthread+0x173/0x1b0 ret_from_fork+0x24/0x30 -> #3 (cpuhp_state-up){+.+.}: cpuhp_issue_call+0x132/0x1c0 __cpuhp_setup_state_cpuslocked+0x12f/0x2a0 __cpuhp_setup_state+0x3a/0x50 page_writeback_init+0x3a/0x5c start_kernel+0x393/0x3e2 secondary_startup_64+0xa5/0xb0 -> #2 (cpuhp_state_mutex){+.+.}: __mutex_lock+0x81/0x9b0 __cpuhp_setup_state_cpuslocked+0x4b/0x2a0 __cpuhp_setup_state+0x3a/0x50 page_alloc_init+0x1f/0x26 start_kernel+0x139/0x3e2 secondary_startup_64+0xa5/0xb0 -> #1 (cpu_hotplug_lock.rw_sem){++++}: cpus_read_lock+0x34/0xa0 apply_workqueue_attrs+0xd/0x40 __alloc_workqueue_key+0x2c7/0x4e1 intel_guc_submission_init+0x10c/0x650 [i915] intel_uc_init_hw+0x29e/0x460 [i915] i915_gem_init_hw+0xca/0x290 [i915] i915_gem_init+0x115/0x3a0 [i915] i915_driver_load+0x9a8/0x16c0 [i915] i915_pci_probe+0x2e/0x90 [i915] pci_device_probe+0x9c/0x120 driver_probe_device+0x2a3/0x480 __driver_attach+0xd9/0xe0 bus_for_each_dev+0x57/0x90 bus_add_driver+0x168/0x260 driver_register+0x52/0xc0 do_one_initcall+0x39/0x150 do_init_module+0x56/0x1ef load_module+0x231c/0x2d70 SyS_finit_module+0xa5/0xe0 entry_SYSCALL_64_fastpath+0x1c/0x89 -> #0 (&dev->struct_mutex){+.+.}: lock_acquire+0xaf/0x200 __mutex_lock+0x81/0x9b0 i915_mutex_lock_interruptible+0x47/0x130 [i915] i915_gem_fault+0x201/0x760 [i915] __do_fault+0x15/0x70 __handle_mm_fault+0x85b/0xe40 handle_mm_fault+0x14f/0x2f0 __do_page_fault+0x2d1/0x560 page_fault+0x22/0x30 other info that might help us debug this: Chain exists of: &dev->struct_mutex --> &sb->s_type->i_mutex_key#5 --> &mm->mmap_sem Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&mm->mmap_sem); lock(&sb->s_type->i_mutex_key#5); lock(&mm->mmap_sem); lock(&dev->struct_mutex); * DEADLOCK * 1 lock held by debugfs_test/1351: #0: (&mm->mmap_sem){++++}, at: [<000000005df01c1e>] __do_page_fault+0x106/0x560 stack backtrace: CPU: 2 PID: 1351 Comm: debugfs_test Not tainted 4.15.0-rc2-CI-Patchwork_7428+ #1 Hardware name: /NUC6i5SYB, BIOS SYSKLi35.86A.0057.2017.0119.1758 01/19/2017 Call Trace: dump_stack+0x5f/0x86 print_circular_bug+0x230/0x3b0 check_prev_add+0x439/0x7b0 ? lockdep_init_map_crosslock+0x20/0x20 ? unwind_get_return_address+0x16/0x30 ? __lock_acquire+0x1385/0x15a0 __lock_acquire+0x1385/0x15a0 lock_acquire+0xaf/0x200 ? i915_mutex_lock_interruptible+0x47/0x130 [i915] __mutex_lock+0x81/0x9b0 ? i915_mutex_lock_interruptible+0x47/0x130 [i915] ? i915_mutex_lock_interruptible+0x47/0x130 [i915] ? i915_mutex_lock_interruptible+0x47/0x130 [i915] i915_mutex_lock_interruptible+0x47/0x130 [i915] ? __pm_runtime_resume+0x4f/0x80 i915_gem_fault+0x201/0x760 [i915] __do_fault+0x15/0x70 __handle_mm_fault+0x85b/0xe40 handle_mm_fault+0x14f/0x2f0 __do_page_fault+0x2d1/0x560 page_fault+0x22/0x30 RIP: 0033:0x7f98d6f49116 RSP: 002b:00007ffd6ffc3278 EFLAGS: 00010283 RAX: 00007f98d39a2bc0 RBX: 0000000000000000 RCX: 0000000000001680 RDX: 0000000000001680 RSI: 00007ffd6ffc3400 RDI: 00007f98d39a2bc0 RBP: 00007ffd6ffc33a0 R08: 0000000000000000 R09: 00000000000005a0 R10: 000055e847c2a830 R11: 0000000000000002 R12: 0000000000000001 R13: 000055e847c1d040 R14: 00007ffd6ffc3400 R15: 00007f98d6752ba0 v2: Init preempt_work unconditionally (Chris) v3: Mention that we need the enable_guc=1 for lockdep splat (Chris) Testcase: igt/debugfs_test/read_all_entries # with i915.enable_guc=1 Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20171213221352.7173-2-michal.winiarski@intel.com	2017-12-14 08:06:54 +00:00
Chris Wilson	6ca9a2beb5	drm/i915: Unwind i915_gem_init() failure Since Michal introduced new user controllable errors other than -EIO during i915_gem_init(), we need to actually unwind on the error path as we have to abort the module load (and we expect to do so cleanly!). As we now teardown key state and then mark the driver as wedged (on EIO), we have to be careful to not allow ourselves to resume and unwedge, thus attempting to use the uninitialised driver. v2: Try not to free driver state for the suppressed EIO v3: Use load-fault-injection to test both error/recovery paths. References: `8620eb1dbb` ("drm/i915/uc: Don't use -EIO to report missing firmware") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171213134347.4608-1-chris@chris-wilson.co.uk	2017-12-13 18:53:35 +00:00
Chris Wilson	d7dc4131eb	drm/i915: Don't check #active_requests from i915_gem_wait_for_idle() i915_gem_wait_for_idle() is called from inside the shrinker, to ensure that we drain the last resources from the GPU in dire circumstances (OOM). As we may allocate whilst building a request, it is then possible to hit the shrinker with a request under construction, and so we must account for the incomplete request whilst waiting. In particular, we preincrement (in reserve_engine) the i915->gt.active_requests counter and mark the GPU as busy, therefore we can not use that counter for shortcircuiting the wait-for-idle. [ 950.859024] GEM_BUG_ON(i915->gt.active_requests) [ 950.859041] WARNING: CPU: 2 PID: 2178 at drivers/gpu/drm/i915/i915_gem.c:3615 i915_gem_wait_for_idle.part.56+0x166/0x4e0 [ 950.859041] Modules linked in: ccm tun fuse nf_conntrack_netbios_ns nf_conntrack_broadcast ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_mangle iptable_security iptable_raw arc4 iwldvm mac80211 snd_hda_codec_hdmi snd_hda_codec_idt snd_hda_codec_generic snd_hda_intel snd_hda_codec btusb snd_hda_core btrtl btbcm iwlwifi snd_hwdep btintel bluetooth snd_seq snd_seq_device snd_pcm ecdh_generic x86_pkg_temp_thermal tpm_infineon coretemp tpm_tis crc32_pclmul wmi_bmof crc32c_intel iTCO_wdt hp_wmi snd_timer iTCO_vendor_support sparse_keymap tpm_tis_core mei_me cfg80211 [ 950.859082] snd joydev tpm mei rfkill pcspkr wmi soundcore lpc_ich hp_accel lis3lv02d input_polldev binfmt_misc e1000e ptp serio_raw pps_core [ 950.859094] CPU: 2 PID: 2178 Comm: gem_exec_nop Tainted: G U 4.15.0-rc2+ #900 [ 950.859102] Hardware name: Hewlett-Packard HP ProBook 6360b/1620, BIOS 68SCF Ver. B.42 12/29/2010 [ 950.859107] task: c5119cb4 task.stack: f3ccb8d8 [ 950.859112] EIP: i915_gem_wait_for_idle.part.56+0x166/0x4e0 [ 950.859113] EFLAGS: 00010296 CPU: 2 [ 950.859114] EAX: 00000024 EBX: f36c1888 ECX: f777a044 EDX: 00000007 [ 950.859115] ESI: f36c1888 EDI: edd53958 EBP: edd53970 ESP: edd53938 [ 950.859116] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 [ 950.859117] CR0: 80050033 CR2: b7f39000 CR3: 2f2b3000 CR4: 000406d0 [ 950.859118] Call Trace: [ 950.859125] ? drm_printk+0x70/0x70 [ 950.859129] i915_gem_wait_for_idle+0x18/0x30 [ 950.859133] i915_gem_shrink+0x360/0x410 [ 950.859138] ? vmpressure+0xa8/0xf0 [ 950.859142] ? ktime_get+0x4a/0x100 [ 950.859147] i915_gem_shrink_all+0x21/0x40 [ 950.859151] i915_gem_shrinker_oom+0x23/0x130 [ 950.859156] notifier_call_chain+0x4e/0x70 [ 950.859160] __blocking_notifier_call_chain+0x2f/0x60 [ 950.859164] blocking_notifier_call_chain+0x11/0x20 [ 950.859169] out_of_memory+0x207/0x280 [ 950.859174] __alloc_pages_nodemask+0xd47/0xe60 [ 950.859179] new_slab+0x32d/0x450 [ 950.859183] ___slab_alloc.constprop.81+0x358/0x4e0 [ 950.859189] ? i915_sw_fence_await_dma_fence+0x53/0x160 [ 950.859193] ? __slab_free+0x1fe/0x310 [ 950.859197] ? native_sched_clock+0x1e/0xc0 [ 950.859201] ? i915_gem_request_alloc+0xcf/0x510 [ 950.859205] ? sched_clock+0x9/0x10 [ 950.859209] __slab_alloc.constprop.80+0x29/0x40 [ 950.859212] ? __slab_alloc.constprop.80+0x29/0x40 [ 950.859216] kmem_cache_alloc_trace+0x160/0x1a0 [ 950.859220] ? i915_sw_fence_await_dma_fence+0x53/0x160 [ 950.859224] i915_sw_fence_await_dma_fence+0x53/0x160 [ 950.859229] i915_gem_request_await_dma_fence+0x1eb/0x390 [ 950.859233] i915_gem_request_await_object+0xee/0x230 [ 950.859239] i915_gem_do_execbuffer+0xc16/0x1200 [ 950.859246] ? irqtime_account_irq+0x3e/0xc0 [ 950.859251] ? irq_exit+0x4f/0xb0 [ 950.859257] ? smp_apic_timer_interrupt+0x5f/0x110 [ 950.859261] ? apic_timer_interrupt+0x35/0x3c [ 950.859266] i915_gem_execbuffer2_ioctl+0x212/0x440 [ 950.859270] ? apic_timer_interrupt+0x35/0x3c [ 950.859274] ? i915_gem_do_execbuffer+0x1200/0x1200 [ 950.859279] ? insn_get_seg_base+0x1b/0x50 [ 950.859283] ? i915_gem_do_execbuffer+0x1200/0x1200 [ 950.859287] drm_ioctl_kernel+0x51/0xa0 [ 950.859291] drm_ioctl+0x2a3/0x350 [ 950.859294] ? i915_gem_do_execbuffer+0x1200/0x1200 [ 950.859300] ? sched_clock+0x9/0x10 [ 950.859303] ? drm_getunique+0x70/0x70 [ 950.859308] do_vfs_ioctl+0x7d/0x640 [ 950.859311] ? native_sched_clock+0x1e/0xc0 [ 950.859315] ? sched_clock+0x9/0x10 [ 950.859319] ? sched_clock_cpu+0x13/0x120 [ 950.859323] SyS_ioctl+0x4e/0x80 [ 950.859326] do_fast_syscall_32+0x75/0x250 [ 950.859331] ? irq_exit+0x4f/0xb0 [ 950.859334] entry_SYSENTER_32+0x47/0x71 [ 950.859338] EIP: 0xb7f81d11 [ 950.859339] EFLAGS: 00000296 CPU: 2 [ 950.859340] EAX: ffffffda EBX: 00000003 ECX: 40406469 EDX: bfde4c20 [ 950.859340] ESI: 00000003 EDI: 40406469 EBP: 00000003 ESP: bfde4b38 [ 950.859341] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b [ 950.859343] Code: e8 30 60 01 00 83 c4 10 83 c3 04 39 f3 75 e0 8b 45 d8 8b 80 14 37 00 00 85 c0 74 13 68 dd 33 e4 c0 68 49 6f e3 c0 e8 4a 55 be ff <0f> ff 5e 5f b8 fe ff ff 3f bb 0a 00 00 00 e8 b7 14 c4 ff 8b 15 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171212132148.8124-1-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-12-13 11:15:38 +00:00
Chris Wilson	59e4b19d62	drm/i915: Dump the engine state before declaring wedged from wait_for_engines() If wait_for_engines() fails and we resort to declaring the HW wedged, dump the engine state for debugging. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171211194135.27095-2-chris@chris-wilson.co.uk	2017-12-12 21:07:41 +00:00
Chris Wilson	ee42c00e1c	drm/i915: Bump timeout for wait_for_engines() Extract the timeout we use in i915_gem_idle_work_handler() and reuse it for wait_for_engines() in i915_gem_wait_for_idle(). It too has the same problem in sometimes having to wait for an extended period before the HW settles, so make use of the same timeout. References: `5427f20785` ("drm/i915: Bump wait-times for the final CS interrupt before parking") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171211194135.27095-1-chris@chris-wilson.co.uk	2017-12-12 21:07:41 +00:00
Matthew Auld	73ebd50303	drm/i915: make mappable struct resource centric Now that we are using struct resource to track the stolen region, it is more convenient if we track the mappable region in a resource as well. v2: prefer iomap and gmadr naming scheme prefer DEFINE_RES_MEM Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171211151822.20953-8-matthew.auld@intel.com	2017-12-12 12:30:21 +02:00
Tvrtko Ursulin	b68763741a	drm/i915: Restore GT performance in headless mode with DMC loaded It seems that the DMC likes to transition between the DC states a lot when there are no connected displays (no active power domains) during command submission. This activity on DC states has a negative impact on the performance of the chip with huge latencies observed in the interrupt handlers and elsewhere. Simple tests like igt/gem_latency -n 0 are slowed down by a factor of eight. Work around it by introducing a new power domain named, POWER_DOMAIN_GT_IRQ, associtated with the "DC off" power well, which is held for the duration of command submission activity. CNL has the same problem which will be addressed as a follow-up. Doing that requires a fix for a DC6 context corruption problem in the CNL DMC firmware which is yet to be released. v2: * Add commit text as comment in i915_gem_mark_busy. (Chris Wilson) * Protect macro body with braces. (Jani Nikula) v3: * Add dedicated power domain for clarity. (Chris, Imre) * Commit message and comment text updates. * Apply to all big-core GEN9 parts apart for Skylake which is pending DMC firmware release. v4: * Power domain should be inner to device runtime pm. (Chris) * Simplify NEEDS_CSR_GT_PERF_WA macro. (Chris) * Handle async DMC loading by moving the GT_IRQ power domain logic into intel_runtime_pm. (Daniel, Chris) * Include small core GEN9 as well. (Imre) v5 * Special handling for async DMC load is not needed since on failure the power domain reference is kept permanently taken. (Imre) v6: * Drop the NEEDS_CSR_GT_PERF_WA macro since all firmwares have now been deployed. (Imre, Chris) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100572 Testcase: igt/gem_exec_nop/headless Cc: Imre Deak <imre.deak@intel.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> (v2) Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> (v5) Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> [Imre: Add note about applying the WA on CNL as a follow-up] Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171205132854.26380-1-tvrtko.ursulin@linux.intel.com	2017-12-08 12:23:07 +02:00
Chris Wilson	e2189dd078	drm/i915: Refactor common list iteration over GGTT vma In quite a few places, we have a list iteration over the vma on an object that only want to inspect GGTT vma. By construction, these are placed at the start of the list, so we have copied that knowledge into many callsites. Pull that knowledge back to i915_vma.h and provide a for_each_ggtt_vma() to tidy up the code. v2: Add a backreference from vma_create() to remind ourselves why we put ggtt vma at the head of the obj->vma_list (and ppgtt vma at the tail). v3: Fixup s/vma/V/ Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20171207211407.31549-1-chris@chris-wilson.co.uk	2017-12-07 23:26:55 +00:00
Chris Wilson	7125397b82	drm/i915: Track GGTT writes on the vma As writes through the GTT and GGTT PTE updates do not share the same path, they are not strictly ordered and so we must explicitly flush the indirect writes prior to modifying the PTE. We do track outstanding GGTT writes on the object itself, but since the object may have multiple GGTT vma, that is overly coarse as we can track and flush individual vma as required. Whilst here, update the GGTT flushing behaviour for Cannonlake. v2: Hard-code ring offset to allow use during unload (after RCS may have been freed, or never existed!) References: https://bugs.freedesktop.org/show_bug.cgi?id=104002 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20171206124914.19960-2-chris@chris-wilson.co.uk	2017-12-07 14:01:59 +00:00

... 7 8 9 10 11 ...

2340 Commits