We need to have the address space when reserving space for the objects.
Since the address space and context are tied together, and reserve
occurs before context switch (for good reason), we must lookup our
context earlier in the process.
This leaves some room for optimizations where we no longer need to use
ctx_id in certain places. This will be addressed in a subsequent patch.
Important tricky bit:
Because slow relocations during execbuffer drop struct_mutex
Perhaps it would be best to acquire the reference when we get the
context, but I'll save that for another day (note I have written the
patch before, and I found the changes required to be uglier than this).
Note that since we currently access everything via context id, and not
the data structure this is fine, though not desirable. The next change
attempts to get the context only once via the context ID idr lookup, and
as such, the following can happen:
CTX-A is created, refcount = 1
CTX-A execbuf, mutex dropped
close IOCTL called on CTX-A, refcount = 0
CTX-A resumes in execbuf.
v2: Rebased on top of
commit b6359918b8
Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Date: Wed Oct 30 15:44:16 2013 +0200
drm/i915: add i915_get_reset_stats_ioctl
v3: Rebased on top of
commit 25b3dfc87b
Author: Mika Westerberg <mika.westerberg@linux.intel.com>
Date: Tue Nov 12 11:57:30 2013 +0200
Author: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Date: Tue Nov 26 16:14:33 2013 +0200
drm/i915: check context reset stats before relocations
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
To simplify the codepaths somewhat, we can simply always create a
context. Contexts already keep hangstat information. This prevents us
from having to differentiate at other parts in the code.
There is allocation overhead, but it should not be measurable.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Every file will get it's own context, and we use this context instead of
the default context. The default context still exists for future
shrinker usage as well as reset handling.
v2: Updated to address Mika's recent context guilty changes
Some more changes around this come up in later patches as well.
v3: Use a fake context to avoid allocation for the !HAS_HW_CONTEXT case.
I've tried the alternatives. This looks the best to me.
Removed hangstat stuff from v2 - for a separate patch
Demote failed PPGTT set to DRM_DEBUG_DRIVER since it can now be invoked
easily from userspace.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We have a default context which suits the aliasing PPGTT well. Tie them
together so it looks like any other context/PPGTT pair. This makes the
code cleaner as it won't have to special case aliasing as often.
The patch has one slightly tricky part in the default context creation
function. In the future (and on aliased setup) we create a new VM for a
context (potentially). However, if we have aliasing PPGTT, which occurs
at this point in time for all platforms GEN6+, we can simply manage the
refcounting to allow things to behave as normal. Now is a good time to
recall that the aliasing_ppgtt doesn't have a real VM, it uses the GGTT
drm_mm.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Pretty straightforward so far except for the bit about the refcounting.
The PPGTT will potentially be shared amongst multiple contexts. Because
contexts themselves have a refcounted lifecycle, the easiest way to
manage this will be to refcount the PPGTT. To acheive this, we piggy
back off of the existing context refcount, and will increment and
decrement the PPGTT refcount with context creation, and destruction.
To put it more clearly, if context A, and context B both use PPGTT 0, we
can't free the PPGTT until both A, and B are destroyed.
Note that because the PPGTT is permanently pinned (for now), it really
just matters for the PPGTT destruction, as opposed to making space under
memory pressure.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This patch consolidates the way in which we handle the various supported
PPGTT by module parameter in addition to what the hardware supports. It
strives to make doing the right thing in the code as simple as possible,
with the USES_ macros.
I've opted to add the full PPGTT argument simply so one can see how I
intend to use this function. It will not/cannot be used until later.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
In order to do the full context switch with address space, it's
convenient to have a way to switch the address space. We already have
this in our code - just pull it out to be called by the context switch
code later.
v2: Rebased on BDW support. Required adding BDW.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
When PPGTT support was originally enabled, it was only designed to
support 1 PPGTT. It therefore made sense to simply hide the GGTT space
required to enable this from the drm_mm allocator.
Since we intend to support full PPGTT, which means more than 1, and they
can be created and destroyed ad hoc it will be required to use the
proper allocation techniques we already have.
The first step here is to make the existing single PPGTT use the
allocator.
The astute observer will notice that we are reserving space in the GGTT
for the PDEs for the lifetime of the address space, and would be right
to question whether or not this is a good idea. It does not make a
difference with this current patch only the aliasing PPGTT (indeed the
PDEs should still be hidden from the shrinker). For the future, we are
allocating from top to bottom to avoid using the precious "gtt
space" The GGTT space at that point should only be used for scanout, HW
contexts, ringbuffers, HWSP, PDEs, and a couple of other small buffers
(potentially) used by the kernel. Everything else should be mapped into
a PPGTT. To put the consumption in more tangible terms, it takes
approximately 4 sets of PDEs to equal one 19x10 framebuffer (with no
fancy stride or alignment constraints). 3/4 of the total [average] GGTT
can be used for PDEs, and hopefully never touch the 1/4 that the
framebuffer needs.
The astute, and persistent observer might ask about the page tables
which are also pinned for the address space. This waste is unfortunate.
We use 2MB of memory per address space. We leave wrapping the PDEs as a
real GEM object as a TODO.
v2: Align PDEs to 64b in GTT
Allocate the node dynamically so we can use drm_mm_put_block
Now tested on IGT
Allocate node at the top to avoid fragmentation (Chris)
v3: Use Chris' top down allocator
v4: Embed drm_mm_node into ppgtt struct (Jesse)
Remove hunks which didn't belong (Jesse)
v5: Don't subtract guard page since we now killed the guard page prior
to this patch. (Ben)
v6: Rebased and removed guard page stuff.
Added a chunk to the commit message
Allow adding a context to mappable region
v7: Undo v3, so we can make the drm patch last in the series
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> (v4)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
squash: drm/i915: allow PPGTT to use mappable
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We **need** to do this for exactly 1 reason, because we want to embed a
PPGTT into the context, but we don't want to special case the default
context.
To achieve that, we must be able to initialize contexts after the GTT is
setup (so we can allocate and pin the default context's BO), but before
the PPGTT and rings are initialized. This is because, currently, context
initialization requires ring usage. We don't have rings until after the
GTT is setup. If we split the enabling part of context initialization,
the part requiring the ringbuffer, we can untangle this, and then later
embed the PPGTT
Incidentally this allows us to also adhere to the original design of
context init/fini in future patches: they were only ever meant to be
called at driver load and unload.
v2: Move hw_contexts_disabled test in i915_gem_context_enable() (Chris)
v3: BUG_ON after checking for disabled contexts. Or else it blows up pre
gen6 (Ben)
v4: Forward port
Modified enable for each ring, since that patch is earlier in the series
Dropped ring arg from create_default_context so it can be used by others
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This patch adds to changes for contexts on reset:
Sets last context to default - this will prevent the context switch
happening after a reset. That switch is not possible because the
rings are hung during reset and context switch requires reset. This
behavior will need to be reworked in the future, but this is what we
want for now.
In the future, we'll also want to reset the guilty context to
uninitialized. We should wait for ARB_Robustness related code to land
for that.
This is somewhat for paranoia. Because we really don't know what the
GPU was doing when it hung, or the state it was in (mid context write,
for example), later restoring the context is a bad idea. By setting the
flag to not initialized, the next load of that context will not restore
the state, and thus on the subsequent switch away from the context will
overwrite the old data.
NOTE: This code needs a fixup when we actually have multiple VMs. The
issue that can occur is inactive objects in a VM will need to be
destroyed before the last context unref. This can now happen via the
fake switch introduced in this patch (and it other ways in the future)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Previously we dropped the association of a context to a ring. It is
however very important to know which ring a context ran on (we could
have reused the other member, but I was nitpicky).
This is very important when we switch address spaces, which unlike
context objects, do change per ring.
As an example, if we have:
RCS BCS
ctx A
ctx A
ctx B
ctx B
Without tracking the last ring B ran on, we wouldn't know to switch the
address space on BCS in the last row.
As a result, we no longer need to track which ring a context "belongs"
to, as it never really made much sense anyway.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We'll be doing a bit more stuff with each file, so having our own open
function should make things clean.
This also allows us to easily add conditionals for stuff we don't want
to do when we don't have HW contexts.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
To sum up what goes on here, we abstract the vma binding, similarly to
the previous object binding. This helps for distinguishing legacy
binding, versus modern binding. To keep the code churn as minimal as
possible, I am leaving in insert_entries(). It serves as the per
platform pte writing basically. bind_vma and insert_entries do share a
lot of similarities, and I did have designs to combine the two, but as
mentioned already... too much churn in an already massive patchset.
What follows are the 3 commits which existed discretely in the original
submissions. Upon rebasing on Broadwell support, it became clear that
separation was not good, and only made for more error prone code. Below
are the 3 commit messages with all their history.
drm/i915: Add bind/unbind object functions to VMA
drm/i915: Use the new vm [un]bind functions
drm/i915: reduce vm->insert_entries() usage
drm/i915: Add bind/unbind object functions to VMA
As we plumb the code with more VM information, it has become more
obvious that the easiest way to deal with bind and unbind is to simply
put the function pointers in the vm, and let those choose the correct
way to handle the page table updates. This change allows many places in
the code to simply be vm->bind, and not have to worry about
distinguishing PPGTT vs GGTT.
Notice that this patch has no impact on functionality. I've decided to
save the actual change until the next patch because I think it's easier
to review that way. I'm happy to squash the two, or let Daniel do it on
merge.
v2:
Make ggtt handle the quirky aliasing ppgtt
Add flags to bind object to support above
Don't ever call bind/unbind directly for PPGTT until we have real, full
PPGTT (use NULLs to assert this)
Make sure we rebind the ggtt if there already is a ggtt binding. This
happens on set cache levels.
Use VMA for bind/unbind (Daniel, Ben)
v3: Reorganize ggtt_vma_bind to be more concise and easier to read
(Ville). Change logic in unbind to only unbind ggtt when there is a
global mapping, and to remove a redundant check if the aliasing ppgtt
exists.
v4: Make the bind function a bit smarter about the cache levels to avoid
unnecessary multiple remaps. "I accept it is a wart, I think unifying
the pin_vma / bind_vma could be unified later" (Chris)
Removed the git notes, and put version info here. (Daniel)
v5: Update the comment to not suck (Chris)
v6:
Move bind/unbind to the VMA. It makes more sense in the VMA structure
(always has, but I was previously lazy). With this change, it will allow
us to keep a distinct insert_entries.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: Use the new vm [un]bind functions
Building on the last patch which created the new function pointers in
the VM for bind/unbind, here we actually put those new function pointers
to use.
Split out as a separate patch to aid in review. I'm fine with squashing
into the previous patch if people request it.
v2: Updated to address the smart ggtt which can do aliasing as needed
Make sure we bind to global gtt when mappable and fenceable. I thought
we could get away without this initialy, but we cannot.
v3: Make the global GTT binding explicitly use the ggtt VM for
bind_vma(). While at it, use the new ggtt_vma helper (Chris)
At this point the original mailing list thread diverges. ie.
v4^:
use target_obj instead of obj for gen6 relocate_entry
vma->bind_vma() can be called safely during pin. So simply do that
instead of the complicated conditionals.
Don't restore PPGTT bound objects on resume path
Bug fix in resume path for globally bound Bos
Properly handle secure dispatch
Rebased on vma bind/unbind conversion
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
drm/i915: reduce vm->insert_entries() usage
FKA: drm/i915: eliminate vm->insert_entries()
With bind/unbind function pointers in place, we no longer need
insert_entries. We could, and want, to remove clear_range, however it's
not totally easy at this point. Since it's used in a couple of place
still that don't only deal in objects: setup, ppgtt init, and restore
gtt mappings.
v2: Don't actually remove insert_entries, just limit its usage. It will
be useful when we introduce gen8. It will always be called from the vma
bind/unbind.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v1)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.15 (GNU/Linux)
iQEcBAABAgAGBQJSogqUAAoJEHm+PkMAQRiGM2MIAJrr5KEXEWuuAR4+JkkWBK7A
+dVT4n1MM4wP/aCIyriSlq7kgT03Wxk4Q4wKsj2wZvDQkNgEQjrctgIihc75jqi5
126nmT3YXJLwgDpFA3RHZUWve3j3vfUG53rRuk7K9Xx1sGWU3Ls7BuInvQZ//+QS
6UB4UuEAalmose5U8ToXQfMqZhjwreZKeb64TEZwFvu2klv4cnka1L/zHbmQGgRg
2Pfv+aUrjsYE8s9lkEKX8MIQsDn28Q5Lsv7XIEQwo2at4rYbJaxX6usuC1OI0MQ5
BLUn1GgtvOidq6FzSg6kXiA/MJYH3J0S+p4uULWAprxA+KeJRbWNRroM94W1qAk=
=1Wcq
-----END PGP SIGNATURE-----
Merge tag 'v3.13-rc3' into drm-intel-next-queued
Linux 3.13-rc3
I need a backmerge for two reasons:
- For merging the ppgtt patches from Ben I need to pull in the bdw
support.
- We now have duplicated calls to intel_uncore_forcewake_reset in the
setup code to due 2 different patches merged into -next and 3.13.
The conflict is silen so I need the merge to be able to apply
Deepak's fixup patch.
Conflicts:
drivers/gpu/drm/i915/intel_display.c
Trivial conflict, it doesn't even show up in the merge diff.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Atm we call intel_display_power_enabled() from
i915_capture_error_state() in IRQ context and then take a mutex. To fix
this add a new intel_display_power_enabled_sw() which returns the domain
state based on software tracking as opposed to reading the actual HW
state.
Since we use domain_use_count for this without locking on the reader
side make sure we increase the counter only after enabling all required
power wells and decrease it before disabling any of these power wells.
Regression introduced in
commit 1b02383464b4a915627ef3b8fd0ad7f07168c54c
Author: Imre Deak <imre.deak@intel.com>
Date: Tue Sep 24 16:17:09 2013 +0300
drm/i915: support for multiple power wells
Note that atm we depend on the value returned by
intel_display_power_enabled_sw() in i915_capture_error_state() to avoid
unclaimed register access reports. This was never guaranteed though,
since another thread can disable the power concurrently. If this is a
problem we need another explicit way to disable the reporting during
error captures.
v2:
- remove barriers as the caller can't depend on the value
returned from i915_capture_error_state_sw() anyway (Ville)
- dump the state of pipe/transcoder power domain state (Daniel)
Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Split vlv force wake routines to help individually control Media/Render
well based on the register access.
We've seen power savings in the lower sub-1W range on workloads that
only need on of the power wells, e.g. glbenchmark, media playback
Note: The same split isn't there for the forcewake queue, only the
forcwake domains are split.
Signed-off-by: Deepak S <deepak.s@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
[danvet: Rebase on top of the removed forcewake hack in the ring irq
get/put code and add a note to add Deepak's answer to Chris question.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Added power well arguments to all the force wake routines
to help us individually control power well based on the
scenario.
Signed-off-by: Deepak S <deepak.s@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
[danvet: Resolve conflict with the removed forcewake hack and drop one
spurious hunk Jesse noticed.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Add a debugfs entry showing the use-count for all power domains of each
power well.
v3: address comments from Paulo:
- simplify power_domain_str() by using a switch table
- move power_well::domain_count to power_domains
- WARN_ON decrementing a 0 refcount
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Paulo Zanoni <paulo.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
So far we distinguished platforms without a dynamic power well with
the HAS_POWER_WELL macro and for such platforms we didn't call any power
domain functions. Instead of doing this check we can add an always-on
power well for these platforms and call the power domain functions
unconditionally. For always-on power wells we only increase/decrease
their refcounts, otherwise they are nop.
This makes high level driver code more readable and as a bonus provides
some idea of the current power domains state for all platforms (once
the relevant debugfs entry is added).
v3: rename intel_power_wells to i9xx_always_on_power_well (Paulo)
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Paulo Zanoni <paulo.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Instead of using a separate function to check whether a power domain is
is always on, add an always-on power well covering all these power
domains and do the usual get/put on these unconditionally. Since we
don't assign a .set handler for these the get/put won't have any effect
besides the adjusted refcount.
This makes the code more readable and provides debug info also on the
use of always-on power wells (once the relevant debugfs entry is added.)
v3: make is_always_on to be bool instead of a bit field (Paulo)
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Paulo Zanoni <paulo.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
HW generations so far had only one always-on power well and optionally
one dynamic power well. Upcoming HW gens may have multiple dynamic power
wells, so add some infrastructure to support them.
The idea is to keep the existing power domain API used by the rest of
the driver and create a mapping between these power domains and the
underlying power wells. This mapping can differ from one HW to another
but high level driver code doesn't need to know about this. Through the
existing get/put API it would just ask for a given power domain and the
power domain framework would make sure the relevant power wells get
enabled in the right order.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Paulo Zanoni <paulo.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This way the code is simpler and can also be used for other platforms
where the audio power domain->power well mapping is different.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Paulo Zanoni <paulo.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
If the hardware does not support package C8, then do not even schedule
work to enable it. Thereby we can eliminate a bunch of dangerous work.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This was forgotten in
commit 9d1cb9147d
Author: Paulo Zanoni <paulo.r.zanoni@intel.com>
Date: Fri Nov 1 13:32:08 2013 -0200
drm/i915: avoid unclaimed registers when capturing the error state
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Pull in Jani's backlight rework branch. This was merged through a
separate branch to be able to sort out the Broadwell conflicts
properly before pulling it into the main development branch.
Conflicts:
drivers/gpu/drm/i915/intel_display.c
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Merge the bdw changes into the backlight rework branch so that we can
adapt the new code for bdw, too. This is a bit a mess, but doing this
another way would have delayed the merging of the backlight
refactoring. Mea culpa.
As discussed with Jani on irc only do bdw-specific callbacks for the
set/get methods and bake in the only other special-case into the pch
enable function.
Conflicts:
drivers/gpu/drm/i915/intel_panel.c
v2: Don't enable the PWM too early for bdw (Jani).
v3: Create new bdw_ functions for setup and enable - the rules change
sufficiently imo with the switch from controlling the pwm from the cpu
to controlling it completel from the pch to warrant this.
v4: Rip out unused pipe variable in bdw_enable_backlight (0-day
builder).
Tested-by: Ben Widawsky <ben@bwidawsk.net> (on bdw)
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The backlight enable code now has the smarts to do the right thing. Only
do backlight register save/restore in UMS.
Some VLV specific code gets dropped as UMS is not supported on VLV.
v2: Move save/restore to UMS instead of removing completely (Daniel).
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
No longer needed. We now have fully cached max backlight values.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The quirk was added as what I'd say was a stopgap measure in
commit e85843bec6
Author: Kamal Mostafa <kamal@canonical.com>
Date: Fri Jul 19 15:02:01 2013 -0700
drm/i915: quirk no PCH_PWM_ENABLE for Dell XPS13 backlight
without really digging into what was going on.
Also, as mentioned in the related bug [1], having the quirk regressed
some of the machines it was supposed to fix to begin with, and there
were patches posted to disable the quirk on such machines [2]!
The fact is, we do need the BLM_PCH_PWM_ENABLE bit set to have
backlight. With the quirk, we've relied on BIOS to have set it, and our
save/restore code to retain it. With the full backlight setup at enable,
we have no place for things that rely on previous state.
With the per platform hooks, we've also made a change in the PCH
platform enable order: setting the backlight duty cycle between CPU and
PCH PWM enable. Some experimenting and
commit 770c12312a
Author: Takashi Iwai <tiwai@suse.de>
Date: Sat Aug 11 08:56:42 2012 +0200
drm/i915: Fix blank panel at reopening lid
indicate that we can't set the backlight before enabling CPU PWM; the
value just won't stick. But AFAICT we should do it before enabling the
PCH PWM.
Finally, any fallout we should fix properly, preferrably without quirks,
and absolutely without quirks that rely on existing state. With the per
platform hooks have much more flexibility to adjust the sequence as
required by platforms.
[1] https://bugzilla.kernel.org/show_bug.cgi?id=47941
[2] http://lkml.kernel.org/r/1378229848-29113-1-git-send-email-kamal@canonical.com
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
For what we care about ULT and ULX are interchangeable. We know of 3
types of pciids for these cases. I am not sure if at some point we will
need to distinguish ULT and ULX.
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The backlight code has grown rather hairy, not least because the
hardware registers and bits have repeatedly been shuffled around. And
this isn't expected to get any easier with new hardware. Make things
easier for our (read: my) poor brains, and split the code up into chip
specific functions.
There should be no functional changes.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Move from dev_priv to connector->panel. We still don't allow multiple
sysfs interfaces, though.
There should be no functional changes, except for a slight reordering of
connector backlight and sysfs destroy calls. (This change happens now
that the backlight device is actually per-connector, even though the
destroy calls became per-connector earlier.)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This ioctl returns reset stats for specified context.
The struct returned contains context loss counters.
reset_count: all resets across all contexts
batch_active: active batches lost on resets
batch_pending: pending batches lost on resets
v2: get rid of state tracking completely and deliver only counts. Idea
from Chris Wilson.
v3: fix commit message
v4: default context handled inside i915_gem_context_get_hang_stats
v5: reset_count only for priviledged process
v6: ctx=0 needs CAP_SYS_ADMIN for batch_* counters (Chris Wilson)
v7: context hang stats never returns NULL
v8: rebased on top of reworked context hang stats
DRM_RENDER_ALLOW for ioctl
v9: use DEFAULT_CONTEXT_ID. Improve comments for ioctl struct members
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Ian Romanick <idr@freedesktop.org>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
reset_counter will be incremented twice per successful
reset. Odd values mean reset is in progress and even values
mean that reset has completed.
Reset status ioctl introduced in following commit
needs to deliver global reset count to userspace so
use reset_counter to derive the actual reset count
for the gpu
Note that reset in progress is enough to increment
the counter.
v2: wedged equals reset in progress (Daniel Vetter)
v3: Fixed stale comments (Damien Lespiau)
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
vlv_dpio_read/write should be describe more in PHY centric instead of
display controller centric.
Create a enum dpio_channel for channel index and enum dpio_phy for PHY
index. This should better to gather for upcoming platform.
v2: Rebase the code based on
drm/i915/vlv: Fix typo in the DPIO register define.
v3: Rename vlv_phy to dpio_phy_iosf_port and define additional macro
DPIO_PHY, and remove unrelated change. (Ville)
Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Chon Ming Lee <chon.ming.lee@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
So here's the Broadwell pull request. From a kernel driver pov there's
two areas with big changes in Broadwell:
- Completely new enumerated interrupt bits. On the plus side it now looks
fairly unform and sane.
- Completely new pagetable layout.
To ensure minimal impact on existing platforms we've refactored both the
irq and low-level gtt handling code a lot in anticipation of the bdw push.
So now bdw enabling in these areas just plugs in a bunch of vfuncs.
Otherwise it's all fairly harmless adjusting of switch cases and
if-ladders to shovel bdw into the right blocks. So minimized impact on
existing platforms. I've also merged the bdw-stage1 branch into our
-nightly integration branch for the past week to make sure we don't break
anything.
Note that there's still quite a flurry or patches floating around, but
I've figured I'll push this out. I plan to keep the bdw fixes separate
from my usual -fixes stream so that you can reject them easily in case it
still looks like too much churn. Also, bdw is for now hidden behind the
preliminary hw enabling module option. So there's no real pressure to get
follow-up patches all into 3.13.
* tag 'bdw-stage1-2013-11-08-v2' of git://people.freedesktop.org/~danvet/drm-intel: (75 commits)
drm/i915: Mask the vblank interrupt on bdw by default
drm/i915: Wire up cpu fifo underrun reporting support for bdw
drm/i915: Optimize gen8_enable|disable_vblank functions
drm/i915: Wire up pipe CRC support for bdw
drm/i915: Wire up PCH interrupts for bdw
drm/i915: Wire up port A aux channel
drm/i915: Fix up the bdw pipe interrupt enable lists
drm/i915: Optimize pipe irq handling on bdw
drm/i915/bdw: Take render error interrupt out of the mask
drm/i915/bdw: Add BDW PCH check first
drm/i915: Use hsw_crt_get_config on BDW
drm/i915/bdw: Change dp aux timeout to 600us on DDIA
drm/i915/bdw: Enable trickle feed on Broadwell
drm/i915/bdw: WaSingleSubspanDispatchOnAALinesAndPoints
drm/i915/bdw: conservative SBE VUE cache mode
drm/i915/bdw: Limit SDE poly depth FIFO to 2
drm/i915/bdw: Sampler power bypass disable
ddrm/i915/bdw: Disable centroid pixel perf optimization
drm/i915/bdw: BWGTLB clock gate disable
drm/i915/bdw: Implement edp PSR workarounds
...
Broadwell PSR support is a superset of Haswell. With this simple
register base calculation, everything that worked on HSW for eDP PSR
should work on BDW.
Note that Broadwell provides additional PSR support. This is not
addressed at this time.
v2: Make the HAS_PSR include BDW
v3: Use the correct offset (I had incorrectly used one from my faulty
brain) (Art!)
v4: It helps if you git add
v5: Be explicit about not setting min link entry time for BDW. This
should be no functional change over v4 (Jani)
Reviewed-by: Art Runyan <arthur.j.runyan@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
v2: Squash in fixup from Ben to synchronize the GT mailbox commands.
CC: Art Runyan <arthur.j.runyan@intel.com>
Reviewed-by: Art Runyan <arthur.j.runyan@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Just like Haswell, but with the small twist that the panel fitter for pipe A is
now also in the always-on power well.
v2: Use the new HAS_POWER_WELL macro.
v3: Rebase on top of intel_using_power_well patches.
v4: This time actually update the PFIT check correctly so that the
pipe A pfit is in the always-on domain.
v5: Rebase on top of the VGA power domain addition.
v6: Rebase on top of the new power domain infrastructure. Also pimp the commit
message a bit while at it.
v7: Use IS_BROADWELL instead of IS_GEN8 (Ville).
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> (v1)
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
For now it's just equivalent to IS_GEN8, but in the future we might
want to change that (e.g., on Gen 7 we have IS_VALLEYVIEW,
IS_IVYBRIDGE and IS_HASWELL).
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Aside from the potential size increase of the PPGTT, the primary
difference from previous hardware is the Page Directories are no longer
carved out of the Global GTT.
Note that the PDE allocation is done as a 8MB contiguous allocation,
this needs to be eventually fixed (since driver reloading will be a
pain otherwise). Also, this will be a no-go for real PPGTT support.
v2: Move vtable initialization
v3: Resolve conflicts due to patch series reordering.
v4: Rebase on top of the address space refactoring of the PPGTT
support. Drop Imre's r-b tag for v2, too outdated by now.
v5: Free the correct amount of memory, "get_order takes size not a page
count." (Imre)
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The interrupt handling implementation remains the same as previous
generations with the 4 types of registers, status, identity, mask, and
enable. However the layout of where the bits go have changed entirely.
To address these changes, all of the interrupt vfuncs needed special
gen8 code.
The way it works is there is a top level status register now which
informs the interrupt service routine which unit caused the interrupt,
and therefore which interrupt registers to read to process the
interrupt. For display the division is quite logical, a set of interrupt
registers for each pipe, and in addition to those, a set each for "misc"
and port.
For GT the things get a bit hairy, as seen by the code. Each of the GT
units has it's own bits defined. They all look *very similar* and
resides in 16 bits of a GT register. As an example, RCS and BCS share
register 0. To compact the code a bit, at a slight expense to
complexity, this is exactly how the code works as well. 2 structures are
added to the ring buffer so that our ring buffer interrupt handling code
knows which ring shares the interrupt registers, and a shift value (ie.
the top or bottom 16 bits of the register).
The above allows us to kept the interrupt register caching scheme, the
per interrupt enables, and the code to mask and unmask interrupts
relatively clean (again at the cost of some more complexity).
Most of the GT units mentioned above are command streamers, and so the
symmetry should work quite well for even the yet to be implemented rings
which Broadwell adds.
v2: Fixes up a couple of bugs, and is more verbose about errors in the
Broadwell interrupt handler.
v3: fix DE_MISC IER offset
v4: Simplify interrupts:
I totally misread the docs the first time I implemented interrupts, and
so this should greatly simplify the mess. Unlike GEN6, we never touch
the regular mask registers in irq_get/put.
v5: Rebased on to of recent pch hotplug setup changes.
v6: Fixup on top of moving num_pipes to intel_info.
v7: Rebased on top of Egbert Eich's hpd irq handling rework. Also
wired up ibx_hpd_irq_setup for gen8.
v8: Rebase on top of Jani's asle handling rework.
v9: Rebase on top of Ben's VECS enabling for Haswell, where he
unfortunately went OCD on the gt irq #defines. Not that they're still
not yet fully consistent:
- Used the GT_RENDER_ #defines + bdw shifts.
- Dropped the shift from the L3_PARITY stuff, seemed clearer.
- s/irq_refcount/irq_refcount.gt/
v10: Squash in VECS enabling patches and the gen8_gt_irq_handler
refactoring from Zhao Yakui <yakui.zhao@intel.com>
v11: Rebase on top of the interrupt cleanups in upstream.
v12: Rebase on top of Ben's DPF changes in upstream.
v13: Drop bdw from the HAS_L3_DPF feature flag for now, it's unclear what
exactly needs to be done. Requested by Ben.
v14: Fix the patch.
- Drop the mask of reserved bits and assorted logic, it doesn't match
the spec.
- Do the posting read inconditionally instead of commenting it out.
- Add a GEN8_MASTER_IRQ_CONTROL definition and use it.
- Fix up the GEN8_PIPE interrupt defines and give the GEN8_ prefixes -
we actually will need to use them.
- Enclose macros in do {} while (0) (checkpatch).
- Clear DE_MISC interrupt bits only after having processed them.
- Fix whitespace fail (checkpatch).
- Fix overtly long lines where appropriate (checkpatch).
- Don't use typedef'ed private_t (maintainer-scripts).
- Align the function parameter list correctly.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net> (v4)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
bikeshed
v2: Fixed the botched locking on init_hw failure in i915_reset (Ville)
Call cleanup_ringbuffer on failed context create in init_hw (Ville)
v3: Add dev argument ti clean_ringbuffer
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
With the connector and pipe passed around, we can now set the backlight
on the right pipe on VLV/BYT.
v2: drop combination mode check for VLV (Jani)
add save/restore code for VLV backlight regs (Jani)
check for existing modulation freq when initializing backlight regs (Jani)
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67245
Tested-by: Joe Konno <joe.konno@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We'll be looking at more than just mem_freq from dev_priv, so
just pass the whole thing.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
On VLV/BYT, backlight controls a per-pipe, so when adjusting the
backlight we need to pass the correct info. So make the externally
visible backlight functions take a connector argument, which can be used
internally to figure out the pipe backlight to adjust.
v2: make connector pipe lookup check for NULL crtc (Jani)
fixup connector check in ASLE code (Jani)
v3: make sure we take the mode config lock around lookups (Daniel)
v4: fix double unlock in panel_get_brightness (Daniel)
v5: push ASLE work into a work queue (Daniel)
v6: separate ASLE work to a prep patch, rebase (Jani)
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Doing this has been long overdue anyway, but now we really need it in
preparation for per connector backlight handling.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>