Up until now, ppgtt->pdp has always been the root of our page tables.
Legacy 32b addresses acted like it had 1 PDP with 4 PDPEs.
In preparation for 4 level page tables, we need to stop using ppgtt->pdp
directly unless we know it's what we want. The future structure will use
ppgtt->pml4 for the top level, and the pdp is just one of the entries
being pointed to by a pml4e. The temporal pdp local variable will be
removed once the rest of the 4-level code lands.
Also, start passing the vm pointer to the alloc functions, instead of
ppgtt.
v2: Updated after dynamic page allocation changes.
v3: Rebase after s/page_tables/page_table/.
v4: Rebase after changes in "Dynamic page table allocations" patch.
v5: Rebase after Mika's ppgtt cleanup / scratch merge patch series.
v6: Rebase after final merged version of Mika's ppgtt/scratch patches.
v7: Keep pagetable map in-line (and avoid unnecessary for_each_pde
loops), remove redundant ppgtt pointer in _alloc_pagetabs (Akash)
v8: Fix text indentation in _alloc_pagetabs/page_directories (Chris)
v9: Defer gen8_alloc_va_range_4lvl definition until 4lvl is implemented,
clean-up gen8_ppgtt_cleanup [pun intended] (Akash).
v10: Clean-up commit message (Akash).
Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Reviewed-by: "Akash Goel" <akash.goel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This transitional patch doesn't do much for the existing code. However,
it should make upcoming patches to use the full 48b address space a bit
easier.
32-bit ppgtt uses just 4 PDPs, while 48-bit ppgtt will have up-to 512;
this patch prepares the existing functions to query the right number of pdps
at run-time. This also means that used_pdpes should also be allocated during
ppgtt_init, as the bitmap size will depend on the ppgtt address range
selected.
v2: Renamed pdp_free to be similar to pd/pt (unmap_and_free_pdp).
v3: To facilitate testing, 48b mode will be available on Broadwell and
GEN9+, when i915.enable_ppgtt = 3.
v4: Rebase after s/page_tables/page_table/, added extra information
about 4-level page table formats and use IS_ENABLED macro.
v5: Check CONFIG_X86_64 instead of CONFIG_64BIT.
v6: Rebase after Mika's ppgtt cleanup / scratch merge patch series, and
follow
his nomenclature in pdp functions (there is no alloc_pdp yet).
v7: Rebase after merged version of Mika's ppgtt cleanup patch series.
v8: Rebase after final merged version of Mika's ppgtt/scratch patches.
v9: Introduce PML4 (and 48-bit checks) until next patch (Akash).
v10: Also use test_bit to detect when pd/pt are already allocated (Akash)
Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+)
Reviewed-by: Akash Goel <akash.goel@intel.com>
[danvet: Amend commit message as suggested by Michel.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
gen8_clamp_pd clamps to the next page directory boundary, but the macro
gen8_for_each_pde already has a check to stop at the page directory
boundary.
Furthermore, i915_pte_count also restricts to the next page table
boundary.
v2: Rebase after Mika's ppgtt cleanup / scratch merge patch series.
Suggested-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: "Akash Goel" <akash.goel@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Backmerge drm-intel-fixes because a bunch of atomic patch backporting
we had to do lead to horrible conflicts.
Conflicts:
drivers/gpu/drm/drm_crtc.c
Just a bit of context conflict between -next and -fixes.
drivers/gpu/drm/i915/intel_atomic.c
drivers/gpu/drm/i915/intel_display.c
Atomic conflicts, always pick the code from -next.
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Backmerge fixes since it's getting out of hand again with the massive
split due to atomic between -next and 4.2-rc. All the bugfixes in
4.2-rc are addressed already (by converting more towards atomic
instead of minimal duct-tape) so just always pick the version in next
for the conflicts in modeset code.
All the other conflicts are just adjacent lines changed.
Conflicts:
drivers/gpu/drm/i915/i915_drv.h
drivers/gpu/drm/i915/i915_gem_gtt.c
drivers/gpu/drm/i915/intel_display.c
drivers/gpu/drm/i915/intel_drv.h
drivers/gpu/drm/i915/intel_ringbuffer.h
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
After the previous patch this flag will check always clear, as it's
never set for shmem backed and userptr objects, so we can remove it.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Yeah this isn't really fixes but it's a nice cleanup to
clarify the code but not really worth the hassle of backmerging. So
just add to -fixes, we're still early in -rc.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Previously we have pointed the page where the individual ppgtt
scratch structures refer to, to be the instance which GGTT setup have
allocated. So it has been shared.
To achieve full isolation between ppgtts also in this regard,
allocate per ppgtt scratch page.
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Every other alloc_* function return the pointer to the page
they alloc. Follow the convention with scratch page also.
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
After Mika's ppgtt cleanup series, all the other free functions have
drm_device as the first parameter, except this one.
No functional changes.
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Scratch page is part of struct i915_address_space. Move other
scratch entities into the same struct. This is a preparatory patch
for having only one instance of each scratch_pt/pd.
v2: make commit msg more readable
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com> (v1)
[danvet: Bikeshed summary to avoid confusion with vmas.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Write page directory entry without using superfluous
indirect function. Also remove unused device parameter
from the encode function.
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Dynamic page table allocation might wake the shrinker
when memory is requested for page table structures.
As this happens when we try to allocate the virtual address
during binding, our vma might be among the targets for eviction.
We should do i915_vma_pin() and do pin early in there like Chris
suggests but this is interim solution.
Shield our vma from shrinker by incrementing pin count before
the virtual address is allocated.
The proper place to fix this would be in gem, inside of
i915_vma_pin(). But we don't have that yet so take the short
cut as a intermediate solution.
Testcase: igt/gem_ctx_thrash
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Lay out scratch page structure in similar manner than other
paging structures. This allows us to use the same tools for
setup and teardown.
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Make paging structure type agnostic *_px macros to access
page dma struct, the backing page and the dma address.
This makes the code less cluttered on internals of
i915_page_dma.
v2: Superfluous const -> nonconst removed
v3: Rebased
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com> (v2)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
As there is flushing involved when we have done the cpu
write, make functions for mapping for cpu space. Make macros
to map any type of paging structure.
v2: Make it clear tha flushing kunmap is only for ppgtt (Ville)
v3: Flushing fixed (Ville, Michel). Removed superfluous semicolon
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
When we setup page directories and tables, we point the entries
to a to the next level scratch structure. Make this generic
by introducing a fill_page_dma which maps and flushes. We also
need 32 bit variant for legacy gens.
v2: Fix flushes and handle valleyview (Ville)
v3: Now really fix flushes (Michel, Ville)
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
All the paging structures are now similar and mapped for
dma. The unmapping is taken care of by common accessors, so
don't overload the reader with such details.
v2: Be consistent with goto labels (Michel)
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
All our paging structures have struct page and dma address
for that page.
Add struct for page/dma address pairs and use it to make
the setup and teardown for different paging structures
identical.
Include the page directory offset also in the struct for legacy
gens. Rename it to clearly point out that it is offset into the
ggtt.
v2: Add comment about ggtt_offset (Michel)
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The legacy mode mm switch and the execlist context assignment
needs dma address for the page directories.
Introduce a function that encapsulates the scratch_pd dma
fallback if no pd is found.
v2: Rebase, s/ring/req
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com> (v1)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We can have exactly 4GB sized ppgtt with 32bit system.
size_t is inadequate for this.
v2: Convert a lot more places (Daniel)
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Check the allocation area against the known end
of address space instead of against fixed value.
v2: Return ENODEV on internal bugs (Chris)
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Currently object size is returned for the rotated VMA size which can be
bigger than the rotated view itself. Since the binding code pads all
excess size with scratch pages the only minor issue with this is wasting
some GGTT space, but still feels nicer to fix and report the real size.
v2: Rebase for tracking size in bytes instead of pages.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
It is only used in logging and it doesn't need to exist on its own.
Also it was misleading to log view size as object size.
v2: Improve commit message. (Joonas Lahtinen)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
[danvet: s/%lu/%zu/ where needed, reported by 0-day.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Now that everything above has been converted to use requests, intel_ring_begin()
can be updated to take a request instead of a ring. This also means that it no
longer needs to lazily allocate a request if no-one happens to have done it
earlier.
For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Updated the various ring->flush() functions to take a request instead of a ring.
Also updated the tracer to include the request id.
For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
[danvet: Rebase since I didn't merge the addition of req->uniq.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Updated the switch_mm() code paths to take a request instead of a ring. This
includes the myriad *_mm_switch functions themselves and a bunch of PDP related
helper functions.
v2: Rebased to newer tree.
For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The final step in removing the OLR from i915_gem_init_hw() is to pass the newly
allocated request structure in to each step rather than passing a ring
structure. This patch updates both i915_ppgtt_init_ring() and
i915_gem_context_enable() to take request pointers.
For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The i915_gem_init_hw() function calls a bunch of smaller initialisation
functions. Multiple of which have generic sections and per ring sections. This
means multiple passes are done over the rings. Each pass writes data to the ring
which floats around in that ring's OLR until some random point in the future
when an add_request() is done by some random other piece of code.
This patch breaks i915_ppgtt_init_hw() in two with the per ring initialisation
now being done in i915_ppgtt_init_ring(). The ring looping is now done at the
top level in i915_gem_init_hw().
v2: Fix dumb loop variable re-use.
For: VIZ-5115
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tomas Elf <tomas.elf@intel.com> (v1)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We already set this limit for the GGTT.
This is a temporary patch until a full replacement of size_t variables
(inadequate in 32-bit kernel) is in place.
Regression from:
commit a4e0bedca6
Author: Michel Thierry <michel.thierry@intel.com>
Date: Wed Apr 8 12:13:35 2015 +0100
drm/i915: Use complete address space in true PPGTT
v2: Prettify code and explain why this is needed. (Chris)
v3: Don't hide the compilation warning in 32-bit. (Chris)
Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Partial view type allows manipulating parts of huge BOs through the GGTT,
which was not previously possible due to constraint that whole object had
to be mapped for any access to it through GGTT.
v2:
- Retain error value from sg_alloc_table (Tvrtko Ursulin)
- Do not zero already zeroed variable (Tvrtko Ursulin)
- Use more common variable types for page size/offset (Tvrtko Ursulin)
v3:
- Only compare additional view parameters when need to (Tvrtko Ursulin)
v4:
- Do zero out the variable that needs to be (bug introduced in v2).
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
GGTT VMA sizes might be smaller than the whole object size due to
different GGTT views.
v2:
- Separate GGTT view constraint calculations from normal view
constraint calculations (Chris Wilson)
v3:
- Do not bother with debug wording. (Tvrtko Ursulin)
v4:
- Clearer logic for calculating map_and_fenceable (Tvrtko Ursulin)
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
[danvet: Drop BUG_ON, it's redudant.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The patch 69876bed7e: "drm/i915/gen8:
page directories rework allocation" added an overflow warning, but the
mask had an extra 0. Use less typo-prone option suggested by Dave
instead, to check for (start + length) >= 0x100000000ULL.
This check will be unnecessary after gen8_alloc_va_range handles more
than 4 PDPs (48b addressing).
v2: Really check for 32b overflow (Ville)
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Dave Gordon <david.s.gordon@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This is the wrong layer to apply an arbitrary restriction and the wrong
error code (object too large!). If we do want to prevent large offsets
being return to the user on 32bit systems (to hide bugs in userspace),
you want to restrict the drm_mm range manager instead. This first tells
userspace about the correct size of the GTT they can use (so they don't
try and overallocate object or batches), and fixes the eviction logic to
avoid the eventual and *guaranteed* error.
Fixes regression in
commit d7b2633dba
Author: Michel Thierry <michel.thierry@intel.com>
Date: Wed Apr 8 12:13:34 2015 +0100
drm/i915/gen8: Dynamic page table allocations
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Do not to clear mappings outside the allocated VMA under any
circumstances. Only clear the smaller of VMA or object page count.
This is required to allow creating partial object VMAs which in
turn are needed for partial GGTT views.
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
We have this neat abstraction between ppgtt and ggtt for (un)bind_vma
and didn't end up using it really. What a shame, so fix this and make
the ->bind_vma hook a bit more useful.
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
ggtt_bind/unbind_vma already has checks for aliasing ppgtt or not,
there's nothing else magic they do. Resurrect i915_ggtt_insert_entries
to make the reuse possibel.
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>