When doing p2p with a NIC device, the NIC needs to make sure all the
writes to the HBM (through the PCI bar of the Gaudi device) were
flushed.
It can be done by either the NIC or the host reading through the PCI
bar.
To support the host side, we supply a simple uapi to perform this flush
through the driver, because the user can't create such a transaction
by itself (the PCI bar isn't exposed to normal users).
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Now that we have a subsystem for compute accelerators, move the
habanalabs driver to it.
This patch only moves the files and fixes the Makefiles. Future
patches will change the existing code to register to the accel
subsystem and expose the accel device char files instead of the
habanalabs device char files.
Update the MAINTAINERS file to reflect this change.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Move the habanalabs.h uapi file from include/uapi/misc to
include/uapi/drm, and rename it to habanalabs_accel.h.
This is required before moving the actual driver to the accel
subsystem.
Update MAINTAINERS file accordingly.
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
The dma-buf private object is freed if a call to dma_buf_fd() fails,
and because a file was already associated with the dma-buf in
dma_buf_export(), the release op will be called and will use this
object.
Mark the 'priv' field as NULL in this case, and avoid accessing it from
the release op.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
In order to have the no-cause error print be more informative,
we add the event description in addition to the event id.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Add a uAPI, as part of the INFO IOCTL, to allow users to send
requests directly to f/w, according to a pre-defined set of opcodes
that the f/w exposes.
The f/w will put the result in a kernel-allocated buffer, which the
driver will then copy to the user-supplied buffer.
This will allow f/w tools to communicate directly with the f/w
without the need to add a new uAPI to the driver for each new type
of request.
Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
An Ascii message that is sent from preboot towards the driver
will indicate the specific error that occurred on the f/w.
This commit supports that message and parse the ascii string
in order to print it into the kernel log
The commit also changes the way the descriptor struct is declared.
While its size increased (it now above 1024 bytes), it will be
allocated by using kmalloc instead of stack declaration.
Signed-off-by: Tal Cohen <talcohen@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Instead of waiting for BTM indication we should wait for preboot ready.
Consider the below scenario:
1. FW update is being triggered
- setting the dirty bit
2. hard reset will be triggered due to the dirty bit
3. FW initiates the reset:
- dirty bit cleared
- BTM indication cleared
- preboot ready indication cleared
4. during hard reset:
- BTM indication will be set
- BIST test performed and another reset triggered
5. only after this reset the preboot will set the preboot ready
When polling on BTM indication alone we can lose sync with FW while
trying to communicate with FW that is during reset.
To overcome this we will always wait to preboot ready indication.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Need to put fences even if an unexpected status value is received while
waiting for a fence.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
The -ERESTARTSYS return value is not handled correctly when a signal is
received while waiting for CS completion.
This can lead to bad output values to user when waiting for a single CS
completion, and more severe, it can cause a non-stopping loop when
waiting to multi-CS completion and until a CS timeout.
Fix the handling and exit the waiting if this return value is received.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
This patch fixes a bug that was found in the dmabuf flow.
Bug description as found on Gaudi2 device:
1. User allocates 4MB of device memory
- Note that although the allocation size was 4MB the HMMU allocated
a full page of 768MB to back the request.
- The user gets a memory handle that points to a single page (768MB)
- Mapping the handle, the user gets virtual address to the start of
the page.
2. User exports the buffer
3. User registers the exported buffer in the importer. This flow has
a callback to the exporter which in turn converts the phys_page_pack
to an SG list for the importer. This SG list is of single entry of
size 768MB. However, the size that was passed to the importer was
only 4MB.
The solution for this is to make sure the importer gets exposure only
to the exported size.
This will be done by fixing the SG created by the exporter to be of
the total size of the actual exported memory requested by the user.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
A previous commit deprecated the option to export from handle, leaving
the code with no support for devices with virtual memory.
This commit modifies the export API in a way that unifies the uAPI to
user address for both cases (i.e. with and without MMU support) and add
the actual support for devices with virtual memory.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Validate export parameters in a dedicated function instead of in the
main export flow.
This will be useful later when support to export dmabuf for devices
with virtual memory will be added.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
The API to the user which allows exporting DMA buffer from handle is
deprecated here. It was never used as it is relevant only for Gaudi2,
and the user stack has yet to add support for dmabuf in Gaudi2.
Looking forward, a modified API to export DMA buffer for ASICs that
supports virtual memory will be added.
Until the new API will be ready- exporting DMA buffer will not be
supported for ASICs with virtual memory.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
This warning doesn't have real consequences, and therefore can be
printed in debug level.
Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Call COMMS tracepoints from within the dynamic CPU FW load.
This can help debug failures or delays in the dynamic FW load flow.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
As the COMMS protocol is being used more widely in our driver,
an available debug tool for the handshake will be handy.
This commit defines tracepoints to various key points of the COMMS
protocol.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
In certain scenarios, firmware might encounter a fatal event for
which a device reset is required. Hence, a proper notification
is needed for driver to be aware and initiate a reset sequence.
In secured environments the reset will be performed by firmware
without an explicit request from the driver.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
When user context is released and hpriv_release() is called, there is a
device idle status check, to understand if user has left the device not
idle and then a reset is required.
However, if the user process is killed because of device hard reset,
the device at this point would always be not idle, because the device
engines were already forcefully halted.
Modify hpriv_release() to skip the idle check if reset is in progress.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
timestamp events that expire on the same interrupt will get the same
timestamp value
Signed-off-by: Tamir Gilad-Raz <tgiladraz@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
In order to reduce error log, we try to minimize the dumped rows
while keeping all relevant error info. In addition we completely
remove clock throttling debug logs.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
During event handling we extract interrupt cause and count it.
In case we could not find any cause we should add proper error.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
If the f/w reports the binning masks at the preboot stage, the driver
must align its DRAM properties according to the new information.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Removing double assignment of the hop2_pte_addr
variable in dram_default_mapping_fini().
Dead store reported by clang-analyzer.
Signed-off-by: Marco Pagani <marpagan@redhat.com>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
As ASICs are evolving, we will need to update the DRAM properties at
various points because we may get different information from the f/w
at different points of the initialization.
This ASIC function is a foundation for this capability.
Signed-off-by: Ohad Sharabi <osharabi@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
As hl_mmap_mem_buf_get() is called also from IOCTLs which can have a
bad handle from user, modify the print for "no match to handle" to use
dev_dbg().
Calls to this function which are not dependent on user, already have an
error print when the function fails.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
The refcount of a CB buffer is initialized when user allocates a CB,
and is decreased when he destroys the CB handle.
If this refcount is increased also from kernel and user sends more than
one destroy requests for the handle, the buffer will be released/freed
and later be accessed when the refcount is put from kernel side.
To avoid it, prevent user from destroying the handle more than once.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
As clock throttling due to high power consumption can happen very
frequently and there is no real reason to notify the user about it,
we skip this notification in all asics.
Signed-off-by: Ofir Bitton <obitton@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
User should close the FD when being notified about an error, after
which a device reset takes place.
However, if the user has pending threads that wait for completions,
the device release won't be called and eventually the watchdog timeout
will expire, leading to hard reset and killing the user process.
To avoid it, abort such waiting threads right after the error
notification, and block following waiting operations.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
The device file is not in use when hl_device_release() is called,
and there aren't any user threads that use IOCTLs to wait for
interrupts. Therefore there is no need to release them at this point.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Sometimes we need the binning info at a very early state of the
driver initialization. Therefore, support was added in preboot to
provide the binning info as part of the f/w descriptor and the driver
can now use that.
Signed-off-by: farah kassabri <fkassabri@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Fix programming incorrect value of address range
Signed-off-by: tal albo <talbo@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
Driver Changes:
Fixes/improvements/new stuff:
- Fix workarounds on Gen2-3 (Tvrtko Ursulin)
- Fix HuC delayed load memory leaks (Daniele Ceraolo Spurio)
- Fix a BUG caused by impendance mismatch in dma_fence_wait_timeout and GuC (Janusz Krzysztofik)
- Add DG2 workarounds Wa_18018764978 and Wa_18019271663 (Matt Atwood)
- Apply recommended L3 hashing mask tuning parameters (Gen12+) (Matt Roper)
- Improve suspend / resume times with VT-d scanout workaround active (Andi Shyti, Chris Wilson)
- Silence misleading "mailbox access failed" warning in snb_pcode_read (Ashutosh Dixit)
- Fix null pointer dereference on HSW perf/OA (Umesh Nerlige Ramappa)
- Avoid trampling the ring during buffer migration (and selftests) (Chris Wilson, Matthew Auld)
- Fix DG2 visual corruption on small BAR systems by not forgetting to copy CCS aux state (Matthew Auld)
- More fixing of DG2 visual corruption by not forgetting to copy CCS aux state of backup objects (Matthew Auld)
- Fix TLB invalidation for Gen12.50 video and compute engines (Andrzej Hajda)
- Limit Wa_22012654132 to just specific steppings (Matt Roper)
- Fix userspace crashes due eviction not working under lock contention after the object locking conversion (Matthew Auld)
- Avoid double free is user deploys a corrupt GuC firmware (John Harrison)
- Fix 32-bit builds by using "%zu" to format size_t (Nirmoy Das)
- Fix a possible BUG in TTM async unbind due not reserving enough fence slots (Nirmoy Das)
- Fix potential use after free by not exposing the GEM context id to userspace too early (Rob Clark)
- Show clamped PL1 limit to the user (hwmon) (Ashutosh Dixit)
- Workaround unreliable reset on Jasperlake (Chris Wilson)
- Cover rest of SVG unit MCR registers (Gustavo Sousa)
- Avoid PXP log spam on platforms which do not support the feature (Alan Previn)
- Re-disable RC6p on Sandy Bridge to avoid GPU hangs and visual glitches (Sasa Dragic)
Future platform enablement:
- Manage uncore->lock while waiting on MCR register (Matt Roper)
- Enable Idle Messaging for GSC CS (Vinay Belgaumkar)
- Only initialize GSC in tile 0 (José Roberto de Souza)
- Media GT and Render GT share common GGTT (Aravind Iddamsetty)
- Add dedicated MCR lock (Matt Roper)
- Implement recommended caching policy (PVC) (Wayne Boyer)
- Add hardware-level lock for steering (Matt Roper)
- Check full IP version when applying hw steering semaphore (Matt Roper)
- Enable GuC GGTT invalidation from the start (Daniele Ceraolo Spurio)
- MTL GSC firmware support (Daniele Ceraolo Spurio, Jonathan Cavitt)
- MTL OA support (Umesh Nerlige Ramappa)
- MTL initial gt workarounds (Matt Roper)
Driver refactors:
- Hold forcewake and MCR lock over PPAT setup (Matt Roper)
- Acquire fw before loop in intel_uncore_read64_2x32 (Umesh Nerlige Ramappa)
- GuC filename cleanups and use submission API version number (John Harrison)
- Promote pxp subsystem to top-level of i915 (Alan Previn)
- Finish proofing the code agains object size overflows (Chris Wilson, Gwan-gyeong Mun)
- Start adding module oriented dmesg output (John Harrison)
Miscellaneous:
- Correct kerneldoc for intel_gt_mcr_wait_for_reg() (Matt Roper)
- Bump up sample period for busy stats selftest (Umesh Nerlige Ramappa)
- Make GuC default_lists const data (Jani Nikula)
- Fix table order verification to check all FW types (John Harrison)
- Remove some limited use register access wrappers (Jani Nikula)
- Remove struct_member macro (Andrzej Hajda)
- Remove hardcoded value with a macro (Nirmoy Das)
- Use helper func to find out map type (Nirmoy Das)
- Fix a static analysis warning (John Harrison)
- Consolidate VMA active tracking helpers (Andrzej Hajda)
- Do not cover all future platforms in TLB invalidation (Tvrtko Ursulin)
- Replace zero-length arrays with flexible-array members (Gustavo A. R. Silva)
- Unwind hugepages to drop wakeref on error (Chris Wilson)
- Remove a couple of superfluous i915_drm.h includes (Jani Nikula)
Merges:
- Merge drm/drm-next into drm-intel-gt-next (Rodrigo Vivi)
danvet: Fix up merge conflict in intel_uc_fw.c, we ended up with 2
copies of try_firmware_load() somehow.
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Y8fW2Ny1B1hZ5ZmF@tursulin-desk
The sparse tool complains with the following warning:
$ make M=drivers/gpu/drm/solomon/ C=2
CC [M] drivers/gpu/drm/solomon/ssd130x.o
CHECK drivers/gpu/drm/solomon/ssd130x.c
drivers/gpu/drm/solomon/ssd130x.c:363:21: warning: dubious: x & !y
This seems to be a false positive in my opinion but still we can silence
the tool while making the code easier to read. Let's also add a comment,
to explain why the "com_seq" logical not is used rather than its value.
Reported-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20230121190930.2804224-1-javierm@redhat.com
This optional callback was added in the commit 1f45f9dbb3 ("fb_defio:
add first_io callback") but it was never used by a driver. Let's remove
it since it's unlikely that will be used after a decade that was added.
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20230121192418.2814955-2-javierm@redhat.com
Add XB24 and AB24 to the list of supported formats. The format helpers
support conversion to these formats and they are documented in the
simple-framebuffer device tree bindings.
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230120173103.4002342-8-thierry.reding@gmail.com
Simple framebuffers can be set up in system memory, which cannot be
requested and/or I/O remapped using the I/O resource helpers. Add a
separate code path that obtains system memory framebuffers from the
reserved memory region referenced in the memory-region property.
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230120173103.4002342-6-thierry.reding@gmail.com
The majority of the driver already uses struct iosys_map to encapsulate
accesses to I/O remapped vs. system memory. Accesses via the screen base
pointer still use __iomem annotations, which can lead to inconsistencies
and conflicts with subsequent patches.
Convert the screen base to a struct iosys_map as well for consistency
and to avoid these issues.
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230120173103.4002342-5-thierry.reding@gmail.com
The original goal with drm_edid_connector_update() was to have a single
call for updating the connector and adding probed modes, in this order,
but that turned out to be problematic. Drivers that need to update the
connector in the .detect() callback would end up updating the probed
modes as well. Turns out the callback may be called so many times that
the probed mode list fills up without bounds, and this is amplified by
add_alternate_cea_modes() duplicating the CEA modes on every call,
actually running out of memory on some machines.
Kudos to Imre Deak <imre.deak@intel.com> for explaining this to me.
Go back to having separate drm_edid_connector_update() and
drm_edid_connector_add_modes() calls. The former may be called from
.detect(), .force(), or .get_modes(), but the latter only from
.get_modes().
Unlike drm_add_edid_modes(), have drm_edid_connector_add_modes() update
the probed modes from the EDID property instead of the passed in
EDID. This is mainly to enforce two things:
1) drm_edid_connector_update() must be called before
drm_edid_connector_add_modes().
Display info and quirks are needed for parsing the modes, and we
don't want to call update_display_info() again to ensure the info is
available, like drm_add_edid_modes() does.
2) The same EDID is used for both updating the connector and adding the
probed modes.
Fortunately, the change is easy, because no driver has actually adopted
drm_edid_connector_update(). Not even i915, and that's mainly because of
the problem described above.
Cc: Imre Deak <imre.deak@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/e86fff1579f14ebf6334692526c8f6831cd02cac.1674144945.git.jani.nikula@intel.com