Commit Graph

3624 Commits

Author SHA1 Message Date
Tomer Tayar
9f832fda79 habanalabs: Update CPU DMA memory label name
The CPU accessible DMA memory is general and not used only for PQ.
Accordingly, this patch renames the "free_cpu_pq_dma_mem" label with
"free_cpu_dma_mem".

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-05-02 15:37:19 +03:00
Tomer Tayar
ba209e1587 habanalabs: Update CPU DMA pool label name
The CPU accessible DMA pool is general and not used only for PQ.
Accordingly, this patch rename the "free_cpu_pq_pool" label with
"free_cpu_accessible_dma_pool".

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-05-02 11:33:12 +03:00
Dalit Ben Zoor
b1b537713e habanalabs: increase timeout if working with simulator
Where there is a spike in the CPU consumption, it may cause
random failures in the C/I since the KMD timeout for CPU
and/or QMAN0 jobs expires and it stops communicating to the simulator.
This commit fixes it by increasing timeout on polling functions
if working with simulator.

Signed-off-by: Dalit Ben Zoor <dbenzoor@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-04-30 17:18:51 +03:00
Dalit Ben Zoor
f0539fb0fb habanalabs: remove condition that is always true
After removing the parsing of the command submission
when doing memset of the device memory, goya_validate_dma_pkt_host
is never called by the kernel, so there is no need to check
context id.

Signed-off-by: Dalit Ben Zoor <dbenzoor@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-05-01 13:24:58 +03:00
Dalit Ben Zoor
5809e18e02 habanalabs: remove redundant member from parser struct
use_virt_addr member was used for telling whether to treat the
addresses in the CB as virtual during parsing. We disabled it only
when calling the parser from the driver memset device function,
and since this call had been removed, it should always be enabled.

Signed-off-by: Dalit Ben Zoor <dbenzoor@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-05-01 13:16:18 +03:00
Tomer Tayar
94cb669ceb habanalabs: Manipulate DMA addresses in ASIC functions
Routing device accesses to the host memory requires the usage of a base
offset, which is canceled by the iATU just before leaving the device.
The value of the base offset might be distinctive between different ASIC
types.
The manipulation of the addresses is currently used throughout the
driver code, and one should be aware to it whenever providing a host
memory address to the device.
This patch removes this manipulation from the driver common code, and
moves it to the ASIC specific functions that are responsible for
host memory allocation/mapping.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-05-01 11:28:15 +03:00
Oded Gabbay
d9c3aa8038 habanalabs: rename functions to improve code readability
This patch renames four functions in the ASIC-specific functions section,
so it will be easier to differentiate them from the generic kernel
functions with the same name.

This will help in future code reviews, to make sure we don't use the
kernel functions directly.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-05-01 11:47:04 +03:00
Dalit Ben Zoor
3706b47006 habanalabs: remove call to cs_parser()
There is no need to parse the command submission when doing memset
of the device memory using the DMA engine because only the driver calls
the memset function and therefore, the CS is trusted and doesn't require
validation and patching.

Signed-off-by: Dalit Ben Zoor <dbenzoor@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-04-30 15:22:14 +03:00
Tomer Tayar
03d5f641dc habanalabs: Use single pool for CPU accessible host memory
The device's CPU accessible memory on host is managed in a dedicated
pool, except for 2 regions - Primary Queue (PQ) and Event Queue (EQ) -
which are allocated from generic DMA pools.
Due to address length limitations of the CPU, the addresses of all these
memory regions must have the same MSBs starting at bit 40.
This patch modifies the allocation of the PQ and EQ to be also from the
dedicated pool, to ensure compliance with the limitation.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-04-28 19:17:38 +03:00
Oded Gabbay
a38693d775 habanalabs: return old dram bar address upon change
This patch changes the ASIC interface function that changes the DRAM bar
window. The change is to return the old address that the DRAM bar pointed
to instead of an error code.

This simplifies the code that use this function (mainly in debugfs) to
restore the bar to the old setting.

This is also needed for easier support in future ASICs.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-04-28 10:18:35 +03:00
Oded Gabbay
027d35d0b6 habanalabs: rename restore to ctx_switch when appropriate
This patch only does renaming of certain variables and structure members,
and their accompanied comments.

This is done to better reflect the actions these variables and members
represent.

There is no functional change in this patch.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-04-25 20:15:42 +03:00
Oded Gabbay
b2377e032f habanalabs: use ASIC functions interface for rreg/wreg
This patch slightly changes the macros of RREG32 and WREG32, which are
used when reading or writing from registers.

Instead of directly calling a function in the common code from these
macros, the new code calls a function from the ASIC functions interface.

This change allows us to share much more code between real ASICs and
simulators, which in turn reduces the maintenance burden and
the chances for forgetting to port code between the ASIC files.

The patch also implements the hl_poll_timeout macro, instead of calling
the generic readl_poll_timeout macro. This is required to allow use of
this macro in the simulator files.

As a result from this change, more functions in goya.c are shared with the
simulator and therefore, should not be defined as static.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-04-22 11:49:06 +03:00
Oded Gabbay
d691171d61 uapi/habanalabs: add missing fields in bmon params
This patch adds missing fields of start address 0 and 1 in the bmon
parameter structure that is received from the user in the debug IOCTL.

Without these fields, the functionality of the bmon trace is broken,
because there is no configuration of the base address of the filter of the
bus monitor.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-04-21 16:20:46 +03:00
Oded Gabbay
883c2459a5 habanalabs: re-factor goya_parse_cb_no_ext_queue()
This patch re-factors goya_parse_cb_no_ext_queue() to make it more
readable by inverting the check inside the first if statement so the bulk
of the function won't be inside an if statement.

The patch also fixes a spelling error in the name of the function.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-04-21 10:48:41 +03:00
Tomer Tayar
e00dac3daa habanalabs: Cancel pr_fmt() definition dependency on includes order
pr_fmt() should be defined before including linux/printk.h, either
directly or indirectly, in order to avoid redefinition of the macro.
Currently the macro definition is in habanalabs.h, which is included in
many files, and that makes the addition/reorder of includes to be prone
to compilation errors.
This patch cancels this dependency by defining the macro only in the few
source files that use it.

Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-04-10 15:18:46 +03:00
Patrick Venture
94001602d6 misc: aspeed-p2a-ctrl: fix mixed declarations
Fix up mixed declarations and code in aspeed_p2a_mmap.

Tested: Verified the build had the error and that this patch resolved it
and there were no other warnings or build errors associated with
compilation of this driver.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Patrick Venture <venture@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-27 14:54:13 +02:00
Patrick Venture
2d1c31cb64 drivers/misc: Add Aspeed P2A control driver
Fixup compiler warnings:
 - 108 warning: ISO C90 forbids mixed declarations and code
 - 264 warning: unused variable 'value'
 - 335 warning: unused variable 'res'

Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Patrick Venture <venture@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-25 22:37:52 +02:00
Young Xiao
b281218ad4 Drivers: misc: fix out-of-bounds access in function param_set_kgdbts_var
There is an out-of-bounds access to "config[len - 1]" array when the
variable "len" is zero.

See commit dada6a43b0 ("kgdboc: fix KASAN global-out-of-bounds bug
in param_set_kgdboc_var()") for details.

Signed-off-by: Young Xiao <YangX92@hotmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-25 22:23:06 +02:00
Fuqian Huang
d2f4a83fe3 misc: genwqe: Fix misuse of %x
The pointer should be printed with %p or %px rather than
cast to long long type and printed with %016llx.
Change %x to %p to print the pointer.

Signed-off-by: Fuqian Huang <huangfq.daxian@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-25 19:53:17 +02:00
Alexander Usyskin
43b8a7ed47 mei: expose device state in sysfs
Expose mei device state to user-space through sysfs.
This gives indication to applications that driver is in transition,
usefully mostly to detect link reset state.

Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-25 19:33:34 +02:00
Tomas Winkler
d65bf04200 mei: hdcp: use own Kconfig file
The mei/hdcp module have its own Makefile
so naturally it should have associated Kconfig
in the same directory.

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-25 19:33:34 +02:00
Patrick Venture
01c60dcea9 drivers/misc: Add Aspeed P2A control driver
The ASPEED AST2400, and AST2500 in some configurations include a
PCI-to-AHB MMIO bridge.  This bridge allows a server to read and write
in the BMC's physical address space.  This feature is especially useful
when using this bridge to send large files to the BMC.

The host may use this to send down a firmware image by staging data at a
specific memory address, and in a coordinated effort with the BMC's
software stack and kernel, transmit the bytes.

This driver enables the BMC to unlock the PCI bridge on demand, and
configure it via ioctl to allow the host to write bytes to an agreed
upon location.  In the primary use-case, the region to use is known
apriori on the BMC, and the host requests this information.  Once this
request is received, the BMC's software stack will enable the bridge and
the region and then using some software flow control (possibly via IPMI
packets), copy the bytes down.  Once the process is complete, the BMC
will disable the bridge and unset any region involved.

The default behavior of this bridge when present is: enabled and all
regions marked read-write.  This driver will fix the regions to be
read-only and then disable the bridge entirely.

The memory regions protected are:
 * BMC flash MMIO window
 * System flash MMIO windows
 * SOC IO (peripheral MMIO)
 * DRAM

The DRAM region itself is all of DRAM and cannot be further specified.
Once the PCI bridge is enabled, the host can read all of DRAM, and if
the DRAM section is write-enabled, then it can write to all of it.

Signed-off-by: Patrick Venture <venture@google.com>
Reviewed-by: Andrew Jeffery <andrew@aj.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-25 19:33:34 +02:00
Greg Kroah-Hartman
3a26172437 Merge 5.1-rc6 into char-misc-next
We want the fixes, and this resolves a merge error in the fastrpc
driver.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-21 23:14:47 +02:00
Oded Gabbay
9f201aba56 habanalabs: prevent device PTE read/write during hard-reset
During hard-reset, contexts are closed as part of the tear-down process.
After a context is closed, the driver cleans up the page tables of that
context in the device's DRAM. This action is both dangerous and
unnecessary.

It is unnecessary, because the device is going through a hard-reset, which
means the device's DRAM contents are no longer valid and the device's MMU
is being reset.

It is dangerous, because if the hard-reset came as a result of a PCI
freeze, this action may cause the entire host machine to hang.

Therefore, prevent all device PTE updates when a hard-reset operation is
pending.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-04-06 15:33:38 +03:00
Oded Gabbay
3f5398cfbf habanalabs: improve IOCTLs behavior when disabled or reset
This patch makes some improvement in how IOCTLs behave when the device is
disabled or under reset.

The new code checks, at the start of every IOCTL, if the device is
disabled or in reset. If so, it prints an appropriate kernel message and
returns -EBUSY to user-space.

In addition, the code modifies the location of where the
hard_reset_pending flag is being set or cleared:

1. It is now cleared immediately after the reset *tear-down* flow is
   finished but before the re-initialization flow begins.

2. It is being set in the remove function of the device, to make the
   behavior the same with the hard-reset flow

There are two exceptions to the disable or in reset check:

1. The HL_INFO_DEVICE_STATUS opcode in the INFO IOCTL. This opcode allows
   the user to inquire about the status of the device, whether it is
   operational, in reset or malfunction (disabled). If the driver will
   block this IOCTL, the user won't be able to retrieve the status in
   case of malfunction or in reset.

2. The WAIT_FOR_CS IOCTL. This IOCTL allows the user to inquire about the
   status of a CS. We want to allow the user to continue to do so, even if
   we started a soft-reset process because it will allow the user to get
   the correct error code for each CS he submitted.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-04-06 15:41:35 +03:00
Oded Gabbay
caa3c8e525 habanalabs: all FD must be closed before removing device
This patch fixes a bug in the implementation of the function that removes
the device.

The bug can happen when the device is removed but not the driver itself
(e.g. remove by the OS due to PCI freeze in Power architecture).

In that case, there maybe open users that are calling IOCTLs while the
device is removed. This is a possible race condition that the driver must
handle. Otherwise, a kernel panic may occur.

This race is prevented in the hard-reset flow, because the driver makes
sure the users are closed before continuing with the hard-reset. This
race can not occur when the driver itself is removed because the OS makes
sure all the file descriptors are closed.

The fix is to make sure the open users close their file descriptors and if
they don't (after a certain amount of time), the driver sends them a
SIGKILL, because the remove of the device can't be stopped.

The patch re-uses the same code that is called from the hard-reset flow.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-04-06 13:23:54 +03:00
Oded Gabbay
54303a1aef habanalabs: split mmu/no-mmu code paths in memory ioctl
To make the memory ioctl code more readable, this patch moves the
legacy/debug code path of mmu-disabled to a separate function, which is
called (if necessary) from the main memory ioctl function.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-04-04 14:42:26 +03:00
Oded Gabbay
295938406c habanalabs: ASIC_AUTO_DETECT enum value is redundant
This patch removes the enum value of ASIC_AUTO_DETECT because we can use
the validity of the pdev variable to know whether we have a real device or
a simulator. For a real device, we detect the asic type from the device ID
while for a simulator, the simulator code calls create_hdev() with the
specified ASIC type.

Set ASIC_INVALID as the first option in the enum to make sure that no
other enum value will receive the value 0 (which indicates a non-existing
entry in the simulator array).

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-04-04 14:33:34 +03:00
Bo YU
01b76c32e3 misc: fastrpc: add checked value for dma_set_mask
There be should check return value from dma_set_mask to throw some info
if fail to set dma mask.

Detected by CoverityScan, CID# 1443983:  Error handling issues (CHECKED_RETURN)

Fixes: f6f9279f2b ("misc: fastrpc: Add Qualcomm fastrpc basic driver model")
Signed-off-by: Bo YU <tsu.yubo@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-04-02 17:56:54 +02:00
Oded Gabbay
bedd14425d habanalabs: refactoring in goya.c
This patch does some refactoring in goya.c to make code more reusable
between goya code and the goya simulator code (which is not upstreamed).

In addition, the patch removes some dead functions from goya.c which are
not used by the current upstream code

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-04-02 15:56:16 +03:00
Omer Shpigelman
8ba2876ddf habanalabs: add goya implementation for debug configuration
This patch adds the ASIC-specific function for GOYA to configure the
coresight components.

Most of the components have an enabled/disabled flag, depending on whether
the user wants to enable the component or disable it.

For some of the components, such as ETR and SPMU, the user can also
request to read values from them. Those values are needed for the user to
parse the trace data.

The ETR configuration is also checked for security purposes, to make sure
the trace data is written to the device's DRAM.

Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-04-01 22:23:02 +03:00
Omer Shpigelman
315bc055ed habanalabs: add new IOCTL for debug, tracing and profiling
Habanalabs ASICs use the ARM coresight infrastructure to support debug,
tracing and profiling of neural networks topologies.

Because the coresight is configured using register writes and reads, and
some of the registers hold sensitive information (e.g. the address in
the device's DRAM where the trace data is written to), the user must go
through the kernel driver to configure this mechanism.

This patch implements the common code of the IOCTL and calls the
ASIC-specific function for the actual H/W configuration.

The IOCTL supports configuration of seven coresight components:
ETR, ETF, STM, FUNNEL, BMON, SPMU and TIMESTAMP

The user specifies which component he wishes to configure and provides a
pointer to a structure (located in its process space) that contains the
relevant configuration.

The common code copies the relevant data from the user-space to kernel
space and then calls the ASIC-specific function to do the H/W
configuration.

After the configuration is done, which is usually composed
of several IOCTL calls depending on what the user wanted to trace, the
user can start executing the topology. The trace data will be written to
the user's area in the device's DRAM.

After the tracing operation is complete, and user will call the IOCTL
again to disable the tracing operation. The user also need to read
values from registers for some of the components (e.g. the size of the
trace data in the device's DRAM). In that case, the user will provide a
pointer to an "output" structure in user-space, which the IOCTL code will
fill according the to selected component.

Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-04-01 22:31:22 +03:00
Oded Gabbay
a1c92d1c2a habanalabs: remove extra semicolon
This patch removes an extra ; after the closing brackets of a while loop.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Mukesh Ojha <mojha@codeaurora.org>
2019-04-02 15:46:02 +03:00
Oded Gabbay
e850b89f50 habanalabs: prevent CPU soft lockup on Palladium
Unmapping ptes in the device MMU on Palladium can take a long time, which
can cause a kernel BUG of CPU soft lockup.

This patch minimize the chances for this bug by sleeping a little between
unmapping ptes.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-03-31 21:37:42 +03:00
Oded Gabbay
9336c02167 habanalabs: remove trailing blank line from EOF
GIT does not like extra blank lines at the end of the file, so this patch
removes those lines.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Mukesh Ojha <mojha@codeaurora.org>
2019-03-31 11:29:53 +03:00
Oded Gabbay
cab8e3e20d habanalabs: improve error messages
This patch improves two error messages to help the user to
better understand what error occurred.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-03-27 09:44:28 +02:00
Oded Gabbay
bfb57a91c2 habanalabs: remove low credit limit of DMA #0
Because DMA #0 is now used by the user, remove the limitation of credits
from this channel. Without this patch, this channel is pretty much
unusable due to its very low bandwidth configuration.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-03-24 18:07:02 +02:00
Dalit Ben Zoor
aa957088b4 habanalabs: add device status option to INFO IOCTL
This patch adds a new opcode to INFO IOCTL that returns the device status.

This will allow users to query the device status in order to avoid sending
command submissions while device is in reset.

Signed-off-by: Dalit Ben Zoor <dbenzoor@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-03-24 10:15:44 +02:00
Dalit Ben Zoor
9354c29ed5 habanalabs: allow user to modify TPC clock relaxation value
This patch allows the user to modify the TPC PLL clock relaxation value
on-the-fly in order to reduce power consumption.

To enable this, the patch removes the protection from the specific
register that controls this behavior.

Signed-off-by: Dalit Ben Zoor <dbenzoor@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-03-21 09:54:49 +02:00
Dalit Ben Zoor
a691a1ebb5 habanalabs: set new golden value to tpc clock relaxation
On init or context switch, set TPC clock relaxation counter
register to a golden value.

Signed-off-by: Dalit Ben Zoor <dbenzoor@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-03-20 16:13:23 +02:00
Oded Gabbay
0878a42086 habanalabs: never fail hard reset of device
Hard-reset of our device should never fail, due to dangers of permanent
damage to the H/W.

This patch removes the last place in the reset path where the driver might
exit before doing the actual reset.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-03-17 09:12:29 +02:00
Oded Gabbay
d9973871da habanalabs: keep track of the device's dma mask
This patch refactors the code that is responsible to set the DMA mask for
the device.

Upon each change of the dma mask, the driver will save the new value that
was set. This is needed in order to make sure we don't try to increase the
mask a second time, in case we failed in the first time. This is
especially relevant for Power machines, as that may cause a change in
configuration of the TVT which will break the device.

Goya will first try to set the device's dma mask to 39 bits, so that the
memory that is allocated on the host machine for communication with the
device's cpu will be in a bus address which is lower then 39 bits. Later,
Goya will try to increase that mask to 48 bits, but only if setting the
mask to 39 bits was successful.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-03-07 18:03:23 +02:00
Oded Gabbay
7c22278edd habanalabs: cast to expected type
This patch fix the following sparse warning:

drivers/misc/habanalabs/goya/goya.c:3646:14: warning: incorrect type in
assignment (different address spaces)
drivers/misc/habanalabs/goya/goya.c:3646:14:    expected void *base
drivers/misc/habanalabs/goya/goya.c:3646:14:    got void [noderef] <asn:2> *

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-03-03 10:23:29 +02:00
Oded Gabbay
7cb5101ee0 habanalabs: prevent host crash during suspend/resume
This patch fixes the implementation of suspend/resume of the device so that
upon resume of the device, the host won't crash due to PCI completion
timeout.

Upon suspend, the device is being reset due to PERST. Therefore, upon
resume, the driver must initialize the PCI controller as if the driver was
loaded. If the controller is not initialized and the device tries to
access the device through the PCI bars, the host will crash with PCI
completion timeout error.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-03-03 22:29:20 +02:00
Oded Gabbay
cbaa99ed1b habanalabs: perform accounting for active CS
This patch adds accounting for active CS. Active means that the CS was
submitted to the H/W queues and was not completed yet.

This is necessary to support suspend operation. Because the device will be
reset upon suspend, we can only suspend after all active CS have been
completed. Hence, we need to perform accounting on their number.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-03-03 15:13:15 +02:00
Omer Shpigelman
d12a5e2458 habanalabs: fix mapping with page size bigger than 4KB
This patch fixes the mapping of virtual address to physical addresses on
architectures where PAGE_SIZE is bigger than 4KB.
The break down to the device page size was done only for the virtual
address while it should have been done for the physical address as well.
As a result virtual addresses were mapped to wrong physical address.
The fix is to apply the break down for the physical addresses as well in
order to get correct mappings.

Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-03-14 16:54:45 +02:00
Omer Shpigelman
f650a95b71 habanalabs: complete user context cleanup before hard reset
This patch fixes a bug which led to a crash during hard reset flow.
Before a hard reset is executed, we wait a few seconds for the user
context cleanup to complete.
If it wasn't completed, we kill the user process and move on to the reset
flow.
Upon killing the user process, the context cleanup flow begins and may
take a while due to MMU unmaps.
Meanwhile, in the driver reset flow, we change the PCI DRAM bar location
which can interfere with the MMU that uses the bar.
If the context cleanup flow didn't finish quickly, a crash may occur due
to PCI DRAM bar mislocation during the MMU unmap.
Hence adding a wait between killing the user process and the start of the
reset flow.

Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-03-13 13:36:28 +02:00
Omer Shpigelman
4eb1d1253d habanalabs: fix bug when mapping very large memory area
This patch fixes a bug of allocating a too big memory size with kmalloc,
which causes a failure.
In case of mapping a large memory block, an array of the relevant physical
page addresses is allocated. If there are many pages the array might be
too big to allocate with kmalloc, hence changing to kvmalloc.

Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-03-07 15:47:19 +02:00
Omer Shpigelman
bfb1ce1259 habanalabs: fix MMU number of pages calculation
The requested allocation size is 64bit, hence the number of requested
pages and the total requested size should 64bit as well.
This patch fixes all places where these are treated as 32bit.

Signed-off-by: Omer Shpigelman <oshpigelman@habana.ai>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2019-03-05 10:59:16 +02:00
Linus Torvalds
a50243b1dd 5.1 Merge Window Pull Request
This has been a slightly more active cycle than normal with ongoing core
 changes and quite a lot of collected driver updates.
 
 - Various driver fixes for bnxt_re, cxgb4, hns, mlx5, pvrdma, rxe
 
 - A new data transfer mode for HFI1 giving higher performance
 
 - Significant functional and bug fix update to the mlx5 On-Demand-Paging MR
   feature
 
 - A chip hang reset recovery system for hns
 
 - Change mm->pinned_vm to an atomic64
 
 - Update bnxt_re to support a new 57500 chip
 
 - A sane netlink 'rdma link add' method for creating rxe devices and fixing
   the various unregistration race conditions in rxe's unregister flow
 
 - Allow lookup up objects by an ID over netlink
 
 - Various reworking of the core to driver interface:
   * Drivers should not assume umem SGLs are in PAGE_SIZE chunks
   * ucontext is accessed via udata not other means
   * Start to make the core code responsible for object memory
     allocation
   * Drivers should convert struct device to struct ib_device
     via a helper
   * Drivers have more tools to avoid use after unregister problems
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAlyAJYYACgkQOG33FX4g
 mxrWwQ/+OyAx4Moru7Aix0C6GWxTJp/wKgw21CS3reZxgLai6x81xNYG/s2wCNjo
 IccObVd7mvzyqPdxOeyHBsJBbQDqWvoD6O2duH8cqGMgBRgh3CSdUep2zLvPpSAx
 2W1SvWYCLDnCuarboFrCA8c4AN3eCZiqD7z9lHyFQGjy3nTUWzk1uBaOP46uaiMv
 w89N8EMdXJ/iY6ONzihvE05NEYbMA8fuvosKLLNdghRiHIjbMQU8SneY23pvyPDd
 ZziPu9NcO3Hw9OVbkwtJp47U3KCBgvKHmnixyZKkikjiD+HVoABw2IMwcYwyBZwP
 Bic/ddONJUvAxMHpKRnQaW7znAiHARk21nDG28UAI7FWXH/wMXgicMp6LRcNKqKF
 vqXdxHTKJb0QUR4xrYI+eA8ihstss7UUpgSgByuANJ0X729xHiJtlEvPb1DPo1Dz
 9CB4OHOVRl5O8sA5Jc6PSusZiKEpvWoyWbdmw0IiwDF5pe922VLl5Nv88ta+sJ38
 v2Ll5AgYcluk7F3599Uh9D7gwp5hxW2Ph3bNYyg2j3HP4/dKsL9XvIJPXqEthgCr
 3KQS9rOZfI/7URieT+H+Mlf+OWZhXsZilJG7No0fYgIVjgJ00h3SF1/299YIq6Qp
 9W7ZXBfVSwLYA2AEVSvGFeZPUxgBwHrSZ62wya4uFeB1jyoodPk=
 =p12E
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma updates from Jason Gunthorpe:
 "This has been a slightly more active cycle than normal with ongoing
  core changes and quite a lot of collected driver updates.

   - Various driver fixes for bnxt_re, cxgb4, hns, mlx5, pvrdma, rxe

   - A new data transfer mode for HFI1 giving higher performance

   - Significant functional and bug fix update to the mlx5
     On-Demand-Paging MR feature

   - A chip hang reset recovery system for hns

   - Change mm->pinned_vm to an atomic64

   - Update bnxt_re to support a new 57500 chip

   - A sane netlink 'rdma link add' method for creating rxe devices and
     fixing the various unregistration race conditions in rxe's
     unregister flow

   - Allow lookup up objects by an ID over netlink

   - Various reworking of the core to driver interface:
       - drivers should not assume umem SGLs are in PAGE_SIZE chunks
       - ucontext is accessed via udata not other means
       - start to make the core code responsible for object memory
         allocation
       - drivers should convert struct device to struct ib_device via a
         helper
       - drivers have more tools to avoid use after unregister problems"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (280 commits)
  net/mlx5: ODP support for XRC transport is not enabled by default in FW
  IB/hfi1: Close race condition on user context disable and close
  RDMA/umem: Revert broken 'off by one' fix
  RDMA/umem: minor bug fix in error handling path
  RDMA/hns: Use GFP_ATOMIC in hns_roce_v2_modify_qp
  cxgb4: kfree mhp after the debug print
  IB/rdmavt: Fix concurrency panics in QP post_send and modify to error
  IB/rdmavt: Fix loopback send with invalidate ordering
  IB/iser: Fix dma_nents type definition
  IB/mlx5: Set correct write permissions for implicit ODP MR
  bnxt_re: Clean cq for kernel consumers only
  RDMA/uverbs: Don't do double free of allocated PD
  RDMA: Handle ucontext allocations by IB/core
  RDMA/core: Fix a WARN() message
  bnxt_re: fix the regression due to changes in alloc_pbl
  IB/mlx4: Increase the timeout for CM cache
  IB/core: Abort page fault handler silently during owning process exit
  IB/mlx5: Validate correct PD before prefetch MR
  IB/mlx5: Protect against prefetch of invalid MR
  RDMA/uverbs: Store PR pointer before it is overwritten
  ...
2019-03-09 15:53:03 -08:00