Current logic only support main processor to stop/start the remote
processor after crash. However to SoC, such as i.MX8QM/QXP, the
remote processor could do attach recovery after crash and trigger watchdog
to reboot itself. It does not need main processor to load image, or
stop/start remote processor.
Introduce two functions: rproc_attach_recovery, rproc_boot_recovery
for the two cases. Boot recovery is as before, let main processor to
help recovery, while attach recovery is to recover itself without help.
To attach recovery, we only do detach and attach.
Acked-by: Arnaud Pouliquen <arnaud.pouliquen@foss.st.com>
Signed-off-by: Peng Fan <peng.fan@nxp.com>
Link: https://lore.kernel.org/r/20220928064756.4059662-3-peng.fan@oss.nxp.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Define a platform driver to manage the remoteproc virtio device as
a platform devices.
The platform device allows to pass rproc_vdev_data platform data to
specify properties that are stored in the rproc_vdev structure.
Such approach will allow to preserve legacy remoteproc virtio device
creation but also to probe the device using device tree mechanism.
remoteproc_virtio.c update:
- Add rproc_virtio_driver platform driver. The probe ops replaces
the rproc_rvdev_add_device function.
- All reference to the rvdev->dev has been updated to rvdev-pdev->dev.
- rproc_rvdev_release is removed as associated to the rvdev device.
- The use of rvdev->kref counter is replaced by get/put_device on the
remoteproc virtio platform device.
- The vdev device no longer increments rproc device counter.
increment/decrement is done in rproc_virtio_probe/rproc_virtio_remove
function in charge of the vrings allocation/free.
remoteproc_core.c update:
Migrate from the rvdev device to the rvdev platform device.
From this patch, when a vdev resource is found in the resource table
the remoteproc core register a platform device.
Signed-off-by: Arnaud Pouliquen <arnaud.pouliquen@foss.st.com>
Link: https://lore.kernel.org/r/20220921135044.917140-5-arnaud.pouliquen@foss.st.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Move functions related to the management of the rproc_vdev
structure in the remoteproc_virtio.c.
The aim is to decorrelate as possible the virtio management from
the core part.
Due to the strong correlation between the vrings and the resource table
the rproc_alloc/parse/free_vring functions are kept in the remoteproc core.
Signed-off-by: Arnaud Pouliquen <arnaud.pouliquen@foss.st.com>
Link: https://lore.kernel.org/r/20220921135044.917140-4-arnaud.pouliquen@foss.st.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
In preparation of the migration of the management of rvdev in
remoteproc_virtio.c, this patch spins off a new function to manage the
remoteproc virtio device creation.
The rproc_rvdev_add_device will be moved to remoteproc_virtio.c.
The rproc_vdev_data structure is introduced to provide information for
the rvdev creation. This structure allows to manage the rvdev and vrings
allocation in the rproc_rvdev_add_device function.
Signed-off-by: Arnaud Pouliquen <arnaud.pouliquen@foss.st.com>
Link: https://lore.kernel.org/r/20220921135044.917140-2-arnaud.pouliquen@foss.st.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
This patch switches the driver away from legacy gpio/of_gpio API to
gpiod API, and removes use of of_get_named_gpio_flags() which I want to
make private to gpiolib.
Note that there is a behavior change in the driver: previously the
driver did not actually request GPIO, it simply parsed GPIO number out
of device tree and poked at it.
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/Yxe20ehiOnitDGus@google.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
This reverts commit a10fba0377: the
proposed API isn't supported on all transports but no
effort was made to address this.
It might not be hard to fix if we want to: maybe just
rename size to size_hint and make sure legacy
transports ignore the hint.
But it's not sure what the benefit is in any case, so
let's drop it.
Fixes: a10fba0377 ("virtio: find_vqs() add arg sizes")
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20220816053602.173815-8-mst@redhat.com>
Pull virtio updates from Michael Tsirkin:
- A huge patchset supporting vq resize using the new vq reset
capability
- Features, fixes, and cleanups all over the place
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (88 commits)
vdpa/mlx5: Fix possible uninitialized return value
vdpa_sim_blk: add support for discard and write-zeroes
vdpa_sim_blk: add support for VIRTIO_BLK_T_FLUSH
vdpa_sim_blk: make vdpasim_blk_check_range usable by other requests
vdpa_sim_blk: check if sector is 0 for commands other than read or write
vdpa_sim: Implement suspend vdpa op
vhost-vdpa: uAPI to suspend the device
vhost-vdpa: introduce SUSPEND backend feature bit
vdpa: Add suspend operation
virtio-blk: Avoid use-after-free on suspend/resume
virtio_vdpa: support the arg sizes of find_vqs()
vhost-vdpa: Call ida_simple_remove() when failed
vDPA: fix 'cast to restricted le16' warnings in vdpa.c
vDPA: !FEATURES_OK should not block querying device config space
vDPA/ifcvf: support userspace to query features and MQ of a management device
vDPA/ifcvf: get_config_size should return a value no greater than dev implementation
vhost scsi: Allow user to control num virtqueues
vhost-scsi: Fix max number of virtqueues
vdpa/mlx5: Support different address spaces for control and data
vdpa/mlx5: Implement susupend virtqueue callback
...
find_vqs() adds a new parameter sizes to specify the size of each vq
vring.
NULL as sizes means that all queues in find_vqs() use the maximum size.
A value in the array is 0, which means that the corresponding queue uses
the maximum size.
In the split scenario, the meaning of size is the largest size, because
it may be limited by memory, the virtio core will try a smaller size.
And the size is power of 2.
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20220801063902.129329-34-xuanzhuo@linux.alibaba.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Rename the member len in the structure rpoc_vring to num. And remove 'in
bytes' from the comment of it. This is misleading. Because this actually
refers to the size of the virtio vring to be created. The unit is not
bytes.
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Message-Id: <20220624025621.128843-2-xuanzhuo@linux.alibaba.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
devm_regulator_get_optional() API will return -ENODEV if the regulator was
not found. For the optional supplies CX, PX we should not fail in that case
but rather continue. So let's catch that error and continue silently if
those regulators are not found.
The commit 3f52d118f9 ("remoteproc: qcom_q6v5_pas: Deal silently with
optional px and cx regulators") was supposed to do the same but it missed
the fact that devm_regulator_get_optional() API returns -ENODEV when the
regulator was not found.
Cc: Abel Vesa <abel.vesa@linaro.org>
Fixes: 3f52d118f9 ("remoteproc: qcom_q6v5_pas: Deal silently with optional px and cx regulators")
Reported-by: Steev Klimaszewski <steev@kali.org>
Reviewed-by: Abel Vesa <abel.vesa@linaro.org>
Reviewed-by: Johan Hovold <johan+linaro@kernel.org>
Tested-by: Steev Klimaszewski <steev@kali.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20220801053939.12556-1-manivannan.sadhasivam@linaro.org
There could be a scenario when there is too much load on a core
(n number of tasks which is affined) or in a case when multiple
rproc subsystem is going for recovery, they queue their recovery
work to one core so even though subsystem are independent their
recovery will be delayed if one of the subsystem recovery work
is taking more time in completing.
If we make this queue unbounded, the recovery work could be picked
on any cpu. This patch is trying to address this.
Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/1650367554-15510-1-git-send-email-quic_mojha@quicinc.com
When a new remoteproc boots up, send the sysmon state notification
of only running remoteprocs. Sending state of remoteprocs booting
up in parallel can cause a race between SSR clients of the remoteproc
that is booting up and the sysmon notification for the same remoteproc,
resulting in an inconsistency between which state the remoteproc that
is booting up in parallel.
For example - if remoteproc A and B crash one after the other, after
remoteproc A boots up, if the remoteproc A tries to get the state of
remoteproc B before the sysmon subdevice for B is invoked but after
the ssr subdevice of B has been invoked, clients on remoteproc A
might get confused when the sysmon notification indicates a different
state.
Signed-off-by: Siddharth Gupta <sidgup@codeaurora.org>
Signed-off-by: Sibi Sankar <quic_sibis@quicinc.com>
Reviewed-by: Konrad Dybcio <konrad.dybcio@somainline.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/1657022900-2049-8-git-send-email-quic_sibis@quicinc.com
The SSCTL service comes up after a finite time when the remote Q6 comes
out of reset. Any graceful shutdowns requested during this period will
be a NOP and abrupt tearing down of the glink channel might lead to pending
transactions on the remote Q6 side and will ultimately lead to a fatal
error. Fix this by waiting for the SSCTL service when a graceful shutdown
is requested.
Fixes: 1fb82ee806 ("remoteproc: qcom: Introduce sysmon")
Reviewed-by: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Sibi Sankar <quic_sibis@quicinc.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/1657022900-2049-7-git-send-email-quic_sibis@quicinc.com
Due to firmware bugs on the Q6 the hardware watchdog irq can be triggered
multiple times. As the remoteproc framework schedules work items for the
recovery process, if the other threads do not get a chance to run before
recovery is completed the proceeding threads will see the state of the
remoteproc as running and kill the remoteproc while it is running. This
can result in various SMMU and NOC errors. This change sets the state of
the remoteproc to offline whenever a watchdog irq is received.
Signed-off-by: Siddharth Gupta <sidgup@codeaurora.org>
Signed-off-by: Sibi Sankar <quic_sibis@quicinc.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/1657022900-2049-6-git-send-email-quic_sibis@quicinc.com
The initial shutdown request to modem on SM8450 SoCs would start the
decryption process and will keep returning errors until the modem shutdown
is complete. Fix this by retrying shutdowns in fixed intervals.
Err Logs on modem shutdown:
qcom_q6v5_pas 4080000.remoteproc: failed to shutdown: -22
remoteproc remoteproc3: can't stop rproc: -22
Fixes: 5cef9b4845 ("remoteproc: qcom: pas: Add SM8450 remoteproc support")
Signed-off-by: Sibi Sankar <quic_sibis@quicinc.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/1657022900-2049-2-git-send-email-quic_sibis@quicinc.com
The wcnss_get_irq function is expected to return a value > 0 in the
event that an IRQ is succssfully obtained, but it instead returns 0.
This causes the stop and ready IRQs to never actually be used despite
being defined in the device-tree. This patch fixes that.
Fixes: aed361adca ("remoteproc: qcom: Introduce WCNSS peripheral image loader")
Signed-off-by: Sireesh Kodali <sireeshkodali1@gmail.com>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20220526141740.15834-2-sireeshkodali1@gmail.com
The K3 AM62x family of SoC has one PRUSS-M instance and it has two
Programmable Real-Time Units (PRU0 and PRU1). This does not support
Industrial Communications Subsystem features like Ethernet.
Enhance the existing PRU remoteproc driver to support the PRU cores
by using specific compatibles. The initial names for the firmware
images for each PRU core are retrieved from DT nodes, and can be adjusted
through sysfs if required.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Link: https://lore.kernel.org/r/20220602101920.12504-4-kishon@ti.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Every iteration of for_each_available_child_of_node() decrements
the reference count of the previous node.
When breaking early from a for_each_available_child_of_node() loop,
we need to explicitly call of_node_put() on the child node.
Add missing of_node_put() to avoid refcount leak.
Fixes: 6dedbd1d54 ("remoteproc: k3-r5: Add a remoteproc driver for R5F subsystem")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Acked-by: Suman Anna <s-anna@ti.com>
Link: https://lore.kernel.org/r/20220605083334.23942-1-linmq006@gmail.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
There is no mutex protection of these state checking for 'stop'
and 'detach' which can't guarantee there is no another instance
is trying to do same operation.
Consider two instances case:
Instance1: echo stop > /sys/class/remoteproc/remoteproc0/state
Instance2: echo stop > /sys/class/remoteproc/remoteproc0/state
The issue is that the instance2 case may success, Or it
may fail with -EINVAL, which is uncertain.
So move this state checking in rproc_cdev_write() and
state_store() for 'stop', 'detach' operation to
'rproc_shutdown' , 'rproc_detach' function under the mutex
protection.
Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com>
Link: https://lore.kernel.org/r/1648434012-16655-3-git-send-email-shengjiu.wang@nxp.com
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>