Here is the big set of driver core changes for 6.11-rc1.
Lots of stuff in here, with not a huge diffstat, but apis are evolving
which required lots of files to be touched. Highlights of the changes
in here are:
- platform remove callback api final fixups (Uwe took many releases to
get here, finally!)
- Rust bindings for basic firmware apis and initial driver-core
interactions. It's not all that useful for a "write a whole driver
in rust" type of thing, but the firmware bindings do help out the
phy rust drivers, and the driver core bindings give a solid base on
which others can start their work. There is still a long way to go
here before we have a multitude of rust drivers being added, but
it's a great first step.
- driver core const api changes. This reached across all bus types,
and there are some fix-ups for some not-common bus types that
linux-next and 0-day testing shook out. This work is being done to
help make the rust bindings more safe, as well as the C code, moving
toward the end-goal of allowing us to put driver structures into
read-only memory. We aren't there yet, but are getting closer.
- minor devres cleanups and fixes found by code inspection
- arch_topology minor changes
- other minor driver core cleanups
All of these have been in linux-next for a very long time with no
reported problems.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZqH+aQ8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ymoOQCfVBdLcBjEDAGh3L8qHRGMPy4rV2EAoL/r+zKm
cJEYtJpGtWX6aAtugm9E
=ZyJV
-----END PGP SIGNATURE-----
Merge tag 'driver-core-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here is the big set of driver core changes for 6.11-rc1.
Lots of stuff in here, with not a huge diffstat, but apis are evolving
which required lots of files to be touched. Highlights of the changes
in here are:
- platform remove callback api final fixups (Uwe took many releases
to get here, finally!)
- Rust bindings for basic firmware apis and initial driver-core
interactions.
It's not all that useful for a "write a whole driver in rust" type
of thing, but the firmware bindings do help out the phy rust
drivers, and the driver core bindings give a solid base on which
others can start their work.
There is still a long way to go here before we have a multitude of
rust drivers being added, but it's a great first step.
- driver core const api changes.
This reached across all bus types, and there are some fix-ups for
some not-common bus types that linux-next and 0-day testing shook
out.
This work is being done to help make the rust bindings more safe,
as well as the C code, moving toward the end-goal of allowing us to
put driver structures into read-only memory. We aren't there yet,
but are getting closer.
- minor devres cleanups and fixes found by code inspection
- arch_topology minor changes
- other minor driver core cleanups
All of these have been in linux-next for a very long time with no
reported problems"
* tag 'driver-core-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (55 commits)
ARM: sa1100: make match function take a const pointer
sysfs/cpu: Make crash_hotplug attribute world-readable
dio: Have dio_bus_match() callback take a const *
zorro: make match function take a const pointer
driver core: module: make module_[add|remove]_driver take a const *
driver core: make driver_find_device() take a const *
driver core: make driver_[create|remove]_file take a const *
firmware_loader: fix soundness issue in `request_internal`
firmware_loader: annotate doctests as `no_run`
devres: Correct code style for functions that return a pointer type
devres: Initialize an uninitialized struct member
devres: Fix memory leakage caused by driver API devm_free_percpu()
devres: Fix devm_krealloc() wasting memory
driver core: platform: Switch to use kmemdup_array()
driver core: have match() callback in struct bus_type take a const *
MAINTAINERS: add Rust device abstractions to DRIVER CORE
device: rust: improve safety comments
MAINTAINERS: add Danilo as FIRMWARE LOADER maintainer
MAINTAINERS: add Rust FW abstractions to FIRMWARE LOADER
firmware: rust: improve safety comments
...
In the match() callback, the struct device_driver * should not be
changed, so change the function callback to be a const *. This is one
step of many towards making the driver core safe to have struct
device_driver in read-only memory.
Because the match() callback is in all busses, all busses are modified
to handle this properly. This does entail switching some container_of()
calls to container_of_const() to properly handle the constant *.
For some busses, like PCI and USB and HV, the const * is cast away in
the match callback as those busses do want to modify those structures at
this point in time (they have a local lock in the driver structure.)
That will have to be changed in the future if they wish to have their
struct device * in read-only-memory.
Cc: Rafael J. Wysocki <rafael@kernel.org>
Reviewed-by: Alex Elder <elder@kernel.org>
Acked-by: Sumit Garg <sumit.garg@linaro.org>
Link: https://lore.kernel.org/r/2024070136-wrongdoer-busily-01e8@gregkh
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Patch series "Remove usage of the deprecated ida_simple_xx() API".
The series removes the *last* usages of the deprecated ida_simple_xx() API.
This patch (of 3):
ida_alloc() and ida_free() should be preferred to the deprecated
ida_simple_get() and ida_simple_remove().
Note that the upper limit of ida_simple_get() is exclusive, but the one of
ida_alloc_range() is inclusive. So, this upper limit, INT_MAX, should
have been changed to INT_MAX-1.
But, it is likely that the INT_MAX 'idx' is valid that the max value passed
to ida_simple_get() should have been 0.
So, allow this INT_MAX 'idx' value now.
Link: https://lkml.kernel.org/r/cover.1717855701.git.christophe.jaillet@wanadoo.fr
Link: https://lkml.kernel.org/r/8e28b0c45fe8f28ca4475fe0027f8099c41259f0.1717855701.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Eddie James <eajames@linux.ibm.com>
Cc: Alistar Popple <alistair@popple.id.au>
Cc: Christian Gromm <christian.gromm@microchip.com>
Cc: Jeremy Kerr <jk@ozlabs.org>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Parthiban Veerasooran <Parthiban.Veerasooran@microchip.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.
To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().
Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.
Link: https://lists.ozlabs.org/pipermail/linux-fsi/2024-March/000613.html
Acked-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.
To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().
Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.
Acked-by: Jeremy Kerr <jk@ozlabs.org>
Link: https://lists.ozlabs.org/pipermail/linux-fsi/2024-March/000612.html
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.
To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().
Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.
Acked-by: Jeremy Kerr <jk@ozlabs.org>
Link: https://lists.ozlabs.org/pipermail/linux-fsi/2024-March/000614.html
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.
To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().
Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.
Acked-by: Jeremy Kerr <jk@ozlabs.org>
Link: https://lore.kernel.org/r/de0f2d4cb529a433d4620ca0e8fda0dfb1e950db.1709673414.git.u.kleine-koenig@pengutronix.de
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
struct i2c_driver::probe_new is about to go away. Switch the driver to
use the probe callback with the same prototype.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reviewed-by: Eddie James <eajames@linux.ibm.com>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Link: https://lore.kernel.org/r/20230816171944.123705-1-u.kleine-koenig@pengutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The I2CR has the capability to directly perform SCOM operations,
circumventing the need to drive the FSI2PIB engine. Add a new
driver to perform SCOM operations through the I2CR.
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Link: https://lore.kernel.org/r/20230612195657.245125-15-eajames@linux.ibm.com
Signed-off-by: Joel Stanley <joel@jms.id.au>
The I2C Responder (I2CR) is an I2C device that translates I2C commands
to CFAM or SCOM operations, effectively implementing an FSI master and
bus.
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Link: https://lore.kernel.org/r/20230612195657.245125-14-eajames@linux.ibm.com
Signed-off-by: Joel Stanley <joel@jms.id.au>
Because master device registration may cause hub master scans, or
user scans may begin before device registration has ended, so the
master scan lock must be held while registering the device.
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Link: https://lore.kernel.org/r/20230809180814.151984-3-eajames@linux.ibm.com
Signed-off-by: Joel Stanley <joel@jms.id.au>
Master indexing is problematic if a hub is rescanned while the
root master is being rescanned. Always allocate an index for the
FSI master, and set the device name if it hasn't already been set.
Move the call to ida_free to the bottom of master unregistration
and set the number of links to 0 in case another call to scan
comes in before the device is removed.
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Link: https://lore.kernel.org/r/20230809180814.151984-2-eajames@linux.ibm.com
Signed-off-by: Joel Stanley <joel@jms.id.au>
The legacy minor numbering shifts the chip id too much,
resulting in ids that overlap with regular ids. Since there
are only 2 bits for 4 types, only shift the chip id by 2
to fit the legacy ids in their reserved space.
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Link: https://lore.kernel.org/r/20230612195657.245125-10-eajames@linux.ibm.com
Signed-off-by: Joel Stanley <joel@jms.id.au>
Add more trace events for the scanning and unregistration
functions for debug purposes.
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Link: https://lore.kernel.org/r/20230612195657.245125-9-eajames@linux.ibm.com
Signed-off-by: Joel Stanley <joel@jms.id.au>
It has been observed that sometimes the FSI master will return all 0xffs
after a CFAM has been taken out of reset, without presenting any error.
Resetting the FSI master errors resolves the issue.
Fixes: 4a851d714e ("fsi: aspeed: Support CFAM reset GPIO")
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Link: https://lore.kernel.org/r/20230612195657.245125-8-eajames@linux.ibm.com
Signed-off-by: Joel Stanley <joel@jms.id.au>
There's no reason to limit the user here. The way the driver is
designed, extremely large transfers require extremely long timeouts.
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Link: https://lore.kernel.org/r/20230612195657.245125-7-eajames@linux.ibm.com
Signed-off-by: Joel Stanley <joel@jms.id.au>
A new use case for the SBEFIFO requires a long in-command timeout
as the SBE processes each part of the command before clearing the
upstream FIFO for the next part of the command.
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Link: https://lore.kernel.org/r/20230612195657.245125-6-eajames@linux.ibm.com
Signed-off-by: Joel Stanley <joel@jms.id.au>
The status check during probe doesn't serve any purpose. Any attempt
to use the SBEFIFO will result in the same check and cleanup.
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Link: https://lore.kernel.org/r/20230612195657.245125-5-eajames@linux.ibm.com
Signed-off-by: Joel Stanley <joel@jms.id.au>
Since we have two scom drivers, use the standard of matching if
the driver specifies a table so that the right devices go to the
right driver.
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Link: https://lore.kernel.org/r/20230612195657.245125-4-eajames@linux.ibm.com
Signed-off-by: Joel Stanley <joel@jms.id.au>
The I2C and SPI subsystems can use an aliased name to number the device.
Add similar support to the FSI subsystem for any device type.
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Link: https://lore.kernel.org/r/20230612195657.245125-3-eajames@linux.ibm.com
Signed-off-by: Joel Stanley <joel@jms.id.au>
Some FSI drivers may have need of the slave definition, so
move it to a header file. Also use one macro for obtaining a
pointer to the fsi_master structure.
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Link: https://lore.kernel.org/r/20230612195657.245125-2-eajames@linux.ibm.com
Signed-off-by: Joel Stanley <joel@jms.id.au>
Use the recently added of_property_read_reg() helper to get the
untranslated "reg" address value.
Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Eddie James <eajames@linux.ibm.com>
Link: https://lore.kernel.org/r/20230609183056.1765183-1-robh@kernel.org
Signed-off-by: Joel Stanley <joel@jms.id.au>
The DT of_device.h and of_platform.h date back to the separate
of_platform_bus_type before it as merged into the regular platform bus.
As part of that merge prepping Arm DT support 13 years ago, they
"temporarily" include each other. They also include platform_device.h
and of.h. As a result, there's a pretty much random mix of those include
files used throughout the tree. In order to detangle these headers and
replace the implicit includes with struct declarations, users need to
explicitly include the correct includes.
Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Eddie James <eajames@linux.ibm.com>
Link: https://lore.kernel.org/r/20230718205508.1790932-1-robh@kernel.org
Signed-off-by: Joel Stanley <joel@jms.id.au>
The devnode() callback in struct device_type should not be modifying the
device that is passed into it, so mark it as a const * and propagate the
function signature changes out into all relevant subsystems that use
this callback.
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Alison Schofield <alison.schofield@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Ben Widawsky <bwidawsk@kernel.org>
Cc: Jeremy Kerr <jk@ozlabs.org>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Alistar Popple <alistair@popple.id.au>
Cc: Eddie James <eajames@linux.ibm.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Jilin Yuan <yuanjilin@cdjrlc.com>
Cc: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Cc: Won Chung <wonchung@google.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20230111113018.459199-7-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
READ/WRITE proved to be actively confusing - the meanings are
"data destination, as used with read(2)" and "data source, as
used with write(2)", but people keep interpreting those as
"we read data from it" and "we write data to it", i.e. exactly
the wrong way.
Call them ITER_DEST and ITER_SOURCE - at least that is harder
to misinterpret...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
If allocation fails, the ida_simple_get() will return error number.
So master->idx could be error number and be used in dev_set_name().
Therefore, it should be better to check it and return error if fails,
like the ida_simple_get() in __fsi_get_new_minor().
Fixes: 09aecfab93 ("drivers/fsi: Add fsi master definition")
Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn>
Reviewed-by: Eddie James <eajames@linux.ibm.com>
Link: https://lore.kernel.org/r/20220111073411.614138-1-jiasheng@iscas.ac.cn
Signed-off-by: Joel Stanley <joel@jms.id.au>
There is now a need for reading devicetree properties in the OCC
hwmon driver, which isn't current supported as the FSI driver just
instantiates a basic platform device. Add support for this use case
by checking for an "occ-hwmon" node and if present, creating an
OF device from it.
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Link: https://lore.kernel.org/r/20220809200701.218059-3-eajames@linux.ibm.com
Signed-off-by: Joel Stanley <joel@jms.id.au>
of_parse_phandle returns node pointer with refcount incremented, use
of_node_put() on it when done.
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Lv Ruyi <lv.ruyi@zte.com.cn>
Link: https://lore.kernel.org/r/20220407085911.2491719-1-lv.ruyi@zte.com.cn
Signed-off-by: Joel Stanley <joel@jms.id.au>
Provide more output on the timeout status, and make some vdbg calls into
dbg calls so they can be enabled at runtime.
Signed-off-by: Joel Stanley <joel@jms.id.au>
Link: https://lore.kernel.org/r/20220415050757.281158-1-joel@jms.id.au
Signed-off-by: Joel Stanley <joel@jms.id.au>
Smatch reports these issues
fsi-core.c:395:12: warning: function 'fsi_slave_claim_range'
with external linkage has definition
fsi-core.c:409:13: warning: function 'fsi_slave_release_range'
with external linkage has definition
The storage-class-specifier extern is not needed in a
definition, so remove it.
Signed-off-by: Tom Rix <trix@redhat.com>
Link: https://lore.kernel.org/r/20220403140937.3833578-1-trix@redhat.com
Signed-off-by: Joel Stanley <joel@jms.id.au>
Use get_device and put_device in the open and close functions to
make sure the device doesn't get freed while a file descriptor is
open.
Also, lock around the freeing of the device buffer and check the
buffer before using it in the submit function.
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20220513194424.53468-1-eajames@linux.ibm.com
Signed-off-by: Joel Stanley <joel@jms.id.au>
Change the checksum errno to something different than the errno
used for a bad SBE message. In addition, don't set the user's
response length to the data length in this case, since it's not
SBE FFDC.
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Link: https://lore.kernel.org/r/20220426154956.27205-2-eajames@linux.ibm.com
Signed-off-by: Joel Stanley <joel@jms.id.au>
* Improvements in SCOM and OCC drivers for error handling and retries
* Addition of tracepoints for initialisation path
* API for setting long running SBE FIFO operations
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE+nHMAt9PCBDH63wBa3ZZB4FHcJ4FAmITXCIACgkQa3ZZB4FH
cJ6tRQ//eBrtmh0pkfccNMw1tqGqPJAqsANxBwTBFlG02MD7YhUH6VEyULdp2V8O
icwweLcUwM5Q6d6mDSCTEWxRDyLDsQ3lIBCq/j6DLCOZ3gaFAm7bwdl24p59lK51
DNkNB4quAhsc1eXQ3hmALXEUZ1QRLSrbw4+utYPhaiyCU/8LORqaelmip52HBVK5
8WDsOs/Vj7o+Yiu6dNtYfRTkjH3AHyMCgPjSI6tZ1sUZGGeyBTWywDNcbX7wIVAn
wuf+4G+v5XtY7uqmnvMRmhoUU01FvdJ573/UGLyQ5qlZIN6C6i2q5shyvUP839FC
8GNeRJJFABq0qB9nMK2S9hhGvaFG8AhcWMNvT+/IRPa9p4wyvwClwzqeeScATDWg
762bQSl9NOrkGtggh3GDDH5hywQeFuboPhiNfCp9Y0Hw6Ng+OBx1Z9BzxuqayFQi
N1hsd+AC/K8qkcYiayV4euCbA2rinSzUZOYWCHlaPrqG78wACgcI2YWrwmz1prBH
y4aSibeehcXIvHseQQnLHUzLWfZiRr1UsEGA94lva2NnjUADiO/Wty9F2gI/i7zv
X+VP77/9s14EGrx8w4SunPn24i67xEoHMFzwuzr69HuT32w/CKLX52UqUdPCjecF
TbdEJDAwQH/Pml/t+rBfN4ntHhkoMRDhvbM7yovUJ6Y3gcDTMDY=
=kOpN
-----END PGP SIGNATURE-----
Merge tag 'fsi-for-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/joel/fsi into char-misc-next
Joel writes:
FSI changes for v5.18
* Improvements in SCOM and OCC drivers for error handling and retries
* Addition of tracepoints for initialisation path
* API for setting long running SBE FIFO operations
* tag 'fsi-for-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/joel/fsi:
fsi: Add trace events in initialization path
fsi: sbefifo: Implement FSI_SBEFIFO_READ_TIMEOUT_SECONDS ioctl
fsi: sbefifo: Use specified value of start of response timeout
fsi: occ: Improve response status checking
fsi: scom: Remove retries in indirect scoms
fsi: scom: Fix error handling
Add definitions for trace events to show the scanning flow.
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Link: https://lore.kernel.org/r/20220207161640.35605-1-eajames@linux.ibm.com
Signed-off-by: Joel Stanley <joel@jms.id.au>
FSI_SBEFIFO_READ_TIMEOUT_SECONDS ioctl sets the read timeout (in
seconds) for the response received by sbefifo device from sbe. The
timeout affects only the read operation on current sbefifo device fd.
Certain SBE operations can take long time to complete and the default
timeout of 10 seconds might not be sufficient to start receiving
response from SBE. In such cases, allow the timeout to be set to the
maximum of 120 seconds.
The kernel does not contain the definition of the various SBE
operations, so we must expose an interface to userspace to set the
timeout for the given operation.
Signed-off-by: Amitay Isaacs <amitay@ozlabs.org>
Signed-off-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Eddie James <eajames@linux.ibm.com>
Link: https://lore.kernel.org/r/20220121053816.82253-3-joel@jms.id.au
Signed-off-by: Joel Stanley <joel@jms.id.au>
For some of the chip-ops where sbe needs to collect trace information,
sbe can take a long time (>30s) to respond. Currently these chip-ops
will timeout as the start of response timeout defaults to 10s.
Instead of default value, use specified value. The require timeout
value will be set using ioctl.
Signed-off-by: Amitay Isaacs <amitay@ozlabs.org>
Signed-off-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Eddie James <eajames@linux.ibm.com>
Link: https://lore.kernel.org/r/20220121053816.82253-2-joel@jms.id.au
Signed-off-by: Joel Stanley <joel@jms.id.au>
If the driver sequence number coincidentally equals the previous
command response sequence number, the driver may proceed with
fetching the entire buffer before the OCC has processed the current
command. To be sure the correct response is obtained, check the
command type and also retry if any of the response parameters have
changed when the rest of the buffer is fetched. Also initialize the
driver with a random sequence number in order to reduce the chances
of this happening.
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Link: https://lore.kernel.org/r/20220208152235.19686-1-eajames@linux.ibm.com
Signed-off-by: Joel Stanley <joel@jms.id.au>
A struct device can never be devm_alloc()'ed.
Here, it is embedded in "struct fsi_master", and "struct fsi_master" is
embedded in "struct fsi_master_aspeed".
Since "struct device" is embedded, the data structure embedding it must be
released with the release function, as is already done here.
So use kzalloc() instead of devm_kzalloc() when allocating "aspeed" and
update all error handling branches accordingly.
This prevent a potential double free().
This also fix another issue if opb_readl() fails. Instead of a direct
return, it now jumps in the error handling path.
Fixes: 606397d67f ("fsi: Add ast2600 master driver")
Suggested-by: Greg KH <gregkh@linuxfoundation.org>
Suggested-by: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/2c123f8b0a40dc1a061fae982169fe030b4f47e6.1641765339.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In commit f72ddbe1d7 ("fsi: scom: Remove retries") the retries were
removed from get and put scoms. That patch missed the retires in get and
put indirect scom.
For the same reason, remove them from the scom driver to allow the
caller to decide to retry.
This removes the following special case which would have caused the
retry code to return early:
- if ((ind_data & XSCOM_DATA_IND_COMPLETE) || (err != SCOM_PIB_BLOCKED))
- return 0;
I believe this case is handled.
Fixes: f72ddbe1d7 ("fsi: scom: Remove retries")
Signed-off-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Eddie James <eajames@linux.ibm.com>
Link: https://lore.kernel.org/r/20211207033811.518981-3-joel@jms.id.au
Signed-off-by: Joel Stanley <joel@jms.id.au>
SCOM error handling is made complex by trying to pass around two bits of
information: the function return code, and a status parameter that
represents the CFAM error status register.
The commit f72ddbe1d7 ("fsi: scom: Remove retries") removed the
"hidden" retries in the SCOM driver, in preference of allowing the
calling code (userspace or driver) to decide how to handle a failed
SCOM. However it introduced a bug by attempting to be smart about the
return codes that were "errors" and which were ok to fall through to the
status register parsing.
We get the following errors:
- EINVAL or ENXIO, for indirect scoms where the value is invalid
- EINVAL, where the size or address is incorrect
- EIO or ETIMEOUT, where FSI write failed (aspeed master)
- EAGAIN, where the master detected a crc error (GPIO master only)
- EBUSY, where the bus is disabled (GPIO master in external mode)
In all of these cases we should fail the SCOM read/write and return the
error.
Thanks to Dan Carpenter for the detailed bug report.
Fixes: f72ddbe1d7 ("fsi: scom: Remove retries")
Link: https://lists.ozlabs.org/pipermail/linux-fsi/2021-November/000235.html
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Eddie James <eajames@linux.ibm.com>
Link: https://lore.kernel.org/r/20211207033811.518981-2-joel@jms.id.au
Signed-off-by: Joel Stanley <joel@jms.id.au>
Some SBE operations have extremely large responses and can require
several minutes to process the response. During this time, the device
lock must be held. If another process attempts an operation, it will
wait for the mutex for longer than the kernel hung task watchdog
allows. Therefore, use the interruptible function to lock the mutex.
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Link: https://lore.kernel.org/r/20210803213016.44739-1-eajames@linux.ibm.com
Signed-off-by: Joel Stanley <joel@jms.id.au>
The SBEFIFO timeout error requires special handling in userspace
to do recovery operations. Add a sysfs file to indicate a timeout
error, and notify pollers when a timeout occurs.
This will be used by the openpower-occ-control application.
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Link: https://lore.kernel.org/r/20211019211749.38059-3-eajames@linux.ibm.com
Signed-off-by: Joel Stanley <joel@jms.id.au>
If the SBEFIFO response indicates an error, store the response in the
user buffer and return an error. Previously, the user had no way of
obtaining the SBEFIFO FFDC.
The user's buffer now contains data in the event of a failure. No change
in the event of a successful transfer.
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Link: https://lore.kernel.org/r/20211019205307.36946-3-eajames@linux.ibm.com
Signed-off-by: Joel Stanley <joel@jms.id.au>
Allocate a large buffer for each OCC to handle response data. This
removes memory allocation during an operation, and also allows for
the maximum amount of SBE FFDC.
Previously for the putsram and attn commands, only 32 words would have
been available, and for getsram, only up to the size of the transfer.
SBE FFDC might be up to 8Kb.
The SBE interface expects data to be specified in units of words (4
bytes), defined as OCC_MAX_RESP_WORDS.
This change allows the full FFDC capture to be implemented, where before
it was not available.
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Link: https://lore.kernel.org/r/20211019205307.36946-2-eajames@linux.ibm.com
Signed-off-by: Joel Stanley <joel@jms.id.au>
Set and increment the sequence number during the submit operation.
This prevents sequence number conflicts between different users of
the interface. A sequence number conflict may result in a user
getting an OCC response meant for a different command. Since the
sequence number is now modified, the checksum must be calculated and
set before submitting the command.
Signed-off-by: Eddie James <eajames@linux.ibm.com>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Link: https://lore.kernel.org/r/20210721190231.117185-2-eajames@linux.ibm.com
Signed-off-by: Joel Stanley <joel@jms.id.au>