Multiple interrupt sets for affinity spreading are now handled in the core
code and the number of sets and their size is recalculated via a driver
supplied callback.
That avoids the requirement to invoke pci_alloc_irq_vectors_affinity() with
the arguments minvecs and maxvecs set to the same value and the callsite
handling the ENOSPC situation.
Remove the now obsolete sanity checks and the related comments.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Bjorn Helgaas <helgaas@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: linux-nvme@lists.infradead.org
Cc: linux-pci@vger.kernel.org
Cc: Keith Busch <keith.busch@intel.com>
Cc: Sumit Saxena <sumit.saxena@broadcom.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Shivasharan Srikanteshwara <shivasharan.srikanteshwara@broadcom.com>
Link: https://lkml.kernel.org/r/20190216172228.778630549@linutronix.de
The interrupt affinity spreading mechanism supports to spread out
affinities for one or more interrupt sets. A interrupt set contains one or
more interrupts. Each set is mapped to a specific functionality of a
device, e.g. general I/O queues and read I/O queus of multiqueue block
devices.
The number of interrupts per set is defined by the driver. It depends on
the total number of available interrupts for the device, which is
determined by the PCI capabilites and the availability of underlying CPU
resources, and the number of queues which the device provides and the
driver wants to instantiate.
The driver passes initial configuration for the interrupt allocation via a
pointer to struct irq_affinity.
Right now the allocation mechanism is complex as it requires to have a loop
in the driver to determine the maximum number of interrupts which are
provided by the PCI capabilities and the underlying CPU resources. This
loop would have to be replicated in every driver which wants to utilize
this mechanism. That's unwanted code duplication and error prone.
In order to move this into generic facilities it is required to have a
mechanism, which allows the recalculation of the interrupt sets and their
size, in the core code. As the core code does not have any knowledge about the
underlying device, a driver specific callback is required in struct
irq_affinity, which can be invoked by the core code. The callback gets the
number of available interupts as an argument, so the driver can calculate the
corresponding number and size of interrupt sets.
At the moment the struct irq_affinity pointer which is handed in from the
driver and passed through to several core functions is marked 'const', but for
the callback to be able to modify the data in the struct it's required to
remove the 'const' qualifier.
Add the optional callback to struct irq_affinity, which allows drivers to
recalculate the number and size of interrupt sets and remove the 'const'
qualifier.
For simple invocations, which do not supply a callback, a default callback
is installed, which just sets nr_sets to 1 and transfers the number of
spreadable vectors to the set_size array at index 0.
This is for now guarded by a check for nr_sets != 0 to keep the NVME driver
working until it is converted to the callback mechanism.
To make sure that the driver configuration is correct under all circumstances
the callback is invoked even when there are no interrupts for queues left,
i.e. the pre/post requirements already exhaust the numner of available
interrupts.
At the PCI layer irq_create_affinity_masks() has to be invoked even for the
case where the legacy interrupt is used. That ensures that the callback is
invoked and the device driver can adjust to that situation.
[ tglx: Fixed the simple case (no sets required). Moved the sanity check
for nr_sets after the invocation of the callback so it catches
broken drivers. Fixed the kernel doc comments for struct
irq_affinity and de-'This patch'-ed the changelog ]
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Bjorn Helgaas <helgaas@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: linux-nvme@lists.infradead.org
Cc: linux-pci@vger.kernel.org
Cc: Keith Busch <keith.busch@intel.com>
Cc: Sumit Saxena <sumit.saxena@broadcom.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Shivasharan Srikanteshwara <shivasharan.srikanteshwara@broadcom.com>
Link: https://lkml.kernel.org/r/20190216172228.512444498@linutronix.de
Commit 0e157e5286 ("PCI/PME: Implement runtime PM callbacks") tried to
solve an issue where the hierarchy immediately wakes up when it is
transitioned into D3cold. However, it turns out to prevent PME
propagation on some systems that do not support D3cold.
I looked more closely at what might cause the immediate wakeup. It happens
when the ACPI power resource of the root port is turned off. The AML code
associated with the _OFF() method of the ACPI power resource starts a PCIe
L2/L3 Ready transition and waits for it to complete. Right after the L2/L3
Ready transition is started the root port receives a PME from the
downstream port.
The simplest hierarchy where this happens looks like this:
00:1d.0 PCIe Root Port
^
|
v
05:00.0 PCIe switch #1 upstream port
06:01.0 PCIe switch #1 downstream hotplug port
^
|
v
08:00.0 PCIe switch #2 upstream port
It seems that the PCIe link between the two switches, before
PME_Turn_Off/PME_TO_Ack is complete for the whole hierarchy, goes
inactive and triggers PME towards the root port bringing it back to D0.
The L2/L3 Ready sequence is described in PCIe r4.0 spec sections 5.2 and
5.3.3 but unfortunately they do not state what happens if DLLSCE is
enabled during the sequence.
Disabling Data Link Layer State Changed event (DLLSCE) seems to prevent
the issue and still allows the downstream hotplug port to notice when a
device is plugged/unplugged.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=202593
Fixes: 0e157e5286 ("PCI/PME: Implement runtime PM callbacks")
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
CC: stable@vger.kernel.org # v4.20+
The Class Code for subtractive decode PCI-to-PCI bridge is 060401h; add an
entry to make portdrv support this type of bridge. This allows use of PCIe
services on subtractive decode ports.
Signed-off-by: Honghui Zhang <honghui.zhang@mediatek.com>
[bhelgaas: add braces surrounding entry]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The pci_device_id table was technically correct, but unusually formatted,
which made adding entries error-prone. Change the format so it's obvious
how to add entries.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Now that pci-epf-test uses get_features callback and
dw_plat_pcie_epc_features in Designware plat EP driver already indicates
it doesn't support linkup notification and is MSIX capable, remove setting
epc->features which is not used anymore by the endpoint function driver.
Tested-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
pci_epf_linkup() is intended to be invoked if the EPC supports linkup
notification. Now that pci-epf-test uses get_features callback, which
indicates Rockchip EP driver doesn't support linkup notification, remove
pci_epf_linkup() from Rockchip EP driver.
Tested-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Shawn Lin <shawn.lin@rock-chips.com>
pci_epf_linkup() is intended to be invoked if the EPC supports linkup
notification. Now that pci-epf-test uses the get_features() callback,
which indicates Cadence EP driver doesn't support the linkup notification,
remove pci_epf_linkup() from Cadence EP driver.
Tested-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Use pci_epc_get_features() to get EPC features such as linkup
notifier support, MSI/MSIX capable, BAR configuration etc and use it
for configuring pci-epf-test. Since these features are now obtained
directly from EPC driver, remove pci_epf_test_data which was initially
added to have EPC features in endpoint function driver.
Tested-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
It's useless to allocate memory for next BAR if the current BAR is a
64Bit BAR. Stop allocating memory for the next BAR, if the current
BARs flag indicates this is a 64Bit BAR.
Tested-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Now that pci_epf_alloc_space() sets BAR MEM TYPE flags as 64Bit or
32Bit based on size, remove setting it in function driver.
Tested-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
pci_epf_alloc_space() sets the MEM TYPE flags to indicate a 32-bit
Base Address Register irrespective of the size. Fix it here to indicate
64-bit BAR if the size is > 2GB.
Tested-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Add a helper function pci_epc_get_first_free_bar() to get the first
unreserved BAR that can be used for endpoint function.
Tested-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
This reverts commit 0e157e5286.
Heiner reported that the commit in question prevents his network adapter
from triggering PME and waking up when network cable is plugged.
The commit tried to prevent root port waking up from D3cold immediately but
looks like disabing root port PME interrupt is not the right way to fix
that issue so revert it now. The patch following proposes an alternative
solution to that issue.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=202103
Fixes: 0e157e5286 ("PCI/PME: Implement runtime PM callbacks")
Reported-by: Heiner Kallweit <hkallweit1@gmail.com>
Tested-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
CC: stable@vger.kernel.org # v4.20+
Populate ->get_features() dw_pcie_ep_ops to return the EPC features
supported by Cadence PCIe endpoint controller.
Tested-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Populate ->get_features() dw_pcie_ep_ops to return the EPC features
supported by Rockchip PCIe endpoint controller.
Tested-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Populate ->get_features() dw_pcie_ep_ops to return the EPC features
supported by DRA7xx PCIe endpoint controller.
Tested-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Populate ->get_features() dw_pcie_ep_ops to return the EPC features
supported by Designware PCIe endpoint controller.
Tested-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Each platform using Designware PCIe core can support different set of
endpoint features. Add a new callback function ->get_features() in
dw_pcie_ep_ops so that each platform using Designware PCIe core can
advertise its supported features to the endpoint function driver.
Tested-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Add a new pci_epc_ops ->get_features() to get the features
supported by the EPC. Since EPC can provide different features to
different functions, the ->get_features() ops takes _func_no_ as
an argument.
Tested-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Functions copying from/to IO addresses should use the
memcpy_fromio()/memcpy_toio() API rather than plain memcpy().
Fix the issue detected through the sparse tool.
Fixes: 349e7a85b2 ("PCI: endpoint: functions: Add an EP function to test PCI")
Suggested-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Wen Yang <wen.yang99@zte.com.cn>
[lorenzo.pieralisi@arm.com: updated log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
CC: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
CC: Bjorn Helgaas <bhelgaas@google.com>
CC: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
CC: Niklas Cassel <niklas.cassel@axis.com>
CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
CC: Cyrille Pitchen <cyrille.pitchen@free-electrons.com>
CC: linux-pci@vger.kernel.org
CC: linux-kernel@vger.kernel.org
This implements the workound described in the NXP IMX7d erratum e10728.
Initial VCO oscillation may fail under corner conditions such as cold
temperature. It causes PCIe PLL to fail to lock in the initialization
phase, which results in the PCIe link failing to come up.
The workaround is to disable Duty-Cycle Corrector (DCC) calibration
after G_RST.
To do this it is necessary to gain access to the undocumented and
currently unused PCIe PHY register bank. A new device tree node of type
"fsl,imx7d-pcie-phy" is created for the PHY block and the existing PCIe
device uses a phandle named "fsl,imx7d-pcie-phy" to point to it.
Signed-off-by: Trent Piepho <tpiepho@impinj.com>
[lorenzo.pieralisi@arm.com: updated log string, commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Add debug error message when MSI-X entry control mask bit is set, to help
debug the reason why a MSI-X interrupt is not being triggered.
Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Jingoo Han <jingoohan1@gmail.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Joao Pinto <joao.pinto@synopsys.com>
Latency Tolerance Reporting (LTR) allows Endpoints and Switch Upstream
Ports to report their latency requirements to upstream components. If ASPM
L1 PM substates are enabled, the LTR information helps determine when a
Link enters L1.2 [1].
Software must set the maximum latency values in the LTR Capability based on
characteristics of the platform, then set LTR Mechanism Enable in the
Device Control 2 register in the PCIe Capability. The device can then use
LTR to report its latency tolerance.
If the device reports a maximum latency value of zero, that means the
device requires the highest possible performance and the ASPM L1.2 substate
is effectively disabled.
We put devices in D3 for suspend, and we assume their internal state is
lost. On resume, previously we did not restore the LTR Capability, but we
did restore the LTR Mechanism Enable bit, so devices would request the
highest possible performance and ASPM L1.2 wouldn't be used.
[1] PCIe r4.0, sec 5.5.1
Link: https://bugzilla.kernel.org/show_bug.cgi?id=201469
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Gigabyte X299 DESIGNARE EX motherboard has one PCIe root port that is
connected to an Alpine Ridge Thunderbolt controller. This port has slot
implemented bit set in the config space but other than that it is not
hotplug capable in the sense we are expecting in Linux (it has
dev->is_hotplug_bridge set to 0):
00:1c.4 PCI bridge: Intel Corporation 200 Series PCH PCI Express Root Port #5
Bus: primary=00, secondary=05, subordinate=46, sec-latency=0
Memory behind bridge: 78000000-8fffffff [size=384M]
Prefetchable memory behind bridge: 00003800f8000000-00003800ffffffff [size=128M]
...
Capabilities: [40] Express (v2) Root Port (Slot+), MSI 00
...
SltCap: AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug- Surprise-
Slot #8, PowerLimit 25.000W; Interlock- NoCompl+
SltCtl: Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg-
Control: AttnInd Unknown, PwrInd Unknown, Power- Interlock-
SltSta: Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet- Interlock-
Changed: MRL- PresDet+ LinkState+
This system is using ACPI based hotplug to notify the OS that it needs to
rescan the PCI bus (ACPI hotplug).
If there is nothing connected in any of the Thunderbolt ports the root port
will not have any runtime PM active children and is thus automatically
runtime suspended pretty soon after boot by PCI PM core. Now, when a
device is connected the BIOS SMI handler responsible for enumerating newly
added devices is not able to find anything because the port is in D3.
Prevent this from happening by blacklisting PCI power management of this
particular Gigabyte system.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=202031
Reported-by: Kedar A Dongre <kedar.a.dongre@intel.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
RussianNeuroMancer reported that the Intel 7265 wifi on a Dell Venue 11 Pro
7140 table stopped working after wakeup from suspend and bisected the
problem to 9ab105deb6 ("PCI/ASPM: Disable ASPM L1.2 Substate if we don't
have LTR"). David Ward reported the same problem on a Dell Latitude 7350.
After af8bb9f898 ("PCI/ACPI: Request LTR control from platform before
using it"), we don't enable LTR unless the platform has granted LTR control
to us. In addition, we don't notice if the platform had already enabled
LTR itself.
After 9ab105deb6 ("PCI/ASPM: Disable ASPM L1.2 Substate if we don't have
LTR"), we avoid using LTR if we don't think the path to the device has LTR
enabled.
The combination means that if the platform itself enables LTR but declines
to give the OS control over LTR, we unnecessarily avoided using ASPM L1.2.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=201469
Fixes: 9ab105deb6 ("PCI/ASPM: Disable ASPM L1.2 Substate if we don't have LTR")
Fixes: af8bb9f898 ("PCI/ACPI: Request LTR control from platform before using it")
Reported-by: RussianNeuroMancer <russianneuromancer@ya.ru>
Reported-by: David Ward <david.ward@ll.mit.edu>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: stable@vger.kernel.org # v4.18+
The double underscore types are meant for compatibility in userspace
headers which does not apply here. Therefore, change to use the standard
no-underscore types.
The origin of the double underscore types dates back to before the git era
so I was not able to find a commit to see the original justification.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
There are at least four different parts with the same Vendor and Device
ID ([16c3:abcd]):
1) Synopsys HAPS USB3 controller
2) Synopsys PCIe Root Port in Freescale/NXP i.MX6Q (reported by Lucas)
3) Synopsys PCIe Root Port in Freescale/NXP i.MX6QP (reported by Lukas)
4) Synopsys PCIe Root Port in Freescale/NXP i.MX7D (reported by Trent)
The HAPS USB3 controller has a Class Code of PCI_CLASS_SERIAL_USB_XHCI,
which means the XHCI driver would normally claim it. Previously,
quirk_synopsys_haps() changed the Class Code of all [16c3:abcd] devices,
including the Root Ports, to PCI_CLASS_SERIAL_USB_DEVICE to prevent the
XHCI driver from claiming them so dwc3-haps can claim them instead.
Changing the Class Code of the Root Ports prevents the PCI core from
handling them as bridges, so devices below them don't work.
Restrict the quirk so it only changes the Class Code for devices that start
with the PCI_CLASS_SERIAL_USB_XHCI Class Code, leaving the Root Ports
alone.
Fixes: 03e6742584 ("PCI: Override Synopsys USB 3.x HAPS device class")
Reported-by: Lukas F. Hartmann <lukas@mntmn.com>
Reported-by: Trent Piepho <tpiepho@impinj.com>
Reported-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Thinh Nguyen <thinhn@synopsys.com>
[bhelgaas: changelog]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Both i.MX7D and i.MX8MQ have the same behaviour when it comes to
clearing DIRECT_SPEED_CHANGE bit when no speed change occurs, so to
handle variants correctly add a flag instead of checking the IP block
variant.
Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
[lorenzo.pieralisi@arm.com: updated log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Fabio Estevam <fabio.estevam@nxp.com>
Cc: Chris Healy <cphealy@gmail.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Leonard Crestez <leonard.crestez@nxp.com>
Cc: "A.s. Dong" <aisheng.dong@nxp.com>
Cc: Richard Zhu <hongxing.zhu@nxp.com>
PCIe PHY IP block on i.MX7D differs from the one used on i.MX6 family,
so none of the code in the current implementation of
imx6_setup_phy_mpll() or imx6_pcie_reset_phy() is applicable.
Introduce IMX6_PCIE_FLAG_IMX6_PHY and check for it in the aforementioned
functions to make sure they are only executed on appropriate PCIe IP
variants.
Tested-by: Trent Piepho <tpiepho@impinj.com>
Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
[lorenzo.pieralisi@arm.com: updated log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Fabio Estevam <fabio.estevam@nxp.com>
Cc: Chris Healy <cphealy@gmail.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Leonard Crestez <leonard.crestez@nxp.com>
Cc: "A.s. Dong" <aisheng.dong@nxp.com>
Cc: Richard Zhu <hongxing.zhu@nxp.com>
Introduce driver data struct. This will simplify handling of device
specific differences.
Signed-off-by: Stefan Agner <stefan@agner.ch>
[andrew.smirnov@gmail.com reformatted drvdata, to simplify future diffs]
Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Fabio Estevam <fabio.estevam@nxp.com>
Cc: Chris Healy <cphealy@gmail.com>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Leonard Crestez <leonard.crestez@nxp.com>
Cc: "A.s. Dong" <aisheng.dong@nxp.com>
Cc: Richard Zhu <hongxing.zhu@nxp.com>
As per Figure 6-3 in PCIe r4.0, sec 6.2.6, ERR_ messages will be forwarded
from the secondary interface to the primary interface, if the SERR# Enable
bit in the Bridge Control register is set.
It seems clear that an ACPI hotplug parameter method (_HPP or _HPX) that
tells us to "enable SERR in the command register" (ACPI v6.2, sec 6.2.8,
6.2.9.1) refers to PCI_COMMAND_SERR, which enables reporting of errors by
the function itself.
For bridges, we also interpreted that to mean we should enable
PCI_BRIDGE_CTL_SERR, which enables *forwarding* of errors by the bridge.
But we didn't enable PCI_BRIDGE_CTL_SERR anywhere else, which means we
never enabled it for non-ACPI systems or ACPI systems that didn't supply
hotplug parameters.
That means errors reported below bridges were often never forwarded up to a
Root Port where they could be signaled via AER.
Enable PCI_BRIDGE_CTL_SERR for all bridges so we can get better error
reporting for downstream devices.
Signed-off-by: Bharat Kumar Gogada <bharat.kumar.gogada@xilinx.com>
[bhelgaas: changelog]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The HXT SD4800 PCI controller does not set the Command Completed bit unless
writes to the Slot Command register change "Control" bits.
Add SD4800 to the quirk.
Signed-off-by: Shunyong Yang <shunyong.yang@hxt-semitech.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Joey Zheng <yu.zheng@hxt-semitech.com>
The design of HXT SD4800 ACS feature is the same as QCOM QDF2xxx. Add an
ACS quirk for the SD4800.
Signed-off-by: Shunyong Yang <shunyong.yang@hxt-semitech.com>
[bhelgaas: split to separate patch]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Sinan Kaya <okaya@kernel.org>
CC: Joey Zheng <yu.zheng@hxt-semitech.com>
Replace bit rotation operation (1 << bit) with BIT(bit), which
simplifies code reading.
No functional change is intended.
Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Jingoo Han <jingoohan1@gmail.com>
Improve code readability and simplifies mask/unmask operations by
inverting the applied logic (no functional change is intended).
Replace variable name from irq_status to irq_mask, since its goal is to
keep track of which interrupts are masked or not.
Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Jingoo Han <jingoohan1@gmail.com>
Rename variable from data to d to maintain consistency between driver
functions.
No functional change is intended.
Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Jingoo Han <jingoohan1@gmail.com>
Rename variable from data to d to maintain consistency between driver
functions.
No functional change is intended.
Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Jingoo Han <jingoohan1@gmail.com>
Rename variable from data to d to maintain consistency between driver
functions, such as dw_pci_setup_msi_msg().
No functional change is intended.
Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Jingoo Han <jingoohan1@gmail.com>
Rename variable from data to d to maintain consistency between driver
functions, such as dw_msi_mask_irq() and dw_msi_unmask_irq().
No functional change is intended.
Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Jingoo Han <jingoohan1@gmail.com>
Remove unnecessary header include (signal.h) since it doesn't provide
any needed symbols.
Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Jingoo Han <jingoohan1@gmail.com>
Remove unnecessary header include (of_gpio.h) since it doesn't provide
any needed symbols.
Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Jingoo Han <jingoohan1@gmail.com>
Revert commit 3d71746c42 ("PCI: armada8k: Add support for gpio controlled
reset signal").
That commit breaks boot on Macchiatobin board when a Mellanox NIC is
present in the PCIe slot.
It turns out that full reset cycle requires first comphy serdes
initialization. Reset signal toggle without comphy initialization makes
access to PCI configuration registers stall indefinitely. U-Boot toggles
the Macchiatobin PCIe reset line already at boot, after initializing the
comphy serdes.
So while commit 3d71746c42 ("PCI: armada8k: Add support for gpio controlled
reset signal") enables PCIe on platforms that U-Boot does not touch the
reset line (like Clearfog GT-8K), it breaks PCIe (and boot) on the
Macchiatobin board.
Revert commit 3d71746c42 ("PCI: armada8k: Add support for gpio controlled
reset signal") entirely to fix the Macchiatobin regression.
Reported-by: Sven Auhagen <sven.auhagen@voleatech.de>
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
The check on the device_link_add() return value is wrong;
this leads to erroneous code execution, so fix it.
Fixes: 3f7cceeab8 ("PCI: imx: Add multi-pd support")
Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com>
[lorenzo.pieralisi@arm.com: updated commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
On chips without a separate power domain for PCI (such as 6q/6qp) the
imx6_pcie_attach_pd() function incorrectly returns an error.
Fix by returning 0 if dev_pm_domain_attach_by_name() does not find
anything.
Fixes: 3f7cceeab8 ("PCI: imx: Add multi-pd support")
Reported-by: Lukas F.Hartmann <lukas@mntmn.com>
Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com>
[lorenzo.pieralisi@arm.com: updated commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Certain PHYs used with PCIe controller can also be used with other
controllers such as USB or SATA. In order to configure the PHY
to work with PCIe controller, invoke phy_set_mode() API with mode
set to PHY_MODE_PCIE.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
[lorenzo.pieralisi@arm.com: updated commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
dra74x/dra76x and dra72x have separate compatible strings. Add support
for these compatible strings in pci-dra7xx driver to perform syscon
configurations required to get x2 mode working.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
The PCI configuration space header type tells us whether the device is a
bridge, a CardBus bridge, or a normal device, and defines the layout of the
rest of the header (PCI r3.0 sec 6.1, PCIe r4.0 sec 7.5.1.1.9).
When we rely on the header format, e.g., when we're dealing with bridge
windows, we should check the header type, not the class code. The class
code is loosely related to the header type, but is often incorrect and the
spec doesn't actually require it to be related to the header format.
Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Honghui Zhang <honghui.zhang@mediatek.com>
[bhelgaas: changelog, keep the PCI_CLASS_BRIDGE_HOST check]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Currently, the pci_size() function actually returns 'size-1'. Make it
return real size to avoid confusion.
Signed-off-by: Du Changbin <changbin.du@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
In order to provide the most performance and/or compatible settings,
ensure VMD root buses observe the pcie bus tuning settings by
configuring those settings prior to adding the devices to the pcie tree.
This patch open-codes pci_rescan_bus() and configures the buses prior to
adding devices and attaching drivers.
Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
The sem_exit variable is conceptually a completion, so it should be called
that.
Similarly, the semOperations semaphore is a simple mutex, and can be
changed into that, respectively.
With both converted, the ibmphp_hpc_initvars() function is no longer used
and can be removed.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
match_string() returns the array index of a matching string. Use it
instead of the open-coded implementation.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Convert string compares of DT node names to use of_node_name_eq() helper
instead. This removes direct access to the node name pointer.
Signed-off-by: Rob Herring <robh@kernel.org>
[bhelgaas: drop similar rpaphp_core.c change to avoid merge conflict]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
pci_bridge_check_ranges() determines whether a bridge supports the optional
I/O and prefetchable memory windows and sets the flag bits in the bridge
resources. This *could* be done once during enumeration except that the
resource allocation code completely clears the flag bits, e.g., in the
pci_assign_unassigned_bridge_resources() path.
The problem with pci_bridge_check_ranges() in the resource allocation path
is that we may allocate resources after devices have been claimed by
drivers, and pci_bridge_check_ranges() *changes* the window registers to
determine whether they're writable. This may break concurrent accesses to
devices behind the bridge.
Add a new pci_read_bridge_windows() to determine whether a bridge supports
the optional windows, call it once during enumeration, remember the
results, and change pci_bridge_check_ranges() so it doesn't touch the
bridge windows but sets the flag bits based on those remembered results.
Link: https://lore.kernel.org/linux-pci/1506151482-113560-1-git-send-email-wangzhou1@hisilicon.com
Link: https://lists.gnu.org/archive/html/qemu-devel/2018-12/msg02082.html
Reported-by: Yandong Xu <xuyandong2@huawei.com>
Tested-by: Yandong Xu <xuyandong2@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Ofer Hayut <ofer@lightbitslabs.com>
Cc: Roy Shterman <roys@lightbitslabs.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Zhou Wang <wangzhou1@hisilicon.com>
We are trying to get rid of BUS_ATTR() and the usage of that in
pci-sysfs.c can be trivially converted to use BUS_ATTR_WO(), so use that
instead.
Cc: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We are trying to get rid of BUS_ATTR() and the usage of that in pci.c
can be trivially converted to use BUS_ATTR_RW(), so use that instead.
Cc: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The disable_acs_redir parameter stores a pointer to the string passed to
pci_setup(). However, the string passed to PCI setup is actually a
temporary copy allocated in static __initdata memory. After init, once the
memory is freed, it is no longer valid to reference this pointer.
This bug was noticed in v5.0-rc1 after a change in commit c5eb119007
("PCI / PM: Allow runtime PM without callback functions") caused
pci_disable_acs_redir() to be called during shutdown which manifested
as an unable to handle kernel paging request at:
RIP: 0010:pci_enable_acs+0x3f/0x1e0
Call Trace:
pci_restore_state.part.44+0x159/0x3c0
pci_restore_standard_config+0x33/0x40
pci_pm_runtime_resume+0x2b/0xd0
? pci_restore_standard_config+0x40/0x40
__rpm_callback+0xbc/0x1b0
rpm_callback+0x1f/0x70
? pci_restore_standard_config+0x40/0x40
rpm_resume+0x4f9/0x710
? pci_conf1_read+0xb6/0xf0
? pci_conf1_write+0xb2/0xe0
__pm_runtime_resume+0x47/0x70
pci_device_shutdown+0x1e/0x60
device_shutdown+0x14a/0x1f0
kernel_restart+0xe/0x50
__do_sys_reboot+0x1ee/0x210
? __fput+0x144/0x1d0
do_writev+0x5e/0xf0
? do_writev+0x5e/0xf0
do_syscall_64+0x48/0xf0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
It was also likely possible to trigger this bug when hotplugging PCI
devices.
To fix this, instead of storing a pointer, we use kstrdup() to copy the
disable_acs_redir_param to its own buffer which will never be freed.
Fixes: aaca43fda7 ("PCI: Add "pci=disable_acs_redir=" parameter for peer-to-peer support")
Tested-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
The API of pci_alloc_irq_vectors_affinity() says it returns -ENOSPC if
fewer than @min_vecs interrupt vectors are available for @dev.
However, if a device supports MSI-X but not MSI and a caller requests
@min_vecs that can't be satisfied by MSI-X, we previously returned -EINVAL
(from the failed attempt to enable MSI), not -ENOSPC.
When -ENOSPC is returned, callers may reduce the number IRQs they request
and try again. Most callers can use the @min_vecs and @max_vecs
parameters to avoid this retry loop, but that doesn't work when using IRQ
affinity "nr_sets" because rebalancing the sets is driver-specific.
This return value bug has been present since pci_alloc_irq_vectors() was
added in v4.10 by aff171641d ("PCI: Provide sensible IRQ vector
alloc/free routines"), but it wasn't an issue because @min_vecs/@max_vecs
removed the need for callers to iteratively reduce the number of IRQs
requested and retry the allocation, so they didn't need to distinguish
-ENOSPC from -EINVAL.
In v5.0, 6da4b3ab9a ("genirq/affinity: Add support for allocating
interrupt sets") added IRQ sets to the interface, which reintroduced the
need to check for -ENOSPC and possibly reduce the number of IRQs requested
and retry the allocation.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
[bhelgaas: changelog]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Jens Axboe <axboe@fb.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Shameerali reported that running v4.20-rc1 as QEMU guest, the PCIe hotplug
port times out during boot:
pciehp 0000:00:01.0:pcie004: Timeout on hotplug command 0x03f1 (issued 1016 msec ago)
pciehp 0000:00:01.0:pcie004: Timeout on hotplug command 0x03f1 (issued 1024 msec ago)
pciehp 0000:00:01.0:pcie004: Failed to check link status
pciehp 0000:00:01.0:pcie004: Timeout on hotplug command 0x02f1 (issued 2520 msec ago)
The issue was bisected down to commit 720d6a671a ("PCI: pciehp: Do not
handle events if interrupts are masked") and was further analyzed by the
reporter to be caused by the fact that pciehp first updates the hardware
and only then cache the ctrl->slot_ctrl in pcie_do_write_cmd(). If the
interrupt happens before we cache the value, pciehp_isr() reads value 0 and
decides that the interrupt was not meant for it causing the above timeout
to trigger.
Fix by moving ctrl->slot_ctrl assignment to happen before it is written to
the hardware.
Fixes: 720d6a671a ("PCI: pciehp: Do not handle events if interrupts are masked")
Link: https://lore.kernel.org/linux-pci/5FC3163CFD30C246ABAA99954A238FA8387DD344@FRAEML521-MBX.china.huawei.com
Reported-by: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
After commit eb01d42a77 ("PCI: consolidate PCI config entry in
drivers/pci"), all the PCI kconfig options appear below "PCI support"
rather than within a sub-menu. This is because menuconfig expects all
kconfig entries to be enclosed in an if/endif section. Add the missing
if/endif.
With this, "depends on PCI" is redundant in the sub-menu entries and can
be removed.
Fixes: eb01d42a77 ("PCI: consolidate PCI config entry in drivers/pci")
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
There is a plan to build the kernel with -Wimplicit-fallthrough and
these places in the code produced warnings (W=1). Fix them up.
Signed-off-by: Mathieu Malaterre <malat@debian.org>
[bhelgaas: squash into one patch, drop extra changelog detail]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
We've always had a weird situation around dma_zalloc_coherent. To
safely support mapping the allocations to userspace major architectures
like x86 and arm have always zeroed allocations from dma_alloc_coherent,
but a couple other architectures were missing that zeroing either always
or in corner cases. Then later we grew anothe dma_zalloc_coherent
interface to explicitly request zeroing, but that just added __GFP_ZERO
to the allocation flags, which for some allocators that didn't end
up using the page allocator ended up being a no-op and still not
zeroing the allocations.
So for this merge window I fixed up all remaining architectures to zero
the memory in dma_alloc_coherent, and made dma_zalloc_coherent a no-op
wrapper around dma_alloc_coherent, which fixes all of the above issues.
dma_zalloc_coherent is now pointless and can go away, and Luis helped
me writing a cocchinelle script and patch series to kill it, which I
think we should apply now just after -rc1 to finally settle these
issue.
-----BEGIN PGP SIGNATURE-----
iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAlw6LV0LHGhjaEBsc3Qu
ZGUACgkQD55TZVIEUYPd1hAAshbVLVIUg750CQoKD5sk44/IW7klkQUnzcp9ueOY
/GIYS/ils8q9DSITAyMJxHKpjt1EEVlavWLvYLlfpkDfLaVGMUJu+zKGaolhU5F6
OuldJKZV6tWrC7zGVl+09y5CAyelVxLyuD09I+QYnHUIO9ljgZHB2+W3ezOFxBRD
FjrQRuFY6Xpr1F42zWc4aJrgACffH761pLx3fbJlIs8aEInWKqDbuyL6Lg71BRXh
kHKt0DQxFxklyQmqaYyDesujjXUysweAFLNxgN9GSrlWBR8GE3qJpsSrIzjX5k8w
WKzbypYqVQepI3zYCN5EoCAoiHBFZXPSNHCoXAH6tHjYwgQ3uoDpzxEKJOEykO4i
1+kcJh3ArQZA/BsMBf3I/CNMsxvBuC3/QKFMcs/7pKx1ABoumSBSIpqB4pG4NU+o
fxRBHKjqbILufWKReb2PuRXiPpddwuo0vg70U0FK2aWZrClRYEpBdExPKrBUAG34
WtQCGA0YFXV/kAgPPmOvnPlwpYM2ZrVLVl5Ct2diR5QaLee3o1GiStQm0LuspRzk
HSzVyCYdKRxH4zkEBzKUn/PuyYLoMRyPP4PQ3R/xlQrFqvv6FeiGYnow89+1JpUp
2qWg5vU1aLM7/WXnyVGDED3T42eZREi/uMPQIADXqRIVC7e43/eKcLF06n0lIWh9
usg=
=VIBB
-----END PGP SIGNATURE-----
Merge tag 'remove-dma_zalloc_coherent-5.0' of git://git.infradead.org/users/hch/dma-mapping
Pull dma_zalloc_coherent() removal from Christoph Hellwig:
"We've always had a weird situation around dma_zalloc_coherent. To
safely support mapping the allocations to userspace major
architectures like x86 and arm have always zeroed allocations from
dma_alloc_coherent, but a couple other architectures were missing that
zeroing either always or in corner cases.
Then later we grew anothe dma_zalloc_coherent interface to explicitly
request zeroing, but that just added __GFP_ZERO to the allocation
flags, which for some allocators that didn't end up using the page
allocator ended up being a no-op and still not zeroing the
allocations.
So for this merge window I fixed up all remaining architectures to
zero the memory in dma_alloc_coherent, and made dma_zalloc_coherent a
no-op wrapper around dma_alloc_coherent, which fixes all of the above
issues.
dma_zalloc_coherent is now pointless and can go away, and Luis helped
me writing a cocchinelle script and patch series to kill it, which I
think we should apply now just after -rc1 to finally settle these
issue"
* tag 'remove-dma_zalloc_coherent-5.0' of git://git.infradead.org/users/hch/dma-mapping:
dma-mapping: remove dma_zalloc_coherent()
cross-tree: phase out dma_zalloc_coherent() on headers
cross-tree: phase out dma_zalloc_coherent()
Building the driver when GPIOLIB=n is not selected is causing the following
compilation failure:
drivers/pci/controller/dwc/pci-meson.c: In function 'meson_pcie_assert_reset':
drivers/pci/controller/dwc/pci-meson.c:290:2: error: implicit declaration of function 'gpiod_set_value_cansleep'; did you mean 'gpio_set_value_cansleep'? [-Werror=implicit-function-declaration]
gpiod_set_value_cansleep(mp->reset_gpio, 0);
^~~~~~~~~~~~~~~~~~~~~~~~
gpio_set_value_cansleep
drivers/pci/controller/dwc/pci-meson.c: In function 'meson_pcie_probe':
drivers/pci/controller/dwc/pci-meson.c:540:19: error: implicit declaration of function 'devm_gpiod_get'; did you mean 'devm_gpio_free'? [-Werror=implicit-function-declaration]
mp->reset_gpio = devm_gpiod_get(dev, "reset", GPIOD_OUT_LOW);
^~~~~~~~~~~~~~
devm_gpio_free
drivers/pci/controller/dwc/pci-meson.c:540:48: error: 'GPIOD_OUT_LOW' undeclared (first use in this function); did you mean 'GPIOF_INIT_LOW'?
mp->reset_gpio = devm_gpiod_get(dev, "reset", GPIOD_OUT_LOW);
^~~~~~~~~~~~~
GPIOF_INIT_LOW
Add the missing linux/gpio/consumer.h header to fix it.
Fixes: 9c0ef6d34f ("PCI: amlogic: Add the Amlogic Meson PCIe controller driver")
Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
[lorenzo.pieralisi@arm.com: commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
We already need to zero out memory for dma_alloc_coherent(), as such
using dma_zalloc_coherent() is superflous. Phase it out.
This change was generated with the following Coccinelle SmPL patch:
@ replace_dma_zalloc_coherent @
expression dev, size, data, handle, flags;
@@
-dma_zalloc_coherent(dev, size, handle, flags)
+dma_alloc_coherent(dev, size, handle, flags)
Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
[hch: re-ran the script on the latest tree]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Nobody has actually used the type (VERIFY_READ vs VERIFY_WRITE) argument
of the user address range verification function since we got rid of the
old racy i386-only code to walk page tables by hand.
It existed because the original 80386 would not honor the write protect
bit when in kernel mode, so you had to do COW by hand before doing any
user access. But we haven't supported that in a long time, and these
days the 'type' argument is a purely historical artifact.
A discussion about extending 'user_access_begin()' to do the range
checking resulted this patch, because there is no way we're going to
move the old VERIFY_xyz interface to that model. And it's best done at
the end of the merge window when I've done most of my merges, so let's
just get this done once and for all.
This patch was mostly done with a sed-script, with manual fix-ups for
the cases that weren't of the trivial 'access_ok(VERIFY_xyz' form.
There were a couple of notable cases:
- csky still had the old "verify_area()" name as an alias.
- the iter_iov code had magical hardcoded knowledge of the actual
values of VERIFY_{READ,WRITE} (not that they mattered, since nothing
really used it)
- microblaze used the type argument for a debug printout
but other than those oddities this should be a total no-op patch.
I tried to fix up all architectures, did fairly extensive grepping for
access_ok() uses, and the changes are trivial, but I may have missed
something. Any missed conversion should be trivially fixable, though.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- Mask DesignWare interrupts instead of disabling them to avoid lost
interrupts (Marc Zyngier)
- Add locking when acking DesignWare interrupts (Marc Zyngier)
- Ack DesignWare interrupts in the proper callbacks (Marc Zyngier)
* remotes/lorenzo/pci/dwc-msi:
PCI: dwc: Move interrupt acking into the proper callback
PCI: dwc: Take lock when ACKing an interrupt
PCI: dwc: Use interrupt masking instead of disabling
- Skip VF scanning on powerpc, which does this in firmware (Sebastian
Ott)
* pci/virtualization:
s390/pci: skip VF scanning
PCI/IOV: Add flag so platforms can skip VF scanning
PCI/IOV: Factor out sriov_add_vfs()
- Expand Kconfig "PF" acronyms (Randy Dunlap)
- Update MAINTAINERS for arch/x86/kernel/early-quirks.c (Bjorn Helgaas)
- Add missing include to drivers/pci.h (Alexandru Gagniuc)
- Override Synopsys USB 3.x HAPS device class so dwc3-haps can claim it
instead of xhci (Thinh Nguyen)
* pci/misc:
PCI: Override Synopsys USB 3.x HAPS device class
PCI: Move Synopsys HAPS platform device IDs
PCI: Add missing include to drivers/pci.h
PCI: Remove unnecessary space before function pointer arguments
MAINTAINERS: Add x86 early-quirks.c file pattern to PCI subsystem
PCI: Expand the "PF" acronym in Kconfig help text
The MSI Enable bit in the MSI Capability (PCIe r4.0, sec 7.7.1.2) controls
whether a Function can request service using MSI.
i.MX6 Root Ports implement the MSI Capability and may use MSI to request
service for events like PME, hotplug, AER, etc. In addition, on i.MX6, the
MSI Enable bit controls delivery of MSI interrupts from components below
the Root Port.
Prior to f3fdfc4ac3 ("PCI: Remove host driver Kconfig selection of
CONFIG_PCIEPORTBUS"), enabling CONFIG_PCI_IMX6 automatically also enabled
CONFIG_PCIEPORTBUS, and when portdrv claimed the Root Ports, it set the MSI
Enable bit so it could use PME, hotplug, AER, etc. As a side effect, that
also enabled delivery of MSI interrupts from downstream components.
The imx6q-pcie driver itself does not depend on portdrv, so set MSI Enable
in imx6q-pcie so MSI from downstream components works even if nobody uses
MSI for the Root Port events.
Fixes: f3fdfc4ac3 ("PCI: Remove host driver Kconfig selection of CONFIG_PCIEPORTBUS")
Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Sven Van Asbroeck <TheSven73@googlemail.com>
Tested-by: Trent Piepho <tpiepho@impinj.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Provide a flag to skip scanning for new VFs after SR-IOV enablement. This
can be set by implementations for which the VFs are already reported by
other means.
Signed-off-by: Sebastian Ott <sebott@linux.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Provide sriov_add_vfs() as a wrapper to scan for VFs that cleans up after
itself. This is just a code simplification. No functional change.
Signed-off-by: Sebastian Ott <sebott@linux.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Consolidation of bus (PCI, PCMCIA, EISA, RapidIO) config entries
by Christoph Hellwig.
Currently, every architecture that wants to provide common peripheral
busses needs to add some boilerplate code and include the right Kconfig
files. This series instead just selects the presence (when needed) and
then handles everything in the bus-specific Kconfig file under drivers/.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJcJilwAAoJED2LAQed4NsGt1YP/RMTEUqbCSwS/CnTLrE+aVTC
O2aWwB80ZlVwpeBbHLW5/M88OvOev0UaCr+gyzgpFRl5ITzS7Jevb8VbpGzblbH7
bFxIEyZFGQiy9oEWw3Lfu9JRSsLm3jNo7hkmdBSn2Rw3KkEd/YF7K3q9GuA7BpCS
ZxAirebvEpr4KYEzkuc57NqCYx2Tc8G+JWr5D7pZCFaq9vxYt3TddGqw/c7iQVSQ
1Og1809IdhGyCSlA/ExfaqaBMaJHMRAOHX5GgkqZw1EbFcizUFhAAsKCrGL5nBtX
NiWF9jhgHR1M+L69jfctOstrmGQD2KicNgWQf1aS5RQkPfjuqIKGT/i9g6J1pVyX
TaW1J36Hcl8PpsKoPBnnrixd1T41O3/PuqtEJRm7LCBYOQiwS9sEmLO09RDRjER8
SPAAyvkhE8oq+0RHiTYN4tm8dyJc1djZ5wzgLnwFPAnU6SR+mbN02RzBMsYZXD+x
RNbBSGBRJFQDBw6Rn+ktcIQvcKYmUqe1k1YNHMy6kG3QqvhBaDy+8PA/YjIKPQYQ
B/NNUAMEJMys1OQrRL2UDXb2ysaCpzwMmlrBW2IwYsQrX5OwbPkNuQ5Mbe1Lr+mc
4NXR+HubvojsHaAby+OhFbrUX2Jcz3wqYj7aannb9sMRmw0VJXV5dPYUqje3ZhPS
P2AovKT8O9nWsEttqER5
=WxId
-----END PGP SIGNATURE-----
Merge tag 'kconfig-v4.21-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kconfig file consolidation from Masahiro Yamada:
"Consolidation of bus (PCI, PCMCIA, EISA, RapidIO) config entries by
Christoph Hellwig.
Currently, every architecture that wants to provide common peripheral
busses needs to add some boilerplate code and include the right
Kconfig files. This series instead just selects the presence (when
needed) and then handles everything in the bus-specific Kconfig file
under drivers/"
* tag 'kconfig-v4.21-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
pcmcia: remove per-arch PCMCIA config entry
eisa: consolidate EISA Kconfig entry in drivers/eisa
rapidio: consolidate RAPIDIO config entry in drivers/rapidio
pcmcia: allow PCMCIA support independent of the architecture
PCI: consolidate the PCI_SYSCALL symbol
PCI: consolidate the PCI_DOMAINS and PCI_DOMAINS_GENERIC config options
PCI: consolidate PCI config entry in drivers/pci
MIPS: remove the HT_PCI config option
Here is the big set of char and misc driver patches for 4.21-rc1.
Lots of different types of driver things in here, as this tree seems to
be the "collection of various driver subsystems not big enough to have
their own git tree" lately.
Anyway, some highlights of the changes in here:
- binderfs: is it a rule that all driver subsystems will eventually
grow to have their own filesystem? Binder now has one to handle the
use of it in containerized systems. This was discussed at the
Plumbers conference a few months ago and knocked into mergable shape
very fast by Christian Brauner. Who also has signed up to be
another binder maintainer, showing a distinct lack of good judgement :)
- binder updates and fixes
- mei driver updates
- fpga driver updates and additions
- thunderbolt driver updates
- soundwire driver updates
- extcon driver updates
- nvmem driver updates
- hyper-v driver updates
- coresight driver updates
- pvpanic driver additions and reworking for more device support
- lp driver updates. Yes really, it's _finally_ moved to the proper
parallal port driver model, something I never thought I would see
happen. Good stuff.
- other tiny driver updates and fixes.
All of these have been in linux-next for a while with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXCZCUA8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ymF9QCgx/Z8Fj1qzGVGrIE4flXOi7pxOrgAoMqJEWtU
ywwL8M9suKDz7cZT9fWQ
=xxr6
-----END PGP SIGNATURE-----
Merge tag 'char-misc-4.21-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver updates from Greg KH:
"Here is the big set of char and misc driver patches for 4.21-rc1.
Lots of different types of driver things in here, as this tree seems
to be the "collection of various driver subsystems not big enough to
have their own git tree" lately.
Anyway, some highlights of the changes in here:
- binderfs: is it a rule that all driver subsystems will eventually
grow to have their own filesystem? Binder now has one to handle the
use of it in containerized systems.
This was discussed at the Plumbers conference a few months ago and
knocked into mergable shape very fast by Christian Brauner. Who
also has signed up to be another binder maintainer, showing a
distinct lack of good judgement :)
- binder updates and fixes
- mei driver updates
- fpga driver updates and additions
- thunderbolt driver updates
- soundwire driver updates
- extcon driver updates
- nvmem driver updates
- hyper-v driver updates
- coresight driver updates
- pvpanic driver additions and reworking for more device support
- lp driver updates. Yes really, it's _finally_ moved to the proper
parallal port driver model, something I never thought I would see
happen. Good stuff.
- other tiny driver updates and fixes.
All of these have been in linux-next for a while with no reported
issues"
* tag 'char-misc-4.21-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (116 commits)
MAINTAINERS: add another Android binder maintainer
intel_th: msu: Fix an off-by-one in attribute store
stm class: Add a reference to the SyS-T document
stm class: Fix a module refcount leak in policy creation error path
char: lp: use new parport device model
char: lp: properly count the lp devices
char: lp: use first unused lp number while registering
char: lp: detach the device when parallel port is removed
char: lp: introduce list to save port number
bus: qcom: remove duplicated include from qcom-ebi2.c
VMCI: Use memdup_user() rather than duplicating its implementation
char/rtc: Use of_node_name_eq for node name comparisons
misc: mic: fix a DMA pool free failure
ptp: fix an IS_ERR() vs NULL check
genwqe: Fix size check
binder: implement binderfs
binder: fix use-after-free due to ksys_close() during fdget()
bus: fsl-mc: remove duplicated include files
bus: fsl-mc: explicitly define the fsl_mc_command endianness
misc: ti-st: make array read_ver_cmd static, shrinks object size
...
Merge misc updates from Andrew Morton:
- large KASAN update to use arm's "software tag-based mode"
- a few misc things
- sh updates
- ocfs2 updates
- just about all of MM
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (167 commits)
kernel/fork.c: mark 'stack_vm_area' with __maybe_unused
memcg, oom: notify on oom killer invocation from the charge path
mm, swap: fix swapoff with KSM pages
include/linux/gfp.h: fix typo
mm/hmm: fix memremap.h, move dev_page_fault_t callback to hmm
hugetlbfs: Use i_mmap_rwsem to fix page fault/truncate race
hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization
memory_hotplug: add missing newlines to debugging output
mm: remove __hugepage_set_anon_rmap()
include/linux/vmstat.h: remove unused page state adjustment macro
mm/page_alloc.c: allow error injection
mm: migrate: drop unused argument of migrate_page_move_mapping()
blkdev: avoid migration stalls for blkdev pages
mm: migrate: provide buffer_migrate_page_norefs()
mm: migrate: move migrate_page_lock_buffers()
mm: migrate: lock buffers before migrate_page_move_mapping()
mm: migration: factor out code to compute expected number of page references
mm, page_alloc: enable pcpu_drain with zone capability
kmemleak: add config to select auto scan
mm/page_alloc.c: don't call kasan_free_pages() at deferred mem init
...
A huge update this time, but a lot of that is just consolidating or
removing code:
- provide a common DMA_MAPPING_ERROR definition and avoid indirect
calls for dma_map_* error checking
- use direct calls for the DMA direct mapping case, avoiding huge
retpoline overhead for high performance workloads
- merge the swiotlb dma_map_ops into dma-direct
- provide a generic remapping DMA consistent allocator for architectures
that have devices that perform DMA that is not cache coherent. Based
on the existing arm64 implementation and also used for csky now.
- improve the dma-debug infrastructure, including dynamic allocation
of entries (Robin Murphy)
- default to providing chaining scatterlist everywhere, with opt-outs
for the few architectures (alpha, parisc, most arm32 variants) that
can't cope with it
- misc sparc32 dma-related cleanups
- remove the dma_mark_clean arch hook used by swiotlb on ia64 and
replace it with the generic noncoherent infrastructure
- fix the return type of dma_set_max_seg_size (Niklas Söderlund)
- move the dummy dma ops for not DMA capable devices from arm64 to
common code (Robin Murphy)
- ensure dma_alloc_coherent returns zeroed memory to avoid kernel data
leaks through userspace. We already did this for most common
architectures, but this ensures we do it everywhere.
dma_zalloc_coherent has been deprecated and can hopefully be
removed after -rc1 with a coccinelle script.
-----BEGIN PGP SIGNATURE-----
iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAlwctQgLHGhjaEBsc3Qu
ZGUACgkQD55TZVIEUYMxgQ//dBpAfS4/J76CdAbYry2zqgcOUU9hIrD6NHiEMWov
ltJxyvEl3LsUmIdEj3aCrYL9jZN0qsnCzn5BVj2c3jDIVgD64fAr7HDf/PbEEfKb
j6/GgEnVLPZV+sQMvhNA5jOzHrkseaqPa4/pNLFZ/l8jnuZ2d+btusDWJpMoVDer
TXVwtIfgeIu0gTygYOShLYXd5qptWKWsZEpbTZOO2sE6+x+ZJX7yQYUxYDTlcOIj
JWVO2l5QNHPc5T9o2at+6L5aNUvnZOxT79sWgyZLn0Kc+FagKAVwfLqUEl0v7foG
8k/xca5/8p3afB1DfrIrtplJqis7cVgdyGxriwuuoO8X4F0nPyWwpGmxsBhrWwwl
xTqC4UorEJ7QwoP6Azopk/vYI2QXIUBLjuCJCuFXZj9+2BGf4IfvBY1S2cLM9qLs
HMcxQonuXJii044KEFS96ePEuiT+igVINweIFBKWcgNCEG0UQtyL6RQ1U5297ipF
JiWZAqD+p9X52UdKS+oKfAiZEekMXn6Xyo97+YCiNpfOo0GP5eEcwhL+JpY4AiRq
apPXtsRy2o1s8yfjdraUIM2Mc2n62vFKb35oUbGCd/QO9piPrFQHl6T0HHcHk4YR
XrUXcHieFZBCYqh7ZVa4RL8Msq1wvGuTL4Dxl43mXdsMoUFRR6eSNWLoAV4IpOLZ
WgA=
=in72
-----END PGP SIGNATURE-----
Merge tag 'dma-mapping-4.21' of git://git.infradead.org/users/hch/dma-mapping
Pull DMA mapping updates from Christoph Hellwig:
"A huge update this time, but a lot of that is just consolidating or
removing code:
- provide a common DMA_MAPPING_ERROR definition and avoid indirect
calls for dma_map_* error checking
- use direct calls for the DMA direct mapping case, avoiding huge
retpoline overhead for high performance workloads
- merge the swiotlb dma_map_ops into dma-direct
- provide a generic remapping DMA consistent allocator for
architectures that have devices that perform DMA that is not cache
coherent. Based on the existing arm64 implementation and also used
for csky now.
- improve the dma-debug infrastructure, including dynamic allocation
of entries (Robin Murphy)
- default to providing chaining scatterlist everywhere, with opt-outs
for the few architectures (alpha, parisc, most arm32 variants) that
can't cope with it
- misc sparc32 dma-related cleanups
- remove the dma_mark_clean arch hook used by swiotlb on ia64 and
replace it with the generic noncoherent infrastructure
- fix the return type of dma_set_max_seg_size (Niklas Söderlund)
- move the dummy dma ops for not DMA capable devices from arm64 to
common code (Robin Murphy)
- ensure dma_alloc_coherent returns zeroed memory to avoid kernel
data leaks through userspace. We already did this for most common
architectures, but this ensures we do it everywhere.
dma_zalloc_coherent has been deprecated and can hopefully be
removed after -rc1 with a coccinelle script"
* tag 'dma-mapping-4.21' of git://git.infradead.org/users/hch/dma-mapping: (73 commits)
dma-mapping: fix inverted logic in dma_supported
dma-mapping: deprecate dma_zalloc_coherent
dma-mapping: zero memory returned from dma_alloc_*
sparc/iommu: fix ->map_sg return value
sparc/io-unit: fix ->map_sg return value
arm64: default to the direct mapping in get_arch_dma_ops
PCI: Remove unused attr variable in pci_dma_configure
ia64: only select ARCH_HAS_DMA_COHERENT_TO_PFN if swiotlb is enabled
dma-mapping: bypass indirect calls for dma-direct
vmd: use the proper dma_* APIs instead of direct methods calls
dma-direct: merge swiotlb_dma_ops into the dma_direct code
dma-direct: use dma_direct_map_page to implement dma_direct_map_sg
dma-direct: improve addressability error reporting
swiotlb: remove dma_mark_clean
swiotlb: remove SWIOTLB_MAP_ERROR
ACPI / scan: Refactor _CCA enforcement
dma-mapping: factor out dummy DMA ops
dma-mapping: always build the direct mapping code
dma-mapping: move dma_cache_sync out of line
dma-mapping: move various slow path functions out of line
...
At Maintainer Summit, Greg brought up a topic I proposed around
EXPORT_SYMBOL_GPL usage. The motivation was considerations for when
EXPORT_SYMBOL_GPL is warranted and the criteria for taking the exceptional
step of reclassifying an existing export. Specifically, I wanted to make
the case that although the line is fuzzy and hard to specify in abstract
terms, it is nonetheless clear that devm_memremap_pages() and HMM
(Heterogeneous Memory Management) have crossed it. The
devm_memremap_pages() facility should have been EXPORT_SYMBOL_GPL from the
beginning, and HMM as a derivative of that functionality should have
naturally picked up that designation as well.
Contrary to typical rules, the HMM infrastructure was merged upstream with
zero in-tree consumers. There was a promise at the time that those users
would be merged "soon", but it has been over a year with no drivers
arriving. While the Nouveau driver is about to belatedly make good on
that promise it is clear that HMM was targeted first and foremost at an
out-of-tree consumer.
HMM is derived from devm_memremap_pages(), a facility Christoph and I
spearheaded to support persistent memory. It combines a device lifetime
model with a dynamically created 'struct page' / memmap array for any
physical address range. It enables coordination and control of the many
code paths in the kernel built to interact with memory via 'struct page'
objects. With HMM the integration goes even deeper by allowing device
drivers to hook and manipulate page fault and page free events.
One interpretation of when EXPORT_SYMBOL is suitable is when it is
exporting stable and generic leaf functionality. The
devm_memremap_pages() facility continues to see expanding use cases,
peer-to-peer DMA being the most recent, with no clear end date when it
will stop attracting reworks and semantic changes. It is not suitable to
export devm_memremap_pages() as a stable 3rd party driver API due to the
fact that it is still changing and manipulates core behavior. Moreover,
it is not in the best interest of the long term development of the core
memory management subsystem to permit any external driver to effectively
define its own system-wide memory management policies with no
encouragement to engage with upstream.
I am also concerned that HMM was designed in a way to minimize further
engagement with the core-MM. That, with these hooks in place,
device-drivers are free to implement their own policies without much
consideration for whether and how the core-MM could grow to meet that
need. Going forward not only should HMM be EXPORT_SYMBOL_GPL, but the
core-MM should be allowed the opportunity and stimulus to change and
address these new use cases as first class functionality.
Original changelog:
hmm_devmem_add(), and hmm_devmem_add_resource() duplicated
devm_memremap_pages() and are now simple now wrappers around the core
facility to inject a dev_pagemap instance into the global pgmap_radix and
hook page-idle events. The devm_memremap_pages() interface is base
infrastructure for HMM. HMM has more and deeper ties into the kernel
memory management implementation than base ZONE_DEVICE which is itself a
EXPORT_SYMBOL_GPL facility.
Originally, the HMM page structure creation routines copied the
devm_memremap_pages() code and reused ZONE_DEVICE. A cleanup to unify the
implementations was discussed during the initial review:
http://lkml.iu.edu/hypermail/linux/kernel/1701.2/00812.html Recent work to
extend devm_memremap_pages() for the peer-to-peer-DMA facility enabled
this cleanup to move forward.
In addition to the integration with devm_memremap_pages() HMM depends on
other GPL-only symbols:
mmu_notifier_unregister_no_release
percpu_ref
region_intersects
__class_create
It goes further to consume / indirectly expose functionality that is not
exported to any other driver:
alloc_pages_vma
walk_page_range
HMM is derived from devm_memremap_pages(), and extends deep core-kernel
fundamentals. Similar to devm_memremap_pages(), mark its entry points
EXPORT_SYMBOL_GPL().
[logang@deltatee.com: PCI/P2PDMA: match interface changes to devm_memremap_pages()]
Link: http://lkml.kernel.org/r/20181130225911.2900-1-logang@deltatee.com
Link: http://lkml.kernel.org/r/154275560565.76910.15919297436557795278.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: Balbir Singh <bsingharora@gmail.com>,
Cc: Michal Hocko <mhocko@suse.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull irq updates from Thomas Gleixner:
"The interrupt department provides:
Core updates:
- Better spreading to NUMA nodes in the affinity management
- Support for more than one set of interrupts to spread out to allow
separate queues for separate functionality of a single device.
- Decouple the non queue interrupts from being managed. Those are
usually general interrupts for error handling etc. and those should
never be shut down. This also a preparation to utilize the
spreading mechanism for initial spreading of non-managed interrupts
later.
- Make the single CPU target selection in the matrix allocator more
balanced so interrupts won't accumulate on single CPUs in certain
situations.
- A large spell checking patch so we don't end up fixing single typos
over and over.
Driver updates:
- A bunch of new irqchip drivers (RDA8810PL, Madera, imx-irqsteer)
- Updates for the 8MQ, F1C100s platform drivers
- A number of SPDX cleanups
- A workaround for a very broken GICv3 implementation on msm8996
which sports a botched register set.
- A platform-msi fix to prevent memory leakage
- Various cleanups"
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (37 commits)
genirq/affinity: Add is_managed to struct irq_affinity_desc
genirq/core: Introduce struct irq_affinity_desc
genirq/affinity: Remove excess indentation
irqchip/stm32: protect configuration registers with hwspinlock
dt-bindings: interrupt-controller: stm32: Document hwlock properties
irqchip: Add driver for imx-irqsteer controller
dt-bindings/irq: Add binding for Freescale IRQSTEER multiplexer
irqchip: Add driver for Cirrus Logic Madera codecs
genirq: Fix various typos in comments
irqchip/irq-imx-gpcv2: Add IRQCHIP_DECLARE for i.MX8MQ compatible
irqchip/irq-rda-intc: Fix return value check in rda8810_intc_init()
irqchip/irq-imx-gpcv2: Silence "fall through" warning
irqchip/gic-v3: Add quirk for msm8996 broken registers
irqchip/gic: Add support to device tree based quirks
dt-bindings/gic-v3: Add msm8996 compatible string
irqchip/sun4i: Add support for Allwinner ARMv5 F1C100s
irqchip/sun4i: Move IC specific register offsets to struct
irqchip/sun4i: Add a struct to hold global variables
dt-bindings: interrupt-controller: Add suniv interrupt-controller
irqchip: Add RDA8810PL interrupt driver
...
- Update the ACPICA code in the kernel to the 20181213 upstream
revision including:
* New Windows _OSI strings (Bob Moore, Jung-uk Kim).
* Buffers-to-string conversions update (Bob Moore).
* Removal of support for expressions in package elements (Bob
Moore).
* New option to display method/object evaluation in debug output
(Bob Moore).
* Compiler improvements (Bob Moore, Erik Schmauss).
* Minor debugger fix (Erik Schmauss).
* Disassembler improvement (Erik Schmauss).
* Assorted cleanups (Bob Moore, Colin Ian King, Erik Schmauss).
- Add support for a new OEM _OSI string to indicate special handling
of secondary graphics adapters on some systems (Alex Hung).
- Make it possible to build the ACPI subystem without PCI support
(Sinan Kaya).
- Make the SPCR table handling regard baud rate 0 in accordance with
the specification of it and make the DSDT override code support
DSDT code names generated by recent ACPICA (Andy Shevchenko, Wang
Dongsheng, Nathan Chancellor).
- Add clock frequency for Hisilicon Hip08 SPI controller to the ACPI
driver for AMD SoCs (APD) (Jay Fang).
- Fix the PM handling during device init in the ACPI driver for
Intel SoCs (LPSS) (Hans de Goede).
- Avoid double panic()s by clearing the APEI GHES block_status
before panic() (Lenny Szubowicz).
- Clean up a function invocation in the ACPI core and get rid of
some code duplication by using the DEFINE_SHOW_ATTRIBUTE macro
in the APEI support code (Alexey Dobriyan, Yangtao Li).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJcHMSBAAoJEILEb/54YlRxZmEQAIbRXKOwvvt3my9HLBC/6V1u
+Wed0yNBQ9HkVWQzFuppDq97/kk5DRODnPNu9RaeS7QXxVOBfwElinm8NhzVI7Fm
FP5iPwnNq8EAkDTBOoG139Fs82EkaVSa2x9FHy84Jge3BXmauQM13bWP/kF5TjCn
Frjuh0TfhQ+ub853GisAr/SW7ixCWp81FZaW/xFcDuJU2E6AvjNQusdiAocgAqQ8
rnl8D0gjSW6m6HcauaTizRMXOIyePkfT86xQKwU7259ByRW20iQtsl/6+Rnyy3wG
cCrlsaHd0bP6qwVAQyh6cURq8hdLAUYI9tzBW0EL+UEpJ289j51s+RSh2nZNyIKO
wfbr2DdK3aaWcUygSxoP4FFHqINch/IRwaP2huT9szO1yLCikAN8Xmrb1BPZvOIK
m6Lywb1B+SOfGgJl4Z1GjzIc6dimrXVbgxjN1+Bpe1NeKqe/M6vMdbcvPIsMs7b8
iE/1gJPeJ5pvAgsQiWncZvyaOKaSmrLWbaw/ITQnNXVLDlTI3hIQExiPPl5hJ00v
Z4egVMdCCxYqZxxkZKEYnEe/lb9BRAMIvbkkocPBdmtNAWPuVnCqdR26BppaEt7i
r2tnEd84aISCDcBc2sIpo/pVUwncw5GtK20z8Ke+3rlg8lDZ0hAdHQWgBtj4xnnw
grImzXnKvSdajfZnvjRg
=yxXc
-----END PGP SIGNATURE-----
Merge tag 'acpi-4.21-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI updates from Rafael Wysocki:
"These update the ACPICA code in the kernel to the 20181213 upstream
revision, make it possible to build the ACPI subsystem without PCI
support, and a new OEM _OSI string, add a new device support to the
ACPI driver for AMD SoCs and fix PM handling in the ACPI driver for
Intel SoCs, fix the SPCR table handling and do some assorted fixes and
cleanups.
Specifics:
- Update the ACPICA code in the kernel to the 20181213 upstream
revision including:
* New Windows _OSI strings (Bob Moore, Jung-uk Kim).
* Buffers-to-string conversions update (Bob Moore).
* Removal of support for expressions in package elements (Bob
Moore).
* New option to display method/object evaluation in debug output
(Bob Moore).
* Compiler improvements (Bob Moore, Erik Schmauss).
* Minor debugger fix (Erik Schmauss).
* Disassembler improvement (Erik Schmauss).
* Assorted cleanups (Bob Moore, Colin Ian King, Erik Schmauss).
- Add support for a new OEM _OSI string to indicate special handling
of secondary graphics adapters on some systems (Alex Hung).
- Make it possible to build the ACPI subystem without PCI support
(Sinan Kaya).
- Make the SPCR table handling regard baud rate 0 in accordance with
the specification of it and make the DSDT override code support
DSDT code names generated by recent ACPICA (Andy Shevchenko, Wang
Dongsheng, Nathan Chancellor).
- Add clock frequency for Hisilicon Hip08 SPI controller to the ACPI
driver for AMD SoCs (APD) (Jay Fang).
- Fix the PM handling during device init in the ACPI driver for Intel
SoCs (LPSS) (Hans de Goede).
- Avoid double panic()s by clearing the APEI GHES block_status before
panic() (Lenny Szubowicz).
- Clean up a function invocation in the ACPI core and get rid of some
code duplication by using the DEFINE_SHOW_ATTRIBUTE macro in the
APEI support code (Alexey Dobriyan, Yangtao Li)"
* tag 'acpi-4.21-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (31 commits)
ACPI / tables: Add an ifdef around amlcode and dsdt_amlcode
ACPI/APEI: Clear GHES block_status before panic()
ACPI: Make PCI slot detection driver depend on PCI
ACPI/IORT: Stub out ACS functions when CONFIG_PCI is not set
arm64: select ACPI PCI code only when both features are enabled
PCI/ACPI: Allow ACPI to be built without CONFIG_PCI set
ACPICA: Remove PCI bits from ACPICA when CONFIG_PCI is unset
ACPI: Allow CONFIG_PCI to be unset for reboot
ACPI: Move PCI reset to a separate function
ACPI / OSI: Add OEM _OSI string to enable dGPU direct output
ACPI / tables: add DSDT AmlCode new declaration name support
ACPICA: Update version to 20181213
ACPICA: change coding style to match ACPICA, no functional change
ACPICA: Debug output: Add option to display method/object evaluation
ACPICA: disassembler: disassemble OEMx tables as AML
ACPICA: Add "Windows 2018.2" string in the _OSI support
ACPICA: Expressions in package elements are not supported
ACPICA: Update buffer-to-string conversions
ACPICA: add comments, no functional change
ACPICA: Remove defines that use deprecated flag
...
We are compiling PCI code today for systems with ACPI and no PCI
device present. Remove the useless code and reduce the tight
dependency.
Signed-off-by: Sinan Kaya <okaya@kernel.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com> # PCI parts
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The interrupt affinity management uses straight cpumask pointers to convey
the automatically assigned affinity masks for managed interrupts. The core
interrupt descriptor allocation also decides based on the pointer being non
NULL whether an interrupt is managed or not.
Devices which use managed interrupts usually have two classes of
interrupts:
- Interrupts for multiple device queues
- Interrupts for general device management
Currently both classes are treated the same way, i.e. as managed
interrupts. The general interrupts get the default affinity mask assigned
while the device queue interrupts are spread out over the possible CPUs.
Treating the general interrupts as managed is both a limitation and under
certain circumstances a bug. Assume the following situation:
default_irq_affinity = 4..7
So if CPUs 4-7 are offlined, then the core code will shut down the device
management interrupts because the last CPU in their affinity mask went
offline.
It's also a limitation because it's desired to allow manual placement of
the general device interrupts for various reasons. If they are marked
managed then the interrupt affinity setting from both user and kernel space
is disabled.
To remedy that situation it's required to convey more information than the
cpumasks through various interfaces related to interrupt descriptor
allocation.
Instead of adding yet another argument, create a new data structure
'irq_affinity_desc' which for now just contains the cpumask. This struct
can be expanded to convey auxilliary information in the next step.
No functional change, just preparatory work.
[ tglx: Simplified logic and clarified changelog ]
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Dou Liyang <douliyangs@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-pci@vger.kernel.org
Cc: kashyap.desai@broadcom.com
Cc: shivasharan.srikanteshwara@broadcom.com
Cc: sumit.saxena@broadcom.com
Cc: ming.lei@redhat.com
Cc: hch@lst.de
Cc: douliyang1@huawei.com
Link: https://lkml.kernel.org/r/20181204155122.6327-2-douliyangs@gmail.com
This introduces specific glue layer for UniPhier platform to support
PCIe host controller that is based on the DesignWare PCIe core, and
this driver supports Root Complex (host) mode.
Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
The Amlogic Meson PCIe host controller is based on the Synopsys DesignWare
PCI core. This patch adds the driver support for Meson PCIe controller.
Link: https://lore.kernel.org/linux-pci/20181218224708.GB22610@google.com/
Signed-off-by: Yue Wang <yue.wang@amlogic.com>
Signed-off-by: Hanjie Lin <hanjie.lin@amlogic.com>
[lorenzo.pieralisi@arm.com: updated coding/comment style]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
The "lane" variant in struct mtk_pcie_port is not used, remove it.
Signed-off-by: Honghui Zhang <honghui.zhang@mediatek.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
The DWC PCIe core contains various separate register spaces: DBI, DBI2,
ATU, DMA, etc. The relationship between the addresses of these register
spaces is entirely determined by the implementation of the IP block, not
by the IP block design itself. Hence, the DWC driver must not make
assumptions that one register space can be accessed at a fixed offset from
any other register space. To avoid such assumptions, introduce an
explicit/separate register pointer for the ATU register space. In
particular, the current assumption is not valid for NVIDIA's T194 SoC.
The ATU register space is only used on systems that require unrolled ATU
access. This property is detected at run-time for host controllers, and
when this is detected, this patch provides a default value for atu_base
that matches the previous assumption re: register layout. An alternative
would be to update all drivers for HW that requires unrolled access to
explicitly set atu_base. However, it's hard to tell which drivers would
require atu_base to be set. The unrolled property is not detected for
endpoint systems, and so any endpoint driver that requires unrolled access
must explicitly set the iatu_unroll_enabled flag (none do at present), and
so a check is added to require the driver to also set atu_base while at
it.
Signed-off-by: Stephen Warren <swarren@nvidia.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Acked-by: Vidya Sagar <vidyas@nvidia.com>
Enable PCI suspend/resume support on imx6sx SOCs. This is similar to
imx7d with a few differences:
* The PM_Turn_Off bit is exposed through an IOMUX GPR, like all other
pcie control bits on 6sx.
* The pcie_inbound_axi clk needs to be turned off in suspend. On resume
it is restored via resume -> deassert_core_reset -> enable_ref_clk.
Most of the resume logic is shared with the initial reset after probe.
Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Acked-by: Lucas Stach <l.stach@pengutronix.de>
Add support for the gpio reset signal binding as described in the
designware-pcie.txt DT binding document. Both the documented
'reset-gpio' property name and the more standard 'reset-gpios' name are
supported.
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
The IMX6 PCI-e host driver also supports the IMX7d. However, the
Kconfig dependencies of the driver prevented it from being enabled
unless the kernel was built with both IMX6 and IMX7 support.
It works fine to build with only IMX7 support enabled therefore
adjust the Kconfig entry to allow this configuration.
Signed-off-by: Trent Piepho <tpiepho@impinj.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Cc: Andrey Smirnov <andrew.smirnov@gmail.com>
Constify driver data since they do not get changed at runtime.
Signed-off-by: Stefan Agner <stefan@agner.ch>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
On some chips the PCIe and PCIE_PHY blocks are in separate power domains
which can be power-gated independently. The PCI driver needs to handle
this by keeping both domain active.
This is intended for imx6sx where PCIe is in DISPLAY and PCIE_PHY in
its own domain. Defining the DISPLAY domain requires a way for PCIe to
keep it active or it will break when displays are off.
The power-domains on imx6sx are meant to look like this:
power-domains = <&pd_disp>, <&pd_pci>;
power-domain-names = "pcie", "pcie_phy";
Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com>
[lorenzo.pieralisi@arm.com: updated commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Synopsys USB 3.x host HAPS platform has a class code of
PCI_CLASS_SERIAL_USB_XHCI, and xhci driver can claim it. However, these
devices should use dwc3-haps driver. Change these devices' class code to
PCI_CLASS_SERIAL_USB_DEVICE to prevent the xhci-pci driver from claiming
them.
Signed-off-by: Thinh Nguyen <thinhn@synopsys.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
ASPM does not make use of the children or link LIST_HEADs declared in
struct pcie_link_state and defined in alloc_pcie_link_state(). Therefore,
remove these lists.
No functional change intended.
Signed-off-by: Frederick Lawler <fred@fredlawl.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Clang warns:
drivers/pci/pci-driver.c:1603:21: error: unused variable 'attr'
[-Werror,-Wunused-variable]
Commit e5361ca29f ("ACPI / scan: Refactor _CCA enforcement") removed
attr's use and replaced it with its assigned value so it is no longer
needed.
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
ecae65e133 ("PCI/AER: Use kfifo_in_spinlocked() to insert locked
elements") replaced kfifo_put() with kfifo_in_spinlocked(), but passed the
*size* of the queue entry, where kfifo_in_spinlocked() expects the *number*
of entries to be copied.
We want to insert only one element into kfifo, not "sizeof(entry) = 16".
Without this patch, we would get 15 uninitialized elements.
Fixes: ecae65e133 ("PCI/AER: Use kfifo_in_spinlocked() to insert locked elements")
Signed-off-by: Yanjiang Jin <yanjiang.jin@hxt-semitech.com>
[bhelgaas: changelog]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
With the bypass support for the direct mapping we might not always have
methods to call, so use the proper APIs instead. The only downside is
that we will create two dma-debug entries for each mapping if
CONFIG_DMA_DEBUG is enabled.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Tested-by: Jesper Dangaard Brouer <brouer@redhat.com>
Tested-by: Tony Luck <tony.luck@intel.com>
Rather than checking the DMA attribute at each callsite, just pass it
through for acpi_dma_configure() to handle directly. That can then deal
with the relatively exceptional DEV_DMA_NOT_SUPPORTED case by explicitly
installing dummy DMA ops instead of just skipping setup entirely. This
will then free up the dev->dma_ops == NULL case for some valuable
fastpath optimisations.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Tested-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Tony Luck <tony.luck@intel.com>
MRPC normal mode requires the host to read the MRPC command status and
output data from BAR. This results in high latency responses from the
Memory Read TLP and potential Completion Timeout (CTO).
Add support for MRPC DMA mode, including related macro definitions and data
structures and code to:
* Retrieve MRPC DMA mode version from adapter firmware
* Allocate DMA buffer, register ISR, and enable DMA during init
* Check MRPC execution status and get execution results from DMA buffer
* Release DMA buffer and disable DMA function when unloading module
MRPC DMA mode is a new feature of firmware, and the driver will fall back
to MRPC normal mode if there is no support in the legacy firmware.
Add a module parameter, "use_dma_mrpc", to select between MRPC DMA mode and
MRPC normal mode. Since the driver automatically detects DMA support in
the firmware, this parameter is just for debugging and testing.
Include <linux/io-64-nonatomic-lo-hi.h> so that readq/writeq is replaced by
two readl/writel on systems that do not support it.
Signed-off-by: Wesley Sheng <wesley.sheng@microchip.com>
[bhelgaas: changelog, simplify dma_ver check]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
The MRPC Input buffer is mostly memory without any side effects, so we
can improve the access time by enabling write combining on this region
of the BAR.
In a few places, we still need to flush the WC buffer. To do this, we
simply read from the Outbound Doorbell register because reads to this
register are processed by low latency hardware.
Signed-off-by: Kelvin Cao <kelvin.cao@microchip.com>
Signed-off-by: Wesley Sheng <wesley.sheng@microchip.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
In the ioctl_event_ctl() SWITCHTEC_IOCTL_EVENT_IDX_ALL case, we call
event_ctl() several times with the same "ctl" struct. Each call clobbers
ctl.flags, which leads to the problem that we may not actually enable or
disable all events as the user requested.
Preserve the event flag value with a temporary variable.
Fixes: 52eabba5bc ("switchtec: Add IOCTLs to the Switchtec driver")
Signed-off-by: Joey Zhang <joey.zhang@microchip.com>
Signed-off-by: Wesley Sheng <wesley.sheng@microchip.com>
[bhelgaas: changelog]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Switchtec hardware supports 64-bit DMA, so set the correct DMA mask. This
allows the CMA to allocate larger buffers for memory windows.
Signed-off-by: Boris Glimcher <Boris.Glimcher@emc.com>
Signed-off-by: Wesley Sheng <wesley.sheng@microchip.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
After submitting a Firmware Download MRPC command, Switchtec firmware will
delay Management EP BAR MemRd TLP responses by more than 10ms. This is a
firmware limitation. Delayed MemRd completions are a problem for systems
with a low Completion Timeout (CTO).
The current driver checks the MRPC status immediately after submitting an
MRPC command, which results in a delayed MemRd completion that may cause a
Completion Timeout.
Remove the immediate status check and rely on the check after receiving an
interrupt or timing out.
This is only a software workaround to the READ issue and a proper fix of
this should be done in firmware.
Fixes: 080b47def5 ("MicroSemi Switchtec management interface driver")
Signed-off-by: Kelvin Cao <kelvin.cao@microchip.com>
Signed-off-by: Wesley Sheng <wesley.sheng@microchip.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
a9c8088c79 ("i2c: i801: Don't restore config registers on runtime PM")
nullified the runtime PM suspend/resume callback pointers while keeping the
runtime PM enabled.
This caused the SMBus PCI device to stay in D0 with
/sys/devices/.../power/runtime_status showing "error" when the runtime PM
framework attempted to autosuspend the device. This is due to PCI bus
runtime PM, which checks for driver runtime PM callbacks and returns
-ENOSYS if they are not set.
Since i2c-i801.c doesn't need to do anything device-specific for runtime
PM, Jean Delvare proposed this be fixed in the PCI core rather than adding
dummy runtime PM callback functions in the PCI drivers.
Change pci_pm_runtime_suspend()/pci_pm_runtime_resume() so they allow
changing the PCI device power state during runtime PM transitions even if
the driver supplies no runtime PM callbacks.
This fixes the runtime PM regression on i2c-i801.c.
It is not obvious why the code previously required the runtime PM
callbacks. The test has been there since the code was introduced by
6cbf82148f ("PCI PM: Run-time callbacks for PCI bus type").
On the other hand, a similar change was done to generic runtime PM
callbacks in 05aa55dddb ("PM / Runtime: Lenient generic runtime pm
callbacks").
Fixes: a9c8088c79 ("i2c: i801: Don't restore config registers on runtime PM")
Reported-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: stable@vger.kernel.org # v4.18+
Fix typos, spellos, and grammar in p2pdma.rst and p2pdma.c.
Fix return value(s) in function pci_p2pmem_alloc_sgl().
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Logan Gunthorpe <logang@deltatee.com>
Cc: Jonathan Corbet <corbet@lwn.net>
The write to the status register is really an ACK for the HW,
and should be treated as such by the driver. Let's move it to the
irq_ack() callback, which will prevent people from moving it around
in order to paper over other bugs.
Fixes: 8c934095fa ("PCI: dwc: Clear MSI interrupt status after it is handled,
not before")
Fixes: 7c5925afbc ("PCI: dwc: Move MSI IRQs allocation to IRQ domains
hierarchical API")
Link: https://lore.kernel.org/linux-pci/20181113225734.8026-1-marc.zyngier@arm.com/
Reported-by: Trent Piepho <tpiepho@impinj.com>
Tested-by: Niklas Cassel <niklas.cassel@linaro.org>
Tested-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Tested-by: Stanimir Varbanov <svarbanov@mm-sol.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
[lorenzo.pieralisi@arm.com: updated commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: stable@vger.kernel.org
Bizarrely, there is no lock taken in the irq_ack() helper. This
puts the ACK callback provided by a specific platform in a awkward
situation where there is no synchronization that would be expected
on other callback.
Introduce the required lock, giving some level of uniformity among
callbacks.
Fixes: 7c5925afbc ("PCI: dwc: Move MSI IRQs allocation to IRQ domains
hierarchical API")
Link: https://lore.kernel.org/linux-pci/20181113225734.8026-1-marc.zyngier@arm.com/
Tested-by: Niklas Cassel <niklas.cassel@linaro.org>
Tested-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Tested-by: Stanimir Varbanov <svarbanov@mm-sol.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
[lorenzo.pieralisi@arm.com: updated commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: stable@vger.kernel.org
The dwc driver is showing an interesting level of brokeness, as it
insists on using the enable/disable set of registers to mask/unmask
MSIs, meaning that an MSIs being generated while the interrupt is in
that "disabled" state will simply be lost.
Let's move to the mask/unmask set of registers, which offers the
expected semantics.
Fixes: 7c5925afbc ("PCI: dwc: Move MSI IRQs allocation to IRQ domains
hierarchical API")
Link: https://lore.kernel.org/linux-pci/20181113225734.8026-1-marc.zyngier@arm.com/
Tested-by: Niklas Cassel <niklas.cassel@linaro.org>
Tested-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
Tested-by: Stanimir Varbanov <svarbanov@mm-sol.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
[lorenzo.pieralisi@arm.com: updated commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: stable@vger.kernel.org
-----BEGIN PGP SIGNATURE-----
iQJUBAABCgA+FiEEVTdhRGBbNzLrSUBaAP2fSd+ZWKAFAlwOLrYgHG1pa2Eud2Vz
dGVyYmVyZ0BsaW51eC5pbnRlbC5jb20ACgkQAP2fSd+ZWKCQyA//Zs4mAtgwa6ON
9gaYjVDkrAIs8jD1JHHBKpqCVvRdCZfnAtYT2Rk33SgRYFV0UmXoOL3HzMOHMTjO
2/S0SmO260UAQL7b4yPSb6tzCLSrW55rDS52EpQmJ8ncGD8l65tduwdA9gZEt+Kr
AhM5TK6nEQagRIcAQSmBlJMkDWNy2wvxTOebQv3C9woGSK7TFMvhCfZaLV9hpi89
ThdQtGLsGnYyzSw9tvEAwrsX96mWr2sdMV392SIgXEs+P3NtphTPvNM33Jo48l36
aFbhQwHEy6vtV6sC1va8NC/XQgLCK3DSx9R2/s+dZnZTXF4w14X+7KvNhQM8YpPB
OXPQXIsjpz/APBWULoPy6BX2TtzJUy0upGm/4B0kYBCFF1qmbFIeOi6beaXTFGzz
80qBpv6XUk/P/kGs3FTt3FfARjmHYnYuQhP90wFgoelMOHmBatz0YQYUYVeNe5ew
5itFeXgm3PkSnibxu0KBJCQe32SmNgMiGctgEKirAbK6Ibdp0bvj8NlZuCdoF5S/
1z6L4GL9zoPrxTRKMZlxlvexe9Lr+qRqiG8fG+Xt/WRZFauD0k7l7RvTqzxDC+Di
PnlrryDDNDEnSSlNjHdEqNMg8B1QrQ/5e0GAUoKNFVVTBLfc1oChaxAfzDjhRRgU
4l2irxVwQSP76ZCvPuF5NOfEfI58Buo=
=tqRs
-----END PGP SIGNATURE-----
Merge tag 'thunderbolt-for-v4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/westeri/thunderbolt into char-misc-next
Mika writes:
thunderbolt: Changes for v4.21 merge window
* tag 'thunderbolt-for-v4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/westeri/thunderbolt:
thunderbolt: Export IOMMU based DMA protection support to userspace
iommu/vt-d: Do not enable ATS for untrusted devices
iommu/vt-d: Force IOMMU on for platform opt in hint
PCI / ACPI: Identify untrusted PCI devices
This file makes use of definitions provided in <linux/pci.h>. This only
compiles when <linux/pci.h> is included beforehand, and creates a nasty
include dependency. Instead, just include the correct file.
Signed-off-by: Alexandru Gagniuc <mr.nuke.me@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Make spacing more consistent in the code for function pointer declarations
based on checkpatch.pl.
Signed-off-by: Benjamin Young <youngcdev@gmail.com>
[bhelgaas: make similar changes in include/linux/pci.h]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
No users left except for vmd which just forwards it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
A malicious PCI device may use DMA to attack the system. An external
Thunderbolt port is a convenient point to attach such a device. The OS
may use IOMMU to defend against DMA attacks.
Some BIOSes mark these externally facing root ports with this
ACPI _DSD [1]:
Name (_DSD, Package () {
ToUUID ("efcc06cc-73ac-4bc3-bff0-76143807c389"),
Package () {
Package () {"ExternalFacingPort", 1},
Package () {"UID", 0 }
}
})
If we find such a root port, mark it and all its children as untrusted.
The rest of the OS may use this information to enable DMA protection
against malicious devices. For instance the device may be put behind an
IOMMU to keep it from accessing memory outside of what the driver has
allocated for it.
While at it, add a comment on top of prp_guids array explaining the
possible caveat resulting when these GUIDs are treated equivalent.
[1] https://docs.microsoft.com/en-us/windows-hardware/drivers/pci/dsd-for-pcie-root-ports#identifying-externally-exposed-pcie-root-ports
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
This reverts commit 17c9148736.
Rafael found that this commit broke the SD card reader in his
Acer Aspire S5. Details of the problem are in the bugzilla below.
Fixes: 17c9148736 ("PCI/ASPM: Do not initialize link state when aspm_disabled is set")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=201801
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The macros PCI_EXP_LNKCAP_SLS_*GB are values, not bit masks. We must mask
the register and compare it against them.
This fixes errors like this:
amdgpu: [powerplay] failed to send message 261 ret is 0
when a PCIe-v3 card is plugged into a PCIe-v1 slot, because the slot is
being incorrectly reported as PCIe-v3 capable.
6cf57be0f7, which appeared in v4.17, added pcie_get_speed_cap() with the
incorrect test of PCI_EXP_LNKCAP_SLS as a bitmask. 5d9a633040, which
appeared in v4.19, changed amdgpu to use pcie_get_speed_cap(), so the
amdgpu bug reports below are regressions in v4.19.
Fixes: 6cf57be0f7 ("PCI: Add pcie_get_speed_cap() to find max supported link speed")
Fixes: 5d9a633040 ("drm/amdgpu: use pcie functions for link width and speed")
Link: https://bugs.freedesktop.org/show_bug.cgi?id=108704
Link: https://bugs.freedesktop.org/show_bug.cgi?id=108778
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
[bhelgaas: update comment, remove use of PCI_EXP_LNKCAP_SLS_8_0GB and
PCI_EXP_LNKCAP_SLS_16_0GB since those should be covered by PCI_EXP_LNKCAP2,
remove test of PCI_EXP_LNKCAP for zero, since that register is required]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # v4.17+
Use the devm_of_pci_get_host_bridge_resources() API in place of the PCI OF
DT parser.
Signed-off-by: Honghui Zhang <honghui.zhang@mediatek.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Ryder Lee <ryder.lee@mediatek.com>
Fix an error caused by 3-bit right rotation on offset address
calculation of MSI-X table in dw_pcie_ep_raise_msix_irq().
The initial testing code was setting by default the offset address of
MSI-X table to zero, so that even with a 3-bit right rotation the
computed result would still be zero and valid, therefore this bug went
unnoticed.
Fixes: beb4641a78 ("PCI: dwc: Add MSI-X callbacks handler")
Signed-off-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
[lorenzo.pieralisi@arm.com: updated commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: stable@vger.kernel.org
Let architectures select the syscall support instead of duplicating the
kconfig entry.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Move the definitions to drivers/pci and let the architectures select
them. Two small differences to before: PCI_DOMAINS_GENERIC now selects
PCI_DOMAINS, cutting down the churn for modern architectures. As the
only architectured arm did previously also offer PCI_DOMAINS as a user
visible choice in addition to selecting it from the relevant configs,
this is gone now.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Paul Burton <paul.burton@mips.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
There is no good reason to duplicate the PCI menu in every architecture.
Instead provide a selectable HAVE_PCI symbol that indicates availability
of PCI support, and a FORCE_PCI symbol to for PCI on and the handle the
rest in drivers/pci.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Palmer Dabbelt <palmer@sifive.com>
Acked-by: Max Filippov <jcmvbkbc@gmail.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Paul Burton <paul.burton@mips.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
The order of parameters is not correct when invoking the outbound
window disable routine. Fix it.
Fixes: 4a2745d760 ("PCI: layerscape: Disable outbound windows configured by bootloader")
Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@nxp.com>
[lorenzo.pieralisi@arm.com: commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: stable@vger.kernel.org
This bug was introduced in the interaction for two commits on either
branch of the merge commit 562df5c852 ("Merge branch
'pci/host-designware' into next").
Commit 4d107d3b5a ("PCI: imx6: Move link up check into
imx6_pcie_wait_for_link()"), changed imx6_pcie_wait_for_link() to poll
the link status register directly, checking for link up and not
training, and made imx6_pcie_link_up() only check the link up bit (once,
not a polling loop).
While commit 886bc5ceb5 ("PCI: designware: Add generic
dw_pcie_wait_for_link()"), replaced the loop in
imx6_pcie_wait_for_link() with a call to a new dwc core function, which
polled imx6_pcie_link_up(), which still checked both link up and not
training in a loop.
When these two commits were merged, the version of
imx6_pcie_wait_for_link() from 886bc5ceb5 was kept, which eliminated
the link training check placed there by 4d107d3b5a. However, the
version of imx6_pcie_link_up() from 4d107d3b5a was kept, which
eliminated the link training check that had been there and was moved to
imx6_pcie_wait_for_link().
The result was the link training check got lost for the imx6 driver.
Eliminate imx6_pcie_link_up() so that the default handler,
dw_pcie_link_up(), is used instead. The default handler has the correct
code, which checks for link up and also that it still is not training,
fixing the regression.
Fixes: 562df5c852 ("Merge branch 'pci/host-designware' into next")
Signed-off-by: Trent Piepho <tpiepho@impinj.com>
[lorenzo.pieralisi@arm.com: rewrote the commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Joao Pinto <Joao.Pinto@synopsys.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Richard Zhu <hongxing.zhu@nxp.com>
The dw_pcie_host_ops structure is only stored in the ops field
of a pcie_port structure, and this field is const, so make the
dw_pcie_host_ops structure const as well.
Done with the help of Coccinelle.
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Tell users what a PCI PF is in the PCI_PF_STUB config help text.
Fixes: a8ccf8a666 ("PCI/IOV: Add pci-pf-stub driver for PFs that only enable VFs")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
A driver may have a need to allocate multiple sets of MSI/MSI-X interrupts,
and have them appropriately affinitized.
Add support for defining a number of sets in the irq_affinity structure, of
varying sizes, and get each set affinitized correctly across the machine.
[ tglx: Minor changelog tweaks ]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Cc: linux-block@vger.kernel.org
Link: https://lkml.kernel.org/r/20181102145951.31979-5-ming.lei@redhat.com
Pull XArray conversion from Matthew Wilcox:
"The XArray provides an improved interface to the radix tree data
structure, providing locking as part of the API, specifying GFP flags
at allocation time, eliminating preloading, less re-walking the tree,
more efficient iterations and not exposing RCU-protected pointers to
its users.
This patch set
1. Introduces the XArray implementation
2. Converts the pagecache to use it
3. Converts memremap to use it
The page cache is the most complex and important user of the radix
tree, so converting it was most important. Converting the memremap
code removes the only other user of the multiorder code, which allows
us to remove the radix tree code that supported it.
I have 40+ followup patches to convert many other users of the radix
tree over to the XArray, but I'd like to get this part in first. The
other conversions haven't been in linux-next and aren't suitable for
applying yet, but you can see them in the xarray-conv branch if you're
interested"
* 'xarray' of git://git.infradead.org/users/willy/linux-dax: (90 commits)
radix tree: Remove multiorder support
radix tree test: Convert multiorder tests to XArray
radix tree tests: Convert item_delete_rcu to XArray
radix tree tests: Convert item_kill_tree to XArray
radix tree tests: Move item_insert_order
radix tree test suite: Remove multiorder benchmarking
radix tree test suite: Remove __item_insert
memremap: Convert to XArray
xarray: Add range store functionality
xarray: Move multiorder_check to in-kernel tests
xarray: Move multiorder_shrink to kernel tests
xarray: Move multiorder account test in-kernel
radix tree test suite: Convert iteration test to XArray
radix tree test suite: Convert tag_tagged_items to XArray
radix tree: Remove radix_tree_clear_tags
radix tree: Remove radix_tree_maybe_preload_order
radix tree: Remove split/join code
radix tree: Remove radix_tree_update_node_t
page cache: Finish XArray conversion
dax: Convert page fault handlers to XArray
...
Notable changes:
- A large series to rewrite our SLB miss handling, replacing a lot of fairly
complicated asm with much fewer lines of C.
- Following on from that, we now maintain a cache of SLB entries for each
process and preload them on context switch. Leading to a 27% speedup for our
context switch benchmark on Power9.
- Improvements to our handling of SLB multi-hit errors. We now print more debug
information when they occur, and try to continue running by flushing the SLB
and reloading, rather than treating them as fatal.
- Enable THP migration on 64-bit Book3S machines (eg. Power7/8/9).
- Add support for physical memory up to 2PB in the linear mapping on 64-bit
Book3S. We only support up to 512TB as regular system memory, otherwise the
percpu allocator runs out of vmalloc space.
- Add stack protector support for 32 and 64-bit, with a per-task canary.
- Add support for PTRACE_SYSEMU and PTRACE_SYSEMU_SINGLESTEP.
- Support recognising "big cores" on Power9, where two SMT4 cores are presented
to us as a single SMT8 core.
- A large series to cleanup some of our ioremap handling and PTE flags.
- Add a driver for the PAPR SCM (storage class memory) interface, allowing
guests to operate on SCM devices (acked by Dan).
- Changes to our ftrace code to handle very large kernels, where we need to use
a trampoline to get to ftrace_caller().
Many other smaller enhancements and cleanups.
Thanks to:
Alan Modra, Alistair Popple, Aneesh Kumar K.V, Anton Blanchard, Aravinda
Prasad, Bartlomiej Zolnierkiewicz, Benjamin Herrenschmidt, Breno Leitao,
Cédric Le Goater, Christophe Leroy, Christophe Lombard, Dan Carpenter, Daniel
Axtens, Finn Thain, Gautham R. Shenoy, Gustavo Romero, Haren Myneni, Hari
Bathini, Jia Hongtao, Joel Stanley, John Allen, Laurent Dufour, Madhavan
Srinivasan, Mahesh Salgaonkar, Mark Hairgrove, Masahiro Yamada, Michael
Bringmann, Michael Neuling, Michal Suchanek, Murilo Opsfelder Araujo, Nathan
Fontenot, Naveen N. Rao, Nicholas Piggin, Nick Desaulniers, Oliver O'Halloran,
Paul Mackerras, Petr Vorel, Rashmica Gupta, Reza Arbab, Rob Herring, Sam
Bobroff, Samuel Mendoza-Jonas, Scott Wood, Stan Johnson, Stephen Rothwell,
Stewart Smith, Suraj Jitindar Singh, Tyrel Datwyler, Vaibhav Jain, Vasant
Hegde, YueHaibing, zhong jiang,
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJb01vTAAoJEFHr6jzI4aWADsEP/jqL3+2qxs098ra80tpXCpXJ
tgXCosEs4b35sGtyHeUWZZZfWXeisaPAIlP8zTx1n50HACZduDYRAl0Ew9XB7Xdw
enDHRVccD21FsmHBOx/Ii1rVJlovWlj6EQCWHKeZmNjeRoFuClVZ7CYmf+mBifKR
sw2Db2fKA/59wMTq2zIMy5pqYgqlAs4jTWS6uN5hKPoBmO/82ARnNG+qgLuloD3Z
O8zSDM9QQ7PpuyDgTjO9SAo2YjmEfXlEG6cOCCejsU3DMctaEAK5PUZ+blsHYHBH
BYZYKs/x4pcw0SO41GtTh0M2YqDYBVuBIpRw8lLZap97Xo9ucSkAm5WD3rGxk4CY
YeZKEPUql6MHN3+DKl8mx2F0V+Et/tio2HNqc9KReR1tfoolZAbe+SFZHfgmc/Rq
RD9nnG8KRd4K2K1BTqpkTmI1EtE7jPtPJPSV8gMGhgL/N5vPmH3mql/qyOtYx48E
6/hPzWESgs16VRZ/opLh8VvjlY1HBDODQhehhhl+o23/Vb8qEgRf8Uqhq50rQW1H
EeOqyyYQ90txSU31Sgy1kQkvOgIFAsBObWT1ZCJ3RbfGbB4/tdEAvZqTZRlXo2OY
7P0Sqcw/9Le5eJkHIlLtBv0TF7y1OYemCbLgRQzFlcRP+UKtYyg8eFnFjqbPEEmP
ulwhn/BfFVSgaYKQ503u
=I0pj
-----END PGP SIGNATURE-----
Merge tag 'powerpc-4.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
"Notable changes:
- A large series to rewrite our SLB miss handling, replacing a lot of
fairly complicated asm with much fewer lines of C.
- Following on from that, we now maintain a cache of SLB entries for
each process and preload them on context switch. Leading to a 27%
speedup for our context switch benchmark on Power9.
- Improvements to our handling of SLB multi-hit errors. We now print
more debug information when they occur, and try to continue running
by flushing the SLB and reloading, rather than treating them as
fatal.
- Enable THP migration on 64-bit Book3S machines (eg. Power7/8/9).
- Add support for physical memory up to 2PB in the linear mapping on
64-bit Book3S. We only support up to 512TB as regular system
memory, otherwise the percpu allocator runs out of vmalloc space.
- Add stack protector support for 32 and 64-bit, with a per-task
canary.
- Add support for PTRACE_SYSEMU and PTRACE_SYSEMU_SINGLESTEP.
- Support recognising "big cores" on Power9, where two SMT4 cores are
presented to us as a single SMT8 core.
- A large series to cleanup some of our ioremap handling and PTE
flags.
- Add a driver for the PAPR SCM (storage class memory) interface,
allowing guests to operate on SCM devices (acked by Dan).
- Changes to our ftrace code to handle very large kernels, where we
need to use a trampoline to get to ftrace_caller().
And many other smaller enhancements and cleanups.
Thanks to: Alan Modra, Alistair Popple, Aneesh Kumar K.V, Anton
Blanchard, Aravinda Prasad, Bartlomiej Zolnierkiewicz, Benjamin
Herrenschmidt, Breno Leitao, Cédric Le Goater, Christophe Leroy,
Christophe Lombard, Dan Carpenter, Daniel Axtens, Finn Thain, Gautham
R. Shenoy, Gustavo Romero, Haren Myneni, Hari Bathini, Jia Hongtao,
Joel Stanley, John Allen, Laurent Dufour, Madhavan Srinivasan, Mahesh
Salgaonkar, Mark Hairgrove, Masahiro Yamada, Michael Bringmann,
Michael Neuling, Michal Suchanek, Murilo Opsfelder Araujo, Nathan
Fontenot, Naveen N. Rao, Nicholas Piggin, Nick Desaulniers, Oliver
O'Halloran, Paul Mackerras, Petr Vorel, Rashmica Gupta, Reza Arbab,
Rob Herring, Sam Bobroff, Samuel Mendoza-Jonas, Scott Wood, Stan
Johnson, Stephen Rothwell, Stewart Smith, Suraj Jitindar Singh, Tyrel
Datwyler, Vaibhav Jain, Vasant Hegde, YueHaibing, zhong jiang"
* tag 'powerpc-4.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (221 commits)
Revert "selftests/powerpc: Fix out-of-tree build errors"
powerpc/msi: Fix compile error on mpc83xx
powerpc: Fix stack protector crashes on CPU hotplug
powerpc/traps: restore recoverability of machine_check interrupts
powerpc/64/module: REL32 relocation range check
powerpc/64s/radix: Fix radix__flush_tlb_collapsed_pmd double flushing pmd
selftests/powerpc: Add a test of wild bctr
powerpc/mm: Fix page table dump to work on Radix
powerpc/mm/radix: Display if mappings are exec or not
powerpc/mm/radix: Simplify split mapping logic
powerpc/mm/radix: Remove the retry in the split mapping logic
powerpc/mm/radix: Fix small page at boundary when splitting
powerpc/mm/radix: Fix overuse of small pages in splitting logic
powerpc/mm/radix: Fix off-by-one in split mapping logic
powerpc/ftrace: Handle large kernel configs
powerpc/mm: Fix WARN_ON with THP NUMA migration
selftests/powerpc: Fix out-of-tree build errors
powerpc/time: no steal_time when CONFIG_PPC_SPLPAR is not selected
powerpc/time: Only set CONFIG_ARCH_HAS_SCALED_CPUTIME on PPC64
powerpc/time: isolate scaled cputime accounting in dedicated functions.
...
These updates bring:
- Debugfs support for the Intel VT-d driver. When enabled, it
now also exposes some of its internal data structures to
user-space for debugging purposes.
- ARM-SMMU driver now uses the generic deferred flushing
and fast-path iova allocation code. This is expected to be a
major performance improvement, as this allocation path scales
a lot better.
- Support for r8a7744 in the Renesas iommu driver
- Couple of minor fixes and improvements all over the place
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABAgAGBQJb0vixAAoJECvwRC2XARrj0lkQALur432cGae8225gLNG+Ab1B
lDGz/8uJeV4V552r58msq/yFpVascoMYOCgS+5N5J/jn5UiPnWxk//Uz2lvvCsFn
3Z4HswSbmNLSuEHmN3/1CK28An44LjYxtnH/zAEaHRJgWNmC05lO4glPXaSIBwVS
ve6ULymHJittCHFNNAstNBvMYirYV2y+FYxoq6EteTuCruNNXR78KQV7TqPYI+uZ
0DwaXUyxO+HZbVeLpOnj/WHZ6+EUY0cHwHuk8U6ZCHnINZ+k9knt+WUvYu7wPCtj
jGIyJXW5BG0rjJZnVUQs9BFXFSJLV2Ap8M3zKVIyFAUAyStEtGHct0YMRC29GX/J
e45GPbElAZqx1NWRGGTV0xTsH5Gn85S2nP3p7iiPhj5zUhX/6SreZBDQdC+brtsB
8HG85xohsUkVmRq/ez4hu0yqXtB66ppV7TcOjyixybG+ixRPtUwTbiaYUxbvkZTr
hcYUVLGcpJX463VjUKGoRPFL/jZ6BXUWdLVllZPYgDT+IBXtQx1TB20DDtj5V2mR
3m7B0xLQJDWdarhdA9Oj0FQj7ivmwmitcJ9EoNvHSRdEoE1iIy1vHv/7v/GokRVS
J1YT5ZYAsGHBgZIsL7FpVA37i9t3JPVvgakUV/ZfLDyG3v+P0+eS3gNhECYt5luS
D8G7Jy+2vsitO/ZCyu/r
=q1HJ
-----END PGP SIGNATURE-----
Merge tag 'iommu-updates-v4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU updates from Joerg Roedel:
- Debugfs support for the Intel VT-d driver.
When enabled, it now also exposes some of its internal data
structures to user-space for debugging purposes.
- ARM-SMMU driver now uses the generic deferred flushing and fast-path
iova allocation code.
This is expected to be a major performance improvement, as this
allocation path scales a lot better.
- Support for r8a7744 in the Renesas iommu driver
- Couple of minor fixes and improvements all over the place
* tag 'iommu-updates-v4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (39 commits)
iommu/arm-smmu-v3: Remove unnecessary wrapper function
iommu/arm-smmu-v3: Add SPDX header
iommu/amd: Add default branch in amd_iommu_capable()
dt-bindings: iommu: ipmmu-vmsa: Add r8a7744 support
iommu/amd: Move iommu_init_pci() to .init section
iommu/arm-smmu: Support non-strict mode
iommu/io-pgtable-arm-v7s: Add support for non-strict mode
iommu/arm-smmu-v3: Add support for non-strict mode
iommu/io-pgtable-arm: Add support for non-strict mode
iommu: Add "iommu.strict" command line option
iommu/dma: Add support for non-strict mode
iommu/arm-smmu: Ensure that page-table updates are visible before TLBI
iommu/arm-smmu-v3: Implement flush_iotlb_all hook
iommu/arm-smmu-v3: Avoid back-to-back CMD_SYNC operations
iommu/arm-smmu-v3: Fix unexpected CMD_SYNC timeout
iommu/io-pgtable-arm: Fix race handling in split_blk_unmap()
iommu/arm-smmu-v3: Fix a couple of minor comment typos
iommu: Fix a typo
iommu: Remove .domain_{get,set}_windows
iommu: Tidy up window attributes
...
-----BEGIN PGP SIGNATURE-----
iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAlvPV7IUHGJoZWxnYWFz
QGdvb2dsZS5jb20ACgkQWYigwDrT+vyaUg//WnCaRIu2oKOp8c/bplZJDW5eT10d
oYAN9qeyptU9RYrg4KBNbZL9UKGFTk3AoN5AUjrk8njxc/dY2ra/79esOvZyyYQy
qLXBvrXKg3yZnlNlnyBneGSnUVwv/kl2hZS+kmYby2YOa8AH/mhU0FIFvsnfRK2I
XvwABFm2ZYvXCqh3e5HXaHhOsR88NQ9In0AXVC7zHGqv1r/bMVn2YzPZHL/zzMrF
mS79tdBTH+shSvchH9zvfgIs+UEKvvjEJsG2liwMkcQaV41i5dZjSKTdJ3EaD/Y2
BreLxXRnRYGUkBqfcon16Yx+P6VCefDRLa+RhwYO3dxFF2N4ZpblbkIdBATwKLjL
npiGc6R8yFjTmZU0/7olMyMCm7igIBmDvWPcsKEE8R4PezwoQv6YKHBMwEaflIbl
Rv4IUqjJzmQPaA0KkRoAVgAKHxldaNqno/6G1FR2gwz+fr68p5WSYFlQ3axhvTjc
bBMJpB/fbp9WmpGJieTt6iMOI6V1pnCVjibM5ZON59WCFfytHGGpbYW05gtZEod4
d/3yRuU53JRSj3jQAQuF1B6qYhyxvv5YEtAQqIFeHaPZ67nL6agw09hE+TlXjWbE
rTQRShflQ+ydnzIfKicFgy6/53D5hq7iH2l7HwJVXbXRQ104T5DB/XHUUTr+UWQn
/Nkhov32/n6GjxQ=
=58I4
-----END PGP SIGNATURE-----
Merge tag 'pci-v4.20-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI updates from Bjorn Helgaas:
- Fix ASPM link_state teardown on removal (Lukas Wunner)
- Fix misleading _OSC ASPM message (Sinan Kaya)
- Make _OSC optional for PCI (Sinan Kaya)
- Don't initialize ASPM link state when ACPI_FADT_NO_ASPM is set
(Patrick Talbert)
- Remove x86 and arm64 node-local allocation for host bridge structures
(Punit Agrawal)
- Pay attention to device-specific _PXM node values (Jonathan Cameron)
- Support new Immediate Readiness bit (Felipe Balbi)
- Differentiate between pciehp surprise and safe removal (Lukas Wunner)
- Remove unnecessary pciehp includes (Lukas Wunner)
- Drop pciehp hotplug_slot_ops wrappers (Lukas Wunner)
- Tolerate PCIe Slot Presence Detect being hardwired to zero to
workaround broken hardware, e.g., the Wilocity switch/wireless device
(Lukas Wunner)
- Unify pciehp controller & slot structs (Lukas Wunner)
- Constify hotplug_slot_ops (Lukas Wunner)
- Drop hotplug_slot_info (Lukas Wunner)
- Embed hotplug_slot struct into users instead of allocating it
separately (Lukas Wunner)
- Initialize PCIe port service drivers directly instead of relying on
initcall ordering (Keith Busch)
- Restore PCI config state after a slot reset (Keith Busch)
- Save/restore DPC config state along with other PCI config state
(Keith Busch)
- Reference count devices during AER handling to avoid race issue with
concurrent hot removal (Keith Busch)
- If an Upstream Port reports ERR_FATAL, don't try to read the Port's
config space because it is probably unreachable (Keith Busch)
- During error handling, use slot-specific reset instead of secondary
bus reset to avoid link up/down issues on hotplug ports (Keith Busch)
- Restore previous AER/DPC handling that does not remove and
re-enumerate devices on ERR_FATAL (Keith Busch)
- Notify all drivers that may be affected by error recovery resets
(Keith Busch)
- Always generate error recovery uevents, even if a driver doesn't have
error callbacks (Keith Busch)
- Make PCIe link active reporting detection generic (Keith Busch)
- Support D3cold in PCIe hierarchies during system sleep and runtime,
including hotplug and Thunderbolt ports (Mika Westerberg)
- Handle hpmemsize/hpiosize kernel parameters uniformly, whether slots
are empty or occupied (Jon Derrick)
- Remove duplicated include from pci/pcie/err.c and unused variable
from cpqphp (YueHaibing)
- Remove driver pci_cleanup_aer_uncorrect_error_status() calls (Oza
Pawandeep)
- Uninline PCI bus accessors for better ftracing (Keith Busch)
- Remove unused AER Root Port .error_resume method (Keith Busch)
- Use kfifo in AER instead of a local version (Keith Busch)
- Use threaded IRQ in AER bottom half (Keith Busch)
- Use managed resources in AER core (Keith Busch)
- Reuse pcie_port_find_device() for AER injection (Keith Busch)
- Abstract AER interrupt handling to disconnect error injection (Keith
Busch)
- Refactor AER injection callbacks to simplify future improvments
(Keith Busch)
- Remove unused Netronome NFP32xx Device IDs (Jakub Kicinski)
- Use bitmap_zalloc() for dma_alias_mask (Andy Shevchenko)
- Add switch fall-through annotations (Gustavo A. R. Silva)
- Remove unused Switchtec quirk variable (Joshua Abraham)
- Fix pci.c kernel-doc warning (Randy Dunlap)
- Remove trivial PCI wrappers for DMA APIs (Christoph Hellwig)
- Add Intel GPU device IDs to spurious interrupt quirk (Bin Meng)
- Run Switchtec DMA aliasing quirk only on NTB endpoints to avoid
useless dmesg errors (Logan Gunthorpe)
- Update Switchtec NTB documentation (Wesley Yung)
- Remove redundant "default n" from Kconfig (Bartlomiej Zolnierkiewicz)
- Avoid panic when drivers enable MSI/MSI-X twice (Tonghao Zhang)
- Add PCI support for peer-to-peer DMA (Logan Gunthorpe)
- Add sysfs group for PCI peer-to-peer memory statistics (Logan
Gunthorpe)
- Add PCI peer-to-peer DMA scatterlist mapping interface (Logan
Gunthorpe)
- Add PCI configfs/sysfs helpers for use by peer-to-peer users (Logan
Gunthorpe)
- Add PCI peer-to-peer DMA driver writer's documentation (Logan
Gunthorpe)
- Add block layer flag to indicate driver support for PCI peer-to-peer
DMA (Logan Gunthorpe)
- Map Infiniband scatterlists for peer-to-peer DMA if they contain P2P
memory (Logan Gunthorpe)
- Register nvme-pci CMB buffer as PCI peer-to-peer memory (Logan
Gunthorpe)
- Add nvme-pci support for PCI peer-to-peer memory in requests (Logan
Gunthorpe)
- Use PCI peer-to-peer memory in nvme (Stephen Bates, Steve Wise,
Christoph Hellwig, Logan Gunthorpe)
- Cache VF config space size to optimize enumeration of many VFs
(KarimAllah Ahmed)
- Remove unnecessary <linux/pci-ats.h> include (Bjorn Helgaas)
- Fix VMD AERSID quirk Device ID matching (Jon Derrick)
- Fix Cadence PHY handling during probe (Alan Douglas)
- Signal Cadence Endpoint interrupts via AXI region 0 instead of last
region (Alan Douglas)
- Write Cadence Endpoint MSI interrupts with 32 bits of data (Alan
Douglas)
- Remove redundant controller tests for "device_type == pci" (Rob
Herring)
- Document R-Car E3 (R8A77990) bindings (Tho Vu)
- Add device tree support for R-Car r8a7744 (Biju Das)
- Drop unused mvebu PCIe capability code (Thomas Petazzoni)
- Add shared PCI bridge emulation code (Thomas Petazzoni)
- Convert mvebu to use shared PCI bridge emulation (Thomas Petazzoni)
- Add aardvark Root Port emulation (Thomas Petazzoni)
- Support 100MHz/200MHz refclocks for i.MX6 (Lucas Stach)
- Add initial power management for i.MX7 (Leonard Crestez)
- Add PME_Turn_Off support for i.MX7 (Leonard Crestez)
- Fix qcom runtime power management error handling (Bjorn Andersson)
- Update TI dra7xx unaligned access errata workaround for host mode as
well as endpoint mode (Vignesh R)
- Fix kirin section mismatch warning (Nathan Chancellor)
- Remove iproc PAXC slot check to allow VF support (Jitendra Bhivare)
- Quirk Keystone K2G to limit MRRS to 256 (Kishon Vijay Abraham I)
- Update Keystone to use MRRS quirk for host bridge instead of open
coding (Kishon Vijay Abraham I)
- Refactor Keystone link establishment (Kishon Vijay Abraham I)
- Simplify and speed up Keystone link training (Kishon Vijay Abraham I)
- Remove unused Keystone host_init argument (Kishon Vijay Abraham I)
- Merge Keystone driver files into one (Kishon Vijay Abraham I)
- Remove redundant Keystone platform_set_drvdata() (Kishon Vijay
Abraham I)
- Rename Keystone functions for uniformity (Kishon Vijay Abraham I)
- Add Keystone device control module DT binding (Kishon Vijay Abraham
I)
- Use SYSCON API to get Keystone control module device IDs (Kishon
Vijay Abraham I)
- Clean up Keystone PHY handling (Kishon Vijay Abraham I)
- Use runtime PM APIs to enable Keystone clock (Kishon Vijay Abraham I)
- Clean up Keystone config space access checks (Kishon Vijay Abraham I)
- Get Keystone outbound window count from DT (Kishon Vijay Abraham I)
- Clean up Keystone outbound window configuration (Kishon Vijay Abraham
I)
- Clean up Keystone DBI setup (Kishon Vijay Abraham I)
- Clean up Keystone ks_pcie_link_up() (Kishon Vijay Abraham I)
- Fix Keystone IRQ status checking (Kishon Vijay Abraham I)
- Add debug messages for all Keystone errors (Kishon Vijay Abraham I)
- Clean up Keystone includes and macros (Kishon Vijay Abraham I)
- Fix Mediatek unchecked return value from devm_pci_remap_iospace()
(Gustavo A. R. Silva)
- Fix Mediatek endpoint/port matching logic (Honghui Zhang)
- Change Mediatek Root Port Class Code to PCI_CLASS_BRIDGE_PCI (Honghui
Zhang)
- Remove redundant Mediatek PM domain check (Honghui Zhang)
- Convert Mediatek to pci_host_probe() (Honghui Zhang)
- Fix Mediatek MSI enablement (Honghui Zhang)
- Add Mediatek system PM support for MT2712 and MT7622 (Honghui Zhang)
- Add Mediatek loadable module support (Honghui Zhang)
- Detach VMD resources after stopping root bus to prevent orphan
resources (Jon Derrick)
- Convert pcitest build process to that used by other tools (iio, perf,
etc) (Gustavo Pimentel)
* tag 'pci-v4.20-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (140 commits)
PCI/AER: Refactor error injection fallbacks
PCI/AER: Abstract AER interrupt handling
PCI/AER: Reuse existing pcie_port_find_device() interface
PCI/AER: Use managed resource allocations
PCI: pcie: Remove redundant 'default n' from Kconfig
PCI: aardvark: Implement emulated root PCI bridge config space
PCI: mvebu: Convert to PCI emulated bridge config space
PCI: mvebu: Drop unused PCI express capability code
PCI: Introduce PCI bridge emulated config space common logic
PCI: vmd: Detach resources after stopping root bus
nvmet: Optionally use PCI P2P memory
nvmet: Introduce helper functions to allocate and free request SGLs
nvme-pci: Add support for P2P memory in requests
nvme-pci: Use PCI p2pmem subsystem to manage the CMB
IB/core: Ensure we map P2P memory correctly in rdma_rw_ctx_[init|destroy]()
block: Add PCI P2P flag for request queue
PCI/P2PDMA: Add P2P DMA driver writer's documentation
docs-rst: Add a new directory for PCI documentation
PCI/P2PDMA: Introduce configfs/sysfs enable attribute helpers
PCI/P2PDMA: Add PCI p2pmem DMA mappings to adjust the bus offset
...
Pull perf updates from Ingo Molnar:
"The main updates in this cycle were:
- Lots of perf tooling changes too voluminous to list (big perf trace
and perf stat improvements, lots of libtraceevent reorganization,
etc.), so I'll list the authors and refer to the changelog for
details:
Benjamin Peterson, Jérémie Galarneau, Kim Phillips, Peter
Zijlstra, Ravi Bangoria, Sangwon Hong, Sean V Kelley, Steven
Rostedt, Thomas Gleixner, Ding Xiang, Eduardo Habkost, Thomas
Richter, Andi Kleen, Sanskriti Sharma, Adrian Hunter, Tzvetomir
Stoyanov, Arnaldo Carvalho de Melo, Jiri Olsa.
... with the bulk of the changes written by Jiri Olsa, Tzvetomir
Stoyanov and Arnaldo Carvalho de Melo.
- Continued intel_rdt work with a focus on playing well with perf
events. This also imported some non-perf RDT work due to
dependencies. (Reinette Chatre)
- Implement counter freezing for Arch Perfmon v4 (Skylake and newer).
This allows to speed up the PMI handler by avoiding unnecessary MSR
writes and make it more accurate. (Andi Kleen)
- kprobes cleanups and simplification (Masami Hiramatsu)
- Intel Goldmont PMU updates (Kan Liang)
- ... plus misc other fixes and updates"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (155 commits)
kprobes/x86: Use preempt_enable() in optimized_callback()
x86/intel_rdt: Prevent pseudo-locking from using stale pointers
kprobes, x86/ptrace.h: Make regs_get_kernel_stack_nth() not fault on bad stack
perf/x86/intel: Export mem events only if there's PEBS support
x86/cpu: Drop pointless static qualifier in punit_dev_state_show()
x86/intel_rdt: Fix initial allocation to consider CDP
x86/intel_rdt: CBM overlap should also check for overlap with CDP peer
x86/intel_rdt: Introduce utility to obtain CDP peer
tools lib traceevent, perf tools: Move struct tep_handler definition in a local header file
tools lib traceevent: Separate out tep_strerror() for strerror_r() issues
perf python: More portable way to make CFLAGS work with clang
perf python: Make clang_has_option() work on Python 3
perf tools: Free temporary 'sys' string in read_event_files()
perf tools: Avoid double free in read_event_file()
perf tools: Free 'printk' string in parse_ftrace_printk()
perf tools: Cleanup trace-event-info 'tdata' leak
perf strbuf: Match va_{add,copy} with va_end
perf test: S390 does not support watchpoints in test 22
perf auxtrace: Include missing asm/bitsperlong.h to get BITS_PER_LONG
tools include: Adopt linux/bits.h
...
- mostly more consolidation of the direct mapping code, including
converting over hexagon, and merging the coherent and non-coherent
code into a single dma_map_ops instance (me)
- cleanups for the dma_configure/dma_unconfigure callchains (me)
- better handling of dma_masks in odd setups (me, Alexander Duyck)
- better debugging of passing vmalloc address to the DMA API
(Stephen Boyd)
- CMA command line parsing fix (He Zhe)
-----BEGIN PGP SIGNATURE-----
iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAlvNg6YLHGhjaEBsc3Qu
ZGUACgkQD55TZVIEUYMm/Q/9FFVOH73Nc3rT40N2HdaPbzV2hXmI1//hEJcImDP5
mLGq8XqieGuo8Pmu9+xp1tC2UnfUkhK4FjhQbWM+qKER/RNYES2BD50xVFmt6ICS
9d8IaRcs+ceggljfdwszkkucJspBsYNxpiKjjao0OsHn6UDatu6elZs/yvb2nXci
HCJUvs9vYm9MkAtVXEtOQtij3YRaJ/9xYY4h5Dy5vBtHPp+kjUMF0mWAwA2+Ec1V
8iqKjUY3c8nr8Kf6WE9tzJ0wrMFijc4HJlE3W1ud8YsKdfCkCf8XiIuS6PgTzOeK
0cn9h8dVrV1ZXJ/D/9JZDivmYvIsoKWAYVQHNzAiq7PI3uOJY1ggCxyZpWtTHZhM
ATHF0sJGpIenkSWybYpKee8e8RsS7L9dUgu6bYpK5pVkirNYnR9IOGVJNmS63L7Q
B0uUtqjBKDG2yNGZGY9zqBQFgxiPO0wxFLeKyHbIsC0b7FBti3rXGAimch5WiBuL
zlDV0zEfMH0BW6gNPrjfFur84duKtGZ/0DBSxQ0E1Mvk8B1LBr78MgZt8OfJEuoe
dx1FYU70u8PYi+hjmn386YnNNMTjd1GT5XW7AWedM2wCjRYmNy0yMGmm9cACMneN
5eBv/SYr7X1zKNL7w7H6KQVZilTJcBoj3f/lmjL7i22m9FXYQpcUP61L8wHNM8H2
iJo=
=AVSD
-----END PGP SIGNATURE-----
Merge tag 'dma-mapping-4.20' of git://git.infradead.org/users/hch/dma-mapping
Pull dma mapping updates from Christoph Hellwig:
"First batch of dma-mapping changes for 4.20.
There will be a second PR as some big changes were only applied just
before the end of the merge window, and I want to give them a few more
days in linux-next.
Summary:
- mostly more consolidation of the direct mapping code, including
converting over hexagon, and merging the coherent and non-coherent
code into a single dma_map_ops instance (me)
- cleanups for the dma_configure/dma_unconfigure callchains (me)
- better handling of dma_masks in odd setups (me, Alexander Duyck)
- better debugging of passing vmalloc address to the DMA API (Stephen
Boyd)
- CMA command line parsing fix (He Zhe)"
* tag 'dma-mapping-4.20' of git://git.infradead.org/users/hch/dma-mapping: (27 commits)
dma-direct: respect DMA_ATTR_NO_WARN
dma-mapping: translate __GFP_NOFAIL to DMA_ATTR_NO_WARN
dma-direct: document the zone selection logic
dma-debug: Check for drivers mapping invalid addresses in dma_map_single()
dma-direct: fix return value of dma_direct_supported
dma-mapping: move dma_default_get_required_mask under ifdef
dma-direct: always allow dma mask <= physiscal memory size
dma-direct: implement complete bus_dma_mask handling
dma-direct: refine dma_direct_alloc zone selection
dma-direct: add an explicit dma_direct_get_required_mask
dma-mapping: make the get_required_mask method available unconditionally
unicore32: remove swiotlb support
Revert "dma-mapping: clear dev->dma_ops in arch_teardown_dma_ops"
dma-mapping: support non-coherent devices in dma_common_get_sgtable
dma-mapping: consolidate the dma mmap implementations
dma-mapping: merge direct and noncoherent ops
dma-mapping: move the dma_coherent flag to struct device
MIPS: don't select DMA_MAYBE_COHERENT from DMA_PERDEV_COHERENT
dma-mapping: add the missing ARCH_HAS_SYNC_DMA_FOR_CPU_ALL declaration
dma-mapping: fix panic caused by passing empty cma command line argument
...
- Detach VMD resources after stopping root bus to prevent orphan
resources (Jon Derrick)
* remotes/lorenzo/pci/vmd:
PCI: vmd: Detach resources after stopping root bus
- Fix Mediatek unchecked return value from devm_pci_remap_iospace()
(Gustavo A. R. Silva)
- Fix Mediatek endpoint/port matching logic (Honghui Zhang)
- Change Mediatek Root Port Class Code to PCI_CLASS_BRIDGE_PCI (Honghui
Zhang)
- Remove redundant Mediatek PM domain check (Honghui Zhang)
- Convert Mediatek to pci_host_probe() (Honghui Zhang)
- Fix Mediatek MSI enablement (Honghui Zhang)
- Add Mediatek system PM support for MT2712 and MT7622 (Honghui Zhang)
- Add Mediatek loadable module support (Honghui Zhang)
* remotes/lorenzo/pci/mediatek:
PCI: mediatek: Add loadable kernel module support
PCI: mediatek: Add system PM support for MT2712 and MT7622
PCI: mediatek: Fixup MSI enablement logic by enabling MSI before clocks
PCI: mediatek: Convert to use pci_host_probe()
PCI: mediatek: Remove the redundant dev->pm_domain check
PCI: mediatek: Fix class type for MT7622 to PCI_CLASS_BRIDGE_PCI
PCI: mediatek: Fix mtk_pcie_find_port() endpoint/port matching logic
PCI: mediatek: Fix unchecked return value
- Quirk Keystone K2G to limit MRRS to 256 (Kishon Vijay Abraham I)
- Update Keystone to use MRRS quirk for host bridge instead of open
coding (Kishon Vijay Abraham I)
- Refactor Keystone link establishment (Kishon Vijay Abraham I)
- Simplify and speed up Keystone link training (Kishon Vijay Abraham I)
- Remove unused Keystone host_init argument (Kishon Vijay Abraham I)
- Merge Keystone driver files into one (Kishon Vijay Abraham I)
- Remove redundant Keystone platform_set_drvdata() (Kishon Vijay Abraham
I)
- Rename Keystone functions for uniformity (Kishon Vijay Abraham I)
- Add Keystone device control module DT binding (Kishon Vijay Abraham I)
- Use SYSCON API to get Keystone control module device IDs (Kishon Vijay
Abraham I)
- Clean up Keystone PHY handling (Kishon Vijay Abraham I)
- Use runtime PM APIs to enable Keystone clock (Kishon Vijay Abraham I)
- Clean up Keystone config space access checks (Kishon Vijay Abraham I)
- Get Keystone outbound window count from DT (Kishon Vijay Abraham I)
- Clean up Keystone outbound window configuration (Kishon Vijay Abraham
I)
- Clean up Keystone DBI setup (Kishon Vijay Abraham I)
- Clean up Keystone ks_pcie_link_up() (Kishon Vijay Abraham I)
- Fix Keystone IRQ status checking (Kishon Vijay Abraham I)
- Add debug messages for all Keystone errors (Kishon Vijay Abraham I)
- Clean up Keystone includes and macros (Kishon Vijay Abraham I)
* remotes/lorenzo/pci/keystone:
PCI: keystone: Cleanup macros defined in pci-keystone.c
PCI: keystone: Reorder header file in alphabetical order
PCI: keystone: Add debug error message for all errors
PCI: keystone: Use ERR_IRQ_STATUS instead of ERR_IRQ_STATUS_RAW to get interrupt status
PCI: keystone: Cleanup ks_pcie_link_up()
PCI: keystone: Cleanup set_dbi_mode() and get_dbi_mode()
PCI: keystone: Cleanup outbound window configuration
PCI: keystone: Get number of outbound windows from DT
PCI: keystone: Cleanup configuration space access
PCI: keystone: Invoke runtime PM APIs to enable clock
PCI: keystone: Cleanup PHY handling
PCI: keystone: Use SYSCON APIs to get device ID from control module
dt-bindings: PCI: keystone: Add bindings to get device control module
PCI: keystone: Use uniform function naming convention
PCI: keystone: Remove redundant platform_set_drvdata() invocation
PCI: keystone: Merge pci-keystone-dw.c and pci-keystone.c
PCI: keystone: Remove unused argument from ks_dw_pcie_host_init()
PCI: keystone: Do not initiate link training multiple times
PCI: keystone: Move dw_pcie_setup_rc() out of ks_pcie_establish_link()
PCI: keystone: Use quirk to set MRRS for PCI host bridge
PCI: keystone: Use quirk to limit MRRS for K2G
- Fix Cadence PHY handling during probe (Alan Douglas)
- Signal Cadence Endpoint interrupts via AXI region 0 instead of last
region (Alan Douglas)
- Write Cadence Endpoint MSI interrupts with 32 bits of data (Alan
Douglas)
* remotes/lorenzo/pci/cadence:
PCI: cadence: Write MSI data with 32bits
PCI: cadence: Use AXI region 0 to signal interrupts from EP
PCI: cadence: Correct probe behaviour when failing to get PHY
- Cache VF config space size to optimize enumeration of many VFs
(KarimAllah Ahmed)
- Remove unnecessary <linux/pci-ats.h> include (Bjorn Helgaas)
* pci/virtualization:
PCI/IOV: Remove unnecessary include of <linux/pci-ats.h>
PCI/IOV: Use VF0 cached config space size for other VFs
- Add PCI support for peer-to-peer DMA (Logan Gunthorpe)
- Add sysfs group for PCI peer-to-peer memory statistics (Logan
Gunthorpe)
- Add PCI peer-to-peer DMA scatterlist mapping interface (Logan
Gunthorpe)
- Add PCI configfs/sysfs helpers for use by peer-to-peer users (Logan
Gunthorpe)
- Add PCI peer-to-peer DMA driver writer's documentation (Logan
Gunthorpe)
- Add block layer flag to indicate driver support for PCI peer-to-peer
DMA (Logan Gunthorpe)
- Map Infiniband scatterlists for peer-to-peer DMA if they contain P2P
memory (Logan Gunthorpe)
- Register nvme-pci CMB buffer as PCI peer-to-peer memory (Logan
Gunthorpe)
- Add nvme-pci support for PCI peer-to-peer memory in requests (Logan
Gunthorpe)
- Use PCI peer-to-peer memory in nvme (Stephen Bates, Steve Wise,
Christoph Hellwig, Logan Gunthorpe)
* pci/peer-to-peer:
nvmet: Optionally use PCI P2P memory
nvmet: Introduce helper functions to allocate and free request SGLs
nvme-pci: Add support for P2P memory in requests
nvme-pci: Use PCI p2pmem subsystem to manage the CMB
IB/core: Ensure we map P2P memory correctly in rdma_rw_ctx_[init|destroy]()
block: Add PCI P2P flag for request queue
PCI/P2PDMA: Add P2P DMA driver writer's documentation
docs-rst: Add a new directory for PCI documentation
PCI/P2PDMA: Introduce configfs/sysfs enable attribute helpers
PCI/P2PDMA: Add PCI p2pmem DMA mappings to adjust the bus offset
PCI/P2PDMA: Add sysfs group to display p2pmem stats
PCI/P2PDMA: Support peer-to-peer memory
- Differentiate between pciehp surprise and safe removal (Lukas Wunner)
- Remove unnecessary pciehp includes (Lukas Wunner)
- Drop pciehp hotplug_slot_ops wrappers (Lukas Wunner)
- Tolerate PCIe Slot Presence Detect being hardwired to zero to
workaround broken hardware, e.g., the Wilocity switch/wireless device
(Lukas Wunner)
- Unify pciehp controller & slot structs (Lukas Wunner)
- Constify hotplug_slot_ops (Lukas Wunner)
- Drop hotplug_slot_info (Lukas Wunner)
- Embed hotplug_slot struct into users instead of allocating it
separately (Lukas Wunner)
- Initialize PCIe port service drivers directly instead of relying on
initcall ordering (Keith Busch)
- Restore PCI config state after a slot reset (Keith Busch)
- Save/restore DPC config state along with other PCI config state (Keith
Busch)
- Reference count devices during AER handling to avoid race issue with
concurrent hot removal (Keith Busch)
- If an Upstream Port reports ERR_FATAL, don't try to read the Port's
config space because it is probably unreachable (Keith Busch)
- During error handling, use slot-specific reset instead of secondary
bus reset to avoid link up/down issues on hotplug ports (Keith Busch)
- Restore previous AER/DPC handling that does not remove and re-enumerate
devices on ERR_FATAL (Keith Busch)
- Notify all drivers that may be affected by error recovery resets (Keith
Busch)
- Always generate error recovery uevents, even if a driver doesn't have
error callbacks (Keith Busch)
- Make PCIe link active reporting detection generic (Keith Busch)
- Support D3cold in PCIe hierarchies during system sleep and runtime,
including hotplug and Thunderbolt ports (Mika Westerberg)
- Handle hpmemsize/hpiosize kernel parameters uniformly, whether slots
are empty or occupied (Jon Derrick)
- Remove duplicated include from pci/pcie/err.c and unused variable from
cpqphp (YueHaibing)
- Remove driver pci_cleanup_aer_uncorrect_error_status() calls (Oza
Pawandeep)
- Uninline PCI bus accessors for better ftracing (Keith Busch)
- Remove unused AER Root Port .error_resume method (Keith Busch)
- Use kfifo in AER instead of a local version (Keith Busch)
- Use threaded IRQ in AER bottom half (Keith Busch)
- Use managed resources in AER core (Keith Busch)
- Reuse pcie_port_find_device() for AER injection (Keith Busch)
- Abstract AER interrupt handling to disconnect error injection (Keith
Busch)
- Refactor AER injection callbacks to simplify future improvments (Keith
Busch)
* pci/hotplug:
PCI/AER: Refactor error injection fallbacks
PCI/AER: Abstract AER interrupt handling
PCI/AER: Reuse existing pcie_port_find_device() interface
PCI/AER: Use managed resource allocations
PCI/AER: Use threaded IRQ for bottom half
PCI/AER: Use kfifo_in_spinlocked() to insert locked elements
PCI/AER: Use kfifo for tracking events instead of reimplementing it
PCI/AER: Remove error source from AER struct aer_rpc
PCI/AER: Remove unused aer_error_resume()
PCI: Uninline PCI bus accessors for better ftracing
PCI/AER: Remove pci_cleanup_aer_uncorrect_error_status() calls
PCI: pnv_php: Use kmemdup()
PCI: cpqphp: Remove set but not used variable 'physical_slot'
PCI/ERR: Remove duplicated include from err.c
PCI: Equalize hotplug memory and io for occupied and empty slots
PCI / ACPI: Whitelist D3 for more PCIe hotplug ports
ACPI / property: Allow multiple property compatible _DSD entries
PCI/PME: Implement runtime PM callbacks
PCI: pciehp: Implement runtime PM callbacks
PCI/portdrv: Add runtime PM hooks for port service drivers
PCI/portdrv: Resume upon exit from system suspend if left runtime suspended
PCI: pciehp: Do not handle events if interrupts are masked
PCI: pciehp: Disable hotplug interrupt during suspend
PCI / ACPI: Enable wake automatically for power managed bridges
PCI: Do not skip power-managed bridges in pci_enable_wake()
PCI: Make link active reporting detection generic
PCI: Unify device inaccessible
PCI/ERR: Always report current recovery status for udev
PCI/ERR: Simplify broadcast callouts
PCI/ERR: Run error recovery callbacks for all affected devices
PCI/ERR: Handle fatal error recovery
PCI/ERR: Use slot reset if available
PCI/AER: Don't read upstream ports below fatal errors
PCI/AER: Take reference on error devices
PCI/DPC: Save and restore config state
PCI: portdrv: Restore PCI config state on slot reset
PCI: portdrv: Initialize service drivers directly
PCI: hotplug: Document TODOs
PCI: hotplug: Embed hotplug_slot
PCI: hotplug: Drop hotplug_slot_info
PCI: hotplug: Constify hotplug_slot_ops
PCI: pciehp: Reshuffle controller struct for clarity
PCI: pciehp: Rename controller struct members for clarity
PCI: pciehp: Unify controller and slot structs
PCI: pciehp: Tolerate Presence Detect hardwired to zero
PCI: pciehp: Drop hotplug_slot_ops wrappers
PCI: pciehp: Drop unnecessary includes
PCI: pciehp: Differentiate between surprise and safe removal
PCI: Simplify disconnected marking
Move the bus ops fallback into separate functions. No functional change
here.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The aer_inject module was directly calling aer_irq(). This required the
AER driver export its private IRQ handler for no other reason than to
support error injection. A driver should not have to expose its private
interfaces, so use the IRQ subsystem to route injection to the AER driver,
and make aer_irq() a private interface.
This provides additional benefits:
First, directly calling the IRQ handler bypassed the IRQ subsytem so the
injection wasn't really synthesizing what happens if a shared AER interrupt
occurs.
The error injection had to provide the callback data directly, which may be
racing with a removal that is freeing that structure. The IRQ subsystem
can handle that race.
Finally, using the IRQ subsystem automatically reacts to threaded IRQs,
keeping the error injection abstracted from that implementation detail.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The port services driver already provides a method to find the pcie_device
for a service. Export that function, use it from the aer_inject module,
and remove the duplicate functionality.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Use the managed device resource allocations for the service data so the AER
driver doesn't need to manage it, further simplifying this driver.
Link: https://lore.kernel.org/linux-pci/20180918235848.26694-12-keith.busch@intel.com
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
'default n' is the default value for any bool or tristate Kconfig setting
so there is no need to write it explicitly.
Also since commit f467c5640c ("kconfig: only write '# CONFIG_FOO is not
set' for visible symbols") the Kconfig behavior is the same regardless of
'default n' being present or not:
...
One side effect of (and the main motivation for) this change is making
the following two definitions behave exactly the same:
config FOO
bool
config FOO
bool
default n
With this change, neither of these will generate a
'# CONFIG_FOO is not set' line (assuming FOO isn't selected/implied).
That might make it clearer to people that a bare 'default n' is
redundant.
...
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The PCI controller in the Marvell Armada 3720 does not implement a
software-accessible root port PCI bridge configuration space. This
causes a number of problems when using PCIe switches or when the Max
Payload size needs to be aligned between the root complex and the
endpoint.
Implementing an emulated root PCI bridge, like is already done in the
pci-mvebu driver for older Marvell platforms allows to solve those
issues, and also to support features such as ASR, PME, VC, HP.
Signed-off-by: Zachary Zhang <zhangzg@marvell.com>
[Thomas: convert to the common emulated PCI bridge logic.]
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Convert the pci-mvebu driver to use the pci-bridge-emul logic, that
helps emulating a root port PCI bridge configuration space.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
[lorenzo.pieralisi@arm.com: updated commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Commit dc0352ab0b ("PCI: mvebu: Add PCI Express root complex
capability block") added support for emulating the PCI Express
capability block. As part of this, the pcie_sltcap, pcie_devctl and
pcie_rtctl fields were added to the mvebu_sw_pci_bridge structure, and
used when reading the corresponding PCI Express capability block
registers.
However, those structure members are never set to any value other than
zero. This makes them unneeded because:
- pcie_devctl is used to OR *value, so with pcie_devctl always zero,
it has no effect.
- for pcie_sltcap and pcie_rtstl, the mvebu_sw_pci_bridge_read()
function always returns 0 for registers that are not explicitly
handled.
In preparation for reworking the PCI bridge emulation logic in
pci-mvebu, let's simplify the code by dropping those structure
members.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Some PCI host controllers do not expose a configuration space for the
root port PCI bridge. Due to this, the Marvell Armada 370/38x/XP PCI
controller driver (pci-mvebu) emulates a root port PCI bridge
configuration space, and uses that to (among other things) dynamically
create the memory windows that correspond to the PCI MEM and I/O
regions.
Since we now need to add a very similar logic for the Marvell Armada
37xx PCI controller driver (pci-aardvark), instead of duplicating the
code, we create in this commit a common logic called pci-bridge-emul.
The idea of this logic is to emulate a root port PCI bridge
configuration space by providing configuration space read/write
operations, and faking behind the scenes the configuration space of a
PCI bridge. A PCI host controller driver simply has to call
pci_bridge_emul_conf_read() and pci_bridge_emul_conf_write() to
read/write the configuration space of the bridge.
By default, the PCI bridge configuration space is simply emulated by a
chunk of memory, but the PCI host controller can override the behavior
of the read and write operations on a per-register basis to do
additional actions if needed. We take care of complying with the
behavior of the PCI configuration space registers in terms of bits
that are read-write, read-only, reserved and write-1-to-clear.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk>
The VMD removal path calls pci_stop_root_busi(), which tears down the pcie
tree, including detaching all of the attached drivers. During driver
detachment, devices may use pci_release_region() to release resources.
This path relies on the resource being accessible in resource tree.
By detaching the child domain from the parent resource domain prior to
stopping the bus, we are preventing the list traversal from finding the
resource to be freed. If we instead detach the resource after stopping
the bus, we will have properly freed the resource and detaching is
simply accounting at that point.
Without this order, the resource is never freed and is orphaned on VMD
removal, leading to a warning:
[ 181.940162] Trying to free nonexistent resource <e5a10000-e5a13fff>
Fixes: 2c2c5c5cd2 ("x86/PCI: VMD: Attach VMD resources to parent domain's resource tree")
Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
[lorenzo.pieralisi@arm.com: updated commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Users of the P2PDMA infrastructure will typically need a way for the user
to tell the kernel to use P2P resources. Typically this will be a simple
on/off boolean operation but sometimes it may be desirable for the user to
specify the exact device to use for the P2P operation.
Add new helpers for attributes which take a boolean or a PCI device. Any
boolean as accepted by strtobool() turn P2P on or off (such as 'y', 'n',
'1', '0', etc). Specifying a full PCI device name/BDF will select the
specific device.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
The DMA address used when mapping PCI P2P memory must be the PCI bus
address. Thus, introduce pci_p2pmem_map_sg() to map the correct addresses
when using P2P memory.
Memory mapped in this way does not need to be unmapped and thus if we
provided pci_p2pmem_unmap_sg() it would be empty. This breaks the expected
balance between map/unmap but was left out as an empty function doesn't
really provide any benefit. In the future, if this call becomes necessary
it can be added without much difficulty.
For this, we assume that an SGL passed to these functions contain all P2P
memory or no P2P memory.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Add a sysfs group to display statistics about P2P memory that is registered
in each PCI device.
Attributes in the group display the total amount of P2P memory, the amount
available and whether it is published or not.
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Implement remove() callback function for the Mediatek PCIe controller
driver to add loadable kernel module support.
Save the PCIe's GIC IRQ at probe so that it can be retrieved to
call dispose_irq() to tear down the IRQ upon module removal.
Signed-off-by: Honghui Zhang <honghui.zhang@mediatek.com>
[lorenzo.pieralisi@arm.com: updated commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Ryder Lee <ryder.lee@mediatek.com>
In order to reduce the PCIe power consumption in system suspend,
the PCI bus physical layer should be gated. On system resume, the PCIe
link should be re-established and the related control register values
should be restored.
Define suspend_noirq & resume_noirq callback functions to implement
PM system syspend hooks for the PCI host controller.
Signed-off-by: Honghui Zhang <honghui.zhang@mediatek.com>
[lorenzo.pieralisi@arm.com: updated commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Ryder Lee <ryder.lee@mediatek.com>
Commit 43e6409db6 ("PCI: mediatek: Add MSI support for MT2712 and
MT7622") added MSI support but enabled MSI in the wrong place, at a step
in the probe sequence where clocks were not still enabled.
Fix this issue by calling mtk_pcie_enable_msi() in mtk_pcie_startup_port_v2()
since clocks are enabled when mtk_pcie_startup_port_v2() is called.
To avoid forward declaration of mtk_pcie_enable_msi(), move the
mtk_pcie_startup_port_v2() function definition in the file.
Fixes: 43e6409db6 ("PCI: mediatek: Add MSI support for MT2712 and MT7622")
Signed-off-by: Honghui Zhang <honghui.zhang@mediatek.com>
[lorenzo.pieralisi@arm.com: squashed commit and adapted log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Ryder Lee <ryder.lee@mediatek.com>
Part of mtk_pcie_register_host() is an open-coded version of
pci_host_probe(). So instead of duplicating this code, use
pci_host_probe() directly and remove mtk_pcie_register_host().
Signed-off-by: Honghui Zhang <honghui.zhang@mediatek.com>
[lorenzo.pieralisi@arm.com: commit log changes]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Ryder Lee <ryder.lee@mediatek.com>
There is no need to check whether device have a PM domain attached before
calling the PM runtime methods. Remove it.
Signed-off-by: Honghui Zhang <honghui.zhang@mediatek.com>
[lorenzo.pieralisi@arm.com: commit log changes]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Ryder Lee <ryder.lee@mediatek.com>
No functional change. Cleanup macros defined in pci-keystone.c
by removing unused macros, grouping the macros and aligning
it properly.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
No functional change. Reorder header file in alphabetical order.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
commit 025dd3daed ("PCI: keystone: Add error IRQ
handler") added dev_err() message only for ERR_AXI and ERR_FATAL. Add
debug error message for ERR_SYS, ERR_NONFATAL, ERR_CORR and ERR_AER here.
While at that avoid using ERR_IRQ_STATUS_RAW and use ERR_IRQ_STATUS
instead.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Use ERR_IRQ_STATUS instead of ERR_IRQ_STATUS_RAW to get interrupt
status. ERR_IRQ_STATUS_RAW has the status of the interrupts
before masking whereas ERR_IRQ_STATUS has the status of the interrupts
after masking. Since all the interrupts are unmasked here, use
ERR_IRQ_STATUS.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
ks_pcie_link_up() uses registers from the designware core to get the
status of the link. Move the register defines to pcie-designware.h
and cleanup ks_pcie_link_up().
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
No functional change. Use BIT() macro for DBI_CS2 and cleanup
set_dbi_mode() and get_dbi_mode().
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Outbound translation window is configured in order to access the
PCIe card's MEM space. Cleanup outbound translation configuration
here by using BIT() macros, adding a macro for window size and
using lower_32_bits/upper_32_bits macros for configuring the 64 bit
offset in the outbound translation region.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Instead of having a fixed outbound window count, get the number of
outbound windows from the device tree.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cleanup configuration space access by removing ks_pcie_cfg_setup()
which has an unncessary check of "if (bus == 0)" which will never be the
case of *_other_conf() and adding macros for configuring the CFG_SETUP
register required for accessing the configuration space of the device.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Invoke runtime PM APIs to enable clocks and remove explicit
clock enabling using clk_prepare_enable().
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cleanup PHY handling by using devm_phy_optional_get() to get PHYs if
the PHYs are optional, creating a device link between the PHY device
and the controller device and disable PHY on error cases here.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Control module registers should be read using syscon APIs.
pci-keystone.c uses platform_get_resource() to get control module registers.
Fix it here by using syscon APIs to get device id from control module.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
No functional change. Some function names begin with ks_dw_pcie_*
and some function names begin with ks_pcie_*. Modify it so that
all function names begin with ks_pcie_*.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
No functional change. Remove redundant platform_set_drvdata() invocation
in ks_pcie_probe().
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
No functional change. Having two different files for keystone PCI driver
doesn't serve any purpose. Merge pci-keystone-dw.c and pci-keystone.c
into a single pci-keystone.c file and remove pci-keystone.h.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
No functional change. Remove unused "msi_intc_np" argument from
ks_dw_pcie_host_init().
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
commit 886bc5ceb5 ("PCI: designware: Add generic
dw_pcie_wait_for_link()") while adding a generic dw_pcie_wait_for_link()
performed a special handling (initiate link training multiple times) for
keystone which is not required. This also resulted in unncessarily waiting
for more time to establish the link even when no PCI device is connected.
Remove it and make it look similar to other dwc based PCIe drivers.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
No functional change. Move dw_pcie_setup_rc() out of
ks_pcie_establish_link() so that ks_pcie_establish_linki() can be used only
to start the link.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reuse the already existing quirk to set MRRS for PCI host bridge
instead of explicitly setting MRRS in ks_pcie_host_init().
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
PCI controller in K2G also has a limitation that memory read request
size (MRRS) must not exceed 256 bytes. Use the quirk to limit MRRS
(added for K2HK, K2L and K2E) for K2G as well.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
commit 101c92dc80 ("PCI: mediatek: Set up vendor ID and class
type for MT7622") erroneously set the class type for MT7622 to
PCI_CLASS_BRIDGE_HOST.
The PCIe controller of MT7622 integrates a Root Port that has type 1
configuration space header and related bridge windows.
The HW default value of this bridge's class type is invalid.
Fix its class type and set it to PCI_CLASS_BRIDGE_PCI to
match the hardware implementation.
Fixes: 101c92dc80 ("PCI: mediatek: Set up vendor ID and class type for MT7622")
Signed-off-by: Honghui Zhang <honghui.zhang@mediatek.com>
[lorenzo.pieralisi@arm.com: reworked the commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Ryder Lee <ryder.lee@mediatek.com>
The Mediatek's host controller has two slots, each with its own control
registers. The host driver needs to identify what slot is connected to
what port in order to access the device's configuration space.
Current code retrieving slot connected to a given endpoint device.
Assuming each slot is connected to one endpoint device as below:
host bridge
bus 0 --> __________|_______
| |
| |
slot 0 slot 1
bus 1 -->| bus 2 --> |
| |
EP 0 EP 1
During PCI enumeration, system software will scan all the PCI devices on
every bus starting from devfn 0. Using PCI_SLOT(devfn) for matching an
endpoint to its slot is erroneous in that the devfn does not contain the
hierarchical bus numbering in it. In order to match an endpoint with its
slot (and related port), the PCI tree must be walked up to the root bus
(where the root ports are situated) and then the PCI_SLOT(devfn)
matching logic can be correctly applied for matching.
This patch fixes the mtk_pcie_find_port() slot matching logic by adding
appropriate PCI tree walking code to retrieve the slot/port a given
endpoint is connected to.
Signed-off-by: Honghui Zhang <honghui.zhang@mediatek.com>
[lorenzo.pieralisi@arm.com: rewrote the commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Ryder Lee <ryder.lee@mediatek.com>
Currently, eeh_pe_state_mark() marks a PE (and it's children) with a
state and then performs additional processing if that state included
EEH_PE_ISOLATED.
The state parameter is always a constant at the call site, so
rearrange eeh_pe_state_mark() into two functions and just call the
appropriate one at each site.
Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
According to the PCIe specification, although the MSI data is only
16bits, the upper 16bits should be written as 0. Use writel
instead of writew when writing the MSI data to the host.
Fixes: 37dddf14f1 ("PCI: cadence: Add EndPoint Controller driver for Cadence PCIe controller")
Signed-off-by: Alan Douglas <adouglas@cadence.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
The IRQ physical address is allocated from region 0, rather than
the highest region. Update the driver to reserve this region in
the bitmap and to use region 0 for all types of interrupt.
This corrects a problem which prevents the interrupt being
signalled correctly if using the first address in the AXI region,
since an offset of zero will always be mapped to region 0.
Fixes: 37dddf14f1 ("PCI: cadence: Add EndPoint Controller driver for Cadence PCIe controller")
Signed-off-by: Alan Douglas <adouglas@cadence.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
iov.c uses nothing declared in <linux/pci-ats.h>, so remove the include of
it. No functional change intended.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cache the config space size from VF0 and use it for all other VFs instead
of reading it from the config space of each VF. We assume that it will be
the same across all associated VFs.
This is an optimization when enabling SR-IOV on a device with many VFs.
Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de>
[bhelgaas: use CONFIG_PCI_IOV (not CONFIG_PCI_ATS)]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Currently the Switchtec quirk runs on all endpoints in the switch,
including all the upstream and downstream ports. These other functions do
not contain BARs, so the quirk fails when trying to map the BAR and prints
the error "Cannot iomap Switchtec device". The user will see a few of
these useless and scary errors, one for each port in the switch.
At most, the quirk should only run on either a management endpoint
(PCI_CLASS_MEMORY_OTHER) or an NTB endpoint (PCI_CLASS_BRIDGE_OTHER).
However, the quirk is useless except in NTB applications, so we will
only run it when the class is PCI_CLASS_BRIDGE_OTHER.
Switch to using DECLARE_PCI_FIXUP_CLASS_FINAL and only match
PCI_CLASS_BRIDGE_OTHER.
Reported-by: Stephen Bates <sbates@raithlin.com>
Fixes: ad281ecf1c ("PCI: Add DMA alias quirk for Microsemi Switchtec NTB")
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
[bhelgaas: split SWITCHTEC_QUIRK() introduction to separate patch]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Doug Meyer <dmeyer@gigaio.com>
Cc: Kurt Schwemmer <kurt.schwemmer@microsemi.com>
Add SWITCHTEC_QUIRK() to reduce redundancy in declaring devices that use
quirk_switchtec_ntb_dma_alias().
By itself, this is no functional change, but a subsequent patch updates
SWITCHTEC_QUIRK() to fix ad281ecf1c ("PCI: Add DMA alias quirk for
Microsemi Switchtec NTB").
Fixes: ad281ecf1c ("PCI: Add DMA alias quirk for Microsemi Switchtec NTB")
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
[bhelgaas: split to separate patch]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Add Device IDs to the Intel GPU "spurious interrupt" quirk table.
For these devices, unplugging the VGA cable and plugging it in again causes
spurious interrupts from the IGD. Linux eventually disables the interrupt,
but of course that disables any other devices sharing the interrupt.
The theory is that this is a VGA BIOS defect: it should have disabled the
IGD interrupt but failed to do so.
See f67fd55fa9 ("PCI: Add quirk for still enabled interrupts on Intel
Sandy Bridge GPUs") and 7c82126a94 ("PCI: Add new ID for Intel GPU
"spurious interrupt" quirk") for some history.
[bhelgaas: See link below for discussion about how to fix this more
generically instead of adding device IDs for every new Intel GPU. I hope
this is the last patch to add device IDs.]
Link: https://lore.kernel.org/linux-pci/1537974841-29928-1-git-send-email-bmeng.cn@gmail.com
Signed-off-by: Bin Meng <bmeng.cn@gmail.com>
[bhelgaas: changelog]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org # v3.4+
The few callers can just use dma_set_max_seg_size ()directly.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The two callers can just use dma_set_seg_boundary() directly.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Some PCI devices may have memory mapped in a BAR space that's intended for
use in peer-to-peer transactions. To enable such transactions the memory
must be registered with ZONE_DEVICE pages so it can be used by DMA
interfaces in existing drivers.
Add an interface for other subsystems to find and allocate chunks of P2P
memory as necessary to facilitate transfers between two PCI peers:
struct pci_dev *pci_p2pmem_find[_many]();
int pci_p2pdma_distance[_many]();
void *pci_alloc_p2pmem();
The new interface requires a driver to collect a list of client devices
involved in the transaction then call pci_p2pmem_find() to obtain any
suitable P2P memory. Alternatively, if the caller knows a device which
provides P2P memory, they can use pci_p2pdma_distance() to determine if it
is usable. With a suitable p2pmem device, memory can then be allocated
with pci_alloc_p2pmem() for use in DMA transactions.
Depending on hardware, using peer-to-peer memory may reduce the bandwidth
of the transfer but can significantly reduce pressure on system memory.
This may be desirable in many cases: for example a system could be designed
with a small CPU connected to a PCIe switch by a small number of lanes
which would maximize the number of lanes available to connect to NVMe
devices.
The code is designed to only utilize the p2pmem device if all the devices
involved in a transfer are behind the same PCI bridge. This is because we
have no way of knowing whether peer-to-peer routing between PCIe Root Ports
is supported (PCIe r4.0, sec 1.3.1). Additionally, the benefits of P2P
transfers that go through the RC is limited to only reducing DRAM usage
and, in some cases, coding convenience. The PCI-SIG may be exploring
adding a new capability bit to advertise whether this is possible for
future hardware.
This commit includes significant rework and feedback from Christoph
Hellwig.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
[bhelgaas: fold in fix from Keith Busch <keith.busch@intel.com>:
https://lore.kernel.org/linux-pci/20181012155920.15418-1-keith.busch@intel.com,
to address comment from Dan Carpenter <dan.carpenter@oracle.com>, fold in
https://lore.kernel.org/linux-pci/20181017160510.17926-1-logang@deltatee.com]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The threaded IRQ is naturally single threaded as desired, so use that to
simplify the AER bottom half handler. Since the root port structure has
much less to do now, remove the rpc construction helper routine.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Use the recommended kernel API for writing to a concurrently-accessed
kfifo. No functional change here.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The kernel provides a generic FIFO implementation, so no need to reinvent
that capability in a driver. Replace the AER-specific implementation with
the kernel-provided kfifo. Since the interrupt handler producer and work
queue consumer run single threaded, there is no need for additional
locking, so remove that lock, too.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The AER struct aer_rpc was carrying a copy of the error source simply as a
temperary variable. Remove that from the structure and use a stack
variable for the purpose.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The error recovery callbacks are only run on child devices. A Root Port is
never a child device, so this error resume callback was never invoked.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
As done treewide earlier, this catches several more open-coded
allocation size calculations that were added to the kernel during the
merge window. This performs the following mechanical transformations
using Coccinelle:
kvmalloc(a * b, ...) -> kvmalloc_array(a, b, ...)
kvzalloc(a * b, ...) -> kvcalloc(a, b, ...)
devm_kzalloc(..., a * b, ...) -> devm_kcalloc(..., a, b, ...)
Signed-off-by: Kees Cook <keescook@chromium.org>
When the root complex suspends it must send a PME_Turn_Off TLP.
Implement this by asserting the "turnoff" reset.
On imx7d this functionality is part of the System Reset Controller (SRC)
and is exposed through the linux reset-controller subsystem.
On imx6 equivalent bits are in the IOMUXC pinmux controller General
Purpose Register (GPR) area which the imx6-pcie driver accesses
directly.
This is only for imx7d right now but it's deliberately implemented as an
optional reset, ignoring the chip variant:
* Older dtbs won't have this reset so it will be ignored.
* Future chips might also expose this as a reset controller.
For example imx8m (not yet supported) has the exact same
PCIE_CTRL_APPS_TURNOFF bit in the same location.
Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com>
[lorenzo.pieralisi@arm.com: updated commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
The PCI bus config accessors could be inlined into other accessor
functions, which makes it so they can't be traced. Force them to never be
inlined so that ftrace can hook into these functions.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.
Addresses-Coverity-ID: 1472052 ("Missing break in switch")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Use kmemdup() rather than duplicating its implementation.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/pci/hotplug/cpqphp_core.c: In function 'init_SERR':
drivers/pci/hotplug/cpqphp_core.c:124:5: warning: variable 'physical_slot' set but not used [-Wunused-but-set-variable]
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Currently, a hotplug bridge will be given hpmemsize additional memory
and hpiosize additional io if available, in order to satisfy any future
hotplug allocation requirements.
These calculations don't consider the current memory/io size of the
hotplug bridge/slot, so hotplug bridges/slots which have downstream
devices will be allocated their current allocation in addition to the
hpmemsize value.
This makes for possibly undesirable results with a mix of unoccupied and
occupied slots (ex, with hpmemsize=2M):
02:03.0 PCI bridge: <-- Occupied
Memory behind bridge: d6200000-d64fffff [size=3M]
02:04.0 PCI bridge: <-- Unoccupied
Memory behind bridge: d6500000-d66fffff [size=2M]
This change considers the current allocation size when using the
hpmemsize/hpiosize parameters to make the reservations predictable for
the mix of unoccupied and occupied slots:
02:03.0 PCI bridge: <-- Occupied
Memory behind bridge: d6200000-d63fffff [size=2M]
02:04.0 PCI bridge: <-- Unoccupied
Memory behind bridge: d6400000-d65fffff [size=2M]
Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
In order to have better power management for Thunderbolt PCIe chains,
Windows enables power management for native PCIe hotplug ports if there is
the following ACPI _DSD attached to the root port:
Name (_DSD, Package () {
ToUUID ("6211e2c0-58a3-4af3-90e1-927a4e0c55a4"),
Package () {
Package () {"HotPlugSupportInD3", 1}
}
})
This is also documented in:
https://docs.microsoft.com/en-us/windows-hardware/drivers/pci/dsd-for-pcie-root-ports#identifying-pcie-root-ports-supporting-hot-plug-in-d3
Do the same in Linux by introducing new firmware PM callback
(->bridge_d3()) and then implement it for ACPI based systems so that the
above property is checked.
There is one catch, though. The initial pci_dev->bridge_d3 is set before
the root port has ACPI companion bound (the device is not added to the PCI
bus either) so we need to look up the ACPI companion manually in that case
in acpi_pci_bridge_d3().
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Basically we need to do the same steps than what we do when system sleep is
entered and disable PME interrupt when the root port is runtime suspended.
This prevents spurious wakeups immediately when the port is transitioned
into D3cold.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Basically we need to do the same thing when runtime suspending than with
system sleep so re-use those operations here. This makes sure hotplug
interrupt does not trigger immediately when the link goes down.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
When PCIe port is runtime suspended/resumed some extra steps might be
needed to be executed from the port service driver side. For instance we
may need to disable PCIe hotplug interrupt to prevent it from triggering
immediately when PCIe link to the downstream component goes down.
To make the above possible add optional ->runtime_suspend() and
->runtime_resume() callbacks to struct pcie_port_service_driver and call
them for each port service in runtime suspend/resume callbacks of portdrv.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
[bhelgaas: adjust "slot->state" for 5790a9c78e ("PCI: pciehp: Unify
controller and slot structs")]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Currently we try to keep PCIe ports runtime suspended over system suspend
if possible. This mostly happens when entering suspend-to-idle because
there is no need to re-configure wake settings.
This causes problems if the parent port goes into D3cold and it gets
resumed upon exit from system suspend. This may happen for example if the
port is part of PCIe switch and the same switch is connected to a PCIe
endpoint that needs to be resumed. The way exit from D3cold works according
PCIe 4.0 spec 5.3.1.4.2 is that power is restored and cold reset is
signaled. After this the device is in D0unitialized state keeping PME
context if it supports wake from D3cold.
The problem occurs when a PCIe hotplug port is left suspended and the
parent port goes into D3cold and back to D0: the port keeps its PME context
but since everything else is reset back to defaults (D0unitialized) it is
not set to detect hotplug events anymore.
For this reason change the PCIe portdrv power management logic so that it
is fine to keep the port runtime suspended over system suspend but it needs
to be resumed upon exit to make sure it gets properly re-initialized.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
PCIe native hotplug shares MSI vector with native PME so the interrupt
handler might get called even the hotplug interrupt is masked. In that case
we should not handle any events because the interrupt was not meant for us.
Modify the PCIe hotplug interrupt handler to check this accordingly and
bail out if it finds out that the interrupt was not about hotplug.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
When PCIe hotplug port is transitioned into D3hot, the link to the
downstream component will go down. If hotplug interrupt generation is
enabled when that happens, it will trigger immediately, waking up the
system and bringing the link back up.
To prevent this, disable hotplug interrupt generation when system suspend
is entered. This does not prevent wakeup from low power states according
to PCIe 4.0 spec section 6.7.3.4:
Software enables a hot-plug event to generate a wakeup event by
enabling software notification of the event as described in Section
6.7.3.1. Note that in order for software to disable interrupt generation
while keeping wakeup generation enabled, the Hot-Plug Interrupt Enable
bit must be cleared.
So as long as we have set the slot event mask accordingly, wakeup should
work even if slot interrupt is disabled. The port should trigger wake and
then send PME to the root port when the PCIe hierarchy is brought back up.
Limit this to systems using native PME mechanism to make sure older Apple
systems depending on commit e3354628c376 ("PCI: pciehp: Support interrupts
sent from D3hot") still continue working.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
We enable power management automatically for bridges where
pci_bridge_d3_possible() returns true. However, these bridges may have
ACPI methods such as _DSW that need to be called before D3 entry. For
example in Lenovo Thinkpad X1 Carbon 6th _DSW method is used to prepare
D3cold for the PCIe root port hosting Thunderbolt chain. Because wake is
not enabled _DSW method is never called and the port does not enter
D3cold properly consuming more power than necessary.
Users can work this around by writing "enabled" to "wakeup" sysfs file
under the device in question but that is not something an ordinary user
is expected to do.
Since we already automatically enable power management for PCIe ports
with ->bridge_d3 set extend that to enable wake for them as well,
assuming the port has any ACPI wakeup related objects implemented in the
namespace (adev->wakeup.flags.valid is true). This ensures the necessary
ACPI methods get called at appropriate times and allows the root port in
Thinkpad X1 Carbon 6th to go into D3cold.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Commit baecc470d5 ("PCI / PM: Skip bridges in pci_enable_wake()") changed
pci_enable_wake() so that all bridges are skipped when wakeup is enabled
(or disabled) with the reasoning that bridges can only signal wakeup on
behalf of their subordinate devices.
However, there are bridges that can signal wakeup themselves. For example
PCIe downstream and root ports supporting hotplug may signal wakeup upon
hotplug event.
For this reason change pci_enable_wake() so that it skips all bridges
except those that we power manage (->bridge_d3 is set). Those are the ones
that can go into low power states and may need to signal wakeup.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The spec has timing requirements when waiting for a link to become active
after a conventional reset. Implement those hard delays when waiting for
an active link so pciehp and dpc drivers don't need to duplicate this.
For devices that don't support data link layer active reporting, wait the
fixed time recommended by the PCIe spec.
Signed-off-by: Keith Busch <keith.busch@intel.com>
[bhelgaas: changelog]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Sinan Kaya <okaya@kernel.org>
Bring surprise removals and permanent failures together so we no longer
need separate flags. The implementation enforces that error handling will
not be able to override a surprise removal's permanent channel failure.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Sinan Kaya <okaya@kernel.org>
A device still participates in error recovery even if it doesn't have
the error callbacks.
Always provide the status for user event watchers.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Sinan Kaya <okaya@kernel.org>
There is no point in having a generic broadcast function if it needs to
have special cases for each callback it broadcasts.
Abstract the error broadcast to only the necessary information and removes
the now unnecessary helper to walk the bus.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Sinan Kaya <okaya@kernel.org>
Going primarily by:
https://en.wikipedia.org/wiki/List_of_Intel_Atom_microprocessors
with additional information gleaned from other related pages; notably:
- Bonnell shrink was called Saltwell
- Moorefield is the Merriefield refresh which makes it Airmont
The general naming scheme is: FAM6_ATOM_UARCH_SOCTYPE
for i in `git grep -l FAM6_ATOM` ; do
sed -i -e 's/ATOM_PINEVIEW/ATOM_BONNELL/g' \
-e 's/ATOM_LINCROFT/ATOM_BONNELL_MID/' \
-e 's/ATOM_PENWELL/ATOM_SALTWELL_MID/g' \
-e 's/ATOM_CLOVERVIEW/ATOM_SALTWELL_TABLET/g' \
-e 's/ATOM_CEDARVIEW/ATOM_SALTWELL/g' \
-e 's/ATOM_SILVERMONT1/ATOM_SILVERMONT/g' \
-e 's/ATOM_SILVERMONT2/ATOM_SILVERMONT_X/g' \
-e 's/ATOM_MERRIFIELD/ATOM_SILVERMONT_MID/g' \
-e 's/ATOM_MOOREFIELD/ATOM_AIRMONT_MID/g' \
-e 's/ATOM_DENVERTON/ATOM_GOLDMONT_X/g' \
-e 's/ATOM_GEMINI_LAKE/ATOM_GOLDMONT_PLUS/g' ${i}
done
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: dave.hansen@linux.intel.com
Cc: len.brown@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Commit ee1604381a ("PCI: mvebu: Only remap I/O space if configured") had
the side effect that the PCI I/O mapping was created much earlier than
before, at a point where the probe() of the driver could still fail. This
is for example a problem if one gets an -EPROBE_DEFER at some point during
probe(), after pci_ioremap_io() has been called.
Indeed, there is currently no function to undo what pci_ioremap_io() did,
and switching to pci_remap_iospace() is not an option in pci-mvebu due to
the need for special memory attributes on Armada 38x.
Reverting ee1604381a ("PCI: mvebu: Only remap I/O space if configured")
would be a possibility, but it would require also reverting 42342073e3
("PCI: mvebu: Convert to use pci_host_bridge directly"). So instead, we use
an open-coded version of pci_host_probe() that creates the PCI I/O mapping
at a point where we are guaranteed not to fail anymore.
Fixes: ee1604381a ("PCI: mvebu: Only remap I/O space if configured")
Reported-by: Jan Kundrát <jan.kundrat@cesnet.cz>
Tested-by: Jan Kundrát <jan.kundrat@cesnet.cz>
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
The PCI kirin driver compilation produces the following section mismatch
warning:
WARNING: vmlinux.o(.text+0x4758cc): Section mismatch in reference from
the function kirin_pcie_probe() to the function
.init.text:kirin_add_pcie_port()
The function kirin_pcie_probe() references
the function __init kirin_add_pcie_port().
This is often because kirin_pcie_probe lacks a __init
annotation or the annotation of kirin_add_pcie_port is wrong.
Remove '__init' from kirin_add_pcie_port() to fix it.
Fixes: fc5165db24 ("PCI: kirin: Add HiSilicon Kirin SoC PCIe controller driver")
Reported-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
[lorenzo.pieralisi@arm.com: updated commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
This save some duplication for ia64, and makes the interface more
general. In the long run we want each dma_map_ops instance to fill this
out, but this will take a little more prep work.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
PCIe r4.0, sec 7.5.1.1.4 defines a new bit in the Status Register:
Immediate Readiness – This optional bit, when Set, indicates the Function
is guaranteed to be ready to successfully complete valid configuration
accesses at any time following any reset that the host is capable of
issuing Configuration Requests to this Function.
When this bit is Set, for accesses to this Function, software is exempt
from all requirements to delay configuration accesses following any type
of reset, including but not limited to the timing requirements defined in
Section 6.6.
This means that all delays after a Conventional or Function Reset can be
skipped.
This patch reads such bit and caches its value in a flag inside struct
pci_dev to be checked later if we should delay or can skip delays after a
reset. While at that, also move the explicit msleep(100) call from
pcie_flr() and pci_af_flr() to pci_dev_wait().
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
[bhelgaas: rename PCI_STATUS_IMMEDIATE to PCI_STATUS_IMM_READY]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Test the correct value to see whether the PHY get failed.
Use devm_phy_get() instead of devm_phy_optional_get(), since it is
only called if phy name is given in devicetree and so should exist.
If failure when getting or linking PHY, put any PHYs which were
already got and unlink them.
Fixes: dfb8053469 ("PCI: cadence: Add generic PHY support to host and EP drivers")
Reported-by: Colin King <colin.king@canonical.com>
Signed-off-by: Alan Douglas <adouglas@cadence.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
On 38+ Intel-based ASUS products, the NVIDIA GPU becomes unusable after S3
suspend/resume. The affected products include multiple generations of
NVIDIA GPUs and Intel SoCs. After resume, nouveau logs many errors such
as:
fifo: fault 00 [READ] at 0000005555555000 engine 00 [GR] client 04
[HUB/FE] reason 4a [] on channel -1 [007fa91000 unknown]
DRM: failed to idle channel 0 [DRM]
Similarly, the NVIDIA proprietary driver also fails after resume (black
screen, 100% CPU usage in Xorg process). We shipped a sample to NVIDIA for
diagnosis, and their response indicated that it's a problem with the parent
PCI bridge (on the Intel SoC), not the GPU.
Runtime suspend/resume works fine, only S3 suspend is affected.
We found a workaround: on resume, rewrite the Intel PCI bridge
'Prefetchable Base Upper 32 Bits' register (PCI_PREF_BASE_UPPER32). In the
cases that I checked, this register has value 0 and we just have to rewrite
that value.
Linux already saves and restores PCI config space during suspend/resume,
but this register was being skipped because upon resume, it already has
value 0 (the correct, pre-suspend value).
Intel appear to have previously acknowledged this behaviour and the
requirement to rewrite this register:
https://bugzilla.kernel.org/show_bug.cgi?id=116851#c23
Based on that, rewrite the prefetch register values even when that appears
unnecessary.
We have confirmed this solution on all the affected models we have in-hands
(X542UQ, UX533FD, X530UN, V272UN).
Additionally, this solves an issue where r8169 MSI-X interrupts were broken
after S3 suspend/resume on ASUS X441UAR. This issue was recently worked
around in commit 7bb05b85bc ("r8169: don't use MSI-X on RTL8106e"). It
also fixes the same issue on RTL6186evl/8111evl on an Aimfor-tech laptop
that we had not yet patched. I suspect it will also fix the issue that was
worked around in commit 7c53a72245 ("r8169: don't use MSI-X on
RTL8168g").
Thomas Martitz reports that this change also solves an issue where the AMD
Radeon Polaris 10 GPU on the HP Zbook 14u G5 is unresponsive after S3
suspend/resume.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=201069
Signed-off-by: Daniel Drake <drake@endlessm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-By: Peter Wu <peter@lekensteyn.nl>
CC: stable@vger.kernel.org
HP 6730b laptop has an ethernet NIC connected to one of the PCIe root
ports. The root ports themselves are native PCIe hotplug capable. Now,
during boot after PCI devices are scanned the BIOS triggers ACPI bus check
directly to the NIC:
ACPI: \_SB_.PCI0.RP06.NIC_: Bus check in hotplug_event()
It is not clear why it is sending bus check but regardless the ACPI hotplug
notify handler calls enable_slot() directly (instead of going through
acpiphp_check_bridge() as there is no bridge), which ends up handling
special case for non-hotplug bridges with native PCIe hotplug. This
results a crash of some kind but the reporter only sees black screen so it
is hard to figure out the exact spot and what actually happens. Based on
a few fix proposals it was tracked to crash somewhere inside
pci_assign_unassigned_bridge_resources().
In any case we should not really be in that special branch at all because
the ACPI notify happened to a slot that is not a PCI bridge (it is just a
regular PCI device).
Fix this so that we only go to that special branch if we are calling
enable_slot() for a bridge (e.g., the ACPI notification was for the
bridge).
Link: https://bugzilla.kernel.org/show_bug.cgi?id=201127
Fixes: 84c8b58ed3 ("ACPI / hotplug / PCI: Don't scan bridges managed by native hotplug")
Reported-by: Peter Anemone <peter.anemone@gmail.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
CC: stable@vger.kernel.org # v4.18+
If an Endpoint reported an error with ERR_FATAL, we previously ran driver
error recovery callbacks only for the Endpoint's driver. But if we reset a
Link to recover from the error, all downstream components are affected,
including the Endpoint, any multi-function peers, and children of those
peers.
Initiate the Link reset from the deepest Downstream Port that is
reliable, and call the error recovery callbacks for all its children.
If a Downstream Port (including a Root Port) reports an error, we assume
the Port itself is reliable and we need to reset its downstream Link. In
all other cases (Switch Upstream Ports, Endpoints, Bridges, etc), we assume
the Link leading to the component needs to be reset, so we initiate the
reset at the parent Downstream Port.
This allows two other clean-ups. First, we currently only use a Link
reset, which can only be initiated using a Downstream Port, so we can
remove checks for Endpoints. Second, the Downstream Port where we initiate
the Link reset is reliable (unlike components downstream from it), so the
special cases for error detect and resume are no longer necessary.
Signed-off-by: Keith Busch <keith.busch@intel.com>
[bhelgaas: changelog]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Sinan Kaya <okaya@kernel.org>