Commit Graph

695 Commits

Author SHA1 Message Date
Bjorn Helgaas
7ecf13fd35 Merge branch 'pci/misc'
- Constify pcibus_class (Heiner Kallweit)

- Annotate pci_cache_line_size variables as __ro_after_init (Heiner
  Kallweit)

- Clean up formatting of PCI accessor macros (Ilpo Järvinen)

- Remove some OLPC dead code (Kunwu Chan)

- Make pcie_bandwidth_capable() static (Ilpo Järvinen)

* pci/misc:
  PCI: Make pcie_bandwidth_capable() static
  x86/pci: Remove OLPC dead code
  PCI: Clean up accessor macro formatting
  PCI/ERR: Cleanup misleading indentation inside if conditions
  PCI: Annotate pci_cache_line_size variables as __ro_after_init
  PCI: Constify pcibus_class
2024-05-16 18:14:14 -05:00
Bjorn Helgaas
ce4a9f1b1c Merge branch 'pci/enumeration'
- Clear bridge Secondary Status errors after enumeration since enumeration
  causes many errors (Vidya Sagar)

- Wait for Link Training==0 before starting Link retrain to avoid a race;
  this was done previously but broken by a faulty merge (Ilpo Järvinen)

- Rename PCI_IRQ_LEGACY to PCI_IRQ_INTX to be more specific about what
  "LEGACY" means (Damien Le Moal)

- Update return types of pci_find_capability() stubs to match the extern
  declarations for the actual implementations (Bjorn Helgaas)

- Drop unnecessary pci_enable_device_io() from pata_cs5520 (Heiner
  Kallweit)

- Drop unused pci_enable_device_io() (Heiner Kallweit)

- On 2016 and newer BIOSes, skip early E820 check for ECAM regions
  described in ACPI MCFG; there's no spec requirement for E820
  reservations, and some machines don't provide them (Bjorn Helgaas)

- If devices were disconnected while suspended, don't wait for them when
  resuming (Ilpo Järvinen)

* pci/enumeration:
  PCI: Do not wait for disconnected devices when resuming
  x86/pci: Skip early E820 check for ECAM region
  PCI: Remove unused pci_enable_device_io()
  ata: pata_cs5520: Remove unnecessary call to pci_enable_device_io()
  PCI: Update pci_find_capability() stub return types
  PCI: Remove PCI_IRQ_LEGACY
  scsi: vmw_pvscsi: Do not use PCI_IRQ_LEGACY instead of PCI_IRQ_LEGACY
  scsi: pmcraid: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  scsi: mpt3sas: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  scsi: megaraid_sas: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  scsi: ipr: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  scsi: hpsa: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  scsi: arcmsr: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  wifi: rtw89: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  wifi: rtw88: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  wifi: ath10k: Refer to INTX instead of LEGACY
  net: wangxun: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  r8169: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  net: alx: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  net: atlantic: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  net: amd-xgbe: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  VMCI: Use PCI_IRQ_ALL_TYPES to remove PCI_IRQ_LEGACY use
  RDMA/vmw_pvrdma: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  IB/qib: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  drm/amdgpu: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  mfd: intel-lpss: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  ntb: idt: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  platform/x86: intel_ips: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  tty: 8250_pci: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  usb: hcd-pci: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  ASoC: Intel: avs: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  Documentation: PCI: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  PCI/portdrv: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  PCI/MSI: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
  PCI: Clarify intent of LT wait
  PCI: Wait for Link Training==0 before starting Link retrain
  PCI: Clear Secondary Status errors after enumeration
2024-05-16 18:14:10 -05:00
Dave Jiang
7e89efc6e9 PCI: Lock upstream bridge for pci_reset_function()
Fix a long-standing locking gap for missing pci_cfg_access_lock() while
manipulating bridge reset registers and configuration during
pci_reset_bus_function().

If there is an upstream bridge, lock it before locking the device itself.
pci_dev_lock() calls pci_cfg_access_lock(), which blocks the writing of PCI
config space by user space.

Add lockdep assertion via pci_dev->cfg_access_lock to verify
pci_dev->block_cfg_access is set.

Co-developed-by: Dan Williams <dan.j.williams@intel.com>
Link: https://lore.kernel.org/r/20240502165851.1948523-3-dave.jiang@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
[bhelgaas: commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-05-08 13:19:20 -05:00
Vidya Sagar
7bf9d2af7e PCI: Clear Secondary Status errors after enumeration
We enumerate devices by attempting config reads to the Vendor ID of each
possible device.  On conventional PCI, if no device responds, the read
terminates with a Master Abort (PCI r3.0, sec 6.1).  On PCIe, the config
read is terminated as an Unsupported Request (PCIe r6.0, sec 2.3.2,
7.5.1.3.7).  In either case, if the read addressed a device below a bridge,
it is logged by setting "Received Master Abort" in the bridge Secondary
Status register.

Clear any errors logged in the Secondary Status register after enumeration.

Link: https://lore.kernel.org/r/20240116143258.483235-1-vidyas@nvidia.com
Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
[bhelgaas: simplify commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-04-23 16:08:17 -05:00
Heiner Kallweit
e30556bf68 PCI: Constify pcibus_class
Constify pcibus_class. All users take a const struct class * argument.

Link: https://lore.kernel.org/r/5e01f46f-266f-4fb3-be8a-8cb9e566cd75@gmail.com
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-04-17 10:44:17 -05:00
Bjorn Helgaas
b8de187056 Merge branch 'pci/sysfs'
- Compile pci-sysfs.c only if CONFIG_SYSFS=y, which reduces kernel size by
  ~120KB when it's disabled (Lukas Wunner)

- Remove obsolete pci_cleanup_rom() declaration (Lukas Wunner)

- Rework pci_dev_resource_resize_attr(n) macros to call a function instead
  of duplicating most of the body, which saves about 2.5KB of text (Ilpo
  Järvinen)

* pci/sysfs:
  PCI/sysfs: Demacrofy pci_dev_resource_resize_attr(n) functions
  PCI: Remove obsolete pci_cleanup_rom() declaration
  PCI/sysfs: Compile pci-sysfs.c only if CONFIG_SYSFS=y

# Conflicts:
#	drivers/pci/Makefile
2024-03-12 12:14:23 -05:00
David E. Box
17423360a2 PCI/ASPM: Save L1 PM Substates Capability for suspend/resume
4ff116d0d5 ("PCI/ASPM: Save L1 PM Substates Capability for
suspend/resume") restored the L1 PM Substates Capability after resume,
which reduced power consumption by making the ASPM L1.x states work after
resume.

a7152be79b ("Revert "PCI/ASPM: Save L1 PM Substates Capability for
suspend/resume"") reverted 4ff116d0d5 because resume failed on some
systems, so power consumption after resume increased again.

a7152be79b mentioned that we restore L1 PM substate configuration even
though ASPM L1 may already be enabled. This is due the fact that the
pci_restore_aspm_l1ss_state() was called before pci_restore_pcie_state().

Save and restore the L1 PM Substates Capability, following PCIe r6.1, sec
5.5.4 more closely by:

  1) Do not restore ASPM configuration in pci_restore_pcie_state() but
     do that after PCIe capability is restored in pci_restore_aspm_state()
     following PCIe r6.1, sec 5.5.4.

  2) If BIOS reenables L1SS, particularly L1.2, we need to clear the
     enables in the right order, downstream before upstream. Defer
     restoring the L1SS config until we are at the downstream component.
     Then update the config for both ends of the link in the prescribed
     order.

  3) Program ASPM L1 PM substate configuration before L1 enables.

  4) Program ASPM L1 PM substate enables last, after rest of the fields
     in the capability are programmed.

[bhelgaas: commit log, squash L1SS-related patches, do both LNKCTL restores
in pci_restore_pcie_state()]

Link: https://lore.kernel.org/r/20240128233212.1139663-3-david.e.box@linux.intel.com
Link: https://lore.kernel.org/r/20240128233212.1139663-4-david.e.box@linux.intel.com
Link: https://lore.kernel.org/r/20240223205851.114931-5-helgaas@kernel.org
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217321
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216782
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216877
Co-developed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Co-developed-by: David E. Box <david.e.box@linux.intel.com>
Reported-by: Koba Ko <koba.ko@canonical.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: David E. Box <david.e.box@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Tasev Nikola <tasev.stefanoska@skynet.be> # Asus UX305FA
Cc: Mark Enriquez <enriquezmark36@gmail.com>
Cc: Thomas Witt <kernel@witt.link>
Cc: Werner Sembach <wse@tuxedocomputers.com>
Cc: Vidya Sagar <vidyas@nvidia.com>
2024-03-12 11:53:45 -05:00
David E. Box
fa84f4435a PCI/ASPM: Move pci_configure_ltr() to aspm.c
The Latency Tolerance Reporting (LTR) mechanism supports the ASPM L1.2
state and is only configured when CONFIG_PCIEASPM is set.

Move pci_configure_ltr() and pci_bridge_reconfigure_ltr() into aspm.c since
they only build when CONFIG_PCIEASPM is set.  No functional change
intended.

Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20240128233212.1139663-2-david.e.box@linux.intel.com
[bhelgaas: commit log, split build change from function moves]
Link: https://lore.kernel.org/r/20240223205851.114931-2-helgaas@kernel.org
Signed-off-by: David E. Box <david.e.box@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-03-07 14:29:08 -06:00
Lukas Wunner
be9c3a4c8b PCI/sysfs: Compile pci-sysfs.c only if CONFIG_SYSFS=y
It is possible to enable CONFIG_PCI but disable CONFIG_SYSFS and for
space-constrained devices such as routers, such a configuration may
actually make sense.

However pci-sysfs.c is compiled even if CONFIG_SYSFS is disabled,
unnecessarily increasing the kernel's size.

To rectify that:

* Move pci_mmap_fits() to mmap.c.  It is not only needed by
  pci-sysfs.c, but also proc.c.

* Move pci_dev_type to probe.c and make it private.  It references
  pci_dev_attr_groups in pci-sysfs.c.  Make that public instead for
  consistency with pci_dev_groups, pcibus_groups and pci_bus_groups,
  which are likewise public and referenced by struct definitions in
  pci-driver.c and probe.c.

* Define pci_dev_groups, pci_dev_attr_groups, pcibus_groups and
  pci_bus_groups to NULL if CONFIG_SYSFS is disabled.  Provide empty
  static inlines for pci_{create,remove}_legacy_files() and
  pci_{create,remove}_sysfs_dev_files().

Result:

vmlinux size is reduced by 122996 bytes in my arm 32-bit test build.

Link: https://lore.kernel.org/r/85ca95ae8e4d57ccf082c5c069b8b21eb141846e.1698668982.git.lukas@wunner.de
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
2024-03-05 16:08:43 -06:00
Bjorn Helgaas
95140c2fbf PCI: Log bridge info when first enumerating bridge
Log bridge secondary/subordinate bus and window information at the same
time we log the bridge BARs, just after discovering the bridge and before
scanning the bridge's secondary bus.  This logs the bridge and downstream
devices in a more logical order:

  - pci 0000:00:01.0: [8086:1901] type 01 class 0x060400
  - pci 0000:01:00.0: [10de:13b6] type 00 class 0x030200
  - pci 0000:01:00.0: reg 0x10: [mem 0xec000000-0xecffffff]
  - pci 0000:00:01.0: PCI bridge to [bus 01]
  - pci 0000:00:01.0:   bridge window [io  0xe000-0xefff]

  + pci 0000:00:01.0: [8086:1901] type 01 class 0x060400
  + pci 0000:00:01.0: PCI bridge to [bus 01]
  + pci 0000:00:01.0:   bridge window [io  0xe000-0xefff]
  + pci 0000:01:00.0: [10de:13b6] type 00 class 0x030200
  + pci 0000:01:00.0: reg 0x10: [mem 0xec000000-0xecffffff]

Note that we read the windows into a temporary struct resource that is
thrown away, not into the resources in the struct pci_bus.

The windows may be adjusted after we know what downstream devices require,
and those adjustments are logged as they are made.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-12-15 17:28:43 -06:00
Bjorn Helgaas
63c6ebb294 PCI: Log bridge windows conditionally
Previously pci_read_bridge_io(), pci_read_bridge_mmio(), and
pci_read_bridge_mmio_pref() unconditionally logged the bridge window
resource.  A future change will call these functions earlier and more
often.  Add a "log" parameter so callers can control whether to generate
the log message.  No functional change intended.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-12-15 17:28:43 -06:00
Bjorn Helgaas
281e1f137a PCI: Supply bridge device, not secondary bus, to read window details
Previously we logged information about devices *below* the bridge before
logging information about the bridge itself, e.g.,

  pci 0000:00:01.0: [8086:1901] type 01 class 0x060400
  pci 0000:01:00.0: [10de:13b6] type 00 class 0x030200
  pci 0000:01:00.0: reg 0x10: [mem 0xec000000-0xecffffff]
  pci 0000:00:01.0: PCI bridge to [bus 01]
  pci 0000:00:01.0:   bridge window [io  0xe000-0xefff]

This is partly because the bridge windows are read in this path:

  pci_scan_child_bus_extend
    for (devfn = 0; devfn < 256; devfn += 8)
      pci_scan_slot(bus, devfn)       # scan below bridge
    pcibios_fixup_bus(bus)
      pci_read_bridge_bases(bus)      # read bridge windows
        pci_read_bridge_io(bus)

Remove the assumption that the secondary (child) pci_bus already exists by
passing in the bridge device (instead of the pci_bus) and a resource
pointer when reading bridge windows.  A future change can use this to log
the bridge details before we enumerate the devices below the bridge.

No functional change intended.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-12-15 17:28:43 -06:00
Bjorn Helgaas
6f32099a91 PCI: Move pci_read_bridge_windows() below individual window accessors
Move pci_read_bridge_windows() below the functions that read the I/O,
memory, and prefetchable memory windows, so pci_read_bridge_windows() can
use them in the future.  No functional change intended.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-12-15 17:28:43 -06:00
Puranjay Mohan
dc4e6f21c3 PCI: Use resource names in PCI log messages
Use the pci_resource_name() to get the name of the resource and use it
while printing log messages.

[bhelgaas: rename to match struct resource * names, also use names in other
BAR messages]
Link: https://lore.kernel.org/r/20211106112606.192563-3-puranjay12@gmail.com
Signed-off-by: Puranjay Mohan <puranjay12@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-12-15 17:28:42 -06:00
Bjorn Helgaas
35259ff188 PCI: Log device type during enumeration
Log the device type when enumeration a device.  Sample output changes:

  - pci 0000:00:00.0: [8086:1237] type 00 class 0x060000
  + pci 0000:00:00.0: [8086:1237] type 00 class 0x060000 conventional PCI endpoint

  - pci 0000:00:1c.0: [8086:a110] type 01 class 0x060400
  + pci 0000:00:1c.0: [8086:a110] type 01 class 0x060400 PCIe Root Port

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-12-15 17:26:21 -06:00
Bjorn Helgaas
5897c17402 Merge branch 'pci/field-get'
- Use FIELD_GET()/FIELD_PREP() when possible throughout drivers/pci/ (Ilpo
  Järvinen, Bjorn Helgaas)

- Rework DPC control programming for clarity (Ilpo Järvinen)

* pci/field-get:
  PCI/portdrv: Use FIELD_GET()
  PCI/VC: Use FIELD_GET()
  PCI/PTM: Use FIELD_GET()
  PCI/PME: Use FIELD_GET()
  PCI/ATS: Use FIELD_GET()
  PCI/ATS: Show PASID Capability register width in bitmasks
  PCI: Use FIELD_GET() in Sapphire RX 5600 XT Pulse quirk
  PCI: Use FIELD_GET()
  PCI/MSI: Use FIELD_GET/PREP()
  PCI/DPC: Use defines with DPC reason fields
  PCI/DPC: Use defined fields with DPC_CTL register
  PCI/DPC: Use FIELD_GET()
  PCI: hotplug: Use FIELD_GET/PREP()
  PCI: dwc: Use FIELD_GET/PREP()
  PCI: cadence: Use FIELD_GET()
  PCI: Use FIELD_GET() to extract Link Width
  PCI: mvebu: Use FIELD_PREP() with Link Width
  PCI: tegra194: Use FIELD_GET()/FIELD_PREP() with Link Width fields

# Conflicts:
#	drivers/pci/controller/dwc/pcie-tegra194.c
2023-10-28 13:31:05 -05:00
Bjorn Helgaas
e0f0a16f5f PCI: Use FIELD_GET()
Use FIELD_GET() and FIELD_PREP() to remove dependences on the field
position, i.e., the shift value.  No functional change intended.

Link: https://lore.kernel.org/r/20231010204436.1000644-2-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
2023-10-24 10:54:04 -05:00
Ilpo Järvinen
d15f18053e PCI: Do error check on own line to split long "if" conditions
Placing PCI error code check inside "if" condition usually results in need
to split lines. Combined with additional conditions the "if" condition
becomes messy.

Convert to the usual error handling pattern with an additional variable to
improve code readability. In addition, reverse the logic in
pci_find_vsec_capability() to get rid of &&.

No functional changes intended.

Link: https://lore.kernel.org/r/20230911125354.25501-5-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
[bhelgaas: PCI_POSSIBLE_ERROR()]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-10-10 16:40:18 -05:00
Ross Lagerwall
8ec9c1d5d0 PCI: Free released resource after coalescing
release_resource() doesn't actually free the resource or resource list
entry so free the resource list entry to avoid a leak.

Closes: https://lore.kernel.org/r/878r9sga1t.fsf@kernel.org/
Fixes: e54223275b ("PCI: Release resource invalidated by coalescing")
Link: https://lore.kernel.org/r/20230906110846.225369-1-ross.lagerwall@citrix.com
Reported-by: Kalle Valo <kvalo@kernel.org>
Tested-by: Kalle Valo <kvalo@kernel.org>
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org      # v5.16+
2023-09-06 12:19:29 -05:00
Bjorn Helgaas
43cc31da91 Merge branch 'pci/misc'
- Reorder struct pci_dev to avoid holes and reduce size (Christophe
  JAILLET)

- Change pdev->rom_attr_enabled to single bit since it's only a boolean
  value (Christophe JAILLET)

- Use struct_size() in pirq_convert_irt_table() instead of hand-writing it
  (Christophe JAILLET)

- Explicitly include correct DT includes to untangle headers (Rob Herring)

- Fix a DOE race between destroy_work_on_stack() and the stack-allocated
  task->work struct going out of scope in pci_doe() (Ira Weiny)

- Use pci_dev_id() when possible instead of manually composing ID from
  dev->bus->number and dev->devfn (Xiongfeng Wang, Zheng Zengkai)

- Move pci_create_resource_files() declarations to linux/pci.h for alpha
  build warnings (Arnd Bergmann)

- Remove unused hotplug function declarations (Yue Haibing)

- Remove unused mvebu struct mvebu_pcie.busn (Pali Rohár)

- Unexport pcie_port_bus_type (Bjorn Helgaas)

- Remove unnecessary sysfs ID local variable initialization (Bjorn Helgaas)

- Fix BAR value printk formatting to accommodate 32-bit values (Bjorn
  Helgaas)

- Use consistent pointer types for config access syscall get_user() and
  put_user() uses (Bjorn Helgaas)

- Simplify AER_RECOVER_RING_SIZE definition (Bjorn Helgaas)

- Simplify pci_pio_to_address() (Bjorn Helgaas)

- Simplify pci_dev_driver() (Bjorn Helgaas)

- Fix pci_bus_resetable(), pci_slot_resetable() name typos (Bjorn Helgaas)

- Fix code and doc typos and code formatting (Bjorn Helgaas)

- Tidy config space save/restore messages (Bjorn Helgaas)

* pci/misc:
  PCI: Tidy config space save/restore messages
  PCI: Fix code formatting inconsistencies
  PCI: Fix typos in docs and comments
  PCI: Fix pci_bus_resetable(), pci_slot_resetable() name typos
  PCI: Simplify pci_dev_driver()
  PCI: Simplify pci_pio_to_address()
  PCI/AER: Simplify AER_RECOVER_RING_SIZE definition
  PCI: Use consistent put_user() pointer types
  PCI: Fix printk field formatting
  PCI: Remove unnecessary initializations
  PCI: Unexport pcie_port_bus_type
  PCI: mvebu: Remove unused busn member
  PCI: Remove unused function declarations
  PCI/sysfs: Move declarations to linux/pci.h
  PCI/P2PDMA: Use pci_dev_id() to simplify the code
  PCI/IOV: Use pci_dev_id() to simplify the code
  PCI/AER: Use pci_dev_id() to simplify the code
  PCI: apple: Use pci_dev_id() to simplify the code
  PCI/DOE: Fix destroy_work_on_stack() race
  PCI: Explicitly include correct DT includes
  x86/PCI: Use struct_size() in pirq_convert_irt_table()
  PCI: Change pdev->rom_attr_enabled to single bit
  PCI: Reorder pci_dev fields to reduce holes
2023-08-29 11:03:57 -05:00
Bjorn Helgaas
86b4ad7d67 PCI: Fix typos in docs and comments
Fix typos in docs and comments.

Link: https://lore.kernel.org/r/20230824193712.542167-11-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2023-08-25 08:15:38 -05:00
Ilpo Järvinen
5e70d0acf0 PCI: Add locking to RMW PCI Express Capability Register accessors
Many places in the kernel write the Link Control and Root Control PCI
Express Capability Registers without proper concurrency control and this
could result in losing the changes one of the writers intended to make.

Add pcie_cap_lock spinlock into the struct pci_dev and use it to protect
bit changes made in the RMW capability accessors. Protect only a selected
set of registers by differentiating the RMW accessor internally to
locked/unlocked variants using a wrapper which has the same signature as
pcie_capability_clear_and_set_word(). As the Capability Register (pos)
given to the wrapper is always a constant, the compiler should be able to
simplify all the dead-code away.

So far only the Link Control Register (ASPM, hotplug, link retraining,
various drivers) and the Root Control Register (AER & PME) seem to
require RMW locking.

Suggested-by: Lukas Wunner <lukas@wunner.de>
Fixes: c7f486567c ("PCI PM: PCIe PME root port service driver")
Fixes: f12eb72a26 ("PCI/ASPM: Use PCI Express Capability accessors")
Fixes: 7d715a6c1a ("PCI: add PCI Express ASPM support")
Fixes: affa48de84 ("staging/rdma/hfi1: Add support for enabling/disabling PCIe ASPM")
Fixes: 849a9366cb ("misc: rtsx: Add support new chip rts5228 mmc: rtsx: Add support MMC_CAP2_NO_MMC")
Fixes: 3d1e7aa80d ("misc: rtsx: Use pcie_capability_clear_and_set_word() for PCI_EXP_LNKCTL")
Fixes: c0e5f4e73a ("misc: rtsx: Add support for RTS5261")
Fixes: 3df4fce739 ("misc: rtsx: separate aspm mode into MODE_REG and MODE_CFG")
Fixes: 121e9c6b5c ("misc: rtsx: modify and fix init_hw function")
Fixes: 19f3bd548f ("mfd: rtsx: Remove LCTLR defination")
Fixes: 773ccdfd9c ("mfd: rtsx: Read vendor setting from config space")
Fixes: 8275b77a15 ("mfd: rts5249: Add support for RTS5250S power saving")
Fixes: 5da4e04ae4 ("misc: rtsx: Add support for RTS5260")
Fixes: 0f49bfbd0f ("tg3: Use PCI Express Capability accessors")
Fixes: 5e7dfd0fb9 ("tg3: Prevent corruption at 10 / 100Mbps w CLKREQ")
Fixes: b726e493e8 ("r8169: sync existing 8168 device hardware start sequences with vendor driver")
Fixes: e6de30d63e ("r8169: more 8168dp support.")
Fixes: 8a06127602 ("Bluetooth: hci_bcm4377: Add new driver for BCM4377 PCIe boards")
Fixes: 6f461f6c7c ("e1000e: enable/disable ASPM L0s and L1 and ERT according to hardware errata")
Fixes: 1eae4eb2a1 ("e1000e: Disable L1 ASPM power savings for 82573 mobile variants")
Fixes: 8060e169e0 ("ath9k: Enable extended synch for AR9485 to fix L0s recovery issue")
Fixes: 69ce674bfa ("ath9k: do btcoex ASPM disabling at initialization time")
Fixes: f37f055035 ("mt76: mt76x2e: disable pcie_aspm by default")
Link: https://lore.kernel.org/r/20230717120503.15276-2-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: "Rafael J. Wysocki" <rafael@kernel.org>
2023-08-10 11:12:09 -05:00
Rob Herring
c925cfaf09 PCI: Explicitly include correct DT includes
The DT of_device.h and of_platform.h date back to the separate
of_platform_bus_type before it as merged into the regular platform bus.  As
part of that merge prepping Arm DT support 13 years ago, they "temporarily"
include each other. They also include platform_device.h and of.h. As a
result, there's a pretty much random mix of those include files used
throughout the tree. In order to detangle these headers and replace the
implicit includes with struct declarations, users need to explicitly
include the correct includes.

Link: https://lore.kernel.org/r/20230714174827.4061572-1-robh@kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-07-18 14:33:17 -05:00
Bjorn Helgaas
d0b7b3a422 Merge branch 'pci/resource'
- When we coalesce host bridge windows, remove invalidated resources from
  the resource tree so future allocations work correctly (Ross Lagerwall)

* pci/resource:
  PCI: Release resource invalidated by coalescing
2023-06-26 12:59:57 -05:00
Maciej W. Rozycki
a89c82249c PCI: Work around PCIe link training failures
Attempt to handle cases such as with a downstream port of the ASMedia
ASM2824 PCIe switch where link training never completes and the link
continues switching between speeds indefinitely with the data link layer
never reaching the active state.

It has been observed with a downstream port of the ASMedia ASM2824 Gen 3
switch wired to the upstream port of the Pericom PI7C9X2G304 Gen 2 switch,
using a Delock Riser Card PCI Express x1 > 2 x PCIe x1 device, P/N 41433,
wired to a SiFive HiFive Unmatched board.  In this setup the switches
should negotiate a link speed of 5.0GT/s, falling back to 2.5GT/s if
necessary.

Instead the link continues oscillating between the two speeds, at the rate
of 34-35 times per second, with link training reported repeatedly active
~84% of the time.  Limiting the target link speed to 2.5GT/s with the
upstream ASM2824 device makes the two switches communicate correctly.
Removing the speed restriction afterwards makes the two devices switch to
5.0GT/s then.

Make use of these observations and detect the inability to train the link
by checking for the Data Link Layer Link Active status bit being off while
the Link Bandwidth Management Status indicating that hardware has changed
the link speed or width in an attempt to correct unreliable link operation.

Restrict the speed to 2.5GT/s then with the Target Link Speed field,
request a retrain and wait 200ms for the data link to go up.  If this is
successful, lift the restriction, letting the devices negotiate a higher
speed.

Also check for a 2.5GT/s speed restriction the firmware may have already
arranged and lift it too with ports of devices known to continue working
afterwards (currently only ASM2824), that already report their data link
being up.

[bhelgaas: reorder and squash stubs from
https://lore.kernel.org/r/alpine.DEB.2.21.2306111619570.64925@angie.orcam.me.uk
to avoid adding stubs that do nothing]
Link: https://lore.kernel.org/r/alpine.DEB.2.21.2203022037020.56670@angie.orcam.me.uk/
Link: https://source.denx.de/u-boot/u-boot/-/commit/a398a51ccc68
Link: https://lore.kernel.org/r/alpine.DEB.2.21.2305310038540.59226@angie.orcam.me.uk
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-06-20 10:58:53 -05:00
Maciej W. Rozycki
42adbdc74c PCI: Initialize dev->link_active_reporting earlier
Determine whether Data Link Layer Link Active Reporting is available before
calling any fixups so that the cached value can be used there and later on.

[bhelgaas: move to set_pcie_port_type() where other PCIe init is done]
Link: https://lore.kernel.org/r/alpine.DEB.2.21.2305310122210.59226@angie.orcam.me.uk
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-06-14 17:58:12 -05:00
Ross Lagerwall
e54223275b PCI: Release resource invalidated by coalescing
When contiguous windows are coalesced by pci_register_host_bridge(), the
second resource is expanded to include the first, and the first is
invalidated and consequently not added to the bus. However, it remains in
the resource hierarchy.  For example, these windows:

  fec00000-fec7ffff : PCI Bus 0000:00
  fec80000-fecbffff : PCI Bus 0000:00

are coalesced into this, where the first resource remains in the tree with
start/end zeroed out:

  00000000-00000000 : PCI Bus 0000:00
  fec00000-fecbffff : PCI Bus 0000:00

In some cases (e.g. the Xen scratch region), this causes future calls to
allocate_resource() to choose an inappropriate location which the caller
cannot handle.

Fix by releasing the zeroed-out resource and removing it from the resource
hierarchy.

[bhelgaas: commit log]
Fixes: 7c3855c423 ("PCI: Coalesce host bridge contiguous apertures")
Link: https://lore.kernel.org/r/20230525153248.712779-1-ross.lagerwall@citrix.com
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org	# v5.16+
2023-06-09 15:06:16 -05:00
Linus Torvalds
7acc137211 cxl for v6.4
- Refactor the DOE infrastructure (Data Object Exchange PCI-config-cycle
   mailbox) to be a facility of the PCI core rather than the CXL core.
   This is foundational for upcoming support for PCI device-attestation and
   PCIe / CXL link encryption.
 
 - Add support for retrieving and injecting poison for CXL memory
   expanders. This enabling uses trace-events to convey CXL media error
   records to user tooling. It includes translation of device-local
   addresses (DPA) to system physical addresses (SPA) and their
   corresponding CXL region.
 
 - Fixes for decoder enumeration that missed v6.3-final
 
 - Miscellaneous fixups
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQSbo+XnGs+rwLz9XGXfioYZHlFsZwUCZE2nNwAKCRDfioYZHlFs
 Z5c2AQCTWebok6CD+HN01xnIx+CBWAUQe0QIGR40dT2P6/WGEgEA8wMae0w/FDlc
 lQDvSoIyPvy1hGO7Ppb0K2AT6jrQAgU=
 =blcC
 -----END PGP SIGNATURE-----

Merge tag 'cxl-for-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl

Pull compute express link updates from Dan Williams:
 "DOE support is promoted from drivers/cxl/ to drivers/pci/ with Bjorn's
  blessing, and the CXL core continues to mature its media management
  capabilities with support for listing and injecting media errors. Some
  late fixes that missed v6.3-final are also included:

   - Refactor the DOE infrastructure (Data Object Exchange
     PCI-config-cycle mailbox) to be a facility of the PCI core rather
     than the CXL core.

     This is foundational for upcoming support for PCI
     device-attestation and PCIe / CXL link encryption.

   - Add support for retrieving and injecting poison for CXL memory
     expanders.

     This enabling uses trace-events to convey CXL media error records
     to user tooling. It includes translation of device-local addresses
     (DPA) to system physical addresses (SPA) and their corresponding
     CXL region.

   - Fixes for decoder enumeration that missed v6.3-final

   - Miscellaneous fixups"

* tag 'cxl-for-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (38 commits)
  cxl/test: Add mock test for set_timestamp
  cxl/mbox: Update CMD_RC_TABLE
  tools/testing/cxl: Require CONFIG_DEBUG_FS
  tools/testing/cxl: Add a sysfs attr to test poison inject limits
  tools/testing/cxl: Use injected poison for get poison list
  tools/testing/cxl: Mock the Clear Poison mailbox command
  tools/testing/cxl: Mock the Inject Poison mailbox command
  cxl/mem: Add debugfs attributes for poison inject and clear
  cxl/memdev: Trace inject and clear poison as cxl_poison events
  cxl/memdev: Warn of poison inject or clear to a mapped region
  cxl/memdev: Add support for the Clear Poison mailbox command
  cxl/memdev: Add support for the Inject Poison mailbox command
  tools/testing/cxl: Mock support for Get Poison List
  cxl/trace: Add an HPA to cxl_poison trace events
  cxl/region: Provide region info to the cxl_poison trace event
  cxl/memdev: Add trigger_poison_list sysfs attribute
  cxl/trace: Add TRACE support for CXL media-error records
  cxl/mbox: Add GET_POISON_LIST mailbox command
  cxl/mbox: Initialize the poison state
  cxl/mbox: Restrict poison cmds to debugfs cxl_raw_allow_all
  ...
2023-04-30 11:51:51 -07:00
Linus Torvalds
34b62f186d pci-v6.4-changes
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmRIKooUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vxq7A/9G0sInrqvqH2I9/Set/FnmMfCtGDH
 YcEjHYYxL+pztSiXTavDV+ib9iaut83oYtcV9p1bUMhJoZdKNZhrNdIGzRFSemI4
 0/ShtklPzNEu6nPPL24CnEzgbrODBU56ZvzrIE/tShEoOjkKa1triBnOA/JMxYTL
 cUwqDQlDkdpYniCgxy05QfcFZ0mmSOkbl7runGfTMTiUKKC3xSRiaW5YN9KZe3i7
 G5YHu1VVCjeQdQSICHYwyFmkyiqosCoajQNp1IHBkWqSwilzyZMg0NWJobVSA7M/
 mXXnzLtFcC60oT58/9MaggQwDTaSGDE8mG+sWv05bB2u5TQVyZEZqZ4c2FzmZIZT
 WLZYLB6PFRW0zePEuMnVkSLS2npkX+aGaBv28bf88sjorpaYNG01uYijnLEceolQ
 yBPFRN3bsRuOyHvYY/tiZX/BP7z/DS++XXwA8zQWZnYsXSlncJdwCNquV0xIwUt+
 hij4/Yu7o9SgV1LbuwtkMFAn3C9Szc65Eer+IvRRdnMZYphjVHbA5F2msRFyiCeR
 HxECtMQ1jBnVrpQAcBX1Sz+Vu5MrwCqzc2n6tvTQHDvVNjXfkG3NaFhxYPc1IL9Z
 NJMeCKfK1qzw7TtbvWXCluTTIM9N/bNJXrJhQbjNY7V6IaBZY1QNYW0ZFfGgj6Gb
 UUPgndidRy4/hzw=
 =HPXl
 -----END PGP SIGNATURE-----

Merge tag 'pci-v6.4-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci

Pull pci updates from Bjorn Helgaas:
 "Resource management:

   - Add pci_dev_for_each_resource() and pci_bus_for_each_resource()
     iterators

  PCIe native device hotplug:

   - Fix AB-BA deadlock between reset_lock and device_lock

  Power management:

   - Wait longer for devices to become ready after resume (as we do for
     reset) to accommodate Intel Titan Ridge xHCI devices

   - Extend D3hot delay for NVIDIA HDA controllers to avoid
     unrecoverable devices after a bus reset

  Error handling:

   - Clear PCIe Device Status after EDR since generic error recovery now
     only clears it when AER is native

  ASPM:

   - Work around Chromebook firmware defect that clobbers Capability
     list (including ASPM L1 PM Substates Cap) when returning from
     D3cold to D0

  Freescale i.MX6 PCIe controller driver:

   - Install imprecise external abort handler only when DT indicates
     PCIe support

  Freescale Layerscape PCIe controller driver:

   - Add ls1028a endpoint mode support

  Qualcomm PCIe controller driver:

   - Add SM8550 DT binding and driver support

   - Add SDX55 DT binding and driver support

   - Use bulk APIs for clocks of IP 1.0.0, 2.3.2, 2.3.3

   - Use bulk APIs for reset of IP 2.1.0, 2.3.3, 2.4.0

   - Add DT "mhi" register region for supported SoCs

   - Expose link transition counts via debugfs to help debug low power
     issues

   - Support system suspend and resume; reduce interconnect bandwidth
     and turn off clock and PHY if there are no active devices

   - Enable async probe by default to reduce boot time

  Miscellaneous:

   - Sort controller Kconfig entries by vendor"

* tag 'pci-v6.4-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (56 commits)
  PCI: xilinx: Drop obsolete dependency on COMPILE_TEST
  PCI: mobiveil: Sort Kconfig entries by vendor
  PCI: dwc: Sort Kconfig entries by vendor
  PCI: Sort controller Kconfig entries by vendor
  PCI: Use consistent controller Kconfig menu entry language
  PCI: xilinx-nwl: Add 'Xilinx' to Kconfig prompt
  PCI: hv: Add 'Microsoft' to Kconfig prompt
  PCI: meson: Add 'Amlogic' to Kconfig prompt
  PCI: Use of_property_present() for testing DT property presence
  PCI/PM: Extend D3hot delay for NVIDIA HDA controllers
  dt-bindings: PCI: qcom: Document msi-map and msi-map-mask properties
  PCI: qcom: Add SM8550 PCIe support
  dt-bindings: PCI: qcom: Add SM8550 compatible
  PCI: qcom: Add support for SDX55 SoC
  dt-bindings: PCI: qcom-ep: Fix the unit address used in example
  dt-bindings: PCI: qcom: Add SDX55 SoC
  dt-bindings: PCI: qcom: Update maintainers entry
  PCI: qcom: Enable async probe by default
  PCI: qcom: Add support for system suspend and resume
  PCI/PM: Drop pci_bridge_wait_for_secondary_bus() timeout parameter
  ...
2023-04-27 10:45:30 -07:00
Rob Herring
0d21e71a91 PCI: Restrict device disabled status check to DT
Commit 6fffbc7ae1 ("PCI: Honor firmware's device disabled status")
checked the firmware device status for both DT and ACPI devices. That
caused a regression in some ACPI systems. The exact reason isn't clear.
It's possibly a firmware bug. For now, at least, refactor the check to
be for DT based systems only.

Note that the original implementation leaked a refcount which is now
correctly handled.

[bhelgaas: Per ACPI r6.5, sec 6.3.7, for devices on an enumerable bus, _STA
must return with bit[0] ("device is present") set]

Link: https://lore.kernel.org/all/m2fs9lgndw.fsf@gmail.com/
Fixes: 6fffbc7ae1 ("PCI: Honor firmware's device disabled status")
Link: https://lore.kernel.org/r/20230419193513.708818-1-robh@kernel.org
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217317
Reported-by: Donald Hunter <donald.hunter@gmail.com>
Reported-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Tested-by: Donald Hunter <donald.hunter@gmail.com>
Tested-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Binbin Zhou <zhoubinbin@loongson.cn>
Cc: Liu Peibao <liupeibao@loongson.cn>
Cc: Huacai Chen <chenhuacai@loongson.cn>
2023-04-20 13:30:14 -05:00
Lukas Wunner
ac04840350 PCI/DOE: Create mailboxes on device enumeration
Currently a DOE instance cannot be shared by multiple drivers because
each driver creates its own pci_doe_mb struct for a given DOE instance.
For the same reason a DOE instance cannot be shared between the PCI core
and a driver.

Moreover, finding out which protocols a DOE instance supports requires
creating a pci_doe_mb for it.  If a device has multiple DOE instances,
a driver looking for a specific protocol may need to create a pci_doe_mb
for each of the device's DOE instances and then destroy those which
do not support the desired protocol.  That's obviously an inefficient
way to do things.

Overcome these issues by creating mailboxes in the PCI core on device
enumeration.

Provide a pci_find_doe_mailbox() API call to allow drivers to get a
pci_doe_mb for a given (pci_dev, vendor, protocol) triple.  This API is
modeled after pci_find_capability() and can later be amended with a
pci_find_next_doe_mailbox() call to iterate over all mailboxes of a
given pci_dev which support a specific protocol.

On removal, destroy the mailboxes in pci_destroy_dev(), after the driver
is unbound.  This allows drivers to use DOE in their ->remove() hook.

On surprise removal, cancel ongoing DOE exchanges and prevent new ones
from being scheduled.  Thereby ensure that a hot-removed device doesn't
needlessly wait for a running exchange to time out.

Tested-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Ming Li <ming4.li@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/40a6f973f72ef283d79dd55e7e6fddc7481199af.1678543498.git.lukas@wunner.de
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-04-18 10:36:58 -07:00
Andy Shevchenko
02992064bd PCI: Make pci_bus_for_each_resource() index optional
Refactor pci_bus_for_each_resource() in the same way as
pci_dev_for_each_resource(). This allows the index to be hidden inside the
implementation so the caller can omit it when it's not used otherwise.

No functional changes intended.

Link: https://lore.kernel.org/r/20230330162434.35055-6-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Krzysztof Wilczyński <kw@linux.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2023-04-05 15:10:09 -05:00
Linus Torvalds
7c3dc440b1 cxl for v6.3
- CXL RAM region enumeration: instantiate 'struct cxl_region' objects
   for platform firmware created memory regions
 
 - CXL RAM region provisioning: complement the existing PMEM region
   creation support with RAM region support
 
 - "Soft Reservation" policy change: Online (memory hot-add)
   soft-reserved memory (EFI_MEMORY_SP) by default, but still allow for
   setting aside such memory for dedicated access via device-dax.
 
 - CXL Events and Interrupts: Takeover CXL event handling from
   platform-firmware (ACPI calls this CXL Memory Error Reporting) and
   export CXL Events via Linux Trace Events.
 
 - Convey CXL _OSC results to drivers: Similar to PCI, let the CXL
   subsystem interrogate the result of CXL _OSC negotiation.
 
 - Emulate CXL DVSEC Range Registers as "decoders": Allow for
   first-generation devices that pre-date the definition of the CXL HDM
   Decoder Capability to translate the CXL DVSEC Range Registers into
   'struct cxl_decoder' objects.
 
 - Set timestamp: Per spec, set the device timestamp in case of hotplug,
   or if platform-firwmare failed to set it.
 
 - General fixups: linux-next build issues, non-urgent fixes for
   pre-production hardware, unit test fixes, spelling and debug message
   improvements.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQSbo+XnGs+rwLz9XGXfioYZHlFsZwUCY/WYcgAKCRDfioYZHlFs
 Z6m3APkBUtiEEm1o8ikdu5llUS1OTLBwqjJDwGMTyf8X/WDXhgD+J2mLsCgARS7X
 5IS0RAtefutrW5sQpUucPM7QiLuraAY=
 =kOXC
 -----END PGP SIGNATURE-----

Merge tag 'cxl-for-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl

Pull Compute Express Link (CXL) updates from Dan Williams:
 "To date Linux has been dependent on platform-firmware to map CXL RAM
  regions and handle events / errors from devices. With this update we
  can now parse / update the CXL memory layout, and report events /
  errors from devices. This is a precursor for the CXL subsystem to
  handle the end-to-end "RAS" flow for CXL memory. i.e. the flow that
  for DDR-attached-DRAM is handled by the EDAC driver where it maps
  system physical address events to a field-replaceable-unit (FRU /
  endpoint device). In general, CXL has the potential to standardize
  what has historically been a pile of memory-controller-specific error
  handling logic.

  Another change of note is the default policy for handling RAM-backed
  device-dax instances. Previously the default access mode was "device",
  mmap(2) a device special file to access memory. The new default is
  "kmem" where the address range is assigned to the core-mm via
  add_memory_driver_managed(). This saves typical users from wondering
  why their platform memory is not visible via free(1) and stuck behind
  a device-file. At the same time it allows expert users to deploy
  policy to, for example, get dedicated access to high performance
  memory, or hide low performance memory from general purpose kernel
  allocations. This affects not only CXL, but also systems with
  high-bandwidth-memory that platform-firmware tags with the
  EFI_MEMORY_SP (special purpose) designation.

  Summary:

   - CXL RAM region enumeration: instantiate 'struct cxl_region' objects
     for platform firmware created memory regions

   - CXL RAM region provisioning: complement the existing PMEM region
     creation support with RAM region support

   - "Soft Reservation" policy change: Online (memory hot-add)
     soft-reserved memory (EFI_MEMORY_SP) by default, but still allow
     for setting aside such memory for dedicated access via device-dax.

   - CXL Events and Interrupts: Takeover CXL event handling from
     platform-firmware (ACPI calls this CXL Memory Error Reporting) and
     export CXL Events via Linux Trace Events.

   - Convey CXL _OSC results to drivers: Similar to PCI, let the CXL
     subsystem interrogate the result of CXL _OSC negotiation.

   - Emulate CXL DVSEC Range Registers as "decoders": Allow for
     first-generation devices that pre-date the definition of the CXL
     HDM Decoder Capability to translate the CXL DVSEC Range Registers
     into 'struct cxl_decoder' objects.

   - Set timestamp: Per spec, set the device timestamp in case of
     hotplug, or if platform-firwmare failed to set it.

   - General fixups: linux-next build issues, non-urgent fixes for
     pre-production hardware, unit test fixes, spelling and debug
     message improvements"

* tag 'cxl-for-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (66 commits)
  dax/kmem: Fix leak of memory-hotplug resources
  cxl/mem: Add kdoc param for event log driver state
  cxl/trace: Add serial number to trace points
  cxl/trace: Add host output to trace points
  cxl/trace: Standardize device information output
  cxl/pci: Remove locked check for dvsec_range_allowed()
  cxl/hdm: Add emulation when HDM decoders are not committed
  cxl/hdm: Create emulated cxl_hdm for devices that do not have HDM decoders
  cxl/hdm: Emulate HDM decoder from DVSEC range registers
  cxl/pci: Refactor cxl_hdm_decode_init()
  cxl/port: Export cxl_dvsec_rr_decode() to cxl_port
  cxl/pci: Break out range register decoding from cxl_hdm_decode_init()
  cxl: add RAS status unmasking for CXL
  cxl: remove unnecessary calling of pci_enable_pcie_error_reporting()
  dax/hmem: build hmem device support as module if possible
  dax: cxl: add CXL_REGION dependency
  cxl: avoid returning uninitialized error code
  cxl/pmem: Fix nvdimm registration races
  cxl/mem: Fix UAPI command comment
  cxl/uapi: Tag commands from cxl_query_cmd()
  ...
2023-02-25 09:19:23 -08:00
Bjorn Helgaas
ebdce9e3d0 Merge branch 'pci/resource'
- Realign space as required by bridge windows after dividing it up (Mika
  Westerberg)

- Account for space required by other devices on the bus before
  distributing it all to bridges (Mika Westerberg)

- Distribute spare resources to root bus devices as well as to other
  hotplug bridges (Mika Westerberg)

- Fix bug that dropped root bus resources that end at zero, e.g., a host
  bridge that leads only to bus 00 (Geert Uytterhoeven)

* pci/resource:
  PCI: Fix dropping valid root bus resources with .end = zero
  PCI: Distribute available resources for root buses, too
  PCI: Take other bus devices into account when distributing resources
  PCI: Align extra resources for hotplug bridges properly
2023-02-22 13:47:27 -06:00
Geert Uytterhoeven
9d8ba74a18 PCI: Fix dropping valid root bus resources with .end = zero
On r8a7791/koelsch:

  kmemleak: 1 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
  # cat /sys/kernel/debug/kmemleak
  unreferenced object 0xc3a34e00 (size 64):
    comm "swapper/0", pid 1, jiffies 4294937460 (age 199.080s)
    hex dump (first 32 bytes):
      b4 5d 81 f0 b4 5d 81 f0 c0 b0 a2 c3 00 00 00 00  .]...]..........
      00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    backtrace:
      [<fe3aa979>] __kmalloc+0xf0/0x140
      [<34bd6bc0>] resource_list_create_entry+0x18/0x38
      [<767046bc>] pci_add_resource_offset+0x20/0x68
      [<b3f3edf2>] devm_of_pci_get_host_bridge_resources.constprop.0+0xb0/0x390

When coalescing two resources for a contiguous aperture, the second
resource is enlarged to cover the full contiguous range, while the first
resource is marked invalid.  This invalidation is done by clearing the
flags, start, and end members.

When adding the initial resources to the bus later, invalid resources are
skipped.  Unfortunately, the check for an invalid resource considers only
the end member, causing false positives.

E.g. on r8a7791/koelsch, root bus resource 0 ("bus 00") is skipped, and no
longer registered with pci_bus_insert_busn_res() (causing the memory leak),
nor printed:

   pci-rcar-gen2 ee090000.pci: host bridge /soc/pci@ee090000 ranges:
   pci-rcar-gen2 ee090000.pci:      MEM 0x00ee080000..0x00ee08ffff -> 0x00ee080000
   pci-rcar-gen2 ee090000.pci: PCI: revision 11
   pci-rcar-gen2 ee090000.pci: PCI host bridge to bus 0000:00
  -pci_bus 0000:00: root bus resource [bus 00]
   pci_bus 0000:00: root bus resource [mem 0xee080000-0xee08ffff]

Fix this by only skipping resources where all of the flags, start, and end
members are zero.

Fixes: 7c3855c423 ("PCI: Coalesce host bridge contiguous apertures")
Link: https://lore.kernel.org/r/da0fcd5e86c74239be79c7cb03651c0fce31b515.1676036673.git.geert+renesas@glider.be
Tested-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
2023-02-13 16:40:45 -06:00
Rob Herring
6fffbc7ae1 PCI: Honor firmware's device disabled status
If a device has a firmware node (DT/ACPI), and the device is marked
disabled, that is currently ignored. Add a check for this condition and
bail out creating the pci_dev.

This assumes the config space for the device can still be accessed because
they already have by this point in order to identify the device.

Link: https://lore.kernel.org/r/20230210164351.2687475-1-robh@kernel.org
Tested-by: Binbin Zhou <zhoubinbin@loongson.cn>
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Liu Peibao <liupeibao@loongson.cn>
Cc: Huacai Chen <chenhuacai@loongson.cn>
2023-02-13 15:29:56 -06:00
Ira Weiny
589c335737 PCI/CXL: Export native CXL error reporting control
CXL _OSC Error Reporting Control is used by the OS to determine if
Firmware has control of various CXL error reporting capabilities
including the event logs.

Expose the result of negotiating CXL Error Reporting Control in struct
pci_host_bridge for consumption by the CXL drivers.

Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Lukas Wunner <lukas@wunner.de>
Cc: linux-pci@vger.kernel.org
Cc: linux-acpi@vger.kernel.org
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Link: https://lore.kernel.org/r/20221212070627.1372402-2-ira.weiny@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-01-05 13:31:27 -08:00
Linus Torvalds
c7020e1b34 pci-v6.2-changes
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmOYpTIUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vxuZhAAhGjE8voLZeOYwxbvfL69hGTAZ+Me
 x2hqRWVhh/IGWXTTaoSLwSjMMokcmAKN5S/wv8qdCG5sB8EN8FyTBIZDy8PuRRdl
 8UlqlBMSL+d4oSRDCnYLxFNcynLRNnmx2dfcdw9tJ4zjTLN8Y4o8PHFogR6pJ3MT
 sDC8S0myTQKXr4wAGzTZycKsiGManviYtByp6dCcKD3Oy5Q2uZ9OKO2DP2yQpn+F
 c3IJSV9oDz3KR8JVJ5Q1iz9cdMXbGwjkM3JLlHpxhedwjN4ErLumPutKcebtzO5C
 aTqabN7Nnzc4yJusAIfojFCWH7fgaYUyJ3pxcFyJ4tu4m9Last+2I5UB/kV2sYAD
 jWiCYx3sA/mRopNXOnrBGae+Lgy+sQnt8or0grySr0bK+b+ArAGis4uT4A0uASGO
 RUQdIQwz7zhHeQrwAladHWxnx4BEDNCatgfn38p4fklIYKydCY5nfZURMDvHezSR
 G6Nu08hoE9ZXlmkWTFw+5F23wPWKcCpzZj0hf7OroIouXUp8vqSFSqatH5vGkbCl
 bDswck9GdRJ2hl5SvFOeelaXkM42du45TMLU2JmIn6dYYFNrO93JgdvKSU7E2CpG
 AmDIpg1Idxo8fEPPGH1I7RVU5+ilzmmPQQY7poQW+va4/dEd/QVp1+ZZTDnMC1qk
 qi3ck22VdvPU2VU=
 =KULr
 -----END PGP SIGNATURE-----

Merge tag 'pci-v6.2-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI updates from Bjorn Helgaas:
 "Enumeration:

   - Squash portdrv_{core,pci}.c into portdrv.c to ease maintenance and
     make more things static.

   - Make portdrv bind to Switch Ports that have AER. Previously, if
     these Ports lacked MSI/MSI-X, portdrv failed to bind, which meant
     the Ports couldn't be suspended to low-power states. AER on these
     Ports doesn't use interrupts, and the AER driver doesn't need to
     claim them.

   - Assign PCI domain IDs using ida_alloc(), which makes host bridge
     add/remove work better.

  Resource management:

   - To work better with recent BIOSes that use EfiMemoryMappedIO for
     PCI host bridge apertures, remove those regions from the E820 map
     (E820 entries normally prevent us from allocating BARs). In v5.19,
     we added some quirks to disable E820 checking, but that's not very
     maintainable. EfiMemoryMappedIO means the OS needs to map the
     region for use by EFI runtime services; it shouldn't prevent OS
     from using it.

  PCIe native device hotplug:

   - Build pciehp by default if USB4 is enabled, since Thunderbolt/USB4
     PCIe tunneling depends on native PCIe hotplug.

   - Enable Command Completed Interrupt only if supported to avoid user
     confusion from lspci output that says this is enabled but not
     supported.

   - Prevent pciehp from binding to Switch Upstream Ports; this happened
     because of interaction with acpiphp and caused devices below the
     Upstream Port to disappear.

  Power management:

   - Convert AGP drivers to generic power management. We hope to remove
     legacy power management from the PCI core eventually.

  Virtualization:

   - Fix pci_device_is_present(), which previously always returned
     "false" for VFs, causing virtio hangs when unbinding the driver.

  Miscellaneous:

   - Convert drivers to gpiod API to prepare for dropping some legacy
     code.

   - Fix DOE fencepost error for the maximum data object length.

  Baikal-T1 PCIe controller driver:

   - Add driver and DT bindings.

  Broadcom STB PCIe controller driver:

   - Enable Multi-MSI.

   - Delay 100ms after PERST# deassert to allow power and clocks to
     stabilize.

   - Configure Read Completion Boundary to 64 bytes.

  Freescale i.MX6 PCIe controller driver:

   - Initialize PHY before deasserting core reset to fix a regression in
     v6.0 on boards where the PHY provides the reference.

   - Fix imx6sx and imx8mq clock names in DT schema.

  Intel VMD host bridge driver:

   - Fix Secondary Bus Reset on VMD bridges, which allows reset of NVMe
     SSDs in VT-d pass-through scenarios.

   - Disable MSI remapping, which gets re-enabled by firmware during
     suspend/resume.

  MediaTek PCIe Gen3 controller driver:

   - Add MT7986 and MT8195 support.

  Qualcomm PCIe controller driver:

   - Add SC8280XP/SA8540P basic interconnect support.

  Rockchip DesignWare PCIe controller driver:

   - Base DT schema on common Synopsys schema.

  Synopsys DesignWare PCIe core:

   - Collect DT items shared between Root Port and Endpoint (PERST GPIO,
     PHY info, clocks, resets, link speed, number of lanes, number of
     iATU windows, interrupt info, etc) to snps,dw-pcie-common.yaml.

   - Add dma-ranges support for Root Ports and Endpoints.

   - Consolidate DT resource retrieval for "dbi", "dbi2", "atu", etc. to
     reduce code duplication.

   - Add generic names for clocks and resets to encourage more
     consistent naming across drivers using DesignWare IP.

   - Stop advertising PTM Responder role for Endpoints, which aren't
     allowed to be responders.

  TI J721E PCIe driver:

   - Add j721s2 host mode ID to DT schema.

   - Add interrupt properties to DT schema.

  Toshiba Visconti PCIe controller driver:

   - Fix interrupts array max constraints in DT schema"

* tag 'pci-v6.2-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (95 commits)
  x86/PCI: Use pr_info() when possible
  x86/PCI: Fix log message typo
  x86/PCI: Tidy E820 removal messages
  PCI: Skip allocate_resource() if too little space available
  efi/x86: Remove EfiMemoryMappedIO from E820 map
  PCI/portdrv: Allow AER service only for Root Ports & RCECs
  PCI: xilinx-nwl: Fix coding style violations
  PCI: mvebu: Switch to using gpiod API
  PCI: pciehp: Enable Command Completed Interrupt only if supported
  PCI: aardvark: Switch to using devm_gpiod_get_optional()
  dt-bindings: PCI: mediatek-gen3: add support for mt7986
  dt-bindings: PCI: mediatek-gen3: add SoC based clock config
  dt-bindings: PCI: qcom: Allow 'dma-coherent' property
  PCI: mt7621: Add sentinel to quirks table
  PCI: vmd: Fix secondary bus reset for Intel bridges
  PCI: endpoint: pci-epf-vntb: Fix sparse ntb->reg build warning
  PCI: endpoint: pci-epf-vntb: Fix sparse build warning for epf_db
  PCI: endpoint: pci-epf-vntb: Replace hardcoded 4 with sizeof(u32)
  PCI: endpoint: pci-epf-vntb: Remove unused epf_db_phy struct member
  PCI: endpoint: pci-epf-vntb: Fix call pci_epc_mem_free_addr() in error path
  ...
2022-12-14 09:54:10 -08:00
Linus Torvalds
c1f0fcd85d cxl for 6.2
- Add the cpu_cache_invalidate_memregion() API for cache flushing in
   response to physical memory reconfiguration, or memory-side data
   invalidation from operations like secure erase or memory-device unlock.
 
 - Add a facility for the kernel to warn about collisions between kernel
   and userspace access to PCI configuration registers
 
 - Add support for Restricted CXL Host (RCH) topologies (formerly CXL 1.1)
 
 - Add handling and reporting of CXL errors reported via the PCIe AER
   mechanism
 
 - Add support for CXL Persistent Memory Security commands
 
 - Add support for the "XOR" algorithm for CXL host bridge interleave
 
 - Rework / simplify CXL to NVDIMM interactions
 
 - Miscellaneous cleanups and fixes
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQSbo+XnGs+rwLz9XGXfioYZHlFsZwUCY5UpyAAKCRDfioYZHlFs
 Z0ttAP4uxCjIibKsFVyexpSgI4vaZqQ9yt9NesmPwonc0XookwD+PlwP6Xc0d0Ox
 t0gJ6+pwdh11NRzhcNE1pAaPcJZU4gs=
 =HAQk
 -----END PGP SIGNATURE-----

Merge tag 'cxl-for-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl

Pull cxl updates from Dan Williams:
 "Compute Express Link (CXL) updates for 6.2.

  While it may seem backwards, the CXL update this time around includes
  some focus on CXL 1.x enabling where the work to date had been with
  CXL 2.0 (VH topologies) in mind.

  First generation CXL can mostly be supported via BIOS, similar to DDR,
  however it became clear there are use cases for OS native CXL error
  handling and some CXL 3.0 endpoint features can be deployed on CXL 1.x
  hosts (Restricted CXL Host (RCH) topologies). So, this update brings
  RCH topologies into the Linux CXL device model.

  In support of the ongoing CXL 2.0+ enabling two new core kernel
  facilities are added.

  One is the ability for the kernel to flag collisions between userspace
  access to PCI configuration registers and kernel accesses. This is
  brought on by the PCIe Data-Object-Exchange (DOE) facility, a hardware
  mailbox over config-cycles.

  The other is a cpu_cache_invalidate_memregion() API that maps to
  wbinvd_on_all_cpus() on x86. To prevent abuse it is disabled in guest
  VMs and architectures that do not support it yet. The CXL paths that
  need it, dynamic memory region creation and security commands (erase /
  unlock), are disabled when it is not present.

  As for the CXL 2.0+ this cycle the subsystem gains support Persistent
  Memory Security commands, error handling in response to PCIe AER
  notifications, and support for the "XOR" host bridge interleave
  algorithm.

  Summary:

   - Add the cpu_cache_invalidate_memregion() API for cache flushing in
     response to physical memory reconfiguration, or memory-side data
     invalidation from operations like secure erase or memory-device
     unlock.

   - Add a facility for the kernel to warn about collisions between
     kernel and userspace access to PCI configuration registers

   - Add support for Restricted CXL Host (RCH) topologies (formerly CXL
     1.1)

   - Add handling and reporting of CXL errors reported via the PCIe AER
     mechanism

   - Add support for CXL Persistent Memory Security commands

   - Add support for the "XOR" algorithm for CXL host bridge interleave

   - Rework / simplify CXL to NVDIMM interactions

   - Miscellaneous cleanups and fixes"

* tag 'cxl-for-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (71 commits)
  cxl/region: Fix memdev reuse check
  cxl/pci: Remove endian confusion
  cxl/pci: Add some type-safety to the AER trace points
  cxl/security: Drop security command ioctl uapi
  cxl/mbox: Add variable output size validation for internal commands
  cxl/mbox: Enable cxl_mbox_send_cmd() users to validate output size
  cxl/security: Fix Get Security State output payload endian handling
  cxl: update names for interleave ways conversion macros
  cxl: update names for interleave granularity conversion macros
  cxl/acpi: Warn about an invalid CHBCR in an existing CHBS entry
  tools/testing/cxl: Require cache invalidation bypass
  cxl/acpi: Fail decoder add if CXIMS for HBIG is missing
  cxl/region: Fix spelling mistake "memergion" -> "memregion"
  cxl/regs: Fix sparse warning
  cxl/acpi: Set ACPI's CXL _OSC to indicate RCD mode support
  tools/testing/cxl: Add an RCH topology
  cxl/port: Add RCD endpoint port enumeration
  cxl/mem: Move devm_cxl_add_endpoint() from cxl_core to cxl_mem
  tools/testing/cxl: Add XOR Math support to cxl_test
  cxl/acpi: Support CXL XOR Interleave Math (CXIMS)
  ...
2022-12-12 13:55:31 -08:00
Thomas Gleixner
a474d3fbe2 PCI/MSI: Get rid of PCI_MSI_IRQ_DOMAIN
What a zoo:

     PCI_MSI
	select GENERIC_MSI_IRQ

     PCI_MSI_IRQ_DOMAIN
     	def_bool y
	depends on PCI_MSI
	select GENERIC_MSI_IRQ_DOMAIN

Ergo PCI_MSI enables PCI_MSI_IRQ_DOMAIN which in turn selects
GENERIC_MSI_IRQ_DOMAIN. So all the dependencies on PCI_MSI_IRQ_DOMAIN are
just an indirection to PCI_MSI.

Match the reality and just admit that PCI_MSI requires
GENERIC_MSI_IRQ_DOMAIN.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20221111122014.467556921@linutronix.de
2022-11-17 15:15:19 +01:00
Ira Weiny
278294798a PCI: Allow drivers to request exclusive config regions
PCI config space access from user space has traditionally been
unrestricted with writes being an understood risk for device operation.

Unfortunately, device breakage or odd behavior from config writes lacks
indicators that can leave driver writers confused when evaluating
failures.  This is especially true with the new PCIe Data Object
Exchange (DOE) mailbox protocol where backdoor shenanigans from user
space through things such as vendor defined protocols may affect device
operation without complete breakage.

A prior proposal restricted read and writes completely.[1]  Greg and
Bjorn pointed out that proposal is flawed for a couple of reasons.
First, lspci should always be allowed and should not interfere with any
device operation.  Second, setpci is a valuable tool that is sometimes
necessary and it should not be completely restricted.[2]  Finally
methods exist for full lock of device access if required.

Even though access should not be restricted it would be nice for driver
writers to be able to flag critical parts of the config space such that
interference from user space can be detected.

Introduce pci_request_config_region_exclusive() to mark exclusive config
regions.  Such regions trigger a warning and kernel taint if accessed
via user space.

Create pci_warn_once() to restrict the user from spamming the log.

[1] https://lore.kernel.org/all/161663543465.1867664.5674061943008380442.stgit@dwillia2-desk3.amr.corp.intel.com/
[2] https://lore.kernel.org/all/YF8NGeGv9vYcMfTV@kroah.com/

Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20220926215711.2893286-2-ira.weiny@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-11-14 10:07:22 -08:00
Pali Rohár
c14f7ccc9f PCI: Assign PCI domain IDs by ida_alloc()
Replace assignment of PCI domain IDs from atomic_inc_return() to
ida_alloc().

Use two IDAs, one for static domain allocations (those which are defined in
device tree) and second for dynamic allocations (all other).

During removal of root bus / host bridge, also release the domain ID.  The
released ID can be reused again, for example when dynamically loading and
unloading native PCI host bridge drivers.

This change also allows to mix static device tree assignment and dynamic by
kernel as all static allocations are reserved in dynamic pool.

[bhelgaas: set "err" if "bus->domain_nr < 0"]
Link: https://lore.kernel.org/r/20220714184130.5436-1-pali@kernel.org
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2022-11-08 11:14:18 -06:00
Bjorn Helgaas
44e985938e Revert "PCI: Clear PCI_STATUS when setting up device"
This reverts commit 6cd514e58f.

Christophe Fergeau reported that 6cd514e58f ("PCI: Clear PCI_STATUS when
setting up device") causes boot failures when trying to start linux guests
with Apple's virtualization framework (for example using
https://developer.apple.com/documentation/virtualization/running_linux_in_a_virtual_machine?language=objc)

6cd514e58f only solved a cosmetic problem, so revert it to fix the boot
failures.

Link: https://bugzilla.redhat.com/show_bug.cgi?id=2137803
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2022-11-07 15:31:08 -06:00
Mika Westerberg
58e011609c PCI: Fix typo in pci_scan_child_bus_extend()
Should be 'if' not 'of'. Fix this.

Link: https://lore.kernel.org/r/20220905080232.36087-7-mika.westerberg@linux.intel.com
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2022-09-21 14:50:35 -05:00
Mika Westerberg
17d2d67d76 PCI: Fix whitespace and indentation
Drop two empty lines from pci_scan_child_bus_extend() and correct
indentation in pci_bridge_distribute_available_resources() to better
follow the kernel coding style.

No functional impact.

Link: https://lore.kernel.org/r/20220905080232.36087-6-mika.westerberg@linux.intel.com
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2022-09-21 14:50:35 -05:00
Mika Westerberg
49ad31e9d7 PCI: Pass available buses even if the bridge is already configured
If some part of the PCI topology is already configured (by the boot
firmware) but not all, and it includes hotplug bridges, we may need to
extend the bus resources of those bridges to accommodate any future
hotplugs, in the same way we already do with the normal hotplug case.

Pass the available buses to pci_scan_child_bus_extend() even when the
bridge in question is already configured so the bus allocation code can use
these available buses to extend the possible hotplug bridges below.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=216000
Link: https://lore.kernel.org/r/20220905080232.36087-3-mika.westerberg@linux.intel.com
Reported-by: Chris Chiu <chris.chiu@canonical.com>
Tested-by: Chris Chiu <chris.chiu@canonical.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2022-09-21 14:46:48 -05:00
Mika Westerberg
8066cc86b7 PCI: Fix used_buses calculation in pci_scan_child_bus_extend()
pci_scan_bridge_extend() returns the subordinate bus number needed to cover
all the buses below a bridge.  pci_scan_child_bus_extend() computes the
number of buses to reserve by comparing that with the current max bus
number.  Previously it did the subtraction in the wrong order, so
'used_buses' was nonsense.

Subtract 'max' from 'cmax' as is done for the similar
pci_scan_bridge_extend() call in the following block.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=216000
Fixes: 3374c545c2 ("PCI: Account for all bridges on bus when distributing bus numbers")
Link: https://lore.kernel.org/r/20220905080232.36087-2-mika.westerberg@linux.intel.com
Reported-by: Chris Chiu <chris.chiu@canonical.com>
Tested-by: Chris Chiu <chris.chiu@canonical.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2022-09-20 18:19:11 -05:00
Linus Torvalds
c235698355 cxl for 6.0
- Introduce a 'struct cxl_region' object with support for provisioning
   and assembling persistent memory regions.
 
 - Introduce alloc_free_mem_region() to accompany the existing
   request_free_mem_region() as a method to allocate physical memory
   capacity out of an existing resource.
 
 - Export insert_resource_expand_to_fit() for the CXL subsystem to
   late-publish CXL platform windows in iomem_resource.
 
 - Add a polled mode PCI DOE (Data Object Exchange) driver service and
   use it in cxl_pci to retrieve the CDAT (Coherent Device Attribute
   Table).
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQSbo+XnGs+rwLz9XGXfioYZHlFsZwUCYvLYmAAKCRDfioYZHlFs
 Z0pbAQC/3j+WriWpU7CdhrnZI1Wqn+x5IIklF0Lc4/f6LwGZtAEAsSbLpItzvwqx
 M/rcLaeLpwYlgvS1JjdsuQ2VQ7KOtAs=
 =ehNT
 -----END PGP SIGNATURE-----

Merge tag 'cxl-for-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl

Pull cxl updates from Dan Williams:
 "Compute Express Link (CXL) updates for 6.0:

   - Introduce a 'struct cxl_region' object with support for
     provisioning and assembling persistent memory regions.

   - Introduce alloc_free_mem_region() to accompany the existing
     request_free_mem_region() as a method to allocate physical memory
     capacity out of an existing resource.

   - Export insert_resource_expand_to_fit() for the CXL subsystem to
     late-publish CXL platform windows in iomem_resource.

   - Add a polled mode PCI DOE (Data Object Exchange) driver service and
     use it in cxl_pci to retrieve the CDAT (Coherent Device Attribute
     Table)"

* tag 'cxl-for-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (74 commits)
  cxl/hdm: Fix skip allocations vs multiple pmem allocations
  cxl/region: Disallow region granularity != window granularity
  cxl/region: Fix x1 interleave to greater than x1 interleave routing
  cxl/region: Move HPA setup to cxl_region_attach()
  cxl/region: Fix decoder interleave programming
  Documentation: cxl: remove dangling kernel-doc reference
  cxl/region: describe targets and nr_targets members of cxl_region_params
  cxl/regions: add padding for cxl_rr_ep_add nested lists
  cxl/region: Fix IS_ERR() vs NULL check
  cxl/region: Fix region reference target accounting
  cxl/region: Fix region commit uninitialized variable warning
  cxl/region: Fix port setup uninitialized variable warnings
  cxl/region: Stop initializing interleave granularity
  cxl/hdm: Fix DPA reservation vs cxl_endpoint_decoder lifetime
  cxl/acpi: Minimize granularity for x1 interleaves
  cxl/region: Delete 'region' attribute from root decoders
  cxl/acpi: Autoload driver for 'cxl_acpi' test devices
  cxl/region: decrement ->nr_targets on error in cxl_region_attach()
  cxl/region: prevent underflow in ways_to_cxl()
  cxl/region: uninitialized variable in alloc_hpa()
  ...
2022-08-10 11:07:26 -07:00
Bjorn Helgaas
5a20930f27 Merge branch 'pci/err'
- Recognize disconnected devices so we don't bother trying to set them to
  "frozen" or "normal" state (Christoph Hellwig)

- Clear PCI Status register during enumeration in case firmware left errors
  logged (Kai-Heng Feng)

- Configure ECRC for every device, including hot-added ones (Stefan Roese)

- Keep AER error reporting enabled for switches (Stefan Roese)

- Enable error reporting for all devices that support AER (Stefan Roese)

- Iterate over error counters instead of error strings to avoid printing
  junk in AER sysfs counters (Mohamed Khalfella)

* pci/err:
  PCI/AER: Iterate over error counters instead of error strings
  PCI/AER: Enable error reporting when AER is native
  PCI/portdrv: Don't disable AER reporting in get_port_device_capability()
  PCI/AER: Configure ECRC for every device
  PCI: Clear PCI_STATUS when setting up device
  PCI/ERR: Recognize disconnected devices in report_error_detected()
2022-08-04 11:41:52 -05:00
Niklas Schnelle
189c6c33ff PCI: Extend isolated function probing to s390
Like the jailhouse hypervisor, s390's PCI architecture allows passing
isolated PCI functions to a guest OS instance. As of now this is was not
utilized even with multi-function support as the s390 PCI code makes sure
that only virtual PCI busses including a function with devfn 0 are
presented to the PCI subsystem. A subsequent change will remove this
restriction.

Allow probing such functions by replacing the existing check for
jailhouse_paravirt() with a new hypervisor_isolated_pci_functions() helper.

Link: https://lore.kernel.org/r/20220628143100.3228092-5-schnelle@linux.ibm.com
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Cc: Jan Kiszka <jan.kiszka@siemens.com>
2022-07-22 16:06:03 -05:00