Use kmalloc_node() instead of kmalloc() when possible to optimize
for performance on NUMA platforms.
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Grant Likely <grant.likely@linaro.org>
Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/1402302011-23642-6-git-send-email-jiang.liu@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The vast majority of platforms are not supplying ACPI _PXM (proximity)
information corresponding to host bridge (PNP0A03/PNP0A08) devices
resulting in sysfs "numa_node" values of -1 (NUMA_NO_NODE):
# for i in /sys/devices/pci0000\:00/*/numa_node; do cat $i; done | uniq
-1
# find /sys/ -name "numa_node" | while read fname; do cat $fname; \
done | uniq
-1
AMD based platforms provide a fall-back for this situation via amd_bus.c.
These platforms snoop out the information by directly reading specific
registers from the Northbridge and caching them via alloc_pci_root_info().
Later during boot processing when host bridges are discovered -
pci_acpi_scan_root() - the kernel looks for their corresponding ACPI _PXM
method - drivers/acpi/numa.c::acpi_get_node(). If the BIOS supplied a _PXM
method then that node (proximity) value is associated. If the BIOS did not
supply a _PXM method *and* the platform is AMD-based, the fall-back cached
values obtained directly from the Northbridge are used; otherwise,
"NUMA_NO_NODE" is associated.
There are a number of issues with this fall-back mechanism the most notable
being that amd_bus.c extracts a 3-bit number from a CPU register and uses
it as the node number. The node numbers used by Linux are logical and
there's no reason they need to be identical to settings in the CPU
registers. So if we have some node information obtained in the normal way
(from _PXM, SLIT, SRAT, etc.) and some from amd_bus.c, there's no reason to
believe they will be compatible.
This patch warns when this situation occurs:
pci_root PNP0A08:00: [Firmware Bug]: no _PXM; falling back to node 0 from hardware (may be inconsistent with ACPI node numbers)
Link: https://bugzilla.kernel.org/show_bug.cgi?id=72051
Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
* pci/misc:
PCI: Enable INTx if BIOS left them disabled
ia64/PCI: Set IORESOURCE_ROM_SHADOW only for the default VGA device
x86/PCI: Set IORESOURCE_ROM_SHADOW only for the default VGA device
PCI: Update outdated comment for pcibios_bus_report_status()
PCI: Cleanup per-arch list of object files
PCI: cpqphp: Fix hex vs decimal typo in cpqhpc_probe()
x86/PCI: Fix function definition whitespace
x86/PCI: Reword comments
x86/PCI: Remove unnecessary local variable initialization
PCI: Remove unnecessary list_empty(&pci_pme_list) check
The PCI host bridge code doesn't care about _PXM values directly; it only
needs to know what NUMA node the hardware is on.
This uses acpi_get_node() directly and removes the _PXM stuff.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
NUMA_NO_NODE is the usual value for "we don't know what node this is on,"
e.g., it is the error return from acpi_get_node(). This changes uses of -1
to NUMA_NO_NODE. NUMA_NO_NODE is #defined to be -1 already, so this is not
a functional change.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
There are no callers of get_mp_bus_to_node(), so we no longer need
mp_bus_to_node[], get_mp_bus_to_node(), or set_mp_bus_to_node().
This removes them.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This replaces all uses of get_mp_bus_to_node() with x86_pci_root_bus_node().
I think these uses are all on root buses, except possibly for blind
probing, where NUMA node information is unimportant.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Modify struct acpi_dev_node to contain a pointer to struct acpi_device
associated with the given device object (that is, its ACPI companion
device) instead of an ACPI handle corresponding to it. Introduce two
new macros for manipulating that pointer in a CONFIG_ACPI-safe way,
ACPI_COMPANION() and ACPI_COMPANION_SET(), and rework the
ACPI_HANDLE() macro to take the above changes into account.
Drop the ACPI_HANDLE_SET() macro entirely and rework its users to
use ACPI_COMPANION_SET() instead. For some of them who used to
pass the result of acpi_get_child() directly to ACPI_HANDLE_SET()
introduce a helper routine acpi_preset_companion() doing an
equivalent thing.
The main motivation for doing this is that there are things
represented by struct acpi_device objects that don't have valid
ACPI handles (so called fixed ACPI hardware features, such as
power and sleep buttons) and we would like to create platform
device objects for them and "glue" them to their ACPI companions
in the usual way (which currently is impossible due to the
lack of valid ACPI handles). However, there are more reasons
why it may be useful.
First, struct acpi_device pointers allow of much better type checking
than void pointers which are ACPI handles, so it should be more
difficult to write buggy code using modified struct acpi_dev_node
and the new macros. Second, the change should help to reduce (over
time) the number of places in which the result of ACPI_HANDLE() is
passed to acpi_bus_get_device() in order to obtain a pointer to the
struct acpi_device associated with the given "physical" device,
because now that pointer is returned by ACPI_COMPANION() directly.
Finally, the change should make it easier to write generic code that
will build both for CONFIG_ACPI set and unset without adding explicit
compiler directives to it.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com> # on Haswell
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Aaron Lu <aaron.lu@intel.com> # for ATA and SDIO part
Previously we coalesced windows by expanding the first overlapping one and
making the second invalid. But we never look at the expanded first window
again, so we fail to notice other windows that overlap it. For example, we
coalesced these:
[io 0x0000-0x03af] // #0
[io 0x03e0-0x0cf7] // #1
[io 0x0000-0xdfff] // #2
into these, which still overlap:
[io 0x0000-0xdfff] // #0
[io 0x03e0-0x0cf7] // #1
The fix is to expand the *second* overlapping resource and ignore the
first, so we get this instead with no overlaps:
[io 0x0000-0xdfff] // #2
[bhelgaas: changelog]
Reference: https://bugzilla.kernel.org/show_bug.cgi?id=62511
Signed-off-by: Alexey Neyman <stilor@att.net>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Based on a patch by Jon Mason (see URL below).
All users of pcie_bus_configure_settings() pass arguments of the form
"bus, bus->self->pcie_mpss". The "mpss" argument is redundant since we
can easily look it up internally. In addition, all callers check
"bus->self" for NULL, which we can also do internally.
This patch simplifies the interface and the callers. No functional change.
Reference: http://lkml.kernel.org/r/1317048850-30728-2-git-send-email-mason@myri.com
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
We should increase info->res_num before we checking pci_use_crs return
when pci=nocrs set.
No functional change, since we don't use res_num and res_offset[]
in the "!pci_use_crs" case anyway, but this makes the code read better.
Signed-off-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Jiang Liu <liuj97@gmail.com>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Cc: Feng Tang <feng.tang@intel.com>
Host bridge hotplug
- Major overhaul of ACPI host bridge add/start (Rafael Wysocki, Yinghai Lu)
- Major overhaul of PCI/ACPI binding (Rafael Wysocki, Yinghai Lu)
- Split out ACPI host bridge and ACPI PCI device hotplug (Yinghai Lu)
- Stop caching _PRT and make independent of bus numbers (Yinghai Lu)
PCI device hotplug
- Clean up cpqphp dead code (Sasha Levin)
- Disable ARI unless device and upstream bridge support it (Yijing Wang)
- Initialize all hot-added devices (not functions 0-7) (Yijing Wang)
Power management
- Don't touch ASPM if disabled (Joe Lawrence)
- Fix ASPM link state management (Myron Stowe)
Miscellaneous
- Fix PCI_EXP_FLAGS accessor (Alex Williamson)
- Disable Bus Master in pci_device_shutdown (Konstantin Khlebnikov)
- Document hotplug resource and MPS parameters (Yijing Wang)
- Add accessor for PCIe capabilities (Myron Stowe)
- Drop pciehp suspend/resume messages (Paul Bolle)
- Make pci_slot built-in only (not a module) (Jiang Liu)
- Remove unused PCI/ACPI bind ops (Jiang Liu)
- Removed used pci_root_bus (Bjorn Helgaas)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJRKS3hAAoJEFmIoMA60/r8xxoP/j1CS4oCZAnBIVT9fKBkis+/
CENcfHIUKj6J9iMfJEVvqBELvqaLqtpeNwAGMcGPxV7VuT3K1QumChfaTpRDP0HC
VDRmrjcmfenEK+YPOG7acsrTyqk2wjpLOyu9MKRxtC5u7tF6376KQpkEFpO4haL4
eUHTxfE76OkrPBSvx3+PUSf6jqrvrNbjX8K6HdDVVlm3sVAQKmYJU/Wphv2NPOqa
CAMyCzEGybFjr8hDRwvWgr+06c718GMwQUbnrPdHXAe7lMNMrN/XVBmU9ABN3Aas
icd3lrDs+yPObgcO/gT8+sAZErCtdJ9zuHGYHdYpRbIQj/5JT4TMk7tw/Bj7vKY9
Mqmho9GR5YmYTRN9f1r+2n5AQ/KYWXJDrRNOnt5/ys5BOM3vwJ7WJ902zpSwtFQp
nLX+oD/hLfzpnoIQGDuBAoAXp2Kam3XWRgVvG78buRNrPj+kUzimk14a8qQeY+CB
El6UKuwi5Uv/qgs1gAqqjmZmsAkon2DnsRZa6Fl8NTkDlis7LY4gp9OU38ySFpB+
PhCmRyCZmDDqTVtwj6XzR3nPQ5LBSbvsTfgMxYMIUSXHa06tyb2q5p4mEIas0OmU
RKaP5xQqZuTgD8fbdYrx0xgSrn7JHt/j/X//Qs6unlLCWhlpm3LjJZKxyw2FwBGr
o4Lci+PiBh3MowCrju9D
=ER3b
-----END PGP SIGNATURE-----
Merge tag 'pci-v3.9-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI changes from Bjorn Helgaas:
"Host bridge hotplug
- Major overhaul of ACPI host bridge add/start (Rafael Wysocki, Yinghai Lu)
- Major overhaul of PCI/ACPI binding (Rafael Wysocki, Yinghai Lu)
- Split out ACPI host bridge and ACPI PCI device hotplug (Yinghai Lu)
- Stop caching _PRT and make independent of bus numbers (Yinghai Lu)
PCI device hotplug
- Clean up cpqphp dead code (Sasha Levin)
- Disable ARI unless device and upstream bridge support it (Yijing Wang)
- Initialize all hot-added devices (not functions 0-7) (Yijing Wang)
Power management
- Don't touch ASPM if disabled (Joe Lawrence)
- Fix ASPM link state management (Myron Stowe)
Miscellaneous
- Fix PCI_EXP_FLAGS accessor (Alex Williamson)
- Disable Bus Master in pci_device_shutdown (Konstantin Khlebnikov)
- Document hotplug resource and MPS parameters (Yijing Wang)
- Add accessor for PCIe capabilities (Myron Stowe)
- Drop pciehp suspend/resume messages (Paul Bolle)
- Make pci_slot built-in only (not a module) (Jiang Liu)
- Remove unused PCI/ACPI bind ops (Jiang Liu)
- Removed used pci_root_bus (Bjorn Helgaas)"
* tag 'pci-v3.9-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (51 commits)
PCI/ACPI: Don't cache _PRT, and don't associate them with bus numbers
PCI: Fix PCI Express Capability accessors for PCI_EXP_FLAGS
ACPI / PCI: Make pci_slot built-in only, not a module
PCI/PM: Clear state_saved during suspend
PCI: Use atomic_inc_return() rather than atomic_add_return()
PCI: Catch attempts to disable already-disabled devices
PCI: Disable Bus Master unconditionally in pci_device_shutdown()
PCI: acpiphp: Remove dead code for PCI host bridge hotplug
PCI: acpiphp: Create companion ACPI devices before creating PCI devices
PCI: Remove unused "rc" in virtfn_add_bus()
PCI: pciehp: Drop suspend/resume ENTRY messages
PCI/ASPM: Don't touch ASPM if forcibly disabled
PCI/ASPM: Deallocate upstream link state even if device is not PCIe
PCI: Document MPS parameters pci=pcie_bus_safe, pci=pcie_bus_perf, etc
PCI: Document hpiosize= and hpmemsize= resource reservation parameters
PCI: Use PCI Express Capability accessor
PCI: Introduce accessor to retrieve PCIe Capabilities Register
PCI: Put pci_dev in device tree as early as possible
PCI: Skip attaching driver in device_add()
PCI: acpiphp: Keep driver loaded even if no slots found
...
The ACPI handles of PCI root bridges need to be known to
acpi_bind_one(), so that it can create the appropriate
"firmware_node" and "physical_node" files for them, but currently
the way it gets to know those handles is not exactly straightforward
(to put it lightly).
This is how it works, roughly:
1. acpi_bus_scan() finds the handle of a PCI root bridge,
creates a struct acpi_device object for it and passes that
object to acpi_pci_root_add().
2. acpi_pci_root_add() creates a struct acpi_pci_root object,
populates its "device" field with its argument's address
(device->handle is the ACPI handle found in step 1).
3. The struct acpi_pci_root object created in step 2 is passed
to pci_acpi_scan_root() and used to get resources that are
passed to pci_create_root_bus().
4. pci_create_root_bus() creates a struct pci_host_bridge object
and passes its "dev" member to device_register().
5. platform_notify(), which for systems with ACPI is set to
acpi_platform_notify(), is called.
So far, so good. Now it starts to be "interesting".
6. acpi_find_bridge_device() is used to find the ACPI handle of
the given device (which is the PCI root bridge) and executes
acpi_pci_find_root_bridge(), among other things, for the
given device object.
7. acpi_pci_find_root_bridge() uses the name (sic!) of the given
device object to extract the segment and bus numbers of the PCI
root bridge and passes them to acpi_get_pci_rootbridge_handle().
8. acpi_get_pci_rootbridge_handle() browses the list of ACPI PCI
root bridges and finds the one that matches the given segment
and bus numbers. Its handle is then used to initialize the
ACPI handle of the PCI root bridge's device object by
acpi_bind_one(). However, this is *exactly* the ACPI handle we
started with in step 1.
Needless to say, this is quite embarassing, but it may be avoided
thanks to commit f3fd0c8 (ACPI: Allow ACPI handles of devices to be
initialized in advance), which makes it possible to initialize the
ACPI handle of a device before passing it to device_register().
Accordingly, add a new __weak routine, pcibios_root_bridge_prepare(),
defaulting to an empty implementation that can be replaced by the
interested architecutres (x86 and ia64 at the moment) with functions
that will set the root bridge's ACPI handle before its dev member is
passed to device_register(). Make both x86 and ia64 provide such
implementations of pcibios_root_bridge_prepare() and remove
acpi_pci_find_root_bridge() and acpi_get_pci_rootbridge_handle() that
aren't necessary any more.
Included is a fix for breakage on systems with non-ACPI PCI host
bridges from Bjorn Helgaas.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CONFIG_HOTPLUG is going away as an option. As a result, the __dev*
markings need to be removed.
This change removes the use of __devinit, __devexit_p, __devinitconst,
and __devexit from these drivers.
Based on patches originally written by Bill Pemberton, but redone by me
in order to handle some of the coding style issues better, by hand.
Cc: Bill Pemberton <wfp5p@virginia.edu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Daniel Drake <dsd@laptop.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The memory range descriptors in the _CRS control method contain an address
translation offset for host bridges. This value is used to translate
addresses across the bridge. The support to use _TRA values is present for
other architectures but not for X86 platforms.
For existing X86 platforms the _TRA value is zero. Non-zero _TRA values
are expected on future X86 platforms. This change will register that value
with the resource.
Signed-off-by: Mike Yoknis <mike.yoknis@hp.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
The xw9300 BIOS supplies _SEG methods that are incorrect, which results
in some LSI SCSI devices not being discovered. This adds a quirk to
ignore _SEG on this machine and default to zero.
The xw9300 has three host bridges:
ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-3f])
ACPI: PCI Root Bridge [PCI1] (domain 0001 [bus 40-7f])
ACPI: PCI Root Bridge [PCI2] (domain 0002 [bus 80-ff])
When the BIOS "ACPI Bus Segmentation" option is enabled (as it is by
default), the _SEG methods of the PCI1 and PCI2 bridges return 1 and 2,
respectively. However, the BIOS implementation appears to be incomplete,
and we can't enumerate devices in those domains.
But if we assume PCI1 and PCI2 really lead to buses in domain 0,
everything works fine. Windows XP and Vista also seem to ignore
these _SEG methods.
Reference: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=543308
Reference: https://bugzilla.kernel.org/show_bug.cgi?id=15362
Reported-and-Tested-by: Sean M. Pappalardo <pegasus@renegadetech.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Use kzalloc() so the struct resource doesn't contain garbage in
fields we don't initialize.
[bhelgaas: changelog]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: x86@kernel.org
For each resource of a PCI host bridge, the arch code and PCI
code log following messages. We don't need both, so drop the
arch-specific printing.
pci_root PNP0A08:00: host bridge window [io 0x0000-0x03af]
pci_bus 0000:00: root bus resource [io 0x0000-0x03af]
Reviewed-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jiang Liu <liuj97@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
This patch enhances x86 arch-specific code to update MMCONFIG information
when PCI host bridge hotplug event happens.
Reviewed-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jiang Liu <liuj97@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Add the host bridge bus number aperture from _CRS to the resource list.
Like the MMIO and I/O port apertures, this will be used when assigning
resources to hot-added devices or in the case of conflicts.
Note that we always use the _CRS bus number aperture, even if we're
ignoring _CRS otherwise.
[bhelgaas: changelog]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Replace the struct pci_bus secondary/subordinate members with the
struct resource busn_res. Later we'll build a resource tree of these
bus numbers.
[bhelgaas: changelog]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Add resource_overlaps(), which returns true if two resources overlap at all.
Use this to replace the complicated check in coalesce_windows().
Signed-Off-By: Wei Yang <weiyang@linux.vnet.ibm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Embed the x86 struct pci_sysdata in the struct pci_root_info so it
will be automatically freed in the remove path.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
We now keep the pci_root_info struct for the entire lifetime of the
host bridge, so just embed the name in the struct rather than
allocating it separately.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
1. Allocate pci_root_info instead of using stack. We need to pass around
info for release function.
2. Add release_pci_root_info
3. Set x86 host bridge release function to make sure root bridge
related resources get freed during root bus removal.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Rename get_current_resources() to probe_pci_root_info.
1. Remove resource list head from pci_root_info
2. Make get_current_resources() not pass resources
3. Rename get_current_resources() to probe_pci_root_info()
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
In pci_scan_acpi_root(), when pci_use_crs is set, get_current_resources()
is used to get pci_root_info, and it will allocate name and resource array.
Later if pci_create_root_bus() can not create bus (could be already
there...) it will only free bus res list, but the name and res array is not
freed.
Let get_current_resource() take info pointer instead of using local info.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Pull PCI changes (including maintainer change) from Jesse Barnes:
"This pull has some good cleanups from Bjorn and Yinghai, as well as
some more code from Yinghai to better handle resource re-allocation
when enabled.
There's also a new initcall_debug feature from Arjan which will print
out quirk timing information to help identify slow quirks for fixing
or refinement (Yinghai sent in a few patches to do just that once the
new debug code landed).
Beyond that, I'm handing off PCI maintainership to Bjorn Helgaas.
He's been a core PCI and Linux contributor for some time now, and has
kindly volunteered to take over. I just don't feel I have the time
for PCI review and work that it deserves lately (I've taken on some
other projects), and haven't been as responsive lately as I'd like, so
I approached Bjorn asking if he'd like to manage things. He's going
to give it a try, and I'm confident he'll do at least as well as I
have in keeping the tree managed, patches flowing, and keeping things
stable."
Fix up some fairly trivial conflicts due to other cleanups (mips device
resource fixup cleanups clashing with list handling cleanup, ppc iseries
removal clashing with pci_probe_only cleanup etc)
* 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci: (112 commits)
PCI: Bjorn gets PCI hotplug too
PCI: hand PCI maintenance over to Bjorn Helgaas
unicore32/PCI: move <asm-generic/pci-bridge.h> include to asm/pci.h
sparc/PCI: convert devtree and arch-probed bus addresses to resource
powerpc/PCI: allow reallocation on PA Semi
powerpc/PCI: convert devtree bus addresses to resource
powerpc/PCI: compute I/O space bus-to-resource offset consistently
arm/PCI: don't export pci_flags
PCI: fix bridge I/O window bus-to-resource conversion
x86/PCI: add spinlock held check to 'pcibios_fwaddrmap_lookup()'
PCI / PCIe: Introduce command line option to disable ARI
PCI: make acpihp use __pci_remove_bus_device instead
PCI: export __pci_remove_bus_device
PCI: Rename pci_remove_behind_bridge to pci_stop_and_remove_behind_bridge
PCI: Rename pci_remove_bus_device to pci_stop_and_remove_bus_device
PCI: print out PCI device info along with duration
PCI: Move "pci reassigndev resource alignment" out of quirks.c
PCI: Use class for quirk for usb host controller fixup
PCI: Use class for quirk for ti816x class fixup
PCI: Use class for quirk for intel e100 interrupt fixup
...
Carlos was getting
WARNING: at drivers/pci/pci.c:118 pci_ioremap_bar+0x24/0x52()
when probing his sound card, and sound did not work. After adding
pci=use_crs to the kernel command line, no more trouble.
Ok, we can add a quirk. dmidecode output reveals that this is an MSI
MS-7253, for which we already have a quirk, but the short-sighted
author tied the quirk to a single BIOS version, making it not kick in
on Carlos's machine with BIOS V1.2. If a later BIOS update makes it
no longer necessary to look at the _CRS info it will still be
harmless, so let's stop trying to guess which versions have and don't
have accurate _CRS tables.
Addresses https://bugtrack.alsa-project.org/alsa-bug/view.php?id=5533
Also see <https://bugzilla.kernel.org/show_bug.cgi?id=42619>.
Reported-by: Carlos Luna <caralu74@gmail.com>
Reviewed-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
In the spirit of commit 29cf7a30f8 ("x86/PCI: use host bridge _CRS
info on ASUS M2V-MX SE"), this DMI quirk turns on "pci_use_crs" by
default on a board that needs it.
This fixes boot failures and oopses introduced in 3e3da00c01
("x86/pci: AMD one chain system to use pci read out res"). The quirk
is quite targetted (to a specific board and BIOS version) for two
reasons:
(1) to emphasize that this method of tackling the problem one quirk
at a time is a little insane
(2) to give BIOS vendors an opportunity to use simpler tables and
allow us to return to generic behavior (whatever that happens to
be) with a later BIOS update
In other words, I am not at all happy with having quirks like this.
But it is even worse for the kernel not to work out of the box on
these machines, so...
Reference: https://bugzilla.kernel.org/show_bug.cgi?id=42619
Reported-by: Svante Signell <svante.signell@telia.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Host bridges that lead to things like the Uncore need not have any
I/O port or MMIO apertures. For example, in this case:
ACPI: PCI Root Bridge [UNC1] (domain 0000 [bus ff])
PCI: root bus ff: using default resources
PCI host bridge to bus 0000:ff
pci_bus 0000:ff: root bus resource [io 0x0000-0xffff]
pci_bus 0000:ff: root bus resource [mem 0x00000000-0x3fffffffffff]
we should not pretend those default resources are available on bus ff.
CC: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
x86 has two kinds of PCI root bus scanning:
(1) ACPI-based, using _CRS resources. This used pci_create_bus(), not
pci_scan_bus(), because ACPI hotplug needed to split the
pci_bus_add_devices() into a separate host bridge .start() method.
This patch parses the _CRS resources earlier, so we can build a list of
resources and pass it to pci_create_root_bus().
Note that as before, we parse the _CRS even if we aren't going to use
it so we can print it for debugging purposes.
(2) All other, which used either default resources (ioport_resource and
iomem_resource) or information read from the hardware via amd_bus.c or
similar. This used pci_scan_bus().
This patch converts x86_pci_root_bus_res_quirks() (previously called
from pcibios_fixup_bus()) to x86_pci_root_bus_resources(), which builds
a list of resources before we call pci_scan_root_bus().
We also use x86_pci_root_bus_resources() if we have ACPI but are
ignoring _CRS.
CC: Yinghai Lu <yinghai.lu@oracle.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This assures that a _CRS reserved host bridge window or window region is
not used if it is not addressable by the CPU. The new code either trims
the window to exclude the non-addressable portion or totally ignores the
window if the entire window is non-addressable.
The current code has been shown to be problematic with 32-bit non-PAE
kernels on systems where _CRS reserves resources above 4GB.
Signed-off-by: Gary Hade <garyhade@us.ibm.com>
Reviewed-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Thomas Renninger <trenn@novell.com>
Cc: linux-kernel@vger.kernel.org
Cc: stable@kernel.org
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Enabling CRS by default breaks suspend on the Thinkpad SL510.
Details in https://bugzilla.redhat.com/show_bug.cgi?id=769657
Reported-by: Stefan Kirrmann <stefan.kirrmann@gmail.com>
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The Dell Studio 1557 also doesn't suspend correctly when CRS is enabled.
Details at https://bugzilla.redhat.com/show_bug.cgi?id=769657
Reported-by: Gregory S. Hoerner <ghoerner@transcendingthought.com>
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Some machines don't boot unless passed pci=nocrs.
(See https://bugzilla.redhat.com/show_bug.cgi?id=770308 for details of
one report. Waiting on dmidecode output for others).
Currently there is a DMI whitelist, even though the default is on.
v2: drop the 1536 blacklist entry, superceded by the PNP/MMCONFIG changes from
Bjorn
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
In summary, this DMI quirk uses the _CRS info by default for the ASUS
M2V-MX SE by turning on `pci=use_crs` and is similar to the quirk
added by commit 2491762cfb ("x86/PCI: use host bridge _CRS info on
ASRock ALiveSATA2-GLAN") whose commit message should be read for further
information.
Since commit 3e3da00c01 ("x86/pci: AMD one chain system to use pci
read out res") Linux gives the following oops:
parport0: PC-style at 0x378, irq 7 [PCSPP,TRISTATE]
HDA Intel 0000:20:01.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
HDA Intel 0000:20:01.0: setting latency timer to 64
BUG: unable to handle kernel paging request at ffffc90011c08000
IP: [<ffffffffa0578402>] azx_probe+0x3ad/0x86b [snd_hda_intel]
PGD 13781a067 PUD 13781b067 PMD 1300ba067 PTE 800000fd00000173
Oops: 0009 [#1] SMP
last sysfs file: /sys/module/snd_pcm/initstate
CPU 0
Modules linked in: snd_hda_intel(+) snd_hda_codec snd_hwdep snd_pcm_oss snd_mixer_oss snd_pcm snd_seq_midi snd_rawmidi snd_seq_midi_event tpm_tis tpm snd_seq tpm_bios psmouse parport_pc snd_timer snd_seq_device parport processor evdev snd i2c_viapro thermal_sys amd64_edac_mod k8temp i2c_core soundcore shpchp pcspkr serio_raw asus_atk0110 pci_hotplug edac_core button snd_page_alloc edac_mce_amd ext3 jbd mbcache sha256_generic cryptd aes_x86_64 aes_generic cbc dm_crypt dm_mod raid1 md_mod usbhid hid sg sd_mod crc_t10dif sr_mod cdrom ata_generic uhci_hcd sata_via pata_via libata ehci_hcd usbcore scsi_mod via_rhine mii nls_base [last unloaded: scsi_wait_scan]
Pid: 1153, comm: work_for_cpu Not tainted 2.6.37-1-amd64 #1 M2V-MX SE/System Product Name
RIP: 0010:[<ffffffffa0578402>] [<ffffffffa0578402>] azx_probe+0x3ad/0x86b [snd_hda_intel]
RSP: 0018:ffff88013153fe50 EFLAGS: 00010286
RAX: ffffc90011c08000 RBX: ffff88013029ec00 RCX: 0000000000000006
RDX: 0000000000000000 RSI: 0000000000000246 RDI: 0000000000000246
RBP: ffff88013341d000 R08: 0000000000000000 R09: 0000000000000040
R10: 0000000000000286 R11: 0000000000003731 R12: ffff88013029c400
R13: 0000000000000000 R14: 0000000000000000 R15: ffff88013341d090
FS: 0000000000000000(0000) GS:ffff8800bfc00000(0000) knlGS:00000000f7610ab0
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: ffffc90011c08000 CR3: 0000000132f57000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process work_for_cpu (pid: 1153, threadinfo ffff88013153e000, task ffff8801303c86c0)
Stack:
0000000000000005 ffffffff8123ad65 00000000000136c0 ffff88013029c400
ffff8801303c8998 ffff88013341d000 ffff88013341d090 ffff8801322d9dc8
ffff88013341d208 0000000000000000 0000000000000000 ffffffff811ad232
Call Trace:
[<ffffffff8123ad65>] ? __pm_runtime_set_status+0x162/0x186
[<ffffffff811ad232>] ? local_pci_probe+0x49/0x92
[<ffffffff8105afc5>] ? do_work_for_cpu+0x0/0x1b
[<ffffffff8105afc5>] ? do_work_for_cpu+0x0/0x1b
[<ffffffff8105afd0>] ? do_work_for_cpu+0xb/0x1b
[<ffffffff8105fd3f>] ? kthread+0x7a/0x82
[<ffffffff8100a824>] ? kernel_thread_helper+0x4/0x10
[<ffffffff8105fcc5>] ? kthread+0x0/0x82
[<ffffffff8100a820>] ? kernel_thread_helper+0x0/0x10
Code: f4 01 00 00 ef 31 f6 48 89 df e8 29 dd ff ff 85 c0 0f 88 2b 03 00 00 48 89 ef e8 b4 39 c3 e0 8b 7b 40 e8 fc 9d b1 e0 48 8b 43 38 <66> 8b 10 66 89 14 24 8b 43 14 83 e8 03 83 f8 01 77 32 31 d2 be
RIP [<ffffffffa0578402>] azx_probe+0x3ad/0x86b [snd_hda_intel]
RSP <ffff88013153fe50>
CR2: ffffc90011c08000
---[ end trace 8d1f3ebc136437fd ]---
Trusting the ACPI _CRS information (`pci=use_crs`) fixes this problem.
$ dmesg | grep -i crs # with the quirk
PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug
The match has to be against the DMI board entries though since the vendor entries are not populated.
DMI: System manufacturer System Product Name/M2V-MX SE, BIOS 0304 10/30/2007
This quirk should be removed when `pci=use_crs` is enabled for machines
from 2006 or earlier or some other solution is implemented.
Using coreboot [1] with this board the problem does not exist but this
quirk also does not affect it either. To be safe though the check is
tightened to only take effect when the BIOS from American Megatrends is
used.
15:13 < ruik> but coreboot does not need that
15:13 < ruik> because i have there only one root bus
15:13 < ruik> the audio is behind a bridge
$ sudo dmidecode
BIOS Information
Vendor: American Megatrends Inc.
Version: 0304
Release Date: 10/30/2007
[1] http://www.coreboot.org/
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=30552
Cc: stable@kernel.org (2.6.34)
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: x86@kernel.org
Signed-off-by: Paul Menzel <paulepanter@users.sourceforge.net>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit b03e7495a8 ("PCI: Set PCI-E Max Payload Size on fabric")
introduced a potential NULL pointer dereference in calls to
pcie_bus_configure_settings due to attempts to access pci_bus self
variables when the self pointer is NULL.
To correct this, verify that the self pointer in pci_bus is non-NULL
before dereferencing it.
Reported-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Shyam Iyer <shyam_iyer@dell.com>
Signed-off-by: Jon Mason <mason@myri.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
On a given PCI-E fabric, each device, bridge, and root port can have a
different PCI-E maximum payload size. There is a sizable performance
boost for having the largest possible maximum payload size on each PCI-E
device. However, if improperly configured, fatal bus errors can occur.
Thus, it is important to ensure that PCI-E payloads sends by a device
are never larger than the MPS setting of all devices on the way to the
destination.
This can be achieved two ways:
- A conservative approach is to use the smallest common denominator of
the entire tree below a root complex for every device on that fabric.
This means for example that having a 128 bytes MPS USB controller on one
leg of a switch will dramatically reduce performances of a video card or
10GE adapter on another leg of that same switch.
It also means that any hierarchy supporting hotplug slots (including
expresscard or thunderbolt I suppose, dbl check that) will have to be
entirely clamped to 128 bytes since we cannot predict what will be
plugged into those slots, and we cannot change the MPS on a "live"
system.
- A more optimal way is possible, if it falls within a couple of
constraints:
* The top-level host bridge will never generate packets larger than the
smallest TLP (or if it can be controlled independently from its MPS at
least)
* The device will never generate packets larger than MPS (which can be
configured via MRRS)
* No support of direct PCI-E <-> PCI-E transfers between devices without
some additional code to specifically deal with that case
Then we can use an approach that basically ignores downstream requests
and focuses exclusively on upstream requests. In that case, all we need
to care about is that a device MPS is no larger than its parent MPS,
which allows us to keep all switches/bridges to the max MPS supported by
their parent and eventually the PHB.
In this case, your USB controller would no longer "starve" your 10GE
Ethernet and your hotplug slots won't affect your global MPS.
Additionally, the hotplugged devices themselves can be configured to a
larger MPS up to the value configured in the hotplug bridge.
To choose between the two available options, two PCI kernel boot args
have been added to the PCI calls. "pcie_bus_safe" will provide the
former behavior, while "pcie_bus_perf" will perform the latter behavior.
By default, the latter behavior is used.
NOTE: due to the location of the enablement, each arch will need to add
calls to this function. This patch only enables x86.
This patch includes a number of changes recommended by Benjamin
Herrenschmidt.
Tested-by: Jordan_Hargrave@dell.com
Signed-off-by: Jon Mason <mason@myri.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Host bridge windows are top-level resources, so if we find a host bridge
window conflict, it's probably with a hard-coded legacy reservation.
Moving host bridge windows is theoretically possible, but we don't support
it; we just ignore windows with conflicts, and it's not worth making this
a user-visible error.
Reported-and-tested-by: Jools Wills <jools@oxfordinspire.co.uk>
References: https://bugzilla.kernel.org/show_bug.cgi?id=38522
Reported-by: Das <dasfox@gmail.com>
References: https://bugzilla.kernel.org/show_bug.cgi?id=16497
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
The flags field of struct resource from linux/ioport.h is "unsigned
long". Change the "type" parameter of coalesce_windows() function to
match that field. This fixes the following warning messages when
compiling with "make C=1 W=1 bzImage modules":
arch/x86/pci/acpi.c: In function ‘coalesce_windows’:
arch/x86/pci/acpi.c:198: warning: conversion to ‘long unsigned int’ from ‘int’ may change the sign of the result
arch/x86/pci/acpi.c:203: warning: conversion to ‘long unsigned int’ from ‘int’ may change the sign of the result
Signed-off-by: Márton Németh <nm127@freemail.hu>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Some BIOSes provide PCI host bridge windows that overlap, e.g.,
pci_root PNP0A03:00: host bridge window [mem 0xb0000000-0xffffffff]
pci_root PNP0A03:00: host bridge window [mem 0xafffffff-0xdfffffff]
pci_root PNP0A03:00: host bridge window [mem 0xf0000000-0xffffffff]
If we simply insert these as children of iomem_resource, the second window
fails because it conflicts with the first, and the third is inserted as a
child of the first, i.e.,
b0000000-ffffffff PCI Bus 0000:00
f0000000-ffffffff PCI Bus 0000:00
When we claim PCI device resources, this can cause collisions like this
if we put them in the first window:
pci 0000:00:01.0: address space collision: [mem 0xff300000-0xff4fffff] conflicts with PCI Bus 0000:00 [mem 0xf0000000-0xffffffff]
Host bridge windows are top-level resources by definition, so it doesn't
make sense to make the third window a child of the first. This patch
coalesces any host bridge windows that overlap. For the example above,
the result is this single window:
pci_root PNP0A03:00: host bridge window [mem 0xafffffff-0xffffffff]
This fixes a 2.6.34 regression.
Reference: https://bugzilla.kernel.org/show_bug.cgi?id=17011
Reported-and-tested-by: Anisse Astier <anisse@astier.eu>
Reported-and-tested-by: Pramod Dematagoda <pmd.lotr.gandalf@gmail.com>
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
This DMI quirk turns on "pci=use_crs" for the ALiveSATA2-GLAN because
amd_bus.c doesn't handle this system correctly.
The system has a single HyperTransport I/O chain, but has two PCI host
bridges to buses 00 and 80. amd_bus.c learns the MMIO range associated
with buses 00-ff and that this range is routed to the HT chain hosted at
node 0, link 0:
bus: [00, ff] on node 0 link 0
bus: 00 index 1 [mem 0x80000000-0xfcffffffff]
This includes the address space for both bus 00 and bus 80, and amd_bus.c
assumes it's all routed to bus 00.
We find device 80:01.0, which BIOS left in the middle of that space, but
we don't find a bridge from bus 00 to bus 80, so we conclude that 80:01.0
is unreachable from bus 00, and we move it from the original, working,
address to something outside the bus 00 aperture, which does not work:
pci 0000:80:01.0: reg 10: [mem 0xfebfc000-0xfebfffff 64bit]
pci 0000:80:01.0: BAR 0: assigned [mem 0xfd00000000-0xfd00003fff 64bit]
The BIOS told us everything we need to know to handle this correctly,
so we're better off if we just pay attention, which lets us leave the
80:01.0 device at the original, working, address:
ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-7f])
pci_root PNP0A03:00: host bridge window [mem 0x80000000-0xff37ffff]
ACPI: PCI Root Bridge [PCI1] (domain 0000 [bus 80-ff])
pci_root PNP0A08:00: host bridge window [mem 0xfebfc000-0xfebfffff]
This was a regression between 2.6.33 and 2.6.34. In 2.6.33, amd_bus.c
was used only when we found multiple HT chains. 3e3da00c01, which
enabled amd_bus.c even on systems with a single HT chain, caused this
failure.
This quirk was written by Graham. If we ever enable "pci=use_crs" for
machines from 2006 or earlir, this quirk should be removed.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=16007
Cc: stable@kernel.org
Reported-by: Graham Ramsey <ramsey.graham@ntlworld.com>
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, cpufeature: Unbreak compile with gcc 3.x
x86, pat: Fix memory leak in free_memtype
x86, k8: Fix section mismatch for powernowk8_exit()
lib/atomic64_test: fix missing include of linux/kernel.h
x86: remove last traces of quicklist usage
x86, setup: Phoenix BIOS fixup is needed on Dell Inspiron Mini 1012
x86: "nosmp" command line option should force the system into UP mode
arch/x86/pci: use kasprintf
x86, apic: ack all pending irqs when crashed/on kexec