Commit Graph

682 Commits

Author SHA1 Message Date
Linus Torvalds
193c0d6825 PCI changes for the v3.8 merge window:
Host bridge hotplug:
     - Untangle _PRT from struct pci_bus (Bjorn Helgaas)
     - Request _OSC control before scanning root bus (Taku Izumi)
     - Assign resources when adding host bridge (Yinghai Lu)
     - Remove root bus when removing host bridge (Yinghai Lu)
     - Remove _PRT during hot remove (Yinghai Lu)
 
   SRIOV
     - Add sysfs knobs to control numVFs (Don Dutile)
 
   Power management
     - Notify devices when power resource turned on (Huang Ying)
 
   Bug fixes
     - Work around broken _SEG on HP xw9300 (Bjorn Helgaas)
     - Keep runtime PM enabled for unbound PCI devices (Huang Ying)
     - Fix Optimus dual-GPU runtime D3 suspend issue (Dave Airlie)
     - Fix xen frontend shutdown issue (David Vrabel)
     - Work around PLX PCI 9050 BAR alignment erratum (Ian Abbott)
 
   Miscellaneous
     - Add GPL license for drivers/pci/ioapic (Andrew Cooks)
     - Add standard PCI-X, PCIe ASPM register #defines (Bjorn Helgaas)
     - NumaChip remote PCI support (Daniel Blueman)
     - Fix PCIe Link Capabilities Supported Link Speed definition (Jingoo Han)
     - Convert dev_printk() to dev_info(), etc (Joe Perches)
     - Add support for non PCI BAR ROM data (Matthew Garrett)
     - Add x86 support for host bridge translation offset (Mike Yoknis)
     - Report success only when every driver supports AER (Vijay Pandarathil)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.10 (GNU/Linux)
 
 iQIcBAABAgAGBQJQyKwSAAoJEPGMOI97Hn6zScgQAJZK2VDfCv74mKrgSDNokIzH
 5nVDrc9AHKJm7CUODs6keJK5d4TD/za3Zao68zrYHsJJKes2ni2Z3W34HP2RXKK2
 eOmePXOHYPPZMlimP9r9cVxNu1ZJCyp/yWSBcsPF4zUgWhBWLRaSj85I049gQ0sz
 +05nZYfLjVd3HNiaXsG4CQyMrNF46XEsLhF9vs+Nr2GHPwrpzhfScgYv63oDS86C
 3ICKsjmiRUZcNelxIFYmyxa5u89QdW5XHjzc9eHGQuus24Vxw+TZzsdfc17sUJEE
 HTyXY+RjDpOVhdtwwUjrCEOiyZYvy3g9+3sKxoxgt/76ghdUaR7fxITwB97qVMFD
 T0ESlKjSV/Qv5QYdyy5uP4zwNs/PXCWXkTg/L1m71F30BxKWDa7tgiA6uK7Z7fl5
 1aokKBdk3mtJJJIDJG1YkxPXx/JItTGCNYrx7CcFj49rSjrUWLQdmrYahersRIsB
 3wiD2xTi9e4dXeP/+VGzGOWB/sHk+73jvrvZe/REa1FCnMINDz4+9V9WaGROMqyq
 MQ8kX0KfYcNVNxy1GOXjU5wLpMN/t/QbvI7gwzRP1DAUCJPoOgFy7AjvSTVG3zuy
 8CtdOFttVkUn5dqsbQR0gVbyQVTS3PGSKz5XC/s8kVDWhja0xZTBYwrskM/4zdSD
 Xf48OyYV5EjpC3FYUSiU
 =OE3Q
 -----END PGP SIGNATURE-----

Merge tag 'for-3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI update from Bjorn Helgaas:
 "Host bridge hotplug:
   - Untangle _PRT from struct pci_bus (Bjorn Helgaas)
   - Request _OSC control before scanning root bus (Taku Izumi)
   - Assign resources when adding host bridge (Yinghai Lu)
   - Remove root bus when removing host bridge (Yinghai Lu)
   - Remove _PRT during hot remove (Yinghai Lu)

  SRIOV
    - Add sysfs knobs to control numVFs (Don Dutile)

  Power management
   - Notify devices when power resource turned on (Huang Ying)

  Bug fixes
   - Work around broken _SEG on HP xw9300 (Bjorn Helgaas)
   - Keep runtime PM enabled for unbound PCI devices (Huang Ying)
   - Fix Optimus dual-GPU runtime D3 suspend issue (Dave Airlie)
   - Fix xen frontend shutdown issue (David Vrabel)
   - Work around PLX PCI 9050 BAR alignment erratum (Ian Abbott)

  Miscellaneous
   - Add GPL license for drivers/pci/ioapic (Andrew Cooks)
   - Add standard PCI-X, PCIe ASPM register #defines (Bjorn Helgaas)
   - NumaChip remote PCI support (Daniel Blueman)
   - Fix PCIe Link Capabilities Supported Link Speed definition (Jingoo
     Han)
   - Convert dev_printk() to dev_info(), etc (Joe Perches)
   - Add support for non PCI BAR ROM data (Matthew Garrett)
   - Add x86 support for host bridge translation offset (Mike Yoknis)
   - Report success only when every driver supports AER (Vijay
     Pandarathil)"

Fix up trivial conflicts.

* tag 'for-3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (48 commits)
  PCI: Use phys_addr_t for physical ROM address
  x86/PCI: Add NumaChip remote PCI support
  ath9k: Use standard #defines for PCIe Capability ASPM fields
  iwlwifi: Use standard #defines for PCIe Capability ASPM fields
  iwlwifi: collapse wrapper for pcie_capability_read_word()
  iwlegacy: Use standard #defines for PCIe Capability ASPM fields
  iwlegacy: collapse wrapper for pcie_capability_read_word()
  cxgb3: Use standard #defines for PCIe Capability ASPM fields
  PCI: Add standard PCIe Capability Link ASPM field names
  PCI/portdrv: Use PCI Express Capability accessors
  PCI: Use standard PCIe Capability Link register field names
  x86: Use PCI setup data
  PCI: Add support for non-BAR ROMs
  PCI: Add pcibios_add_device
  EFI: Stash ROMs if they're not in the PCI BAR
  PCI: Add and use standard PCI-X Capability register names
  PCI/PM: Keep runtime PM enabled for unbound PCI devices
  xen-pcifront: Handle backend CLOSED without CLOSING
  PCI: SRIOV control and status via sysfs (documentation)
  PCI/AER: Report success only when every device has AER-aware driver
  ...
2012-12-13 12:14:47 -08:00
Linus Torvalds
1ebaf4f4e6 Merge branch 'x86-timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 timer update from Ingo Molnar:
 "This tree includes HPET fixes and also implements a calibration-free,
  TSC match driven APIC timer interrupt mode: 'TSC deadline mode'
  supported in SandyBridge and later CPUs."

* 'x86-timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86: hpet: Fix inverted return value check in arch_setup_hpet_msi()
  x86: hpet: Fix masking of MSI interrupts
  x86: apic: Use tsc deadline for oneshot when available
2012-12-11 20:01:33 -08:00
Linus Torvalds
e9a5a91971 Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cleanups from Ingo Molnar:
 "Small cleanups."

* 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86: Fix the error of using "const" in gen-insn-attr-x86.awk
  x86, apic: Cleanup cfg->domain setup for legacy interrupts
  x86: Remove dead hlt_use_halt code
2012-12-11 19:57:56 -08:00
Daniel J Blueman
f9726bfd4b x86/PCI: Add NumaChip remote PCI support
Add NumaChip-specific PCI access mechanism via MMCONFIG cycles, but
preventing access to AMD Northbridges which shouldn't respond.

Signed-off-by: Daniel J Blueman <daniel@numascale-asia.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2012-12-07 14:24:32 -07:00
Suresh Siddha
29c574c0ab x86, apic: Cleanup cfg->domain setup for legacy interrupts
Issues that need to be handled:
* Handle PIC interrupts on any CPU irrespective of the apic mode
* In the apic lowest priority logical flat delivery mode, be prepared to
  handle the interrupt on any CPU irrespective of what the IO-APIC RTE says.
* Because of above, when the IO-APIC starts handling the legacy PIC interrupt,
  use the same vector that is being used by the PIC while programming the
  corresponding IO-APIC RTE.

Start with all the cpu's in the legacy PIC interrupts cfg->domain.

By the time IO-APIC starts taking over the PIC interrupts, apic driver
model is finalized. So depend on the assign_irq_vector() to update the
cfg->domain and retain the same vector that was used by PIC before.

For the logical apic flat mode, cfg->domain is updated (during the first
call to assign_irq_vector()) to contain all the possible online cpu's (0xff).
Vector used for the legacy PIC interrupt doesn't change when the IO-APIC
starts handling the interrupt. Any interrupt migration after that
doesn't change the cfg->domain or the vector used.

For other apic modes like physical mode, cfg->domain is updated
(during the first call to assign_irq_vector()) to the boot cpu (cpu-0),
with the same vector that is being used by the PIC. When that interrupt is
migrated to a different cpu, cfg->domin and the vector assigned will change
accordingly.

Tested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Link: http://lkml.kernel.org/r/1353970176.21070.51.camel@sbsiddha-desk.sc.intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-11-26 15:43:25 -08:00
Fenghua Yu
8d966a0410 x86, hotplug: Handle retrigger irq by the first available CPU
The first cpu in irq cfg->domain is likely to be CPU 0 and may not be available
when CPU 0 is offline. Instead of using CPU 0 to handle retriggered irq, we use
first available CPU which is online and in this irq's domain.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: http://lkml.kernel.org/r/1352835171-3958-13-git-send-email-fenghua.yu@intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2012-11-14 15:28:11 -08:00
Jan Beulich
5074b85bdd x86: hpet: Fix inverted return value check in arch_setup_hpet_msi()
setup_hpet_msi_remapped() returns a negative error indicator on error
- check for this rather than for a boolean false indication, and pass
on that error code rather than a meaningless "-1".

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Link: http://lkml.kernel.org/r/5093E00D02000078000A60E2@nat28.tlf.novell.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-11-02 22:53:27 +01:00
Suresh Siddha
279f146143 x86: apic: Use tsc deadline for oneshot when available
If the TSC deadline mode is supported, LAPIC timer one-shot mode can be
implemented using IA32_TSC_DEADLINE MSR. An interrupt will be generated
when the TSC value equals or exceeds the value in the IA32_TSC_DEADLINE
MSR.

This enables us to skip the APIC calibration during boot. Also, in
xapic mode, this enables us to skip the uncached apic access to re-arm
the APIC timer.

As this timer ticks at the high frequency TSC rate, we use the
TSC_DIVISOR (32) to work with the 32-bit restrictions in the
clockevent API's to avoid 64-bit divides etc (frequency is u32 and
"unsigned long" in the set_next_event(), max_delta limits the next
event to 32-bit for 32-bit kernel).

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: venki@google.com
Cc: len.brown@intel.com
Link: http://lkml.kernel.org/r/1350941878.6017.31.camel@sbsiddha-desk.sc.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-11-02 11:23:37 +01:00
Dimitri Sivanich
94777fc51b x86/irq/ioapic: Check for valid irq_cfg pointer in smp_irq_move_cleanup_interrupt
Posting this patch to fix an issue concerning sparse irq's that
I raised a while back.  There was discussion about adding
refcounting to sparse irqs (to fix other potential race
conditions), but that does not appear to have been addressed
yet.  This covers the only issue of this type that I've
encountered in this area.

A NULL pointer dereference can occur in
smp_irq_move_cleanup_interrupt() if we haven't yet setup the
irq_cfg pointer in the irq_desc.irq_data.chip_data.

In create_irq_nr() there is a window where we have set
vector_irq in __assign_irq_vector(), but not yet called
irq_set_chip_data() to set the irq_cfg pointer.

Should an IRQ_MOVE_CLEANUP_VECTOR hit the cpu in question during
this time, smp_irq_move_cleanup_interrupt() will attempt to
process the aforementioned irq, but panic when accessing
irq_cfg.

Only continue processing the irq if irq_cfg is non-NULL.

Signed-off-by: Dimitri Sivanich <sivanich@sgi.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Joerg Roedel <joerg.roedel@amd.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Alexander Gordeev <agordeev@redhat.com>
Link: http://lkml.kernel.org/r/20121016125021.GA22935@sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-10-24 12:53:51 +02:00
Andi Kleen
75fdd155ea sections: fix section conflicts in arch/x86
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-10-06 03:04:40 +09:00
Peter Senna Tschudin
4b8073e467 arch/x86: Remove unecessary semicolons
Found by http://coccinelle.lip6.fr/

Signed-off-by: Peter Senna Tschudin <peter.senna@gmail.com>
Cc: avi@redhat.com
Cc: mtosatti@redhat.com
Cc: a.p.zijlstra@chello.nl
Cc: rusty@rustcorp.com.au
Cc: masami.hiramatsu.pt@hitachi.com
Cc: suresh.b.siddha@intel.com
Cc: joerg.roedel@amd.com
Cc: agordeev@redhat.com
Cc: yinghai@kernel.org
Cc: bhelgaas@google.com
Cc: liuj97@gmail.com
Link: http://lkml.kernel.org/r/1347986174-30287-7-git-send-email-peter.senna@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-09-19 17:32:48 +02:00
Suresh Siddha
f1c6300183 x86, apic: fix broken legacy interrupts in the logical apic mode
Recent commit 332afa656e cleaned up
a workaround that updates irq_cfg domain for legacy irq's that
are handled by the IO-APIC. This was assuming that the recent
changes in assign_irq_vector() were sufficient to remove the workaround.

But this broke couple of AMD platforms. One of them seems to be
sending interrupts to the offline cpu's, resulting in spurious
"No irq handler for vector xx (irq -1)" messages when those cpu's come online.
And the other platform seems to always send the interrupt to the last logical
CPU (cpu-7). Recent changes had an unintended side effect of using only logical
cpu-0 in the IO-APIC RTE (during boot for the legacy interrupts) and this
broke the legacy interrupts not getting routed to the cpu-7 on the AMD
platform, resulting in a boot hang.

For now, reintroduce the removed workaround, (essentially not allowing the
vector to change for legacy irq's when io-apic starts to handle the irq. Which
also addressed the uninteded sife effect of just specifying cpu-0 in the
IO-APIC RTE for those irq's during boot).

Reported-and-tested-by: Robert Richter <robert.richter@amd.com>
Reported-and-tested-by: Borislav Petkov <bp@amd64.org>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Link: http://lkml.kernel.org/r/1344453412.29170.5.camel@sbsiddha-desk.sc.intel.com
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2012-08-14 09:52:20 -07:00
Linus Torvalds
1ca0049f2c Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
 "Various fixes"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86-64, kcmp: The kcmp system call can be common
  arch/x86/kernel/kdebugfs.c: Ensure a consistent return value in error case
  x86/mce: Add quirk for instruction recovery on Sandy Bridge processors
  x86/mce: Move MCACOD defines from mce-severity.c to <asm/mce.h>
  x86/ioapic: Fix NULL pointer dereference on CPU hotplug after disabling irqs
  x86, nops: Missing break resulting in incorrect selection on Intel
  x86: CONFIG_CC_STACKPROTECTOR=y is no longer experimental
2012-08-03 10:59:36 -07:00
Linus Torvalds
4cb38750d4 Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/mm changes from Peter Anvin:
 "The big change here is the patchset by Alex Shi to use INVLPG to flush
  only the affected pages when we only need to flush a small page range.

  It also removes the special INVALIDATE_TLB_VECTOR interrupts (32
  vectors!) and replace it with an ordinary IPI function call."

Fix up trivial conflicts in arch/x86/include/asm/apic.h (added code next
to changed line)

* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/tlb: Fix build warning and crash when building for !SMP
  x86/tlb: do flush_tlb_kernel_range by 'invlpg'
  x86/tlb: replace INVALIDATE_TLB_VECTOR by CALL_FUNCTION_VECTOR
  x86/tlb: enable tlb flush range support for x86
  mm/mmu_gather: enable tlb flush range in generic mmu_gather
  x86/tlb: add tlb_flushall_shift knob into debugfs
  x86/tlb: add tlb_flushall_shift for specific CPU
  x86/tlb: fall back to flush all when meet a THP large page
  x86/flush_tlb: try flush_tlb_single one by one in flush_tlb_range
  x86/tlb_info: get last level TLB entry number of CPU
  x86: Add read_mostly declaration/definition to variables from smp.h
  x86: Define early read-mostly per-cpu macros
2012-07-26 13:17:17 -07:00
Tomoki Sekiyama
1d44b30f35 x86/ioapic: Fix NULL pointer dereference on CPU hotplug after disabling irqs
In the current kernel, percpu variable `vector_irq' is not always
cleared when a CPU is offlined. If the CPU that has the disabled
irqs in vector_irq is hotplugged again, __setup_vector_irq()
hits invalid irq vector and may crash.

This bug can be reproduced as following;

 # echo 0 > /sys/devices/system/cpu/cpu7/online
 # modprobe -r some_driver_using_interrupts     # vector_irq@cpu7 uncleared
 # echo 1 > /sys/devices/system/cpu/cpu7/online # kernel may crash

To fix this problem, this patch clears vector_irq in
__fixup_irqs() when the CPU is offlined.

This also reverts commit f6175f5bfb, which partially fixes
this bug by clearing vector in __clear_irq_vector(). But in
environments with IOMMU IRQ remapper, it could fail because
cfg->domain doesn't contain offlined CPUs. With this patch, the
fix in __clear_irq_vector() can be reverted because every
vector_irq is already cleared in __fixup_irqs() on offlined CPUs.

Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama.qu@hitachi.com>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: yrl.pp-manager.tt@hitachi.com
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Alexander Gordeev <agordeev@redhat.com>
Link: http://lkml.kernel.org/r/20120726104732.2889.19144.stgit@kvmdev
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-07-26 15:01:17 +02:00
Linus Torvalds
5fecc9d8f5 KVM updates for the 3.6 merge window
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQIcBAABAgAGBQJQDRDNAAoJEI7yEDeUysxlkl8P/3C2AHx2webOU8sVzhfU6ONZ
 ZoGevwBjyZIeJEmiWVpFTTEew1l0PXtpyOocXGNUXIddVnhXTQOKr/Scj4uFbmx8
 ROqgK8NSX9+xOGrBPCoN7SlJkmp+m6uYtwYkl2SGnsEVLWMKkc7J7oqmszCcTQvN
 UXMf7G47/Ul2NUSBdv4Yvizhl4kpvWxluiweDw3E/hIQKN0uyP7CY58qcAztw8nG
 csZBAnnuPFwIAWxHXW3eBBv4UP138HbNDqJ/dujjocM6GnOxmXJmcZ6b57gh+Y64
 3+w9IR4qrRWnsErb/I8inKLJ1Jdcf7yV2FmxYqR4pIXay2Yzo1BsvFd6EB+JavUv
 pJpixrFiDDFoQyXlh4tGpsjpqdXNMLqyG4YpqzSZ46C8naVv9gKE7SXqlXnjyDlb
 Llx3hb9Fop8O5ykYEGHi+gIISAK5eETiQl4yw9RUBDpxydH4qJtqGIbLiDy8y9wi
 Xyi8PBlNl+biJFsK805lxURqTp/SJTC3+Zb7A7CzYEQm5xZw3W/CKZx1ZYBfpaa/
 pWaP6tB7JwgLIVXi4HQayLWqMVwH0soZIn9yazpOEFv6qO8d5QH5RAxAW2VXE3n5
 JDlrajar/lGIdiBVWfwTJLb86gv3QDZtIWoR9mZuLKeKWE/6PRLe7HQpG1pJovsm
 2AsN5bS0BWq+aqPpZHa5
 =pECD
 -----END PGP SIGNATURE-----

Merge tag 'kvm-3.6-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Avi Kivity:
 "Highlights include
  - full big real mode emulation on pre-Westmere Intel hosts (can be
    disabled with emulate_invalid_guest_state=0)
  - relatively small ppc and s390 updates
  - PCID/INVPCID support in guests
  - EOI avoidance; 3.6 guests should perform better on 3.6 hosts on
    interrupt intensive workloads)
  - Lockless write faults during live migration
  - EPT accessed/dirty bits support for new Intel processors"

Fix up conflicts in:
 - Documentation/virtual/kvm/api.txt:

   Stupid subchapter numbering, added next to each other.

 - arch/powerpc/kvm/booke_interrupts.S:

   PPC asm changes clashing with the KVM fixes

 - arch/s390/include/asm/sigp.h, arch/s390/kvm/sigp.c:

   Duplicated commits through the kvm tree and the s390 tree, with
   subsequent edits in the KVM tree.

* tag 'kvm-3.6-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (93 commits)
  KVM: fix race with level interrupts
  x86, hyper: fix build with !CONFIG_KVM_GUEST
  Revert "apic: fix kvm build on UP without IOAPIC"
  KVM guest: switch to apic_set_eoi_write, apic_write
  apic: add apic_set_eoi_write for PV use
  KVM: VMX: Implement PCID/INVPCID for guests with EPT
  KVM: Add x86_hyper_kvm to complete detect_hypervisor_platform check
  KVM: PPC: Critical interrupt emulation support
  KVM: PPC: e500mc: Fix tlbilx emulation for 64-bit guests
  KVM: PPC64: booke: Set interrupt computation mode for 64-bit host
  KVM: PPC: bookehv: Add ESR flag to Data Storage Interrupt
  KVM: PPC: bookehv64: Add support for std/ld emulation.
  booke: Added crit/mc exception handler for e500v2
  booke/bookehv: Add host crit-watchdog exception support
  KVM: MMU: document mmu-lock and fast page fault
  KVM: MMU: fix kvm_mmu_pagetable_walk tracepoint
  KVM: MMU: trace fast page fault
  KVM: MMU: fast path of handling guest page fault
  KVM: MMU: introduce SPTE_MMU_WRITEABLE bit
  KVM: MMU: fold tlb flush judgement into mmu_spte_update
  ...
2012-07-24 12:01:20 -07:00
Linus Torvalds
bd3e57f913 Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 platform changes from Ingo Molnar:
 "This tree mostly involves various APIC driver cleanups/robustization,
  and vSMP motivated platform callback improvements/cleanups"

Fix up trivial conflict due to printk cleanup right next to return value
change.

* 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (29 commits)
  Revert "x86/early_printk: Replace obsolete simple_strtoul() usage with kstrtoint()"
  x86/apic/x2apic: Use multiple cluster members for the irq destination only with the explicit affinity
  x86/apic/x2apic: Limit the vector reservation to the user specified mask
  x86/apic: Optimize cpu traversal in __assign_irq_vector() using domain membership
  x86/vsmp: Fix vector_allocation_domain's return value
  irq/apic: Use config_enabled(CONFIG_SMP) checks to clean up irq_set_affinity() for UP
  x86/vsmp: Fix linker error when CONFIG_PROC_FS is not set
  x86/apic/es7000: Make apicid of a cluster (not CPU) from a cpumask
  x86/apic/es7000+summit: Always make valid apicid from a cpumask
  x86/apic/es7000+summit: Fix compile warning in cpu_mask_to_apicid()
  x86/apic: Fix ugly casting and branching in cpu_mask_to_apicid_and()
  x86/apic: Eliminate cpu_mask_to_apicid() operation
  x86/x2apic/cluster: Vector_allocation_domain() should return a value
  x86/apic/irq_remap: Silence a bogus pr_err()
  x86/vsmp: Ignore IOAPIC IRQ affinity if possible
  x86/apic: Make cpu_mask_to_apicid() operations check cpu_online_mask
  x86/apic: Make cpu_mask_to_apicid() operations return error code
  x86/apic: Avoid useless scanning thru a cpumask in assign_irq_vector()
  x86/apic: Try to spread IRQ vectors to different priority levels
  x86/apic: Factor out default vector_allocation_domain() operation
  ...
2012-07-22 12:19:36 -07:00
Michael S. Tsirkin
1551df646d apic: add apic_set_eoi_write for PV use
KVM PV EOI optimization overrides eoi_write apic op with its own
version. Add an API for this to avoid meddling with core x86 apic driver
data structures directly.

For KVM use, we don't need any guarantees about when the switch to the
new op will take place, so it could in theory use this API after SMP init,
but it currently doesn't, and restricting callers to early init makes it
clear that it's safe as it won't race with actual APIC driver use.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-07-16 12:51:23 +03:00
Suresh Siddha
d872818dbb x86/apic/x2apic: Use multiple cluster members for the irq destination only with the explicit affinity
During boot or driver load etc, interrupt destination is setup
using default target cpu's. Later the user (irqbalance etc) or
the driver (irq_set_affinity/ irq_set_affinity_hint) can request
the interrupt to be migrated to some specific set of cpu's.

In the x2apic cluster routing, for the default scenario use
single cpu as the interrupt destination and when there is an
explicit interrupt affinity request, route the interrupt to
multiple members of a x2apic cluster specified in the cpumask of
the migration request.

This will minmize the vector pressure when there are lot of
interrupt sources and relatively few x2apic clusters (for
example a single socket server). This will allow the performance
critical interrupts to be routed to multiple cpu's in the x2apic
cluster (irqbalance for example uses the cache siblings etc
while specifying the interrupt destination) and allow
non-critical interrupts to be serviced by a single logical cpu.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Alexander Gordeev <agordeev@redhat.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Link: http://lkml.kernel.org/r/1340656709-11423-4-git-send-email-suresh.b.siddha@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-07-06 11:00:23 +02:00
Suresh Siddha
1ac322d0b1 x86/apic/x2apic: Limit the vector reservation to the user specified mask
For the x2apic cluster mode, vector for an interrupt is
currently reserved on all the cpu's that are part of the x2apic
cluster. But the interrupts will be routed only to the cluster
(derived from the first cpu in the mask) members specified in
the mask. So there is no need to reserve the vector in the
unused cluster members.

Modify __assign_irq_vector() to reserve the vectors based on the
user specified irq destination mask. If the new mask is a proper
subset of the currently used mask, cleanup the vector allocation
on the unused cpu members.

Also, allow the apic driver to tune the vector domain based on
the affinity mask (which in most cases is the user-specified
mask).

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Alexander Gordeev <agordeev@redhat.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Link: http://lkml.kernel.org/r/1340656709-11423-3-git-send-email-suresh.b.siddha@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-07-06 11:00:22 +02:00
Suresh Siddha
b39f25a849 x86/apic: Optimize cpu traversal in __assign_irq_vector() using domain membership
Currently __assign_irq_vector() goes through each cpu in the
specified mask until it finds a free vector in all the cpu's
that are part of the same interrupt domain. We visit all the
interrupt domain sibling cpus to reserve the free vector. So,
when we fail to find a free vector in an interrupt domain, it is
safe to continue our search with a cpu belonging to a new
interrupt domain. No need to go through each cpu, if the domain
containing that cpu is already visited.

Use the irq_cfg's old_domain to track the visited domains and
optimize the cpu traversal while finding a free vector in the
given cpumask.

NOTE: We can also optimize the search by using for_each_cpu() and
skip the current cpu, if it is not the first cpu in the mask
returned by the vector_allocation_domain(). But re-using the
cfg->old_domain to track the visited domains will be slightly
faster.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Alexander Gordeev <agordeev@redhat.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Link: http://lkml.kernel.org/r/1340656709-11423-2-git-send-email-suresh.b.siddha@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-07-06 11:00:21 +02:00
Ingo Molnar
6a991accee Merge commit 'v3.5-rc3' into x86/debug
Merge it in to pick up a fix that we are going to clean up in this
branch.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-20 14:22:34 +02:00
Ingo Molnar
8461689c67 Merge branch 'x86/apic' into x86/platform
Merge in x86/apic to solve a vector_allocation_domain() API change semantic merge conflict.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-18 11:09:49 +02:00
Suresh Siddha
7eb9ae0799 irq/apic: Use config_enabled(CONFIG_SMP) checks to clean up irq_set_affinity() for UP
Move the ->irq_set_affinity() routines out of the #ifdef CONFIG_SMP
sections and use config_enabled(CONFIG_SMP) checks inside those
routines. Thus making those routines simple null stubs for
!CONFIG_SMP and retaining those routines with no additional
runtime overhead for CONFIG_SMP kernels.

Cleans up the ifdef CONFIG_SMP in and around routines related to
irq_set_affinity in io_apic and irq_remapping subsystems.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: torvalds@linux-foundation.org
Cc: joerg.roedel@amd.com
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Link: http://lkml.kernel.org/r/1339723729.3475.63.camel@sbsiddha-desk.sc.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-15 14:17:29 +02:00
Ingo Molnar
879060d574 Merge branch 'x86/cleanups' into x86/apic
Merge in the cleanups because a followup x86/apic change relies on them.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-15 14:17:01 +02:00
Alexander Gordeev
5a0a2a3081 x86/apic/es7000: Make apicid of a cluster (not CPU) from a cpumask
cpu_mask_to_apicid_and() always returns apicid of a single CPU,
even in case multiple CPUs were requested. This update fixes a
typo and forces apicid of a cluster to be returned.

Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/20120614075043.GI3383@dhcp-26-207.brq.redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-14 12:53:16 +02:00
Alexander Gordeev
214e270b5f x86/apic/es7000+summit: Always make valid apicid from a cpumask
In case of invalid parameters cpu_mask_to_apicid_and() might
return apicid value of 0 (on Summit) or a uninitialized value
(on ES7000), although it is supposed to return apicid of cpu-0
at least. Fix the operation to always return a valid apicid.

Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/20120614075026.GH3383@dhcp-26-207.brq.redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-14 12:53:15 +02:00
Alexander Gordeev
49ad3fd483 x86/apic/es7000+summit: Fix compile warning in cpu_mask_to_apicid()
Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/20120614075010.GG3383@dhcp-26-207.brq.redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-14 12:53:15 +02:00
Alexander Gordeev
ea3807ea52 x86/apic: Fix ugly casting and branching in cpu_mask_to_apicid_and()
Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/20120614074954.GF3383@dhcp-26-207.brq.redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-14 12:53:14 +02:00
Alexander Gordeev
a5a391561b x86/apic: Eliminate cpu_mask_to_apicid() operation
Since there are only two locations where cpu_mask_to_apicid() is
called from, remove the operation and use only
cpu_mask_to_apicid_and() instead.

Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Suggested-and-acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Acked-by: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/20120614074935.GE3383@dhcp-26-207.brq.redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-14 12:53:13 +02:00
Alexander Gordeev
cac4afbc3d x86/x2apic/cluster: Vector_allocation_domain() should return a value
Since commit 8637e38 ("x86/apic: Avoid useless scanning thru a
cpumask in assign_irq_vector()") vector_allocation_domain()
operation indicates if a cpumask is dynamic or static. This
update fixes the oversight and makes the operation to return a
value.

Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/20120614103933.GJ3383@dhcp-26-207.brq.redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-14 12:53:12 +02:00
Vlad Zolotarov
0816b0f036 x86: Add read_mostly declaration/definition to variables from smp.h
Add "read-mostly" qualifier to the following variables in
smp.h:

 - cpu_sibling_map
 - cpu_core_map
 - cpu_llc_shared_map
 - cpu_llc_id
 - cpu_number
 - x86_cpu_to_apicid
 - x86_bios_cpu_apicid
 - x86_cpu_to_logical_apicid

As long as all the variables above are only written during the
initialization, this change is meant to prevent the false
sharing. More specifically, on vSMP Foundation platform
x86_cpu_to_apicid shared the same internode_cache_line with
frequently written lapic_events.

From the analysis of the first 33 per_cpu variables out of 219
(memories they describe, to be more specific) the 8 have read_mostly
nature (tlb_vector_offset, cpu_loops_per_jiffy, xen_debug_irq, etc.)
and 25 are frequently written (irq_stack_union, gdt_page,
exception_stacks, idt_desc, etc.).

Assuming that the spread of the rest of the per_cpu variables is
similar, identifying the read mostly memories will make more sense
in terms of long-term code maintenance comparing to identifying
frequently written memories.

Signed-off-by: Vlad Zolotarov <vlad@scalemp.com>
Acked-by: Shai Fultheim <shai@scalemp.com>
Cc: Shai Fultheim (Shai@ScaleMP.com) <Shai@scalemp.com>
Cc: ido@wizery.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1719258.EYKzE4Zbq5@vlad
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-14 12:42:11 +02:00
Alexander Gordeev
4988a40c39 x86/apic: Make cpu_mask_to_apicid() operations check cpu_online_mask
Currently cpu_mask_to_apicid() should not get a offline CPU with
the cpumask. Otherwise some apic drivers might try to access
non-existent per-cpu variables (i.e. x2apic). In that regard
cpu_mask_to_apicid() and cpu_mask_to_apicid_and() operations are
inconsistent.

This fix makes the two operations do not rely on calling
functions and always return the apicid for only online CPUs. As
result, the meaning and implementations of cpu_mask_to_apicid()
and cpu_mask_to_apicid_and() operations become straight.

Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/20120607131624.GG4759@dhcp-26-207.brq.redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-08 11:44:30 +02:00
Alexander Gordeev
ff16432412 x86/apic: Make cpu_mask_to_apicid() operations return error code
Current cpu_mask_to_apicid() and cpu_mask_to_apicid_and()
implementations have few shortcomings:

1. A value returned by cpu_mask_to_apicid() is written to
hardware registers unconditionally. Should BAD_APICID get ever
returned it will be written to a hardware too. But the value of
BAD_APICID is not universal across all hardware in all modes and
might cause unexpected results, i.e. interrupts might get routed
to CPUs that are not configured to receive it.

2. Because the value of BAD_APICID is not universal it is
counter- intuitive to return it for a hardware where it does not
make sense (i.e. x2apic).

3. cpu_mask_to_apicid_and() operation is thought as an
complement to cpu_mask_to_apicid() that only applies a AND mask
on top of a cpumask being passed. Yet, as consequence of 18374d8
commit the two operations are inconsistent in that of:
  cpu_mask_to_apicid() should not get a offline CPU with the cpumask
  cpu_mask_to_apicid_and() should not fail and return BAD_APICID
These limitations are impossible to realize just from looking at
the operations prototypes.

Most of these shortcomings are resolved by returning a error
code instead of BAD_APICID. As the result, faults are reported
back early rather than possibilities to cause a unexpected
behaviour exist (in case of [1]).

The only exception is setup_timer_IRQ0_pin() routine. Although
obviously controversial to this fix, its existing behaviour is
preserved to not break the fragile check_timer() and would
better addressed in a separate fix.

Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/20120607131559.GF4759@dhcp-26-207.brq.redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-08 11:44:29 +02:00
Alexander Gordeev
8637e38aff x86/apic: Avoid useless scanning thru a cpumask in assign_irq_vector()
In case of static vector allocation domains (i.e. flat) if all
vector numbers are exhausted, an attempt to assign a new vector
will lead to useless scans through all CPUs in the cpumask, even
though it is known that each new pass would fail. Make this
corner case less painful by letting report whether the vector
allocation domain depends on passed arguments or not and stop
scanning early.

The same could have been achived by introducing a static flag to
the apic operations. But let's allow vector_allocation_domain()
have more intelligence here and decide dynamically, in case we
would need it in the future.

Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/20120607131542.GE4759@dhcp-26-207.brq.redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-08 11:44:29 +02:00
Alexander Gordeev
1bccd58bff x86/apic: Try to spread IRQ vectors to different priority levels
When assigning a new vector it is primarially done by adding 8
to the previously given out vector number. Hence, two
consequently allocated vector numbers would likely fall into the
same priority level. Try to spread vector numbers to different
priority levels better by changing the step from 8 to 16.

Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/20120607131514.GD4759@dhcp-26-207.brq.redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-08 11:44:28 +02:00
Alexander Gordeev
9d8e106676 x86/apic: Factor out default vector_allocation_domain() operation
Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Acked-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/20120607131449.GC4759@dhcp-26-207.brq.redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-08 11:44:27 +02:00
Tomoki Sekiyama
f6175f5bfb x86/ioapic: Fix NULL pointer dereference on CPU hotplug after disabling irqs
In current Linux, percpu variable `vector_irq' is not cleared on
offlined cpus while disabling devices' irqs. If the cpu that has
the disabled irqs in vector_irq is hotplugged,
__setup_vector_irq() hits invalid irq vector and may crash.

This bug can be reproduced as following;

  # echo 0 > /sys/devices/system/cpu/cpu7/online
  # modprobe -r some_driver_using_interrupts      # vector_irq@cpu7 uncleared
  # echo 1 > /sys/devices/system/cpu/cpu7/online  # kernel may crash

This patch fixes this bug by clearing vector_irq in
__clear_irq_vector() even if the cpu is offlined.

Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama.qu@hitachi.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: yrl.pp-manager.tt@hitachi.com
Cc: ltc-kernel@ml.yrl.intra.hitachi.co.jp
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Alexander Gordeev <agordeev@redhat.com>
Link: http://lkml.kernel.org/r/4FC340BE.7080101@hitachi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06 12:03:25 +02:00
Alexander Gordeev
6398268d2b x86/apic: Factor out default cpu_mask_to_apicid() operations
Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/20120605112340.GA11454@dhcp-26-207.brq.redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06 10:22:18 +02:00
Alexander Gordeev
bf721d3a3b x86/apic: Factor out default target_cpus() operation
Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/20120605112324.GA11449@dhcp-26-207.brq.redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06 10:22:17 +02:00
Alexander Gordeev
49d0c7a0a4 x86/apic: Trivial whitespace fixes
Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/20120605112310.GA11443@dhcp-26-207.brq.redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06 10:22:16 +02:00
Suresh Siddha
0b8255e660 x86/x2apic/cluster: Use all the members of one cluster specified in the smp_affinity mask for the interrupt destination
If the HW implements round-robin interrupt delivery, this
enables multiple cpu's (which are part of the user specified
interrupt smp_affinity mask and belong to the same x2apic
cluster) to service the interrupt.

Also if the platform supports Power Aware Interrupt Routing,
then this enables the interrupt to be routed to an idle cpu or a
busy cpu depending on the perf/power bias tunable.

We are now grouping all the cpu's in a cluster to one vector
domain. So that will limit the total number of interrupt sources
handled by Linux. Previously we support "cpu-count *
available-vectors-per-cpu" interrupt sources but this will now
reduce to "cpu-count/16 * available-vectors-per-cpu".

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: yinghai@kernel.org
Cc: gorcunov@openvz.org
Cc: agordeev@redhat.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1337644682-19854-2-git-send-email-suresh.b.siddha@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06 09:51:22 +02:00
Suresh Siddha
332afa656e x86/irq: Update irq_cfg domain unless the new affinity is a subset of the current domain
Until now, irq_cfg domain is mostly static. Either all CPU's
(used by flat mode) or one CPU (first CPU in the irq afffinity
mask) to which irq is being migrated (this is used by the rest
of apic modes).

Upcoming x2apic cluster mode optimization patch allows the irq
to be sent to any CPU in the x2apic cluster (if supported by the
HW). So irq_cfg domain changes on the fly (depending on which
CPU in the x2apic cluster is online).

Instead of checking for any intersection between the new irq
affinity mask and the current irq_cfg domain, check if the new
irq affinity mask is a subset of the current irq_cfg domain.
Otherwise proceed with updating the irq_cfg domain aswell as
assigning vector's on all the CPUs specified in the new mask.

This also cleans up a workaround in updating irq_cfg domain for
legacy irq's that are handled by the IO-APIC.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: yinghai@kernel.org
Cc: gorcunov@openvz.org
Cc: agordeev@redhat.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1337644682-19854-1-git-send-email-suresh.b.siddha@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06 09:51:22 +02:00
Joe Perches
c767a54ba0 x86/debug: Add KERN_<LEVEL> to bare printks, convert printks to pr_<level>
Use a more current logging style:

 - Bare printks should have a KERN_<LEVEL> for consistency's sake
 - Add pr_fmt where appropriate
 - Neaten some macro definitions
 - Convert some Ok output to OK
 - Use "%s: ", __func__ in pr_fmt for summit
 - Convert some printks to pr_<level>

Message output is not identical in all cases.

Signed-off-by: Joe Perches <joe@perches.com>
Cc: levinsasha928@gmail.com
Link: http://lkml.kernel.org/r/1337655007.24226.10.camel@joe2Laptop
[ merged two similar patches, tidied up the changelog ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06 09:17:22 +02:00
Ido Yariv
7db971b235 x86/platform: Introduce APIC post-initialization callback
Some subarchitectures (such as vSMP) need to slightly adjust the
underlying APIC structure. Add an APIC post-initialization callback
to 'struct x86_platform_ops' for this purpose and use it for
adjusting the APIC structure on vSMP systems.

Signed-off-by: Ido Yariv <ido@wizery.com>
Acked-by: Shai Fultheim <shai@scalemp.com>
Link: http://lkml.kernel.org/r/1338675095-27260-1-git-send-email-ido@wizery.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06 09:06:19 +02:00
Jiang Liu
f841d792e3 x86: Return IRQ_SET_MASK_OK_NOCOPY from irq affinity functions
The interrupt chip irq_set_affinity() functions copy the affinity mask
to irq_data->affinity but return 0, i.e. IRQ_SET_MASK_OK.
IRQ_SET_MASK_OK causes the core code to do another redundant copy.

Return IRQ_SET_MASK_OK_NOCOPY to avoid this.

Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Naga Chumbalkar <nagananda.chumbalkar@hp.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Cliff Wickman <cpw@sgi.com>
Cc: Jiang Liu <liuj97@gmail.com>
Cc: Keping Chen <chenkeping@huawei.com>
Link: http://lkml.kernel.org/r/1333120296-13563-4-git-send-email-jiang.liu@huawei.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-05-24 23:16:33 +02:00
Linus Torvalds
d5b4bb4d10 Merge branch 'delete-mca' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux
Pull the MCA deletion branch from Paul Gortmaker:
 "It was good that we could support MCA machines back in the day, but
  realistically, nobody is using them anymore.  They were mostly limited
  to 386-sx 16MHz CPU and some 486 class machines and never more than
  64MB of RAM.  Even the enthusiast hobbyist community seems to have
  dried up close to ten years ago, based on what you can find searching
  various websites dedicated to the relatively short lived hardware.

  So lets remove the support relating to CONFIG_MCA.  There is no point
  carrying this forward, wasting cycles doing routine maintenance on it;
  wasting allyesconfig build time on validating it, wasting I/O on git
  grep'ping over it, and so on."

Let's see if anybody screams.  It generally has compiled, and James
Bottomley pointed out that there was a MCA extension from NCR that
allowed for up to 4GB of memory and PPro-class machines.  So in *theory*
there may be users out there.

But even James (technically listed as a maintainer) doesn't actually
have a system, and while Alan Cox claims to have a machine in his cellar
that he offered to anybody who wants to take it off his hands, he didn't
argue for keeping MCA support either.

So we could bring it back.  But somebody had better speak up and talk
about how they have actually been using said MCA hardware with modern
kernels for us to do that.  And David already took the patch to delete
all the networking driver code (commit a5e371f61a: "drivers/net:
delete all code/drivers depending on CONFIG_MCA").

* 'delete-mca' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux:
  MCA: delete all remaining traces of microchannel bus support.
  scsi: delete the MCA specific drivers and driver code
  serial: delete the MCA specific 8250 support.
  arm: remove ability to select CONFIG_MCA
2012-05-23 17:12:06 -07:00
Linus Torvalds
f08b9c2f8a Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/apic changes from Ingo Molnar:
 "Most of the changes are about helping virtualized guest kernels
  achieve better performance."

Fix up trivial conflicts with the iommu updates to arch/x86/kernel/apic/io_apic.c

* 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/apic: Implement EIO micro-optimization
  x86/apic: Add apic->eoi_write() callback
  x86/apic: Use symbolic APIC_EOI_ACK
  x86/apic: Fix typo EIO_ACK -> EOI_ACK and document it
  x86/xen/apic: Add missing #include <xen/xen.h>
  x86/apic: Only compile local function if used with !CONFIG_GENERIC_PENDING_IRQ
  x86/apic: Fix UP boot crash
  x86: Conditionally update time when ack-ing pending irqs
  xen/apic: implement io apic read with hypercall
  Revert "xen/x86: Workaround 'x86/ioapic: Add register level checks to detect bogus io-apic entries'"
  xen/x86: Implement x86_apic_ops
  x86/apic: Replace io_apic_ops with x86_io_apic_ops.
2012-05-22 18:38:11 -07:00
Michael S. Tsirkin
0ab711ae6a x86/apic: Implement EIO micro-optimization
We know both register and value for eoi beforehand,
so there's no need to check it and no need to do math
to calculate the msr. Saves instructions/branches
on each EOI when using x2apic.

I looked at the objdump output to verify that the
generated code looks right and actually is shorter.

The real improvemements will be on the KVM guest side
though, those come in a later patch.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: gleb@redhat.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/e019d1a125316f10d3e3a4b2f6bda41473f4fb72.1337184153.git.mst@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-18 09:46:09 +02:00
Michael S. Tsirkin
2a43195d83 x86/apic: Add apic->eoi_write() callback
Add eoi_write callback so that kvm can override
eoi accesses without touching the rest of the apic.
As a side-effect, this will enable a micro-optimization
for apics using msr.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: gleb@redhat.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/0df425d746c49ac2ecc405174df87752869629d2.1337184153.git.mst@redhat.com
[ tidied it up a bit ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-18 09:46:08 +02:00