linux/arch/x86/kernel
Aaron Lu 82ba4faca1 x86/irq: Do not substract irq_tlb_count from irq_call_count
Since commit:

  52aec3308d ("x86/tlb: replace INVALIDATE_TLB_VECTOR by CALL_FUNCTION_VECTOR")

the TLB remote shootdown is done through call function vector. That
commit didn't take care of irq_tlb_count, which a later commit:

  fd0f586972 ("x86: Distinguish TLB shootdown interrupts from other functions call interrupts")

... tried to fix.

The fix assumes every increase of irq_tlb_count has a corresponding
increase of irq_call_count. So the irq_call_count is always bigger than
irq_tlb_count and we could substract irq_tlb_count from irq_call_count.

Unfortunately this is not true for the smp_call_function_single() case.
The IPI is only sent if the target CPU's call_single_queue is empty when
adding a csd into it in generic_exec_single. That means if two threads
are both adding flush tlb csds to the same CPU's call_single_queue, only
one IPI is sent. In other words, the irq_call_count is incremented by 1
but irq_tlb_count is incremented by 2. Over time, irq_tlb_count will be
bigger than irq_call_count and the substract will produce a very large
irq_call_count value due to overflow.

Considering that:

  1) it's not worth to send more IPIs for the sake of accurate counting of
     irq_call_count in generic_exec_single();

  2) it's not easy to tell if the call function interrupt is for TLB
     shootdown in __smp_call_function_single_interrupt().

Not to exclude TLB shootdown from call function count seems to be the
simplest fix and this patch just does that.

This bug was found by LKP's cyclic performance regression tracking recently
with the vm-scalability test suite. I have bisected to commit:

  3dec0ba0be ("mm/rmap: share the i_mmap_rwsem")

This commit didn't do anything wrong but revealed the irq_call_count
problem. IIUC, the commit makes rwc->remap_one in rmap_walk_file
concurrent with multiple threads.  When remap_one is try_to_unmap_one(),
then multiple threads could queue flush TLB to the same CPU but only
one IPI will be sent.

Since the commit was added in Linux v3.19, the counting problem only
shows up from v3.19 onwards.

Signed-off-by: Aaron Lu <aaron.lu@intel.com>
Cc: Alex Shi <alex.shi@linaro.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tomoki Sekiyama <tomoki.sekiyama.qu@hitachi.com>
Link: http://lkml.kernel.org/r/20160811074430.GA18163@aaronlu.sh.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-08-11 11:14:59 +02:00
..
acpi Merge branch 'x86-headers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-08-01 14:23:42 -04:00
apic x86/platform/UV: Fix kernel panic running RHEL kdump kernel on UV systems 2016-08-10 15:55:39 +02:00
cpu Merge branch 'x86-headers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-08-01 14:23:42 -04:00
fpu x86/mm/pkeys: Fix compact mode by removing protection keys' XSAVE buffer manipulation 2016-08-10 16:12:26 +02:00
kprobes kprobes/x86: Clear TF bit in fault on single-stepping 2016-06-14 12:00:54 +02:00
.gitignore
alternative.c x86/asm: Stop depending on ptrace.h in alternative.h 2016-04-29 11:56:40 +02:00
amd_gart_64.c dma-mapping: use unsigned long for dma_attrs 2016-08-04 08:50:07 -04:00
amd_nb.c Merge branch 'x86-headers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-08-01 14:23:42 -04:00
apb_timer.c x86/apb_timer: Convert to hotplug state machine 2016-07-15 10:40:22 +02:00
aperture_64.c param: convert some "on"/"off" users to strtobool 2016-03-17 15:09:34 -07:00
apm_32.c x86/apm32: Remove paravirt_enabled() use 2016-04-22 10:29:03 +02:00
asm-offsets_32.c x86/entry/32: Simplify and fix up the SYSENTER stack #DB/NMI fixup 2016-03-10 09:48:14 +01:00
asm-offsets_64.c
asm-offsets.c x86/uaccess: Move thread_info::addr_limit to thread_struct 2016-07-15 10:26:30 +02:00
audit_64.c
bootflag.c
check.c
cpuid.c
crash_dump_32.c
crash_dump_64.c
crash.c x86/kernel: Audit and remove any unnecessary uses of module.h 2016-07-14 15:06:41 +02:00
devicetree.c x86/cpufeature: Replace cpu_has_apic with boot_cpu_has() usage 2016-04-13 11:37:41 +02:00
doublefault.c
dumpstack_32.c Merge branch 'x86-headers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-08-01 14:23:42 -04:00
dumpstack_64.c Merge branch 'x86-headers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-08-01 14:23:42 -04:00
dumpstack.c Merge branch 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-07-25 18:18:04 -07:00
e820.c Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-03-15 09:32:27 -07:00
early_printk.c x86: Fix misspellings in comments 2016-02-24 08:44:58 +01:00
early-quirks.c Linux 4.7 2016-07-26 17:26:29 +10:00
ebda.c x86/boot: Simplify EBDA-vs-BIOS reservation logic 2016-07-22 11:46:01 +02:00
espfix_64.c x86: get rid of superfluous __GFP_REPEAT 2016-06-24 17:23:52 -07:00
ftrace.c Nothing major this round. Mostly small clean ups and fixes. 2016-03-24 10:52:25 -07:00
head32.c x86/boot: Reorganize and clean up the BIOS area reservation code 2016-07-21 10:11:57 +02:00
head64.c x86/boot: Reorganize and clean up the BIOS area reservation code 2016-07-21 10:11:57 +02:00
head_32.S Merge branch 'x86/urgent' into x86/asm, to refresh the tree 2016-04-29 11:55:04 +02:00
head_64.S x86/mm: Enable KASLR for physical mapping memory regions 2016-07-08 17:35:15 +02:00
hpet.c RTC for 4.8 2016-08-05 09:48:22 -04:00
hw_breakpoint.c x86/kernel: Audit and remove any unnecessary uses of module.h 2016-07-14 15:06:41 +02:00
i386_ksyms_32.c Merge branch 'x86-headers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-08-01 14:23:42 -04:00
i8237.c
i8253.c x86/kernel: Audit and remove any unnecessary uses of module.h 2016-07-14 15:06:41 +02:00
i8259.c
io_delay.c x86/kernel: Audit and remove any unnecessary uses of module.h 2016-07-14 15:06:41 +02:00
ioport.c x86/iopl: Fix iopl capability check on Xen PV 2016-03-17 09:49:27 +01:00
irq_32.c x86/kernel: Audit and remove any unnecessary uses of module.h 2016-07-14 15:06:41 +02:00
irq_64.c x86/kernel: Audit and remove any unnecessary uses of module.h 2016-07-14 15:06:41 +02:00
irq_work.c
irq.c x86/irq: Do not substract irq_tlb_count from irq_call_count 2016-08-11 11:14:59 +02:00
irqinit.c
jump_label.c x86/asm: Stop depending on ptrace.h in alternative.h 2016-04-29 11:56:40 +02:00
kdebugfs.c x86/kernel: Audit and remove any unnecessary uses of module.h 2016-07-14 15:06:41 +02:00
kexec-bzimage64.c KEYS: Generalise system_verify_data() to provide access to internal content 2016-04-06 16:14:24 +01:00
kgdb.c x86/asm: Stop depending on ptrace.h in alternative.h 2016-04-29 11:56:40 +02:00
ksysfs.c
kvm.c Merge branch 'x86-headers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-08-01 14:23:42 -04:00
kvmclock.c x86: Fix misspellings in comments 2016-02-24 08:44:58 +01:00
ldt.c x86/mm: Factor out LDT init from context init 2016-02-18 19:46:31 +01:00
machine_kexec_32.c
machine_kexec_64.c kexec: provide arch_kexec_protect(unprotect)_crashkres() 2016-05-23 17:04:14 -07:00
Makefile Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/livepatching 2016-05-17 17:11:27 -07:00
mcount_64.S ftrace/x86: Set ftrace_stub to weak to prevent gcc from using short jumps to it 2016-05-20 13:28:40 -04:00
mmconf-fam10h_64.c
module.c x86/asm: Stop depending on ptrace.h in alternative.h 2016-04-29 11:56:40 +02:00
mpparse.c x86/kernel: Audit and remove any unnecessary uses of module.h 2016-07-14 15:06:41 +02:00
msr.c x86/cpufeature: Carve out X86_FEATURE_* 2016-01-30 11:22:17 +01:00
nmi_selftest.c
nmi.c x86: include linux/ratelimit.h in nmi.c 2016-06-06 17:10:15 +02:00
paravirt_patch_32.c
paravirt_patch_64.c
paravirt-spinlocks.c x86/kernel: Audit and remove any unnecessary uses of module.h 2016-07-14 15:06:41 +02:00
paravirt.c x86/kernel: Audit and remove any unnecessary uses of module.h 2016-07-14 15:06:41 +02:00
pci-calgary_64.c dma-mapping: use unsigned long for dma_attrs 2016-08-04 08:50:07 -04:00
pci-dma.c dma-mapping: use unsigned long for dma_attrs 2016-08-04 08:50:07 -04:00
pci-iommu_table.c x86: Fix non-static inlines 2016-04-16 13:21:40 +02:00
pci-nommu.c dma-mapping: use unsigned long for dma_attrs 2016-08-04 08:50:07 -04:00
pci-swiotlb.c dma-mapping: use unsigned long for dma_attrs 2016-08-04 08:50:07 -04:00
pcspeaker.c
perf_regs.c
platform-quirks.c x86/boot: Reorganize and clean up the BIOS area reservation code 2016-07-21 10:11:57 +02:00
pmem.c x86/kernel: Audit and remove any unnecessary uses of module.h 2016-07-14 15:06:41 +02:00
probe_roms.c
process_32.c x86/kernel: Audit and remove any unnecessary uses of module.h 2016-07-14 15:06:41 +02:00
process_64.c x86/kernel: Audit and remove any unnecessary uses of module.h 2016-07-14 15:06:41 +02:00
process.c Merge branch 'x86-headers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-08-01 14:23:42 -04:00
ptrace.c x86/ptrace: Stop setting TS_COMPAT in ptrace code 2016-07-27 11:09:43 +02:00
pvclock.c pvclock: introduce seqcount-like API 2016-08-04 13:52:21 +02:00
quirks.c
reboot_fixups_32.c
reboot.c Merge branch 'x86-headers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-08-01 14:23:42 -04:00
relocate_kernel_32.S
relocate_kernel_64.S
resource.c
rtc.c char/genrtc: x86: remove remnants of asm/rtc.h 2016-06-04 00:20:07 +02:00
setup_percpu.c x86/acpi: store ACPI ids from MADT for future usage 2016-07-25 13:30:53 +01:00
setup.c x86/mm/KASLR: Fix physical memory calculation on KASLR memory randomization 2016-08-10 14:45:19 +02:00
signal_compat.c x86/signals: Add build-time checks to the siginfo compat code 2016-06-14 12:19:24 +02:00
signal.c Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-08-06 09:04:35 -04:00
smp.c
smpboot.c Merge branch 'x86-headers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-08-01 14:23:42 -04:00
stacktrace.c x86/kernel: Audit and remove any unnecessary uses of module.h 2016-07-14 15:06:41 +02:00
step.c
sys_x86_64.c
sysfb_efi.c Merge branch 'linus' into efi/core, to pick up fixes 2016-05-07 07:00:07 +02:00
sysfb_simplefb.c
sysfb.c
tboot.c x86/tboot: Convert to hotplug state machine 2016-07-15 10:40:30 +02:00
tce_64.c x86/cpufeature: Remove cpu_has_clflush 2016-03-31 13:35:09 +02:00
test_nx.c x86/mm: Always enable CONFIG_DEBUG_RODATA and remove the Kconfig option 2016-02-22 08:51:38 +01:00
test_rodata.c x86: Don't use module.h just for AUTHOR / LICENSE tags 2016-07-14 13:04:20 +02:00
time.c
tls.c x86/tls: Synchronize segment registers in set_thread_area() 2016-04-29 11:56:42 +02:00
tls.h
topology.c
trace_clock.c
tracepoint.c
traps.c x86/kernel: Audit and remove any unnecessary uses of module.h 2016-07-14 15:06:41 +02:00
tsc_msr.c x86/tsc_msr: Remove irqoff around MSR-based TSC enumeration 2016-07-11 21:30:12 +02:00
tsc_sync.c
tsc.c Merge branch 'x86-headers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-08-01 14:23:42 -04:00
uprobes.c Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-05-16 15:15:17 -07:00
verify_cpu.S x86/cpufeature: Carve out X86_FEATURE_* 2016-01-30 11:22:17 +01:00
vm86_32.c x86, bitops: remove use of "sbb" to return CF 2016-06-08 12:41:20 -07:00
vmlinux.lds.S x86/boot: Move compressed kernel to the end of the decompression buffer 2016-04-29 11:03:29 +02:00
vsmp_64.c
x86_init.c Merge branch 'x86-headers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-08-01 14:23:42 -04:00
x8664_ksyms_64.c Merge branch 'x86-headers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-08-01 14:23:42 -04:00