linux/arch/x86/kernel/cpu
Reinette Chatre 36b6f9fcb8 x86/intel_rdt: Fix potential deadlock during resctrl unmount
Lockdep warns about a potential deadlock:

[   66.782842] ======================================================
[   66.782888] WARNING: possible circular locking dependency detected
[   66.782937] 4.14.0-rc2-test-test+ #48 Not tainted
[   66.782983] ------------------------------------------------------
[   66.783052] umount/336 is trying to acquire lock:
[   66.783117]  (cpu_hotplug_lock.rw_sem){++++}, at: [<ffffffff81032395>] rdt_kill_sb+0x215/0x390
[   66.783193]
               but task is already holding lock:
[   66.783244]  (rdtgroup_mutex){+.+.}, at: [<ffffffff810321b6>] rdt_kill_sb+0x36/0x390
[   66.783305]
               which lock already depends on the new lock.

[   66.783364]
               the existing dependency chain (in reverse order) is:
[   66.783419]
               -> #3 (rdtgroup_mutex){+.+.}:
[   66.783467]        __lock_acquire+0x1293/0x13f0
[   66.783509]        lock_acquire+0xaf/0x220
[   66.783543]        __mutex_lock+0x71/0x9b0
[   66.783575]        mutex_lock_nested+0x1b/0x20
[   66.783610]        intel_rdt_online_cpu+0x3b/0x430
[   66.783649]        cpuhp_invoke_callback+0xab/0x8e0
[   66.783687]        cpuhp_thread_fun+0x7a/0x150
[   66.783722]        smpboot_thread_fn+0x1cc/0x270
[   66.783764]        kthread+0x16e/0x190
[   66.783794]        ret_from_fork+0x27/0x40
[   66.783825]
               -> #2 (cpuhp_state){+.+.}:
[   66.783870]        __lock_acquire+0x1293/0x13f0
[   66.783906]        lock_acquire+0xaf/0x220
[   66.783938]        cpuhp_issue_call+0x102/0x170
[   66.783974]        __cpuhp_setup_state_cpuslocked+0x154/0x2a0
[   66.784023]        __cpuhp_setup_state+0xc7/0x170
[   66.784061]        page_writeback_init+0x43/0x67
[   66.784097]        pagecache_init+0x43/0x4a
[   66.784131]        start_kernel+0x3ad/0x3f7
[   66.784165]        x86_64_start_reservations+0x2a/0x2c
[   66.784204]        x86_64_start_kernel+0x72/0x75
[   66.784241]        verify_cpu+0x0/0xfb
[   66.784270]
               -> #1 (cpuhp_state_mutex){+.+.}:
[   66.784319]        __lock_acquire+0x1293/0x13f0
[   66.784355]        lock_acquire+0xaf/0x220
[   66.784387]        __mutex_lock+0x71/0x9b0
[   66.784419]        mutex_lock_nested+0x1b/0x20
[   66.784454]        __cpuhp_setup_state_cpuslocked+0x52/0x2a0
[   66.784497]        __cpuhp_setup_state+0xc7/0x170
[   66.784535]        page_alloc_init+0x28/0x30
[   66.784569]        start_kernel+0x148/0x3f7
[   66.784602]        x86_64_start_reservations+0x2a/0x2c
[   66.784642]        x86_64_start_kernel+0x72/0x75
[   66.784678]        verify_cpu+0x0/0xfb
[   66.784707]
               -> #0 (cpu_hotplug_lock.rw_sem){++++}:
[   66.784759]        check_prev_add+0x32f/0x6e0
[   66.784794]        __lock_acquire+0x1293/0x13f0
[   66.784830]        lock_acquire+0xaf/0x220
[   66.784863]        cpus_read_lock+0x3d/0xb0
[   66.784896]        rdt_kill_sb+0x215/0x390
[   66.784930]        deactivate_locked_super+0x3e/0x70
[   66.784968]        deactivate_super+0x40/0x60
[   66.785003]        cleanup_mnt+0x3f/0x80
[   66.785034]        __cleanup_mnt+0x12/0x20
[   66.785070]        task_work_run+0x8b/0xc0
[   66.785103]        exit_to_usermode_loop+0x94/0xa0
[   66.786804]        syscall_return_slowpath+0xe8/0x150
[   66.788502]        entry_SYSCALL_64_fastpath+0xab/0xad
[   66.790194]
               other info that might help us debug this:

[   66.795139] Chain exists of:
                 cpu_hotplug_lock.rw_sem --> cpuhp_state --> rdtgroup_mutex

[   66.800035]  Possible unsafe locking scenario:

[   66.803267]        CPU0                    CPU1
[   66.804867]        ----                    ----
[   66.806443]   lock(rdtgroup_mutex);
[   66.808002]                                lock(cpuhp_state);
[   66.809565]                                lock(rdtgroup_mutex);
[   66.811110]   lock(cpu_hotplug_lock.rw_sem);
[   66.812608]
                *** DEADLOCK ***

[   66.816983] 2 locks held by umount/336:
[   66.818418]  #0:  (&type->s_umount_key#35){+.+.}, at: [<ffffffff81229738>] deactivate_super+0x38/0x60
[   66.819922]  #1:  (rdtgroup_mutex){+.+.}, at: [<ffffffff810321b6>] rdt_kill_sb+0x36/0x390

When the resctrl filesystem is unmounted the locks should be obtain in the
locks in the same order as was done when the cpus came online:

      cpu_hotplug_lock before rdtgroup_mutex.

This also requires to switch the static_branch_disable() calls to the
_cpulocked variant because now cpu hotplug lock is held already.

[ tglx: Switched to cpus_read_[un]lock ]

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Acked-by: Vikas Shivappa <vikas.shivappa@linux.intel.com>
Acked-by: Fenghua Yu <fenghua.yu@intel.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/cc292e76be073f7260604651711c47b09fd0dc81.1508490116.git.reinette.chatre@intel.com
2017-10-21 17:12:21 +02:00
..
mcheck Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-09-04 17:43:56 -07:00
microcode Merge branch 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-09-04 11:11:57 -07:00
mtrr x86/mtrr: Prevent CPU hotplug lock recursion 2017-08-15 13:03:47 +02:00
.gitignore
amd.c x86/cpu/AMD: Fix erratum 1076 (CPB bit) 2017-09-15 11:30:53 +02:00
aperfmperf.c cpufreq: x86: Disable interrupts during MSRs reading 2017-08-11 01:27:41 +02:00
bugs.c x86/mm/32: Move setup_clear_cpu_cap(X86_FEATURE_PCID) earlier 2017-09-17 18:59:08 +02:00
centaur.c Merge branch 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-05-01 20:51:12 -07:00
common.c x86/mm/32: Move setup_clear_cpu_cap(X86_FEATURE_PCID) earlier 2017-09-17 18:59:08 +02:00
cpu.h x86/cpu: Restore MSR_IA32_ENERGY_PERF_BIAS after resume 2015-07-21 07:51:38 +02:00
cyrix.c x86/cpu/cyrix: Add alternative Device ID of Geode GX1 SoC 2017-06-05 08:34:20 +02:00
hypervisor.c x86/cpu: remove hypervisor specific set_cpu_features 2017-05-02 11:14:30 +02:00
intel_cacheinfo.c x86/cpu/amd: Derive L3 shared_cpu_map from cpu_llc_shared_mask 2017-08-10 17:37:43 +02:00
intel_rdt_ctrlmondata.c x86/intel_rdt: Add diagnostics when writing the tasks file 2017-09-27 12:10:10 +02:00
intel_rdt_monitor.c x86/intel_rdt/cqm: Make integer rmid_limbo_count static 2017-10-05 13:20:32 +02:00
intel_rdt_rdtgroup.c x86/intel_rdt: Fix potential deadlock during resctrl unmount 2017-10-21 17:12:21 +02:00
intel_rdt.c x86/intel_rdt: Initialize bitmask of shareable resource if CDP enabled 2017-10-21 17:12:21 +02:00
intel_rdt.h x86/intel_rdt: Add framework for better RDT UI diagnostics 2017-09-27 12:10:10 +02:00
intel.c x86/arch_prctl: Add ARCH_[GET|SET]_CPUID 2017-03-20 16:10:34 +01:00
Makefile x86/intel_rdt: Prepare for RDT monitor data support 2017-08-01 22:41:25 +02:00
match.c x86/kernel: Audit and remove any unnecessary uses of module.h 2016-07-14 15:06:41 +02:00
mkcapflags.sh x86/cpufeature: Carve out X86_FEATURE_* 2016-01-30 11:22:17 +01:00
mshyperv.c x86/hyper-V: Allocate the IDT entry early in boot 2017-09-13 11:02:26 +02:00
perfctr-watchdog.c x86/kernel: Audit and remove any unnecessary uses of module.h 2016-07-14 15:06:41 +02:00
powerflags.c x86/cpu: Add advanced power management bits 2016-03-29 11:12:11 +02:00
proc.c x86: do not use cpufreq_quick_get() for /proc/cpuinfo "cpu MHz" 2017-06-24 01:45:47 +02:00
rdrand.c x86, asm: Use CC_SET()/CC_OUT() and static_cpu_has() in archrandom.h 2016-06-08 12:41:20 -07:00
scattered.c x86/cpu/AMD: Add the Secure Memory Encryption CPU feature 2017-07-18 11:37:59 +02:00
topology.c x86/cpu: Convert printk(KERN_<LEVEL> ...) to pr_<level>(...) 2016-02-03 10:30:03 +01:00
transmeta.c Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-03-07 14:42:34 -08:00
umc.c
vmware.c vmware: set cpu capabilities during platform initialization 2017-05-02 11:14:24 +02:00