linux/drivers/base
Pierre Gondois 5944ce092b arch_topology: Build cacheinfo from primary CPU
commit 3fcbf1c77d ("arch_topology: Fix cache attributes detection
in the CPU hotplug path")
adds a call to detect_cache_attributes() to populate the cacheinfo
before updating the siblings mask. detect_cache_attributes() allocates
memory and can take the PPTT mutex (on ACPI platforms). On PREEMPT_RT
kernels, on secondary CPUs, this triggers a:
  'BUG: sleeping function called from invalid context' [1]
as the code is executed with preemption and interrupts disabled.

The primary CPU was previously storing the cache information using
the now removed (struct cpu_topology).llc_id:
commit 5b8dc787ce ("arch_topology: Drop LLC identifier stash from
the CPU topology")

allocate_cache_info() tries to build the cacheinfo from the primary
CPU prior secondary CPUs boot, if the DT/ACPI description
contains cache information.
If allocate_cache_info() fails, then fallback to the current state
for the cacheinfo allocation. [1] will be triggered in such case.

When unplugging a CPU, the cacheinfo memory cannot be freed. If it
was, then the memory would be allocated early by the re-plugged
CPU and would trigger [1].

Note that populate_cache_leaves() might be called multiple times
due to populate_leaves being moved up. This is required since
detect_cache_attributes() might be called with per_cpu_cacheinfo(cpu)
being allocated but not populated.

[1]:
 | BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:46
 | in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 0, name: swapper/111
 | preempt_count: 1, expected: 0
 | RCU nest depth: 1, expected: 1
 | 3 locks held by swapper/111/0:
 |  #0:  (&pcp->lock){+.+.}-{3:3}, at: get_page_from_freelist+0x218/0x12c8
 |  #1:  (rcu_read_lock){....}-{1:3}, at: rt_spin_trylock+0x48/0xf0
 |  #2:  (&zone->lock){+.+.}-{3:3}, at: rmqueue_bulk+0x64/0xa80
 | irq event stamp: 0
 | hardirqs last  enabled at (0):  0x0
 | hardirqs last disabled at (0):  copy_process+0x5dc/0x1ab8
 | softirqs last  enabled at (0):  copy_process+0x5dc/0x1ab8
 | softirqs last disabled at (0):  0x0
 | Preemption disabled at:
 |  migrate_enable+0x30/0x130
 | CPU: 111 PID: 0 Comm: swapper/111 Tainted: G        W          6.0.0-rc4-rt6-[...]
 | Call trace:
 |  __kmalloc+0xbc/0x1e8
 |  detect_cache_attributes+0x2d4/0x5f0
 |  update_siblings_masks+0x30/0x368
 |  store_cpu_topology+0x78/0xb8
 |  secondary_start_kernel+0xd0/0x198
 |  __secondary_switched+0xb0/0xb4

Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Link: https://lore.kernel.org/r/20230104183033.755668-7-pierre.gondois@arm.com
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
2023-01-18 09:58:40 +00:00
..
firmware_loader Kbuild updates for v6.2 2022-12-19 12:33:32 -06:00
power Power management updates for 6.2-rc1 2022-12-12 13:19:07 -08:00
regmap regmap: Merge fix for where we get the number of registers from 2022-12-12 11:50:58 +00:00
test device property: Add a blank line in Kconfig of tests 2022-11-23 19:35:31 +01:00
arch_numa.c mm: percpu: add generic pcpu_populate_pte() function 2022-01-20 08:52:52 +02:00
arch_topology.c arch_topology: Build cacheinfo from primary CPU 2023-01-18 09:58:40 +00:00
attribute_container.c
auxiliary.c Documentation/auxiliary_bus: Move the text into the code 2021-12-03 16:41:50 +01:00
base.h driver core: mark driver_allows_async_probing static 2022-11-10 18:31:04 +01:00
bus.c kobject: kset_uevent_ops: make filter() callback take a const * 2022-11-22 17:34:46 +01:00
cacheinfo.c arch_topology: Build cacheinfo from primary CPU 2023-01-18 09:58:40 +00:00
class.c kobject: make kobject_get_ownership() take a constant kobject * 2022-11-22 17:34:29 +01:00
component.c component: Add common helper for compare/release functions 2022-02-25 12:16:12 +01:00
container.c
core.c kobject: kset_uevent_ops: make name() callback take a const * 2022-11-22 17:34:52 +01:00
cpu.c x86/bugs: Report AMD retbleed vulnerability 2022-06-27 10:33:59 +02:00
dd.c driver core: Fix bus_type.match() error handling in __driver_attach() 2022-11-10 18:36:04 +01:00
devcoredump.c devcoredump : Serialize devcd_del work 2022-09-24 14:01:40 +02:00
devres.c devres: Use kmalloc_size_roundup() to match ksize() usage 2022-11-09 15:11:46 +01:00
devtmpfs.c devtmpfs: fix the dangling pointer of global devtmpfsd thread 2022-06-27 16:41:13 +02:00
driver.c driver core: fix driver_set_override() issue with empty strings 2022-09-05 13:01:34 +02:00
firmware.c
hypervisor.c
init.c init: Initialize noop_backing_dev_info early 2022-06-16 10:55:57 +02:00
isa.c
Kconfig devtmpfs: mount with noexec and nosuid 2021-12-30 13:54:42 +01:00
Makefile genirq: Get rid of GENERIC_MSI_IRQ_DOMAIN 2022-11-17 15:15:20 +01:00
map.c
memory.c mm/hwpoison: introduce per-memory_block hwpoison counter 2022-11-08 17:37:22 -08:00
module.c
node.c - Yu Zhao's Multi-Gen LRU patches are here. They've been under test in 2022-10-10 17:53:04 -07:00
physical_location.c driver core: location: Add "back" as a possible output for panel 2022-05-19 19:28:32 +02:00
physical_location.h driver core: Add sysfs support for physical location of a device 2022-04-27 09:51:57 +02:00
pinctrl.c
platform-msi.c platform-msi: Switch to the domain id aware MSI interfaces 2022-12-05 19:21:00 +01:00
platform.c platform: use fwnode_irq_get_byname instead of of_irq_get_byname to get irq 2022-11-10 18:56:47 +01:00
property.c device property: Fix documentation for fwnode_get_next_parent() 2022-12-07 17:22:44 +01:00
soc.c base: soc: Make soc_device_match() simpler and easier to read 2022-03-18 14:28:07 +01:00
swnode.c software node: fix wrong node passed to find nargs_prop 2021-12-22 18:26:18 +01:00
syscore.c
topology.c drivers/base: fix userspace break from using bin_attributes for cpumap and cpulist 2022-07-15 17:36:33 +02:00
trace.c
trace.h
transport_class.c