linux/drivers
Seiji Aguchi e0d59733f6 efivars, efi-pstore: Hold off deletion of sysfs entry until the scan is completed
Currently, when mounting pstore file system, a read callback of
efi_pstore driver runs mutiple times as below.

- In the first read callback, scan efivar_sysfs_list from head and pass
  a kmsg buffer of a entry to an upper pstore layer.
- In the second read callback, rescan efivar_sysfs_list from the entry
  and pass another kmsg buffer to it.
- Repeat the scan and pass until the end of efivar_sysfs_list.

In this process, an entry is read across the multiple read function
calls. To avoid race between the read and erasion, the whole process
above is protected by a spinlock, holding in open() and releasing in
close().

At the same time, kmemdup() is called to pass the buffer to pstore
filesystem during it. And then, it causes a following lockdep warning.

To make the dynamic memory allocation runnable without taking spinlock,
holding off a deletion of sysfs entry if it happens while scanning it
via efi_pstore, and deleting it after the scan is completed.

To implement it, this patch introduces two flags, scanning and deleting,
to efivar_entry.

On the code basis, it seems that all the scanning and deleting logic is
not needed because __efivars->lock are not dropped when reading from the
EFI variable store.

But, the scanning and deleting logic is still needed because an
efi-pstore and a pstore filesystem works as follows.

In case an entry(A) is found, the pointer is saved to psi->data.  And
efi_pstore_read() passes the entry(A) to a pstore filesystem by
releasing  __efivars->lock.

And then, the pstore filesystem calls efi_pstore_read() again and the
same entry(A), which is saved to psi->data, is used for resuming to scan
a sysfs-list.

So, to protect the entry(A), the logic is needed.

[    1.143710] ------------[ cut here ]------------
[    1.144058] WARNING: CPU: 1 PID: 1 at kernel/lockdep.c:2740 lockdep_trace_alloc+0x104/0x110()
[    1.144058] DEBUG_LOCKS_WARN_ON(irqs_disabled_flags(flags))
[    1.144058] Modules linked in:
[    1.144058] CPU: 1 PID: 1 Comm: systemd Not tainted 3.11.0-rc5 #2
[    1.144058]  0000000000000009 ffff8800797e9ae0 ffffffff816614a5 ffff8800797e9b28
[    1.144058]  ffff8800797e9b18 ffffffff8105510d 0000000000000080 0000000000000046
[    1.144058]  00000000000000d0 00000000000003af ffffffff81ccd0c0 ffff8800797e9b78
[    1.144058] Call Trace:
[    1.144058]  [<ffffffff816614a5>] dump_stack+0x54/0x74
[    1.144058]  [<ffffffff8105510d>] warn_slowpath_common+0x7d/0xa0
[    1.144058]  [<ffffffff8105517c>] warn_slowpath_fmt+0x4c/0x50
[    1.144058]  [<ffffffff8131290f>] ? vsscanf+0x57f/0x7b0
[    1.144058]  [<ffffffff810bbd74>] lockdep_trace_alloc+0x104/0x110
[    1.144058]  [<ffffffff81192da0>] __kmalloc_track_caller+0x50/0x280
[    1.144058]  [<ffffffff815147bb>] ? efi_pstore_read_func.part.1+0x12b/0x170
[    1.144058]  [<ffffffff8115b260>] kmemdup+0x20/0x50
[    1.144058]  [<ffffffff815147bb>] efi_pstore_read_func.part.1+0x12b/0x170
[    1.144058]  [<ffffffff81514800>] ? efi_pstore_read_func.part.1+0x170/0x170
[    1.144058]  [<ffffffff815148b4>] efi_pstore_read_func+0xb4/0xe0
[    1.144058]  [<ffffffff81512b7b>] __efivar_entry_iter+0xfb/0x120
[    1.144058]  [<ffffffff8151428f>] efi_pstore_read+0x3f/0x50
[    1.144058]  [<ffffffff8128d7ba>] pstore_get_records+0x9a/0x150
[    1.158207]  [<ffffffff812af25c>] ? selinux_d_instantiate+0x1c/0x20
[    1.158207]  [<ffffffff8128ce30>] ? parse_options+0x80/0x80
[    1.158207]  [<ffffffff8128ced5>] pstore_fill_super+0xa5/0xc0
[    1.158207]  [<ffffffff811ae7d2>] mount_single+0xa2/0xd0
[    1.158207]  [<ffffffff8128ccf8>] pstore_mount+0x18/0x20
[    1.158207]  [<ffffffff811ae8b9>] mount_fs+0x39/0x1b0
[    1.158207]  [<ffffffff81160550>] ? __alloc_percpu+0x10/0x20
[    1.158207]  [<ffffffff811c9493>] vfs_kern_mount+0x63/0xf0
[    1.158207]  [<ffffffff811cbb0e>] do_mount+0x23e/0xa20
[    1.158207]  [<ffffffff8115b51b>] ? strndup_user+0x4b/0xf0
[    1.158207]  [<ffffffff811cc373>] SyS_mount+0x83/0xc0
[    1.158207]  [<ffffffff81673cc2>] system_call_fastpath+0x16/0x1b
[    1.158207] ---[ end trace 61981bc62de9f6f4 ]---

Signed-off-by: Seiji Aguchi <seiji.aguchi@hds.com>
Tested-by: Madper Xie <cxie@redhat.com>
Cc: stable@kernel.org
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
2013-11-28 20:16:55 +00:00
..
accessibility
acpi More ACPI and power management updates for 3.13-rc1 2013-11-20 13:25:04 -08:00
amba
ata More ACPI and power management updates for 3.13-rc1 2013-11-20 13:25:04 -08:00
atm atm: idt77252: fix dev refcnt leak 2013-11-19 15:53:02 -05:00
auxdisplay
base More ACPI and power management updates for 3.13-rc1 2013-11-20 13:25:04 -08:00
bcma Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2013-11-13 17:40:34 +09:00
block kernel: remove CONFIG_USE_GENERIC_SMP_HELPERS cleanly 2013-11-21 16:42:27 -08:00
bluetooth
bus Merge branch 'for-linus' of git://git.linaro.org/people/rmk/linux-arm 2013-11-14 08:51:29 +09:00
cdrom
char Merge branch 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security 2013-11-21 19:46:00 -08:00
clk Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2013-11-15 16:47:22 -08:00
clocksource
connector connector: improved unaligned access error fix 2013-11-14 17:19:20 -05:00
cpufreq More ACPI and power management updates for 3.13-rc1 2013-11-20 13:25:04 -08:00
cpuidle ACPI and power management updates for 3.13-rc1 2013-11-14 13:41:48 +09:00
crypto tree-wide: use reinit_completion instead of INIT_COMPLETION 2013-11-15 09:32:21 +09:00
dca
devfreq
dio
dma Merge branch 'next' of git://git.infradead.org/users/vkoul/slave-dma 2013-11-20 13:20:24 -08:00
edac Merge branch 'linux_next' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac 2013-11-18 14:51:52 -08:00
eisa
extcon
firewire tree-wide: use reinit_completion instead of INIT_COMPLETION 2013-11-15 09:32:21 +09:00
firmware efivars, efi-pstore: Hold off deletion of sysfs entry until the scan is completed 2013-11-28 20:16:55 +00:00
fmc
gpio ACPI / driver core: Store an ACPI device pointer in struct acpi_dev_node 2013-11-14 23:14:43 +01:00
gpu drm/sysfs: fix hotplug regression since lifetime changes 2013-11-21 21:10:00 +10:00
hid More ACPI and power management updates for 3.13-rc1 2013-11-20 13:25:04 -08:00
hsi
hv
hwmon hwmon: (acpi_power_meter) Fix acpi_bus_get_device() return value check 2013-11-20 08:31:01 -08:00
hwspinlock
i2c More ACPI and power management updates for 3.13-rc1 2013-11-20 13:25:04 -08:00
ide More ACPI and power management updates for 3.13-rc1 2013-11-20 13:25:04 -08:00
idle Merge branch 'pm-cpuidle' 2013-11-19 01:06:28 +01:00
iio kfifo API type safety 2013-11-15 09:32:23 +09:00
infiniband Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending 2013-11-22 10:52:03 -08:00
input Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid 2013-11-15 16:48:22 -08:00
iommu Don't try to compile shmobile-iommu outside of ARM 2013-11-15 18:57:42 -08:00
ipack
irqchip Merge branch 'for-linus' of git://git.linaro.org/people/rmk/linux-arm 2013-11-14 08:51:29 +09:00
isdn net: rework recvmsg handler msg_name and msg_namelen logic 2013-11-20 21:52:30 -05:00
leds
lguest
macintosh DeviceTree updates for 3.13. This is a bit larger pull request than 2013-11-12 16:52:17 +09:00
mailbox
md md update for 3.13. 2013-11-20 13:05:25 -08:00
media Merge branch 'next' of git://git.infradead.org/users/vkoul/slave-dma 2013-11-20 13:20:24 -08:00
memory
memstick tree-wide: use reinit_completion instead of INIT_COMPLETION 2013-11-15 09:32:21 +09:00
message drivers/message/i2o/driver.c: add missing destroy_workqueue() on error in i2o_driver_register() 2013-11-13 12:09:26 +09:00
mfd Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2013-11-15 16:47:22 -08:00
misc Merge branch 'next' of git://git.infradead.org/users/vkoul/slave-dma 2013-11-20 13:20:24 -08:00
mmc More ACPI and power management updates for 3.13-rc1 2013-11-20 13:25:04 -08:00
mtd Merge branch 'next' of git://git.infradead.org/users/vkoul/slave-dma 2013-11-20 13:20:24 -08:00
net Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2013-11-22 09:57:35 -08:00
nfc
ntb dmaengine: remove DMA unmap flags 2013-11-14 11:04:38 -08:00
nubus
of Merge branch 'for-linus-dma-masks' of git://git.linaro.org/people/rmk/linux-arm 2013-11-14 07:55:21 +09:00
oprofile
parisc
parport Kconfig cleanups for v3.13 2013-11-15 14:05:15 -08:00
pci PCI updates for v3.13: 2013-11-22 10:53:47 -08:00
pcmcia DeviceTree updates for 3.13. This is a bit larger pull request than 2013-11-12 16:52:17 +09:00
phy
pinctrl pinctrl: single: call pcs_soc->rearm() whenever IRQ mask is changed 2013-11-14 10:43:17 -08:00
platform More ACPI and power management updates for 3.13-rc1 2013-11-20 13:25:04 -08:00
pnp ACPI: Eliminate the DEVICE_ACPI_HANDLE() macro 2013-11-14 23:17:21 +01:00
power Highlights: 2013-11-18 15:35:09 -08:00
powercap
pps drivers/pps/clients/pps-gpio.c: remove redundant of_match_ptr 2013-11-13 12:09:35 +09:00
ps3
ptp
pwm
rapidio
regulator Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2013-11-15 16:47:22 -08:00
remoteproc
reset
rpmsg
rtc ARM: drivers/rtc/rtc-at91rm9200.c: disable interrupts at shutdown 2013-11-21 16:42:27 -08:00
s390 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux 2013-11-19 11:43:21 -08:00
sbus
scsi Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending 2013-11-22 10:52:03 -08:00
sfi
sh
sn
spi More ACPI and power management updates for 3.13-rc1 2013-11-20 13:25:04 -08:00
ssb
staging Merge branch 'topic/kbuild-fixes-for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media 2013-11-18 15:10:05 -08:00
target Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending 2013-11-22 10:52:03 -08:00
tc
thermal Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2013-11-19 15:50:47 -08:00
tty Merge branch 'next' of git://git.infradead.org/users/vkoul/slave-dma 2013-11-20 13:20:24 -08:00
uio drivers/uio/uio_pruss.c: use gen_pool_dma_alloc() to allocate sram memory 2013-11-13 12:09:23 +09:00
usb Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending 2013-11-22 10:52:03 -08:00
uwb
vfio
vhost Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending 2013-11-22 10:52:03 -08:00
video ARM: SoC fixes for 3.13 merge window 2013-11-16 12:45:55 -08:00
virt
virtio Nothing really exciting: some groundwork for changing virtio endian, and 2013-11-15 13:28:47 +09:00
vlynq
vme
w1 drivers/w1/masters/w1-gpio.c: use dev_get_platdata() 2013-11-15 09:32:21 +09:00
watchdog watchdog: w83627hf: Use helper functions to access superio registers 2013-11-18 21:34:19 +01:00
xen More ACPI and power management updates for 3.13-rc1 2013-11-20 13:25:04 -08:00
zorro
Kconfig ACPI and power management updates for 3.13-rc1 2013-11-14 13:41:48 +09:00
Makefile ACPI and power management updates for 3.13-rc1 2013-11-14 13:41:48 +09:00