linux/lib
Jens Axboe 60b0ea120c percpu_counter: fix bad counter state during suspend
I got a bug report yesterday from Laszlo Ersek <lersek@xxxxxxxxxx>, in
which he states that his kvm instance fails to suspend. He Laszlo
bisected it down to this commit:

commit 1cf7e9c68f
Author: Jens Axboe <axboe@xxxxxxxxx>
Date: Fri Nov 1 10:52:52 2013 -0600

virtio_blk: blk-mq support

where virtio-blk is converted to use the blk-mq infrastructure. After
digging a bit, it became clear that the issue was with the queue drain.
blk-mq tracks queue usage in a percpu counter, which is incremented on
request alloc and decremented when the request is freed. The initial
hunt was for an inconsistency in blk-mq, but everything seemed fine. In
fact, the counter only returned crazy values when suspend was in
progress. When a CPU is unplugged, the percpu counters merges that CPU
state with the general state. blk-mq takes care to register a hotcpu
notifier with the appropriate priority, so we know it runs after the
percpu counter notifier. However, the percpu counter notifier only
merges the state when the CPU is fully gone. This leaves a state
transition where the CPU going away is no longer in the online mask, yet
it still holds private values. This means that in this state,
percpu_counter_sum() returns invalid results, and the suspend then hangs
waiting for abs(dead-cpu-value) requests to complete which of course
will never happen.

Fix this by clearing the state earlier, so we never have a case where
the CPU isn't in online mask but still holds private state. This bug has
been there since forever, I guess we don't have a lot of users where
percpu counters needs to be reliable during the suspend cycle.

Reported-by: <lersek@redhat.com>
Tested-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-04-07 08:17:10 -06:00
..
fonts partly revert commit 8a10bc9: parisc/sti_console: prefer Linux fonts over built-in ROM fonts 2014-03-23 16:44:42 +01:00
lz4 lz4: fix compression/decompression signedness mismatch 2013-09-11 15:59:45 -07:00
lzo lib/lzo: Update LZO compression to current upstream version 2013-02-20 19:36:01 +01:00
mpi MPILIB: add module description and license 2013-09-25 17:17:01 +01:00
raid6 md update for v3.12 2013-09-10 13:03:41 -07:00
reed_solomon
xz decompressors: fix typo "POWERPC" 2013-03-13 15:21:48 -07:00
zlib_deflate zlib: slim down zlib_deflate() workspace when possible 2011-03-22 17:44:17 -07:00
zlib_inflate
.gitignore X.509: Implement simple static OID registry 2012-10-08 13:50:18 +10:30
argv_split.c argv_split(): teach it to handle mutable strings 2013-04-29 18:28:19 -07:00
asn1_decoder.c Nothing all that exciting; a new module-from-fd syscall for those who want 2012-12-19 07:55:08 -08:00
assoc_array.c assoc_array: remove global variable 2014-01-23 09:25:10 -08:00
atomic64_test.c atomic64_test: simplify the #ifdef for atomic64_dec_if_positive() test 2012-07-30 17:25:16 -07:00
atomic64.c lib: atomic64: Initialize locks statically to fix early users 2012-12-20 13:50:16 -08:00
audit.c audit: support the "standard" <asm-generic/unistd.h> 2011-05-04 14:41:28 -04:00
average.c lib: Ensure EWMA does not store wrong intermediate values 2014-01-16 23:46:06 -08:00
bcd.c usb/core: use bin2bcd() for bcdDevice in RH 2012-09-10 11:13:16 -07:00
bch.c lib: add shared BCH ECC library 2011-03-11 14:25:50 +00:00
bitmap.c propagate name change to comments in kernel source 2012-12-06 10:39:54 +01:00
bitrev.c
bsearch.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
btree.c btree: catch NULL value before it does harm 2012-06-07 14:43:55 -07:00
bug.c taint: add explicit flag to show whether lock dep is still OK. 2013-01-21 17:17:57 +10:30
build_OID_registry X.509: do not emit any informational output 2013-06-19 17:54:06 +02:00
bust_spinlocks.c printk: Provide a wake_up_klogd() off-case 2013-03-22 16:41:20 -07:00
check_signature.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
checksum.c asm-generic headers: Allow yet more arch overrides in checksum.h 2013-02-11 20:00:33 +05:30
clz_ctz.c lib/clz_ctz.c: add prototype declarations in lib/clz_ctz.c 2014-04-03 16:21:12 -07:00
clz_tab.c lib: Fix multiple definitions of clz_tab 2012-02-02 10:34:23 +11:00
cmdline.c lib/cmdline.c: declare exported symbols immediately 2014-01-23 16:36:57 -08:00
cordic.c Docs: wording: functions -> algorithm 2011-10-29 21:20:22 +02:00
cpu_rmap.c Remove GENERIC_HARDIRQ config option 2013-09-13 15:09:52 +02:00
cpu-notifier-error-inject.c cpu: rewrite cpu-notifier-error-inject module 2012-07-30 17:25:22 -07:00
cpumask.c lib/cpumask.c: use memblock apis for early memory allocations 2014-01-21 16:19:47 -08:00
crc7.c
crc8.c lib: crc8: add new library module providing crc8 algorithm 2011-06-03 15:01:06 -04:00
crc16.c
crc32.c lib: crc32: reduce number of cases for crc32{, c}_combine 2013-11-04 15:27:08 -05:00
crc32defs.h crc32: select an algorithm via Kconfig 2012-03-23 16:58:38 -07:00
crc-ccitt.c
crc-itu-t.c
crc-t10dif.c crypto: crct10dif - Add fallback for broken initrds 2013-09-12 15:31:34 +10:00
ctype.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
debug_locks.c mutex: Add support for wound/wait style locks 2013-06-26 12:10:56 +02:00
debugobjects.c lib/debugobjects.c: remove unnecessary work pending test 2013-11-13 12:09:22 +09:00
dec_and_lock.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
decompress_bunzip2.c decompress_bunzip2: remove invalid vi modeline 2011-12-06 10:00:05 +01:00
decompress_inflate.c lib/decompress_inflate.c: include appropriate header file 2014-04-03 16:21:12 -07:00
decompress_unlz4.c lib/decompress_unlz4.c: always set an error return code on failures 2014-01-23 16:37:04 -08:00
decompress_unlzma.c treewide: Fix comment and string typo 'bufer' 2011-12-06 09:53:40 +01:00
decompress_unlzo.c lib/lzo: Rename lzo1x_decompress.c to lzo1x_decompress_safe.c 2013-02-20 19:36:00 +01:00
decompress_unxz.c Fix common misspellings 2011-03-31 11:26:23 -03:00
decompress.c lib: add support for LZ4-compressed kernel 2013-07-09 10:33:30 -07:00
devres.c lib/devres.c: fix some sparse warnings 2014-04-03 16:21:11 -07:00
digsig.c lib/digsig.c: use ERR_CAST inlined function instead of ERR_PTR(PTR_ERR(...)) 2013-11-13 12:09:22 +09:00
div64.c math64: New separate div64_u64_rem helper 2013-08-23 09:02:14 -04:00
dma-debug.c dma debug: account for cachelines and read-only mappings in overlap tracking 2014-03-04 07:55:47 -08:00
dump_stack.c x86, asmlinkage: Make dump_stack visible 2013-08-06 14:21:01 -07:00
dynamic_debug.c dynamic_debug: replace obselete simple_strtoul() with kstrtouint() 2014-01-27 21:02:39 -08:00
dynamic_queue_limits.c bql: Avoid possible inconsistent calculation. 2012-05-31 18:18:17 -04:00
earlycpio.c earlycpio.c: Fix the confusing comment of find_cpio_data(). 2013-08-14 23:24:01 +02:00
extable.c
fault-inject.c debugfs: add get/set for atomic types 2013-06-03 13:55:01 -07:00
fdt_ro.c of/lib: Allow scripts/dtc/libfdt to be used from kernel code 2012-07-23 13:54:52 +01:00
fdt_rw.c of/lib: Allow scripts/dtc/libfdt to be used from kernel code 2012-07-23 13:54:52 +01:00
fdt_strerror.c of/lib: Allow scripts/dtc/libfdt to be used from kernel code 2012-07-23 13:54:52 +01:00
fdt_sw.c of/lib: Allow scripts/dtc/libfdt to be used from kernel code 2012-07-23 13:54:52 +01:00
fdt_wip.c of/lib: Allow scripts/dtc/libfdt to be used from kernel code 2012-07-23 13:54:52 +01:00
fdt.c of/lib: Allow scripts/dtc/libfdt to be used from kernel code 2012-07-23 13:54:52 +01:00
find_last_bit.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
find_next_bit.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
flex_array.c reciprocal_divide: update/correction of the algorithm 2014-01-21 23:17:20 -08:00
flex_proportions.c lib/flex_proportions.c: fix corruption of denominator in flexible proportions 2012-09-25 08:59:21 -07:00
gcd.c lib/gcd.c: prevent possible div by 0 2012-10-06 03:04:57 +09:00
gen_crc32table.c sections: fix const sections for crc32 table 2012-10-06 03:04:46 +09:00
genalloc.c lib/genalloc.c: add check gen_pool_dma_alloc() if dma pointer is not NULL 2014-01-29 16:22:39 -08:00
halfmd4.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
hash.c lib: hash: follow-up fixups for arch hash 2013-12-19 00:14:53 -05:00
hexdump.c lib: introduce upper case hex ascii helpers 2013-09-20 15:38:26 -04:00
hweight.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
idr.c idr: Add new function idr_is_empty() 2014-02-17 16:27:47 +01:00
inflate.c MN10300: Don't try and #include <linux/slab.h> in lib/inflate.c from bootloader 2010-08-12 09:51:35 -07:00
int_sqrt.c lib/int_sqrt.c: optimize square root algorithm 2013-04-29 18:28:19 -07:00
interval_tree_test_main.c random32: rename random32 to prandom 2012-12-17 17:15:26 -08:00
interval_tree.c mm: interval tree updates 2012-10-09 16:22:40 +09:00
iomap_copy.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
iomap.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
iommu-helper.c The following text was taken from the original review request: 2012-03-24 10:24:31 -07:00
ioremap.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
iovec.c Hoist memcpy_fromiovec/memcpy_toiovec into lib/ 2013-05-20 10:24:22 +09:30
irq_regs.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
is_single_threaded.c
jedec_ddr_data.c ddr: add LPDDR2 data from JESD209-2 2012-05-02 00:04:06 -07:00
kasprintf.c lib/kasprintf.c: use kmalloc_track_caller() to get accurate traces for kvasprintf 2012-10-11 08:50:15 +09:00
Kconfig Merge branch 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security 2013-11-21 19:46:00 -08:00
Kconfig.debug locktorture: Add a lock-torture kernel module 2014-02-23 09:04:29 -08:00
Kconfig.kgdb treewide: Fix typo in printk 2013-06-18 13:48:45 +02:00
Kconfig.kmemcheck
kfifo.c kfifo: kfifo_copy_{to,from}_user: fix copied bytes calculation 2013-11-15 09:32:23 +09:00
klist.c klist: del waiter from klist_remove_waiters before wakeup waitting process 2013-05-21 10:16:39 -07:00
kobject_uevent.c kobject: don't block for each kobject_uevent 2014-04-03 16:21:04 -07:00
kobject.c sysfs, kobject: add sysfs wrapper for kernfs_enable_ns() 2014-02-07 16:08:57 -08:00
kstrtox.c lib/kstrtox.c: remove redundant cleanup 2014-01-23 16:36:57 -08:00
kstrtox.h lib/kstrtox: common code between kstrto*() and simple_strto*() functions 2011-10-31 17:30:56 -07:00
lcm.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
libcrc32c.c
list_debug.c rcu: Fix broken strings in RCU's source code. 2012-07-06 06:01:49 -07:00
list_sort.c lib/: rename random32() to prandom_u32() 2013-04-29 18:28:42 -07:00
llist.c llists-move-llist_reverse_order-from-raid5-to-llistc-fix 2013-11-15 09:32:22 +09:00
locking-selftest-hardirq.h
locking-selftest-mutex.h
locking-selftest-rlock-hardirq.h
locking-selftest-rlock-softirq.h
locking-selftest-rlock.h
locking-selftest-rsem.h
locking-selftest-softirq.h
locking-selftest-spin-hardirq.h
locking-selftest-spin-softirq.h
locking-selftest-spin.h
locking-selftest-wlock-hardirq.h
locking-selftest-wlock-softirq.h
locking-selftest-wlock.h
locking-selftest-wsem.h
locking-selftest.c sched: Introduce preempt_count accessor functions 2013-09-25 14:07:32 +02:00
lockref.c lockref: include mutex.h rather than reinvent arch_mutex_cpu_relax 2013-11-27 20:37:33 -08:00
lru_cache.c lru_cache: introduce lc_get_cumulative() 2013-03-22 22:17:36 -06:00
Makefile * Avoid WARN_ON() when mapping BGRT on Baytrail (EFI 32-bit). 2014-02-07 11:27:30 -08:00
md5.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
memory-notifier-error-inject.c memory: memory notifier error injection module 2012-07-30 17:25:22 -07:00
memweight.c string: introduce memweight() 2012-07-30 17:25:16 -07:00
net_utils.c net: core: move mac_pton() to lib/net_utils.c 2013-06-05 12:00:27 -07:00
nlattr.c netlink: don't compare the nul-termination in nla_strcmp 2014-04-01 15:25:02 -04:00
notifier-error-inject.c mode_t, whack-a-mole at 11... 2013-04-09 14:13:05 -04:00
notifier-error-inject.h fault-injection: notifier error injection 2012-07-30 17:25:22 -07:00
of-reconfig-notifier-error-inject.c powerpc+of: Rename and fix OF reconfig notifier error inject module 2012-12-14 10:32:52 +11:00
oid_registry.c Give the OID registry file module info to avoid kernel tainting 2013-05-05 14:38:00 -07:00
parser.c lib/parser.c: put EXPORT_SYMBOLs in the conventional place 2014-01-23 16:36:55 -08:00
pci_iomap.c lib: add NO_GENERIC_PCI_IOPORT_MAP 2012-01-31 23:19:47 +02:00
percpu_counter.c percpu_counter: fix bad counter state during suspend 2014-04-07 08:17:10 -06:00
percpu_ida.c Merge branch 'for-linus' of git://git.kernel.dk/linux-block 2014-02-14 10:45:18 -08:00
percpu_test.c percpu: add test module for various percpu operations 2013-11-13 12:09:11 +09:00
percpu-refcount.c percpu-refcount: Add a WARN() for ref going negative 2014-01-21 04:40:56 -05:00
plist.c lib/plist.c: make plist test announcements KERN_DEBUG 2012-10-06 03:04:58 +09:00
pm-notifier-error-inject.c PM: PM notifier error injection module 2012-07-30 17:25:22 -07:00
prio_heap.c
proportions.c locking, lib/proportions: Annotate prop_local_percpu::lock as raw 2011-09-13 11:11:50 +02:00
radix-tree.c mm: keep page cache radix tree nodes in check 2014-04-03 16:21:01 -07:00
random32.c lib/random32.c: minor cleanups and kdoc fix 2014-04-03 16:21:11 -07:00
ratelimit.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
rational.c lib: Change mail address of Oskar Schirmer 2012-05-17 15:18:37 +02:00
rbtree_test.c rbtree/test: test rbtree_postorder_for_each_entry_safe() 2014-01-23 16:37:03 -08:00
rbtree.c rbtree: add postorder iteration functions 2013-09-11 15:59:19 -07:00
reciprocal_div.c reciprocal_divide: update/correction of the algorithm 2014-01-21 23:17:20 -08:00
scatterlist.c lib/scatterlist: export sg_miter_skip() 2013-12-08 17:56:37 -08:00
sha1.c lib: reduce the use of module.h wherever possible 2012-03-07 15:04:04 -05:00
show_mem.c lib/show_mem.c: show num_poisoned_pages when oom 2014-01-21 16:19:48 -08:00
smp_processor_id.c sched: Introduce preempt_count accessor functions 2013-09-25 14:07:32 +02:00
sort.c
stmp_device.c lib: add support for stmp-style devices 2012-04-20 23:27:08 +02:00
string_helpers.c lib/string_helpers: introduce generic string_unescape 2013-04-30 17:04:03 -07:00
string.c asmlinkage Make __stack_chk_failed and memcmp visible 2014-02-13 18:13:43 -08:00
strncpy_from_user.c word-at-a-time: make the interfaces truly generic 2012-05-26 11:33:40 -07:00
strnlen_user.c lib: Fix generic strnlen_user for 32-bit big-endian machines 2012-05-27 20:59:46 -07:00
swiotlb.c memblock, nobootmem: add memblock_virt_alloc_low() 2014-01-27 21:02:38 -08:00
syscall.c lib/syscall.c: unexport task_current_syscall() 2014-04-03 16:21:06 -07:00
test_module.c test: add minimal module for verification testing 2014-01-23 16:36:57 -08:00
test_user_copy.c test: check copy_to/from_user boundary validation 2014-01-23 16:36:57 -08:00
test-kstrtox.c lib/test-kstrtox.c: mark const init data with __initconst instead of __initdata 2012-05-29 16:22:32 -07:00
test-string_helpers.c lib/string_helpers: introduce generic string_unescape 2013-04-30 17:04:03 -07:00
textsearch.c textsearch: doc - fix spelling in lib/textsearch.c. 2011-01-24 23:33:30 -08:00
timerqueue.c The following text was taken from the original review request: 2012-03-24 10:24:31 -07:00
ts_bm.c
ts_fsm.c
ts_kmp.c
ucs2_string.c Move utf16 functions to kernel core and rename 2013-04-15 21:23:03 +01:00
usercopy.c Kconfig: consolidate CONFIG_DEBUG_STRICT_USER_COPY_CHECKS 2013-04-30 17:04:09 -07:00
uuid.c uuid: use prandom_bytes() 2013-04-29 18:28:42 -07:00
vsprintf.c vsprintf: remove %n handling 2014-04-03 16:21:07 -07:00