* 'fix/asoc' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
ASoC: Select wm_hubs automatically for WM8994
ASoC: Remove duplicate AUX definition from WM8776
ASoC:: remove a redundant snd_soc_unregister_codec call in wm8988_register
ASoC: wm8727: add a missing return in wm8727_platform_probe
ASoC: fsi: fixup wrong value setting order of TDM
ASoC: fsi: fixup clock inversion operation
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6:
math-emu: correct test for downshifting fraction in _FP_FROM_INT()
perf: Add DWARF register lookup for sparc
MAINTAINERS: Add SBUS driver path to sparc entry.
drivers/sbus: Remove unnecessary casts of private_data
sparc: remove homegrown L1_CACHE_ALIGN macro
sparc64: fix the build error due to smp_kgdb_capture_client()
sparc64: Fix maybe_change_configuration() PCR setting.
arch/sparc/kernel: Eliminate what looks like a NULL pointer dereference
sparc64: Update defconfig.
sunsu: Fix use after free in su_remove().
sunserial: Don't call add_preferred_console() when console= is specified.
sparc32: Kill none_mask, it's bogus.
Pointed out by Lucas who found the new one in a comment in
setup_percpu.c. And then I fixed the others that I grepped
for.
Reported-by: Lucas <canolucas@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
CONFIG_ACPI_PROCFS=n:
drivers/acpi/processor_idle.c:83: warning: 'us_to_pm_timer_ticks' defined but not used.
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Len Brown <len.brown@intel.com>
make rpm was broken by commit 0915512:
make clean
set -e; cd ..; ln -sf /usr/src/iwlwifi-2.6 kernel-2.6.35rc4wl
/bin/sh /usr/src/iwlwifi-2.6/scripts/setlocalversion --scm-only >
/usr/src/iwlwifi-2.6/.scmversion
cat: .scmversion: input file is output file
make[1]: *** [rpm] Error 1
Reported-and-tested-by: "Zheng, Jiajia" <jiajia.zheng@intel.com>
Signed-off-by: Michal Marek <mmarek@suse.cz>
Up to 2.6.34 pcmcia_release_irq() reset p_dev->_irq to 0 after releasing
the irq. The IRQ is now released in pcmcia_disable_device(), however
p_dev->_irq is not reset, triggering a warning in pcmcia_device_remove().
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
Add the shrinkers missed in the first conversion of the API in
commit 7f8275d0d6 ("mm: add context argument to
shrinker callback").
Signed-off-by: Dave Chinner <dchinner@redhat.com>
The Nokia RX51 board code (arch/arm/mach-omap2/board-rx51-peripherals.c)
defines a key map for the matrix keypad keyboard. The hardware seems to
use all of the 8 rows and 8 columns of the keypad, although not all
possible locations are used.
The TWL4030 supports keypads with at most 8 rows and 8 columns. Most keys
are defined with a row and column number between 0 and 7, except
KEY(0xff, 2, KEY_F9),
KEY(0xff, 4, KEY_F10),
KEY(0xff, 5, KEY_F11),
which represent keycodes that should be emitted when entire row is
connected to the ground. since the driver handles this case as if we
had an extra column in the key matrix. Unfortunately we do not allocate
enough space and end up owerwriting some random memory.
Reported-and-tested-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: stable@kernel.org
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
We moved input devices from 'struct gc' to individial pads (struct
gc-pad), but gc_nes_process_packet() was still trying to use old
ones and crashing.
Cc: stable@kernel.org
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
The kernel's math-emu code contains a macro _FP_FROM_INT() which is
used to convert an integer to a raw normalized floating-point value.
It does this basically in three steps:
1. Compute the exponent from the number of leading zero bits.
2. Downshift large fractions to put the MSB in the right position
for normalized fractions.
3. Upshift small fractions to put the MSB in the right position.
There is an boundary error in step 2, causing a fraction with its
MSB exactly one bit above the normalized MSB position to not be
downshifted. This results in a non-normalized raw float, which when
packed becomes a massively inaccurate representation for that input.
The impact of this depends on a number of arch-specific factors,
but it is known to have broken emulation of FXTOD instructions
on UltraSPARC III, which was originally reported as GCC bug 44631
<http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44631>.
Any arch which uses math-emu to emulate conversions from integers to
same-size floats may be affected.
The fix is simple: the exponent comparison used to determine if the
fraction should be downshifted must be "<=" not "<".
I'm sending a kernel module to test this as a reply to this message.
There are also SPARC user-space test cases in the GCC bug entry.
Signed-off-by: Mikael Pettersson <mikpe@it.uu.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
drm/r600: fix possible NULL pointer derefernce
drm/radeon/kms: add quirk for ASUS HD 3600 board
include/linux/vgaarb.h: add missing part of include guard
drm/nouveau: Fix crashes during fbcon init on single head cards.
drm/nouveau: fix pcirom vbios shadow breakage from acpi rom patch
drm/radeon/kms: fix shared ddc harder
drm/i915: enable low power render writes on GEN3 hardware.
drm/i915: Define MI_ARB_STATE bits
vmwgfx: return -EFAULT if copy_to_user fails
fb: handle allocation failure in alloc_apertures()
drm: radeon: check kzalloc() result
drm/ttm: Fix build on architectures without AGP
drm/radeon/kms: fix gtt MC base alignment on rs4xx/rs690/rs740 asics
drm/radeon/kms: fix possible mis-detection of sideport on rs690/rs740
drm/radeon/kms: fix legacy tv-out pal mode
Reported-by: Alexander Y. Fomichev <git.user@gmail.com>
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Connector is actually DVI rather than HDMI.
Reported-by: trapDoor <trapdoor6@gmail.com>
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Cc: stable@kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
vgaarb.h was missing the #define of the #ifndef at the top for the guard
to prevent multiple #include's from causing re-define errors
Signed-off-by: Doug Goldstein <cardoe@gentoo.org>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
ceph: do not include cap/dentry releases in replayed messages
ceph: reuse request message when replaying against recovering mds
ceph: fix creation of ipv6 sockets
ceph: fix parsing of ipv6 addresses
ceph: fix printing of ipv6 addrs
ceph: add kfree() to error path
ceph: fix leak of mon authorizer
ceph: fix message revocation
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (24 commits)
bridge: Partially disable netpoll support
tcp: fix crash in tcp_xmit_retransmit_queue
IPv6: fix CoA check in RH2 input handler (mip6_rthdr_input())
ibmveth: lost IRQ while closing/opening device leads to service loss
rt2x00: Fix lockdep warning in rt2x00lib_probe_dev()
vhost: avoid pr_err on condition guest can trigger
ipmr: Don't leak memory if fib lookup fails.
vhost-net: avoid flush under lock
net: fix problem in reading sock TX queue
net/core: neighbour update Oops
net: skb_tx_hash() fix relative to skb_orphan_try()
rfs: call sock_rps_record_flow() in tcp_splice_read()
xfrm: do not assume that template resolving always returns xfrms
hostap_pci: set dev->base_addr during probe
axnet_cs: use spin_lock_irqsave in ax_interrupt
dsa: Fix Kconfig dependencies.
act_nat: not all of the ICMP packets need an IP header payload
r8169: incorrect identifier for a 8168dp
Phonet: fix skb leak in pipe endpoint accept()
Bluetooth: Update sec_level/auth_type for already existing connections
...
If a single-threaded process does a file-descriptor operation, and some
other process accesses that same file descriptor via /proc, the current
rcu_dereference_check_fdtable() can give a false-positive RCU-lockdep
splat due to the reference count being increased by the /proc access after
the reference-count check in fget_light() but before the check in
rcu_dereference_check_fdtable().
This commit prevents this false positive by checking for a single-threaded
process. To avoid #include hell, this commit uses the wrapper for
thread_group_empty(current) defined by rcu_my_thread_group_empty()
provided in a separate commit.
Located-by: Miles Lane <miles.lane@gmail.com>
Located-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
System will crash sooner or later once the memory with the code of the
s3c-sdhci.ko module is reused for something else. I really have no idea
how the lack of remove function went unnoticed into the mainline code.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
My Collabora address is no longer enabled - update the MODULE_AUTHOR
fields of drivers to my current email address.
Signed-off-by: Andres Salomon <dilinger@queued.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit e534c7c5f8 ("numa: x86_64: use generic percpu var
numa_node_id() implementation") broke numa systems that don't have ram
on node0 when MEMORY_HOTPLUG is enabled, because cpu_up() will call
cpu_to_node() before per_cpu(numa_node) is setup for APs.
When Node0 doesn't have RAM, on x86, cpus already round it to nearest
node with RAM in x86_cpu_to_node_map. and per_cpu(numa_node) is not set
up until in c_init for APs.
When later cpu_up() calling cpu_to_node() will get 0 again, and make it
online even there is no RAM on node0. so later all APs can not booted up,
and later will have panic.
[ 1.611101] On node 0 totalpages: 0
.........
[ 2.608558] On node 0 totalpages: 0
[ 2.612065] Brought up 1 CPUs
[ 2.615199] Total of 1 processors activated (3990.31 BogoMIPS).
...
93.225341] calling loop_init+0x0/0x1a4 @ 1
[ 93.229314] PERCPU: allocation failed, size=80 align=8, failed to populate
[ 93.246539] Pid: 1, comm: swapper Tainted: G W 2.6.35-rc4-tip-yh-04371-gd64e6c4-dirty #354
[ 93.264621] Call Trace:
[ 93.266533] [<ffffffff81125e43>] pcpu_alloc+0x83a/0x8e7
[ 93.270710] [<ffffffff81125f15>] __alloc_percpu+0x10/0x12
[ 93.285849] [<ffffffff8140786c>] alloc_disk_node+0x94/0x16d
[ 93.291811] [<ffffffff81407956>] alloc_disk+0x11/0x13
[ 93.306157] [<ffffffff81503e51>] loop_alloc+0xa7/0x180
[ 93.310538] [<ffffffff8277ef48>] loop_init+0x9b/0x1a4
[ 93.324909] [<ffffffff8277eead>] ? loop_init+0x0/0x1a4
[ 93.329650] [<ffffffff810001f2>] do_one_initcall+0x57/0x136
[ 93.345197] [<ffffffff827486d0>] kernel_init+0x184/0x20e
[ 93.348146] [<ffffffff81034954>] kernel_thread_helper+0x4/0x10
[ 93.365194] [<ffffffff81c7cc3c>] ? restore_args+0x0/0x30
[ 93.369305] [<ffffffff8274854c>] ? kernel_init+0x0/0x20e
[ 93.386011] [<ffffffff81034950>] ? kernel_thread_helper+0x0/0x10
[ 93.392047] loop: out of memory
...
Try to assign per_cpu(numa_node) early
[akpm@linux-foundation.org: tidy up code comment]
Signed-off-by: Yinghai <yinghai@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Acked-by: Lee Schermerhorn <lee.schermerhorn@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Simply add a proper ID into the device table.
Signed-off-by: Anton Vorontsov <avorontsov@mvista.com>
Cc: Doug Thompson <dougthompson@xmission.com>
Cc: Peter Tyser <ptyser@xes-inc.com>
Cc: Dave Jiang <djiang@mvista.com>
Cc: Kumar Gala <galak@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Since commit 5753c082f6 ("powerpc/85xx:
Kconfig cleanup"), there is no MPC85xx Kconfig symbol anymore, so the
driver became non-selectable.
This patch fixes the issue by switching to PPC_85xx symbol.
Signed-off-by: Anton Vorontsov <avorontsov@mvista.com>
Cc: Doug Thompson <dougthompson@xmission.com>
Cc: Peter Tyser <ptyser@xes-inc.com>
Cc: Dave Jiang <djiang@mvista.com>
Cc: Kumar Gala <galak@kernel.crashing.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We need lock_page_nosync() here because we have no reference to the
mapping when taking the page lock.
Signed-off-by: Nick Piggin <npiggin@suse.de>
Reviewed-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The last change to improve the scalability moved the actual wake-up out of
the section that is protected by spin_lock(sma->sem_perm.lock).
This means that IN_WAKEUP can be in queue.status even when the spinlock is
acquired by the current task. Thus the same loop that is performed when
queue.status is read without the spinlock acquired must be performed when
the spinlock is acquired.
Thanks to kamezawa.hiroyu@jp.fujitsu.com for noticing lack of the memory
barrier.
Addresses https://bugzilla.kernel.org/show_bug.cgi?id=16255
[akpm@linux-foundation.org: clean up kerneldoc, checkpatch warning and whitespace]
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Reported-by: Luca Tettamanti <kronos.it@gmail.com>
Tested-by: Luca Tettamanti <kronos.it@gmail.com>
Reported-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Maciej Rutecki <maciej.rutecki@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We define a number of symbols in the linker scipt like this:
__start_syscalls_metadata = .;
*(__syscalls_metadata)
But we do not know the alignment of "." when we assign
the __start_syscalls_metadata symbol.
gcc started to uses bigger alignment for structs (32 bytes),
so we saw situations where the linker due to alignment
constraints increased the value of "." after the symbol assignment.
This resulted in boot fails.
Fix this by forcing a 32 byte alignment of "." before the
assignment.
This patch introduces the forced alignment for
ftrace_events and syscalls_metadata.
It may be required in more places.
Reported-by: Zeev Tarantov <zeev.tarantov@gmail.com>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
LKML-Reference: <20100710063459.GA14596@merkur.ravnborg.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
this fixes a regression since the fbcon rework.
Signed-off-by: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
On nv50 it became impossible to attempt a PCI ROM shadow of the VBIOS,
which will break some setups.
This patch also removes the different ordering of shadow methods for
pre-nv50 chipsets. The reason for the different ordering was paranoia,
but it should hopefully be OK to try shadowing PRAMIN first.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This fixes a regression caused by b2ea4aa67b
due to the way shared ddc with multiple digital connectors was handled.
You generally have two cases where DDC lines are shared:
- HDMI + VGA
- HDMI + DVI-D
HDMI + VGA is easy to deal with because you can check the EDID for the
to see if the attached monitor is digital. A shared DDC line with two
digital connectors is more complex. You can't use the hdmi bits in the
EDID since they may not be there with DVI<->HDMI adapters. In this case
all we can do is check the HPD pins to see which is connected as we have
no way of knowing using the EDID.
Reported-by: trapdoor6@gmail.com
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Cc: stable@kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
My platform makes use of the null_legacy_pic choice and oopses when doing
a shutdown as the shutdown code goes through all the registered sysdevs
and calls their shutdown method which in my case poke on a non-existing
i8259. Imho the i8259 specific sysdev should only be registered if the
i8259 is actually there.
Do not register the sysdev function when the null_legacy_pic is used so
that the i8259 resume, suspend and shutdown functions are not called.
Signed-off-by: Adam Lackorzynski <adam@os.inf.tu-dresden.de>
LKML-Reference: <201007202218.o6KMIJ3m020955@imap1.linux-foundation.org>
Cc: Jacob Pan <jacob.jun.pan@intel.com>
Cc: <stable@kernel.org> 2.6.34
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Having both IO-mapping.txt and io-mapping.txt in Documentation/
was confusing and/or bothersome to some people, so rename
IO-mapping.txt to bus-virt-phys-mapping.txt. Also update
Documentation/00-INDEX for both of these files.
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Kees Bakker <kees.bakker@xs4all.nl>
Cc: Keith Packard <keithp@keithp.com>
The 'source' builtin is a bash alias to the '.' (dot) builtin. While the
former is supported only by bash, the latter is specified in POSIX and
works fine with all POSIX-compliant shells I am aware of.
The '$_' special parameter is specific to bash. It is partially
supported in dash too but it always evaluates to the current script path
(which causes the script to enter a loop recursively re-executing
itself). This is why I have replaced the two occurences of '$_' with the
explicit parameter.
The 'local' builtin is another example of bash-specific code. Although
it is supported by all POSIX-compliant shells I am aware of, it is not
part of POSIX specification and thus the code should not rely on it
assigning a specific value to the local variable. Moreover, the 'posh'
shell has a limited version of 'local' builtin not supporting direct
variable assignments. Thus, I have broken one of the 'local'
declarations down into a (non-POSIX) 'local' declaration and a plain
(POSIX-compliant) variable assignment.
Signed-off-by: Michał Górny <gentoo@mgorny.alt.pl>
Signed-off-by: Michal Marek <mmarek@suse.cz>
Otherwise all machine drivers need to do so.
Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Signed-off-by: Joonyoung Shim <jy0922.shim@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Acked-by: Liam Girdwood <lrg@slimlogic.co.uk>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
The new netpoll code in bridging contains use-after-free bugs
that are non-trivial to fix.
This patch fixes this by removing the code that uses skbs after
they're freed.
As a consequence, this means that we can no longer call bridge
from the netpoll path, so this patch also removes the controller
function in order to disable netpoll.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Thanks,
Signed-off-by: David S. Miller <davem@davemloft.net>
A lot of 945GMs have had stability issues for a long time, this manifested as X hangs, blitter engine hangs, and lots of crashes.
one such report is at:
https://bugs.freedesktop.org/show_bug.cgi?id=20560
along with numerous distro bugzillas.
This only took a week of digging and hair ripping to figure out.
Tracked down and tested on a 945GM Lenovo T60,
previously running
x11perf -copypixwin500
or
x11perf -copywinpix500
repeatedly would cause the GPU to wedge within 4 or 5 tries, with random busy bits set.
After this patch no hangs were observed.
cc: stable@kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
The i915 memory arbiter has a register full of configuration
bits which are currently not defined in the driver header file.
Signed-off-by: Keith Packard <keithp@keithp.com>
cc: stable@kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
copy_to_user() returns the number of bytes remaining to be copied, but
we want to return a negative error code. This gets copied to user
space.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
If the kzalloc() fails we should return NULL. All the places that call
alloc_apertures() check for this already.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Acked-by: James Simmons <jsimmons@infradead.org>
Acked-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Make inclusion of <asm/agp.h> conditional on TTM_HAS_AGP. The use
of the functions declared in it is already conditional.
Reported-by: Geert Stappers <stappers@stappers.nl>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Tested-by: Geert Stappers <stappers@stappers.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The name of platfrom device was changed and we need to make driver's
name match in order for it to bind to the device.
Signed-off-by: Wan ZongShun <mcuos.com@gmail.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Gigabyte "Spring Peak" notebook indicates wrong chassis-type, tripping up
i8042 and breaking the touchpad. Add this model to i8042_dmi_noloop_table[]
to resolve.
BugLink: https://bugs.launchpad.net/bugs/580664
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Cc: stable@kernel.org
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
* 'shrinker' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/xfsdev:
xfs: track AGs with reclaimable inodes in per-ag radix tree
xfs: convert inode shrinker to per-filesystem contexts
mm: add context argument to shrinker callback
https://bugzilla.kernel.org/show_bug.cgi?id=16348
When the filesystem grows to a large number of allocation groups,
the summing of recalimable inodes gets expensive. In many cases,
most AGs won't have any reclaimable inodes and so we are wasting CPU
time aggregating over these AGs. This is particularly important for
the inode shrinker that gets called frequently under memory
pressure.
To avoid the overhead, track AGs with reclaimable inodes in the
per-ag radix tree so that we can find all the AGs with reclaimable
inodes via a simple gang tag lookup. This involves setting the tag
when the first reclaimable inode is tracked in the AG, and removing
the tag when the last reclaimable inode is removed from the tree.
Then the summation process becomes a loop walking the radix tree
summing AGs with the reclaim tag set.
This significantly reduces the overhead of scanning - a 6400 AG
filesystea now only uses about 25% of a cpu in kswapd while slab
reclaim progresses instead of being permanently stuck at 100% CPU
and making little progress. Clean filesystems filesystems will see
no overhead and the overhead only increases linearly with the number
of dirty AGs.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>