The previous fixes for the use after free in xfs_iunpin left a nasty log
deadlock when xfslogd unpinned the inode and dropped the last reference to
the inode. the ->clear_inode() method can issue transactions, and if the
log was full, the transaction could push on the log and get stuck trying
to push the inode it was currently unpinning.
To fix this, we provide xfs_iunpin a guarantee that it will always have a
valid xfs_inode <-> linux inode link or a particular flag will be set on
the inode. We then use log forces during lookup to ensure transactions are
completed before we recycle the inode. This ensures that xfs_iunpin will
never use the linux inode after it is being freed, and any lookup on an
inode on the reclaim list will wait until it is safe to attach a new linux
inode to the xfs inode.
SGI-PV: 956832
SGI-Modid: xfs-linux-melb:xfs-kern:27359a
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Shailendra Tripathi <stripathi@agami.com>
Signed-off-by: Takenori Nagano <t-nagano@ah.jp.nec.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
CONFIG_XFS_TRACE is on
SGI-PV: 956618
SGI-Modid: xfs-linux-melb:xfs-kern:27196a
Signed-off-by: Vlad Apostolov <vapo@sgi.com>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Commit 6264d69d7d modified the nfsd_create()
error handling in such a way that nfsd_create will usually return
nfserr_perm even when succesful, if the export has the async export option.
This introduced a regression that could cause mkdir() to always return a
permissions error, even though the directory in question was actually
succesfully created.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Acked-by: NeilBrown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Eric's changes to the htirq infrastructure require corresponding
modifications to the ipath HT driver code so that interrupts are still
delivered properly.
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Roland Dreier <rdreier@cisco.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch adds a variant of ht_create_irq __ht_create_irq that takes an
aditional parameter update that is a function that is called whenever we want
to write to a drivers htirq configuration registers.
This is needed to support the ipath_iba6110 because it's registers in the
proper location are not actually conected to the hardware that controlls
interrupt delivery.
[bos@serpentine.com: fixes]
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Andi Kleen <ak@suse.de>
Cc: <olson@pathscale.com>
Cc: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Bryan O'Sullivan <bryan.osullivan@qlogic.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This refactoring actually optimizes the code a little by caching the value
that we think the device is programmed with instead of reading it back from
the hardware. Which simplifies the code a little and should speed things up a
bit.
This patch introduces the concept of a ht_irq_msg and modifies the
architecture read/write routines to update this code.
There is a minor consistency fix here as well as x86_64 forgot to initialize
the htirq as masked.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Andi Kleen <ak@suse.de>
Acked-by: Bryan O'Sullivan <bos@pathscale.com>
Cc: <olson@pathscale.com>
Cc: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Some more errors from the IPMI send message command are retryable, but are not
being retried by the IPMI code. Make sure they get retried.
Signed-off-by: Corey Minyard <minyard@acm.org>
Cc: Frederic Lelievre <Frederic.Lelievre@ca.kontron.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
A wrong function was being used to free a list; this fixes the problem.
Otherwise, an oops at unload time was possible. But not likely, since you
can't have any users when you unload the modules and it is very hard to get
messages into this queue without users.
Signed-off-by: Corey Minyard <minyard@acm.org>
Cc: Patrick Schoeller <Patrick.Schoeller@hp.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The basic issue is that despite have been deprecated and warned about as a
very bad thing in the man pages since its inception there are a few real
users of sys_sysctl. It was my assumption that because sysctl had been
deprecated for all of 2.6 there would be no user space users by this point,
so I initially gave sys_sysctl a very short deprecation period.
Now that I know there are a few real users the only sane way to proceed
with deprecation is to push the time limit out to a year or two work and
work with distributions that have big testing pools like fedora core to
find these last remaining users.
Which means that the sys_sysctl interface needs to be maintained in the
meantime.
Since I have provided a technical measure that allows us to add new sysctl
entries without reserving more binary numbers I believe that is enough to
fix the sys_sysctl binary interface maintenance problems, because there is
no longer a need to change the binary interface at all.
Since the sys_sysctl implementation needs to stay around for a while and
the worst of the maintenance issues that caused us to occasionally break
the ABI have been addressed I don't see any advantage in continuing with
the removal of sys_sysctl.
So instead of merely increasing the deprecation period this patch removes
the deprecation of sys_sysctl and modifies the kernel to compile the code
in by default.
With committing to maintain sys_sysctl we get all of the advantages of a
fast interface for anything that needs it. Currently sys_sysctl is about
5x faster than /proc/sys, for the same string data.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When ACPI && NUMA, pxm_to_node is used and it exists in drivers/acpi/numa.c
Tony said:
The patch makes sense ... if you pick both of "ACPI" and "NUMA", then you
need (and should automatically be given) ACPI_NUMA too.
The only open question is whether there is a better way of getting there.
Perhaps with less configuration options in the first place? We are heading
towards a future where so many systems will be NUMA that there would seem to
be little benefit in keeping ACPI_NUMA separate from ACPI ... but perhaps
we aren't quite there yet.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujtisu.com>
Cc: Len Brown <lenb@kernel.org>
Acked-by: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
There are two bugs in the kretprobe-booster.
1) It doesn't make room for gs registers.
2) It doesn't change status of the current kprobe. This status will
effect the fault handling.
This patch fixes these bugs and, additionally, saves skipped registers for
compatibility with the original kretprobe.
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
If there's a swap file on a software RAID, it should be possible to use this
file for saving the swsusp's suspend image. Also, this file should be
available to the memory management subsystem when memory is being freed before
the suspend image is created.
For the above reasons it seems that md_threads should not be frozen during the
suspend and the appended patch makes this happen, but then there is the
question if they don't cause any data to be written to disks after the suspend
image has been created, provided that all filesystems are frozen at that time.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
I forgot to has the size-in-blocks to (loff_t) before shifting up to a
size-in-bytes.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
It turns out that CHANGE is preferred to ONLINE/OFFLINE for various reasons
(not least of which being that udev understands it already).
So remove the recently added KOBJ_OFFLINE (no-one is likely to care anyway)
and change the ONLINE to a CHANGE event
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Neil Brown <neilb@suse.de>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The Coverity checker noted that in
drivers/telephony/ixj.c:ixj_build_filter_cadence(), filter_en[4] or
filter_en[5] could be written to.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
All device-mapper targets must complete outstanding I/O before suspending.
The mirror target generates I/O in its recovery phase and fails to wait for
it. It needs to be tracked so we can ensure that it has completed before we
suspend.
[akpm@osdl.org: cleanup]
Signed-off-by: Jonathan E Brassow <jbrassow@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Cc: <dm-devel@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
When adding paths to the round-robin path selector, their order gets inverted,
which is not desirable.
Fix by replacing list_add() with list_add_tail().
Signed-off-by: Jonathan E Brassow <jbrassow@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Cc: <dm-devel@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
If the device is already suspended, just return the error and skip the code
that would incorrectly wipe md->suspended_bdev.
(This isn't currently a problem because existing code avoids calling this
function if the device is already suspended.)
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Cc: <dm-devel@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
There is a race between dev_create() and find_device().
If the mdptr has not yet been stored against a device, find_device() needs to
behave as though no device was found. It already returns NULL, but there is a
dm_put() missing: it must drop the reference dm_get_md() took.
The bug was introduced by dm-fix-mapped-device-ref-counting.patch.
It manifests itself if another dm ioctl attempts to reference a newly-created
device while the device creation ioctl is still running. The consequence is
that the device cannot be removed until the machine is rebooted. Certain udev
configurations can lead to this happening.
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Cc: <dm-devel@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
o Currently there is no specific alignment restriction in linker script
and in some cases it can be placed non 4K aligned addresses. This fails
kexec which checks that segment to be loaded is page aligned.
o I guess, it does not harm data segment to be 4K aligned.
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
In the case where an open creates the file, we shouldn't be rechecking
permissions to open the file; the open succeeds regardless of what the new
file's mode bits say.
This patch fixes the problem, but only by introducing yet another parameter
to nfsd_create_v3. This is ugly. This will be fixed by later patches.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Acked-by: Neil Brown <neilb@suse.de>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Minor rearrangement, cleanup of do_open_lookup(). No change in behavior.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Acked-by: Neil Brown <neilb@suse.de>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
set_mb() is used by set_current_state() which needs mb(), not wmb(). I
think it would be right to assume that set_mb() implies mb(), all arches
seem to do just this.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
If the microcode driver is built in (rather than module) there are some,
ehm, interesting effects happening due to the new "call out to userspace"
behavior that is introduced.. and which runs too early. The result is a
boot hang; which is really nasty.
The patch below is a minimally safe patch to fix this regression for 2.6.19
by just not requesting actual microcode updates during early boot. (That
is a good idea in general anyway)
The "real" fix is a lot more complex given the entire cpu hotplug scenario
(during cpu hotplug you normally need to load the microcode as well); but
the interactions for that are just really messy at this point; this fix at
least makes it work and avoids a full detangle of hotplug.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This is the x86-64 version of f9dadfa71b
that did the same thing on i386.
Since the "mask" bit is in the low word, when we write a new entry, we
need to write the high word first, before we potentially unmask it.
The exception is when we actually want to mask the interrupt, in which
case we want to write the low word first to make sure that the high word
doesn't change while the interrupt routing is still active.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This is just commit 130fe05dbc ported to
x86-64, for all the same reasons. It cleans up the IO-APIC accesses in
order to then fix the ordering issues.
We move the accessor functions (that were only used by io_apic.c) out of
a header file, and use proper memory-mapped accesses rather than making
up our own "volatile" pointers.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This reverts commit de09bddb9d. It tried
to reserve the MMCONFIG mmio memory ranges, but since the MMCONFIG
information is broken and often bogus (which is why we don't dare use it
most of the time _anyway_), it does more harm than good.
Cc: Jeff Chua <jeff.chua.linux@gmail.com>
Cc: Adrian Bunk <bunk@stusta.de>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Here are some fixes to endianess problems spotted by Al Viro.
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use proper upper limits for the loops and check for all error
conditions.
The problem was noticed by Adrian Bunk.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since pskb_copy tacks on the non-linear bits from the original
skb, it needs to count them in the truesize field of the new skb.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Otherwise we can hit paths that (legally) do multiple deletes on the
same node and OOPS with the HLIST poison values there instead of
NULL.
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch removes consideration of high memory when determining TCP
hash table sizes. Taking into account high memory results in tcp_mem
values that are too large.
Signed-off-by: John Heffner <jheffner@psc.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
As Patrick McHardy <kaber@trash.net> suggested, Traffic Shaper is now
obsolete and alternative to it is no longer CBQ, since its problems with
virtual devices, alter Kconfig text to reflect this -- put a link to the
traffic schedulers as a whole.
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch contains a fix for a bug introduced more than a year ago
(not setting *eof) and updates whitespace a bit.
Signed-off-by: Jan-Benedict Glaw <jbglaw@lug-owl.de>
show_mem() was not correctly handling holes in the memory
map. It was treating the freed sections of the map as
though they contained valid struct page entries. This
could cause incorrect debugging output or even a kernel
panic.
This patch keeps the struct meminfo around after system
initialization so that show_mem() can use it when
scanning memory. show_mem() now walks over each bank
of each online node, rather than assuming that each node
contains a single contiguous bank.
Signed-off-by: Ray Lehtiniemi <rayl@mail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>