Till VFS can correctly support read-only remount without racing,
use WARN_ON instead of BUG_ON on detecting transaction in flight
after quiescing filesystem.
Signed-off-by: Felix Blyakher <felixb@sgi.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Before trying to obtain, read or write a buffer,
check that the buffer length is actually valid. If
it is not valid, then something read in the recovery
process has been corrupted and we should abort
recovery.
Reported-by: Eric Sesterhenn <snakebyte@gmx.de>
Tested-by: Eric Sesterhenn <snakebyte@gmx.de>
Reviewed-by: Christoph Hellwig <hch@infradead.org>
Reviewed-by: Felix Blyakher <felixb@sgi.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
libata: implement HORKAGE_1_5_GBPS and apply it to WD My Book
libata: add no penalty retry request for EH device handling routines
libata: improve probe failure handling
libata: add @spd_limit to sata_down_spd_limit()
libata: clear dev->ering in smarter way
libata: check onlineness before using SPD in sata_down_spd_limit()
libata: move ata_dev_disable() to libata-eh.c
libata: fix EH device failure handling
sata_nv: ck804 has borked hardreset too
ide/libata: fix ata_id_is_cfa() (take 4)
libata: fix kernel-doc warnings
ahci: add a module parameter to ignore the SSS flags for async scanning
sata_mv: Fix chip type for Hightpoint RocketRaid 1740/1742
[libata] sata_sil: Fix compilation error with libata debugging enabled
Impact: new feature
With this and a blkrawverify modified not to verify the sequence numbers
we can start using the userspace tools to verify that the data produced
with the ftrace plugin works as expected.
Example:
[root@f10-1 ~]# echo 1 > /sys/block/sda/sda1/trace/enable
[root@f10-1 ~]# echo bin > /d/tracing/trace_options
[root@f10-1 ~]# echo blk > /d/tracing/current_tracer
[root@f10-1 ~]# cat /d/tracing/trace_pipe > sda1.blktrace.0
^C
[root@f10-1 ~]# ./blkrawverify --noseq sda1
Verifying sda1
CPU 0
Wrote output to sda1.verify.out
[root@f10-1 ~]# cat sda1.verify.out
---------------
Verifying sda1
---------------------
Summary for cpu 0:
1349 valid + 0 invalid (100.0%) processed
[root@f10-1 ~]#
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: API change
The trace_seq and trace_entry are in trace_iterator, where there are
more fields that may be needed by tracers, so just pass the
tracer_iterator as is already the case for struct tracer->print_line.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: make trace_event more convenient for tracers
All tracers (for the moment) that use the struct trace_event want to
have the context info printed before their own output: the pid/cmdline,
cpu, and timestamp.
But some other tracers that want to implement their trace_event
callbacks will not necessary need these information or they may want to
format them as they want.
This patch adds a new default-enabled trace option:
TRACE_ITER_CONTEXT_INFO When disabled through:
echo nocontext-info > /debugfs/tracing/trace_options
The pid, cpu and timestamps headers will not be printed.
IE with the sched_switch tracer with context-info (default):
bash-2935 [001] 100.356561: 2935:120:S ==> [001] 0:140:R <idle>
<idle>-0 [000] 100.412804: 0:140:R + [000] 11:115:S events/0
<idle>-0 [000] 100.412816: 0:140:R ==> [000] 11:115:R events/0
events/0-11 [000] 100.412829: 11:115:S ==> [000] 0:140:R <idle>
Without context-info:
2935:120:S ==> [001] 0:140:R <idle>
0:140:R + [000] 11:115:S events/0
0:140:R ==> [000] 11:115:R events/0
11:115:S ==> [000] 0:140:R <idle>
A tracer can disable it at runtime by clearing the bit
TRACE_ITER_CONTEXT_INFO in trace_flags.
The print routines were renamed to trace_print_context and
trace_print_lat_context, so that they can be used by tracers if they
want to use them for one of the trace_event callbacks.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Change spin_locks to irqsave to prevent dead-locks.
Protect adding and deleting to/from dca_providers list.
Drop the lock during dca_sysfs_add_req() and dca_sysfs_remove_req() calls
as they might sleep (use GFP_KERNEL allocation).
Signed-off-by: Maciej Sosnowski <maciej.sosnowski@intel.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
while (limit--)
if (test())
break;
if (limit <= 0)
goto test_failed;
In the last iteration, limit is decremented after the test to 0.
If just thereafter test() succeeds and a break occurs, the goto
still occurs because limit is 0.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In order to handle all of the cases of address calculation overflow
properly, we run sparc 32-bit processes in "address masking" mode
when running on a 64-bit kernel.
Address masking mode zeros out the top 32-bits of the address
calculated for every load and store instruction.
However, when we're in privileged mode we have to run with that
address masking mode disabled even when accessing userspace from
the kernel.
To "simulate" the address masking mode we clear the top-bits by
hand for 32-bit processes in the fault handler.
It is the responsibility of code in the compat layer to properly
zero extend addresses used to access userspace. If this isn't
followed properly we can get into a fault loop.
Say that the user address is 0xf0000000 but for whatever reason
the kernel code sign extends this to 64-bit, and then the kernel
tries to access the result.
In such a case we'll fault on address 0xfffffffff0000000 but the fault
handler will process that fault as if it were to address 0xf0000000.
We'll loop faulting forever because the fault never gets satisfied.
So add a check specifically for this case, when the kernel is faulting
on a user address access and the addresses don't match up.
This code path is sufficiently slow path, and this bug is sufficiently
painful to diagnose, that this kind of bug check is warranted.
Signed-off-by: David S. Miller <davem@davemloft.net>
This just updates my email address on some drivers I'd forgotten in a
previous patch.
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Impact: Documentation only
There is an email alias as well to reach the x86 maintainers: x86@kernel.org.
Document it.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
When we're idling in NOHZ mode, timer interrupts are not running.
Evidence of processing timer interrupts is what the NMI watchdog
uses to determine if the CPU is stuck.
On Niagara, we'll yield the cpu. This will make the cpu, at
worst, hang out in the hypervisor until an interrupt arrives.
This will prevent the NMI watchdog timer from firing.
However on non-Niagara we just loop executing instructions
which will cause the NMI watchdog to keep firing. It won't
see timer interrupts happening so it will think the cpu is
stuck.
Fix this by touching the NMI watchdog in the cpu idle loop
on non-Niagara machines.
Signed-off-by: David S. Miller <davem@davemloft.net>
while (timeout--) { ... }
timeout becomes -1 if the loop isn't ended otherwise, not 0.
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add kernel-doc notation for @lock:
include/linux/sched.h:457: No description found for parameter 'lock'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Now that we have a working ftrace=<tracer> function, make the boot
tracer get activated by it. This way we can turn it on or off without
recompiling the kernel, as well as keeping the selftests on. The
selftests are disabled whenever a default tracer starts running.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Peter Zijlstra started the functionality to start up a default
tracing at bootup. This patch finishes the work.
Now if you add 'ftrace=<tracer>' to the command line, when that tracer
is registered on bootup, that tracer is selected and starts tracing.
Note, all selftests for tracers that are registered after this tracer
is disabled. This prevents the selftests from disturbing the running
tracer, or the running tracer from disturbing the selftest.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: Fixes dumpstack and KDB on 64 bits
This re-adds the old stack pointer to the top of the irqstack to help
with unwinding. It was removed in commit d99015b1ab
as part of the save_args out-of-line work.
Both dumpstack and KDB require this information.
Signed-off-by: Martin Hicks <mort@sgi.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
3Gbps is often much more prone to transmission failures. It's usually
okay to let EH handle speed down after transmission failures but some
WD My Book drives completely shutdown after certain transmission
failures and after it only power cycling can revive them. Combined
with the fact that external drives often end up with cable assembly
which is longer than usual and more likely to have intervening gender,
this makes these drives very likely to shutdown under certain
configurations virtually rendering them unusable.
This patch implements HOARKGE_1_5_GBPS and applies it to WD My Book
such that 1.5Gbps is forced once the device is identified.
Please take a look at the following bz for related reports.
http://bugzilla.kernel.org/show_bug.cgi?id=9913
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Let -EAGAIN from EH device handling routines trigger EH retry without
consuming its tries count. This will be used to implement link SPD
horkage which requires hardreset to adjust SPD without affecting other
EH decisions. As it bypasses the forward progress guarantee provided
by the tries count, the requester is responsible for ensuring forward
progress.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
When link is flaky at high speed, it isn't uncommon for a device to
repeatedly fail probing sequence early after successfully negotiating
high link speed. This often leads to consecutive hotplug events
without successful probing.
This patch improves libata EH such that it remembers probing trials
and if there have been more than two unsuccessful trials in the past
60 seconds, slows down link speed to 1.5Gbps.
As link speed negotiation is the duty of the PHY layer proper, the
goal of this fallback mechanism is to provide the last resort when
everything else fails, which unfortunately happens not too
infrequently, so no fancy 6->3->1.5 speeding down or highest
successful transmission speed seen kind of logics (yet).
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Add @spd_limit to sata_down_spd_limit() so that the caller can specify
the SPD limit it wants. This parameter doesn't get in the way even
when it's too low. The closest possible limit is applied.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
dev->ering used to be cleared together with the rest of ata_device in
ata_dev_init() which is called whenever a probing event occurs.
dev->ering is about to be used to track probing failures so it needs
to remain persistent over multiple porbing events. This patch
achieves this by doing the following.
* Instead of CLEAR_OFFSET, define CLEAR_BEGIN and CLEAR_END and only
clear between BEGIN and END. ering is moved after END. The split
of persistent area is to allow hotter items remain at the head.
* ering is explicitly cleared on ata_dev_disable() and when device
attach succeeds. So, ering is persistent throug a device's life
time (unless explicitly cleared of course) and also through periods
inbetween disablement of an attached device and successful detection
of the next one.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
sata_down_spd_limit() should check whether the link is online before
using the SPD value to determine how to limit the link speed. Factor
out onlineness test and test it from sata_down_spd_limit().
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
ata_dev_disable() is about to be more tightly integrated into EH
logic. Move it to libata-eh.c.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
The dev->pio_mode > XFER_PIO_0 test is there to avoid unnecessary
speed down warning messages but it accidentally disabled SATA link spd
down during configuration phase after reset where PIO mode is always
zero.
This patch fixes the problem by moving the test where it belongs.
This makes libata probing sequence behave better when the connection
is flaky at higher link speeds which isn't too uncommon for eSATA
devices.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
While playing with nvraid, I found out that rmmoding and insmoding
often trigger hardreset failure on the first port (the second one was
always okay). Seriously, how diverse can you get with hardreset
behaviors? Anyways, make ck804 use noclassify variant too.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
When checking for the CFA feature set support, ata_id_is_cfa() tests bit 2 in
word 82 of the identify data instead the word 83; it also checks the ATA/PI
version support in the word 80 (which the CompactFlash specifications have as
reserved), this having no slightest chance to work on the modern CF cards that
don't have 0x848A in the word 0...
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Fix libata kernel-doc warnings:
Warning(linux-next-20090120//drivers/ata/libata-core.c:4720): Excess function parameter 'dev' description in 'ata_qc_new'
Warning(linux-next-20090120//drivers/ata/libata-scsi.c:428): No description found for parameter 'ap'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
The SSS flag, which directs the OS to spin up one disk at a time
to not have the PSU blow out, sometimes gets set even when not needed.
The effect of this is a longer-than-needed boot time.
This patch adds a module parameter that makes the driver ignore SSS
at least as far as the parallel scan during boot is concerned...
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Fix chip type for the Highpoint RocketRAID 1740 and 1742 PCI cards.
These really do have Marvell 6042 chips on them, rather than the 5081 chip.
Confirmed by multiple (two) users (for the 1740), and by examining
the product photographs from Highpoint's web site.
Signed-off-by: Mark Lord <mlord@pobox.com>
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
I tried compiling 2.6.29-rc1 and 2.6.29-rc3 with libata debugging enabled
and got the following error:
CC [M] drivers/ata/sata_sil.o
drivers/ata/sata_sil.c: In function 'sil_fill_sg':
drivers/ata/sata_sil.c:327: error: 'pi' undeclared (first use in this function)
drivers/ata/sata_sil.c:327: error: (Each undeclared identifier is reported only once
drivers/ata/sata_sil.c:327: error: for each function it appears in.)
make[2]: *** [drivers/ata/sata_sil.o] Error 1
make[1]: *** [drivers/ata] Error 2
make: *** [drivers] Error 2
include/linux/libata.h has the following enabled:
#define ATA_DEBUG
#define ATA_VERBOSE_DEBUG
#define ATA_IRQ_TRAP
This fixes the compilation.
Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
According to the Spec the first two elements in the _BCL package won't be
regarded as the available brightness level. The first is the brightness when
full power is connected to the box(It means that the AC adapter is plugged).
The second is the brightness level when the box is on battery.
If the first two elements are still used while finding the next brightness
level, it will fall back to the lowest level when keeping on pressing
hotkey. (On some boxes the brightness will be changed twice when hotkey is
pressed once. One is in the ACPI video driver. The other is changed by sys I/F.
In the ACPI video driver the first two elements will be used while changing
the brightness. But the first two elements is skipped while using sys I/F.
In such case there exists the inconsistency).
So he first two elements had better be skipped while showing the available
brightness or finding the next brightness level.
http://bugzilla.kernel.org/show_bug.cgi?id=12450
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI hotplug: Change link order of pciehp & acpiphp
PCI hotplug: fakephp: Allocate PCI resources before adding the device
PCI MSI: Fix undefined shift by 32
PCI PM: Do not wait for buses in B2 or B3 during resume
PCI PM: Power up devices before restoring their state
PCI PM: Fix hibernation breakage on EeePC 701
PCI: irq and pci_ids patch for Intel Tigerpoint DeviceIDs
PCI PM: Fix suspend error paths and testing facility breakage
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6:
slub: fix per cpu kmem_cache_cpu array memory leak
kmalloc: return NULL instead of link failure
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
fbdev/atyfb: Fix DSP config on some PowerMacs & PowerBooks
powerpc: Fix oops on some machines due to incorrect pr_debug()
powerpc/ps3: Printing fixups for l64 to ll64 convserion drivers/net
powerpc/5200: update device tree binding documentation
powerpc/5200: Bugfix for PCI mapping of memory and IMMR
powerpc/5200: update defconfigs
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
sched_rt: don't use first_cpu on cpumask created with cpumask_and
sched: fix buddie group latency
sched: clear buddies more aggressively
sched: symmetric sync vs avg_overlap
sched: fix sync wakeups
cpuset: fix possible deadlock in async_rebuild_sched_domains
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/drzeus/mmc:
pxamci: enable DMA for write ops after CMD/RESP
pxamci: replace #ifdef CONFIG_PXA27x with if (cpu_is_pxa27x())
ricoh_mmc: Use suspend_late/resume_early
mmci: Add support for ST Micro derivate
mmc: Add a MX2/MX3 specific SDHC driver
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6:
icside: fix PCB version 6 support (v2)
tx4939ide: typo fix and minor cleanup
ide: add CS5536 host driver (v3)
ide: Force VIA IDE legacy interrupts for AmigaOne boards
IDE: Unregister and disable devices if initialization fails.
ide: fix ide_register_port() failure handling
ide: struct device - replace bus_id with dev_name(), dev_set_name()
ide-cd: fix DMA for non bio-backed requests
* 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/dvrabel/uwb:
uwb: lock rc->rsvs_lock with spin_lock_bh()
wusb: timeout when waiting for ASL/PZL updates in whci-hcd
uwb: remove unused #include <version.h>'s
wusb: return -ENOTCONN when resetting a port with no connected device
uwb: safely remove all reservations
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
block: add text file detailing queue/ sysfs files
bio.h: If they MUST be inlined, then use __always_inline
Fix misleading comment in bio.h
block: fix inconsistent parenthesisation of QUEUE_FLAG_DEFAULT
block: fix oops in blk_queue_io_stat()
The host really shouldn't be notifying us of config changes
before the device status is VIRTIO_CONFIG_S_DRIVER or
VIRTIO_CONFIG_S_DRIVER_OK.
However, if we do happen to be interrupted while we're not
attached to a driver, we really shouldn't oops. Prevent
this simply by checking that device->driver is non-NULL
before trying to notify the driver of config changes.
Problem observed by doing a "set_link virtio.0 down" with
QEMU before the net driver had been loaded.
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Current refcounting for modules (done if CONFIG_MODULE_UNLOAD=y) is
using a lot of memory.
Each 'struct module' contains an [NR_CPUS] array of full cache lines.
This patch uses existing infrastructure (percpu_modalloc() &
percpu_modfree()) to allocate percpu space for the refcount storage.
Instead of wasting NR_CPUS*128 bytes (on i386), we now use
nr_cpu_ids*sizeof(local_t) bytes.
On a typical distro, where NR_CPUS=8, shiping 2000 modules, we reduce
size of module files by about 2 Mbytes. (1Kb per module)
Instead of having all refcounters in the same memory node - with TLB misses
because of vmalloc() - this new implementation permits to have better
NUMA properties, since each CPU will use storage on its preferred node,
thanks to percpu storage.
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We weren't reclaiming the clusters which get free'd from this function,
so any user punching holes in a file would still have those bytes accounted
against him/her. Add the call to vfs_dq_free_space_nodirty() to fix this.
Interestingly enough, the journal credits calculation already took this into
account.
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
Acked-by: Jan Kara <jack@suse.cz>