Commit Graph

26325 Commits

Author SHA1 Message Date
Tejun Heo
6d97dbd72d [PATCH] libata-eh: implement BMDMA EH
Implement stock BMDMA error handling methods.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:58:24 +09:00
Tejun Heo
022bdb075b [PATCH] libata-eh: implement new EH
Implement new EH.  The exported interface is ata_do_eh() which is to
be called from ->error_handler and performs the following steps to
recover the failed port.

ata_eh_autopsy() : analyze SError/TF, determine the cause of failure
		   and required recovery actions and record it in
		   ap->eh_context
ata_eh_report()	 : report the failure to user
ata_eh_recover() : perform recovery actions described in ap->eh_context
ata_eh_finish()	 : finish failed qcs

LLDDs can customize error handling by modifying eh_context before
calling ata_do_eh() or, if necessary, doing so inbetween each major
steps by calling each step explicitly.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:58:22 +09:00
Tejun Heo
f3e81b19aa [PATCH] libata-eh: implement ata_eh_info and ata_eh_context
struct ata_eh_info serves as the communication channel between
execution path and EH.  Execution path describes detected error
condition in ap->eh_info and EH recovers the port using it.  To avoid
missing error conditions detected during EH, EH makes its own copy of
eh_info and clears it on entry allowing error info to accumulate
during EH.

Most EH states including EH's copy of eh_info are stored in
ap->eh_context (struct ata_eh_context) which is owned by EH and thus
doesn't require any synchronization to access and alter.  This
standardized context makes it easy to integrate various parts of EH
and extend EH to handle multiple links (for PM).

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:58:21 +09:00
Tejun Heo
0c247c559c [PATCH] libata-eh: implement dev->ering
This patch implements ata_ering and uses it to define dev->ering.

ata_ering is a ring buffer which records libata errors - whether a
command was for normar IO request, err_mask and timestamp.  Errors are
recorded per-device in dev->ering.  This will be used by EH to
determine recovery actions.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:58:19 +09:00
Tejun Heo
9be1e979f2 [PATCH] libata-eh: add ATA and libata flags for new EH
Add ATA and libata flags to be used by new EH.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:58:17 +09:00
Tejun Heo
246619da30 [PATCH] libata-eh-fw: update SCSI command completion path for new EH
SCSI command completion path used to do some part of EH including
printing messages and obtaining sense data.  With new EH, all these
are responsibilities of the EH, update SCSI command completion path to
reflect this.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:58:16 +09:00
Tejun Heo
d95a717f57 [PATCH] libata-eh-fw: update ata_exec_internal() for new EH
Update ata_exec_internal() such that it uses new EH framework.
->post_internal_cmd() is always invoked regardless of completion
status.  Also, when ata_exec_internal() detects a timeout condition
and new EH is in place, it freezes the port as timeout for normal
commands would do.

Note that ata_port_flush_task() is called regardless of
wait_for_completion status.  This is necessary as exceptions unrelated
to the qc can abort the qc, in which case PIO task could still be
running after the wait for completion returns.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:58:14 +09:00
Tejun Heo
ad9e276244 [PATCH] libata-eh-fw: update ata_scsi_error() for new EH
Update ata_scsi_error() for new EH.  ata_scsi_error() is responsible
for claiming timed out qcs and invoking ->error_handler in safe and
synchronized manner.  As the state of the controller is unknown if a
qc has timed out, the port is frozen in such cases.

Note that ata_scsi_timed_out() isn't used for new EH.  This is because
a timed out qc cannot be claimed by EH without freezing the port and
freezing the port in ata_scsi_timed_out() results in unnecessary
abortion of other active qcs.  ata_scsi_timed_out() can be removed
once all drivers are converted to new EH.

While at it, add 'TODO: kill' comments to old EH functions.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:58:12 +09:00
Tejun Heo
dafadcde8d [PATCH] libata-eh-fw: implement new EH scheduling from PIO
PIO executes without holding host_set lock, so it cannot be
synchronized using the same mechanism as interrupt driven execution.
port_task framework makes sure that EH is not entered until PIO task
is flushed, so PIO task can be sure the qc in progress won't go away
underneath it.  One thing it cannot be sure of is whether the qc has
already been scheduled for EH by another exception condition while
host_set lock was released.

This patch makes ata_poll_qc-complete() handle such conditions
properly and make it freeze the port if HSM violation is detected
during PIO execution.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:58:11 +09:00
Tejun Heo
e318049949 [PATCH] libata-eh-fw: implement freeze/thaw
Freezing is performed atomic w.r.t. host_set->lock and once frozen
LLDD is not allowed to access the port or any qc on it.  Also, libata
makes sure that no new qc gets issued to a frozen port.

A frozen port is thawed after a reset operation completes
successfully, so reset methods must do its job while the port is
frozen.  During initialization all ports get frozen before requesting
IRQ, so reset methods are always invoked on a frozen port.

Optional ->freeze and ->thaw operations notify LLDD that the port is
being frozen and thawed, respectively.  LLDD can disable/enable
hardware interrupt in these callbacks if the controller's IRQ mask can
be changed dynamically.  If the controller doesn't allow such
operation, LLDD can check for frozen state in the interrupt handler
and ack/clear interrupts unconditionally while frozen.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:58:09 +09:00
Tejun Heo
7b70fc0398 [PATCH] libata-eh-fw: implement ata_port_schedule_eh() and ata_port_abort()
ata_port_schedule_eh() directly schedules EH for @ap without
associated qc.  Once EH scheduled, no further qc is allowed and EH
kicks in as soon as all currently active qc's are drained.

ata_port_abort() schedules all currently active commands for EH by
qc_completing them with ATA_QCFLAG_FAILED set.  If ata_port_abort()
doesn't find any qc to abort, it directly schedule EH using
ata_port_schedule_eh().

These two functions provide ways to invoke EH for conditions which
aren't directly related to any specfic qc.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:58:07 +09:00
Tejun Heo
f686bcb807 [PATCH] libata-eh-fw: implement new EH scheduling via error completion
There are several ways a qc can get schedule for EH in new EH.  This
patch implements one of them - completing a qc with ATA_QCFLAG_FAILED
set or with non-zero qc->err_mask.  ALL such qc's are examined by EH.

New EH schedules a qc for EH from completion iff ->error_handler is
implemented, qc is marked as failed or qc->err_mask is non-zero and
the command is not an internal command (internal cmd is handled via
->post_internal_cmd).  The EH scheduling itself is performed by asking
SCSI midlayer to schedule EH for the specified scmd.

For drivers implementing old-EH, nothing changes.  As this change
makes ata_qc_complete() rather large, it's not inlined anymore and
__ata_qc_complete() is exported to other parts of libata for later
use.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:58:05 +09:00
Tejun Heo
f69499f42c [PATCH] libata-eh-fw: update ata_qc_from_tag() to enforce normal/EH qc ownership
New EH framework has clear distinction about who owns a qc.  Every qc
starts owned by normal execution path - PIO, interrupt or whatever.
When an exception condition occurs which affects the qc, the qc gets
scheduled for EH.  Note that some events (say, link lost and regained,
command timeout) may schedule qc's which are not directly related but
could have been affected for EH too.  Scheduling for EH is atomic
w.r.t. ap->host_set->lock and once schedule for EH, normal execution
path is not allowed to access the qc in whatever way.  (PIO
synchronization acts a bit different and will be dealt with later)

This patch make ata_qc_from_tag() check whether a qc is active and
owned by normal path before returning it.  If conditions don't match,
NULL is returned and thus access to the qc is denied.
__ata_qc_from_tag() is the original ata_qc_from_tag() and is used by
libata core/EH layers to access inactive/failed qc's.

This change is applied only if the associated LLDD implements new EH
as indicated by non-NULL ->error_handler

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:58:03 +09:00
Tejun Heo
2ab7db1ff1 [PATCH] libata-eh-fw: use special reserved tag and qc for internal commands
New EH may issue internal commands to recover from error while failed
qc's are still hanging around.  To allow such usage, reserve tag
ATA_MAX_QUEUE-1 for internal command.  This also makes it easy to tell
whether a qc is for internal command or not.  ata_tag_internal() test
implements this test.

To avoid breaking existing drivers, ata_exec_internal() uses
ATA_TAG_INTERNAL only for drivers which implement ->error_handler.
For drivers using old EH, tag 0 is used.  Note that this makes
ata_tag_internal() test valid only when ->error_handler is
implemented.  This is okay as drivers on old EH should not and does
not have any reason to use ata_tag_internal().

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:58:02 +09:00
Tejun Heo
dc2b351586 [PATCH] libata-eh-fw: clear SError in ata_std_postreset()
Clear SError in ata_std_postreset().  This is to clear SError bits
which get set during reset.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:58:00 +09:00
Tejun Heo
9ec957f200 [PATCH] libata-eh-fw: add flags and operations for new EH
Add ATA_FLAG_EH_{PENDING|FROZEN}, ATA_ATA_QCFLAG_{FAILED|SENSE_VALID}
and ops->freeze, thaw, error_handler, post_internal_cmd() for new EH.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:58 +09:00
Tejun Heo
f15a1dafed [PATCH] libata: use ATA printk helpers
Use ATA printk helpers.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:56 +09:00
Tejun Heo
61440db61f [PATCH] libata: implement ATA printk helpers
Implement ata_{port|dev}_printk() which prefixes the message with
proper identification string.  This change is necessary for later PM
support because devices and links should be identified differently
depending on how they are attached.

This also helps unifying device id strings.  Currently, there are two
forms in use (P is the port number D device number) - 'ataP(D):', and
'ataP: dev D '.  These macros also make it harder to forget proper ID
string (e.g. printing only port number when a device is in question).

Debug message handling can be integrated into these printk macros by
passing debug type and level via @lv.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:55 +09:00
Tejun Heo
3373efd89d [PATCH] libata: use dev->ap
Use dev->ap where possible and eliminate superflous @ap from functions
and structures.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:53 +09:00
Tejun Heo
38d87234d6 [PATCH] libata: add dev->ap
Add dev->ap which points back to the port the device belongs to.  This
makes it unnecessary to pass @ap for silly reasons (e.g. printks).
Also, this change is necessary to accomodate later PM support which
will introduce ATA link inbetween port and device.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:51 +09:00
Tejun Heo
a0ab51cefc [PATCH] libata: kill old SCR functions and sata_dev_present()
Kill now unused scr_{read|write|write_flush}() and sata_dev_present().

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:49 +09:00
Tejun Heo
81952c5497 [PATCH] libata: use new SCR and on/offline functions
Use new SCR and on/offline functions.  Note that for LLDD which know
it implements SCR callbacks, SCR functions are guaranteed to succeed
and ata_port_online() == !ata_port_offline().

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:47 +09:00
Tejun Heo
34bf21704c [PATCH] libata: implement new SCR handling and port on/offline functions
Implement ata_scr_{valid|read|write|write_flush}() and
ata_port_{online|offline}().  These functions replace
scr_{read|write}() and sata_dev_present().

Major difference between between the new SCR functions and the old
ones is that the new ones have a way to signal error to the caller.
This makes handling SCR-available and SCR-unavailable cases in the
same path easier.  Also, it eases later PM implementation where SCR
access can fail due to various reasons.

ata_port_{online|offline}() functions return 1 only when they are
affirmitive of the condition.  e.g.  if SCR is unaccessible or
presence cannot be determined for other reasons, these functions
return 0.  So, ata_port_online() != !ata_port_offline().  This
distinction is useful in many exception handling cases.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:46 +09:00
Tejun Heo
838df6284c [PATCH] libata: init ap->cbl to ATA_CBL_SATA early
Init ap->cbl to ATA_CBL_SATA in ata_host_init().  This is necessary
for soon-to-follow SCR handling function changes.  LLDDs are free to
change ap->cbl during probing.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:44 +09:00
Tejun Heo
ce5f7f3d0c [PATCH] sata_sil24: update TF image only when necessary
Update TF image (pp->tf) only when necessary.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:42 +09:00
Tejun Heo
e61e067227 [PATCH] libata: implement qc->result_tf
Add qc->result_tf and ATA_QCFLAG_RESULT_TF.  This moves the
responsibility of loading result TF from post-compltion path to qc
execution path.  qc->result_tf is loaded if explicitly requested or
the qc failsa.  This allows more efficient completion implementation
and correct handling of result TF for controllers which don't have
global TF representation such as sil3124/32.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:40 +09:00
Tejun Heo
96bd39ec29 [PATCH] libata: remove postreset handling from ata_do_reset()
Make ata_do_reset() deal only with reset.  postreset is now the
responsibility of the caller.  This is simpler and eases later
prereset addition.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:38 +09:00
Tejun Heo
3adcebb2b5 [PATCH] libata: move ->set_mode() handling into ata_set_mode()
Move ->set_mode() handlng into ata_set_mode().

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:37 +09:00
Tejun Heo
fe635c7e91 [PATCH] libata: use preallocated buffers
It's not a very good idea to allocate memory during EH.  Use
statically allocated buffer for dev->id[] and add 512byte buffer
ap->sector_buf.  This buffer is owned by EH (or probing) and to be
used as temporary buffer for various purposes (IDENTIFY, NCQ log page
10h, PM GSCR block).

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:35 +09:00
Tejun Heo
158693031d [PATCH] libata: hold host_set lock while finishing internal qc
Hold host_set lock while finishing internal qc.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:33 +09:00
Tejun Heo
7401abf2f4 [PATCH] libata: clear ap->active_tag atomically w.r.t. command completion
ap->active_tag was cleared in ata_qc_free().  This left ap->active_tag
dangling after ata_qc_complete().  Spurious interrupts inbetween could
incorrectly access the qc.  Clear active_tag in ata_qc_complete().
This change is necessary for later EH changes.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:32 +09:00
Tejun Heo
f8c2c4202d [PATCH] libata: fix ->phy_reset class code handling in ata_bus_probe()
ata_bus_probe() doesn't clear dev->class after ->phy_reset().  This
can result in falsely enabled devices if probing fails.  Clear
dev->class to ATA_DEV_UNKNOWN after fetching it.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:30 +09:00
Tejun Heo
6cd727b14f [PATCH] libata: kill duplicate prototypes
Kill duplicate prototypes for ata_eh_qc_complete/retry() in libata.h.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:28 +09:00
Tejun Heo
e23befe901 [PATCH] libata: unexport ata_scsi_error()
While moving ata_scsi_error() from LLDD sht to libata transportt,
EXPORT_SYMBOL_GPL() entry was left out.  Kill it.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:27 +09:00
Tejun Heo
e4fac92ae7 [PATCH] ahci: hardreset classification fix
AHCI calls ata_dev_classify() even when no device is attached which
results in false class code.  Fix it.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:25 +09:00
Tejun Heo
3c567b7d11 [PATCH] libata: rename ata_down_sata_spd_limit() and friends
Rename ata_down_sata_spd_limit() and friends to sata_down_spd_limit()
and likewise for simplicity & consistency.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:23 +09:00
Tejun Heo
c44078c03f [PATCH] libata: silly fix in ata_scsi_start_stop_xlat()
Don't directly access &qc->tf when tf == &qc->tf.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:21 +09:00
Tejun Heo
ee7863bc68 [PATCH] SCSI: implement shost->host_eh_scheduled
libata needs to invoke EH without scmd.  This patch adds
shost->host_eh_scheduled to implement such behavior.

Currently the only user of this feature is libata and no general
interface is defined.  This patch simply adds handling for
host_eh_scheduled where needed and exports scsi_eh_wakeup() to
modules.  The rest is upto libata.  This is the result of the
following discussion.

http://thread.gmane.org/gmane.linux.scsi/23853/focus=9760

In short, SCSI host is not supposed to know about exceptions unrelated
to specific device or command.  Such exceptions should be handled by
transport layer proper.  However, the distinction is not essential to
ATA and libata is planning to depart from SCSI, so, for the time
being, libata will be using SCSI EH to handle such exceptions.

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:20 +09:00
Luben Tuikov
89f48c4d67 [PATCH] SCSI: Introduce scsi_req_abort_cmd (REPOST)
Introduce scsi_req_abort_cmd(struct scsi_cmnd *).
This function requests that SCSI Core start recovery for the
command by deleting the timer and adding the command to the eh
queue.  It can be called by either LLDDs or SCSI Core.  LLDDs who
implement their own error recovery MAY ignore the timeout event if
they generated scsi_req_abort_cmd.

First post:
http://marc.theaimsgroup.com/?l=linux-scsi&m=113833937421677&w=2

Signed-off-by: Luben Tuikov <ltuikov@yahoo.com>

Signed-off-by: Tejun Heo <htejun@gmail.com>
2006-05-15 20:57:18 +09:00
Jeff Garzik
acc696d93d Merge branch 'master' into upstream 2006-04-27 04:53:34 -04:00
Linus Torvalds
2be4d50295 Linux v2.6.17-rc3 2006-04-26 19:19:25 -07:00
Linus Torvalds
a82642fa19 Merge master.kernel.org:/home/rmk/linux-2.6-mmc
* master.kernel.org:/home/rmk/linux-2.6-mmc:
  [MMC] pxamci: fix data timeout calculation
2006-04-26 15:45:27 -07:00
Linus Torvalds
58f8236bed Merge master.kernel.org:/home/rmk/linux-2.6-arm
* master.kernel.org:/home/rmk/linux-2.6-arm:
  [ARM] nommu: trivial fixups for head-nommu.S and the Makefile
  [ARM] vfp: fix leak of VFP_NAN_FLAG into FPSCR
  [ARM] 3484/1: Correct AEABI CFLAGS for correct enum handling
2006-04-26 15:45:02 -07:00
Russell King
f13d241bc3 Merge nommu tree 2006-04-26 21:18:45 +01:00
Chandra Seetharaman
83d722f7e1 [PATCH] Remove __devinit and __cpuinit from notifier_call definitions
Few of the notifier_chain_register() callers use __init in the definition
of notifier_call.  It is incorrect as the function definition should be
available after the initializations (they do not unregister them during
initializations).

This patch fixes all such usages to _not_ have the notifier_call __init
section.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-26 08:30:03 -07:00
Chandra Seetharaman
649bbaa484 [PATCH] Remove __devinitdata from notifier block definitions
Few of the notifier_chain_register() callers use __devinitdata in the
definition of notifier_block data structure.  It is incorrect as the
data structure should be available after the initializations (they do
not unregister them during initializations).

This was leading to an oops when notifier_chain_register() call is
invoked for those callback chains after initialization.

This patch fixes all such usages to _not_ have the notifier_block data
structure in the init data section.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-26 08:27:50 -07:00
James Morris
e7edf9cded [PATCH] LSM: add missing hook to do_compat_readv_writev()
This patch addresses a flaw in LSM, where there is no mediation of readv()
and writev() in for 32-bit compatible apps using a 64-bit kernel.

This bug was discovered and fixed initially in the native readv/writev
code [1], but was not fixed in the compat code.  Thanks to Al for spotting
this one.

  [1] http://lwn.net/Articles/154282/

Signed-off-by: James Morris <jmorris@namei.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-26 07:52:21 -07:00
Al Viro
a090d9132c [PATCH] protect ext3 ioctl modifying append_only, immutable, etc. with i_mutex
All modifications of ->i_flags in inodes that might be visible to
somebody else must be under ->i_mutex.  That patch fixes ext3 ioctl()
setting S_APPEND and friends.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-26 07:52:21 -07:00
Al Viro
6ad0013b31 [PATCH] fix mips sys32_p{read,write}
Switched to use of sys_pread64()/sys_pwrite64() rather than keep duplicating
their guts; among the little things that had been missing there were such as
	ret = security_file_permission (file, MAY_READ);
Gotta love the LSM robustness, right?

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-26 07:52:21 -07:00
Al Viro
de0bb97aff [PATCH] forgotten ->b_data in memcpy() call in ext3/resize.c (oopsable)
sbi->s_group_desc is an array of pointers to buffer_head.  memcpy() of
buffer size from address of buffer_head is a bad idea - it will generate
junk in any case, may oops if buffer_head is close to the end of slab
page and next page is not mapped and isn't what was intended there.
IOW, ->b_data is missing in that call.  Fortunately, result doesn't go
into the primary on-disk data structures, so only backup ones get crap
written to them; that had allowed this bug to remain unnoticed until
now.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-26 07:52:21 -07:00