Commit Graph

1688 Commits

Author SHA1 Message Date
Stefan Weinhuber
22825ab769 [S390] dasd: support DIAG access for read-only devices
When a DASD device is used with the DIAG discipline, the DIAG
initialization will indicate success or error with a respective
return code. So far we have interpreted a return code of 4 as error,
but it actually means that the initialization was successful, but
the device is read-only. To allow read-only devices to be used with
DIAG we need to accept a return code of 4 as success.

Re-initialization of the DIAG access is also part of the DIAG error
recovery. If we find that the access mode of a device has been
changed from writable to read-only while the device was in use,
we print an error message.

Signed-off-by: Stefan Weinhuber <wein@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-12-07 12:51:34 +01:00
Sebastian Ott
d40f7b75a2 [S390] cio: dont unregister a busy device in ccw_device_set_offline
If we detect a busy subchannel after the driver's set_offline
callback returned in ccw_device_set_offline, the current behavior
is to unregister the device, which may lead to undesired
consequences. Change this to just quiesce the subchannel and go on
with the offline processing.

Note: This is no excuse for not fixing these drivers -
after the set_offline callback they should have no running IO!

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-12-07 12:51:33 +01:00
Peter Oberparleiter
de1b04388f [S390] cio: improve error recovery for internal I/Os
Improve error recovery for internal I/Os by repeating each I/O
256 times per path to cope with long-running non-permanent error
conditions. Also retry each path twice to cope with link flapping,
i.e. single paths becoming unavailable in the order in which they
are tried.

Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-12-07 12:51:33 +01:00
Sebastian Ott
7a8ad1001c [S390] cio: change locking in io_subchannel_remove
IO subchannels are always unregistered in process context, so use
spin_lock_irq in the corresponding remove callback.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-12-07 12:51:33 +01:00
Sebastian Ott
6e9a0f67de [S390] cio: quiesce subchannel in io_subchannel_remove
Ensure that there will be no more interrupts for an
unregistered device by using the same quiesce and disable loop
as in io_subchannel_shutdown.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-12-07 12:51:32 +01:00
Sebastian Ott
0c609fca24 [S390] cio: handle busy subchannel in ccw_device_move_to_sch
Try to disable the old subchannel before we ask the driver core
to move the attached device to a new parent. This way we can use
the QUIESCE state during shutdown which prevents a possible use
after free situation in some error cases.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-12-07 12:51:32 +01:00
Sebastian Ott
ec64333c3a [S390] cio: handle failed disable_subchannel after device recognition
Handle a failing cio_disable_subchannel at the end of our device
recognition as if the recognition itself failed. This way
subsequent registration steps do not need to handle enabled
subchannels.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-12-07 12:51:32 +01:00
Sebastian Ott
56e6b796fe [S390] cio: fix quiesce state
DEV_STATE_QUIESCE is used to stop all IO on a busy subchannel.
This patch fixes the following problems related to the QUIESCE
state:

* Fix a potential race condition which could occur when the
resulting state was DEV_STATE_OFFLINE.

* Add missing locking around cio_disable_subchannel,
ccw_device_cancel_halt_clear and the cdev's handler.

* Loop until we know for sure that the subchannel is disabled.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-12-07 12:51:32 +01:00
Sebastian Ott
24a1872d64 [S390] cio: add per device initialization status flag
The function ccw_device_unregister has to ensure to remove
all references obtained by device_add and device_initialize.
Unfortunately it gets called for devices which are
1) uninitialized, 2) initialized but unregistered, and
3) registered devices. To distinguish 1) and 2) this patch
introduces a new flag "initialized", which is 1 as long as we
hold the initial device reference.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-12-07 12:51:32 +01:00
Sebastian Ott
7d253b9a1a [S390] cio: remove registered flag from ccw_device_private
We used to maintain a "registered" flag in our ccw_device_private
structure. This patch removes the "registered" flag and converts
all users of it to device_is_registered which has the exact same
meaning.

Note: The usage the atomic operation test_and_clear_bit is replaced
by the non-atomic if (device_is_registered()) device_del(). This
will not do harm, since we serialize calls to ccw_device_unregister
with a single-threaded workqueue.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-12-07 12:51:32 +01:00
Peter Oberparleiter
d7d12ef2be [S390] cio: make steal lock procedure more robust
An Unconditional Reserve + Release operation (steal lock) for a
boxed device may fail when encountering special error cases
(e.g. unit checks or path errors). Fix this by using the more
robust ccw_request infrastructure for performing the steal lock
CCW program.

Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-12-07 12:51:32 +01:00
Peter Oberparleiter
52ef0608e3 [S390] cio: use sense-pgid operation for path verification
Set-pgid operations fail for some device types under z/VM for which
the hypervisor has already set the pgid. Also reserved devices or
changed pgids are not correctly recognized. Fix these problems by
using a combination of sense-pgid and set-pgid and by also accepting
pre-defined pgid settings.

Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-12-07 12:51:31 +01:00
Peter Oberparleiter
454e1fa1eb [S390] cio: split PGID settings and status
Split setting (driver wants feature enabled) and status (feature
setup was successful) for PGID related ccw device features so that
setup errors can be detected. Previously, incorrectly handled setup
errors could in rare cases lead to erratic I/O behavior and
permanently unusuable devices.

Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-12-07 12:51:31 +01:00
Peter Oberparleiter
4257aaecff [S390] cio: remove intretry flag
After changing all internal I/O functions to use the newly introduced
ccw request infrastructure, retries are handled automatically after a
clear operation. Therefore remove the internal retry flag and
associated code.

Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-12-07 12:51:31 +01:00
Peter Oberparleiter
350e91207b [S390] cio: allow setting not-operational devices offline
Accept a request for setting a not-operational device offline.
This way, users can remove devices from Linux which would otherwise
remain unusable until reboot.

Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-12-07 12:51:31 +01:00
Peter Oberparleiter
9679baaf85 [S390] cio: use ccw request infrastructure for pgid
Use the newly introduced ccw request infrastructure to implement
pgid related operations: sense pgid, set pgid and disband pg.

Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-12-07 12:51:31 +01:00
Peter Oberparleiter
39f5360b3d [S390] cio: use ccw request infrastructure for sense id
Use the newly introduced ccw request infrastructure to implement
the sense id operation.

Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-12-07 12:51:31 +01:00
Peter Oberparleiter
e1f0fbd655 [S390] cio: consistent infrastructure for internal I/O requests
Reduce code duplication by introducing a central infrastructure to
perform an internal I/O operation on a CCW device.

Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-12-07 12:51:31 +01:00
Peter Oberparleiter
16b9a0571d [S390] cio: dont panic in non-fatal conditions
Remove the call to BUG() for situations which are unexpected
but do not cause actual problems.

Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-12-07 12:51:30 +01:00
Peter Oberparleiter
1f5bd3848b [S390] cio: ensure proper locking during device recognition
Device recognition needs to be started with the ccw device lock
held to prevent race conditions between I/O starting and interrupt
reception.

Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-12-07 12:51:30 +01:00
Peter Oberparleiter
7c4d964fa4 [S390] cio: handle error during path verification consistently
Handle verification errors consistently through the existing
callback ccw_device_done to reduce cleanup code duplication.

Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-12-07 12:51:30 +01:00
Peter Oberparleiter
736b5db895 [S390] cio: handle error during device recognition consistently
Remove the return code from ccw_device_recognition and handle
recognition errors through the existing callback
ccw_device_recog_done to reduce cleanup code duplication.

Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-12-07 12:51:30 +01:00
Peter Oberparleiter
a7ae2c02f5 [S390] cio: inform user when online/offline processing fails
Print a warning message in case a ccw device enters boxed or
not operational state during online/offline processing.

Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-12-07 12:51:30 +01:00
Peter Oberparleiter
37de53bb52 [S390] cio: introduce ccw device todos
Introduce a central mechanism for performing delayed ccw device work
to ensure that different types of work do not overwrite each other.
Prioritization ensures that the most important work is always
performed while less important tasks are either obsoleted or repeated
later.

Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-12-07 12:51:30 +01:00
Peter Oberparleiter
390935acac [S390] cio: introduce subchannel todos
Ensure that current and future users of sch->work do not overwrite
each other by introducing a single mechanism for delayed subchannel
work.

Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-12-07 12:51:30 +01:00
Peter Oberparleiter
5d6e6b6f6f [S390] cio: introduce parent-initiated device move
Change the initiative to update subchannel-ccw device associations
to the subchannel: when there is an indication that the internal
association no longer reflects the current hardware state, mark
each affected subchannel as requiring attention. Once processing
reaches a subchannel, determine the correct association for that
subchannel at that time and perform the necessary device_move
operations.

This change fixes problems with the previous approach which would
leave devices in an inconsistent state when a new hardware change
occurred while a device_move was already scheduled.

Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-12-07 12:51:29 +01:00
Peter Oberparleiter
60e4dac1ab [S390] cio: fix repeat setting of cdev parent association
sch_create_and_recog_new_device() associates a parent subchannel
with its ccw device child even though this is already done by
the subsequently called io_subchannel_recog(). Also make sure
io_subchannel_recog() sets the association under lock.

Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-12-07 12:51:29 +01:00
Peter Oberparleiter
48e4c385c5 [S390] cio: fix double free in case of probe failure
io_subchannel_probe() frees memory for sch->private which is later
freed again when io_subchannel_remove() is called. Fix this problem
by removing the cleanup in io_subchannel_probe().

Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-12-07 12:51:29 +01:00
Martin Schwidefsky
8b94c1ed4d [S390] sclp: undo quiesce handler override on resume
In a system where the ctrl-alt-del init action initiated by signal
quiesce suspends the machine the quiesce handler override for
_machine_restart, _machine_halt and _machine_power_off needs to be
undone, otherwise the override is still present in the resumed
system. The next shutdown would then load the quiesce state psw
instead of performing the correct shutdown action.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-11-13 15:45:03 +01:00
Gerald Schaefer
ccaf655396 [S390] monreader: fix use after free bug with suspend/resume
The monreader device driver doesn't set dev->driver_data to NULL after
freeing the corresponding data structure. This leads to a use after
free bug in the freeze/thaw suspend/resume functions after the device
has been opened and closed once. Fix this by clearing dev->driver_data
in the close() function.

Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-11-13 15:45:03 +01:00
Linus Torvalds
7d531a7e51 Merge branch 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6
* 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6:
  [S390] smp: fix sigp sense handling
  [S390] smp: fix sigp stop handling
  [S390] cputime: fix overflow on 31 bit systems
  [S390] call home: fix string length handling
  [S390] call home: fix error handling in init function
  [S390] smp: fix prefix handling of offlined cpus
  [S390] s/r: cmm resume fix
  [S390] call home: fix local buffer usage in proc handler
2009-10-31 12:14:56 -07:00
Linus Torvalds
61aa1620be Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6:
  [SCSI] zfcp: Flush SCSI registration work when adding unit
  [SCSI] zfcp: Fix timer initialization for ct and els requests
  [SCSI] zfcp: Warn about storage devices with broken PLOGI data
  [SCSI] zfcp: Handle WWPN mismatch in PLOGI payload
  [SCSI] zfcp: fix kfree handling in zfcp_init_device_setup
  [SCSI] fix memory leak in initialization
2009-10-29 09:16:01 -07:00
Heiko Carstens
e8a79c9ec7 [S390] call home: fix string length handling
After copying uts->nodename to the static nodename array the static
version isn't necessarily zero termininated, since the size of the
array is one byte too short.
Afterwards doing strncat(data, nodename, strlen(nodename)); may copy
an arbitrary large amount of bytes.
Fix this by getting rid of the static array and using strncat with
proper length limit.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-10-29 15:05:12 +01:00
Heiko Carstens
4a0fb4c445 [S390] call home: fix error handling in init function
Fix missing unregister_sysctl_table in case the SCLP doesn't provide
the requested feature. Also simplify the whole error handling while
at it.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-10-29 15:05:12 +01:00
Martin Schwidefsky
8ca45667f9 [S390] s/r: cmm resume fix
If a suspended z/VM guest has been logged off before the resume the
'SET SMSG IUCV' CP command need to be repeated to reenable sending
message via SMSG. This fixes the following error:

HCPMFS057I H4214002 not receiving; SMSG off
Error: non-zero CP response for command 'SMSG H4214002 CMM SHRINK 5010': #57

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-10-29 15:05:12 +01:00
Sebastian Ott
3f0b3c33ee [S390] call home: fix local buffer usage in proc handler
Fix the size of the local buffer and use snprintf to prevent
further miscalculations. Also fix the usage of bitwise vs logic
operations.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-10-29 15:05:12 +01:00
Christof Schmitt
9e820afd0c [SCSI] zfcp: Flush SCSI registration work when adding unit
When configuring a LUN for use in zfcp, flush the SCSI work to ensure
the SCSI device has been created before returning. This means that a
configuration procedure can run these commands in a script and the
SCSI device is available immediately after the unit_add:

echo 1 > /sys/bus/ccw/drivers/zfcp/0.0.181d/online
echo 0x401040C300000000 > \
        /sys/bus/ccw/drivers/zfcp/0.0.181d/0x500507630313c562/unit_add
lsscsi

Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-10-22 09:38:51 +09:00
Christof Schmitt
9d38500de1 [SCSI] zfcp: Fix timer initialization for ct and els requests
Add HZ since the start_timer function expects jiffies, not seconds.

Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-10-22 09:38:49 +09:00
Christof Schmitt
10d00f78e6 [SCSI] zfcp: Warn about storage devices with broken PLOGI data
After opening a remote port zfcp checks if the WWPN returned in the
PLOGI maches the WWPN of the port that should have been opened. On a
mismatch zfcp assumes that the DID just changed, queries the FC
nameserver and tries again. If the situation persists the erp will
give up.

With this strategy, if the remote port always returns the wrong PLOGI
data, the remote port will not be opened. Introduce a warning, so that
the system administrator knows why the remote port is not being opened
and to have a pointer to investigate the problem on the storage
system.

Reviewed-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-10-22 09:38:47 +09:00
Christof Schmitt
934aeb587b [SCSI] zfcp: Handle WWPN mismatch in PLOGI payload
For ports, zfcp gets the DID from the FC nameserver and tries to open
the port. If the open succeeds, zfcp compares the WWPN from the
nameserver with the WWPN in the PLOGI payload. In case of a mismatch,
zfcp assumes that the DID of the port just changed and we opened the
wrong port. This means that zfcp has to forget the DID, lookup the DID
again and retry.

This error case had a problem that zfcp forgets the DID, but never
looks up a new one, stalling the ERP in this case. Fix this by
triggering the DID lookup and properly exit from the ERP. The DID
lookup will trigger a new ERP action.

Also ensure when trying to open the port again with the new DID, first
close the open port, even in the NOESC case.

Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-10-22 09:38:45 +09:00
Heiko Carstens
d10c0858f6 [SCSI] zfcp: fix kfree handling in zfcp_init_device_setup
The pointer that is allocated with kmalloc() is passed to strsep()
which modifies it. Later on the modified pointer value will be passed
to kfree. Save the original pointer and pass that one to kfree
instead.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
2009-10-22 09:38:42 +09:00
Michael Holzheu
ac522b638d [S390] sclp_vt220 build fix
Fix this build error:

	next-20091013 randconfig build on s390x build breaks with

drivers/s390/built-in.o:(.data+0x3354): undefined reference to `sclp_vt220_pm_event_fn'

Reported-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Signed-off-by: Michael Holzheu <michael.holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-10-14 12:43:54 +02:00
Peter Oberparleiter
6d7c5afc89 [S390] cio: change misleading console logic
Use cio_is_console() in io_subchannel_probe to indicate that the
special handling is console specific. As long as there is no other
subchannel for which this might be true, it is misleading to speak
of "early devices". Should more of these devices be introduced,
a cleanup of all console special handling is in order anyway.

Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-10-14 12:43:53 +02:00
Heiko Carstens
d3acf71fb8 [S390] call home support: fix proc handler
8d65af78 "sysctl: remove "struct file *" argument of ->proc_handler"
removed the struct file argument from all proc_handlers but didn't
change the call home proc handler (or call home was merged later).

So fix this now.

Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Hans-Joachim Picht <hans@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-10-14 12:43:53 +02:00
Stefan Haberland
d9fa9441ed [S390] dasd: use idal for device characteristics
If the rdc_buffer is above 2G we need indirect addresssing so we have
to use an idaw to give the rdc_buffer to the ccw.
If the rdc_buffer is under 2G nothing changes.

Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-10-14 12:43:53 +02:00
Stefan Haberland
a7602f6c16 [S390] dasd: fix locking bug
Replace spin_lock with spin_lock_irqsave in dasd_eckd_restore_device.

Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-10-14 12:43:52 +02:00
Michael Holzheu
03cadd36d5 [S390] tape390: Fix request queue handling in block driver
When setting a channel attached tape online under Linux 2.6.31, the

"vol_id" process from udev hangs in sync_page():
 2 sync_page+144 [0x1dfaac]
 3 __wait_on_bit_lock+194 [0x58c23e]
 4 __lock_page+116 [0x1df9dc]
 5 truncate_inode_pages_range+728 [0x1ed7cc]
 6 __blkdev_put+244 [0x25f738]
 7 __fput+300 [0x229c4c]
 8 filp_close+122 [0x225a3a]

The reason for that is an error in the request queue handling. It can
happen that we fetch a request, but do not process it further because
the number of queued requests exceeds TAPEBLOCK_MIN_REQUEUE.
To fix this, we should call blk_peek_request() instead of
blk_fetch_request() in the while condition and fetch the request in
the loop body afterwards.

This bug was introduced with the patch "block: implement and enforce
request peek/start/fetch" (9934c8c045)

Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-10-14 12:43:52 +02:00
Linus Torvalds
f144c78e52 Merge branch 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6
* 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6: (21 commits)
  [S390] dasd: fix race condition in resume code
  [S390] Add EX_TABLE for addressing exception in usercopy functions.
  [S390] 64-bit register support for 31-bit processes
  [S390] hibernate: Use correct place for CPU address in lowcore
  [S390] pm: ignore time spend in suspended state
  [S390] zcrypt: Improve some comments
  [S390] zcrypt: Fix sparse warning.
  [S390] perf_counter: fix vdso detection
  [S390] ftrace: drop nmi protection
  [S390] compat: fix truncate system call wrapper
  [S390] Provide arch specific mdelay implementation.
  [S390] Fix enabled udelay for short delays.
  [S390] cio: allow setting boxed devices offline
  [S390] cio: make not operational handling consistent
  [S390] cio: make disconnected handling consistent
  [S390] Fix memory leak in /proc/cio_ignore
  [S390] cio: channel path memory leak
  [S390] module: fix memory leak in s390 module loader
  [S390] Enable kmemleak on s390.
  [S390] 3270 console build fix
  ...
2009-10-11 11:34:50 -07:00
Linus Torvalds
69585dd69e Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6: (34 commits)
  [SCSI] qla2xxx: Fix NULL ptr deref bug in fail path during queue create
  [SCSI] st: fix possible memory use after free after MTSETBLK ioctl
  [SCSI] be2iscsi: Moving to pci_pools v3
  [SCSI] libiscsi: iscsi_session_setup to allow for private space
  [SCSI] be2iscsi: add 10Gbps iSCSI - BladeEngine 2 driver
  [SCSI] zfcp: Fix hang when offlining device with offline chpid
  [SCSI] zfcp: Fix lockdep warning when offlining device with offline chpid
  [SCSI] zfcp: Fix oops during shutdown of offline device
  [SCSI] zfcp: Fix initial device and cfdc for delayed adapter allocation
  [SCSI] zfcp: correctly initialize unchained requests
  [SCSI] mpt2sas: Bump version 02.100.03.00
  [SCSI] mpt2sas: Support dev remove when phy status is MPI2_EVENT_SAS_TOPO_PHYSTATUS_VACANT
  [SCSI] mpt2sas: Timeout occurred within the HANDSHAKE logic while waiting on firmware to ACK.
  [SCSI] mpt2sas: Call init_completion on a per request basis.
  [SCSI] mpt2sas: Target Reset will be issued from Interrupt context.
  [SCSI] mpt2sas: Added SCSIIO, Internal and high priority memory pools to support multiple TM
  [SCSI] mpt2sas: Copyright change to 2009.
  [SCSI] mpt2sas: Added mpi2_history.txt for MPI2 headers.
  [SCSI] mpt2sas: Update driver to MPI2 REV K headers.
  [SCSI] bfa: Brocade BFA FC SCSI driver
  ...
2009-10-11 11:12:33 -07:00
Stefan Haberland
6fca97a958 [S390] dasd: fix race condition in resume code
There is a race while re-reading the device characteristics. After
cleaning the memory area a cqr is build which reads the device
characteristics. This may take a rather long time and the device
characteristics structure is zero during this. Now it could be
possible that the block tasklet starts working and a new cqr will be
build. The build_cp command refers to the device characteristics
structure and this may lead into a divide by zero exception.
Fix this by re-reading the device characteristics into a temporary
structur and copy the data to the original structure. Also take the
ccwdev_lock.

Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-10-06 10:35:11 +02:00