linux

Author	SHA1	Message	Date
Lars Ellenberg	8ccee20e3e	drbd: don't cond_resched_lock with IRQs disabled The last commit, drbd: add missing spinlock to bitmap receive, introduced a cond_resched_lock(), where the lock in question is taken with irqs disabled. As we must not schedule with IRQs disabled, and cond_resched_lock_irq() does not exist, yet, we re-aquire the spin_lock_irq() for each bitmap page processed in turn. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-06-30 09:23:42 +02:00
Lars Ellenberg	829c608786	drbd: add missing spinlock to bitmap receive During bitmap exchange, when using the RLE bitmap compression scheme, we have a code path that can set the whole bitmap at once. To avoid holding spin_lock_irq() for too long, we used to lock out other bitmap modifications during bitmap exchange by other means, and then, knowing we have exclusive access to the bitmap, modify it without the spinlock, and with IRQs enabled. Since we now allow local IO to continue, potentially setting additional bits during the bitmap receive phase, this is no longer true, and we get uncoordinated updates of bitmap members, causing bm_set to no longer accurately reflect the total number of set bits. To actually see this, you'd need to have a large bitmap, use RLE bitmap compression, and have busy IO during sync handshake and bitmap exchange. Fix this by taking the spin_lock_irq() in this code path as well, but calling cond_resched_lock() after each page worth of bits processed. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-06-30 09:23:41 +02:00
Philipp Reisner	0cfdd247d1	drbd: Use the correct max_bio_size when creating resync requests Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-06-30 09:23:40 +02:00
Namhyung Kim	a1c15c59fe	loop: handle on-demand devices correctly When finding or allocating a loop device, loop_probe() did not take partition numbers into account so that it can result to a different device. Consider following example: $ sudo modprobe loop max_part=15 $ ls -l /dev/loop* brw-rw---- 1 root disk 7, 0 2011-05-24 22:16 /dev/loop0 brw-rw---- 1 root disk 7, 16 2011-05-24 22:16 /dev/loop1 brw-rw---- 1 root disk 7, 32 2011-05-24 22:16 /dev/loop2 brw-rw---- 1 root disk 7, 48 2011-05-24 22:16 /dev/loop3 brw-rw---- 1 root disk 7, 64 2011-05-24 22:16 /dev/loop4 brw-rw---- 1 root disk 7, 80 2011-05-24 22:16 /dev/loop5 brw-rw---- 1 root disk 7, 96 2011-05-24 22:16 /dev/loop6 brw-rw---- 1 root disk 7, 112 2011-05-24 22:16 /dev/loop7 $ sudo mknod /dev/loop8 b 7 128 $ sudo losetup /dev/loop8 ~/temp/disk-with-3-parts.img $ sudo losetup -a /dev/loop128: [0805]:278201 (/home/namhyung/temp/disk-with-3-parts.img) $ ls -l /dev/loop* brw-rw---- 1 root disk 7, 0 2011-05-24 22:16 /dev/loop0 brw-rw---- 1 root disk 7, 16 2011-05-24 22:16 /dev/loop1 brw-rw---- 1 root disk 7, 2048 2011-05-24 22:18 /dev/loop128 brw-rw---- 1 root disk 7, 2049 2011-05-24 22:18 /dev/loop128p1 brw-rw---- 1 root disk 7, 2050 2011-05-24 22:18 /dev/loop128p2 brw-rw---- 1 root disk 7, 2051 2011-05-24 22:18 /dev/loop128p3 brw-rw---- 1 root disk 7, 32 2011-05-24 22:16 /dev/loop2 brw-rw---- 1 root disk 7, 48 2011-05-24 22:16 /dev/loop3 brw-rw---- 1 root disk 7, 64 2011-05-24 22:16 /dev/loop4 brw-rw---- 1 root disk 7, 80 2011-05-24 22:16 /dev/loop5 brw-rw---- 1 root disk 7, 96 2011-05-24 22:16 /dev/loop6 brw-rw---- 1 root disk 7, 112 2011-05-24 22:16 /dev/loop7 brw-r--r-- 1 root root 7, 128 2011-05-24 22:17 /dev/loop8 After this patch, /dev/loop8 - instead of /dev/loop128 - was accessed correctly. In addition, 'range' passed to blk_register_region() should include all range of dev_t that LOOP_MAJOR can address. It does not need to be limited by partition numbers unless 'max_loop' param was specified. Signed-off-by: Namhyung Kim <namhyung@gmail.com> Cc: Laurent Vivier <Laurent.Vivier@bull.net> Cc: stable@kernel.org Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2011-05-24 16:48:55 +02:00
Namhyung Kim	78f4bb367f	loop: limit 'max_part' module param to DISK_MAX_PARTS The 'max_part' parameter controls the number of maximum partition a loop block device can have. However if a user specifies very large value it would exceed the limitation of device minor number and can cause a kernel panic (or, at least, produce invalid device nodes in some cases). On my desktop system, following command kills the kernel. On qemu, it triggers similar oops but the kernel was alive: $ sudo modprobe loop max_part0000 ------------[ cut here ]------------ kernel BUG at /media/Linux_Data/project/linux/fs/sysfs/group.c:65! invalid opcode: 0000 [#1] SMP last sysfs file: CPU 0 Modules linked in: loop(+) Pid: 43, comm: insmod Tainted: G W 2.6.39-qemu+ #155 Bochs Bochs RIP: 0010:[<ffffffff8113ce61>] [<ffffffff8113ce61>] internal_create_group= +0x2a/0x170 RSP: 0018:ffff880007b3fde8 EFLAGS: 00000246 RAX: 00000000ffffffef RBX: ffff880007b3d878 RCX: 00000000000007b4 RDX: ffffffff8152da50 RSI: 0000000000000000 RDI: ffff880007b3d878 RBP: ffff880007b3fe38 R08: ffff880007b3fde8 R09: 0000000000000000 R10: ffff88000783b4a8 R11: ffff880007b3d878 R12: ffffffff8152da50 R13: ffff880007b3d868 R14: 0000000000000000 R15: ffff880007b3d800 FS: 0000000002137880(0063) GS:ffff880007c00000(0000) knlGS:00000000000000= 00 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000422680 CR3: 0000000007b50000 CR4: 00000000000006b0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 0000000000000000 DR7: 0000000000000000 Process insmod (pid: 43, threadinfo ffff880007b3e000, task ffff880007afb9c= 0) Stack: ffff880007b3fe58 ffffffff811e66dd ffff880007b3fe58 ffffffff811e570b 0000000000000010 ffff880007b3d800 ffff880007a7b390 ffff880007b3d868 0000000000400920 ffff880007b3d800 ffff880007b3fe48 ffffffff8113cfc8 Call Trace: [<ffffffff811e66dd>] ? device_add+0x4bc/0x5af [<ffffffff811e570b>] ? dev_set_name+0x3c/0x3e [<ffffffff8113cfc8>] sysfs_create_group+0xe/0x12 [<ffffffff810b420e>] blk_trace_init_sysfs+0x14/0x16 [<ffffffff8116a090>] blk_register_queue+0x47/0xf7 [<ffffffff8116f527>] add_disk+0xdf/0x290 [<ffffffffa00060eb>] loop_init+0xeb/0x1b8 [loop] [<ffffffffa0006000>] ? 0xffffffffa0005fff [<ffffffff8100020a>] do_one_initcall+0x7a/0x12e [<ffffffff81096804>] sys_init_module+0x9c/0x1e0 [<ffffffff813329bb>] system_call_fastpath+0x16/0x1b Code: c3 55 48 89 e5 41 57 41 56 41 89 f6 41 55 41 54 49 89 d4 53 48 89 fb= 48 83 ec 28 48 85 ff 74 0b 85 f6 75 0b 48 83 7f 30 00 75 14 <0f> 0b eb fe = 48 83 7f 30 00 b9 ea ff ff ff 0f 84 18 01 00 00 49 RIP [<ffffffff8113ce61>] internal_create_group+0x2a/0x170 RSP <ffff880007b3fde8> ---[ end trace a123eb592043acad ]--- Signed-off-by: Namhyung Kim <namhyung@gmail.com> Cc: Laurent Vivier <Laurent.Vivier@bull.net> Cc: stable@kernel.org Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2011-05-24 16:48:54 +02:00
Andrew Morton	0ddf72be4e	drbd: fix warning In file included from drivers/block/drbd/drbd_main.c:54: drivers/block/drbd/drbd_int.h:1190: warning: parameter has incomplete type Forward declarations of enums do not work. Fix it unpleasantly by moving the prototype. Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Lars Ellenberg <drbd-dev@lists.linbit.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2011-05-24 10:38:33 +02:00
Philipp Reisner	9b2f61aec7	drbd: fix warning Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>	2011-05-24 10:38:32 +02:00
Bart Van Assche	24c4830c8e	drbd: Fix spelling Found these with the help of ispell -l. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>	2011-05-24 10:21:29 +02:00
Lars Ellenberg	9a0d9d0389	drbd: fix schedule in atomic An administrative detach used to request a state change directly to D_DISKLESS, first suspending IO to avoid the last put_ldev() occuring from an endio handler, potentially in irq context. This is not enough on the receiving side (typically secondary), we may miss some peer_req on the way to local disk, which then may do the last put_ldev() from their drbd_peer_request_endio(). This patch makes the detach always go through the intermediate D_FAILED state. We may consider to rename it D_DETACHING. Alternative approach would be to create yet an other work item to be scheduled on the worker, do the destructor work from there, and get the timing right. manually picked commit 564040f from the drbd 8.4 branch. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-05-24 10:14:32 +02:00
Philipp Reisner	99432fcc52	drbd: Take a more conservative approach when deciding max_bio_size The old (optimistic) implementation could shrink the bio size on an primary device. Shrinking the bio size on a primary device is bad. Since there we might get BIOs with the old (bigger) size shortly after we published the new size. The new implementation is more conservative, and eventually increases the max_bio_size on a primary device (which is valid). It does so, when it knows the local limit AND the remote limit. We cache the last seen max_bio_size of the peer in the meta data, and rely on that, to make the operation of single nodes more efficient. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-05-24 10:08:58 +02:00
Philipp Reisner	21423fa791	drbd: Fixed state transitions after async outdate-peer-handler returned Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-05-24 10:08:11 +02:00
Philipp Reisner	fa7d939663	drbd: Disallow the peer_disk_state to be D_OUTDATED while connected Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-05-24 10:07:50 +02:00
Philipp Reisner	a8e407925d	drbd: Fix for the connection problems on high latency links It seems that the real cause of all the issues where that we did not noticed in drbd_try_connect() when the other guy closes one socket if the round trip time gets higher than 100ms. There were that 100ms hard coded! Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-05-24 10:07:22 +02:00
Lars Ellenberg	76727f684a	drbd: fix potential activity log refcount imbalance in error path It is no longer sufficient to trigger on local WRITE, we need to check on (rq_state & RQ_IN_ACT_LOG) before calling drbd_al_complete_io also in the error path. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-05-24 10:06:44 +02:00
Philipp Reisner	d2e17807e3	drbd: Only downgrade the disk state in case of disk failures Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-05-24 10:05:48 +02:00
Lars Ellenberg	f36af18c7b	drbd: fix disconnect/reconnect loop, if ping-timeout == ping-int If there is no replication traffic within the idle timeout (ping-int seconds), DRBD will send a P_PING, and adjust the timeout to ping-timeout. If there is no P_PING_ACK received within this ping-timeout, DRBD finally drops the connection, and tries to re-establish it. To decide which timeout was active, we compared the current timeout with the ping-timeout, and dropped the connection, if that was the case. By default, ping-int is 10 seconds, ping-timeout is 500 ms. Unfortunately, if you configure ping-timeout to be the same as ping-int, expiry of the idle-timeout had been mistaken for a missing ping ack, and caused an immediate reconnection attempt. Fix: Allow both timeouts to be equal, use a local variable to store which timeout is active. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-05-24 10:03:30 +02:00
Lars Ellenberg	53ea433145	drbd: fix potential distributed deadlock We limit ourselves to a configurable maximum number of pages used as temporary bio pages. If the configured "max_buffers" is not big enough to match the bandwidth of the respective deployment, a distributed deadlock could be triggered by e.g. fast online verify and heavy application IO. TCP connections would block on congestion, because both receivers would wait on pages to become available. Fortunately the respective senders in this case would be able to give back some pages already. So do that. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-05-24 10:02:41 +02:00
Philipp Reisner	738a84b25c	drbd: Fix for application IO with the on-io-error=pass-on policy In case a write failes on the local disk, go into D_INCONSISTENT disk state. That causes future reads of that block to be shipped to the peer. Read retry remote was already in place. Actually the documentation needs to get fixed now. Since the application is still shielded from the error. (as long as we have only a single disk failing) The difference to detach is that we keep the disk. And therefore might keep all the other, still working sectors up to date. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>	2011-05-24 09:59:49 +02:00
Jens Axboe	779d530632	Merge branches 'for-jens/xen-backend-fixes' and 'for-jens/xen-blkback-v3.3' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen into for-2.6.40/drivers	2011-05-19 09:46:00 +02:00
Jan Beulich	8ab521506c	xen/blkback: don't fail empty barrier requests The sector number on empty barrier requests may (will?) be -1, which, given that it's being treated as unsigned 64-bit quantity, will almost always exceed the actual (virtual) disk's size. Inspired by Konrad's "When writting barriers set the sector number to zero...". While at it also add overflow checking to the math in vbd_translate(). Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-05-18 11:28:16 -04:00
Laszlo Ersek	496b318eb6	xen/blkback: fix xenbus_transaction_start() hang caused by double xenbus_transaction_end() vbd_resize() up_read()'s xs_state.suspend_mutex twice in a row via double xenbus_transaction_end() calls. The next down_read() in xenbus_transaction_start() (at eg. the next resize attempt) hangs. Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=618317 Acked-by: Jan Beulich <jbeulich@novell.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-05-13 09:45:40 -04:00
Konrad Rzeszutek Wilk	5185432277	xen/blkback: Align the tabs on the structure. The recent changes caused this field of the structure to be offset a bit. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-05-12 18:02:28 -04:00
Konrad Rzeszutek Wilk	cca537af7d	xen/blkback: if log_stats is enabled print out the data. And not depend on the driver being built with -DDEBUG flag. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-05-12 17:55:54 -04:00
Konrad Rzeszutek Wilk	5a577e3872	xen/blkback: Add the prefix XEN in the common.h. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-05-12 17:55:53 -04:00
Konrad Rzeszutek Wilk	3d814731ba	xen/blkback: Prefix 'vbd' with 'xen' in structs and functions. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-05-12 17:55:52 -04:00
Konrad Rzeszutek Wilk	30fd150202	xen/blkback: Change structure name blkif_st to xen_blkif. No need for that '_st' and xen_blkif is more apt. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-05-12 17:55:51 -04:00
Konrad Rzeszutek Wilk	325a648604	xen/blkback: Remove the unused typedefs. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-05-12 17:55:50 -04:00
Konrad Rzeszutek Wilk	452a6b2bb6	xen/blkback: Move include/xen/blkif.h into drivers/block/xen-blkback/common.h Not point of the blkif.h file. It is not used by the frontend. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-05-12 17:55:49 -04:00
Konrad Rzeszutek Wilk	b0f801273f	xen/blkback: Fixing some more of the cleanpatch.pl warnings. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-05-12 17:55:48 -04:00
Konrad Rzeszutek Wilk	03e0edf946	xen/blkback: Checkpatch.pl recommend against multiple assigments. CHECK: multiple assignments should be avoided Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-05-12 17:55:47 -04:00
Konrad Rzeszutek Wilk	a4c348580e	xen/blkback: Flesh out the description in the Kconfig. with more details. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-05-12 17:55:40 -04:00
Konrad Rzeszutek Wilk	b9fc02968c	xen/blkback: Fix spelling mistakes. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-05-12 16:43:21 -04:00
Konrad Rzeszutek Wilk	68c88dd7d3	xen/blkback: Move blkif_get_x86_[32\|64]_req to common.h in block/xen-blkback dir. From the blkif.h header, which was exposed to the frontend. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-05-12 16:43:20 -04:00
Konrad Rzeszutek Wilk	72468bfcb8	xen/blkback: Removing the debug_lvl option. It is not really used for anything. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-05-12 16:43:20 -04:00
Konrad Rzeszutek Wilk	22b20f2dff	xen/blkback: Use the DRV_PFX in the pr_.. macros. To make it easier to read. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-05-12 16:43:12 -04:00
Konrad Rzeszutek Wilk	1afbd730a3	xen/blkback: Make the DPRINTK uniform. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-05-12 16:42:51 -04:00
Konrad Rzeszutek Wilk	ebe8190659	xen/blkback: Change printk/DPRINTK to pr_.. type variant. And also make them uniform and prefix the message with 'xen-blkback'. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-05-12 16:42:31 -04:00
Konrad Rzeszutek Wilk	edf6ef59ec	xen-blkfront: Introduce BLKIF_OP_FLUSH_DISKCACHE support. If the backend supports the 'feature-flush-cache' mode, use that instead of the 'feature-barrier' support. Currently there are three backends that support the 'feature-flush-cache' mode: NetBSD, Solaris and Linux kernel. The 'flush' option is much light-weight version than the 'barrier' support so lets try to use as there are no filesystems in the kernel that use full barriers anymore. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-05-12 08:56:03 -04:00
Marek Marczykowski	4352b47ab7	xen-blkfront: fix data size for xenbus_gather in blkfront_connect barrier variable is int, not long. This overflow caused another variable override: "err" (in PV code) and "binfo" (in xenlinux code - drivers/xen/blkfront/blkfront.c). The later caused incorrect device flags (RO/removable etc). Signed-off-by: Marek Marczykowski <marmarek@mimuw.edu.pl> Acked-by: Ian Campbell <Ian.Campbell@citrix.com> [v1: Changed title] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-05-12 08:55:51 -04:00
Konrad Rzeszutek Wilk	01f37f2d53	xen/blkback: Fixed up comments and converted spaces to tabs. Suggested-by: Ian Campbell <Ian.Campbell@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2011-05-11 15:57:09 -04:00
Jens Axboe	edc83d47a9	cciss: fix compile issue drivers/block/cciss.c: In function ‘cciss_send_reset’: drivers/block/cciss.c:2515:2: error: implicit declaration of function ‘fill_cmd’ drivers/block/cciss.c: At top level: drivers/block/cciss.c:2531:12: error: conflicting types for ‘fill_cmd’ drivers/block/cciss.c:2534:1: note: an argument type that has a default promotion can’t match an empty parameter name list declaration drivers/block/cciss.c:2515:18: note: previous implicit declaration of ‘fill_cmd’ was here make[1]: * [drivers/block/cciss.o] Error 1 make: * [drivers/block/cciss.o] Error 2 Move fill_cmd() to above where it is first used. Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2011-05-06 08:27:00 -06:00
Stephen M. Cameron	8a4ec67bd5	cciss: add cciss_tape_cmds module paramter This is to allow number of commands reserved for use by SCSI tape drives and medium changers to be adjusted at driver load time via the kernel parameter cciss_tape_cmds, with a default value of 6, and a range of 2 - 16 inclusive. Previously, the driver limited the number of commands which could be queued to the SCSI half of the the driver to only 2. This is to fix the problem that if you had more than two tape drives, you couldn't, for example, erase or rewind them all at the same time. Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2011-05-06 08:23:59 -06:00
Stephen M. Cameron	063d2cf72a	cciss: do not use bit 2 doorbell reset It causes NMIs which are undesirable at best, unsurvivable at worst. Prefer the soft reset instead. Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2011-05-06 08:23:58 -06:00
Stephen M. Cameron	ec52d5f1cb	cciss: do not attempt PCI power management reset method if we know it won't work. Just go straight to the soft-reset method instead. Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2011-05-06 08:23:57 -06:00
Stephen M. Cameron	93c46c2fa7	cciss: remove superfluous sleeps around reset code Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2011-05-06 08:23:56 -06:00
Stephen M. Cameron	5afe278114	cciss: do soft reset if hard reset is broken on driver load, if reset_devices is set, and the hard reset attempts fail, try to bring up the controller to the point that a command can be sent, and send it a soft reset command, then after the reset undo whatever driver initialization was done to get it to the point to take a command, and re-do it after the reset. This is to get kdump to work on all the "non-resettable" controllers (except 64xx controllers which can't be reset due to the potentially shared cache module.) Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2011-05-06 08:23:56 -06:00
Stephen M. Cameron	bf2e2e6b87	cciss: use new doorbell-bit-5 reset method The bit-2-doorbell reset method seemed to cause (survivable) NMIs on some systems and (unsurvivable) IOCK NMIs on some G7 servers. Firmware guys implemented a new doorbell method to alleviate these problems triggered by bit 5 of the doorbell register. We want to use it if it's available. Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2011-05-06 08:23:55 -06:00
Stephen M. Cameron	3e28601fdf	cciss: increase timeouts for post-reset no-ops Just to reduce the messages about timeouts that appear. Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2011-05-06 08:23:54 -06:00
Stephen M. Cameron	59ec86bb98	cciss: clarify messages around reset behavior When waiting for the board to become "not ready" don't print a message saying "waiting for board to become ready" (possibly followed by a message saying "failed waiting for board to become not ready". Instead, it should be "waiting for board to reset" and "failed waiting for board to reset." Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> " Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2011-05-06 08:23:53 -06:00
Stephen M. Cameron	19adbb9254	cciss: increase time to wait for board reset to start Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2011-05-06 08:23:51 -06:00

1 2 3 4 5 ...

1972 Commits