linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-17 17:41:44 +00:00

Author	SHA1	Message	Date
Yossi Etigin	93a3ab939b	IPoIB: Fix hang in ipoib_flush_paths() ipoib_flush_paths() can hang during an SM up/down loop: if path_rec_start() fails (for instance, because there is no sm_ah), the path is still added to the path list by neigh_add_path(). Then, ipoib_flush_paths() will wait for path->done, but it will never complete because the request was not issued at all. Fix this by completing path->done if issuing the query fails. This fixes <https://bugs.openfabrics.org/show_bug.cgi?id=1329>. Signed-off-by: Yossi Etigin <yosefe@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-11-12 10:24:38 -08:00
Yossi Etigin	fe25c56190	IPoIB: Don't enable NAPI when it's already enabled If a P_Key is not present when an interface is created, ipoib_open() will return after doing napi_enable(). ipoib_open() will be called again from ipoib_pkey_poll() when the P_Key appears, after NAPI has already been enabled, and try to enable it again. This triggers a BUG_ON() in napi_enable(). Fix this by moving the call to napi_enable() to after the test for P_Key presence. Signed-off-by: Yossi Etigin <yosefe@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-11-12 10:24:36 -08:00
Linus Torvalds	724bdd097e	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: IB/ehca: Reject dynamic memory add/remove when ehca adapter is present IB/ehca: Fix reported max number of QPs and CQs in systems with >1 adapter IPoIB: Set netdev offload features properly for child (VLAN) interfaces IPoIB: Clean up ethtool support mlx4_core: Add Ethernet PCI device IDs mlx4_en: Add driver for Mellanox ConnectX 10GbE NIC mlx4_core: Multiple port type support mlx4_core: Ethernet MAC/VLAN management mlx4_core: Get ethernet MTU and default address from firmware mlx4_core: Support multiple pre-reserved QP regions Update NetEffect maintainer emails to Intel emails RDMA/cxgb3: Remove cmid reference on tid allocation failures IB/mad: Use krealloc() to resize snoop table IPoIB: Always initialize poll_timer to avoid crash on unload IB/ehca: Don't allow creating UC QP with SRQ mlx4_core: Add QP range reservation support RDMA/ucma: Test ucma_alloc_multicast() return against NULL, not with IS_ERR()	2008-10-23 08:16:03 -07:00
Or Gerlitz	83bb63f62b	IPoIB: Set netdev offload features properly for child (VLAN) interfaces Child devices were created without any offload features set, fix this by moving the code that computes the features into generic function which is now called through non-child and child device creation. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> -- v1 has a bug where the 'result' flag in ipoib_vlan_add may be used uninitialized Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-10-22 15:49:49 -07:00
Or Gerlitz	70c9c0db54	IPoIB: Clean up ethtool support Add a get_rx_csum method. Remove the driver's own get_tso method, as the ethtool kernel code uses the default one if nothing is provided. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-10-22 15:49:29 -07:00
Linus Torvalds	ed09441dac	Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (39 commits) [SCSI] sd: fix compile failure with CONFIG_BLK_DEV_INTEGRITY=n libiscsi: fix locking in iscsi_eh_device_reset libiscsi: check reason why we are stopping iscsi session to determine error value [SCSI] iscsi_tcp: return a descriptive error value during connection errors [SCSI] libiscsi: rename host reset to target reset [SCSI] iscsi class: fix endpoint id handling [SCSI] libiscsi: Support drivers initiating session removal [SCSI] libiscsi: fix data corruption when target has to resend data-in packets [SCSI] sd: Switch kernel printing level for DIF messages [SCSI] sd: Correctly handle all combinations of DIF and DIX [SCSI] sd: Always print actual protection_type [SCSI] sd: Issue correct protection operation [SCSI] scsi_error: fix target reset handling [SCSI] lpfc 8.2.8 v2 : Add statistical reporting control and additional fc vendor events [SCSI] lpfc 8.2.8 v2 : Add sysfs control of target queue depth handling [SCSI] lpfc 8.2.8 v2 : Revert target busy in favor of transport disrupted [SCSI] scsi_dh_alua: remove REQ_NOMERGE [SCSI] lpfc 8.2.8 : update driver version to 8.2.8 [SCSI] lpfc 8.2.8 : Add MSI-X support [SCSI] lpfc 8.2.8 : Update driver to use new Host byte error code DID_TRANSPORT_DISRUPTED ...	2008-10-17 09:00:23 -07:00
Steven Whitehouse	a447c09324	vfs: Use const for kernel parser table This is a much better version of a previous patch to make the parser tables constant. Rather than changing the typedef, we put the "const" in all the various places where its required, allowing the __initconst exception for nfsroot which was the cause of the previous trouble. This was posted for review some time ago and I believe its been in -mm since then. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Cc: Alexander Viro <aviro@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-10-13 10:10:37 -07:00
Mike Christie	8e12452549	[SCSI] libiscsi: rename host reset to target reset I had this in my patchset to add target reset support, but it got dropped due to patching conflicts. This initial patch just renames the function and users. We are actually just dropping the session, and so this does not have anything to do with the host exactly. It does for software iscsi because we allocate a host per session, but for cxgb3i this makes no sense. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-10-13 09:28:59 -04:00
Mike Christie	e5bd7b54e9	[SCSI] libiscsi: Support drivers initiating session removal If the driver knows when hardware is removed like with cxgb3i, bnx2i, qla4xxx and iser then we will want to remove the sessions/devices that are bound to that device before removing the host. cxgb3i and in the future bnx2i will remove the host and that will remove all the sessions on the hba. iser can call iscsi_kill_session when it gets an event that indicates that a hca is removed. And when qla4xxx is hooked in to the lib (it is only hooked into the class right now) it can call iscsi remove host like the partial offload card drivers. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-10-13 09:28:59 -04:00
Roland Dreier	2767840a5c	IPoIB: Always initialize poll_timer to avoid crash on unload ipoib_ib_dev_stop() does del_timer_sync(&priv->poll_timer), but if a P_key for an interface is not found, poll_timer is not initialized, so this leads to a crash or hang. Fix this by moving where poll_timer is initialized to ipoib_ib_dev_init(), which is always called. This fixes <https://bugs.openfabrics.org/show_bug.cgi?id=1172>. Debugged-by: Yosef Etigin <yosefe@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-10-10 15:58:52 -07:00
Roland Dreier	943c246e9b	IPoIB: Use netif_tx_lock() and get rid of private tx_lock, LLTX Currently, IPoIB is an LLTX driver that uses its own IRQ-disabling tx_lock. Not only do we want to get rid of LLTX, this actually causes problems because of the skb_orphan() done with this tx_lock held: some skb destructors expect to be run with interrupts enabled. The simplest fix for this is to get rid of the driver-private tx_lock and stop using LLTX. We kill off priv->tx_lock and use netif_tx_lock[_bh]() instead; the patch to do this is a tiny bit tricky because we need to update places that take priv->lock inside the tx_lock to disable IRQs, rather than relying on tx_lock having already disabled IRQs. Also, there are a couple of places where we need to disable BHs to make sure we have a consistent context to call netif_tx_lock() (since we no longer can use _irqsave() variants), and we also have to change ipoib_send_comp_handler() to call drain_tx_cq() through a timer rather than directly, because ipoib_send_comp_handler() runs in interrupt context and drain_tx_cq() must run in BH context so it can call netif_tx_lock(). Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-09-30 10:36:21 -07:00
Roland Dreier	c9da4bad5b	IPoIB: Fix crash when path record fails after path flush Commit `ee1e2c82` ("IPoIB: Refresh paths instead of flushing them on SM change events") changed how paths are flushed on an SM event. This change introduces a problem if the path record query triggered by fails, causing path->ah to become NULL. A later successful path query will then trigger WARN_ON() in path_rec_completion(), and crash because path->ah has already been freed, so the ipoib_put_ah() inside the lock in path_rec_completion() may actually drop the last reference (contrary to the comment that claims this is safe). Fix this by updating path->ah and freeing old_ah only when the path record query is successful. This prevents the neighbour AH and that path AH from getting out of sync. This fixes <https://bugs.openfabrics.org/show_bug.cgi?id=1194> Reported-by: Rabah Salem <ravah@mellanox.com> Debugged-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-09-25 15:26:15 -07:00
Yossi Etigin	e8224e4b80	IPoIB: Fix deadlock on RTNL between bcast join comp and ipoib_stop() Taking rtnl_lock in ipoib_mcast_join_complete() causes a deadlock with ipoib_stop(). We avoid it by scheduling the piece of code that takes the lock on ipoib_workqueue instead of executing it directly. This works because we only flush the ipoib_workqueue with the RTNL not held. The deadlock happens because ipoib_stop() calls ipoib_ib_dev_down() which calls ipoib_mcast_dev_flush(), which calls ipoib_mcast_free(), which calls ipoib_mcast_leave(). The latter calls ib_sa_free_multicast(), and this waits until the multicast completion handler finishes. This handler is ipoib_mcast_join_complete(), which waits for the rtnl_lock(), which was already taken by ipoib_stop(). This bug was introduced in commit `a77a57a1` ("IPoIB: Fix deadlock on RTNL in ipoib_stop()"). Signed-off-by: Yossi Etigin <yosefe@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-09-16 11:57:45 -07:00
Adrian Bunk	7a8fc9b248	removed unused #include <linux/version.h>'s This patch lets the files using linux/version.h match the files that #include it. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-08-23 12:14:12 -07:00
Roland Dreier	a77a57a1a2	IPoIB: Fix deadlock on RTNL in ipoib_stop() Commit `c8c2afe3` ("IPoIB: Use rtnl lock/unlock when changing device flags") added a call to rtnl_lock() in ipoib_mcast_join_task(), which is run from the ipoib_workqueue. However, ipoib_stop() (which is run inside rtnl_lock()) flushes this workqueue, which leads to a deadlock if the join task is pending. Fix this by simply not flushing the workqueue from ipoib_stop(). It turns out that we really don't care about workqueue tasks running during or after ipoib_stop(), as long as we make sure to flush the workqueue before unregistering a netdev. This fixes <https://bugs.openfabrics.org/show_bug.cgi?id=1114>. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-08-19 15:01:32 -07:00
David J. Wilder	b1404069f6	IPoIB/cm: Use vmalloc() to allocate rx_rings There are users that are running UDP applications that require a large receive queue size in order to get good performance. To prevent allocation failures for rx_rings when using non-SRQ mode and large recv_queue_size (1K or larger), use vmalloc() instead of kcalloc() to alocate rx_rings. Signed-off-by: David Wilder <dwilder@us.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-08-08 15:51:29 -07:00
Roland Dreier	e08198169e	IPoIB/cm: Set correct SG list in ipoib_cm_init_rx_wr() wr->sg_list should be set to the sge pointer passed in, not priv->cm.rx_sge. Reported-by: Hoang-Nam Nguyen <HNGUYEN@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-30 07:21:46 -07:00
Roland Dreier	9905922446	IPoIB: Correct help text for INFINIBAND_IPOIB_DEBUG The help text for INFINIBAND_IPOIB_DEBUG refers to "ipoib_debugfs," which no longer exists. Correct this to talk about the files under debugfs that are really created. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-24 20:37:25 -07:00
Roland Dreier	99c3a5a9e3	IPoIB/cm: Connected mode is no longer EXPERIMENTAL Connected mode is now tested and used by lots of people. No need to hide it under CONFIG_EXPERIMENTAL. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-24 20:37:25 -07:00
Roland Dreier	2cc177364e	Merge branches 'bkl-removal', 'cma', 'ehca', 'for-2.6.27', 'mlx4', 'mthca' and 'nes' into for-linus	2008-07-24 08:38:47 -07:00
Or Gerlitz	01b3fc8b15	IPoIB: Include err code in trace message for ib_sa_path_rec_get() failures Print the return code of ib_sa_path_rec_get() if it fails to help debug errors. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-22 14:18:34 -07:00
Or Gerlitz	2f5de15128	IB/iser: Add support for RDMA_CM_EVENT_ADDR_CHANGE event Enhance iser to act upon notification on network stack changes that make its RDMA connection unaligned with the link used by the stack for the <src,dst> IPs used to establish the connection. When RDMA_CM_EVENT_ADDR_CHANGE arrives, just disconnect the connection, assuming that the user space iscsid daemon will reconnect, and the new connection will be aligned with the IP stack. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-22 14:16:21 -07:00
David S. Miller	49997d7515	Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6 Conflicts: Documentation/powerpc/booting-without-of.txt drivers/atm/Makefile drivers/net/fs_enet/fs_enet-main.c drivers/pci/pci-acpi.c net/8021q/vlan.c net/iucv/iucv.c	2008-07-18 02:39:39 -07:00
Linus Torvalds	89a93f2f48	Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (102 commits) [SCSI] scsi_dh: fix kconfig related build errors [SCSI] sym53c8xx: Fix bogus sym_que_entry re-implementation of container_of [SCSI] scsi_cmnd.h: remove double inclusion of linux/blkdev.h [SCSI] make struct scsi_{host,target}_type static [SCSI] fix locking in host use of blk_plug_device() [SCSI] zfcp: Cleanup external header file [SCSI] zfcp: Cleanup code in zfcp_erp.c [SCSI] zfcp: zfcp_fsf cleanup. [SCSI] zfcp: consolidate sysfs things into one file. [SCSI] zfcp: Cleanup of code in zfcp_aux.c [SCSI] zfcp: Cleanup of code in zfcp_scsi.c [SCSI] zfcp: Move status accessors from zfcp to SCSI include file. [SCSI] zfcp: Small QDIO cleanups [SCSI] zfcp: Adapter reopen for large number of unsolicited status [SCSI] zfcp: Fix error checking for ELS ADISC requests [SCSI] zfcp: wait until adapter is finished with ERP during auto-port [SCSI] ibmvfc: IBM Power Virtual Fibre Channel Adapter Client Driver [SCSI] sg: Add target reset support [SCSI] lib: Add support for the T10 (SCSI) Data Integrity Field CRC [SCSI] sd: Move scsi_disk() accessor function to sd.h ...	2008-07-15 18:58:04 -07:00
David S. Miller	b9e4085768	netdev: Do not use TX lock to protect address lists. Now that we have a specific lock to protect the network device unicast and multicast lists, remove extraneous grabs of the TX lock in cases where the code only needs address list protection. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-15 00:15:08 -07:00
David S. Miller	e308a5d806	netdev: Add netdev->addr_list_lock protection. Add netif_addr_{lock,unlock}{,_bh}() helpers. Use them to protect operations that operate on or read the network device unicast and multicast address lists. Also use them in cases where the code simply wants to block calls into the driver's ->set_rx_mode() and ->set_multicast_list() methods. Signed-off-by: David S. Miller <davem@davemloft.net>	2008-07-15 00:13:44 -07:00
Eli Cohen	bc3a290b51	IPoIB: Double default RX/TX ring sizes Increase IPoIB ring sizes to twice their original sizes (RX: 128->256, TX: 64->128) to act as a shock absorber for high traffic peaks. With the current settings, we have seen cases that there are many calls to netif_stop_queue(), which causes degradation in throughput. Also, larger receive buffer sizes help IPoIB in CM mode to avoid experiencing RNR NAK conditions due to insufficient receive buffers at the SRQ. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-14 23:48:52 -07:00
Eli Cohen	e112373fd6	IPoIB/cm: Reduce connected mode TX object size Since IPoIB connected mode does not NETIF_F_SG, we only have one DMA mapping per send, so we don't need a mapping[] array. Define a new struct with a single u64 mapping member and use it for the CM tx_ring. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-14 23:48:52 -07:00
Eli Cohen	bd3606715e	IPoIB: Use dev_set_mtu() to change mtu When the driver sets the MTU of the net device outside of its change_mtu method, it should make use of dev_set_mtu() instead of directly setting the mtu field of struct netdevice. Otherwise functions registered to be called upon MTU change will not get called (this is done through call_netdevice_notifiers() in dev_set_mtu()). Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-14 23:48:51 -07:00
Eli Cohen	c8c2afe360	IPoIB: Use rtnl lock/unlock when changing device flags Use of this lock is required to synchronize changes to the netdvice's data structs. Also move the call to ipoib_flush_paths() after the modification of the netdevice flags in set_mode(). Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-14 23:48:51 -07:00
Roland Dreier	9eae554c17	IPoIB: Get rid of ipoib_mcast_detach() wrapper ipoib_mcast_detach() does nothing except call ib_detach_mcast(), so just use the core API in the one place that does a multicast group detach. add/remove: 0/1 grow/shrink: 0/1 up/down: 0/-105 (-105) function old new delta ipoib_mcast_leave 357 319 -38 ipoib_mcast_detach 67 - -67 Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-14 23:48:50 -07:00
Eli Cohen	d0de13622d	IPoIB: Only set Q_Key once: after joining broadcast group The current code will set the Q_Key for any join of a non-sendonly multicast group. The operation involves a modify QP operation, which is fairly heavyweight, and is only really required after the join of the broadcast group. Fix this by adding a parameter to ipoib_mcast_attach() to control when the Q_Key is set. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-14 23:48:50 -07:00
Eli Cohen	5892eff91a	IPoIB: Remove priv->mcast_mutex No need for a mutex around calls to ib_attach_mcast/ib_detach_mcast since these operations are synchronized at the HW driver layer. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-14 23:48:50 -07:00
Eli Cohen	c03d4731b5	IPoIB: Remove unused IPOIB_MCAST_STARTED code The IPOIB_MCAST_STARTED flag is not used at all since commit `b3e2749b` ("IPoIB: Don't drop multicast sends when they can be queued"), so remove it. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-14 23:48:50 -07:00
Moni Shoua	ee1e2c82c2	IPoIB: Refresh paths instead of flushing them on SM change events The patch tries to solve the problem of device going down and paths being flushed on an SM change event. The method is to mark the paths as candidates for refresh (by setting the new valid flag to 0), and wait for an ARP probe a new path record query. The solution requires a different and less intrusive handling of SM change event. For that, the second argument of the flush function changes its meaning from a boolean flag to a level. In most cases, SM failover doesn't cause LID change so traffic won't stop. In the rare cases of LID change, the remote host (the one that hadn't changed its LID) will lose connectivity until paths are refreshed. This is no worse than the current state. In fact, preventing the device from going down saves packets that otherwise would be lost. Signed-off-by: Moni Levy <monil@voltaire.com> Signed-off-by: Moni Shoua <monis@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-14 23:48:49 -07:00
Vladimir Sokolovsky	af40da894e	IPoIB: add LRO support Add "ipoib_use_lro" module parameter to enable LRO and an "ipoib_lro_max_aggr" module parameter to set the max number of packets to be aggregated. Make LRO controllable and LRO statistics accessible through ethtool. Signed-off-by: Vladimir Sokolovsky <vlad@mellanox.co.il> Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-14 23:48:48 -07:00
Ron Livne	1240673405	IPoIB: Use multicast loopback blocking if available Set IB_QP_CREATE_BLOCK_MULTICAST_LOOPBACK for IPoIB's UD QPs if supported by the underlying device. This creates an improvement of up to 39% in bandwidth when sending multicast packets with IPoIB, and an improvment of 12% in cpu usage. Signed-off-by: Ron Livne <ronli@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-14 23:48:48 -07:00
Roland Dreier	a7d834c4bc	IPoIB/cm: Fix racy use of receive WR/SGL in ipoib_cm_post_receive_nonsrq() For devices that don't support SRQs, ipoib_cm_post_receive_nonsrq() is called from both ipoib_cm_handle_rx_wc() and ipoib_cm_nonsrq_init_rx(), and these two callers are not synchronized against each other. However, ipoib_cm_post_receive_nonsrq() always reuses the same receive work request and scatter list structures, so multiple callers can end up stepping on each other, which leads to posting garbled work requests. Fix this by having the caller pass in the ib_recv_wr and ib_sge structures to use, and allocating new local structures in ipoib_cm_nonsrq_init_rx(). Based on a patch by Pradeep Satyanarayana <pradeep@us.ibm.com> and David Wilder <dwilder@us.ibm.com>, with debugging help from Hoang-Nam Nguyen <hnguyen@de.ibm.com>. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-14 23:48:47 -07:00
Eli Cohen	f89271da32	IPoIB: Copy small received SKBs in connected mode The connected mode implementation in the IPoIB driver has a large overhead in the way SKBs are handled in the receive flow. It usually allocates an SKB with as big as was used in the currently received SKB and moves unused fragments from the old SKB to the new one. This involves a loop on all the remaining fragments and incurs overhead on the CPU. This patch, for small SKBs, allocates an SKB just large enough to contain the received data and copies to it the data from the received SKB. The newly allocated SKB is passed to the stack and the old SKB is reposted. When running netperf, UDP small messages, without this pach I get: UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 14.4.3.178 (14.4.3.178) port 0 AF_INET Socket Message Elapsed Messages Size Size Time Okay Errors Throughput bytes bytes secs # # 10^6bits/sec 114688 128 10.00 5142034 0 526.31 114688 10.00 1130489 115.71 With this patch I get both send and receive at ~315 mbps. The reason that send performance actually slows down is as follows: When using this patch, the overhead of the CPU for handling RX packets is dramatically reduced. As a result, we do not experience RNR NAK messages from the receiver which cause the connection to be closed and reopened again; when the patch is not used, the receiver cannot handle the packets fast enough so there is less time to post new buffers and hence the mentioned RNR NACKs. So what happens is that the application thinks it posted a certain number of packets for transmission but these packets are flushed and do not really get transmitted. Since the connection gets opened and closed many times, each time netperf gets the CPU time that otherwise would have been given to IPoIB to actually transmit the packets. This can be verified when looking at the port counters -- the output of ifconfig and the oputput of netperf (this is for the case without the patch): tx packets ========== port counter: 1,543,996 ifconfig: 1,581,426 netperf: 5,142,034 rx packets ========== netperf 1,1304,089 Signed-off-by: Eli Cohen <eli@mellanox.co.il>	2008-07-14 23:48:44 -07:00
Roland Dreier	f3781d2e89	RDMA: Remove subversion $Id tags They don't get updated by git and so they're worse than useless. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-14 23:48:44 -07:00
Roland Dreier	969a60f9db	IB/srp: Remove use of cached P_Key/GID queries The SRP initiator is currently using ib_find_cached_pkey() and ib_get_cached_gid() in situations where the uncached ib_find_pkey() and ib_query_gid() functions serve just as well: sleeping is allowed and performance is not an issue. Since we want to eliminate the cached operations in the long term, convert SRP to use the uncached variants. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2008-07-14 23:48:43 -07:00
Mike Christie	8e9a20cee4	[SCSI] libiscsi, iscsi_tcp, ib_iser: fix setting of can_queue with old tools. This patch fixes two bugs that are related. 1. Old tools did not set can_queue/cmds_max. This patch modifies libiscsi so that when we add the host we catch this and set it to the default. 2. iscsi_tcp thought that the scsi command that was passed to the eh functions needed a iscsi_cmd_task allocated for it. It only needed a mgmt task, and now it does not matter since it all comes from the same pool and libiscsi handles this for the drivers. ib_iser had copied iscsi_tcp's code and set can_queue to its max - 1 to handle this. So this patch removes the max -1, and just sets it to the max. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-07-12 08:22:29 -05:00
Mike Christie	913e5bf435	[SCSI] libiscsi, iser, tcp: remove recv_lock The recv lock was defined so the iscsi layer could block the recv path from processing IO during recovery. It turns out iser just set a lock to that pointer which was pointless. We now disconnect the transport connection before doing recovery so we do not need the recv lock. For iscsi_tcp we still stop the recv path incase older tools are being used. This patch also has iscsi_itt_to_ctask user grab the session lock and has the caller access the task with the lock or get a ref to it in case the target is broken and sends a tmf success response then sends data or a response for the command that was supposed to be affected bty the tmf. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-07-12 08:22:22 -05:00
Mike Christie	88dfd340b9	[SCSI] iscsi class: Add session initiatorname and ifacename sysfs attrs. This adds two new attrs used for creating initiator ports and binding sessions to hardware. The session level initiatorname: Since bnx2i does a scsi_host per host device, we need to add the iface initiator port settings on the session, so we can create multiple initiator ports (each with different inames) per device/scsi_host. The current iname reflects that qla4xxx can have one iname per hba, and we are allocating a host per session for software. The iname on the host will remain so we can export and set the hba level qla4xxx setting. The ifacename attr: To bind a session to a some peice of hardware in userspace we maintain some mappings, but during boot or iscsid restart (iscsid contains the user space part of the driver) we need to be able to figure out which of those host mappings abstractions maps to certain sessions. This patch adds a ifacename attr, which userspace can set to id the host side of the endpoint across pivot_roots and iscsid restarts. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-07-12 08:22:21 -05:00
Mike Christie	412eeafa0a	[SCSI] iser: Modify iser to take a iscsi_endpoint struct in ep callouts and session setup This hooks iser into the iscsi endpoint code. Previously it handled the lookup and allocation. This has been made generic so bnx2i and iser can share it. It also allows us to pass iser the leading conn's ep, so we know the ib_deivce being used and can set it as the scsi_host's parent. And that allows scsi-ml to set the dma_mask based on those values. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-07-12 08:22:21 -05:00
Mike Christie	7970634b81	[SCSI] iscsi class: user device_for_each_child instead of duplicating session list Currently we duplicate the list of sessions, because we were using the test for if a session was on the host list to indicate if the session was bound or unbound. We can instead use the target_id and fix up the class so that drivers like bnx2i do not have to manage the target id space. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-07-12 08:22:20 -05:00
Mike Christie	2261ec3d68	[SCSI] iser: handle iscsi_cmd_task rename This handles the iscsi_cmd_task rename and renames the iser cmd task to iser task. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-07-12 08:22:20 -05:00
Mike Christie	2747fdb257	[SCSI] iser: convert ib_iser to support merged tasks Convert ib_iser to support merged tasks. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-07-12 08:22:19 -05:00
Mike Christie	0af967f5d4	[SCSI] libiscsi, iscsi_tcp, iser: add session cmds array accessor Currently to get a ctask from the session cmd array, you have to know to use the itt modifier. To make this easier on LLDs and so in the future we can easilly kill the session array and use the host shared map instead, this patch adds a nice wrapper to strip the itt into a session->cmds index and return a ctask. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-07-12 08:22:18 -05:00
Mike Christie	b40977d95f	[SCSI] iser: fix handling of scsi cmnds during recovery. After the stop_conn callback has returned the LLD should not touch the scsi cmds. iscsi_tcp and libiscsi use the conn->recv_lock and suspend_rx field to halt recv path processing, but iser does not have any protection. This patch modifies iser so that userspace can just call the ep_disconnect callback, which will halt all recv IO, before calling the stop_conn callback so we do not have to worry about the conn->recv_lock and suspend rx field. iser just needs to stop the send side from accessing the ib conn. Fixup to handle when the ep poll fails and ep disconnect is called from Erez. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>	2008-07-12 08:22:17 -05:00

1 2 3 4 5 ...

383 Commits