Commit ce6e74f2 ("RDMA/nes: Make nesadapter->phy_lock usage
consistent") introduced a problem where phy_lock was only unlocked
within an if statement and so nes_process_mac_intr() could return with
phy_lock still held. Fix this.
This was discovered because of the sparse warning:
drivers/infiniband/hw/nes/nes_hw.c:2643:9: warning: context imbalance in 'nes_process_mac_intr' - different lock contexts for basic block
Reported-by: Roland Dreier <rdreier@cisco.com>
Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Under abnormal termination, modify_qp() closes the QP, and async event
(AE) handling also attempts to close the same QP, causing a crash.
Fix this by checking the state of the QP before processing the AE.
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Enhance ethtool to read hardware registers for rcv/tx error stats.
Also add support for free pbl resources. Remove cq depth stats, which
are not used.
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
- wrap cq->cqidx_inc based on cq size.
- optimize t4_arm_cq logic.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
1) save the timestamp flit in the cq when we consume a CQE.
2) always compare the saved flit with the previous entry flit when
reading the next CQE entry. If the flits don't compare, then we
have overflowed.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
We need 1 extra entry for the status page and 1 to always have 1 free
entry to detect when the queue is full.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The LLD now supports proper UP state change events, so move the RDMA
provider registration to UP path.
This fixes a crash when loading iw_cxgb4 _after_ the NFS/RDMA
transport is up and running.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
In the RDMA core unregister path, kernel users will be calling down
into the T4 provider to release resources. So we cannot detach from
the LLD until this process completes.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The ib_qib driver is taking over support for QLogic PCIe QLE devices,
so remove support for them from ib_ipath. The ib_ipath driver now
supports only the obsolete QLogic Hyper-Transport IB host channel
adapter (model QHT7140).
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Add a low-level IB driver for QLogic PCIe adapters.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Add a new parameter to ib_register_device() so that low-level device
drivers can pass in a pointer to a callback function that will be
called for each port that is registered in sysfs. This allows
low-level device drivers to create files in
/sys/class/infiniband/<hca>/ports/<N>/
without having to poke through the internals of the RDMA sysfs handling.
There is no need for an unregister function since the kobject
reference will go to zero when ib_unregister_device() is called.
Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (44 commits)
vlynq: make whole Kconfig-menu dependant on architecture
add descriptive comment for TIF_MEMDIE task flag declaration.
EEPROM: max6875: Header file cleanup
EEPROM: 93cx6: Header file cleanup
EEPROM: Header file cleanup
agp: use NULL instead of 0 when pointer is needed
rtc-v3020: make bitfield unsigned
PCI: make bitfield unsigned
jbd2: use NULL instead of 0 when pointer is needed
cciss: fix shadows sparse warning
doc: inode uses a mutex instead of a semaphore.
uml: i386: Avoid redefinition of NR_syscalls
fix "seperate" typos in comments
cocbalt_lcdfb: correct sections
doc: Change urls for sparse
Powerpc: wii: Fix typo in comment
i2o: cleanup some exit paths
Documentation/: it's -> its where appropriate
UML: Fix compiler warning due to missing task_struct declaration
UML: add kernel.h include to signal.c
...
Use kmemdup when some other buffer is immediately copied into the
allocated region.
A simplified version of the semantic patch that makes this change is as
follows: (http://coccinelle.lip6.fr/)
// <smpl>
@@
expression from,to,size,flag;
statement S;
@@
- to = \(kmalloc\|kzalloc\)(size,flag);
+ to = kmemdup(from,size,flag);
if (to==NULL || ...) S
- memcpy(to, from, size);
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
We shouldn't free things here because we free them later.
The call tree looks like this:
iser_connect() ==> initiating the connection establishment
and later
iser_cma_handler() => iser_route_handler() => iser_create_ib_conn_res()
if we fail here, eventually iser_conn_release() is called, resulting
in a double free.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The iser connection teardown flow isn't over until the underlying
Connection Manager (e.g the IB CM) delivers a disconnected or timeout
event through the RDMA-CM. When the remote (target) side isn't
reachable, e.g when some HW e.g port/hca/switch isn't functioning or
taken down administratively, the CM timeout flow is used and the event
may be generated only after relatively long time -- on the order of
tens of seconds.
The current iser code exposes this possibly long delay to higher
layers, specifically to the iscsid daemon and iscsi kernel stack. As a
result, the iscsi stack doesn't respond well: this low-level CM delay
is added to the fail-over time under HA schemes such as the one
provided by DM multipath through the multipathd(8) service.
This patch enhances the reference counting scheme on iser's IB
connections so that the disconnect flow initiated by iscsid from user
space (ep_disconnect) doesn't wait for the CM to deliver the
disconnect/timeout event. (The connection teardown isn't done from
iser's view point until the event is delivered)
The iser ib (rdma) connection object is destroyed when its reference
count reaches zero. When this happens on the RDMA-CM callback
context, extra care is taken so that the RDMA-CM does the actual
destroying of the associated ID, since doing it in the callback is
prohibited.
The reference count of iser ib connection normally reaches three,
where the <ref, deref> relations are
1. conn <init, terminate>
2. conn <bind, stop/destroy>
3. cma id <create, disconnect/error/timeout callbacks>
With this patch, multipath fail-over time is about 30 seconds, while
without this patch, multipath fail-over time is about 130 seconds.
Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The iscsi connection object life cycle includes binding and unbinding
(conn_stop) to/from the iscsi transport connection object. Since
iscsi connection objects are recycled, at the time the transport
connection (e.g iser's IB connection) is released, it is not valid to
touch the iscsi connection tied to the transport back-pointer since it
may already point to a different transport connection.
Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Add handler to handle events such as port up and down. This is useful
when testing high-availability schemes such as multi-pathing.
Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Add support for masked atomic operations (masked compare and swap,
masked fetch and add).
Signed-off-by: Vladimir Sokolovsky <vlad@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Randomize local port allocation in the way sctp_get_port_local() does.
Update rover at the end of loop since we're likely to pick a valid port
on the first try.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reviewed-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
This allows the compiler to do a bit better; on my x86-64 build:
add/remove: 0/2 grow/shrink: 1/0 up/down: 2288/-2365 (-77)
function old new delta
nes_init_phy 273 2561 +2288
nes_init_1g_phy 469 - -469
nes_init_2025_phy 1896 - -1896
Signed-off-by: Roland Dreier <rolandd@cisco.com>
nes_{read,write}_1G_phy_reg() are using phy_lock while
nes_{read,write}_10G_phy_reg() leave that to the caller.
Remove phy_lock from 1G routines and leave the locking to the caller.
Add additional phy_lock calls around 1G read/write.
Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Add an RDMA/iWARP driver for Chelsio T4 Ethernet adapters.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The DMA API is preferred; no functional change.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The DMA API is preferred; no functional change.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The low level cxgb3 driver can return NET_XMIT_CN and friends.
The iw_cxgb3 driver should _not_ treat these as errors.
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
The DMA API is preferred; no functional change.
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Several RDMA user-access drivers have file_operations structures with
no .llseek method set. None of the drivers actually do anything with
f_pos, so this means llseek is essentially a NOP, instead of returning
an error as leaving other file_operations methods unimplemented would
do. This is mostly harmless, except that a NULL .llseek means that
default_llseek() is used, and this function grabs the BKL, which we
would like to avoid.
Since llseek does nothing useful on these files, we would like it to
return an error to userspace instead of silently grabbing the BKL and
succeeding. For nearly all of the file types, we take the
belt-and-suspenders approach of setting the .llseek method to
no_llseek and also calling nonseekable_open(); the exception is the
uverbs_event files, which are created with anon_inode_getfile(), which
already sets f_mode the same way as nonseekable_open() would.
This work is motivated by Arnd Bergmann's bkl-removal tree.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
IB/mlx4: Check correct variable for allocation failure
RDMA/nes: Correct cap.max_inline_data assignment in nes_query_qp()
RDMA/cm: Set num_paths when manually assigning path records
IB/cm: Fix device_create() return value check
The intent here is to check the "mfrpl->mapped_page_list" allocation.
We checked "mfrpl->ibfrpl.page_list" earlier.
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
cap.max_inline_data is incorrectly set in init_attr instead of attr.
Set it in attr so subsequent init_attr.cap assignment will get the
correct value.
Signed-off-by: Chien Tung <chien.tin.tung@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
When manually assigning the path records to use for a connection, save
the number of paths that were set. Otherwise, checks against num_path
will show 0, even though path record data is available.
This was discovered by manually setting the path records from user
space, then querying the kernel to see if the correct path records
were assigned, only to discover that the kernel returned 0 path
records to the query.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6:
driver core: numa: fix BUILD_BUG_ON for node_read_distance
driver-core: document ERR_PTR() return values
kobject: documentation: Update to refer to kset-example.c.
sysdev: the cpu probe/release attributes should be sysdev_class_attributes
kobject: documentation: Fix erroneous example in kobject doc.
driver-core: fix missing kernel-doc in firmware_class
Driver core: Early platform kernel-doc update
sysfs: fix sysfs lockdep warning in mlx4 code
sysfs: fix sysfs lockdep warning in infiniband code
sysfs: fix sysfs lockdep warning in ipmi code
sysfs: Initialised pci bus legacy_mem field before use
sysfs: use sysfs_bin_attr_init in firmware class driver
This fixes a sysfs lockdep warning in the infiniband code.
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (69 commits)
[SCSI] scsi_transport_fc: Fix synchronization issue while deleting vport
[SCSI] bfa: Update the driver version to 2.1.2.1.
[SCSI] bfa: Remove unused header files and did some cleanup.
[SCSI] bfa: Handle SCSI IO underrun case.
[SCSI] bfa: FCS and include file changes.
[SCSI] bfa: Modified the portstats get/clear logic
[SCSI] bfa: Replace bfa_get_attr() with specific APIs
[SCSI] bfa: New portlog entries for events (FIP/FLOGI/FDISC/LOGO).
[SCSI] bfa: Rename pport to fcport in BFA FCS.
[SCSI] bfa: IOC fixes, check for IOC down condition.
[SCSI] bfa: In MSIX mode, ignore spurious RME interrupts when FCoE ports are in FW mismatch state.
[SCSI] bfa: Fix Command Queue (CPE) full condition check and ack CPE interrupt.
[SCSI] bfa: IOC recovery fix in fcmode.
[SCSI] bfa: AEN and byte alignment fixes.
[SCSI] bfa: Introduce a link notification state machine.
[SCSI] bfa: Added firmware save clear feature for BFA driver.
[SCSI] bfa: FCS authentication related changes.
[SCSI] bfa: PCI VPD, FIP and include file changes.
[SCSI] bfa: Fix to copy fpma MAC when requested by user space application.
[SCSI] bfa: RPORT state machine: direct attach mode fix.
...