On error path kfree() should get pointer to memory allocated by
kmalloc() not the address of variable holding it (which is on stack).
Signed-off-by: Mariusz Kozlowski <mk@lab.zgora.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
We should reduce the number of reserved completion queues from the total
number of entries. Since the queue size is power of two, not reducing the
reserved entries, caused a double queue size, which may lead to allocation
failures in some cases.
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
In case of allocation failure, tried to use the promiscuous QP
entry that was previously freed.
Now freeing this entry only in case we will not put it back to the list
of promiscuous entries.
Reported-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (30 commits)
xfrm: Restrict extended sequence numbers to esp
xfrm: Check for esn buffer len in xfrm_new_ae
xfrm: Assign esn pointers when cloning a state
xfrm: Move the test on replay window size into the replay check functions
netdev: bfin_mac: document TE setting in RMII modes
drivers net: Fix declaration ordering in inline functions.
cxgb3: Apply interrupt coalescing settings to all queues
net: Always allocate at least 16 skb frags regardless of page size
ipv4: Don't ip_rt_put() an error pointer in RAW sockets.
net: fix ethtool->set_flags not intended -EINVAL return value
mlx4_en: Fix loss of promiscuity
tg3: Fix inline keyword usage
tg3: use <linux/io.h> and <linux/uaccess.h> instead <asm/io.h> and <asm/uaccess.h>
net: use CHECKSUM_NONE instead of magic number
Net / jme: Do not use legacy PCI power management
myri10ge: small rx_done refactoring
bridge: notify applications if address of bridge device changes
ipv4: Fix IP timestamp option (IPOPT_TS_PRESPEC) handling in ip_options_echo()
can: c_can: Fix tx_bytes accounting
can: c_can_platform: fix irq check in probe
...
The mlx4_en driver uses the combination stop_port/start_port
in a number of places. Unfortunately that causes any promiscuous
mode settings on the hardware to be lost.
This patch fixes that problem.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (56 commits)
route: Take the right src and dst addresses in ip_route_newports
ipv4: Fix nexthop caching wrt. scoping.
ipv4: Invalidate nexthop cache nh_saddr more correctly.
net: fix pch_gbe section mismatch warning
ipv4: fix fib metrics
mlx4_en: Removing HW info from ethtool -i report.
net_sched: fix THROTTLED/RUNNING race
drivers/net/a2065.c: Convert release_resource to release_region/release_mem_region
drivers/net/ariadne.c: Convert release_resource to release_region/release_mem_region
bonding: fix rx_handler locking
myri10ge: fix rmmod crash
mlx4_en: updated driver version to 1.5.4.1
mlx4_en: Using blue flame support
mlx4_core: reserve UARs for userspace consumers
mlx4_core: maintain available field in bitmap allocator
mlx4: Add blue flame support for kernel consumers
mlx4_en: Enabling new steering
mlx4: Add support for promiscuous mode in the new steering model.
mlx4: generalization of multicast steering.
mlx4_en: Reporting HW revision in ethtool -i
...
Avoiding abuse of ethtool_drvinfo.driver field.
HW specific info can be retrieved using lspci.
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
Doorbell is used according to usage of BlueFlame.
For Blue Flame to work in Ethernet mode QP number should have 0
at bits 6,7.
Allocating range of QPs accordingly.
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
Do not allow a kernel consumer to allocate a UAR to serve for blue flame if the
number of available UARs gets below MLX4_NUM_RESERVED_UARS (currently 8). This
will allow userspace apps to open a device file and run things like
ibv_devinfo.
Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add mlx4_bitmap_avail() to give the number of available resources. We want to
use this as a hint to whether to allocate a resources or not. This patch is
introduced to be used with allocation blue flame registers.
Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
Using blue flame can improve latency by allowing the HW to more efficiently
access the WQE. This patch presents two functions that are used to allocate or
release HW resources for using blue flame; the caller need to supply a struct
mlx4_bf object when allocating resources. Consumers that make use of this API
should post doorbells to the UAR object pointed by the initialized struct
mlx4_bf;
Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
The mlx4_en module now uses the new steering mechanism.
The RX packets are now steered through the MCG table instead
of Mac table for unicast, and default entry for multicast.
The feature is enabled through INIT_HCA
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
For Ethernet mode only,
When we want to register QP as promiscuous, it must be added to all the
existing steering entries and also to the default one.
The promiscuous QP might also be on of "real" QPs,
which means we need to monitor every entry to avoid duplicates and ensure
we close an entry when all it has is promiscuous QPs.
Same mechanism both for unicast and multicast.
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
The same packet steering mechanism would be used both for IB and Ethernet,
Both multicasts and unicasts.
This commit prepares the general infrastructure for this.
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
HW revision is derived from device ID and rev id.
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.co.il>
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver queries the FW for WOL support.
Ethtool get/set_wol is implemented accordingly.
Only magic packets are supported at the time.
Signed-off-by: Igor Yarovinsky <igory@mellanox.co.il>
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
Each RX ring will have its own interrupt vector, and TX rings will share one
(we mostly use polling for TX completions).
The vectors are assigned first time device is opened, and its name includes
the interface name and ring number.
Signed-off-by: Markuze Alex <markuze@mellanox.co.il>
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adding a pool of MSI-X vectors and EQs that can be used explicitly by mlx4_core
customers (mlx4_ib, mlx4_en). The consumers will assign their own names to the
interrupt vectors. Those vectors are not opened at mlx4 device initialization,
opened by demand.
Changed the max number of possible EQs according to the new scheme, no longer relies on
on number of cores.
The new functionality is exposed through mlx4_assign_eq() and mlx4_release_eq().
Customers that do not use the new API will get completion vectors as before.
Signed-off-by: Markuze Alex <markuze@mellanox.co.il>
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of reseting the module parameters each ifup or mtu change,
they are being set once at device initialization
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
By default, each device is assumed to be able only handle 64 KB chunks
during DMA. By giving the segment size a larger value, the block layer
will coalesce more S/G entries together for SRP, allowing larger
requests with the same sg_tablesize setting. The block layer is the
only direct user of it, though a few IOMMU drivers reference it as
well for their *_map_sg coalescing code. pci-gart_64 on x86, and a
smattering on on sparc, powerpc, and ia64.
Since other IB protocols could potentially see larger segments with
this, let's check those:
- iSER is fine, because you limit your maximum request size to 512
KB, so we'll never overrun the page vector in struct iser_page_vec
(128 entries currently). It is independent of the DMA segment size,
and handles multi-page segments already.
- IPoIB is fine, as it maps each page individually, and doesn't use
ib_dma_map_sg().
- RDS appears to do the right thing and has no dependencies on DMA
segment size, but I don't claim to have done a complete audit.
- NFSoRDMA and 9p are OK -- they do not use ib_dma_map_sg(), so they
doesn't care about the coalescing.
- Lustre's ko2iblnd does not care about coalescing -- it properly
walks the returned sg list.
This patch ups the value on Mellanox hardware to 1 GB, which matches
reported firmware limits on mlx4.
Signed-off-by: David Dillow <dillowda@ornl.gov>
Signed-off-by: Roland Dreier <roland@purestorage.com>
The newest device firmware stores IB vs. Ethernet protocol in two bits
in members_count field of multicast group table (0: Infiniband, 1:
Ethernet). When changing the QP members count for a multicast group,
it important not to reset this information. When calling multicast
attach first time, the protocol type should be specified. In this
patch we always set it IB, but in the future we will handle Ethernet
too. When looking for a QP, the protocol type shoud be checked too.
Signed-off-by: Aleksey Senin <alekseys@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Some systems have PCI addresses that don't fit in unsigned long (eg some
32-bit PowerPC 440 systems have 36-bit bus addresses). Fix up mlx4 drivers
by using phys_addr_t where appropriate, so we don't truncate any PCI
resource addresses before ioremapping them.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (67 commits)
cxgb4vf: recover from failure in cxgb4vf_open()
netfilter: ebtables: make broute table work again
netfilter: fix race in conntrack between dump_table and destroy
ah: reload pointers to skb data after calling skb_cow_data()
ah: update maximum truncated ICV length
xfrm: check trunc_len in XFRMA_ALG_AUTH_TRUNC
ehea: Increase the skb array usage
net/fec: remove config FEC2 as it's used nowhere
pcnet_cs: add new_id
tcp: disallow bind() to reuse addr/port
net/r8169: Update the function of parsing firmware
net: ppp: use {get,put}_unaligned_be{16,32}
CAIF: Fix IPv6 support in receive path for GPRS/3G
arp: allow to invalidate specific ARP entries
net_sched: factorize qdisc stats handling
mlx4: Call alloc_etherdev to allocate RX and TX queues
net: Add alloc_netdev_mqs function
caif: don't set connection request param size before copying data
cxgb4vf: fix mailbox data/control coherency domain race
qlcnic: change module parameter permissions
...
The kernel warning message added in commit 58d74bb1d9 ("mlx4_core:
Workaround firmware bug in query dev cap") about mlx4 reporting the
wrong number of "blue flame registers" doesn't really help anyone, since
the firmware bug is known and fixed and the bug is pretty much harmless
to users. So just get rid of the warning.
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Changed driver to call alloc_etherdev_mqs so that the number of TX
and RX queues can be set to correct values in the netdev device.
Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
IB: Fix information leak in marshalling code
IB/pack: Remove some unused code added by the IBoE patches
IB/mlx4: Fix IBoE link state
IB/mlx4: Fix IBoE reported link rate
mlx4_core: Workaround firmware bug in query dev cap
IB/mlx4: Fix memory ordering of VLAN insertion control bits
MAINTAINERS: Update NetEffect entry
ConnectX firmware is supposed to report the number blue flame
registers per page as log2 of the value. However, due to a firmware
bug, it reports actual number. This patch works around this by
checking if the number of registers calculated fits within a page. If
it does not, we use 8 registers per page.
Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (63 commits)
IB/qib: clean up properly if pci_set_consistent_dma_mask() fails
IB/qib: Allow driver to load if PCIe AER fails
IB/qib: Fix uninitialized pointer if CONFIG_PCI_MSI not set
IB/qib: Fix extra log level in qib_early_err()
RDMA/cxgb4: Remove unnecessary KERN_<level> use
RDMA/cxgb3: Remove unnecessary KERN_<level> use
IB/core: Add link layer type information to sysfs
IB/mlx4: Add VLAN support for IBoE
IB/core: Add VLAN support for IBoE
IB/mlx4: Add support for IBoE
mlx4_en: Change multicast promiscuous mode to support IBoE
mlx4_core: Update data structures and constants for IBoE
mlx4_core: Allow protocol drivers to find corresponding interfaces
IB/uverbs: Return link layer type to userspace for query port operation
IB/srp: Sync buffer before posting send
IB/srp: Use list_first_entry()
IB/srp: Reduce number of BUSY conditions
IB/srp: Eliminate two forward declarations
IB/mlx4: Signal node desc changes to SM by using FW to generate trap 144
IB: Replace EXTRA_CFLAGS with ccflags-y
...
When searching for a free entry in either mlx4_register_vlan() or
mlx4_register_mac(), and there is no free entry, the loop terminates without
updating the local variable free thus causing out of array bounds access. Fix
this by adding a proper check outside the loop.
Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch allows IBoE traffic to be encapsulated in 802.1Q tagged
VLAN frames. The VLAN tag is encoded in the GID and derived from it
by a simple computation.
The netdev notifier callback is modified to catch VLAN device
addition/removal and the port's GID table is updated to reflect the
change, so that for each netdevice there is an entry in the GID table.
When the port's GID table is exhausted, GID entries will not be added.
Only children of the main interfaces can add to the GID table; if a
VLAN interface is added on another VLAN interface (e.g. "vconfig add
eth2.6 8"), then that interfaces will not add an entry to the GID
table.
Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Change multicast promiscuous mode to pass packets through the multicast group distribution table
before sending packets that miss to the default multicast QP.
Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Add fields to hardware data structures and add new constants required for IBoE
support.
Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Add a mechanism for mlx4 protocol drivers to get a pointer to other
drivers's device objects. For this, an exported function,
mlx4_get_protocol_dev() is added, which allows a driver to get some
other driver's device based on the protocol that the driver
implements. Two protocols are added: MLX4_PROTOCOL_IB and
MLX4_PROTOCOL_EN.
This will be used in mlx4 IBoE support so that mlx4_ib can find the
corresponding mlx4_en netdev.
Signed-off-by: Eli Cohen <eli@mellanox.co.il>
[ Clean up and rename a few things. - Roland ]
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Make the following function static: mlx4_UNMAP_ICM
Remove the following unused function: mlx4_MAP_ICM_page
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Many (but not all) drivers check to see whether there is a vlan
group configured before using a tag stored in the skb. There's
not much point in this check since it just throws away data that
should only be present in the expected circumstances. However,
it will soon be legal and expected to get a vlan tag when no
vlan group is configured, so remove this check from all drivers
to avoid dropping the tags.
Signed-off-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As there are now machines with a lot more physical memory, we need to
be able to register more memory. This patch lifts the upper limit of
log_mtts_per_seg from 5 to 7, increasing the amount of memory that can
be registered.
Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Change "return (EXPR);" to "return EXPR;"
return is not a function, parentheses are not required.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The values didn't match the title after removing the LRO
statistics in commit fa37a9586f
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
If failed to get skb frags using napi_get_frags(),
the packet is dropped.
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace loops calling set_bit() and clear_bit() with bitmap_set() and
bitmap_clear().
Unlike loops calling set_bit() and clear_bit(), bitmap_set() and
bitmap_clear() are not atomic. But this is ok.
Because the bitmap operations are protected by bitmap->lock
except for initialization of the bitmap in mlx4_bitmap_init().
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Roland Dreier <rolandd@cisco.com>
Cc: netdev@vger.kernel.org
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
drivers/net/mlx4/en_rx.c: In function ‘mlx4_en_create_rx_ring’:
drivers/net/mlx4/en_rx.c:305: warning: label ‘err_map’ defined but not used
Signed-off-by: David S. Miller <davem@davemloft.net>