Since commit 7361c36c52 (af_unix: Allow credentials to work across
user and pid namespaces) af_unix performance dropped a lot.
This is because we now take a reference on pid and cred in each write(),
and release them in read(), usually done from another process,
eventually from another cpu. This triggers false sharing.
# Events: 154K cycles
#
# Overhead Command Shared Object Symbol
# ........ ....... .................. .........................
#
10.40% hackbench [kernel.kallsyms] [k] put_pid
8.60% hackbench [kernel.kallsyms] [k] unix_stream_recvmsg
7.87% hackbench [kernel.kallsyms] [k] unix_stream_sendmsg
6.11% hackbench [kernel.kallsyms] [k] do_raw_spin_lock
4.95% hackbench [kernel.kallsyms] [k] unix_scm_to_skb
4.87% hackbench [kernel.kallsyms] [k] pid_nr_ns
4.34% hackbench [kernel.kallsyms] [k] cred_to_ucred
2.39% hackbench [kernel.kallsyms] [k] unix_destruct_scm
2.24% hackbench [kernel.kallsyms] [k] sub_preempt_count
1.75% hackbench [kernel.kallsyms] [k] fget_light
1.51% hackbench [kernel.kallsyms] [k]
__mutex_lock_interruptible_slowpath
1.42% hackbench [kernel.kallsyms] [k] sock_alloc_send_pskb
This patch includes SCM_CREDENTIALS information in a af_unix message/skb
only if requested by the sender, [man 7 unix for details how to include
ancillary data using sendmsg() system call]
Note: This might break buggy applications that expected SCM_CREDENTIAL
from an unaware write() system call, and receiver not using SO_PASSCRED
socket option.
If SOCK_PASSCRED is set on source or destination socket, we still
include credentials for mere write() syscalls.
Performance boost in hackbench : more than 50% gain on a 16 thread
machine (2 quad-core cpus, 2 threads per core)
hackbench 20 thread 2000
4.228 sec instead of 9.102 sec
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Most boards with SysKonnect/Marvell Ethernet have only a single port.
For the single port case, use the standard Ethernet driver convention
of allocating IRQ when device is brought up rather than at probe time.
This patch also adds some additional read after writes to avoid any
PCI posting problems when setting the IRQ mask.
The error handling of dual port cards is also changed. If second port
can not be brought up, then just fail. No point in continuing, since
the failure is most certainly because of out of memory.
It is worth noting that the dual port skge device has a single irq but two
seperate status rings and therefore has two NAPI objects, one for
each port.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This fix provides a newly flashed FW version (appended, in braces)
along with the currently running FW version via ethtool. The newly
flashed version runs only after a system reset.
Signed-off-by: Suresh Reddy <Suresh.Reddy@emulex.com>
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Re-posting with subject fixed!
Multicast programming has been broken since commit 5b8821b7. Setting the
MULTICAST flag while sending the cmd to the FW was missing. Fixed this.
Also fixed-up some indentation in the adjacent lines.
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rename struct tcp_skb_cb "flags" to "tcp_flags" to ease code review and
maintenance.
Its content is a combination of FIN/SYN/RST/PSH/ACK/URG/ECE/CWR flags
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
struct tcp_skb_cb contains a "flags" field containing either tcp flags
or IP dsfield depending on context (input or output path)
Introduce ip_dsfield to make the difference clear and ease maintenance.
If later we want to save space, we can union flags/ip_dsfield
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch touchs most of the enic port profile handling code.
Tried to break it into sub patches without success.
The patch mainly does the following:
- Port profile operations for a SRIOV VF are modified to work
only via its PF
- Changes the port profile static struct in struct enic to a pointer.
This is because a SRIOV PF has to now hold the port profile information
for all its VF's
- Moved address registration for VF's during port profile ASSOCIATE time
- Most changes in port profile handling code are changes related to indexing
into the port profile struct array of a PF for the VF port profile
information
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: Sujith Sankar <ssujith@cisco.com>
Signed-off-by: Christian Benvenuti <benve@cisco.com>
Signed-off-by: David Wang <dwang2@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds helper functions to use PF as proxy for SRIOV VF firmware
commands.
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: Sujith Sankar <ssujith@cisco.com>
Signed-off-by: Christian Benvenuti <benve@cisco.com>
Signed-off-by: David Wang <dwang2@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds support to enable SRIOV on enic devices. Enic SRIOV VF's are dynamic vnics and will use the same driver code as dynamic vnics.
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
Signed-off-by: Sujith Sankar <ssujith@cisco.com>
Signed-off-by: Christian Benvenuti <benve@cisco.com>
Signed-off-by: David Wang <dwang2@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
While playing with a new ADSL box at home, I discovered that ECN
blackhole can trigger suboptimal quickack mode on linux : We send one
ACK for each incoming data frame, without any delay and eventual
piggyback.
This is because TCP_ECN_check_ce() considers that if no ECT is seen on a
segment, this is because this segment was a retransmit.
Refine this heuristic and apply it only if we seen ECT in a previous
segment, to detect ECN blackhole at IP level.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Jamal Hadi Salim <jhs@mojatatu.com>
CC: Jerry Chu <hkchu@google.com>
CC: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
CC: Jim Gettys <jg@freedesktop.org>
CC: Dave Taht <dave.taht@gmail.com>
Acked-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Most sky2 hardware only has a single port, although some variations of the
chip support two interfaces. For the single port case, use the standard
Ethernet driver convention of allocating IRQ when device is brought up
rather than at probe time.
Also, change the error handling of dual port cards so that if second
port can not be brought up, then just fail. No point in continuing, since
the failure is most certainly because of out of memory.
The dual port sky2 device has a single irq and a single status ring,
therefore it has a single NAPI object shared by both ports.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
netdev is unused in pch_gbe_setup_rctl. Remove this declaration to
avoid a compiler warning.
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
Device drivers that create and destroy SR-IOV virtual functions via
calls to pci_enable_sriov() and pci_disable_sriov can cause catastrophic
failures if they attempt to destroy VFs while they are assigned to
guest virtual machines. By adding a flag for use by the Xen PCI back
to indicate that a device is assigned a device driver can check that
flag and avoid destroying VFs while they are assigned and avoid system
failures.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently ehea ndo_get_stats can sleep in two places, in a hcall
and in a GFP_KERNEL alloc, which is not correct.
This patch creates a delayed workqueue that grabs the information each 1
sec from the hardware, and place it into the device structure, so that,
.ndo_get_stats quickly returns the device structure statistics block.
Signed-off-by: Breno Leitao <brenohl@br.ibm.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit adds one step support to the phyter. When enabled, the
hardware does not provide time stamps for transmitted sync messages but
instead inserts the stamp into the outgoing packet.
Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
Signed-off-by: David S. Miller <davem@davemloft.net>
The IEEE 1588 standard (PTP) has a provision for a "one step" mode, where
time stamps on outgoing event packets are inserted into the packet by the
hardware on the fly. This patch adds a new flag for the SIOCSHWTSTAMP
ioctl that lets user space programs request this mode.
Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch enables six external event channels and one periodic output.
One GPIO is reserved for synchronizing multiple PHYs. The assignment
of GPIO functions can be changed via a module parameter.
The code supports multiple simultaneous events by inducing a PTP clock
event for every channel marked in the PHY's extended status word.
Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
Signed-off-by: David S. Miller <davem@davemloft.net>
Argument list to CDRP function has become unmanageably long. Fix it by properly
declaring a struct that encompasses all the input and output parameters.
Signed-off-by: Anirban Chakraborty <anirban.chakraborty@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ameen Rahman <ameen.rahman@qlogic.com>
Signed-off-by: Anirban Chakraborty <anirban.chakraborty@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The imx6q enet is a derivative of imx28 enet controller. It fixed
the frame endian issue found on imx28, and added 1 Gbps support.
It also fixes a typo on vendor name in Kconfig.
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
In function fec_enet_mii_init(), it uses non-zero pdev->id as part
of the condition to check the second fec instance (fec1). This works
before the driver supports device tree probe. But in case of device
tree probe, pdev->id is -1 which is also non-zero, so the logic becomes
broken when device tree probe gets supported.
The patch change the logic to check "pdev->id > 0" as the part of the
condition for identifying fec1.
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
FEC can work without a phy reset on some platforms, which means not
very platform necessarily have a phy-reset gpio encoded in device tree.
Even on the platforms that have the gpio, FEC can work without
resetting phy for some cases, e.g. boot loader has done that.
So it makes more sense to have the phy-reset-gpio request failure as
a debug message rather than a warning, and get fec_reset_phy() return
void since the caller does not check the return anyway.
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Finish conversion to unified ethtool ops: convert get_flags.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The SEEQ drivers should depend on HAS_IOMEM to prevent compile breakage
on !HAS_IOMEM architectures:
drivers/net/ethernet/seeq/seeq8005.c: In function 'seeq8005_probe1':
drivers/net/ethernet/seeq/seeq8005.c:179:2: error:
implicit declaration of function 'inw' [-Werror=implicit-function-declaration]
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reloading FW during resets can cause issues. Remove the full reset
as it is not needed.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add support for WOL as determined by the EEPROM.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This change is meant to avoid a hardware lockup when Tx work is still
pending and we request a reset.
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds support for configuring the priority to
traffic class mapping.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
We don't need SFP+ plugable support for X540 hardware (copper only) so
don't enable the SFP+ interrupts.
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The DCB CEE command set_state() will complete successfully
but is misleading because it enables IEEE mode. After
this patch the command is failed.
And IEEE PFC/ETS is managed from ieee paths now instead
of using CEE primitives.
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Use the PCI device flag indicating if a VF is assigned to a guest VM
to guard against destroying VFs upon driver removal. Implement
additional feature to detect if VFs already exist when the driver
is loaded and if so configure them and set the driver state to
SR-IOV enabled.
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Device drivers that create and destroy SR-IOV virtual functions via
calls to pci_enable_sriov() and pci_disable_sriov can cause catastrophic
failures if they attempt to destroy VFs while they are assigned to
guest virtual machines. By adding a flag for use by the KVM module
to indicate that a device is assigned a device driver can check that
flag and avoid destroying VFs while they are assigned and avoid system
failures.
CC: Ian Campbell <ijc@hellion.org.uk>
CC: Konrad Wilk <konrad.wilk@oracle.com>
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Shreyas Bhatewara <sbhatewara@vmware.com>
Cc: "VMware, Inc." <pv-drivers@vmware.com>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: virtualization@lists.linux-foundation.org
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Francois Romieu <romieu@fr.zoreil.com>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: David Dillow <dave@thedillows.org>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Alexander Indenbaum <baum@tehutinetworks.net>
Cc: Andy Gospodarek <andy@greyhouse.net>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Stephen Hemminger <shemminger@linux-foundation.org>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Stephen Hemminger <shemminger@linux-foundation.org>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Solarflare linux maintainers <linux-net-drivers@solarflare.com>
Cc: Steve Hodgson <shodgson@solarflare.com>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Jon Mason <jdmason@kudzu.us>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
- fix features : jumbo frames and checksumming can not be used at the
same time.
- introduce hw_jumbo_{enable / disable} helpers. Their content has been
creatively extracted from Realtek's own drivers. As an illustration,
it would be nice to know how/if the MaxTxPacketSize register operates
when the device can work with a 9k jumbo frame as its documentation
(8168c) can not be applied beyond ~7k.
- rtl_tx_performance_tweak is moved forward. No change.
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>