Including fixes from bpf and netfilter.

Current release - regressions:
 
   - gro: initialize network_offset in network layer
 
   - tcp: reduce accepted window in NEW_SYN_RECV state
 
 Current release - new code bugs:
 
   - eth: mlx5e: do not use ptp structure for tx ts stats when not initialized
 
   - eth: ice: check for unregistering correct number of devlink params
 
 Previous releases - regressions:
 
   - bpf: Allow delete from sockmap/sockhash only if update is allowed
 
   - sched: taprio: extend minimum interval restriction to entire cycle too
 
   - netfilter: ipset: add list flush to cancel_gc
 
   - ipv4: fix address dump when IPv4 is disabled on an interface
 
   - sock_map: avoid race between sock_map_close and sk_psock_put
 
   - eth: mlx5: use mlx5_ipsec_rx_status_destroy to correctly delete status rules
 
 Previous releases - always broken:
 
   - core: fix __dst_negative_advice() race
 
   - bpf:
     - fix multi-uprobe PID filtering logic
     - fix pkt_type override upon netkit pass verdict
 
   - netfilter: tproxy: bail out if IP has been disabled on the device
 
   - af_unix: annotate data-race around unix_sk(sk)->addr
 
   - eth: mlx5e: fix UDP GSO for encapsulated packets
 
   - eth: idpf: don't enable NAPI and interrupts prior to allocating Rx buffers
 
   - eth: i40e: fully suspend and resume IO operations in EEH case
 
   - eth: octeontx2-pf: free send queue buffers incase of leaf to inner
 
   - eth: ipvlan: dont Use skb->sk in ipvlan_process_v{4,6}_outbound
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmZYaP0SHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOk5+QP/3wc2ktY/whZvLyJyM6NsVl1DYohnjua
 H05bveXgUMd4NNxEfQ31IMGCct6d2fe+fAIJrefxdjxbjyY38SY5xd1zpXLQDxqB
 ks6T9vZ4ITgwpqWT5Z1XafIgV/bYlf42+GHUIPuFFlBisoUqkAm7Wzw/T+Ap3rVX
 7Y2p7ulvdh85GyMGsAi5Bz9EkyiSQUsMvbtGOA9a9WopIyqoxTgV5Unk1L/FXlEU
 ZO8L7hrwZKWL1UDlaqnfESD9DBEbNc85WRoagFM4EdHl8vTwxwvTQ6+SDMtLO8jW
 8DSeb9CCin/VagqPhrylj5u72QGz+i7gDUMZIZVU6mHJc8WB13tIflOq0qKLnfNE
 n63/4zu9kWCznb7IKqg99mo1+bDcg1fyZusih+aguCGNYEQ/yrAf5ll2OMfjmZWa
 FFOuaVoLmN0f6XMb4L38Wwd9obvC3EbpnNveco3lmTp+4kRk1H/Ox2UI2jaFbUnG
 Nim4LZD4iGXJh1qnnQ0xkTjrltFAvnY9zUwo2Yv7TUQOi0JAXxsZwXwY6UjsiNrC
 QWdKL5VcdI0N1Y1MrmpQQKpRE9Lu1dTvbIRvFtQHmWgV7gqwTmShoSARBL1IM+lp
 tm+jfZOmznjYTaVnc1xnBCaIqs925gvnkniZpzru53xb5UegenadNXvQtYlaAokJ
 j13QKA6NrZVI
 =xkIZ
 -----END PGP SIGNATURE-----

Merge tag 'net-6.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from bpf and netfilter.

  Current release - regressions:

   - gro: initialize network_offset in network layer

   - tcp: reduce accepted window in NEW_SYN_RECV state

  Current release - new code bugs:

   - eth: mlx5e: do not use ptp structure for tx ts stats when not
     initialized

   - eth: ice: check for unregistering correct number of devlink params

  Previous releases - regressions:

   - bpf: Allow delete from sockmap/sockhash only if update is allowed

   - sched: taprio: extend minimum interval restriction to entire cycle
     too

   - netfilter: ipset: add list flush to cancel_gc

   - ipv4: fix address dump when IPv4 is disabled on an interface

   - sock_map: avoid race between sock_map_close and sk_psock_put

   - eth: mlx5: use mlx5_ipsec_rx_status_destroy to correctly delete
     status rules

  Previous releases - always broken:

   - core: fix __dst_negative_advice() race

   - bpf:
       - fix multi-uprobe PID filtering logic
       - fix pkt_type override upon netkit pass verdict

   - netfilter: tproxy: bail out if IP has been disabled on the device

   - af_unix: annotate data-race around unix_sk(sk)->addr

   - eth: mlx5e: fix UDP GSO for encapsulated packets

   - eth: idpf: don't enable NAPI and interrupts prior to allocating Rx
     buffers

   - eth: i40e: fully suspend and resume IO operations in EEH case

   - eth: octeontx2-pf: free send queue buffers incase of leaf to inner

   - eth: ipvlan: dont Use skb->sk in ipvlan_process_v{4,6}_outbound"

* tag 'net-6.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (69 commits)
  netdev: add qstat for csum complete
  ipvlan: Dont Use skb->sk in ipvlan_process_v{4,6}_outbound
  net: ena: Fix redundant device NUMA node override
  ice: check for unregistering correct number of devlink params
  ice: fix 200G PHY types to link speed mapping
  i40e: Fully suspend and resume IO operations in EEH case
  i40e: factoring out i40e_suspend/i40e_resume
  e1000e: move force SMBUS near the end of enable_ulp function
  net: dsa: microchip: fix RGMII error in KSZ DSA driver
  ipv4: correctly iterate over the target netns in inet_dump_ifaddr()
  net: fix __dst_negative_advice() race
  nfc/nci: Add the inconsistency check between the input data length and count
  MAINTAINERS: dwmac: starfive: update Maintainer
  net/sched: taprio: extend minimum interval restriction to entire cycle too
  net/sched: taprio: make q->picos_per_byte available to fill_sched_entry()
  netfilter: nft_fib: allow from forward/input without iif selector
  netfilter: tproxy: bail out if IP has been disabled on the device
  netfilter: nft_payload: skbuff vlan metadata mangle support
  net: ti: icssg-prueth: Fix start counter for ft1 filter
  sock_map: avoid race between sock_map_close and sk_psock_put
  ...
This commit is contained in:
Linus Torvalds 2024-05-30 08:33:04 -07:00
commit d8ec19857b
78 changed files with 1179 additions and 381 deletions

View File

@ -24,6 +24,7 @@ properties:
managers:
type: object
additionalProperties: false
description:
List of the PD69208T4/PD69204T4/PD69208M PSE managers. Each manager
have 4 or 8 physical ports according to the chip version. No need to
@ -47,8 +48,9 @@ properties:
- "#size-cells"
patternProperties:
"^manager@0[0-9a-b]$":
"^manager@[0-9a-b]$":
type: object
additionalProperties: false
description:
PD69208T4/PD69204T4/PD69208M PSE manager exposing 4 or 8 physical
ports.
@ -69,9 +71,14 @@ properties:
patternProperties:
'^port@[0-7]$':
type: object
additionalProperties: false
properties:
reg:
maxItems: 1
required:
- reg
additionalProperties: false
required:
- reg

View File

@ -29,13 +29,31 @@ properties:
of the ports conversion matrix that establishes relationship between
the logical ports and the physical channels.
type: object
additionalProperties: false
properties:
"#address-cells":
const: 1
"#size-cells":
const: 0
patternProperties:
'^channel@[0-7]$':
type: object
additionalProperties: false
properties:
reg:
maxItems: 1
required:
- reg
required:
- "#address-cells"
- "#size-cells"
unevaluatedProperties: false
required:

View File

@ -349,6 +349,10 @@ attribute-sets:
Number of packets dropped due to transient lack of resources, such as
buffer space, host descriptors etc.
type: uint
-
name: rx-csum-complete
doc: Number of packets that were marked as CHECKSUM_COMPLETE.
type: uint
-
name: rx-csum-unnecessary
doc: Number of packets that were marked as CHECKSUM_UNNECESSARY.

View File

@ -227,7 +227,7 @@ preferably including links to previous postings, for example::
The amount of mooing will depend on packet rate so should match
the diurnal cycle quite well.
Signed-of-by: Joe Defarmer <joe@barn.org>
Signed-off-by: Joe Defarmer <joe@barn.org>
---
v3:
- add a note about time-of-day mooing fluctuation to the commit message

View File

@ -3854,6 +3854,7 @@ BPF JIT for ARM64
M: Daniel Borkmann <daniel@iogearbox.net>
M: Alexei Starovoitov <ast@kernel.org>
M: Puranjay Mohan <puranjay@kernel.org>
R: Xu Kuohai <xukuohai@huaweicloud.com>
L: bpf@vger.kernel.org
S: Supported
F: arch/arm64/net/
@ -21316,7 +21317,7 @@ F: arch/riscv/boot/dts/starfive/
STARFIVE DWMAC GLUE LAYER
M: Emil Renner Berthing <kernel@esmil.dk>
M: Samin Guo <samin.guo@starfivetech.com>
M: Minda Chen <minda.chen@starfivetech.com>
S: Maintained
F: Documentation/devicetree/bindings/net/starfive,jh7110-dwmac.yaml
F: drivers/net/ethernet/stmicro/stmmac/dwmac-starfive.c

View File

@ -39,7 +39,7 @@
/************** Functions that the back-end must provide **************/
/* Extension for 32-bit operations. */
inline u8 zext(u8 *buf, u8 rd);
u8 zext(u8 *buf, u8 rd);
/***** Moves *****/
u8 mov_r32(u8 *buf, u8 rd, u8 rs, u8 sign_ext);
u8 mov_r32_i32(u8 *buf, u8 reg, s32 imm);

View File

@ -62,7 +62,7 @@ enum {
* If/when we decide to add ARCv2 instructions that do use register pairs,
* the mapping, hopefully, doesn't need to be revisited.
*/
const u8 bpf2arc[][2] = {
static const u8 bpf2arc[][2] = {
/* Return value from in-kernel function, and exit value from eBPF */
[BPF_REG_0] = {ARC_R_8, ARC_R_9},
/* Arguments from eBPF program to in-kernel function */
@ -1302,7 +1302,7 @@ static u8 arc_b(u8 *buf, s32 offset)
/************* Packers (Deal with BPF_REGs) **************/
inline u8 zext(u8 *buf, u8 rd)
u8 zext(u8 *buf, u8 rd)
{
if (rd != BPF_REG_FP)
return arc_movi_r(buf, REG_HI(rd), 0);
@ -2235,6 +2235,7 @@ u8 gen_swap(u8 *buf, u8 rd, u8 size, u8 endian, bool force, bool do_zext)
break;
default:
/* The caller must have handled this. */
break;
}
} else {
/*
@ -2253,6 +2254,7 @@ u8 gen_swap(u8 *buf, u8 rd, u8 size, u8 endian, bool force, bool do_zext)
break;
default:
/* The caller must have handled this. */
break;
}
}
@ -2517,7 +2519,7 @@ u8 arc_epilogue(u8 *buf, u32 usage, u16 frame_size)
#define JCC64_NR_OF_JMPS 3 /* Number of jumps in jcc64 template. */
#define JCC64_INSNS_TO_END 3 /* Number of insn. inclusive the 2nd jmp to end. */
#define JCC64_SKIP_JMP 1 /* Index of the "skip" jump to "end". */
const struct {
static const struct {
/*
* "jit_off" is common between all "jmp[]" and is coupled with
* "cond" of each "jmp[]" instance. e.g.:
@ -2883,7 +2885,7 @@ u8 gen_jmp_64(u8 *buf, u8 rd, u8 rs, u8 cond, u32 curr_off, u32 targ_off)
* The "ARC_CC_SET" becomes "CC_unequal" because of the "tst"
* instruction that precedes the conditional branch.
*/
const u8 arcv2_32_jmps[ARC_CC_LAST] = {
static const u8 arcv2_32_jmps[ARC_CC_LAST] = {
[ARC_CC_UGT] = CC_great_u,
[ARC_CC_UGE] = CC_great_eq_u,
[ARC_CC_ULT] = CC_less_u,

View File

@ -159,7 +159,7 @@ static void jit_dump(const struct jit_context *ctx)
/* Initialise the context so there's no garbage. */
static int jit_ctx_init(struct jit_context *ctx, struct bpf_prog *prog)
{
memset(ctx, 0, sizeof(ctx));
memset(ctx, 0, sizeof(*ctx));
ctx->orig_prog = prog;
@ -167,7 +167,7 @@ static int jit_ctx_init(struct jit_context *ctx, struct bpf_prog *prog)
ctx->prog = bpf_jit_blind_constants(prog);
if (IS_ERR(ctx->prog))
return PTR_ERR(ctx->prog);
ctx->blinded = (ctx->prog == ctx->orig_prog ? false : true);
ctx->blinded = (ctx->prog != ctx->orig_prog);
/* If the verifier doesn't zero-extend, then we have to do it. */
ctx->do_zext = !ctx->prog->aux->verifier_zext;
@ -1182,12 +1182,12 @@ static int jit_prepare(struct jit_context *ctx)
}
/*
* All the "handle_*()" functions have been called before by the
* "jit_prepare()". If there was an error, we would know by now.
* Therefore, no extra error checking at this point, other than
* a sanity check at the end that expects the calculated length
* (jit.len) to be equal to the length of generated instructions
* (jit.index).
* jit_compile() is the real compilation phase. jit_prepare() is
* invoked before jit_compile() as a dry-run to make sure everything
* will go OK and allocate the necessary memory.
*
* In the end, jit_compile() checks if it has produced the same number
* of instructions as jit_prepare() would.
*/
static int jit_compile(struct jit_context *ctx)
{
@ -1407,9 +1407,9 @@ static struct bpf_prog *do_extra_pass(struct bpf_prog *prog)
/*
* This function may be invoked twice for the same stream of BPF
* instructions. The "extra pass" happens, when there are "call"s
* involved that their addresses are not known during the first
* invocation.
* instructions. The "extra pass" happens, when there are
* (re)locations involved that their addresses are not known
* during the first run.
*/
struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
{

View File

@ -3142,7 +3142,7 @@ phy_interface_t ksz_get_xmii(struct ksz_device *dev, int port, bool gbit)
else
interface = PHY_INTERFACE_MODE_MII;
} else if (val == bitval[P_RMII_SEL]) {
interface = PHY_INTERFACE_MODE_RGMII;
interface = PHY_INTERFACE_MODE_RMII;
} else {
interface = PHY_INTERFACE_MODE_RGMII;
if (data8 & P_RGMII_ID_EG_ENABLE)

View File

@ -312,7 +312,6 @@ static int ena_com_init_io_sq(struct ena_com_dev *ena_dev,
struct ena_com_io_sq *io_sq)
{
size_t size;
int dev_node = 0;
memset(&io_sq->desc_addr, 0x0, sizeof(io_sq->desc_addr));
@ -325,12 +324,9 @@ static int ena_com_init_io_sq(struct ena_com_dev *ena_dev,
size = io_sq->desc_entry_size * io_sq->q_depth;
if (io_sq->mem_queue_type == ENA_ADMIN_PLACEMENT_POLICY_HOST) {
dev_node = dev_to_node(ena_dev->dmadev);
set_dev_node(ena_dev->dmadev, ctx->numa_node);
io_sq->desc_addr.virt_addr =
dma_alloc_coherent(ena_dev->dmadev, size, &io_sq->desc_addr.phys_addr,
GFP_KERNEL);
set_dev_node(ena_dev->dmadev, dev_node);
if (!io_sq->desc_addr.virt_addr) {
io_sq->desc_addr.virt_addr =
dma_alloc_coherent(ena_dev->dmadev, size,
@ -354,10 +350,7 @@ static int ena_com_init_io_sq(struct ena_com_dev *ena_dev,
size = (size_t)io_sq->bounce_buf_ctrl.buffer_size *
io_sq->bounce_buf_ctrl.buffers_num;
dev_node = dev_to_node(ena_dev->dmadev);
set_dev_node(ena_dev->dmadev, ctx->numa_node);
io_sq->bounce_buf_ctrl.base_buffer = devm_kzalloc(ena_dev->dmadev, size, GFP_KERNEL);
set_dev_node(ena_dev->dmadev, dev_node);
if (!io_sq->bounce_buf_ctrl.base_buffer)
io_sq->bounce_buf_ctrl.base_buffer =
devm_kzalloc(ena_dev->dmadev, size, GFP_KERNEL);
@ -397,7 +390,6 @@ static int ena_com_init_io_cq(struct ena_com_dev *ena_dev,
struct ena_com_io_cq *io_cq)
{
size_t size;
int prev_node = 0;
memset(&io_cq->cdesc_addr, 0x0, sizeof(io_cq->cdesc_addr));
@ -409,11 +401,8 @@ static int ena_com_init_io_cq(struct ena_com_dev *ena_dev,
size = io_cq->cdesc_entry_size_in_bytes * io_cq->q_depth;
prev_node = dev_to_node(ena_dev->dmadev);
set_dev_node(ena_dev->dmadev, ctx->numa_node);
io_cq->cdesc_addr.virt_addr =
dma_alloc_coherent(ena_dev->dmadev, size, &io_cq->cdesc_addr.phys_addr, GFP_KERNEL);
set_dev_node(ena_dev->dmadev, prev_node);
if (!io_cq->cdesc_addr.virt_addr) {
io_cq->cdesc_addr.virt_addr =
dma_alloc_coherent(ena_dev->dmadev, size, &io_cq->cdesc_addr.phys_addr,

View File

@ -1117,18 +1117,30 @@ static int enic_set_vf_port(struct net_device *netdev, int vf,
pp->request = nla_get_u8(port[IFLA_PORT_REQUEST]);
if (port[IFLA_PORT_PROFILE]) {
if (nla_len(port[IFLA_PORT_PROFILE]) != PORT_PROFILE_MAX) {
memcpy(pp, &prev_pp, sizeof(*pp));
return -EINVAL;
}
pp->set |= ENIC_SET_NAME;
memcpy(pp->name, nla_data(port[IFLA_PORT_PROFILE]),
PORT_PROFILE_MAX);
}
if (port[IFLA_PORT_INSTANCE_UUID]) {
if (nla_len(port[IFLA_PORT_INSTANCE_UUID]) != PORT_UUID_MAX) {
memcpy(pp, &prev_pp, sizeof(*pp));
return -EINVAL;
}
pp->set |= ENIC_SET_INSTANCE;
memcpy(pp->instance_uuid,
nla_data(port[IFLA_PORT_INSTANCE_UUID]), PORT_UUID_MAX);
}
if (port[IFLA_PORT_HOST_UUID]) {
if (nla_len(port[IFLA_PORT_HOST_UUID]) != PORT_UUID_MAX) {
memcpy(pp, &prev_pp, sizeof(*pp));
return -EINVAL;
}
pp->set |= ENIC_SET_HOST;
memcpy(pp->host_uuid,
nla_data(port[IFLA_PORT_HOST_UUID]), PORT_UUID_MAX);

View File

@ -4130,6 +4130,14 @@ free_queue_mem:
return ret;
}
static void fec_enet_deinit(struct net_device *ndev)
{
struct fec_enet_private *fep = netdev_priv(ndev);
netif_napi_del(&fep->napi);
fec_enet_free_queue(ndev);
}
#ifdef CONFIG_OF
static int fec_reset_phy(struct platform_device *pdev)
{
@ -4524,6 +4532,7 @@ failed_register:
fec_enet_mii_remove(fep);
failed_mii_init:
failed_irq:
fec_enet_deinit(ndev);
failed_init:
fec_ptp_stop(pdev);
failed_reset:
@ -4587,6 +4596,7 @@ fec_drv_remove(struct platform_device *pdev)
pm_runtime_put_noidle(&pdev->dev);
pm_runtime_disable(&pdev->dev);
fec_enet_deinit(ndev);
free_netdev(ndev);
}

View File

@ -1225,6 +1225,28 @@ s32 e1000_enable_ulp_lpt_lp(struct e1000_hw *hw, bool to_sx)
}
release:
/* Switching PHY interface always returns MDI error
* so disable retry mechanism to avoid wasting time
*/
e1000e_disable_phy_retry(hw);
/* Force SMBus mode in PHY */
ret_val = e1000_read_phy_reg_hv_locked(hw, CV_SMB_CTRL, &phy_reg);
if (ret_val) {
e1000e_enable_phy_retry(hw);
hw->phy.ops.release(hw);
goto out;
}
phy_reg |= CV_SMB_CTRL_FORCE_SMBUS;
e1000_write_phy_reg_hv_locked(hw, CV_SMB_CTRL, phy_reg);
e1000e_enable_phy_retry(hw);
/* Force SMBus mode in MAC */
mac_reg = er32(CTRL_EXT);
mac_reg |= E1000_CTRL_EXT_FORCE_SMBUS;
ew32(CTRL_EXT, mac_reg);
hw->phy.ops.release(hw);
out:
if (ret_val)

View File

@ -6623,7 +6623,6 @@ static int __e1000_shutdown(struct pci_dev *pdev, bool runtime)
struct e1000_hw *hw = &adapter->hw;
u32 ctrl, ctrl_ext, rctl, status, wufc;
int retval = 0;
u16 smb_ctrl;
/* Runtime suspend should only enable wakeup for link changes */
if (runtime)
@ -6697,23 +6696,6 @@ static int __e1000_shutdown(struct pci_dev *pdev, bool runtime)
if (retval)
return retval;
}
/* Force SMBUS to allow WOL */
/* Switching PHY interface always returns MDI error
* so disable retry mechanism to avoid wasting time
*/
e1000e_disable_phy_retry(hw);
e1e_rphy(hw, CV_SMB_CTRL, &smb_ctrl);
smb_ctrl |= CV_SMB_CTRL_FORCE_SMBUS;
e1e_wphy(hw, CV_SMB_CTRL, smb_ctrl);
e1000e_enable_phy_retry(hw);
/* Force SMBus mode in MAC */
ctrl_ext = er32(CTRL_EXT);
ctrl_ext |= E1000_CTRL_EXT_FORCE_SMBUS;
ew32(CTRL_EXT, ctrl_ext);
}
/* Ensure that the appropriate bits are set in LPI_CTRL

View File

@ -11171,6 +11171,8 @@ static void i40e_reset_and_rebuild(struct i40e_pf *pf, bool reinit,
ret = i40e_reset(pf);
if (!ret)
i40e_rebuild(pf, reinit, lock_acquired);
else
dev_err(&pf->pdev->dev, "%s: i40e_reset() FAILED", __func__);
}
/**
@ -16334,6 +16336,139 @@ unmap:
pci_disable_device(pdev);
}
/**
* i40e_enable_mc_magic_wake - enable multicast magic packet wake up
* using the mac_address_write admin q function
* @pf: pointer to i40e_pf struct
**/
static void i40e_enable_mc_magic_wake(struct i40e_pf *pf)
{
struct i40e_vsi *main_vsi = i40e_pf_get_main_vsi(pf);
struct i40e_hw *hw = &pf->hw;
u8 mac_addr[6];
u16 flags = 0;
int ret;
/* Get current MAC address in case it's an LAA */
if (main_vsi && main_vsi->netdev) {
ether_addr_copy(mac_addr, main_vsi->netdev->dev_addr);
} else {
dev_err(&pf->pdev->dev,
"Failed to retrieve MAC address; using default\n");
ether_addr_copy(mac_addr, hw->mac.addr);
}
/* The FW expects the mac address write cmd to first be called with
* one of these flags before calling it again with the multicast
* enable flags.
*/
flags = I40E_AQC_WRITE_TYPE_LAA_WOL;
if (hw->func_caps.flex10_enable && hw->partition_id != 1)
flags = I40E_AQC_WRITE_TYPE_LAA_ONLY;
ret = i40e_aq_mac_address_write(hw, flags, mac_addr, NULL);
if (ret) {
dev_err(&pf->pdev->dev,
"Failed to update MAC address registers; cannot enable Multicast Magic packet wake up");
return;
}
flags = I40E_AQC_MC_MAG_EN
| I40E_AQC_WOL_PRESERVE_ON_PFR
| I40E_AQC_WRITE_TYPE_UPDATE_MC_MAG;
ret = i40e_aq_mac_address_write(hw, flags, mac_addr, NULL);
if (ret)
dev_err(&pf->pdev->dev,
"Failed to enable Multicast Magic Packet wake up\n");
}
/**
* i40e_io_suspend - suspend all IO operations
* @pf: pointer to i40e_pf struct
*
**/
static int i40e_io_suspend(struct i40e_pf *pf)
{
struct i40e_hw *hw = &pf->hw;
set_bit(__I40E_DOWN, pf->state);
/* Ensure service task will not be running */
del_timer_sync(&pf->service_timer);
cancel_work_sync(&pf->service_task);
/* Client close must be called explicitly here because the timer
* has been stopped.
*/
i40e_notify_client_of_netdev_close(pf, false);
if (test_bit(I40E_HW_CAP_WOL_MC_MAGIC_PKT_WAKE, pf->hw.caps) &&
pf->wol_en)
i40e_enable_mc_magic_wake(pf);
/* Since we're going to destroy queues during the
* i40e_clear_interrupt_scheme() we should hold the RTNL lock for this
* whole section
*/
rtnl_lock();
i40e_prep_for_reset(pf);
wr32(hw, I40E_PFPM_APM, (pf->wol_en ? I40E_PFPM_APM_APME_MASK : 0));
wr32(hw, I40E_PFPM_WUFC, (pf->wol_en ? I40E_PFPM_WUFC_MAG_MASK : 0));
/* Clear the interrupt scheme and release our IRQs so that the system
* can safely hibernate even when there are a large number of CPUs.
* Otherwise hibernation might fail when mapping all the vectors back
* to CPU0.
*/
i40e_clear_interrupt_scheme(pf);
rtnl_unlock();
return 0;
}
/**
* i40e_io_resume - resume IO operations
* @pf: pointer to i40e_pf struct
*
**/
static int i40e_io_resume(struct i40e_pf *pf)
{
struct device *dev = &pf->pdev->dev;
int err;
/* We need to hold the RTNL lock prior to restoring interrupt schemes,
* since we're going to be restoring queues
*/
rtnl_lock();
/* We cleared the interrupt scheme when we suspended, so we need to
* restore it now to resume device functionality.
*/
err = i40e_restore_interrupt_scheme(pf);
if (err) {
dev_err(dev, "Cannot restore interrupt scheme: %d\n",
err);
}
clear_bit(__I40E_DOWN, pf->state);
i40e_reset_and_rebuild(pf, false, true);
rtnl_unlock();
/* Clear suspended state last after everything is recovered */
clear_bit(__I40E_SUSPENDED, pf->state);
/* Restart the service task */
mod_timer(&pf->service_timer,
round_jiffies(jiffies + pf->service_timer_period));
return 0;
}
/**
* i40e_pci_error_detected - warning that something funky happened in PCI land
* @pdev: PCI device information struct
@ -16358,7 +16493,7 @@ static pci_ers_result_t i40e_pci_error_detected(struct pci_dev *pdev,
/* shutdown all operations */
if (!test_bit(__I40E_SUSPENDED, pf->state))
i40e_prep_for_reset(pf);
i40e_io_suspend(pf);
/* Request a slot reset */
return PCI_ERS_RESULT_NEED_RESET;
@ -16380,7 +16515,8 @@ static pci_ers_result_t i40e_pci_error_slot_reset(struct pci_dev *pdev)
u32 reg;
dev_dbg(&pdev->dev, "%s\n", __func__);
if (pci_enable_device_mem(pdev)) {
/* enable I/O and memory of the device */
if (pci_enable_device(pdev)) {
dev_info(&pdev->dev,
"Cannot re-enable PCI device after reset.\n");
result = PCI_ERS_RESULT_DISCONNECT;
@ -16443,54 +16579,7 @@ static void i40e_pci_error_resume(struct pci_dev *pdev)
if (test_bit(__I40E_SUSPENDED, pf->state))
return;
i40e_handle_reset_warning(pf, false);
}
/**
* i40e_enable_mc_magic_wake - enable multicast magic packet wake up
* using the mac_address_write admin q function
* @pf: pointer to i40e_pf struct
**/
static void i40e_enable_mc_magic_wake(struct i40e_pf *pf)
{
struct i40e_vsi *main_vsi = i40e_pf_get_main_vsi(pf);
struct i40e_hw *hw = &pf->hw;
u8 mac_addr[6];
u16 flags = 0;
int ret;
/* Get current MAC address in case it's an LAA */
if (main_vsi && main_vsi->netdev) {
ether_addr_copy(mac_addr, main_vsi->netdev->dev_addr);
} else {
dev_err(&pf->pdev->dev,
"Failed to retrieve MAC address; using default\n");
ether_addr_copy(mac_addr, hw->mac.addr);
}
/* The FW expects the mac address write cmd to first be called with
* one of these flags before calling it again with the multicast
* enable flags.
*/
flags = I40E_AQC_WRITE_TYPE_LAA_WOL;
if (hw->func_caps.flex10_enable && hw->partition_id != 1)
flags = I40E_AQC_WRITE_TYPE_LAA_ONLY;
ret = i40e_aq_mac_address_write(hw, flags, mac_addr, NULL);
if (ret) {
dev_err(&pf->pdev->dev,
"Failed to update MAC address registers; cannot enable Multicast Magic packet wake up");
return;
}
flags = I40E_AQC_MC_MAG_EN
| I40E_AQC_WOL_PRESERVE_ON_PFR
| I40E_AQC_WRITE_TYPE_UPDATE_MC_MAG;
ret = i40e_aq_mac_address_write(hw, flags, mac_addr, NULL);
if (ret)
dev_err(&pf->pdev->dev,
"Failed to enable Multicast Magic Packet wake up\n");
i40e_io_resume(pf);
}
/**
@ -16552,48 +16641,11 @@ static void i40e_shutdown(struct pci_dev *pdev)
static int i40e_suspend(struct device *dev)
{
struct i40e_pf *pf = dev_get_drvdata(dev);
struct i40e_hw *hw = &pf->hw;
/* If we're already suspended, then there is nothing to do */
if (test_and_set_bit(__I40E_SUSPENDED, pf->state))
return 0;
set_bit(__I40E_DOWN, pf->state);
/* Ensure service task will not be running */
del_timer_sync(&pf->service_timer);
cancel_work_sync(&pf->service_task);
/* Client close must be called explicitly here because the timer
* has been stopped.
*/
i40e_notify_client_of_netdev_close(pf, false);
if (test_bit(I40E_HW_CAP_WOL_MC_MAGIC_PKT_WAKE, pf->hw.caps) &&
pf->wol_en)
i40e_enable_mc_magic_wake(pf);
/* Since we're going to destroy queues during the
* i40e_clear_interrupt_scheme() we should hold the RTNL lock for this
* whole section
*/
rtnl_lock();
i40e_prep_for_reset(pf);
wr32(hw, I40E_PFPM_APM, (pf->wol_en ? I40E_PFPM_APM_APME_MASK : 0));
wr32(hw, I40E_PFPM_WUFC, (pf->wol_en ? I40E_PFPM_WUFC_MAG_MASK : 0));
/* Clear the interrupt scheme and release our IRQs so that the system
* can safely hibernate even when there are a large number of CPUs.
* Otherwise hibernation might fail when mapping all the vectors back
* to CPU0.
*/
i40e_clear_interrupt_scheme(pf);
rtnl_unlock();
return 0;
return i40e_io_suspend(pf);
}
/**
@ -16603,39 +16655,11 @@ static int i40e_suspend(struct device *dev)
static int i40e_resume(struct device *dev)
{
struct i40e_pf *pf = dev_get_drvdata(dev);
int err;
/* If we're not suspended, then there is nothing to do */
if (!test_bit(__I40E_SUSPENDED, pf->state))
return 0;
/* We need to hold the RTNL lock prior to restoring interrupt schemes,
* since we're going to be restoring queues
*/
rtnl_lock();
/* We cleared the interrupt scheme when we suspended, so we need to
* restore it now to resume device functionality.
*/
err = i40e_restore_interrupt_scheme(pf);
if (err) {
dev_err(dev, "Cannot restore interrupt scheme: %d\n",
err);
}
clear_bit(__I40E_DOWN, pf->state);
i40e_reset_and_rebuild(pf, false, true);
rtnl_unlock();
/* Clear suspended state last after everything is recovered */
clear_bit(__I40E_SUSPENDED, pf->state);
/* Restart the service task */
mod_timer(&pf->service_timer,
round_jiffies(jiffies + pf->service_timer_period));
return 0;
return i40e_io_resume(pf);
}
static const struct pci_error_handlers i40e_err_handler = {

View File

@ -1388,7 +1388,7 @@ enum ice_param_id {
ICE_DEVLINK_PARAM_ID_TX_SCHED_LAYERS,
};
static const struct devlink_param ice_devlink_params[] = {
static const struct devlink_param ice_dvl_rdma_params[] = {
DEVLINK_PARAM_GENERIC(ENABLE_ROCE, BIT(DEVLINK_PARAM_CMODE_RUNTIME),
ice_devlink_enable_roce_get,
ice_devlink_enable_roce_set,
@ -1397,6 +1397,9 @@ static const struct devlink_param ice_devlink_params[] = {
ice_devlink_enable_iw_get,
ice_devlink_enable_iw_set,
ice_devlink_enable_iw_validate),
};
static const struct devlink_param ice_dvl_sched_params[] = {
DEVLINK_PARAM_DRIVER(ICE_DEVLINK_PARAM_ID_TX_SCHED_LAYERS,
"tx_scheduling_layers",
DEVLINK_PARAM_TYPE_U8,
@ -1464,21 +1467,31 @@ int ice_devlink_register_params(struct ice_pf *pf)
{
struct devlink *devlink = priv_to_devlink(pf);
struct ice_hw *hw = &pf->hw;
size_t params_size;
int status;
params_size = ARRAY_SIZE(ice_devlink_params);
status = devl_params_register(devlink, ice_dvl_rdma_params,
ARRAY_SIZE(ice_dvl_rdma_params));
if (status)
return status;
if (!hw->func_caps.common_cap.tx_sched_topo_comp_mode_en)
params_size--;
if (hw->func_caps.common_cap.tx_sched_topo_comp_mode_en)
status = devl_params_register(devlink, ice_dvl_sched_params,
ARRAY_SIZE(ice_dvl_sched_params));
return devl_params_register(devlink, ice_devlink_params,
params_size);
return status;
}
void ice_devlink_unregister_params(struct ice_pf *pf)
{
devl_params_unregister(priv_to_devlink(pf), ice_devlink_params,
ARRAY_SIZE(ice_devlink_params));
struct devlink *devlink = priv_to_devlink(pf);
struct ice_hw *hw = &pf->hw;
devl_params_unregister(devlink, ice_dvl_rdma_params,
ARRAY_SIZE(ice_dvl_rdma_params));
if (hw->func_caps.common_cap.tx_sched_topo_comp_mode_en)
devl_params_unregister(devlink, ice_dvl_sched_params,
ARRAY_SIZE(ice_dvl_sched_params));
}
#define ICE_DEVLINK_READ_BLK_SIZE (1024 * 1024)

View File

@ -3148,6 +3148,16 @@ ice_get_link_speed_based_on_phy_type(u64 phy_type_low, u64 phy_type_high)
case ICE_PHY_TYPE_HIGH_100G_AUI2:
speed_phy_type_high = ICE_AQ_LINK_SPEED_100GB;
break;
case ICE_PHY_TYPE_HIGH_200G_CR4_PAM4:
case ICE_PHY_TYPE_HIGH_200G_SR4:
case ICE_PHY_TYPE_HIGH_200G_FR4:
case ICE_PHY_TYPE_HIGH_200G_LR4:
case ICE_PHY_TYPE_HIGH_200G_DR4:
case ICE_PHY_TYPE_HIGH_200G_KR4_PAM4:
case ICE_PHY_TYPE_HIGH_200G_AUI4_AOC_ACC:
case ICE_PHY_TYPE_HIGH_200G_AUI4:
speed_phy_type_high = ICE_AQ_LINK_SPEED_200GB;
break;
default:
speed_phy_type_high = ICE_AQ_LINK_SPEED_UNKNOWN;
break;

View File

@ -45,14 +45,15 @@ int ice_vsi_add_vlan(struct ice_vsi *vsi, struct ice_vlan *vlan)
return -EINVAL;
err = ice_fltr_add_vlan(vsi, vlan);
if (err && err != -EEXIST) {
if (!err)
vsi->num_vlan++;
else if (err == -EEXIST)
err = 0;
else
dev_err(ice_pf_to_dev(vsi->back), "Failure Adding VLAN %d on VSI %i, status %d\n",
vlan->vid, vsi->vsi_num, err);
return err;
}
vsi->num_vlan++;
return 0;
return err;
}
/**

View File

@ -1394,6 +1394,7 @@ static int idpf_vport_open(struct idpf_vport *vport, bool alloc_res)
}
idpf_rx_init_buf_tail(vport);
idpf_vport_intr_ena(vport);
err = idpf_send_config_queues_msg(vport);
if (err) {

View File

@ -3746,9 +3746,9 @@ static void idpf_vport_intr_ena_irq_all(struct idpf_vport *vport)
*/
void idpf_vport_intr_deinit(struct idpf_vport *vport)
{
idpf_vport_intr_dis_irq_all(vport);
idpf_vport_intr_napi_dis_all(vport);
idpf_vport_intr_napi_del_all(vport);
idpf_vport_intr_dis_irq_all(vport);
idpf_vport_intr_rel_irq(vport);
}
@ -4179,7 +4179,6 @@ int idpf_vport_intr_init(struct idpf_vport *vport)
idpf_vport_intr_map_vector_to_qs(vport);
idpf_vport_intr_napi_add_all(vport);
idpf_vport_intr_napi_ena_all(vport);
err = vport->adapter->dev_ops.reg_ops.intr_reg_init(vport);
if (err)
@ -4193,17 +4192,20 @@ int idpf_vport_intr_init(struct idpf_vport *vport)
if (err)
goto unroll_vectors_alloc;
idpf_vport_intr_ena_irq_all(vport);
return 0;
unroll_vectors_alloc:
idpf_vport_intr_napi_dis_all(vport);
idpf_vport_intr_napi_del_all(vport);
return err;
}
void idpf_vport_intr_ena(struct idpf_vport *vport)
{
idpf_vport_intr_napi_ena_all(vport);
idpf_vport_intr_ena_irq_all(vport);
}
/**
* idpf_config_rss - Send virtchnl messages to configure RSS
* @vport: virtual port

View File

@ -990,6 +990,7 @@ int idpf_vport_intr_alloc(struct idpf_vport *vport);
void idpf_vport_intr_update_itr_ena_irq(struct idpf_q_vector *q_vector);
void idpf_vport_intr_deinit(struct idpf_vport *vport);
int idpf_vport_intr_init(struct idpf_vport *vport);
void idpf_vport_intr_ena(struct idpf_vport *vport);
enum pkt_hash_types idpf_ptype_to_htype(const struct idpf_rx_ptype_decoded *decoded);
int idpf_config_rss(struct idpf_vport *vport);
int idpf_init_rss(struct idpf_vport *vport);

View File

@ -1422,7 +1422,10 @@ static int otx2_qos_leaf_to_inner(struct otx2_nic *pfvf, u16 classid,
otx2_qos_read_txschq_cfg(pfvf, node, old_cfg);
/* delete the txschq nodes allocated for this node */
otx2_qos_disable_sq(pfvf, qid);
otx2_qos_free_hw_node_schq(pfvf, node);
otx2_qos_free_sw_node_schq(pfvf, node);
pfvf->qos.qid_to_sqmap[qid] = OTX2_QOS_INVALID_SQ;
/* mark this node as htb inner node */
WRITE_ONCE(node->qid, OTX2_QOS_QID_INNER);
@ -1632,6 +1635,7 @@ static int otx2_qos_leaf_del_last(struct otx2_nic *pfvf, u16 classid, bool force
dwrr_del_node = true;
/* destroy the leaf node */
otx2_qos_disable_sq(pfvf, qid);
otx2_qos_destroy_node(pfvf, node);
pfvf->qos.qid_to_sqmap[qid] = OTX2_QOS_INVALID_SQ;

View File

@ -102,8 +102,14 @@ static inline void
mlx5e_udp_gso_handle_tx_skb(struct sk_buff *skb)
{
int payload_len = skb_shinfo(skb)->gso_size + sizeof(struct udphdr);
struct udphdr *udphdr;
udp_hdr(skb)->len = htons(payload_len);
if (skb->encapsulation)
udphdr = (struct udphdr *)skb_inner_transport_header(skb);
else
udphdr = udp_hdr(skb);
udphdr->len = htons(payload_len);
}
struct mlx5e_accel_tx_state {

View File

@ -750,8 +750,7 @@ err_fs:
err_fs_ft:
if (rx->allow_tunnel_mode)
mlx5_eswitch_unblock_encap(mdev);
mlx5_del_flow_rules(rx->status.rule);
mlx5_modify_header_dealloc(mdev, rx->status.modify_hdr);
mlx5_ipsec_rx_status_destroy(ipsec, rx);
err_add:
mlx5_destroy_flow_table(rx->ft.status);
err_fs_ft_status:

View File

@ -97,18 +97,11 @@ mlx5e_ipsec_feature_check(struct sk_buff *skb, netdev_features_t features)
if (!x || !x->xso.offload_handle)
goto out_disable;
if (xo->inner_ipproto) {
/* Cannot support tunnel packet over IPsec tunnel mode
* because we cannot offload three IP header csum
*/
if (x->props.mode == XFRM_MODE_TUNNEL)
goto out_disable;
/* Only support UDP or TCP L4 checksum */
if (xo->inner_ipproto != IPPROTO_UDP &&
xo->inner_ipproto != IPPROTO_TCP)
goto out_disable;
}
/* Only support UDP or TCP L4 checksum */
if (xo->inner_ipproto &&
xo->inner_ipproto != IPPROTO_UDP &&
xo->inner_ipproto != IPPROTO_TCP)
goto out_disable;
return features;

View File

@ -3886,7 +3886,7 @@ mlx5e_get_stats(struct net_device *dev, struct rtnl_link_stats64 *stats)
mlx5e_fold_sw_stats64(priv, stats);
}
stats->rx_dropped = priv->stats.qcnt.rx_out_of_buffer;
stats->rx_missed_errors = priv->stats.qcnt.rx_out_of_buffer;
stats->rx_length_errors =
PPORT_802_3_GET(pstats, a_in_range_length_errors) +

View File

@ -1186,6 +1186,9 @@ void mlx5e_stats_ts_get(struct mlx5e_priv *priv,
ts_stats->err = 0;
ts_stats->lost = 0;
if (!ptp)
goto out;
/* Aggregate stats across all TCs */
for (i = 0; i < ptp->num_tc; i++) {
struct mlx5e_ptp_cq_stats *stats =
@ -1214,6 +1217,7 @@ void mlx5e_stats_ts_get(struct mlx5e_priv *priv,
}
}
out:
mutex_unlock(&priv->state_lock);
}

View File

@ -153,7 +153,11 @@ mlx5e_tx_get_gso_ihs(struct mlx5e_txqsq *sq, struct sk_buff *skb, int *hopbyhop)
*hopbyhop = 0;
if (skb->encapsulation) {
ihs = skb_inner_tcp_all_headers(skb);
if (skb_shinfo(skb)->gso_type & SKB_GSO_UDP_L4)
ihs = skb_inner_transport_offset(skb) +
sizeof(struct udphdr);
else
ihs = skb_inner_tcp_all_headers(skb);
stats->tso_inner_packets++;
stats->tso_inner_bytes += skb->len - ihs;
} else {

View File

@ -719,6 +719,7 @@ bool mlx5_lag_check_prereq(struct mlx5_lag *ldev)
struct mlx5_core_dev *dev;
u8 mode;
#endif
bool roce_support;
int i;
for (i = 0; i < ldev->ports; i++)
@ -743,6 +744,11 @@ bool mlx5_lag_check_prereq(struct mlx5_lag *ldev)
if (mlx5_sriov_is_enabled(ldev->pf[i].dev))
return false;
#endif
roce_support = mlx5_get_roce_state(ldev->pf[MLX5_LAG_P1].dev);
for (i = 1; i < ldev->ports; i++)
if (mlx5_get_roce_state(ldev->pf[i].dev) != roce_support)
return false;
return true;
}
@ -910,8 +916,10 @@ static void mlx5_do_bond(struct mlx5_lag *ldev)
} else if (roce_lag) {
dev0->priv.flags &= ~MLX5_PRIV_FLAGS_DISABLE_IB_ADEV;
mlx5_rescan_drivers_locked(dev0);
for (i = 1; i < ldev->ports; i++)
mlx5_nic_vport_enable_roce(ldev->pf[i].dev);
for (i = 1; i < ldev->ports; i++) {
if (mlx5_get_roce_state(ldev->pf[i].dev))
mlx5_nic_vport_enable_roce(ldev->pf[i].dev);
}
} else if (shared_fdb) {
int i;

View File

@ -100,10 +100,6 @@ static bool ft_create_alias_supported(struct mlx5_core_dev *dev)
static bool mlx5_sd_is_supported(struct mlx5_core_dev *dev, u8 host_buses)
{
/* Feature is currently implemented for PFs only */
if (!mlx5_core_is_pf(dev))
return false;
/* Honor the SW implementation limit */
if (host_buses > MLX5_SD_MAX_GROUP_SZ)
return false;
@ -162,6 +158,14 @@ static int sd_init(struct mlx5_core_dev *dev)
bool sdm;
int err;
/* Feature is currently implemented for PFs only */
if (!mlx5_core_is_pf(dev))
return 0;
/* Block on embedded CPU PFs */
if (mlx5_core_is_ecpf(dev))
return 0;
if (!MLX5_CAP_MCAM_REG(dev, mpir))
return 0;

View File

@ -455,7 +455,7 @@ void icssg_ft1_set_mac_addr(struct regmap *miig_rt, int slice, u8 *mac_addr)
{
const u8 mask_addr[] = { 0, 0, 0, 0, 0, 0, };
rx_class_ft1_set_start_len(miig_rt, slice, 0, 6);
rx_class_ft1_set_start_len(miig_rt, slice, ETH_ALEN, ETH_ALEN);
rx_class_ft1_set_da(miig_rt, slice, 0, mac_addr);
rx_class_ft1_set_da_mask(miig_rt, slice, 0, mask_addr);
rx_class_ft1_cfg_set_type(miig_rt, slice, 0, FT1_CFG_TYPE_EQ);

View File

@ -439,7 +439,7 @@ static noinline_for_stack int ipvlan_process_v4_outbound(struct sk_buff *skb)
memset(IPCB(skb), 0, sizeof(*IPCB(skb)));
err = ip_local_out(net, skb->sk, skb);
err = ip_local_out(net, NULL, skb);
if (unlikely(net_xmit_eval(err)))
DEV_STATS_INC(dev, tx_errors);
else
@ -494,7 +494,7 @@ static int ipvlan_process_v6_outbound(struct sk_buff *skb)
memset(IP6CB(skb), 0, sizeof(*IP6CB(skb)));
err = ip6_local_out(dev_net(dev), skb->sk, skb);
err = ip6_local_out(dev_net(dev), NULL, skb);
if (unlikely(net_xmit_eval(err)))
DEV_STATS_INC(dev, tx_errors);
else

View File

@ -55,6 +55,7 @@ static void netkit_prep_forward(struct sk_buff *skb, bool xnet)
skb_scrub_packet(skb, xnet);
skb->priority = 0;
nf_skip_egress(skb, true);
skb_reset_mac_header(skb);
}
static struct netkit *netkit_priv(const struct net_device *dev)
@ -78,6 +79,7 @@ static netdev_tx_t netkit_xmit(struct sk_buff *skb, struct net_device *dev)
skb_orphan_frags(skb, GFP_ATOMIC)))
goto drop;
netkit_prep_forward(skb, !net_eq(dev_net(dev), dev_net(peer)));
eth_skb_pkt_type(skb, peer);
skb->dev = peer;
entry = rcu_dereference(nk->active);
if (entry)
@ -85,7 +87,7 @@ static netdev_tx_t netkit_xmit(struct sk_buff *skb, struct net_device *dev)
switch (ret) {
case NETKIT_NEXT:
case NETKIT_PASS:
skb->protocol = eth_type_trans(skb, skb->dev);
eth_skb_pull_mac(skb);
skb_postpull_rcsum(skb, eth_hdr(skb), ETH_HLEN);
if (likely(__netif_rx(skb) == NET_RX_SUCCESS)) {
dev_sw_netstats_tx_add(dev, 1, len);
@ -155,6 +157,16 @@ static void netkit_set_multicast(struct net_device *dev)
/* Nothing to do, we receive whatever gets pushed to us! */
}
static int netkit_set_macaddr(struct net_device *dev, void *sa)
{
struct netkit *nk = netkit_priv(dev);
if (nk->mode != NETKIT_L2)
return -EOPNOTSUPP;
return eth_mac_addr(dev, sa);
}
static void netkit_set_headroom(struct net_device *dev, int headroom)
{
struct netkit *nk = netkit_priv(dev), *nk2;
@ -198,6 +210,7 @@ static const struct net_device_ops netkit_netdev_ops = {
.ndo_start_xmit = netkit_xmit,
.ndo_set_rx_mode = netkit_set_multicast,
.ndo_set_rx_headroom = netkit_set_headroom,
.ndo_set_mac_address = netkit_set_macaddr,
.ndo_get_iflink = netkit_get_iflink,
.ndo_get_peer_dev = netkit_peer_dev,
.ndo_get_stats64 = netkit_get_stats,
@ -300,9 +313,11 @@ static int netkit_validate(struct nlattr *tb[], struct nlattr *data[],
if (!attr)
return 0;
NL_SET_ERR_MSG_ATTR(extack, attr,
"Setting Ethernet address is not supported");
return -EOPNOTSUPP;
if (nla_len(attr) != ETH_ALEN)
return -EINVAL;
if (!is_valid_ether_addr(nla_data(attr)))
return -EADDRNOTAVAIL;
return 0;
}
static struct rtnl_link_ops netkit_link_ops;
@ -365,6 +380,9 @@ static int netkit_new_link(struct net *src_net, struct net_device *dev,
strscpy(ifname, "nk%d", IFNAMSIZ);
ifname_assign_type = NET_NAME_ENUM;
}
if (mode != NETKIT_L2 &&
(tb[IFLA_ADDRESS] || tbp[IFLA_ADDRESS]))
return -EOPNOTSUPP;
net = rtnl_link_get_net(src_net, tbp);
if (IS_ERR(net))
@ -379,7 +397,7 @@ static int netkit_new_link(struct net *src_net, struct net_device *dev,
netif_inherit_tso_max(peer, dev);
if (mode == NETKIT_L2)
if (mode == NETKIT_L2 && !(ifmp && tbp[IFLA_ADDRESS]))
eth_hw_addr_random(peer);
if (ifmp && dev->ifindex)
peer->ifindex = ifmp->ifi_index;
@ -402,7 +420,7 @@ static int netkit_new_link(struct net *src_net, struct net_device *dev,
if (err < 0)
goto err_configure_peer;
if (mode == NETKIT_L2)
if (mode == NETKIT_L2 && !tb[IFLA_ADDRESS])
eth_hw_addr_random(dev);
if (tb[IFLA_IFNAME])
nla_strscpy(dev->name, tb[IFLA_IFNAME], IFNAMSIZ);

View File

@ -4029,7 +4029,7 @@ static int lan8841_config_intr(struct phy_device *phydev)
if (phydev->interrupts == PHY_INTERRUPT_ENABLED) {
err = phy_read(phydev, LAN8814_INTS);
if (err)
if (err < 0)
return err;
/* Enable / disable interrupts. It is OK to enable PTP interrupt
@ -4045,6 +4045,14 @@ static int lan8841_config_intr(struct phy_device *phydev)
return err;
err = phy_read(phydev, LAN8814_INTS);
if (err < 0)
return err;
/* Getting a positive value doesn't mean that is an error, it
* just indicates what was the status. Therefore make sure to
* clear the value and say that there is no error.
*/
err = 0;
}
return err;
@ -5327,6 +5335,7 @@ static struct phy_driver ksphy_driver[] = {
/* PHY_BASIC_FEATURES */
.probe = kszphy_probe,
.config_init = ksz8061_config_init,
.soft_reset = genphy_soft_reset,
.config_intr = kszphy_config_intr,
.handle_interrupt = kszphy_handle_interrupt,
.suspend = kszphy_suspend,

View File

@ -879,7 +879,7 @@ static int smsc95xx_start_rx_path(struct usbnet *dev)
static int smsc95xx_reset(struct usbnet *dev)
{
struct smsc95xx_priv *pdata = dev->driver_priv;
u32 read_buf, write_buf, burst_cap;
u32 read_buf, burst_cap;
int ret = 0, timeout;
netif_dbg(dev, ifup, dev->net, "entering smsc95xx_reset\n");
@ -1003,10 +1003,13 @@ static int smsc95xx_reset(struct usbnet *dev)
return ret;
netif_dbg(dev, ifup, dev->net, "ID_REV = 0x%08x\n", read_buf);
ret = smsc95xx_read_reg(dev, LED_GPIO_CFG, &read_buf);
if (ret < 0)
return ret;
/* Configure GPIO pins as LED outputs */
write_buf = LED_GPIO_CFG_SPD_LED | LED_GPIO_CFG_LNK_LED |
LED_GPIO_CFG_FDX_LED;
ret = smsc95xx_write_reg(dev, LED_GPIO_CFG, write_buf);
read_buf |= LED_GPIO_CFG_SPD_LED | LED_GPIO_CFG_LNK_LED |
LED_GPIO_CFG_FDX_LED;
ret = smsc95xx_write_reg(dev, LED_GPIO_CFG, read_buf);
if (ret < 0)
return ret;

View File

@ -125,6 +125,10 @@ static ssize_t virtual_ncidev_write(struct file *file,
kfree_skb(skb);
return -EFAULT;
}
if (strnlen(skb->data, count) != count) {
kfree_skb(skb);
return -EINVAL;
}
nci_recv_frame(vdev->ndev, skb);
return count;

View File

@ -636,6 +636,14 @@ static inline void eth_skb_pkt_type(struct sk_buff *skb,
}
}
static inline struct ethhdr *eth_skb_pull_mac(struct sk_buff *skb)
{
struct ethhdr *eth = (struct ethhdr *)skb->data;
skb_pull_inline(skb, ETH_HLEN);
return eth;
}
/**
* eth_skb_pad - Pad buffer to mininum number of octets for Ethernet frame
* @skb: Buffer to pad

View File

@ -10308,9 +10308,9 @@ struct mlx5_ifc_mcam_access_reg_bits {
u8 mfrl[0x1];
u8 regs_39_to_32[0x8];
u8 regs_31_to_10[0x16];
u8 regs_31_to_11[0x15];
u8 mtmp[0x1];
u8 regs_8_to_0[0x9];
u8 regs_9_to_0[0xa];
};
struct mlx5_ifc_mcam_access_reg_bits1 {

View File

@ -24,7 +24,7 @@ struct dst_ops {
void (*destroy)(struct dst_entry *);
void (*ifdown)(struct dst_entry *,
struct net_device *dev);
struct dst_entry * (*negative_advice)(struct dst_entry *);
void (*negative_advice)(struct sock *sk, struct dst_entry *);
void (*link_failure)(struct sk_buff *);
void (*update_pmtu)(struct dst_entry *dst, struct sock *sk,
struct sk_buff *skb, u32 mtu,

View File

@ -45,16 +45,17 @@ struct pp_alloc_cache {
/**
* struct page_pool_params - page pool parameters
* @fast: params accessed frequently on hotpath
* @order: 2^order pages on allocation
* @pool_size: size of the ptr_ring
* @nid: NUMA node id to allocate from pages from
* @dev: device, for DMA pre-mapping purposes
* @netdev: netdev this pool will serve (leave as NULL if none or multiple)
* @napi: NAPI which is the sole consumer of pages, otherwise NULL
* @dma_dir: DMA mapping direction
* @max_len: max DMA sync memory size for PP_FLAG_DMA_SYNC_DEV
* @offset: DMA sync address offset for PP_FLAG_DMA_SYNC_DEV
* @netdev: corresponding &net_device for Netlink introspection
* @slow: params with slowpath access only (initialization and Netlink)
* @netdev: netdev this pool will serve (leave as NULL if none or multiple)
* @flags: PP_FLAG_DMA_MAP, PP_FLAG_DMA_SYNC_DEV, PP_FLAG_SYSTEM_POOL
*/
struct page_pool_params {

View File

@ -285,4 +285,16 @@ static inline int reqsk_queue_len_young(const struct request_sock_queue *queue)
return atomic_read(&queue->young);
}
/* RFC 7323 2.3 Using the Window Scale Option
* The window field (SEG.WND) of every outgoing segment, with the
* exception of <SYN> segments, MUST be right-shifted by
* Rcv.Wind.Shift bits.
*
* This means the SEG.WND carried in SYNACK can not exceed 65535.
* We use this property to harden TCP stack while in NEW_SYN_RECV state.
*/
static inline u32 tcp_synack_window(const struct request_sock *req)
{
return min(req->rsk_rcv_wnd, 65535U);
}
#endif /* _REQUEST_SOCK_H */

View File

@ -2063,17 +2063,10 @@ sk_dst_get(const struct sock *sk)
static inline void __dst_negative_advice(struct sock *sk)
{
struct dst_entry *ndst, *dst = __sk_dst_get(sk);
struct dst_entry *dst = __sk_dst_get(sk);
if (dst && dst->ops->negative_advice) {
ndst = dst->ops->negative_advice(dst);
if (ndst != dst) {
rcu_assign_pointer(sk->sk_dst_cache, ndst);
sk_tx_queue_clear(sk);
WRITE_ONCE(sk->sk_dst_pending_confirm, 0);
}
}
if (dst && dst->ops->negative_advice)
dst->ops->negative_advice(sk, dst);
}
static inline void dst_negative_advice(struct sock *sk)

View File

@ -69,8 +69,7 @@ struct proc_input {
static inline enum proc_cn_event valid_event(enum proc_cn_event ev_type)
{
ev_type &= PROC_EVENT_ALL;
return ev_type;
return (enum proc_cn_event)(ev_type & PROC_EVENT_ALL);
}
/*

View File

@ -148,6 +148,7 @@ enum {
NETDEV_A_QSTATS_RX_ALLOC_FAIL,
NETDEV_A_QSTATS_RX_HW_DROPS,
NETDEV_A_QSTATS_RX_HW_DROP_OVERRUNS,
NETDEV_A_QSTATS_RX_CSUM_COMPLETE,
NETDEV_A_QSTATS_RX_CSUM_UNNECESSARY,
NETDEV_A_QSTATS_RX_CSUM_NONE,
NETDEV_A_QSTATS_RX_CSUM_BAD,

View File

@ -8882,7 +8882,8 @@ static bool may_update_sockmap(struct bpf_verifier_env *env, int func_id)
enum bpf_attach_type eatype = env->prog->expected_attach_type;
enum bpf_prog_type type = resolve_prog_type(env->prog);
if (func_id != BPF_FUNC_map_update_elem)
if (func_id != BPF_FUNC_map_update_elem &&
func_id != BPF_FUNC_map_delete_elem)
return false;
/* It's not possible to get access to a locked struct sock in these
@ -8893,6 +8894,11 @@ static bool may_update_sockmap(struct bpf_verifier_env *env, int func_id)
if (eatype == BPF_TRACE_ITER)
return true;
break;
case BPF_PROG_TYPE_SOCK_OPS:
/* map_update allowed only via dedicated helpers with event type checks */
if (func_id == BPF_FUNC_map_delete_elem)
return true;
break;
case BPF_PROG_TYPE_SOCKET_FILTER:
case BPF_PROG_TYPE_SCHED_CLS:
case BPF_PROG_TYPE_SCHED_ACT:
@ -8988,7 +8994,6 @@ static int check_map_func_compatibility(struct bpf_verifier_env *env,
case BPF_MAP_TYPE_SOCKMAP:
if (func_id != BPF_FUNC_sk_redirect_map &&
func_id != BPF_FUNC_sock_map_update &&
func_id != BPF_FUNC_map_delete_elem &&
func_id != BPF_FUNC_msg_redirect_map &&
func_id != BPF_FUNC_sk_select_reuseport &&
func_id != BPF_FUNC_map_lookup_elem &&
@ -8998,7 +9003,6 @@ static int check_map_func_compatibility(struct bpf_verifier_env *env,
case BPF_MAP_TYPE_SOCKHASH:
if (func_id != BPF_FUNC_sk_redirect_hash &&
func_id != BPF_FUNC_sock_hash_update &&
func_id != BPF_FUNC_map_delete_elem &&
func_id != BPF_FUNC_msg_redirect_hash &&
func_id != BPF_FUNC_sk_select_reuseport &&
func_id != BPF_FUNC_map_lookup_elem &&

View File

@ -3295,7 +3295,7 @@ static int uprobe_prog_run(struct bpf_uprobe *uprobe,
struct bpf_run_ctx *old_run_ctx;
int err = 0;
if (link->task && current != link->task)
if (link->task && current->mm != link->task->mm)
return 0;
if (sleepable)
@ -3396,8 +3396,9 @@ int bpf_uprobe_multi_link_attach(const union bpf_attr *attr, struct bpf_prog *pr
upath = u64_to_user_ptr(attr->link_create.uprobe_multi.path);
uoffsets = u64_to_user_ptr(attr->link_create.uprobe_multi.offsets);
cnt = attr->link_create.uprobe_multi.cnt;
pid = attr->link_create.uprobe_multi.pid;
if (!upath || !uoffsets || !cnt)
if (!upath || !uoffsets || !cnt || pid < 0)
return -EINVAL;
if (cnt > MAX_UPROBE_MULTI_CNT)
return -E2BIG;
@ -3421,11 +3422,8 @@ int bpf_uprobe_multi_link_attach(const union bpf_attr *attr, struct bpf_prog *pr
goto error_path_put;
}
pid = attr->link_create.uprobe_multi.pid;
if (pid) {
rcu_read_lock();
task = get_pid_task(find_vpid(pid), PIDTYPE_PID);
rcu_read_unlock();
task = get_pid_task(find_vpid(pid), PIDTYPE_TGID);
if (!task) {
err = -ESRCH;
goto error_path_put;

View File

@ -423,9 +423,6 @@ static int __sock_map_delete(struct bpf_stab *stab, struct sock *sk_test,
struct sock *sk;
int err = 0;
if (irqs_disabled())
return -EOPNOTSUPP; /* locks here are hardirq-unsafe */
spin_lock_bh(&stab->lock);
sk = *psk;
if (!sk_test || sk_test == sk)
@ -948,9 +945,6 @@ static long sock_hash_delete_elem(struct bpf_map *map, void *key)
struct bpf_shtab_elem *elem;
int ret = -ENOENT;
if (irqs_disabled())
return -EOPNOTSUPP; /* locks here are hardirq-unsafe */
hash = sock_hash_bucket_hash(key, key_size);
bucket = sock_hash_select_bucket(htab, hash);
@ -1680,19 +1674,23 @@ void sock_map_close(struct sock *sk, long timeout)
lock_sock(sk);
rcu_read_lock();
psock = sk_psock_get(sk);
if (unlikely(!psock)) {
rcu_read_unlock();
release_sock(sk);
saved_close = READ_ONCE(sk->sk_prot)->close;
} else {
psock = sk_psock(sk);
if (likely(psock)) {
saved_close = psock->saved_close;
sock_map_remove_links(sk, psock);
psock = sk_psock_get(sk);
if (unlikely(!psock))
goto no_psock;
rcu_read_unlock();
sk_psock_stop(psock);
release_sock(sk);
cancel_delayed_work_sync(&psock->work);
sk_psock_put(sk, psock);
} else {
saved_close = READ_ONCE(sk->sk_prot)->close;
no_psock:
rcu_read_unlock();
release_sock(sk);
}
/* Make sure we do not recurse. This is a bug.

View File

@ -161,9 +161,7 @@ __be16 eth_type_trans(struct sk_buff *skb, struct net_device *dev)
skb->dev = dev;
skb_reset_mac_header(skb);
eth = (struct ethhdr *)skb->data;
skb_pull_inline(skb, ETH_HLEN);
eth = eth_skb_pull_mac(skb);
eth_skb_pkt_type(skb, dev);
/*

View File

@ -1532,7 +1532,7 @@ struct sk_buff *inet_gro_receive(struct list_head *head, struct sk_buff *skb)
}
NAPI_GRO_CB(skb)->flush |= flush;
NAPI_GRO_CB(skb)->inner_network_offset = off;
NAPI_GRO_CB(skb)->network_offsets[NAPI_GRO_CB(skb)->encap_mark] = off;
/* Note : No need to call skb_gro_postpull_rcsum() here,
* as we already checked checksum over ipv4 header was 0

View File

@ -1887,10 +1887,11 @@ static int inet_dump_ifaddr(struct sk_buff *skb, struct netlink_callback *cb)
goto done;
if (fillargs.ifindex) {
err = -ENODEV;
dev = dev_get_by_index_rcu(tgt_net, fillargs.ifindex);
if (!dev)
if (!dev) {
err = -ENODEV;
goto done;
}
in_dev = __in_dev_get_rcu(dev);
if (!in_dev)
goto done;
@ -1902,7 +1903,7 @@ static int inet_dump_ifaddr(struct sk_buff *skb, struct netlink_callback *cb)
cb->seq = inet_base_seq(tgt_net);
for_each_netdev_dump(net, dev, ctx->ifindex) {
for_each_netdev_dump(tgt_net, dev, ctx->ifindex) {
in_dev = __in_dev_get_rcu(dev);
if (!in_dev)
continue;

View File

@ -58,6 +58,8 @@ __be32 nf_tproxy_laddr4(struct sk_buff *skb, __be32 user_laddr, __be32 daddr)
laddr = 0;
indev = __in_dev_get_rcu(skb->dev);
if (!indev)
return daddr;
in_dev_for_each_ifa_rcu(ifa, indev) {
if (ifa->ifa_flags & IFA_F_SECONDARY)

View File

@ -129,7 +129,8 @@ struct dst_entry *ipv4_dst_check(struct dst_entry *dst, u32 cookie);
static unsigned int ipv4_default_advmss(const struct dst_entry *dst);
INDIRECT_CALLABLE_SCOPE
unsigned int ipv4_mtu(const struct dst_entry *dst);
static struct dst_entry *ipv4_negative_advice(struct dst_entry *dst);
static void ipv4_negative_advice(struct sock *sk,
struct dst_entry *dst);
static void ipv4_link_failure(struct sk_buff *skb);
static void ip_rt_update_pmtu(struct dst_entry *dst, struct sock *sk,
struct sk_buff *skb, u32 mtu,
@ -825,22 +826,15 @@ static void ip_do_redirect(struct dst_entry *dst, struct sock *sk, struct sk_buf
__ip_do_redirect(rt, skb, &fl4, true);
}
static struct dst_entry *ipv4_negative_advice(struct dst_entry *dst)
static void ipv4_negative_advice(struct sock *sk,
struct dst_entry *dst)
{
struct rtable *rt = dst_rtable(dst);
struct dst_entry *ret = dst;
if (rt) {
if (dst->obsolete > 0) {
ip_rt_put(rt);
ret = NULL;
} else if ((rt->rt_flags & RTCF_REDIRECTED) ||
rt->dst.expires) {
ip_rt_put(rt);
ret = NULL;
}
}
return ret;
if ((dst->obsolete > 0) ||
(rt->rt_flags & RTCF_REDIRECTED) ||
rt->dst.expires)
sk_dst_reset(sk);
}
/*

View File

@ -1144,14 +1144,9 @@ static void tcp_v4_reqsk_send_ack(const struct sock *sk, struct sk_buff *skb,
#endif
}
/* RFC 7323 2.3
* The window field (SEG.WND) of every outgoing segment, with the
* exception of <SYN> segments, MUST be right-shifted by
* Rcv.Wind.Shift bits:
*/
tcp_v4_send_ack(sk, skb, seq,
tcp_rsk(req)->rcv_nxt,
req->rsk_rcv_wnd >> inet_rsk(req)->rcv_wscale,
tcp_synack_window(req) >> inet_rsk(req)->rcv_wscale,
tcp_rsk_tsval(tcp_rsk(req)),
READ_ONCE(req->ts_recent),
0, &key,

View File

@ -783,8 +783,11 @@ struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb,
/* RFC793: "first check sequence number". */
if (paws_reject || !tcp_in_window(TCP_SKB_CB(skb)->seq, TCP_SKB_CB(skb)->end_seq,
tcp_rsk(req)->rcv_nxt, tcp_rsk(req)->rcv_nxt + req->rsk_rcv_wnd)) {
if (paws_reject || !tcp_in_window(TCP_SKB_CB(skb)->seq,
TCP_SKB_CB(skb)->end_seq,
tcp_rsk(req)->rcv_nxt,
tcp_rsk(req)->rcv_nxt +
tcp_synack_window(req))) {
/* Out of window: send ACK and drop. */
if (!(flg & TCP_FLAG_RST) &&
!tcp_oow_rate_limited(sock_net(sk), skb,

View File

@ -236,7 +236,7 @@ INDIRECT_CALLABLE_SCOPE struct sk_buff *ipv6_gro_receive(struct list_head *head,
if (unlikely(!iph))
goto out;
NAPI_GRO_CB(skb)->inner_network_offset = off;
NAPI_GRO_CB(skb)->network_offsets[NAPI_GRO_CB(skb)->encap_mark] = off;
flush += ntohs(iph->payload_len) != skb->len - hlen;

View File

@ -87,7 +87,8 @@ struct dst_entry *ip6_dst_check(struct dst_entry *dst, u32 cookie);
static unsigned int ip6_default_advmss(const struct dst_entry *dst);
INDIRECT_CALLABLE_SCOPE
unsigned int ip6_mtu(const struct dst_entry *dst);
static struct dst_entry *ip6_negative_advice(struct dst_entry *);
static void ip6_negative_advice(struct sock *sk,
struct dst_entry *dst);
static void ip6_dst_destroy(struct dst_entry *);
static void ip6_dst_ifdown(struct dst_entry *,
struct net_device *dev);
@ -2770,24 +2771,24 @@ INDIRECT_CALLABLE_SCOPE struct dst_entry *ip6_dst_check(struct dst_entry *dst,
}
EXPORT_INDIRECT_CALLABLE(ip6_dst_check);
static struct dst_entry *ip6_negative_advice(struct dst_entry *dst)
static void ip6_negative_advice(struct sock *sk,
struct dst_entry *dst)
{
struct rt6_info *rt = dst_rt6_info(dst);
if (rt) {
if (rt->rt6i_flags & RTF_CACHE) {
rcu_read_lock();
if (rt6_check_expired(rt)) {
rt6_remove_exception_rt(rt);
dst = NULL;
}
rcu_read_unlock();
} else {
dst_release(dst);
dst = NULL;
if (rt->rt6i_flags & RTF_CACHE) {
rcu_read_lock();
if (rt6_check_expired(rt)) {
/* counteract the dst_release() in sk_dst_reset() */
dst_hold(dst);
sk_dst_reset(sk);
rt6_remove_exception_rt(rt);
}
rcu_read_unlock();
return;
}
return dst;
sk_dst_reset(sk);
}
static void ip6_link_failure(struct sk_buff *skb)

View File

@ -1272,15 +1272,10 @@ static void tcp_v6_reqsk_send_ack(const struct sock *sk, struct sk_buff *skb,
/* sk->sk_state == TCP_LISTEN -> for regular TCP_SYN_RECV
* sk->sk_state == TCP_SYN_RECV -> for Fast Open.
*/
/* RFC 7323 2.3
* The window field (SEG.WND) of every outgoing segment, with the
* exception of <SYN> segments, MUST be right-shifted by
* Rcv.Wind.Shift bits:
*/
tcp_v6_send_ack(sk, skb, (sk->sk_state == TCP_LISTEN) ?
tcp_rsk(req)->snt_isn + 1 : tcp_sk(sk)->snd_nxt,
tcp_rsk(req)->rcv_nxt,
req->rsk_rcv_wnd >> inet_rsk(req)->rcv_wscale,
tcp_synack_window(req) >> inet_rsk(req)->rcv_wscale,
tcp_rsk_tsval(tcp_rsk(req)),
READ_ONCE(req->ts_recent), sk->sk_bound_dev_if,
&key, ipv6_get_dsfield(ipv6_hdr(skb)), 0,

View File

@ -549,6 +549,9 @@ list_set_cancel_gc(struct ip_set *set)
if (SET_WITH_TIMEOUT(set))
timer_shutdown_sync(&map->gc);
/* Flush list to drop references to other ipsets */
list_set_flush(set);
}
static const struct ip_set_type_variant set_variant = {

View File

@ -169,7 +169,9 @@ instance_destroy_rcu(struct rcu_head *head)
struct nfqnl_instance *inst = container_of(head, struct nfqnl_instance,
rcu);
rcu_read_lock();
nfqnl_flush(inst, NULL, 0);
rcu_read_unlock();
kfree(inst);
module_put(THIS_MODULE);
}

View File

@ -35,11 +35,9 @@ int nft_fib_validate(const struct nft_ctx *ctx, const struct nft_expr *expr,
switch (priv->result) {
case NFT_FIB_RESULT_OIF:
case NFT_FIB_RESULT_OIFNAME:
hooks = (1 << NF_INET_PRE_ROUTING);
if (priv->flags & NFTA_FIB_F_IIF) {
hooks |= (1 << NF_INET_LOCAL_IN) |
(1 << NF_INET_FORWARD);
}
hooks = (1 << NF_INET_PRE_ROUTING) |
(1 << NF_INET_LOCAL_IN) |
(1 << NF_INET_FORWARD);
break;
case NFT_FIB_RESULT_ADDRTYPE:
if (priv->flags & NFTA_FIB_F_IIF)

View File

@ -45,36 +45,27 @@ nft_payload_copy_vlan(u32 *d, const struct sk_buff *skb, u8 offset, u8 len)
int mac_off = skb_mac_header(skb) - skb->data;
u8 *vlanh, *dst_u8 = (u8 *) d;
struct vlan_ethhdr veth;
u8 vlan_hlen = 0;
if ((skb->protocol == htons(ETH_P_8021AD) ||
skb->protocol == htons(ETH_P_8021Q)) &&
offset >= VLAN_ETH_HLEN && offset < VLAN_ETH_HLEN + VLAN_HLEN)
vlan_hlen += VLAN_HLEN;
vlanh = (u8 *) &veth;
if (offset < VLAN_ETH_HLEN + vlan_hlen) {
if (offset < VLAN_ETH_HLEN) {
u8 ethlen = len;
if (vlan_hlen &&
skb_copy_bits(skb, mac_off, &veth, VLAN_ETH_HLEN) < 0)
return false;
else if (!nft_payload_rebuild_vlan_hdr(skb, mac_off, &veth))
if (!nft_payload_rebuild_vlan_hdr(skb, mac_off, &veth))
return false;
if (offset + len > VLAN_ETH_HLEN + vlan_hlen)
ethlen -= offset + len - VLAN_ETH_HLEN - vlan_hlen;
if (offset + len > VLAN_ETH_HLEN)
ethlen -= offset + len - VLAN_ETH_HLEN;
memcpy(dst_u8, vlanh + offset - vlan_hlen, ethlen);
memcpy(dst_u8, vlanh + offset, ethlen);
len -= ethlen;
if (len == 0)
return true;
dst_u8 += ethlen;
offset = ETH_HLEN + vlan_hlen;
offset = ETH_HLEN;
} else {
offset -= VLAN_HLEN + vlan_hlen;
offset -= VLAN_HLEN;
}
return skb_copy_bits(skb, offset + mac_off, dst_u8, len) == 0;
@ -154,12 +145,12 @@ int nft_payload_inner_offset(const struct nft_pktinfo *pkt)
return pkt->inneroff;
}
static bool nft_payload_need_vlan_copy(const struct nft_payload *priv)
static bool nft_payload_need_vlan_adjust(u32 offset, u32 len)
{
unsigned int len = priv->offset + priv->len;
unsigned int boundary = offset + len;
/* data past ether src/dst requested, copy needed */
if (len > offsetof(struct ethhdr, h_proto))
if (boundary > offsetof(struct ethhdr, h_proto))
return true;
return false;
@ -183,7 +174,7 @@ void nft_payload_eval(const struct nft_expr *expr,
goto err;
if (skb_vlan_tag_present(skb) &&
nft_payload_need_vlan_copy(priv)) {
nft_payload_need_vlan_adjust(priv->offset, priv->len)) {
if (!nft_payload_copy_vlan(dest, skb,
priv->offset, priv->len))
goto err;
@ -810,21 +801,79 @@ struct nft_payload_set {
u8 csum_flags;
};
/* This is not struct vlan_hdr. */
struct nft_payload_vlan_hdr {
__be16 h_vlan_proto;
__be16 h_vlan_TCI;
};
static bool
nft_payload_set_vlan(const u32 *src, struct sk_buff *skb, u8 offset, u8 len,
int *vlan_hlen)
{
struct nft_payload_vlan_hdr *vlanh;
__be16 vlan_proto;
u16 vlan_tci;
if (offset >= offsetof(struct vlan_ethhdr, h_vlan_encapsulated_proto)) {
*vlan_hlen = VLAN_HLEN;
return true;
}
switch (offset) {
case offsetof(struct vlan_ethhdr, h_vlan_proto):
if (len == 2) {
vlan_proto = nft_reg_load_be16(src);
skb->vlan_proto = vlan_proto;
} else if (len == 4) {
vlanh = (struct nft_payload_vlan_hdr *)src;
__vlan_hwaccel_put_tag(skb, vlanh->h_vlan_proto,
ntohs(vlanh->h_vlan_TCI));
} else {
return false;
}
break;
case offsetof(struct vlan_ethhdr, h_vlan_TCI):
if (len != 2)
return false;
vlan_tci = ntohs(nft_reg_load_be16(src));
skb->vlan_tci = vlan_tci;
break;
default:
return false;
}
return true;
}
static void nft_payload_set_eval(const struct nft_expr *expr,
struct nft_regs *regs,
const struct nft_pktinfo *pkt)
{
const struct nft_payload_set *priv = nft_expr_priv(expr);
struct sk_buff *skb = pkt->skb;
const u32 *src = &regs->data[priv->sreg];
int offset, csum_offset;
int offset, csum_offset, vlan_hlen = 0;
struct sk_buff *skb = pkt->skb;
__wsum fsum, tsum;
switch (priv->base) {
case NFT_PAYLOAD_LL_HEADER:
if (!skb_mac_header_was_set(skb))
goto err;
offset = skb_mac_header(skb) - skb->data;
if (skb_vlan_tag_present(skb) &&
nft_payload_need_vlan_adjust(priv->offset, priv->len)) {
if (!nft_payload_set_vlan(src, skb,
priv->offset, priv->len,
&vlan_hlen))
goto err;
if (!vlan_hlen)
return;
}
offset = skb_mac_header(skb) - skb->data - vlan_hlen;
break;
case NFT_PAYLOAD_NETWORK_HEADER:
offset = skb_network_offset(skb);

View File

@ -1151,11 +1151,6 @@ static int parse_taprio_schedule(struct taprio_sched *q, struct nlattr **tb,
list_for_each_entry(entry, &new->entries, list)
cycle = ktime_add_ns(cycle, entry->interval);
if (!cycle) {
NL_SET_ERR_MSG(extack, "'cycle_time' can never be 0");
return -EINVAL;
}
if (cycle < 0 || cycle > INT_MAX) {
NL_SET_ERR_MSG(extack, "'cycle_time' is too big");
return -EINVAL;
@ -1164,6 +1159,11 @@ static int parse_taprio_schedule(struct taprio_sched *q, struct nlattr **tb,
new->cycle_time = cycle;
}
if (new->cycle_time < new->num_entries * length_to_duration(q, ETH_ZLEN)) {
NL_SET_ERR_MSG(extack, "'cycle_time' is too small");
return -EINVAL;
}
taprio_calculate_gate_durations(q, new);
return 0;
@ -1848,6 +1848,9 @@ static int taprio_change(struct Qdisc *sch, struct nlattr *opt,
}
q->flags = taprio_flags;
/* Needed for length_to_duration() during netlink attribute parsing */
taprio_set_picos_per_byte(dev, q);
err = taprio_parse_mqprio_opt(dev, mqprio, extack, q->flags);
if (err < 0)
return err;
@ -1907,7 +1910,6 @@ static int taprio_change(struct Qdisc *sch, struct nlattr *opt,
if (err < 0)
goto free_sched;
taprio_set_picos_per_byte(dev, q);
taprio_update_queue_max_sdu(q, new_admin, stab);
if (FULL_OFFLOAD_IS_ENABLED(q->flags))

View File

@ -731,7 +731,7 @@ static int unix_listen(struct socket *sock, int backlog)
if (sock->type != SOCK_STREAM && sock->type != SOCK_SEQPACKET)
goto out; /* Only stream/seqpacket sockets accept */
err = -EINVAL;
if (!u->addr)
if (!READ_ONCE(u->addr))
goto out; /* No listens on an unbound socket */
unix_state_lock(sk);
if (sk->sk_state != TCP_CLOSE && sk->sk_state != TCP_LISTEN)
@ -1131,8 +1131,8 @@ static struct sock *unix_find_other(struct net *net,
static int unix_autobind(struct sock *sk)
{
unsigned int new_hash, old_hash = sk->sk_hash;
struct unix_sock *u = unix_sk(sk);
unsigned int new_hash, old_hash;
struct net *net = sock_net(sk);
struct unix_address *addr;
u32 lastnum, ordernum;
@ -1155,6 +1155,7 @@ static int unix_autobind(struct sock *sk)
addr->name->sun_family = AF_UNIX;
refcount_set(&addr->refcnt, 1);
old_hash = sk->sk_hash;
ordernum = get_random_u32();
lastnum = ordernum & 0xFFFFF;
retry:
@ -1195,8 +1196,8 @@ static int unix_bind_bsd(struct sock *sk, struct sockaddr_un *sunaddr,
{
umode_t mode = S_IFSOCK |
(SOCK_INODE(sk->sk_socket)->i_mode & ~current_umask());
unsigned int new_hash, old_hash = sk->sk_hash;
struct unix_sock *u = unix_sk(sk);
unsigned int new_hash, old_hash;
struct net *net = sock_net(sk);
struct mnt_idmap *idmap;
struct unix_address *addr;
@ -1234,6 +1235,7 @@ static int unix_bind_bsd(struct sock *sk, struct sockaddr_un *sunaddr,
if (u->addr)
goto out_unlock;
old_hash = sk->sk_hash;
new_hash = unix_bsd_hash(d_backing_inode(dentry));
unix_table_double_lock(net, old_hash, new_hash);
u->path.mnt = mntget(parent.mnt);
@ -1261,8 +1263,8 @@ out:
static int unix_bind_abstract(struct sock *sk, struct sockaddr_un *sunaddr,
int addr_len)
{
unsigned int new_hash, old_hash = sk->sk_hash;
struct unix_sock *u = unix_sk(sk);
unsigned int new_hash, old_hash;
struct net *net = sock_net(sk);
struct unix_address *addr;
int err;
@ -1280,6 +1282,7 @@ static int unix_bind_abstract(struct sock *sk, struct sockaddr_un *sunaddr,
goto out_mutex;
}
old_hash = sk->sk_hash;
new_hash = unix_abstract_hash(addr->name, addr->len, sk->sk_type);
unix_table_double_lock(net, old_hash, new_hash);
@ -1369,7 +1372,7 @@ static int unix_dgram_connect(struct socket *sock, struct sockaddr *addr,
if ((test_bit(SOCK_PASSCRED, &sock->flags) ||
test_bit(SOCK_PASSPIDFD, &sock->flags)) &&
!unix_sk(sk)->addr) {
!READ_ONCE(unix_sk(sk)->addr)) {
err = unix_autobind(sk);
if (err)
goto out;
@ -1481,7 +1484,8 @@ static int unix_stream_connect(struct socket *sock, struct sockaddr *uaddr,
goto out;
if ((test_bit(SOCK_PASSCRED, &sock->flags) ||
test_bit(SOCK_PASSPIDFD, &sock->flags)) && !u->addr) {
test_bit(SOCK_PASSPIDFD, &sock->flags)) &&
!READ_ONCE(u->addr)) {
err = unix_autobind(sk);
if (err)
goto out;
@ -1950,7 +1954,8 @@ static int unix_dgram_sendmsg(struct socket *sock, struct msghdr *msg,
}
if ((test_bit(SOCK_PASSCRED, &sock->flags) ||
test_bit(SOCK_PASSPIDFD, &sock->flags)) && !u->addr) {
test_bit(SOCK_PASSPIDFD, &sock->flags)) &&
!READ_ONCE(u->addr)) {
err = unix_autobind(sk);
if (err)
goto out;

View File

@ -3910,15 +3910,10 @@ static void xfrm_link_failure(struct sk_buff *skb)
/* Impossible. Such dst must be popped before reaches point of failure. */
}
static struct dst_entry *xfrm_negative_advice(struct dst_entry *dst)
static void xfrm_negative_advice(struct sock *sk, struct dst_entry *dst)
{
if (dst) {
if (dst->obsolete) {
dst_release(dst);
dst = NULL;
}
}
return dst;
if (dst->obsolete)
sk_dst_reset(sk);
}
static void xfrm_init_pmtu(struct xfrm_dst **bundle, int nr)

View File

@ -728,7 +728,7 @@ static int sets_patch(struct object *obj)
static int symbols_patch(struct object *obj)
{
int err;
off_t err;
if (__symbols_patch(obj, &obj->structs) ||
__symbols_patch(obj, &obj->unions) ||

View File

@ -148,6 +148,7 @@ enum {
NETDEV_A_QSTATS_RX_ALLOC_FAIL,
NETDEV_A_QSTATS_RX_HW_DROPS,
NETDEV_A_QSTATS_RX_HW_DROP_OVERRUNS,
NETDEV_A_QSTATS_RX_CSUM_COMPLETE,
NETDEV_A_QSTATS_RX_CSUM_UNNECESSARY,
NETDEV_A_QSTATS_RX_CSUM_NONE,
NETDEV_A_QSTATS_RX_CSUM_BAD,

View File

@ -392,11 +392,40 @@ static int probe_uprobe_multi_link(int token_fd)
link_fd = bpf_link_create(prog_fd, -1, BPF_TRACE_UPROBE_MULTI, &link_opts);
err = -errno; /* close() can clobber errno */
if (link_fd >= 0 || err != -EBADF) {
close(link_fd);
close(prog_fd);
return 0;
}
/* Initial multi-uprobe support in kernel didn't handle PID filtering
* correctly (it was doing thread filtering, not process filtering).
* So now we'll detect if PID filtering logic was fixed, and, if not,
* we'll pretend multi-uprobes are not supported, if not.
* Multi-uprobes are used in USDT attachment logic, and we need to be
* conservative here, because multi-uprobe selection happens early at
* load time, while the use of PID filtering is known late at
* attachment time, at which point it's too late to undo multi-uprobe
* selection.
*
* Creating uprobe with pid == -1 for (invalid) '/' binary will fail
* early with -EINVAL on kernels with fixed PID filtering logic;
* otherwise -ESRCH would be returned if passed correct binary path
* (but we'll just get -BADF, of course).
*/
link_opts.uprobe_multi.pid = -1; /* invalid PID */
link_opts.uprobe_multi.path = "/"; /* invalid path */
link_opts.uprobe_multi.offsets = &offset;
link_opts.uprobe_multi.cnt = 1;
link_fd = bpf_link_create(prog_fd, -1, BPF_TRACE_UPROBE_MULTI, &link_opts);
err = -errno; /* close() can clobber errno */
if (link_fd >= 0)
close(link_fd);
close(prog_fd);
return link_fd < 0 && err == -EBADF;
return link_fd < 0 && err == -EINVAL;
}
static int probe_kern_bpf_cookie(int token_fd)

View File

@ -73,6 +73,16 @@ static int create_netkit(int mode, int policy, int peer_policy, int *ifindex,
"up primary");
ASSERT_OK(system("ip addr add dev " netkit_name " 10.0.0.1/24"),
"addr primary");
if (mode == NETKIT_L3) {
ASSERT_EQ(system("ip link set dev " netkit_name
" addr ee:ff:bb:cc:aa:dd 2> /dev/null"), 512,
"set hwaddress");
} else {
ASSERT_OK(system("ip link set dev " netkit_name
" addr ee:ff:bb:cc:aa:dd"),
"set hwaddress");
}
if (same_netns) {
ASSERT_OK(system("ip link set dev " netkit_peer " up"),
"up peer");
@ -89,6 +99,16 @@ static int create_netkit(int mode, int policy, int peer_policy, int *ifindex,
return err;
}
static void move_netkit(void)
{
ASSERT_OK(system("ip link set " netkit_peer " netns foo"),
"move peer");
ASSERT_OK(system("ip netns exec foo ip link set dev "
netkit_peer " up"), "up peer");
ASSERT_OK(system("ip netns exec foo ip addr add dev "
netkit_peer " 10.0.0.2/24"), "addr peer");
}
static void destroy_netkit(void)
{
ASSERT_OK(system("ip link del dev " netkit_name), "del primary");
@ -685,3 +705,77 @@ void serial_test_tc_netkit_neigh_links(void)
serial_test_tc_netkit_neigh_links_target(NETKIT_L2, BPF_NETKIT_PRIMARY);
serial_test_tc_netkit_neigh_links_target(NETKIT_L3, BPF_NETKIT_PRIMARY);
}
static void serial_test_tc_netkit_pkt_type_mode(int mode)
{
LIBBPF_OPTS(bpf_netkit_opts, optl_nk);
LIBBPF_OPTS(bpf_tcx_opts, optl_tcx);
int err, ifindex, ifindex2;
struct test_tc_link *skel;
struct bpf_link *link;
err = create_netkit(mode, NETKIT_PASS, NETKIT_PASS,
&ifindex, true);
if (err)
return;
ifindex2 = if_nametoindex(netkit_peer);
ASSERT_NEQ(ifindex, ifindex2, "ifindex_1_2");
skel = test_tc_link__open();
if (!ASSERT_OK_PTR(skel, "skel_open"))
goto cleanup;
ASSERT_EQ(bpf_program__set_expected_attach_type(skel->progs.tc1,
BPF_NETKIT_PRIMARY), 0, "tc1_attach_type");
ASSERT_EQ(bpf_program__set_expected_attach_type(skel->progs.tc7,
BPF_TCX_INGRESS), 0, "tc7_attach_type");
err = test_tc_link__load(skel);
if (!ASSERT_OK(err, "skel_load"))
goto cleanup;
assert_mprog_count_ifindex(ifindex, BPF_NETKIT_PRIMARY, 0);
assert_mprog_count_ifindex(ifindex2, BPF_TCX_INGRESS, 0);
link = bpf_program__attach_netkit(skel->progs.tc1, ifindex, &optl_nk);
if (!ASSERT_OK_PTR(link, "link_attach"))
goto cleanup;
skel->links.tc1 = link;
assert_mprog_count_ifindex(ifindex, BPF_NETKIT_PRIMARY, 1);
assert_mprog_count_ifindex(ifindex2, BPF_TCX_INGRESS, 0);
link = bpf_program__attach_tcx(skel->progs.tc7, ifindex2, &optl_tcx);
if (!ASSERT_OK_PTR(link, "link_attach"))
goto cleanup;
skel->links.tc7 = link;
assert_mprog_count_ifindex(ifindex, BPF_NETKIT_PRIMARY, 1);
assert_mprog_count_ifindex(ifindex2, BPF_TCX_INGRESS, 1);
move_netkit();
tc_skel_reset_all_seen(skel);
skel->bss->set_type = true;
ASSERT_EQ(send_icmp(), 0, "icmp_pkt");
ASSERT_EQ(skel->bss->seen_tc1, true, "seen_tc1");
ASSERT_EQ(skel->bss->seen_tc7, true, "seen_tc7");
ASSERT_EQ(skel->bss->seen_host, true, "seen_host");
ASSERT_EQ(skel->bss->seen_mcast, true, "seen_mcast");
cleanup:
test_tc_link__destroy(skel);
assert_mprog_count_ifindex(ifindex, BPF_NETKIT_PRIMARY, 0);
destroy_netkit();
}
void serial_test_tc_netkit_pkt_type(void)
{
serial_test_tc_netkit_pkt_type_mode(NETKIT_L2);
serial_test_tc_netkit_pkt_type_mode(NETKIT_L3);
}

View File

@ -1,12 +1,14 @@
// SPDX-License-Identifier: GPL-2.0
#include <unistd.h>
#include <pthread.h>
#include <test_progs.h>
#include "uprobe_multi.skel.h"
#include "uprobe_multi_bench.skel.h"
#include "uprobe_multi_usdt.skel.h"
#include "bpf/libbpf_internal.h"
#include "testing_helpers.h"
#include "../sdt.h"
static char test_data[] = "test_data";
@ -25,9 +27,17 @@ noinline void uprobe_multi_func_3(void)
asm volatile ("");
}
noinline void usdt_trigger(void)
{
STAP_PROBE(test, pid_filter_usdt);
}
struct child {
int go[2];
int c2p[2]; /* child -> parent channel */
int pid;
int tid;
pthread_t thread;
};
static void release_child(struct child *child)
@ -38,6 +48,10 @@ static void release_child(struct child *child)
return;
close(child->go[1]);
close(child->go[0]);
if (child->thread)
pthread_join(child->thread, NULL);
close(child->c2p[0]);
close(child->c2p[1]);
if (child->pid > 0)
waitpid(child->pid, &child_status, 0);
}
@ -63,7 +77,7 @@ static struct child *spawn_child(void)
if (pipe(child.go))
return NULL;
child.pid = fork();
child.pid = child.tid = fork();
if (child.pid < 0) {
release_child(&child);
errno = EINVAL;
@ -82,6 +96,7 @@ static struct child *spawn_child(void)
uprobe_multi_func_1();
uprobe_multi_func_2();
uprobe_multi_func_3();
usdt_trigger();
exit(errno);
}
@ -89,6 +104,67 @@ static struct child *spawn_child(void)
return &child;
}
static void *child_thread(void *ctx)
{
struct child *child = ctx;
int c = 0, err;
child->tid = syscall(SYS_gettid);
/* let parent know we are ready */
err = write(child->c2p[1], &c, 1);
if (err != 1)
pthread_exit(&err);
/* wait for parent's kick */
err = read(child->go[0], &c, 1);
if (err != 1)
pthread_exit(&err);
uprobe_multi_func_1();
uprobe_multi_func_2();
uprobe_multi_func_3();
usdt_trigger();
err = 0;
pthread_exit(&err);
}
static struct child *spawn_thread(void)
{
static struct child child;
int c, err;
/* pipe to notify child to execute the trigger functions */
if (pipe(child.go))
return NULL;
/* pipe to notify parent that child thread is ready */
if (pipe(child.c2p)) {
close(child.go[0]);
close(child.go[1]);
return NULL;
}
child.pid = getpid();
err = pthread_create(&child.thread, NULL, child_thread, &child);
if (err) {
err = -errno;
close(child.go[0]);
close(child.go[1]);
close(child.c2p[0]);
close(child.c2p[1]);
errno = -err;
return NULL;
}
err = read(child.c2p[0], &c, 1);
if (!ASSERT_EQ(err, 1, "child_thread_ready"))
return NULL;
return &child;
}
static void uprobe_multi_test_run(struct uprobe_multi *skel, struct child *child)
{
skel->bss->uprobe_multi_func_1_addr = (__u64) uprobe_multi_func_1;
@ -103,15 +179,23 @@ static void uprobe_multi_test_run(struct uprobe_multi *skel, struct child *child
* passed at the probe attach.
*/
skel->bss->pid = child ? 0 : getpid();
skel->bss->expect_pid = child ? child->pid : 0;
/* trigger all probes, if we are testing child *process*, just to make
* sure that PID filtering doesn't let through activations from wrong
* PIDs; when we test child *thread*, we don't want to do this to
* avoid double counting number of triggering events
*/
if (!child || !child->thread) {
uprobe_multi_func_1();
uprobe_multi_func_2();
uprobe_multi_func_3();
usdt_trigger();
}
if (child)
kick_child(child);
/* trigger all probes */
uprobe_multi_func_1();
uprobe_multi_func_2();
uprobe_multi_func_3();
/*
* There are 2 entry and 2 exit probe called for each uprobe_multi_func_[123]
* function and each slepable probe (6) increments uprobe_multi_sleep_result.
@ -126,8 +210,12 @@ static void uprobe_multi_test_run(struct uprobe_multi *skel, struct child *child
ASSERT_EQ(skel->bss->uprobe_multi_sleep_result, 6, "uprobe_multi_sleep_result");
if (child)
ASSERT_FALSE(skel->bss->bad_pid_seen, "bad_pid_seen");
if (child) {
ASSERT_EQ(skel->bss->child_pid, child->pid, "uprobe_multi_child_pid");
ASSERT_EQ(skel->bss->child_tid, child->tid, "uprobe_multi_child_tid");
}
}
static void test_skel_api(void)
@ -190,8 +278,24 @@ __test_attach_api(const char *binary, const char *pattern, struct bpf_uprobe_mul
if (!ASSERT_OK_PTR(skel->links.uprobe_extra, "bpf_program__attach_uprobe_multi"))
goto cleanup;
/* Attach (uprobe-backed) USDTs */
skel->links.usdt_pid = bpf_program__attach_usdt(skel->progs.usdt_pid, pid, binary,
"test", "pid_filter_usdt", NULL);
if (!ASSERT_OK_PTR(skel->links.usdt_pid, "attach_usdt_pid"))
goto cleanup;
skel->links.usdt_extra = bpf_program__attach_usdt(skel->progs.usdt_extra, -1, binary,
"test", "pid_filter_usdt", NULL);
if (!ASSERT_OK_PTR(skel->links.usdt_extra, "attach_usdt_extra"))
goto cleanup;
uprobe_multi_test_run(skel, child);
ASSERT_FALSE(skel->bss->bad_pid_seen_usdt, "bad_pid_seen_usdt");
if (child) {
ASSERT_EQ(skel->bss->child_pid_usdt, child->pid, "usdt_multi_child_pid");
ASSERT_EQ(skel->bss->child_tid_usdt, child->tid, "usdt_multi_child_tid");
}
cleanup:
uprobe_multi__destroy(skel);
}
@ -210,6 +314,13 @@ test_attach_api(const char *binary, const char *pattern, struct bpf_uprobe_multi
return;
__test_attach_api(binary, pattern, opts, child);
/* pid filter (thread) */
child = spawn_thread();
if (!ASSERT_OK_PTR(child, "spawn_thread"))
return;
__test_attach_api(binary, pattern, opts, child);
}
static void test_attach_api_pattern(void)
@ -397,7 +508,7 @@ static void test_attach_api_fails(void)
link_fd = bpf_link_create(prog_fd, 0, BPF_TRACE_UPROBE_MULTI, &opts);
if (!ASSERT_ERR(link_fd, "link_fd"))
goto cleanup;
ASSERT_EQ(link_fd, -ESRCH, "pid_is_wrong");
ASSERT_EQ(link_fd, -EINVAL, "pid_is_wrong");
cleanup:
if (link_fd >= 0)
@ -495,6 +606,13 @@ static void test_link_api(void)
return;
__test_link_api(child);
/* pid filter (thread) */
child = spawn_thread();
if (!ASSERT_OK_PTR(child, "spawn_thread"))
return;
__test_link_api(child);
}
static void test_bench_attach_uprobe(void)

View File

@ -67,6 +67,7 @@
#include "verifier_search_pruning.skel.h"
#include "verifier_sock.skel.h"
#include "verifier_sock_addr.skel.h"
#include "verifier_sockmap_mutate.skel.h"
#include "verifier_spill_fill.skel.h"
#include "verifier_spin_lock.skel.h"
#include "verifier_stack_ptr.skel.h"
@ -183,6 +184,7 @@ void test_verifier_sdiv(void) { RUN(verifier_sdiv); }
void test_verifier_search_pruning(void) { RUN(verifier_search_pruning); }
void test_verifier_sock(void) { RUN(verifier_sock); }
void test_verifier_sock_addr(void) { RUN(verifier_sock_addr); }
void test_verifier_sockmap_mutate(void) { RUN(verifier_sockmap_mutate); }
void test_verifier_spill_fill(void) { RUN(verifier_spill_fill); }
void test_verifier_spin_lock(void) { RUN(verifier_spin_lock); }
void test_verifier_stack_ptr(void) { RUN(verifier_stack_ptr); }

View File

@ -4,7 +4,8 @@
#include <linux/bpf.h>
#include <linux/if_ether.h>
#include <linux/stddef.h>
#include <linux/if_packet.h>
#include <bpf/bpf_endian.h>
#include <bpf/bpf_helpers.h>
@ -16,7 +17,13 @@ bool seen_tc3;
bool seen_tc4;
bool seen_tc5;
bool seen_tc6;
bool seen_tc7;
bool set_type;
bool seen_eth;
bool seen_host;
bool seen_mcast;
SEC("tc/ingress")
int tc1(struct __sk_buff *skb)
@ -28,8 +35,16 @@ int tc1(struct __sk_buff *skb)
if (bpf_skb_load_bytes(skb, 0, &eth, sizeof(eth)))
goto out;
seen_eth = eth.h_proto == bpf_htons(ETH_P_IP);
seen_host = skb->pkt_type == PACKET_HOST;
if (seen_host && set_type) {
eth.h_dest[0] = 4;
if (bpf_skb_store_bytes(skb, 0, &eth, sizeof(eth), 0))
goto fail;
bpf_skb_change_type(skb, PACKET_MULTICAST);
}
out:
seen_tc1 = true;
fail:
return TCX_NEXT;
}
@ -67,3 +82,21 @@ int tc6(struct __sk_buff *skb)
seen_tc6 = true;
return TCX_PASS;
}
SEC("tc/ingress")
int tc7(struct __sk_buff *skb)
{
struct ethhdr eth = {};
if (skb->protocol != __bpf_constant_htons(ETH_P_IP))
goto out;
if (bpf_skb_load_bytes(skb, 0, &eth, sizeof(eth)))
goto out;
if (eth.h_dest[0] == 4 && set_type) {
seen_mcast = skb->pkt_type == PACKET_MULTICAST;
bpf_skb_change_type(skb, PACKET_HOST);
}
out:
seen_tc7 = true;
return TCX_PASS;
}

View File

@ -1,8 +1,8 @@
// SPDX-License-Identifier: GPL-2.0
#include <linux/bpf.h>
#include "vmlinux.h"
#include <bpf/bpf_helpers.h>
#include <bpf/bpf_tracing.h>
#include <stdbool.h>
#include <bpf/usdt.bpf.h>
char _license[] SEC("license") = "GPL";
@ -22,6 +22,13 @@ __u64 uprobe_multi_sleep_result = 0;
int pid = 0;
int child_pid = 0;
int child_tid = 0;
int child_pid_usdt = 0;
int child_tid_usdt = 0;
int expect_pid = 0;
bool bad_pid_seen = false;
bool bad_pid_seen_usdt = false;
bool test_cookie = false;
void *user_ptr = 0;
@ -36,11 +43,19 @@ static __always_inline bool verify_sleepable_user_copy(void)
static void uprobe_multi_check(void *ctx, bool is_return, bool is_sleep)
{
child_pid = bpf_get_current_pid_tgid() >> 32;
__u64 cur_pid_tgid = bpf_get_current_pid_tgid();
__u32 cur_pid;
if (pid && child_pid != pid)
cur_pid = cur_pid_tgid >> 32;
if (pid && cur_pid != pid)
return;
if (expect_pid && cur_pid != expect_pid)
bad_pid_seen = true;
child_pid = cur_pid_tgid >> 32;
child_tid = (__u32)cur_pid_tgid;
__u64 cookie = test_cookie ? bpf_get_attach_cookie(ctx) : 0;
__u64 addr = bpf_get_func_ip(ctx);
@ -97,5 +112,32 @@ int uretprobe_sleep(struct pt_regs *ctx)
SEC("uprobe.multi//proc/self/exe:uprobe_multi_func_*")
int uprobe_extra(struct pt_regs *ctx)
{
/* we need this one just to mix PID-filtered and global uprobes */
return 0;
}
SEC("usdt")
int usdt_pid(struct pt_regs *ctx)
{
__u64 cur_pid_tgid = bpf_get_current_pid_tgid();
__u32 cur_pid;
cur_pid = cur_pid_tgid >> 32;
if (pid && cur_pid != pid)
return 0;
if (expect_pid && cur_pid != expect_pid)
bad_pid_seen_usdt = true;
child_pid_usdt = cur_pid_tgid >> 32;
child_tid_usdt = (__u32)cur_pid_tgid;
return 0;
}
SEC("usdt")
int usdt_extra(struct pt_regs *ctx)
{
/* we need this one just to mix PID-filtered and global USDT probes */
return 0;
}

View File

@ -0,0 +1,187 @@
// SPDX-License-Identifier: GPL-2.0
#include <linux/bpf.h>
#include <bpf/bpf_helpers.h>
#include <bpf/bpf_tracing.h>
#include "bpf_misc.h"
#define __always_unused __attribute__((unused))
char _license[] SEC("license") = "GPL";
struct sock {
} __attribute__((preserve_access_index));
struct bpf_iter__sockmap {
union {
struct sock *sk;
};
} __attribute__((preserve_access_index));
struct {
__uint(type, BPF_MAP_TYPE_SOCKHASH);
__uint(max_entries, 1);
__type(key, int);
__type(value, int);
} sockhash SEC(".maps");
struct {
__uint(type, BPF_MAP_TYPE_SOCKMAP);
__uint(max_entries, 1);
__type(key, int);
__type(value, int);
} sockmap SEC(".maps");
enum { CG_OK = 1 };
int zero = 0;
static __always_inline void test_sockmap_delete(void)
{
bpf_map_delete_elem(&sockmap, &zero);
bpf_map_delete_elem(&sockhash, &zero);
}
static __always_inline void test_sockmap_update(void *sk)
{
if (sk) {
bpf_map_update_elem(&sockmap, &zero, sk, BPF_ANY);
bpf_map_update_elem(&sockhash, &zero, sk, BPF_ANY);
}
}
static __always_inline void test_sockmap_lookup_and_update(void)
{
struct bpf_sock *sk = bpf_map_lookup_elem(&sockmap, &zero);
if (sk) {
test_sockmap_update(sk);
bpf_sk_release(sk);
}
}
static __always_inline void test_sockmap_mutate(void *sk)
{
test_sockmap_delete();
test_sockmap_update(sk);
}
static __always_inline void test_sockmap_lookup_and_mutate(void)
{
test_sockmap_delete();
test_sockmap_lookup_and_update();
}
SEC("action")
__success
int test_sched_act(struct __sk_buff *skb)
{
test_sockmap_mutate(skb->sk);
return 0;
}
SEC("classifier")
__success
int test_sched_cls(struct __sk_buff *skb)
{
test_sockmap_mutate(skb->sk);
return 0;
}
SEC("flow_dissector")
__success
int test_flow_dissector_delete(struct __sk_buff *skb __always_unused)
{
test_sockmap_delete();
return 0;
}
SEC("flow_dissector")
__failure __msg("program of this type cannot use helper bpf_sk_release")
int test_flow_dissector_update(struct __sk_buff *skb __always_unused)
{
test_sockmap_lookup_and_update(); /* no access to skb->sk */
return 0;
}
SEC("iter/sockmap")
__success
int test_trace_iter(struct bpf_iter__sockmap *ctx)
{
test_sockmap_mutate(ctx->sk);
return 0;
}
SEC("raw_tp/kfree")
__failure __msg("cannot update sockmap in this context")
int test_raw_tp_delete(const void *ctx __always_unused)
{
test_sockmap_delete();
return 0;
}
SEC("raw_tp/kfree")
__failure __msg("cannot update sockmap in this context")
int test_raw_tp_update(const void *ctx __always_unused)
{
test_sockmap_lookup_and_update();
return 0;
}
SEC("sk_lookup")
__success
int test_sk_lookup(struct bpf_sk_lookup *ctx)
{
test_sockmap_mutate(ctx->sk);
return 0;
}
SEC("sk_reuseport")
__success
int test_sk_reuseport(struct sk_reuseport_md *ctx)
{
test_sockmap_mutate(ctx->sk);
return 0;
}
SEC("socket")
__success
int test_socket_filter(struct __sk_buff *skb)
{
test_sockmap_mutate(skb->sk);
return 0;
}
SEC("sockops")
__success
int test_sockops_delete(struct bpf_sock_ops *ctx __always_unused)
{
test_sockmap_delete();
return CG_OK;
}
SEC("sockops")
__failure __msg("cannot update sockmap in this context")
int test_sockops_update(struct bpf_sock_ops *ctx)
{
test_sockmap_update(ctx->sk);
return CG_OK;
}
SEC("sockops")
__success
int test_sockops_update_dedicated(struct bpf_sock_ops *ctx)
{
bpf_sock_map_update(ctx, &sockmap, &zero, BPF_ANY);
bpf_sock_hash_update(ctx, &sockhash, &zero, BPF_ANY);
return CG_OK;
}
SEC("xdp")
__success
int test_xdp(struct xdp_md *ctx __always_unused)
{
test_sockmap_lookup_and_mutate();
return XDP_PASS;
}

View File

@ -174,6 +174,8 @@ trap cleanup_all_ns EXIT
setup_hsr_interfaces 0
do_complete_ping_test
setup_ns ns1 ns2 ns3
setup_hsr_interfaces 1
do_complete_ping_test

View File

@ -261,6 +261,8 @@ reset()
TEST_NAME="${1}"
MPTCP_LIB_SUBTEST_FLAKY=0 # reset if modified
if skip_test; then
MPTCP_LIB_TEST_COUNTER=$((MPTCP_LIB_TEST_COUNTER+1))
last_test_ignored=1
@ -448,7 +450,9 @@ reset_with_tcp_filter()
# $1: err msg
fail_test()
{
ret=${KSFT_FAIL}
if ! mptcp_lib_subtest_is_flaky; then
ret=${KSFT_FAIL}
fi
if [ ${#} -gt 0 ]; then
print_fail "${@}"
@ -3069,6 +3073,7 @@ fullmesh_tests()
fastclose_tests()
{
if reset_check_counter "fastclose test" "MPTcpExtMPFastcloseTx"; then
MPTCP_LIB_SUBTEST_FLAKY=1
test_linkfail=1024 fastclose=client \
run_tests $ns1 $ns2 10.0.1.1
chk_join_nr 0 0 0
@ -3077,6 +3082,7 @@ fastclose_tests()
fi
if reset_check_counter "fastclose server test" "MPTcpExtMPFastcloseRx"; then
MPTCP_LIB_SUBTEST_FLAKY=1
test_linkfail=1024 fastclose=server \
run_tests $ns1 $ns2 10.0.1.1
chk_join_nr 0 0 0 0 0 0 1
@ -3095,6 +3101,7 @@ fail_tests()
{
# single subflow
if reset_with_fail "Infinite map" 1; then
MPTCP_LIB_SUBTEST_FLAKY=1
test_linkfail=128 \
run_tests $ns1 $ns2 10.0.1.1
chk_join_nr 0 0 0 +1 +0 1 0 1 "$(pedit_action_pkts)"
@ -3103,6 +3110,7 @@ fail_tests()
# multiple subflows
if reset_with_fail "MP_FAIL MP_RST" 2; then
MPTCP_LIB_SUBTEST_FLAKY=1
tc -n $ns2 qdisc add dev ns2eth1 root netem rate 1mbit delay 5ms
pm_nl_set_limits $ns1 0 1
pm_nl_set_limits $ns2 0 1

View File

@ -21,6 +21,7 @@ declare -rx MPTCP_LIB_AF_INET6=10
MPTCP_LIB_SUBTESTS=()
MPTCP_LIB_SUBTESTS_DUPLICATED=0
MPTCP_LIB_SUBTEST_FLAKY=0
MPTCP_LIB_TEST_COUNTER=0
MPTCP_LIB_TEST_FORMAT="%02u %-50s"
MPTCP_LIB_IP_MPTCP=0
@ -41,6 +42,16 @@ else
readonly MPTCP_LIB_COLOR_RESET=
fi
# SELFTESTS_MPTCP_LIB_OVERRIDE_FLAKY env var can be set not to ignore errors
# from subtests marked as flaky
mptcp_lib_override_flaky() {
[ "${SELFTESTS_MPTCP_LIB_OVERRIDE_FLAKY:-}" = 1 ]
}
mptcp_lib_subtest_is_flaky() {
[ "${MPTCP_LIB_SUBTEST_FLAKY}" = 1 ] && ! mptcp_lib_override_flaky
}
# $1: color, $2: text
mptcp_lib_print_color() {
echo -e "${MPTCP_LIB_START_PRINT:-}${*}${MPTCP_LIB_COLOR_RESET}"
@ -72,7 +83,16 @@ mptcp_lib_pr_skip() {
}
mptcp_lib_pr_fail() {
mptcp_lib_print_err "[FAIL]${1:+ ${*}}"
local title cmt
if mptcp_lib_subtest_is_flaky; then
title="IGNO"
cmt=" (flaky)"
else
title="FAIL"
fi
mptcp_lib_print_err "[${title}]${cmt}${1:+ ${*}}"
}
mptcp_lib_pr_info() {
@ -208,7 +228,13 @@ mptcp_lib_result_pass() {
# $1: test name
mptcp_lib_result_fail() {
__mptcp_lib_result_add "not ok" "${1}"
if mptcp_lib_subtest_is_flaky; then
# It might sound better to use 'not ok # TODO' or 'ok # SKIP',
# but some CIs don't understand 'TODO' and treat SKIP as errors.
__mptcp_lib_result_add "ok" "${1} # IGNORE Flaky"
else
__mptcp_lib_result_add "not ok" "${1}"
fi
}
# $1: test name

View File

@ -244,7 +244,7 @@ run_test()
do_transfer $small $large $time
lret=$?
mptcp_lib_result_code "${lret}" "${msg}"
if [ $lret -ne 0 ]; then
if [ $lret -ne 0 ] && ! mptcp_lib_subtest_is_flaky; then
ret=$lret
[ $bail -eq 0 ] || exit $ret
fi
@ -254,7 +254,7 @@ run_test()
do_transfer $large $small $time
lret=$?
mptcp_lib_result_code "${lret}" "${msg}"
if [ $lret -ne 0 ]; then
if [ $lret -ne 0 ] && ! mptcp_lib_subtest_is_flaky; then
ret=$lret
[ $bail -eq 0 ] || exit $ret
fi
@ -290,7 +290,7 @@ run_test 10 10 0 0 "balanced bwidth"
run_test 10 10 1 25 "balanced bwidth with unbalanced delay"
# we still need some additional infrastructure to pass the following test-cases
run_test 10 3 0 0 "unbalanced bwidth"
MPTCP_LIB_SUBTEST_FLAKY=1 run_test 10 3 0 0 "unbalanced bwidth"
run_test 10 3 1 25 "unbalanced bwidth with unbalanced delay"
run_test 10 3 25 1 "unbalanced bwidth with opposed, unbalanced delay"

View File

@ -132,6 +132,50 @@
"echo \"1\" > /sys/bus/netdevsim/del_device"
]
},
{
"id": "6f62",
"name": "Add taprio Qdisc with too short interval",
"category": [
"qdisc",
"taprio"
],
"plugins": {
"requires": "nsPlugin"
},
"setup": [
"echo \"1 1 8\" > /sys/bus/netdevsim/new_device"
],
"cmdUnderTest": "$TC qdisc add dev $ETH root handle 1: taprio num_tc 2 queues 1@0 1@1 sched-entry S 01 300 sched-entry S 02 1700 clockid CLOCK_TAI",
"expExitCode": "2",
"verifyCmd": "$TC qdisc show dev $ETH",
"matchPattern": "qdisc taprio 1: root refcnt",
"matchCount": "0",
"teardown": [
"echo \"1\" > /sys/bus/netdevsim/del_device"
]
},
{
"id": "831f",
"name": "Add taprio Qdisc with too short cycle-time",
"category": [
"qdisc",
"taprio"
],
"plugins": {
"requires": "nsPlugin"
},
"setup": [
"echo \"1 1 8\" > /sys/bus/netdevsim/new_device"
],
"cmdUnderTest": "$TC qdisc add dev $ETH root handle 1: taprio num_tc 2 queues 1@0 1@1 sched-entry S 01 200000 sched-entry S 02 200000 cycle-time 100 clockid CLOCK_TAI",
"expExitCode": "2",
"verifyCmd": "$TC qdisc show dev $ETH",
"matchPattern": "qdisc taprio 1: root refcnt",
"matchCount": "0",
"teardown": [
"echo \"1\" > /sys/bus/netdevsim/del_device"
]
},
{
"id": "3e1e",
"name": "Add taprio Qdisc with an invalid cycle-time",