linux/drivers/net/dsa/sja1105/sja1105_main.c

3592 lines
99 KiB
C
Raw Normal View History

// SPDX-License-Identifier: GPL-2.0
/* Copyright (c) 2018, Sensor-Technik Wiedemann GmbH
* Copyright (c) 2018-2019, Vladimir Oltean <olteanv@gmail.com>
*/
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
#include <linux/delay.h>
#include <linux/module.h>
#include <linux/printk.h>
#include <linux/spi/spi.h>
#include <linux/errno.h>
#include <linux/gpio/consumer.h>
#include <linux/phylink.h>
#include <linux/of.h>
#include <linux/of_net.h>
#include <linux/of_mdio.h>
#include <linux/of_device.h>
#include <linux/netdev_features.h>
#include <linux/netdevice.h>
#include <linux/if_bridge.h>
#include <linux/if_ether.h>
#include <linux/dsa/8021q.h>
#include "sja1105.h"
net: dsa: sja1105: Add support for the SGMII port SJA1105 switches R and S have one SerDes port with an 802.3z quasi-compatible PCS, hardwired on port 4. The other ports are still MII/RMII/RGMII. The PCS performs rate adaptation to lower link speeds; the MAC on this port is hardwired at gigabit. Only full duplex is supported. The SGMII port can be configured as part of the static config tables, as well as through a dedicated SPI address region for its pseudo-clause-22 registers. However it looks like the static configuration is not able to change some out-of-reset values (like the value of MII_BMCR), so at the end of the day, having code for it is utterly pointless. We are just going to use the pseudo-C22 interface. Because the PCS gets reset when the switch resets, we have to add even more restoration logic to sja1105_static_config_reload, otherwise the SGMII port breaks after operations such as enabling PTP timestamping which require a switch reset. >From PHYLINK perspective, the switch supports *only* SGMII (it doesn't support 1000Base-X). It also doesn't expose access to the raw config word for in-band AN in registers MII_ADV/MII_LPA. It is able to work in the following modes: - Forced speed - SGMII in-band AN slave (speed received from PHY) - SGMII in-band AN master (acting as a PHY) The latter mode is not supported by this patch. It is even unclear to me how that would be described. There is some code for it left in the patch, but 'an_master' is always passed as false. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-20 11:29:37 +00:00
#include "sja1105_sgmii.h"
net: dsa: sja1105: Configure the Time-Aware Scheduler via tc-taprio offload This qdisc offload is the closest thing to what the SJA1105 supports in hardware for time-based egress shaping. The switch core really is built around SAE AS6802/TTEthernet (a TTTech standard) but can be made to operate similarly to IEEE 802.1Qbv with some constraints: - The gate control list is a global list for all ports. There are 8 execution threads that iterate through this global list in parallel. I don't know why 8, there are only 4 front-panel ports. - Care must be taken by the user to make sure that two execution threads never get to execute a GCL entry simultaneously. I created a O(n^4) checker for this hardware limitation, prior to accepting a taprio offload configuration as valid. - The spec says that if a GCL entry's interval is shorter than the frame length, you shouldn't send it (and end up in head-of-line blocking). Well, this switch does anyway. - The switch has no concept of ADMIN and OPER configurations. Because it's so simple, the TAS settings are loaded through the static config tables interface, so there isn't even place for any discussion about 'graceful switchover between ADMIN and OPER'. You just reset the switch and upload a new OPER config. - The switch accepts multiple time sources for the gate events. Right now I am using the standalone clock source as opposed to PTP. So the base time parameter doesn't really do much. Support for the PTP clock source will be added in a future series. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-15 02:00:02 +00:00
#include "sja1105_tas.h"
net: dsa: sja1105: implement cross-chip bridging operations sja1105 uses dsa_8021q for DSA tagging, a format which is VLAN at heart and which is compatible with cascading. A complete description of this tagging format is in net/dsa/tag_8021q.c, but a quick summary is that each external-facing port tags incoming frames with a unique pvid, and this special VLAN is transmitted as tagged towards the inside of the system, and as untagged towards the exterior. The tag encodes the switch id and the source port index. This means that cross-chip bridging for dsa_8021q only entails adding the dsa_8021q pvids of one switch to the RX filter of the other switches. Everything else falls naturally into place, as long as the bottom-end of ports (the leaves in the tree) is comprised exclusively of dsa_8021q-compatible (i.e. sja1105 switches). Otherwise, there would be a chance that a front-panel switch transmits a packet tagged with a dsa_8021q header, header which it wouldn't be able to remove, and which would hence "leak" out. The only use case I tested (due to lack of board availability) was when the sja1105 switches are part of disjoint trees (however, this doesn't change the fact that multiple sja1105 switches still need unique switch identifiers in such a system). But in principle, even "true" single-tree setups (with DSA links) should work just as fine, except for a small change which I can't test: dsa_towards_port should be used instead of dsa_upstream_port (I made the assumption that the routing port that any sja1105 should use towards its neighbours is the CPU port. That might not hold true in other setups). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-10 16:37:43 +00:00
static const struct dsa_switch_ops sja1105_switch_ops;
static void sja1105_hw_reset(struct gpio_desc *gpio, unsigned int pulse_len,
unsigned int startup_delay)
{
gpiod_set_value_cansleep(gpio, 1);
/* Wait for minimum reset pulse length */
msleep(pulse_len);
gpiod_set_value_cansleep(gpio, 0);
/* Wait until chip is ready after reset */
msleep(startup_delay);
}
static void
sja1105_port_allow_traffic(struct sja1105_l2_forwarding_entry *l2_fwd,
int from, int to, bool allow)
{
if (allow) {
l2_fwd[from].bc_domain |= BIT(to);
l2_fwd[from].reach_port |= BIT(to);
l2_fwd[from].fl_domain |= BIT(to);
} else {
l2_fwd[from].bc_domain &= ~BIT(to);
l2_fwd[from].reach_port &= ~BIT(to);
l2_fwd[from].fl_domain &= ~BIT(to);
}
}
/* Structure used to temporarily transport device tree
* settings into sja1105_setup
*/
struct sja1105_dt_port {
phy_interface_t phy_mode;
sja1105_mii_role_t role;
};
static int sja1105_init_mac_settings(struct sja1105_private *priv)
{
struct sja1105_mac_config_entry default_mac = {
/* Enable all 8 priority queues on egress.
* Every queue i holds top[i] - base[i] frames.
* Sum of top[i] - base[i] is 511 (max hardware limit).
*/
.top = {0x3F, 0x7F, 0xBF, 0xFF, 0x13F, 0x17F, 0x1BF, 0x1FF},
.base = {0x0, 0x40, 0x80, 0xC0, 0x100, 0x140, 0x180, 0x1C0},
.enabled = {true, true, true, true, true, true, true, true},
/* Keep standard IFG of 12 bytes on egress. */
.ifg = 0,
/* Always put the MAC speed in automatic mode, where it can be
* adjusted at runtime by PHYLINK.
*/
.speed = SJA1105_SPEED_AUTO,
/* No static correction for 1-step 1588 events */
.tp_delin = 0,
.tp_delout = 0,
/* Disable aging for critical TTEthernet traffic */
.maxage = 0xFF,
/* Internal VLAN (pvid) to apply to untagged ingress */
.vlanprio = 0,
.vlanid = 1,
.ing_mirr = false,
.egr_mirr = false,
/* Don't drop traffic with other EtherType than ETH_P_IP */
.drpnona664 = false,
/* Don't drop double-tagged traffic */
.drpdtag = false,
/* Don't drop untagged traffic */
.drpuntag = false,
/* Don't retag 802.1p (VID 0) traffic with the pvid */
.retag = false,
net: dsa: sja1105: Add support for Spanning Tree Protocol While not explicitly documented as supported in UM10944, compliance with the STP states can be obtained by manipulating 3 settings at the (per-port) MAC config level: dynamic learning, inhibiting reception of regular traffic, and inhibiting transmission of regular traffic. In all these modes, transmission and reception of special BPDU frames from the stack is still enabled (not inhibited by the MAC-level settings). On ingress, BPDUs are classified by the MAC filter as link-local (01-80-C2-00-00-00) and forwarded to the CPU port. This mechanism works under all conditions (even without the custom 802.1Q tagging) because the switch hardware inserts the source port and switch ID into bytes 4 and 5 of the MAC-filtered frames. Then the DSA .rcv handler needs to put back zeroes into the MAC address after decoding the source port information. On egress, BPDUs are transmitted using management routes from the xmit worker thread. Again this does not require switch tagging, as the switch port is programmed through SPI to hold a temporary (single-fire) route for a frame with the programmed destination MAC (01-80-C2-00-00-00). STP is activated using the following commands and was tested by connecting two front-panel ports together and noticing that switching loops were prevented (one port remains in the blocking state): $ ip link add name br0 type bridge stp_state 1 && ip link set br0 up $ for eth in $(ls /sys/devices/platform/soc/2100000.spi/spi_master/spi0/spi0.1/net/); do ip link set ${eth} master br0 && ip link set ${eth} up; done Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-05 10:19:28 +00:00
/* Disable learning and I/O on user ports by default -
* STP will enable it.
*/
.dyn_learn = false,
.egress = false,
.ingress = false,
};
struct sja1105_mac_config_entry *mac;
struct sja1105_table *table;
int i;
table = &priv->static_config.tables[BLK_IDX_MAC_CONFIG];
/* Discard previous MAC Configuration Table */
if (table->entry_count) {
kfree(table->entries);
table->entry_count = 0;
}
table->entries = kcalloc(SJA1105_NUM_PORTS,
table->ops->unpacked_entry_size, GFP_KERNEL);
if (!table->entries)
return -ENOMEM;
table->entry_count = SJA1105_NUM_PORTS;
mac = table->entries;
net: dsa: sja1105: Add support for Spanning Tree Protocol While not explicitly documented as supported in UM10944, compliance with the STP states can be obtained by manipulating 3 settings at the (per-port) MAC config level: dynamic learning, inhibiting reception of regular traffic, and inhibiting transmission of regular traffic. In all these modes, transmission and reception of special BPDU frames from the stack is still enabled (not inhibited by the MAC-level settings). On ingress, BPDUs are classified by the MAC filter as link-local (01-80-C2-00-00-00) and forwarded to the CPU port. This mechanism works under all conditions (even without the custom 802.1Q tagging) because the switch hardware inserts the source port and switch ID into bytes 4 and 5 of the MAC-filtered frames. Then the DSA .rcv handler needs to put back zeroes into the MAC address after decoding the source port information. On egress, BPDUs are transmitted using management routes from the xmit worker thread. Again this does not require switch tagging, as the switch port is programmed through SPI to hold a temporary (single-fire) route for a frame with the programmed destination MAC (01-80-C2-00-00-00). STP is activated using the following commands and was tested by connecting two front-panel ports together and noticing that switching loops were prevented (one port remains in the blocking state): $ ip link add name br0 type bridge stp_state 1 && ip link set br0 up $ for eth in $(ls /sys/devices/platform/soc/2100000.spi/spi_master/spi0/spi0.1/net/); do ip link set ${eth} master br0 && ip link set ${eth} up; done Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-05 10:19:28 +00:00
for (i = 0; i < SJA1105_NUM_PORTS; i++) {
mac[i] = default_mac;
net: dsa: sja1105: Add support for Spanning Tree Protocol While not explicitly documented as supported in UM10944, compliance with the STP states can be obtained by manipulating 3 settings at the (per-port) MAC config level: dynamic learning, inhibiting reception of regular traffic, and inhibiting transmission of regular traffic. In all these modes, transmission and reception of special BPDU frames from the stack is still enabled (not inhibited by the MAC-level settings). On ingress, BPDUs are classified by the MAC filter as link-local (01-80-C2-00-00-00) and forwarded to the CPU port. This mechanism works under all conditions (even without the custom 802.1Q tagging) because the switch hardware inserts the source port and switch ID into bytes 4 and 5 of the MAC-filtered frames. Then the DSA .rcv handler needs to put back zeroes into the MAC address after decoding the source port information. On egress, BPDUs are transmitted using management routes from the xmit worker thread. Again this does not require switch tagging, as the switch port is programmed through SPI to hold a temporary (single-fire) route for a frame with the programmed destination MAC (01-80-C2-00-00-00). STP is activated using the following commands and was tested by connecting two front-panel ports together and noticing that switching loops were prevented (one port remains in the blocking state): $ ip link add name br0 type bridge stp_state 1 && ip link set br0 up $ for eth in $(ls /sys/devices/platform/soc/2100000.spi/spi_master/spi0/spi0.1/net/); do ip link set ${eth} master br0 && ip link set ${eth} up; done Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-05 10:19:28 +00:00
if (i == dsa_upstream_port(priv->ds, i)) {
/* STP doesn't get called for CPU port, so we need to
* set the I/O parameters statically.
*/
mac[i].dyn_learn = true;
mac[i].ingress = true;
mac[i].egress = true;
}
}
return 0;
}
net: dsa: sja1105: Add support for the SGMII port SJA1105 switches R and S have one SerDes port with an 802.3z quasi-compatible PCS, hardwired on port 4. The other ports are still MII/RMII/RGMII. The PCS performs rate adaptation to lower link speeds; the MAC on this port is hardwired at gigabit. Only full duplex is supported. The SGMII port can be configured as part of the static config tables, as well as through a dedicated SPI address region for its pseudo-clause-22 registers. However it looks like the static configuration is not able to change some out-of-reset values (like the value of MII_BMCR), so at the end of the day, having code for it is utterly pointless. We are just going to use the pseudo-C22 interface. Because the PCS gets reset when the switch resets, we have to add even more restoration logic to sja1105_static_config_reload, otherwise the SGMII port breaks after operations such as enabling PTP timestamping which require a switch reset. >From PHYLINK perspective, the switch supports *only* SGMII (it doesn't support 1000Base-X). It also doesn't expose access to the raw config word for in-band AN in registers MII_ADV/MII_LPA. It is able to work in the following modes: - Forced speed - SGMII in-band AN slave (speed received from PHY) - SGMII in-band AN master (acting as a PHY) The latter mode is not supported by this patch. It is even unclear to me how that would be described. There is some code for it left in the patch, but 'an_master' is always passed as false. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-20 11:29:37 +00:00
static bool sja1105_supports_sgmii(struct sja1105_private *priv, int port)
{
if (priv->info->part_no != SJA1105R_PART_NO &&
priv->info->part_no != SJA1105S_PART_NO)
return false;
if (port != SJA1105_SGMII_PORT)
return false;
if (dsa_is_unused_port(priv->ds, port))
return false;
return true;
}
static int sja1105_init_mii_settings(struct sja1105_private *priv,
struct sja1105_dt_port *ports)
{
struct device *dev = &priv->spidev->dev;
struct sja1105_xmii_params_entry *mii;
struct sja1105_table *table;
int i;
table = &priv->static_config.tables[BLK_IDX_XMII_PARAMS];
/* Discard previous xMII Mode Parameters Table */
if (table->entry_count) {
kfree(table->entries);
table->entry_count = 0;
}
table->entries = kcalloc(SJA1105_MAX_XMII_PARAMS_COUNT,
table->ops->unpacked_entry_size, GFP_KERNEL);
if (!table->entries)
return -ENOMEM;
/* Override table based on PHYLINK DT bindings */
table->entry_count = SJA1105_MAX_XMII_PARAMS_COUNT;
mii = table->entries;
for (i = 0; i < SJA1105_NUM_PORTS; i++) {
if (dsa_is_unused_port(priv->ds, i))
continue;
switch (ports[i].phy_mode) {
case PHY_INTERFACE_MODE_MII:
mii->xmii_mode[i] = XMII_MODE_MII;
break;
case PHY_INTERFACE_MODE_RMII:
mii->xmii_mode[i] = XMII_MODE_RMII;
break;
case PHY_INTERFACE_MODE_RGMII:
case PHY_INTERFACE_MODE_RGMII_ID:
case PHY_INTERFACE_MODE_RGMII_RXID:
case PHY_INTERFACE_MODE_RGMII_TXID:
mii->xmii_mode[i] = XMII_MODE_RGMII;
break;
net: dsa: sja1105: Add support for the SGMII port SJA1105 switches R and S have one SerDes port with an 802.3z quasi-compatible PCS, hardwired on port 4. The other ports are still MII/RMII/RGMII. The PCS performs rate adaptation to lower link speeds; the MAC on this port is hardwired at gigabit. Only full duplex is supported. The SGMII port can be configured as part of the static config tables, as well as through a dedicated SPI address region for its pseudo-clause-22 registers. However it looks like the static configuration is not able to change some out-of-reset values (like the value of MII_BMCR), so at the end of the day, having code for it is utterly pointless. We are just going to use the pseudo-C22 interface. Because the PCS gets reset when the switch resets, we have to add even more restoration logic to sja1105_static_config_reload, otherwise the SGMII port breaks after operations such as enabling PTP timestamping which require a switch reset. >From PHYLINK perspective, the switch supports *only* SGMII (it doesn't support 1000Base-X). It also doesn't expose access to the raw config word for in-band AN in registers MII_ADV/MII_LPA. It is able to work in the following modes: - Forced speed - SGMII in-band AN slave (speed received from PHY) - SGMII in-band AN master (acting as a PHY) The latter mode is not supported by this patch. It is even unclear to me how that would be described. There is some code for it left in the patch, but 'an_master' is always passed as false. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-20 11:29:37 +00:00
case PHY_INTERFACE_MODE_SGMII:
if (!sja1105_supports_sgmii(priv, i))
return -EINVAL;
mii->xmii_mode[i] = XMII_MODE_SGMII;
break;
default:
dev_err(dev, "Unsupported PHY mode %s!\n",
phy_modes(ports[i].phy_mode));
}
net: dsa: sja1105: Add support for the SGMII port SJA1105 switches R and S have one SerDes port with an 802.3z quasi-compatible PCS, hardwired on port 4. The other ports are still MII/RMII/RGMII. The PCS performs rate adaptation to lower link speeds; the MAC on this port is hardwired at gigabit. Only full duplex is supported. The SGMII port can be configured as part of the static config tables, as well as through a dedicated SPI address region for its pseudo-clause-22 registers. However it looks like the static configuration is not able to change some out-of-reset values (like the value of MII_BMCR), so at the end of the day, having code for it is utterly pointless. We are just going to use the pseudo-C22 interface. Because the PCS gets reset when the switch resets, we have to add even more restoration logic to sja1105_static_config_reload, otherwise the SGMII port breaks after operations such as enabling PTP timestamping which require a switch reset. >From PHYLINK perspective, the switch supports *only* SGMII (it doesn't support 1000Base-X). It also doesn't expose access to the raw config word for in-band AN in registers MII_ADV/MII_LPA. It is able to work in the following modes: - Forced speed - SGMII in-band AN slave (speed received from PHY) - SGMII in-band AN master (acting as a PHY) The latter mode is not supported by this patch. It is even unclear to me how that would be described. There is some code for it left in the patch, but 'an_master' is always passed as false. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-20 11:29:37 +00:00
/* Even though the SerDes port is able to drive SGMII autoneg
* like a PHY would, from the perspective of the XMII tables,
* the SGMII port should always be put in MAC mode.
*/
if (ports[i].phy_mode == PHY_INTERFACE_MODE_SGMII)
mii->phy_mac[i] = XMII_MAC;
else
mii->phy_mac[i] = ports[i].role;
}
return 0;
}
static int sja1105_init_static_fdb(struct sja1105_private *priv)
{
struct sja1105_table *table;
table = &priv->static_config.tables[BLK_IDX_L2_LOOKUP];
/* We only populate the FDB table through dynamic
* L2 Address Lookup entries
*/
if (table->entry_count) {
kfree(table->entries);
table->entry_count = 0;
}
return 0;
}
static int sja1105_init_l2_lookup_params(struct sja1105_private *priv)
{
struct sja1105_table *table;
u64 max_fdb_entries = SJA1105_MAX_L2_LOOKUP_COUNT / SJA1105_NUM_PORTS;
struct sja1105_l2_lookup_params_entry default_l2_lookup_params = {
/* Learned FDB entries are forgotten after 300 seconds */
.maxage = SJA1105_AGEING_TIME_MS(300000),
/* All entries within a FDB bin are available for learning */
.dyn_tbsz = SJA1105ET_FDB_BIN_SIZE,
/* And the P/Q/R/S equivalent setting: */
.start_dynspc = 0,
.maxaddrp = {max_fdb_entries, max_fdb_entries, max_fdb_entries,
max_fdb_entries, max_fdb_entries, },
/* 2^8 + 2^5 + 2^3 + 2^2 + 2^1 + 1 in Koopman notation */
.poly = 0x97,
/* This selects between Independent VLAN Learning (IVL) and
* Shared VLAN Learning (SVL)
*/
net: dsa: sja1105: Fix broken learning with vlan_filtering disabled When put under a bridge with vlan_filtering 0, the SJA1105 ports will flood all traffic as if learning was broken. This is because learning interferes with the rx_vid's configured by dsa_8021q as unique pvid's. So learning technically still *does* work, it's just that the learnt entries never get matched due to their unique VLAN ID. The setting that saves the day is Shared VLAN Learning, which on this switch family works exactly as desired: VLAN tagging still works (untagged traffic gets the correct pvid) and FDB entries are still populated with the correct contents including VID. Also, a frame cannot violate the forwarding domain restrictions enforced by its classified VLAN. It is just that the VID is ignored when looking up the FDB for taking a forwarding decision (selecting the egress port). This patch activates SVL, and the result is that frames with a learnt DMAC are no longer flooded in the scenario described above. Now exactly *because* SVL works as desired, we have to revisit some earlier patches: - It is no longer necessary to manipulate the VID of the 'bridge fdb {add,del}' command when vlan_filtering is off. This is because now, SVL is enabled for that case, so the actual VID does not matter*. - It is still desirable to hide dsa_8021q VID's in the FDB dump callback. But right now the dump callback should no longer hide duplicates (one per each front panel port's pvid, plus one for the VLAN that the CPU port is going to tag a TX frame with), because there shouldn't be any (the switch will match a single FDB entry no matter its VID anyway). * Not really... It's no longer necessary to transform a 'bridge fdb add' into 5 fdb add operations, but the user might still add a fdb entry with any vid, and all of them would appear as duplicates in 'bridge fdb show'. So force a 'bridge fdb add' to insert the VID of 0**, so that we can prune the duplicates at insertion time. ** The VID of 0 is better than 1 because it is always guaranteed to be in the ports' hardware filter. DSA also avoids putting the VID inside the netlink response message towards the bridge driver when we return this particular VID, which makes it suitable for FDB entries learnt with vlan_filtering off. Fixes: 227d07a07ef1 ("net: dsa: sja1105: Add support for traffic through standalone ports") Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Georg Waibel <georg.waibel@sensor-technik.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-04 22:38:44 +00:00
.shared_learn = true,
/* Don't discard management traffic based on ENFPORT -
* we don't perform SMAC port enforcement anyway, so
* what we are setting here doesn't matter.
*/
.no_enf_hostprt = false,
/* Don't learn SMAC for mac_fltres1 and mac_fltres0.
* Maybe correlate with no_linklocal_learn from bridge driver?
*/
.no_mgmt_learn = true,
/* P/Q/R/S only */
.use_static = true,
/* Dynamically learned FDB entries can overwrite other (older)
* dynamic FDB entries
*/
.owr_dyn = true,
.drpnolearn = true,
};
table = &priv->static_config.tables[BLK_IDX_L2_LOOKUP_PARAMS];
if (table->entry_count) {
kfree(table->entries);
table->entry_count = 0;
}
table->entries = kcalloc(SJA1105_MAX_L2_LOOKUP_PARAMS_COUNT,
table->ops->unpacked_entry_size, GFP_KERNEL);
if (!table->entries)
return -ENOMEM;
table->entry_count = SJA1105_MAX_L2_LOOKUP_PARAMS_COUNT;
/* This table only has a single entry */
((struct sja1105_l2_lookup_params_entry *)table->entries)[0] =
default_l2_lookup_params;
return 0;
}
static int sja1105_init_static_vlan(struct sja1105_private *priv)
{
struct sja1105_table *table;
struct sja1105_vlan_lookup_entry pvid = {
.ving_mirr = 0,
.vegr_mirr = 0,
.vmemb_port = 0,
.vlan_bc = 0,
.tag_port = 0,
.vlanid = 1,
};
net: dsa: sja1105: save/restore VLANs using a delta commit method Managing the VLAN table that is present in hardware will become very difficult once we add a third operating state (best_effort_vlan_filtering). That is because correct cleanup (not too little, not too much) becomes virtually impossible, when VLANs can be added from the bridge layer, from dsa_8021q for basic tagging, for cross-chip bridging, as well as retagging rules for sub-VLANs and cross-chip sub-VLANs. So we need to rethink VLAN interaction with the switch in a more scalable way. In preparation for that, use the priv->expect_dsa_8021q boolean to classify any VLAN request received through .port_vlan_add or .port_vlan_del towards either one of 2 internal lists: bridge VLANs and dsa_8021q VLANs. Then, implement a central sja1105_build_vlan_table method that creates a VLAN configuration from scratch based on the 2 lists of VLANs kept by the driver, and based on the VLAN awareness state. Currently, if we are VLAN-unaware, install the dsa_8021q VLANs, otherwise the bridge VLANs. Then, implement a delta commit procedure that identifies which VLANs from this new configuration are actually different from the config previously committed to hardware. We apply the delta through the dynamic configuration interface (we don't reset the switch). The result is that the hardware should see the exact sequence of operations as before this patch. This also helps remove the "br" argument passed to dsa_8021q_crosschip_bridge_join, which it was only using to figure out whether it should commit the configuration back to us or not, based on the VLAN awareness state of the bridge. We can simplify that, by always allowing those VLANs inside of our dsa_8021q_vlans list, and committing those to hardware when necessary. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 17:20:29 +00:00
struct dsa_switch *ds = priv->ds;
int port;
table = &priv->static_config.tables[BLK_IDX_VLAN_LOOKUP];
/* The static VLAN table will only contain the initial pvid of 1.
* All other VLANs are to be configured through dynamic entries,
* and kept in the static configuration table as backing memory.
*/
if (table->entry_count) {
kfree(table->entries);
table->entry_count = 0;
}
table->entries = kcalloc(1, table->ops->unpacked_entry_size,
GFP_KERNEL);
if (!table->entries)
return -ENOMEM;
table->entry_count = 1;
/* VLAN 1: all DT-defined ports are members; no restrictions on
net: dsa: sja1105: save/restore VLANs using a delta commit method Managing the VLAN table that is present in hardware will become very difficult once we add a third operating state (best_effort_vlan_filtering). That is because correct cleanup (not too little, not too much) becomes virtually impossible, when VLANs can be added from the bridge layer, from dsa_8021q for basic tagging, for cross-chip bridging, as well as retagging rules for sub-VLANs and cross-chip sub-VLANs. So we need to rethink VLAN interaction with the switch in a more scalable way. In preparation for that, use the priv->expect_dsa_8021q boolean to classify any VLAN request received through .port_vlan_add or .port_vlan_del towards either one of 2 internal lists: bridge VLANs and dsa_8021q VLANs. Then, implement a central sja1105_build_vlan_table method that creates a VLAN configuration from scratch based on the 2 lists of VLANs kept by the driver, and based on the VLAN awareness state. Currently, if we are VLAN-unaware, install the dsa_8021q VLANs, otherwise the bridge VLANs. Then, implement a delta commit procedure that identifies which VLANs from this new configuration are actually different from the config previously committed to hardware. We apply the delta through the dynamic configuration interface (we don't reset the switch). The result is that the hardware should see the exact sequence of operations as before this patch. This also helps remove the "br" argument passed to dsa_8021q_crosschip_bridge_join, which it was only using to figure out whether it should commit the configuration back to us or not, based on the VLAN awareness state of the bridge. We can simplify that, by always allowing those VLANs inside of our dsa_8021q_vlans list, and committing those to hardware when necessary. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 17:20:29 +00:00
* forwarding; always transmit as untagged.
*/
net: dsa: sja1105: save/restore VLANs using a delta commit method Managing the VLAN table that is present in hardware will become very difficult once we add a third operating state (best_effort_vlan_filtering). That is because correct cleanup (not too little, not too much) becomes virtually impossible, when VLANs can be added from the bridge layer, from dsa_8021q for basic tagging, for cross-chip bridging, as well as retagging rules for sub-VLANs and cross-chip sub-VLANs. So we need to rethink VLAN interaction with the switch in a more scalable way. In preparation for that, use the priv->expect_dsa_8021q boolean to classify any VLAN request received through .port_vlan_add or .port_vlan_del towards either one of 2 internal lists: bridge VLANs and dsa_8021q VLANs. Then, implement a central sja1105_build_vlan_table method that creates a VLAN configuration from scratch based on the 2 lists of VLANs kept by the driver, and based on the VLAN awareness state. Currently, if we are VLAN-unaware, install the dsa_8021q VLANs, otherwise the bridge VLANs. Then, implement a delta commit procedure that identifies which VLANs from this new configuration are actually different from the config previously committed to hardware. We apply the delta through the dynamic configuration interface (we don't reset the switch). The result is that the hardware should see the exact sequence of operations as before this patch. This also helps remove the "br" argument passed to dsa_8021q_crosschip_bridge_join, which it was only using to figure out whether it should commit the configuration back to us or not, based on the VLAN awareness state of the bridge. We can simplify that, by always allowing those VLANs inside of our dsa_8021q_vlans list, and committing those to hardware when necessary. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 17:20:29 +00:00
for (port = 0; port < ds->num_ports; port++) {
struct sja1105_bridge_vlan *v;
if (dsa_is_unused_port(ds, port))
continue;
pvid.vmemb_port |= BIT(port);
pvid.vlan_bc |= BIT(port);
pvid.tag_port &= ~BIT(port);
/* Let traffic that don't need dsa_8021q (e.g. STP, PTP) be
* transmitted as untagged.
*/
v = kzalloc(sizeof(*v), GFP_KERNEL);
if (!v)
return -ENOMEM;
v->port = port;
v->vid = 1;
v->untagged = true;
if (dsa_is_cpu_port(ds, port))
v->pvid = true;
list_add(&v->list, &priv->dsa_8021q_vlans);
}
((struct sja1105_vlan_lookup_entry *)table->entries)[0] = pvid;
return 0;
}
static int sja1105_init_l2_forwarding(struct sja1105_private *priv)
{
struct sja1105_l2_forwarding_entry *l2fwd;
struct sja1105_table *table;
int i, j;
table = &priv->static_config.tables[BLK_IDX_L2_FORWARDING];
if (table->entry_count) {
kfree(table->entries);
table->entry_count = 0;
}
table->entries = kcalloc(SJA1105_MAX_L2_FORWARDING_COUNT,
table->ops->unpacked_entry_size, GFP_KERNEL);
if (!table->entries)
return -ENOMEM;
table->entry_count = SJA1105_MAX_L2_FORWARDING_COUNT;
l2fwd = table->entries;
/* First 5 entries define the forwarding rules */
for (i = 0; i < SJA1105_NUM_PORTS; i++) {
unsigned int upstream = dsa_upstream_port(priv->ds, i);
for (j = 0; j < SJA1105_NUM_TC; j++)
l2fwd[i].vlan_pmap[j] = j;
if (i == upstream)
continue;
sja1105_port_allow_traffic(l2fwd, i, upstream, true);
sja1105_port_allow_traffic(l2fwd, upstream, i, true);
}
/* Next 8 entries define VLAN PCP mapping from ingress to egress.
* Create a one-to-one mapping.
*/
for (i = 0; i < SJA1105_NUM_TC; i++)
for (j = 0; j < SJA1105_NUM_PORTS; j++)
l2fwd[SJA1105_NUM_PORTS + i].vlan_pmap[j] = i;
return 0;
}
static int sja1105_init_l2_forwarding_params(struct sja1105_private *priv)
{
struct sja1105_l2_forwarding_params_entry default_l2fwd_params = {
/* Disallow dynamic reconfiguration of vlan_pmap */
.max_dynp = 0,
/* Use a single memory partition for all ingress queues */
.part_spc = { SJA1105_MAX_FRAME_MEMORY, 0, 0, 0, 0, 0, 0, 0 },
};
struct sja1105_table *table;
table = &priv->static_config.tables[BLK_IDX_L2_FORWARDING_PARAMS];
if (table->entry_count) {
kfree(table->entries);
table->entry_count = 0;
}
table->entries = kcalloc(SJA1105_MAX_L2_FORWARDING_PARAMS_COUNT,
table->ops->unpacked_entry_size, GFP_KERNEL);
if (!table->entries)
return -ENOMEM;
table->entry_count = SJA1105_MAX_L2_FORWARDING_PARAMS_COUNT;
/* This table only has a single entry */
((struct sja1105_l2_forwarding_params_entry *)table->entries)[0] =
default_l2fwd_params;
return 0;
}
void sja1105_frame_memory_partitioning(struct sja1105_private *priv)
{
struct sja1105_l2_forwarding_params_entry *l2_fwd_params;
struct sja1105_vl_forwarding_params_entry *vl_fwd_params;
struct sja1105_table *table;
int max_mem;
/* VLAN retagging is implemented using a loopback port that consumes
* frame buffers. That leaves less for us.
*/
if (priv->vlan_state == SJA1105_VLAN_BEST_EFFORT)
max_mem = SJA1105_MAX_FRAME_MEMORY_RETAGGING;
else
max_mem = SJA1105_MAX_FRAME_MEMORY;
table = &priv->static_config.tables[BLK_IDX_L2_FORWARDING_PARAMS];
l2_fwd_params = table->entries;
l2_fwd_params->part_spc[0] = max_mem;
/* If we have any critical-traffic virtual links, we need to reserve
* some frame buffer memory for them. At the moment, hardcode the value
* at 100 blocks of 128 bytes of memory each. This leaves 829 blocks
* remaining for best-effort traffic. TODO: figure out a more flexible
* way to perform the frame buffer partitioning.
*/
if (!priv->static_config.tables[BLK_IDX_VL_FORWARDING].entry_count)
return;
table = &priv->static_config.tables[BLK_IDX_VL_FORWARDING_PARAMS];
vl_fwd_params = table->entries;
l2_fwd_params->part_spc[0] -= SJA1105_VL_FRAME_MEMORY;
vl_fwd_params->partspc[0] = SJA1105_VL_FRAME_MEMORY;
}
static int sja1105_init_general_params(struct sja1105_private *priv)
{
struct sja1105_general_params_entry default_general_params = {
/* Allow dynamic changing of the mirror port */
.mirr_ptacu = true,
.switchid = priv->ds->index,
/* Priority queue for link-local management frames
* (both ingress to and egress from CPU - PTP, STP etc)
*/
.hostprio = 7,
.mac_fltres1 = SJA1105_LINKLOCAL_FILTER_A,
.mac_flt1 = SJA1105_LINKLOCAL_FILTER_A_MASK,
net: dsa: sja1105: Limit use of incl_srcpt to bridge+vlan mode The incl_srcpt setting makes the switch mangle the destination MACs of multicast frames trapped to the CPU - a primitive tagging mechanism that works even when we cannot use the 802.1Q software features. The downside is that the two multicast MAC addresses that the switch traps for L2 PTP (01-80-C2-00-00-0E and 01-1B-19-00-00-00) quickly turn into a lot more, as the switch encodes the source port and switch id into bytes 3 and 4 of the MAC. The resulting range of MAC addresses would need to be installed manually into the DSA master port's multicast MAC filter, and even then, most devices might not have a large enough MAC filtering table. As a result, only limit use of incl_srcpt to when it's strictly necessary: when under a VLAN filtering bridge. This fixes PTP in non-bridged mode (standalone ports). Otherwise, PTP frames, as well as metadata follow-up frames holding RX timestamps won't be received because they will be blocked by the master port's MAC filter. Linuxptp doesn't help, because it only requests the addition of the unmodified PTP MACs to the multicast filter. This issue is not seen in bridged mode because the master port is put in promiscuous mode when the slave ports are enslaved to a bridge. Therefore, there is no downside to having the incl_srcpt mechanism active there. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-08 12:04:32 +00:00
.incl_srcpt1 = false,
.send_meta1 = false,
.mac_fltres0 = SJA1105_LINKLOCAL_FILTER_B,
.mac_flt0 = SJA1105_LINKLOCAL_FILTER_B_MASK,
net: dsa: sja1105: Limit use of incl_srcpt to bridge+vlan mode The incl_srcpt setting makes the switch mangle the destination MACs of multicast frames trapped to the CPU - a primitive tagging mechanism that works even when we cannot use the 802.1Q software features. The downside is that the two multicast MAC addresses that the switch traps for L2 PTP (01-80-C2-00-00-0E and 01-1B-19-00-00-00) quickly turn into a lot more, as the switch encodes the source port and switch id into bytes 3 and 4 of the MAC. The resulting range of MAC addresses would need to be installed manually into the DSA master port's multicast MAC filter, and even then, most devices might not have a large enough MAC filtering table. As a result, only limit use of incl_srcpt to when it's strictly necessary: when under a VLAN filtering bridge. This fixes PTP in non-bridged mode (standalone ports). Otherwise, PTP frames, as well as metadata follow-up frames holding RX timestamps won't be received because they will be blocked by the master port's MAC filter. Linuxptp doesn't help, because it only requests the addition of the unmodified PTP MACs to the multicast filter. This issue is not seen in bridged mode because the master port is put in promiscuous mode when the slave ports are enslaved to a bridge. Therefore, there is no downside to having the incl_srcpt mechanism active there. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-08 12:04:32 +00:00
.incl_srcpt0 = false,
.send_meta0 = false,
/* The destination for traffic matching mac_fltres1 and
* mac_fltres0 on all ports except host_port. Such traffic
* receieved on host_port itself would be dropped, except
* by installing a temporary 'management route'
*/
.host_port = dsa_upstream_port(priv->ds, 0),
/* Default to an invalid value */
.mirr_port = SJA1105_NUM_PORTS,
/* Link-local traffic received on casc_port will be forwarded
* to host_port without embedding the source port and device ID
* info in the destination MAC address (presumably because it
* is a cascaded port and a downstream SJA switch already did
* that). Default to an invalid port (to disable the feature)
* and overwrite this if we find any DSA (cascaded) ports.
*/
.casc_port = SJA1105_NUM_PORTS,
/* No TTEthernet */
.vllupformat = SJA1105_VL_FORMAT_PSFP,
.vlmarker = 0,
.vlmask = 0,
/* Only update correctionField for 1-step PTP (L2 transport) */
.ignore2stf = 0,
/* Forcefully disable VLAN filtering by telling
* the switch that VLAN has a different EtherType.
*/
.tpid = ETH_P_SJA1105,
.tpid2 = ETH_P_SJA1105,
};
struct sja1105_table *table;
table = &priv->static_config.tables[BLK_IDX_GENERAL_PARAMS];
if (table->entry_count) {
kfree(table->entries);
table->entry_count = 0;
}
table->entries = kcalloc(SJA1105_MAX_GENERAL_PARAMS_COUNT,
table->ops->unpacked_entry_size, GFP_KERNEL);
if (!table->entries)
return -ENOMEM;
table->entry_count = SJA1105_MAX_GENERAL_PARAMS_COUNT;
/* This table only has a single entry */
((struct sja1105_general_params_entry *)table->entries)[0] =
default_general_params;
return 0;
}
static int sja1105_init_avb_params(struct sja1105_private *priv)
{
struct sja1105_avb_params_entry *avb;
struct sja1105_table *table;
table = &priv->static_config.tables[BLK_IDX_AVB_PARAMS];
/* Discard previous AVB Parameters Table */
if (table->entry_count) {
kfree(table->entries);
table->entry_count = 0;
}
table->entries = kcalloc(SJA1105_MAX_AVB_PARAMS_COUNT,
table->ops->unpacked_entry_size, GFP_KERNEL);
if (!table->entries)
return -ENOMEM;
table->entry_count = SJA1105_MAX_AVB_PARAMS_COUNT;
avb = table->entries;
/* Configure the MAC addresses for meta frames */
avb->destmeta = SJA1105_META_DMAC;
avb->srcmeta = SJA1105_META_SMAC;
net: dsa: sja1105: configure the PTP_CLK pin as EXT_TS or PER_OUT The SJA1105 switch family has a PTP_CLK pin which emits a signal with fixed 50% duty cycle, but variable frequency and programmable start time. On the second generation (P/Q/R/S) switches, this pin supports even more functionality. The use case described by the hardware documents talks about synchronization via oneshot pulses: given 2 sja1105 switches, arbitrarily designated as a master and a slave, the master emits a single pulse on PTP_CLK, while the slave is configured to timestamp this pulse received on its PTP_CLK pin (which must obviously be configured as input). The difference between the timestamps then exactly becomes the slave offset to the master. The only trouble with the above is that the hardware is very much tied into this use case only, and not very generic beyond that: - When emitting a oneshot pulse, instead of being told when to emit it, the switch just does it "now" and tells you later what time it was, via the PTPSYNCTS register. [ Incidentally, this is the same register that the slave uses to collect the ext_ts timestamp from, too. ] - On the sync slave, there is no interrupt mechanism on reception of a new extts, and no FIFO to buffer them, because in the foreseen use case, software is in control of both the master and the slave pins, so it "knows" when there's something to collect. These 2 problems mean that: - We don't support (at least yet) the quirky oneshot mode exposed by the hardware, just normal periodic output. - We abuse the hardware a little bit when we expose generic extts. Because there's no interrupt mechanism, we need to poll at double the frequency we expect to receive a pulse. Currently that means a non-configurable "twice a second". Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-23 22:59:24 +00:00
/* On P/Q/R/S, configure the direction of the PTP_CLK pin as input by
* default. This is because there might be boards with a hardware
* layout where enabling the pin as output might cause an electrical
* clash. On E/T the pin is always an output, which the board designers
* probably already knew, so even if there are going to be electrical
* issues, there's nothing we can do.
*/
avb->cas_master = false;
return 0;
}
/* The L2 policing table is 2-stage. The table is looked up for each frame
* according to the ingress port, whether it was broadcast or not, and the
* classified traffic class (given by VLAN PCP). This portion of the lookup is
* fixed, and gives access to the SHARINDX, an indirection register pointing
* within the policing table itself, which is used to resolve the policer that
* will be used for this frame.
*
* Stage 1 Stage 2
* +------------+--------+ +---------------------------------+
* |Port 0 TC 0 |SHARINDX| | Policer 0: Rate, Burst, MTU |
* +------------+--------+ +---------------------------------+
* |Port 0 TC 1 |SHARINDX| | Policer 1: Rate, Burst, MTU |
* +------------+--------+ +---------------------------------+
* ... | Policer 2: Rate, Burst, MTU |
* +------------+--------+ +---------------------------------+
* |Port 0 TC 7 |SHARINDX| | Policer 3: Rate, Burst, MTU |
* +------------+--------+ +---------------------------------+
* |Port 1 TC 0 |SHARINDX| | Policer 4: Rate, Burst, MTU |
* +------------+--------+ +---------------------------------+
* ... | Policer 5: Rate, Burst, MTU |
* +------------+--------+ +---------------------------------+
* |Port 1 TC 7 |SHARINDX| | Policer 6: Rate, Burst, MTU |
* +------------+--------+ +---------------------------------+
* ... | Policer 7: Rate, Burst, MTU |
* +------------+--------+ +---------------------------------+
* |Port 4 TC 7 |SHARINDX| ...
* +------------+--------+
* |Port 0 BCAST|SHARINDX| ...
* +------------+--------+
* |Port 1 BCAST|SHARINDX| ...
* +------------+--------+
* ... ...
* +------------+--------+ +---------------------------------+
* |Port 4 BCAST|SHARINDX| | Policer 44: Rate, Burst, MTU |
* +------------+--------+ +---------------------------------+
*
* In this driver, we shall use policers 0-4 as statically alocated port
* (matchall) policers. So we need to make the SHARINDX for all lookups
* corresponding to this ingress port (8 VLAN PCP lookups and 1 broadcast
* lookup) equal.
* The remaining policers (40) shall be dynamically allocated for flower
* policers, where the key is either vlan_prio or dst_mac ff:ff:ff:ff:ff:ff.
*/
#define SJA1105_RATE_MBPS(speed) (((speed) * 64000) / 1000)
static int sja1105_init_l2_policing(struct sja1105_private *priv)
{
struct sja1105_l2_policing_entry *policing;
struct sja1105_table *table;
int port, tc;
table = &priv->static_config.tables[BLK_IDX_L2_POLICING];
/* Discard previous L2 Policing Table */
if (table->entry_count) {
kfree(table->entries);
table->entry_count = 0;
}
table->entries = kcalloc(SJA1105_MAX_L2_POLICING_COUNT,
table->ops->unpacked_entry_size, GFP_KERNEL);
if (!table->entries)
return -ENOMEM;
table->entry_count = SJA1105_MAX_L2_POLICING_COUNT;
policing = table->entries;
/* Setup shared indices for the matchall policers */
for (port = 0; port < SJA1105_NUM_PORTS; port++) {
int bcast = (SJA1105_NUM_PORTS * SJA1105_NUM_TC) + port;
for (tc = 0; tc < SJA1105_NUM_TC; tc++)
policing[port * SJA1105_NUM_TC + tc].sharindx = port;
policing[bcast].sharindx = port;
}
/* Setup the matchall policer parameters */
for (port = 0; port < SJA1105_NUM_PORTS; port++) {
int mtu = VLAN_ETH_FRAME_LEN + ETH_FCS_LEN;
if (dsa_is_cpu_port(priv->ds, port))
mtu += VLAN_HLEN;
policing[port].smax = 65535; /* Burst size in bytes */
policing[port].rate = SJA1105_RATE_MBPS(1000);
policing[port].maxlen = mtu;
policing[port].partition = 0;
}
return 0;
}
static int sja1105_static_config_load(struct sja1105_private *priv,
struct sja1105_dt_port *ports)
{
int rc;
sja1105_static_config_free(&priv->static_config);
rc = sja1105_static_config_init(&priv->static_config,
priv->info->static_ops,
priv->info->device_id);
if (rc)
return rc;
/* Build static configuration */
rc = sja1105_init_mac_settings(priv);
if (rc < 0)
return rc;
rc = sja1105_init_mii_settings(priv, ports);
if (rc < 0)
return rc;
rc = sja1105_init_static_fdb(priv);
if (rc < 0)
return rc;
rc = sja1105_init_static_vlan(priv);
if (rc < 0)
return rc;
rc = sja1105_init_l2_lookup_params(priv);
if (rc < 0)
return rc;
rc = sja1105_init_l2_forwarding(priv);
if (rc < 0)
return rc;
rc = sja1105_init_l2_forwarding_params(priv);
if (rc < 0)
return rc;
rc = sja1105_init_l2_policing(priv);
if (rc < 0)
return rc;
rc = sja1105_init_general_params(priv);
if (rc < 0)
return rc;
rc = sja1105_init_avb_params(priv);
if (rc < 0)
return rc;
/* Send initial configuration to hardware via SPI */
return sja1105_static_config_upload(priv);
}
net: dsa: sja1105: Error out if RGMII delays are requested in DT Documentation/devicetree/bindings/net/ethernet.txt is confusing because it says what the MAC should not do, but not what it *should* do: * "rgmii-rxid" (RGMII with internal RX delay provided by the PHY, the MAC should not add an RX delay in this case) The gap in semantics is threefold: 1. Is it illegal for the MAC to apply the Rx internal delay by itself, and simplify the phy_mode (mask off "rgmii-rxid" into "rgmii") before passing it to of_phy_connect? The documentation would suggest yes. 1. For "rgmii-rxid", while the situation with the Rx clock skew is more or less clear (needs to be added by the PHY), what should the MAC driver do about the Tx delays? Is it an implicit wild card for the MAC to apply delays in the Tx direction if it can? What if those were already added as serpentine PCB traces, how could that be made more obvious through DT bindings so that the MAC doesn't attempt to add them twice and again potentially break the link? 3. If the interface is a fixed-link and therefore the PHY object is fixed (a purely software entity that obviously cannot add clock skew), what is the meaning of the above property? So an interpretation of the RGMII bindings was chosen that hopefully does not contradict their intention but also makes them more applied. The SJA1105 driver understands to act upon "rgmii-*id" phy-mode bindings if the port is in the PHY role (either explicitly, or if it is a fixed-link). Otherwise it always passes the duty of setting up delays to the PHY driver. The error behavior that this patch adds is required on SJA1105E/T where the MAC really cannot apply internal delays. If the other end of the fixed-link cannot apply RGMII delays either (this would be specified through its own DT bindings), then the situation requires PCB delays. For SJA1105P/Q/R/S, this is however hardware supported and the error is thus only temporary. I created a stub function pointer for configuring delays per-port on RXC and TXC, and will implement it when I have access to a board with this hardware setup. Meanwhile do not allow the user to select an invalid configuration. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-02 20:23:32 +00:00
static int sja1105_parse_rgmii_delays(struct sja1105_private *priv,
const struct sja1105_dt_port *ports)
{
int i;
for (i = 0; i < SJA1105_NUM_PORTS; i++) {
if (ports[i].role == XMII_MAC)
net: dsa: sja1105: Error out if RGMII delays are requested in DT Documentation/devicetree/bindings/net/ethernet.txt is confusing because it says what the MAC should not do, but not what it *should* do: * "rgmii-rxid" (RGMII with internal RX delay provided by the PHY, the MAC should not add an RX delay in this case) The gap in semantics is threefold: 1. Is it illegal for the MAC to apply the Rx internal delay by itself, and simplify the phy_mode (mask off "rgmii-rxid" into "rgmii") before passing it to of_phy_connect? The documentation would suggest yes. 1. For "rgmii-rxid", while the situation with the Rx clock skew is more or less clear (needs to be added by the PHY), what should the MAC driver do about the Tx delays? Is it an implicit wild card for the MAC to apply delays in the Tx direction if it can? What if those were already added as serpentine PCB traces, how could that be made more obvious through DT bindings so that the MAC doesn't attempt to add them twice and again potentially break the link? 3. If the interface is a fixed-link and therefore the PHY object is fixed (a purely software entity that obviously cannot add clock skew), what is the meaning of the above property? So an interpretation of the RGMII bindings was chosen that hopefully does not contradict their intention but also makes them more applied. The SJA1105 driver understands to act upon "rgmii-*id" phy-mode bindings if the port is in the PHY role (either explicitly, or if it is a fixed-link). Otherwise it always passes the duty of setting up delays to the PHY driver. The error behavior that this patch adds is required on SJA1105E/T where the MAC really cannot apply internal delays. If the other end of the fixed-link cannot apply RGMII delays either (this would be specified through its own DT bindings), then the situation requires PCB delays. For SJA1105P/Q/R/S, this is however hardware supported and the error is thus only temporary. I created a stub function pointer for configuring delays per-port on RXC and TXC, and will implement it when I have access to a board with this hardware setup. Meanwhile do not allow the user to select an invalid configuration. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-02 20:23:32 +00:00
continue;
if (ports[i].phy_mode == PHY_INTERFACE_MODE_RGMII_RXID ||
ports[i].phy_mode == PHY_INTERFACE_MODE_RGMII_ID)
net: dsa: sja1105: Error out if RGMII delays are requested in DT Documentation/devicetree/bindings/net/ethernet.txt is confusing because it says what the MAC should not do, but not what it *should* do: * "rgmii-rxid" (RGMII with internal RX delay provided by the PHY, the MAC should not add an RX delay in this case) The gap in semantics is threefold: 1. Is it illegal for the MAC to apply the Rx internal delay by itself, and simplify the phy_mode (mask off "rgmii-rxid" into "rgmii") before passing it to of_phy_connect? The documentation would suggest yes. 1. For "rgmii-rxid", while the situation with the Rx clock skew is more or less clear (needs to be added by the PHY), what should the MAC driver do about the Tx delays? Is it an implicit wild card for the MAC to apply delays in the Tx direction if it can? What if those were already added as serpentine PCB traces, how could that be made more obvious through DT bindings so that the MAC doesn't attempt to add them twice and again potentially break the link? 3. If the interface is a fixed-link and therefore the PHY object is fixed (a purely software entity that obviously cannot add clock skew), what is the meaning of the above property? So an interpretation of the RGMII bindings was chosen that hopefully does not contradict their intention but also makes them more applied. The SJA1105 driver understands to act upon "rgmii-*id" phy-mode bindings if the port is in the PHY role (either explicitly, or if it is a fixed-link). Otherwise it always passes the duty of setting up delays to the PHY driver. The error behavior that this patch adds is required on SJA1105E/T where the MAC really cannot apply internal delays. If the other end of the fixed-link cannot apply RGMII delays either (this would be specified through its own DT bindings), then the situation requires PCB delays. For SJA1105P/Q/R/S, this is however hardware supported and the error is thus only temporary. I created a stub function pointer for configuring delays per-port on RXC and TXC, and will implement it when I have access to a board with this hardware setup. Meanwhile do not allow the user to select an invalid configuration. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-02 20:23:32 +00:00
priv->rgmii_rx_delay[i] = true;
if (ports[i].phy_mode == PHY_INTERFACE_MODE_RGMII_TXID ||
ports[i].phy_mode == PHY_INTERFACE_MODE_RGMII_ID)
net: dsa: sja1105: Error out if RGMII delays are requested in DT Documentation/devicetree/bindings/net/ethernet.txt is confusing because it says what the MAC should not do, but not what it *should* do: * "rgmii-rxid" (RGMII with internal RX delay provided by the PHY, the MAC should not add an RX delay in this case) The gap in semantics is threefold: 1. Is it illegal for the MAC to apply the Rx internal delay by itself, and simplify the phy_mode (mask off "rgmii-rxid" into "rgmii") before passing it to of_phy_connect? The documentation would suggest yes. 1. For "rgmii-rxid", while the situation with the Rx clock skew is more or less clear (needs to be added by the PHY), what should the MAC driver do about the Tx delays? Is it an implicit wild card for the MAC to apply delays in the Tx direction if it can? What if those were already added as serpentine PCB traces, how could that be made more obvious through DT bindings so that the MAC doesn't attempt to add them twice and again potentially break the link? 3. If the interface is a fixed-link and therefore the PHY object is fixed (a purely software entity that obviously cannot add clock skew), what is the meaning of the above property? So an interpretation of the RGMII bindings was chosen that hopefully does not contradict their intention but also makes them more applied. The SJA1105 driver understands to act upon "rgmii-*id" phy-mode bindings if the port is in the PHY role (either explicitly, or if it is a fixed-link). Otherwise it always passes the duty of setting up delays to the PHY driver. The error behavior that this patch adds is required on SJA1105E/T where the MAC really cannot apply internal delays. If the other end of the fixed-link cannot apply RGMII delays either (this would be specified through its own DT bindings), then the situation requires PCB delays. For SJA1105P/Q/R/S, this is however hardware supported and the error is thus only temporary. I created a stub function pointer for configuring delays per-port on RXC and TXC, and will implement it when I have access to a board with this hardware setup. Meanwhile do not allow the user to select an invalid configuration. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-02 20:23:32 +00:00
priv->rgmii_tx_delay[i] = true;
if ((priv->rgmii_rx_delay[i] || priv->rgmii_tx_delay[i]) &&
!priv->info->setup_rgmii_delay)
return -EINVAL;
}
return 0;
}
static int sja1105_parse_ports_node(struct sja1105_private *priv,
struct sja1105_dt_port *ports,
struct device_node *ports_node)
{
struct device *dev = &priv->spidev->dev;
struct device_node *child;
for_each_available_child_of_node(ports_node, child) {
struct device_node *phy_node;
phy_interface_t phy_mode;
u32 index;
int err;
/* Get switch port number from DT */
if (of_property_read_u32(child, "reg", &index) < 0) {
dev_err(dev, "Port number not defined in device tree "
"(property \"reg\")\n");
of_node_put(child);
return -ENODEV;
}
/* Get PHY mode from DT */
err = of_get_phy_mode(child, &phy_mode);
if (err) {
dev_err(dev, "Failed to read phy-mode or "
"phy-interface-type property for port %d\n",
index);
of_node_put(child);
return -ENODEV;
}
ports[index].phy_mode = phy_mode;
phy_node = of_parse_phandle(child, "phy-handle", 0);
if (!phy_node) {
if (!of_phy_is_fixed_link(child)) {
dev_err(dev, "phy-handle or fixed-link "
"properties missing!\n");
of_node_put(child);
return -ENODEV;
}
/* phy-handle is missing, but fixed-link isn't.
* So it's a fixed link. Default to PHY role.
*/
ports[index].role = XMII_PHY;
} else {
/* phy-handle present => put port in MAC role */
ports[index].role = XMII_MAC;
of_node_put(phy_node);
}
/* The MAC/PHY role can be overridden with explicit bindings */
if (of_property_read_bool(child, "sja1105,role-mac"))
ports[index].role = XMII_MAC;
else if (of_property_read_bool(child, "sja1105,role-phy"))
ports[index].role = XMII_PHY;
}
return 0;
}
static int sja1105_parse_dt(struct sja1105_private *priv,
struct sja1105_dt_port *ports)
{
struct device *dev = &priv->spidev->dev;
struct device_node *switch_node = dev->of_node;
struct device_node *ports_node;
int rc;
ports_node = of_get_child_by_name(switch_node, "ports");
if (!ports_node) {
dev_err(dev, "Incorrect bindings: absent \"ports\" node\n");
return -ENODEV;
}
rc = sja1105_parse_ports_node(priv, ports, ports_node);
of_node_put(ports_node);
return rc;
}
net: dsa: sja1105: Add support for the SGMII port SJA1105 switches R and S have one SerDes port with an 802.3z quasi-compatible PCS, hardwired on port 4. The other ports are still MII/RMII/RGMII. The PCS performs rate adaptation to lower link speeds; the MAC on this port is hardwired at gigabit. Only full duplex is supported. The SGMII port can be configured as part of the static config tables, as well as through a dedicated SPI address region for its pseudo-clause-22 registers. However it looks like the static configuration is not able to change some out-of-reset values (like the value of MII_BMCR), so at the end of the day, having code for it is utterly pointless. We are just going to use the pseudo-C22 interface. Because the PCS gets reset when the switch resets, we have to add even more restoration logic to sja1105_static_config_reload, otherwise the SGMII port breaks after operations such as enabling PTP timestamping which require a switch reset. >From PHYLINK perspective, the switch supports *only* SGMII (it doesn't support 1000Base-X). It also doesn't expose access to the raw config word for in-band AN in registers MII_ADV/MII_LPA. It is able to work in the following modes: - Forced speed - SGMII in-band AN slave (speed received from PHY) - SGMII in-band AN master (acting as a PHY) The latter mode is not supported by this patch. It is even unclear to me how that would be described. There is some code for it left in the patch, but 'an_master' is always passed as false. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-20 11:29:37 +00:00
static int sja1105_sgmii_read(struct sja1105_private *priv, int pcs_reg)
{
const struct sja1105_regs *regs = priv->info->regs;
u32 val;
int rc;
rc = sja1105_xfer_u32(priv, SPI_READ, regs->sgmii + pcs_reg, &val,
NULL);
if (rc < 0)
return rc;
return val;
}
static int sja1105_sgmii_write(struct sja1105_private *priv, int pcs_reg,
u16 pcs_val)
{
const struct sja1105_regs *regs = priv->info->regs;
u32 val = pcs_val;
int rc;
rc = sja1105_xfer_u32(priv, SPI_WRITE, regs->sgmii + pcs_reg, &val,
NULL);
if (rc < 0)
return rc;
return val;
}
static void sja1105_sgmii_pcs_config(struct sja1105_private *priv,
bool an_enabled, bool an_master)
{
u16 ac = SJA1105_AC_AUTONEG_MODE_SGMII;
/* DIGITAL_CONTROL_1: Enable vendor-specific MMD1, allow the PHY to
* stop the clock during LPI mode, make the MAC reconfigure
* autonomously after PCS autoneg is done, flush the internal FIFOs.
*/
sja1105_sgmii_write(priv, SJA1105_DC1, SJA1105_DC1_EN_VSMMD1 |
SJA1105_DC1_CLOCK_STOP_EN |
SJA1105_DC1_MAC_AUTO_SW |
SJA1105_DC1_INIT);
/* DIGITAL_CONTROL_2: No polarity inversion for TX and RX lanes */
sja1105_sgmii_write(priv, SJA1105_DC2, SJA1105_DC2_TX_POL_INV_DISABLE);
/* AUTONEG_CONTROL: Use SGMII autoneg */
if (an_master)
ac |= SJA1105_AC_PHY_MODE | SJA1105_AC_SGMII_LINK;
sja1105_sgmii_write(priv, SJA1105_AC, ac);
/* BASIC_CONTROL: enable in-band AN now, if requested. Otherwise,
* sja1105_sgmii_pcs_force_speed must be called later for the link
* to become operational.
*/
if (an_enabled)
sja1105_sgmii_write(priv, MII_BMCR,
BMCR_ANENABLE | BMCR_ANRESTART);
}
static void sja1105_sgmii_pcs_force_speed(struct sja1105_private *priv,
int speed)
{
int pcs_speed;
switch (speed) {
case SPEED_1000:
pcs_speed = BMCR_SPEED1000;
break;
case SPEED_100:
pcs_speed = BMCR_SPEED100;
break;
case SPEED_10:
pcs_speed = BMCR_SPEED10;
break;
default:
dev_err(priv->ds->dev, "Invalid speed %d\n", speed);
return;
}
sja1105_sgmii_write(priv, MII_BMCR, pcs_speed | BMCR_FULLDPLX);
}
/* Convert link speed from SJA1105 to ethtool encoding */
static int sja1105_speed[] = {
[SJA1105_SPEED_AUTO] = SPEED_UNKNOWN,
[SJA1105_SPEED_10MBPS] = SPEED_10,
[SJA1105_SPEED_100MBPS] = SPEED_100,
[SJA1105_SPEED_1000MBPS] = SPEED_1000,
};
/* Set link speed in the MAC configuration for a specific port. */
static int sja1105_adjust_port_config(struct sja1105_private *priv, int port,
int speed_mbps)
{
struct sja1105_xmii_params_entry *mii;
struct sja1105_mac_config_entry *mac;
struct device *dev = priv->ds->dev;
sja1105_phy_interface_t phy_mode;
sja1105_speed_t speed;
int rc;
/* On P/Q/R/S, one can read from the device via the MAC reconfiguration
* tables. On E/T, MAC reconfig tables are not readable, only writable.
* We have to *know* what the MAC looks like. For the sake of keeping
* the code common, we'll use the static configuration tables as a
* reasonable approximation for both E/T and P/Q/R/S.
*/
mac = priv->static_config.tables[BLK_IDX_MAC_CONFIG].entries;
mii = priv->static_config.tables[BLK_IDX_XMII_PARAMS].entries;
switch (speed_mbps) {
case SPEED_UNKNOWN:
/* PHYLINK called sja1105_mac_config() to inform us about
* the state->interface, but AN has not completed and the
* speed is not yet valid. UM10944.pdf says that setting
* SJA1105_SPEED_AUTO at runtime disables the port, so that is
* ok for power consumption in case AN will never complete -
* otherwise PHYLINK should come back with a new update.
*/
speed = SJA1105_SPEED_AUTO;
break;
case SPEED_10:
speed = SJA1105_SPEED_10MBPS;
break;
case SPEED_100:
speed = SJA1105_SPEED_100MBPS;
break;
case SPEED_1000:
speed = SJA1105_SPEED_1000MBPS;
break;
default:
dev_err(dev, "Invalid speed %iMbps\n", speed_mbps);
return -EINVAL;
}
/* Overwrite SJA1105_SPEED_AUTO from the static MAC configuration
* table, since this will be used for the clocking setup, and we no
* longer need to store it in the static config (already told hardware
* we want auto during upload phase).
net: dsa: sja1105: Add support for the SGMII port SJA1105 switches R and S have one SerDes port with an 802.3z quasi-compatible PCS, hardwired on port 4. The other ports are still MII/RMII/RGMII. The PCS performs rate adaptation to lower link speeds; the MAC on this port is hardwired at gigabit. Only full duplex is supported. The SGMII port can be configured as part of the static config tables, as well as through a dedicated SPI address region for its pseudo-clause-22 registers. However it looks like the static configuration is not able to change some out-of-reset values (like the value of MII_BMCR), so at the end of the day, having code for it is utterly pointless. We are just going to use the pseudo-C22 interface. Because the PCS gets reset when the switch resets, we have to add even more restoration logic to sja1105_static_config_reload, otherwise the SGMII port breaks after operations such as enabling PTP timestamping which require a switch reset. >From PHYLINK perspective, the switch supports *only* SGMII (it doesn't support 1000Base-X). It also doesn't expose access to the raw config word for in-band AN in registers MII_ADV/MII_LPA. It is able to work in the following modes: - Forced speed - SGMII in-band AN slave (speed received from PHY) - SGMII in-band AN master (acting as a PHY) The latter mode is not supported by this patch. It is even unclear to me how that would be described. There is some code for it left in the patch, but 'an_master' is always passed as false. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-20 11:29:37 +00:00
* Actually for the SGMII port, the MAC is fixed at 1 Gbps and
* we need to configure the PCS only (if even that).
*/
net: dsa: sja1105: Add support for the SGMII port SJA1105 switches R and S have one SerDes port with an 802.3z quasi-compatible PCS, hardwired on port 4. The other ports are still MII/RMII/RGMII. The PCS performs rate adaptation to lower link speeds; the MAC on this port is hardwired at gigabit. Only full duplex is supported. The SGMII port can be configured as part of the static config tables, as well as through a dedicated SPI address region for its pseudo-clause-22 registers. However it looks like the static configuration is not able to change some out-of-reset values (like the value of MII_BMCR), so at the end of the day, having code for it is utterly pointless. We are just going to use the pseudo-C22 interface. Because the PCS gets reset when the switch resets, we have to add even more restoration logic to sja1105_static_config_reload, otherwise the SGMII port breaks after operations such as enabling PTP timestamping which require a switch reset. >From PHYLINK perspective, the switch supports *only* SGMII (it doesn't support 1000Base-X). It also doesn't expose access to the raw config word for in-band AN in registers MII_ADV/MII_LPA. It is able to work in the following modes: - Forced speed - SGMII in-band AN slave (speed received from PHY) - SGMII in-band AN master (acting as a PHY) The latter mode is not supported by this patch. It is even unclear to me how that would be described. There is some code for it left in the patch, but 'an_master' is always passed as false. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-20 11:29:37 +00:00
if (sja1105_supports_sgmii(priv, port))
mac[port].speed = SJA1105_SPEED_1000MBPS;
else
mac[port].speed = speed;
/* Write to the dynamic reconfiguration tables */
rc = sja1105_dynamic_config_write(priv, BLK_IDX_MAC_CONFIG, port,
&mac[port], true);
if (rc < 0) {
dev_err(dev, "Failed to write MAC config: %d\n", rc);
return rc;
}
/* Reconfigure the PLLs for the RGMII interfaces (required 125 MHz at
* gigabit, 25 MHz at 100 Mbps and 2.5 MHz at 10 Mbps). For MII and
* RMII no change of the clock setup is required. Actually, changing
* the clock setup does interrupt the clock signal for a certain time
* which causes trouble for all PHYs relying on this signal.
*/
phy_mode = mii->xmii_mode[port];
if (phy_mode != XMII_MODE_RGMII)
return 0;
return sja1105_clocking_setup_port(priv, port);
}
/* The SJA1105 MAC programming model is through the static config (the xMII
* Mode table cannot be dynamically reconfigured), and we have to program
* that early (earlier than PHYLINK calls us, anyway).
* So just error out in case the connected PHY attempts to change the initial
* system interface MII protocol from what is defined in the DT, at least for
* now.
*/
static bool sja1105_phy_mode_mismatch(struct sja1105_private *priv, int port,
phy_interface_t interface)
{
struct sja1105_xmii_params_entry *mii;
sja1105_phy_interface_t phy_mode;
mii = priv->static_config.tables[BLK_IDX_XMII_PARAMS].entries;
phy_mode = mii->xmii_mode[port];
switch (interface) {
case PHY_INTERFACE_MODE_MII:
return (phy_mode != XMII_MODE_MII);
case PHY_INTERFACE_MODE_RMII:
return (phy_mode != XMII_MODE_RMII);
case PHY_INTERFACE_MODE_RGMII:
case PHY_INTERFACE_MODE_RGMII_ID:
case PHY_INTERFACE_MODE_RGMII_RXID:
case PHY_INTERFACE_MODE_RGMII_TXID:
return (phy_mode != XMII_MODE_RGMII);
net: dsa: sja1105: Add support for the SGMII port SJA1105 switches R and S have one SerDes port with an 802.3z quasi-compatible PCS, hardwired on port 4. The other ports are still MII/RMII/RGMII. The PCS performs rate adaptation to lower link speeds; the MAC on this port is hardwired at gigabit. Only full duplex is supported. The SGMII port can be configured as part of the static config tables, as well as through a dedicated SPI address region for its pseudo-clause-22 registers. However it looks like the static configuration is not able to change some out-of-reset values (like the value of MII_BMCR), so at the end of the day, having code for it is utterly pointless. We are just going to use the pseudo-C22 interface. Because the PCS gets reset when the switch resets, we have to add even more restoration logic to sja1105_static_config_reload, otherwise the SGMII port breaks after operations such as enabling PTP timestamping which require a switch reset. >From PHYLINK perspective, the switch supports *only* SGMII (it doesn't support 1000Base-X). It also doesn't expose access to the raw config word for in-band AN in registers MII_ADV/MII_LPA. It is able to work in the following modes: - Forced speed - SGMII in-band AN slave (speed received from PHY) - SGMII in-band AN master (acting as a PHY) The latter mode is not supported by this patch. It is even unclear to me how that would be described. There is some code for it left in the patch, but 'an_master' is always passed as false. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-20 11:29:37 +00:00
case PHY_INTERFACE_MODE_SGMII:
return (phy_mode != XMII_MODE_SGMII);
default:
return true;
}
}
net: dsa: sja1105: Fix broken fixed-link interfaces on user ports PHYLIB and PHYLINK handle fixed-link interfaces differently. PHYLIB wraps them in a software PHY ("pseudo fixed link") phydev construct such that .adjust_link driver callbacks see an unified API. Whereas PHYLINK simply creates a phylink_link_state structure and passes it to .mac_config. At the time the driver was introduced, DSA was using PHYLIB for the CPU/cascade ports (the ones with no net devices) and PHYLINK for everything else. As explained below: commit aab9c4067d2389d0adfc9c53806437df7b0fe3d5 Author: Florian Fainelli <f.fainelli@gmail.com> Date: Thu May 10 13:17:36 2018 -0700 net: dsa: Plug in PHYLINK support Drivers that utilize fixed links for user-facing ports (e.g: bcm_sf2) will need to implement phylink_mac_ops from now on to preserve functionality, since PHYLINK *does not* create a phy_device instance for fixed links. In the above patch, DSA guards the .phylink_mac_config callback against a NULL phydev pointer. Therefore, .adjust_link is not called in case of a fixed-link user port. This patch fixes the situation by converting the driver from using .adjust_link to .phylink_mac_config. This can be done now in a unified fashion for both slave and CPU/cascade ports because DSA now uses PHYLINK for all ports. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-28 17:38:17 +00:00
static void sja1105_mac_config(struct dsa_switch *ds, int port,
net: dsa: sja1105: Add support for the SGMII port SJA1105 switches R and S have one SerDes port with an 802.3z quasi-compatible PCS, hardwired on port 4. The other ports are still MII/RMII/RGMII. The PCS performs rate adaptation to lower link speeds; the MAC on this port is hardwired at gigabit. Only full duplex is supported. The SGMII port can be configured as part of the static config tables, as well as through a dedicated SPI address region for its pseudo-clause-22 registers. However it looks like the static configuration is not able to change some out-of-reset values (like the value of MII_BMCR), so at the end of the day, having code for it is utterly pointless. We are just going to use the pseudo-C22 interface. Because the PCS gets reset when the switch resets, we have to add even more restoration logic to sja1105_static_config_reload, otherwise the SGMII port breaks after operations such as enabling PTP timestamping which require a switch reset. >From PHYLINK perspective, the switch supports *only* SGMII (it doesn't support 1000Base-X). It also doesn't expose access to the raw config word for in-band AN in registers MII_ADV/MII_LPA. It is able to work in the following modes: - Forced speed - SGMII in-band AN slave (speed received from PHY) - SGMII in-band AN master (acting as a PHY) The latter mode is not supported by this patch. It is even unclear to me how that would be described. There is some code for it left in the patch, but 'an_master' is always passed as false. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-20 11:29:37 +00:00
unsigned int mode,
net: dsa: sja1105: Fix broken fixed-link interfaces on user ports PHYLIB and PHYLINK handle fixed-link interfaces differently. PHYLIB wraps them in a software PHY ("pseudo fixed link") phydev construct such that .adjust_link driver callbacks see an unified API. Whereas PHYLINK simply creates a phylink_link_state structure and passes it to .mac_config. At the time the driver was introduced, DSA was using PHYLIB for the CPU/cascade ports (the ones with no net devices) and PHYLINK for everything else. As explained below: commit aab9c4067d2389d0adfc9c53806437df7b0fe3d5 Author: Florian Fainelli <f.fainelli@gmail.com> Date: Thu May 10 13:17:36 2018 -0700 net: dsa: Plug in PHYLINK support Drivers that utilize fixed links for user-facing ports (e.g: bcm_sf2) will need to implement phylink_mac_ops from now on to preserve functionality, since PHYLINK *does not* create a phy_device instance for fixed links. In the above patch, DSA guards the .phylink_mac_config callback against a NULL phydev pointer. Therefore, .adjust_link is not called in case of a fixed-link user port. This patch fixes the situation by converting the driver from using .adjust_link to .phylink_mac_config. This can be done now in a unified fashion for both slave and CPU/cascade ports because DSA now uses PHYLINK for all ports. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-28 17:38:17 +00:00
const struct phylink_link_state *state)
{
struct sja1105_private *priv = ds->priv;
net: dsa: sja1105: Add support for the SGMII port SJA1105 switches R and S have one SerDes port with an 802.3z quasi-compatible PCS, hardwired on port 4. The other ports are still MII/RMII/RGMII. The PCS performs rate adaptation to lower link speeds; the MAC on this port is hardwired at gigabit. Only full duplex is supported. The SGMII port can be configured as part of the static config tables, as well as through a dedicated SPI address region for its pseudo-clause-22 registers. However it looks like the static configuration is not able to change some out-of-reset values (like the value of MII_BMCR), so at the end of the day, having code for it is utterly pointless. We are just going to use the pseudo-C22 interface. Because the PCS gets reset when the switch resets, we have to add even more restoration logic to sja1105_static_config_reload, otherwise the SGMII port breaks after operations such as enabling PTP timestamping which require a switch reset. >From PHYLINK perspective, the switch supports *only* SGMII (it doesn't support 1000Base-X). It also doesn't expose access to the raw config word for in-band AN in registers MII_ADV/MII_LPA. It is able to work in the following modes: - Forced speed - SGMII in-band AN slave (speed received from PHY) - SGMII in-band AN master (acting as a PHY) The latter mode is not supported by this patch. It is even unclear to me how that would be described. There is some code for it left in the patch, but 'an_master' is always passed as false. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-20 11:29:37 +00:00
bool is_sgmii = sja1105_supports_sgmii(priv, port);
if (sja1105_phy_mode_mismatch(priv, port, state->interface)) {
dev_err(ds->dev, "Changing PHY mode to %s not supported!\n",
phy_modes(state->interface));
return;
}
net: dsa: sja1105: Add support for the SGMII port SJA1105 switches R and S have one SerDes port with an 802.3z quasi-compatible PCS, hardwired on port 4. The other ports are still MII/RMII/RGMII. The PCS performs rate adaptation to lower link speeds; the MAC on this port is hardwired at gigabit. Only full duplex is supported. The SGMII port can be configured as part of the static config tables, as well as through a dedicated SPI address region for its pseudo-clause-22 registers. However it looks like the static configuration is not able to change some out-of-reset values (like the value of MII_BMCR), so at the end of the day, having code for it is utterly pointless. We are just going to use the pseudo-C22 interface. Because the PCS gets reset when the switch resets, we have to add even more restoration logic to sja1105_static_config_reload, otherwise the SGMII port breaks after operations such as enabling PTP timestamping which require a switch reset. >From PHYLINK perspective, the switch supports *only* SGMII (it doesn't support 1000Base-X). It also doesn't expose access to the raw config word for in-band AN in registers MII_ADV/MII_LPA. It is able to work in the following modes: - Forced speed - SGMII in-band AN slave (speed received from PHY) - SGMII in-band AN master (acting as a PHY) The latter mode is not supported by this patch. It is even unclear to me how that would be described. There is some code for it left in the patch, but 'an_master' is always passed as false. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-20 11:29:37 +00:00
if (phylink_autoneg_inband(mode) && !is_sgmii) {
dev_err(ds->dev, "In-band AN not supported!\n");
return;
}
net: dsa: sja1105: Add support for the SGMII port SJA1105 switches R and S have one SerDes port with an 802.3z quasi-compatible PCS, hardwired on port 4. The other ports are still MII/RMII/RGMII. The PCS performs rate adaptation to lower link speeds; the MAC on this port is hardwired at gigabit. Only full duplex is supported. The SGMII port can be configured as part of the static config tables, as well as through a dedicated SPI address region for its pseudo-clause-22 registers. However it looks like the static configuration is not able to change some out-of-reset values (like the value of MII_BMCR), so at the end of the day, having code for it is utterly pointless. We are just going to use the pseudo-C22 interface. Because the PCS gets reset when the switch resets, we have to add even more restoration logic to sja1105_static_config_reload, otherwise the SGMII port breaks after operations such as enabling PTP timestamping which require a switch reset. >From PHYLINK perspective, the switch supports *only* SGMII (it doesn't support 1000Base-X). It also doesn't expose access to the raw config word for in-band AN in registers MII_ADV/MII_LPA. It is able to work in the following modes: - Forced speed - SGMII in-band AN slave (speed received from PHY) - SGMII in-band AN master (acting as a PHY) The latter mode is not supported by this patch. It is even unclear to me how that would be described. There is some code for it left in the patch, but 'an_master' is always passed as false. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-20 11:29:37 +00:00
if (is_sgmii)
sja1105_sgmii_pcs_config(priv, phylink_autoneg_inband(mode),
false);
}
static void sja1105_mac_link_down(struct dsa_switch *ds, int port,
unsigned int mode,
phy_interface_t interface)
{
sja1105_inhibit_tx(ds->priv, BIT(port), true);
}
static void sja1105_mac_link_up(struct dsa_switch *ds, int port,
unsigned int mode,
phy_interface_t interface,
struct phy_device *phydev,
int speed, int duplex,
bool tx_pause, bool rx_pause)
{
struct sja1105_private *priv = ds->priv;
sja1105_adjust_port_config(priv, port, speed);
net: dsa: sja1105: Add support for the SGMII port SJA1105 switches R and S have one SerDes port with an 802.3z quasi-compatible PCS, hardwired on port 4. The other ports are still MII/RMII/RGMII. The PCS performs rate adaptation to lower link speeds; the MAC on this port is hardwired at gigabit. Only full duplex is supported. The SGMII port can be configured as part of the static config tables, as well as through a dedicated SPI address region for its pseudo-clause-22 registers. However it looks like the static configuration is not able to change some out-of-reset values (like the value of MII_BMCR), so at the end of the day, having code for it is utterly pointless. We are just going to use the pseudo-C22 interface. Because the PCS gets reset when the switch resets, we have to add even more restoration logic to sja1105_static_config_reload, otherwise the SGMII port breaks after operations such as enabling PTP timestamping which require a switch reset. >From PHYLINK perspective, the switch supports *only* SGMII (it doesn't support 1000Base-X). It also doesn't expose access to the raw config word for in-band AN in registers MII_ADV/MII_LPA. It is able to work in the following modes: - Forced speed - SGMII in-band AN slave (speed received from PHY) - SGMII in-band AN master (acting as a PHY) The latter mode is not supported by this patch. It is even unclear to me how that would be described. There is some code for it left in the patch, but 'an_master' is always passed as false. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-20 11:29:37 +00:00
if (sja1105_supports_sgmii(priv, port) && !phylink_autoneg_inband(mode))
sja1105_sgmii_pcs_force_speed(priv, speed);
sja1105_inhibit_tx(priv, BIT(port), false);
}
static void sja1105_phylink_validate(struct dsa_switch *ds, int port,
unsigned long *supported,
struct phylink_link_state *state)
{
/* Construct a new mask which exhaustively contains all link features
* supported by the MAC, and then apply that (logical AND) to what will
* be sent to the PHY for "marketing".
*/
__ETHTOOL_DECLARE_LINK_MODE_MASK(mask) = { 0, };
struct sja1105_private *priv = ds->priv;
struct sja1105_xmii_params_entry *mii;
mii = priv->static_config.tables[BLK_IDX_XMII_PARAMS].entries;
/* include/linux/phylink.h says:
* When @state->interface is %PHY_INTERFACE_MODE_NA, phylink
* expects the MAC driver to return all supported link modes.
*/
if (state->interface != PHY_INTERFACE_MODE_NA &&
sja1105_phy_mode_mismatch(priv, port, state->interface)) {
bitmap_zero(supported, __ETHTOOL_LINK_MODE_MASK_NBITS);
return;
}
/* The MAC does not support pause frames, and also doesn't
* support half-duplex traffic modes.
*/
phylink_set(mask, Autoneg);
phylink_set(mask, MII);
phylink_set(mask, 10baseT_Full);
phylink_set(mask, 100baseT_Full);
phylink_set(mask, 100baseT1_Full);
net: dsa: sja1105: Add support for the SGMII port SJA1105 switches R and S have one SerDes port with an 802.3z quasi-compatible PCS, hardwired on port 4. The other ports are still MII/RMII/RGMII. The PCS performs rate adaptation to lower link speeds; the MAC on this port is hardwired at gigabit. Only full duplex is supported. The SGMII port can be configured as part of the static config tables, as well as through a dedicated SPI address region for its pseudo-clause-22 registers. However it looks like the static configuration is not able to change some out-of-reset values (like the value of MII_BMCR), so at the end of the day, having code for it is utterly pointless. We are just going to use the pseudo-C22 interface. Because the PCS gets reset when the switch resets, we have to add even more restoration logic to sja1105_static_config_reload, otherwise the SGMII port breaks after operations such as enabling PTP timestamping which require a switch reset. >From PHYLINK perspective, the switch supports *only* SGMII (it doesn't support 1000Base-X). It also doesn't expose access to the raw config word for in-band AN in registers MII_ADV/MII_LPA. It is able to work in the following modes: - Forced speed - SGMII in-band AN slave (speed received from PHY) - SGMII in-band AN master (acting as a PHY) The latter mode is not supported by this patch. It is even unclear to me how that would be described. There is some code for it left in the patch, but 'an_master' is always passed as false. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-20 11:29:37 +00:00
if (mii->xmii_mode[port] == XMII_MODE_RGMII ||
mii->xmii_mode[port] == XMII_MODE_SGMII)
phylink_set(mask, 1000baseT_Full);
bitmap_and(supported, supported, mask, __ETHTOOL_LINK_MODE_MASK_NBITS);
bitmap_and(state->advertising, state->advertising, mask,
__ETHTOOL_LINK_MODE_MASK_NBITS);
}
net: dsa: sja1105: Add support for the SGMII port SJA1105 switches R and S have one SerDes port with an 802.3z quasi-compatible PCS, hardwired on port 4. The other ports are still MII/RMII/RGMII. The PCS performs rate adaptation to lower link speeds; the MAC on this port is hardwired at gigabit. Only full duplex is supported. The SGMII port can be configured as part of the static config tables, as well as through a dedicated SPI address region for its pseudo-clause-22 registers. However it looks like the static configuration is not able to change some out-of-reset values (like the value of MII_BMCR), so at the end of the day, having code for it is utterly pointless. We are just going to use the pseudo-C22 interface. Because the PCS gets reset when the switch resets, we have to add even more restoration logic to sja1105_static_config_reload, otherwise the SGMII port breaks after operations such as enabling PTP timestamping which require a switch reset. >From PHYLINK perspective, the switch supports *only* SGMII (it doesn't support 1000Base-X). It also doesn't expose access to the raw config word for in-band AN in registers MII_ADV/MII_LPA. It is able to work in the following modes: - Forced speed - SGMII in-band AN slave (speed received from PHY) - SGMII in-band AN master (acting as a PHY) The latter mode is not supported by this patch. It is even unclear to me how that would be described. There is some code for it left in the patch, but 'an_master' is always passed as false. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-20 11:29:37 +00:00
static int sja1105_mac_pcs_get_state(struct dsa_switch *ds, int port,
struct phylink_link_state *state)
{
struct sja1105_private *priv = ds->priv;
int ais;
/* Read the vendor-specific AUTONEG_INTR_STATUS register */
ais = sja1105_sgmii_read(priv, SJA1105_AIS);
if (ais < 0)
return ais;
switch (SJA1105_AIS_SPEED(ais)) {
case 0:
state->speed = SPEED_10;
break;
case 1:
state->speed = SPEED_100;
break;
case 2:
state->speed = SPEED_1000;
break;
default:
dev_err(ds->dev, "Invalid SGMII PCS speed %lu\n",
SJA1105_AIS_SPEED(ais));
}
state->duplex = SJA1105_AIS_DUPLEX_MODE(ais);
state->an_complete = SJA1105_AIS_COMPLETE(ais);
state->link = SJA1105_AIS_LINK_STATUS(ais);
return 0;
}
static int
sja1105_find_static_fdb_entry(struct sja1105_private *priv, int port,
const struct sja1105_l2_lookup_entry *requested)
{
struct sja1105_l2_lookup_entry *l2_lookup;
struct sja1105_table *table;
int i;
table = &priv->static_config.tables[BLK_IDX_L2_LOOKUP];
l2_lookup = table->entries;
for (i = 0; i < table->entry_count; i++)
if (l2_lookup[i].macaddr == requested->macaddr &&
l2_lookup[i].vlanid == requested->vlanid &&
l2_lookup[i].destports & BIT(port))
return i;
return -1;
}
/* We want FDB entries added statically through the bridge command to persist
* across switch resets, which are a common thing during normal SJA1105
* operation. So we have to back them up in the static configuration tables
* and hence apply them on next static config upload... yay!
*/
static int
sja1105_static_fdb_change(struct sja1105_private *priv, int port,
const struct sja1105_l2_lookup_entry *requested,
bool keep)
{
struct sja1105_l2_lookup_entry *l2_lookup;
struct sja1105_table *table;
int rc, match;
table = &priv->static_config.tables[BLK_IDX_L2_LOOKUP];
match = sja1105_find_static_fdb_entry(priv, port, requested);
if (match < 0) {
/* Can't delete a missing entry. */
if (!keep)
return 0;
/* No match => new entry */
rc = sja1105_table_resize(table, table->entry_count + 1);
if (rc)
return rc;
match = table->entry_count - 1;
}
/* Assign pointer after the resize (it may be new memory) */
l2_lookup = table->entries;
/* We have a match.
* If the job was to add this FDB entry, it's already done (mostly
* anyway, since the port forwarding mask may have changed, case in
* which we update it).
* Otherwise we have to delete it.
*/
if (keep) {
l2_lookup[match] = *requested;
return 0;
}
/* To remove, the strategy is to overwrite the element with
* the last one, and then reduce the array size by 1
*/
l2_lookup[match] = l2_lookup[table->entry_count - 1];
return sja1105_table_resize(table, table->entry_count - 1);
}
/* First-generation switches have a 4-way set associative TCAM that
* holds the FDB entries. An FDB index spans from 0 to 1023 and is comprised of
* a "bin" (grouping of 4 entries) and a "way" (an entry within a bin).
* For the placement of a newly learnt FDB entry, the switch selects the bin
* based on a hash function, and the way within that bin incrementally.
*/
static int sja1105et_fdb_index(int bin, int way)
{
return bin * SJA1105ET_FDB_BIN_SIZE + way;
}
static int sja1105et_is_fdb_entry_in_bin(struct sja1105_private *priv, int bin,
const u8 *addr, u16 vid,
struct sja1105_l2_lookup_entry *match,
int *last_unused)
{
int way;
for (way = 0; way < SJA1105ET_FDB_BIN_SIZE; way++) {
struct sja1105_l2_lookup_entry l2_lookup = {0};
int index = sja1105et_fdb_index(bin, way);
/* Skip unused entries, optionally marking them
* into the return value
*/
if (sja1105_dynamic_config_read(priv, BLK_IDX_L2_LOOKUP,
index, &l2_lookup)) {
if (last_unused)
*last_unused = way;
continue;
}
if (l2_lookup.macaddr == ether_addr_to_u64(addr) &&
l2_lookup.vlanid == vid) {
if (match)
*match = l2_lookup;
return way;
}
}
/* Return an invalid entry index if not found */
return -1;
}
int sja1105et_fdb_add(struct dsa_switch *ds, int port,
const unsigned char *addr, u16 vid)
{
struct sja1105_l2_lookup_entry l2_lookup = {0};
struct sja1105_private *priv = ds->priv;
struct device *dev = ds->dev;
int last_unused = -1;
int bin, way, rc;
bin = sja1105et_fdb_hash(priv, addr, vid);
way = sja1105et_is_fdb_entry_in_bin(priv, bin, addr, vid,
&l2_lookup, &last_unused);
if (way >= 0) {
/* We have an FDB entry. Is our port in the destination
* mask? If yes, we need to do nothing. If not, we need
* to rewrite the entry by adding this port to it.
*/
if (l2_lookup.destports & BIT(port))
return 0;
l2_lookup.destports |= BIT(port);
} else {
int index = sja1105et_fdb_index(bin, way);
/* We don't have an FDB entry. We construct a new one and
* try to find a place for it within the FDB table.
*/
l2_lookup.macaddr = ether_addr_to_u64(addr);
l2_lookup.destports = BIT(port);
l2_lookup.vlanid = vid;
if (last_unused >= 0) {
way = last_unused;
} else {
/* Bin is full, need to evict somebody.
* Choose victim at random. If you get these messages
* often, you may need to consider changing the
* distribution function:
* static_config[BLK_IDX_L2_LOOKUP_PARAMS].entries->poly
*/
get_random_bytes(&way, sizeof(u8));
way %= SJA1105ET_FDB_BIN_SIZE;
dev_warn(dev, "Warning, FDB bin %d full while adding entry for %pM. Evicting entry %u.\n",
bin, addr, way);
/* Evict entry */
sja1105_dynamic_config_write(priv, BLK_IDX_L2_LOOKUP,
index, NULL, false);
}
}
l2_lookup.index = sja1105et_fdb_index(bin, way);
rc = sja1105_dynamic_config_write(priv, BLK_IDX_L2_LOOKUP,
l2_lookup.index, &l2_lookup,
true);
if (rc < 0)
return rc;
return sja1105_static_fdb_change(priv, port, &l2_lookup, true);
}
int sja1105et_fdb_del(struct dsa_switch *ds, int port,
const unsigned char *addr, u16 vid)
{
struct sja1105_l2_lookup_entry l2_lookup = {0};
struct sja1105_private *priv = ds->priv;
int index, bin, way, rc;
bool keep;
bin = sja1105et_fdb_hash(priv, addr, vid);
way = sja1105et_is_fdb_entry_in_bin(priv, bin, addr, vid,
&l2_lookup, NULL);
if (way < 0)
return 0;
index = sja1105et_fdb_index(bin, way);
/* We have an FDB entry. Is our port in the destination mask? If yes,
* we need to remove it. If the resulting port mask becomes empty, we
* need to completely evict the FDB entry.
* Otherwise we just write it back.
*/
l2_lookup.destports &= ~BIT(port);
if (l2_lookup.destports)
keep = true;
else
keep = false;
rc = sja1105_dynamic_config_write(priv, BLK_IDX_L2_LOOKUP,
index, &l2_lookup, keep);
if (rc < 0)
return rc;
return sja1105_static_fdb_change(priv, port, &l2_lookup, keep);
}
int sja1105pqrs_fdb_add(struct dsa_switch *ds, int port,
const unsigned char *addr, u16 vid)
{
struct sja1105_l2_lookup_entry l2_lookup = {0};
struct sja1105_private *priv = ds->priv;
int rc, i;
/* Search for an existing entry in the FDB table */
l2_lookup.macaddr = ether_addr_to_u64(addr);
l2_lookup.vlanid = vid;
l2_lookup.iotag = SJA1105_S_TAG;
l2_lookup.mask_macaddr = GENMASK_ULL(ETH_ALEN * 8 - 1, 0);
if (priv->vlan_state != SJA1105_VLAN_UNAWARE) {
net: dsa: sja1105: Fix broken learning with vlan_filtering disabled When put under a bridge with vlan_filtering 0, the SJA1105 ports will flood all traffic as if learning was broken. This is because learning interferes with the rx_vid's configured by dsa_8021q as unique pvid's. So learning technically still *does* work, it's just that the learnt entries never get matched due to their unique VLAN ID. The setting that saves the day is Shared VLAN Learning, which on this switch family works exactly as desired: VLAN tagging still works (untagged traffic gets the correct pvid) and FDB entries are still populated with the correct contents including VID. Also, a frame cannot violate the forwarding domain restrictions enforced by its classified VLAN. It is just that the VID is ignored when looking up the FDB for taking a forwarding decision (selecting the egress port). This patch activates SVL, and the result is that frames with a learnt DMAC are no longer flooded in the scenario described above. Now exactly *because* SVL works as desired, we have to revisit some earlier patches: - It is no longer necessary to manipulate the VID of the 'bridge fdb {add,del}' command when vlan_filtering is off. This is because now, SVL is enabled for that case, so the actual VID does not matter*. - It is still desirable to hide dsa_8021q VID's in the FDB dump callback. But right now the dump callback should no longer hide duplicates (one per each front panel port's pvid, plus one for the VLAN that the CPU port is going to tag a TX frame with), because there shouldn't be any (the switch will match a single FDB entry no matter its VID anyway). * Not really... It's no longer necessary to transform a 'bridge fdb add' into 5 fdb add operations, but the user might still add a fdb entry with any vid, and all of them would appear as duplicates in 'bridge fdb show'. So force a 'bridge fdb add' to insert the VID of 0**, so that we can prune the duplicates at insertion time. ** The VID of 0 is better than 1 because it is always guaranteed to be in the ports' hardware filter. DSA also avoids putting the VID inside the netlink response message towards the bridge driver when we return this particular VID, which makes it suitable for FDB entries learnt with vlan_filtering off. Fixes: 227d07a07ef1 ("net: dsa: sja1105: Add support for traffic through standalone ports") Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Georg Waibel <georg.waibel@sensor-technik.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-04 22:38:44 +00:00
l2_lookup.mask_vlanid = VLAN_VID_MASK;
l2_lookup.mask_iotag = BIT(0);
} else {
l2_lookup.mask_vlanid = 0;
l2_lookup.mask_iotag = 0;
}
l2_lookup.destports = BIT(port);
rc = sja1105_dynamic_config_read(priv, BLK_IDX_L2_LOOKUP,
SJA1105_SEARCH, &l2_lookup);
if (rc == 0) {
/* Found and this port is already in the entry's
* port mask => job done
*/
if (l2_lookup.destports & BIT(port))
return 0;
/* l2_lookup.index is populated by the switch in case it
* found something.
*/
l2_lookup.destports |= BIT(port);
goto skip_finding_an_index;
}
/* Not found, so try to find an unused spot in the FDB.
* This is slightly inefficient because the strategy is knock-knock at
* every possible position from 0 to 1023.
*/
for (i = 0; i < SJA1105_MAX_L2_LOOKUP_COUNT; i++) {
rc = sja1105_dynamic_config_read(priv, BLK_IDX_L2_LOOKUP,
i, NULL);
if (rc < 0)
break;
}
if (i == SJA1105_MAX_L2_LOOKUP_COUNT) {
dev_err(ds->dev, "FDB is full, cannot add entry.\n");
return -EINVAL;
}
l2_lookup.lockeds = true;
l2_lookup.index = i;
skip_finding_an_index:
rc = sja1105_dynamic_config_write(priv, BLK_IDX_L2_LOOKUP,
l2_lookup.index, &l2_lookup,
true);
if (rc < 0)
return rc;
return sja1105_static_fdb_change(priv, port, &l2_lookup, true);
}
int sja1105pqrs_fdb_del(struct dsa_switch *ds, int port,
const unsigned char *addr, u16 vid)
{
struct sja1105_l2_lookup_entry l2_lookup = {0};
struct sja1105_private *priv = ds->priv;
bool keep;
int rc;
l2_lookup.macaddr = ether_addr_to_u64(addr);
l2_lookup.vlanid = vid;
l2_lookup.iotag = SJA1105_S_TAG;
l2_lookup.mask_macaddr = GENMASK_ULL(ETH_ALEN * 8 - 1, 0);
if (priv->vlan_state != SJA1105_VLAN_UNAWARE) {
net: dsa: sja1105: Fix broken learning with vlan_filtering disabled When put under a bridge with vlan_filtering 0, the SJA1105 ports will flood all traffic as if learning was broken. This is because learning interferes with the rx_vid's configured by dsa_8021q as unique pvid's. So learning technically still *does* work, it's just that the learnt entries never get matched due to their unique VLAN ID. The setting that saves the day is Shared VLAN Learning, which on this switch family works exactly as desired: VLAN tagging still works (untagged traffic gets the correct pvid) and FDB entries are still populated with the correct contents including VID. Also, a frame cannot violate the forwarding domain restrictions enforced by its classified VLAN. It is just that the VID is ignored when looking up the FDB for taking a forwarding decision (selecting the egress port). This patch activates SVL, and the result is that frames with a learnt DMAC are no longer flooded in the scenario described above. Now exactly *because* SVL works as desired, we have to revisit some earlier patches: - It is no longer necessary to manipulate the VID of the 'bridge fdb {add,del}' command when vlan_filtering is off. This is because now, SVL is enabled for that case, so the actual VID does not matter*. - It is still desirable to hide dsa_8021q VID's in the FDB dump callback. But right now the dump callback should no longer hide duplicates (one per each front panel port's pvid, plus one for the VLAN that the CPU port is going to tag a TX frame with), because there shouldn't be any (the switch will match a single FDB entry no matter its VID anyway). * Not really... It's no longer necessary to transform a 'bridge fdb add' into 5 fdb add operations, but the user might still add a fdb entry with any vid, and all of them would appear as duplicates in 'bridge fdb show'. So force a 'bridge fdb add' to insert the VID of 0**, so that we can prune the duplicates at insertion time. ** The VID of 0 is better than 1 because it is always guaranteed to be in the ports' hardware filter. DSA also avoids putting the VID inside the netlink response message towards the bridge driver when we return this particular VID, which makes it suitable for FDB entries learnt with vlan_filtering off. Fixes: 227d07a07ef1 ("net: dsa: sja1105: Add support for traffic through standalone ports") Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Georg Waibel <georg.waibel@sensor-technik.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-04 22:38:44 +00:00
l2_lookup.mask_vlanid = VLAN_VID_MASK;
l2_lookup.mask_iotag = BIT(0);
} else {
l2_lookup.mask_vlanid = 0;
l2_lookup.mask_iotag = 0;
}
l2_lookup.destports = BIT(port);
rc = sja1105_dynamic_config_read(priv, BLK_IDX_L2_LOOKUP,
SJA1105_SEARCH, &l2_lookup);
if (rc < 0)
return 0;
l2_lookup.destports &= ~BIT(port);
/* Decide whether we remove just this port from the FDB entry,
* or if we remove it completely.
*/
if (l2_lookup.destports)
keep = true;
else
keep = false;
rc = sja1105_dynamic_config_write(priv, BLK_IDX_L2_LOOKUP,
l2_lookup.index, &l2_lookup, keep);
if (rc < 0)
return rc;
return sja1105_static_fdb_change(priv, port, &l2_lookup, keep);
}
static int sja1105_fdb_add(struct dsa_switch *ds, int port,
const unsigned char *addr, u16 vid)
{
struct sja1105_private *priv = ds->priv;
net: dsa: sja1105: Fix broken learning with vlan_filtering disabled When put under a bridge with vlan_filtering 0, the SJA1105 ports will flood all traffic as if learning was broken. This is because learning interferes with the rx_vid's configured by dsa_8021q as unique pvid's. So learning technically still *does* work, it's just that the learnt entries never get matched due to their unique VLAN ID. The setting that saves the day is Shared VLAN Learning, which on this switch family works exactly as desired: VLAN tagging still works (untagged traffic gets the correct pvid) and FDB entries are still populated with the correct contents including VID. Also, a frame cannot violate the forwarding domain restrictions enforced by its classified VLAN. It is just that the VID is ignored when looking up the FDB for taking a forwarding decision (selecting the egress port). This patch activates SVL, and the result is that frames with a learnt DMAC are no longer flooded in the scenario described above. Now exactly *because* SVL works as desired, we have to revisit some earlier patches: - It is no longer necessary to manipulate the VID of the 'bridge fdb {add,del}' command when vlan_filtering is off. This is because now, SVL is enabled for that case, so the actual VID does not matter*. - It is still desirable to hide dsa_8021q VID's in the FDB dump callback. But right now the dump callback should no longer hide duplicates (one per each front panel port's pvid, plus one for the VLAN that the CPU port is going to tag a TX frame with), because there shouldn't be any (the switch will match a single FDB entry no matter its VID anyway). * Not really... It's no longer necessary to transform a 'bridge fdb add' into 5 fdb add operations, but the user might still add a fdb entry with any vid, and all of them would appear as duplicates in 'bridge fdb show'. So force a 'bridge fdb add' to insert the VID of 0**, so that we can prune the duplicates at insertion time. ** The VID of 0 is better than 1 because it is always guaranteed to be in the ports' hardware filter. DSA also avoids putting the VID inside the netlink response message towards the bridge driver when we return this particular VID, which makes it suitable for FDB entries learnt with vlan_filtering off. Fixes: 227d07a07ef1 ("net: dsa: sja1105: Add support for traffic through standalone ports") Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Georg Waibel <georg.waibel@sensor-technik.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-04 22:38:44 +00:00
/* dsa_8021q is in effect when the bridge's vlan_filtering isn't,
* so the switch still does some VLAN processing internally.
* But Shared VLAN Learning (SVL) is also active, and it will take
* care of autonomous forwarding between the unique pvid's of each
* port. Here we just make sure that users can't add duplicate FDB
* entries when in this mode - the actual VID doesn't matter except
* for what gets printed in 'bridge fdb show'. In the case of zero,
* no VID gets printed at all.
*/
if (priv->vlan_state != SJA1105_VLAN_FILTERING_FULL)
net: dsa: sja1105: Fix broken learning with vlan_filtering disabled When put under a bridge with vlan_filtering 0, the SJA1105 ports will flood all traffic as if learning was broken. This is because learning interferes with the rx_vid's configured by dsa_8021q as unique pvid's. So learning technically still *does* work, it's just that the learnt entries never get matched due to their unique VLAN ID. The setting that saves the day is Shared VLAN Learning, which on this switch family works exactly as desired: VLAN tagging still works (untagged traffic gets the correct pvid) and FDB entries are still populated with the correct contents including VID. Also, a frame cannot violate the forwarding domain restrictions enforced by its classified VLAN. It is just that the VID is ignored when looking up the FDB for taking a forwarding decision (selecting the egress port). This patch activates SVL, and the result is that frames with a learnt DMAC are no longer flooded in the scenario described above. Now exactly *because* SVL works as desired, we have to revisit some earlier patches: - It is no longer necessary to manipulate the VID of the 'bridge fdb {add,del}' command when vlan_filtering is off. This is because now, SVL is enabled for that case, so the actual VID does not matter*. - It is still desirable to hide dsa_8021q VID's in the FDB dump callback. But right now the dump callback should no longer hide duplicates (one per each front panel port's pvid, plus one for the VLAN that the CPU port is going to tag a TX frame with), because there shouldn't be any (the switch will match a single FDB entry no matter its VID anyway). * Not really... It's no longer necessary to transform a 'bridge fdb add' into 5 fdb add operations, but the user might still add a fdb entry with any vid, and all of them would appear as duplicates in 'bridge fdb show'. So force a 'bridge fdb add' to insert the VID of 0**, so that we can prune the duplicates at insertion time. ** The VID of 0 is better than 1 because it is always guaranteed to be in the ports' hardware filter. DSA also avoids putting the VID inside the netlink response message towards the bridge driver when we return this particular VID, which makes it suitable for FDB entries learnt with vlan_filtering off. Fixes: 227d07a07ef1 ("net: dsa: sja1105: Add support for traffic through standalone ports") Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Georg Waibel <georg.waibel@sensor-technik.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-04 22:38:44 +00:00
vid = 0;
net: dsa: sja1105: Fix broken learning with vlan_filtering disabled When put under a bridge with vlan_filtering 0, the SJA1105 ports will flood all traffic as if learning was broken. This is because learning interferes with the rx_vid's configured by dsa_8021q as unique pvid's. So learning technically still *does* work, it's just that the learnt entries never get matched due to their unique VLAN ID. The setting that saves the day is Shared VLAN Learning, which on this switch family works exactly as desired: VLAN tagging still works (untagged traffic gets the correct pvid) and FDB entries are still populated with the correct contents including VID. Also, a frame cannot violate the forwarding domain restrictions enforced by its classified VLAN. It is just that the VID is ignored when looking up the FDB for taking a forwarding decision (selecting the egress port). This patch activates SVL, and the result is that frames with a learnt DMAC are no longer flooded in the scenario described above. Now exactly *because* SVL works as desired, we have to revisit some earlier patches: - It is no longer necessary to manipulate the VID of the 'bridge fdb {add,del}' command when vlan_filtering is off. This is because now, SVL is enabled for that case, so the actual VID does not matter*. - It is still desirable to hide dsa_8021q VID's in the FDB dump callback. But right now the dump callback should no longer hide duplicates (one per each front panel port's pvid, plus one for the VLAN that the CPU port is going to tag a TX frame with), because there shouldn't be any (the switch will match a single FDB entry no matter its VID anyway). * Not really... It's no longer necessary to transform a 'bridge fdb add' into 5 fdb add operations, but the user might still add a fdb entry with any vid, and all of them would appear as duplicates in 'bridge fdb show'. So force a 'bridge fdb add' to insert the VID of 0**, so that we can prune the duplicates at insertion time. ** The VID of 0 is better than 1 because it is always guaranteed to be in the ports' hardware filter. DSA also avoids putting the VID inside the netlink response message towards the bridge driver when we return this particular VID, which makes it suitable for FDB entries learnt with vlan_filtering off. Fixes: 227d07a07ef1 ("net: dsa: sja1105: Add support for traffic through standalone ports") Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Georg Waibel <georg.waibel@sensor-technik.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-04 22:38:44 +00:00
return priv->info->fdb_add_cmd(ds, port, addr, vid);
}
static int sja1105_fdb_del(struct dsa_switch *ds, int port,
const unsigned char *addr, u16 vid)
{
struct sja1105_private *priv = ds->priv;
if (priv->vlan_state != SJA1105_VLAN_FILTERING_FULL)
net: dsa: sja1105: Fix broken learning with vlan_filtering disabled When put under a bridge with vlan_filtering 0, the SJA1105 ports will flood all traffic as if learning was broken. This is because learning interferes with the rx_vid's configured by dsa_8021q as unique pvid's. So learning technically still *does* work, it's just that the learnt entries never get matched due to their unique VLAN ID. The setting that saves the day is Shared VLAN Learning, which on this switch family works exactly as desired: VLAN tagging still works (untagged traffic gets the correct pvid) and FDB entries are still populated with the correct contents including VID. Also, a frame cannot violate the forwarding domain restrictions enforced by its classified VLAN. It is just that the VID is ignored when looking up the FDB for taking a forwarding decision (selecting the egress port). This patch activates SVL, and the result is that frames with a learnt DMAC are no longer flooded in the scenario described above. Now exactly *because* SVL works as desired, we have to revisit some earlier patches: - It is no longer necessary to manipulate the VID of the 'bridge fdb {add,del}' command when vlan_filtering is off. This is because now, SVL is enabled for that case, so the actual VID does not matter*. - It is still desirable to hide dsa_8021q VID's in the FDB dump callback. But right now the dump callback should no longer hide duplicates (one per each front panel port's pvid, plus one for the VLAN that the CPU port is going to tag a TX frame with), because there shouldn't be any (the switch will match a single FDB entry no matter its VID anyway). * Not really... It's no longer necessary to transform a 'bridge fdb add' into 5 fdb add operations, but the user might still add a fdb entry with any vid, and all of them would appear as duplicates in 'bridge fdb show'. So force a 'bridge fdb add' to insert the VID of 0**, so that we can prune the duplicates at insertion time. ** The VID of 0 is better than 1 because it is always guaranteed to be in the ports' hardware filter. DSA also avoids putting the VID inside the netlink response message towards the bridge driver when we return this particular VID, which makes it suitable for FDB entries learnt with vlan_filtering off. Fixes: 227d07a07ef1 ("net: dsa: sja1105: Add support for traffic through standalone ports") Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Georg Waibel <georg.waibel@sensor-technik.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-04 22:38:44 +00:00
vid = 0;
net: dsa: sja1105: Fix broken learning with vlan_filtering disabled When put under a bridge with vlan_filtering 0, the SJA1105 ports will flood all traffic as if learning was broken. This is because learning interferes with the rx_vid's configured by dsa_8021q as unique pvid's. So learning technically still *does* work, it's just that the learnt entries never get matched due to their unique VLAN ID. The setting that saves the day is Shared VLAN Learning, which on this switch family works exactly as desired: VLAN tagging still works (untagged traffic gets the correct pvid) and FDB entries are still populated with the correct contents including VID. Also, a frame cannot violate the forwarding domain restrictions enforced by its classified VLAN. It is just that the VID is ignored when looking up the FDB for taking a forwarding decision (selecting the egress port). This patch activates SVL, and the result is that frames with a learnt DMAC are no longer flooded in the scenario described above. Now exactly *because* SVL works as desired, we have to revisit some earlier patches: - It is no longer necessary to manipulate the VID of the 'bridge fdb {add,del}' command when vlan_filtering is off. This is because now, SVL is enabled for that case, so the actual VID does not matter*. - It is still desirable to hide dsa_8021q VID's in the FDB dump callback. But right now the dump callback should no longer hide duplicates (one per each front panel port's pvid, plus one for the VLAN that the CPU port is going to tag a TX frame with), because there shouldn't be any (the switch will match a single FDB entry no matter its VID anyway). * Not really... It's no longer necessary to transform a 'bridge fdb add' into 5 fdb add operations, but the user might still add a fdb entry with any vid, and all of them would appear as duplicates in 'bridge fdb show'. So force a 'bridge fdb add' to insert the VID of 0**, so that we can prune the duplicates at insertion time. ** The VID of 0 is better than 1 because it is always guaranteed to be in the ports' hardware filter. DSA also avoids putting the VID inside the netlink response message towards the bridge driver when we return this particular VID, which makes it suitable for FDB entries learnt with vlan_filtering off. Fixes: 227d07a07ef1 ("net: dsa: sja1105: Add support for traffic through standalone ports") Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Georg Waibel <georg.waibel@sensor-technik.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-04 22:38:44 +00:00
return priv->info->fdb_del_cmd(ds, port, addr, vid);
}
static int sja1105_fdb_dump(struct dsa_switch *ds, int port,
dsa_fdb_dump_cb_t *cb, void *data)
{
struct sja1105_private *priv = ds->priv;
struct device *dev = ds->dev;
int i;
for (i = 0; i < SJA1105_MAX_L2_LOOKUP_COUNT; i++) {
struct sja1105_l2_lookup_entry l2_lookup = {0};
u8 macaddr[ETH_ALEN];
int rc;
rc = sja1105_dynamic_config_read(priv, BLK_IDX_L2_LOOKUP,
i, &l2_lookup);
/* No fdb entry at i, not an issue */
if (rc == -ENOENT)
continue;
if (rc) {
dev_err(dev, "Failed to dump FDB: %d\n", rc);
return rc;
}
/* FDB dump callback is per port. This means we have to
* disregard a valid entry if it's not for this port, even if
* only to revisit it later. This is inefficient because the
* 1024-sized FDB table needs to be traversed 4 times through
* SPI during a 'bridge fdb show' command.
*/
if (!(l2_lookup.destports & BIT(port)))
continue;
u64_to_ether_addr(l2_lookup.macaddr, macaddr);
net: dsa: sja1105: Fix broken learning with vlan_filtering disabled When put under a bridge with vlan_filtering 0, the SJA1105 ports will flood all traffic as if learning was broken. This is because learning interferes with the rx_vid's configured by dsa_8021q as unique pvid's. So learning technically still *does* work, it's just that the learnt entries never get matched due to their unique VLAN ID. The setting that saves the day is Shared VLAN Learning, which on this switch family works exactly as desired: VLAN tagging still works (untagged traffic gets the correct pvid) and FDB entries are still populated with the correct contents including VID. Also, a frame cannot violate the forwarding domain restrictions enforced by its classified VLAN. It is just that the VID is ignored when looking up the FDB for taking a forwarding decision (selecting the egress port). This patch activates SVL, and the result is that frames with a learnt DMAC are no longer flooded in the scenario described above. Now exactly *because* SVL works as desired, we have to revisit some earlier patches: - It is no longer necessary to manipulate the VID of the 'bridge fdb {add,del}' command when vlan_filtering is off. This is because now, SVL is enabled for that case, so the actual VID does not matter*. - It is still desirable to hide dsa_8021q VID's in the FDB dump callback. But right now the dump callback should no longer hide duplicates (one per each front panel port's pvid, plus one for the VLAN that the CPU port is going to tag a TX frame with), because there shouldn't be any (the switch will match a single FDB entry no matter its VID anyway). * Not really... It's no longer necessary to transform a 'bridge fdb add' into 5 fdb add operations, but the user might still add a fdb entry with any vid, and all of them would appear as duplicates in 'bridge fdb show'. So force a 'bridge fdb add' to insert the VID of 0**, so that we can prune the duplicates at insertion time. ** The VID of 0 is better than 1 because it is always guaranteed to be in the ports' hardware filter. DSA also avoids putting the VID inside the netlink response message towards the bridge driver when we return this particular VID, which makes it suitable for FDB entries learnt with vlan_filtering off. Fixes: 227d07a07ef1 ("net: dsa: sja1105: Add support for traffic through standalone ports") Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Georg Waibel <georg.waibel@sensor-technik.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-04 22:38:44 +00:00
/* We need to hide the dsa_8021q VLANs from the user. */
if (priv->vlan_state == SJA1105_VLAN_UNAWARE)
net: dsa: sja1105: Fix broken learning with vlan_filtering disabled When put under a bridge with vlan_filtering 0, the SJA1105 ports will flood all traffic as if learning was broken. This is because learning interferes with the rx_vid's configured by dsa_8021q as unique pvid's. So learning technically still *does* work, it's just that the learnt entries never get matched due to their unique VLAN ID. The setting that saves the day is Shared VLAN Learning, which on this switch family works exactly as desired: VLAN tagging still works (untagged traffic gets the correct pvid) and FDB entries are still populated with the correct contents including VID. Also, a frame cannot violate the forwarding domain restrictions enforced by its classified VLAN. It is just that the VID is ignored when looking up the FDB for taking a forwarding decision (selecting the egress port). This patch activates SVL, and the result is that frames with a learnt DMAC are no longer flooded in the scenario described above. Now exactly *because* SVL works as desired, we have to revisit some earlier patches: - It is no longer necessary to manipulate the VID of the 'bridge fdb {add,del}' command when vlan_filtering is off. This is because now, SVL is enabled for that case, so the actual VID does not matter*. - It is still desirable to hide dsa_8021q VID's in the FDB dump callback. But right now the dump callback should no longer hide duplicates (one per each front panel port's pvid, plus one for the VLAN that the CPU port is going to tag a TX frame with), because there shouldn't be any (the switch will match a single FDB entry no matter its VID anyway). * Not really... It's no longer necessary to transform a 'bridge fdb add' into 5 fdb add operations, but the user might still add a fdb entry with any vid, and all of them would appear as duplicates in 'bridge fdb show'. So force a 'bridge fdb add' to insert the VID of 0**, so that we can prune the duplicates at insertion time. ** The VID of 0 is better than 1 because it is always guaranteed to be in the ports' hardware filter. DSA also avoids putting the VID inside the netlink response message towards the bridge driver when we return this particular VID, which makes it suitable for FDB entries learnt with vlan_filtering off. Fixes: 227d07a07ef1 ("net: dsa: sja1105: Add support for traffic through standalone ports") Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Georg Waibel <georg.waibel@sensor-technik.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-04 22:38:44 +00:00
l2_lookup.vlanid = 0;
cb(macaddr, l2_lookup.vlanid, l2_lookup.lockeds, data);
}
return 0;
}
/* This callback needs to be present */
static int sja1105_mdb_prepare(struct dsa_switch *ds, int port,
const struct switchdev_obj_port_mdb *mdb)
{
return 0;
}
static void sja1105_mdb_add(struct dsa_switch *ds, int port,
const struct switchdev_obj_port_mdb *mdb)
{
sja1105_fdb_add(ds, port, mdb->addr, mdb->vid);
}
static int sja1105_mdb_del(struct dsa_switch *ds, int port,
const struct switchdev_obj_port_mdb *mdb)
{
return sja1105_fdb_del(ds, port, mdb->addr, mdb->vid);
}
static int sja1105_bridge_member(struct dsa_switch *ds, int port,
struct net_device *br, bool member)
{
struct sja1105_l2_forwarding_entry *l2_fwd;
struct sja1105_private *priv = ds->priv;
int i, rc;
l2_fwd = priv->static_config.tables[BLK_IDX_L2_FORWARDING].entries;
for (i = 0; i < SJA1105_NUM_PORTS; i++) {
/* Add this port to the forwarding matrix of the
* other ports in the same bridge, and viceversa.
*/
if (!dsa_is_user_port(ds, i))
continue;
/* For the ports already under the bridge, only one thing needs
* to be done, and that is to add this port to their
* reachability domain. So we can perform the SPI write for
* them immediately. However, for this port itself (the one
* that is new to the bridge), we need to add all other ports
* to its reachability domain. So we do that incrementally in
* this loop, and perform the SPI write only at the end, once
* the domain contains all other bridge ports.
*/
if (i == port)
continue;
if (dsa_to_port(ds, i)->bridge_dev != br)
continue;
sja1105_port_allow_traffic(l2_fwd, i, port, member);
sja1105_port_allow_traffic(l2_fwd, port, i, member);
rc = sja1105_dynamic_config_write(priv, BLK_IDX_L2_FORWARDING,
i, &l2_fwd[i], true);
if (rc < 0)
return rc;
}
return sja1105_dynamic_config_write(priv, BLK_IDX_L2_FORWARDING,
port, &l2_fwd[port], true);
}
net: dsa: sja1105: Add support for Spanning Tree Protocol While not explicitly documented as supported in UM10944, compliance with the STP states can be obtained by manipulating 3 settings at the (per-port) MAC config level: dynamic learning, inhibiting reception of regular traffic, and inhibiting transmission of regular traffic. In all these modes, transmission and reception of special BPDU frames from the stack is still enabled (not inhibited by the MAC-level settings). On ingress, BPDUs are classified by the MAC filter as link-local (01-80-C2-00-00-00) and forwarded to the CPU port. This mechanism works under all conditions (even without the custom 802.1Q tagging) because the switch hardware inserts the source port and switch ID into bytes 4 and 5 of the MAC-filtered frames. Then the DSA .rcv handler needs to put back zeroes into the MAC address after decoding the source port information. On egress, BPDUs are transmitted using management routes from the xmit worker thread. Again this does not require switch tagging, as the switch port is programmed through SPI to hold a temporary (single-fire) route for a frame with the programmed destination MAC (01-80-C2-00-00-00). STP is activated using the following commands and was tested by connecting two front-panel ports together and noticing that switching loops were prevented (one port remains in the blocking state): $ ip link add name br0 type bridge stp_state 1 && ip link set br0 up $ for eth in $(ls /sys/devices/platform/soc/2100000.spi/spi_master/spi0/spi0.1/net/); do ip link set ${eth} master br0 && ip link set ${eth} up; done Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-05 10:19:28 +00:00
static void sja1105_bridge_stp_state_set(struct dsa_switch *ds, int port,
u8 state)
{
struct sja1105_private *priv = ds->priv;
struct sja1105_mac_config_entry *mac;
mac = priv->static_config.tables[BLK_IDX_MAC_CONFIG].entries;
switch (state) {
case BR_STATE_DISABLED:
case BR_STATE_BLOCKING:
/* From UM10944 description of DRPDTAG (why put this there?):
* "Management traffic flows to the port regardless of the state
* of the INGRESS flag". So BPDUs are still be allowed to pass.
* At the moment no difference between DISABLED and BLOCKING.
*/
mac[port].ingress = false;
mac[port].egress = false;
mac[port].dyn_learn = false;
break;
case BR_STATE_LISTENING:
mac[port].ingress = true;
mac[port].egress = false;
mac[port].dyn_learn = false;
break;
case BR_STATE_LEARNING:
mac[port].ingress = true;
mac[port].egress = false;
mac[port].dyn_learn = true;
break;
case BR_STATE_FORWARDING:
mac[port].ingress = true;
mac[port].egress = true;
mac[port].dyn_learn = true;
break;
default:
dev_err(ds->dev, "invalid STP state: %d\n", state);
return;
}
sja1105_dynamic_config_write(priv, BLK_IDX_MAC_CONFIG, port,
&mac[port], true);
}
static int sja1105_bridge_join(struct dsa_switch *ds, int port,
struct net_device *br)
{
return sja1105_bridge_member(ds, port, br, true);
}
static void sja1105_bridge_leave(struct dsa_switch *ds, int port,
struct net_device *br)
{
sja1105_bridge_member(ds, port, br, false);
}
#define BYTES_PER_KBIT (1000LL / 8)
static int sja1105_find_unused_cbs_shaper(struct sja1105_private *priv)
{
int i;
for (i = 0; i < priv->info->num_cbs_shapers; i++)
if (!priv->cbs[i].idle_slope && !priv->cbs[i].send_slope)
return i;
return -1;
}
static int sja1105_delete_cbs_shaper(struct sja1105_private *priv, int port,
int prio)
{
int i;
for (i = 0; i < priv->info->num_cbs_shapers; i++) {
struct sja1105_cbs_entry *cbs = &priv->cbs[i];
if (cbs->port == port && cbs->prio == prio) {
memset(cbs, 0, sizeof(*cbs));
return sja1105_dynamic_config_write(priv, BLK_IDX_CBS,
i, cbs, true);
}
}
return 0;
}
static int sja1105_setup_tc_cbs(struct dsa_switch *ds, int port,
struct tc_cbs_qopt_offload *offload)
{
struct sja1105_private *priv = ds->priv;
struct sja1105_cbs_entry *cbs;
int index;
if (!offload->enable)
return sja1105_delete_cbs_shaper(priv, port, offload->queue);
index = sja1105_find_unused_cbs_shaper(priv);
if (index < 0)
return -ENOSPC;
cbs = &priv->cbs[index];
cbs->port = port;
cbs->prio = offload->queue;
/* locredit and sendslope are negative by definition. In hardware,
* positive values must be provided, and the negative sign is implicit.
*/
cbs->credit_hi = offload->hicredit;
cbs->credit_lo = abs(offload->locredit);
/* User space is in kbits/sec, hardware in bytes/sec */
cbs->idle_slope = offload->idleslope * BYTES_PER_KBIT;
cbs->send_slope = abs(offload->sendslope * BYTES_PER_KBIT);
/* Convert the negative values from 64-bit 2's complement
* to 32-bit 2's complement (for the case of 0x80000000 whose
* negative is still negative).
*/
cbs->credit_lo &= GENMASK_ULL(31, 0);
cbs->send_slope &= GENMASK_ULL(31, 0);
return sja1105_dynamic_config_write(priv, BLK_IDX_CBS, index, cbs,
true);
}
static int sja1105_reload_cbs(struct sja1105_private *priv)
{
int rc = 0, i;
for (i = 0; i < priv->info->num_cbs_shapers; i++) {
struct sja1105_cbs_entry *cbs = &priv->cbs[i];
if (!cbs->idle_slope && !cbs->send_slope)
continue;
rc = sja1105_dynamic_config_write(priv, BLK_IDX_CBS, i, cbs,
true);
if (rc)
break;
}
return rc;
}
static const char * const sja1105_reset_reasons[] = {
[SJA1105_VLAN_FILTERING] = "VLAN filtering",
[SJA1105_RX_HWTSTAMPING] = "RX timestamping",
[SJA1105_AGEING_TIME] = "Ageing time",
[SJA1105_SCHEDULING] = "Time-aware scheduling",
[SJA1105_BEST_EFFORT_POLICING] = "Best-effort policing",
[SJA1105_VIRTUAL_LINKS] = "Virtual links",
};
/* For situations where we need to change a setting at runtime that is only
* available through the static configuration, resetting the switch in order
* to upload the new static config is unavoidable. Back up the settings we
* modify at runtime (currently only MAC) and restore them after uploading,
* such that this operation is relatively seamless.
*/
int sja1105_static_config_reload(struct sja1105_private *priv,
enum sja1105_reset_reason reason)
{
struct ptp_system_timestamp ptp_sts_before;
struct ptp_system_timestamp ptp_sts_after;
struct sja1105_mac_config_entry *mac;
int speed_mbps[SJA1105_NUM_PORTS];
struct dsa_switch *ds = priv->ds;
s64 t1, t2, t3, t4;
s64 t12, t34;
net: dsa: sja1105: Add support for the SGMII port SJA1105 switches R and S have one SerDes port with an 802.3z quasi-compatible PCS, hardwired on port 4. The other ports are still MII/RMII/RGMII. The PCS performs rate adaptation to lower link speeds; the MAC on this port is hardwired at gigabit. Only full duplex is supported. The SGMII port can be configured as part of the static config tables, as well as through a dedicated SPI address region for its pseudo-clause-22 registers. However it looks like the static configuration is not able to change some out-of-reset values (like the value of MII_BMCR), so at the end of the day, having code for it is utterly pointless. We are just going to use the pseudo-C22 interface. Because the PCS gets reset when the switch resets, we have to add even more restoration logic to sja1105_static_config_reload, otherwise the SGMII port breaks after operations such as enabling PTP timestamping which require a switch reset. >From PHYLINK perspective, the switch supports *only* SGMII (it doesn't support 1000Base-X). It also doesn't expose access to the raw config word for in-band AN in registers MII_ADV/MII_LPA. It is able to work in the following modes: - Forced speed - SGMII in-band AN slave (speed received from PHY) - SGMII in-band AN master (acting as a PHY) The latter mode is not supported by this patch. It is even unclear to me how that would be described. There is some code for it left in the patch, but 'an_master' is always passed as false. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-20 11:29:37 +00:00
u16 bmcr = 0;
int rc, i;
s64 now;
mutex_lock(&priv->mgmt_lock);
mac = priv->static_config.tables[BLK_IDX_MAC_CONFIG].entries;
/* Back up the dynamic link speed changed by sja1105_adjust_port_config
* in order to temporarily restore it to SJA1105_SPEED_AUTO - which the
* switch wants to see in the static config in order to allow us to
* change it through the dynamic interface later.
*/
for (i = 0; i < SJA1105_NUM_PORTS; i++) {
speed_mbps[i] = sja1105_speed[mac[i].speed];
mac[i].speed = SJA1105_SPEED_AUTO;
}
net: dsa: sja1105: Add support for the SGMII port SJA1105 switches R and S have one SerDes port with an 802.3z quasi-compatible PCS, hardwired on port 4. The other ports are still MII/RMII/RGMII. The PCS performs rate adaptation to lower link speeds; the MAC on this port is hardwired at gigabit. Only full duplex is supported. The SGMII port can be configured as part of the static config tables, as well as through a dedicated SPI address region for its pseudo-clause-22 registers. However it looks like the static configuration is not able to change some out-of-reset values (like the value of MII_BMCR), so at the end of the day, having code for it is utterly pointless. We are just going to use the pseudo-C22 interface. Because the PCS gets reset when the switch resets, we have to add even more restoration logic to sja1105_static_config_reload, otherwise the SGMII port breaks after operations such as enabling PTP timestamping which require a switch reset. >From PHYLINK perspective, the switch supports *only* SGMII (it doesn't support 1000Base-X). It also doesn't expose access to the raw config word for in-band AN in registers MII_ADV/MII_LPA. It is able to work in the following modes: - Forced speed - SGMII in-band AN slave (speed received from PHY) - SGMII in-band AN master (acting as a PHY) The latter mode is not supported by this patch. It is even unclear to me how that would be described. There is some code for it left in the patch, but 'an_master' is always passed as false. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-20 11:29:37 +00:00
if (sja1105_supports_sgmii(priv, SJA1105_SGMII_PORT))
bmcr = sja1105_sgmii_read(priv, MII_BMCR);
/* No PTP operations can run right now */
mutex_lock(&priv->ptp_data.lock);
rc = __sja1105_ptp_gettimex(ds, &now, &ptp_sts_before);
if (rc < 0)
goto out_unlock_ptp;
/* Reset switch and send updated static configuration */
rc = sja1105_static_config_upload(priv);
if (rc < 0)
goto out_unlock_ptp;
rc = __sja1105_ptp_settime(ds, 0, &ptp_sts_after);
if (rc < 0)
goto out_unlock_ptp;
t1 = timespec64_to_ns(&ptp_sts_before.pre_ts);
t2 = timespec64_to_ns(&ptp_sts_before.post_ts);
t3 = timespec64_to_ns(&ptp_sts_after.pre_ts);
t4 = timespec64_to_ns(&ptp_sts_after.post_ts);
/* Mid point, corresponds to pre-reset PTPCLKVAL */
t12 = t1 + (t2 - t1) / 2;
/* Mid point, corresponds to post-reset PTPCLKVAL, aka 0 */
t34 = t3 + (t4 - t3) / 2;
/* Advance PTPCLKVAL by the time it took since its readout */
now += (t34 - t12);
__sja1105_ptp_adjtime(ds, now);
out_unlock_ptp:
mutex_unlock(&priv->ptp_data.lock);
dev_info(priv->ds->dev,
"Reset switch and programmed static config. Reason: %s\n",
sja1105_reset_reasons[reason]);
/* Configure the CGU (PLLs) for MII and RMII PHYs.
* For these interfaces there is no dynamic configuration
* needed, since PLLs have same settings at all speeds.
*/
rc = sja1105_clocking_setup(priv);
if (rc < 0)
goto out;
for (i = 0; i < SJA1105_NUM_PORTS; i++) {
rc = sja1105_adjust_port_config(priv, i, speed_mbps[i]);
if (rc < 0)
goto out;
}
net: dsa: sja1105: Add support for the SGMII port SJA1105 switches R and S have one SerDes port with an 802.3z quasi-compatible PCS, hardwired on port 4. The other ports are still MII/RMII/RGMII. The PCS performs rate adaptation to lower link speeds; the MAC on this port is hardwired at gigabit. Only full duplex is supported. The SGMII port can be configured as part of the static config tables, as well as through a dedicated SPI address region for its pseudo-clause-22 registers. However it looks like the static configuration is not able to change some out-of-reset values (like the value of MII_BMCR), so at the end of the day, having code for it is utterly pointless. We are just going to use the pseudo-C22 interface. Because the PCS gets reset when the switch resets, we have to add even more restoration logic to sja1105_static_config_reload, otherwise the SGMII port breaks after operations such as enabling PTP timestamping which require a switch reset. >From PHYLINK perspective, the switch supports *only* SGMII (it doesn't support 1000Base-X). It also doesn't expose access to the raw config word for in-band AN in registers MII_ADV/MII_LPA. It is able to work in the following modes: - Forced speed - SGMII in-band AN slave (speed received from PHY) - SGMII in-band AN master (acting as a PHY) The latter mode is not supported by this patch. It is even unclear to me how that would be described. There is some code for it left in the patch, but 'an_master' is always passed as false. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-20 11:29:37 +00:00
if (sja1105_supports_sgmii(priv, SJA1105_SGMII_PORT)) {
bool an_enabled = !!(bmcr & BMCR_ANENABLE);
sja1105_sgmii_pcs_config(priv, an_enabled, false);
if (!an_enabled) {
int speed = SPEED_UNKNOWN;
if (bmcr & BMCR_SPEED1000)
speed = SPEED_1000;
else if (bmcr & BMCR_SPEED100)
speed = SPEED_100;
else if (bmcr & BMCR_SPEED10)
speed = SPEED_10;
sja1105_sgmii_pcs_force_speed(priv, speed);
}
}
rc = sja1105_reload_cbs(priv);
if (rc < 0)
goto out;
out:
mutex_unlock(&priv->mgmt_lock);
return rc;
}
static int sja1105_pvid_apply(struct sja1105_private *priv, int port, u16 pvid)
{
struct sja1105_mac_config_entry *mac;
mac = priv->static_config.tables[BLK_IDX_MAC_CONFIG].entries;
mac[port].vlanid = pvid;
return sja1105_dynamic_config_write(priv, BLK_IDX_MAC_CONFIG, port,
&mac[port], true);
}
net: dsa: sja1105: implement cross-chip bridging operations sja1105 uses dsa_8021q for DSA tagging, a format which is VLAN at heart and which is compatible with cascading. A complete description of this tagging format is in net/dsa/tag_8021q.c, but a quick summary is that each external-facing port tags incoming frames with a unique pvid, and this special VLAN is transmitted as tagged towards the inside of the system, and as untagged towards the exterior. The tag encodes the switch id and the source port index. This means that cross-chip bridging for dsa_8021q only entails adding the dsa_8021q pvids of one switch to the RX filter of the other switches. Everything else falls naturally into place, as long as the bottom-end of ports (the leaves in the tree) is comprised exclusively of dsa_8021q-compatible (i.e. sja1105 switches). Otherwise, there would be a chance that a front-panel switch transmits a packet tagged with a dsa_8021q header, header which it wouldn't be able to remove, and which would hence "leak" out. The only use case I tested (due to lack of board availability) was when the sja1105 switches are part of disjoint trees (however, this doesn't change the fact that multiple sja1105 switches still need unique switch identifiers in such a system). But in principle, even "true" single-tree setups (with DSA links) should work just as fine, except for a small change which I can't test: dsa_towards_port should be used instead of dsa_upstream_port (I made the assumption that the routing port that any sja1105 should use towards its neighbours is the CPU port. That might not hold true in other setups). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-10 16:37:43 +00:00
static int sja1105_crosschip_bridge_join(struct dsa_switch *ds,
int tree_index, int sw_index,
int other_port, struct net_device *br)
{
struct dsa_switch *other_ds = dsa_switch_find(tree_index, sw_index);
struct sja1105_private *other_priv = other_ds->priv;
struct sja1105_private *priv = ds->priv;
int port, rc;
if (other_ds->ops != &sja1105_switch_ops)
return 0;
for (port = 0; port < ds->num_ports; port++) {
if (!dsa_is_user_port(ds, port))
continue;
if (dsa_to_port(ds, port)->bridge_dev != br)
continue;
other_priv->expect_dsa_8021q = true;
net: dsa: sja1105: implement cross-chip bridging operations sja1105 uses dsa_8021q for DSA tagging, a format which is VLAN at heart and which is compatible with cascading. A complete description of this tagging format is in net/dsa/tag_8021q.c, but a quick summary is that each external-facing port tags incoming frames with a unique pvid, and this special VLAN is transmitted as tagged towards the inside of the system, and as untagged towards the exterior. The tag encodes the switch id and the source port index. This means that cross-chip bridging for dsa_8021q only entails adding the dsa_8021q pvids of one switch to the RX filter of the other switches. Everything else falls naturally into place, as long as the bottom-end of ports (the leaves in the tree) is comprised exclusively of dsa_8021q-compatible (i.e. sja1105 switches). Otherwise, there would be a chance that a front-panel switch transmits a packet tagged with a dsa_8021q header, header which it wouldn't be able to remove, and which would hence "leak" out. The only use case I tested (due to lack of board availability) was when the sja1105 switches are part of disjoint trees (however, this doesn't change the fact that multiple sja1105 switches still need unique switch identifiers in such a system). But in principle, even "true" single-tree setups (with DSA links) should work just as fine, except for a small change which I can't test: dsa_towards_port should be used instead of dsa_upstream_port (I made the assumption that the routing port that any sja1105 should use towards its neighbours is the CPU port. That might not hold true in other setups). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-10 16:37:43 +00:00
rc = dsa_8021q_crosschip_bridge_join(ds, port, other_ds,
net: dsa: sja1105: save/restore VLANs using a delta commit method Managing the VLAN table that is present in hardware will become very difficult once we add a third operating state (best_effort_vlan_filtering). That is because correct cleanup (not too little, not too much) becomes virtually impossible, when VLANs can be added from the bridge layer, from dsa_8021q for basic tagging, for cross-chip bridging, as well as retagging rules for sub-VLANs and cross-chip sub-VLANs. So we need to rethink VLAN interaction with the switch in a more scalable way. In preparation for that, use the priv->expect_dsa_8021q boolean to classify any VLAN request received through .port_vlan_add or .port_vlan_del towards either one of 2 internal lists: bridge VLANs and dsa_8021q VLANs. Then, implement a central sja1105_build_vlan_table method that creates a VLAN configuration from scratch based on the 2 lists of VLANs kept by the driver, and based on the VLAN awareness state. Currently, if we are VLAN-unaware, install the dsa_8021q VLANs, otherwise the bridge VLANs. Then, implement a delta commit procedure that identifies which VLANs from this new configuration are actually different from the config previously committed to hardware. We apply the delta through the dynamic configuration interface (we don't reset the switch). The result is that the hardware should see the exact sequence of operations as before this patch. This also helps remove the "br" argument passed to dsa_8021q_crosschip_bridge_join, which it was only using to figure out whether it should commit the configuration back to us or not, based on the VLAN awareness state of the bridge. We can simplify that, by always allowing those VLANs inside of our dsa_8021q_vlans list, and committing those to hardware when necessary. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 17:20:29 +00:00
other_port,
net: dsa: sja1105: implement cross-chip bridging operations sja1105 uses dsa_8021q for DSA tagging, a format which is VLAN at heart and which is compatible with cascading. A complete description of this tagging format is in net/dsa/tag_8021q.c, but a quick summary is that each external-facing port tags incoming frames with a unique pvid, and this special VLAN is transmitted as tagged towards the inside of the system, and as untagged towards the exterior. The tag encodes the switch id and the source port index. This means that cross-chip bridging for dsa_8021q only entails adding the dsa_8021q pvids of one switch to the RX filter of the other switches. Everything else falls naturally into place, as long as the bottom-end of ports (the leaves in the tree) is comprised exclusively of dsa_8021q-compatible (i.e. sja1105 switches). Otherwise, there would be a chance that a front-panel switch transmits a packet tagged with a dsa_8021q header, header which it wouldn't be able to remove, and which would hence "leak" out. The only use case I tested (due to lack of board availability) was when the sja1105 switches are part of disjoint trees (however, this doesn't change the fact that multiple sja1105 switches still need unique switch identifiers in such a system). But in principle, even "true" single-tree setups (with DSA links) should work just as fine, except for a small change which I can't test: dsa_towards_port should be used instead of dsa_upstream_port (I made the assumption that the routing port that any sja1105 should use towards its neighbours is the CPU port. That might not hold true in other setups). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-10 16:37:43 +00:00
&priv->crosschip_links);
other_priv->expect_dsa_8021q = false;
net: dsa: sja1105: implement cross-chip bridging operations sja1105 uses dsa_8021q for DSA tagging, a format which is VLAN at heart and which is compatible with cascading. A complete description of this tagging format is in net/dsa/tag_8021q.c, but a quick summary is that each external-facing port tags incoming frames with a unique pvid, and this special VLAN is transmitted as tagged towards the inside of the system, and as untagged towards the exterior. The tag encodes the switch id and the source port index. This means that cross-chip bridging for dsa_8021q only entails adding the dsa_8021q pvids of one switch to the RX filter of the other switches. Everything else falls naturally into place, as long as the bottom-end of ports (the leaves in the tree) is comprised exclusively of dsa_8021q-compatible (i.e. sja1105 switches). Otherwise, there would be a chance that a front-panel switch transmits a packet tagged with a dsa_8021q header, header which it wouldn't be able to remove, and which would hence "leak" out. The only use case I tested (due to lack of board availability) was when the sja1105 switches are part of disjoint trees (however, this doesn't change the fact that multiple sja1105 switches still need unique switch identifiers in such a system). But in principle, even "true" single-tree setups (with DSA links) should work just as fine, except for a small change which I can't test: dsa_towards_port should be used instead of dsa_upstream_port (I made the assumption that the routing port that any sja1105 should use towards its neighbours is the CPU port. That might not hold true in other setups). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-10 16:37:43 +00:00
if (rc)
return rc;
priv->expect_dsa_8021q = true;
net: dsa: sja1105: implement cross-chip bridging operations sja1105 uses dsa_8021q for DSA tagging, a format which is VLAN at heart and which is compatible with cascading. A complete description of this tagging format is in net/dsa/tag_8021q.c, but a quick summary is that each external-facing port tags incoming frames with a unique pvid, and this special VLAN is transmitted as tagged towards the inside of the system, and as untagged towards the exterior. The tag encodes the switch id and the source port index. This means that cross-chip bridging for dsa_8021q only entails adding the dsa_8021q pvids of one switch to the RX filter of the other switches. Everything else falls naturally into place, as long as the bottom-end of ports (the leaves in the tree) is comprised exclusively of dsa_8021q-compatible (i.e. sja1105 switches). Otherwise, there would be a chance that a front-panel switch transmits a packet tagged with a dsa_8021q header, header which it wouldn't be able to remove, and which would hence "leak" out. The only use case I tested (due to lack of board availability) was when the sja1105 switches are part of disjoint trees (however, this doesn't change the fact that multiple sja1105 switches still need unique switch identifiers in such a system). But in principle, even "true" single-tree setups (with DSA links) should work just as fine, except for a small change which I can't test: dsa_towards_port should be used instead of dsa_upstream_port (I made the assumption that the routing port that any sja1105 should use towards its neighbours is the CPU port. That might not hold true in other setups). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-10 16:37:43 +00:00
rc = dsa_8021q_crosschip_bridge_join(other_ds, other_port, ds,
net: dsa: sja1105: save/restore VLANs using a delta commit method Managing the VLAN table that is present in hardware will become very difficult once we add a third operating state (best_effort_vlan_filtering). That is because correct cleanup (not too little, not too much) becomes virtually impossible, when VLANs can be added from the bridge layer, from dsa_8021q for basic tagging, for cross-chip bridging, as well as retagging rules for sub-VLANs and cross-chip sub-VLANs. So we need to rethink VLAN interaction with the switch in a more scalable way. In preparation for that, use the priv->expect_dsa_8021q boolean to classify any VLAN request received through .port_vlan_add or .port_vlan_del towards either one of 2 internal lists: bridge VLANs and dsa_8021q VLANs. Then, implement a central sja1105_build_vlan_table method that creates a VLAN configuration from scratch based on the 2 lists of VLANs kept by the driver, and based on the VLAN awareness state. Currently, if we are VLAN-unaware, install the dsa_8021q VLANs, otherwise the bridge VLANs. Then, implement a delta commit procedure that identifies which VLANs from this new configuration are actually different from the config previously committed to hardware. We apply the delta through the dynamic configuration interface (we don't reset the switch). The result is that the hardware should see the exact sequence of operations as before this patch. This also helps remove the "br" argument passed to dsa_8021q_crosschip_bridge_join, which it was only using to figure out whether it should commit the configuration back to us or not, based on the VLAN awareness state of the bridge. We can simplify that, by always allowing those VLANs inside of our dsa_8021q_vlans list, and committing those to hardware when necessary. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 17:20:29 +00:00
port,
net: dsa: sja1105: implement cross-chip bridging operations sja1105 uses dsa_8021q for DSA tagging, a format which is VLAN at heart and which is compatible with cascading. A complete description of this tagging format is in net/dsa/tag_8021q.c, but a quick summary is that each external-facing port tags incoming frames with a unique pvid, and this special VLAN is transmitted as tagged towards the inside of the system, and as untagged towards the exterior. The tag encodes the switch id and the source port index. This means that cross-chip bridging for dsa_8021q only entails adding the dsa_8021q pvids of one switch to the RX filter of the other switches. Everything else falls naturally into place, as long as the bottom-end of ports (the leaves in the tree) is comprised exclusively of dsa_8021q-compatible (i.e. sja1105 switches). Otherwise, there would be a chance that a front-panel switch transmits a packet tagged with a dsa_8021q header, header which it wouldn't be able to remove, and which would hence "leak" out. The only use case I tested (due to lack of board availability) was when the sja1105 switches are part of disjoint trees (however, this doesn't change the fact that multiple sja1105 switches still need unique switch identifiers in such a system). But in principle, even "true" single-tree setups (with DSA links) should work just as fine, except for a small change which I can't test: dsa_towards_port should be used instead of dsa_upstream_port (I made the assumption that the routing port that any sja1105 should use towards its neighbours is the CPU port. That might not hold true in other setups). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-10 16:37:43 +00:00
&other_priv->crosschip_links);
priv->expect_dsa_8021q = false;
net: dsa: sja1105: implement cross-chip bridging operations sja1105 uses dsa_8021q for DSA tagging, a format which is VLAN at heart and which is compatible with cascading. A complete description of this tagging format is in net/dsa/tag_8021q.c, but a quick summary is that each external-facing port tags incoming frames with a unique pvid, and this special VLAN is transmitted as tagged towards the inside of the system, and as untagged towards the exterior. The tag encodes the switch id and the source port index. This means that cross-chip bridging for dsa_8021q only entails adding the dsa_8021q pvids of one switch to the RX filter of the other switches. Everything else falls naturally into place, as long as the bottom-end of ports (the leaves in the tree) is comprised exclusively of dsa_8021q-compatible (i.e. sja1105 switches). Otherwise, there would be a chance that a front-panel switch transmits a packet tagged with a dsa_8021q header, header which it wouldn't be able to remove, and which would hence "leak" out. The only use case I tested (due to lack of board availability) was when the sja1105 switches are part of disjoint trees (however, this doesn't change the fact that multiple sja1105 switches still need unique switch identifiers in such a system). But in principle, even "true" single-tree setups (with DSA links) should work just as fine, except for a small change which I can't test: dsa_towards_port should be used instead of dsa_upstream_port (I made the assumption that the routing port that any sja1105 should use towards its neighbours is the CPU port. That might not hold true in other setups). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-10 16:37:43 +00:00
if (rc)
return rc;
}
return 0;
}
static void sja1105_crosschip_bridge_leave(struct dsa_switch *ds,
int tree_index, int sw_index,
int other_port,
struct net_device *br)
{
struct dsa_switch *other_ds = dsa_switch_find(tree_index, sw_index);
struct sja1105_private *other_priv = other_ds->priv;
struct sja1105_private *priv = ds->priv;
int port;
if (other_ds->ops != &sja1105_switch_ops)
return;
for (port = 0; port < ds->num_ports; port++) {
if (!dsa_is_user_port(ds, port))
continue;
if (dsa_to_port(ds, port)->bridge_dev != br)
continue;
other_priv->expect_dsa_8021q = true;
net: dsa: sja1105: implement cross-chip bridging operations sja1105 uses dsa_8021q for DSA tagging, a format which is VLAN at heart and which is compatible with cascading. A complete description of this tagging format is in net/dsa/tag_8021q.c, but a quick summary is that each external-facing port tags incoming frames with a unique pvid, and this special VLAN is transmitted as tagged towards the inside of the system, and as untagged towards the exterior. The tag encodes the switch id and the source port index. This means that cross-chip bridging for dsa_8021q only entails adding the dsa_8021q pvids of one switch to the RX filter of the other switches. Everything else falls naturally into place, as long as the bottom-end of ports (the leaves in the tree) is comprised exclusively of dsa_8021q-compatible (i.e. sja1105 switches). Otherwise, there would be a chance that a front-panel switch transmits a packet tagged with a dsa_8021q header, header which it wouldn't be able to remove, and which would hence "leak" out. The only use case I tested (due to lack of board availability) was when the sja1105 switches are part of disjoint trees (however, this doesn't change the fact that multiple sja1105 switches still need unique switch identifiers in such a system). But in principle, even "true" single-tree setups (with DSA links) should work just as fine, except for a small change which I can't test: dsa_towards_port should be used instead of dsa_upstream_port (I made the assumption that the routing port that any sja1105 should use towards its neighbours is the CPU port. That might not hold true in other setups). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-10 16:37:43 +00:00
dsa_8021q_crosschip_bridge_leave(ds, port, other_ds, other_port,
net: dsa: sja1105: save/restore VLANs using a delta commit method Managing the VLAN table that is present in hardware will become very difficult once we add a third operating state (best_effort_vlan_filtering). That is because correct cleanup (not too little, not too much) becomes virtually impossible, when VLANs can be added from the bridge layer, from dsa_8021q for basic tagging, for cross-chip bridging, as well as retagging rules for sub-VLANs and cross-chip sub-VLANs. So we need to rethink VLAN interaction with the switch in a more scalable way. In preparation for that, use the priv->expect_dsa_8021q boolean to classify any VLAN request received through .port_vlan_add or .port_vlan_del towards either one of 2 internal lists: bridge VLANs and dsa_8021q VLANs. Then, implement a central sja1105_build_vlan_table method that creates a VLAN configuration from scratch based on the 2 lists of VLANs kept by the driver, and based on the VLAN awareness state. Currently, if we are VLAN-unaware, install the dsa_8021q VLANs, otherwise the bridge VLANs. Then, implement a delta commit procedure that identifies which VLANs from this new configuration are actually different from the config previously committed to hardware. We apply the delta through the dynamic configuration interface (we don't reset the switch). The result is that the hardware should see the exact sequence of operations as before this patch. This also helps remove the "br" argument passed to dsa_8021q_crosschip_bridge_join, which it was only using to figure out whether it should commit the configuration back to us or not, based on the VLAN awareness state of the bridge. We can simplify that, by always allowing those VLANs inside of our dsa_8021q_vlans list, and committing those to hardware when necessary. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 17:20:29 +00:00
&priv->crosschip_links);
other_priv->expect_dsa_8021q = false;
net: dsa: sja1105: implement cross-chip bridging operations sja1105 uses dsa_8021q for DSA tagging, a format which is VLAN at heart and which is compatible with cascading. A complete description of this tagging format is in net/dsa/tag_8021q.c, but a quick summary is that each external-facing port tags incoming frames with a unique pvid, and this special VLAN is transmitted as tagged towards the inside of the system, and as untagged towards the exterior. The tag encodes the switch id and the source port index. This means that cross-chip bridging for dsa_8021q only entails adding the dsa_8021q pvids of one switch to the RX filter of the other switches. Everything else falls naturally into place, as long as the bottom-end of ports (the leaves in the tree) is comprised exclusively of dsa_8021q-compatible (i.e. sja1105 switches). Otherwise, there would be a chance that a front-panel switch transmits a packet tagged with a dsa_8021q header, header which it wouldn't be able to remove, and which would hence "leak" out. The only use case I tested (due to lack of board availability) was when the sja1105 switches are part of disjoint trees (however, this doesn't change the fact that multiple sja1105 switches still need unique switch identifiers in such a system). But in principle, even "true" single-tree setups (with DSA links) should work just as fine, except for a small change which I can't test: dsa_towards_port should be used instead of dsa_upstream_port (I made the assumption that the routing port that any sja1105 should use towards its neighbours is the CPU port. That might not hold true in other setups). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-10 16:37:43 +00:00
priv->expect_dsa_8021q = true;
net: dsa: sja1105: save/restore VLANs using a delta commit method Managing the VLAN table that is present in hardware will become very difficult once we add a third operating state (best_effort_vlan_filtering). That is because correct cleanup (not too little, not too much) becomes virtually impossible, when VLANs can be added from the bridge layer, from dsa_8021q for basic tagging, for cross-chip bridging, as well as retagging rules for sub-VLANs and cross-chip sub-VLANs. So we need to rethink VLAN interaction with the switch in a more scalable way. In preparation for that, use the priv->expect_dsa_8021q boolean to classify any VLAN request received through .port_vlan_add or .port_vlan_del towards either one of 2 internal lists: bridge VLANs and dsa_8021q VLANs. Then, implement a central sja1105_build_vlan_table method that creates a VLAN configuration from scratch based on the 2 lists of VLANs kept by the driver, and based on the VLAN awareness state. Currently, if we are VLAN-unaware, install the dsa_8021q VLANs, otherwise the bridge VLANs. Then, implement a delta commit procedure that identifies which VLANs from this new configuration are actually different from the config previously committed to hardware. We apply the delta through the dynamic configuration interface (we don't reset the switch). The result is that the hardware should see the exact sequence of operations as before this patch. This also helps remove the "br" argument passed to dsa_8021q_crosschip_bridge_join, which it was only using to figure out whether it should commit the configuration back to us or not, based on the VLAN awareness state of the bridge. We can simplify that, by always allowing those VLANs inside of our dsa_8021q_vlans list, and committing those to hardware when necessary. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 17:20:29 +00:00
dsa_8021q_crosschip_bridge_leave(other_ds, other_port, ds, port,
net: dsa: sja1105: implement cross-chip bridging operations sja1105 uses dsa_8021q for DSA tagging, a format which is VLAN at heart and which is compatible with cascading. A complete description of this tagging format is in net/dsa/tag_8021q.c, but a quick summary is that each external-facing port tags incoming frames with a unique pvid, and this special VLAN is transmitted as tagged towards the inside of the system, and as untagged towards the exterior. The tag encodes the switch id and the source port index. This means that cross-chip bridging for dsa_8021q only entails adding the dsa_8021q pvids of one switch to the RX filter of the other switches. Everything else falls naturally into place, as long as the bottom-end of ports (the leaves in the tree) is comprised exclusively of dsa_8021q-compatible (i.e. sja1105 switches). Otherwise, there would be a chance that a front-panel switch transmits a packet tagged with a dsa_8021q header, header which it wouldn't be able to remove, and which would hence "leak" out. The only use case I tested (due to lack of board availability) was when the sja1105 switches are part of disjoint trees (however, this doesn't change the fact that multiple sja1105 switches still need unique switch identifiers in such a system). But in principle, even "true" single-tree setups (with DSA links) should work just as fine, except for a small change which I can't test: dsa_towards_port should be used instead of dsa_upstream_port (I made the assumption that the routing port that any sja1105 should use towards its neighbours is the CPU port. That might not hold true in other setups). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-10 16:37:43 +00:00
&other_priv->crosschip_links);
priv->expect_dsa_8021q = false;
net: dsa: sja1105: implement cross-chip bridging operations sja1105 uses dsa_8021q for DSA tagging, a format which is VLAN at heart and which is compatible with cascading. A complete description of this tagging format is in net/dsa/tag_8021q.c, but a quick summary is that each external-facing port tags incoming frames with a unique pvid, and this special VLAN is transmitted as tagged towards the inside of the system, and as untagged towards the exterior. The tag encodes the switch id and the source port index. This means that cross-chip bridging for dsa_8021q only entails adding the dsa_8021q pvids of one switch to the RX filter of the other switches. Everything else falls naturally into place, as long as the bottom-end of ports (the leaves in the tree) is comprised exclusively of dsa_8021q-compatible (i.e. sja1105 switches). Otherwise, there would be a chance that a front-panel switch transmits a packet tagged with a dsa_8021q header, header which it wouldn't be able to remove, and which would hence "leak" out. The only use case I tested (due to lack of board availability) was when the sja1105 switches are part of disjoint trees (however, this doesn't change the fact that multiple sja1105 switches still need unique switch identifiers in such a system). But in principle, even "true" single-tree setups (with DSA links) should work just as fine, except for a small change which I can't test: dsa_towards_port should be used instead of dsa_upstream_port (I made the assumption that the routing port that any sja1105 should use towards its neighbours is the CPU port. That might not hold true in other setups). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-10 16:37:43 +00:00
}
}
static int sja1105_setup_8021q_tagging(struct dsa_switch *ds, bool enabled)
{
struct sja1105_private *priv = ds->priv;
int rc, i;
for (i = 0; i < SJA1105_NUM_PORTS; i++) {
priv->expect_dsa_8021q = true;
rc = dsa_port_setup_8021q_tagging(ds, i, enabled);
priv->expect_dsa_8021q = false;
if (rc < 0) {
dev_err(ds->dev, "Failed to setup VLAN tagging for port %d: %d\n",
i, rc);
return rc;
}
}
net: dsa: sja1105: implement cross-chip bridging operations sja1105 uses dsa_8021q for DSA tagging, a format which is VLAN at heart and which is compatible with cascading. A complete description of this tagging format is in net/dsa/tag_8021q.c, but a quick summary is that each external-facing port tags incoming frames with a unique pvid, and this special VLAN is transmitted as tagged towards the inside of the system, and as untagged towards the exterior. The tag encodes the switch id and the source port index. This means that cross-chip bridging for dsa_8021q only entails adding the dsa_8021q pvids of one switch to the RX filter of the other switches. Everything else falls naturally into place, as long as the bottom-end of ports (the leaves in the tree) is comprised exclusively of dsa_8021q-compatible (i.e. sja1105 switches). Otherwise, there would be a chance that a front-panel switch transmits a packet tagged with a dsa_8021q header, header which it wouldn't be able to remove, and which would hence "leak" out. The only use case I tested (due to lack of board availability) was when the sja1105 switches are part of disjoint trees (however, this doesn't change the fact that multiple sja1105 switches still need unique switch identifiers in such a system). But in principle, even "true" single-tree setups (with DSA links) should work just as fine, except for a small change which I can't test: dsa_towards_port should be used instead of dsa_upstream_port (I made the assumption that the routing port that any sja1105 should use towards its neighbours is the CPU port. That might not hold true in other setups). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-10 16:37:43 +00:00
dev_info(ds->dev, "%s switch tagging\n",
enabled ? "Enabled" : "Disabled");
return 0;
}
static enum dsa_tag_protocol
sja1105_get_tag_protocol(struct dsa_switch *ds, int port,
enum dsa_tag_protocol mp)
{
return DSA_TAG_PROTO_SJA1105;
}
static int sja1105_find_free_subvlan(u16 *subvlan_map, bool pvid)
{
int subvlan;
if (pvid)
return 0;
for (subvlan = 1; subvlan < DSA_8021Q_N_SUBVLAN; subvlan++)
if (subvlan_map[subvlan] == VLAN_N_VID)
return subvlan;
return -1;
}
static int sja1105_find_subvlan(u16 *subvlan_map, u16 vid)
{
int subvlan;
for (subvlan = 0; subvlan < DSA_8021Q_N_SUBVLAN; subvlan++)
if (subvlan_map[subvlan] == vid)
return subvlan;
return -1;
}
static int sja1105_find_committed_subvlan(struct sja1105_private *priv,
int port, u16 vid)
{
struct sja1105_port *sp = &priv->ports[port];
return sja1105_find_subvlan(sp->subvlan_map, vid);
}
static void sja1105_init_subvlan_map(u16 *subvlan_map)
{
int subvlan;
for (subvlan = 0; subvlan < DSA_8021Q_N_SUBVLAN; subvlan++)
subvlan_map[subvlan] = VLAN_N_VID;
}
static void sja1105_commit_subvlan_map(struct sja1105_private *priv, int port,
u16 *subvlan_map)
{
struct sja1105_port *sp = &priv->ports[port];
int subvlan;
for (subvlan = 0; subvlan < DSA_8021Q_N_SUBVLAN; subvlan++)
sp->subvlan_map[subvlan] = subvlan_map[subvlan];
}
net: dsa: sja1105: save/restore VLANs using a delta commit method Managing the VLAN table that is present in hardware will become very difficult once we add a third operating state (best_effort_vlan_filtering). That is because correct cleanup (not too little, not too much) becomes virtually impossible, when VLANs can be added from the bridge layer, from dsa_8021q for basic tagging, for cross-chip bridging, as well as retagging rules for sub-VLANs and cross-chip sub-VLANs. So we need to rethink VLAN interaction with the switch in a more scalable way. In preparation for that, use the priv->expect_dsa_8021q boolean to classify any VLAN request received through .port_vlan_add or .port_vlan_del towards either one of 2 internal lists: bridge VLANs and dsa_8021q VLANs. Then, implement a central sja1105_build_vlan_table method that creates a VLAN configuration from scratch based on the 2 lists of VLANs kept by the driver, and based on the VLAN awareness state. Currently, if we are VLAN-unaware, install the dsa_8021q VLANs, otherwise the bridge VLANs. Then, implement a delta commit procedure that identifies which VLANs from this new configuration are actually different from the config previously committed to hardware. We apply the delta through the dynamic configuration interface (we don't reset the switch). The result is that the hardware should see the exact sequence of operations as before this patch. This also helps remove the "br" argument passed to dsa_8021q_crosschip_bridge_join, which it was only using to figure out whether it should commit the configuration back to us or not, based on the VLAN awareness state of the bridge. We can simplify that, by always allowing those VLANs inside of our dsa_8021q_vlans list, and committing those to hardware when necessary. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 17:20:29 +00:00
static int sja1105_is_vlan_configured(struct sja1105_private *priv, u16 vid)
{
struct sja1105_vlan_lookup_entry *vlan;
int count, i;
vlan = priv->static_config.tables[BLK_IDX_VLAN_LOOKUP].entries;
count = priv->static_config.tables[BLK_IDX_VLAN_LOOKUP].entry_count;
for (i = 0; i < count; i++)
if (vlan[i].vlanid == vid)
return i;
/* Return an invalid entry index if not found */
return -1;
}
static int
sja1105_find_retagging_entry(struct sja1105_retagging_entry *retagging,
int count, int from_port, u16 from_vid,
u16 to_vid)
{
int i;
for (i = 0; i < count; i++)
if (retagging[i].ing_port == BIT(from_port) &&
retagging[i].vlan_ing == from_vid &&
retagging[i].vlan_egr == to_vid)
return i;
/* Return an invalid entry index if not found */
return -1;
}
net: dsa: sja1105: save/restore VLANs using a delta commit method Managing the VLAN table that is present in hardware will become very difficult once we add a third operating state (best_effort_vlan_filtering). That is because correct cleanup (not too little, not too much) becomes virtually impossible, when VLANs can be added from the bridge layer, from dsa_8021q for basic tagging, for cross-chip bridging, as well as retagging rules for sub-VLANs and cross-chip sub-VLANs. So we need to rethink VLAN interaction with the switch in a more scalable way. In preparation for that, use the priv->expect_dsa_8021q boolean to classify any VLAN request received through .port_vlan_add or .port_vlan_del towards either one of 2 internal lists: bridge VLANs and dsa_8021q VLANs. Then, implement a central sja1105_build_vlan_table method that creates a VLAN configuration from scratch based on the 2 lists of VLANs kept by the driver, and based on the VLAN awareness state. Currently, if we are VLAN-unaware, install the dsa_8021q VLANs, otherwise the bridge VLANs. Then, implement a delta commit procedure that identifies which VLANs from this new configuration are actually different from the config previously committed to hardware. We apply the delta through the dynamic configuration interface (we don't reset the switch). The result is that the hardware should see the exact sequence of operations as before this patch. This also helps remove the "br" argument passed to dsa_8021q_crosschip_bridge_join, which it was only using to figure out whether it should commit the configuration back to us or not, based on the VLAN awareness state of the bridge. We can simplify that, by always allowing those VLANs inside of our dsa_8021q_vlans list, and committing those to hardware when necessary. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 17:20:29 +00:00
static int sja1105_commit_vlans(struct sja1105_private *priv,
struct sja1105_vlan_lookup_entry *new_vlan,
struct sja1105_retagging_entry *new_retagging,
int num_retagging)
net: dsa: sja1105: save/restore VLANs using a delta commit method Managing the VLAN table that is present in hardware will become very difficult once we add a third operating state (best_effort_vlan_filtering). That is because correct cleanup (not too little, not too much) becomes virtually impossible, when VLANs can be added from the bridge layer, from dsa_8021q for basic tagging, for cross-chip bridging, as well as retagging rules for sub-VLANs and cross-chip sub-VLANs. So we need to rethink VLAN interaction with the switch in a more scalable way. In preparation for that, use the priv->expect_dsa_8021q boolean to classify any VLAN request received through .port_vlan_add or .port_vlan_del towards either one of 2 internal lists: bridge VLANs and dsa_8021q VLANs. Then, implement a central sja1105_build_vlan_table method that creates a VLAN configuration from scratch based on the 2 lists of VLANs kept by the driver, and based on the VLAN awareness state. Currently, if we are VLAN-unaware, install the dsa_8021q VLANs, otherwise the bridge VLANs. Then, implement a delta commit procedure that identifies which VLANs from this new configuration are actually different from the config previously committed to hardware. We apply the delta through the dynamic configuration interface (we don't reset the switch). The result is that the hardware should see the exact sequence of operations as before this patch. This also helps remove the "br" argument passed to dsa_8021q_crosschip_bridge_join, which it was only using to figure out whether it should commit the configuration back to us or not, based on the VLAN awareness state of the bridge. We can simplify that, by always allowing those VLANs inside of our dsa_8021q_vlans list, and committing those to hardware when necessary. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 17:20:29 +00:00
{
struct sja1105_retagging_entry *retagging;
net: dsa: sja1105: save/restore VLANs using a delta commit method Managing the VLAN table that is present in hardware will become very difficult once we add a third operating state (best_effort_vlan_filtering). That is because correct cleanup (not too little, not too much) becomes virtually impossible, when VLANs can be added from the bridge layer, from dsa_8021q for basic tagging, for cross-chip bridging, as well as retagging rules for sub-VLANs and cross-chip sub-VLANs. So we need to rethink VLAN interaction with the switch in a more scalable way. In preparation for that, use the priv->expect_dsa_8021q boolean to classify any VLAN request received through .port_vlan_add or .port_vlan_del towards either one of 2 internal lists: bridge VLANs and dsa_8021q VLANs. Then, implement a central sja1105_build_vlan_table method that creates a VLAN configuration from scratch based on the 2 lists of VLANs kept by the driver, and based on the VLAN awareness state. Currently, if we are VLAN-unaware, install the dsa_8021q VLANs, otherwise the bridge VLANs. Then, implement a delta commit procedure that identifies which VLANs from this new configuration are actually different from the config previously committed to hardware. We apply the delta through the dynamic configuration interface (we don't reset the switch). The result is that the hardware should see the exact sequence of operations as before this patch. This also helps remove the "br" argument passed to dsa_8021q_crosschip_bridge_join, which it was only using to figure out whether it should commit the configuration back to us or not, based on the VLAN awareness state of the bridge. We can simplify that, by always allowing those VLANs inside of our dsa_8021q_vlans list, and committing those to hardware when necessary. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 17:20:29 +00:00
struct sja1105_vlan_lookup_entry *vlan;
struct sja1105_table *table;
int num_vlans = 0;
int rc, i, k = 0;
/* VLAN table */
table = &priv->static_config.tables[BLK_IDX_VLAN_LOOKUP];
vlan = table->entries;
for (i = 0; i < VLAN_N_VID; i++) {
int match = sja1105_is_vlan_configured(priv, i);
if (new_vlan[i].vlanid != VLAN_N_VID)
num_vlans++;
if (new_vlan[i].vlanid == VLAN_N_VID && match >= 0) {
/* Was there before, no longer is. Delete */
dev_dbg(priv->ds->dev, "Deleting VLAN %d\n", i);
rc = sja1105_dynamic_config_write(priv,
BLK_IDX_VLAN_LOOKUP,
i, &vlan[match], false);
if (rc < 0)
return rc;
} else if (new_vlan[i].vlanid != VLAN_N_VID) {
/* Nothing changed, don't do anything */
if (match >= 0 &&
vlan[match].vlanid == new_vlan[i].vlanid &&
vlan[match].tag_port == new_vlan[i].tag_port &&
vlan[match].vlan_bc == new_vlan[i].vlan_bc &&
vlan[match].vmemb_port == new_vlan[i].vmemb_port)
continue;
/* Update entry */
dev_dbg(priv->ds->dev, "Updating VLAN %d\n", i);
rc = sja1105_dynamic_config_write(priv,
BLK_IDX_VLAN_LOOKUP,
i, &new_vlan[i],
true);
if (rc < 0)
return rc;
}
}
if (table->entry_count)
kfree(table->entries);
table->entries = kcalloc(num_vlans, table->ops->unpacked_entry_size,
GFP_KERNEL);
if (!table->entries)
return -ENOMEM;
table->entry_count = num_vlans;
vlan = table->entries;
for (i = 0; i < VLAN_N_VID; i++) {
if (new_vlan[i].vlanid == VLAN_N_VID)
continue;
vlan[k++] = new_vlan[i];
}
/* VLAN Retagging Table */
table = &priv->static_config.tables[BLK_IDX_RETAGGING];
retagging = table->entries;
for (i = 0; i < table->entry_count; i++) {
rc = sja1105_dynamic_config_write(priv, BLK_IDX_RETAGGING,
i, &retagging[i], false);
if (rc)
return rc;
}
if (table->entry_count)
kfree(table->entries);
table->entries = kcalloc(num_retagging, table->ops->unpacked_entry_size,
GFP_KERNEL);
if (!table->entries)
return -ENOMEM;
table->entry_count = num_retagging;
retagging = table->entries;
for (i = 0; i < num_retagging; i++) {
retagging[i] = new_retagging[i];
/* Update entry */
rc = sja1105_dynamic_config_write(priv, BLK_IDX_RETAGGING,
i, &retagging[i], true);
if (rc < 0)
return rc;
}
net: dsa: sja1105: save/restore VLANs using a delta commit method Managing the VLAN table that is present in hardware will become very difficult once we add a third operating state (best_effort_vlan_filtering). That is because correct cleanup (not too little, not too much) becomes virtually impossible, when VLANs can be added from the bridge layer, from dsa_8021q for basic tagging, for cross-chip bridging, as well as retagging rules for sub-VLANs and cross-chip sub-VLANs. So we need to rethink VLAN interaction with the switch in a more scalable way. In preparation for that, use the priv->expect_dsa_8021q boolean to classify any VLAN request received through .port_vlan_add or .port_vlan_del towards either one of 2 internal lists: bridge VLANs and dsa_8021q VLANs. Then, implement a central sja1105_build_vlan_table method that creates a VLAN configuration from scratch based on the 2 lists of VLANs kept by the driver, and based on the VLAN awareness state. Currently, if we are VLAN-unaware, install the dsa_8021q VLANs, otherwise the bridge VLANs. Then, implement a delta commit procedure that identifies which VLANs from this new configuration are actually different from the config previously committed to hardware. We apply the delta through the dynamic configuration interface (we don't reset the switch). The result is that the hardware should see the exact sequence of operations as before this patch. This also helps remove the "br" argument passed to dsa_8021q_crosschip_bridge_join, which it was only using to figure out whether it should commit the configuration back to us or not, based on the VLAN awareness state of the bridge. We can simplify that, by always allowing those VLANs inside of our dsa_8021q_vlans list, and committing those to hardware when necessary. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 17:20:29 +00:00
return 0;
}
struct sja1105_crosschip_vlan {
struct list_head list;
u16 vid;
bool untagged;
int port;
int other_port;
struct dsa_switch *other_ds;
};
net: dsa: sja1105: save/restore VLANs using a delta commit method Managing the VLAN table that is present in hardware will become very difficult once we add a third operating state (best_effort_vlan_filtering). That is because correct cleanup (not too little, not too much) becomes virtually impossible, when VLANs can be added from the bridge layer, from dsa_8021q for basic tagging, for cross-chip bridging, as well as retagging rules for sub-VLANs and cross-chip sub-VLANs. So we need to rethink VLAN interaction with the switch in a more scalable way. In preparation for that, use the priv->expect_dsa_8021q boolean to classify any VLAN request received through .port_vlan_add or .port_vlan_del towards either one of 2 internal lists: bridge VLANs and dsa_8021q VLANs. Then, implement a central sja1105_build_vlan_table method that creates a VLAN configuration from scratch based on the 2 lists of VLANs kept by the driver, and based on the VLAN awareness state. Currently, if we are VLAN-unaware, install the dsa_8021q VLANs, otherwise the bridge VLANs. Then, implement a delta commit procedure that identifies which VLANs from this new configuration are actually different from the config previously committed to hardware. We apply the delta through the dynamic configuration interface (we don't reset the switch). The result is that the hardware should see the exact sequence of operations as before this patch. This also helps remove the "br" argument passed to dsa_8021q_crosschip_bridge_join, which it was only using to figure out whether it should commit the configuration back to us or not, based on the VLAN awareness state of the bridge. We can simplify that, by always allowing those VLANs inside of our dsa_8021q_vlans list, and committing those to hardware when necessary. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 17:20:29 +00:00
struct sja1105_crosschip_switch {
struct list_head list;
struct dsa_switch *other_ds;
};
static int sja1105_commit_pvid(struct sja1105_private *priv)
{
struct sja1105_bridge_vlan *v;
struct list_head *vlan_list;
int rc = 0;
if (priv->vlan_state == SJA1105_VLAN_FILTERING_FULL)
vlan_list = &priv->bridge_vlans;
else
vlan_list = &priv->dsa_8021q_vlans;
list_for_each_entry(v, vlan_list, list) {
if (v->pvid) {
rc = sja1105_pvid_apply(priv, v->port, v->vid);
if (rc)
break;
}
}
return rc;
}
static int
sja1105_build_bridge_vlans(struct sja1105_private *priv,
struct sja1105_vlan_lookup_entry *new_vlan)
{
struct sja1105_bridge_vlan *v;
if (priv->vlan_state == SJA1105_VLAN_UNAWARE)
return 0;
list_for_each_entry(v, &priv->bridge_vlans, list) {
int match = v->vid;
new_vlan[match].vlanid = v->vid;
new_vlan[match].vmemb_port |= BIT(v->port);
new_vlan[match].vlan_bc |= BIT(v->port);
if (!v->untagged)
new_vlan[match].tag_port |= BIT(v->port);
}
return 0;
}
static int
sja1105_build_dsa_8021q_vlans(struct sja1105_private *priv,
struct sja1105_vlan_lookup_entry *new_vlan)
{
struct sja1105_bridge_vlan *v;
if (priv->vlan_state == SJA1105_VLAN_FILTERING_FULL)
return 0;
list_for_each_entry(v, &priv->dsa_8021q_vlans, list) {
int match = v->vid;
new_vlan[match].vlanid = v->vid;
new_vlan[match].vmemb_port |= BIT(v->port);
new_vlan[match].vlan_bc |= BIT(v->port);
if (!v->untagged)
new_vlan[match].tag_port |= BIT(v->port);
}
return 0;
}
static int sja1105_build_subvlans(struct sja1105_private *priv,
u16 subvlan_map[][DSA_8021Q_N_SUBVLAN],
struct sja1105_vlan_lookup_entry *new_vlan,
struct sja1105_retagging_entry *new_retagging,
int *num_retagging)
{
struct sja1105_bridge_vlan *v;
int k = *num_retagging;
if (priv->vlan_state != SJA1105_VLAN_BEST_EFFORT)
return 0;
list_for_each_entry(v, &priv->bridge_vlans, list) {
int upstream = dsa_upstream_port(priv->ds, v->port);
int match, subvlan;
u16 rx_vid;
/* Only sub-VLANs on user ports need to be applied.
* Bridge VLANs also include VLANs added automatically
* by DSA on the CPU port.
*/
if (!dsa_is_user_port(priv->ds, v->port))
continue;
subvlan = sja1105_find_subvlan(subvlan_map[v->port],
v->vid);
if (subvlan < 0) {
subvlan = sja1105_find_free_subvlan(subvlan_map[v->port],
v->pvid);
if (subvlan < 0) {
dev_err(priv->ds->dev, "No more free subvlans\n");
return -ENOSPC;
}
}
rx_vid = dsa_8021q_rx_vid_subvlan(priv->ds, v->port, subvlan);
/* @v->vid on @v->port needs to be retagged to @rx_vid
* on @upstream. Assume @v->vid on @v->port and on
* @upstream was already configured by the previous
* iteration over bridge_vlans.
*/
match = rx_vid;
new_vlan[match].vlanid = rx_vid;
new_vlan[match].vmemb_port |= BIT(v->port);
new_vlan[match].vmemb_port |= BIT(upstream);
new_vlan[match].vlan_bc |= BIT(v->port);
new_vlan[match].vlan_bc |= BIT(upstream);
/* The "untagged" flag is set the same as for the
* original VLAN
*/
if (!v->untagged)
new_vlan[match].tag_port |= BIT(v->port);
/* But it's always tagged towards the CPU */
new_vlan[match].tag_port |= BIT(upstream);
/* The Retagging Table generates packet *clones* with
* the new VLAN. This is a very odd hardware quirk
* which we need to suppress by dropping the original
* packet.
* Deny egress of the original VLAN towards the CPU
* port. This will force the switch to drop it, and
* we'll see only the retagged packets.
*/
match = v->vid;
new_vlan[match].vlan_bc &= ~BIT(upstream);
/* And the retagging itself */
new_retagging[k].vlan_ing = v->vid;
new_retagging[k].vlan_egr = rx_vid;
new_retagging[k].ing_port = BIT(v->port);
new_retagging[k].egr_port = BIT(upstream);
if (k++ == SJA1105_MAX_RETAGGING_COUNT) {
dev_err(priv->ds->dev, "No more retagging rules\n");
return -ENOSPC;
}
subvlan_map[v->port][subvlan] = v->vid;
}
*num_retagging = k;
return 0;
}
/* Sadly, in crosschip scenarios where the CPU port is also the link to another
* switch, we should retag backwards (the dsa_8021q vid to the original vid) on
* the CPU port of neighbour switches.
*/
static int
sja1105_build_crosschip_subvlans(struct sja1105_private *priv,
struct sja1105_vlan_lookup_entry *new_vlan,
struct sja1105_retagging_entry *new_retagging,
int *num_retagging)
{
struct sja1105_crosschip_vlan *tmp, *pos;
struct dsa_8021q_crosschip_link *c;
struct sja1105_bridge_vlan *v, *w;
struct list_head crosschip_vlans;
int k = *num_retagging;
int rc = 0;
if (priv->vlan_state != SJA1105_VLAN_BEST_EFFORT)
return 0;
INIT_LIST_HEAD(&crosschip_vlans);
list_for_each_entry(c, &priv->crosschip_links, list) {
struct sja1105_private *other_priv = c->other_ds->priv;
if (other_priv->vlan_state == SJA1105_VLAN_FILTERING_FULL)
continue;
/* Crosschip links are also added to the CPU ports.
* Ignore those.
*/
if (!dsa_is_user_port(priv->ds, c->port))
continue;
if (!dsa_is_user_port(c->other_ds, c->other_port))
continue;
/* Search for VLANs on the remote port */
list_for_each_entry(v, &other_priv->bridge_vlans, list) {
bool already_added = false;
bool we_have_it = false;
if (v->port != c->other_port)
continue;
/* If @v is a pvid on @other_ds, it does not need
* re-retagging, because its SVL field is 0 and we
* already allow that, via the dsa_8021q crosschip
* links.
*/
if (v->pvid)
continue;
/* Search for the VLAN on our local port */
list_for_each_entry(w, &priv->bridge_vlans, list) {
if (w->port == c->port && w->vid == v->vid) {
we_have_it = true;
break;
}
}
if (!we_have_it)
continue;
list_for_each_entry(tmp, &crosschip_vlans, list) {
if (tmp->vid == v->vid &&
tmp->untagged == v->untagged &&
tmp->port == c->port &&
tmp->other_port == v->port &&
tmp->other_ds == c->other_ds) {
already_added = true;
break;
}
}
if (already_added)
continue;
tmp = kzalloc(sizeof(*tmp), GFP_KERNEL);
if (!tmp) {
dev_err(priv->ds->dev, "Failed to allocate memory\n");
rc = -ENOMEM;
goto out;
}
tmp->vid = v->vid;
tmp->port = c->port;
tmp->other_port = v->port;
tmp->other_ds = c->other_ds;
tmp->untagged = v->untagged;
list_add(&tmp->list, &crosschip_vlans);
}
}
list_for_each_entry(tmp, &crosschip_vlans, list) {
struct sja1105_private *other_priv = tmp->other_ds->priv;
int upstream = dsa_upstream_port(priv->ds, tmp->port);
int match, subvlan;
u16 rx_vid;
subvlan = sja1105_find_committed_subvlan(other_priv,
tmp->other_port,
tmp->vid);
/* If this happens, it's a bug. The neighbour switch does not
* have a subvlan for tmp->vid on tmp->other_port, but it
* should, since we already checked for its vlan_state.
*/
if (WARN_ON(subvlan < 0)) {
rc = -EINVAL;
goto out;
}
rx_vid = dsa_8021q_rx_vid_subvlan(tmp->other_ds,
tmp->other_port,
subvlan);
/* The @rx_vid retagged from @tmp->vid on
* {@tmp->other_ds, @tmp->other_port} needs to be
* re-retagged to @tmp->vid on the way back to us.
*
* Assume the original @tmp->vid is already configured
* on this local switch, otherwise we wouldn't be
* retagging its subvlan on the other switch in the
* first place. We just need to add a reverse retagging
* rule for @rx_vid and install @rx_vid on our ports.
*/
match = rx_vid;
new_vlan[match].vlanid = rx_vid;
new_vlan[match].vmemb_port |= BIT(tmp->port);
new_vlan[match].vmemb_port |= BIT(upstream);
/* The "untagged" flag is set the same as for the
* original VLAN. And towards the CPU, it doesn't
* really matter, because @rx_vid will only receive
* traffic on that port. For consistency with other dsa_8021q
* VLANs, we'll keep the CPU port tagged.
*/
if (!tmp->untagged)
new_vlan[match].tag_port |= BIT(tmp->port);
new_vlan[match].tag_port |= BIT(upstream);
/* Deny egress of @rx_vid towards our front-panel port.
* This will force the switch to drop it, and we'll see
* only the re-retagged packets (having the original,
* pre-initial-retagging, VLAN @tmp->vid).
*/
new_vlan[match].vlan_bc &= ~BIT(tmp->port);
/* On reverse retagging, the same ingress VLAN goes to multiple
* ports. So we have an opportunity to create composite rules
* to not waste the limited space in the retagging table.
*/
k = sja1105_find_retagging_entry(new_retagging, *num_retagging,
upstream, rx_vid, tmp->vid);
if (k < 0) {
if (*num_retagging == SJA1105_MAX_RETAGGING_COUNT) {
dev_err(priv->ds->dev, "No more retagging rules\n");
rc = -ENOSPC;
goto out;
}
k = (*num_retagging)++;
}
/* And the retagging itself */
new_retagging[k].vlan_ing = rx_vid;
new_retagging[k].vlan_egr = tmp->vid;
new_retagging[k].ing_port = BIT(upstream);
new_retagging[k].egr_port |= BIT(tmp->port);
}
out:
list_for_each_entry_safe(tmp, pos, &crosschip_vlans, list) {
list_del(&tmp->list);
kfree(tmp);
}
return rc;
}
net: dsa: sja1105: save/restore VLANs using a delta commit method Managing the VLAN table that is present in hardware will become very difficult once we add a third operating state (best_effort_vlan_filtering). That is because correct cleanup (not too little, not too much) becomes virtually impossible, when VLANs can be added from the bridge layer, from dsa_8021q for basic tagging, for cross-chip bridging, as well as retagging rules for sub-VLANs and cross-chip sub-VLANs. So we need to rethink VLAN interaction with the switch in a more scalable way. In preparation for that, use the priv->expect_dsa_8021q boolean to classify any VLAN request received through .port_vlan_add or .port_vlan_del towards either one of 2 internal lists: bridge VLANs and dsa_8021q VLANs. Then, implement a central sja1105_build_vlan_table method that creates a VLAN configuration from scratch based on the 2 lists of VLANs kept by the driver, and based on the VLAN awareness state. Currently, if we are VLAN-unaware, install the dsa_8021q VLANs, otherwise the bridge VLANs. Then, implement a delta commit procedure that identifies which VLANs from this new configuration are actually different from the config previously committed to hardware. We apply the delta through the dynamic configuration interface (we don't reset the switch). The result is that the hardware should see the exact sequence of operations as before this patch. This also helps remove the "br" argument passed to dsa_8021q_crosschip_bridge_join, which it was only using to figure out whether it should commit the configuration back to us or not, based on the VLAN awareness state of the bridge. We can simplify that, by always allowing those VLANs inside of our dsa_8021q_vlans list, and committing those to hardware when necessary. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 17:20:29 +00:00
static int sja1105_build_vlan_table(struct sja1105_private *priv, bool notify);
static int sja1105_notify_crosschip_switches(struct sja1105_private *priv)
{
struct sja1105_crosschip_switch *s, *pos;
struct list_head crosschip_switches;
struct dsa_8021q_crosschip_link *c;
int rc = 0;
INIT_LIST_HEAD(&crosschip_switches);
list_for_each_entry(c, &priv->crosschip_links, list) {
bool already_added = false;
list_for_each_entry(s, &crosschip_switches, list) {
if (s->other_ds == c->other_ds) {
already_added = true;
break;
}
}
if (already_added)
continue;
s = kzalloc(sizeof(*s), GFP_KERNEL);
if (!s) {
dev_err(priv->ds->dev, "Failed to allocate memory\n");
rc = -ENOMEM;
goto out;
}
s->other_ds = c->other_ds;
list_add(&s->list, &crosschip_switches);
}
list_for_each_entry(s, &crosschip_switches, list) {
struct sja1105_private *other_priv = s->other_ds->priv;
rc = sja1105_build_vlan_table(other_priv, false);
if (rc)
goto out;
}
out:
list_for_each_entry_safe(s, pos, &crosschip_switches, list) {
list_del(&s->list);
kfree(s);
}
return rc;
}
static int sja1105_build_vlan_table(struct sja1105_private *priv, bool notify)
{
u16 subvlan_map[SJA1105_NUM_PORTS][DSA_8021Q_N_SUBVLAN];
struct sja1105_retagging_entry *new_retagging;
net: dsa: sja1105: save/restore VLANs using a delta commit method Managing the VLAN table that is present in hardware will become very difficult once we add a third operating state (best_effort_vlan_filtering). That is because correct cleanup (not too little, not too much) becomes virtually impossible, when VLANs can be added from the bridge layer, from dsa_8021q for basic tagging, for cross-chip bridging, as well as retagging rules for sub-VLANs and cross-chip sub-VLANs. So we need to rethink VLAN interaction with the switch in a more scalable way. In preparation for that, use the priv->expect_dsa_8021q boolean to classify any VLAN request received through .port_vlan_add or .port_vlan_del towards either one of 2 internal lists: bridge VLANs and dsa_8021q VLANs. Then, implement a central sja1105_build_vlan_table method that creates a VLAN configuration from scratch based on the 2 lists of VLANs kept by the driver, and based on the VLAN awareness state. Currently, if we are VLAN-unaware, install the dsa_8021q VLANs, otherwise the bridge VLANs. Then, implement a delta commit procedure that identifies which VLANs from this new configuration are actually different from the config previously committed to hardware. We apply the delta through the dynamic configuration interface (we don't reset the switch). The result is that the hardware should see the exact sequence of operations as before this patch. This also helps remove the "br" argument passed to dsa_8021q_crosschip_bridge_join, which it was only using to figure out whether it should commit the configuration back to us or not, based on the VLAN awareness state of the bridge. We can simplify that, by always allowing those VLANs inside of our dsa_8021q_vlans list, and committing those to hardware when necessary. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 17:20:29 +00:00
struct sja1105_vlan_lookup_entry *new_vlan;
struct sja1105_table *table;
int i, num_retagging = 0;
net: dsa: sja1105: save/restore VLANs using a delta commit method Managing the VLAN table that is present in hardware will become very difficult once we add a third operating state (best_effort_vlan_filtering). That is because correct cleanup (not too little, not too much) becomes virtually impossible, when VLANs can be added from the bridge layer, from dsa_8021q for basic tagging, for cross-chip bridging, as well as retagging rules for sub-VLANs and cross-chip sub-VLANs. So we need to rethink VLAN interaction with the switch in a more scalable way. In preparation for that, use the priv->expect_dsa_8021q boolean to classify any VLAN request received through .port_vlan_add or .port_vlan_del towards either one of 2 internal lists: bridge VLANs and dsa_8021q VLANs. Then, implement a central sja1105_build_vlan_table method that creates a VLAN configuration from scratch based on the 2 lists of VLANs kept by the driver, and based on the VLAN awareness state. Currently, if we are VLAN-unaware, install the dsa_8021q VLANs, otherwise the bridge VLANs. Then, implement a delta commit procedure that identifies which VLANs from this new configuration are actually different from the config previously committed to hardware. We apply the delta through the dynamic configuration interface (we don't reset the switch). The result is that the hardware should see the exact sequence of operations as before this patch. This also helps remove the "br" argument passed to dsa_8021q_crosschip_bridge_join, which it was only using to figure out whether it should commit the configuration back to us or not, based on the VLAN awareness state of the bridge. We can simplify that, by always allowing those VLANs inside of our dsa_8021q_vlans list, and committing those to hardware when necessary. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 17:20:29 +00:00
int rc;
table = &priv->static_config.tables[BLK_IDX_VLAN_LOOKUP];
new_vlan = kcalloc(VLAN_N_VID,
table->ops->unpacked_entry_size, GFP_KERNEL);
if (!new_vlan)
return -ENOMEM;
table = &priv->static_config.tables[BLK_IDX_VLAN_LOOKUP];
new_retagging = kcalloc(SJA1105_MAX_RETAGGING_COUNT,
table->ops->unpacked_entry_size, GFP_KERNEL);
if (!new_retagging) {
kfree(new_vlan);
return -ENOMEM;
}
net: dsa: sja1105: save/restore VLANs using a delta commit method Managing the VLAN table that is present in hardware will become very difficult once we add a third operating state (best_effort_vlan_filtering). That is because correct cleanup (not too little, not too much) becomes virtually impossible, when VLANs can be added from the bridge layer, from dsa_8021q for basic tagging, for cross-chip bridging, as well as retagging rules for sub-VLANs and cross-chip sub-VLANs. So we need to rethink VLAN interaction with the switch in a more scalable way. In preparation for that, use the priv->expect_dsa_8021q boolean to classify any VLAN request received through .port_vlan_add or .port_vlan_del towards either one of 2 internal lists: bridge VLANs and dsa_8021q VLANs. Then, implement a central sja1105_build_vlan_table method that creates a VLAN configuration from scratch based on the 2 lists of VLANs kept by the driver, and based on the VLAN awareness state. Currently, if we are VLAN-unaware, install the dsa_8021q VLANs, otherwise the bridge VLANs. Then, implement a delta commit procedure that identifies which VLANs from this new configuration are actually different from the config previously committed to hardware. We apply the delta through the dynamic configuration interface (we don't reset the switch). The result is that the hardware should see the exact sequence of operations as before this patch. This also helps remove the "br" argument passed to dsa_8021q_crosschip_bridge_join, which it was only using to figure out whether it should commit the configuration back to us or not, based on the VLAN awareness state of the bridge. We can simplify that, by always allowing those VLANs inside of our dsa_8021q_vlans list, and committing those to hardware when necessary. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 17:20:29 +00:00
for (i = 0; i < VLAN_N_VID; i++)
new_vlan[i].vlanid = VLAN_N_VID;
for (i = 0; i < SJA1105_MAX_RETAGGING_COUNT; i++)
new_retagging[i].vlan_ing = VLAN_N_VID;
for (i = 0; i < priv->ds->num_ports; i++)
sja1105_init_subvlan_map(subvlan_map[i]);
net: dsa: sja1105: save/restore VLANs using a delta commit method Managing the VLAN table that is present in hardware will become very difficult once we add a third operating state (best_effort_vlan_filtering). That is because correct cleanup (not too little, not too much) becomes virtually impossible, when VLANs can be added from the bridge layer, from dsa_8021q for basic tagging, for cross-chip bridging, as well as retagging rules for sub-VLANs and cross-chip sub-VLANs. So we need to rethink VLAN interaction with the switch in a more scalable way. In preparation for that, use the priv->expect_dsa_8021q boolean to classify any VLAN request received through .port_vlan_add or .port_vlan_del towards either one of 2 internal lists: bridge VLANs and dsa_8021q VLANs. Then, implement a central sja1105_build_vlan_table method that creates a VLAN configuration from scratch based on the 2 lists of VLANs kept by the driver, and based on the VLAN awareness state. Currently, if we are VLAN-unaware, install the dsa_8021q VLANs, otherwise the bridge VLANs. Then, implement a delta commit procedure that identifies which VLANs from this new configuration are actually different from the config previously committed to hardware. We apply the delta through the dynamic configuration interface (we don't reset the switch). The result is that the hardware should see the exact sequence of operations as before this patch. This also helps remove the "br" argument passed to dsa_8021q_crosschip_bridge_join, which it was only using to figure out whether it should commit the configuration back to us or not, based on the VLAN awareness state of the bridge. We can simplify that, by always allowing those VLANs inside of our dsa_8021q_vlans list, and committing those to hardware when necessary. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 17:20:29 +00:00
/* Bridge VLANs */
rc = sja1105_build_bridge_vlans(priv, new_vlan);
if (rc)
goto out;
/* VLANs necessary for dsa_8021q operation, given to us by tag_8021q.c:
* - RX VLANs
* - TX VLANs
* - Crosschip links
*/
rc = sja1105_build_dsa_8021q_vlans(priv, new_vlan);
if (rc)
goto out;
/* Private VLANs necessary for dsa_8021q operation, which we need to
* determine on our own:
* - Sub-VLANs
* - Sub-VLANs of crosschip switches
*/
rc = sja1105_build_subvlans(priv, subvlan_map, new_vlan, new_retagging,
&num_retagging);
if (rc)
goto out;
rc = sja1105_build_crosschip_subvlans(priv, new_vlan, new_retagging,
&num_retagging);
if (rc)
goto out;
rc = sja1105_commit_vlans(priv, new_vlan, new_retagging, num_retagging);
net: dsa: sja1105: save/restore VLANs using a delta commit method Managing the VLAN table that is present in hardware will become very difficult once we add a third operating state (best_effort_vlan_filtering). That is because correct cleanup (not too little, not too much) becomes virtually impossible, when VLANs can be added from the bridge layer, from dsa_8021q for basic tagging, for cross-chip bridging, as well as retagging rules for sub-VLANs and cross-chip sub-VLANs. So we need to rethink VLAN interaction with the switch in a more scalable way. In preparation for that, use the priv->expect_dsa_8021q boolean to classify any VLAN request received through .port_vlan_add or .port_vlan_del towards either one of 2 internal lists: bridge VLANs and dsa_8021q VLANs. Then, implement a central sja1105_build_vlan_table method that creates a VLAN configuration from scratch based on the 2 lists of VLANs kept by the driver, and based on the VLAN awareness state. Currently, if we are VLAN-unaware, install the dsa_8021q VLANs, otherwise the bridge VLANs. Then, implement a delta commit procedure that identifies which VLANs from this new configuration are actually different from the config previously committed to hardware. We apply the delta through the dynamic configuration interface (we don't reset the switch). The result is that the hardware should see the exact sequence of operations as before this patch. This also helps remove the "br" argument passed to dsa_8021q_crosschip_bridge_join, which it was only using to figure out whether it should commit the configuration back to us or not, based on the VLAN awareness state of the bridge. We can simplify that, by always allowing those VLANs inside of our dsa_8021q_vlans list, and committing those to hardware when necessary. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 17:20:29 +00:00
if (rc)
goto out;
rc = sja1105_commit_pvid(priv);
if (rc)
goto out;
for (i = 0; i < priv->ds->num_ports; i++)
sja1105_commit_subvlan_map(priv, i, subvlan_map[i]);
net: dsa: sja1105: save/restore VLANs using a delta commit method Managing the VLAN table that is present in hardware will become very difficult once we add a third operating state (best_effort_vlan_filtering). That is because correct cleanup (not too little, not too much) becomes virtually impossible, when VLANs can be added from the bridge layer, from dsa_8021q for basic tagging, for cross-chip bridging, as well as retagging rules for sub-VLANs and cross-chip sub-VLANs. So we need to rethink VLAN interaction with the switch in a more scalable way. In preparation for that, use the priv->expect_dsa_8021q boolean to classify any VLAN request received through .port_vlan_add or .port_vlan_del towards either one of 2 internal lists: bridge VLANs and dsa_8021q VLANs. Then, implement a central sja1105_build_vlan_table method that creates a VLAN configuration from scratch based on the 2 lists of VLANs kept by the driver, and based on the VLAN awareness state. Currently, if we are VLAN-unaware, install the dsa_8021q VLANs, otherwise the bridge VLANs. Then, implement a delta commit procedure that identifies which VLANs from this new configuration are actually different from the config previously committed to hardware. We apply the delta through the dynamic configuration interface (we don't reset the switch). The result is that the hardware should see the exact sequence of operations as before this patch. This also helps remove the "br" argument passed to dsa_8021q_crosschip_bridge_join, which it was only using to figure out whether it should commit the configuration back to us or not, based on the VLAN awareness state of the bridge. We can simplify that, by always allowing those VLANs inside of our dsa_8021q_vlans list, and committing those to hardware when necessary. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 17:20:29 +00:00
if (notify) {
rc = sja1105_notify_crosschip_switches(priv);
if (rc)
goto out;
}
out:
kfree(new_vlan);
kfree(new_retagging);
net: dsa: sja1105: save/restore VLANs using a delta commit method Managing the VLAN table that is present in hardware will become very difficult once we add a third operating state (best_effort_vlan_filtering). That is because correct cleanup (not too little, not too much) becomes virtually impossible, when VLANs can be added from the bridge layer, from dsa_8021q for basic tagging, for cross-chip bridging, as well as retagging rules for sub-VLANs and cross-chip sub-VLANs. So we need to rethink VLAN interaction with the switch in a more scalable way. In preparation for that, use the priv->expect_dsa_8021q boolean to classify any VLAN request received through .port_vlan_add or .port_vlan_del towards either one of 2 internal lists: bridge VLANs and dsa_8021q VLANs. Then, implement a central sja1105_build_vlan_table method that creates a VLAN configuration from scratch based on the 2 lists of VLANs kept by the driver, and based on the VLAN awareness state. Currently, if we are VLAN-unaware, install the dsa_8021q VLANs, otherwise the bridge VLANs. Then, implement a delta commit procedure that identifies which VLANs from this new configuration are actually different from the config previously committed to hardware. We apply the delta through the dynamic configuration interface (we don't reset the switch). The result is that the hardware should see the exact sequence of operations as before this patch. This also helps remove the "br" argument passed to dsa_8021q_crosschip_bridge_join, which it was only using to figure out whether it should commit the configuration back to us or not, based on the VLAN awareness state of the bridge. We can simplify that, by always allowing those VLANs inside of our dsa_8021q_vlans list, and committing those to hardware when necessary. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 17:20:29 +00:00
return rc;
}
/* Select the list to which we should add this VLAN. */
static struct list_head *sja1105_classify_vlan(struct sja1105_private *priv,
u16 vid)
{
if (priv->expect_dsa_8021q)
return &priv->dsa_8021q_vlans;
return &priv->bridge_vlans;
}
static int sja1105_vlan_prepare(struct dsa_switch *ds, int port,
const struct switchdev_obj_port_vlan *vlan)
{
struct sja1105_private *priv = ds->priv;
u16 vid;
if (priv->vlan_state == SJA1105_VLAN_FILTERING_FULL)
return 0;
/* If the user wants best-effort VLAN filtering (aka vlan_filtering
* bridge plus tagging), be sure to at least deny alterations to the
* configuration done by dsa_8021q.
*/
for (vid = vlan->vid_begin; vid <= vlan->vid_end; vid++) {
if (!priv->expect_dsa_8021q && vid_is_dsa_8021q(vid)) {
dev_err(ds->dev, "Range 1024-3071 reserved for dsa_8021q operation\n");
return -EBUSY;
}
}
return 0;
}
/* The TPID setting belongs to the General Parameters table,
* which can only be partially reconfigured at runtime (and not the TPID).
* So a switch reset is required.
*/
static int sja1105_vlan_filtering(struct dsa_switch *ds, int port, bool enabled)
{
net: dsa: sja1105: Fix broken learning with vlan_filtering disabled When put under a bridge with vlan_filtering 0, the SJA1105 ports will flood all traffic as if learning was broken. This is because learning interferes with the rx_vid's configured by dsa_8021q as unique pvid's. So learning technically still *does* work, it's just that the learnt entries never get matched due to their unique VLAN ID. The setting that saves the day is Shared VLAN Learning, which on this switch family works exactly as desired: VLAN tagging still works (untagged traffic gets the correct pvid) and FDB entries are still populated with the correct contents including VID. Also, a frame cannot violate the forwarding domain restrictions enforced by its classified VLAN. It is just that the VID is ignored when looking up the FDB for taking a forwarding decision (selecting the egress port). This patch activates SVL, and the result is that frames with a learnt DMAC are no longer flooded in the scenario described above. Now exactly *because* SVL works as desired, we have to revisit some earlier patches: - It is no longer necessary to manipulate the VID of the 'bridge fdb {add,del}' command when vlan_filtering is off. This is because now, SVL is enabled for that case, so the actual VID does not matter*. - It is still desirable to hide dsa_8021q VID's in the FDB dump callback. But right now the dump callback should no longer hide duplicates (one per each front panel port's pvid, plus one for the VLAN that the CPU port is going to tag a TX frame with), because there shouldn't be any (the switch will match a single FDB entry no matter its VID anyway). * Not really... It's no longer necessary to transform a 'bridge fdb add' into 5 fdb add operations, but the user might still add a fdb entry with any vid, and all of them would appear as duplicates in 'bridge fdb show'. So force a 'bridge fdb add' to insert the VID of 0**, so that we can prune the duplicates at insertion time. ** The VID of 0 is better than 1 because it is always guaranteed to be in the ports' hardware filter. DSA also avoids putting the VID inside the netlink response message towards the bridge driver when we return this particular VID, which makes it suitable for FDB entries learnt with vlan_filtering off. Fixes: 227d07a07ef1 ("net: dsa: sja1105: Add support for traffic through standalone ports") Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Georg Waibel <georg.waibel@sensor-technik.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-04 22:38:44 +00:00
struct sja1105_l2_lookup_params_entry *l2_lookup_params;
struct sja1105_general_params_entry *general_params;
struct sja1105_private *priv = ds->priv;
enum sja1105_vlan_state state;
struct sja1105_table *table;
struct sja1105_rule *rule;
bool want_tagging;
u16 tpid, tpid2;
int rc;
list_for_each_entry(rule, &priv->flow_block.rules, list) {
if (rule->type == SJA1105_RULE_VL) {
dev_err(ds->dev,
"Cannot change VLAN filtering state while VL rules are active\n");
return -EBUSY;
}
}
if (enabled) {
/* Enable VLAN filtering. */
tpid = ETH_P_8021Q;
tpid2 = ETH_P_8021AD;
} else {
/* Disable VLAN filtering. */
tpid = ETH_P_SJA1105;
tpid2 = ETH_P_SJA1105;
}
net: dsa: sja1105: prepare tagger for handling DSA tags and VLAN simultaneously In VLAN-unaware mode, sja1105 uses VLAN tags with a custom TPID of 0xdadb. While in the yet-to-be introduced best_effort_vlan_filtering mode, it needs to work with normal VLAN TPID values. A complication arises when we must transmit a VLAN-tagged packet to the switch when it's in VLAN-aware mode. We need to construct a packet with 2 VLAN tags, and the switch will use the outer header for routing and pop it on egress. But sadly, here the 2 hardware generations don't behave the same: - E/T switches won't pop an ETH_P_8021AD tag on egress, it seems (packets will remain double-tagged). - P/Q/R/S switches will drop a packet with 2 ETH_P_8021Q tags (it looks like it tries to prevent VLAN hopping). But looks like the reverse is also true: - E/T switches have no problem popping the outer tag from packets with 2 ETH_P_8021Q tags. - P/Q/R/S will have no problem popping a single tag even if that is ETH_P_8021AD. So it is clear that if we want the hardware to work with dsa_8021q tagging in VLAN-aware mode, we need to send different TPIDs depending on revision. Keep that information in priv->info->qinq_tpid. The per-port tagger structure will hold an xmit_tpid value that depends not only upon the qinq_tpid, but also upon the VLAN awareness state itself (in case we must transmit using 0xdadb). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 17:20:32 +00:00
for (port = 0; port < ds->num_ports; port++) {
struct sja1105_port *sp = &priv->ports[port];
if (enabled)
sp->xmit_tpid = priv->info->qinq_tpid;
else
sp->xmit_tpid = ETH_P_SJA1105;
}
if (!enabled)
state = SJA1105_VLAN_UNAWARE;
else if (priv->best_effort_vlan_filtering)
state = SJA1105_VLAN_BEST_EFFORT;
else
state = SJA1105_VLAN_FILTERING_FULL;
if (priv->vlan_state == state)
return 0;
priv->vlan_state = state;
want_tagging = (state == SJA1105_VLAN_UNAWARE ||
state == SJA1105_VLAN_BEST_EFFORT);
table = &priv->static_config.tables[BLK_IDX_GENERAL_PARAMS];
general_params = table->entries;
/* EtherType used to identify inner tagged (C-tag) VLAN traffic */
general_params->tpid = tpid;
/* EtherType used to identify outer tagged (S-tag) VLAN traffic */
general_params->tpid2 = tpid2;
net: dsa: sja1105: Limit use of incl_srcpt to bridge+vlan mode The incl_srcpt setting makes the switch mangle the destination MACs of multicast frames trapped to the CPU - a primitive tagging mechanism that works even when we cannot use the 802.1Q software features. The downside is that the two multicast MAC addresses that the switch traps for L2 PTP (01-80-C2-00-00-0E and 01-1B-19-00-00-00) quickly turn into a lot more, as the switch encodes the source port and switch id into bytes 3 and 4 of the MAC. The resulting range of MAC addresses would need to be installed manually into the DSA master port's multicast MAC filter, and even then, most devices might not have a large enough MAC filtering table. As a result, only limit use of incl_srcpt to when it's strictly necessary: when under a VLAN filtering bridge. This fixes PTP in non-bridged mode (standalone ports). Otherwise, PTP frames, as well as metadata follow-up frames holding RX timestamps won't be received because they will be blocked by the master port's MAC filter. Linuxptp doesn't help, because it only requests the addition of the unmodified PTP MACs to the multicast filter. This issue is not seen in bridged mode because the master port is put in promiscuous mode when the slave ports are enslaved to a bridge. Therefore, there is no downside to having the incl_srcpt mechanism active there. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-08 12:04:32 +00:00
/* When VLAN filtering is on, we need to at least be able to
* decode management traffic through the "backup plan".
*/
general_params->incl_srcpt1 = enabled;
general_params->incl_srcpt0 = enabled;
want_tagging = priv->best_effort_vlan_filtering || !enabled;
net: dsa: sja1105: Fix broken learning with vlan_filtering disabled When put under a bridge with vlan_filtering 0, the SJA1105 ports will flood all traffic as if learning was broken. This is because learning interferes with the rx_vid's configured by dsa_8021q as unique pvid's. So learning technically still *does* work, it's just that the learnt entries never get matched due to their unique VLAN ID. The setting that saves the day is Shared VLAN Learning, which on this switch family works exactly as desired: VLAN tagging still works (untagged traffic gets the correct pvid) and FDB entries are still populated with the correct contents including VID. Also, a frame cannot violate the forwarding domain restrictions enforced by its classified VLAN. It is just that the VID is ignored when looking up the FDB for taking a forwarding decision (selecting the egress port). This patch activates SVL, and the result is that frames with a learnt DMAC are no longer flooded in the scenario described above. Now exactly *because* SVL works as desired, we have to revisit some earlier patches: - It is no longer necessary to manipulate the VID of the 'bridge fdb {add,del}' command when vlan_filtering is off. This is because now, SVL is enabled for that case, so the actual VID does not matter*. - It is still desirable to hide dsa_8021q VID's in the FDB dump callback. But right now the dump callback should no longer hide duplicates (one per each front panel port's pvid, plus one for the VLAN that the CPU port is going to tag a TX frame with), because there shouldn't be any (the switch will match a single FDB entry no matter its VID anyway). * Not really... It's no longer necessary to transform a 'bridge fdb add' into 5 fdb add operations, but the user might still add a fdb entry with any vid, and all of them would appear as duplicates in 'bridge fdb show'. So force a 'bridge fdb add' to insert the VID of 0**, so that we can prune the duplicates at insertion time. ** The VID of 0 is better than 1 because it is always guaranteed to be in the ports' hardware filter. DSA also avoids putting the VID inside the netlink response message towards the bridge driver when we return this particular VID, which makes it suitable for FDB entries learnt with vlan_filtering off. Fixes: 227d07a07ef1 ("net: dsa: sja1105: Add support for traffic through standalone ports") Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Georg Waibel <georg.waibel@sensor-technik.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-04 22:38:44 +00:00
/* VLAN filtering => independent VLAN learning.
* No VLAN filtering (or best effort) => shared VLAN learning.
net: dsa: sja1105: Fix broken learning with vlan_filtering disabled When put under a bridge with vlan_filtering 0, the SJA1105 ports will flood all traffic as if learning was broken. This is because learning interferes with the rx_vid's configured by dsa_8021q as unique pvid's. So learning technically still *does* work, it's just that the learnt entries never get matched due to their unique VLAN ID. The setting that saves the day is Shared VLAN Learning, which on this switch family works exactly as desired: VLAN tagging still works (untagged traffic gets the correct pvid) and FDB entries are still populated with the correct contents including VID. Also, a frame cannot violate the forwarding domain restrictions enforced by its classified VLAN. It is just that the VID is ignored when looking up the FDB for taking a forwarding decision (selecting the egress port). This patch activates SVL, and the result is that frames with a learnt DMAC are no longer flooded in the scenario described above. Now exactly *because* SVL works as desired, we have to revisit some earlier patches: - It is no longer necessary to manipulate the VID of the 'bridge fdb {add,del}' command when vlan_filtering is off. This is because now, SVL is enabled for that case, so the actual VID does not matter*. - It is still desirable to hide dsa_8021q VID's in the FDB dump callback. But right now the dump callback should no longer hide duplicates (one per each front panel port's pvid, plus one for the VLAN that the CPU port is going to tag a TX frame with), because there shouldn't be any (the switch will match a single FDB entry no matter its VID anyway). * Not really... It's no longer necessary to transform a 'bridge fdb add' into 5 fdb add operations, but the user might still add a fdb entry with any vid, and all of them would appear as duplicates in 'bridge fdb show'. So force a 'bridge fdb add' to insert the VID of 0**, so that we can prune the duplicates at insertion time. ** The VID of 0 is better than 1 because it is always guaranteed to be in the ports' hardware filter. DSA also avoids putting the VID inside the netlink response message towards the bridge driver when we return this particular VID, which makes it suitable for FDB entries learnt with vlan_filtering off. Fixes: 227d07a07ef1 ("net: dsa: sja1105: Add support for traffic through standalone ports") Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Georg Waibel <georg.waibel@sensor-technik.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-04 22:38:44 +00:00
*
* In shared VLAN learning mode, untagged traffic still gets
* pvid-tagged, and the FDB table gets populated with entries
* containing the "real" (pvid or from VLAN tag) VLAN ID.
* However the switch performs a masked L2 lookup in the FDB,
* effectively only looking up a frame's DMAC (and not VID) for the
* forwarding decision.
*
* This is extremely convenient for us, because in modes with
* vlan_filtering=0, dsa_8021q actually installs unique pvid's into
* each front panel port. This is good for identification but breaks
* learning badly - the VID of the learnt FDB entry is unique, aka
* no frames coming from any other port are going to have it. So
* for forwarding purposes, this is as though learning was broken
* (all frames get flooded).
*/
table = &priv->static_config.tables[BLK_IDX_L2_LOOKUP_PARAMS];
l2_lookup_params = table->entries;
l2_lookup_params->shared_learn = want_tagging;
net: dsa: sja1105: Fix broken learning with vlan_filtering disabled When put under a bridge with vlan_filtering 0, the SJA1105 ports will flood all traffic as if learning was broken. This is because learning interferes with the rx_vid's configured by dsa_8021q as unique pvid's. So learning technically still *does* work, it's just that the learnt entries never get matched due to their unique VLAN ID. The setting that saves the day is Shared VLAN Learning, which on this switch family works exactly as desired: VLAN tagging still works (untagged traffic gets the correct pvid) and FDB entries are still populated with the correct contents including VID. Also, a frame cannot violate the forwarding domain restrictions enforced by its classified VLAN. It is just that the VID is ignored when looking up the FDB for taking a forwarding decision (selecting the egress port). This patch activates SVL, and the result is that frames with a learnt DMAC are no longer flooded in the scenario described above. Now exactly *because* SVL works as desired, we have to revisit some earlier patches: - It is no longer necessary to manipulate the VID of the 'bridge fdb {add,del}' command when vlan_filtering is off. This is because now, SVL is enabled for that case, so the actual VID does not matter*. - It is still desirable to hide dsa_8021q VID's in the FDB dump callback. But right now the dump callback should no longer hide duplicates (one per each front panel port's pvid, plus one for the VLAN that the CPU port is going to tag a TX frame with), because there shouldn't be any (the switch will match a single FDB entry no matter its VID anyway). * Not really... It's no longer necessary to transform a 'bridge fdb add' into 5 fdb add operations, but the user might still add a fdb entry with any vid, and all of them would appear as duplicates in 'bridge fdb show'. So force a 'bridge fdb add' to insert the VID of 0**, so that we can prune the duplicates at insertion time. ** The VID of 0 is better than 1 because it is always guaranteed to be in the ports' hardware filter. DSA also avoids putting the VID inside the netlink response message towards the bridge driver when we return this particular VID, which makes it suitable for FDB entries learnt with vlan_filtering off. Fixes: 227d07a07ef1 ("net: dsa: sja1105: Add support for traffic through standalone ports") Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Georg Waibel <georg.waibel@sensor-technik.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-04 22:38:44 +00:00
sja1105_frame_memory_partitioning(priv);
net: dsa: sja1105: avoid invalid state in sja1105_vlan_filtering Be there 2 switches spi/spi2.0 and spi/spi2.1 in a cross-chip setup, both under the same VLAN-filtering bridge, both in the SJA1105_VLAN_BEST_EFFORT state. If we try to change the VLAN state of one of the switches (to SJA1105_VLAN_FILTERING_FULL) we get the following error: devlink dev param set spi/spi2.1 name best_effort_vlan_filtering value false cmode runtime [ 38.325683] sja1105 spi2.1: Not allowed to overcommit frame memory. L2 memory partitions and VL memory partitions share the same space. The sum of all 16 memory partitions is not allowed to be larger than 929 128-byte blocks (or 910 with retagging). Please adjust l2-forwarding-parameters-table.part_spc and/or vl-forwarding-parameters-table.partspc. [ 38.356803] sja1105 spi2.1: Invalid config, cannot upload This is because the spi/spi2.1 switch doesn't support tagging anymore in the SJA1105_VLAN_FILTERING_FULL state, so it doesn't need to have any retagging rules defined. Great, so it can use more frame memory (retagging consumes extra memory). But the built-in low-level static config checker from the sja1105 driver says "not so fast, you've increased the frame memory to non-retagging values, but you still kept the retagging rules in the static config". So we need to rebuild the VLAN table immediately before re-uploading the static config, operation which will take care, based on the new VLAN state, of removing the retagging rules. Fixes: 3f01c91aab92 ("net: dsa: sja1105: implement VLAN retagging for dsa_8021q sub-VLANs") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-27 17:20:38 +00:00
rc = sja1105_build_vlan_table(priv, false);
if (rc)
return rc;
rc = sja1105_static_config_reload(priv, SJA1105_VLAN_FILTERING);
if (rc)
dev_err(ds->dev, "Failed to change VLAN Ethertype\n");
/* Switch port identification based on 802.1Q is only passable
* if we are not under a vlan_filtering bridge. So make sure
* the two configurations are mutually exclusive (of course, the
* user may know better, i.e. best_effort_vlan_filtering).
*/
return sja1105_setup_8021q_tagging(ds, want_tagging);
}
static void sja1105_vlan_add(struct dsa_switch *ds, int port,
const struct switchdev_obj_port_vlan *vlan)
{
struct sja1105_private *priv = ds->priv;
net: dsa: sja1105: save/restore VLANs using a delta commit method Managing the VLAN table that is present in hardware will become very difficult once we add a third operating state (best_effort_vlan_filtering). That is because correct cleanup (not too little, not too much) becomes virtually impossible, when VLANs can be added from the bridge layer, from dsa_8021q for basic tagging, for cross-chip bridging, as well as retagging rules for sub-VLANs and cross-chip sub-VLANs. So we need to rethink VLAN interaction with the switch in a more scalable way. In preparation for that, use the priv->expect_dsa_8021q boolean to classify any VLAN request received through .port_vlan_add or .port_vlan_del towards either one of 2 internal lists: bridge VLANs and dsa_8021q VLANs. Then, implement a central sja1105_build_vlan_table method that creates a VLAN configuration from scratch based on the 2 lists of VLANs kept by the driver, and based on the VLAN awareness state. Currently, if we are VLAN-unaware, install the dsa_8021q VLANs, otherwise the bridge VLANs. Then, implement a delta commit procedure that identifies which VLANs from this new configuration are actually different from the config previously committed to hardware. We apply the delta through the dynamic configuration interface (we don't reset the switch). The result is that the hardware should see the exact sequence of operations as before this patch. This also helps remove the "br" argument passed to dsa_8021q_crosschip_bridge_join, which it was only using to figure out whether it should commit the configuration back to us or not, based on the VLAN awareness state of the bridge. We can simplify that, by always allowing those VLANs inside of our dsa_8021q_vlans list, and committing those to hardware when necessary. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 17:20:29 +00:00
bool vlan_table_changed = false;
u16 vid;
int rc;
for (vid = vlan->vid_begin; vid <= vlan->vid_end; vid++) {
net: dsa: sja1105: save/restore VLANs using a delta commit method Managing the VLAN table that is present in hardware will become very difficult once we add a third operating state (best_effort_vlan_filtering). That is because correct cleanup (not too little, not too much) becomes virtually impossible, when VLANs can be added from the bridge layer, from dsa_8021q for basic tagging, for cross-chip bridging, as well as retagging rules for sub-VLANs and cross-chip sub-VLANs. So we need to rethink VLAN interaction with the switch in a more scalable way. In preparation for that, use the priv->expect_dsa_8021q boolean to classify any VLAN request received through .port_vlan_add or .port_vlan_del towards either one of 2 internal lists: bridge VLANs and dsa_8021q VLANs. Then, implement a central sja1105_build_vlan_table method that creates a VLAN configuration from scratch based on the 2 lists of VLANs kept by the driver, and based on the VLAN awareness state. Currently, if we are VLAN-unaware, install the dsa_8021q VLANs, otherwise the bridge VLANs. Then, implement a delta commit procedure that identifies which VLANs from this new configuration are actually different from the config previously committed to hardware. We apply the delta through the dynamic configuration interface (we don't reset the switch). The result is that the hardware should see the exact sequence of operations as before this patch. This also helps remove the "br" argument passed to dsa_8021q_crosschip_bridge_join, which it was only using to figure out whether it should commit the configuration back to us or not, based on the VLAN awareness state of the bridge. We can simplify that, by always allowing those VLANs inside of our dsa_8021q_vlans list, and committing those to hardware when necessary. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 17:20:29 +00:00
bool untagged = vlan->flags & BRIDGE_VLAN_INFO_UNTAGGED;
bool pvid = vlan->flags & BRIDGE_VLAN_INFO_PVID;
struct sja1105_bridge_vlan *v;
struct list_head *vlan_list;
bool already_added = false;
vlan_list = sja1105_classify_vlan(priv, vid);
list_for_each_entry(v, vlan_list, list) {
if (v->port == port && v->vid == vid &&
v->untagged == untagged && v->pvid == pvid) {
already_added = true;
break;
}
}
net: dsa: sja1105: save/restore VLANs using a delta commit method Managing the VLAN table that is present in hardware will become very difficult once we add a third operating state (best_effort_vlan_filtering). That is because correct cleanup (not too little, not too much) becomes virtually impossible, when VLANs can be added from the bridge layer, from dsa_8021q for basic tagging, for cross-chip bridging, as well as retagging rules for sub-VLANs and cross-chip sub-VLANs. So we need to rethink VLAN interaction with the switch in a more scalable way. In preparation for that, use the priv->expect_dsa_8021q boolean to classify any VLAN request received through .port_vlan_add or .port_vlan_del towards either one of 2 internal lists: bridge VLANs and dsa_8021q VLANs. Then, implement a central sja1105_build_vlan_table method that creates a VLAN configuration from scratch based on the 2 lists of VLANs kept by the driver, and based on the VLAN awareness state. Currently, if we are VLAN-unaware, install the dsa_8021q VLANs, otherwise the bridge VLANs. Then, implement a delta commit procedure that identifies which VLANs from this new configuration are actually different from the config previously committed to hardware. We apply the delta through the dynamic configuration interface (we don't reset the switch). The result is that the hardware should see the exact sequence of operations as before this patch. This also helps remove the "br" argument passed to dsa_8021q_crosschip_bridge_join, which it was only using to figure out whether it should commit the configuration back to us or not, based on the VLAN awareness state of the bridge. We can simplify that, by always allowing those VLANs inside of our dsa_8021q_vlans list, and committing those to hardware when necessary. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 17:20:29 +00:00
if (already_added)
continue;
v = kzalloc(sizeof(*v), GFP_KERNEL);
if (!v) {
dev_err(ds->dev, "Out of memory while storing VLAN\n");
return;
}
v->port = port;
v->vid = vid;
v->untagged = untagged;
v->pvid = pvid;
list_add(&v->list, vlan_list);
vlan_table_changed = true;
}
net: dsa: sja1105: save/restore VLANs using a delta commit method Managing the VLAN table that is present in hardware will become very difficult once we add a third operating state (best_effort_vlan_filtering). That is because correct cleanup (not too little, not too much) becomes virtually impossible, when VLANs can be added from the bridge layer, from dsa_8021q for basic tagging, for cross-chip bridging, as well as retagging rules for sub-VLANs and cross-chip sub-VLANs. So we need to rethink VLAN interaction with the switch in a more scalable way. In preparation for that, use the priv->expect_dsa_8021q boolean to classify any VLAN request received through .port_vlan_add or .port_vlan_del towards either one of 2 internal lists: bridge VLANs and dsa_8021q VLANs. Then, implement a central sja1105_build_vlan_table method that creates a VLAN configuration from scratch based on the 2 lists of VLANs kept by the driver, and based on the VLAN awareness state. Currently, if we are VLAN-unaware, install the dsa_8021q VLANs, otherwise the bridge VLANs. Then, implement a delta commit procedure that identifies which VLANs from this new configuration are actually different from the config previously committed to hardware. We apply the delta through the dynamic configuration interface (we don't reset the switch). The result is that the hardware should see the exact sequence of operations as before this patch. This also helps remove the "br" argument passed to dsa_8021q_crosschip_bridge_join, which it was only using to figure out whether it should commit the configuration back to us or not, based on the VLAN awareness state of the bridge. We can simplify that, by always allowing those VLANs inside of our dsa_8021q_vlans list, and committing those to hardware when necessary. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 17:20:29 +00:00
if (!vlan_table_changed)
return;
rc = sja1105_build_vlan_table(priv, true);
if (rc)
dev_err(ds->dev, "Failed to build VLAN table: %d\n", rc);
}
static int sja1105_vlan_del(struct dsa_switch *ds, int port,
const struct switchdev_obj_port_vlan *vlan)
{
struct sja1105_private *priv = ds->priv;
net: dsa: sja1105: save/restore VLANs using a delta commit method Managing the VLAN table that is present in hardware will become very difficult once we add a third operating state (best_effort_vlan_filtering). That is because correct cleanup (not too little, not too much) becomes virtually impossible, when VLANs can be added from the bridge layer, from dsa_8021q for basic tagging, for cross-chip bridging, as well as retagging rules for sub-VLANs and cross-chip sub-VLANs. So we need to rethink VLAN interaction with the switch in a more scalable way. In preparation for that, use the priv->expect_dsa_8021q boolean to classify any VLAN request received through .port_vlan_add or .port_vlan_del towards either one of 2 internal lists: bridge VLANs and dsa_8021q VLANs. Then, implement a central sja1105_build_vlan_table method that creates a VLAN configuration from scratch based on the 2 lists of VLANs kept by the driver, and based on the VLAN awareness state. Currently, if we are VLAN-unaware, install the dsa_8021q VLANs, otherwise the bridge VLANs. Then, implement a delta commit procedure that identifies which VLANs from this new configuration are actually different from the config previously committed to hardware. We apply the delta through the dynamic configuration interface (we don't reset the switch). The result is that the hardware should see the exact sequence of operations as before this patch. This also helps remove the "br" argument passed to dsa_8021q_crosschip_bridge_join, which it was only using to figure out whether it should commit the configuration back to us or not, based on the VLAN awareness state of the bridge. We can simplify that, by always allowing those VLANs inside of our dsa_8021q_vlans list, and committing those to hardware when necessary. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 17:20:29 +00:00
bool vlan_table_changed = false;
u16 vid;
for (vid = vlan->vid_begin; vid <= vlan->vid_end; vid++) {
net: dsa: sja1105: save/restore VLANs using a delta commit method Managing the VLAN table that is present in hardware will become very difficult once we add a third operating state (best_effort_vlan_filtering). That is because correct cleanup (not too little, not too much) becomes virtually impossible, when VLANs can be added from the bridge layer, from dsa_8021q for basic tagging, for cross-chip bridging, as well as retagging rules for sub-VLANs and cross-chip sub-VLANs. So we need to rethink VLAN interaction with the switch in a more scalable way. In preparation for that, use the priv->expect_dsa_8021q boolean to classify any VLAN request received through .port_vlan_add or .port_vlan_del towards either one of 2 internal lists: bridge VLANs and dsa_8021q VLANs. Then, implement a central sja1105_build_vlan_table method that creates a VLAN configuration from scratch based on the 2 lists of VLANs kept by the driver, and based on the VLAN awareness state. Currently, if we are VLAN-unaware, install the dsa_8021q VLANs, otherwise the bridge VLANs. Then, implement a delta commit procedure that identifies which VLANs from this new configuration are actually different from the config previously committed to hardware. We apply the delta through the dynamic configuration interface (we don't reset the switch). The result is that the hardware should see the exact sequence of operations as before this patch. This also helps remove the "br" argument passed to dsa_8021q_crosschip_bridge_join, which it was only using to figure out whether it should commit the configuration back to us or not, based on the VLAN awareness state of the bridge. We can simplify that, by always allowing those VLANs inside of our dsa_8021q_vlans list, and committing those to hardware when necessary. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 17:20:29 +00:00
struct sja1105_bridge_vlan *v, *n;
struct list_head *vlan_list;
vlan_list = sja1105_classify_vlan(priv, vid);
list_for_each_entry_safe(v, n, vlan_list, list) {
if (v->port == port && v->vid == vid) {
list_del(&v->list);
kfree(v);
vlan_table_changed = true;
break;
}
}
}
net: dsa: sja1105: save/restore VLANs using a delta commit method Managing the VLAN table that is present in hardware will become very difficult once we add a third operating state (best_effort_vlan_filtering). That is because correct cleanup (not too little, not too much) becomes virtually impossible, when VLANs can be added from the bridge layer, from dsa_8021q for basic tagging, for cross-chip bridging, as well as retagging rules for sub-VLANs and cross-chip sub-VLANs. So we need to rethink VLAN interaction with the switch in a more scalable way. In preparation for that, use the priv->expect_dsa_8021q boolean to classify any VLAN request received through .port_vlan_add or .port_vlan_del towards either one of 2 internal lists: bridge VLANs and dsa_8021q VLANs. Then, implement a central sja1105_build_vlan_table method that creates a VLAN configuration from scratch based on the 2 lists of VLANs kept by the driver, and based on the VLAN awareness state. Currently, if we are VLAN-unaware, install the dsa_8021q VLANs, otherwise the bridge VLANs. Then, implement a delta commit procedure that identifies which VLANs from this new configuration are actually different from the config previously committed to hardware. We apply the delta through the dynamic configuration interface (we don't reset the switch). The result is that the hardware should see the exact sequence of operations as before this patch. This also helps remove the "br" argument passed to dsa_8021q_crosschip_bridge_join, which it was only using to figure out whether it should commit the configuration back to us or not, based on the VLAN awareness state of the bridge. We can simplify that, by always allowing those VLANs inside of our dsa_8021q_vlans list, and committing those to hardware when necessary. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 17:20:29 +00:00
if (!vlan_table_changed)
return 0;
return sja1105_build_vlan_table(priv, true);
}
static int sja1105_best_effort_vlan_filtering_get(struct sja1105_private *priv,
bool *be_vlan)
{
*be_vlan = priv->best_effort_vlan_filtering;
return 0;
}
static int sja1105_best_effort_vlan_filtering_set(struct sja1105_private *priv,
bool be_vlan)
{
struct dsa_switch *ds = priv->ds;
bool vlan_filtering;
int port;
int rc;
priv->best_effort_vlan_filtering = be_vlan;
rtnl_lock();
for (port = 0; port < ds->num_ports; port++) {
struct dsa_port *dp;
if (!dsa_is_user_port(ds, port))
continue;
dp = dsa_to_port(ds, port);
vlan_filtering = dsa_port_is_vlan_filtering(dp);
rc = sja1105_vlan_filtering(ds, port, vlan_filtering);
if (rc)
break;
}
rtnl_unlock();
return rc;
}
enum sja1105_devlink_param_id {
SJA1105_DEVLINK_PARAM_ID_BASE = DEVLINK_PARAM_GENERIC_ID_MAX,
SJA1105_DEVLINK_PARAM_ID_BEST_EFFORT_VLAN_FILTERING,
};
static int sja1105_devlink_param_get(struct dsa_switch *ds, u32 id,
struct devlink_param_gset_ctx *ctx)
{
struct sja1105_private *priv = ds->priv;
int err;
switch (id) {
case SJA1105_DEVLINK_PARAM_ID_BEST_EFFORT_VLAN_FILTERING:
err = sja1105_best_effort_vlan_filtering_get(priv,
&ctx->val.vbool);
break;
default:
err = -EOPNOTSUPP;
break;
}
return err;
}
static int sja1105_devlink_param_set(struct dsa_switch *ds, u32 id,
struct devlink_param_gset_ctx *ctx)
{
struct sja1105_private *priv = ds->priv;
int err;
switch (id) {
case SJA1105_DEVLINK_PARAM_ID_BEST_EFFORT_VLAN_FILTERING:
err = sja1105_best_effort_vlan_filtering_set(priv,
ctx->val.vbool);
break;
default:
err = -EOPNOTSUPP;
break;
}
return err;
}
static const struct devlink_param sja1105_devlink_params[] = {
DSA_DEVLINK_PARAM_DRIVER(SJA1105_DEVLINK_PARAM_ID_BEST_EFFORT_VLAN_FILTERING,
"best_effort_vlan_filtering",
DEVLINK_PARAM_TYPE_BOOL,
BIT(DEVLINK_PARAM_CMODE_RUNTIME)),
};
static int sja1105_setup_devlink_params(struct dsa_switch *ds)
{
return dsa_devlink_params_register(ds, sja1105_devlink_params,
ARRAY_SIZE(sja1105_devlink_params));
}
static void sja1105_teardown_devlink_params(struct dsa_switch *ds)
{
dsa_devlink_params_unregister(ds, sja1105_devlink_params,
ARRAY_SIZE(sja1105_devlink_params));
}
/* The programming model for the SJA1105 switch is "all-at-once" via static
* configuration tables. Some of these can be dynamically modified at runtime,
* but not the xMII mode parameters table.
* Furthermode, some PHYs may not have crystals for generating their clocks
* (e.g. RMII). Instead, their 50MHz clock is supplied via the SJA1105 port's
* ref_clk pin. So port clocking needs to be initialized early, before
* connecting to PHYs is attempted, otherwise they won't respond through MDIO.
* Setting correct PHY link speed does not matter now.
* But dsa_slave_phy_setup is called later than sja1105_setup, so the PHY
* bindings are not yet parsed by DSA core. We need to parse early so that we
* can populate the xMII mode parameters table.
*/
static int sja1105_setup(struct dsa_switch *ds)
{
struct sja1105_dt_port ports[SJA1105_NUM_PORTS];
struct sja1105_private *priv = ds->priv;
int rc;
rc = sja1105_parse_dt(priv, ports);
if (rc < 0) {
dev_err(ds->dev, "Failed to parse DT: %d\n", rc);
return rc;
}
net: dsa: sja1105: Error out if RGMII delays are requested in DT Documentation/devicetree/bindings/net/ethernet.txt is confusing because it says what the MAC should not do, but not what it *should* do: * "rgmii-rxid" (RGMII with internal RX delay provided by the PHY, the MAC should not add an RX delay in this case) The gap in semantics is threefold: 1. Is it illegal for the MAC to apply the Rx internal delay by itself, and simplify the phy_mode (mask off "rgmii-rxid" into "rgmii") before passing it to of_phy_connect? The documentation would suggest yes. 1. For "rgmii-rxid", while the situation with the Rx clock skew is more or less clear (needs to be added by the PHY), what should the MAC driver do about the Tx delays? Is it an implicit wild card for the MAC to apply delays in the Tx direction if it can? What if those were already added as serpentine PCB traces, how could that be made more obvious through DT bindings so that the MAC doesn't attempt to add them twice and again potentially break the link? 3. If the interface is a fixed-link and therefore the PHY object is fixed (a purely software entity that obviously cannot add clock skew), what is the meaning of the above property? So an interpretation of the RGMII bindings was chosen that hopefully does not contradict their intention but also makes them more applied. The SJA1105 driver understands to act upon "rgmii-*id" phy-mode bindings if the port is in the PHY role (either explicitly, or if it is a fixed-link). Otherwise it always passes the duty of setting up delays to the PHY driver. The error behavior that this patch adds is required on SJA1105E/T where the MAC really cannot apply internal delays. If the other end of the fixed-link cannot apply RGMII delays either (this would be specified through its own DT bindings), then the situation requires PCB delays. For SJA1105P/Q/R/S, this is however hardware supported and the error is thus only temporary. I created a stub function pointer for configuring delays per-port on RXC and TXC, and will implement it when I have access to a board with this hardware setup. Meanwhile do not allow the user to select an invalid configuration. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-02 20:23:32 +00:00
/* Error out early if internal delays are required through DT
* and we can't apply them.
*/
rc = sja1105_parse_rgmii_delays(priv, ports);
if (rc < 0) {
dev_err(ds->dev, "RGMII delay not supported\n");
return rc;
}
rc = sja1105_ptp_clock_register(ds);
net: dsa: sja1105: Add support for the PTP clock The design of this PHC driver is influenced by the switch's behavior w.r.t. timestamping. It exposes two PTP counters, one free-running (PTPTSCLK) and the other offset- and frequency-corrected in hardware through PTPCLKVAL, PTPCLKADD and PTPCLKRATE. The MACs can sample either of these for frame timestamps. However, the user manual warns that taking timestamps based on the corrected clock is less than useful, as the switch can deliver corrupted timestamps in a variety of circumstances. Therefore, this PHC uses the free-running PTPTSCLK together with a timecounter/cyclecounter structure that translates it into a software time domain. Thus, the settime/adjtime and adjfine callbacks are hardware no-ops. The timestamps (introduced in a further patch) will also be translated to the correct time domain before being handed over to the userspace PTP stack. The introduction of a second set of PHC operations that operate on the hardware PTPCLKVAL/PTPCLKADD/PTPCLKRATE in the future is somewhat unavoidable, as the TTEthernet core uses the corrected PTP time domain. However, the free-running counter + timecounter structure combination will suffice for now, as the resulting timestamps yield a sub-50 ns synchronization offset in steady state using linuxptp. For this patch, in absence of frame timestamping, the operations of the switch PHC were tested by syncing it to the system time as a local slave clock with: phc2sys -s CLOCK_REALTIME -c swp2 -O 0 -m -S 0.01 Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-08 12:04:34 +00:00
if (rc < 0) {
dev_err(ds->dev, "Failed to register PTP clock: %d\n", rc);
return rc;
}
/* Create and send configuration down to device */
rc = sja1105_static_config_load(priv, ports);
if (rc < 0) {
dev_err(ds->dev, "Failed to load static config: %d\n", rc);
return rc;
}
/* Configure the CGU (PHY link modes and speeds) */
rc = sja1105_clocking_setup(priv);
if (rc < 0) {
dev_err(ds->dev, "Failed to configure MII clocking: %d\n", rc);
return rc;
}
/* On SJA1105, VLAN filtering per se is always enabled in hardware.
* The only thing we can do to disable it is lie about what the 802.1Q
* EtherType is.
* So it will still try to apply VLAN filtering, but all ingress
* traffic (except frames received with EtherType of ETH_P_SJA1105)
* will be internally tagged with a distorted VLAN header where the
* TPID is ETH_P_SJA1105, and the VLAN ID is the port pvid.
*/
ds->vlan_filtering_is_global = true;
/* Advertise the 8 egress queues */
ds->num_tx_queues = SJA1105_NUM_TC;
ds->mtu_enforcement_ingress = true;
ds->configure_vlan_while_not_filtering = true;
rc = sja1105_setup_devlink_params(ds);
if (rc < 0)
return rc;
/* The DSA/switchdev model brings up switch ports in standalone mode by
* default, and that means vlan_filtering is 0 since they're not under
* a bridge, so it's safe to set up switch tagging at this time.
*/
return sja1105_setup_8021q_tagging(ds, true);
}
static void sja1105_teardown(struct dsa_switch *ds)
{
struct sja1105_private *priv = ds->priv;
net: dsa: sja1105: save/restore VLANs using a delta commit method Managing the VLAN table that is present in hardware will become very difficult once we add a third operating state (best_effort_vlan_filtering). That is because correct cleanup (not too little, not too much) becomes virtually impossible, when VLANs can be added from the bridge layer, from dsa_8021q for basic tagging, for cross-chip bridging, as well as retagging rules for sub-VLANs and cross-chip sub-VLANs. So we need to rethink VLAN interaction with the switch in a more scalable way. In preparation for that, use the priv->expect_dsa_8021q boolean to classify any VLAN request received through .port_vlan_add or .port_vlan_del towards either one of 2 internal lists: bridge VLANs and dsa_8021q VLANs. Then, implement a central sja1105_build_vlan_table method that creates a VLAN configuration from scratch based on the 2 lists of VLANs kept by the driver, and based on the VLAN awareness state. Currently, if we are VLAN-unaware, install the dsa_8021q VLANs, otherwise the bridge VLANs. Then, implement a delta commit procedure that identifies which VLANs from this new configuration are actually different from the config previously committed to hardware. We apply the delta through the dynamic configuration interface (we don't reset the switch). The result is that the hardware should see the exact sequence of operations as before this patch. This also helps remove the "br" argument passed to dsa_8021q_crosschip_bridge_join, which it was only using to figure out whether it should commit the configuration back to us or not, based on the VLAN awareness state of the bridge. We can simplify that, by always allowing those VLANs inside of our dsa_8021q_vlans list, and committing those to hardware when necessary. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 17:20:29 +00:00
struct sja1105_bridge_vlan *v, *n;
net: dsa: Make deferred_xmit private to sja1105 There are 3 things that are wrong with the DSA deferred xmit mechanism: 1. Its introduction has made the DSA hotpath ever so slightly more inefficient for everybody, since DSA_SKB_CB(skb)->deferred_xmit needs to be initialized to false for every transmitted frame, in order to figure out whether the driver requested deferral or not (a very rare occasion, rare even for the only driver that does use this mechanism: sja1105). That was necessary to avoid kfree_skb from freeing the skb. 2. Because L2 PTP is a link-local protocol like STP, it requires management routes and deferred xmit with this switch. But as opposed to STP, the deferred work mechanism needs to schedule the packet rather quickly for the TX timstamp to be collected in time and sent to user space. But there is no provision for controlling the scheduling priority of this deferred xmit workqueue. Too bad this is a rather specific requirement for a feature that nobody else uses (more below). 3. Perhaps most importantly, it makes the DSA core adhere a bit too much to the NXP company-wide policy "Innovate Where It Doesn't Matter". The sja1105 is probably the only DSA switch that requires some frames sent from the CPU to be routed to the slave port via an out-of-band configuration (register write) rather than in-band (DSA tag). And there are indeed very good reasons to not want to do that: if that out-of-band register is at the other end of a slow bus such as SPI, then you limit that Ethernet flow's throughput to effectively the throughput of the SPI bus. So hardware vendors should definitely not be encouraged to design this way. We do _not_ want more widespread use of this mechanism. Luckily we have a solution for each of the 3 issues: For 1, we can just remove that variable in the skb->cb and counteract the effect of kfree_skb with skb_get, much to the same effect. The advantage, of course, being that anybody who doesn't use deferred xmit doesn't need to do any extra operation in the hotpath. For 2, we can create a kernel thread for each port's deferred xmit work. If the user switch ports are named swp0, swp1, swp2, the kernel threads will be named swp0_xmit, swp1_xmit, swp2_xmit (there appears to be a 15 character length limit on kernel thread names). With this, the user can change the scheduling priority with chrt $(pidof swp2_xmit). For 3, we can actually move the entire implementation to the sja1105 driver. So this patch deletes the generic implementation from the DSA core and adds a new one, more adequate to the requirements of PTP TX timestamping, in sja1105_main.c. Suggested-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-04 00:37:10 +00:00
int port;
for (port = 0; port < SJA1105_NUM_PORTS; port++) {
struct sja1105_port *sp = &priv->ports[port];
if (!dsa_is_user_port(ds, port))
continue;
net: dsa: sja1105: Don't destroy not-yet-created xmit_worker Fixes the following NULL pointer dereference on PHY connect error path teardown: [ 2.291010] sja1105 spi0.1: Probed switch chip: SJA1105T [ 2.310044] sja1105 spi0.1: Enabled switch tagging [ 2.314970] fsl-gianfar soc:ethernet@2d90000 eth2: error -19 setting up slave phy [ 2.322463] 8<--- cut here --- [ 2.325497] Unable to handle kernel NULL pointer dereference at virtual address 00000018 [ 2.333555] pgd = (ptrval) [ 2.336241] [00000018] *pgd=00000000 [ 2.339797] Internal error: Oops: 5 [#1] SMP ARM [ 2.344384] Modules linked in: [ 2.347420] CPU: 1 PID: 64 Comm: kworker/1:1 Not tainted 5.5.0-rc5 #1 [ 2.353820] Hardware name: Freescale LS1021A [ 2.358070] Workqueue: events deferred_probe_work_func [ 2.363182] PC is at kthread_destroy_worker+0x4/0x74 [ 2.368117] LR is at sja1105_teardown+0x70/0xb4 [ 2.372617] pc : [<c036cdd4>] lr : [<c0b89238>] psr: 60000013 [ 2.378845] sp : eeac3d30 ip : eeab1900 fp : eef45480 [ 2.384036] r10: eef4549c r9 : 00000001 r8 : 00000000 [ 2.389227] r7 : eef527c0 r6 : 00000034 r5 : ed8ddd0c r4 : ed8ddc40 [ 2.395714] r3 : 00000000 r2 : 00000000 r1 : eef4549c r0 : 00000000 [ 2.402204] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none [ 2.409297] Control: 10c5387d Table: 8020406a DAC: 00000051 [ 2.415008] Process kworker/1:1 (pid: 64, stack limit = 0x(ptrval)) [ 2.421237] Stack: (0xeeac3d30 to 0xeeac4000) [ 2.612635] [<c036cdd4>] (kthread_destroy_worker) from [<c0b89238>] (sja1105_teardown+0x70/0xb4) [ 2.621379] [<c0b89238>] (sja1105_teardown) from [<c10717fc>] (dsa_switch_teardown.part.1+0x48/0x74) [ 2.630467] [<c10717fc>] (dsa_switch_teardown.part.1) from [<c1072438>] (dsa_register_switch+0x8b0/0xbf4) [ 2.639984] [<c1072438>] (dsa_register_switch) from [<c0b89c30>] (sja1105_probe+0x2ac/0x464) [ 2.648378] [<c0b89c30>] (sja1105_probe) from [<c0b11a5c>] (spi_drv_probe+0x7c/0xa0) [ 2.656081] [<c0b11a5c>] (spi_drv_probe) from [<c0a26ab8>] (really_probe+0x208/0x480) [ 2.663871] [<c0a26ab8>] (really_probe) from [<c0a26f0c>] (driver_probe_device+0x78/0x1c4) [ 2.672093] [<c0a26f0c>] (driver_probe_device) from [<c0a24c48>] (bus_for_each_drv+0x80/0xc4) [ 2.680574] [<c0a24c48>] (bus_for_each_drv) from [<c0a26810>] (__device_attach+0xd0/0x168) [ 2.688794] [<c0a26810>] (__device_attach) from [<c0a259d8>] (bus_probe_device+0x84/0x8c) [ 2.696927] [<c0a259d8>] (bus_probe_device) from [<c0a25f24>] (deferred_probe_work_func+0x84/0xc4) [ 2.705842] [<c0a25f24>] (deferred_probe_work_func) from [<c03667b0>] (process_one_work+0x22c/0x560) [ 2.714926] [<c03667b0>] (process_one_work) from [<c0366d8c>] (worker_thread+0x2a8/0x5d4) [ 2.723059] [<c0366d8c>] (worker_thread) from [<c036cf94>] (kthread+0x150/0x154) [ 2.730416] [<c036cf94>] (kthread) from [<c03010e8>] (ret_from_fork+0x14/0x2c) Checking for NULL pointer is correct because the per-port xmit kernel threads are created in sja1105_probe immediately after calling dsa_register_switch. Fixes: a68578c20a96 ("net: dsa: Make deferred_xmit private to sja1105") Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-29 20:30:07 +00:00
if (sp->xmit_worker)
kthread_destroy_worker(sp->xmit_worker);
net: dsa: Make deferred_xmit private to sja1105 There are 3 things that are wrong with the DSA deferred xmit mechanism: 1. Its introduction has made the DSA hotpath ever so slightly more inefficient for everybody, since DSA_SKB_CB(skb)->deferred_xmit needs to be initialized to false for every transmitted frame, in order to figure out whether the driver requested deferral or not (a very rare occasion, rare even for the only driver that does use this mechanism: sja1105). That was necessary to avoid kfree_skb from freeing the skb. 2. Because L2 PTP is a link-local protocol like STP, it requires management routes and deferred xmit with this switch. But as opposed to STP, the deferred work mechanism needs to schedule the packet rather quickly for the TX timstamp to be collected in time and sent to user space. But there is no provision for controlling the scheduling priority of this deferred xmit workqueue. Too bad this is a rather specific requirement for a feature that nobody else uses (more below). 3. Perhaps most importantly, it makes the DSA core adhere a bit too much to the NXP company-wide policy "Innovate Where It Doesn't Matter". The sja1105 is probably the only DSA switch that requires some frames sent from the CPU to be routed to the slave port via an out-of-band configuration (register write) rather than in-band (DSA tag). And there are indeed very good reasons to not want to do that: if that out-of-band register is at the other end of a slow bus such as SPI, then you limit that Ethernet flow's throughput to effectively the throughput of the SPI bus. So hardware vendors should definitely not be encouraged to design this way. We do _not_ want more widespread use of this mechanism. Luckily we have a solution for each of the 3 issues: For 1, we can just remove that variable in the skb->cb and counteract the effect of kfree_skb with skb_get, much to the same effect. The advantage, of course, being that anybody who doesn't use deferred xmit doesn't need to do any extra operation in the hotpath. For 2, we can create a kernel thread for each port's deferred xmit work. If the user switch ports are named swp0, swp1, swp2, the kernel threads will be named swp0_xmit, swp1_xmit, swp2_xmit (there appears to be a 15 character length limit on kernel thread names). With this, the user can change the scheduling priority with chrt $(pidof swp2_xmit). For 3, we can actually move the entire implementation to the sja1105 driver. So this patch deletes the generic implementation from the DSA core and adds a new one, more adequate to the requirements of PTP TX timestamping, in sja1105_main.c. Suggested-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-04 00:37:10 +00:00
}
sja1105_teardown_devlink_params(ds);
sja1105_flower_teardown(ds);
net: dsa: sja1105: Configure the Time-Aware Scheduler via tc-taprio offload This qdisc offload is the closest thing to what the SJA1105 supports in hardware for time-based egress shaping. The switch core really is built around SAE AS6802/TTEthernet (a TTTech standard) but can be made to operate similarly to IEEE 802.1Qbv with some constraints: - The gate control list is a global list for all ports. There are 8 execution threads that iterate through this global list in parallel. I don't know why 8, there are only 4 front-panel ports. - Care must be taken by the user to make sure that two execution threads never get to execute a GCL entry simultaneously. I created a O(n^4) checker for this hardware limitation, prior to accepting a taprio offload configuration as valid. - The spec says that if a GCL entry's interval is shorter than the frame length, you shouldn't send it (and end up in head-of-line blocking). Well, this switch does anyway. - The switch has no concept of ADMIN and OPER configurations. Because it's so simple, the TAS settings are loaded through the static config tables interface, so there isn't even place for any discussion about 'graceful switchover between ADMIN and OPER'. You just reset the switch and upload a new OPER config. - The switch accepts multiple time sources for the gate events. Right now I am using the standalone clock source as opposed to PTP. So the base time parameter doesn't really do much. Support for the PTP clock source will be added in a future series. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-15 02:00:02 +00:00
sja1105_tas_teardown(ds);
sja1105_ptp_clock_unregister(ds);
sja1105_static_config_free(&priv->static_config);
net: dsa: sja1105: save/restore VLANs using a delta commit method Managing the VLAN table that is present in hardware will become very difficult once we add a third operating state (best_effort_vlan_filtering). That is because correct cleanup (not too little, not too much) becomes virtually impossible, when VLANs can be added from the bridge layer, from dsa_8021q for basic tagging, for cross-chip bridging, as well as retagging rules for sub-VLANs and cross-chip sub-VLANs. So we need to rethink VLAN interaction with the switch in a more scalable way. In preparation for that, use the priv->expect_dsa_8021q boolean to classify any VLAN request received through .port_vlan_add or .port_vlan_del towards either one of 2 internal lists: bridge VLANs and dsa_8021q VLANs. Then, implement a central sja1105_build_vlan_table method that creates a VLAN configuration from scratch based on the 2 lists of VLANs kept by the driver, and based on the VLAN awareness state. Currently, if we are VLAN-unaware, install the dsa_8021q VLANs, otherwise the bridge VLANs. Then, implement a delta commit procedure that identifies which VLANs from this new configuration are actually different from the config previously committed to hardware. We apply the delta through the dynamic configuration interface (we don't reset the switch). The result is that the hardware should see the exact sequence of operations as before this patch. This also helps remove the "br" argument passed to dsa_8021q_crosschip_bridge_join, which it was only using to figure out whether it should commit the configuration back to us or not, based on the VLAN awareness state of the bridge. We can simplify that, by always allowing those VLANs inside of our dsa_8021q_vlans list, and committing those to hardware when necessary. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 17:20:29 +00:00
list_for_each_entry_safe(v, n, &priv->dsa_8021q_vlans, list) {
list_del(&v->list);
kfree(v);
}
list_for_each_entry_safe(v, n, &priv->bridge_vlans, list) {
list_del(&v->list);
kfree(v);
}
}
static int sja1105_port_enable(struct dsa_switch *ds, int port,
struct phy_device *phy)
{
struct net_device *slave;
if (!dsa_is_user_port(ds, port))
return 0;
slave = dsa_to_port(ds, port)->slave;
slave->features &= ~NETIF_F_HW_VLAN_CTAG_FILTER;
return 0;
}
net: dsa: Make deferred_xmit private to sja1105 There are 3 things that are wrong with the DSA deferred xmit mechanism: 1. Its introduction has made the DSA hotpath ever so slightly more inefficient for everybody, since DSA_SKB_CB(skb)->deferred_xmit needs to be initialized to false for every transmitted frame, in order to figure out whether the driver requested deferral or not (a very rare occasion, rare even for the only driver that does use this mechanism: sja1105). That was necessary to avoid kfree_skb from freeing the skb. 2. Because L2 PTP is a link-local protocol like STP, it requires management routes and deferred xmit with this switch. But as opposed to STP, the deferred work mechanism needs to schedule the packet rather quickly for the TX timstamp to be collected in time and sent to user space. But there is no provision for controlling the scheduling priority of this deferred xmit workqueue. Too bad this is a rather specific requirement for a feature that nobody else uses (more below). 3. Perhaps most importantly, it makes the DSA core adhere a bit too much to the NXP company-wide policy "Innovate Where It Doesn't Matter". The sja1105 is probably the only DSA switch that requires some frames sent from the CPU to be routed to the slave port via an out-of-band configuration (register write) rather than in-band (DSA tag). And there are indeed very good reasons to not want to do that: if that out-of-band register is at the other end of a slow bus such as SPI, then you limit that Ethernet flow's throughput to effectively the throughput of the SPI bus. So hardware vendors should definitely not be encouraged to design this way. We do _not_ want more widespread use of this mechanism. Luckily we have a solution for each of the 3 issues: For 1, we can just remove that variable in the skb->cb and counteract the effect of kfree_skb with skb_get, much to the same effect. The advantage, of course, being that anybody who doesn't use deferred xmit doesn't need to do any extra operation in the hotpath. For 2, we can create a kernel thread for each port's deferred xmit work. If the user switch ports are named swp0, swp1, swp2, the kernel threads will be named swp0_xmit, swp1_xmit, swp2_xmit (there appears to be a 15 character length limit on kernel thread names). With this, the user can change the scheduling priority with chrt $(pidof swp2_xmit). For 3, we can actually move the entire implementation to the sja1105 driver. So this patch deletes the generic implementation from the DSA core and adds a new one, more adequate to the requirements of PTP TX timestamping, in sja1105_main.c. Suggested-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-04 00:37:10 +00:00
static void sja1105_port_disable(struct dsa_switch *ds, int port)
{
struct sja1105_private *priv = ds->priv;
struct sja1105_port *sp = &priv->ports[port];
if (!dsa_is_user_port(ds, port))
return;
kthread_cancel_work_sync(&sp->xmit_work);
skb_queue_purge(&sp->xmit_queue);
}
static int sja1105_mgmt_xmit(struct dsa_switch *ds, int port, int slot,
struct sk_buff *skb, bool takets)
{
struct sja1105_mgmt_entry mgmt_route = {0};
struct sja1105_private *priv = ds->priv;
struct ethhdr *hdr;
int timeout = 10;
int rc;
hdr = eth_hdr(skb);
mgmt_route.macaddr = ether_addr_to_u64(hdr->h_dest);
mgmt_route.destports = BIT(port);
mgmt_route.enfport = 1;
mgmt_route.tsreg = 0;
mgmt_route.takets = takets;
rc = sja1105_dynamic_config_write(priv, BLK_IDX_MGMT_ROUTE,
slot, &mgmt_route, true);
if (rc < 0) {
kfree_skb(skb);
return rc;
}
/* Transfer skb to the host port. */
dsa_enqueue_skb(skb, dsa_to_port(ds, port)->slave);
/* Wait until the switch has processed the frame */
do {
rc = sja1105_dynamic_config_read(priv, BLK_IDX_MGMT_ROUTE,
slot, &mgmt_route);
if (rc < 0) {
dev_err_ratelimited(priv->ds->dev,
"failed to poll for mgmt route\n");
continue;
}
/* UM10944: The ENFPORT flag of the respective entry is
* cleared when a match is found. The host can use this
* flag as an acknowledgment.
*/
cpu_relax();
} while (mgmt_route.enfport && --timeout);
if (!timeout) {
/* Clean up the management route so that a follow-up
* frame may not match on it by mistake.
* This is only hardware supported on P/Q/R/S - on E/T it is
* a no-op and we are silently discarding the -EOPNOTSUPP.
*/
sja1105_dynamic_config_write(priv, BLK_IDX_MGMT_ROUTE,
slot, &mgmt_route, false);
dev_err_ratelimited(priv->ds->dev, "xmit timed out\n");
}
return NETDEV_TX_OK;
}
net: dsa: Make deferred_xmit private to sja1105 There are 3 things that are wrong with the DSA deferred xmit mechanism: 1. Its introduction has made the DSA hotpath ever so slightly more inefficient for everybody, since DSA_SKB_CB(skb)->deferred_xmit needs to be initialized to false for every transmitted frame, in order to figure out whether the driver requested deferral or not (a very rare occasion, rare even for the only driver that does use this mechanism: sja1105). That was necessary to avoid kfree_skb from freeing the skb. 2. Because L2 PTP is a link-local protocol like STP, it requires management routes and deferred xmit with this switch. But as opposed to STP, the deferred work mechanism needs to schedule the packet rather quickly for the TX timstamp to be collected in time and sent to user space. But there is no provision for controlling the scheduling priority of this deferred xmit workqueue. Too bad this is a rather specific requirement for a feature that nobody else uses (more below). 3. Perhaps most importantly, it makes the DSA core adhere a bit too much to the NXP company-wide policy "Innovate Where It Doesn't Matter". The sja1105 is probably the only DSA switch that requires some frames sent from the CPU to be routed to the slave port via an out-of-band configuration (register write) rather than in-band (DSA tag). And there are indeed very good reasons to not want to do that: if that out-of-band register is at the other end of a slow bus such as SPI, then you limit that Ethernet flow's throughput to effectively the throughput of the SPI bus. So hardware vendors should definitely not be encouraged to design this way. We do _not_ want more widespread use of this mechanism. Luckily we have a solution for each of the 3 issues: For 1, we can just remove that variable in the skb->cb and counteract the effect of kfree_skb with skb_get, much to the same effect. The advantage, of course, being that anybody who doesn't use deferred xmit doesn't need to do any extra operation in the hotpath. For 2, we can create a kernel thread for each port's deferred xmit work. If the user switch ports are named swp0, swp1, swp2, the kernel threads will be named swp0_xmit, swp1_xmit, swp2_xmit (there appears to be a 15 character length limit on kernel thread names). With this, the user can change the scheduling priority with chrt $(pidof swp2_xmit). For 3, we can actually move the entire implementation to the sja1105 driver. So this patch deletes the generic implementation from the DSA core and adds a new one, more adequate to the requirements of PTP TX timestamping, in sja1105_main.c. Suggested-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-04 00:37:10 +00:00
#define work_to_port(work) \
container_of((work), struct sja1105_port, xmit_work)
#define tagger_to_sja1105(t) \
container_of((t), struct sja1105_private, tagger_data)
/* Deferred work is unfortunately necessary because setting up the management
* route cannot be done from atomit context (SPI transfer takes a sleepable
* lock on the bus)
*/
net: dsa: Make deferred_xmit private to sja1105 There are 3 things that are wrong with the DSA deferred xmit mechanism: 1. Its introduction has made the DSA hotpath ever so slightly more inefficient for everybody, since DSA_SKB_CB(skb)->deferred_xmit needs to be initialized to false for every transmitted frame, in order to figure out whether the driver requested deferral or not (a very rare occasion, rare even for the only driver that does use this mechanism: sja1105). That was necessary to avoid kfree_skb from freeing the skb. 2. Because L2 PTP is a link-local protocol like STP, it requires management routes and deferred xmit with this switch. But as opposed to STP, the deferred work mechanism needs to schedule the packet rather quickly for the TX timstamp to be collected in time and sent to user space. But there is no provision for controlling the scheduling priority of this deferred xmit workqueue. Too bad this is a rather specific requirement for a feature that nobody else uses (more below). 3. Perhaps most importantly, it makes the DSA core adhere a bit too much to the NXP company-wide policy "Innovate Where It Doesn't Matter". The sja1105 is probably the only DSA switch that requires some frames sent from the CPU to be routed to the slave port via an out-of-band configuration (register write) rather than in-band (DSA tag). And there are indeed very good reasons to not want to do that: if that out-of-band register is at the other end of a slow bus such as SPI, then you limit that Ethernet flow's throughput to effectively the throughput of the SPI bus. So hardware vendors should definitely not be encouraged to design this way. We do _not_ want more widespread use of this mechanism. Luckily we have a solution for each of the 3 issues: For 1, we can just remove that variable in the skb->cb and counteract the effect of kfree_skb with skb_get, much to the same effect. The advantage, of course, being that anybody who doesn't use deferred xmit doesn't need to do any extra operation in the hotpath. For 2, we can create a kernel thread for each port's deferred xmit work. If the user switch ports are named swp0, swp1, swp2, the kernel threads will be named swp0_xmit, swp1_xmit, swp2_xmit (there appears to be a 15 character length limit on kernel thread names). With this, the user can change the scheduling priority with chrt $(pidof swp2_xmit). For 3, we can actually move the entire implementation to the sja1105 driver. So this patch deletes the generic implementation from the DSA core and adds a new one, more adequate to the requirements of PTP TX timestamping, in sja1105_main.c. Suggested-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-04 00:37:10 +00:00
static void sja1105_port_deferred_xmit(struct kthread_work *work)
{
net: dsa: Make deferred_xmit private to sja1105 There are 3 things that are wrong with the DSA deferred xmit mechanism: 1. Its introduction has made the DSA hotpath ever so slightly more inefficient for everybody, since DSA_SKB_CB(skb)->deferred_xmit needs to be initialized to false for every transmitted frame, in order to figure out whether the driver requested deferral or not (a very rare occasion, rare even for the only driver that does use this mechanism: sja1105). That was necessary to avoid kfree_skb from freeing the skb. 2. Because L2 PTP is a link-local protocol like STP, it requires management routes and deferred xmit with this switch. But as opposed to STP, the deferred work mechanism needs to schedule the packet rather quickly for the TX timstamp to be collected in time and sent to user space. But there is no provision for controlling the scheduling priority of this deferred xmit workqueue. Too bad this is a rather specific requirement for a feature that nobody else uses (more below). 3. Perhaps most importantly, it makes the DSA core adhere a bit too much to the NXP company-wide policy "Innovate Where It Doesn't Matter". The sja1105 is probably the only DSA switch that requires some frames sent from the CPU to be routed to the slave port via an out-of-band configuration (register write) rather than in-band (DSA tag). And there are indeed very good reasons to not want to do that: if that out-of-band register is at the other end of a slow bus such as SPI, then you limit that Ethernet flow's throughput to effectively the throughput of the SPI bus. So hardware vendors should definitely not be encouraged to design this way. We do _not_ want more widespread use of this mechanism. Luckily we have a solution for each of the 3 issues: For 1, we can just remove that variable in the skb->cb and counteract the effect of kfree_skb with skb_get, much to the same effect. The advantage, of course, being that anybody who doesn't use deferred xmit doesn't need to do any extra operation in the hotpath. For 2, we can create a kernel thread for each port's deferred xmit work. If the user switch ports are named swp0, swp1, swp2, the kernel threads will be named swp0_xmit, swp1_xmit, swp2_xmit (there appears to be a 15 character length limit on kernel thread names). With this, the user can change the scheduling priority with chrt $(pidof swp2_xmit). For 3, we can actually move the entire implementation to the sja1105 driver. So this patch deletes the generic implementation from the DSA core and adds a new one, more adequate to the requirements of PTP TX timestamping, in sja1105_main.c. Suggested-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-04 00:37:10 +00:00
struct sja1105_port *sp = work_to_port(work);
struct sja1105_tagger_data *tagger_data = sp->data;
struct sja1105_private *priv = tagger_to_sja1105(tagger_data);
int port = sp - priv->ports;
struct sk_buff *skb;
net: dsa: Make deferred_xmit private to sja1105 There are 3 things that are wrong with the DSA deferred xmit mechanism: 1. Its introduction has made the DSA hotpath ever so slightly more inefficient for everybody, since DSA_SKB_CB(skb)->deferred_xmit needs to be initialized to false for every transmitted frame, in order to figure out whether the driver requested deferral or not (a very rare occasion, rare even for the only driver that does use this mechanism: sja1105). That was necessary to avoid kfree_skb from freeing the skb. 2. Because L2 PTP is a link-local protocol like STP, it requires management routes and deferred xmit with this switch. But as opposed to STP, the deferred work mechanism needs to schedule the packet rather quickly for the TX timstamp to be collected in time and sent to user space. But there is no provision for controlling the scheduling priority of this deferred xmit workqueue. Too bad this is a rather specific requirement for a feature that nobody else uses (more below). 3. Perhaps most importantly, it makes the DSA core adhere a bit too much to the NXP company-wide policy "Innovate Where It Doesn't Matter". The sja1105 is probably the only DSA switch that requires some frames sent from the CPU to be routed to the slave port via an out-of-band configuration (register write) rather than in-band (DSA tag). And there are indeed very good reasons to not want to do that: if that out-of-band register is at the other end of a slow bus such as SPI, then you limit that Ethernet flow's throughput to effectively the throughput of the SPI bus. So hardware vendors should definitely not be encouraged to design this way. We do _not_ want more widespread use of this mechanism. Luckily we have a solution for each of the 3 issues: For 1, we can just remove that variable in the skb->cb and counteract the effect of kfree_skb with skb_get, much to the same effect. The advantage, of course, being that anybody who doesn't use deferred xmit doesn't need to do any extra operation in the hotpath. For 2, we can create a kernel thread for each port's deferred xmit work. If the user switch ports are named swp0, swp1, swp2, the kernel threads will be named swp0_xmit, swp1_xmit, swp2_xmit (there appears to be a 15 character length limit on kernel thread names). With this, the user can change the scheduling priority with chrt $(pidof swp2_xmit). For 3, we can actually move the entire implementation to the sja1105 driver. So this patch deletes the generic implementation from the DSA core and adds a new one, more adequate to the requirements of PTP TX timestamping, in sja1105_main.c. Suggested-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-04 00:37:10 +00:00
while ((skb = skb_dequeue(&sp->xmit_queue)) != NULL) {
struct sk_buff *clone = DSA_SKB_CB(skb)->clone;
net: dsa: Make deferred_xmit private to sja1105 There are 3 things that are wrong with the DSA deferred xmit mechanism: 1. Its introduction has made the DSA hotpath ever so slightly more inefficient for everybody, since DSA_SKB_CB(skb)->deferred_xmit needs to be initialized to false for every transmitted frame, in order to figure out whether the driver requested deferral or not (a very rare occasion, rare even for the only driver that does use this mechanism: sja1105). That was necessary to avoid kfree_skb from freeing the skb. 2. Because L2 PTP is a link-local protocol like STP, it requires management routes and deferred xmit with this switch. But as opposed to STP, the deferred work mechanism needs to schedule the packet rather quickly for the TX timstamp to be collected in time and sent to user space. But there is no provision for controlling the scheduling priority of this deferred xmit workqueue. Too bad this is a rather specific requirement for a feature that nobody else uses (more below). 3. Perhaps most importantly, it makes the DSA core adhere a bit too much to the NXP company-wide policy "Innovate Where It Doesn't Matter". The sja1105 is probably the only DSA switch that requires some frames sent from the CPU to be routed to the slave port via an out-of-band configuration (register write) rather than in-band (DSA tag). And there are indeed very good reasons to not want to do that: if that out-of-band register is at the other end of a slow bus such as SPI, then you limit that Ethernet flow's throughput to effectively the throughput of the SPI bus. So hardware vendors should definitely not be encouraged to design this way. We do _not_ want more widespread use of this mechanism. Luckily we have a solution for each of the 3 issues: For 1, we can just remove that variable in the skb->cb and counteract the effect of kfree_skb with skb_get, much to the same effect. The advantage, of course, being that anybody who doesn't use deferred xmit doesn't need to do any extra operation in the hotpath. For 2, we can create a kernel thread for each port's deferred xmit work. If the user switch ports are named swp0, swp1, swp2, the kernel threads will be named swp0_xmit, swp1_xmit, swp2_xmit (there appears to be a 15 character length limit on kernel thread names). With this, the user can change the scheduling priority with chrt $(pidof swp2_xmit). For 3, we can actually move the entire implementation to the sja1105 driver. So this patch deletes the generic implementation from the DSA core and adds a new one, more adequate to the requirements of PTP TX timestamping, in sja1105_main.c. Suggested-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-04 00:37:10 +00:00
mutex_lock(&priv->mgmt_lock);
net: dsa: Make deferred_xmit private to sja1105 There are 3 things that are wrong with the DSA deferred xmit mechanism: 1. Its introduction has made the DSA hotpath ever so slightly more inefficient for everybody, since DSA_SKB_CB(skb)->deferred_xmit needs to be initialized to false for every transmitted frame, in order to figure out whether the driver requested deferral or not (a very rare occasion, rare even for the only driver that does use this mechanism: sja1105). That was necessary to avoid kfree_skb from freeing the skb. 2. Because L2 PTP is a link-local protocol like STP, it requires management routes and deferred xmit with this switch. But as opposed to STP, the deferred work mechanism needs to schedule the packet rather quickly for the TX timstamp to be collected in time and sent to user space. But there is no provision for controlling the scheduling priority of this deferred xmit workqueue. Too bad this is a rather specific requirement for a feature that nobody else uses (more below). 3. Perhaps most importantly, it makes the DSA core adhere a bit too much to the NXP company-wide policy "Innovate Where It Doesn't Matter". The sja1105 is probably the only DSA switch that requires some frames sent from the CPU to be routed to the slave port via an out-of-band configuration (register write) rather than in-band (DSA tag). And there are indeed very good reasons to not want to do that: if that out-of-band register is at the other end of a slow bus such as SPI, then you limit that Ethernet flow's throughput to effectively the throughput of the SPI bus. So hardware vendors should definitely not be encouraged to design this way. We do _not_ want more widespread use of this mechanism. Luckily we have a solution for each of the 3 issues: For 1, we can just remove that variable in the skb->cb and counteract the effect of kfree_skb with skb_get, much to the same effect. The advantage, of course, being that anybody who doesn't use deferred xmit doesn't need to do any extra operation in the hotpath. For 2, we can create a kernel thread for each port's deferred xmit work. If the user switch ports are named swp0, swp1, swp2, the kernel threads will be named swp0_xmit, swp1_xmit, swp2_xmit (there appears to be a 15 character length limit on kernel thread names). With this, the user can change the scheduling priority with chrt $(pidof swp2_xmit). For 3, we can actually move the entire implementation to the sja1105 driver. So this patch deletes the generic implementation from the DSA core and adds a new one, more adequate to the requirements of PTP TX timestamping, in sja1105_main.c. Suggested-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-04 00:37:10 +00:00
sja1105_mgmt_xmit(priv->ds, port, 0, skb, !!clone);
net: dsa: Make deferred_xmit private to sja1105 There are 3 things that are wrong with the DSA deferred xmit mechanism: 1. Its introduction has made the DSA hotpath ever so slightly more inefficient for everybody, since DSA_SKB_CB(skb)->deferred_xmit needs to be initialized to false for every transmitted frame, in order to figure out whether the driver requested deferral or not (a very rare occasion, rare even for the only driver that does use this mechanism: sja1105). That was necessary to avoid kfree_skb from freeing the skb. 2. Because L2 PTP is a link-local protocol like STP, it requires management routes and deferred xmit with this switch. But as opposed to STP, the deferred work mechanism needs to schedule the packet rather quickly for the TX timstamp to be collected in time and sent to user space. But there is no provision for controlling the scheduling priority of this deferred xmit workqueue. Too bad this is a rather specific requirement for a feature that nobody else uses (more below). 3. Perhaps most importantly, it makes the DSA core adhere a bit too much to the NXP company-wide policy "Innovate Where It Doesn't Matter". The sja1105 is probably the only DSA switch that requires some frames sent from the CPU to be routed to the slave port via an out-of-band configuration (register write) rather than in-band (DSA tag). And there are indeed very good reasons to not want to do that: if that out-of-band register is at the other end of a slow bus such as SPI, then you limit that Ethernet flow's throughput to effectively the throughput of the SPI bus. So hardware vendors should definitely not be encouraged to design this way. We do _not_ want more widespread use of this mechanism. Luckily we have a solution for each of the 3 issues: For 1, we can just remove that variable in the skb->cb and counteract the effect of kfree_skb with skb_get, much to the same effect. The advantage, of course, being that anybody who doesn't use deferred xmit doesn't need to do any extra operation in the hotpath. For 2, we can create a kernel thread for each port's deferred xmit work. If the user switch ports are named swp0, swp1, swp2, the kernel threads will be named swp0_xmit, swp1_xmit, swp2_xmit (there appears to be a 15 character length limit on kernel thread names). With this, the user can change the scheduling priority with chrt $(pidof swp2_xmit). For 3, we can actually move the entire implementation to the sja1105 driver. So this patch deletes the generic implementation from the DSA core and adds a new one, more adequate to the requirements of PTP TX timestamping, in sja1105_main.c. Suggested-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-04 00:37:10 +00:00
/* The clone, if there, was made by dsa_skb_tx_timestamp */
if (clone)
sja1105_ptp_txtstamp_skb(priv->ds, port, clone);
net: dsa: Make deferred_xmit private to sja1105 There are 3 things that are wrong with the DSA deferred xmit mechanism: 1. Its introduction has made the DSA hotpath ever so slightly more inefficient for everybody, since DSA_SKB_CB(skb)->deferred_xmit needs to be initialized to false for every transmitted frame, in order to figure out whether the driver requested deferral or not (a very rare occasion, rare even for the only driver that does use this mechanism: sja1105). That was necessary to avoid kfree_skb from freeing the skb. 2. Because L2 PTP is a link-local protocol like STP, it requires management routes and deferred xmit with this switch. But as opposed to STP, the deferred work mechanism needs to schedule the packet rather quickly for the TX timstamp to be collected in time and sent to user space. But there is no provision for controlling the scheduling priority of this deferred xmit workqueue. Too bad this is a rather specific requirement for a feature that nobody else uses (more below). 3. Perhaps most importantly, it makes the DSA core adhere a bit too much to the NXP company-wide policy "Innovate Where It Doesn't Matter". The sja1105 is probably the only DSA switch that requires some frames sent from the CPU to be routed to the slave port via an out-of-band configuration (register write) rather than in-band (DSA tag). And there are indeed very good reasons to not want to do that: if that out-of-band register is at the other end of a slow bus such as SPI, then you limit that Ethernet flow's throughput to effectively the throughput of the SPI bus. So hardware vendors should definitely not be encouraged to design this way. We do _not_ want more widespread use of this mechanism. Luckily we have a solution for each of the 3 issues: For 1, we can just remove that variable in the skb->cb and counteract the effect of kfree_skb with skb_get, much to the same effect. The advantage, of course, being that anybody who doesn't use deferred xmit doesn't need to do any extra operation in the hotpath. For 2, we can create a kernel thread for each port's deferred xmit work. If the user switch ports are named swp0, swp1, swp2, the kernel threads will be named swp0_xmit, swp1_xmit, swp2_xmit (there appears to be a 15 character length limit on kernel thread names). With this, the user can change the scheduling priority with chrt $(pidof swp2_xmit). For 3, we can actually move the entire implementation to the sja1105 driver. So this patch deletes the generic implementation from the DSA core and adds a new one, more adequate to the requirements of PTP TX timestamping, in sja1105_main.c. Suggested-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-04 00:37:10 +00:00
mutex_unlock(&priv->mgmt_lock);
}
}
/* The MAXAGE setting belongs to the L2 Forwarding Parameters table,
* which cannot be reconfigured at runtime. So a switch reset is required.
*/
static int sja1105_set_ageing_time(struct dsa_switch *ds,
unsigned int ageing_time)
{
struct sja1105_l2_lookup_params_entry *l2_lookup_params;
struct sja1105_private *priv = ds->priv;
struct sja1105_table *table;
unsigned int maxage;
table = &priv->static_config.tables[BLK_IDX_L2_LOOKUP_PARAMS];
l2_lookup_params = table->entries;
maxage = SJA1105_AGEING_TIME_MS(ageing_time);
if (l2_lookup_params->maxage == maxage)
return 0;
l2_lookup_params->maxage = maxage;
return sja1105_static_config_reload(priv, SJA1105_AGEING_TIME);
}
static int sja1105_change_mtu(struct dsa_switch *ds, int port, int new_mtu)
{
struct sja1105_l2_policing_entry *policing;
struct sja1105_private *priv = ds->priv;
new_mtu += VLAN_ETH_HLEN + ETH_FCS_LEN;
if (dsa_is_cpu_port(ds, port))
new_mtu += VLAN_HLEN;
policing = priv->static_config.tables[BLK_IDX_L2_POLICING].entries;
if (policing[port].maxlen == new_mtu)
return 0;
policing[port].maxlen = new_mtu;
return sja1105_static_config_reload(priv, SJA1105_BEST_EFFORT_POLICING);
}
static int sja1105_get_max_mtu(struct dsa_switch *ds, int port)
{
return 2043 - VLAN_ETH_HLEN - ETH_FCS_LEN;
}
net: dsa: sja1105: Configure the Time-Aware Scheduler via tc-taprio offload This qdisc offload is the closest thing to what the SJA1105 supports in hardware for time-based egress shaping. The switch core really is built around SAE AS6802/TTEthernet (a TTTech standard) but can be made to operate similarly to IEEE 802.1Qbv with some constraints: - The gate control list is a global list for all ports. There are 8 execution threads that iterate through this global list in parallel. I don't know why 8, there are only 4 front-panel ports. - Care must be taken by the user to make sure that two execution threads never get to execute a GCL entry simultaneously. I created a O(n^4) checker for this hardware limitation, prior to accepting a taprio offload configuration as valid. - The spec says that if a GCL entry's interval is shorter than the frame length, you shouldn't send it (and end up in head-of-line blocking). Well, this switch does anyway. - The switch has no concept of ADMIN and OPER configurations. Because it's so simple, the TAS settings are loaded through the static config tables interface, so there isn't even place for any discussion about 'graceful switchover between ADMIN and OPER'. You just reset the switch and upload a new OPER config. - The switch accepts multiple time sources for the gate events. Right now I am using the standalone clock source as opposed to PTP. So the base time parameter doesn't really do much. Support for the PTP clock source will be added in a future series. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-15 02:00:02 +00:00
static int sja1105_port_setup_tc(struct dsa_switch *ds, int port,
enum tc_setup_type type,
void *type_data)
{
switch (type) {
case TC_SETUP_QDISC_TAPRIO:
return sja1105_setup_tc_taprio(ds, port, type_data);
case TC_SETUP_QDISC_CBS:
return sja1105_setup_tc_cbs(ds, port, type_data);
net: dsa: sja1105: Configure the Time-Aware Scheduler via tc-taprio offload This qdisc offload is the closest thing to what the SJA1105 supports in hardware for time-based egress shaping. The switch core really is built around SAE AS6802/TTEthernet (a TTTech standard) but can be made to operate similarly to IEEE 802.1Qbv with some constraints: - The gate control list is a global list for all ports. There are 8 execution threads that iterate through this global list in parallel. I don't know why 8, there are only 4 front-panel ports. - Care must be taken by the user to make sure that two execution threads never get to execute a GCL entry simultaneously. I created a O(n^4) checker for this hardware limitation, prior to accepting a taprio offload configuration as valid. - The spec says that if a GCL entry's interval is shorter than the frame length, you shouldn't send it (and end up in head-of-line blocking). Well, this switch does anyway. - The switch has no concept of ADMIN and OPER configurations. Because it's so simple, the TAS settings are loaded through the static config tables interface, so there isn't even place for any discussion about 'graceful switchover between ADMIN and OPER'. You just reset the switch and upload a new OPER config. - The switch accepts multiple time sources for the gate events. Right now I am using the standalone clock source as opposed to PTP. So the base time parameter doesn't really do much. Support for the PTP clock source will be added in a future series. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-15 02:00:02 +00:00
default:
return -EOPNOTSUPP;
}
}
/* We have a single mirror (@to) port, but can configure ingress and egress
* mirroring on all other (@from) ports.
* We need to allow mirroring rules only as long as the @to port is always the
* same, and we need to unset the @to port from mirr_port only when there is no
* mirroring rule that references it.
*/
static int sja1105_mirror_apply(struct sja1105_private *priv, int from, int to,
bool ingress, bool enabled)
{
struct sja1105_general_params_entry *general_params;
struct sja1105_mac_config_entry *mac;
struct sja1105_table *table;
bool already_enabled;
u64 new_mirr_port;
int rc;
table = &priv->static_config.tables[BLK_IDX_GENERAL_PARAMS];
general_params = table->entries;
mac = priv->static_config.tables[BLK_IDX_MAC_CONFIG].entries;
already_enabled = (general_params->mirr_port != SJA1105_NUM_PORTS);
if (already_enabled && enabled && general_params->mirr_port != to) {
dev_err(priv->ds->dev,
"Delete mirroring rules towards port %llu first\n",
general_params->mirr_port);
return -EBUSY;
}
new_mirr_port = to;
if (!enabled) {
bool keep = false;
int port;
/* Anybody still referencing mirr_port? */
for (port = 0; port < SJA1105_NUM_PORTS; port++) {
if (mac[port].ing_mirr || mac[port].egr_mirr) {
keep = true;
break;
}
}
/* Unset already_enabled for next time */
if (!keep)
new_mirr_port = SJA1105_NUM_PORTS;
}
if (new_mirr_port != general_params->mirr_port) {
general_params->mirr_port = new_mirr_port;
rc = sja1105_dynamic_config_write(priv, BLK_IDX_GENERAL_PARAMS,
0, general_params, true);
if (rc < 0)
return rc;
}
if (ingress)
mac[from].ing_mirr = enabled;
else
mac[from].egr_mirr = enabled;
return sja1105_dynamic_config_write(priv, BLK_IDX_MAC_CONFIG, from,
&mac[from], true);
}
static int sja1105_mirror_add(struct dsa_switch *ds, int port,
struct dsa_mall_mirror_tc_entry *mirror,
bool ingress)
{
return sja1105_mirror_apply(ds->priv, port, mirror->to_local_port,
ingress, true);
}
static void sja1105_mirror_del(struct dsa_switch *ds, int port,
struct dsa_mall_mirror_tc_entry *mirror)
{
sja1105_mirror_apply(ds->priv, port, mirror->to_local_port,
mirror->ingress, false);
}
static int sja1105_port_policer_add(struct dsa_switch *ds, int port,
struct dsa_mall_policer_tc_entry *policer)
{
struct sja1105_l2_policing_entry *policing;
struct sja1105_private *priv = ds->priv;
policing = priv->static_config.tables[BLK_IDX_L2_POLICING].entries;
/* In hardware, every 8 microseconds the credit level is incremented by
* the value of RATE bytes divided by 64, up to a maximum of SMAX
* bytes.
*/
policing[port].rate = div_u64(512 * policer->rate_bytes_per_sec,
1000000);
policing[port].smax = policer->burst;
return sja1105_static_config_reload(priv, SJA1105_BEST_EFFORT_POLICING);
}
static void sja1105_port_policer_del(struct dsa_switch *ds, int port)
{
struct sja1105_l2_policing_entry *policing;
struct sja1105_private *priv = ds->priv;
policing = priv->static_config.tables[BLK_IDX_L2_POLICING].entries;
policing[port].rate = SJA1105_RATE_MBPS(1000);
policing[port].smax = 65535;
sja1105_static_config_reload(priv, SJA1105_BEST_EFFORT_POLICING);
}
static const struct dsa_switch_ops sja1105_switch_ops = {
.get_tag_protocol = sja1105_get_tag_protocol,
.setup = sja1105_setup,
.teardown = sja1105_teardown,
.set_ageing_time = sja1105_set_ageing_time,
.port_change_mtu = sja1105_change_mtu,
.port_max_mtu = sja1105_get_max_mtu,
.phylink_validate = sja1105_phylink_validate,
net: dsa: sja1105: Add support for the SGMII port SJA1105 switches R and S have one SerDes port with an 802.3z quasi-compatible PCS, hardwired on port 4. The other ports are still MII/RMII/RGMII. The PCS performs rate adaptation to lower link speeds; the MAC on this port is hardwired at gigabit. Only full duplex is supported. The SGMII port can be configured as part of the static config tables, as well as through a dedicated SPI address region for its pseudo-clause-22 registers. However it looks like the static configuration is not able to change some out-of-reset values (like the value of MII_BMCR), so at the end of the day, having code for it is utterly pointless. We are just going to use the pseudo-C22 interface. Because the PCS gets reset when the switch resets, we have to add even more restoration logic to sja1105_static_config_reload, otherwise the SGMII port breaks after operations such as enabling PTP timestamping which require a switch reset. >From PHYLINK perspective, the switch supports *only* SGMII (it doesn't support 1000Base-X). It also doesn't expose access to the raw config word for in-band AN in registers MII_ADV/MII_LPA. It is able to work in the following modes: - Forced speed - SGMII in-band AN slave (speed received from PHY) - SGMII in-band AN master (acting as a PHY) The latter mode is not supported by this patch. It is even unclear to me how that would be described. There is some code for it left in the patch, but 'an_master' is always passed as false. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-20 11:29:37 +00:00
.phylink_mac_link_state = sja1105_mac_pcs_get_state,
net: dsa: sja1105: Fix broken fixed-link interfaces on user ports PHYLIB and PHYLINK handle fixed-link interfaces differently. PHYLIB wraps them in a software PHY ("pseudo fixed link") phydev construct such that .adjust_link driver callbacks see an unified API. Whereas PHYLINK simply creates a phylink_link_state structure and passes it to .mac_config. At the time the driver was introduced, DSA was using PHYLIB for the CPU/cascade ports (the ones with no net devices) and PHYLINK for everything else. As explained below: commit aab9c4067d2389d0adfc9c53806437df7b0fe3d5 Author: Florian Fainelli <f.fainelli@gmail.com> Date: Thu May 10 13:17:36 2018 -0700 net: dsa: Plug in PHYLINK support Drivers that utilize fixed links for user-facing ports (e.g: bcm_sf2) will need to implement phylink_mac_ops from now on to preserve functionality, since PHYLINK *does not* create a phy_device instance for fixed links. In the above patch, DSA guards the .phylink_mac_config callback against a NULL phydev pointer. Therefore, .adjust_link is not called in case of a fixed-link user port. This patch fixes the situation by converting the driver from using .adjust_link to .phylink_mac_config. This can be done now in a unified fashion for both slave and CPU/cascade ports because DSA now uses PHYLINK for all ports. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-28 17:38:17 +00:00
.phylink_mac_config = sja1105_mac_config,
.phylink_mac_link_up = sja1105_mac_link_up,
.phylink_mac_link_down = sja1105_mac_link_down,
.get_strings = sja1105_get_strings,
.get_ethtool_stats = sja1105_get_ethtool_stats,
.get_sset_count = sja1105_get_sset_count,
net: dsa: sja1105: Add support for the PTP clock The design of this PHC driver is influenced by the switch's behavior w.r.t. timestamping. It exposes two PTP counters, one free-running (PTPTSCLK) and the other offset- and frequency-corrected in hardware through PTPCLKVAL, PTPCLKADD and PTPCLKRATE. The MACs can sample either of these for frame timestamps. However, the user manual warns that taking timestamps based on the corrected clock is less than useful, as the switch can deliver corrupted timestamps in a variety of circumstances. Therefore, this PHC uses the free-running PTPTSCLK together with a timecounter/cyclecounter structure that translates it into a software time domain. Thus, the settime/adjtime and adjfine callbacks are hardware no-ops. The timestamps (introduced in a further patch) will also be translated to the correct time domain before being handed over to the userspace PTP stack. The introduction of a second set of PHC operations that operate on the hardware PTPCLKVAL/PTPCLKADD/PTPCLKRATE in the future is somewhat unavoidable, as the TTEthernet core uses the corrected PTP time domain. However, the free-running counter + timecounter structure combination will suffice for now, as the resulting timestamps yield a sub-50 ns synchronization offset in steady state using linuxptp. For this patch, in absence of frame timestamping, the operations of the switch PHC were tested by syncing it to the system time as a local slave clock with: phc2sys -s CLOCK_REALTIME -c swp2 -O 0 -m -S 0.01 Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-08 12:04:34 +00:00
.get_ts_info = sja1105_get_ts_info,
.port_enable = sja1105_port_enable,
net: dsa: Make deferred_xmit private to sja1105 There are 3 things that are wrong with the DSA deferred xmit mechanism: 1. Its introduction has made the DSA hotpath ever so slightly more inefficient for everybody, since DSA_SKB_CB(skb)->deferred_xmit needs to be initialized to false for every transmitted frame, in order to figure out whether the driver requested deferral or not (a very rare occasion, rare even for the only driver that does use this mechanism: sja1105). That was necessary to avoid kfree_skb from freeing the skb. 2. Because L2 PTP is a link-local protocol like STP, it requires management routes and deferred xmit with this switch. But as opposed to STP, the deferred work mechanism needs to schedule the packet rather quickly for the TX timstamp to be collected in time and sent to user space. But there is no provision for controlling the scheduling priority of this deferred xmit workqueue. Too bad this is a rather specific requirement for a feature that nobody else uses (more below). 3. Perhaps most importantly, it makes the DSA core adhere a bit too much to the NXP company-wide policy "Innovate Where It Doesn't Matter". The sja1105 is probably the only DSA switch that requires some frames sent from the CPU to be routed to the slave port via an out-of-band configuration (register write) rather than in-band (DSA tag). And there are indeed very good reasons to not want to do that: if that out-of-band register is at the other end of a slow bus such as SPI, then you limit that Ethernet flow's throughput to effectively the throughput of the SPI bus. So hardware vendors should definitely not be encouraged to design this way. We do _not_ want more widespread use of this mechanism. Luckily we have a solution for each of the 3 issues: For 1, we can just remove that variable in the skb->cb and counteract the effect of kfree_skb with skb_get, much to the same effect. The advantage, of course, being that anybody who doesn't use deferred xmit doesn't need to do any extra operation in the hotpath. For 2, we can create a kernel thread for each port's deferred xmit work. If the user switch ports are named swp0, swp1, swp2, the kernel threads will be named swp0_xmit, swp1_xmit, swp2_xmit (there appears to be a 15 character length limit on kernel thread names). With this, the user can change the scheduling priority with chrt $(pidof swp2_xmit). For 3, we can actually move the entire implementation to the sja1105 driver. So this patch deletes the generic implementation from the DSA core and adds a new one, more adequate to the requirements of PTP TX timestamping, in sja1105_main.c. Suggested-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-04 00:37:10 +00:00
.port_disable = sja1105_port_disable,
.port_fdb_dump = sja1105_fdb_dump,
.port_fdb_add = sja1105_fdb_add,
.port_fdb_del = sja1105_fdb_del,
.port_bridge_join = sja1105_bridge_join,
.port_bridge_leave = sja1105_bridge_leave,
net: dsa: sja1105: Add support for Spanning Tree Protocol While not explicitly documented as supported in UM10944, compliance with the STP states can be obtained by manipulating 3 settings at the (per-port) MAC config level: dynamic learning, inhibiting reception of regular traffic, and inhibiting transmission of regular traffic. In all these modes, transmission and reception of special BPDU frames from the stack is still enabled (not inhibited by the MAC-level settings). On ingress, BPDUs are classified by the MAC filter as link-local (01-80-C2-00-00-00) and forwarded to the CPU port. This mechanism works under all conditions (even without the custom 802.1Q tagging) because the switch hardware inserts the source port and switch ID into bytes 4 and 5 of the MAC-filtered frames. Then the DSA .rcv handler needs to put back zeroes into the MAC address after decoding the source port information. On egress, BPDUs are transmitted using management routes from the xmit worker thread. Again this does not require switch tagging, as the switch port is programmed through SPI to hold a temporary (single-fire) route for a frame with the programmed destination MAC (01-80-C2-00-00-00). STP is activated using the following commands and was tested by connecting two front-panel ports together and noticing that switching loops were prevented (one port remains in the blocking state): $ ip link add name br0 type bridge stp_state 1 && ip link set br0 up $ for eth in $(ls /sys/devices/platform/soc/2100000.spi/spi_master/spi0/spi0.1/net/); do ip link set ${eth} master br0 && ip link set ${eth} up; done Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-05 10:19:28 +00:00
.port_stp_state_set = sja1105_bridge_stp_state_set,
.port_vlan_prepare = sja1105_vlan_prepare,
.port_vlan_filtering = sja1105_vlan_filtering,
.port_vlan_add = sja1105_vlan_add,
.port_vlan_del = sja1105_vlan_del,
.port_mdb_prepare = sja1105_mdb_prepare,
.port_mdb_add = sja1105_mdb_add,
.port_mdb_del = sja1105_mdb_del,
.port_hwtstamp_get = sja1105_hwtstamp_get,
.port_hwtstamp_set = sja1105_hwtstamp_set,
.port_rxtstamp = sja1105_port_rxtstamp,
.port_txtstamp = sja1105_port_txtstamp,
net: dsa: sja1105: Configure the Time-Aware Scheduler via tc-taprio offload This qdisc offload is the closest thing to what the SJA1105 supports in hardware for time-based egress shaping. The switch core really is built around SAE AS6802/TTEthernet (a TTTech standard) but can be made to operate similarly to IEEE 802.1Qbv with some constraints: - The gate control list is a global list for all ports. There are 8 execution threads that iterate through this global list in parallel. I don't know why 8, there are only 4 front-panel ports. - Care must be taken by the user to make sure that two execution threads never get to execute a GCL entry simultaneously. I created a O(n^4) checker for this hardware limitation, prior to accepting a taprio offload configuration as valid. - The spec says that if a GCL entry's interval is shorter than the frame length, you shouldn't send it (and end up in head-of-line blocking). Well, this switch does anyway. - The switch has no concept of ADMIN and OPER configurations. Because it's so simple, the TAS settings are loaded through the static config tables interface, so there isn't even place for any discussion about 'graceful switchover between ADMIN and OPER'. You just reset the switch and upload a new OPER config. - The switch accepts multiple time sources for the gate events. Right now I am using the standalone clock source as opposed to PTP. So the base time parameter doesn't really do much. Support for the PTP clock source will be added in a future series. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-15 02:00:02 +00:00
.port_setup_tc = sja1105_port_setup_tc,
.port_mirror_add = sja1105_mirror_add,
.port_mirror_del = sja1105_mirror_del,
.port_policer_add = sja1105_port_policer_add,
.port_policer_del = sja1105_port_policer_del,
.cls_flower_add = sja1105_cls_flower_add,
.cls_flower_del = sja1105_cls_flower_del,
net: dsa: sja1105: implement tc-gate using time-triggered virtual links Restrict the TTEthernet hardware support on this switch to operate as closely as possible to IEEE 802.1Qci as possible. This means that it can perform PTP-time-based ingress admission control on streams identified by {DMAC, VID, PCP}, which is useful when trying to ensure the determinism of traffic scheduled via IEEE 802.1Qbv. The oddity comes from the fact that in hardware (and in TTEthernet at large), virtual links always need a full-blown action, including not only the type of policing, but also the list of destination ports. So in practice, a single tc-gate action will result in all packets getting dropped. Additional actions (either "trap" or "redirect") need to be specified in the same filter rule such that the conforming packets are actually forwarded somewhere. Apart from the VL Lookup, Policing and Forwarding tables which need to be programmed for each flow (virtual link), the Schedule engine also needs to be told to open/close the admission gates for each individual virtual link. A fairly accurate (and detailed) description of how that works is already present in sja1105_tas.c, since it is already used to trigger the egress gates for the tc-taprio offload (IEEE 802.1Qbv). Key point here, we remember that the schedule engine supports 8 "subschedules" (execution threads that iterate through the global schedule in parallel, and that no 2 hardware threads must execute a schedule entry at the same time). For tc-taprio, each egress port used one of these 8 subschedules, leaving a total of 4 subschedules unused. In principle we could have allocated 1 subschedule for the tc-gate offload of each ingress port, but actually the schedules of all virtual links installed on each ingress port would have needed to be merged together, before they could have been programmed to hardware. So simplify our life and just merge the entire tc-gate configuration, for all virtual links on all ingress ports, into a single subschedule. Be sure to check that against the usual hardware scheduling conflicts, and program it to hardware alongside any tc-taprio subschedule that may be present. The following scenarios were tested: 1. Quantitative testing: tc qdisc add dev swp2 clsact tc filter add dev swp2 ingress flower skip_sw \ dst_mac 42:be:24:9b:76:20 \ action gate index 1 base-time 0 \ sched-entry OPEN 1200 -1 -1 \ sched-entry CLOSE 1200 -1 -1 \ action trap ping 192.168.1.2 -f PING 192.168.1.2 (192.168.1.2) 56(84) bytes of data. ............................. --- 192.168.1.2 ping statistics --- 948 packets transmitted, 467 received, 50.7384% packet loss, time 9671ms 2. Qualitative testing (with a phase-aligned schedule - the clocks are synchronized by ptp4l, not shown here): Receiver (sja1105): tc qdisc add dev swp2 clsact now=$(phc_ctl /dev/ptp1 get | awk '/clock time is/ {print $5}') && \ sec=$(echo $now | awk -F. '{print $1}') && \ base_time="$(((sec + 2) * 1000000000))" && \ echo "base time ${base_time}" tc filter add dev swp2 ingress flower skip_sw \ dst_mac 42:be:24:9b:76:20 \ action gate base-time ${base_time} \ sched-entry OPEN 60000 -1 -1 \ sched-entry CLOSE 40000 -1 -1 \ action trap Sender (enetc): now=$(phc_ctl /dev/ptp0 get | awk '/clock time is/ {print $5}') && \ sec=$(echo $now | awk -F. '{print $1}') && \ base_time="$(((sec + 2) * 1000000000))" && \ echo "base time ${base_time}" tc qdisc add dev eno0 parent root taprio \ num_tc 8 \ map 0 1 2 3 4 5 6 7 \ queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \ base-time ${base_time} \ sched-entry S 01 50000 \ sched-entry S 00 50000 \ flags 2 ping -A 192.168.1.1 PING 192.168.1.1 (192.168.1.1): 56 data bytes ... ^C --- 192.168.1.1 ping statistics --- 1425 packets transmitted, 1424 packets received, 0% packet loss round-trip min/avg/max = 0.322/0.361/0.990 ms And just for comparison, with the tc-taprio schedule deleted: ping -A 192.168.1.1 PING 192.168.1.1 (192.168.1.1): 56 data bytes ... ^C --- 192.168.1.1 ping statistics --- 33 packets transmitted, 19 packets received, 42% packet loss round-trip min/avg/max = 0.336/0.464/0.597 ms Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-05 19:20:56 +00:00
.cls_flower_stats = sja1105_cls_flower_stats,
net: dsa: sja1105: implement cross-chip bridging operations sja1105 uses dsa_8021q for DSA tagging, a format which is VLAN at heart and which is compatible with cascading. A complete description of this tagging format is in net/dsa/tag_8021q.c, but a quick summary is that each external-facing port tags incoming frames with a unique pvid, and this special VLAN is transmitted as tagged towards the inside of the system, and as untagged towards the exterior. The tag encodes the switch id and the source port index. This means that cross-chip bridging for dsa_8021q only entails adding the dsa_8021q pvids of one switch to the RX filter of the other switches. Everything else falls naturally into place, as long as the bottom-end of ports (the leaves in the tree) is comprised exclusively of dsa_8021q-compatible (i.e. sja1105 switches). Otherwise, there would be a chance that a front-panel switch transmits a packet tagged with a dsa_8021q header, header which it wouldn't be able to remove, and which would hence "leak" out. The only use case I tested (due to lack of board availability) was when the sja1105 switches are part of disjoint trees (however, this doesn't change the fact that multiple sja1105 switches still need unique switch identifiers in such a system). But in principle, even "true" single-tree setups (with DSA links) should work just as fine, except for a small change which I can't test: dsa_towards_port should be used instead of dsa_upstream_port (I made the assumption that the routing port that any sja1105 should use towards its neighbours is the CPU port. That might not hold true in other setups). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-10 16:37:43 +00:00
.crosschip_bridge_join = sja1105_crosschip_bridge_join,
.crosschip_bridge_leave = sja1105_crosschip_bridge_leave,
.devlink_param_get = sja1105_devlink_param_get,
.devlink_param_set = sja1105_devlink_param_set,
};
static int sja1105_check_device_id(struct sja1105_private *priv)
{
const struct sja1105_regs *regs = priv->info->regs;
u8 prod_id[SJA1105_SIZE_DEVICE_ID] = {0};
struct device *dev = &priv->spidev->dev;
u32 device_id;
u64 part_no;
int rc;
net: dsa: sja1105: Implement the .gettimex64 system call for PTP Through the PTP_SYS_OFFSET_EXTENDED ioctl, it is possible for userspace applications (i.e. phc2sys) to compensate for the delays incurred while reading the PHC's time. The task itself of taking the software timestamp is delegated to the SPI subsystem, through the newly introduced API in struct spi_transfer. The goal is to cross-timestamp I/O operations on the switch's PTP clock with values in the local system clock (CLOCK_REALTIME). For that we need to understand a bit of the hardware internals. The 'read PTP time' message is a 12 byte structure, first 4 bytes of which represent the SPI header, and the last 8 bytes represent the 64-bit PTP time. The switch itself starts processing the command immediately after receiving the last bit of the address, i.e. at the middle of byte 3 (last byte of header). The PTP time is shadowed to a buffer register in the switch, and retrieved atomically during the subsequent SPI frames. A similar thing goes on for the 'write PTP time' message, although in that case the switch waits until the 64-bit PTP time becomes fully available before taking any action. So the byte that needs to be software-timestamped is byte 11 (last) of the transfer. The patch creates a common (and local) sja1105_xfer implementation for the SPI I/O, and offers 3 front-ends: - sja1105_xfer_u32 and sja1105_xfer_u64: these are capable of optionally requesting a PTP timestamp - sja1105_xfer_buf: this is for large transfers (e.g. the static config buffer) and other misc data, and there is no point in giving timestamping capabilities to this. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-09 11:32:22 +00:00
rc = sja1105_xfer_u32(priv, SPI_READ, regs->device_id, &device_id,
NULL);
if (rc < 0)
return rc;
if (device_id != priv->info->device_id) {
dev_err(dev, "Expected device ID 0x%llx but read 0x%x\n",
priv->info->device_id, device_id);
return -ENODEV;
}
rc = sja1105_xfer_buf(priv, SPI_READ, regs->prod_id, prod_id,
SJA1105_SIZE_DEVICE_ID);
if (rc < 0)
return rc;
sja1105_unpack(prod_id, &part_no, 19, 4, SJA1105_SIZE_DEVICE_ID);
if (part_no != priv->info->part_no) {
dev_err(dev, "Expected part number 0x%llx but read 0x%llx\n",
priv->info->part_no, part_no);
return -ENODEV;
}
return 0;
}
static int sja1105_probe(struct spi_device *spi)
{
struct sja1105_tagger_data *tagger_data;
struct device *dev = &spi->dev;
struct sja1105_private *priv;
struct dsa_switch *ds;
net: dsa: Make deferred_xmit private to sja1105 There are 3 things that are wrong with the DSA deferred xmit mechanism: 1. Its introduction has made the DSA hotpath ever so slightly more inefficient for everybody, since DSA_SKB_CB(skb)->deferred_xmit needs to be initialized to false for every transmitted frame, in order to figure out whether the driver requested deferral or not (a very rare occasion, rare even for the only driver that does use this mechanism: sja1105). That was necessary to avoid kfree_skb from freeing the skb. 2. Because L2 PTP is a link-local protocol like STP, it requires management routes and deferred xmit with this switch. But as opposed to STP, the deferred work mechanism needs to schedule the packet rather quickly for the TX timstamp to be collected in time and sent to user space. But there is no provision for controlling the scheduling priority of this deferred xmit workqueue. Too bad this is a rather specific requirement for a feature that nobody else uses (more below). 3. Perhaps most importantly, it makes the DSA core adhere a bit too much to the NXP company-wide policy "Innovate Where It Doesn't Matter". The sja1105 is probably the only DSA switch that requires some frames sent from the CPU to be routed to the slave port via an out-of-band configuration (register write) rather than in-band (DSA tag). And there are indeed very good reasons to not want to do that: if that out-of-band register is at the other end of a slow bus such as SPI, then you limit that Ethernet flow's throughput to effectively the throughput of the SPI bus. So hardware vendors should definitely not be encouraged to design this way. We do _not_ want more widespread use of this mechanism. Luckily we have a solution for each of the 3 issues: For 1, we can just remove that variable in the skb->cb and counteract the effect of kfree_skb with skb_get, much to the same effect. The advantage, of course, being that anybody who doesn't use deferred xmit doesn't need to do any extra operation in the hotpath. For 2, we can create a kernel thread for each port's deferred xmit work. If the user switch ports are named swp0, swp1, swp2, the kernel threads will be named swp0_xmit, swp1_xmit, swp2_xmit (there appears to be a 15 character length limit on kernel thread names). With this, the user can change the scheduling priority with chrt $(pidof swp2_xmit). For 3, we can actually move the entire implementation to the sja1105 driver. So this patch deletes the generic implementation from the DSA core and adds a new one, more adequate to the requirements of PTP TX timestamping, in sja1105_main.c. Suggested-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-04 00:37:10 +00:00
int rc, port;
if (!dev->of_node) {
dev_err(dev, "No DTS bindings for SJA1105 driver\n");
return -EINVAL;
}
priv = devm_kzalloc(dev, sizeof(struct sja1105_private), GFP_KERNEL);
if (!priv)
return -ENOMEM;
/* Configure the optional reset pin and bring up switch */
priv->reset_gpio = devm_gpiod_get(dev, "reset", GPIOD_OUT_HIGH);
if (IS_ERR(priv->reset_gpio))
dev_dbg(dev, "reset-gpios not defined, ignoring\n");
else
sja1105_hw_reset(priv->reset_gpio, 1, 1);
/* Populate our driver private structure (priv) based on
* the device tree node that was probed (spi)
*/
priv->spidev = spi;
spi_set_drvdata(spi, priv);
/* Configure the SPI bus */
spi->bits_per_word = 8;
rc = spi_setup(spi);
if (rc < 0) {
dev_err(dev, "Could not init SPI\n");
return rc;
}
priv->info = of_device_get_match_data(dev);
/* Detect hardware device */
rc = sja1105_check_device_id(priv);
if (rc < 0) {
dev_err(dev, "Device ID check failed: %d\n", rc);
return rc;
}
dev_info(dev, "Probed switch chip: %s\n", priv->info->name);
ds = devm_kzalloc(dev, sizeof(*ds), GFP_KERNEL);
if (!ds)
return -ENOMEM;
ds->dev = dev;
ds->num_ports = SJA1105_NUM_PORTS;
ds->ops = &sja1105_switch_ops;
ds->priv = priv;
priv->ds = ds;
tagger_data = &priv->tagger_data;
mutex_init(&priv->ptp_data.lock);
mutex_init(&priv->mgmt_lock);
net: dsa: sja1105: implement cross-chip bridging operations sja1105 uses dsa_8021q for DSA tagging, a format which is VLAN at heart and which is compatible with cascading. A complete description of this tagging format is in net/dsa/tag_8021q.c, but a quick summary is that each external-facing port tags incoming frames with a unique pvid, and this special VLAN is transmitted as tagged towards the inside of the system, and as untagged towards the exterior. The tag encodes the switch id and the source port index. This means that cross-chip bridging for dsa_8021q only entails adding the dsa_8021q pvids of one switch to the RX filter of the other switches. Everything else falls naturally into place, as long as the bottom-end of ports (the leaves in the tree) is comprised exclusively of dsa_8021q-compatible (i.e. sja1105 switches). Otherwise, there would be a chance that a front-panel switch transmits a packet tagged with a dsa_8021q header, header which it wouldn't be able to remove, and which would hence "leak" out. The only use case I tested (due to lack of board availability) was when the sja1105 switches are part of disjoint trees (however, this doesn't change the fact that multiple sja1105 switches still need unique switch identifiers in such a system). But in principle, even "true" single-tree setups (with DSA links) should work just as fine, except for a small change which I can't test: dsa_towards_port should be used instead of dsa_upstream_port (I made the assumption that the routing port that any sja1105 should use towards its neighbours is the CPU port. That might not hold true in other setups). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-10 16:37:43 +00:00
INIT_LIST_HEAD(&priv->crosschip_links);
net: dsa: sja1105: save/restore VLANs using a delta commit method Managing the VLAN table that is present in hardware will become very difficult once we add a third operating state (best_effort_vlan_filtering). That is because correct cleanup (not too little, not too much) becomes virtually impossible, when VLANs can be added from the bridge layer, from dsa_8021q for basic tagging, for cross-chip bridging, as well as retagging rules for sub-VLANs and cross-chip sub-VLANs. So we need to rethink VLAN interaction with the switch in a more scalable way. In preparation for that, use the priv->expect_dsa_8021q boolean to classify any VLAN request received through .port_vlan_add or .port_vlan_del towards either one of 2 internal lists: bridge VLANs and dsa_8021q VLANs. Then, implement a central sja1105_build_vlan_table method that creates a VLAN configuration from scratch based on the 2 lists of VLANs kept by the driver, and based on the VLAN awareness state. Currently, if we are VLAN-unaware, install the dsa_8021q VLANs, otherwise the bridge VLANs. Then, implement a delta commit procedure that identifies which VLANs from this new configuration are actually different from the config previously committed to hardware. We apply the delta through the dynamic configuration interface (we don't reset the switch). The result is that the hardware should see the exact sequence of operations as before this patch. This also helps remove the "br" argument passed to dsa_8021q_crosschip_bridge_join, which it was only using to figure out whether it should commit the configuration back to us or not, based on the VLAN awareness state of the bridge. We can simplify that, by always allowing those VLANs inside of our dsa_8021q_vlans list, and committing those to hardware when necessary. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 17:20:29 +00:00
INIT_LIST_HEAD(&priv->bridge_vlans);
INIT_LIST_HEAD(&priv->dsa_8021q_vlans);
net: dsa: sja1105: implement cross-chip bridging operations sja1105 uses dsa_8021q for DSA tagging, a format which is VLAN at heart and which is compatible with cascading. A complete description of this tagging format is in net/dsa/tag_8021q.c, but a quick summary is that each external-facing port tags incoming frames with a unique pvid, and this special VLAN is transmitted as tagged towards the inside of the system, and as untagged towards the exterior. The tag encodes the switch id and the source port index. This means that cross-chip bridging for dsa_8021q only entails adding the dsa_8021q pvids of one switch to the RX filter of the other switches. Everything else falls naturally into place, as long as the bottom-end of ports (the leaves in the tree) is comprised exclusively of dsa_8021q-compatible (i.e. sja1105 switches). Otherwise, there would be a chance that a front-panel switch transmits a packet tagged with a dsa_8021q header, header which it wouldn't be able to remove, and which would hence "leak" out. The only use case I tested (due to lack of board availability) was when the sja1105 switches are part of disjoint trees (however, this doesn't change the fact that multiple sja1105 switches still need unique switch identifiers in such a system). But in principle, even "true" single-tree setups (with DSA links) should work just as fine, except for a small change which I can't test: dsa_towards_port should be used instead of dsa_upstream_port (I made the assumption that the routing port that any sja1105 should use towards its neighbours is the CPU port. That might not hold true in other setups). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-05-10 16:37:43 +00:00
sja1105_tas_setup(ds);
sja1105_flower_setup(ds);
rc = dsa_register_switch(priv->ds);
if (rc)
return rc;
if (IS_ENABLED(CONFIG_NET_SCH_CBS)) {
priv->cbs = devm_kcalloc(dev, priv->info->num_cbs_shapers,
sizeof(struct sja1105_cbs_entry),
GFP_KERNEL);
if (!priv->cbs)
return -ENOMEM;
}
/* Connections between dsa_port and sja1105_port */
net: dsa: Make deferred_xmit private to sja1105 There are 3 things that are wrong with the DSA deferred xmit mechanism: 1. Its introduction has made the DSA hotpath ever so slightly more inefficient for everybody, since DSA_SKB_CB(skb)->deferred_xmit needs to be initialized to false for every transmitted frame, in order to figure out whether the driver requested deferral or not (a very rare occasion, rare even for the only driver that does use this mechanism: sja1105). That was necessary to avoid kfree_skb from freeing the skb. 2. Because L2 PTP is a link-local protocol like STP, it requires management routes and deferred xmit with this switch. But as opposed to STP, the deferred work mechanism needs to schedule the packet rather quickly for the TX timstamp to be collected in time and sent to user space. But there is no provision for controlling the scheduling priority of this deferred xmit workqueue. Too bad this is a rather specific requirement for a feature that nobody else uses (more below). 3. Perhaps most importantly, it makes the DSA core adhere a bit too much to the NXP company-wide policy "Innovate Where It Doesn't Matter". The sja1105 is probably the only DSA switch that requires some frames sent from the CPU to be routed to the slave port via an out-of-band configuration (register write) rather than in-band (DSA tag). And there are indeed very good reasons to not want to do that: if that out-of-band register is at the other end of a slow bus such as SPI, then you limit that Ethernet flow's throughput to effectively the throughput of the SPI bus. So hardware vendors should definitely not be encouraged to design this way. We do _not_ want more widespread use of this mechanism. Luckily we have a solution for each of the 3 issues: For 1, we can just remove that variable in the skb->cb and counteract the effect of kfree_skb with skb_get, much to the same effect. The advantage, of course, being that anybody who doesn't use deferred xmit doesn't need to do any extra operation in the hotpath. For 2, we can create a kernel thread for each port's deferred xmit work. If the user switch ports are named swp0, swp1, swp2, the kernel threads will be named swp0_xmit, swp1_xmit, swp2_xmit (there appears to be a 15 character length limit on kernel thread names). With this, the user can change the scheduling priority with chrt $(pidof swp2_xmit). For 3, we can actually move the entire implementation to the sja1105 driver. So this patch deletes the generic implementation from the DSA core and adds a new one, more adequate to the requirements of PTP TX timestamping, in sja1105_main.c. Suggested-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-04 00:37:10 +00:00
for (port = 0; port < SJA1105_NUM_PORTS; port++) {
struct sja1105_port *sp = &priv->ports[port];
struct dsa_port *dp = dsa_to_port(ds, port);
struct net_device *slave;
int subvlan;
net: dsa: Make deferred_xmit private to sja1105 There are 3 things that are wrong with the DSA deferred xmit mechanism: 1. Its introduction has made the DSA hotpath ever so slightly more inefficient for everybody, since DSA_SKB_CB(skb)->deferred_xmit needs to be initialized to false for every transmitted frame, in order to figure out whether the driver requested deferral or not (a very rare occasion, rare even for the only driver that does use this mechanism: sja1105). That was necessary to avoid kfree_skb from freeing the skb. 2. Because L2 PTP is a link-local protocol like STP, it requires management routes and deferred xmit with this switch. But as opposed to STP, the deferred work mechanism needs to schedule the packet rather quickly for the TX timstamp to be collected in time and sent to user space. But there is no provision for controlling the scheduling priority of this deferred xmit workqueue. Too bad this is a rather specific requirement for a feature that nobody else uses (more below). 3. Perhaps most importantly, it makes the DSA core adhere a bit too much to the NXP company-wide policy "Innovate Where It Doesn't Matter". The sja1105 is probably the only DSA switch that requires some frames sent from the CPU to be routed to the slave port via an out-of-band configuration (register write) rather than in-band (DSA tag). And there are indeed very good reasons to not want to do that: if that out-of-band register is at the other end of a slow bus such as SPI, then you limit that Ethernet flow's throughput to effectively the throughput of the SPI bus. So hardware vendors should definitely not be encouraged to design this way. We do _not_ want more widespread use of this mechanism. Luckily we have a solution for each of the 3 issues: For 1, we can just remove that variable in the skb->cb and counteract the effect of kfree_skb with skb_get, much to the same effect. The advantage, of course, being that anybody who doesn't use deferred xmit doesn't need to do any extra operation in the hotpath. For 2, we can create a kernel thread for each port's deferred xmit work. If the user switch ports are named swp0, swp1, swp2, the kernel threads will be named swp0_xmit, swp1_xmit, swp2_xmit (there appears to be a 15 character length limit on kernel thread names). With this, the user can change the scheduling priority with chrt $(pidof swp2_xmit). For 3, we can actually move the entire implementation to the sja1105 driver. So this patch deletes the generic implementation from the DSA core and adds a new one, more adequate to the requirements of PTP TX timestamping, in sja1105_main.c. Suggested-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-04 00:37:10 +00:00
if (!dsa_is_user_port(ds, port))
continue;
net: dsa: Make deferred_xmit private to sja1105 There are 3 things that are wrong with the DSA deferred xmit mechanism: 1. Its introduction has made the DSA hotpath ever so slightly more inefficient for everybody, since DSA_SKB_CB(skb)->deferred_xmit needs to be initialized to false for every transmitted frame, in order to figure out whether the driver requested deferral or not (a very rare occasion, rare even for the only driver that does use this mechanism: sja1105). That was necessary to avoid kfree_skb from freeing the skb. 2. Because L2 PTP is a link-local protocol like STP, it requires management routes and deferred xmit with this switch. But as opposed to STP, the deferred work mechanism needs to schedule the packet rather quickly for the TX timstamp to be collected in time and sent to user space. But there is no provision for controlling the scheduling priority of this deferred xmit workqueue. Too bad this is a rather specific requirement for a feature that nobody else uses (more below). 3. Perhaps most importantly, it makes the DSA core adhere a bit too much to the NXP company-wide policy "Innovate Where It Doesn't Matter". The sja1105 is probably the only DSA switch that requires some frames sent from the CPU to be routed to the slave port via an out-of-band configuration (register write) rather than in-band (DSA tag). And there are indeed very good reasons to not want to do that: if that out-of-band register is at the other end of a slow bus such as SPI, then you limit that Ethernet flow's throughput to effectively the throughput of the SPI bus. So hardware vendors should definitely not be encouraged to design this way. We do _not_ want more widespread use of this mechanism. Luckily we have a solution for each of the 3 issues: For 1, we can just remove that variable in the skb->cb and counteract the effect of kfree_skb with skb_get, much to the same effect. The advantage, of course, being that anybody who doesn't use deferred xmit doesn't need to do any extra operation in the hotpath. For 2, we can create a kernel thread for each port's deferred xmit work. If the user switch ports are named swp0, swp1, swp2, the kernel threads will be named swp0_xmit, swp1_xmit, swp2_xmit (there appears to be a 15 character length limit on kernel thread names). With this, the user can change the scheduling priority with chrt $(pidof swp2_xmit). For 3, we can actually move the entire implementation to the sja1105 driver. So this patch deletes the generic implementation from the DSA core and adds a new one, more adequate to the requirements of PTP TX timestamping, in sja1105_main.c. Suggested-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-04 00:37:10 +00:00
dp->priv = sp;
sp->dp = dp;
sp->data = tagger_data;
net: dsa: Make deferred_xmit private to sja1105 There are 3 things that are wrong with the DSA deferred xmit mechanism: 1. Its introduction has made the DSA hotpath ever so slightly more inefficient for everybody, since DSA_SKB_CB(skb)->deferred_xmit needs to be initialized to false for every transmitted frame, in order to figure out whether the driver requested deferral or not (a very rare occasion, rare even for the only driver that does use this mechanism: sja1105). That was necessary to avoid kfree_skb from freeing the skb. 2. Because L2 PTP is a link-local protocol like STP, it requires management routes and deferred xmit with this switch. But as opposed to STP, the deferred work mechanism needs to schedule the packet rather quickly for the TX timstamp to be collected in time and sent to user space. But there is no provision for controlling the scheduling priority of this deferred xmit workqueue. Too bad this is a rather specific requirement for a feature that nobody else uses (more below). 3. Perhaps most importantly, it makes the DSA core adhere a bit too much to the NXP company-wide policy "Innovate Where It Doesn't Matter". The sja1105 is probably the only DSA switch that requires some frames sent from the CPU to be routed to the slave port via an out-of-band configuration (register write) rather than in-band (DSA tag). And there are indeed very good reasons to not want to do that: if that out-of-band register is at the other end of a slow bus such as SPI, then you limit that Ethernet flow's throughput to effectively the throughput of the SPI bus. So hardware vendors should definitely not be encouraged to design this way. We do _not_ want more widespread use of this mechanism. Luckily we have a solution for each of the 3 issues: For 1, we can just remove that variable in the skb->cb and counteract the effect of kfree_skb with skb_get, much to the same effect. The advantage, of course, being that anybody who doesn't use deferred xmit doesn't need to do any extra operation in the hotpath. For 2, we can create a kernel thread for each port's deferred xmit work. If the user switch ports are named swp0, swp1, swp2, the kernel threads will be named swp0_xmit, swp1_xmit, swp2_xmit (there appears to be a 15 character length limit on kernel thread names). With this, the user can change the scheduling priority with chrt $(pidof swp2_xmit). For 3, we can actually move the entire implementation to the sja1105 driver. So this patch deletes the generic implementation from the DSA core and adds a new one, more adequate to the requirements of PTP TX timestamping, in sja1105_main.c. Suggested-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-04 00:37:10 +00:00
slave = dp->slave;
kthread_init_work(&sp->xmit_work, sja1105_port_deferred_xmit);
sp->xmit_worker = kthread_create_worker(0, "%s_xmit",
slave->name);
if (IS_ERR(sp->xmit_worker)) {
rc = PTR_ERR(sp->xmit_worker);
dev_err(ds->dev,
"failed to create deferred xmit thread: %d\n",
rc);
goto out;
}
skb_queue_head_init(&sp->xmit_queue);
net: dsa: sja1105: prepare tagger for handling DSA tags and VLAN simultaneously In VLAN-unaware mode, sja1105 uses VLAN tags with a custom TPID of 0xdadb. While in the yet-to-be introduced best_effort_vlan_filtering mode, it needs to work with normal VLAN TPID values. A complication arises when we must transmit a VLAN-tagged packet to the switch when it's in VLAN-aware mode. We need to construct a packet with 2 VLAN tags, and the switch will use the outer header for routing and pop it on egress. But sadly, here the 2 hardware generations don't behave the same: - E/T switches won't pop an ETH_P_8021AD tag on egress, it seems (packets will remain double-tagged). - P/Q/R/S switches will drop a packet with 2 ETH_P_8021Q tags (it looks like it tries to prevent VLAN hopping). But looks like the reverse is also true: - E/T switches have no problem popping the outer tag from packets with 2 ETH_P_8021Q tags. - P/Q/R/S will have no problem popping a single tag even if that is ETH_P_8021AD. So it is clear that if we want the hardware to work with dsa_8021q tagging in VLAN-aware mode, we need to send different TPIDs depending on revision. Keep that information in priv->info->qinq_tpid. The per-port tagger structure will hold an xmit_tpid value that depends not only upon the qinq_tpid, but also upon the VLAN awareness state itself (in case we must transmit using 0xdadb). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-12 17:20:32 +00:00
sp->xmit_tpid = ETH_P_SJA1105;
for (subvlan = 0; subvlan < DSA_8021Q_N_SUBVLAN; subvlan++)
sp->subvlan_map[subvlan] = VLAN_N_VID;
}
return 0;
net: dsa: Make deferred_xmit private to sja1105 There are 3 things that are wrong with the DSA deferred xmit mechanism: 1. Its introduction has made the DSA hotpath ever so slightly more inefficient for everybody, since DSA_SKB_CB(skb)->deferred_xmit needs to be initialized to false for every transmitted frame, in order to figure out whether the driver requested deferral or not (a very rare occasion, rare even for the only driver that does use this mechanism: sja1105). That was necessary to avoid kfree_skb from freeing the skb. 2. Because L2 PTP is a link-local protocol like STP, it requires management routes and deferred xmit with this switch. But as opposed to STP, the deferred work mechanism needs to schedule the packet rather quickly for the TX timstamp to be collected in time and sent to user space. But there is no provision for controlling the scheduling priority of this deferred xmit workqueue. Too bad this is a rather specific requirement for a feature that nobody else uses (more below). 3. Perhaps most importantly, it makes the DSA core adhere a bit too much to the NXP company-wide policy "Innovate Where It Doesn't Matter". The sja1105 is probably the only DSA switch that requires some frames sent from the CPU to be routed to the slave port via an out-of-band configuration (register write) rather than in-band (DSA tag). And there are indeed very good reasons to not want to do that: if that out-of-band register is at the other end of a slow bus such as SPI, then you limit that Ethernet flow's throughput to effectively the throughput of the SPI bus. So hardware vendors should definitely not be encouraged to design this way. We do _not_ want more widespread use of this mechanism. Luckily we have a solution for each of the 3 issues: For 1, we can just remove that variable in the skb->cb and counteract the effect of kfree_skb with skb_get, much to the same effect. The advantage, of course, being that anybody who doesn't use deferred xmit doesn't need to do any extra operation in the hotpath. For 2, we can create a kernel thread for each port's deferred xmit work. If the user switch ports are named swp0, swp1, swp2, the kernel threads will be named swp0_xmit, swp1_xmit, swp2_xmit (there appears to be a 15 character length limit on kernel thread names). With this, the user can change the scheduling priority with chrt $(pidof swp2_xmit). For 3, we can actually move the entire implementation to the sja1105 driver. So this patch deletes the generic implementation from the DSA core and adds a new one, more adequate to the requirements of PTP TX timestamping, in sja1105_main.c. Suggested-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-04 00:37:10 +00:00
out:
while (port-- > 0) {
struct sja1105_port *sp = &priv->ports[port];
if (!dsa_is_user_port(ds, port))
continue;
kthread_destroy_worker(sp->xmit_worker);
}
return rc;
}
static int sja1105_remove(struct spi_device *spi)
{
struct sja1105_private *priv = spi_get_drvdata(spi);
dsa_unregister_switch(priv->ds);
return 0;
}
static const struct of_device_id sja1105_dt_ids[] = {
{ .compatible = "nxp,sja1105e", .data = &sja1105e_info },
{ .compatible = "nxp,sja1105t", .data = &sja1105t_info },
{ .compatible = "nxp,sja1105p", .data = &sja1105p_info },
{ .compatible = "nxp,sja1105q", .data = &sja1105q_info },
{ .compatible = "nxp,sja1105r", .data = &sja1105r_info },
{ .compatible = "nxp,sja1105s", .data = &sja1105s_info },
{ /* sentinel */ },
};
MODULE_DEVICE_TABLE(of, sja1105_dt_ids);
static struct spi_driver sja1105_driver = {
.driver = {
.name = "sja1105",
.owner = THIS_MODULE,
.of_match_table = of_match_ptr(sja1105_dt_ids),
},
.probe = sja1105_probe,
.remove = sja1105_remove,
};
module_spi_driver(sja1105_driver);
MODULE_AUTHOR("Vladimir Oltean <olteanv@gmail.com>");
MODULE_AUTHOR("Georg Waibel <georg.waibel@sensor-technik.de>");
MODULE_DESCRIPTION("SJA1105 Driver");
MODULE_LICENSE("GPL v2");