linux

Author	SHA1	Message	Date
Vladimir Oltean	40d3f295b5	net: mscc: ocelot: use common tag parsing code with DSA The Injection Frame Header and Extraction Frame Header that the switch prepends to frames over the NPI port is also prepended to frames delivered over the CPU port module's queues. Let's unify the handling of the frame headers by making the ocelot driver call some helpers exported by the DSA tagger. Among other things, this allows us to get rid of the strange cpu_to_be32 when transmitting the Injection Frame Header on ocelot, since the packing API uses network byte order natively (when "quirks" is 0). The comments above ocelot_gen_ifh talk about setting pop_cnt to 3, and the cpu extraction queue mask to something, but the code doesn't do it, so we don't do it either. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-02-14 17:31:44 -08:00
Vladimir Oltean	8fe6832e96	net: dsa: felix: propagate the LAG offload ops towards the ocelot lib The ocelot switch has been supporting LAG offload since its initial commit, however felix could not make use of that, due to lack of a LAG abstraction in DSA. Now that we have that, let's forward DSA's calls towards the ocelot library, who will deal with setting up the bonding. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-06 14:51:51 -08:00
Vladimir Oltean	23ca3b727e	net: mscc: ocelot: rebalance LAGs on link up/down events At present there is an issue when ocelot is offloading a bonding interface, but one of the links of the physical ports goes down. Traffic keeps being hashed towards that destination, and of course gets dropped on egress. Monitor the netdev notifier events emitted by the bonding driver for changes in the physical state of lower interfaces, to determine which ports are active and which ones are no longer. Then extend ocelot_get_bond_mask to return either the configured bonding interfaces, or the active ones, depending on a boolean argument. The code that does rebalancing only needs to do so among the active ports, whereas the bridge forwarding mask and the logical port IDs still need to look at the permanently bonded ports. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-06 14:51:51 -08:00
Vladimir Oltean	583cbbe3ee	net: mscc: ocelot: don't refuse bonding interfaces we can't offload Since switchdev/DSA exposes network interfaces that fulfill many of the same user space expectations that dedicated NICs do, it makes sense to not deny bonding interfaces with a bonding policy that we cannot offload, but instead allow the bonding driver to select the egress interface in software. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-06 14:51:50 -08:00
Dan Carpenter	e0c1623357	net: mscc: ocelot: fix error handling bugs in mscc_ocelot_init_ports() There are several error handling bugs in mscc_ocelot_init_ports(). I went through the code, and carefully audited it and made fixes and cleanups. 1) The ocelot_probe_port() function didn't have a mirror release function so it was hard to follow. I created the ocelot_release_port() function. 2) In the ocelot_probe_port() function, if the register_netdev() call failed, then it lead to a double free_netdev(dev) bug. Fix this by setting "ocelot->ports[port] = NULL" on the error path. 3) I was concerned that the "port" which comes from of_property_read_u32() might be out of bounds so I added a check for that. 4) In the original code if ocelot_regmap_init() failed then the driver tried to continue but I think that should be a fatal error. 5) If ocelot_probe_port() failed then the most recent devlink was leaked. The fix for mostly came Vladimir Oltean. Get rid of "registered_ports" and just set a bit in "devlink_ports_registered" to say when the devlink port has been registered (and needs to be unregistered on error). There are fewer than 32 ports so a u32 is large enough for this purpose. 6) The error handling if the final ocelot_port_devlink_init() failed had two problems. The "while (port-- >= 0)" loop should have been "--port" pre-op instead of a post-op to avoid a buffer underflow. The "if (!registered_ports[port])" condition was reversed leading to resource leaks and double frees. Fixes: `6c30384eb1` ("net: mscc: ocelot: register devlink ports") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/YBkXhqRxHtRGzSnJ@mwanda Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-02-03 16:18:10 -08:00
Vladimir Oltean	f59fd9cab7	net: mscc: ocelot: configure watermarks using devlink-sb Using devlink-sb, we can configure 12/16 (the important 75%) of the switch's controlling watermarks for congestion drops, and we can monitor 50% of the watermark occupancies (we can monitor the reservation watermarks, but not the sharing watermarks, which are exposed as pool sizes). The following definitions can be made: SB_BUF=0 # The devlink-sb for frame buffers SB_REF=1 # The devlink-sb for frame references POOL_ING=0 # The pool for ingress traffic. Both devlink-sb instances # have one of these. POOL_EGR=1 # The pool for egress traffic. Both devlink-sb instances # have one of these. Editing the hardware watermarks is done in the following way: BUF_xxxx_I is accessed when sb=$SB_BUF and pool=$POOL_ING REF_xxxx_I is accessed when sb=$SB_REF and pool=$POOL_ING BUF_xxxx_E is accessed when sb=$SB_BUF and pool=$POOL_EGR REF_xxxx_E is accessed when sb=$SB_REF and pool=$POOL_EGR Configuring the sharing watermarks for COL_SHR(dp=0) is done implicitly by modifying the corresponding pool size. By default, the pool size has maximum size, so this can be skipped. devlink sb pool set pci/0000:00:00.5 sb $SB_BUF pool $POOL_ING \ size 129840 thtype static Since by default there is no buffer reservation, the above command has maxed out BUF_COL_SHR_I(dp=0). Configuring the per-port reservation watermark (P_RSRV) is done in the following way: devlink sb port pool set pci/0000:00:00.5/0 sb $SB_BUF \ pool $POOL_ING th 1000 The above command sets BUF_P_RSRV_I(port 0) to 1000 bytes. After this command, the sharing watermarks are internally reconfigured with 1000 bytes less, i.e. from 129840 bytes to 128840 bytes. Configuring the per-port-tc reservation watermarks (Q_RSRV) is done in the following way: for tc in {0..7}; do devlink sb tc bind set pci/0000:00:00.5/0 sb 0 tc $tc \ type ingress pool $POOL_ING \ th 3000 done The above command sets BUF_Q_RSRV_I(port 0, tc 0..7) to 3000 bytes. The sharing watermarks are again reconfigured with 24000 bytes less. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-15 20:02:35 -08:00
Vladimir Oltean	a4ae997adc	net: mscc: ocelot: initialize watermarks to sane defaults This is meant to be a gentle introduction into the world of watermarks on ocelot. The code is placed in ocelot_devlink.c because it will be integrated with devlink, even if it isn't right now. My first step was intended to be to replicate the default configuration of the congestion watermarks programatically, since they are now going to be tuned by the user. But after studying and understanding through trial and error how they work, I now believe that the configuration used out of reset does not do justice to the word "reservation", since the sum of all reservations exceeds the total amount of resources (otherwise said, all reservations cannot be fulfilled at the same time, which means that, contrary to the reference manual, they don't guarantee anything). As an example, here's a dump of the reservation watermarks for frame buffers, for port 0 (for brevity, the ports 1-6 were omitted, but they have the same configuration): BUF_Q_RSRV_I(port 0, prio 0) = max 3000 bytes BUF_Q_RSRV_I(port 0, prio 1) = max 3000 bytes BUF_Q_RSRV_I(port 0, prio 2) = max 3000 bytes BUF_Q_RSRV_I(port 0, prio 3) = max 3000 bytes BUF_Q_RSRV_I(port 0, prio 4) = max 3000 bytes BUF_Q_RSRV_I(port 0, prio 5) = max 3000 bytes BUF_Q_RSRV_I(port 0, prio 6) = max 3000 bytes BUF_Q_RSRV_I(port 0, prio 7) = max 3000 bytes Otherwise said, every port-tc has an ingress reservation of 3000 bytes, and there are 7 ports in VSC9959 Felix (6 user ports and 1 CPU port). Concentrating only on the ingress reservations, there are, in total, 8 [traffic classes] x 7 [ports] x 3000 [bytes] = 168,000 bytes of memory reserved on ingress. But, surprise, Felix only has 128 KB of packet buffer in total... A similar thing happens with Seville, which has a larger packet buffer, but also more ports, and the default configuration is also overcommitted. This patch disables the (apparently) bogus reservations and moves all resources to the shared area. This way, real reservations can be set up by the user, using devlink-sb. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-15 20:02:34 -08:00
Vladimir Oltean	6c30384eb1	net: mscc: ocelot: register devlink ports Add devlink integration into the mscc_ocelot switchdev driver. All physical ports (i.e. the unused ones as well) except the CPU port module at ocelot->num_phys_ports are registered with devlink, and that requires keeping the devlink_port structure outside struct ocelot_port_private, since the latter has a 1:1 mapping with a struct net_device (which does not exist for unused ports). Since we use devlink_port_type_eth_set to link the devlink port to the net_device, we can as well remove the .ndo_get_phys_port_name and .ndo_get_port_parent_id implementations, since devlink takes care of retrieving the port name and number automatically, once .ndo_get_devlink_port is implemented. Note that the felix DSA driver is already integrated with devlink by default, since that is a thing that the DSA core takes care of. This is the reason why these devlink stubs were put in ocelot_net.c and not in the common library. It is also the reason why ocelot::devlink is a pointer and not a full structure embedded inside struct ocelot: because the mscc_ocelot driver allocates that by itself (as the container of struct ocelot, in fact), but in the case of felix, it is DSA who allocates the devlink, and felix just propagates the pointer towards struct ocelot. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-15 20:02:34 -08:00
Vladimir Oltean	c6c65d47dd	net: mscc: ocelot: delete unused ocelot_set_cpu_port prototype This is a leftover of commit `69df578c5f` ("net: mscc: ocelot: eliminate confusion between CPU and NPI port") which renamed that function. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2021-01-15 20:02:34 -08:00
Vladimir Oltean	e5d1f896fd	net: mscc: ocelot: support L2 multicast entries There is one main difference in mscc_ocelot between IP multicast and L2 multicast. With IP multicast, destination ports are encoded into the upper bytes of the multicast MAC address. Example: to deliver the address 01:00:5E:11:22:33 to ports 3, 8, and 9, one would need to program the address of 00:03:08:11:22:33 into hardware. Whereas for L2 multicast, the MAC table entry points to a Port Group ID (PGID), and that PGID contains the port mask that the packet will be forwarded to. As to why it is this way, no clue. My guess is that not all port combinations can be supported simultaneously with the limited number of PGIDs, and this was somehow an issue for IP multicast but not for L2 multicast. Anyway. Prior to this change, the raw L2 multicast code was bogus, due to the fact that there wasn't really any way to test it using the bridge code. There were 2 issues: - A multicast PGID was allocated for each MDB entry, but it wasn't in fact programmed to hardware. It was dummy. - In fact we don't want to reserve a multicast PGID for every single MDB entry. That would be odd because we can only have ~60 PGIDs, but thousands of MDB entries. So instead, we want to reserve a multicast PGID for every single port combination for multicast traffic. And since we can have 2 (or more) MDB entries delivered to the same port group (and therefore PGID), we need to reference-count the PGIDs. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-30 18:25:56 -07:00
Vladimir Oltean	bb8d53fd94	net: mscc: ocelot: make entry_type a member of struct ocelot_multicast This saves a re-classification of the MDB address on deletion. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-10-30 18:25:56 -07:00
Vladimir Oltean	319e4dd11a	net: mscc: ocelot: introduce conversion helpers between port and netdev Since the mscc_ocelot_switch_lib is common between a pure switchdev and a DSA driver, the procedure of retrieving a net_device for a certain port index differs, as those are registered by their individual front-ends. Up to now that has been dealt with by always passing the port index to the switch library, but now, we're going to need to work with net_device pointers from the tc-flower offload, for things like indev, or mirred. It is not desirable to refactor that, so let's make sure that the flower offload core has the ability to translate between a net_device and a port index properly. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-10-02 15:40:30 -07:00
Vladimir Oltean	886e1387c7	net: mscc: ocelot: convert QSYS_SWITCH_PORT_MODE and SYS_PORT_MODE to regfields Currently Felix and Ocelot share the same bit layout in these per-port registers, but Seville does not. So we need reg_fields for that. Actually since these are per-port registers, we need to also specify the number of ports, and register size per port, and use the regmap API for multiple ports. There's a more subtle point to be made about the other 2 register fields: - QSYS_SWITCH_PORT_MODE_SCH_NEXT_CFG - QSYS_SWITCH_PORT_MODE_INGRESS_DROP_MODE which we are not writing any longer, for 2 reasons: - Using the previous API (ocelot_write_rix), we were only writing 1 for Felix and Ocelot, which was their hardware-default value, and which there wasn't any intention in changing. - In the case of SCH_NEXT_CFG, in fact Seville does not have this register field at all, and therefore, if we want to have common code we would be required to not write to it. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-07-13 17:40:01 -07:00
Vladimir Oltean	91c724cfc0	net: mscc: ocelot: convert port registers to regmap At the moment, there are some minimal register differences between VSC7514 Ocelot and VSC9959 Felix. To be precise, the PCS1G registers are missing from Felix because it was integrated with an NXP PCS. But with VSC9953 Seville (not yet introduced), the register differences are more pronounced. The MAC registers are located at different offsets within the DEV_GMII target. So we need to refactor the driver to keep a regmap even for per-port registers. The callers of the ocelot_port_readl and ocelot_port_writel were kept unchanged, only the implementation is now more generic. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-07-13 17:40:01 -07:00
Vladimir Oltean	9403c158b8	net: mscc: ocelot: support IPv4, IPv6 and plain Ethernet mdb entries The current procedure for installing a multicast address is hardcoded for IPv4. But, in the ocelot hardware, there are 3 different procedures for IPv4, IPv6 and for regular L2 multicast. For IPv6 (33-33-xx-xx-xx-xx), it's the same as for IPv4 (01-00-5e-xx-xx-xx), except that the destination port mask is stuffed into first 2 bytes of the MAC address except into first 3 bytes. For plain Ethernet multicast, there's no port-in-address stuffing going on, instead the DEST_IDX (pointer to PGID) is used there, just as for unicast. So we have to use one of the nonreserved multicast PGIDs that the hardware has allocated for this purpose. This patch classifies the type of multicast address based on its first bytes, then redirects to one of the 3 different hardware procedures. Note that this gives us a really better way of redirecting PTP frames sent at 01-1b-19-00-00-00 to the CPU. Previously, Yangbo Lu tried to add a trapping rule for PTP EtherType but got a lot of pushback: https://patchwork.ozlabs.org/project/netdev/patch/20190813025214.18601-5-yangbo.lu@nxp.com/ But right now, that isn't needed at all. The application stack (ptp4l) does this for the PTP multicast addresses it's interested in (which are configurable, and include 01-1b-19-00-00-00): memset(&mreq, 0, sizeof(mreq)); mreq.mr_ifindex = index; mreq.mr_type = PACKET_MR_MULTICAST; mreq.mr_alen = MAC_LEN; memcpy(mreq.mr_address, addr1, MAC_LEN); err1 = setsockopt(fd, SOL_PACKET, PACKET_ADD_MEMBERSHIP, &mreq, sizeof(mreq)); Into the kernel, this translates into a dev_mc_add on the switch network interfaces, and our drivers know that it means they should translate it into a host MDB address (make the CPU port be the destination). Previously, this was broken because all mdb addresses were treated as IPv4 (which 01-1b-19-00-00-00 obviously is not). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-22 20:41:05 -07:00
Vladimir Oltean	209edf95da	net: dsa: felix: call port mdb operations from ocelot This adds the mdb hooks in felix and exports the mdb functions from ocelot. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-22 20:41:05 -07:00
Vladimir Oltean	9c90eea310	net: mscc: ocelot: move net_device related functions to ocelot_net.c The ocelot hardware library shouldn't contain too much net_device specific code, since it is shared with DSA which abstracts that structure away. So much as much of this code as possible into the mscc_ocelot driver and outside of the common library. We're making an exception for MDB and LAG code. That is not yet exported to DSA, but when it will, most of the code that's already in ocelot.c will remain there. So, there's no point in moving code to ocelot_net.c just to move it back later. We could have moved all net_device code to ocelot_vsc7514.c directly, but let's operate under the assumption that if a new switchdev ocelot driver gets added, it'll define its SoC-specific stuff in a new ocelot_vsc*.c file and it'll reuse the rest of the code. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-20 17:25:23 -07:00
Vladimir Oltean	d9feb90499	net: mscc: ocelot: move ocelot_regs.c into ocelot_vsc7514.c ocelot_regs.c actually shouldn't be part of the common library. It describes the register map of the VSC7514 switch. The way ocelot switches work, they'll have highly optimized register maps, so another SoC will likely have the same registers but laid out completely different in memory (so there's little room for reusing this structure). So move it to ocelot_vsc7514.c instead. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-20 17:25:23 -07:00
Yangbo Lu	2b49d128b3	net: mscc: ocelot: move ocelot ptp clock code out of ocelot.c The Ocelot PTP clock driver had been embedded into ocelot.c driver. It had supported basic gettime64/settime64/adjtime/adjfine functions by now which were used by both Ocelot switch and Felix switch. This patch is to move current ptp clock code out of ocelot.c driver maintaining as a single ocelot_ptp.c. For futher new features implementation, the common code could be put in ocelot_ptp.c and the switch specific code should be in specific switch driver. The interrupt implementation in SoC is different between Ocelot and Felix. Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-04-21 15:38:33 -07:00
Vladimir Oltean	87b0f983f6	net: mscc: ocelot: fix untagged packet drops when enslaving to vlan aware bridge To rehash a previous explanation given in commit `1c44ce560b` ("net: mscc: ocelot: fix vlan_filtering when enslaving to bridge before link is up"), the switch driver operates the in a mode where a single VLAN can be transmitted as untagged on a particular egress port. That is the "native VLAN on trunk port" use case. The configuration for this native VLAN is driven in 2 ways: - Set the egress port rewriter to strip the VLAN tag for the native VID (as it is egress-untagged, after all). - Configure the ingress port to drop untagged and priority-tagged traffic, if there is no native VLAN. The intention of this setting is that a trunk port with no native VLAN should not accept untagged traffic. Since both of the above configurations for the native VLAN should only be done if VLAN awareness is requested, they are actually done from the ocelot_port_vlan_filtering function, after the basic procedure of toggling the VLAN awareness flag of the port. But there's a problem with that simplistic approach: we are trying to juggle with 2 independent variables from a single function: - Native VLAN of the port - its value is held in port->vid. - VLAN awareness state of the port - currently there are some issues here, more on that later. The actual problem can be seen when enslaving the switch ports to a VLAN filtering bridge: 0. The driver configures a pvid of zero for each port, when in standalone mode. While the bridge configures a default_pvid of 1 for each port that gets added as a slave to it. 1. The bridge calls ocelot_port_vlan_filtering with vlan_aware=true. The VLAN-filtering-dependent portion of the native VLAN configuration is done, considering that the native VLAN is 0. 2. The bridge calls ocelot_vlan_add with vid=1, pvid=true, untagged=true. The native VLAN changes to 1 (change which gets propagated to hardware). 3. ??? - nobody calls ocelot_port_vlan_filtering again, to reapply the VLAN-filtering-dependent portion of the native VLAN configuration, for the new native VLAN of 1. One can notice that after toggling "ip link set dev br0 type bridge vlan_filtering 0 && ip link set dev br0 type bridge vlan_filtering 1", the new native VLAN finally makes it through and untagged traffic finally starts flowing again. But obviously that shouldn't be needed. So it is clear that 2 independent variables need to both re-trigger the native VLAN configuration. So we introduce the second variable as ocelot_port->vlan_aware. Actually both the DSA Felix driver and the Ocelot driver already had each its own variable: - Ocelot: ocelot_port_private->vlan_aware - Felix: dsa_port->vlan_filtering but the common Ocelot library needs to work with a single, common, variable, so there is some refactoring done to move the vlan_aware property from the private structure into the common ocelot_port structure. Fixes: `97bb69e1e3` ("net: mscc: ocelot: break apart ocelot_vlan_port_apply") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-04-15 12:27:35 -07:00
Vladimir Oltean	1cf3299b03	net: dsa: felix: Allow unknown unicast traffic towards the CPU port module Compared to other DSA switches, in the Ocelot cores, the RX filtering is a much more important concern. Firstly, the primary use case for Ocelot is non-DSA, so there isn't any secondary Ethernet MAC [the DSA master's one] to implicitly drop frames having a DMAC we are not interested in. So the switch driver itself needs to install FDB entries towards the CPU port module (PGID_CPU) for the MAC address of each switch port, in each VLAN installed on the port. Every address that is not whitelisted is implicitly dropped. This is in order to achieve a behavior similar to N standalone net devices. Secondly, even in the secondary use case of DSA, such as illustrated by Felix with the NPI port mode, that secondary Ethernet MAC is present, but its RX filter is bypassed. This is because the DSA tags themselves are placed before Ethernet, so the DMAC that the switch ports see is not seen by the DSA master too (since it's shifter to the right). So RX filtering is pretty important. A good RX filter won't bother the CPU in case the switch port receives a frame that it's not interested in, and there exists no other line of defense. Ocelot is pretty strict when it comes to RX filtering: non-IP multicast and broadcast traffic is allowed to go to the CPU port module, but unknown unicast isn't. This means that traffic reception for any other MAC addresses than the ones configured on each switch port net device won't work. This includes use cases such as macvlan or bridging with a non-Ocelot (so-called "foreign") interface. But this seems to be fine for the scenarios that the Linux system embedded inside an Ocelot switch is intended for - it is simply not interested in unknown unicast traffic, as explained in Allan Nielsen's presentation [0]. On the other hand, the Felix DSA switch is integrated in more general-purpose Linux systems, so it can't afford to drop that sort of traffic in hardware, even if it will end up doing so later, in software. Actually, unknown unicast means more for Felix than it does for Ocelot. Felix doesn't attempt to perform the whitelisting of switch port MAC addresses towards PGID_CPU at all, mainly because it is too complicated to be feasible: while the MAC addresses are unique in Ocelot, by default in DSA all ports are equal and inherited from the DSA master. This adds into account the question of reference counting MAC addresses (delayed ocelot_mact_forget), not to mention reference counting for the VLAN IDs that those MAC addresses are installed in. This reference counting should be done in the DSA core, and the fact that it wasn't needed so far is due to the fact that the other DSA switches don't have the DSA tag placed before Ethernet, so the DSA master is able to whitelist the MAC addresses in hardware. So this means that even regular traffic termination on a Felix switch port happens through flooding (because neither Felix nor Ocelot learn source MAC addresses from CPU-injected frames). So far we've explained that whitelisting towards PGID_CPU: - helps to reduce the likelihood of spamming the CPU with frames it won't process very far anyway - is implemented in the ocelot driver - is sufficient for the ocelot use cases - is not feasible in DSA - breaks use cases in DSA, in the current status (whitelisting enabled but no MAC address whitelisted) So the proposed patch allows unknown unicast frames to be sent to the CPU port module. This is done for the Felix DSA driver only, as Ocelot seems to be happy without it. [0]: https://www.youtube.com/watch?v=B1HhxEcU7Jg Suggested-by: Allan W. Nielsen <allan.nielsen@microchip.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Allan W. Nielsen <allan.nielsen@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-03-04 14:19:01 -08:00
Vladimir Oltean	964ee5c82b	net: mscc: ocelot: export ANA, DEV and QSYS registers to include/soc/mscc Since the Felix DSA driver is implementing its own PHYLINK instance due to SoC differences, it needs access to the few registers that are common, mainly for flow control. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-05 23:22:33 -08:00
Vladimir Oltean	ee50d07c9f	net: mscc: ocelot: make phy_mode a member of the common struct ocelot_port The Ocelot switchdev driver and the Felix DSA one need it for different reasons. Felix (or at least the VSC9959 instantiation in NXP LS1028A) is integrated with the traditional NXP Layerscape PCS design which does not support runtime configuration of SerDes protocol. So it needs to pre-validate the phy-mode from the device tree and prevent PHYLINK from attempting to change it. For this, it needs to cache it in a private variable. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-05 23:22:33 -08:00
Yangbo Lu	e23a7b3e8d	net: mscc: ocelot: convert to use ocelot_get_txtstamp() The method getting TX timestamp by reading timestamp FIFO and matching skbs list is common for DSA Felix driver too. So move code out of ocelot_board.c, convert to use ocelot_get_txtstamp() function and export it. Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-11-21 14:39:02 -08:00
Vladimir Oltean	a030dfe194	net: mscc: ocelot: publish ocelot_sys.h to include/soc/mscc The Felix DSA driver needs to write to SYS_RAM_INIT_RAM_INIT for its own chip initialization process. Also update the MAINTAINERS file such that the headers exported by the ocelot driver are under the same maintainers' umbrella as the driver itself. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-11-15 12:32:16 -08:00
Vladimir Oltean	5e25636502	net: mscc: ocelot: publish structure definitions to include/soc/mscc/ocelot.h We will be registering another switch driver based on ocelot, which lives under drivers/net/dsa. Make sure the Felix DSA front-end has the necessary abstractions to implement a new Ocelot driver instantiation. This includes the function prototypes for implementing DSA callbacks. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-11-15 12:32:16 -08:00
Vladimir Oltean	3a77b5933f	net: mscc: ocelot: separate the implementation of switch reset The Felix switch has a different reset procedure, so a function pointer needs to be created and added to the ocelot_ops structure. The reset procedure has been moved into ocelot_init. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-11-15 12:32:16 -08:00
Vladimir Oltean	ba551bc3bc	net: mscc: ocelot: adjust MTU on the CPU port in NPI mode When using the NPI port, the DSA tag is passed through Ethernet, so the switch's MAC needs to accept it as it comes from the DSA master. Increase the MTU on the external CPU port to account for the length of the injection header. Without this patch, MTU-sized frames are dropped by the switch's CPU port on xmit, which is especially obvious in TCP sessions. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-11-15 12:32:16 -08:00
Vladimir Oltean	f24711fddc	net: mscc: ocelot: export a constant for the tag length in bytes This constant will be used in a future patch to increase the MTU on NPI ports, and will also be used in the tagger driver for Felix. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-11-15 12:32:16 -08:00
Claudiu Manoil	dc3de2a294	net: mscc: ocelot: filter out ocelot SoC specific PCS config from common path The adjust_link routine should be generic enough to be (re)used by any SoC that integrates a switch core compatible with the Ocelot core switch driver. Currently all configurations are generic except for the PCS settings that are SoC specific. Move these out to the Ocelot SoC/board instance. Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-11-15 12:32:16 -08:00
Claudiu Manoil	259630e08c	net: mscc: ocelot: move resource ioremap and regmap init to common code Let's make this ioremap and regmap init code common. It should not be platform dependent as it should be usable by PCI devices too. Use better names where necessary to avoid clashes. Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-11-15 12:32:16 -08:00
Vladimir Oltean	2146819901	net: mscc: ocelot: split assignment of the cpu port into a separate function Now that the places that configure routing destinations for the CPU port have been marked as such, allow callers to specify their own CPU port that is different than ocelot->num_phys_ports. A user will be the Felix DSA driver, where the CPU port is one of the physical ports (NPI mode). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-11-11 12:59:10 -08:00
Vladimir Oltean	004d44f6e5	net: mscc: ocelot: separate net_device related items out of ocelot_port The ocelot and ocelot_port structures will be used by a new DSA driver, so the ocelot_board.c file will have to allocate and work with a private structure (ocelot_port_private), which embeds the generic struct ocelot_port. This is because in DSA, at least one interface does not have a net_device, and the DSA driver API does not interact with that anyway. The ocelot_port structure is equivalent to dsa_port, and ocelot to dsa_switch. The members of ocelot_port which have an equivalent in dsa_port (such as dp->vlan_filtering) have been moved to ocelot_port_private. We want to enforce the coding convention that "ocelot_port" refers to the structure, and "port" refers to the integer index. One can retrieve the structure at any time from ocelot->ports[port]. The patch is large but only contains variable renaming and mechanical movement of fields from one structure to another. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-11-11 12:59:10 -08:00
Vladimir Oltean	17fdd7638c	net: mscc: ocelot: fix __ocelot_rmw_ix prototype The "read-modify-write register index" function is declared with a confusing prototype: the "mask" and "reg" arguments are swapped. Fortunately, this does not affect callers so far. Both arguments are u32, and the wrapper macros (ocelot_rmw_ix etc) have the arguments in the correct order (the one from ocelot_io.c). Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-11-06 15:34:12 -08:00
Antoine Tenart	4e3b0468e6	net: mscc: PTP Hardware Clock (PHC) support This patch adds support for PTP Hardware Clock (PHC) to the Ocelot switch for both PTP 1-step and 2-step modes. Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-15 16:31:12 -07:00
Antoine Tenart	1f0239de58	net: mscc: remove the frame_info cpuq member In struct frame_info, the cpuq member is never used. This cosmetic patch removes it from the structure, and from the parsing of the frame header as it's only set but never used. Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-15 16:31:12 -07:00
Antoine Tenart	45bce1719c	net: mscc: describe the PTP register range This patch adds support for using the PTP register range, and adds a description of its registers. This bank is used when configuring PTP. Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-08-15 16:31:11 -07:00
Horatiu Vultur	b596229448	net: mscc: ocelot: Add support for tcam Add ACL support using the TCAM. Using ACL it is possible to create rules in hardware to filter/redirect frames. Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-06-02 13:49:49 -07:00
David S. Miller	b4b12b0d2f	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net The phylink conflict was between a bug fix by Russell King to make sure we have a consistent PHY interface mode, and a change in net-next to pull some code in phylink_resolve() into the helper functions phylink_mac_link_{up,down}() On the dp83867 side it's mostly overlapping changes, with the 'net' side removing a condition that was supposed to trigger for RGMII but because of how it was coded never actually could trigger. Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-31 10:49:43 -07:00
Joergen Andreasen	2c1d029a01	net: mscc: ocelot: Implement port policers via tc command Hardware offload of matchall classifier and police action are now supported via the tc command. Supported police parameters are: rate and burst. Example: Add: tc qdisc add dev eth3 handle ffff: ingress tc filter add dev eth3 parent ffff: prio 1 handle 2 \ matchall skip_sw \ action police rate 100Mbit burst 10000 Show: tc -s -d qdisc show dev eth3 tc -s -d filter show dev eth3 ingress Delete: tc filter del dev eth3 parent ffff: prio 1 tc qdisc del dev eth3 handle ffff: ingress Signed-off-by: Joergen Andreasen <joergen.andreasen@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-29 21:37:49 -07:00
Claudiu Manoil	40a1578d63	ocelot: Dont allocate another multicast list, use __dev_mc_sync Doing kmalloc in atomic context is always an issue, more so for a list that can grow significantly. Turns out that the driver only uses the duplicated list of multicast mac addresses to keep track of what addresses to delete from h/w before committing the new list from kernel to h/w back again via set_rx_mode, every time this list gets updated by the kernel. Given that the h/w knows how to add and delete mac addresses based on the mac address value alone, __dev_mc_sync should be the much better choice of kernel API for these operations avoiding the considerable overhead of maintaining a duplicated list in the driver. Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Tested-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-05-22 12:08:43 -07:00
Florian Fainelli	56da64bc00	net: mscc: ocelot: Handle SWITCHDEV_PORT_ATTR_SET Following patches will change the way we communicate setting a port's attribute and use notifiers to perform those tasks. Ocelot does not currently have an atomic notifier registered for switchdev events, so we need to register one in order to deal with atomic context SWITCHDEV_PORT_ATTR_SET events. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-02-27 12:39:56 -08:00
Petr Machata	0e332c854f	ocelot: Handle SWITCHDEV_PORT_OBJ_ADD/_DEL Following patches will change the way of distributing port object changes from a switchdev operation to a switchdev notifier. The switchdev code currently recursively descends through layers of lower devices, eventually calling the op on a front-panel port device. The notifier will instead be sent referencing the bridge port device, which may be a stacking device that's one of front-panel ports uppers, or a completely unrelated device. Dispatch the new events to ocelot_port_obj_add() resp. _del() to maintain the same behavior that the switchdev operation based code currently has. Pass through switchdev_handle_port_obj_add() / _del() to handle the recursive descend, because Ocelot supports LAG uppers. Register to the new switchdev blocking notifier chain to get the new events when they start getting distributed. Signed-off-by: Petr Machata <petrm@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-23 18:02:24 -08:00
Quentin Schulz	71e32a20cf	net: mscc: ocelot: make use of SerDes PHYs for handling their configuration Previously, the SerDes muxing was hardcoded to a given mode in the MAC controller driver. Now, the SerDes muxing is configured within the Device Tree and is enforced in the MAC controller driver so we can have a lot of different SerDes configurations. Make use of the SerDes PHYs in the MAC controller to set up the SerDes according to the SerDes<->switch port mapping and the communication mode with the Ethernet PHY. Signed-off-by: Quentin Schulz <quentin.schulz@bootlin.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-05 14:36:44 -07:00
Quentin Schulz	66c2132333	net: mscc: ocelot: simplify register access for PLL5 configuration Since HSIO address space can be accessed by different drivers, let's simplify the register address definitions so that it can be easily used by all drivers and put the register address definition in the include/soc/mscc/ocelot_hsio.h header file. Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Quentin Schulz <quentin.schulz@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-05 14:36:44 -07:00
Quentin Schulz	8afc978925	net: mscc: ocelot: move the HSIO header to include/soc Since HSIO address space can be used by different drivers (PLL, SerDes muxing, temperature sensor), let's move it somewhere it can be included by all drivers. Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: Quentin Schulz <quentin.schulz@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-10-05 14:36:44 -07:00
Alexandre Belloni	dc96ee3730	net: mscc: ocelot: add bonding support Add link aggregation hardware offload support for Ocelot. ocelot_get_link_ksettings() is not great but it does work until the driver is reworked to switch to phylink. Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-06-28 14:18:49 +09:00
Alexandre Belloni	a556c76adc	net: mscc: Add initial Ocelot switch support Add a driver for Microsemi Ocelot Ethernet switch support. This makes two modules: mscc_ocelot_common handles all the common features that doesn't depend on how the switch is integrated in the SoC. Currently, it handles offloading bridging to the hardware. ocelot_io.c handles register accesses. This is unfortunately needed because the register layout is packed and then depends on the number of ports available on the switch. The register definition files are automatically generated. ocelot_board handles the switch integration on the SoC and on the board. Frame injection and extraction to/from the CPU port is currently done using register accesses which is quite slow. DMA is possible but the port is not able to absorb the whole switch bandwidth. Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-15 16:41:15 -04:00

48 Commits