Add index to flow_action_entry structure and delete index from police and
gate child structure.
We make this change to offload tc action for driver to identify a tc
action.
Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove unneeded variable used to store return value.
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Changcheng Deng <deng.changcheng@zte.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Debug print uses invalid check to detect if speed is unforced:
(speed != SPEED_UNFORCED) should be used instead of (!speed).
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Signed-off-by: Andrey Eremeev <Axtone4all@yandex.ru>
Fixes: 96a2b40c7b ("net: dsa: mv88e6xxx: add port's MAC speed setter")
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Recent net-next fails to initialize ports with:
realtek-smi switch: phy mode gmii is unsupported on port 0
realtek-smi switch lan5 (uninitialized): validation of gmii with
support 0000000,00000000,000062ef and advertisement
0000000,00000000,000062ef failed: -22
realtek-smi switch lan5 (uninitialized): failed to connect to PHY:
-EINVAL
realtek-smi switch lan5 (uninitialized): error -22 setting up PHY
for tree 1, switch 0, port 0
Current net branch(3dd7d40b43) is not
affected.
I also noticed the same issue before with older versions but using
a MDIO interface driver, not realtek-smi.
Tested-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Signed-off-by: Luiz Angelo Daros de Luca <luizluca@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The switch supports PTP for UDP transport too. Therefore, add the missing static
FDB entries to ensure correct forwarding of these packets.
Fixes: ddd56dfe52 ("net: dsa: hellcreek: Add PTP clock support")
Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Allow PTP peer delay measurements on blocked ports by STP. In case of topology
changes the PTP stack can directly start with the correct delays.
Fixes: ddd56dfe52 ("net: dsa: hellcreek: Add PTP clock support")
Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Treat STP as management traffic. STP traffic is designated for the CPU port
only. In addition, STP traffic has to pass blocked ports.
Fixes: e4b27ebc78 ("net: dsa: Add DSA driver for Hirschmann Hellcreek switches")
Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The insertion of static FDB entries ignores the pass_blocked bit. That bit is
evaluated with regards to STP. Add the missing functionality.
Fixes: e4b27ebc78 ("net: dsa: Add DSA driver for Hirschmann Hellcreek switches")
Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The driver was incorrectly converted assuming that "sja1105" is the only
tagger supported by this driver. This results in SJA1110 switches
failing to probe:
sja1105 spi1.0: Unable to connect to tag protocol "sja1110": -EPROTONOSUPPORT
sja1105: probe of spi1.2 failed with error -93
Add DSA_TAG_PROTO_SJA1110 to the list of supported taggers by the
sja1105 driver. The sja1105_tagger_data structure format is common for
the two tagging protocols.
Fixes: c79e84866d ("net: dsa: tag_sja1105: convert to tagger-owned data")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since commit 94dd016ae5 ("bond: pass get_ts_info and SIOC[SG]HWTSTAMP
ioctl to active device") the user could get bond active interface's
PHC index directly. But when there is a failover, the bond active
interface will change, thus the PHC index is also changed. This may
break the user's program if they did not update the PHC timely.
This patch adds a new hwtstamp_config flag HWTSTAMP_FLAG_BONDED_PHC_INDEX.
When the user wants to get the bond active interface's PHC, they need to
add this flag and be aware the PHC index may be changed.
With the new flag. All flag checks in current drivers are removed. Only
the checking in net_hwtstamp_validate() is kept.
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 64d47d50be ("net: dsa: mv88e6xxx: configure interface settings
in mac_config") removed forcing of speed and duplex from
mv88e6xxx_mac_config(), where the link is forced down, and left it only
in mv88e6xxx_mac_link_up(), by which time link is unforced.
It seems that (at least on 88E6190) when changing cmode to 2500base-x,
if the link is not forced down, but the speed or duplex are still
forced, the forcing of new settings for speed & duplex doesn't take in
mv88e6xxx_mac_link_up().
Fix this by unforcing speed & duplex in mv88e6xxx_mac_link_down().
Fixes: 64d47d50be ("net: dsa: mv88e6xxx: configure interface settings in mac_config")
Signed-off-by: Marek Behún <kabel@kernel.org>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
The sja1105 driver messes with the tagging protocol's state when PTP RX
timestamping is enabled/disabled. This is fundamentally necessary
because the tagger needs to know what to do when it receives a PTP
packet. If RX timestamping is enabled, then a metadata follow-up frame
is expected, and this holds the (partial) timestamp. So the tagger plays
hide-and-seek with the network stack until it also gets the metadata
frame, and then presents a single packet, the timestamped PTP packet.
But when RX timestamping isn't enabled, there is no metadata frame
expected, so the hide-and-seek game must be turned off and the packet
must be delivered right away to the network stack.
Considering this, we create a pseudo isolation by devising two tagger
methods callable by the switch: one to get the RX timestamping state,
and one to set it. Since we can't export symbols between the tagger and
the switch driver, these methods are exposed through function pointers.
After this change, the public portion of the sja1105_tagger_data
contains only function pointers.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit 6d709cadfd.
The above change was done to avoid calling symbols exported by the
switch driver from the tagging protocol driver.
With the tagger-owned storage model, we have a new option on our hands,
and that is for the switch driver to provide a data consumer handler in
the form of a function pointer inside the ->connect_tag_protocol()
method. Having a function pointer avoids the problems of the exported
symbols approach.
By creating a handler for metadata frames holding TX timestamps on
SJA1110, we are able to eliminate an skb queue from the tagger data, and
replace it with a simple, and stateless, function pointer. This skb
queue is now handled exclusively by sja1105_ptp.c, which makes the code
easier to follow, as it used to be before the reverted patch.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, struct sja1105_tagger_data is a part of struct
sja1105_private, and is used by the sja1105 driver to populate dp->priv.
With the movement towards tagger-owned storage, the sja1105 driver
should not be the owner of this memory.
This change implements the connection between the sja1105 switch driver
and its tagging protocol, which means that sja1105_tagger_data no longer
stays in dp->priv but in ds->tagger_data, and that the sja1105 driver
now only populates the sja1105_port_deferred_xmit callback pointer.
The kthread worker is now the responsibility of the tagger.
The sja1105 driver also alters the tagger's state some more, especially
with regard to the PTP RX timestamping state. This will be fixed up a
bit in further changes.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The TX timestamp ID is incremented by the SJA1110 PTP timestamping
callback (->port_tx_timestamp) for every packet, when cloning it.
It isn't used by the tagger at all, even though it sits inside the
struct sja1105_tagger_data.
Also, serialization to this structure is currently done through
tagger_data->meta_lock, which is a cheap hack because the meta_lock
isn't used for anything else on SJA1110 (sja1105_rcv_meta_state_machine
isn't called).
This change moves ts_id from sja1105_tagger_data to sja1105_private and
introduces a dedicated spinlock for it, also in sja1105_private.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The design of the sja1105 tagger dp->priv is that each port has a
separate struct sja1105_port, and the sp->data pointer points to a
common struct sja1105_tagger_data.
We have removed all per-port members accessible by the tagger, and now
only struct sja1105_tagger_data remains. Make dp->priv point directly to
this.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This tagger property is in fact not used at all by the tagger, only by
the switch driver. Therefore it makes sense to be moved to
sja1105_private.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When the ocelot-8021q driver was converted to deferred xmit as part of
commit 8d5f7954b7 ("net: dsa: felix: break at first CPU port during
init and teardown"), the deferred implementation was deliberately made
subtly different from what sja1105 has.
The implementation differences lied on the following observations:
- There might be a race between these two lines in tag_sja1105.c:
skb_queue_tail(&sp->xmit_queue, skb_get(skb));
kthread_queue_work(sp->xmit_worker, &sp->xmit_work);
and the skb dequeue logic in sja1105_port_deferred_xmit(). For
example, the xmit_work might be already queued, however the work item
has just finished walking through the skb queue. Because we don't
check the return code from kthread_queue_work, we don't do anything if
the work item is already queued.
However, nobody will take that skb and send it, at least until the
next timestampable skb is sent. This creates additional (and
avoidable) TX timestamping latency.
To close that race, what the ocelot-8021q driver does is it doesn't
keep a single work item per port, and a skb timestamping queue, but
rather dynamically allocates a work item per packet.
- It is also unnecessary to have more than one kthread that does the
work. So delete the per-port kthread allocations and replace them with
a single kthread which is global to the switch.
This change brings the two implementations in line by applying those
observations to the sja1105 driver as well.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This code is not necessary and complicates the conversion of this driver
to tagger-owned memory. If there is a PTP packet that is sent
concurrently with the port getting disabled, the deferred xmit mechanism
is robust enough to time out when it sees that it hasn't been delivered,
and recovers.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The felix driver makes very light use of dp->priv, and the tagger is
effectively stateless. dp->priv is practically only needed to set up a
callback to perform deferred xmit of PTP and STP packets using the
ocelot-8021q tagging protocol (the main ocelot tagging protocol makes no
use of dp->priv, although this driver sets up dp->priv irrespective of
actual tagging protocol in use).
struct felix_port (what used to be pointed to by dp->priv) is removed
and replaced with a two-sided structure. The public side of this
structure, visible to the switch driver, is ocelot_8021q_tagger_data.
The private side is ocelot_8021q_tagger_private, and the latter
structure physically encapsulates the former. The public half of the
tagger data structure can be accessed through a helper of the same name
(ocelot_8021q_tagger_data) which also sanity-checks the protocol
currently in use by the switch. The public/private split was requested
by Andrew Lunn.
Suggested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In a typical mv88e6xxx switch tree like this:
CPU
| .----.
.--0--. | .--0--.
| sw0 | | | sw1 |
'-1-2-' | '-1-2-'
'---'
If sw1p{1,2} are added to a bridge that sw0p1 is not a part of, sw0
still needs to add a crosschip PVT entry for the virtual DSA device
assigned to represent the bridge.
Fixes: ce5df6894a ("net: dsa: mv88e6xxx: map virtual bridges with forwarding offload in the PVT")
Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Martyn Welch reports that his CPU port is unable to link where it has
been necessary to use one of the switch ports with an internal PHY for
the CPU port. The reason behind this is the port control register is
left forcing the link down, preventing traffic flow.
This occurs because during initialisation, phylink expects the link to
be down, and DSA forces the link down by synthesising a call to the
DSA drivers phylink_mac_link_down() method, but we don't touch the
forced-link state when we later reconfigure the port.
Resolve this by also unforcing the link state when we are operating in
PHY mode and the PPU is set to poll the PHY to retrieve link status
information.
Reported-by: Martyn Welch <martyn.welch@collabora.com>
Tested-by: Martyn Welch <martyn.welch@collabora.com>
Fixes: 3be98b2d5f ("net: dsa: Down cpu/dsa ports phylink will control")
Cc: <stable@vger.kernel.org> # 5.7: 2b29cb9e3f: net: dsa: mv88e6xxx: fix "don't use PHY_DETECT on internal PHY's"
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/E1mvFhP-00F8Zb-Ul@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Avoid a memory leak if there is not a CPU port defined.
Fixes: 8d5f7954b7 ("net: dsa: felix: break at first CPU port during init and teardown")
Addresses-Coverity-ID: 1492897 ("Resource leak")
Addresses-Coverity-ID: 1492899 ("Resource leak")
Signed-off-by: José Expósito <jose.exposito89@gmail.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20211209110538.11585-1-jose.exposito89@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Added default case to handle undefined cmode scenario in
mv88e6393x_serdes_power() and mv88e6393x_serdes_power() methods.
Addresses-Coverity: 1494644 ("Uninitialized scalar variable")
Fixes: 21635d9203 (net: dsa: mv88e6xxx: Fix application of erratum 4.8 for 88E6393X)
Reviewed-by: Marek Behún <kabel@kernel.org>
Signed-off-by: Ameer Hamza <amhamza.mgc@gmail.com>
Link: https://lore.kernel.org/r/20211209041552.9810-1-amhamza.mgc@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This commit fixes a misunderstanding in commit 4a3e0aeddf ("net: dsa:
mv88e6xxx: don't use PHY_DETECT on internal PHY's").
For Marvell DSA switches with the PHY_DETECT bit (for non-6250 family
devices), controls whether the PPU polls the PHY to retrieve the link,
speed, duplex and pause status to update the port configuration. This
applies for both internal and external PHYs.
For some switches such as 88E6352 and 88E6390X, PHY_DETECT has an
additional function of enabling auto-media mode between the internal
PHY and SERDES blocks depending on which first gains link.
The original intention of commit 5d5b231da7 (net: dsa: mv88e6xxx: use
PHY_DETECT in mac_link_up/mac_link_down) was to allow this bit to be
used to detect when this propagation is enabled, and allow software to
update the port configuration. This has found to be necessary for some
switches which do not automatically propagate status from the SERDES to
the port, which includes the 88E6390. However, commit 4a3e0aeddf
("net: dsa: mv88e6xxx: don't use PHY_DETECT on internal PHY's") breaks
this assumption.
Maarten Zanders has confirmed that the issue he was addressing was for
an 88E6250 switch, which does not have a PHY_DETECT bit in bit 12, but
instead a link status bit. Therefore, mv88e6xxx_port_ppu_updates() does
not report correctly.
This patch resolves the above issues by reverting Maarten's change and
instead making mv88e6xxx_port_ppu_updates() indicate whether the port
is internal for the 88E6250 family of switches.
Yes, you're right, I'm targeting the 6250 family. And yes, your
suggestion would solve my case and is a better implementation for
the other devices (as far as I can see).
Fixes: 4a3e0aeddf ("net: dsa: mv88e6xxx: don't use PHY_DETECT on internal PHY's")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Tested-by: Maarten Zanders <maarten.zanders@mind.be>
Link: https://lore.kernel.org/r/E1muXm7-00EwJB-7n@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
We don't really need new switch API for these, and with new switches
which intend to add support for this feature, it will become cumbersome
to maintain.
The change consists in restructuring the two drivers that implement this
offload (sja1105 and mv88e6xxx) such that the offload is enabled and
disabled from the ->port_bridge_{join,leave} methods instead of the old
->port_bridge_tx_fwd_{,un}offload.
The only non-trivial change is that mv88e6xxx_map_virtual_bridge_to_pvt()
has been moved to avoid a forward declaration, and the
mv88e6xxx_reg_lock() calls from inside it have been removed, since
locking is now done from mv88e6xxx_port_bridge_{join,leave}.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This is a preparation patch for the removal of the DSA switch methods
->port_bridge_tx_fwd_offload() and ->port_bridge_tx_fwd_unoffload().
The plan is for the switch to report whether it offloads TX forwarding
directly as a response to the ->port_bridge_join() method.
This change deals with the noisy portion of converting all existing
function prototypes to take this new boolean pointer argument.
The bool is placed in the cross-chip notifier structure for bridge join,
and a reference to it is provided to drivers. In the next change, DSA
will then actually look at this value instead of calling
->port_bridge_tx_fwd_offload().
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The main desire behind this is to provide coherent bridge information to
the fast path without locking.
For example, right now we set dp->bridge_dev and dp->bridge_num from
separate code paths, it is theoretically possible for a packet
transmission to read these two port properties consecutively and find a
bridge number which does not correspond with the bridge device.
Another desire is to start passing more complex bridge information to
dsa_switch_ops functions. For example, with FDB isolation, it is
expected that drivers will need to be passed the bridge which requested
an FDB/MDB entry to be offloaded, and along with that bridge_dev, the
associated bridge_num should be passed too, in case the driver might
want to implement an isolation scheme based on that number.
We already pass the {bridge_dev, bridge_num} pair to the TX forwarding
offload switch API, however we'd like to remove that and squash it into
the basic bridge join/leave API. So that means we need to pass this
pair to the bridge join/leave API.
During dsa_port_bridge_leave, first we unset dp->bridge_dev, then we
call the driver's .port_bridge_leave with what used to be our
dp->bridge_dev, but provided as an argument.
When bridge_dev and bridge_num get folded into a single structure, we
need to preserve this behavior in dsa_port_bridge_leave: we need a copy
of what used to be in dp->bridge.
Switch drivers check bridge membership by comparing dp->bridge_dev with
the provided bridge_dev, but now, if we provide the struct dsa_bridge as
a pointer, they cannot keep comparing dp->bridge to the provided
pointer, since this only points to an on-stack copy. To make this
obvious and prevent driver writers from forgetting and doing stupid
things, in this new API, the struct dsa_bridge is provided as a full
structure (not very large, contains an int and a pointer) instead of a
pointer. An explicit comparison function needs to be used to determine
bridge membership: dsa_port_offloads_bridge().
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The location of the bridge device pointer and number is going to change.
It is not going to be kept individually per port, but in a common
structure allocated dynamically and which will have lockdep validation.
Use the helpers to access these elements so that we have a migration
path to the new organization.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The goal of this change is to reduce mv88e6xxx_port_vlan() to a form
where dsa_port_bridge_same() can be used, since the dp->bridge_dev
pointer will be hidden in a future change.
To do that, we observe that the "br" pointer is deduced from a
dp->bridge_dev in both cases (of a physical switch port as well as a
virtual bridge). So instead of keeping the "br" pointer, we can just
keep the "dp" pointer from which "br" gets derived.
In the last iteration over switch ports, we must use another iterator
variable, "other_dp"since now we use the "dp" variable to keep an
indirect reference to the bridge. While at it, the old code used to
filter only the ports which were part of the same switch as "ds".
There exists a dedicated DSA port iterator for that:
dsa_switch_for_each_port (which skips the ports in the tree that belong
to non-local switches), so we can just use that.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Avoid a plethora of dsa_to_port() calls (some hidden behind
dsa_is_*_port and some in plain sight) by keeping two struct dsa_port
references: one to the port passed as argument, and another to the other
ports of the switch that we're iterating over.
This isn't called from the DSA initialization path, so there is no risk
that we have user ports without a dp->slave populated. So the combined
checks that a port isn't a DSA port, a CPU port, or doesn't have a slave
net device (therefore is unused), are strictly equivalent to the simple
check that the port is a user port. This is already handled by the DSA
iterator.
i gets replaced by other_dp->index, dsa_is_*_port calls get replaced by
dsa_port_is_*, and dsa_to_port gets replaced by the respective pointer
directly.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Avoid repeated calls to dsa_to_port() (some hidden behind dsa_is_user_port
and some in plain sight) by keeping two struct dsa_port references: one
to the port passed as argument, and another to the other ports of the
switch that we're iterating over.
dsa_to_port(ds, i) gets replaced by other_dp, i gets replaced by
other_port which is derived from other_dp->index, dsa_is_user_port is
handled by the DSA iterator.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The service where DSA assigns a unique bridge number for each forwarding
domain is useful even for drivers which do not implement the TX
forwarding offload feature.
For example, drivers might use the dp->bridge_num for FDB isolation.
So rename ds->num_fwd_offloading_bridges to ds->max_num_bridges, and
calculate a unique bridge_num for all drivers that set this value.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
I have seen too many bugs already due to the fact that we must encode an
invalid dp->bridge_num as a negative value, because the natural tendency
is to check that invalid value using (!dp->bridge_num). Latest example
can be seen in commit 1bec0f0506 ("net: dsa: fix bridge_num not
getting cleared after ports leaving the bridge").
Convert the existing users to assume that dp->bridge_num == 0 is the
encoding for invalid, and valid bridge numbers start from 1.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Fix following coccicheck warning:
/drivers/net/dsa/ocelot/felix_vsc9959.c:1627:13-20:
WARNING opportunity for kmemdup
/drivers/net/dsa/ocelot/felix_vsc9959.c:1506:16-23:
WARNING opportunity for kmemdup
Signed-off-by: Yihao Han <hanyihao@vivo.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20211207064419.38632-1-hanyihao@vivo.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add an interface so that non-mmio regmaps can be used
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Existing felix devices all have an initialized pcs array. Future devices
might not, so running a NULL check on the array before dereferencing it
will allow those future drivers to not crash at this point
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The pci_bar variables for the switch and imdio don't make sense for the
generic felix driver. Moving them to felix_vsc9959 to limit scope and
simplify the felix_info struct.
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
GPIO library does copy the of_node from the parent device of
the GPIO chip, there is no need to repeat this in the individual
drivers. Remove assignment here.
For the details one may look into the of_gpio_dev_init() implementation.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently autoloading for SPI devices does not use the DT ID table, it
uses SPI modalises. Supporting OF modalises is going to be difficult if
not impractical, an attempt was made but has been reverted, so ensure
that module autoloading works for this driver by adding an id_table
listing the SPI IDs for everything.
Fixes: 96c8395e21 ("spi: Revert modalias changes")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Populate the supported interfaces and MAC capabilities for the Lantiq
DSA switches and remove the old validate implementation to allow DSA to
use phylink_generic_validate() for this switch driver.
The exclusion of Gigabit linkmodes for MII, Reverse MII and Reduced MII
links is handled within phylink_generic_validate() in phylink, so there
is no need to make them conditional on the interface mode in the driver.
Reviewed-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Populate the supported interfaces and MAC capabilities for the
hellcreek DSA switch and remove the old validate implementation to
allow DSA to use phylink_generic_validate() for this switch driver.
The switch actually only supports MII and RGMII, but as phylib defaults
to GMII, we need to include this interface mode to keep existing DT
working.
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>
Tested-by: Kurt Kanzenbach <kurt@linutronix.de>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Function mv88e6xxx_serdes_pcs_get_state() currently does not report link
up if AN is enabled, Link bit is set, but Speed and Duplex Resolved bit
is not set, which testing shows is the case for when auto-negotiation
was bypassed (we have AN enabled but link partner does not).
An example of such link partner is Marvell 88X3310 PHY, when put into
the mode where host interface changes between 10gbase-r, 5gbase-r,
2500base-x and sgmii according to copper speed. The 88X3310 does not
enable AN in 2500base-x, and so SerDes on mv88e6xxx currently does not
link with it.
Fix this.
Fixes: a5a6858b79 ("net: dsa: mv88e6xxx: extend phylink to Serdes PHYs")
Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Inband AN is broken on Amethyst in 2500base-x mode when set by standard
mechanism (via cmode).
(There probably is some weird setting done by default in the switch for
this mode that make it cycle in some state or something, because when
the peer is the mvneta controller, it receives link change interrupts
every ~0.3ms, but the link is always down.)
Get around this by configuring the PCS mode to 1000base-x (where inband
AN works), and then changing the SerDes frequency while SerDes
transmitter and receiver are disabled, before enabling SerDes PHY. After
disabling SerDes PHY, change the PCS mode back to 2500base-x, to avoid
confusing the device (if we leave it at 1000base-x PCS mode but with
different frequency, and then change cmode to sgmii, the device won't
change the frequency because it thinks it already has the correct one).
The register which changes the frequency is undocumented. I discovered
it by going through all registers in the ranges 4.f000-4.f100 and
1e.8000-1e.8200 for all SerDes cmodes (sgmii, 1000base-x, 2500base-x,
5gbase-r, 10gbase-r, usxgmii) and filtering out registers that didn't
make sense (the value was the same for modes which have different
frequency). The result of this was:
reg sgmii 1000base-x 2500base-x 5gbase-r 10gbase-r usxgmii
04.f002 005b 0058 0059 005c 005d 005f
04.f076 3000 0000 1000 4000 5000 7000
04.f07c 0950 0950 1850 0550 0150 0150
1e.8000 0059 0059 0058 0055 0051 0051
1e.8140 0e20 0e20 0e28 0e21 0e42 0e42
Register 04.f002 is the documented Port Operational Confiuration
register, it's last 3 bits select PCS type, so changing this register
also changes the frequency to the appropriate value.
Registers 04.f076 and 04.f07c are not writable.
Undocumented register 1e.8000 was the one: changing bits 3:0 from 9 to 8
changed SerDes frequency to 3.125 GHz, while leaving the value of PCS
mode in register 04.f002.2:0 at 1000base-x. Inband autonegotiation
started working correctly.
(I didn't try anything with register 1e.8140 since 1e.8000 solved the
problem.)
Since I don't have documentation for this register 1e.8000.3:0, I am
using the constants without names, but my hypothesis is that this
register selects PHY frequency. If in the future I have access to an
oscilloscope able to handle these frequencies, I will try to test this
hypothesis.
Fixes: de776d0d31 ("net: dsa: mv88e6xxx: add support for mv88e6393x family")
Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add fix for erratum 5.2 of the 88E6393X (Amethyst) family: for 10gbase-r
mode, some undocumented registers need to be written some special
values.
Fixes: de776d0d31 ("net: dsa: mv88e6xxx: add support for mv88e6393x family")
Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Save power on 88E6393X by disabling SerDes receiver and transmitter
after SerDes is SerDes is disabled.
Signed-off-by: Marek Behún <kabel@kernel.org>
Cc: stable@vger.kernel.org # de776d0d31 ("net: dsa: mv88e6xxx: add support for mv88e6393x family")
Signed-off-by: David S. Miller <davem@davemloft.net>
The check for lane is unnecessary, since the function is called only
with allowed lane argument.
Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
According to SERDES scripts for 88E6393X, erratum 4.8 has to be applied
every time before SerDes is powered on.
Split the code for erratum 4.8 into separate function and call it in
mv88e6393x_serdes_power().
Fixes: de776d0d31 ("net: dsa: mv88e6xxx: add support for mv88e6393x family")
Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Zero-length and one-element arrays are deprecated, see
Documentation/process/deprecated.rst
Flexible-array members should be used instead.
Generated by: scripts/coccinelle/misc/flexible_array.cocci
Fixes: 23ae3a7877 ("net: dsa: felix: add stream gate settings for psfp")
CC: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: kernel test robot <lkp@intel.com>
Signed-off-by: Julia Lawall <julia.lawall@inria.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Switch to a shared MDIO access implementation by way of the mdio-mscc-miim
driver.
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Switch seville to use of_mdiobus_register(bus, NULL) instead of just
mdiobus_register. This code is about to be pulled into a separate module
that can optionally define ports by the device_node.
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A contact at Realtek has clarified what exactly the units of RGMII RX
delay are. The answer is that the unit of RX delay is "about 0.3 ns".
Take this into account when parsing rx-internal-delay-ps by
approximating the closest step value. Delays of more than 2.1 ns are
rejected.
This obviously contradicts the previous assumption in the driver that a
step value of 4 was "about 2 ns", but Realtek also points out that it is
easy to find more than one RX delay step value which makes RGMII work.
Fixes: 4af2950c50 ("net: dsa: realtek-smi: add rtl8365mb subdriver for RTL8365MB-VC")
Cc: Arınç ÜNAL <arinc.unal@arinc9.com>
Signed-off-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Acked-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes: 4af2950c50 ("net: dsa: realtek-smi: add rtl8365mb subdriver for RTL8365MB-VC")
Signed-off-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Probe deferral is not an error, so don't log this as an error:
[0.590156] realtek-smi ethernet-switch: unable to register switch ret = -517
Signed-off-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
This switch family can have up to 8 UTP ports {0..7}. However,
INDIRECT_ACCESS_ADDRESS_PHYNUM_MASK was using 2 bits instead of 3,
dropping the most significant bit during indirect register reads and
writes. Reading or writing ports 4, 5, 6, and 7 registers was actually
manipulating, respectively, ports 0, 1, 2, and 3 registers.
This is not sufficient but necessary to support any variant with more
than 4 UTP ports, like RTL8367S.
rtl8365mb_phy_{read,write} will now returns -EINVAL if phy is greater
than 7.
Fixes: 4af2950c50 ("net: dsa: realtek-smi: add rtl8365mb subdriver for RTL8365MB-VC")
Signed-off-by: Luiz Angelo Daros de Luca <luizluca@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Current driver version is able to handle only one bridge at time.
Configuring two bridges on two different ports would end up shorting this
bridges by HW. To reproduce it:
ip l a name br0 type bridge
ip l a name br1 type bridge
ip l s dev br0 up
ip l s dev br1 up
ip l s lan1 master br0
ip l s dev lan1 up
ip l s lan2 master br1
ip l s dev lan2 up
Ping on lan1 and get response on lan2, which should not happen.
This happened, because current driver version is storing one global "Port VLAN
Membership" and applying it to all ports which are members of any
bridge.
To solve this issue, we need to handle each port separately.
This patch is dropping the global port member storage and calculating
membership dynamically depending on STP state and bridge participation.
Note: STP support was broken before this patch and should be fixed
separately.
Fixes: c2e866911e ("net: dsa: microchip: break KSZ9477 DSA driver into two files")
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://lore.kernel.org/r/20211126123926.2981028-1-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The VSC9959 switch embedded within NXP LS1028A (and that version of
Ocelot switches only) supports cut-through forwarding - meaning it can
start the process of looking up the destination ports for a packet, and
forward towards those ports, before the entire packet has been received
(as opposed to the store-and-forward mode).
The up side is having lower forwarding latency for large packets. The
down side is that frames with FCS errors are forwarded instead of being
dropped. However, erroneous frames do not result in incorrect updates of
the FDB or incorrect policer updates, since these processes are deferred
inside the switch to the end of frame. Since the switch starts the
cut-through forwarding process after all packet headers (including IP,
if any) have been processed, packets with large headers and small
payload do not see the benefit of lower forwarding latency.
There are two cases that need special attention.
The first is when a packet is multicast (or flooded) to multiple
destinations, one of which doesn't have cut-through forwarding enabled.
The switch deals with this automatically by disabling cut-through
forwarding for the frame towards all destination ports.
The second is when a packet is forwarded from a port of lower link speed
towards a port of higher link speed. This is not handled by the hardware
and needs software intervention.
Since we practically need to update the cut-through forwarding domain
from paths that aren't serialized by the rtnl_mutex (phylink
mac_link_down/mac_link_up ops), this means we need to serialize physical
link events with user space updates of bonding/bridging domains.
Enabling cut-through forwarding is done per {egress port, traffic class}.
I don't see any reason why this would be a configurable option as long
as it works without issues, and there doesn't appear to be any user
space configuration tool to toggle this on/off, so this patch enables
cut-through forwarding on all eligible ports and traffic classes.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20211125125808.2383984-2-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Fix warning reported by bot.
Make sure hash is init to 0 and fix wrong logic for hash_type in
qca8k_lag_can_offload.
Reported-by: kernel test robot <lkp@intel.com>
Fixes: def975307c ("net: dsa: qca8k: add LAG support")
Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20211123154446.31019-1-ansuelsmth@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add LAG support to this switch. In Documentation this is described as
trunk mode. A max of 4 LAGs are supported and each can support up to 4
port. The current tx mode supported is Hash mode with both L2 and L2+3
mode.
When no port are present in the trunk, the trunk is disabled in the
switch.
When a port is disconnected, the traffic is redirected to the other
available port.
The hash mode is global and each LAG require to have the same hash mode
set. To change the hash mode when multiple LAG are configured, it's
required to remove each LAG and set the desired hash mode to the last.
An error is printed when it's asked to set a not supported hadh mode.
Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The switch supports mirror mode. Only one port can set as mirror port and
every other port can set to both ingress and egress mode. The mirror
port is disabled and reverted to normal operation once every port is
removed from sending packet to it.
Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for mdb add/del function. The ARL table is used to insert
the rule. The rule will be searched, deleted and reinserted with the
port mask updated. The function will check if the rule has to be updated
or insert directly with no deletion of the old rule.
If every port is removed from the port mask, the rule is removed.
The rule is set STATIC in the ARL table (aka it doesn't age) to not be
flushed by fast age function.
Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
qca8k support setting ageing time in step of 7s. Add support for it and
set the max value accepted of 7645m.
Documentation talks about support for 10000m but that values doesn't
make sense as the value doesn't match the max value in the reg.
Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The switch supports fast aging by flushing any rule in the ARL
table for a specific port.
Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We are currently missing 2 additionals MIB counter present in QCA833x
switch.
QC832x switch have 39 MIB counter and QCA833X have 41 MIB counter.
Add the additional MIB counter and rework the MIB function to print the
correct supported counter from the match_data struct.
Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Convert any qca8k set/clear/pool to regmap helper and add
missing config to regmap_config struct.
Read/write/rmw operation are reworked to use the regmap helper
internally to keep the delta of this patch low. These additional
function will then be dropped when the code split will be proposed.
Ipq40xx SoC have the internal switch based on the qca8k regmap but use
mmio for read/write/rmw operation instead of mdio.
In preparation for the support of this internal switch, convert the
driver to regmap API to later split the driver to common and specific
code. The overhead introduced by the use of regamp API is marginal as the
internal mdio will bypass it by using its direct access and regmap will be
used only by configuration functions or fdb access.
Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In preparation for regmap conversion, move regmap init in the probe
function and make it mandatory as any read/write/rmw operation will be
converted to regmap API.
Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mutex is already init in sw_probe. Remove the extra init in qca8k_setup.
Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Convert and try to standardize bit fields using
GENMASK/FIELD_PREP/FIELD_GET macros. Rework some logic to support the
standard macro and tidy things up. No functional change intended.
Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The very next check for port 0 and 6 already makes sure we don't go out
of bounds with the ports_config delay table.
Remove the redundant check.
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
qca8k has a global MTU, so its tracking the MTU per port to make sure
that the largest MTU gets applied.
Since it uses the frame size instead of MTU the driver MTU change function
will then add the size of Ethernet header and checksum on top of MTU.
The driver currently populates the per port MTU size as Ethernet frame
length + checksum which equals 1518.
The issue is that then MTU change function will go through all of the
ports, find the largest MTU and apply the Ethernet header + checksum on
top of it again, so for a desired MTU of 1500 you will end up with 1536.
This is obviously incorrect, so to correct it populate the per port struct
MTU with just the MTU and not include the Ethernet header + checksum size
as those will be added by the MTU change function.
Fixes: f58d2598cf ("net: dsa: qca8k: implement the port MTU callbacks")
Signed-off-by: Robert Marko <robert.marko@sartura.hr>
Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With SGMII phy the internal delay is always applied to the PAD0 config.
This is caused by the falling edge configuration that hardcode the reg
to PAD0 (as the falling edge bits are present only in PAD0 reg)
Move the delay configuration before the reg overwrite to correctly apply
the delay.
Fixes: cef0811584 ("net: dsa: qca8k: set internal delay also for sgmii")
Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
PSFP rules take effect on the streams from any port of VSC9959 switch.
This patch use ingress port to limit the rule only active on this port.
Each stream can only match two ingress source ports in VSC9959. Streams
from lowest port gets the configuration of SFID pointed by MAC Table
lookup and streams from highest port gets the configuration of (SFID+1)
pointed by MAC Table lookup. This patch defines the PSFP rule on highest
port as dummy rule, which means that it does not modify the MAC table.
Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch add police action to set flow meter table which is defined
in IEEE802.1Qci. Flow metering is two rates two buckets and three color
marker to policing the frames, we only enable one rate one bucket in
this patch.
Flow metering shares a same policer pool with VCAP policers, so the PSFP
policer calls ocelot_vcap_policer_add() and ocelot_vcap_policer_del() to
set flow meter police.
Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Policer was previously automatically assigned from the highest index to
the lowest index from policer pool. But police action of tc flower now
uses index to set an police entry. This patch uses the police index to
set vcap policers, so that one policer can be shared by multiple rules.
Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds stream gate settings for PSFP. Use SGI table to store
stream gate entries. Disable the gate entry when it is not used by any
stream.
Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
VSC9959 supports Per-Stream Filtering and Policing(PSFP) that complies
with the IEEE 802.1Qci standard. The stream is identified by Null stream
identification(DMAC and VLAN ID) defined in IEEE802.1CB.
For PSFP, four tables need to be set up: stream table, stream filter
table, stream gate table, and flow meter table. Identify the stream by
parsing the tc flower keys and add it to the stream table. The stream
filter table is automatically maintained, and its index is determined by
SGID(flow gate index) and FMID(flow meter index).
Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
vsc73xx_remove() returns zero unconditionally and no caller checks the
returned value. So convert the function to return no value.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Model 88E6191X only supports >1G speeds on port 10. Port 0 and 9 are
only 1G.
Fixes: de776d0d31 ("net: dsa: mv88e6xxx: add support for mv88e6393x family")
Signed-off-by: Marek Behún <kabel@kernel.org>
Cc: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20211104171747.10509-1-kabel@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Normally it is expected that the dsa_device_ops :: rcv() method finishes
parsing the DSA tag and consumes it, then never looks at it again.
But commit c0bcf53766 ("net: dsa: ocelot: add hardware timestamping
support for Felix") added support for RX timestamping in a very
unconventional way. On this switch, a partial timestamp is available in
the DSA header, but the driver got away with not parsing that timestamp
right away, but instead delayed that parsing for a little longer:
dsa_switch_rcv():
nskb = cpu_dp->rcv(skb, dev); <------------- not here
-> ocelot_rcv()
...
skb = nskb;
skb_push(skb, ETH_HLEN);
skb->pkt_type = PACKET_HOST;
skb->protocol = eth_type_trans(skb, skb->dev);
...
if (dsa_skb_defer_rx_timestamp(p, skb)) <--- but here
-> felix_rxtstamp()
return 0;
When in felix_rxtstamp(), this driver accounted for the fact that
eth_type_trans() happened in the meanwhile, so it got a hold of the
extraction header again by subtracting (ETH_HLEN + OCELOT_TAG_LEN) bytes
from the current skb->data.
This worked for quite some time but was quite fragile from the very
beginning. Not to mention that having DSA tag parsing split in two
different files, under different folders (net/dsa/tag_ocelot.c vs
drivers/net/dsa/ocelot/felix.c) made it quite non-obvious for patches to
come that they might break this.
Finally, the blamed commit does the following: at the end of
ocelot_rcv(), it checks whether the skb payload contains a VLAN header.
If it does, and this port is under a VLAN-aware bridge, that VLAN ID
might not be correct in the sense that the packet might have suffered
VLAN rewriting due to TCAM rules (VCAP IS1). So we consume the VLAN ID
from the skb payload using __skb_vlan_pop(), and take the classified
VLAN ID from the DSA tag, and construct a hwaccel VLAN tag with the
classified VLAN, and the skb payload is VLAN-untagged.
The big problem is that __skb_vlan_pop() does:
memmove(skb->data + VLAN_HLEN, skb->data, 2 * ETH_ALEN);
__skb_pull(skb, VLAN_HLEN);
aka it moves the Ethernet header 4 bytes to the right, and pulls 4 bytes
from the skb headroom (effectively also moving skb->data, by definition).
So for felix_rxtstamp()'s fragile logic, all bets are off now.
Instead of having the "extraction" pointer point to the DSA header,
it actually points to 4 bytes _inside_ the extraction header.
Corollary, the last 4 bytes of the "extraction" header are in fact 4
stale bytes of the destination MAC address from the Ethernet header,
from prior to the __skb_vlan_pop() movement.
So of course, RX timestamps are completely bogus when the system is
configured in this way.
The fix is actually very simple: just don't structure the code like that.
For better or worse, the DSA PTP timestamping API does not offer a
straightforward way for drivers to present their RX timestamps, but
other drivers (sja1105) have established a simple mechanism to carry
their RX timestamp from dsa_device_ops :: rcv() all the way to
dsa_switch_ops :: port_rxtstamp() and even later. That mechanism is to
simply save the partial timestamp to the skb->cb, and complete it later.
Question: why don't we simply populate the skb's struct
skb_shared_hwtstamps from ocelot_rcv(), and bother with this
complication of propagating the timestamp to felix_rxtstamp()?
Answer: dsa_switch_ops :: port_rxtstamp() answers the question whether
PTP packets need sleepable context to retrieve the full RX timestamp.
Currently felix_rxtstamp() answers "no, thanks" to that question, and
calls ocelot_ptp_gettime64() from softirq atomic context. This is
understandable, since Felix VSC9959 is a PCIe memory-mapped switch, so
hardware access does not require sleeping. But the felix driver is
preparing for the introduction of other switches where hardware access
is over a slow bus like SPI or MDIO:
https://lore.kernel.org/lkml/20210814025003.2449143-1-colin.foster@in-advantage.com/
So I would like to keep this code structure, so the rework needed when
that driver will need PTP support will be minimal (answer "yes, I need
deferred context for this skb's RX timestamp", then the partial
timestamp will still be found in the skb->cb.
Fixes: ea440cd2d9 ("net: dsa: tag_ocelot: use VLAN information from tagging header when available")
Reported-by: Po Liu <po.liu@nxp.com>
Cc: Yangbo Lu <yangbo.lu@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some device set MAC06 exchange in the bootloader. This cause some
problem as we don't support this strange mode and we just set the port6
as the primary CPU port. With MAC06 exchange, PAD0 reg configure port6
instead of port0. Add an extra check and explicitly disable MAC06 exchange
to correctly configure the port PAD config.
Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Fixes: 3fcf734aa4 ("net: dsa: qca8k: add support for cpu port 6")
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The GSWIP switch accesses various bridging layer tables (VLANs, FDBs,
forwarding rules) indirectly through PCE registers. These hardware
accesses are non-atomic, being comprised of several register reads and
writes.
These accesses are currently serialized by the rtnl_lock, but DSA is
changing its driver API and that lock will no longer be held when
calling ->port_fdb_add() and ->port_fdb_del().
So this driver needs to serialize the access to the PCE registers using
its own locking scheme. This patch adds that.
Note that the driver also uses the gswip_pce_load_microcode() function
to load a static configuration for the packet classification engine into
a table using the same registers. It is currently not protected, but
since that configuration is only done from the dsa_switch_ops :: setup
method, there is no risk of it being concurrent with other operations.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
The b53 driver performs non-atomic transactions to the ARL table when
adding, deleting and reading FDB and MDB entries.
Traditionally these were all serialized by the rtnl_lock(), but now it
is possible that DSA calls ->port_fdb_add and ->port_fdb_del without
holding that lock.
So the driver must have its own serialization logic. Add a mutex and
hold it from all entry points (->port_fdb_{add,del,dump},
->port_mdb_{add,del}).
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The sja1105 hardware seems as concurrent as can be, but when we create a
background script that adds/removes a rain of FDB entries without the
rtnl_mutex taken, then in parallel we do another operation like run
'bridge fdb show', we can notice these errors popping up:
sja1105 spi2.0: port 2 failed to read back entry for 00:01:02:03:00:40 vid 0: -ENOENT
sja1105 spi2.0: port 2 failed to add 00:01:02:03:00:40 vid 0 to fdb: -2
sja1105 spi2.0: port 2 failed to read back entry for 00:01:02:03:00:46 vid 0: -ENOENT
sja1105 spi2.0: port 2 failed to add 00:01:02:03:00:46 vid 0 to fdb: -2
Luckily what is going on does not require a major rework in the driver.
The sja1105_dynamic_config_read() function sends multiple SPI buffers to
the peripheral until the operation completes. We should not do anything
until the hardware clears the VALID bit.
But since there is no locking (i.e. right now we are implicitly
serialized by the rtnl_mutex, but if we remove that), it might be
possible that the process which performs the dynamic config read is
preempted and another one performs a dynamic config write.
What will happen in that case is that sja1105_dynamic_config_read(),
when it resumes, expects to see VALIDENT set for the entry it reads
back. But it won't.
This can be corrected by introducing a mutex for serializing SPI
accesses to the dynamic config interface which should be atomic with
respect to each other.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The hardware manual says that software should attempt a new dynamic
config access (be it a a write or a read-back) only while the VALID bit
is cleared. The VALID bit is set by software to 1, and it remains set as
long as the hardware is still processing the request.
Currently the driver only polls for the command completion only for
reads, because that's when we need the actual data read back. Writes
have been more or less "asynchronous", although this has never been an
observable issue.
This change makes sja1105_dynamic_config_write poll the VALID bit as
well, to absolutely ensure that a follow-up access to the static config
finds the VALID bit cleared.
So VALID means "work in progress", while VALIDENT means "entry being
read is valid". On reads we check the VALIDENT bit too, while on writes
that bit is not always defined. So we need to factor it out of the loop,
and make the loop provide back the unpacked command structure, so that
sja1105_dynamic_config_read can check the VALIDENT bit.
The change also attempts to convert the open-coded loop to use the
read_poll_timeout macro, since I know this will come up during review.
It's more code, but hey, it uses read_poll_timeout!
Tested on SJA1105T, SJA1105S, SJA1110A.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Looking at the code, the GSWIP switch appears to hold bridging service
structures (VLANs, FDBs, forwarding rules) in PCE table entries.
Hardware access to the PCE table is non-atomic, and is comprised of
several register reads and writes.
These accesses are currently serialized by the rtnl_lock, but DSA is
changing its driver API and that lock will no longer be held when
calling ->port_fdb_add() and ->port_fdb_del().
So this driver needs to serialize the access to the PCE table using its
own locking scheme. This patch adds that.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
The b53 driver performs non-atomic transactions to the ARL table when
adding, deleting and reading FDB and MDB entries.
Traditionally these were all serialized by the rtnl_lock(), but now it
is possible that DSA calls ->port_fdb_add and ->port_fdb_del without
holding that lock.
So the driver must have its own serialization logic. Add a mutex and
hold it from all entry points (->port_fdb_{add,del,dump},
->port_mdb_{add,del}).
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The sja1105 hardware seems as concurrent as can be, but when we create a
background script that adds/removes a rain of FDB entries without the
rtnl_mutex taken, then in parallel we do another operation like run
'bridge fdb show', we can notice these errors popping up:
sja1105 spi2.0: port 2 failed to read back entry for 00:01:02:03:00:40 vid 0: -ENOENT
sja1105 spi2.0: port 2 failed to add 00:01:02:03:00:40 vid 0 to fdb: -2
sja1105 spi2.0: port 2 failed to read back entry for 00:01:02:03:00:46 vid 0: -ENOENT
sja1105 spi2.0: port 2 failed to add 00:01:02:03:00:46 vid 0 to fdb: -2
Luckily what is going on does not require a major rework in the driver.
The sja1105_dynamic_config_read() function sends multiple SPI buffers to
the peripheral until the operation completes. We should not do anything
until the hardware clears the VALID bit.
But since there is no locking (i.e. right now we are implicitly
serialized by the rtnl_mutex, but if we remove that), it might be
possible that the process which performs the dynamic config read is
preempted and another one performs a dynamic config write.
What will happen in that case is that sja1105_dynamic_config_read(),
when it resumes, expects to see VALIDENT set for the entry it reads
back. But it won't.
This can be corrected by introducing a mutex for serializing SPI
accesses to the dynamic config interface which should be atomic with
respect to each other.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The hardware manual says that software should attempt a new dynamic
config access (be it a a write or a read-back) only while the VALID bit
is cleared. The VALID bit is set by software to 1, and it remains set as
long as the hardware is still processing the request.
Currently the driver only polls for the command completion only for
reads, because that's when we need the actual data read back. Writes
have been more or less "asynchronous", although this has never been an
observable issue.
This change makes sja1105_dynamic_config_write poll the VALID bit as
well, to absolutely ensure that a follow-up access to the static config
finds the VALID bit cleared.
So VALID means "work in progress", while VALIDENT means "entry being
read is valid". On reads we check the VALIDENT bit too, while on writes
that bit is not always defined. So we need to factor it out of the loop,
and make the loop provide back the unpacked command structure, so that
sja1105_dynamic_config_read can check the VALIDENT bit.
The change also attempts to convert the open-coded loop to use the
read_poll_timeout macro, since I know this will come up during review.
It's more code, but hey, it uses read_poll_timeout!
Tested on SJA1105T, SJA1105S, SJA1110A.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This converts users of mdiobus to mdiodev using the following semantic
patch:
@@
identifier mdiodev;
expression regnum;
@@
- mdiobus_read(mdiodev->bus, mdiodev->addr, regnum)
+ mdiodev_read(mdiodev, regnum)
@@
identifier mdiodev;
expression regnum, val;
@@
- mdiobus_write(mdiodev->bus, mdiodev->addr, regnum, val)
+ mdiodev_write(mdiodev, regnum, val)
Signed-off-by: Sean Anderson <sean.anderson@seco.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix following coccicheck warning:
./drivers/net/dsa/sja1105/sja1105_main.c:1193:1-33: WARNING: Function
for_each_available_child_of_node should have of_node_put() before return.
Early exits from for_each_available_child_of_node should decrement the
node reference counter.
Fixes: 9ca482a246 ("net: dsa: sja1105: parse {rx, tx}-internal-delay-ps properties for RGMII delays")
Signed-off-by: Wan Jiabing <wanjiabing@vivo.com>
Link: https://lore.kernel.org/r/20211021094606.7118-1-wanjiabing@vivo.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Pass a single argument to dsa_8021q_rx_vid and dsa_8021q_tx_vid that
contains the necessary information from the two arguments that are
currently provided: the switch and the port number.
Also rename those functions so that they have a dsa_port_* prefix, since
they operate on a struct dsa_port *.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tidy and organize qca8k setup function from multiple for loop.
Change for loop in bridge leave/join to scan all port and skip cpu port.
No functional change intended.
Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>