This is a strictly cosmetic change that renames some macros in
sja1105_dynamic_config.c. They were copy-pasted in haste and this has
resulted in them having the driver prefix twice.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Return the driver name and ASIC ID so that generic user space
application are able to know they're looking at sja1105 devlink regions
when pretty-printing them.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As explained in Documentation/networking/dsa/sja1105.rst, this switch
has a static config held in the driver's memory and re-uploaded from
time to time into the device (after any major change).
The format of this static config is in fact described in UM10944.pdf and
it contains all the switch's settings (it also contains device ID, table
CRCs, etc, just like in the manual). So it is a useful and universal
devlink region to expose to user space, for debugging purposes.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We'll have more devlink code soon. Group it together in a separate
translation object.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The whole purpose of tag_8021q is to send VLAN-tagged traffic to the
CPU, from which the driver can decode the source port and switch id.
Currently this only works if the VLAN filtering on the master is
disabled. Change that by explicitly adding code to tag_8021q.c to add
the VLANs corresponding to the tags to the filter of the master
interface.
Because we now need to call vlan_vid_add, then we also need to hold the
RTNL mutex. Propagate that requirement to the callers of dsa_8021q_setup
and modify the existing call sites as appropriate. Note that one call
path, sja1105_best_effort_vlan_filtering_set -> sja1105_vlan_filtering
-> sja1105_setup_8021q_tagging, was already holding this lock.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
While working on another tag_8021q driver implementation, some things
became apparent:
- It is not mandatory for a DSA driver to offload the tag_8021q VLANs by
using the VLAN table per se. For example, it can add custom TCAM rules
that simply encapsulate RX traffic, and redirect & decapsulate rules
for TX traffic. For such a driver, it makes no sense to receive the
tag_8021q configuration through the same callback as it receives the
VLAN configuration from the bridge and the 8021q modules.
- Currently, sja1105 (the only tag_8021q user) sets a
priv->expect_dsa_8021q variable to distinguish between the bridge
calling, and tag_8021q calling. That can be improved, to say the
least.
- The crosschip bridging operations are, in fact, stateful already. The
list of crosschip_links must be kept by the caller and passed to the
relevant tag_8021q functions.
So it would be nice if the tag_8021q configuration was more
self-contained. This patch attempts to do that.
Create a struct dsa_8021q_context which encapsulates a struct
dsa_switch, and has 2 function pointers for adding and deleting a VLAN.
These will replace the previous channel to the driver, which was through
the .port_vlan_add and .port_vlan_del callbacks of dsa_switch_ops.
Also put the list of crosschip_links into this dsa_8021q_context.
Drivers that don't support cross-chip bridging can simply omit to
initialize this list, as long as they dont call any cross-chip function.
The sja1105_vlan_add and sja1105_vlan_del functions are refactored into
a smaller sja1105_vlan_add_one, which now has 2 entry points:
- sja1105_vlan_add, from struct dsa_switch_ops
- sja1105_dsa_8021q_vlan_add, from the tag_8021q ops
But even this change is fairly trivial. It just reflects the fact that
for sja1105, the VLANs from these 2 channels end up in the same hardware
table. However that is not necessarily true in the general sense (and
that's the reason for making this change).
The rest of the patch is mostly plain refactoring of "ds" -> "ctx". The
dsa_8021q_context structure needs to be propagated because adding a VLAN
is now done through the ops function pointers inside of it.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is no point in calling dsa_port_setup_8021q_tagging for each
individual port. Additionally, it will become more difficult to do that
when we'll have a context structure to tag_8021q (next patch). So
refactor this now.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Clang warns:
drivers/net/dsa/sja1105/sja1105_main.c:3418:38: warning: address of
array 'match->compatible' will always evaluate to 'true'
[-Wpointer-bool-conversion]
for (match = sja1105_dt_ids; match->compatible; match++) {
~~~ ~~~~~~~^~~~~~~~~~
1 warning generated.
We should check the value of the first character in compatible to see if
it is empty or not. This matches how the rest of the tree iterates over
IDs.
Fixes: 0b0e299720 ("net: dsa: sja1105: use detected device id instead of DT one on mismatch")
Link: https://github.com/ClangBuiltLinux/linux/issues/1139
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Although we can detect the chip revision 100% at runtime, it is useful
to specify it in the device tree compatible string too, because
otherwise there would be no way to assess the correctness of device tree
bindings statically, without booting a board (only some switch versions
have internal RGMII delays and/or an SGMII port).
But for testing the P/Q/R/S support, what I have is a reworked board
with the SJA1105T replaced by a pin-compatible SJA1105Q, and I don't
want to keep a separate device tree blob just for this one-off board.
Since just the chip has been replaced, its RGMII delay setup is
inherently the same (meaning: delays added by the PHY on the slave
ports, and by PCB traces on the fixed-link CPU port).
For this board, I'd rather have the driver shout at me, but go ahead and
use what it found even if it doesn't match what it's been told is there.
[ 2.970826] sja1105 spi0.1: Device tree specifies chip SJA1105T but found SJA1105Q, please fix it!
[ 2.980010] sja1105 spi0.1: Probed switch chip: SJA1105Q
[ 3.005082] sja1105 spi0.1: Enabled switch tagging
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current poll interval is enough to ensure that rising and falling
edge events are not lost for a 1 PPS signal with 50% duty cycle.
But when we deliver the events to user space, it will try to infer if
they were corresponding to a rising or to a falling edge (the kernel
driver doesn't know that either). User space will try to make that
inference based on the time at which the PPS master had emitted the
pulse (i.e. if it's a .0 time, it's rising edge, if it's .5 time, it's
falling edge).
But there is no in-kernel API for retrieving the precise timestamp
corresponding to a PPS master (aka perout) pulse. So user space has to
guess even that. It will read the PTP time on the PPS master right after
we've delivered the extts event, and declare that the PPS master time
was just the closest integer second, based on 2 thresholds (lower than
.25, or higher than .75, and ignore anything else).
Except that, if we poll for extts events (and our hardware doesn't
really help us, by not providing an interrupt), then there is a risk
that the poll period (and therefore the time at which the event is
delivered) might confuse user space.
Because we are always scheduling the next extts poll at
SJA1105_EXTTS_INTERVAL "from now" (that's the only thing that the
schedule_delayed_work() API gives us), it means that the start time of
the next delayed workqueue will always be shifted to the right a little
bit (shifted with the SPI access duration of this workqueue run).
In turn, because user space sees extts events that are non-periodic
compared to the PPS master's time, this means that it might start making
wrong guesses about rising/falling edge.
To understand the effect, here is the output of ts2phc currently. Notice
the 'src' timestamps of the 'SKIP extts' events, and how they have a
large wander. They keep increasing until the upper limit for the ignore
threshold (.75 seconds), after which the application starts ignoring the
_other_ edge.
ts2phc[26.624]: /dev/ptp3 SKIP extts index 0 at 21.449898912 src 21.657784518
ts2phc[27.133]: adding tstamp 21.949894240 to clock /dev/ptp3
ts2phc[27.133]: adding tstamp 22.000000000 to clock /dev/ptp1
ts2phc[27.133]: /dev/ptp3 offset 640 s2 freq +5112
ts2phc[27.636]: /dev/ptp3 SKIP extts index 0 at 22.449889360 src 22.669398022
ts2phc[28.140]: adding tstamp 22.949884376 to clock /dev/ptp3
ts2phc[28.140]: adding tstamp 23.000000000 to clock /dev/ptp1
ts2phc[28.140]: /dev/ptp3 offset 96 s2 freq +4760
ts2phc[28.644]: /dev/ptp3 SKIP extts index 0 at 23.449879504 src 23.677420422
ts2phc[29.153]: adding tstamp 23.949874704 to clock /dev/ptp3
ts2phc[29.153]: adding tstamp 24.000000000 to clock /dev/ptp1
ts2phc[29.153]: /dev/ptp3 offset -264 s2 freq +4429
ts2phc[29.656]: /dev/ptp3 SKIP extts index 0 at 24.449870008 src 24.689407238
ts2phc[30.160]: adding tstamp 24.949865376 to clock /dev/ptp3
ts2phc[30.160]: adding tstamp 25.000000000 to clock /dev/ptp1
ts2phc[30.160]: /dev/ptp3 offset -280 s2 freq +4334
ts2phc[30.664]: /dev/ptp3 SKIP extts index 0 at 25.449860760 src 25.697449926
ts2phc[31.168]: adding tstamp 25.949856176 to clock /dev/ptp3
ts2phc[31.168]: adding tstamp 26.000000000 to clock /dev/ptp1
ts2phc[31.168]: /dev/ptp3 offset -176 s2 freq +4354
ts2phc[31.672]: /dev/ptp3 SKIP extts index 0 at 26.449851584 src 26.705433606
ts2phc[32.180]: adding tstamp 26.949846992 to clock /dev/ptp3
ts2phc[32.180]: adding tstamp 27.000000000 to clock /dev/ptp1
ts2phc[32.180]: /dev/ptp3 offset -80 s2 freq +4397
ts2phc[32.684]: /dev/ptp3 SKIP extts index 0 at 27.449842384 src 27.717415110
ts2phc[33.192]: adding tstamp 27.949837768 to clock /dev/ptp3
ts2phc[33.192]: adding tstamp 28.000000000 to clock /dev/ptp1
ts2phc[33.192]: /dev/ptp3 offset 0 s2 freq +4453
ts2phc[33.696]: /dev/ptp3 SKIP extts index 0 at 28.449833128 src 28.729412902
ts2phc[34.200]: adding tstamp 28.949828472 to clock /dev/ptp3
ts2phc[34.200]: adding tstamp 29.000000000 to clock /dev/ptp1
ts2phc[34.200]: /dev/ptp3 offset 8 s2 freq +4461
ts2phc[34.704]: /dev/ptp3 SKIP extts index 0 at 29.449823816 src 29.737416038
ts2phc[35.208]: adding tstamp 29.949819152 to clock /dev/ptp3
ts2phc[35.208]: adding tstamp 30.000000000 to clock /dev/ptp1
ts2phc[35.208]: /dev/ptp3 offset -8 s2 freq +4447
ts2phc[35.712]: /dev/ptp3 SKIP extts index 0 at 30.449814496 src 30.745554982
ts2phc[36.216]: adding tstamp 30.949809840 to clock /dev/ptp3
ts2phc[36.216]: adding tstamp 31.000000000 to clock /dev/ptp1
ts2phc[36.216]: /dev/ptp3 offset -8 s2 freq +4445
ts2phc[36.468]: /dev/ptp3 SKIP extts index 0 at 31.449805184 src 31.501109446
ts2phc[36.972]: adding tstamp 31.949800536 to clock /dev/ptp3
ts2phc[36.972]: adding tstamp 32.000000000 to clock /dev/ptp1
ts2phc[36.972]: /dev/ptp3 offset -8 s2 freq +4442
ts2phc[37.480]: /dev/ptp3 SKIP extts index 0 at 32.449795896 src 32.513320070
ts2phc[37.984]: adding tstamp 32.949791248 to clock /dev/ptp3
ts2phc[37.984]: adding tstamp 33.000000000 to clock /dev/ptp1
ts2phc[37.984]: /dev/ptp3 offset 0 s2 freq +4448
Fix that by taking the following measures:
- Schedule the poll from a timer. Because we are really scheduling the
timer periodically, the extts events delivered to user space are
periodic too, and don't suffer from the "shift-to-the-right" effect.
- Increase the poll period to 6 times a second. This imposes a smaller
upper bound to the shift that can occur to the delivery time of extts
events, and makes user space (ts2phc) to always interpret correctly
which events should be skipped and which shouldn't.
- Move the SPI readout itself to the main PTP kernel thread, instead of
the generic workqueue. This is because the timer runs in atomic
context, but is also better than before, because if needed, we can
chrt & taskset this kernel thread, to ensure it gets enough priority
under load.
After this patch, one can notice that the wander is greatly reduced, and
that the latencies of one extts poll are not propagated to the next. The
'src' timestamp that is skipped is never larger than .65 seconds (which
means .15 seconds larger than the time at which the real event occurred
at, and .10 seconds smaller than the .75 upper threshold for ignoring
the falling edge):
ts2phc[40.076]: adding tstamp 34.949261296 to clock /dev/ptp3
ts2phc[40.076]: adding tstamp 35.000000000 to clock /dev/ptp1
ts2phc[40.076]: /dev/ptp3 offset 48 s2 freq +4631
ts2phc[40.568]: /dev/ptp3 SKIP extts index 0 at 35.449256496 src 35.595791078
ts2phc[41.064]: adding tstamp 35.949251744 to clock /dev/ptp3
ts2phc[41.064]: adding tstamp 36.000000000 to clock /dev/ptp1
ts2phc[41.064]: /dev/ptp3 offset -224 s2 freq +4374
ts2phc[41.552]: /dev/ptp3 SKIP extts index 0 at 36.449247088 src 36.579825574
ts2phc[42.044]: adding tstamp 36.949242456 to clock /dev/ptp3
ts2phc[42.044]: adding tstamp 37.000000000 to clock /dev/ptp1
ts2phc[42.044]: /dev/ptp3 offset -240 s2 freq +4290
ts2phc[42.536]: /dev/ptp3 SKIP extts index 0 at 37.449237848 src 37.563828774
ts2phc[43.028]: adding tstamp 37.949233264 to clock /dev/ptp3
ts2phc[43.028]: adding tstamp 38.000000000 to clock /dev/ptp1
ts2phc[43.028]: /dev/ptp3 offset -144 s2 freq +4314
ts2phc[43.520]: /dev/ptp3 SKIP extts index 0 at 38.449228656 src 38.547823238
ts2phc[44.012]: adding tstamp 38.949224048 to clock /dev/ptp3
ts2phc[44.012]: adding tstamp 39.000000000 to clock /dev/ptp1
ts2phc[44.012]: /dev/ptp3 offset -80 s2 freq +4335
ts2phc[44.508]: /dev/ptp3 SKIP extts index 0 at 39.449219432 src 39.535846118
ts2phc[44.996]: adding tstamp 39.949214816 to clock /dev/ptp3
ts2phc[44.996]: adding tstamp 40.000000000 to clock /dev/ptp1
ts2phc[44.996]: /dev/ptp3 offset -32 s2 freq +4359
ts2phc[45.488]: /dev/ptp3 SKIP extts index 0 at 40.449210192 src 40.515824678
ts2phc[45.980]: adding tstamp 40.949205568 to clock /dev/ptp3
ts2phc[45.980]: adding tstamp 41.000000000 to clock /dev/ptp1
ts2phc[45.980]: /dev/ptp3 offset 8 s2 freq +4390
ts2phc[46.636]: /dev/ptp3 SKIP extts index 0 at 41.449200928 src 41.664176902
ts2phc[47.132]: adding tstamp 41.949196288 to clock /dev/ptp3
ts2phc[47.132]: adding tstamp 42.000000000 to clock /dev/ptp1
ts2phc[47.132]: /dev/ptp3 offset 0 s2 freq +4384
ts2phc[47.620]: /dev/ptp3 SKIP extts index 0 at 42.449191656 src 42.648117190
ts2phc[48.112]: adding tstamp 42.949187016 to clock /dev/ptp3
ts2phc[48.112]: adding tstamp 43.000000000 to clock /dev/ptp1
ts2phc[48.112]: /dev/ptp3 offset 0 s2 freq +4384
ts2phc[48.604]: /dev/ptp3 SKIP extts index 0 at 43.449182384 src 43.632112582
ts2phc[49.100]: adding tstamp 43.949177736 to clock /dev/ptp3
ts2phc[49.100]: adding tstamp 44.000000000 to clock /dev/ptp1
ts2phc[49.100]: /dev/ptp3 offset -8 s2 freq +4376
ts2phc[49.588]: /dev/ptp3 SKIP extts index 0 at 44.449173096 src 44.616136774
ts2phc[50.080]: adding tstamp 44.949168464 to clock /dev/ptp3
ts2phc[50.080]: adding tstamp 45.000000000 to clock /dev/ptp1
ts2phc[50.080]: /dev/ptp3 offset 8 s2 freq +4390
ts2phc[50.572]: /dev/ptp3 SKIP extts index 0 at 45.449163816 src 45.600134662
ts2phc[51.064]: adding tstamp 45.949159160 to clock /dev/ptp3
ts2phc[51.064]: adding tstamp 46.000000000 to clock /dev/ptp1
ts2phc[51.064]: /dev/ptp3 offset -8 s2 freq +4376
ts2phc[51.556]: /dev/ptp3 SKIP extts index 0 at 46.449154528 src 46.584588550
ts2phc[52.048]: adding tstamp 46.949149896 to clock /dev/ptp3
ts2phc[52.048]: adding tstamp 47.000000000 to clock /dev/ptp1
ts2phc[52.048]: /dev/ptp3 offset 0 s2 freq +4382
ts2phc[52.540]: /dev/ptp3 SKIP extts index 0 at 47.449145256 src 47.568132198
ts2phc[53.032]: adding tstamp 47.949140616 to clock /dev/ptp3
ts2phc[53.032]: adding tstamp 48.000000000 to clock /dev/ptp1
ts2phc[53.032]: /dev/ptp3 offset 0 s2 freq +4382
ts2phc[53.524]: /dev/ptp3 SKIP extts index 0 at 48.449135968 src 48.552121446
ts2phc[54.016]: adding tstamp 48.949131320 to clock /dev/ptp3
ts2phc[54.016]: adding tstamp 49.000000000 to clock /dev/ptp1
ts2phc[54.016]: /dev/ptp3 offset 0 s2 freq +4382
ts2phc[54.512]: /dev/ptp3 SKIP extts index 0 at 49.449126680 src 49.540147014
ts2phc[55.000]: adding tstamp 49.949122040 to clock /dev/ptp3
ts2phc[55.000]: adding tstamp 50.000000000 to clock /dev/ptp1
ts2phc[55.000]: /dev/ptp3 offset 0 s2 freq +4382
ts2phc[55.492]: /dev/ptp3 SKIP extts index 0 at 50.449117400 src 50.520119078
ts2phc[55.988]: adding tstamp 50.949112768 to clock /dev/ptp3
ts2phc[55.988]: adding tstamp 51.000000000 to clock /dev/ptp1
ts2phc[55.988]: /dev/ptp3 offset 8 s2 freq +4390
ts2phc[56.476]: /dev/ptp3 SKIP extts index 0 at 51.449108120 src 51.504175910
ts2phc[57.132]: adding tstamp 51.949103480 to clock /dev/ptp3
ts2phc[57.132]: adding tstamp 52.000000000 to clock /dev/ptp1
ts2phc[57.132]: /dev/ptp3 offset 0 s2 freq +4384
ts2phc[57.624]: /dev/ptp3 SKIP extts index 0 at 52.449098840 src 52.651833574
ts2phc[58.116]: adding tstamp 52.949094200 to clock /dev/ptp3
ts2phc[58.116]: adding tstamp 53.000000000 to clock /dev/ptp1
ts2phc[58.116]: /dev/ptp3 offset 8 s2 freq +4392
ts2phc[58.612]: /dev/ptp3 SKIP extts index 0 at 53.449089560 src 53.639826918
ts2phc[59.100]: adding tstamp 53.949084920 to clock /dev/ptp3
ts2phc[59.100]: adding tstamp 54.000000000 to clock /dev/ptp1
ts2phc[59.100]: /dev/ptp3 offset 8 s2 freq +4394
ts2phc[59.592]: /dev/ptp3 SKIP extts index 0 at 54.449080272 src 54.619842278
ts2phc[60.084]: adding tstamp 54.949075624 to clock /dev/ptp3
ts2phc[60.084]: adding tstamp 55.000000000 to clock /dev/ptp1
ts2phc[60.084]: /dev/ptp3 offset 8 s2 freq +4397
ts2phc[60.576]: /dev/ptp3 SKIP extts index 0 at 55.449070968 src 55.603885542
ts2phc[61.068]: adding tstamp 55.949066312 to clock /dev/ptp3
ts2phc[61.068]: adding tstamp 56.000000000 to clock /dev/ptp1
ts2phc[61.068]: /dev/ptp3 offset 0 s2 freq +4391
ts2phc[61.560]: /dev/ptp3 SKIP extts index 0 at 56.449061680 src 56.587885798
ts2phc[62.052]: adding tstamp 56.949057032 to clock /dev/ptp3
ts2phc[62.052]: adding tstamp 57.000000000 to clock /dev/ptp1
ts2phc[62.052]: /dev/ptp3 offset -8 s2 freq +4383
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since 'tcfp_burst' with TICK factor, driver side always need to recover
it to the original value, this patch moves the generic calculation and
recover to the 'burst' original value before offloading to device driver.
Signed-off-by: Po Liu <po.liu@nxp.com>
Acked-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Minor overlapping changes in xfrm_device.c, between the double
ESP trailing bug fix setting the XFRM_INIT flag and the changes
in net-next preparing for bonding encryption support.
Signed-off-by: David S. Miller <davem@davemloft.net>
The sja1105_gating_cfg_time_to_interval function does this, as per the
comments:
/* The gate entries contain absolute times in their e->interval field. Convert
* that to proper intervals (i.e. "0, 5, 10, 15" to "5, 5, 5, 5").
*/
To perform that task, it iterates over gating_cfg->entries, at each step
updating the interval of the _previous_ entry. So one interval remains
to be updated at the end of the loop: the last one (since it isn't
"prev" for anyone else).
But there was an erroneous check, that the last element's interval
should not be updated if it's also the only element. I'm not quite sure
why that check was there, but it's clearly incorrect, as a tc-gate
schedule with a single element would get an e->interval of zero,
regardless of the duration requested by the user. The switch wouldn't
even consider this configuration as valid: it will just drop all traffic
that matches the rule.
Fixes: 834f8933d5 ("net: dsa: sja1105: implement tc-gate using time-triggered virtual links")
Reported-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, tas_data->enabled would remain true even after deleting all
tc-gate rules from the switch ports, which would cause the
sja1105_tas_state_machine to get unnecessarily scheduled.
Also, if there were any errors which would prevent the hardware from
enabling the gating schedule, the sja1105_tas_state_machine would
continuously detect and print that, spamming the kernel log, even if the
rules were subsequently deleted.
The rules themselves are _not_ active, because sja1105_init_scheduling
does enough of a job to not install the gating schedule in the static
config. But the virtual link rules themselves are still present.
So call the functions that remove the tc-gate configuration from
priv->tas_data.gating_cfg, so that tas_data->enabled can be set to
false, and sja1105_tas_state_machine will stop from being scheduled.
Fixes: 834f8933d5 ("net: dsa: sja1105: implement tc-gate using time-triggered virtual links")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently sja1105_compose_gating_subschedule is not prepared to be
called for the case where we want to recompute the global tc-gate
configuration after we've deleted those actions on a port.
After deleting the tc-gate actions on the last port, max_cycle_time
would become zero, and that would incorrectly prevent
sja1105_free_gating_config from getting called.
So move the freeing function above the check for the need to apply a new
configuration.
Fixes: 834f8933d5 ("net: dsa: sja1105: implement tc-gate using time-triggered virtual links")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It turns out that sja1105_compose_gating_subschedule must also be called
from sja1105_vl_delete, to recalculate the overall tc-gate
configuration. Currently this is not possible without introducing a
forward declaration. So move the function at the top of the file, along
with its dependencies.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since struct sja1105_private only holds a const pointer to one of these
structures based on device tree compatible string, the structures
themselves can be made const.
Also add an empty line between each structure definition, to appease
checkpatch.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The per-chip instantiations of struct sja1105_table_ops and struct
sja1105_dynamic_table_ops can be made constant, so do that.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sparse is complaining and giving the following warning message:
'Using plain integer as NULL pointer'.
This is not what's going on, instead {0} is used as a zero initializer
for the structure members, to indicate that the particular chip revision
does not support those particular config tables.
But since the config tables are declared globally, the unpopulated
elements are zero-initialized anyway. So, to make sparse shut up, let's
remove the zero initializers.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make use of the struct_size() helper instead of an open-coded version
in order to avoid any potential type mistakes.
This code was detected with the help of Coccinelle and, audited and
fixed manually.
Addresses-KSPP-ID: https://github.com/KSPP/linux/issues/83
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Acked-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds a drop frames counter to tc flower offloading.
Reporting h/w dropped frames is necessary for some actions.
Some actions like police action and the coming introduced stream gate
action would produce dropped frames which is necessary for user. Status
update shows how many filtered packets increasing and how many dropped
in those packets.
v2: Changes
- Update commit comments suggest by Jiri Pirko.
Signed-off-by: Po Liu <Po.Liu@nxp.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Reviewed-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This action requires the VLAN awareness state of the switch to be of the
same type as the key that's being added:
- If the switch is unaware of VLAN, then the tc filter key must only
contain the destination MAC address.
- If the switch is VLAN-aware, the key must also contain the VLAN ID and
PCP.
But this check doesn't work unless we verify the VLAN awareness state on
both the "if" and the "else" branches.
Fixes: 834f8933d5 ("net: dsa: sja1105: implement tc-gate using time-triggered virtual links")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This action requires the VLAN awareness state of the switch to be of the
same type as the key that's being added:
- If the switch is unaware of VLAN, then the tc filter key must only
contain the destination MAC address.
- If the switch is VLAN-aware, the key must also contain the VLAN ID and
PCP.
But this check doesn't work unless we verify the VLAN awareness state on
both the "if" and the "else" branches.
Fixes: dfacc5a23e ("net: dsa: sja1105: support flow-based redirection via virtual links")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This shouldn't be there.
Fixes: 834f8933d5 ("net: dsa: sja1105: implement tc-gate using time-triggered virtual links")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It isn't actually described clearly at all in UM10944.pdf, but on TX of
a management frame (such as PTP), this needs to happen:
- The destination MAC address (i.e. 01-80-c2-00-00-0e), along with the
desired destination port, need to be installed in one of the 4
management slots of the switch, over SPI.
- The host can poll over SPI for that management slot's ENFPORT field.
That gets unset when the switch has matched the slot to the frame.
And therein lies the problem. ENFPORT does not mean that the packet has
been transmitted. Just that it has been received over the CPU port, and
that the mgmt slot is yet again available.
This is relevant because of what we are doing in sja1105_ptp_txtstamp_skb,
which is called right after sja1105_mgmt_xmit. We are in a hard
real-time deadline, since the hardware only gives us 24 bits of TX
timestamp, so we need to read the full PTP clock to reconstruct it.
Because we're in a hurry (in an attempt to make sure that we have a full
64-bit PTP time which is as close as possible to the actual transmission
time of the frame, to avoid 24-bit wraparounds), first we read the PTP
clock, then we poll for the TX timestamp to become available.
But of course, we don't know for sure that the frame has been
transmitted when we read the full PTP clock. We had assumed that ENFPORT
means it has, but the assumption is incorrect. And while in most
real-life scenarios this has never been caught due to software delays,
nowhere is this fact more obvious than with a tc-taprio offload, where
PTP traffic gets a small timeslot very rarely (example: 1 packet per 10
ms). In that case, we will be reading the PTP clock for timestamp
reconstruction too early (before the packet has been transmitted), and
this renders the reconstruction procedure incorrect (see the assumptions
described in the comments found on function sja1105_tstamp_reconstruct).
So the PTP TX timestamps will be off by 1<<24 clock ticks, or 135 ms
(1 tick is 8 ns).
So fix this case of premature optimization by simply reordering the
sja1105_ptpegr_ts_poll and the sja1105_ptpclkval_read function calls. It
turns out that in practice, the 135 ms hard deadline for PTP timestamp
wraparound is not so hard, since even the most bandwidth-intensive PTP
profiles, such as 802.1AS-2011, have a sync frame interval of 125 ms.
So if we couldn't deliver a timestamp in 135 ms (which we can), we're
toast and have much bigger problems anyway.
Fixes: 47ed985e97 ("net: dsa: sja1105: Add logic for TX timestamping")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Newer C compilers are complaining about the fact that there are no
function prototypes in sja1105_vl.c for the non-static functions.
Give them what they want.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The dynamic configuration interface for the General Params and the L2
Lookup Params tables was copy-pasted between E/T devices and P/Q/R/S
devices. Nonetheless, these interfaces are bitwise different.
The driver is using dynamic reconfiguration of the General Parameters
table for the port mirroring feature, which was therefore broken on
P/Q/R/S.
Note that this patch can't be backported easily very far to stable trees
(since it conflicts with some other development done since the
introduction of the driver). So the Fixes: tag is purely informational.
Fixes: 8aa9ebccae ("net: dsa: Introduce driver for NXP SJA1105 5-port L2 switch")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Newer compilers complain with W=1 builds that there are non-static
functions defined in sja1105_static_config.c that don't have a
prototype, because their prototype is defined in sja1105.h which this
translation unit does not include.
I don't entirely understand what is the point of these warnings, since
in principle there's nothing wrong with that. But let's move the
prototypes to a header file that _is_ included by
sja1105_static_config.c, since that will make these warnings go away.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Be there 2 switches spi/spi2.0 and spi/spi2.1 in a cross-chip setup,
both under the same VLAN-filtering bridge, both in the
SJA1105_VLAN_BEST_EFFORT state.
If we try to change the VLAN state of one of the switches (to
SJA1105_VLAN_FILTERING_FULL) we get the following error:
devlink dev param set spi/spi2.1 name best_effort_vlan_filtering value
false cmode runtime
[ 38.325683] sja1105 spi2.1: Not allowed to overcommit frame memory.
L2 memory partitions and VL memory partitions share the
same space. The sum of all 16 memory partitions is not
allowed to be larger than 929 128-byte blocks (or 910
with retagging). Please adjust
l2-forwarding-parameters-table.part_spc and/or
vl-forwarding-parameters-table.partspc.
[ 38.356803] sja1105 spi2.1: Invalid config, cannot upload
This is because the spi/spi2.1 switch doesn't support tagging anymore in
the SJA1105_VLAN_FILTERING_FULL state, so it doesn't need to have any
retagging rules defined. Great, so it can use more frame memory
(retagging consumes extra memory).
But the built-in low-level static config checker from the sja1105 driver
says "not so fast, you've increased the frame memory to non-retagging
values, but you still kept the retagging rules in the static config".
So we need to rebuild the VLAN table immediately before re-uploading the
static config, operation which will take care, based on the new VLAN
state, of removing the retagging rules.
Fixes: 3f01c91aab ("net: dsa: sja1105: implement VLAN retagging for dsa_8021q sub-VLANs")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
SJA1105, being AVB/TSN switches, provide hardware assist for the
Credit-Based Shaper as described in the IEEE 8021Q-2018 document.
First generation has 10 shapers, freely assignable to any of the 4
external ports and 8 traffic classes, and second generation has 16
shapers.
The Credit-Based Shaper tables are accessed through the dynamic
reconfiguration interface, so we have to restore them manually after a
switch reset. The tables are backed up by the static config only on
P/Q/R/S, and we don't want to add custom code only for that family,
since the procedure that is in place now works for both.
Tested with the following commands:
data_rate_kbps=67000
port_transmit_rate_kbps=1000000
idleslope=$data_rate_kbps
sendslope=$(($idleslope - $port_transmit_rate_kbps))
locredit=$((-0x80000000))
hicredit=$((0x7fffffff))
tc qdisc add dev swp2 root handle 1: mqprio hw 0 num_tc 8 \
map 0 1 2 3 4 5 6 7 \
queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7
tc qdisc replace dev swp2 parent 1:1 cbs \
idleslope $idleslope \
sendslope $sendslope \
hicredit $hicredit \
locredit $locredit \
offload 1
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Expand the delta commit procedure for VLANs with additional logic for
treating bridge_vlans in the newly introduced operating mode,
SJA1105_VLAN_BEST_EFFORT.
For every bridge VLAN on every user port, a sub-VLAN index is calculated
and retagging rules are installed towards a dsa_8021q rx_vid that
encodes that sub-VLAN index. This way, the tagger can identify the
original VLANs.
Extra care is taken for VLANs to still work as intended in cross-chip
scenarios. Retagging may have unintended consequences for these because
a sub-VLAN encoding that works for the CPU does not make any sense for a
front-panel port.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are 2 different features that require some reserved frame memory
space: VLAN retagging and virtual links. Create a central function that
modifies the static config and ensures frame memory is never
overcommitted.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Retagging Table is an optional feature that allows the switch to
match frames against a {ingress port, egress port, vid} rule and change
their VLAN ID. The retagged frames are by default clones of the original
ones (since the hardware-foreseen use case was to mirror traffic for
debugging purposes and to tag it with a special VLAN for this purpose),
but we can force the original frames to be dropped by removing the
pre-retagging VLAN from the port membership list of the egress port.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This devlink parameter enables the handling of DSA tags when enslaved to
a bridge with vlan_filtering=1. There are very good reasons to want
this, but there are also very good reasons for not enabling it by
default. So a devlink param named best_effort_vlan_filtering, currently
driver-specific and exported only by sja1105, is used to configure this.
In practice, this is perhaps the way that most users are going to use
the switch in. It assumes that no more than 7 VLANs are needed per port.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Create a subvlan_map as part of each port's tagger private structure.
This keeps reverse mappings of bridge-to-dsa_8021q VLAN retagging rules.
Note that as of this patch, this piece of code is never engaged, due to
the fact that the driver hasn't installed any retagging rule, so we'll
always see packets with a subvlan code of 0 (untagged).
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In VLAN-unaware mode, sja1105 uses VLAN tags with a custom TPID of
0xdadb. While in the yet-to-be introduced best_effort_vlan_filtering
mode, it needs to work with normal VLAN TPID values.
A complication arises when we must transmit a VLAN-tagged packet to the
switch when it's in VLAN-aware mode. We need to construct a packet with
2 VLAN tags, and the switch will use the outer header for routing and
pop it on egress. But sadly, here the 2 hardware generations don't
behave the same:
- E/T switches won't pop an ETH_P_8021AD tag on egress, it seems
(packets will remain double-tagged).
- P/Q/R/S switches will drop a packet with 2 ETH_P_8021Q tags (it looks
like it tries to prevent VLAN hopping).
But looks like the reverse is also true:
- E/T switches have no problem popping the outer tag from packets with
2 ETH_P_8021Q tags.
- P/Q/R/S will have no problem popping a single tag even if that is
ETH_P_8021AD.
So it is clear that if we want the hardware to work with dsa_8021q
tagging in VLAN-aware mode, we need to send different TPIDs depending on
revision. Keep that information in priv->info->qinq_tpid.
The per-port tagger structure will hold an xmit_tpid value that depends
not only upon the qinq_tpid, but also upon the VLAN awareness state
itself (in case we must transmit using 0xdadb).
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
VLAN filtering is a global property for sja1105, and that means that we
rely on the DSA core to not call us more than once.
But we need to introduce some per-port state for the tagger, namely the
xmit_tpid, and the best place to do that is where the xmit_tpid changes,
namely in sja1105_vlan_filtering. So at the moment, exit early from the
function to avoid unnecessarily resetting the switch for each port call.
Then we'll change the xmit_tpid prior to the early exit in the next
patch.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Let the DSA core call our .port_vlan_add methods every time the bridge
layer requests so. We will deal internally with saving/restoring VLANs
depending on our VLAN awareness state.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Managing the VLAN table that is present in hardware will become very
difficult once we add a third operating state
(best_effort_vlan_filtering). That is because correct cleanup (not too
little, not too much) becomes virtually impossible, when VLANs can be
added from the bridge layer, from dsa_8021q for basic tagging, for
cross-chip bridging, as well as retagging rules for sub-VLANs and
cross-chip sub-VLANs. So we need to rethink VLAN interaction with the
switch in a more scalable way.
In preparation for that, use the priv->expect_dsa_8021q boolean to
classify any VLAN request received through .port_vlan_add or
.port_vlan_del towards either one of 2 internal lists: bridge VLANs and
dsa_8021q VLANs.
Then, implement a central sja1105_build_vlan_table method that creates a
VLAN configuration from scratch based on the 2 lists of VLANs kept by
the driver, and based on the VLAN awareness state. Currently, if we are
VLAN-unaware, install the dsa_8021q VLANs, otherwise the bridge VLANs.
Then, implement a delta commit procedure that identifies which VLANs
from this new configuration are actually different from the config
previously committed to hardware. We apply the delta through the dynamic
configuration interface (we don't reset the switch). The result is that
the hardware should see the exact sequence of operations as before this
patch.
This also helps remove the "br" argument passed to
dsa_8021q_crosschip_bridge_join, which it was only using to figure out
whether it should commit the configuration back to us or not, based on
the VLAN awareness state of the bridge. We can simplify that, by always
allowing those VLANs inside of our dsa_8021q_vlans list, and committing
those to hardware when necessary.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
At the moment, this can never happen. The 2 modes that we operate in do
not permit that:
- SJA1105_VLAN_UNAWARE: we are guarded from bridge VLANs added by the
user by the DSA core. We will later lift this restriction by setting
ds->vlan_bridge_vtu = true, and that is where we'll need it.
- SJA1105_VLAN_FILTERING_FULL: in this mode, dsa_8021q configuration is
disabled. So the user is free to add these VLANs in the 1024-3071
range.
The reason for the patch is that we'll introduce a third VLAN awareness
state, where both dsa_8021q as well as the bridge are going to call our
.port_vlan_add and .port_vlan_del methods.
For that, we need a good way to discriminate between the 2. The easiest
(and less intrusive way for upper layers) is to recognize the fact that
dsa_8021q configurations are always driven by our driver - we _know_
when a .port_vlan_add method will be called from dsa_8021q because _we_
initiated it.
So introduce an expect_dsa_8021q boolean which is only used, at the
moment, for blacklisting VLANs in range 1024-3071 in the modes when
dsa_8021q is active.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Soon we'll add a third operating mode to the driver. Introduce a
vlan_state to make things more easy to manage, and use it where
applicable.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
sja1105 uses dsa_8021q for DSA tagging, a format which is VLAN at heart
and which is compatible with cascading. A complete description of this
tagging format is in net/dsa/tag_8021q.c, but a quick summary is that
each external-facing port tags incoming frames with a unique pvid, and
this special VLAN is transmitted as tagged towards the inside of the
system, and as untagged towards the exterior. The tag encodes the switch
id and the source port index.
This means that cross-chip bridging for dsa_8021q only entails adding
the dsa_8021q pvids of one switch to the RX filter of the other
switches. Everything else falls naturally into place, as long as the
bottom-end of ports (the leaves in the tree) is comprised exclusively of
dsa_8021q-compatible (i.e. sja1105 switches). Otherwise, there would be
a chance that a front-panel switch transmits a packet tagged with a
dsa_8021q header, header which it wouldn't be able to remove, and which
would hence "leak" out.
The only use case I tested (due to lack of board availability) was when
the sja1105 switches are part of disjoint trees (however, this doesn't
change the fact that multiple sja1105 switches still need unique switch
identifiers in such a system). But in principle, even "true" single-tree
setups (with DSA links) should work just as fine, except for a small
change which I can't test: dsa_towards_port should be used instead of
dsa_upstream_port (I made the assumption that the routing port that any
sja1105 should use towards its neighbours is the CPU port. That might
not hold true in other setups).
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/net/dsa/sja1105/sja1105_vl.c:468:6: warning: variable ‘prev_time’ set but not used [-Wunused-but-set-variable]
u32 prev_time = 0;
^~~~~~~~~
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Samuel Zou <zou_wei@huawei.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Restrict the TTEthernet hardware support on this switch to operate as
closely as possible to IEEE 802.1Qci as possible. This means that it can
perform PTP-time-based ingress admission control on streams identified
by {DMAC, VID, PCP}, which is useful when trying to ensure the
determinism of traffic scheduled via IEEE 802.1Qbv.
The oddity comes from the fact that in hardware (and in TTEthernet at
large), virtual links always need a full-blown action, including not
only the type of policing, but also the list of destination ports. So in
practice, a single tc-gate action will result in all packets getting
dropped. Additional actions (either "trap" or "redirect") need to be
specified in the same filter rule such that the conforming packets are
actually forwarded somewhere.
Apart from the VL Lookup, Policing and Forwarding tables which need to
be programmed for each flow (virtual link), the Schedule engine also
needs to be told to open/close the admission gates for each individual
virtual link. A fairly accurate (and detailed) description of how that
works is already present in sja1105_tas.c, since it is already used to
trigger the egress gates for the tc-taprio offload (IEEE 802.1Qbv). Key
point here, we remember that the schedule engine supports 8
"subschedules" (execution threads that iterate through the global
schedule in parallel, and that no 2 hardware threads must execute a
schedule entry at the same time). For tc-taprio, each egress port used
one of these 8 subschedules, leaving a total of 4 subschedules unused.
In principle we could have allocated 1 subschedule for the tc-gate
offload of each ingress port, but actually the schedules of all virtual
links installed on each ingress port would have needed to be merged
together, before they could have been programmed to hardware. So
simplify our life and just merge the entire tc-gate configuration, for
all virtual links on all ingress ports, into a single subschedule. Be
sure to check that against the usual hardware scheduling conflicts, and
program it to hardware alongside any tc-taprio subschedule that may be
present.
The following scenarios were tested:
1. Quantitative testing:
tc qdisc add dev swp2 clsact
tc filter add dev swp2 ingress flower skip_sw \
dst_mac 42:be:24:9b:76:20 \
action gate index 1 base-time 0 \
sched-entry OPEN 1200 -1 -1 \
sched-entry CLOSE 1200 -1 -1 \
action trap
ping 192.168.1.2 -f
PING 192.168.1.2 (192.168.1.2) 56(84) bytes of data.
.............................
--- 192.168.1.2 ping statistics ---
948 packets transmitted, 467 received, 50.7384% packet loss, time 9671ms
2. Qualitative testing (with a phase-aligned schedule - the clocks are
synchronized by ptp4l, not shown here):
Receiver (sja1105):
tc qdisc add dev swp2 clsact
now=$(phc_ctl /dev/ptp1 get | awk '/clock time is/ {print $5}') && \
sec=$(echo $now | awk -F. '{print $1}') && \
base_time="$(((sec + 2) * 1000000000))" && \
echo "base time ${base_time}"
tc filter add dev swp2 ingress flower skip_sw \
dst_mac 42:be:24:9b:76:20 \
action gate base-time ${base_time} \
sched-entry OPEN 60000 -1 -1 \
sched-entry CLOSE 40000 -1 -1 \
action trap
Sender (enetc):
now=$(phc_ctl /dev/ptp0 get | awk '/clock time is/ {print $5}') && \
sec=$(echo $now | awk -F. '{print $1}') && \
base_time="$(((sec + 2) * 1000000000))" && \
echo "base time ${base_time}"
tc qdisc add dev eno0 parent root taprio \
num_tc 8 \
map 0 1 2 3 4 5 6 7 \
queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \
base-time ${base_time} \
sched-entry S 01 50000 \
sched-entry S 00 50000 \
flags 2
ping -A 192.168.1.1
PING 192.168.1.1 (192.168.1.1): 56 data bytes
...
^C
--- 192.168.1.1 ping statistics ---
1425 packets transmitted, 1424 packets received, 0% packet loss
round-trip min/avg/max = 0.322/0.361/0.990 ms
And just for comparison, with the tc-taprio schedule deleted:
ping -A 192.168.1.1
PING 192.168.1.1 (192.168.1.1): 56 data bytes
...
^C
--- 192.168.1.1 ping statistics ---
33 packets transmitted, 19 packets received, 42% packet loss
round-trip min/avg/max = 0.336/0.464/0.597 ms
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Implement tc-flower offloads for redirect, trap and drop using
non-critical virtual links.
Commands which were tested to work are:
# Send frames received on swp2 with a DA of 42:be:24:9b:76:20 to the
# CPU and to swp3. This type of key (DA only) when the port's VLAN
# awareness state is off.
tc qdisc add dev swp2 clsact
tc filter add dev swp2 ingress flower skip_sw dst_mac 42:be:24:9b:76:20 \
action mirred egress redirect dev swp3 \
action trap
# Drop frames received on swp2 with a DA of 42:be:24:9b:76:20, a VID
# of 100 and a PCP of 0.
tc filter add dev swp2 ingress protocol 802.1Q flower skip_sw \
dst_mac 42:be:24:9b:76:20 vlan_id 100 vlan_prio 0 action drop
Under the hood, all rules match on DMAC, VID and PCP, but when VLAN
filtering is disabled, those are set internally by the driver to the
port-based defaults. Because we would be put in an awkward situation if
the user were to change the VLAN filtering state while there are active
rules (packets would no longer match on the specified keys), we simply
deny changing vlan_filtering unless the list of flows offloaded via
virtual links is empty. Then the user can re-add new rules.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Virtual links are a sja1105 hardware concept of executing various flow
actions based on a key extracted from the frame's DMAC, VID and PCP.
Currently the tc-flower offload code supports only parsing the DMAC if
that is the broadcast MAC address, and the VLAN PCP. Extract the key
parsing logic from the L2 policers functionality and move it into its
own function, after adding extra logic for matching on any DMAC and VID.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds the register definitions for the:
- VL Lookup Table
- VL Policing Table
- VL Forwarding Table
- VL Forwarding Parameters Table
These are needed in order to perform TTEthernet operations: QoS
classification, flow-based policing and/or frame redirecting with the
switch.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The addition of sja1105_port_status_ether structure into the
statistics causes the frame size to go over the warning limit:
drivers/net/dsa/sja1105/sja1105_ethtool.c:421:6: error: stack frame size of 1104 bytes in function 'sja1105_get_ethtool_stats' [-Werror,-Wframe-larger-than=]
Use dynamic allocation to avoid this.
Fixes: 336aa67bd0 ("net: dsa: sja1105: show more ethtool statistics counters for P/Q/R/S")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
It looks like the sja1105 external timestamping input is not as generic
as we thought. When fed a signal with 50% duty cycle, it will timestamp
both the rising and the falling edge. When fed a short pulse signal,
only the timestamp of the falling edge will be seen in the PTPSYNCTS
register, because that of the rising edge had been overwritten. So the
moral is: don't feed it short pulse inputs.
Luckily this is not a complete deal breaker, as we can still work with
1 Hz square waves. But the problem is that the extts polling period was
not dimensioned enough for this input signal. If we leave the period at
half a second, we risk losing timestamps due to jitter in the measuring
process. So we need to increase it to 4 times per second.
Also, the very least we can do to inform the user is to deny any other
flags combination than with PTP_RISING_EDGE and PTP_FALLING_EDGE both
set.
Fixes: 747e5eb31d ("net: dsa: sja1105: configure the PTP_CLK pin as EXT_TS or PER_OUT")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit d1cbfd771c ("ptp_clock: Allow for it to be optional") changed
all PTP-capable Ethernet drivers from `select PTP_1588_CLOCK` to `imply
PTP_1588_CLOCK`, "in order to break the hard dependency between the PTP
clock subsystem and ethernet drivers capable of being clock providers."
As a result it is possible to build PTP-capable Ethernet drivers without
the PTP subsystem by deselecting PTP_1588_CLOCK. Drivers are required to
handle the missing dependency gracefully.
Some PTP-capable Ethernet drivers (e.g., TI_CPSW) factor their PTP code
out into separate drivers (e.g., TI_CPTS_MOD). The above commit also
changed these PTP-specific drivers to `imply PTP_1588_CLOCK`, making it
possible to build them without the PTP subsystem. But as Grygorii
Strashko noted in [1]:
On Wed, Apr 22, 2020 at 02:16:11PM +0300, Grygorii Strashko wrote:
> Another question is that CPTS completely nonfunctional in this case and
> it was never expected that somebody will even try to use/run such
> configuration (except for random build purposes).
In my view, enabling a PTP-specific driver without the PTP subsystem is
a configuration error made possible by the above commit. Kconfig should
not allow users to create a configuration with missing dependencies that
results in "completely nonfunctional" drivers.
I audited all network drivers that call ptp_clock_register() but merely
`imply PTP_1588_CLOCK` and found five PTP-specific drivers that are
likely nonfunctional without PTP_1588_CLOCK:
NET_DSA_MV88E6XXX_PTP
NET_DSA_SJA1105_PTP
MACB_USE_HWSTAMP
CAVIUM_PTP
TI_CPTS_MOD
Note how these symbols all reference PTP or timestamping in their name;
this is a clue that they depend on PTP_1588_CLOCK.
Change them from `imply PTP_1588_CLOCK` [2] to `depends on PTP_1588_CLOCK`.
I'm not using `select PTP_1588_CLOCK` here because PTP_1588_CLOCK has
its own dependencies, which `select` would not transitively apply.
Additionally, remove the `select NET_PTP_CLASSIFY` from CPTS_TI_MOD;
PTP_1588_CLOCK already selects that.
[1]: https://lore.kernel.org/lkml/c04458ed-29ee-1797-3a11-7f3f560553e6@ti.com/
[2]: NET_DSA_SJA1105_PTP had never declared any type of dependency on
PTP_1588_CLOCK (`imply` or otherwise); adding a `depends on PTP_1588_CLOCK`
here seems appropriate.
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Nicolas Pitre <nico@fluxnic.net>
Cc: Grygorii Strashko <grygorii.strashko@ti.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Fixes: d1cbfd771c ("ptp_clock: Allow for it to be optional")
Signed-off-by: Clay McClure <clay@daemons.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some boards do not have the RX_ER MII signal connected. Normally in such
situation, those pins would be grounded, but then again, some boards
left it electrically floating.
When sending traffic to those switch ports, one can see that the
N_SOFERR statistics counter is incrementing once per each packet. The
user manual states for this counter that it may count the number of
frames "that have the MII error input being asserted prior to or
up to the SOF delimiter byte". So the switch MAC is sampling an
electrically floating signal, and preventing proper traffic reception
because of that.
As a workaround, enable the internal weak pull-downs on the input pads
for the MII control signals. This way, a floating signal would be
internally tied to ground.
The logic levels of signals which _are_ externally driven should not be
bothered by this 40-50 KOhm internal resistor. So it is not an issue to
enable the internal pull-down unconditionally, irrespective of PHY
interface type (MII, RMII, RGMII, SGMII) and of board layout.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds complete support for manipulating the L2 Policing Tables
from this switch. There are 45 table entries, one entry per each port
and traffic class, and one dedicated entry for broadcast traffic for
each ingress port.
Policing entries are shareable, and we use this functionality to support
shared block filters.
We are modeling broadcast policers as simple tc-flower matches on
dst_mac. As for the traffic class policers, the switch only deduces the
traffic class from the VLAN PCP field, so it makes sense to model this
as a tc-flower match on vlan_prio.
How to limit broadcast traffic coming from all front-panel ports to a
cumulated total of 10 Mbit/s:
tc qdisc add dev sw0p0 ingress_block 1 clsact
tc qdisc add dev sw0p1 ingress_block 1 clsact
tc qdisc add dev sw0p2 ingress_block 1 clsact
tc qdisc add dev sw0p3 ingress_block 1 clsact
tc filter add block 1 flower skip_sw dst_mac ff:ff:ff:ff:ff:ff \
action police rate 10mbit burst 64k
How to limit traffic with VLAN PCP 0 (also includes untagged traffic) to
100 Mbit/s on port 0 only:
tc filter add dev sw0p0 ingress protocol 802.1Q flower skip_sw \
vlan_prio 0 action police rate 100mbit burst 64k
The broadcast, VLAN PCP and port policers are compatible with one
another (can be installed at the same time on a port).
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds partial configuration support for the L2 Policing Table. Out
of the 45 policing entries, only 5 are used (one for each port), in a
shared manner. All 8 traffic classes, and the broadcast policer, are
redirected to a common instance which belongs to the ingress port.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It looks like the P/Q/R/S series supports some more counters,
generically named "Ethernet statistics counter", which we were not
printing. Add them.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On this switch, the frame length enforcements are performed by the
ingress policers. There are 2 types of those: regular L2 (also called
best-effort) and Virtual Link policers (an ARINC664/AFDX concept for
defining L2 streams with certain QoS abilities). To avoid future
confusion, I prefer to call the reset reason "Best-effort policers",
even though the VL policers are not yet supported.
We also need to change the setup of the initial static config, such that
DSA calls to .change_mtu (which are expensive) become no-ops and don't
reset the switch 5 times.
A driver-level decision is to unconditionally allow single VLAN-tagged
traffic on all ports. The CPU port must accept an additional VLAN header
for the DSA tag, which is again a driver-level decision.
The policers actually count bytes not only from the SDU, but also from
the Ethernet header and FCS, so those need to be accounted for as well.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The SJA1105 switch family has a PTP_CLK pin which emits a signal with
fixed 50% duty cycle, but variable frequency and programmable start time.
On the second generation (P/Q/R/S) switches, this pin supports even more
functionality. The use case described by the hardware documents talks
about synchronization via oneshot pulses: given 2 sja1105 switches,
arbitrarily designated as a master and a slave, the master emits a
single pulse on PTP_CLK, while the slave is configured to timestamp this
pulse received on its PTP_CLK pin (which must obviously be configured as
input). The difference between the timestamps then exactly becomes the
slave offset to the master.
The only trouble with the above is that the hardware is very much tied
into this use case only, and not very generic beyond that:
- When emitting a oneshot pulse, instead of being told when to emit it,
the switch just does it "now" and tells you later what time it was,
via the PTPSYNCTS register. [ Incidentally, this is the same register
that the slave uses to collect the ext_ts timestamp from, too. ]
- On the sync slave, there is no interrupt mechanism on reception of a
new extts, and no FIFO to buffer them, because in the foreseen use
case, software is in control of both the master and the slave pins,
so it "knows" when there's something to collect.
These 2 problems mean that:
- We don't support (at least yet) the quirky oneshot mode exposed by
the hardware, just normal periodic output.
- We abuse the hardware a little bit when we expose generic extts.
Because there's no interrupt mechanism, we need to poll at double the
frequency we expect to receive a pulse. Currently that means a
non-configurable "twice a second".
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The AVB table contains the CAS_MASTER field (to be added in the next
patch) which decides the direction of the PTP_CLK pin.
Reconfiguring this field dynamically is highly preferable to having to
reset the switch and upload a new static configuration, so we add
support for exactly that.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Because the PTP_CLK pin starts toggling only at a time higher than the
current PTP clock, this helper from the time-aware shaper code comes in
handy here as well. We'll use it to transform generic user input for the
perout request into valid input for the sja1105 hardware.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
These fields configure the destination and source MAC address that the
switch will put in the Ethernet frames sent towards the CPU port that
contain RX timestamps for PTP.
These fields do not enable the feature itself, that is configured via
SEND_META0 and SEND_META1 in the General Params table.
The implication of this patch is that the AVB Params table will always
be present in the static config. Which doesn't really hurt.
This is needed because in a future patch, we will add another field from
this table, CAS_MASTER, for configuring the PTP_CLK pin function. That
can be configured irrespective of whether RX timestamping is enabled or
not, so always having this table present is going to simplify things a
bit.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
SJA1105 switches R and S have one SerDes port with an 802.3z
quasi-compatible PCS, hardwired on port 4. The other ports are still
MII/RMII/RGMII. The PCS performs rate adaptation to lower link speeds;
the MAC on this port is hardwired at gigabit. Only full duplex is
supported.
The SGMII port can be configured as part of the static config tables, as
well as through a dedicated SPI address region for its pseudo-clause-22
registers. However it looks like the static configuration is not
able to change some out-of-reset values (like the value of MII_BMCR), so
at the end of the day, having code for it is utterly pointless. We are
just going to use the pseudo-C22 interface.
Because the PCS gets reset when the switch resets, we have to add even
more restoration logic to sja1105_static_config_reload, otherwise the
SGMII port breaks after operations such as enabling PTP timestamping
which require a switch reset.
>From PHYLINK perspective, the switch supports *only* SGMII (it doesn't
support 1000Base-X). It also doesn't expose access to the raw config
word for in-band AN in registers MII_ADV/MII_LPA.
It is able to work in the following modes:
- Forced speed
- SGMII in-band AN slave (speed received from PHY)
- SGMII in-band AN master (acting as a PHY)
The latter mode is not supported by this patch. It is even unclear to me
how that would be described. There is some code for it left in the
patch, but 'an_master' is always passed as false.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
When sja1105_init_mii_settings iterates over the port list, it prints
this message for disabled ports, because they don't have a valid
phy-mode:
[ 4.778702] sja1105 spi2.0: Unsupported PHY mode unknown!
Suggested-by: Andrew Lunn <andrew@lunn.ch>
Suggested-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The switches supported so far by the driver only have non-SerDes ports,
so they should be configured in the PHYLINK callback that provides the
resolved PHY link parameters.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Validate 100baseT1_Full to make this driver work with TJA1102 PHY.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Acked-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Propagate the resolved link configuration down via DSA's
phylink_mac_link_up() operation to allow split PCS/MAC to work.
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
The sja1105_parse_ports_node function was tested only on device trees
where all ports were enabled. Fix this check so that the driver
continues to probe only with the ports where status is not "disabled",
as expected.
Fixes: 8aa9ebccae ("net: dsa: Introduce driver for NXP SJA1105 5-port L2 switch")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It is possible to stack multiple DSA switches in a way that they are not
part of the tree (disjoint) but the DSA master of a switch is a DSA
slave of another. When that happens switch drivers may have to know this
is the case so as to determine whether their tagging protocol has a
remove chance of working.
This is useful for specific switch drivers such as b53 where devices
have been known to be stacked in the wild without the Broadcom tag
protocol supporting that feature. This allows b53 to continue supporting
those devices by forcing the disabling of Broadcom tags on the outermost
switches if necessary.
The get_tag_protocol() function is therefore updated to gain an
additional enum dsa_tag_protocol argument which denotes the current
tagging protocol used by the DSA master we are attached to, else
DSA_TAG_PROTO_NONE for the top of the dsa_switch_tree.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are 3 things that are wrong with the DSA deferred xmit mechanism:
1. Its introduction has made the DSA hotpath ever so slightly more
inefficient for everybody, since DSA_SKB_CB(skb)->deferred_xmit needs
to be initialized to false for every transmitted frame, in order to
figure out whether the driver requested deferral or not (a very rare
occasion, rare even for the only driver that does use this mechanism:
sja1105). That was necessary to avoid kfree_skb from freeing the skb.
2. Because L2 PTP is a link-local protocol like STP, it requires
management routes and deferred xmit with this switch. But as opposed
to STP, the deferred work mechanism needs to schedule the packet
rather quickly for the TX timstamp to be collected in time and sent
to user space. But there is no provision for controlling the
scheduling priority of this deferred xmit workqueue. Too bad this is
a rather specific requirement for a feature that nobody else uses
(more below).
3. Perhaps most importantly, it makes the DSA core adhere a bit too
much to the NXP company-wide policy "Innovate Where It Doesn't
Matter". The sja1105 is probably the only DSA switch that requires
some frames sent from the CPU to be routed to the slave port via an
out-of-band configuration (register write) rather than in-band (DSA
tag). And there are indeed very good reasons to not want to do that:
if that out-of-band register is at the other end of a slow bus such
as SPI, then you limit that Ethernet flow's throughput to effectively
the throughput of the SPI bus. So hardware vendors should definitely
not be encouraged to design this way. We do _not_ want more
widespread use of this mechanism.
Luckily we have a solution for each of the 3 issues:
For 1, we can just remove that variable in the skb->cb and counteract
the effect of kfree_skb with skb_get, much to the same effect. The
advantage, of course, being that anybody who doesn't use deferred xmit
doesn't need to do any extra operation in the hotpath.
For 2, we can create a kernel thread for each port's deferred xmit work.
If the user switch ports are named swp0, swp1, swp2, the kernel threads
will be named swp0_xmit, swp1_xmit, swp2_xmit (there appears to be a 15
character length limit on kernel thread names). With this, the user can
change the scheduling priority with chrt $(pidof swp2_xmit).
For 3, we can actually move the entire implementation to the sja1105
driver.
So this patch deletes the generic implementation from the DSA core and
adds a new one, more adequate to the requirements of PTP TX
timestamping, in sja1105_main.c.
Suggested-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
I finally found out how the 4 management route slots are supposed to
be used, but.. it's not worth it.
The description from the comment I've just deleted in this commit is
still true: when more than 1 management slot is active at the same time,
the switch will match frames incoming [from the CPU port] on the lowest
numbered management slot that matches the frame's DMAC.
My issue was that one was not supposed to statically assign each port a
slot. Yes, there are 4 slots and also 4 non-CPU ports, but that is a
mere coincidence.
Instead, the switch can be used like this: every management frame gets a
slot at the right of the most recently assigned slot:
Send mgmt frame 1 through S0: S0 x x x
Send mgmt frame 2 through S1: S0 S1 x x
Send mgmt frame 3 through S2: S0 S1 S2 x
Send mgmt frame 4 through S3: S0 S1 S2 S3
The difference compared to the old usage is that the transmission of
frames 1-4 doesn't need to wait until the completion of the management
route. It is safe to use a slot to the right of the most recently used
one, because by protocol nobody will program a slot to your left and
"steal" your route towards the correct egress port.
So there is a potential throughput benefit here.
But mgmt frame 5 has no more free slot to use, so it has to wait until
_all_ of S0, S1, S2, S3 are full, in order to use S0 again.
And that's actually exactly the problem: I was looking for something
that would bring more predictable transmission latency, but this is
exactly the opposite: 3 out of 4 frames would be transmitted quicker,
but the 4th would draw the short straw and have a worse worst-case
latency than before.
Useless.
Things are made even worse by PTP TX timestamping, which is something I
won't go deeply into here. Suffice to say that the fact there is a
driver-level lock on the SPI bus offsets any potential throughput gains
that parallelism might bring.
So there's no going back to the multi-slot scheme, remove the
"mgmt_slot" variable from sja1105_port and the dummy static assignment
made at probe time.
While passing by, also remove the assignment to casc_port altogether.
Don't pretend that we support cascaded setups.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When disabling PTP timestamping, don't reset the switch with the new
static config until all existing PTP frames have been timestamped on the
RX path or dropped. There's nothing we can do with these afterwards.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
And move the queue of skb's waiting for RX timestamps into the ptp_data
structure, since it isn't needed if PTP is not compiled.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For first-generation switches (SJA1105E and SJA1105T):
- TPID means C-Tag (typically 0x8100)
- TPID2 means S-Tag (typically 0x88A8)
While for the second generation switches (SJA1105P, SJA1105Q, SJA1105R,
SJA1105S) it is the other way around:
- TPID means S-Tag (typically 0x88A8)
- TPID2 means C-Tag (typically 0x8100)
In other words, E/T tags untagged traffic with TPID, and P/Q/R/S with
TPID2.
So the patch mentioned below fixed VLAN filtering for P/Q/R/S, but broke
it for E/T.
We strive for a common code path for all switches in the family, so just
lie in the static config packing functions that TPID and TPID2 are at
swapped bit offsets than they actually are, for P/Q/R/S. This will make
both switches understand TPID to be ETH_P_8021Q and TPID2 to be
ETH_P_8021AD. The meaning from the original E/T was chosen over P/Q/R/S
because E/T is actually the one with public documentation available
(UM10944.pdf).
Fixes: f9a1a7646c ("net: dsa: sja1105: Reverse TPID and TPID2")
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The check originates from the initial implementation which was not based
on PTP time but on a standalone clock source. In the meantime we can now
program the PTPSCHTM register at runtime with the dynamic base time
(actually with a value that is 200 ns smaller, to avoid writing DELTA=0
in the Schedule Entry Points Parameters Table). And we also have logic
for moving the actual base time in the future of the PHC's current time
base, so the check for zero serves no purpose, since even if the user
will specify zero, that's not what will end up in the static config
table where the limitation is.
Fixes: 86db36a347 ("net: dsa: sja1105: Implement state machine for TAS with PTP clock source")
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When activating tc-taprio offload on the switch ports, the TAS state
machine will try to check whether it is running or not, but will find
both the STARTED and STOPPED bits as false in the
sja1105_tas_check_running function. So the function will return -EINVAL
(an abnormal situation) and the kernel will keep printing this from the
TAS FSM workqueue:
[ 37.691971] sja1105 spi0.1: An operation returned -22
The reason is that the underlying function that gets called,
sja1105_ptp_commit, does not actually do a SPI_READ, but a SPI_WRITE. So
the command buffer remains initialized with zeroes instead of retrieving
the hardware state. Fix that.
Fixes: 41603d78b3 ("net: dsa: sja1105: Make the PTP command read-write")
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The PTP egress timestamp N must be captured from register PTPEGR_TS[n],
where n = 2 * PORT + TSREG. There are 10 PTPEGR_TS registers, 2 per
port. We are only using TSREG=0.
As opposed to the management slots, which are 4 in number
(SJA1105_NUM_PORTS, minus the CPU port). Any management frame (which
includes PTP frames) can be sent to any non-CPU port through any
management slot. When the CPU port is not the last port (#4), there will
be a mismatch between the slot and the port number.
Luckily, the only mainline occurrence with this switch
(arch/arm/boot/dts/ls1021a-tsn.dts) does have the CPU port as #4, so the
issue did not manifest itself thus far.
Fixes: 47ed985e97 ("net: dsa: sja1105: Add logic for TX timestamping")
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This function was using configuration of port 0 in devicetree for all ports.
In case CPU port was not 0, the delay settings was ignored. This resulted not
working communication between CPU and the switch.
Fixes: f5b8631c29 ("net: dsa: sja1105: Error out if RGMII delays are requested in DT")
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We don't really need 10k species of reset. Remove everything except cold
reset which is what is actually used. Too bad the hardware designers
couldn't agree to use the same bit field for rev 1 and rev 2, so the
(*reset_cmd) function pointer is there to stay.
However let's simplify the prototype and give it a struct dsa_switch (we
want to avoid forward-declarations of structures, in this case struct
sja1105_private, wherever we can).
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tested using the following bash script and the tc from iproute2-next:
#!/bin/bash
set -e -u -o pipefail
NSEC_PER_SEC="1000000000"
gatemask() {
local tc_list="$1"
local mask=0
for tc in ${tc_list}; do
mask=$((${mask} | (1 << ${tc})))
done
printf "%02x" ${mask}
}
if ! systemctl is-active --quiet ptp4l; then
echo "Please start the ptp4l service"
exit
fi
now=$(phc_ctl /dev/ptp1 get | gawk '/clock time is/ { print $5; }')
# Phase-align the base time to the start of the next second.
sec=$(echo "${now}" | gawk -F. '{ print $1; }')
base_time="$(((${sec} + 1) * ${NSEC_PER_SEC}))"
tc qdisc add dev swp5 parent root handle 100 taprio \
num_tc 8 \
map 0 1 2 3 5 6 7 \
queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \
base-time ${base_time} \
sched-entry S $(gatemask 7) 100000 \
sched-entry S $(gatemask "0 1 2 3 4 5 6") 400000 \
clockid CLOCK_TAI flags 2
The "state machine" is a workqueue invoked after each manipulation
command on the PTP clock (reset, adjust time, set time, adjust
frequency) which checks over the state of the time-aware scheduler.
So it is not monitored periodically, only in reaction to a PTP command
typically triggered from a userspace daemon (linuxptp). Otherwise there
is no reason for things to go wrong.
Now that the timecounter/cyclecounter has been replaced with hardware
operations on the PTP clock, the TAS Kconfig now depends upon PTP and
the standalone clocksource operating mode has been removed.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The PTPSTRTSCH and PTPSTOPSCH bits are actually readable and indicate
whether the time-aware scheduler is running or not. We will be using
that for monitoring the scheduler in the next patch, so refactor the PTP
command API in order to allow that.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sometimes it can be quite opaque even for me why the driver decided to
reset the switch. So instead of adding dump_stack() calls each time for
debugging, just add a reset reason to sja1105_static_config_reload
calls which gets printed to the console.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The purpose here is to avoid ptp4l fail due to this condition:
timed out while polling for tx timestamp
increasing tx_timestamp_timeout may correct this issue, but it is likely caused by a driver bug
port 1: send peer delay request failed
So either reset the switch before the management frame was sent, or
after it was timestamped as well, but not in the middle.
The condition may arise either due to a true timeout (i.e. because
re-uploading the static config takes time), or due to the TX timestamp
actually getting lost due to reset. For the former we can increase
tx_timestamp_timeout in userspace, for the latter we need this patch.
Locking all traffic during switch reset does not make sense at all,
though. Forcing all CPU-originated traffic to potentially block waiting
for a sleepable context to send > 800 bytes over SPI is not a good idea.
Flows that are autonomously forwarded by the switch will get dropped
anyway during switch reset no matter what. So just let all other
CPU-originated traffic be dropped as well.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The PTP time of the switch is not preserved when uploading a new static
configuration. Work around this hardware oddity by reading its PTP time
before a static config upload, and restoring it afterwards.
Static config changes are expected to occur at runtime even in scenarios
directly related to PTP, i.e. the Time-Aware Scheduler of the switch is
programmed in this way.
Perhaps the larger implication of this patch is that the PTP .gettimex64
and .settime functions need to be exposed to sja1105_main.c, where the
PTP lock needs to be held during this entire process. So their core
implementation needs to move to some common functions which get exposed
in sja1105_ptp.h.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Through the PTP_SYS_OFFSET_EXTENDED ioctl, it is possible for userspace
applications (i.e. phc2sys) to compensate for the delays incurred while
reading the PHC's time.
The task itself of taking the software timestamp is delegated to the SPI
subsystem, through the newly introduced API in struct spi_transfer. The
goal is to cross-timestamp I/O operations on the switch's PTP clock with
values in the local system clock (CLOCK_REALTIME). For that we need to
understand a bit of the hardware internals.
The 'read PTP time' message is a 12 byte structure, first 4 bytes of
which represent the SPI header, and the last 8 bytes represent the
64-bit PTP time. The switch itself starts processing the command
immediately after receiving the last bit of the address, i.e. at the
middle of byte 3 (last byte of header). The PTP time is shadowed to a
buffer register in the switch, and retrieved atomically during the
subsequent SPI frames.
A similar thing goes on for the 'write PTP time' message, although in
that case the switch waits until the 64-bit PTP time becomes fully
available before taking any action. So the byte that needs to be
software-timestamped is byte 11 (last) of the transfer.
The patch creates a common (and local) sja1105_xfer implementation for
the SPI I/O, and offers 3 front-ends:
- sja1105_xfer_u32 and sja1105_xfer_u64: these are capable of optionally
requesting a PTP timestamp
- sja1105_xfer_buf: this is for large transfers (e.g. the static config
buffer) and other misc data, and there is no point in giving
timestamping capabilities to this.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Before this change of_get_phy_mode() returned an enum,
phy_interface_t. On error, -ENODEV etc, is returned. If the result of
the function is stored in a variable of type phy_interface_t, and the
compiler has decided to represent this as an unsigned int, comparision
with -ENODEV etc, is a signed vs unsigned comparision.
Fix this problem by changing the API. Make the function return an
error, or 0 on success, and pass a pointer, of type phy_interface_t,
where the phy mode should be stored.
v2:
Return with *interface set to PHY_INTERFACE_MODE_NA on error.
Add error checks to all users of of_get_phy_mode()
Fixup a few reverse christmas tree errors
Fixup a few slightly malformed reverse christmas trees
v3:
Fix 0-day reported errors.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
The only slightly tricky merge conflict was the netdevsim because the
mutex locking fix overlapped a lot of driver reload reorganization.
The rest were (relatively) trivial in nature.
Signed-off-by: David S. Miller <davem@davemloft.net>
An earlier bugfix introduced a dependency on CONFIG_NET_SCH_TAPRIO,
but this missed the case of NET_SCH_TAPRIO=m and NET_DSA_SJA1105=y,
which still causes a link error:
drivers/net/dsa/sja1105/sja1105_tas.o: In function `sja1105_setup_tc_taprio':
sja1105_tas.c:(.text+0x5c): undefined reference to `taprio_offload_free'
sja1105_tas.c:(.text+0x3b4): undefined reference to `taprio_offload_get'
drivers/net/dsa/sja1105/sja1105_tas.o: In function `sja1105_tas_teardown':
sja1105_tas.c:(.text+0x6ec): undefined reference to `taprio_offload_free'
Change the dependency to only allow selecting the TAS code when it
can link against the taprio code.
Fixes: a8d570de0c ("net: dsa: sja1105: Add dependency for NET_DSA_SJA1105_TAS")
Fixes: 317ab5b86c ("net: dsa: sja1105: Configure the Time-Aware Scheduler via tc-taprio offload")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that ports are dynamically listed in the fabric, there is no need
to provide a special helper to allocate the dsa_switch structure. This
will give more flexibility to drivers to embed this structure as they
wish in their private structure.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Like the dsa_switch_tree structures, the dsa_port structures will be
allocated on switch registration.
The SJA1105 driver is the only one accessing the dsa_port structure
after the switch allocation and before the switch registration.
For that reason, move switch registration prior to assigning the priv
member of the dsa_port structures.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Do not let the drivers access the ds->ports static array directly
while there is a dsa_to_port helper for this purpose.
At the same time, un-const this helper since the SJA1105 driver
assigns the priv member of the returned dsa_port structure.
Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Adjusting the hardware clock (PTPCLKVAL, PTPCLKADD, PTPCLKRATE) is a
requirement for the auxiliary PTP functionality of the switch
(TTEthernet, PPS input, PPS output).
Therefore we need to switch to using these registers to keep a
synchronized time in hardware, instead of the timecounter/cyclecounter
implementation, which is reliant on the free-running PTPTSCLK.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch corrects the SPDX License Identifier style
in header files related to Distributed Switch Architecture
drivers for NXP SJA1105 series Ethernet switch support.
It uses an expilict block comment for the SPDX License
Identifier.
Changes made by using a script provided by Joe Perches here:
https://lkml.org/lkml/2019/2/7/46.
Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Nishad Kamdar <nishadkamdar@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reworks the SPI transfer implementation to make use of more of the
SPI core features. The main benefit is to avoid the memcpy in
sja1105_xfer_buf().
The memcpy was only needed because the function was transferring a
single buffer at a time. So it needed to copy the caller-provided buffer
at buf + 4, to store the SPI message header in the "headroom" area.
But the SPI core supports scatter-gather messages, comprised of multiple
transfers. We can actually use those to break apart every SPI message
into 2 transfers: one for the header and one for the actual payload.
To keep the behavior the same regarding the chip select signal, it is
necessary to tell the SPI core to de-assert the chip select after each
chunk. This was not needed before, because each spi_message contained
only 1 single transfer.
The meaning of the per-transfer cs_change=1 is:
- If the transfer is the last one of the message, keep CS asserted
- Otherwise, deassert CS
We need to deassert CS in the "otherwise" case, which was implicit
before.
Avoiding the memcpy creates yet another opportunity. The device can't
process more than 256 bytes of SPI payload at a time, so the
sja1105_xfer_long_buf() function used to exist, to split the larger
caller buffer into chunks.
But these chunks couldn't be used as scatter/gather buffers for
spi_message until now, because of that memcpy (we would have needed more
memory for each chunk). So we can now remove the sja1105_xfer_long_buf()
function and have a single implementation for long and short buffers.
Another benefit is lower usage of stack memory. Previously we had to
store 2 SPI buffers for each chunk. Due to the elimination of the
memcpy, we can now send pointers to the actual chunks from the
caller-supplied buffer to the SPI core.
Since the patch merges two functions into a rewritten implementation,
the function prototype was also changed, mainly for cosmetic consistency
with the structures used within it.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is a cosmetic patch that reduces some boilerplate in the SPI
interaction of the driver.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The PTP command register contains enable bits for:
- Putting the 64-bit PTPCLKVAL register in add/subtract or write mode
- Taking timestamps off of the corrected vs free-running clock
- Starting/stopping the TTEthernet scheduling
- Starting/stopping PPS output
- Resetting the switch
When a command needs to be issued (e.g. "change the PTPCLKVAL from write
mode to add/subtract mode"), one cannot simply write to the command
register setting the PTPCLKADD bit to 1, because that would zeroize the
other settings. One also cannot do a read-modify-write (that would be
too easy for this hardware) because not all bits of the command register
are readable over SPI.
So this leaves us with the only option of keeping the value of the PTP
command register in the driver, and operating on that.
Actually there are 2 types of PTP operations now:
- Operations that modify the cached PTP command. These operate on
ptp_data->cmd as a pointer.
- Operations that apply all previously cached PTP settings, but don't
otherwise cache what they did themselves. The sja1105_ptp_reset
function is such an example. It copies the ptp_data->cmd on stack
before modifying and writing it to SPI.
This practically means that struct sja1105_ptp_cmd is no longer an
implementation detail, since it needs to be stored in full into struct
sja1105_ptp_data, and hence in struct sja1105_private. So the (*ptp_cmd)
function prototype can change and take struct sja1105_ptp_cmd as second
argument now.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is a non-functional change with 2 goals (both for the case when
CONFIG_NET_DSA_SJA1105_PTP is not enabled):
- Reduce the size of the sja1105_private structure.
- Make the PTP code more self-contained.
Leaving priv->ptp_data.lock to be initialized in sja1105_main.c is not a
leftover: it will be used in a future patch "net: dsa: sja1105: Restore
PTP time after switch reset".
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The new rule (as already started for sja1105_tas.h) is for functions of
optional driver components (ones which may be disabled via Kconfig - PTP
and TAS) to take struct dsa_switch *ds instead of struct sja1105_private
*priv as first argument.
This is so that forward-declarations of struct sja1105_private can be
avoided.
So make sja1105_ptp.h the second user of this rule.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We need priv->ptp_caps to hold a structure and not just a pointer,
because we use container_of in the various PTP callbacks.
Therefore, the sja1105_ptp_caps structure declared in the global memory
of the driver serves no further purpose after copying it into
priv->ptp_caps.
So just populate priv->ptp_caps with the needed operations and remove
sja1105_ptp_caps.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix sparse warnings:
drivers/net/dsa/sja1105/sja1105_spi.c:159:5: warning: symbol 'sja1105_xfer_long_buf' was not declared. Should it be static?
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Amazingly, of all features, this does not require a switch reset.
Tested with:
tc qdisc add dev swp2 clsact
tc filter add dev swp2 ingress matchall skip_sw \
action mirred egress mirror dev swp3
tc filter show dev swp2 ingress
tc filter del dev swp2 ingress pref 49152
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The most commonly called function in the driver is long due for a
rename. The "packed" word is redundant (it doesn't make sense to
transfer an unpacked structure, since that is in CPU endianness yadda
yadda), and the "spi" word is also redundant since argument 2 of the
function is SPI_READ or SPI_WRITE.
As for the sja1105_spi_send_long_packed_buf function, it is only being
used from sja1105_spi.c, so remove its global prototype.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Having a function that takes a variable number of unpacked bytes which
it generically calls an "int" is confusing and makes auditing patches
next to impossible.
We only use spi_send_int with the int sizes of 32 and 64 bits. So just
make the spi_send_int function less generic and replace it with the
appropriate two explicit functions, which can now type-check the int
pointer type.
Note that there is still a small weirdness in the u32 function, which
has to convert it to a u64 temporary. This is because of how the packing
API works at the moment, but the weirdness is at least hidden from
callers of sja1105_xfer_u32 now.
Suggested-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently this stack trace can be seen with CONFIG_DEBUG_ATOMIC_SLEEP=y:
[ 41.568348] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:909
[ 41.576757] in_atomic(): 1, irqs_disabled(): 0, pid: 208, name: ptp4l
[ 41.583212] INFO: lockdep is turned off.
[ 41.587123] CPU: 1 PID: 208 Comm: ptp4l Not tainted 5.3.0-rc6-01445-ge950f2d4bc7f-dirty #1827
[ 41.599873] [<c0313d7c>] (unwind_backtrace) from [<c030e13c>] (show_stack+0x10/0x14)
[ 41.607584] [<c030e13c>] (show_stack) from [<c1212d50>] (dump_stack+0xd4/0x100)
[ 41.614863] [<c1212d50>] (dump_stack) from [<c037dfc8>] (___might_sleep+0x1c8/0x2b4)
[ 41.622574] [<c037dfc8>] (___might_sleep) from [<c122ea90>] (__mutex_lock+0x48/0xab8)
[ 41.630368] [<c122ea90>] (__mutex_lock) from [<c122f51c>] (mutex_lock_nested+0x1c/0x24)
[ 41.638340] [<c122f51c>] (mutex_lock_nested) from [<c0c6fe08>] (sja1105_static_config_reload+0x30/0x27c)
[ 41.647779] [<c0c6fe08>] (sja1105_static_config_reload) from [<c0c7015c>] (sja1105_hwtstamp_set+0x108/0x1cc)
[ 41.657562] [<c0c7015c>] (sja1105_hwtstamp_set) from [<c0feb650>] (dev_ifsioc+0x18c/0x330)
[ 41.665788] [<c0feb650>] (dev_ifsioc) from [<c0febbd8>] (dev_ioctl+0x320/0x6e8)
[ 41.673064] [<c0febbd8>] (dev_ioctl) from [<c0f8b1f4>] (sock_ioctl+0x334/0x5e8)
[ 41.680340] [<c0f8b1f4>] (sock_ioctl) from [<c05404a8>] (do_vfs_ioctl+0xb0/0xa10)
[ 41.687789] [<c05404a8>] (do_vfs_ioctl) from [<c0540e3c>] (ksys_ioctl+0x34/0x58)
[ 41.695151] [<c0540e3c>] (ksys_ioctl) from [<c0301000>] (ret_fast_syscall+0x0/0x28)
[ 41.702768] Exception stack(0xe8495fa8 to 0xe8495ff0)
[ 41.707796] 5fa0: beff4a8c 00000001 00000011 000089b0 beff4a8c beff4a80
[ 41.715933] 5fc0: beff4a8c 00000001 0000000c 00000036 b6fa98c8 004e19c1 00000001 00000000
[ 41.724069] 5fe0: 004dcedc beff4a6c 004c0738 b6e7af4c
[ 41.729860] BUG: scheduling while atomic: ptp4l/208/0x00000002
[ 41.735682] INFO: lockdep is turned off.
Enabling RX timestamping will logically disturb the fastpath (processing
of meta frames). Replace bool hwts_rx_en with a bit that is checked
atomically from the fastpath and temporarily unset from the sleepable
context during a change of the RX timestamping process (a destructive
operation anyways, requires switch reset).
If found unset, the fastpath (net/dsa/tag_sja1105.c) will just drop any
received meta frame and not take the meta_lock at all.
Fixes: a602afd200 ("net: dsa: sja1105: Expose PTP timestamping ioctls to userspace")
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In sja1105_static_config_upload, in two cases memory is leaked: when
static_config_buf_prepare_for_upload fails and when sja1105_inhibit_tx
fails. In both cases config_buf should be released.
Fixes: 8aa9ebccae ("net: dsa: Introduce driver for NXP SJA1105 5-port L2 switch")
Fixes: 1a4c69406c ("net: dsa: sja1105: Prevent PHY jabbering during switch reset")
Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com>
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sometimes the PTP synchronization on the switch 'jumps':
ptp4l[11241.155]: rms 8 max 16 freq -21732 +/- 11 delay 742 +/- 0
ptp4l[11243.157]: rms 7 max 17 freq -21731 +/- 10 delay 744 +/- 0
ptp4l[11245.160]: rms 33592410 max 134217731 freq +192422 +/- 8530253 delay 743 +/- 0
ptp4l[11247.163]: rms 811631 max 964131 freq +10326 +/- 557785 delay 743 +/- 0
ptp4l[11249.166]: rms 261936 max 533876 freq -304323 +/- 126371 delay 744 +/- 0
ptp4l[11251.169]: rms 48700 max 57740 freq -20218 +/- 30532 delay 744 +/- 0
ptp4l[11253.171]: rms 14570 max 30163 freq -5568 +/- 7563 delay 742 +/- 0
ptp4l[11255.174]: rms 2914 max 3440 freq -22001 +/- 1667 delay 744 +/- 1
ptp4l[11257.177]: rms 811 max 1710 freq -22653 +/- 451 delay 744 +/- 1
ptp4l[11259.180]: rms 177 max 218 freq -21695 +/- 89 delay 741 +/- 0
ptp4l[11261.182]: rms 45 max 92 freq -21677 +/- 32 delay 742 +/- 0
ptp4l[11263.186]: rms 14 max 32 freq -21733 +/- 11 delay 742 +/- 0
ptp4l[11265.188]: rms 9 max 14 freq -21725 +/- 12 delay 742 +/- 0
ptp4l[11267.191]: rms 9 max 16 freq -21727 +/- 13 delay 742 +/- 0
ptp4l[11269.194]: rms 6 max 15 freq -21726 +/- 9 delay 743 +/- 0
ptp4l[11271.197]: rms 8 max 15 freq -21728 +/- 11 delay 743 +/- 0
ptp4l[11273.200]: rms 6 max 12 freq -21727 +/- 8 delay 743 +/- 0
ptp4l[11275.202]: rms 9 max 17 freq -21720 +/- 11 delay 742 +/- 0
ptp4l[11277.205]: rms 9 max 18 freq -21725 +/- 12 delay 742 +/- 0
Background: the switch only offers partial RX timestamps (24 bits) and
it is up to the driver to read the PTP clock to fill those timestamps up
to 64 bits. But the PTP clock readout needs to happen quickly enough (in
0.135 seconds, in fact), otherwise the PTP clock will wrap around 24
bits, condition which cannot be detected.
Looking at the 'max 134217731' value on output line 3, one can see that
in hex it is 0x8000003. Because the PTP clock resolution is 8 ns,
that means 0x1000000 in ticks, which is exactly 2^24. So indeed this is
a PTP clock wraparound, but the reason might be surprising.
What is going on is that sja1105_tstamp_reconstruct(priv, now, ts)
expects a "now" time that is later than the "ts" was snapshotted at.
This, of course, is obvious: we read the PTP time _after_ the partial RX
timestamp was received. However, the workqueue is processing frames from
a skb queue and reuses the same PTP time, read once at the beginning.
Normally the skb queue only contains one frame and all goes well. But
when the skb queue contains two frames, the second frame that gets
dequeued might have been partially timestamped by the RX MAC _after_ we
had read our PTP time initially.
The code was originally like that due to concerns that SPI access for
PTP time readout is a slow process, and we are time-constrained anyway
(aka: premature optimization). But some timing analysis reveals that the
time spent until the RX timestamp is completely reconstructed is 1 order
of magnitude lower than the 0.135 s deadline even under worst-case
conditions. So we can afford to read the PTP time for each frame in the
RX timestamping queue, which of course ensures that the full PTP time is
in the partial timestamp's future.
Fixes: f3097be21b ("net: dsa: sja1105: Add a state machine for RX timestamping")
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If CONFIG_NET_DSA_SJA1105_TAS=y and CONFIG_NET_SCH_TAPRIO=n,
below error can be found:
drivers/net/dsa/sja1105/sja1105_tas.o: In function `sja1105_setup_tc_taprio':
sja1105_tas.c:(.text+0x318): undefined reference to `taprio_offload_free'
sja1105_tas.c:(.text+0x590): undefined reference to `taprio_offload_get'
drivers/net/dsa/sja1105/sja1105_tas.o: In function `sja1105_tas_teardown':
sja1105_tas.c:(.text+0x610): undefined reference to `taprio_offload_free'
make: *** [vmlinux] Error 1
sja1105_tas needs tc-taprio, so this patch add the dependency for it.
Fixes: 317ab5b86c ("net: dsa: sja1105: Configure the Time-Aware Scheduler via tc-taprio offload")
Signed-off-by: Mao Wenan <maowenan@huawei.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
This qdisc offload is the closest thing to what the SJA1105 supports in
hardware for time-based egress shaping. The switch core really is built
around SAE AS6802/TTEthernet (a TTTech standard) but can be made to
operate similarly to IEEE 802.1Qbv with some constraints:
- The gate control list is a global list for all ports. There are 8
execution threads that iterate through this global list in parallel.
I don't know why 8, there are only 4 front-panel ports.
- Care must be taken by the user to make sure that two execution threads
never get to execute a GCL entry simultaneously. I created a O(n^4)
checker for this hardware limitation, prior to accepting a taprio
offload configuration as valid.
- The spec says that if a GCL entry's interval is shorter than the frame
length, you shouldn't send it (and end up in head-of-line blocking).
Well, this switch does anyway.
- The switch has no concept of ADMIN and OPER configurations. Because
it's so simple, the TAS settings are loaded through the static config
tables interface, so there isn't even place for any discussion about
'graceful switchover between ADMIN and OPER'. You just reset the
switch and upload a new OPER config.
- The switch accepts multiple time sources for the gate events. Right
now I am using the standalone clock source as opposed to PTP. So the
base time parameter doesn't really do much. Support for the PTP clock
source will be added in a future series.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is a preparation patch for the tc-taprio offload (and potentially
for other future offloads such as tc-mqprio).
Instead of looking directly at skb->priority during xmit, let's get the
netdev queue and the queue-to-traffic-class mapping, and put the
resulting traffic class into the dsa_8021q PCP field. The switch is
configured with a 1-to-1 PCP-to-ingress-queue-to-egress-queue mapping
(see vlan_pmap in sja1105_main.c), so the effect is that we can inject
into a front-panel's egress traffic class through VLAN tagging from
Linux, completely transparently.
Unfortunately the switch doesn't look at the VLAN PCP in the case of
management traffic to/from the CPU (link-local frames at
01-80-C2-xx-xx-xx or 01-1B-19-xx-xx-xx) so we can't alter the
transmission queue of this type of traffic on a frame-by-frame basis. It
is only selected through the "hostprio" setting which ATM is harcoded in
the driver to 7.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In order to support tc-taprio offload, the TTEthernet egress scheduling
core registers must be made visible through the static interface.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The switch barely supports traffic I/O, and it does that by repurposing
VLANs when there is no bridge that is taking control of them.
Letting DSA declare this netdev feature as supported (see
dsa_slave_create) would mean that VLAN sub-interfaces created on sja1105
switch ports will be hardware offloaded. That means that
net/8021q/vlan_core.c would install the VLAN into the filter tables of
the switch, potentially interfering with the tag_8021q VLANs.
We need to prevent that from happening and not let the 8021q core
offload VLANs to the switch hardware tables. In vlan_filtering=0 modes
of operation, the switch ports can pass through VLAN-tagged frames with
no problem.
Suggested-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes gcc '-Wunused-but-set-variable' warning:
drivers/net/dsa/sja1105/sja1105_main.c: In function sja1105_fdb_dump:
drivers/net/dsa/sja1105/sja1105_main.c:1226:14: warning:
variable tx_vid set but not used [-Wunused-but-set-variable]
drivers/net/dsa/sja1105/sja1105_main.c:1226:6: warning:
variable rx_vid set but not used [-Wunused-but-set-variable]
They are not used since commit 6d7c7d948a ("net: dsa:
sja1105: Fix broken learning with vlan_filtering disabled")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The IS_ERR_OR_NULL(priv->clock) check inside
sja1105_ptp_clock_unregister() is preventing cancel_delayed_work_sync
from actually being run.
Additionally, sja1105_ptp_clock_unregister() does not actually get run,
when placed in sja1105_remove(). The DSA switch gets torn down, but the
sja1105 module does not get unregistered. So sja1105_ptp_clock_unregister
needs to be moved to sja1105_teardown, to be symmetrical with
sja1105_ptp_clock_register which is called from the DSA sja1105_setup.
It is strange to fix a "fixes" patch, but the probe failure can only be
seen when the attached PHY does not respond to MDIO (issue which I can't
pinpoint the reason to) and it goes away after I power-cycle the board.
This time the patch was validated on a failing board, and the kernel
panic from the fixed commit's message can no longer be seen.
Fixes: 29dd908d35 ("net: dsa: sja1105: Cancel PTP delayed work on unregister")
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It looks like the FDB dump taken from first-generation switches also
contains information on whether entries are static or not. So use that
instead of searching through the driver's tables.
Fixes: d763778224 ("net: dsa: sja1105: Implement is_static for FDB entries on E/T")
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When put under a bridge with vlan_filtering 0, the SJA1105 ports will
flood all traffic as if learning was broken. This is because learning
interferes with the rx_vid's configured by dsa_8021q as unique pvid's.
So learning technically still *does* work, it's just that the learnt
entries never get matched due to their unique VLAN ID.
The setting that saves the day is Shared VLAN Learning, which on this
switch family works exactly as desired: VLAN tagging still works
(untagged traffic gets the correct pvid) and FDB entries are still
populated with the correct contents including VID. Also, a frame cannot
violate the forwarding domain restrictions enforced by its classified
VLAN. It is just that the VID is ignored when looking up the FDB for
taking a forwarding decision (selecting the egress port).
This patch activates SVL, and the result is that frames with a learnt
DMAC are no longer flooded in the scenario described above.
Now exactly *because* SVL works as desired, we have to revisit some
earlier patches:
- It is no longer necessary to manipulate the VID of the 'bridge fdb
{add,del}' command when vlan_filtering is off. This is because now,
SVL is enabled for that case, so the actual VID does not matter*.
- It is still desirable to hide dsa_8021q VID's in the FDB dump
callback. But right now the dump callback should no longer hide
duplicates (one per each front panel port's pvid, plus one for the
VLAN that the CPU port is going to tag a TX frame with), because there
shouldn't be any (the switch will match a single FDB entry no matter
its VID anyway).
* Not really... It's no longer necessary to transform a 'bridge fdb add'
into 5 fdb add operations, but the user might still add a fdb entry with
any vid, and all of them would appear as duplicates in 'bridge fdb
show'. So force a 'bridge fdb add' to insert the VID of 0**, so that we
can prune the duplicates at insertion time.
** The VID of 0 is better than 1 because it is always guaranteed to be
in the ports' hardware filter. DSA also avoids putting the VID inside
the netlink response message towards the bridge driver when we return
this particular VID, which makes it suitable for FDB entries learnt
with vlan_filtering off.
Fixes: 227d07a07e ("net: dsa: sja1105: Add support for traffic through standalone ports")
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Georg Waibel <georg.waibel@sensor-technik.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Each iteration of for_each_child_of_node puts the previous node, but in
the case of a return from the middle of the loop, there is no put, thus
causing a memory leak. Hence add an of_node_put before the return.
Issue found with Coccinelle.
Signed-off-by: Nishka Dasgupta <nishkadg.linux@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We need a better way to signal this, perhaps in phylink_validate, but
for now just print this error message as guidance for other people
looking at this driver's code while trying to rework PHYLINK.
Cc: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
PHYLINK being designed with PHYs in mind that can change MII protocol,
for correct operation it is necessary to ensure that the PHY interface
mode stays the same (otherwise clear the supported bit mask, as
required).
Because this is just a hypothetical situation for now, we don't bother
to check whether we could actually support the new PHY interface mode.
Actually we could modify the xMII table, reset the switch and send an
updated static configuration, but adding that would just be dead code.
Cc: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It has been pointed out that PHYLINK can call mac_config only to update
the phy_interface_type and without knowing what the AN results are.
Experimentally, when this was observed to happen, state->link was also
unset, and therefore was used as a proxy to ignore this call. However it
is also suggested that state->link is undefined for this callback and
should not be relied upon.
So let the previously-dead codepath for SPEED_UNKNOWN be called, and
update the comment to make sure the MAC's behavior is sane.
Cc: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The first generation switches don't tell us through the dynamic config
interface whether the dumped FDB entries are static or not (the LOCKEDS
bit from P/Q/R/S).
However, now that we're keeping a mirror of all 'bridge fdb' commands in
the static config, this is an opportunity to compare a dumped FDB entry
to the driver's private database. After all, what makes an entry static
is that *we* added it.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
A FDB entry means that "frames that match this VID and DMAC must be
forwarded to this port".
In the case of dsa_8021q however, the VID is not a single one (and
neither two, as my previous patch assumed). The VID can be set either by
the CPU port (1 tx_vid), or by any of the other front-panel port (n-1
rx_vid's).
Fixes: 93647594d8 ("net: dsa: sja1105: Hide the dsa_8021q VLANs from the bridge fdb command")
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The reason why this wasn't tackled earlier is that I had hoped I
understood the user manual wrong. But unfortunately hacks are required
in order to retrieve the static/dynamic nature of FDB entries on SJA1105
P/Q/R/S, since this info is stored in the writeback buffer of the
dynamic config command.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When trying to add support for LOCKEDS (static FDB entries) on SJA1105
P/Q/R/S, at first I didn't remember how the abstraction I created
worked, and actually thought it works by mistake.
To avoid other people staring at the code and not making much sense out
of it, add some comments at the top of the file.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After commit 8456721dd4 ("net: dsa: sja1105: Add support for
configuring address ageing time"), we started to reset the switch rather
often (each time the bridge core changes the ageing time on a switch
port).
The unfortunate reality is that SJA1105 doesn't have any {cold, warm,
whatever} reset mode in which it accepts a new configuration stream
without flushing the FDB. Instead, in its world, the FDB *is* an
optional part of the static configuration.
So we play its game, and do what we also do for VLANs: for each 'bridge
fdb' command, we add the FDB entry through the dynamic interface, and we
append the in-kernel static config memory with info that we're going to
use later, when the next reset command is going to be issued.
The result is that 'bridge fdb' commands are now persistent (dynamically
learned entries are lost, but that's ok).
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
At the end of the commit 1da7382134 ("net: dsa: sja1105: Add FDB
operations for P/Q/R/S series") message, I said that:
At the moment only FDB entries installed statically through 'bridge fdb'
are visible in the dump callback - the dynamically learned ones are
still under investigation.
It looks like the reason why they were not visible in 'bridge fdb' was
that they were never learned - always flooded.
SJA1105 P/Q/R/S manual says about the MAXADDRP[port] field:
Specify the maximum number of MAC address dynamically learned from
the respective port. It is used to limit the number of learned MAC
addresses per port.
It looks like not providing a value in the static config (aka providing
zeroes) is enough for it to not store the learned addresses in the FDB.
For now we divide the 1024 entry FDB "equally" amongst the 5 ports. This
may be revisited if the situation calls for that - for now I'm happy
that learning works.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In commit 1da7382134 ("net: dsa: sja1105: Add FDB operations for
P/Q/R/S series"), these bits were set in the static config, but
apparently they did not do anything. The reason is that the packing
accessors for them were part of a patch I forgot to send.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In SJA1105 there is no concept of 'default values' per se, everything
needs to be driver-supplied through the static configuration tables.
The issue is that the hardware manual says that 'at least the default
untagging VLAN' is mandatory to be provided through the static config.
But VLAN 0 isn't a very good initial pvid - its use is reserved for
priority-tagged frames, and the layers of the stack that care about
those already make sure that this VLAN is installed, as can be seen in
the message below:
8021q: adding VLAN 0 to HW filter on device swp2
So change the pvid provided through the static configuration to 1, which
matches the bridge core's defaults.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As Arnd Bergmann pointed out in commit 78fe8a28fb ("net: dsa: sja1105:
fix ptp link error"), there is no point in having PTP support as a
separate loadable kernel module.
So remove the exported symbols and make sja1105.ko contain PTP support
or not based on CONFIG_NET_DSA_SJA1105_PTP.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Due to a reversed dependency, it is possible to build
the lower ptp driver as a loadable module and the actual
driver using it as built-in, causing a link error:
drivers/net/dsa/sja1105/sja1105_spi.o: In function `sja1105_static_config_upload':
sja1105_spi.c:(.text+0x6f0): undefined reference to `sja1105_ptp_reset'
drivers/net/dsa/sja1105/sja1105_spi.o:(.data+0x2d4): undefined reference to `sja1105et_ptp_cmd'
drivers/net/dsa/sja1105/sja1105_spi.o:(.data+0x604): undefined reference to `sja1105pqrs_ptp_cmd'
drivers/net/dsa/sja1105/sja1105_main.o: In function `sja1105_remove':
sja1105_main.c:(.text+0x8d4): undefined reference to `sja1105_ptp_clock_unregister'
drivers/net/dsa/sja1105/sja1105_main.o: In function `sja1105_rxtstamp_work':
sja1105_main.c:(.text+0x964): undefined reference to `sja1105_tstamp_reconstruct'
drivers/net/dsa/sja1105/sja1105_main.o: In function `sja1105_setup':
sja1105_main.c:(.text+0xb7c): undefined reference to `sja1105_ptp_clock_register'
drivers/net/dsa/sja1105/sja1105_main.o: In function `sja1105_port_deferred_xmit':
sja1105_main.c:(.text+0x1fa0): undefined reference to `sja1105_ptpegr_ts_poll'
sja1105_main.c:(.text+0x1fc4): undefined reference to `sja1105_tstamp_reconstruct'
drivers/net/dsa/sja1105/sja1105_main.o:(.rodata+0x5b0): undefined reference to `sja1105_get_ts_info'
Change the Makefile logic to always build the ptp module
the same way as the rest. Another option would be to
just add it to the same module and remove the exports,
but I don't know if there was a good reason to keep them
separate.
Fixes: bb77f36ac2 ("net: dsa: sja1105: Add support for the PTP clock")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix sparse warnings:
drivers/net/dsa/sja1105/sja1105_main.c:1848:6:
warning: symbol 'sja1105_port_rxtstamp' was not declared. Should it be static?
drivers/net/dsa/sja1105/sja1105_main.c:1869:6:
warning: symbol 'sja1105_port_txtstamp' was not declared. Should it be static?
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Tested-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
As per the DT phy-mode specification, RGMII delays are applied by the
MAC when there is no PHY present on the link.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
The pad_mii_tx registers point to the same memory region but were
unused. So convert to using these for RGMII I/O cell configuration, as
they bear a shorter name.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The first fact that needs to be stated is that the per-MAC settings in
SJA1105 called EGRESS and INGRESS do *not* disable egress and ingress on
the MAC. They only prevent non-link-local traffic from being
sent/received on this port.
So instead of having .phylink_mac_config essentially mess with the STP
state and force it to DISABLED/BLOCKING (which also brings useless
complications in sja1105_static_config_reload), simply add the
.phylink_mac_link_down and .phylink_mac_link_up callbacks which inhibit
TX at the MAC level, while leaving RX essentially enabled.
Also stop from trying to put the link down in .phylink_mac_config, which
is incorrect.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This will be used to stop egress traffic in .phylink_mac_link_up.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since the driver is now using PHYLINK exclusively, it makes sense to
remove all references to it and replace them with PHYLINK.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is a cosmetic patch that replaces the link speed numbers used in
the driver with the corresponding ethtool macros.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Suggested-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This enables the PTP support towards userspace applications such as
linuxptp.
The switches can timestamp only trapped multicast MAC frames, and
therefore only the profiles of 1588 over L2 are supported.
TX timestamping can be enabled per port, but RX timestamping is enabled
globally. As long as RX timestamping is enabled, the switch will emit
metadata follow-up frames that will be processed by the tagger. It may
be a problem that linuxptp does not restore the RX timestamping settings
when exiting.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Meta frame reception relies on the hardware keeping its promise that it
will send no other traffic towards the CPU port between a link-local
frame and a meta frame. Otherwise there is no other way to associate
the meta frame with the link-local frame it's holding a timestamp of.
The receive function is made stateful, and buffers a timestampable frame
until its meta frame arrives, then merges the two, drops the meta and
releases the link-local frame up the stack.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Without noticing any particular issue, this patch ensures that
management traffic is treated with the maximum priority on RX by the
switch. This is generally desirable, as the driver keeps a state
machine that waits for metadata follow-up frames as soon as a management
frame is received. Increasing the priority helps expedite the reception
(and further reconstruction) of the RX timestamp to the driver after the
MAC has generated it.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This will be used to keep state for RX timestamping. It is global
because the switch serializes timestampable and meta frames when
trapping them towards the CPU port (lower port indices have higher
priority) and therefore having one state machine per port would create
unnecessary complications.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This table is used to program the switch to emit "meta" follow-up
Ethernet frames (which contain partial RX timestamps) after each
link-local frame that was trapped to the CPU port through MAC filtering.
This includes PTP frames.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On TX, timestamping is performed synchronously from the
port_deferred_xmit worker thread.
In management routes, the switch is requested to take egress timestamps
(again partial), which are reconstructed and appended to a clone of the
skb that was just sent. The cloning is done by DSA and we retrieve the
pointer from the structure that DSA keeps in skb->cb.
Then these clones are enqueued to the socket's error queue for
application-level processing.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The design of this PHC driver is influenced by the switch's behavior
w.r.t. timestamping. It exposes two PTP counters, one free-running
(PTPTSCLK) and the other offset- and frequency-corrected in hardware
through PTPCLKVAL, PTPCLKADD and PTPCLKRATE. The MACs can sample either
of these for frame timestamps.
However, the user manual warns that taking timestamps based on the
corrected clock is less than useful, as the switch can deliver corrupted
timestamps in a variety of circumstances.
Therefore, this PHC uses the free-running PTPTSCLK together with a
timecounter/cyclecounter structure that translates it into a software
time domain. Thus, the settime/adjtime and adjfine callbacks are
hardware no-ops.
The timestamps (introduced in a further patch) will also be translated
to the correct time domain before being handed over to the userspace PTP
stack.
The introduction of a second set of PHC operations that operate on the
hardware PTPCLKVAL/PTPCLKADD/PTPCLKRATE in the future is somewhat
unavoidable, as the TTEthernet core uses the corrected PTP time domain.
However, the free-running counter + timecounter structure combination
will suffice for now, as the resulting timestamps yield a sub-50 ns
synchronization offset in steady state using linuxptp.
For this patch, in absence of frame timestamping, the operations of the
switch PHC were tested by syncing it to the system time as a local slave
clock with:
phc2sys -s CLOCK_REALTIME -c swp2 -O 0 -m -S 0.01
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
These are needed for the situation where the switch driver and the PTP
driver are both built as modules.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The incl_srcpt setting makes the switch mangle the destination MACs of
multicast frames trapped to the CPU - a primitive tagging mechanism that
works even when we cannot use the 802.1Q software features.
The downside is that the two multicast MAC addresses that the switch
traps for L2 PTP (01-80-C2-00-00-0E and 01-1B-19-00-00-00) quickly turn
into a lot more, as the switch encodes the source port and switch id
into bytes 3 and 4 of the MAC. The resulting range of MAC addresses
would need to be installed manually into the DSA master port's multicast
MAC filter, and even then, most devices might not have a large enough
MAC filtering table.
As a result, only limit use of incl_srcpt to when it's strictly
necessary: when under a VLAN filtering bridge. This fixes PTP in
non-bridged mode (standalone ports). Otherwise, PTP frames, as well as
metadata follow-up frames holding RX timestamps won't be received
because they will be blocked by the master port's MAC filter.
Linuxptp doesn't help, because it only requests the addition of the
unmodified PTP MACs to the multicast filter.
This issue is not seen in bridged mode because the master port is put in
promiscuous mode when the slave ports are enslaved to a bridge.
Therefore, there is no downside to having the incl_srcpt mechanism
active there.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
>From reading the P/Q/R/S user manual, it appears that TPID is used by
the switch for detecting S-tags and TPID2 for C-tags. Their meaning is
not clear from the E/T manual.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is a cosmetic patch, pre-cursor to making another change to the
General Parameters Table (incl_srcpt) which does not logically pertain
to the sja1105_change_tpid function name, but not putting it there would
otherwise create a need of resetting the switch twice.
So simply move the existing code into the .port_vlan_filtering callback,
where the incl_srcpt change will be added as well.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some ISDN files that got removed in net-next had some changes
done in mainline, take the removals.
Signed-off-by: David S. Miller <davem@davemloft.net>
The hardware values for link speed are held in the sja1105_speed_t enum.
However they do not increase in the order that sja1105_get_speed_cfg was
iterating over them (basically from SJA1105_SPEED_AUTO - 0 - to
SJA1105_SPEED_1000MBPS - 1 - skipping the other two).
Another bug is that the code in sja1105_adjust_port_config relies on the
fact that an invalid link speed is detected by sja1105_get_speed_cfg and
returned as -EINVAL. However storing this into an enum that only has
positive members will cast it into an unsigned value, and it will miss
the negative check.
So take the simplest approach and remove the sja1105_get_speed_cfg
function and replace it with a simple switch-case statement.
Fixes: 8aa9ebccae ("net: dsa: Introduce driver for NXP SJA1105 5-port L2 switch")
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Suggested-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
TX VLANs and RX VLANs are an internal implementation detail of DSA for
frame tagging. They work by installing special VLANs on switch ports in
the operating modes where no behavior change w.r.t. VLANs can be
observed by the user.
Therefore it makes sense to hide these VLANs in the 'bridge fdb'
command, as well as translate the pvid into the RX VID and TX VID on
'bridge fdb add' and 'bridge fdb del' commands.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is a cosmetic patch that simplifies the code by removing a
redundant check. A logical AND-with-zero performed on a zero is still
zero.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds support for manipulating the L2 forwarding database (dump,
add, delete) for the second generation of NXP SJA1105 switches.
At the moment only FDB entries installed statically through 'bridge fdb'
are visible in the dump callback - the dynamically learned ones are
still under investigation.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Management routes are one-shot FDB rules installed on the CPU port for
sending link-local traffic. They are a prerequisite for STP, PTP etc to
work.
Also make a note that removing a management route was not supported on
the previous generation of switches.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Conceptually, if an entry is not found in the requested hardware table,
it is not an invalid request - so change the error returned
appropriately.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
These are needed in order to implement the switchdev FDB callbacks.
Compared to the E/T generation, not only the ABI (bit offsets) is
different, but also the introduction of the HOSTCMD field which permits
O(1) TCAM search for an FDB entry. Make use of the newly introduce
OP_SEARCH to permit that. It will be used while adding and deleting an
FDB entry (to see whether it exists or not).
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The DSA callbacks were written with the E/T (first generation) in mind,
which is quite different.
For P/Q/R/S completely new implementations need to be provided, which
are held as function pointers in the priv->info structure. We are
taking a slightly roundabout way for this (a function from
sja1105_main.c reads a structure defined in sja1105_spi.c that
points to a function defined in sja1105_main.c), but it is what it is.
The FDB dump callback works for both families, hence no function pointer
for that.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Only a single dynamic configuration table of the SJA1105 P/Q/R/S
supports this operation: the FDB.
To keep the existing structure in place (sja1105_dynamic_config_read and
sja1105_dynamic_config_write) and not introduce any new function, a
convention is made for sja1105_dynamic_config_read that a negative index
argument denotes a search for the entry provided as argument.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This appends to the L2 Forwarding and L2 Forwarding Parameters tables
(originally added for first-generation switches) the bits that are new
in the second generation.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This was inadvertently copied from the SJA1105 E/T structure and not
tested. Cross-checking with the P/Q/R/S documentation (UM11040) makes
it immediately obvious what the correct bit offsets for this field are.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This structure is merely an implementation detail and should be hidden
from the sja1105_dynamic_config.h header, which provides to the rest of
the driver an abstract access to the dynamic configuration interface of
the switch.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix sparse warning:
drivers/net/dsa/sja1105/sja1105_static_config.c:446:1: warning:
symbol 'static_config_check_memory_size' was not declared. Should it be static?
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
PHYLIB and PHYLINK handle fixed-link interfaces differently. PHYLIB
wraps them in a software PHY ("pseudo fixed link") phydev construct such
that .adjust_link driver callbacks see an unified API. Whereas PHYLINK
simply creates a phylink_link_state structure and passes it to
.mac_config.
At the time the driver was introduced, DSA was using PHYLIB for the
CPU/cascade ports (the ones with no net devices) and PHYLINK for
everything else.
As explained below:
commit aab9c4067d
Author: Florian Fainelli <f.fainelli@gmail.com>
Date: Thu May 10 13:17:36 2018 -0700
net: dsa: Plug in PHYLINK support
Drivers that utilize fixed links for user-facing ports (e.g: bcm_sf2)
will need to implement phylink_mac_ops from now on to preserve
functionality, since PHYLINK *does not* create a phy_device instance
for fixed links.
In the above patch, DSA guards the .phylink_mac_config callback against
a NULL phydev pointer. Therefore, .adjust_link is not called in case of
a fixed-link user port.
This patch fixes the situation by converting the driver from using
.adjust_link to .phylink_mac_config. This can be done now in a unified
fashion for both slave and CPU/cascade ports because DSA now uses
PHYLINK for all ports.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add SPDX license identifiers to all Make/Kconfig files which:
- Have no license information of any form
These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:
GPL-2.0-only
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Dan Carpenter says:
The patch 640f763f98: "net: dsa: sja1105: Add support for Spanning
Tree Protocol" from May 5, 2019, leads to the following static
checker warning:
drivers/net/dsa/sja1105/sja1105_main.c:1073 sja1105_stp_state_get()
warn: signedness bug returning '(-22)'
The caller doesn't check for negative errors anyway.
Fixes: 640f763f98: ("net: dsa: sja1105: Add support for Spanning Tree Protocol")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The while-loop exit condition check is not correct; the
loop should continue if the returns from the function calls are
negative or the CRC status returns are invalid. Currently it
is ignoring the returns from the function calls. Fix this by
removing the status return checks and only break from the loop
at the very end when we know that all the success condtions have
been met.
Kudos to Dan Carpenter for describing the correct fix and
Vladimir Oltean for noting the change to the check on the number
of retries.
Addresses-Coverity: ("Uninitialized scalar variable")
Fixes: 8aa9ebccae ("net: dsa: Introduce driver for NXP SJA1105 5-port L2 switch")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Tested-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
drivers/net/dsa/sja1105/sja1105_spi.c:486:21: warning: symbol 'sja1105et_regs' was not declared. Should it be static?
drivers/net/dsa/sja1105/sja1105_spi.c:511:21: warning: symbol 'sja1105pqrs_regs' was not declared. Should it be static?
Fixes: 8aa9ebccae ("net: dsa: Introduce driver for NXP SJA1105 5-port L2 switch")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Hai <wanghai26@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
While not explicitly documented as supported in UM10944, compliance with
the STP states can be obtained by manipulating 3 settings at the
(per-port) MAC config level: dynamic learning, inhibiting reception of
regular traffic, and inhibiting transmission of regular traffic.
In all these modes, transmission and reception of special BPDU frames
from the stack is still enabled (not inhibited by the MAC-level
settings).
On ingress, BPDUs are classified by the MAC filter as link-local
(01-80-C2-00-00-00) and forwarded to the CPU port. This mechanism works
under all conditions (even without the custom 802.1Q tagging) because
the switch hardware inserts the source port and switch ID into bytes 4
and 5 of the MAC-filtered frames. Then the DSA .rcv handler needs to put
back zeroes into the MAC address after decoding the source port
information.
On egress, BPDUs are transmitted using management routes from the xmit
worker thread. Again this does not require switch tagging, as the switch
port is programmed through SPI to hold a temporary (single-fire) route
for a frame with the programmed destination MAC (01-80-C2-00-00-00).
STP is activated using the following commands and was tested by
connecting two front-panel ports together and noticing that switching
loops were prevented (one port remains in the blocking state):
$ ip link add name br0 type bridge stp_state 1 && ip link set br0 up
$ for eth in $(ls /sys/devices/platform/soc/2100000.spi/spi_master/spi0/spi0.1/net/);
do ip link set ${eth} master br0 && ip link set ${eth} up; done
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In order to support this, we are creating a make-shift switch tag out of
a VLAN trunk configured on the CPU port. Termination of normal traffic
on switch ports only works when not under a vlan_filtering bridge.
Termination of management (PTP, BPDU) traffic works under all
circumstances because it uses a different tagging mechanism
(incl_srcpt). We are making use of the generic CONFIG_NET_DSA_TAG_8021Q
code and leveraging it from our own CONFIG_NET_DSA_TAG_SJA1105.
There are two types of traffic: regular and link-local.
The link-local traffic received on the CPU port is trapped from the
switch's regular forwarding decisions because it matched one of the two
DMAC filters for management traffic.
On transmission, the switch requires special massaging for these
link-local frames. Due to a weird implementation of the switching IP, by
default it drops link-local frames that originate on the CPU port.
It needs to be told where to forward them to, through an SPI command
("management route") that is valid for only a single frame.
So when we're sending link-local traffic, we are using the
dsa_defer_xmit mechanism.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ethernet flow control:
The switch MAC does not consume, nor does it emit pause frames. It
simply forwards them as any other Ethernet frame (and since the DMAC is,
per IEEE spec, 01-80-C2-00-00-01, it means they are filtered as
link-local traffic and forwarded to the CPU, which can't do anything
useful with them).
Duplex:
There is no duplex setting in the SJA1105 MAC. It is known to forward
traffic at line rate on the same port in both directions. Therefore it
must be that it only supports full duplex.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Resetting the switch at runtime is currently done while changing the
vlan_filtering setting (due to the required TPID change).
But reset is asynchronous with packet egress, and the switch core will
not wait for egress to finish before carrying on with the reset
operation.
As a result, a connected PHY such as the BCM5464 would see an
unterminated Ethernet frame and start to jabber (repeat the last seen
Ethernet symbols - jabber is by definition an oversized Ethernet frame
with bad FCS). This behavior is strange in itself, but it also causes
the MACs of some link partners (such as the FRDM-LS1012A) to completely
lock up.
So as a remedy for this situation, when switch reset is required, simply
inhibit Tx on all ports, and wait for the necessary time for the
eventual one frame left in the egress queue (not even the Tx inhibit
command is instantaneous) to be flushed.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
If STP is active, this setting is applied on bridged ports each time an
Ethernet link is established (topology changes).
Since the setting is global to the switch and a reset is required to
change it, resets are prevented if the new callback does not change the
value that the hardware already is programmed for.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
VLAN filtering cannot be properly disabled in SJA1105. So in order to
emulate the "no VLAN awareness" behavior (not dropping traffic that is
tagged with a VID that isn't configured on the port), we need to hack
another switch feature: programmable TPID (which is 0x8100 for 802.1Q).
We are reprogramming the TPID to a bogus value which leaves the switch
thinking that all traffic is untagged, and therefore accepts it.
Under a vlan_filtering bridge, the proper TPID of ETH_P_8021Q is
installed again, and the switch starts identifying 802.1Q-tagged
traffic.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Documentation/devicetree/bindings/net/ethernet.txt is confusing because
it says what the MAC should not do, but not what it *should* do:
* "rgmii-rxid" (RGMII with internal RX delay provided by the PHY, the MAC
should not add an RX delay in this case)
The gap in semantics is threefold:
1. Is it illegal for the MAC to apply the Rx internal delay by itself,
and simplify the phy_mode (mask off "rgmii-rxid" into "rgmii") before
passing it to of_phy_connect? The documentation would suggest yes.
1. For "rgmii-rxid", while the situation with the Rx clock skew is more
or less clear (needs to be added by the PHY), what should the MAC
driver do about the Tx delays? Is it an implicit wild card for the
MAC to apply delays in the Tx direction if it can? What if those were
already added as serpentine PCB traces, how could that be made more
obvious through DT bindings so that the MAC doesn't attempt to add
them twice and again potentially break the link?
3. If the interface is a fixed-link and therefore the PHY object is
fixed (a purely software entity that obviously cannot add clock
skew), what is the meaning of the above property?
So an interpretation of the RGMII bindings was chosen that hopefully
does not contradict their intention but also makes them more applied.
The SJA1105 driver understands to act upon "rgmii-*id" phy-mode bindings
if the port is in the PHY role (either explicitly, or if it is a
fixed-link). Otherwise it always passes the duty of setting up delays to
the PHY driver.
The error behavior that this patch adds is required on SJA1105E/T where
the MAC really cannot apply internal delays. If the other end of the
fixed-link cannot apply RGMII delays either (this would be specified
through its own DT bindings), then the situation requires PCB delays.
For SJA1105P/Q/R/S, this is however hardware supported and the error is
thus only temporary. I created a stub function pointer for configuring
delays per-port on RXC and TXC, and will implement it when I have access
to a board with this hardware setup.
Meanwhile do not allow the user to select an invalid configuration.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently only the (more difficult) first generation E/T series is
supported. Here the TCAM is only 4-way associative, and to know where
the hardware will search for a FDB entry, we need to perform the same
hash algorithm in order to install the entry in the correct bin.
On P/Q/R/S, the TCAM should be fully associative. However the SPI
command interface is different, and because I don't have access to a
new-generation device at the moment, support for it is TODO.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
At this moment the following is supported:
* Link state management through phylib
* Autonomous L2 forwarding managed through iproute2 bridge commands.
IP termination must be done currently through the master netdevice,
since the switch is unmanaged at this point and using
DSA_TAG_PROTO_NONE.
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Georg Waibel <georg.waibel@sensor-technik.de>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>