Commit Graph

62 Commits

Author SHA1 Message Date
Petr Machata
7de85b0431 mlxsw: spectrum_qdisc: Index future FIFOs by band number
mlxsw used to hold an array of qdiscs indexed by the TC number. In the
previous patch, it was changed to allocate child qdiscs dynamically, and
they are now indexed by band number. Follow suit with the array of future
FIFOs.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-20 16:43:13 -07:00
Petr Machata
5cbd960253 mlxsw: spectrum_qdisc: Allocate child qdiscs dynamically
Instead of keeping qdiscs in globally-preallocated arrays, introduce a
per-qdisc-kind value num_classes, and then allocate the necessary child
qdiscs (if any) based on that value. Since now dynamic allocation is
involved, mlxsw_sp_qdisc_replace() gets messy enough that it is worth it to
split it to two cases: a new qdisc allocation and a change of existing
qdisc. (Note that the change also includes what TC formally calls replace,
if the qdisc kind is the same.)

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-20 16:43:13 -07:00
Petr Machata
cff99e2045 mlxsw: spectrum_qdisc: Guard all qdisc accesses with a lock
The FIFO handler currently guards accesses to the future FIFO tracking by
asserting RTNL. In the future, the changes to the qdisc state will be more
thorough, so other qdiscs will need this guarding is as well. In order
to not further the RTNL infestation, instead convert to a custom lock that
will guard accesses to the qdisc state.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-20 16:43:13 -07:00
Petr Machata
51d52ed955 mlxsw: spectrum_qdisc: Track children per qdisc
mlxsw currently allows a two-level structure of qdiscs: the root and
possibly a number of children. In order to support offloading more general
qdisc trees, introduce to struct mlxsw_sp_qdisc a pointer to child qdiscs.
Refer to the child qdiscs through this pointer, instead of going through
the tclass_qdiscs in qdisc_state. Additionally introduce a field
num_classes, which holds number of given qdisc's children.

Also introduce a generic function for walking qdisc trees. Rewrite
mlxsw_sp_qdisc_find() and _find_by_handle() to use the generic walker.

For now, keep the qdisc_state.tclass_qdisc, and just point root_qdiscs's
children to this array. Following patches will make the allocation dynamic.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-20 16:43:13 -07:00
Petr Machata
b21832b568 mlxsw: spectrum_qdisc: Promote backlog reduction to mlxsw_sp_qdisc_destroy()
When a qdisc is removed, it is necessary to update the backlog value at its
parent--unless the qdisc is at root position. RED, TBF and FIFO all do
that, each separately. Since all of them need to do this, just promote the
operation directly to mlxsw_sp_qdisc_destroy(), instead of deferring it to
individual destructors. Since FIFO dtor thus becomes trivial, remove it.

Add struct mlxsw_sp_qdisc.parent to point at the parent qdisc. This will be
handy later as deeper structures are offloaded. Use the parent qdisc to
find the chain of parents whose backlog value needs to be updated.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-20 16:43:13 -07:00
Petr Machata
017a131cde mlxsw: spectrum_qdisc: Track tclass_num as int, not u8
tclass_num is just a number, a value that would be ordinarily passed around
as an int. (Which is unlike a u8 prio_bitmap.) In several places,
tclass_num already is an int. Convert the remaining instances.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-20 16:43:13 -07:00
Petr Machata
549f2aae84 mlxsw: spectrum_qdisc: Drop an always-true condition
The function mlxsw_sp_qdisc_compare() is invoked a couple lines above this
check, which will bounce any requests where this condition does not hold.
Therefore drop it.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-20 16:43:13 -07:00
Petr Machata
290fe2c595 mlxsw: spectrum_qdisc: Simplify mlxsw_sp_qdisc_compare()
The purpose of this function is to filter out events that are related to
qdiscs that are not offloaded, or are not offloaded anymore. But the
function is unnecessarily thorough:

- mlxsw_sp_qdisc pointer is never NULL in the context where it is called
- Two qdiscs with the same handle will never have different types. Even
  when replacing one qdisc with another in the same class, Linux will not
  permit handle reuse unless the qdisc type also matches.

Simplify the function by omitting these two unnecessary conditions.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-20 16:43:13 -07:00
Petr Machata
17c0e6d175 mlxsw: spectrum_qdisc: Drop one argument from check_params callback
The mlxsw_sp_qdisc argument is not used in any of the actual callbacks.
Drop it.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-20 16:43:13 -07:00
Ido Schimmel
2dcbd9207b mlxsw: spectrum_span: Add SPAN probability rate support
Currently, every packet that matches a mirroring trigger (e.g., received
packets, buffer dropped packets) is mirrored. Spectrum-2 and later ASICs
support mirroring with probability, where every 1 in N matched packets
is mirrored.

Extend the API that creates the binding between the trigger and the SPAN
agent with a probability rate parameter, which is an attribute of the
trigger. Set it to '1' to maintain existing behavior.

Subsequent patches will use it to perform more sophisticated sampling,
by mirroring packets to the CPU with probability.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-11 16:22:39 -08:00
Ido Schimmel
5c7659eba8 mlxsw: spectrum_span: Add SPAN session identifier support
When packets are mirrored to the CPU, the trap identifier with which the
packets are trapped is determined according to the session identifier of
the SPAN agent performing the mirroring. Packets that are trapped for
the same logical reason (e.g., buffer drops) should use the same session
identifier.

Currently, a single session is implicitly supported (identifier 0) and
is used for packets that are mirrored to the CPU due to buffer drops
(e.g., early drop).

Subsequent patches are going to mirror packets to the CPU due to
sampling, which will require a different session identifier.

Prepare for that by making the session identifier an attribute of the
SPAN agent.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-11 16:22:39 -08:00
Petr Machata
509f04ca62 mlxsw: spectrum_qdisc: Disable port buffer autoresize with qdiscs
There are two interfaces to configure ETS: qdiscs and DCB. Historically,
DCB ETS configuration was projected to ingress as well, and configured port
buffers. Qdisc was not.

Keep qdiscs behaving this way, and if an offloaded qdisc is configured on a
port, move this port's headroom to a manual mode, thus allowing
configuration of port buffers through dcbnl_setbuffer.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-17 17:01:38 -07:00
Petr Machata
54a9238589 mlxsw: spectrum_qdisc: Offload action trap for qevents
When offloading action trap on a qevent, pass to_dev of NULL to the SPAN
module to trigger the mirror to the CPU port. Query the buffer drops
policer and use it for policing of the trapped traffic.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-08-03 18:06:46 -07:00
Ido Schimmel
a120ecc3c5 mlxsw: spectrum_span: Allow passing parameters to SPAN agents
Currently, the only parameter of a SPAN agent is the netdev which
the SPAN agent should mirror to.

The next patch will add the ability to request a SPAN agent that mirrors
to a specific netdev and has a specific policer ID bound to it. This is
required when mirroring packets to the CPU port.

Therefore, encapsulate the sole parameter to mlxsw_sp_span_agent_get()
in a structure, so that it could later be extended with policer
information.

Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-14 14:50:49 -07:00
Petr Machata
f6668eac22 mlxsw: spectrum_qdisc: Offload mirroring on RED qevent early_drop
The RED qevents early_drop and mark can be offloaded under the following
fairly strict conditions:

- At most one filter is configured at the qevent block
- The protocol is "any"
- The classifier is matchall
- The action is trap, sample, or mirror with the same conditions as
  with other SPAN offloads
- The hw_counters type is none

In this patchset, implement offload of mirror for early_drop qevent.
The ECN trigger is currently not implemented in the FW and therefore
the mark qevent is not supported.

The qevent notifications look exactly like regular block binding
notifications with a binder type that identifies them as qevents.
Therefore the details of processing this binding are fairly similar
to the matchall offload.

struct flow_block_offload.sch points at the qdisc in question. Use it to
figure out if the qdisc is offloaded at all and what TC it configures.
Bounce bindings on not-offloaded qdiscs.

Individual bindings are kept in a list so that several qevents can share
the same block and all binding points get configured as the configured
filters change.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-13 17:22:22 -07:00
Petr Machata
8040c96b4f mlxsw: spectrum_qdisc: Offload RED ECN nodrop mode
RED ECN nodrop mode means that non-ECT traffic should not be early-dropped,
but enqueued normally instead. In Spectrum systems, this is achieved by
disabling CWTPM.ew (enable WRED) for a given traffic class.

So far CWTPM.ew was unconditionally enabled. Instead disable it when the
RED qdisc is in nodrop mode.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-14 21:03:46 -07:00
Petr Machata
7bec1a45d5 mlxsw: spectrum_qdisc: Support offloading of FIFO Qdisc
There are two peculiarities about offloading FIFO:

- sometimes the qdisc has an unspecified handle (it is "invisible")
- it may be created before the qdisc that it will be a child of

These features make the offload a bit more tricky. The approach chosen in
this patch is to make note of all the FIFOs that needed to be rejected
because their parents were not known. Later when the parent is created,
they are offloaded

FIFO is only offloaded for its counters, queue length is ignored.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-05 14:03:32 -08:00
Petr Machata
c4e372e2ac mlxsw: spectrum_qdisc: Add handle parameter to ..._ops.replace
PRIO and ETS will need to check the value of qdisc handle in their
handlers. Add it to the callback and propagate through.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-05 14:03:31 -08:00
Petr Machata
ee88450d25 mlxsw: spectrum_qdisc: Introduce struct mlxsw_sp_qdisc_state
In order to have a tidy structure where to put information related to Qdisc
offloads, introduce a new structure. Move there the two existing pieces of
data: root_qdisc and tclass_qdiscs. Embed them directly, because there's no
reason to go through pointer anymore. Convert users, update init/fini
functions.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-05 14:03:31 -08:00
Petr Machata
8a29581eb0 mlxsw: spectrum: Move the ECN-marked packet counter to ethtool
Spectrum-1 and Spectrum-2 do not have a per-TC counter of number of packets
marked by ECN. The value reported currently is the total number of marked
packets. Showing this value at individual TC Qdiscs is misleading.

Move the counter to ethtool instead.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-02-26 20:44:42 -08:00
Nathan Chancellor
91a7d4bf3e mlxsw: spectrum_qdisc: Fix 64-bit division error in mlxsw_sp_qdisc_tbf_rate_kbps
When building arm32 allmodconfig:

ERROR: "__aeabi_uldivmod"
[drivers/net/ethernet/mellanox/mlxsw/mlxsw_spectrum.ko] undefined!

rate_bytes_ps has type u64, we need to use a 64-bit division helper to
avoid a build error.

Fixes: a44f58c41b ("mlxsw: spectrum_qdisc: Support offloading of TBF Qdisc")
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Tested-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-01-31 08:52:41 -08:00
Petr Machata
a44f58c41b mlxsw: spectrum_qdisc: Support offloading of TBF Qdisc
React to the TC messages that were introduced in a preceding patch and
configure egress maximum shaper as appropriate. TBF can be used as a root
qdisc or under one of PRIO or strict ETS bands.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-25 10:56:31 +01:00
Petr Machata
be1d5a8a77 mlxsw: spectrum_qdisc: Extract a common leaf unoffload function
When the RED Qdisc is unoffloaded, it needs to reduce the reported backlog
by the amount that is in the HW, so that only the SW backlog is contained
in the counter. The same thing will need to be done by TBF, and likely any
other leaf Qdisc as well.

Extract a helper mlxsw_sp_qdisc_leaf_unoffload() and call it from
mlxsw_sp_qdisc_red_unoffload().

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-25 10:56:31 +01:00
Petr Machata
3d0d592193 mlxsw: spectrum_qdisc: Add mlxsw_sp_qdisc_get_class_stats()
Add a wrapper around mlxsw_sp_qdisc_collect_tc_stats() and
mlxsw_sp_qdisc_update_stats() for the simple case of doing both in one go:
mlxsw_sp_qdisc_get_class_stats(). Dispatch to that function from
mlxsw_sp_qdisc_get_red_stats(). This new function will be useful for other
leaf Qdiscs as well.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-25 10:56:31 +01:00
Petr Machata
cf9af379cd mlxsw: spectrum_qdisc: Extract a per-TC stat function
Extract from mlxsw_sp_qdisc_get_prio_stats() two new functions:
mlxsw_sp_qdisc_collect_tc_stats() to accumulate stats for that one TC only,
and mlxsw_sp_qdisc_update_stats() that makes the stats relative to base
values stored earlier. Use them from mlxsw_sp_qdisc_get_red_stats().

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-25 10:56:31 +01:00
David S. Miller
b3f7e3f23a Merge ra.kernel.org:/pub/scm/linux/kernel/git/netdev/net 2020-01-19 22:10:04 +01:00
Petr Machata
85005b82e5 mlxsw: spectrum_qdisc: Include MC TCs in Qdisc counters
mlxsw configures Spectrum in such a way that BUM traffic is passed not
through its nominal traffic class TC, but through its MC counterpart TC+8.
However, when collecting statistics, Qdiscs only look at the nominal TC and
ignore the MC TC.

Add two helpers to compute the value for logical TC from the constituents,
one for backlog, the other for tail drops. Use them throughout instead of
going through the xstats pointer directly.

Counters for TX bytes and packets are deduced from packet priority
counters, and therefore already include BUM traffic. wred_drop counter is
irrelevant on MC TCs, because RED is not enabled on them.

Fixes: 7b81953066 ("mlxsw: spectrum: Configure MC-aware mode on mlxsw ports")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-15 04:16:30 -08:00
David S. Miller
a2d6d7ae59 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
The ungrafting from PRIO bug fixes in net, when merged into net-next,
merge cleanly but create a build failure.  The resolution used here is
from Petr Machata.

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-09 12:13:43 -08:00
Petr Machata
3971a535b8 mlxsw: spectrum_qdisc: Ignore grafting of invisible FIFO
The following patch will change PRIO to replace a removed Qdisc with an
invisible FIFO, instead of NOOP. mlxsw will see this replacement due to the
graft message that is generated. But because FIFO does not issue its own
REPLACE message, when the graft operation takes place, the Qdisc that mlxsw
tracks under the indicated band is still the old one. The child
handle (0:0) therefore does not match, and mlxsw rejects the graft
operation, which leads to an extack message:

    Warning: Offloading graft operation failed.

Fix by ignoring the invisible children in the PRIO graft handler. The
DESTROY message of the removed Qdisc is going to follow shortly and handle
the removal.

Fixes: 32dc5efc6c ("mlxsw: spectrum: qdiscs: prio: Handle graft command")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-08 12:45:52 -08:00
Petr Machata
19f405b988 mlxsw: spectrum_qdisc: Support offloading of ETS Qdisc
Handle TC_SETUP_QDISC_ETS, add a new ops structure for the ETS Qdisc.
Invoke the extended prio handlers implemented in the previous patch. For
stats ops, invoke directly the prio callbacks, which are not sensitive to
differences between PRIO and ETS.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-18 13:32:30 -08:00
Petr Machata
7917f52ae1 mlxsw: spectrum_qdisc: Generalize PRIO offload to support ETS
Thanks to the similarity between PRIO and ETS it is possible to simply
reuse most of the code for offloading PRIO Qdisc. Extract the common
functionality into separate functions, making the current PRIO handlers
thin API adapters.

Extend the new functions to pass quanta for individual bands, which allows
configuring a subset of bands as WRR. Invoke mlxsw_sp_port_ets_set() as
appropriate to de/configure WRR-ness and weight of individual bands.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-18 13:32:30 -08:00
Petr Machata
5bc146c90e mlxsw: spectrum_qdisc: Clarify a comment
Expand the comment at mlxsw_sp_qdisc_prio_graft() to make the problem that
this function is trying to handle clearer.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-18 13:32:29 -08:00
Petr Machata
914c4fc1b7 mlxsw: spectrum: Use guaranteed buffer size as pool size limit
There are two resources associated with shared buffer size:
cap_total_buffer_size, and cap_guaranteed_shared_buffer. So far, mlxsw has
been using the former as a limit to determine how large a pool size is
allowed to be. However, the total size also includes headrooms and reserved
space, which really cannot be used for shared buffer pools.

Therefore convert mlxsw to use the latter resource as a limit. Adjust
hard-coded pool sizes to be the guaranteed size minus 256000 bytes for CPU
port pool. On Spectrum-1 that actually leads to an increase. A follow-up
patch will have this size calculated automatically.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-10-23 21:31:31 -07:00
Jiri Pirko
9948a0641a mlxsw: Replace license text with SPDX identifiers and adjust copyrights
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-09 10:36:10 -07:00
Kees Cook
6396bb2215 treewide: kzalloc() -> kcalloc()
The kzalloc() function has a 2-factor argument form, kcalloc(). This
patch replaces cases of:

        kzalloc(a * b, gfp)

with:
        kcalloc(a * b, gfp)

as well as handling cases of:

        kzalloc(a * b * c, gfp)

with:

        kzalloc(array3_size(a, b, c), gfp)

as it's slightly less ugly than:

        kzalloc_array(array_size(a, b), c, gfp)

This does, however, attempt to ignore constant size factors like:

        kzalloc(4 * 1024, gfp)

though any constants defined via macros get caught up in the conversion.

Any factors with a sizeof() of "unsigned char", "char", and "u8" were
dropped, since they're redundant.

The Coccinelle script used for this was:

// Fix redundant parens around sizeof().
@@
type TYPE;
expression THING, E;
@@

(
  kzalloc(
-	(sizeof(TYPE)) * E
+	sizeof(TYPE) * E
  , ...)
|
  kzalloc(
-	(sizeof(THING)) * E
+	sizeof(THING) * E
  , ...)
)

// Drop single-byte sizes and redundant parens.
@@
expression COUNT;
typedef u8;
typedef __u8;
@@

(
  kzalloc(
-	sizeof(u8) * (COUNT)
+	COUNT
  , ...)
|
  kzalloc(
-	sizeof(__u8) * (COUNT)
+	COUNT
  , ...)
|
  kzalloc(
-	sizeof(char) * (COUNT)
+	COUNT
  , ...)
|
  kzalloc(
-	sizeof(unsigned char) * (COUNT)
+	COUNT
  , ...)
|
  kzalloc(
-	sizeof(u8) * COUNT
+	COUNT
  , ...)
|
  kzalloc(
-	sizeof(__u8) * COUNT
+	COUNT
  , ...)
|
  kzalloc(
-	sizeof(char) * COUNT
+	COUNT
  , ...)
|
  kzalloc(
-	sizeof(unsigned char) * COUNT
+	COUNT
  , ...)
)

// 2-factor product with sizeof(type/expression) and identifier or constant.
@@
type TYPE;
expression THING;
identifier COUNT_ID;
constant COUNT_CONST;
@@

(
- kzalloc
+ kcalloc
  (
-	sizeof(TYPE) * (COUNT_ID)
+	COUNT_ID, sizeof(TYPE)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(TYPE) * COUNT_ID
+	COUNT_ID, sizeof(TYPE)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(TYPE) * (COUNT_CONST)
+	COUNT_CONST, sizeof(TYPE)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(TYPE) * COUNT_CONST
+	COUNT_CONST, sizeof(TYPE)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(THING) * (COUNT_ID)
+	COUNT_ID, sizeof(THING)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(THING) * COUNT_ID
+	COUNT_ID, sizeof(THING)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(THING) * (COUNT_CONST)
+	COUNT_CONST, sizeof(THING)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(THING) * COUNT_CONST
+	COUNT_CONST, sizeof(THING)
  , ...)
)

// 2-factor product, only identifiers.
@@
identifier SIZE, COUNT;
@@

- kzalloc
+ kcalloc
  (
-	SIZE * COUNT
+	COUNT, SIZE
  , ...)

// 3-factor product with 1 sizeof(type) or sizeof(expression), with
// redundant parens removed.
@@
expression THING;
identifier STRIDE, COUNT;
type TYPE;
@@

(
  kzalloc(
-	sizeof(TYPE) * (COUNT) * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  kzalloc(
-	sizeof(TYPE) * (COUNT) * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  kzalloc(
-	sizeof(TYPE) * COUNT * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  kzalloc(
-	sizeof(TYPE) * COUNT * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  kzalloc(
-	sizeof(THING) * (COUNT) * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
|
  kzalloc(
-	sizeof(THING) * (COUNT) * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
|
  kzalloc(
-	sizeof(THING) * COUNT * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
|
  kzalloc(
-	sizeof(THING) * COUNT * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
)

// 3-factor product with 2 sizeof(variable), with redundant parens removed.
@@
expression THING1, THING2;
identifier COUNT;
type TYPE1, TYPE2;
@@

(
  kzalloc(
-	sizeof(TYPE1) * sizeof(TYPE2) * COUNT
+	array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
  , ...)
|
  kzalloc(
-	sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+	array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
  , ...)
|
  kzalloc(
-	sizeof(THING1) * sizeof(THING2) * COUNT
+	array3_size(COUNT, sizeof(THING1), sizeof(THING2))
  , ...)
|
  kzalloc(
-	sizeof(THING1) * sizeof(THING2) * (COUNT)
+	array3_size(COUNT, sizeof(THING1), sizeof(THING2))
  , ...)
|
  kzalloc(
-	sizeof(TYPE1) * sizeof(THING2) * COUNT
+	array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
  , ...)
|
  kzalloc(
-	sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+	array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
  , ...)
)

// 3-factor product, only identifiers, with redundant parens removed.
@@
identifier STRIDE, SIZE, COUNT;
@@

(
  kzalloc(
-	(COUNT) * STRIDE * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kzalloc(
-	COUNT * (STRIDE) * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kzalloc(
-	COUNT * STRIDE * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kzalloc(
-	(COUNT) * (STRIDE) * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kzalloc(
-	COUNT * (STRIDE) * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kzalloc(
-	(COUNT) * STRIDE * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kzalloc(
-	(COUNT) * (STRIDE) * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  kzalloc(
-	COUNT * STRIDE * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
)

// Any remaining multi-factor products, first at least 3-factor products,
// when they're not all constants...
@@
expression E1, E2, E3;
constant C1, C2, C3;
@@

(
  kzalloc(C1 * C2 * C3, ...)
|
  kzalloc(
-	(E1) * E2 * E3
+	array3_size(E1, E2, E3)
  , ...)
|
  kzalloc(
-	(E1) * (E2) * E3
+	array3_size(E1, E2, E3)
  , ...)
|
  kzalloc(
-	(E1) * (E2) * (E3)
+	array3_size(E1, E2, E3)
  , ...)
|
  kzalloc(
-	E1 * E2 * E3
+	array3_size(E1, E2, E3)
  , ...)
)

// And then all remaining 2 factors products when they're not all constants,
// keeping sizeof() as the second factor argument.
@@
expression THING, E1, E2;
type TYPE;
constant C1, C2, C3;
@@

(
  kzalloc(sizeof(THING) * C2, ...)
|
  kzalloc(sizeof(TYPE) * C2, ...)
|
  kzalloc(C1 * C2 * C3, ...)
|
  kzalloc(C1 * C2, ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(TYPE) * (E2)
+	E2, sizeof(TYPE)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(TYPE) * E2
+	E2, sizeof(TYPE)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(THING) * (E2)
+	E2, sizeof(THING)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	sizeof(THING) * E2
+	E2, sizeof(THING)
  , ...)
|
- kzalloc
+ kcalloc
  (
-	(E1) * E2
+	E1, E2
  , ...)
|
- kzalloc
+ kcalloc
  (
-	(E1) * (E2)
+	E1, E2
  , ...)
|
- kzalloc
+ kcalloc
  (
-	E1 * E2
+	E1, E2
  , ...)
)

Signed-off-by: Kees Cook <keescook@chromium.org>
2018-06-12 16:19:22 -07:00
Nogah Frankel
32dc5efc6c mlxsw: spectrum: qdiscs: prio: Handle graft command
Handle graft command for an offloaded sch_prio.
Grafting a qdisc to any place other than under its original parent is not
supported by mlxsw and will cause the grafted qdisc to stop being
offloaded.

Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Reviewed-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-28 12:06:01 -05:00
Nogah Frankel
98ceb7b6d6 mlxsw: spectrum: qdiscs: prio: Delete child qdiscs when removing bands
When the number the bands of sch_prio is decreased, child qdiscs on the
deleted bands would get deleted as well.
This change and deletions are being done under sch_tree_lock of the
sch_prio qdisc. Part of the destruction of qdisc is unoffloading it, if
it is offloaded. Un-offloading can't be done inside this lock.
Move the offload command to be done before reducing the number of bands,
so unoffloading of the qdiscs that are about to be deleted could be done
outside of the lock.

Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Reviewed-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-28 12:06:01 -05:00
Nogah Frankel
23f2b4048c mlxsw: spectrum: Update sch_prio stats to include sch_red related drops
sch_prio as root qdisc should count all the drops its children have. Since
it is possible for it to have sch_red children, it needs to count RED early
drops.

Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Reviewed-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-28 12:06:01 -05:00
Nogah Frankel
cc6e5c13af mlxsw: spectrum: qdiscs: Update backlog handling of a child qdiscs
When removing a child qdisc its backlog will be decreased from the parent
backlog. The driver backlog count should do the same.
When the parent changes its configuration, the child might need to clean
its stats. However, the backlog can't be cleaned with the rest of the
stats, because it reflects a momentary value that needs to be synced with
the core, not the history of the qdisc.

Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Reviewed-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-28 12:06:00 -05:00
Nogah Frankel
04cc0bf5d6 mlxsw: spectrum: qdiscs: Collect stats for sch_red based on priomap
Priority counters count packets according to their packet priority.
Collect the stats for sch_red based on these counters, so the qdisc bstats
will be the sum of counters matching the priorities marked in the qdisc
priomap.
Changing the mapping of the priorities to bands while traffic is running
can result in losing the stats of the bands qdiscs from their last dump
call to this change, as if the qdisc was unoffloaded and re-offloaded. It
will not affect the traffic behaviour according to sch_red.

Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Reviewed-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-28 12:06:00 -05:00
Nogah Frankel
1631ab2e8d mlxsw: spectrum: qdiscs: Add priority map per qdisc
Add priority map per qdisc, to indicate which priorities are being
directed through this qdisc.

Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Reviewed-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-28 12:06:00 -05:00
Nogah Frankel
eed4baeb04 mlxsw: spectrum: qdiscs: Support qdisc per tclass
Add the option to set a qdisc per tclass.  Match the qdisc to the tclass by
parent ID. Supported currently for sch_red only.
It allows offloading sch_prio as root qdisc and sch_red as its child.
(However, doing so might corrupt the stats for both parent and child.)

Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Reviewed-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-02-28 12:06:00 -05:00
Jakub Kicinski
416ef9b15c net: sched: red: don't reset the backlog on every stat dump
Commit 0dfb33a0d7 ("sch_red: report backlog information") copied
child's backlog into RED's backlog.  Back then RED did not maintain
its own backlog counts.  This has changed after commit 2ccccf5fb4
("net_sched: update hierarchical backlog too") and commit d7f4f332f0
("sch_red: update backlog as well").  Copying is no longer necessary.

Tested:

$ tc -s qdisc show dev veth0
qdisc red 1: root refcnt 2 limit 400000b min 30000b max 30000b ecn
 Sent 20942 bytes 221 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 1260b 14p requeues 14
  marked 0 early 0 pdrop 0 other 0
qdisc tbf 2: parent 1: rate 1Kbit burst 15000b lat 3585.0s
 Sent 20942 bytes 221 pkt (dropped 0, overlimits 138 requeues 0)
 backlog 1260b 14p requeues 14

Recently RED offload was added.  We need to make sure drivers don't
depend on resetting the stats.  This means backlog should be treated
like any other statistic:

  total_stat = new_hw_stat - prev_hw_stat;

Adjust mlxsw.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Nogah Frankel <nogahf@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17 14:29:32 -05:00
Wei Yongjun
e02f08a070 mlxsw: spectrum: qdiscs: Make function mlxsw_sp_qdisc_prio_unoffload static
Fixes the following sparse warning:

drivers/net/ethernet/mellanox/mlxsw/spectrum_qdisc.c:464:1: warning:
 symbol 'mlxsw_sp_qdisc_prio_unoffload' was not declared. Should it be static?

Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-16 14:25:42 -05:00
Nogah Frankel
93d8a4c1b5 mlxsw: spectrum: qdiscs: Support stats for PRIO qdisc
Support basic stats for PRIO qdisc, which includes tx packets and bytes
count, drops count and backlog size. The rest of the stats are irrelevant
for this qdisc offload.
Since backlog is not only incremental but reflecting momentary value, in
case of a qdisc that stops being offloaded but is not destroyed, backlog
value needs to be updated about the un-offloading.
For that reason an unoffload function is being added to the ops struct.

Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Reviewed-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-14 12:21:12 -05:00
Nogah Frankel
46a3615be4 mlxsw: spectrum: qdiscs: Support PRIO qdisc offload
Add support for offloading PRIO qdisc as root qdisc.
The support is for up to 8 bands.
Routed packets priority is determined by the DSCP field with the default
translations. Bridged packets priority is determined by the PCP field, if
exist, otherwise it is set to 0.
Since both options have only priorities 0-7, higher priorities mapping are
being ignored.

Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Reviewed-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-14 12:21:12 -05:00
David S. Miller
19d28fbd30 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
BPF alignment tests got a conflict because the registers
are output as Rn_w instead of just Rn in net-next, and
in net a fixup for a testcase prohibits logical operations
on pointers before using them.

Also, we should attempt to patch BPF call args if JIT always on is
enabled.  Instead, if we fail to JIT the subprogs we should pass
an error back up and fail immediately.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-11 22:13:42 -05:00
Nogah Frankel
56202ca4ed mlxsw: spectrum: qdiscs: Remove qdisc before setting a new one
If a qdisc is being replaced by another qdisc of the same type, it can
simply override over its configuration.
However, if it replaces a qdisc of another type, it needs to be removed
before setting the new qdisc.

Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Reviewed-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-10 16:07:41 -05:00
Nogah Frankel
9cf6c9c758 mlxsw: spectrum: qdiscs: Create a generic replace function
Create a generic qdisc replace function.
For that goal, add three functions to the qdisc ops struct:
* check_params: Checks if the given parameters are offloadable.
* replace: Offload the given parameters.
* clean_stats: clean the qdisc stats for the offloaded qdisc.
integrate RED offloading into using the new internal replace API.

Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Reviewed-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-10 16:07:41 -05:00
Nogah Frankel
9a37a59f71 mlxsw: spectrum: qdiscs: Create a generic destroy function
Add a destroy function to the qdiscs ops struct.
Create a generic qdisc destroy function, that clears the qdisc metadata as
well as calling the specific qdisc destroy function.

Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Reviewed-by: Yuval Mintz <yuvalm@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-10 16:07:41 -05:00