Merge branch 'dsa-docs'
Florian Fainelli says: ==================== Documentation: dsa This patch series adds some documentation about DSA as a subsystem as well as the SF2 driver since it slightly diverges from your average DSA driver ;) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
This commit is contained in:
commit
b635f0901a
114
Documentation/networking/dsa/bcm_sf2.txt
Normal file
114
Documentation/networking/dsa/bcm_sf2.txt
Normal file
@ -0,0 +1,114 @@
|
||||
Broadcom Starfighter 2 Ethernet switch driver
|
||||
=============================================
|
||||
|
||||
Broadcom's Starfighter 2 Ethernet switch hardware block is commonly found and
|
||||
deployed in the following products:
|
||||
|
||||
- xDSL gateways such as BCM63138
|
||||
- streaming/multimedia Set Top Box such as BCM7445
|
||||
- Cable Modem/residential gateways such as BCM7145/BCM3390
|
||||
|
||||
The switch is typically deployed in a configuration involving between 5 to 13
|
||||
ports, offering a range of built-in and customizable interfaces:
|
||||
|
||||
- single integrated Gigabit PHY
|
||||
- quad integrated Gigabit PHY
|
||||
- quad external Gigabit PHY w/ MDIO multiplexer
|
||||
- integrated MoCA PHY
|
||||
- several external MII/RevMII/GMII/RGMII interfaces
|
||||
|
||||
The switch also supports specific congestion control features which allow MoCA
|
||||
fail-over not to lose packets during a MoCA role re-election, as well as out of
|
||||
band back-pressure to the host CPU network interface when downstream interfaces
|
||||
are connected at a lower speed.
|
||||
|
||||
The switch hardware block is typically interfaced using MMIO accesses and
|
||||
contains a bunch of sub-blocks/registers:
|
||||
|
||||
* SWITCH_CORE: common switch registers
|
||||
* SWITCH_REG: external interfaces switch register
|
||||
* SWITCH_MDIO: external MDIO bus controller (there is another one in SWITCH_CORE,
|
||||
which is used for indirect PHY accesses)
|
||||
* SWITCH_INDIR_RW: 64-bits wide register helper block
|
||||
* SWITCH_INTRL2_0/1: Level-2 interrupt controllers
|
||||
* SWITCH_ACB: Admission control block
|
||||
* SWITCH_FCB: Fail-over control block
|
||||
|
||||
Implementation details
|
||||
======================
|
||||
|
||||
The driver is located in drivers/net/dsa/bcm_sf2.c and is implemented as a DSA
|
||||
driver; see Documentation/networking/dsa/dsa.txt for details on the subsytem
|
||||
and what it provides.
|
||||
|
||||
The SF2 switch is configured to enable a Broadcom specific 4-bytes switch tag
|
||||
which gets inserted by the switch for every packet forwarded to the CPU
|
||||
interface, conversely, the CPU network interface should insert a similar tag for
|
||||
packets entering the CPU port. The tag format is described in
|
||||
net/dsa/tag_brcm.c.
|
||||
|
||||
Overall, the SF2 driver is a fairly regular DSA driver; there are a few
|
||||
specifics covered below.
|
||||
|
||||
Device Tree probing
|
||||
-------------------
|
||||
|
||||
The DSA platform device driver is probed using a specific compatible string
|
||||
provided in net/dsa/dsa.c. The reason for that is because the DSA subsystem gets
|
||||
registered as a platform device driver currently. DSA will provide the needed
|
||||
device_node pointers which are then accessible by the switch driver setup
|
||||
function to setup resources such as register ranges and interrupts. This
|
||||
currently works very well because none of the of_* functions utilized by the
|
||||
driver require a struct device to be bound to a struct device_node, but things
|
||||
may change in the future.
|
||||
|
||||
MDIO indirect accesses
|
||||
----------------------
|
||||
|
||||
Due to a limitation in how Broadcom switches have been designed, external
|
||||
Broadcom switches connected to a SF2 require the use of the DSA slave MDIO bus
|
||||
in order to properly configure them. By default, the SF2 pseudo-PHY address, and
|
||||
an external switch pseudo-PHY address will both be snooping for incoming MDIO
|
||||
transactions, since they are at the same address (30), resulting in some kind of
|
||||
"double" programming. Using DSA, and setting ds->phys_mii_mask accordingly, we
|
||||
selectively divert reads and writes towards external Broadcom switches
|
||||
pseudo-PHY addresses. Newer revisions of the SF2 hardware have introduced a
|
||||
configurable pseudo-PHY address which circumvents the initial design limitation.
|
||||
|
||||
Multimedia over CoAxial (MoCA) interfaces
|
||||
-----------------------------------------
|
||||
|
||||
MoCA interfaces are fairly specific and require the use of a firmware blob which
|
||||
gets loaded onto the MoCA processor(s) for packet processing. The switch
|
||||
hardware contains logic which will assert/de-assert link states accordingly for
|
||||
the MoCA interface whenever the MoCA coaxial cable gets disconnected or the
|
||||
firmware gets reloaded. The SF2 driver relies on such events to properly set its
|
||||
MoCA interface carrier state and properly report this to the networking stack.
|
||||
|
||||
The MoCA interfaces are supported using the PHY library's fixed PHY/emulated PHY
|
||||
device and the switch driver registers a fixed_link_update callback for such
|
||||
PHYs which reflects the link state obtained from the interrupt handler.
|
||||
|
||||
|
||||
Power Management
|
||||
----------------
|
||||
|
||||
Whenever possible, the SF2 driver tries to minimize the overall switch power
|
||||
consumption by applying a combination of:
|
||||
|
||||
- turning off internal buffers/memories
|
||||
- disabling packet processing logic
|
||||
- putting integrated PHYs in IDDQ/low-power
|
||||
- reducing the switch core clock based on the active port count
|
||||
- enabling and advertising EEE
|
||||
- turning off RGMII data processing logic when the link goes down
|
||||
|
||||
Wake-on-LAN
|
||||
-----------
|
||||
|
||||
Wake-on-LAN is currently implemented by utilizing the host processor Ethernet
|
||||
MAC controller wake-on logic. Whenever Wake-on-LAN is requested, an intersection
|
||||
between the user request and the supported host Ethernet interface WoL
|
||||
capabilities is done and the intersection result gets configured. During
|
||||
system-wide suspend/resume, only ports not participating in Wake-on-LAN are
|
||||
disabled.
|
615
Documentation/networking/dsa/dsa.txt
Normal file
615
Documentation/networking/dsa/dsa.txt
Normal file
@ -0,0 +1,615 @@
|
||||
Distributed Switch Architecture
|
||||
===============================
|
||||
|
||||
Introduction
|
||||
============
|
||||
|
||||
This document describes the Distributed Switch Architecture (DSA) subsystem
|
||||
design principles, limitations, interactions with other subsystems, and how to
|
||||
develop drivers for this subsystem as well as a TODO for developers interested
|
||||
in joining the effort.
|
||||
|
||||
Design principles
|
||||
=================
|
||||
|
||||
The Distributed Switch Architecture is a subsystem which was primarily designed
|
||||
to support Marvell Ethernet switches (MV88E6xxx, a.k.a Linkstreet product line)
|
||||
using Linux, but has since evolved to support other vendors as well.
|
||||
|
||||
The original philosophy behind this design was to be able to use unmodified
|
||||
Linux tools such as bridge, iproute2, ifconfig to work transparently whether
|
||||
they configured/queried a switch port network device or a regular network
|
||||
device.
|
||||
|
||||
An Ethernet switch is typically comprised of multiple front-panel ports, and one
|
||||
or more CPU or management port. The DSA subsystem currently relies on the
|
||||
presence of a management port connected to an Ethernet controller capable of
|
||||
receiving Ethernet frames from the switch. This is a very common setup for all
|
||||
kinds of Ethernet switches found in Small Home and Office products: routers,
|
||||
gateways, or even top-of-the rack switches. This host Ethernet controller will
|
||||
be later referred to as "master" and "cpu" in DSA terminology and code.
|
||||
|
||||
The D in DSA stands for Distributed, because the subsystem has been designed
|
||||
with the ability to configure and manage cascaded switches on top of each other
|
||||
using upstream and downstream Ethernet links between switches. These specific
|
||||
ports are referred to as "dsa" ports in DSA terminology and code. A collection
|
||||
of multiple switches connected to each other is called a "switch tree".
|
||||
|
||||
For each front-panel port, DSA will create specialized network devices which are
|
||||
used as controlling and data-flowing endpoints for use by the Linux networking
|
||||
stack. These specialized network interfaces are referred to as "slave" network
|
||||
interfaces in DSA terminology and code.
|
||||
|
||||
The ideal case for using DSA is when an Ethernet switch supports a "switch tag"
|
||||
which is a hardware feature making the switch insert a specific tag for each
|
||||
Ethernet frames it received to/from specific ports to help the management
|
||||
interface figure out:
|
||||
|
||||
- what port is this frame coming from
|
||||
- what was the reason why this frame got forwarded
|
||||
- how to send CPU originated traffic to specific ports
|
||||
|
||||
The subsystem does support switches not capable of inserting/stripping tags, but
|
||||
the features might be slightly limited in that case (traffic separation relies
|
||||
on Port-based VLAN IDs).
|
||||
|
||||
Note that DSA does not currently create network interfaces for the "cpu" and
|
||||
"dsa" ports because:
|
||||
|
||||
- the "cpu" port is the Ethernet switch facing side of the management
|
||||
controller, and as such, would create a duplication of feature, since you
|
||||
would get two interfaces for the same conduit: master netdev, and "cpu" netdev
|
||||
|
||||
- the "dsa" port(s) are just conduits between two or more switches, and as such
|
||||
cannot really be used as proper network interfaces either, only the
|
||||
downstream, or the top-most upstream interface makes sense with that model
|
||||
|
||||
Switch tagging protocols
|
||||
------------------------
|
||||
|
||||
DSA currently supports 4 different tagging protocols, and a tag-less mode as
|
||||
well. The different protocols are implemented in:
|
||||
|
||||
net/dsa/tag_trailer.c: Marvell's 4 trailer tag mode (legacy)
|
||||
net/dsa/tag_dsa.c: Marvell's original DSA tag
|
||||
net/dsa/tag_edsa.c: Marvell's enhanced DSA tag
|
||||
net/dsa/tag_brcm.c: Broadcom's 4 bytes tag
|
||||
|
||||
The exact format of the tag protocol is vendor specific, but in general, they
|
||||
all contain something which:
|
||||
|
||||
- identifies which port the Ethernet frame came from/should be sent to
|
||||
- provides a reason why this frame was forwarded to the management interface
|
||||
|
||||
Master network devices
|
||||
----------------------
|
||||
|
||||
Master network devices are regular, unmodified Linux network device drivers for
|
||||
the CPU/management Ethernet interface. Such a driver might occasionally need to
|
||||
know whether DSA is enabled (e.g.: to enable/disable specific offload features),
|
||||
but the DSA subsystem has been proven to work with industry standard drivers:
|
||||
e1000e, mv643xx_eth etc. without having to introduce modifications to these
|
||||
drivers. Such network devices are also often referred to as conduit network
|
||||
devices since they act as a pipe between the host processor and the hardware
|
||||
Ethernet switch.
|
||||
|
||||
Networking stack hooks
|
||||
----------------------
|
||||
|
||||
When a master netdev is used with DSA, a small hook is placed in in the
|
||||
networking stack is in order to have the DSA subsystem process the Ethernet
|
||||
switch specific tagging protocol. DSA accomplishes this by registering a
|
||||
specific (and fake) Ethernet type (later becoming skb->protocol) with the
|
||||
networking stack, this is also known as a ptype or packet_type. A typical
|
||||
Ethernet Frame receive sequence looks like this:
|
||||
|
||||
Master network device (e.g.: e1000e):
|
||||
|
||||
Receive interrupt fires:
|
||||
- receive function is invoked
|
||||
- basic packet processing is done: getting length, status etc.
|
||||
- packet is prepared to be processed by the Ethernet layer by calling
|
||||
eth_type_trans
|
||||
|
||||
net/ethernet/eth.c:
|
||||
|
||||
eth_type_trans(skb, dev)
|
||||
if (dev->dsa_ptr != NULL)
|
||||
-> skb->protocol = ETH_P_XDSA
|
||||
|
||||
drivers/net/ethernet/*:
|
||||
|
||||
netif_receive_skb(skb)
|
||||
-> iterate over registered packet_type
|
||||
-> invoke handler for ETH_P_XDSA, calls dsa_switch_rcv()
|
||||
|
||||
net/dsa/dsa.c:
|
||||
-> dsa_switch_rcv()
|
||||
-> invoke switch tag specific protocol handler in
|
||||
net/dsa/tag_*.c
|
||||
|
||||
net/dsa/tag_*.c:
|
||||
-> inspect and strip switch tag protocol to determine originating port
|
||||
-> locate per-port network device
|
||||
-> invoke eth_type_trans() with the DSA slave network device
|
||||
-> invoked netif_receive_skb()
|
||||
|
||||
Past this point, the DSA slave network devices get delivered regular Ethernet
|
||||
frames that can be processed by the networking stack.
|
||||
|
||||
Slave network devices
|
||||
---------------------
|
||||
|
||||
Slave network devices created by DSA are stacked on top of their master network
|
||||
device, each of these network interfaces will be responsible for being a
|
||||
controlling and data-flowing end-point for each front-panel port of the switch.
|
||||
These interfaces are specialized in order to:
|
||||
|
||||
- insert/remove the switch tag protocol (if it exists) when sending traffic
|
||||
to/from specific switch ports
|
||||
- query the switch for ethtool operations: statistics, link state,
|
||||
Wake-on-LAN, register dumps...
|
||||
- external/internal PHY management: link, auto-negotiation etc.
|
||||
|
||||
These slave network devices have custom net_device_ops and ethtool_ops function
|
||||
pointers which allow DSA to introduce a level of layering between the networking
|
||||
stack/ethtool, and the switch driver implementation.
|
||||
|
||||
Upon frame transmission from these slave network devices, DSA will look up which
|
||||
switch tagging protocol is currently registered with these network devices, and
|
||||
invoke a specific transmit routine which takes care of adding the relevant
|
||||
switch tag in the Ethernet frames.
|
||||
|
||||
These frames are then queued for transmission using the master network device
|
||||
ndo_start_xmit() function, since they contain the appropriate switch tag, the
|
||||
Ethernet switch will be able to process these incoming frames from the
|
||||
management interface and delivers these frames to the physical switch port.
|
||||
|
||||
Graphical representation
|
||||
------------------------
|
||||
|
||||
Summarized, this is basically how DSA looks like from a network device
|
||||
perspective:
|
||||
|
||||
|
||||
|---------------------------
|
||||
| CPU network device (eth0)|
|
||||
----------------------------
|
||||
| <tag added by switch |
|
||||
| |
|
||||
| |
|
||||
| tag added by CPU> |
|
||||
|--------------------------------------------|
|
||||
| Switch driver |
|
||||
|--------------------------------------------|
|
||||
|| || ||
|
||||
|-------| |-------| |-------|
|
||||
| sw0p0 | | sw0p1 | | sw0p2 |
|
||||
|-------| |-------| |-------|
|
||||
|
||||
Slave MDIO bus
|
||||
--------------
|
||||
|
||||
In order to be able to read to/from a switch PHY built into it, DSA creates a
|
||||
slave MDIO bus which allows a specific switch driver to divert and intercept
|
||||
MDIO reads/writes towards specific PHY addresses. In most MDIO-connected
|
||||
switches, these functions would utilize direct or indirect PHY addressing mode
|
||||
to return standard MII registers from the switch builtin PHYs, allowing the PHY
|
||||
library and/or to return link status, link partner pages, auto-negotiation
|
||||
results etc..
|
||||
|
||||
For Ethernet switches which have both external and internal MDIO busses, the
|
||||
slave MII bus can be utilized to mux/demux MDIO reads and writes towards either
|
||||
internal or external MDIO devices this switch might be connected to: internal
|
||||
PHYs, external PHYs, or even external switches.
|
||||
|
||||
Data structures
|
||||
---------------
|
||||
|
||||
DSA data structures are defined in include/net/dsa.h as well as
|
||||
net/dsa/dsa_priv.h.
|
||||
|
||||
dsa_chip_data: platform data configuration for a given switch device, this
|
||||
structure describes a switch device's parent device, its address, as well as
|
||||
various properties of its ports: names/labels, and finally a routing table
|
||||
indication (when cascading switches)
|
||||
|
||||
dsa_platform_data: platform device configuration data which can reference a
|
||||
collection of dsa_chip_data structure if multiples switches are cascaded, the
|
||||
master network device this switch tree is attached to needs to be referenced
|
||||
|
||||
dsa_switch_tree: structure assigned to the master network device under
|
||||
"dsa_ptr", this structure references a dsa_platform_data structure as well as
|
||||
the tagging protocol supported by the switch tree, and which receive/transmit
|
||||
function hooks should be invoked, information about the directly attached switch
|
||||
is also provided: CPU port. Finally, a collection of dsa_switch are referenced
|
||||
to address individual switches in the tree.
|
||||
|
||||
dsa_switch: structure describing a switch device in the tree, referencing a
|
||||
dsa_switch_tree as a backpointer, slave network devices, master network device,
|
||||
and a reference to the backing dsa_switch_driver
|
||||
|
||||
dsa_switch_driver: structure referencing function pointers, see below for a full
|
||||
description.
|
||||
|
||||
Design limitations
|
||||
==================
|
||||
|
||||
DSA is a platform device driver
|
||||
-------------------------------
|
||||
|
||||
DSA is implemented as a DSA platform device driver which is convenient because
|
||||
it will register the entire DSA switch tree attached to a master network device
|
||||
in one-shot, facilitating the device creation and simplifying the device driver
|
||||
model a bit, this comes however with a number of limitations:
|
||||
|
||||
- building DSA and its switch drivers as modules is currently not working
|
||||
- the device driver parenting does not necessarily reflect the original
|
||||
bus/device the switch can be created from
|
||||
- supporting non-MDIO and non-MMIO (platform) switches is not possible
|
||||
|
||||
Limits on the number of devices and ports
|
||||
-----------------------------------------
|
||||
|
||||
DSA currently limits the number of maximum switches within a tree to 4
|
||||
(DSA_MAX_SWITCHES), and the number of ports per switch to 12 (DSA_MAX_PORTS).
|
||||
These limits could be extended to support larger configurations would this need
|
||||
arise.
|
||||
|
||||
Lack of CPU/DSA network devices
|
||||
-------------------------------
|
||||
|
||||
DSA does not currently create slave network devices for the CPU or DSA ports, as
|
||||
described before. This might be an issue in the following cases:
|
||||
|
||||
- inability to fetch switch CPU port statistics counters using ethtool, which
|
||||
can make it harder to debug MDIO switch connected using xMII interfaces
|
||||
|
||||
- inability to configure the CPU port link parameters based on the Ethernet
|
||||
controller capabilities attached to it: http://patchwork.ozlabs.org/patch/509806/
|
||||
|
||||
- inability to configure specific VLAN IDs / trunking VLANs between switches
|
||||
when using a cascaded setup
|
||||
|
||||
Common pitfalls using DSA setups
|
||||
--------------------------------
|
||||
|
||||
Once a master network device is configured to use DSA (dev->dsa_ptr becomes
|
||||
non-NULL), and the switch behind it expects a tagging protocol, this network
|
||||
interface can only exclusively be used as a conduit interface. Sending packets
|
||||
directly through this interface (e.g.: opening a socket using this interface)
|
||||
will not make us go through the switch tagging protocol transmit function, so
|
||||
the Ethernet switch on the other end, expecting a tag will typically drop this
|
||||
frame.
|
||||
|
||||
Slave network devices check that the master network device is UP before allowing
|
||||
you to administratively bring UP these slave network devices. A common
|
||||
configuration mistake is forgetting to bring UP the master network device first.
|
||||
|
||||
Interactions with other subsystems
|
||||
==================================
|
||||
|
||||
DSA currently leverages the following subsystems:
|
||||
|
||||
- MDIO/PHY library: drivers/net/phy/phy.c, mdio_bus.c
|
||||
- Switchdev: net/switchdev/*
|
||||
- Device Tree for various of_* functions
|
||||
- HWMON: drivers/hwmon/*
|
||||
|
||||
MDIO/PHY library
|
||||
----------------
|
||||
|
||||
Slave network devices exposed by DSA may or may not be interfacing with PHY
|
||||
devices (struct phy_device as defined in include/linux/phy.h), but the DSA
|
||||
subsystem deals with all possible combinations:
|
||||
|
||||
- internal PHY devices, built into the Ethernet switch hardware
|
||||
- external PHY devices, connected via an internal or external MDIO bus
|
||||
- internal PHY devices, connected via an internal MDIO bus
|
||||
- special, non-autonegotiated or non MDIO-managed PHY devices: SFPs, MoCA; a.k.a
|
||||
fixed PHYs
|
||||
|
||||
The PHY configuration is done by the dsa_slave_phy_setup() function and the
|
||||
logic basically looks like this:
|
||||
|
||||
- if Device Tree is used, the PHY device is looked up using the standard
|
||||
"phy-handle" property, if found, this PHY device is created and registered
|
||||
using of_phy_connect()
|
||||
|
||||
- if Device Tree is used, and the PHY device is "fixed", that is, conforms to
|
||||
the definition of a non-MDIO managed PHY as defined in
|
||||
Documentation/devicetree/bindings/net/fixed-link.txt, the PHY is registered
|
||||
and connected transparently using the special fixed MDIO bus driver
|
||||
|
||||
- finally, if the PHY is built into the switch, as is very common with
|
||||
standalone switch packages, the PHY is probed using the slave MII bus created
|
||||
by DSA
|
||||
|
||||
|
||||
SWITCHDEV
|
||||
---------
|
||||
|
||||
DSA directly utilizes SWITCHDEV when interfacing with the bridge layer, and
|
||||
more specifically with its VLAN filtering portion when configuring VLANs on top
|
||||
of per-port slave network devices. Since DSA primarily deals with
|
||||
MDIO-connected switches, although not exclusively, SWITCHDEV's
|
||||
prepare/abort/commit phases are often simplified into a prepare phase which
|
||||
checks whether the operation is supporte by the DSA switch driver, and a commit
|
||||
phase which applies the changes.
|
||||
|
||||
As of today, the only SWITCHDEV objects supported by DSA are the FDB and VLAN
|
||||
objects.
|
||||
|
||||
Device Tree
|
||||
-----------
|
||||
|
||||
DSA features a standardized binding which is documented in
|
||||
Documentation/devicetree/bindings/net/dsa/dsa.txt. PHY/MDIO library helper
|
||||
functions such as of_get_phy_mode(), of_phy_connect() are also used to query
|
||||
per-port PHY specific details: interface connection, MDIO bus location etc..
|
||||
|
||||
HWMON
|
||||
-----
|
||||
|
||||
Some switch drivers feature internal temperature sensors which are exposed as
|
||||
regular HWMON devices in /sys/class/hwmon/.
|
||||
|
||||
Driver development
|
||||
==================
|
||||
|
||||
DSA switch drivers need to implement a dsa_switch_driver structure which will
|
||||
contain the various members described below.
|
||||
|
||||
register_switch_driver() registers this dsa_switch_driver in its internal list
|
||||
of drivers to probe for. unregister_switch_driver() does the exact opposite.
|
||||
|
||||
Unless requested differently by setting the priv_size member accordingly, DSA
|
||||
does not allocate any driver private context space.
|
||||
|
||||
Switch configuration
|
||||
--------------------
|
||||
|
||||
- priv_size: additional size needed by the switch driver for its private context
|
||||
|
||||
- tag_protocol: this is to indicate what kind of tagging protocol is supported,
|
||||
should be a valid value from the dsa_tag_protocol enum
|
||||
|
||||
- probe: probe routine which will be invoked by the DSA platform device upon
|
||||
registration to test for the presence/absence of a switch device. For MDIO
|
||||
devices, it is recommended to issue a read towards internal registers using
|
||||
the switch pseudo-PHY and return whether this is a supported device. For other
|
||||
buses, return a non-NULL string
|
||||
|
||||
- setup: setup function for the switch, this function is responsible for setting
|
||||
up the dsa_switch_driver private structure with all it needs: register maps,
|
||||
interrupts, mutexes, locks etc.. This function is also expected to properly
|
||||
configure the switch to separate all network interfaces from each other, that
|
||||
is, they should be isolated by the switch hardware itself, typically by creating
|
||||
a Port-based VLAN ID for each port and allowing only the CPU port and the
|
||||
specific port to be in the forwarding vector. Ports that are unused by the
|
||||
platform should be disabled. Past this function, the switch is expected to be
|
||||
fully configured and ready to serve any kind of request. It is recommended
|
||||
to issue a software reset of the switch during this setup function in order to
|
||||
avoid relying on what a previous software agent such as a bootloader/firmware
|
||||
may have previously configured.
|
||||
|
||||
- set_addr: Some switches require the programming of the management interface's
|
||||
Ethernet MAC address, switch drivers can also disable ageing of MAC addresses
|
||||
on the management interface and "hardcode"/"force" this MAC address for the
|
||||
CPU/management interface as an optimization
|
||||
|
||||
PHY devices and link management
|
||||
-------------------------------
|
||||
|
||||
- get_phy_flags: Some switches are interfaced to various kinds of Ethernet PHYs,
|
||||
if the PHY library PHY driver needs to know about information it cannot obtain
|
||||
on its own (e.g.: coming from switch memory mapped registers), this function
|
||||
should return a 32-bits bitmask of "flags", that is private between the switch
|
||||
driver and the Ethernet PHY driver in drivers/net/phy/*.
|
||||
|
||||
- phy_read: Function invoked by the DSA slave MDIO bus when attempting to read
|
||||
the switch port MDIO registers. If unavailable, return 0xffff for each read.
|
||||
For builtin switch Ethernet PHYs, this function should allow reading the link
|
||||
status, auto-negotiation results, link partner pages etc..
|
||||
|
||||
- phy_write: Function invoked by the DSA slave MDIO bus when attempting to write
|
||||
to the switch port MDIO registers. If unavailable return a negative error
|
||||
code.
|
||||
|
||||
- poll_link: Function invoked by DSA to query the link state of the switch
|
||||
builtin Ethernet PHYs, per port. This function is responsible for calling
|
||||
netif_carrier_{on,off} when appropriate, and can be used to poll all ports in a
|
||||
single call. Executes from workqueue context.
|
||||
|
||||
- adjust_link: Function invoked by the PHY library when a slave network device
|
||||
is attached to a PHY device. This function is responsible for appropriately
|
||||
configuring the switch port link parameters: speed, duplex, pause based on
|
||||
what the phy_device is providing.
|
||||
|
||||
- fixed_link_update: Function invoked by the PHY library, and specifically by
|
||||
the fixed PHY driver asking the switch driver for link parameters that could
|
||||
not be auto-negotiated, or obtained by reading the PHY registers through MDIO.
|
||||
This is particularly useful for specific kinds of hardware such as QSGMII,
|
||||
MoCA or other kinds of non-MDIO managed PHYs where out of band link
|
||||
information is obtained
|
||||
|
||||
Ethtool operations
|
||||
------------------
|
||||
|
||||
- get_strings: ethtool function used to query the driver's strings, will
|
||||
typically return statistics strings, private flags strings etc.
|
||||
|
||||
- get_ethtool_stats: ethtool function used to query per-port statistics and
|
||||
return their values. DSA overlays slave network devices general statistics:
|
||||
RX/TX counters from the network device, with switch driver specific statistics
|
||||
per port
|
||||
|
||||
- get_sset_count: ethtool function used to query the number of statistics items
|
||||
|
||||
- get_wol: ethtool function used to obtain Wake-on-LAN settings per-port, this
|
||||
function may, for certain implementations also query the master network device
|
||||
Wake-on-LAN settings if this interface needs to participate in Wake-on-LAN
|
||||
|
||||
- set_wol: ethtool function used to configure Wake-on-LAN settings per-port,
|
||||
direct counterpart to set_wol with similar restrictions
|
||||
|
||||
- set_eee: ethtool function which is used to configure a switch port EEE (Green
|
||||
Ethernet) settings, can optionally invoke the PHY library to enable EEE at the
|
||||
PHY level if relevant. This function should enable EEE at the switch port MAC
|
||||
controller and data-processing logic
|
||||
|
||||
- get_eee: ethtool function which is used to query a switch port EEE settings,
|
||||
this function should return the EEE state of the switch port MAC controller
|
||||
and data-processing logic as well as query the PHY for its currently configured
|
||||
EEE settings
|
||||
|
||||
- get_eeprom_len: ethtool function returning for a given switch the EEPROM
|
||||
length/size in bytes
|
||||
|
||||
- get_eeprom: ethtool function returning for a given switch the EEPROM contents
|
||||
|
||||
- set_eeprom: ethtool function writing specified data to a given switch EEPROM
|
||||
|
||||
- get_regs_len: ethtool function returning the register length for a given
|
||||
switch
|
||||
|
||||
- get_regs: ethtool function returning the Ethernet switch internal register
|
||||
contents. This function might require user-land code in ethtool to
|
||||
pretty-print register values and registers
|
||||
|
||||
Power management
|
||||
----------------
|
||||
|
||||
- suspend: function invoked by the DSA platform device when the system goes to
|
||||
suspend, should quiesce all Ethernet switch activities, but keep ports
|
||||
participating in Wake-on-LAN active as well as additional wake-up logic if
|
||||
supported
|
||||
|
||||
- resume: function invoked by the DSA platform device when the system resumes,
|
||||
should resume all Ethernet switch activities and re-configure the switch to be
|
||||
in a fully active state
|
||||
|
||||
- port_enable: function invoked by the DSA slave network device ndo_open
|
||||
function when a port is administratively brought up, this function should be
|
||||
fully enabling a given switch port. DSA takes care of marking the port with
|
||||
BR_STATE_BLOCKING if the port is a bridge member, or BR_STATE_FORWARDING if it
|
||||
was not, and propagating these changes down to the hardware
|
||||
|
||||
- port_disable: function invoked by the DSA slave network device ndo_close
|
||||
function when a port is administratively brought down, this function should be
|
||||
fully disabling a given switch port. DSA takes care of marking the port with
|
||||
BR_STATE_DISABLED and propagating changes to the hardware if this port is
|
||||
disabled while being a bridge member
|
||||
|
||||
Hardware monitoring
|
||||
-------------------
|
||||
|
||||
These callbacks are only available if CONFIG_NET_DSA_HWMON is enabled:
|
||||
|
||||
- get_temp: this function queries the given switch for its temperature
|
||||
|
||||
- get_temp_limit: this function returns the switch current maximum temperature
|
||||
limit
|
||||
|
||||
- set_temp_limit: this function configures the maximum temperature limit allowed
|
||||
|
||||
- get_temp_alarm: this function returns the critical temperature threshold
|
||||
returning an alarm notification
|
||||
|
||||
See Documentation/hwmon/sysfs-interface for details.
|
||||
|
||||
Bridge layer
|
||||
------------
|
||||
|
||||
- port_join_bridge: bridge layer function invoked when a given switch port is
|
||||
added to a bridge, this function should be doing the necessary at the switch
|
||||
level to permit the joining port from being added to the relevant logical
|
||||
domain for it to ingress/egress traffic with other members of the bridge. DSA
|
||||
does nothing but calculate a bitmask of switch ports currently members of the
|
||||
specified bridge being requested the join
|
||||
|
||||
- port_leave_bridge: bridge layer function invoked when a given switch port is
|
||||
removed from a bridge, this function should be doing the necessary at the
|
||||
switch level to deny the leaving port from ingress/egress traffic from the
|
||||
remaining bridge members. When the port leaves the bridge, it should be aged
|
||||
out at the switch hardware for the switch to (re) learn MAC addresses behind
|
||||
this port. DSA calculates the bitmask of ports still members of the bridge
|
||||
being left
|
||||
|
||||
- port_stp_update: bridge layer function invoked when a given switch port STP
|
||||
state is computed by the bridge layer and should be propagated to switch
|
||||
hardware to forward/block/learn traffic. The switch driver is responsible for
|
||||
computing a STP state change based on current and asked parameters and perform
|
||||
the relevant ageing based on the intersection results
|
||||
|
||||
Bridge VLAN filtering
|
||||
---------------------
|
||||
|
||||
- port_pvid_get: bridge layer function invoked when a Port-based VLAN ID is
|
||||
queried for the given switch port
|
||||
|
||||
- port_pvid_set: bridge layer function invoked when a Port-based VLAN ID needs
|
||||
to be configured on the given switch port
|
||||
|
||||
- port_vlan_add: bridge layer function invoked when a VLAN is configured
|
||||
(tagged or untagged) for the given switch port
|
||||
|
||||
- port_vlan_del: bridge layer function invoked when a VLAN is removed from the
|
||||
given switch port
|
||||
|
||||
- vlan_getnext: bridge layer function invoked to query the next configured VLAN
|
||||
in the switch, i.e. returns the bitmaps of members and untagged ports
|
||||
|
||||
- port_fdb_add: bridge layer function invoked when the bridge wants to install a
|
||||
Forwarding Database entry, the switch hardware should be programmed with the
|
||||
specified address in the specified VLAN Id in the forwarding database
|
||||
associated with this VLAN ID
|
||||
|
||||
Note: VLAN ID 0 corresponds to the port private database, which, in the context
|
||||
of DSA, would be the its port-based VLAN, used by the associated bridge device.
|
||||
|
||||
- port_fdb_del: bridge layer function invoked when the bridge wants to remove a
|
||||
Forwarding Database entry, the switch hardware should be programmed to delete
|
||||
the specified MAC address from the specified VLAN ID if it was mapped into
|
||||
this port forwarding database
|
||||
|
||||
TODO
|
||||
====
|
||||
|
||||
The platform device problem
|
||||
---------------------------
|
||||
DSA is currently implemented as a platform device driver which is far from ideal
|
||||
as was discussed in this thread:
|
||||
|
||||
http://permalink.gmane.org/gmane.linux.network/329848
|
||||
|
||||
This basically prevents the device driver model to be properly used and applied,
|
||||
and support non-MDIO, non-MMIO Ethernet connected switches.
|
||||
|
||||
Another problem with the platform device driver approach is that it prevents the
|
||||
use of a modular switch drivers build due to a circular dependency, illustrated
|
||||
here:
|
||||
|
||||
http://comments.gmane.org/gmane.linux.network/345803
|
||||
|
||||
Attempts of reworking this has been done here:
|
||||
|
||||
https://lwn.net/Articles/643149/
|
||||
|
||||
Making SWITCHDEV and DSA converge towards an unified codebase
|
||||
-------------------------------------------------------------
|
||||
|
||||
SWITCHDEV properly takes care of abstracting the networking stack with offload
|
||||
capable hardware, but does not enforce a strict switch device driver model. On
|
||||
the other DSA enforces a fairly strict device driver model, and deals with most
|
||||
of the switch specific. At some point we should envision a merger between these
|
||||
two subsystems and get the best of both worlds.
|
||||
|
||||
Other hanging fruits
|
||||
--------------------
|
||||
|
||||
- making the number of ports fully dynamic and not dependent on DSA_MAX_PORTS
|
||||
- allowing more than one CPU/management interface:
|
||||
http://comments.gmane.org/gmane.linux.network/365657
|
||||
- porting more drivers from other vendors:
|
||||
http://comments.gmane.org/gmane.linux.network/365510
|
Loading…
Reference in New Issue
Block a user