net: add document for several snmp counters
add document for below counters: TcpEstabResets TcpAttemptFails TcpOutRsts TcpExtTCPSACKDiscard TcpExtTCPDSACKIgnoredOld TcpExtTCPDSACKIgnoredNoUndo TcpExtTCPSackShifted TcpExtTCPSackMerged TcpExtTCPSackShiftFallback TcpExtTCPWantZeroWindowAdv TcpExtTCPToZeroWindowAdv TcpExtTCPFromZeroWindowAdv TcpExtDelayedACKs TcpExtDelayedACKLocked TcpExtDelayedACKLost TcpExtTCPLossProbes TcpExtTCPLossProbeRecovery Signed-off-by: yupeng <yupeng0921@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This commit is contained in:
parent
26fc181e6c
commit
a6c7c7aac2
@ -336,7 +336,26 @@ time client replies ACK, this socket will get another chance to move
|
|||||||
to the accept queue.
|
to the accept queue.
|
||||||
|
|
||||||
|
|
||||||
TCP Fast Open
|
* TcpEstabResets
|
||||||
|
Defined in `RFC1213 tcpEstabResets`_.
|
||||||
|
|
||||||
|
.. _RFC1213 tcpEstabResets: https://tools.ietf.org/html/rfc1213#page-48
|
||||||
|
|
||||||
|
* TcpAttemptFails
|
||||||
|
Defined in `RFC1213 tcpAttemptFails`_.
|
||||||
|
|
||||||
|
.. _RFC1213 tcpAttemptFails: https://tools.ietf.org/html/rfc1213#page-48
|
||||||
|
|
||||||
|
* TcpOutRsts
|
||||||
|
Defined in `RFC1213 tcpOutRsts`_. The RFC says this counter indicates
|
||||||
|
the 'segments sent containing the RST flag', but in linux kernel, this
|
||||||
|
couner indicates the segments kerenl tried to send. The sending
|
||||||
|
process might be failed due to some errors (e.g. memory alloc failed).
|
||||||
|
|
||||||
|
.. _RFC1213 tcpOutRsts: https://tools.ietf.org/html/rfc1213#page-52
|
||||||
|
|
||||||
|
|
||||||
|
TCP Fast Path
|
||||||
============
|
============
|
||||||
When kernel receives a TCP packet, it has two paths to handler the
|
When kernel receives a TCP packet, it has two paths to handler the
|
||||||
packet, one is fast path, another is slow path. The comment in kernel
|
packet, one is fast path, another is slow path. The comment in kernel
|
||||||
@ -383,8 +402,6 @@ increase 1.
|
|||||||
|
|
||||||
TCP abort
|
TCP abort
|
||||||
========
|
========
|
||||||
|
|
||||||
|
|
||||||
* TcpExtTCPAbortOnData
|
* TcpExtTCPAbortOnData
|
||||||
It means TCP layer has data in flight, but need to close the
|
It means TCP layer has data in flight, but need to close the
|
||||||
connection. So TCP layer sends a RST to the other side, indicate the
|
connection. So TCP layer sends a RST to the other side, indicate the
|
||||||
@ -545,7 +562,6 @@ packet yet, the sender would know packet 4 is out of order. The TCP
|
|||||||
stack of kernel will increase TcpExtTCPSACKReorder for both of the
|
stack of kernel will increase TcpExtTCPSACKReorder for both of the
|
||||||
above scenarios.
|
above scenarios.
|
||||||
|
|
||||||
|
|
||||||
DSACK
|
DSACK
|
||||||
=====
|
=====
|
||||||
The DSACK is defined in `RFC2883`_. The receiver uses DSACK to report
|
The DSACK is defined in `RFC2883`_. The receiver uses DSACK to report
|
||||||
@ -566,13 +582,63 @@ The TCP stack receives an out of order duplicate packet, so it sends a
|
|||||||
DSACK to the sender.
|
DSACK to the sender.
|
||||||
|
|
||||||
* TcpExtTCPDSACKRecv
|
* TcpExtTCPDSACKRecv
|
||||||
The TCP stack receives a DSACK, which indicate an acknowledged
|
The TCP stack receives a DSACK, which indicates an acknowledged
|
||||||
duplicate packet is received.
|
duplicate packet is received.
|
||||||
|
|
||||||
* TcpExtTCPDSACKOfoRecv
|
* TcpExtTCPDSACKOfoRecv
|
||||||
The TCP stack receives a DSACK, which indicate an out of order
|
The TCP stack receives a DSACK, which indicate an out of order
|
||||||
duplicate packet is received.
|
duplicate packet is received.
|
||||||
|
|
||||||
|
invalid SACK and DSACK
|
||||||
|
====================
|
||||||
|
When a SACK (or DSACK) block is invalid, a corresponding counter would
|
||||||
|
be updated. The validation method is base on the start/end sequence
|
||||||
|
number of the SACK block. For more details, please refer the comment
|
||||||
|
of the function tcp_is_sackblock_valid in the kernel source code. A
|
||||||
|
SACK option could have up to 4 blocks, they are checked
|
||||||
|
individually. E.g., if 3 blocks of a SACk is invalid, the
|
||||||
|
corresponding counter would be updated 3 times. The comment of the
|
||||||
|
`Add counters for discarded SACK blocks`_ patch has additional
|
||||||
|
explaination:
|
||||||
|
|
||||||
|
.. _Add counters for discarded SACK blocks: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=18f02545a9a16c9a89778b91a162ad16d510bb32
|
||||||
|
|
||||||
|
* TcpExtTCPSACKDiscard
|
||||||
|
This counter indicates how many SACK blocks are invalid. If the invalid
|
||||||
|
SACK block is caused by ACK recording, the TCP stack will only ignore
|
||||||
|
it and won't update this counter.
|
||||||
|
|
||||||
|
* TcpExtTCPDSACKIgnoredOld and TcpExtTCPDSACKIgnoredNoUndo
|
||||||
|
When a DSACK block is invalid, one of these two counters would be
|
||||||
|
updated. Which counter will be updated depends on the undo_marker flag
|
||||||
|
of the TCP socket. If the undo_marker is not set, the TCP stack isn't
|
||||||
|
likely to re-transmit any packets, and we still receive an invalid
|
||||||
|
DSACK block, the reason might be that the packet is duplicated in the
|
||||||
|
middle of the network. In such scenario, TcpExtTCPDSACKIgnoredNoUndo
|
||||||
|
will be updated. If the undo_marker is set, TcpExtTCPDSACKIgnoredOld
|
||||||
|
will be updated. As implied in its name, it might be an old packet.
|
||||||
|
|
||||||
|
SACK shift
|
||||||
|
=========
|
||||||
|
The linux networking stack stores data in sk_buff struct (skb for
|
||||||
|
short). If a SACK block acrosses multiple skb, the TCP stack will try
|
||||||
|
to re-arrange data in these skb. E.g. if a SACK block acknowledges seq
|
||||||
|
10 to 15, skb1 has seq 10 to 13, skb2 has seq 14 to 20. The seq 14 and
|
||||||
|
15 in skb2 would be moved to skb1. This operation is 'shift'. If a
|
||||||
|
SACK block acknowledges seq 10 to 20, skb1 has seq 10 to 13, skb2 has
|
||||||
|
seq 14 to 20. All data in skb2 will be moved to skb1, and skb2 will be
|
||||||
|
discard, this operation is 'merge'.
|
||||||
|
|
||||||
|
* TcpExtTCPSackShifted
|
||||||
|
A skb is shifted
|
||||||
|
|
||||||
|
* TcpExtTCPSackMerged
|
||||||
|
A skb is merged
|
||||||
|
|
||||||
|
* TcpExtTCPSackShiftFallback
|
||||||
|
A skb should be shifted or merged, but the TCP stack doesn't do it for
|
||||||
|
some reasons.
|
||||||
|
|
||||||
TCP out of order
|
TCP out of order
|
||||||
===============
|
===============
|
||||||
* TcpExtTCPOFOQueue
|
* TcpExtTCPOFOQueue
|
||||||
@ -662,6 +728,60 @@ unacknowledged number (more strict than `RFC 5961 section 5.2`_).
|
|||||||
.. _RFC 5961 section 4.2: https://tools.ietf.org/html/rfc5961#page-9
|
.. _RFC 5961 section 4.2: https://tools.ietf.org/html/rfc5961#page-9
|
||||||
.. _RFC 5961 section 5.2: https://tools.ietf.org/html/rfc5961#page-11
|
.. _RFC 5961 section 5.2: https://tools.ietf.org/html/rfc5961#page-11
|
||||||
|
|
||||||
|
TCP receive window
|
||||||
|
=================
|
||||||
|
* TcpExtTCPWantZeroWindowAdv
|
||||||
|
Depending on current memory usage, the TCP stack tries to set receive
|
||||||
|
window to zero. But the receive window might still be a no-zero
|
||||||
|
value. For example, if the previous window size is 10, and the TCP
|
||||||
|
stack receives 3 bytes, the current window size would be 7 even if the
|
||||||
|
window size calculated by the memory usage is zero.
|
||||||
|
|
||||||
|
* TcpExtTCPToZeroWindowAdv
|
||||||
|
The TCP receive window is set to zero from a no-zero value.
|
||||||
|
|
||||||
|
* TcpExtTCPFromZeroWindowAdv
|
||||||
|
The TCP receive window is set to no-zero value from zero.
|
||||||
|
|
||||||
|
|
||||||
|
Delayed ACK
|
||||||
|
==========
|
||||||
|
The TCP Delayed ACK is a technique which is used for reducing the
|
||||||
|
packet count in the network. For more details, please refer the
|
||||||
|
`Delayed ACK wiki`_
|
||||||
|
|
||||||
|
.. _Delayed ACK wiki: https://en.wikipedia.org/wiki/TCP_delayed_acknowledgment
|
||||||
|
|
||||||
|
* TcpExtDelayedACKs
|
||||||
|
A delayed ACK timer expires. The TCP stack will send a pure ACK packet
|
||||||
|
and exit the delayed ACK mode.
|
||||||
|
|
||||||
|
* TcpExtDelayedACKLocked
|
||||||
|
A delayed ACK timer expires, but the TCP stack can't send an ACK
|
||||||
|
immediately due to the socket is locked by a userspace program. The
|
||||||
|
TCP stack will send a pure ACK later (after the userspace program
|
||||||
|
unlock the socket). When the TCP stack sends the pure ACK later, the
|
||||||
|
TCP stack will also update TcpExtDelayedACKs and exit the delayed ACK
|
||||||
|
mode.
|
||||||
|
|
||||||
|
* TcpExtDelayedACKLost
|
||||||
|
It will be updated when the TCP stack receives a packet which has been
|
||||||
|
ACKed. A Delayed ACK loss might cause this issue, but it would also be
|
||||||
|
triggered by other reasons, such as a packet is duplicated in the
|
||||||
|
network.
|
||||||
|
|
||||||
|
Tail Loss Probe (TLP)
|
||||||
|
===================
|
||||||
|
TLP is an algorithm which is used to detect TCP packet loss. For more
|
||||||
|
details, please refer the `TLP paper`_.
|
||||||
|
|
||||||
|
.. _TLP paper: https://tools.ietf.org/html/draft-dukkipati-tcpm-tcp-loss-probe-01
|
||||||
|
|
||||||
|
* TcpExtTCPLossProbes
|
||||||
|
A TLP probe packet is sent.
|
||||||
|
|
||||||
|
* TcpExtTCPLossProbeRecovery
|
||||||
|
A packet loss is detected and recovered by TLP.
|
||||||
|
|
||||||
examples
|
examples
|
||||||
=======
|
=======
|
||||||
|
Loading…
Reference in New Issue
Block a user