Commit Graph

4549 Commits

Author SHA1 Message Date
Thomas Graf
be577ddc2b [PKT_SCHED] qdisc: Use rtnl registration interface
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:27:09 -07:00
Thomas Graf
63f3444fb9 [IPv4]: Use rtnl registration interface
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:27:08 -07:00
Thomas Graf
9d9e6a5819 [NET] rules: Use rtnl registration interface
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:27:07 -07:00
Thomas Graf
c8822a4e00 [NEIGH]: Use rtnl registration interface
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:27:06 -07:00
Thomas Graf
340d17fc9d [NET] link: Use rtnl registration interface
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:27:05 -07:00
Thomas Graf
e284986385 [RTNL]: Message handler registration interface
This patch adds a new interface to register rtnetlink message
handlers replacing the exported rtnl_links[] array which
required many message handlers to be exported unnecessarly.

Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:27:04 -07:00
Gerrit Renker
30833ffead [CCID3]: Use initial RTT sample from SYN exchange
The patch follows the following recommendation made in an erratum to RFC 4342:

  "Senders MAY additionally make use of other available RTT measurements,
   including those from the initial Request-Response packet exchange."

It implements larger initial windows with regard to this inital RTT measurement,
using the mechanism suggested in draft-ietf-dccp-rfc3448bis, section 4.2.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:27:04 -07:00
Gerrit Renker
89560b53b9 [DCCP]: Sample RTT from SYN exchange
Function:
2007-04-25 22:27:02 -07:00
Gerrit Renker
7dfee1a9c0 [CCID3]: Use function for RTT sampling
This replaces the existing occurrences of RTT sampling with
the use of the new function dccp_sample_rtt.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:27:01 -07:00
Gerrit Renker
4712a792ee [DCCP]: Provide function for RTT sampling
A recurring problem, in particular in the CCID code, is that RTT samples
from packets with timestamp echo and elapsed time options need to be taken.

This service is provided via a new function dccp_sample_rtt in this patch.
Furthermore, to protect against `insane' RTT samples, the sampled value
is bounded between 100 microseconds and 4 seconds - for which u32 is sufficient.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:27:00 -07:00
Gerrit Renker
0c150efb28 [CCID3]: Handle Idle and Application-Limited periods
This updates the code with regard to handling idle and application-limited
periods as specified in [RFC 4342, 5.1].

Background:
2007-04-25 22:26:59 -07:00
Gerrit Renker
a21f9f96cd [CCID3]: Wrap computation of RFC3390-initial rate into separate function
The CCID 3 and TFRC specs (RFC 4342, RFC 3448, draft-3448bis) make frequent
reference to the computation of the RFC-3390 initial sending rate:

  1. Initial sending rate when RTT is known (RFC 4342, p. 6)
  2. Response to Idle/Application-Limited periods (RFC 4342, 5.1)

This warrants putting the code into its own function, for later code reuse.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:26:58 -07:00
Gerrit Renker
1761f7d7fe [CCID3]: Remove build warnings for 64bit
This clears the following sparc64 build warnings:
 1) warning: format "%ld" expects type "long int", but argument 3 has type "suseconds_t"
 2) warning: format "%llu" expects type "long long unsigned int", but argument 3 has type "__u64"
Fixed by using typecast to unsigned. This is argued to be safe, since the quantities, after
de-scaling (factor 2^6) fit all in u32.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:26:57 -07:00
Gerrit Renker
fddc2feb94 [CCID3]: More to see in dccp_probe
This adds a few more fields of interest to /proc/net/dccpprobe, the following output ensues:

1           2          3           4     5  6     7   8        9        10   11
sec.usec   src:sport   dst:dport   size  s  rtt   p   X_calc   X_recv   X    t_ipi

Also made the formatting consistent.

Scripts that go with this can be downloaded from http://139.133.210.30/users/gerrit/dccp/dccp_probe/

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:26:56 -07:00
Gerrit Renker
6626e3628f [DCCP]: More debug information for dccp_wait_for_ccid
This adds more detail in the wait_for_ccid packet scheduling loop.
In particular, it informs about (i) when delay is used and (ii) why
a packet is discarded.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:26:54 -07:00
Gerrit Renker
ac12b0c495 [DCCP]: Always use debug-toggle parameters
Currently debugging output (when configured) is automatically enabled when
DCCP modules are compiled into the kernel rather than built as loadable modules.
This is not necessary, since the module parameters in this case become kernel
commandline parameters, e.g. DCCP or CCID3 debug output can be enabled for a
static build by appending the following at the boot prompt:

	dccp.dccp_debug=1 	dccp_ccid3.ccid3_debug=1

This patch therefore does away with the more complicated way of always enabling
debug output for static builds

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:26:53 -07:00
Gerrit Renker
1266adee12 [CCID3]: Remove race condition and update t_ipi when `s' changes
This:

 1. removes a race condition in the access to the scheduled send time t_nom which
    results from allowing asynchronous r/w access to t_nom without locks;

 2. updates the inter-packet interval t_ipi = s/X when `s' changes, following a
    suggestion by Ian McDonald.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:26:52 -07:00
Ian McDonald
8699be7d24 [CCID3]: More verbose debugging
This adds a few debugging statements to ccid3.c

Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:26:51 -07:00
Ian McDonald
551dc5f7a1 [CCID3]: Fix use of invalid loss intervals
This fixes a bug which uses an invalid comparison.
The bug resulted in the use of invalid loss intervals.

Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Acked-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:26:50 -07:00
Gerrit Renker
371fe7779c [CCID3]: Use MSS for larger initial windows
This improves the slow-start phase by using the MSS
(as suggested in RFC 4342, sec. 5) instead of the packet size s.
Also figured out that __u32 is ample resource enough.

After applying, I got the following in the logs:

  ccid3_hc_tx_packet_recv: client(f7421700), s=6, MSS=1424, w_init=4380, R_sample=176us, X=24886363

Had the previous variant been used, w_init would have been as low as 24.

Committer note: removed unneeded cast to unsigned long long that was
                causing a compiler warning on 64bit architectures.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:26:49 -07:00
Gerrit Renker
9bf17475eb [CCID3]: Re-order CCID 3 source file
No code change at all.
This splits ccid3.c into a RX and a TX section, so that the file has an
organisation similar to the other ones (e.g. packet_history.{h,c}).

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:26:48 -07:00
Gerrit Renker
353b13e10a [CCID3]: Remove redundant `len' test
Since CCID3 avoids  sending 0-byte data packets (cf. ccid3_hc_tx_send_packet),
testing for zero-payload length, as performed by ccid3_hc_tx_update_s, is
redundant - hence removed by this patch.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:26:47 -07:00
Gerrit Renker
8d13bf9a0b [DCCP]: Remove ambiguity in the way before48 is used
This removes two ambiguities in employing the new definition of before48,
following the analysis on http://www.mail-archive.com/dccp@vger.kernel.org/msg01295.html

 (1) Updating GSR when P.seqno >= S.SWL
     With the old definition we did not update when P.seqno and S.SWL are 2^47 apart. To
     ensure the same behaviour as with the old definition, this is replaced with the
     equivalent condition dccp_delta_seqno(S.SWL, P.seqno) >= 0

 (2) Sending SYNC when P.seqno >= S.OSR
     Here it is debatable whether the new definition causes an ambiguity: the case is
     similar to (1); and to have consistency with the case (1), we use the equivalent
     condition dccp_delta_seqno(S.OSR, P.seqno) >= 0

Detailed Justification
2007-04-25 22:26:46 -07:00
Gerrit Renker
b16be51b5e [DCCP]: Fix for follows48
The follows48 relation identifies whether 48-bit sequence number
x is the direct successor of y. Currently, it does not handle cases
of the following type correctly:

	follows48(0x(prefix)10000LL, 0x(prefix)0FFFFLL)

where prefix is an arbitrary hex sequence of up to 7 digits.

This is fixed by reusing the new dccp_delta_seqno function.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:26:45 -07:00
Gerrit Renker
d52de17b8c [DCCP]: Make `before' relation unambiguous
Problem:
2007-04-25 22:26:44 -07:00
Gerrit Renker
0aec51c869 [DCCP]: Make dccp_delta_seqno return signed numbers
Problem:
2007-04-25 22:26:43 -07:00
Gerrit Renker
6b811d43f6 [DCCP]: 48-bit sequence number arithmetic
This patch
 * organizes the sequence arithmetic functions into one corner of dccp.h
 * performs a small modification of dccp_set_seqno to make it more widely reusable
   (now it is safe to use any number, since it performs modulo-2^48 assignment)
 * adds functions and generic macros for 48-bit sequence arithmetic:
 	--48 bit complement
 	--modulo-48 addition and modulo-48 subtraction
	--dccp_inc_seqno now a special case of add48
Constants renamed following a suggestion by Arnaldo.

Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:26:42 -07:00
Arnaldo Carvalho de Melo
6b88dd966b [SK_BUFF] ipv6: Use skb_network_offset in some more places
So that we reduce the number of direct accesses to skb->data.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2007-04-25 22:26:38 -07:00
Arnaldo Carvalho de Melo
dc5fc579b9 [NETLINK]: Use nlmsg_trim() where appropriate
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:26:37 -07:00
Arnaldo Carvalho de Melo
897933bcdf [SK_BUFF]: Remove skb_add_mtu() leftovers
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2007-04-25 22:26:35 -07:00
Arnaldo Carvalho de Melo
b529ccf279 [NETLINK]: Introduce nlmsg_hdr() helper
For the common "(struct nlmsghdr *)skb->data" sequence, so that we reduce the
number of direct accesses to skb->data and for consistency with all the other
cast skb member helpers.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:26:34 -07:00
Robert Olsson
965ffea43d [IPV4]: fib_trie root node settings
The threshold for root node can be more aggressive set to get
better tree compression. The new setting mekes the root grow
from 16 to 19 bits and substansial improvemnt in Aver depth
this with the current table of 214393 prefixes

But really the dynamic resize should need more investigation
both in terms convergence and performance and maybe it should
be possible to change...

Maybe just for the brave to start with or we may have to back
this out.
2007-04-25 22:26:32 -07:00
Robert Olsson
05eee48c5a [IPV4]: fib_trie resize break
The patch below adds break condition for the resize operations. If
we don't achieve the desired fill factor a warning is printed. Trie
should still be operational but new thresholds should be considered.

Signed-off-by: Robert Olsson <robert.olsson@its.uu.se>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:26:32 -07:00
Arnaldo Carvalho de Melo
ca0605a7c8 [SK_BUFF]: Adjust the zeroing up to tail in __alloc_skb too
I did it just in alloc_skb_from_cache, forgot __alloc_skb, fixed now.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:26:31 -07:00
Arnaldo Carvalho de Melo
4305b54135 [SK_BUFF]: Convert skb->end to sk_buff_data_t
Now to convert the last one, skb->data, that will allow many simplifications
and removal of some of the offset helpers.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:26:29 -07:00
Arnaldo Carvalho de Melo
27a884dc3c [SK_BUFF]: Convert skb->tail to sk_buff_data_t
So that it is also an offset from skb->head, reduces its size from 8 to 4 bytes
on 64bit architectures, allowing us to combine the 4 bytes hole left by the
layer headers conversion, reducing struct sk_buff size to 256 bytes, i.e. 4
64byte cachelines, and since the sk_buff slab cache is SLAB_HWCACHE_ALIGN...
:-)

Many calculations that previously required that skb->{transport,network,
mac}_header be first converted to a pointer now can be done directly, being
meaningful as offsets or pointers.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:26:28 -07:00
David S. Miller
be8bd86321 [VLAN] vlan_dev: Use skb_reset_network_header().
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:26:26 -07:00
Samuel Ortiz
c7630a4b93 [IrDA]: irda lockdep annotation
Rmmoding irda triggers a lockdep false positive.

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Samuel Ortiz <samuel@sortiz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:26:23 -07:00
Arnaldo Carvalho de Melo
2e07fa9cd3 [SK_BUFF]: Use offsets for skb->{mac,network,transport}_header on 64bit architectures
With this we save 8 bytes per network packet, leaving a 4 bytes hole to be used
in further shrinking work, likely with the offsetization of other pointers,
such as ->{data,tail,end}, at the cost of adds, that were minimized by the
usual practice of setting skb->{mac,nh,n}.raw to a local variable that is then
accessed multiple times in each function, it also is not more expensive than
before with regards to most of the handling of such headers, like setting one
of these headers to another (transport to network, etc), or subtracting, adding
to/from it, comparing them, etc.

Now we have this layout for sk_buff on a x86_64 machine:

[acme@mica net-2.6.22]$ pahole vmlinux sk_buff
struct sk_buff {
	struct sk_buff *       next;             /*   0   8 */
	struct sk_buff *       prev;             /*   8   8 */
	struct rb_node         rb;               /*  16  24 */
	struct sock *          sk;               /*  40   8 */
	ktime_t                tstamp;           /*  48   8 */
	struct net_device *    dev;              /*  56   8 */
	/* --- cacheline 1 boundary (64 bytes) --- */
	struct net_device *    input_dev;        /*  64   8 */
	sk_buff_data_t         transport_header; /*  72   4 */
	sk_buff_data_t         network_header;   /*  76   4 */
	sk_buff_data_t         mac_header;       /*  80   4 */

	/* XXX 4 bytes hole, try to pack */

	struct dst_entry *     dst;              /*  88   8 */
	struct sec_path *      sp;               /*  96   8 */
	char                   cb[48];           /* 104  48 */
	/* cacheline 2 boundary (128 bytes) was 24 bytes ago*/
	unsigned int           len;              /* 152   4 */
	unsigned int           data_len;         /* 156   4 */
	unsigned int           mac_len;          /* 160   4 */
	union {
		__wsum         csum;             /*       4 */
		__u32          csum_offset;      /*       4 */
	};                                       /* 164   4 */
	__u32                  priority;         /* 168   4 */
	__u8                   local_df:1;       /* 172   1 */
	__u8                   cloned:1;         /* 172   1 */
	__u8                   ip_summed:2;      /* 172   1 */
	__u8                   nohdr:1;          /* 172   1 */
	__u8                   nfctinfo:3;       /* 172   1 */
	__u8                   pkt_type:3;       /* 173   1 */
	__u8                   fclone:2;         /* 173   1 */
	__u8                   ipvs_property:1;  /* 173   1 */

	/* XXX 2 bits hole, try to pack */

	__be16                 protocol;         /* 174   2 */
	void    (*destructor)(struct sk_buff *); /* 176   8 */
	struct nf_conntrack *  nfct;             /* 184   8 */
	/* --- cacheline 3 boundary (192 bytes) --- */
	struct sk_buff *       nfct_reasm;       /* 192   8 */
	struct nf_bridge_info *nf_bridge;        /* 200   8 */
	__u16                  tc_index;         /* 208   2 */
	__u16                  tc_verd;          /* 210   2 */
	dma_cookie_t           dma_cookie;       /* 212   4 */
	__u32                  secmark;          /* 216   4 */
	__u32                  mark;             /* 220   4 */
	unsigned int           truesize;         /* 224   4 */
	atomic_t               users;            /* 228   4 */
	unsigned char *        head;             /* 232   8 */
	unsigned char *        data;             /* 240   8 */
	unsigned char *        tail;             /* 248   8 */
	/* --- cacheline 4 boundary (256 bytes) --- */
	unsigned char *        end;              /* 256   8 */
}; /* size: 264, cachelines: 5 */
   /* sum members: 260, holes: 1, sum holes: 4 */
   /* bit holes: 1, sum bit holes: 2 bits */
   /* last cacheline: 8 bytes */

On 32 bits nothing changes, and pointers continue to be used with the compiler
turning all this abstraction layer into dust. But there are some sk_buff
validation tricks that are now possible, humm... :-)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:26:21 -07:00
Arnaldo Carvalho de Melo
b0e380b1d8 [SK_BUFF]: unions of just one member don't get anything done, kill them
Renaming skb->h to skb->transport_header, skb->nh to skb->network_header and
skb->mac to skb->mac_header, to match the names of the associated helpers
(skb[_[re]set]_{transport,network,mac}_header).

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:26:20 -07:00
Arnaldo Carvalho de Melo
cfe1fc7759 [SK_BUFF]: Introduce skb_network_header_len
For the common sequence "skb->h.raw - skb->nh.raw", similar to skb->mac_len,
that is precalculated tho, don't think we need to bloat skb with one more
member, so just use this new helper, reducing the number of non-skbuff.h
references to the layer headers even more.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:26:19 -07:00
Arnaldo Carvalho de Melo
bff9b61ce3 [SK_BUFF]: Use the helpers to get the layer header pointer
Some more cases...

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:26:18 -07:00
Patrick McHardy
514bca322c [NET_SCHED]: Fix warning
net/sched/sch_api.c: In function 'psched_show':
net/sched/sch_api.c:1219: warning: format '%08x' expects type 'unsigned int', but argument 6 has type 's64'

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:26:17 -07:00
Patrick McHardy
bb239acf56 [NET_SCHED]: sch_cbq: fix watchdog scheduled too late
q->now is increased during dequeue and doesn't contain the current time
afterwards, resulting in a too large timeout value for the qdisc watchdog.
Use "now" instead, which still contains the current time.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:26:16 -07:00
Patrick McHardy
4361cb17f0 [NET_SCHED]: Export real timer resolution in /proc/net/psched
The timer resolution exported in /proc/net/psched is used by userspace to
calculate HTB's burst values. Currently it is set to HZ, since we're now
using hrtimers, use KTIME_MONOTONIC_RES, which makes HTB use smaller burst
values.

This patch also affects libnl, which incorrectly uses this value for
the SFQ perturbation parameter, which is always in seconds, and some
routing cache values, which are in USER_HZ, so both cases are broken
anyway.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:26:15 -07:00
Patrick McHardy
00c04af9df [NET_SCHED]: kill jiffie conversion macros
Now that all packet schedulers have been converted to hrtimers most users
of PSCHED_JIFFIE2US and PSCHED_US2JIFFIE are gone. The remaining users use
it to convert external time units to packet scheduler clock ticks, so use
PSCHED_TICKS_PER_SEC instead.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:26:14 -07:00
Patrick McHardy
fb983d4578 [NET_SCHED]: sch_htb: use hrtimer based watchdog
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:26:13 -07:00
Patrick McHardy
1a13cb63d6 [NET_SCHED]: sch_cbq: use hrtimer for delay_timer
Switch delay_timer to hrtimer.

The class penalty parameter is changed to use psched ticks as units.
Since iproute never supported using this and the only existing user
(libnl) incorrectly assumes psched ticks as units anyway, this
shouldn't break anything.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:26:12 -07:00
Patrick McHardy
e9054a339e [NET_SCHED]: sch_cbq: fix cbq_undelay_prio for non-active priorites
cbq_undelay_prio is supposed to return a time delta, but returns the
current time for non-active priorities, causing cbq_undelay to mark
the priority as active and schedule a timer for twice the current
time.

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:26:11 -07:00
Patrick McHardy
88a993540a [NET_SCHED]: sch_cbq: use hrtimer based watchdog
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-04-25 22:26:09 -07:00