Commit Graph

3919 Commits

Author SHA1 Message Date
Linus Torvalds
f61ea1b0c8 Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6 2006-01-04 16:30:12 -08:00
Linus Torvalds
d347da0def Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 2006-01-04 16:27:41 -08:00
Linus Torvalds
c6c88bbde4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6 2006-01-04 16:25:44 -08:00
Linus Torvalds
0356dbb7fe Merge master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreq 2006-01-04 16:21:26 -08:00
Matthew Dharm
e80b0fade0 [PATCH] USB Storage: add alauda support
This patch adds another usb-storage subdriver, which supports two fairly
old dual-XD/SmartMedia reader-writers (USB1.1 devices).

This driver was written by Daniel Drake <dsd@gentoo.org> -- he notes
that he wrote this driver without specs, however a vendor-supplied GPL
driver for the previous generation of products ("sma03") did prove to be
quite useful, as did the sddr09 driver which also has to deal with
low-level physical block layout on SmartMedia.

The original patch has been reformed by me, as it clashed with the
libusual patches.

We really need to consolidate some of this common SmartMedia code, and
get together with the MTD guys to share it with them as well.

Signed-off-by: Daniel Drake <dsd@gentoo.org>
Signed-off-by: Matthew Dharm <mdharm-usb@one-eyed-alien.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-01-04 13:51:42 -08:00
Alan Stern
12c3da346e [PATCH] USB: Store port number in usb_device
This patch (as610) adds a field to struct usb_device to store the device's
port number.  This allows us to remove several loops in the hub driver
(searching for a particular device among all the entries in the parent's
array of children).

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-01-04 13:48:35 -08:00
Alan Stern
55c527187c [PATCH] USB: Consider power budget when choosing configuration
This patch (as609) changes the way we keep track of power budgeting for
USB hubs and devices, and it updates the choose_configuration routine to
take this information into account.  (This is something we should have
been doing all along.)  A new field in struct usb_device holds the amount
of bus current available from the upstream port, and the usb_hub structure
keeps track of the current available for each downstream port.

Two new rules for configuration selection are added:

	Don't select a self-powered configuration when only bus power
	is available.

	Don't select a configuration requiring more bus power than is
	available.

However the first rule is #if-ed out, because I found that the internal
hub in my HP USB keyboard claims that its only configuration is
self-powered.  The rule would prevent the configuration from being chosen,
leaving the hub & keyboard unconfigured.  Since similar descriptor errors
may turn out to be fairly common, it seemed wise not to include a rule
that would break automatic configuration unnecessarily for such devices.

The second rule may also trigger unnecessarily, although this should be
less common.  More likely it will annoy people by sometimes failing to
accept configurations that should never have been chosen in the first
place.

The patch also changes usbcore's reaction when no configuration is
suitable.  Instead of raising an error and rejecting the device, now
the core will simply leave the device unconfigured.  People can always
work around such problems by installing configurations manually through
sysfs.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-01-04 13:48:34 -08:00
Alan Stern
9ad3d6ccf5 [PATCH] USB: Remove USB private semaphore
This patch (as605) removes the private udev->serialize semaphore,
relying instead on the locking provided by the embedded struct device's
semaphore.  The changes are confined to the core, except that the
usb_trylock_device routine now uses the return convention of
down_trylock rather than down_read_trylock (they return opposite values
for no good reason).

A couple of other associated changes are included as well:

	Now that we aren't concerned about HCDs that avoid using the
	hcd glue layer, usb_disconnect no longer needs to acquire the
	usb_bus_lock -- that can be done by usb_remove_hcd where it
	belongs.

	Devices aren't locked over the same scope of code in
	usb_new_device and hub_port_connect_change as they used to be.
	This shouldn't cause any trouble.

Along with the preceding driver core patch, this needs a lot of testing.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-01-04 13:48:34 -08:00
Greg Kroah-Hartman
75318d2d7c [PATCH] USB: remove .owner field from struct usb_driver
It is no longer needed, so let's remove it, saving a bit of memory.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-01-04 13:48:34 -08:00
Greg Kroah-Hartman
2143acc6dc [PATCH] USB: make registering a usb driver automatically set the module owner
This fixes the driver that forgot to set the module owner up.  Now we
can remove the unneeded pointer from the usb driver structure.  The idea
for how to do this was from Al Viro, who did this for the PCI drivers.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-01-04 13:48:32 -08:00
Greg Kroah-Hartman
ba9dc657af [PATCH] USB: allow usb drivers to disable dynamic ids
This lets drivers, like the usb-serial ones, disable the ability to add
ids from sysfs.

The usb-serial drivers are "odd" in that they are really usb-serial bus
drivers, not usb bus drivers, so the dynamic id logic will have to go
into the usb-serial bus core for those drivers to get that ability.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-01-04 13:48:32 -08:00
Greg Kroah-Hartman
733260ff9c [PATCH] USB: add dynamic id functionality to USB core
Echo the usb vendor and product id to the "new_id" file in the driver's
sysfs directory, and then that driver will be able to bind to a device
with those ids if it is present.

Example:
	echo 0557 2008 > /sys/bus/usb/drivers/foo_driver/new_id
adds the hex values 0557 and 2008 to the device id table for the foo_driver.

Note, usb-serial drivers do not currently work with this capability yet.
usb-storage also might have some oddities.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-01-04 13:48:32 -08:00
Pete Zaitcev
a00828e9ac [PATCH] USB: drivers/usb/storage/libusual
This patch adds a shim driver libusual, which routes devices between
usb-storage and ub according to the common table, based on unusual_devs.h.
The help and example syntax is in Kconfig.

Signed-off-by: Pete Zaitcev <zaitcev@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-01-04 13:48:31 -08:00
Richard Purdie
81f280e22f [PATCH] USB: pxa27x OHCI - Separate platform code from main driver
To allow multiple platforms to use the PXA27x OHCI driver, the platform
code needs to be moved into the board specific files in
arch/arm/mach-pxa. This patch does this for mainstone and adds
preliminary hooks to allow other boards to use the driver.

This has been compile tested for mainstone and successfully run on Spitz
(Sharp Zaurus SL-C3000) with the addition of an appropriate board
support file.

Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Nicolas Pitre <nico@cam.org>
Acked-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-01-04 13:48:28 -08:00
Stephen Hemminger
40efc6fa17 [TCP]: less inline's
TCP inline usage cleanup:
 * get rid of inline in several places
 * replace __inline__ with inline where possible
 * move functions used in one file out of tcp.h
 * let compiler decide on used once cases

On x86_64: 
   text	   data	    bss	    dec	    hex	filename
3594701	 648348	 567400	4810449	 4966d1	vmlinux.orig
3593133	 648580	 567400	4809113	 496199	vmlinux

On sparc64:
   text	   data	    bss	    dec	    hex	filename
2538278	 406152	 530392	3474822	 350586	vmlinux.ORIG
2536382	 406384	 530392	3473158	 34ff06	vmlinux

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 16:03:49 -08:00
Stephen Hemminger
88df8ef59a [NET]: Don't exclude broadcast addresses from is_multicast_ether_addr()
The check for multicast shouldn't exclude broadcast type addresses.
This reverts the incorrect change done in 2.6.13.

The broadcast address is a multicast address and should be excluded
from being a valid_ether_address for use in bridging or device address.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 15:25:45 -08:00
Per Liden
b461d2f218 [NETLINK] genetlink: fix cmd type in genl_ops to be consistent to u8
Signed-off-by: Per Liden <per.liden@ericsson.com>
ACKed-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 14:13:29 -08:00
Benjamin LaHaise
fd19f329a3 [AF_UNIX]: Convert to use a spinlock instead of rwlock
From: Benjamin LaHaise <bcrl@kvack.org>

In af_unix, a rwlock is used to protect internal state.  At least on my 
P4 with HT it is faster to use a spinlock due to the simpler memory 
barrier used to unlock.  This patch raises bw_unix to ~690K/s.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 14:10:46 -08:00
Benjamin LaHaise
4947d3ef8d [NET]: Speed up __alloc_skb()
From: Benjamin LaHaise <bcrl@kvack.org>

In __alloc_skb(), the use of skb_shinfo() which casts a u8 * to the 
shared info structure results in gcc being forced to do a reload of the 
pointer since it has no information on possible aliasing.  Fix this by 
using a pointer to refer to skb_shared_info.

By initializing skb_shared_info sequentially, the write combining buffers 
can reduce the number of memory transactions to a single write.  Reorder 
the initialization in __alloc_skb() to match the structure definition.  
There is also an alignment issue on 64 bit systems with skb_shared_info 
by converting nr_frags to a short everything packs up nicely.

Also, pass the slab cache pointer according to the fclone flag instead 
of using two almost identical function calls.

This raises bw_unix performance up to a peak of 707KB/s when combined 
with the spinlock patch.  It should help other networking protocols, too.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 14:06:50 -08:00
David S. Miller
17ba15fb62 [PPPOX]: Fix assignment into const proto_ops.
And actually, with this, the whole pppox layer can basically
be removed and subsumed into pppoe.c, no other pppox sub-protocol
implementation exists and we've had this thing for at least 4
years.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:23 -08:00
Arnaldo Carvalho de Melo
8639a11e23 [TCP]: Don't use __constant_htonl for a non const arg
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:22 -08:00
Arnaldo Carvalho de Melo
14c850212e [INET_SOCK]: Move struct inet_sock & helper functions to net/inet_sock.h
To help in reducing the number of include dependencies, several files were
touched as they were getting needed headers indirectly for stuff they use.

Thanks also to Alan Menegotto for pointing out that net/dccp/proto.c had
linux/dccp.h include twice.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:21 -08:00
Arnaldo Carvalho de Melo
25995ff577 [SOCK]: Introduce sk_receive_skb
Its common enough to to justify that, TCP still can't use it as it has the
prequeueing stuff, still to be made generic in the not so distant future :-)

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:19 -08:00
Eric Dumazet
90ddc4f047 [NET]: move struct proto_ops to const
I noticed that some of 'struct proto_ops' used in the kernel may share
a cache line used by locks or other heavily modified data. (default
linker alignement is 32 bytes, and L1_CACHE_LINE is 64 or 128 at
least)

This patch makes sure a 'struct proto_ops' can be declared as const,
so that all cpus can share all parts of it without false sharing.

This is not mandatory : a driver can still use a read/write structure
if it needs to (and eventually a __read_mostly)

I made a global stubstitute to change all existing occurences to make
them const.

This should reduce the possibility of false sharing on SMP, and
speedup some socket system calls.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:15 -08:00
Andi Kleen
77d76ea310 [NET]: Small cleanup to socket initialization
sock_init can be done as a core_initcall instead of calling
it directly in init/main.c

Also I removed an out of date #ifdef.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:14 -08:00
Frank Filz
7708610b1b [SCTP]: Add support for SCTP_DELAYED_ACK_TIME socket option.
Signed-off-by: Frank Filz <ffilz@us.ibm.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:13 -08:00
Frank Filz
52ccb8e90c [SCTP]: Update SCTP_PEER_ADDR_PARAMS socket option to the latest api draft.
This patch adds support to set/get heartbeat interval, maximum number of
retransmissions, pathmtu, sackdelay time for a particular transport/
association/socket as per the latest SCTP sockets api draft11.

Signed-off-by: Frank Filz <ffilz@us.ibm.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:11 -08:00
Stephen Hemminger
90933fc8ba [FLS64]: x86_64 version
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:07 -08:00
Stephen Hemminger
3821af2fe1 [FLS64]: generic version
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:06 -08:00
Stephen Hemminger
c865e5d99e [PKT_SCHED] netem: packet corruption option
Here is a new feature for netem in 2.6.16. It adds the ability to
randomly corrupt packets with netem. A version was done by
Hagen Paul Pfeifer, but I redid it to handle the cases of backwards
compatibility with netlink interface and presence of hardware checksum
offload. It is useful for testing hardware offload in devices.

Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:11:05 -08:00
David S. Miller
fbe9cc4a87 [AF_UNIX]: Use spinlock for unix_table_lock
This lock is actually taken mostly as a writer,
so using a rwlock actually just makes performance
worse especially on chips like the Intel P4.

Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:59 -08:00
Arnaldo Carvalho de Melo
d83d8461f9 [IP_SOCKGLUE]: Remove most of the tcp specific calls
As DCCP needs to be called in the same spots.

Now we have a member in inet_sock (is_icsk), set at sock creation time from
struct inet_protosw->flags (if INET_PROTOSW_ICSK is set, like for TCP and
DCCP) to see if a struct sock instance is a inet_connection_sock for places
like the ones in ip_sockglue.c (v4 and v6) where we previously were looking if
sk_type was SOCK_STREAM, that is insufficient because we now use the same code
for DCCP, that has sk_type SOCK_DCCP.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:58 -08:00
Arnaldo Carvalho de Melo
2271281362 [TCP]: Move the TCPF_ enum to tcp_states.h
Upcoming patches will make, for instance, ip_sockglue.c need just this enum
and not all of tcp.h.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:57 -08:00
Arnaldo Carvalho de Melo
d8313f5ca2 [INET6]: Generalise tcp_v6_hash_connect
Renaming it to inet6_hash_connect, making it possible to ditch
dccp_v6_hash_connect and share the same code with TCP instead.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:56 -08:00
Arnaldo Carvalho de Melo
a7f5e7f164 [INET]: Generalise tcp_v4_hash_connect
Renaming it to inet_hash_connect, making it possible to ditch
dccp_v4_hash_connect and share the same code with TCP instead.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:55 -08:00
Arnaldo Carvalho de Melo
6d6ee43e0b [TWSK]: Introduce struct timewait_sock_ops
So that we can share several timewait sockets related functions and
make the timewait mini sockets infrastructure closer to the request
mini sockets one.

Next changesets will take advantage of this, moving more code out of
TCP and DCCP v4 and v6 to common infrastructure.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:54 -08:00
Arnaldo Carvalho de Melo
399c07def6 [IPV6]: Export ipv6_opt_accepted
It was already non-TCP specific, will be used by DCCPv6.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:51 -08:00
Arnaldo Carvalho de Melo
0fa1a53e1f [IPV6]: Introduce inet6_timewait_sock
Out of tcp6_timewait_sock, that now is just an aggregation of
inet_timewait_sock and inet6_timewait_sock, using tw_ipv6_offset in struct
inet_timewait_sock, that is common to the IPv6 transport protocols that use
timewait sockets, like DCCP and TCP.

tw_ipv6_offset plays the struct inet_sock pinfo6 role, i.e. for the generic
code to find the IPv6 area in a timewait sock.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:47 -08:00
Arnaldo Carvalho de Melo
b9750ce13c [IPV6]: Generalise some functions
Using sk->sk_protocol instead of IPPROTO_TCP.

Will be used by DCCPv6 in the next changesets.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:46 -08:00
Benjamin LaHaise
c1cbe4b7ad [NET]: Avoid atomic xchg() for non-error case
It also looks like there were 2 places where the test on sk_err was
missing from the event wait logic (in sk_stream_wait_connect and
sk_stream_wait_memory), while the rest of the sock_error() users look
to be doing the right thing.  This version of the patch fixes those,
and cleans up a few places that were testing ->sk_err directly.

Signed-off-by: Benjamin LaHaise <benjamin.c.lahaise@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:44 -08:00
Herbert Xu
3305b80c21 [IP]: Simplify and consolidate MSG_PEEK error handling
When a packet is obtained from skb_recv_datagram with MSG_PEEK enabled
it is left on the socket receive queue.  This means that when we detect
a checksum error we have to be careful when trying to free the packet
as someone could have dequeued it in the time being.

Currently this delicate logic is duplicated three times between UDPv4,
UDPv6 and RAWv6.  This patch moves them into a one place and simplifies
the code somewhat.

This is based on a suggestion by Eric Dumazet.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:41 -08:00
Arnaldo Carvalho de Melo
af05dc9394 [ICSK]: Move v4_addr2sockaddr from TCP to icsk
Renaming it to inet_csk_addr2sockaddr.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:39 -08:00
Arnaldo Carvalho de Melo
8292a17a39 [ICSK]: Rename struct tcp_func to struct inet_connection_sock_af_ops
And move it to struct inet_connection_sock. DCCP will use it in the
upcoming changesets.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:38 -08:00
Arnaldo Carvalho de Melo
ca304b6104 [IPV6]: Introduce inet6_rsk()
And inet6_rsk_offset in inet_request_sock, for the same reasons as
inet_sock's pinfo6 member.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:37 -08:00
Arnaldo Carvalho de Melo
8129765ac0 [IPV6]: Generalise tcp_v6_search_req & tcp_v6_synq_add
More work is needed tho to introduce inet6_request_sock from
tcp6_request_sock, in the same layout considerations as ipv6_pinfo in
inet_sock, next changeset will do that.

Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:36 -08:00
Arnaldo Carvalho de Melo
c2977c2213 [ICSK]: make inet_csk_reqsk_queue_hash_add timeout arg unsigned long
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:34 -08:00
Arnaldo Carvalho de Melo
90b19d3169 [IPV6]: Generalise __tcp_v6_hash, renaming it to __inet6_hash
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:33 -08:00
Arnaldo Carvalho de Melo
971af18bbf [IPV6]: Reuse inet_csk_get_port in tcp_v6_get_port
Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:33 -08:00
Herbert Xu
89cee8b1cb [IPV4]: Safer reassembly
Another spin of Herbert Xu's "safer ip reassembly" patch
for 2.6.16.

(The original patch is here:
http://marc.theaimsgroup.com/?l=linux-netdev&m=112281936522415&w=2
and my only contribution is to have tested it.)

This patch (optionally) does additional checks before accepting IP
fragments, which can greatly reduce the possibility of reassembling
fragments which originated from different IP datagrams.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Arthur Kepner <akepner@sgi.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:31 -08:00
Trent Jaeger
df71837d50 [LSM-IPSec]: Security association restriction.
This patch series implements per packet access control via the
extension of the Linux Security Modules (LSM) interface by hooks in
the XFRM and pfkey subsystems that leverage IPSec security
associations to label packets.  Extensions to the SELinux LSM are
included that leverage the patch for this purpose.

This patch implements the changes necessary to the XFRM subsystem,
pfkey interface, ipv4/ipv6, and xfrm_user interface to restrict a
socket to use only authorized security associations (or no security
association) to send/receive network packets.

Patch purpose:

The patch is designed to enable access control per packets based on
the strongly authenticated IPSec security association.  Such access
controls augment the existing ones based on network interface and IP
address.  The former are very coarse-grained, and the latter can be
spoofed.  By using IPSec, the system can control access to remote
hosts based on cryptographic keys generated using the IPSec mechanism.
This enables access control on a per-machine basis or per-application
if the remote machine is running the same mechanism and trusted to
enforce the access control policy.

Patch design approach:

The overall approach is that policy (xfrm_policy) entries set by
user-level programs (e.g., setkey for ipsec-tools) are extended with a
security context that is used at policy selection time in the XFRM
subsystem to restrict the sockets that can send/receive packets via
security associations (xfrm_states) that are built from those
policies.

A presentation available at
www.selinux-symposium.org/2005/presentations/session2/2-3-jaeger.pdf
from the SELinux symposium describes the overall approach.

Patch implementation details:

On output, the policy retrieved (via xfrm_policy_lookup or
xfrm_sk_policy_lookup) must be authorized for the security context of
the socket and the same security context is required for resultant
security association (retrieved or negotiated via racoon in
ipsec-tools).  This is enforced in xfrm_state_find.

On input, the policy retrieved must also be authorized for the socket
(at __xfrm_policy_check), and the security context of the policy must
also match the security association being used.

The patch has virtually no impact on packets that do not use IPSec.
The existing Netfilter (outgoing) and LSM rcv_skb hooks are used as
before.

Also, if IPSec is used without security contexts, the impact is
minimal.  The LSM must allow such policies to be selected for the
combination of socket and remote machine, but subsequent IPSec
processing proceeds as in the original case.

Testing:

The pfkey interface is tested using the ipsec-tools.  ipsec-tools have
been modified (a separate ipsec-tools patch is available for version
0.5) that supports assignment of xfrm_policy entries and security
associations with security contexts via setkey and the negotiation
using the security contexts via racoon.

The xfrm_user interface is tested via ad hoc programs that set
security contexts.  These programs are also available from me, and
contain programs for setting, getting, and deleting policy for testing
this interface.  Testing of sa functions was done by tracing kernel
behavior.

Signed-off-by: Trent Jaeger <tjaeger@cse.psu.edu>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-03 13:10:24 -08:00