linux/net
Michal Hocko 2f064f3485 mm: make page pfmemalloc check more robust
Commit c48a11c7ad ("netvm: propagate page->pfmemalloc to skb") added
checks for page->pfmemalloc to __skb_fill_page_desc():

        if (page->pfmemalloc && !page->mapping)
                skb->pfmemalloc = true;

It assumes page->mapping == NULL implies that page->pfmemalloc can be
trusted.  However, __delete_from_page_cache() can set set page->mapping
to NULL and leave page->index value alone.  Due to being in union, a
non-zero page->index will be interpreted as true page->pfmemalloc.

So the assumption is invalid if the networking code can see such a page.
And it seems it can.  We have encountered this with a NFS over loopback
setup when such a page is attached to a new skbuf.  There is no copying
going on in this case so the page confuses __skb_fill_page_desc which
interprets the index as pfmemalloc flag and the network stack drops
packets that have been allocated using the reserves unless they are to
be queued on sockets handling the swapping which is the case here and
that leads to hangs when the nfs client waits for a response from the
server which has been dropped and thus never arrive.

The struct page is already heavily packed so rather than finding another
hole to put it in, let's do a trick instead.  We can reuse the index
again but define it to an impossible value (-1UL).  This is the page
index so it should never see the value that large.  Replace all direct
users of page->pfmemalloc by page_is_pfmemalloc which will hide this
nastiness from unspoiled eyes.

The information will get lost if somebody wants to use page->index
obviously but that was the case before and the original code expected
that the information should be persisted somewhere else if that is
really needed (e.g.  what SLAB and SLUB do).

[akpm@linux-foundation.org: fix blooper in slub]
Fixes: c48a11c7ad ("netvm: propagate page->pfmemalloc to skb")
Signed-off-by: Michal Hocko <mhocko@suse.com>
Debugged-by: Vlastimil Babka <vbabka@suse.com>
Debugged-by: Jiri Bohac <jbohac@suse.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Miller <davem@davemloft.net>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: <stable@vger.kernel.org>	[3.6+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-08-21 14:30:10 -07:00
..
6lowpan
9p virtio/vhost: fixes for 4.2 2015-07-23 13:07:04 -07:00
802
8021q vlan: Add GRO support for non hardware accelerated vlan 2015-06-01 16:50:52 -07:00
appletalk net: Pass kern from net_proto_family.create to sk_alloc 2015-05-11 10:50:17 -04:00
atm net: Pass kern from net_proto_family.create to sk_alloc 2015-05-11 10:50:17 -04:00
ax25 NET: AX.25: Stop heartbeat timer on disconnect. 2015-07-15 15:59:58 -07:00
batman-adv batman-adv: Fix memory leak on tt add with invalid vlan 2015-08-18 19:08:23 -07:00
bluetooth Bluetooth: fix MGMT_EV_NEW_LONG_TERM_KEY event 2015-08-06 16:36:03 +02:00
bridge net: fix wrong skb_get() usage / crash in IGMP/MLD parsing code 2015-08-13 17:08:39 -07:00
caif caif: fix leaks and race in caif_queue_rcv_skb() 2015-07-21 00:02:44 -07:00
can can: replace timestamp as unique skb attribute 2015-07-12 21:13:22 +02:00
ceph libceph: treat sockaddr_storage with uninitialized family as blank 2015-07-09 20:30:34 +03:00
core mm: make page pfmemalloc check more robust 2015-08-21 14:30:10 -07:00
dcb
dccp tcp: fix recv with flags MSG_WAITALL | MSG_PEEK 2015-07-27 01:06:53 -07:00
decnet net: Pass kern from net_proto_family.create to sk_alloc 2015-05-11 10:50:17 -04:00
dns_resolver
dsa net: dsa: Do not override PHY interface if already configured 2015-08-12 14:29:50 -07:00
ethernet net: Add full IPv6 addresses to flow_keys 2015-06-04 15:44:30 -07:00
hsr
ieee802154 inet: frag: change *_frag_mem_limit functions to take netns_frags as argument 2015-07-26 21:00:14 -07:00
ipv4 Revert "net: limit tcp/udp rmem/wmem to SOCK_{RCV,SND}BUF_MIN" 2015-08-17 12:10:30 -07:00
ipv6 ipv6: Fix a potential deadlock when creating pcpu rt 2015-08-17 14:28:03 -07:00
ipx net: Pass kern from net_proto_family.create to sk_alloc 2015-05-11 10:50:17 -04:00
irda irda: use msecs_to_jiffies for conversion to jiffies 2015-05-25 17:46:21 -04:00
iucv net: Pass kern from net_proto_family.create to sk_alloc 2015-05-11 10:50:17 -04:00
key Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2015-06-24 16:49:49 -07:00
l2tp net: Modify sk_alloc to not reference count the netns of kernel sockets. 2015-05-11 10:50:18 -04:00
lapb
llc tcp: fix recv with flags MSG_WAITALL | MSG_PEEK 2015-07-27 01:06:53 -07:00
mac80211 mac80211: fix invalid read in minstrel_sort_best_tp_rates() 2015-08-13 13:52:34 +02:00
mac802154 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2015-06-24 16:49:49 -07:00
mpls Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-06-13 23:56:52 -07:00
netfilter netfilter: conntrack: Use flags in nf_ct_tmpl_alloc() 2015-08-05 10:56:43 +02:00
netlabel
netlink netlink: make sure -EBUSY won't escape from netlink_insert 2015-08-10 10:59:10 -07:00
netrom netfilter: Remove spurios included of netfilter.h 2015-06-18 21:14:32 +02:00
nfc NFC: nci: fix mistake in uart generic driver 2015-06-15 18:10:37 +02:00
openvswitch openvswitch: Fix L4 checksum handling when dealing with IP fragments 2015-08-03 14:03:08 -07:00
packet packet: tpacket_snd(): fix signed/unsigned comparison 2015-07-29 00:09:58 -07:00
phonet net: Pass kern from net_proto_family.create to sk_alloc 2015-05-11 10:50:17 -04:00
rds rds: fix an integer overflow test in rds_info_getsockopt() 2015-08-03 15:20:16 -07:00
rfkill net: rfkill: gpio: make better use of gpiod API 2015-05-29 13:13:45 +02:00
rose Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-06-24 02:58:51 -07:00
rxrpc net: Pass kern from net_proto_family.create to sk_alloc 2015-05-11 10:50:17 -04:00
sched act_mirred: avoid calling tcf_hash_release() when binding 2015-08-03 14:13:28 -07:00
sctp net: sctp: stop spamming klog with rfc6458, 5.3.2. deprecation warnings 2015-07-26 16:32:41 -07:00
sunrpc NFS client bugfixes for Linux 4.2 2015-07-28 09:37:44 -07:00
switchdev net: switchdev: don't abort unsupported operations 2015-07-11 21:29:55 -07:00
tipc net/tipc: initialize security state for new connection socket 2015-07-08 16:08:23 -07:00
unix net/unix: support SCM_SECURITY for stream sockets 2015-06-10 22:49:20 -07:00
vmw_vsock net: Pass kern from net_proto_family.create to sk_alloc 2015-05-11 10:50:17 -04:00
wimax
wireless cfg80211: use RTNL locked reg_can_beacon for IR-relaxation 2015-07-17 15:02:02 +02:00
x25 net: Pass kern from net_proto_family.create to sk_alloc 2015-05-11 10:50:17 -04:00
xfrm Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2015-06-24 16:49:49 -07:00
compat.c
Kconfig net: add CONFIG_NET_INGRESS to enable ingress filtering 2015-05-14 01:10:05 -04:00
Makefile
socket.c net: Add a struct net parameter to sock_create_kern 2015-05-11 10:50:17 -04:00
sysctl_net.c