linux/include
Michal Hocko 2f064f3485 mm: make page pfmemalloc check more robust
Commit c48a11c7ad ("netvm: propagate page->pfmemalloc to skb") added
checks for page->pfmemalloc to __skb_fill_page_desc():

        if (page->pfmemalloc && !page->mapping)
                skb->pfmemalloc = true;

It assumes page->mapping == NULL implies that page->pfmemalloc can be
trusted.  However, __delete_from_page_cache() can set set page->mapping
to NULL and leave page->index value alone.  Due to being in union, a
non-zero page->index will be interpreted as true page->pfmemalloc.

So the assumption is invalid if the networking code can see such a page.
And it seems it can.  We have encountered this with a NFS over loopback
setup when such a page is attached to a new skbuf.  There is no copying
going on in this case so the page confuses __skb_fill_page_desc which
interprets the index as pfmemalloc flag and the network stack drops
packets that have been allocated using the reserves unless they are to
be queued on sockets handling the swapping which is the case here and
that leads to hangs when the nfs client waits for a response from the
server which has been dropped and thus never arrive.

The struct page is already heavily packed so rather than finding another
hole to put it in, let's do a trick instead.  We can reuse the index
again but define it to an impossible value (-1UL).  This is the page
index so it should never see the value that large.  Replace all direct
users of page->pfmemalloc by page_is_pfmemalloc which will hide this
nastiness from unspoiled eyes.

The information will get lost if somebody wants to use page->index
obviously but that was the case before and the original code expected
that the information should be persisted somewhere else if that is
really needed (e.g.  what SLAB and SLUB do).

[akpm@linux-foundation.org: fix blooper in slub]
Fixes: c48a11c7ad ("netvm: propagate page->pfmemalloc to skb")
Signed-off-by: Michal Hocko <mhocko@suse.com>
Debugged-by: Vlastimil Babka <vbabka@suse.com>
Debugged-by: Jiri Bohac <jbohac@suse.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Miller <davem@davemloft.net>
Acked-by: Mel Gorman <mgorman@suse.de>
Cc: <stable@vger.kernel.org>	[3.6+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-08-21 14:30:10 -07:00
..
acpi Additional ACPICA material for v4.2-rc1 2015-07-02 17:11:28 -07:00
asm-generic mm: clean up per architecture MM hook header files 2015-07-17 16:39:53 -07:00
clocksource
crypto
drm drm/edid: add function to help find SADs 2015-08-20 09:46:08 +10:00
dt-bindings The changes to the common clock framework for 4.2 are dominated by new 2015-07-01 19:22:00 -07:00
keys
kvm
linux mm: make page pfmemalloc check more robust 2015-08-21 14:30:10 -07:00
math-emu
media media fixes for v4.2-rc8 2015-08-21 11:03:06 -07:00
memory
misc
net net: sched: fix refcount imbalance in actions 2015-07-30 14:20:39 -07:00
pcmcia
ras tracing: add trace event for memory-failure 2015-06-24 17:49:43 -07:00
rdma IB: Add rdma_cap_ib_switch helper and use where appropriate 2015-07-14 13:20:08 -04:00
rxrpc
scsi Merge branch 'for-4.2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata 2015-08-17 16:20:45 -07:00
soc Merge branch 'for-linus' of git://ftp.arm.linux.org.uk/~rmk/linux-arm 2015-06-26 12:20:00 -07:00
sound ASoC: topology: Add Kconfig option for topology 2015-08-17 22:45:47 -07:00
target iscsi-target: Fix iscsit_start_kthreads failure OOPs 2015-07-24 14:19:43 -07:00
trace Merge branch 'for-linus-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs 2015-06-30 20:07:45 -07:00
uapi sound fixes for 4.2-final 2015-08-20 12:08:38 -07:00
video Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux 2015-06-26 13:18:51 -07:00
xen
Kbuild