Commit Graph

618088 Commits

Author SHA1 Message Date
Eric Ren
c33f0785bf ocfs2: fix deadlock on mmapped page in ocfs2_write_begin_nolock()
The testcase "mmaptruncate" of ocfs2-test deadlocks occasionally.

In this testcase, we create a 2*CLUSTER_SIZE file and mmap() on it;
there are 2 process repeatedly performing the following operations
respectively: one is doing memset(mmaped_addr + 2*CLUSTER_SIZE - 1, 'a',
1), while the another is playing ftruncate(fd, 2*CLUSTER_SIZE) and then
ftruncate(fd, CLUSTER_SIZE) again and again.

This is the backtrace when the deadlock happens:

   __wait_on_bit_lock+0x50/0xa0
   __lock_page+0xb7/0xc0
   ocfs2_write_begin_nolock+0x163f/0x1790 [ocfs2]
   ocfs2_page_mkwrite+0x1c7/0x2a0 [ocfs2]
   do_page_mkwrite+0x66/0xc0
   handle_mm_fault+0x685/0x1350
   __do_page_fault+0x1d8/0x4d0
   trace_do_page_fault+0x37/0xf0
   do_async_page_fault+0x19/0x70
   async_page_fault+0x28/0x30

In ocfs2_write_begin_nolock(), we first grab the pages and then allocate
disk space for this write; ocfs2_try_to_free_truncate_log() will be
called if -ENOSPC is returned; if we're lucky to get enough clusters,
which is usually the case, we start over again.

But in ocfs2_free_write_ctxt() the target page isn't unlocked, so we
will deadlock when trying to grab the target page again.

Also, -ENOMEM might be returned in ocfs2_grab_pages_for_write().
Another deadlock will happen in __do_page_mkwrite() if
ocfs2_page_mkwrite() returns non-VM_FAULT_LOCKED, and along with a
locked target page.

These two errors fail on the same path, so fix them by unlocking the
target page manually before ocfs2_free_write_ctxt().

Jan Kara helps me clear out the JBD2 part, and suggest the hint for root
cause.

Changes since v1:
1. Also put ENOMEM error case into consideration.

Link: http://lkml.kernel.org/r/1474173902-32075-1-git-send-email-zren@suse.com
Signed-off-by: Eric Ren <zren@suse.com>
Reviewed-by: He Gang <ghe@suse.com>
Acked-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-30 15:26:52 -07:00
Johannes Weiner
22f2ac51b6 mm: workingset: fix crash in shadow node shrinker caused by replace_page_cache_page()
Antonio reports the following crash when using fuse under memory pressure:

  kernel BUG at /build/linux-a2WvEb/linux-4.4.0/mm/workingset.c:346!
  invalid opcode: 0000 [#1] SMP
  Modules linked in: all of them
  CPU: 2 PID: 63 Comm: kswapd0 Not tainted 4.4.0-36-generic #55-Ubuntu
  Hardware name: System manufacturer System Product Name/P8H67-M PRO, BIOS 3904 04/27/2013
  task: ffff88040cae6040 ti: ffff880407488000 task.ti: ffff880407488000
  RIP: shadow_lru_isolate+0x181/0x190
  Call Trace:
    __list_lru_walk_one.isra.3+0x8f/0x130
    list_lru_walk_one+0x23/0x30
    scan_shadow_nodes+0x34/0x50
    shrink_slab.part.40+0x1ed/0x3d0
    shrink_zone+0x2ca/0x2e0
    kswapd+0x51e/0x990
    kthread+0xd8/0xf0
    ret_from_fork+0x3f/0x70

which corresponds to the following sanity check in the shadow node
tracking:

  BUG_ON(node->count & RADIX_TREE_COUNT_MASK);

The workingset code tracks radix tree nodes that exclusively contain
shadow entries of evicted pages in them, and this (somewhat obscure)
line checks whether there are real pages left that would interfere with
reclaim of the radix tree node under memory pressure.

While discussing ways how fuse might sneak pages into the radix tree
past the workingset code, Miklos pointed to replace_page_cache_page(),
and indeed there is a problem there: it properly accounts for the old
page being removed - __delete_from_page_cache() does that - but then
does a raw raw radix_tree_insert(), not accounting for the replacement
page.  Eventually the page count bits in node->count underflow while
leaving the node incorrectly linked to the shadow node LRU.

To address this, make sure replace_page_cache_page() uses the tracked
page insertion code, page_cache_tree_insert().  This fixes the page
accounting and makes sure page-containing nodes are properly unlinked
from the shadow node LRU again.

Also, make the sanity checks a bit less obscure by using the helpers for
checking the number of pages and shadows in a radix tree node.

Fixes: 449dd6984d ("mm: keep page cache radix tree nodes in check")
Link: http://lkml.kernel.org/r/20160919155822.29498-1-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: Antonio SJ Musumeci <trapexit@spawn.link>
Debugged-by: Miklos Szeredi <miklos@szeredi.hu>
Cc: <stable@vger.kernel.org>	[3.15+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-30 15:26:52 -07:00
Linus Torvalds
e3b3656ca6 drm fixes for final 4.8, udl, amdgpu/radeon and nouveau
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJX7c+TAAoJEAx081l5xIa+wZ4P/10JkWL+QQCpmCfg0BJ+m1tD
 74kkzSbX8oTB5wk0XL+eGJxDVu1nqs2tt/ngC8UKz+d5mt3UIBRbe1Up4w80DsAW
 JNIdFzDGeiz4JMoHxE3sL3hYDtbreJ9tp/hWZapPUK6D8omed1Prt0vt6zAsl1cu
 JWYUdeWBlerXFRnSw0AuLL/JpnpRrdLrfb0DyWrRszh+dxnJ6D8qMJDbvXCzucqs
 AqDzCBc0yTd6sCco6wTqvANDRYQCwxWWqFeJpEiotbMh545t2INezioUNXaigIBH
 uwuitiUNc277RA5w/O7SwTz9VO7ewOd3KblxFEnKWF4OhgpC13Y6UnuxryapZzb0
 4y2PbdwzL5eLEQIt3LiF0ZfRrkCrRNqHn4wTUOuBGV6nx+ddexkT3jmGQrH/+mPN
 0K4f1QRpYyJFKbepPuZnydpnuA9fo2tslxf00AMo2Lm5Tpw3tAfYur1q+LmhR5JD
 5snGVGuLuiqxQFghuGxn5LUWgnqgI8hHMshpSfwHTpK7tfuAutZRLAgOyZAnKXTd
 xxTqxbTbjvBLEZhXGUrnxmT/xVwIdb0TDdg5v0QFxNNeq6WgVvyVe/mTJigHOYH1
 zODxE/wzLXIWzVk/wHa06JECWJ2KG1huNj4BxFIRdbUvPv6HrXPElbL0bfLM+a8e
 dNTNjCT1C8UzaA8JiGi8
 =F5WM
 -----END PGP SIGNATURE-----

Merge tag 'drm-fixes-for-v4.8-final' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:
 "drm fixes for final 4.8.

  One big regression fix for udl, along with two amdgpu fixes and two
  nouveau fixes.

  All seems pretty safe and useful"

* tag 'drm-fixes-for-v4.8-final' of git://people.freedesktop.org/~airlied/linux:
  drm/udl: fix line iterator in damage handling
  drm/radeon/si/dpm: add workaround for for Jet parts
  drm/amdgpu: disable CRTCs before teardown
  drm/nouveau: Revert "bus: remove cpu_coherent flag"
  drm/nouveau/fifo/nv04: avoid ramht race against cookie insertion
2016-09-29 20:16:57 -07:00
Linus Torvalds
c6169de730 Merge branch 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm fixes from Dan Williams:

 - Four fixes for "flush hint" support.

   Flush hints are addresses advertised by the ACPI 6+ NFIT (NVDIMM
   Firmware Interface Table) that when written and fenced guarantee that
   writes pending in platform write buffers (outside the cpu) have been
   flushed to media.  They might also be used by hypervisors as a
   trigger condition to flush guest-persistent memory ranges to storage.

    Fix a potential data corruption issue, a broken definition of the
    hint array, a wrong allocation size for the unit test implementation
    of the flush hint table, and missing NULL check in an error path.

    The unit test, while it did not prevent these bugs from being
    merged, at least triggered occasional crashes in advance of
    production usages.

 - Fix handling of ACPI DSM error status results.  The DSM mechanism
   allows communication with platform and memory device firmware.  We
   correctly parse known errors, but were silently ignoring others.

   Fix it to consistently fail any command with a non-zero status return
   that we otherwise do not interpret / handle.

* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  libnvdimm, region: fix flush hint table thinko
  nfit: fail DSMs that return non-zero status by default
  libnvdimm: fix devm_nvdimm_memremap() error path
  tools/testing/nvdimm: fix allocation range for mock flush hint tables
  nvdimm: fix PHYS_PFN/PFN_PHYS mixup
2016-09-29 14:59:11 -07:00
Linus Torvalds
53061afee4 Merge branch 'akpm' (patches from Andrew)
Merge fixes from Andrew Morton:
 "4 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mem-hotplug: use nodes that contain memory as mask in new_node_page()
  scripts/recordmcount.c: account for .softirqentry.text
  dma-mapping.h: preserve unmap info for CONFIG_DMA_API_DEBUG
  mm,ksm: fix endless looping in allocating memory when ksm enable
2016-09-28 16:20:24 -07:00
Li Zhong
231e97e2b8 mem-hotplug: use nodes that contain memory as mask in new_node_page()
9bb627be47 ("mem-hotplug: don't clear the only node in new_node_page()")
prevents allocating from an empty nodemask, but as David points out, it is
still wrong.  As node_online_map may include memoryless nodes, only
allocating from these nodes is meaningless.

This patch uses node_states[N_MEMORY] mask to prevent the above case.

Fixes: 9bb627be47 ("mem-hotplug: don't clear the only node in new_node_page()")
Fixes: 394e31d2ce ("mem-hotplug: alloc new page from a nearest neighbor node when mem-offline")
Link: http://lkml.kernel.org/r/1474447117.28370.6.camel@TP420
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Suggested-by: David Rientjes <rientjes@google.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: John Allen <jallen@linux.vnet.ibm.com>
Cc: Xishi Qiu <qiuxishi@huawei.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-28 16:19:02 -07:00
Dmitry Vyukov
e436fd61a8 scripts/recordmcount.c: account for .softirqentry.text
be7635e728 ("arch, ftrace: for KASAN put hard/soft IRQ entries into
separate sections") added .softirqentry.text section, but it was not added
to recordmcount.  So functions in the section are untracable.  Add the
section to scripts/recordmcount.c and scripts/recordmcount.pl.

Fixes: be7635e728 ("arch, ftrace: for KASAN put hard/soft IRQ entries into separate sections")
Link: http://lkml.kernel.org/r/1474902626-73468-1-git-send-email-dvyukov@google.com
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Steve Rostedt <rostedt@goodmis.org>
Cc: <stable@vger.kernel.org>	[4.6+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-28 16:19:02 -07:00
Andrey Smirnov
2481366afd dma-mapping.h: preserve unmap info for CONFIG_DMA_API_DEBUG
When CONFIG_DMA_API_DEBUG is enabled we need to preserve unmapping address
even if "unmap" is a no-op for our architecutre because we need
debug_dma_unmap_page() to correctly cleanup all of the debug bookkeeping.
Failing to do so results in a false positive warnings about previously
mapped areas never being unmapped.

Link: http://lkml.kernel.org/r/1474387125-3713-1-git-send-email-andrew.smirnov@gmail.com
Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Zhen Lei <thunder.leizhen@huawei.com>
Cc: "Luis R. Rodriguez" <mcgrof@suse.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Geliang Tang <geliangtang@163.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-28 16:19:01 -07:00
zhong jiang
5b398e416e mm,ksm: fix endless looping in allocating memory when ksm enable
I hit the following hung task when runing a OOM LTP test case with 4.1
kernel.

Call trace:
[<ffffffc000086a88>] __switch_to+0x74/0x8c
[<ffffffc000a1bae0>] __schedule+0x23c/0x7bc
[<ffffffc000a1c09c>] schedule+0x3c/0x94
[<ffffffc000a1eb84>] rwsem_down_write_failed+0x214/0x350
[<ffffffc000a1e32c>] down_write+0x64/0x80
[<ffffffc00021f794>] __ksm_exit+0x90/0x19c
[<ffffffc0000be650>] mmput+0x118/0x11c
[<ffffffc0000c3ec4>] do_exit+0x2dc/0xa74
[<ffffffc0000c46f8>] do_group_exit+0x4c/0xe4
[<ffffffc0000d0f34>] get_signal+0x444/0x5e0
[<ffffffc000089fcc>] do_signal+0x1d8/0x450
[<ffffffc00008a35c>] do_notify_resume+0x70/0x78

The oom victim cannot terminate because it needs to take mmap_sem for
write while the lock is held by ksmd for read which loops in the page
allocator

ksm_do_scan
	scan_get_next_rmap_item
		down_read
		get_next_rmap_item
			alloc_rmap_item   #ksmd will loop permanently.

There is no way forward because the oom victim cannot release any memory
in 4.1 based kernel.  Since 4.6 we have the oom reaper which would solve
this problem because it would release the memory asynchronously.
Nevertheless we can relax alloc_rmap_item requirements and use
__GFP_NORETRY because the allocation failure is acceptable as ksm_do_scan
would just retry later after the lock got dropped.

Such a patch would be also easy to backport to older stable kernels which
do not have oom_reaper.

While we are at it add GFP_NOWARN so the admin doesn't have to be alarmed
by the allocation failure.

Link: http://lkml.kernel.org/r/1474165570-44398-1-git-send-email-zhongjiang@huawei.com
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Suggested-by: Hugh Dickins <hughd@google.com>
Suggested-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-28 16:19:01 -07:00
Linus Torvalds
ae6dd8d619 Another round of MTD fixes for v4.8
Davinci NAND: fix a long-standing bug in how we clear/prep 4-bit ECC
 
 OMAP NAND: an error-handling fix that made it into v4.8-rc1 caused
 error-handling cases in other configurations/code-paths; this fixes the fix
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJX7Bv2AAoJEFySrpd9RFgtgn4P/0kmLaMl9/mnU8yV7o09uifR
 pwDW1sd6TCG9htjsILkZ2s8gAHEeXqH/5aN0rRpIng4OFaVGyihCsBOAnV9YojXC
 SlUEKLVqLNQxwp9rfCv4fe6amoBWEzw5BZRkSn5SPYP30JTyxCT1vZIYu+Nw22c2
 H0YBRZKI8HnNapeScy9ccI+6Hvm9hUl33kFKuh/pNtHq16ocQvGeNNic8avW6ALr
 hmmRBR0lmzBgRmJiykEe/xmjRjsFhB2Tb6GoxanTUOgcPIje5J4ly7YP8dfeXXIW
 WDXG8wKaHrP5Igh0gLZlPTmjbdd6FAh/qQQkByPxhjMlu+OLZWwDzTw/F9ZvVmZa
 ekcnH4UzYIKez4DFvZ2c0y5z4S64Qy2ajFJ28/3KNTATyBVtvUriGO2hafDUoGOu
 6P0ZeAKr3iIHddFVFZaHb2lbNAhi+3JBv93LXOxMHyu4xroSIw1Sr2NdkBRs2t5J
 +SRmdAjW612oVO7h6L2jEjAoWZMX2rn/ovNSp7AoQA+72BHTkOpGAv9KVI5cZezE
 +08DhatIQdyP4Y0r5w6aMJLBiia33ofe+4hPiWlbZfyt0HkPuXYXDlxNXXyQKEcy
 Vy2Jq0kB+hMMzYis+1lhasrvMBe67ykCKud8wSSE4L+kM8SMvibh9Hw4aCQVKKkb
 fIpTwk2WzsRyMB+8+0d8
 =coFg
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-20160928' of git://git.infradead.org/linux-mtd

Pull late MTD fixes from Brian Norris:
 "Another round of MTD fixes for v4.8

  My apologies for sending this so late.  I've been fairly absent as a
  maintainer this cycle, but I did queue these up weeks ago.  In the
  meantime, Richard was able to handle some other fixes (thanks!) but
  didn't pick these up.

  On the bright side, these are very simple changes that should carry
  little risk.

  Summary:

   - Davinci NAND: fix a long-standing bug in how we clear/prep 4-bit ECC

   - OMAP NAND: an error-handling fix that made it into v4.8-rc1 caused
     error-handling cases in other configurations/code-paths; this fixes
     the fix"

* tag 'for-linus-20160928' of git://git.infradead.org/linux-mtd:
  mtd: nand: davinci: Reinitialize the HW ECC engine in 4bit hwctl
  mtd: nand: omap2: Don't call dma_release_channel() if dma_request_chan() failed
2016-09-28 12:53:08 -07:00
Mark Fasheh
0a966fa891 MAINTAINERS: Update my e-mail
I will be starting employment at Versity next week and would like to update
my MAINTAINERS e-mail to reflect that change. My versity e-mail is already
activated so I shouldn't get any bounces on the new one. My ability to help
with Ocfs2 kernel maintenance won't change as a result of the new job.

Signed-off-by: Mark Fasheh <mfasheh@versity.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-28 12:52:05 -07:00
David Herrmann
90fd68dcf9 drm/udl: fix line iterator in damage handling
The udl damage handler is supposed to render 'height' lines, but its
iterator has an obvious typo that makes it miss most lines if the
rectangle does not cover 0/0.

Fix the damage handler to correctly render all lines.

This is a fallout from:

    commit e375882406
    Author: Noralf Trønnes <noralf@tronnes.org>
    Date:   Thu Apr 28 17:18:37 2016 +0200

        drm/udl: Use drm_fb_helper deferred_io support

Tested-by: poma <poma@gmail.com>
Cc: stable@vger.kernel.org # 4.7+
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-09-28 13:29:18 +10:00
Dave Airlie
aaee1d1e2d Merge branch 'linux-4.8' of git://github.com/skeggsb/linux into drm-fixes
nouveau: couple of fixes.

* 'linux-4.8' of git://github.com/skeggsb/linux:
  drm/nouveau: Revert "bus: remove cpu_coherent flag"
  drm/nouveau/fifo/nv04: avoid ramht race against cookie insertion
2016-09-28 10:23:50 +10:00
Dave Airlie
b86f9faa34 Merge branch 'drm-fixes-4.8' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
two amd fixes.

* 'drm-fixes-4.8' of git://people.freedesktop.org/~agd5f/linux:
  drm/radeon/si/dpm: add workaround for for Jet parts
  drm/amdgpu: disable CRTCs before teardown
2016-09-28 10:19:35 +10:00
Linus Torvalds
8ab293e3a1 Merge branch 'for-4.8-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fixes from Tejun Heo:
 "Three late fixes for cgroup: Two cpuset ones, one trivial and the
  other pretty obscure, and a cgroup core fix for a bug which impacts
  cgroup v2 namespace users"

* 'for-4.8-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup: fix invalid controller enable rejections with cgroup namespace
  cpuset: fix non static symbol warning
  cpuset: handle race between CPU hotplug and cpuset_hotplug_work
2016-09-27 16:43:11 -07:00
Alex Deucher
670bb4fd21 drm/radeon/si/dpm: add workaround for for Jet parts
Add clock quirks for Jet parts.

Reviewed-by: Sonny Jiang <sonny.jiang@amd.com>
Tested-by: Sonny Jiang <sonny.jiang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-09-27 12:22:29 -04:00
Grazvydas Ignotas
a951ed85ab drm/amdgpu: disable CRTCs before teardown
Some code called by drm_crtc_force_disable_all() wants to wait for all
fences, so only do fence teardown after CRTCs are disabled.

Fixes: 84b89bdced ("drm/amdgpu: Turn off CRTCs on driver unload")
Cc: stable@vger.kernel.org # v4.8+
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-09-27 12:22:21 -04:00
Linus Torvalds
08895a8b6b Linux 4.8-rc8 2016-09-25 18:47:13 -07:00
Linus Torvalds
4c04b4b534 Al Viro has been looking at the tracefs code, and has pointed out
some issues. This contains one fix by me and one by Al. I'm sure that
 he'll come up with more but for now I tested these patches and they
 don't appear to have any negative impact on tracing.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJX6FvrAAoJEKKk/i67LK/8EuIH/Arf6vJidYsmbe57WQp8PU3I
 bldem6ePj6zgZ2ZqPlSGCs1J2DcK4Bh3lPVxdx7rRKVWSd/Zoj+i83hvObusR8M7
 Qs1G92bJTvvVO3aPfiN0GvKGdKfGn45L+j0BcBauiTRKqnj3PkhOhIP2/ks0ewSk
 qeq7R3xxo/FDs26AHS69Hm0PIIw7btyhXNX4GB3Il7IIA5/nUknw3C+bjVj86tYX
 R4iElcHEhplgoSjKuLgNIRZGUnEFtsm/fnohYXpHacLTUKNXnTDY230x/OKc1yyB
 1vOfHS/y5s3XSJ1lcgSjYeNc51lK8NiDASaptZSUnOookKSAooUTFELNzpbc0sg=
 =+Fr3
 -----END PGP SIGNATURE-----

Merge tag 'trace-v4.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracefs fixes from Steven Rostedt:
 "Al Viro has been looking at the tracefs code, and has pointed out some
  issues.  This contains one fix by me and one by Al.  I'm sure that
  he'll come up with more but for now I tested these patches and they
  don't appear to have any negative impact on tracing"

* tag 'trace-v4.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  fix memory leaks in tracing_buffers_splice_read()
  tracing: Move mutex to protect against resetting of seq data
2016-09-25 18:40:13 -07:00
Dave Chinner
90b75db649 fault_in_multipages_readable() throws set-but-unused error
When building XFS with -Werror, it now fails with:

  include/linux/pagemap.h: In function 'fault_in_multipages_readable':
  include/linux/pagemap.h:602:16: error: variable 'c' set but not used [-Werror=unused-but-set-variable]
    volatile char c;
                  ^

This is a regression caused by commit e23d4159b1 ("fix
fault_in_multipages_...() on architectures with no-op access_ok()").
Fix it by re-adding the "(void)c" trick taht was previously used to make
the compiler think the variable is used.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-25 18:16:44 -07:00
Lorenzo Stoakes
38e0885465 mm: check VMA flags to avoid invalid PROT_NONE NUMA balancing
The NUMA balancing logic uses an arch-specific PROT_NONE page table flag
defined by pte_protnone() or pmd_protnone() to mark PTEs or huge page
PMDs respectively as requiring balancing upon a subsequent page fault.
User-defined PROT_NONE memory regions which also have this flag set will
not normally invoke the NUMA balancing code as do_page_fault() will send
a segfault to the process before handle_mm_fault() is even called.

However if access_remote_vm() is invoked to access a PROT_NONE region of
memory, handle_mm_fault() is called via faultin_page() and
__get_user_pages() without any access checks being performed, meaning
the NUMA balancing logic is incorrectly invoked on a non-NUMA memory
region.

A simple means of triggering this problem is to access PROT_NONE mmap'd
memory using /proc/self/mem which reliably results in the NUMA handling
functions being invoked when CONFIG_NUMA_BALANCING is set.

This issue was reported in bugzilla (issue 99101) which includes some
simple repro code.

There are BUG_ON() checks in do_numa_page() and do_huge_pmd_numa_page()
added at commit c0e7cad to avoid accidentally provoking strange
behaviour by attempting to apply NUMA balancing to pages that are in
fact PROT_NONE.  The BUG_ON()'s are consistently triggered by the repro.

This patch moves the PROT_NONE check into mm/memory.c rather than
invoking BUG_ON() as faulting in these pages via faultin_page() is a
valid reason for reaching the NUMA check with the PROT_NONE page table
flag set and is therefore not always a bug.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=99101
Reported-by: Trevor Saunders <tbsaunde@tbsaunde.org>
Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-25 15:43:42 -07:00
Linus Torvalds
831e45d84a Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
Pull MIPS fixes from Ralf Baechle:
 "A round of 4.8 fixes:

  MIPS generic code:
   - Add a missing ".set pop" in an early commit
   - Fix memory regions reaching top of physical
   - MAAR: Fix address alignment
   - vDSO: Fix Malta EVA mapping to vDSO page structs
   - uprobes: fix incorrect uprobe brk handling
   - uprobes: select HAVE_REGS_AND_STACK_ACCESS_API
   - Avoid a BUG warning during PR_SET_FP_MODE prctl
   - SMP: Fix possibility of deadlock when bringing CPUs online
   - R6: Remove compact branch policy Kconfig entries
   - Fix size calc when avoiding IPIs for small icache flushes
   - Fix pre-r6 emulation FPU initialisation
   - Fix delay slot emulation count in debugfs

  ATH79:
   - Fix test for error return of clk_register_fixed_factor.

  Octeon:
   - Fix kernel header to work for VDSO build.
   - Fix initialization of platform device probing.

  paravirt:
   - Fix undefined reference to smp_bootstrap"

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
  MIPS: Fix delay slot emulation count in debugfs
  MIPS: SMP: Fix possibility of deadlock when bringing CPUs online
  MIPS: Fix pre-r6 emulation FPU initialisation
  MIPS: vDSO: Fix Malta EVA mapping to vDSO page structs
  MIPS: Select HAVE_REGS_AND_STACK_ACCESS_API
  MIPS: Octeon: Fix platform bus probing
  MIPS: Octeon: mangle-port: fix build failure with VDSO code
  MIPS: Avoid a BUG warning during prctl(PR_SET_FP_MODE, ...)
  MIPS: c-r4k: Fix size calc when avoiding IPIs for small icache flushes
  MIPS: Add a missing ".set pop" in an early commit
  MIPS: paravirt: Fix undefined reference to smp_bootstrap
  MIPS: Remove compact branch policy Kconfig entries
  MIPS: MAAR: Fix address alignment
  MIPS: Fix memory regions reaching top of physical
  MIPS: uprobes: fix incorrect uprobe brk handling
  MIPS: ath79: Fix test for error return of clk_register_fixed_factor().
2016-09-25 13:59:52 -07:00
Linus Torvalds
751b9a5d16 powerpc fixes for 4.8 #7
- powernv/pci: Fix m64 checks for SR-IOV and window alignment from Russell Currey
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJX50CvAAoJEFHr6jzI4aWAqJAP/0/0D8YGwOuIoYD2GmfoasKR
 TFbbuhX3xnfdiRG6w/sFBI3oh7icCw7hC+Qj1lNu9D3L/UkxOTBny+W07KvWzX44
 Yu74nEHgq3mVrRAU4McztbKIUBK2zagGwwCcGZXZl/uQI1ylvmmpcH3xClQzF+oA
 xKk8eB1OW2Ay6+y+FkSuyBHHSfww6QCk7ERPqaStCW9Uy+dDBjIwStLQuOpAhN/o
 Z9K+JwpPJ8qgw1Pe9pvrD5MjcM0hR+tUZm6LklZCCk89feqlwcrz9cpOrmTdGuF+
 n1iacpDaFf6IOlhI+6ImrT15llTgSk/nu9GNIRFDwOjVCuGy5aDQBtWuRFiVNggp
 vkZWFSl594Jn5H9/s6MpMXygSl36NMKgM/ZKvUsEAe6mF0Kb9pZRB7b/aV+ajkCQ
 rkQCe0KKSF6+D3wu3SmMe0NTc3/GkgxZN0lTnqUaB5PSRqwvVwurXugnAKr7arhj
 JSu9/QSeOxNI5ytDF1Nf9/RN0DT+L1w0vun083DupyJkG1hrjzm9kI0lACQTr/QX
 TxAWXGjiTsUOeM4pfNzqaJE4fNUc0TIc41jgWMx9qXzbKjhijgKEPtmyDMz93GVY
 hFXyRAMsWUOsQGP5tiLFYG0PkNsmCDIwca+yg47EicBQGTpEsGLYUBRvIILYNBKI
 ULl0yMLZWekl1rzthDdB
 =35xQ
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.8-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull one more powerpc fix from Michael Ellerman:
 "powernv/pci: Fix m64 checks for SR-IOV and window alignment from
  Russell Currey"

* tag 'powerpc-4.8-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/powernv/pci: Fix m64 checks for SR-IOV and window alignment
2016-09-25 13:52:59 -07:00
Linus Torvalds
8d2c0d36d6 radix tree: fix sibling entry handling in radix_tree_descend()
The fixes to the radix tree test suite show that the multi-order case is
broken.  The basic reason is that the radix tree code uses tagged
pointers with the "internal" bit in the low bits, and calculating the
pointer indices was supposed to mask off those bits.  But gcc will
notice that we then use the index to re-create the pointer, and will
avoid doing the arithmetic and use the tagged pointer directly.

This cleans the code up, using the existing is_sibling_entry() helper to
validate the sibling pointer range (instead of open-coding it), and
using entry_to_node() to mask off the low tag bit from the pointer.  And
once you do that, you might as well just use the now cleaned-up pointer
directly.

[ Side note: the multi-order code isn't actually ever used in the kernel
  right now, and the only reason I didn't just delete all that code is
  that Kirill Shutemov piped up and said:

    "Well, my ext4-with-huge-pages patchset[1] uses multi-order entries.
     It also converts shmem-with-huge-pages and hugetlb to them.

     I'm okay with converting it to other mechanism, but I need
     something.  (I looked into Konstantin's RFC patchset[2].  It looks
     okay, but I don't feel myself qualified to review it as I don't
     know much about radix-tree internals.)"

  [1] http://lkml.kernel.org/r/20160915115523.29737-1-kirill.shutemov@linux.intel.com
  [2] http://lkml.kernel.org/r/147230727479.9957.1087787722571077339.stgit@zurg ]

Reported-by: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Cedric Blancher <cedric.blancher@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-25 13:32:46 -07:00
Matthew Wilcox
62fd5258eb radix tree test suite: Test radix_tree_replace_slot() for multiorder entries
When we replace a multiorder entry, check that all indices reflect the
new value.

Also, compile the test suite with -O2, which shows other problems with
the code due to some dodgy pointer operations in the radix tree code.

Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-25 11:49:16 -07:00
Al Viro
1ae2293dd6 fix memory leaks in tracing_buffers_splice_read()
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-09-25 13:30:13 -04:00
Steven Rostedt (Red Hat)
1245800c0f tracing: Move mutex to protect against resetting of seq data
The iter->seq can be reset outside the protection of the mutex. So can
reading of user data. Move the mutex up to the beginning of the function.

Fixes: d7350c3f45 ("tracing/core: make the read callbacks reentrants")
Cc: stable@vger.kernel.org # 2.6.30+
Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-09-25 10:27:08 -04:00
Paul Burton
116e7111c8 MIPS: Fix delay slot emulation count in debugfs
Commit 432c6bacbd ("MIPS: Use per-mm page to execute branch delay slot
instructions") accidentally removed use of the MIPS_FPU_EMU_INC_STATS
macro from do_dsemulret, leading to the ds_emul file in debugfs always
returning zero even though we perform delay slot emulations.

Fix this by re-adding the use of the MIPS_FPU_EMU_INC_STATS macro.

Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Fixes: 432c6bacbd ("MIPS: Use per-mm page to execute branch delay slot instructions")
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14301/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-09-25 01:59:16 +02:00
Matt Redfearn
8f46cca1e6 MIPS: SMP: Fix possibility of deadlock when bringing CPUs online
This patch fixes the possibility of a deadlock when bringing up
secondary CPUs.
The deadlock occurs because the set_cpu_online() is called before
synchronise_count_slave(). This can cause a deadlock if the boot CPU,
having scheduled another thread, attempts to send an IPI to the
secondary CPU, which it sees has been marked online. The secondary is
blocked in synchronise_count_slave() waiting for the boot CPU to enter
synchronise_count_master(), but the boot cpu is blocked in
smp_call_function_many() waiting for the secondary to respond to it's
IPI request.

Fix this by marking the CPU online in cpu_callin_map and synchronising
counters before declaring the CPU online and calculating the maps for
IPIs.

Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
Reported-by: Justin Chen <justinpopo6@gmail.com>
Tested-by: Justin Chen <justinpopo6@gmail.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: stable@vger.kernel.org # v4.1+
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/14302/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
2016-09-25 01:43:52 +02:00
Linus Torvalds
9c0e28a7be Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Thomas Gleixner:
 "Three fixlets for perf:

   - add a missing NULL pointer check in the intel BTS driver

   - make BTS an exclusive PMU because BTS can only handle one event at
     a time

   - ensure that exclusive events are limited to one PMU so that several
     exclusive events can be scheduled on different PMU instances"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/core: Limit matching exclusive events to one PMU
  perf/x86/intel/bts: Make it an exclusive PMU
  perf/x86/intel/bts: Make sure debug store is valid
2016-09-24 12:44:28 -07:00
Linus Torvalds
2507c85662 Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fixes from Thomas Gleixner:
 "Two smallish fixes:

   - use the proper asm constraint in the Super-H atomic_fetch_ops

   - a trivial typo fix in the Kconfig help text"

* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/hung_task: Fix typo in CONFIG_DETECT_HUNG_TASK help text
  locking/atomic, arch/sh: Fix ATOMIC_FETCH_OP()
2016-09-24 12:41:19 -07:00
Linus Torvalds
709b8f67d7 Merge branch 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI fixes from Thomas Gleixner:
 "Two fixes for EFI/PAT:

   - a 32bit overflow bug in the PAT code which was unearthed by the
     large EFI mappings

   - prevent a boot hang on large systems when EFI mixed mode is enabled
     but not used"

* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/efi: Only map RAM into EFI page tables if in mixed-mode
  x86/mm/pat: Prevent hang during boot when mapping pages
2016-09-24 12:35:26 -07:00
Linus Torvalds
4b8b0ff60f Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Thomas Gleixner:
 "Three fixes for irq core and irq chip drivers:

   - Do not set the irq type if type is NONE.  Fixes a boot regression
     on various SoCs

   - Use the proper cpu for setting up the GIC target list.  Discovered
     by the cpumask debugging code.

   - A rather large fix for the MIPS-GIC so per cpu local interrupts
     work again.  This was discovered late because the code falls back
     to slower timers which use normal device interrupts"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/mips-gic: Fix local interrupts
  irqchip/gicv3: Silence noisy DEBUG_PER_CPU_MAPS warning
  genirq: Skip chained interrupt trigger setup if type is IRQ_TYPE_NONE
2016-09-24 12:30:12 -07:00
Dan Williams
595c73071e libnvdimm, region: fix flush hint table thinko
The definition of the flush hint table as:

	void __iomem *flush_wpq[0][0];

...passed the unit test, but is broken as flush_wpq[0][1] and
flush_wpq[1][0] refer to the same entry.  Fix this to use a helper that
calculates a slot in the table based on the geometry of flush hints in
the region.  This is important to get right since virtualization
solutions use this mechanism to trigger hypervisor flushes to platform
persistence.

Reported-by: Dave Jiang <dave.jiang@intel.com>
Tested-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-09-24 11:45:38 -07:00
Linus Torvalds
0f26574178 Merge branch 'hughd-fixes' (patches from Hugh Dickins)
Merge VM fixes from High Dickins:
 "I get the impression that Andrew is away or busy at the moment, so I'm
  going to send you three independent uncontroversial little mm fixes
  directly - though none is strictly a 4.8 regression fix.

   - shmem: fix tmpfs to handle the huge= option properly from Toshi
     Kani is a one-liner to fix a major embarrassment in 4.8's hugepages
     on tmpfs feature: although Hillf pointed it out in June, somehow
     both Kirill and I repeatedly dropped the ball on this one.  You
     might wonder if the feature got tested at all with that bug in:
     yes, it did, but for wider testing coverage, Kirill and I had each
     relied too much on an override which bypasses that condition.

   - huge tmpfs: fix Committed_AS leak just a run-of-the-mill accounting
     fix in the same feature.

   - mm: delete unnecessary and unsafe init_tlb_ubc() is an unrelated
     fix to 4.3's TLB flush batching in reclaim: the bug would be rare,
     and none of us will be shamed if this one misses 4.8; but it got
     such a quick ack from Mel today that I'm inclined to offer it along
     with the first two"

* emailed patches from Hugh Dickins <hughd@google.com>:
  mm: delete unnecessary and unsafe init_tlb_ubc()
  huge tmpfs: fix Committed_AS leak
  shmem: fix tmpfs to handle the huge= option properly
2016-09-24 11:31:45 -07:00
Hugh Dickins
b385d21f27 mm: delete unnecessary and unsafe init_tlb_ubc()
init_tlb_ubc() looked unnecessary to me: tlb_ubc is statically
initialized with zeroes in the init_task, and copied from parent to
child while it is quiescent in arch_dup_task_struct(); so I went to
delete it.

But inserted temporary debug WARN_ONs in place of init_tlb_ubc() to
check that it was always empty at that point, and found them firing:
because memcg reclaim can recurse into global reclaim (when allocating
biosets for swapout in my case), and arrive back at the init_tlb_ubc()
in shrink_node_memcg().

Resetting tlb_ubc.flush_required at that point is wrong: if the upper
level needs a deferred TLB flush, but the lower level turns out not to,
we miss a TLB flush.  But fortunately, that's the only part of the
protocol that does not nest: with the initialization removed, cpumask
collects bits from upper and lower levels, and flushes TLB when needed.

Fixes: 72b252aed5 ("mm: send one IPI per CPU to TLB flush all entries after unmapping pages")
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: stable@vger.kernel.org # 4.3+
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-24 11:20:01 -07:00
Hugh Dickins
71664665c3 huge tmpfs: fix Committed_AS leak
Under swapping load on huge tmpfs, /proc/meminfo's Committed_AS grows
bigger and bigger: just a cosmetic issue for most users, but disabling
for those who run without overcommit (/proc/sys/vm/overcommit_memory 2).

shmem_uncharge() was forgetting to unaccount __vm_enough_memory's
charge, and shmem_charge() was forgetting it on the filesystem-full
error path.

Fixes: 800d8c63b2 ("shmem: add huge pages support")
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-24 11:20:01 -07:00
Toshi Kani
3089bf614c shmem: fix tmpfs to handle the huge= option properly
shmem_get_unmapped_area() checks SHMEM_SB(sb)->huge incorrectly, which
leads to a reversed effect of "huge=" mount option.

Fix the check in shmem_get_unmapped_area().

Note, the default value of SHMEM_SB(sb)->huge remains as
SHMEM_HUGE_NEVER.  User will need to specify "huge=" option to enable
huge page mappings.

Reported-by: Hillf Danton <hillf.zj@alibaba-inc.com>
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-24 11:20:01 -07:00
Linus Torvalds
bd5dbcb4be Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
 "Three driver bugfixes: fixing uninitialized memory pointers (eg20t),
  pm/clock imbalance (qup), and a wrongly set cached variable (pc954x)"

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: qup: skip qup_i2c_suspend if the device is already runtime suspended
  i2c: mux: pca954x: retry updating the mux selection on failure
  i2c-eg20t: fix race between i2c init and interrupt enable
2016-09-23 16:44:12 -07:00
Linus Torvalds
d0c1d15f5e Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input updates from Dmitry Torokhov:
 "Just a fix up for the firmware handling to the Silead driver (which is
  a new driver in this release)"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: silead_gsl1680 - use "silead/" prefix for firmware loading
  Input: silead_gsl1680 - document firmware-name, fix implementation
2016-09-23 16:34:24 -07:00
Linus Torvalds
4ee6986625 Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
 "Three fixes, two regressions and one that poses a problem in blk-mq
  with the new nvmef code"

* 'for-linus' of git://git.kernel.dk/linux-block:
  blk-mq: skip unmapped queues in blk_mq_alloc_request_hctx
  nvme-rdma: only clear queue flags after successful connect
  blk-throttle: Extend slice if throttle group is not empty
2016-09-23 16:24:36 -07:00
Tejun Heo
9157056da8 cgroup: fix invalid controller enable rejections with cgroup namespace
On the v2 hierarchy, "cgroup.subtree_control" rejects controller
enables if the cgroup has processes in it.  The enforcement of this
logic assumes that the cgroup wouldn't have any css_sets associated
with it if there are no tasks in the cgroup, which is no longer true
since a79a908fd2 ("cgroup: introduce cgroup namespaces").

When a cgroup namespace is created, it pins the css_set of the
creating task to use it as the root css_set of the namespace.  This
extra reference stays as long as the namespace is around and makes
"cgroup.subtree_control" think that the namespace root cgroup is not
empty even when it is and thus reject controller enables.

Fix it by making cgroup_subtree_control() walk and test emptiness of
each css_set instead of testing whether the list_head is empty.

While at it, update the comment of cgroup_task_count() to indicate
that the returned value may be higher than the number of tasks, which
has always been true due to temporary references and doesn't break
anything.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Evgeny Vereshchagin <evvers@ya.ru>
Cc: Serge E. Hallyn <serge.hallyn@ubuntu.com>
Cc: Aditya Kali <adityakali@google.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: stable@vger.kernel.org # v4.6+
Fixes: a79a908fd2 ("cgroup: introduce cgroup namespaces")
Link: https://github.com/systemd/systemd/pull/3589#issuecomment-249089541
2016-09-23 16:55:49 -04:00
Linus Torvalds
b22734a550 Merge branch 'for-linus-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs fixes from Chris Mason:
 "Josef fixed a problem when quotas are enabled with his latest ENOSPC
  rework, and Jeff added more checks into the subvol ioctls to avoid
  tripping up lookup_one_len"

* 'for-linus-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
  btrfs: ensure that file descriptor used with subvol ioctls is a dir
  Btrfs: handle quota reserve failure properly
2016-09-23 13:39:37 -07:00
Linus Torvalds
78bbf153fa regmap: Fix for v4.8
A fix for an issue with double locking that was introduced earlier this
 release.  I'd missed in review that we were already in a locked region
 when trying to drop part of the cache.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJX5XcsAAoJECTWi3JdVIfQSfEH/iGoQHxcIyPeTBOk8XUIkmkJ
 XgaQmZfj2RNKPbCug8KDhDks9qDL/w0/2DG00Trv6pVSFHyBKfX6AQEh0GgmYBJF
 trsL1ai/F1dlUbrQaOGMkfUPSF3DvYRsM4/kJOxSruVYsK5PUk2zXEw9kPmquB57
 K7IRru5KNDyH5p08h6jbm6po+0QhyaFnCi4iX5kqrvqPFWeP15KHfd+vuRNlWYt0
 rf069yqHfZvPXDl6Alw+wdYskxDHT+hvmlWjIGVm1JwAmsx9V2QCDwrhBKG9aZW/
 z+mdSKBSJ+RBIp7ucI8sSBm3JsxeSz4+teTe2DRZeGLJmnrZ6756xSNR0pphoAM=
 =yYvs
 -----END PGP SIGNATURE-----

Merge tag 'regmap-fix-v4.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap

Pull regmap fix from Mark Brown:
 "A fix for an issue with double locking that was introduced earlier
  this release.  I'd missed in review that we were already in a locked
  region when trying to drop part of the cache"

* tag 'regmap-fix-v4.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
  regmap: fix deadlock on _regmap_raw_write() error path
2016-09-23 11:50:49 -07:00
Linus Torvalds
2ddfdd4289 Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
 "This fixes a regression in RSA that was only half-fixed earlier in the
  cycle.  It also fixes an older regression that breaks the keyring
  subsystem"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: rsa-pkcs1pad - Handle leading zero for decryption
  KEYS: Fix skcipher IV clobbering
2016-09-23 11:28:04 -07:00
Linus Torvalds
7d188bad66 - Fix secondary CPU to NUMA node assignment
- Fix kgdb breakpoint insertion in read-only text sections (when
   CONFIG_DEBUG_RODATA or CONFIG_DEBUG_SET_MODULE_RONX are enabled)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJX5SUUAAoJEGvWsS0AyF7xJhsQAIBvpLUNntihhcKTLbBkpQ96
 ni/cludUBgQ8fil3leiR9zmfDDI1meohgETct2Z7qMHaTCx5Ej72GG5cdQl71quy
 a2Ks/k7BlwsMLMzHMQLreZhF/1z0XFzOwjbXIMAvxTLUNHCZRvtuZ3mES5+JTtYc
 u/obI3JhuKJjkV74CFSfSd6NoRP+Xgmzqmj/afN2TKsZwczqJw33BeNABWxi41SR
 o3OcqgJppmQbCyH7KeKsYrG3zghpzraDj9e/qwWwVDiw00bWar+rE2sxmQM5pHuT
 RsCmRXIVy/nbVduKl9t4UamkvMsS+dL048bRD2WIOu8hUkZKgpT/c6KcBgspYGXm
 ZFBSJZITamU3r9txNMn/WNVp/S2i3KaojrxzrTPG/6wencT5ZfCOfBqWB44I6Xf4
 PG8mfmHzEJwpHUMwz5mSnkigNWe5y3CJhRNUO3GQp+4ULAVo3JbD1oyF57WC2PJh
 30MOPfl+jxv+uMT83T1zDsIZ0JdufvuaHJ7d9AP1wVcK5PVGW6icuO5Krd3fk7CQ
 VJ1yflwya9TVEqRUQDDHwnKlCHohILJomx9myV578/lmEOs8FgA5Dbx+/grgS6/P
 +9YKB7hv8uK26uGO/bn41YCqhgCoJX2YqEi4qECRLwDjNtNyYD3XBLT25jAbiopi
 Gfew0xtNcFagew1BHYzD
 =NCBp
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Catalin Marinas:
 "A couple of last-minute arm64 fixes for 4.8:

   - Fix secondary CPU to NUMA node assignment

   - Fix kgdb breakpoint insertion in read-only text sections (when
     CONFIG_DEBUG_RODATA or CONFIG_DEBUG_SET_MODULE_RONX are enabled)"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: kgdb: handle read-only text / modules
  arm64: Call numa_store_cpu_info() earlier.
2016-09-23 11:24:42 -07:00
Linus Torvalds
d9d1ffe00b NAND Fixes for 4.8-rc8:
- Fix a wrong OOB layout definition in the mxc driver
 - Fix incorrect ECC handling in the mtk driver
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJX5RRDAAoJEEtJtSqsAOnWmlUP/0koh9dokeTqQZ8viUlLjjQ2
 Xx+B5RiABc8yocGjCvD8Imi1nxLNpeNL5SoI0Z56q9bWgpLkqaDZHz7o3PCtMe7c
 VVAAjcgyR+4NimwZTwr1ci88rR2jUq55jHDy26YQsAcAVHZ87GnTY54ZkHufF591
 cDylCZzVsjcr5bPnJwVIpG366osYRmL8X1tMX5jwQQ19712sXr0b18VEF2+cnuMa
 dX50x5bexh5BfgFO6+eqNLtPO2gOt1YkLXvEYubv1xFJ3cvKwIxyRxXKnYEFtA4m
 hdj1hCtczF/oV5PD4Ym48/8M+kTqWSlHPR56XH4bIC30jVJJfuMpNHV9LIbLnVyR
 9+h3inoMCDh5zSsh3/Jb9Xao0IHLqWyCUK9gtsKPtuLGBgEMBPcFcoLOBdg1zQYf
 COQNJ7hWnqnymsQCcAkAFoxR3kciDeAM+Fre4iJF5+tGmJMln/V6VN3qNKYaS2Gf
 YCDEwekL93SJG9aSsu+w56qWoo6h9GmfwW351WPXNvOzT2dE/JaoWB/bpVlkYBfT
 nkGItQoZ2oObGf3jKGhoJR+IGJreokfPFtgCOGDoGs+U0dWPciaHE1ZuuVg+XX4q
 OyApXI3c1q9nDEFhePojm4fPtejcT4gvQnEWTy3SDPQdm+mBsgS/3oZvPjd83RYq
 0Vkschc+f0GgYFNUc5Wp
 =/jmi
 -----END PGP SIGNATURE-----

Merge tag 'tags/nand-fixes-for-4.8-rc8' of git://git.infradead.org/linux-ubifs

Pull MTD fixes from Richard Weinberger:
 "NAND Fixes for 4.8-rc8.

  This contains fixes for bugs which got introduced in -rc1.  Usually
  Brian takes NAND patches from Boris, but since Brian is very busy
  these days with other stuff and Boris is not yet member of the
  kernel.org web of trust I stepped in.

  Boris will be in Berlin at ELCE, I'll sign his key and hopefully other
  Kernel developers too such that he can issue his own pull requests
  soon.

  Summary:

   - Fix a wrong OOB layout definition in the mxc driver
   - Fix incorrect ECC handling in the mtk driver"

* tag 'tags/nand-fixes-for-4.8-rc8' of git://git.infradead.org/linux-ubifs:
  mtd: nand: mxc: fix obiwan error in mxc_nand_v[12]_ooblayout_free() functions
  mtd: nand: fix chances to create incomplete ECC data when writing
  mtd: nand: fix generating over-boundary ECC data when writing
2016-09-23 11:15:00 -07:00
Linus Torvalds
e7c5412f77 MMC host:
- dw_mmc: fix the spamming log message
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJX5QKbAAoJEP4mhCVzWIwpspsQAL6kvK/nIlIaT+2pFLwJW7wj
 1fkKsJMqQSkMU9tvH6niCrhCXKIXSdzMVQdP4OW5WpIHlXRIzAoQxdHTziMT90Mg
 Pt9kx0/VhWt7jumrSY/jnNFQk5f0iS8FXhyxR5XNKqvu+Cyc4TkVs5kxVq9BHsR8
 l8Z63/kjdSA23lhhrXEDkgpN5rOB4ScIGllh7aYNC8vdkjSWWUyqh2W/0pxkXugT
 Cru11cFPuXiKJBvJFJ2brlfXJfhZqHnHPhdfA79eTnnlOoBWRk+eUlPy+ziS5jZR
 TetLnJBeBYCa0TZsB+VLSNjhDld+nJBp1LYr25t0qtlUx9JrEYhg4oNUyCg/ga/F
 Oh/0xD0/zIqOFKYLerJn5kS3hsWX2keVnKIWrwHg85vOzbqG2JARx0qlcnWO9bud
 DCBg+cvW63Zy148vO9kfLapNfGbO/tYns5JnNCF2/Ige1f/HNyP3iDzTkwIB9zP/
 m7t7X/yAZfdyCM3Hioqnq2SyNAmwj2VTbcDPygCvKw582wecTp6n5ITAwJW/+68h
 RQaM/sCFf1/lFBSXSWu5+CjT4RVNrTcE2+gp3FkfmPgmuMqOnQ4+pg9iQExJIV0s
 px4RSAhHOhvob260yG7FwvDMFZHWeysTr0dV/BXIQZkP5Nzn1qn3afHilbX8xHyb
 CylB4QLlOUGJEfZUwYDP
 =Ppmz
 -----END PGP SIGNATURE-----

Merge tag 'mmc-v4.8-rc7' of git://git.linaro.org/people/ulf.hansson/mmc

Pull MMC fix from Ulf Hansson:
 "MMC host:

   - dw_mmc: fix the spamming log message"

* tag 'mmc-v4.8-rc7' of git://git.linaro.org/people/ulf.hansson/mmc:
  mmc: dw_mmc: fix the spamming log message
2016-09-23 11:10:53 -07:00
Linus Torvalds
e47f2e50ea One more trivial fix for the binary attribute code from Phil Turnbull.
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJX5KV7AAoJEA+eU2VSBFGD6hEQAINlrv/sIX2mQcxaETodsvPq
 kKt6ESgogl0ZTq3lpNhaOwhiozrvgCPJibQZarq4Qr2q2Sz+AkQzYSLCcVO+CmJB
 94w4jy2m+M+diEFKpjexJpD+LfEoJPjhfrjs9wI6CKUL2F0FS+LUUOU44gCzSKdh
 wupkVgPvC3csUZG/9QwTRxZH9Zh/DpsN2JC7MkM3YSc5ELw+YaFWWiEMNjyNMll2
 ex2l2+fhfbdHW8WGl5rCjaCfjagi1h2VMtOkbwr4LWX89IMVgAdKbtkquAcme41t
 o6oHAqN+8EZwxaWdKTR247u5dg5p7W2MeOQyJmlFzUa52fv8APrKONlUfmco/aYC
 fBvt4s0Hsg/i57dpl+ZdFIfEXzpDgQZpWCEoUvGzfNayghUBk7vF+CcTl+lzcnqA
 qEiKu9NLMpVmMb1XWCAJzWDTVhY/JJrfx/ndsHiyWlXuiI+yDvQvIIN3fVbkzzHR
 4Q52n8zVa2MaVcACb5vf0OKVaETNsemD3oMN5irGcA/RMylxnO7iKghemDYDXMfZ
 Cnm5pyIm6ZF2a9UapetKEfQawdo7UkS1wXkKMPwLhB6aoK4gbk5pxK0oUxmiQyyp
 T5o9nZ3Vmj4XoZwaaq2mlIOlj/USSIa8DChXMb43NH8agiMwFzIm8nbAHhr9TEtd
 JpaLYUe+BvqcZvTwBRxS
 =+uba
 -----END PGP SIGNATURE-----

Merge tag 'configfs-for-4.8-2' of git://git.infradead.org/users/hch/configfs

Pull configfs fix from Christoph Hellwig:
 "One more trivial fix for the binary attribute code from Phil Turnbull"

* tag 'configfs-for-4.8-2' of git://git.infradead.org/users/hch/configfs:
  configfs: Return -EFBIG from configfs_write_bin_file.
2016-09-23 09:45:15 -07:00
Christoph Hellwig
c8712c6a67 blk-mq: skip unmapped queues in blk_mq_alloc_request_hctx
This provides the caller a feedback that a given hctx is not mapped and thus
no command can be sent on it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-09-23 10:25:48 -06:00