Commit Graph

984243 Commits

Author SHA1 Message Date
Linus Torvalds
860b45dae9 io_uring-5.11-2021-02-05
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmAd0KoQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpkPvD/kBm/uxstfomiryxDUeALadUZTIxkIsP8Zx
 6IijgJXvynDJutz8gjA7aynK8j4YyrktiuS4C3ctxU+cyt3/M2ZFnnQpx88gvfK5
 5OANmB1iMwUyu4GkZHsmWPo4aqv6mvE+QKKYMu8m++6/ZA4/458jx0AsjP1XSKth
 VYeRLElPTH+JcoxSgn9DwEJiGGViN26rpiy3NG2fNt/dXNFgwD8BevjXAYdQNscs
 Xrox2p2TLoMnCVWoDXg3XmMwZphigibjyWWgEEZp3LGrHU49HNUIL7GXv5No1nAO
 okxmVL3zropEgCXqxeJ5eGG+Ve1JCuvMxgl34dVN3qoN6AhfU7BXbGFeoKYSQIIW
 pgF2Qv0+KGEnRD7HOSLdygnl2gLP9ID+Xx214rKlnRE3bFkg5lwZxg4Pfos1Sn5N
 PGLqfvhZ8/Qb5BObW4qMobz3yG5ozrHJ8+EeccgNJuOGQtw3yHxp5NAotzTp97mA
 5RCw6f9HVlTcgRnDOdYskeUfb4N1i4Ps1/0RCHGWlxOpFsVkClWeDp1DTA+/gW5l
 +7vREo3vpDfNW68PgWwp5y2RyfocOgRS6pRX0gDhtsLx6MJl1YKGbU0qbamdjofm
 bOygR+Ce4rYiG+kFHkkJcWG9rjcomy2BXCHXoylx65FimYmrFuQzdxpRO2MrWpzJ
 4zQcegXM1A==
 =sGoQ
 -----END PGP SIGNATURE-----

Merge tag 'io_uring-5.11-2021-02-05' of git://git.kernel.dk/linux-block

Pull io_uring fixes from Jens Axboe:
 "Two small fixes that should go into 5.11:

   - task_work resource drop fix (Pavel)

   - identity COW fix (Xiaoguang)"

* tag 'io_uring-5.11-2021-02-05' of git://git.kernel.dk/linux-block:
  io_uring: drop mm/files between task_work_submit
  io_uring: don't modify identity's files uncess identity is cowed
2021-02-06 14:37:24 -08:00
Sukadev Bhattiprolu
ef66a1eace ibmvnic: Clear failover_pending if unable to schedule
Normally we clear the failover_pending flag when processing the reset.
But if we are unable to schedule a failover reset we must clear the
flag ourselves. We could fail to schedule the reset if we are in PROBING
state (eg: when booting via kexec) or because we could not allocate memory.

Thanks to Cris Forno for helping isolate the problem and for testing.

Fixes: 1d85049374 ("powerpc/vnic: Extend "failover pending" window")
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.ibm.com>
Tested-by: Cristobal Forno <cforno12@linux.ibm.com>
Link: https://lore.kernel.org/r/20210203050802.680772-1-sukadev@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-06 10:36:22 -08:00
Jakub Kicinski
2da4b24b1d wireless-drivers fixes for v5.11
Third, and most likely the last, set of fixes for v5.11. Two very
 small fixes.
 
 ath9k
 
 * fix build regression related to LEDS_CLASS
 
 mt76
 
 * fix a memory leak
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJgHXLXAAoJEG4XJFUm622bSswIAKKUL+5rtTO5REcOgQLfjnDf
 FacTFREGoQTmzAOyuXNpM+ULEqsQ4keilGmCWqteuIuVm4Tlpqkyo6z/cyHU6RBO
 FR1Laayu96Ir7Wcig7S0UL8vz01oZJxcOo1Ijm+w+TVfBCbDdH9bk9NlP7e7sH2j
 7wfCo9OMMcnL52QpN1+lI2xC+IF9DTyKM8FjTuymQBFD/45b7mxidIpoZtpMd+ES
 /qQJj92j6ysa44rZvuY5aN5XtmQd0rYZhMu9E7RMm2jo6go4o6FvtIwqcz3Fqsxl
 hjOzIyBZpQHH9dTaKGKcaoPfjXgePovuk4Gh2KOlCgYkxeWtdpyoOqcPOu1VlbU=
 =/Zue
 -----END PGP SIGNATURE-----

Merge tag 'wireless-drivers-2021-02-05' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers

Kalle Valo says:

====================
wireless-drivers fixes for v5.11

Third, and most likely the last, set of fixes for v5.11. Two very
small fixes.

ath9k
 * fix build regression related to LEDS_CLASS

mt76
 * fix a memory leak

* tag 'wireless-drivers-2021-02-05' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers:
  mt76: dma: fix a possible memory leak in mt76_add_fragment()
  ath9k: fix build error with LEDS_CLASS=m
====================

Link: https://lore.kernel.org/r/20210205163434.14D94C433ED@smtp.codeaurora.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-06 09:27:20 -08:00
Borislav Petkov
816ef8d7a2 x86/efi: Remove EFI PGD build time checks
With CONFIG_X86_5LEVEL, CONFIG_UBSAN and CONFIG_UBSAN_UNSIGNED_OVERFLOW
enabled, clang fails the build with

  x86_64-linux-ld: arch/x86/platform/efi/efi_64.o: in function `efi_sync_low_kernel_mappings':
  efi_64.c:(.text+0x22c): undefined reference to `__compiletime_assert_354'

which happens due to -fsanitize=unsigned-integer-overflow being enabled:

  -fsanitize=unsigned-integer-overflow: Unsigned integer overflow, where
  the result of an unsigned integer computation cannot be represented
  in its type. Unlike signed integer overflow, this is not undefined
  behavior, but it is often unintentional. This sanitizer does not check
  for lossy implicit conversions performed before such a computation
  (see -fsanitize=implicit-conversion).

and that fires when the (intentional) EFI_VA_START/END defines overflow
an unsigned long, leading to the assertion expressions not getting
optimized away (on GCC they do)...

However, those checks are superfluous: the runtime services mapping
code already makes sure the ranges don't overshoot EFI_VA_END as the
EFI mapping range is hardcoded. On each runtime services call, it is
switched to the EFI-specific PGD and even if mappings manage to escape
that last PGD, this won't remain unnoticed for long.

So rip them out.

See https://github.com/ClangBuiltLinux/linux/issues/256 for more info.

Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Link: http://lkml.kernel.org/r/20210107223424.4135538-1-arnd@kernel.org
2021-02-06 13:54:14 +01:00
Aneesh Kumar K.V
8c511eff18 powerpc/kuap: Allow kernel thread to access userspace after kthread_use_mm
This fix the bad fault reported by KUAP when io_wqe_worker access userspace.

 Bug: Read fault blocked by KUAP!
 WARNING: CPU: 1 PID: 101841 at arch/powerpc/mm/fault.c:229 __do_page_fault+0x6b4/0xcd0
 NIP [c00000000009e7e4] __do_page_fault+0x6b4/0xcd0
 LR [c00000000009e7e0] __do_page_fault+0x6b0/0xcd0
..........
 Call Trace:
 [c000000016367330] [c00000000009e7e0] __do_page_fault+0x6b0/0xcd0 (unreliable)
 [c0000000163673e0] [c00000000009ee3c] do_page_fault+0x3c/0x120
 [c000000016367430] [c00000000000c848] handle_page_fault+0x10/0x2c
 --- interrupt: 300 at iov_iter_fault_in_readable+0x148/0x6f0
..........
 NIP [c0000000008e8228] iov_iter_fault_in_readable+0x148/0x6f0
 LR [c0000000008e834c] iov_iter_fault_in_readable+0x26c/0x6f0
 interrupt: 300
 [c0000000163677e0] [c0000000007154a0] iomap_write_actor+0xc0/0x280
 [c000000016367880] [c00000000070fc94] iomap_apply+0x1c4/0x780
 [c000000016367990] [c000000000710330] iomap_file_buffered_write+0xa0/0x120
 [c0000000163679e0] [c00800000040791c] xfs_file_buffered_aio_write+0x314/0x5e0 [xfs]
 [c000000016367a90] [c0000000006d74bc] io_write+0x10c/0x460
 [c000000016367bb0] [c0000000006d80e4] io_issue_sqe+0x8d4/0x1200
 [c000000016367c70] [c0000000006d8ad0] io_wq_submit_work+0xc0/0x250
 [c000000016367cb0] [c0000000006e2578] io_worker_handle_work+0x498/0x800
 [c000000016367d40] [c0000000006e2cdc] io_wqe_worker+0x3fc/0x4f0
 [c000000016367da0] [c0000000001cb0a4] kthread+0x1c4/0x1d0
 [c000000016367e10] [c00000000000dbf0] ret_from_kernel_thread+0x5c/0x6c

The kernel consider thread AMR value for kernel thread to be
AMR_KUAP_BLOCKED. Hence access to userspace is denied. This
of course not correct and we should allow userspace access after
kthread_use_mm(). To be precise, kthread_use_mm() should inherit the
AMR value of the operating address space. But, the AMR value is
thread-specific and we inherit the address space and not thread
access restrictions. Because of this ignore AMR value when accessing
userspace via kernel thread.

current_thread_amr/iamr() are updated, because we use them in the
below stack.
....
[  530.710838] CPU: 13 PID: 5587 Comm: io_wqe_worker-0 Tainted: G      D           5.11.0-rc6+ #3
....

 NIP [c0000000000aa0c8] pkey_access_permitted+0x28/0x90
 LR [c0000000004b9278] gup_pte_range+0x188/0x420
 --- interrupt: 700
 [c00000001c4ef3f0] [0000000000000000] 0x0 (unreliable)
 [c00000001c4ef490] [c0000000004bd39c] gup_pgd_range+0x3ac/0xa20
 [c00000001c4ef5a0] [c0000000004bdd44] internal_get_user_pages_fast+0x334/0x410
 [c00000001c4ef620] [c000000000852028] iov_iter_get_pages+0xf8/0x5c0
 [c00000001c4ef6a0] [c0000000007da44c] bio_iov_iter_get_pages+0xec/0x700
 [c00000001c4ef770] [c0000000006a325c] iomap_dio_bio_actor+0x2ac/0x4f0
 [c00000001c4ef810] [c00000000069cd94] iomap_apply+0x2b4/0x740
 [c00000001c4ef920] [c0000000006a38b8] __iomap_dio_rw+0x238/0x5c0
 [c00000001c4ef9d0] [c0000000006a3c60] iomap_dio_rw+0x20/0x80
 [c00000001c4ef9f0] [c008000001927a30] xfs_file_dio_aio_write+0x1f8/0x650 [xfs]
 [c00000001c4efa60] [c0080000019284dc] xfs_file_write_iter+0xc4/0x130 [xfs]
 [c00000001c4efa90] [c000000000669984] io_write+0x104/0x4b0
 [c00000001c4efbb0] [c00000000066cea4] io_issue_sqe+0x3d4/0xf50
 [c00000001c4efc60] [c000000000670200] io_wq_submit_work+0xb0/0x2f0
 [c00000001c4efcb0] [c000000000674268] io_worker_handle_work+0x248/0x4a0
 [c00000001c4efd30] [c0000000006746e8] io_wqe_worker+0x228/0x2a0
 [c00000001c4efda0] [c00000000019d994] kthread+0x1b4/0x1c0

Fixes: 48a8ab4eeb ("powerpc/book3s64/pkeys: Don't update SPRN_AMR when in kernel mode.")
Reported-by: Zorro Lang <zlang@redhat.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210206025634.521979-1-aneesh.kumar@linux.ibm.com
2021-02-06 23:13:04 +11:00
Mohammad Athari Bin Ismail
f317e2ea8c net: stmmac: set TxQ mode back to DCB after disabling CBS
When disable CBS, mode_to_use parameter is not updated even the operation
mode of Tx Queue is changed to Data Centre Bridging (DCB). Therefore,
when tc_setup_cbs() function is called to re-enable CBS, the operation
mode of Tx Queue remains at DCB, which causing CBS fails to work.

This patch updates the value of mode_to_use parameter to MTL_QUEUE_DCB
after operation mode of Tx Queue is changed to DCB in stmmac_dma_qmode()
callback function.

Fixes: 1f705bc61a ("net: stmmac: Add support for CBS QDISC")
Suggested-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Mohammad Athari Bin Ismail <mohammad.athari.ismail@intel.com>
Signed-off-by: Song, Yoong Siang <yoong.siang.song@intel.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Link: https://lore.kernel.org/r/1612447396-20351-1-git-send-email-yoong.siang.song@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-05 20:00:19 -08:00
Jakub Kicinski
fb6221a201 Merge branch 'dpaa_eth-a050385-erratum-workaround-fixes-under-xdp'
Camelia Groza says:

====================
dpaa_eth: A050385 erratum workaround fixes under XDP

This series addresses issue with the current workaround for the A050385
erratum in XDP scenarios.

The first patch makes sure the xdp_frame structure stored at the start of
new buffers isn't overwritten.

The second patch decreases the required data alignment value, thus
preventing unnecessary realignments.

The third patch moves the data in place to align it, instead of allocating
a new buffer for each frame that breaks the alignment rules, thus bringing
an up to 40% performance increase. With this change, the impact of the
erratum workaround is reduced in many cases to a single digit decrease, and
to lower double digits in single flow scenarios.

Changes in v2:
- guarantee enough tailroom is available for the shared_info in 1/3
====================

Link: https://lore.kernel.org/r/cover.1612456902.git.camelia.groza@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-05 19:58:36 -08:00
Camelia Groza
0a9946cca1 dpaa_eth: try to move the data in place for the A050385 erratum
The XDP frame's headroom might be large enough to accommodate the
xdpf backpointer as well as shifting the data to an aligned address.

Try this first before resorting to allocating a new buffer and copying
the data.

Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Camelia Groza <camelia.groza@nxp.com>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-05 19:58:34 -08:00
Camelia Groza
c2b0e8455e dpaa_eth: reduce data alignment requirements for the A050385 erratum
The 256 byte data alignment is required for preventing DMA transaction
splits when crossing 4K page boundaries. Since XDP deals only with page
sized buffers or less, this restriction isn't needed. Instead, the data
only needs to be aligned to 64 bytes to prevent DMA transaction splits.

These lessened restrictions can increase performance by widening the pool
of permitted data alignments and preventing unnecessary realignments.

Fixes: ae680bcbd0 ("dpaa_eth: implement the A050385 erratum workaround for XDP")
Signed-off-by: Camelia Groza <camelia.groza@nxp.com>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-05 19:58:34 -08:00
Camelia Groza
275a9c72b4 dpaa_eth: reserve space for the xdp_frame under the A050385 erratum
When the erratum workaround is triggered, the newly created xdp_frame
structure is stored at the start of the newly allocated buffer. Avoid
the structure from being overwritten by explicitly reserving enough
space in the buffer for storing it.

Account for the fact that the structure's size might increase in time by
aligning the headroom to DPAA_FD_DATA_ALIGNMENT bytes, thus guaranteeing
the data's alignment.

Fixes: ae680bcbd0 ("dpaa_eth: implement the A050385 erratum workaround for XDP")
Signed-off-by: Camelia Groza <camelia.groza@nxp.com>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-05 19:58:34 -08:00
Eric Dumazet
8dc1c444df net: gro: do not keep too many GRO packets in napi->rx_list
Commit c80794323e ("net: Fix packet reordering caused by GRO and
listified RX cooperation") had the unfortunate effect of adding
latencies in common workloads.

Before the patch, GRO packets were immediately passed to
upper stacks.

After the patch, we can accumulate quite a lot of GRO
packets (depdending on NAPI budget).

My fix is counting in napi->rx_count number of segments
instead of number of logical packets.

Fixes: c80794323e ("net: Fix packet reordering caused by GRO and listified RX cooperation")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Bisected-by: John Sperbeck <jsperbeck@google.com>
Tested-by: Jian Yang <jianyang@google.com>
Cc: Maxim Mikityanskiy <maximmi@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Alexander Lobakin <alobakin@pm.me>
Link: https://lore.kernel.org/r/20210204213146.4192368-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-02-05 19:28:01 -08:00
Gabriel Krisman Bertazi
36a6c843fd entry: Use different define for selector variable in SUD
Michael Kerrisk suggested that, from an API perspective, it is a bad
idea to share the PR_SYS_DISPATCH_ defines between the prctl operation
and the selector variable.

Therefore, define two new constants to be used by SUD's selector variable
and update the corresponding documentation and test cases.

While this changes the API syscall user dispatch has never been part of a
Linux release, it will show up for the first time in 5.11.

Suggested-by: Michael Kerrisk (man-pages) <mtk.manpages@gmail.com>
Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20210205184321.2062251-1-krisman@collabora.com
2021-02-06 00:21:42 +01:00
Gabriel Krisman Bertazi
6342adcaa6 entry: Ensure trap after single-step on system call return
Commit 2991552447 ("entry: Drop usage of TIF flags in the generic syscall
code") introduced a bug on architectures using the generic syscall entry
code, in which processes stopped by PTRACE_SYSCALL do not trap on syscall
return after receiving a TIF_SINGLESTEP.

The reason is that the meaning of TIF_SINGLESTEP flag is overloaded to
cause the trap after a system call is executed, but since the above commit,
the syscall call handler only checks for the SYSCALL_WORK flags on the exit
work.

Split the meaning of TIF_SINGLESTEP such that it only means single-step
mode, and create a new type of SYSCALL_WORK to request a trap immediately
after a syscall in single-step mode.  In the current implementation, the
SYSCALL_WORK flag shadows the TIF_SINGLESTEP flag for simplicity.

Update x86 to flip this bit when a tracer enables single stepping.

Fixes: 2991552447 ("entry: Drop usage of TIF flags in the generic syscall code")
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Kyle Huey <me@kylehuey.com>
Link: https://lore.kernel.org/r/87h7mtc9pr.fsf_-_@collabora.com
2021-02-06 00:21:42 +01:00
Thomas Gleixner
2452483d95 Revert "lib: Restrict cpumask_local_spread to houskeeping CPUs"
This reverts commit 1abdfe706a.

This change is broken and not solving any problem it claims to solve.

Robin reported that cpumask_local_spread() now returns any cpu out of
cpu_possible_mask in case that NOHZ_FULL is disabled (runtime or compile
time). It can also return any offline or not-present CPU in the
housekeeping mask. Before that it was returning a CPU out of
online_cpu_mask.

While the function is racy against CPU hotplug if the caller does not
protect against it, the actual use cases are not caring much about it as
they use it mostly as hint for:

 - the user space affinity hint which is unused by the kernel
 - memory node selection which is just suboptimal
 - network queue affinity which might fail but is handled gracefully

But the occasional fail vs. hotplug is very different from returning
anything from possible_cpu_mask which can have a large amount of offline
CPUs obviously.

The changelog of the commit claims:

 "The current implementation of cpumask_local_spread() does not respect
  the isolated CPUs, i.e., even if a CPU has been isolated for Real-Time
  task, it will return it to the caller for pinning of its IRQ
  threads. Having these unwanted IRQ threads on an isolated CPU adds up
  to a latency overhead."

The only correct part of this changelog is:

 "The current implementation of cpumask_local_spread() does not respect
  the isolated CPUs."

Everything else is just disjunct from reality.

Reported-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Nitesh Narayan Lal <nitesh@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: abelits@marvell.com
Cc: davem@davemloft.net
Link: https://lore.kernel.org/r/87y2g26tnt.fsf@nanos.tec.linutronix.de
2021-02-05 23:28:29 +01:00
Linus Torvalds
1e0d27fce0 Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "18 patches.

  Subsystems affected by this patch series: mm (hugetlb, compaction,
  vmalloc, shmem, memblock, pagecache, kasan, and hugetlb), mailmap,
  gcov, ubsan, and MAINTAINERS"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  MAINTAINERS/.mailmap: use my @kernel.org address
  mm: hugetlb: fix missing put_page in gather_surplus_pages()
  ubsan: implement __ubsan_handle_alignment_assumption
  kasan: make addr_has_metadata() return true for valid addresses
  kasan: add explicit preconditions to kasan_report()
  mm/filemap: add missing mem_cgroup_uncharge() to __add_to_page_cache_locked()
  mailmap: add entries for Manivannan Sadhasivam
  mailmap: fix name/email for Viresh Kumar
  memblock: do not start bottom-up allocations with kernel_end
  mm: thp: fix MADV_REMOVE deadlock on shmem THP
  init/gcov: allow CONFIG_CONSTRUCTORS on UML to fix module gcov
  mm/vmalloc: separate put pages and flush VM flags
  mm, compaction: move high_pfn to the for loop scope
  mm: migrate: do not migrate HugeTLB page whose refcount is one
  mm: hugetlb: remove VM_BUG_ON_PAGE from page_huge_active
  mm: hugetlb: fix a race between isolating and freeing page
  mm: hugetlb: fix a race between freeing and dissolving the page
  mm: hugetlbfs: fix cannot migrate the fallocated HugeTLB page
2021-02-05 13:07:27 -08:00
Steven Rostedt (VMware)
256cfdd6fd tracing: Do not count ftrace events in top level enable output
The file /sys/kernel/tracing/events/enable is used to enable all events by
echoing in "1", or disabling all events when echoing in "0". To know if all
events are enabled, disabled, or some are enabled but not all of them,
cating the file should show either "1" (all enabled), "0" (all disabled), or
"X" (some enabled but not all of them). This works the same as the "enable"
files in the individule system directories (like tracing/events/sched/enable).

But when all events are enabled, the top level "enable" file shows "X". The
reason is that its checking the "ftrace" events, which are special events
that only exist for their format files. These include the format for the
function tracer events, that are enabled when the function tracer is
enabled, but not by the "enable" file. The check includes these events,
which will always be disabled, and even though all true events are enabled,
the top level "enable" file will show "X" instead of "1".

To fix this, have the check test the event's flags to see if it has the
"IGNORE_ENABLE" flag set, and if so, not test it.

Cc: stable@vger.kernel.org
Fixes: 553552ce17 ("tracing: Combine event filter_active and enable into single flags field")
Reported-by: "Yordan Karadzhov (VMware)" <y.karadz@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-02-05 15:40:04 -05:00
Hans de Goede
4c7bcb51ae genirq: Prevent [devm_]irq_alloc_desc from returning irq 0
Since commit a85a6c86c2 ("driver core: platform: Clarify that IRQ 0
is invalid"), having a linux-irq with number 0 will trigger a WARN()
when calling platform_get_irq*() to retrieve that linux-irq.

Since [devm_]irq_alloc_desc allocs a single irq and since irq 0 is not used
on some systems, it can return 0, triggering that WARN(). This happens
e.g. on Intel Bay Trail and Cherry Trail devices using the LPE audio engine
for HDMI audio:

 0 is an invalid IRQ number
 WARNING: CPU: 3 PID: 472 at drivers/base/platform.c:238 platform_get_irq_optional+0x108/0x180
 Modules linked in: snd_hdmi_lpe_audio(+) ...

 Call Trace:
  platform_get_irq+0x17/0x30
  hdmi_lpe_audio_probe+0x4a/0x6c0 [snd_hdmi_lpe_audio]

 ---[ end trace ceece38854223a0b ]---

Change the 'from' parameter passed to __[devm_]irq_alloc_descs() by the
[devm_]irq_alloc_desc macros from 0 to 1, so that these macros will no
longer return 0.

Fixes: a85a6c86c2 ("driver core: platform: Clarify that IRQ 0 is invalid")
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20201221185647.226146-1-hdegoede@redhat.com
2021-02-05 20:48:28 +01:00
Aurelien Aptel
21b200d091 cifs: report error instead of invalid when revalidating a dentry fails
Assuming
- //HOST/a is mounted on /mnt
- //HOST/b is mounted on /mnt/b

On a slow connection, running 'df' and killing it while it's
processing /mnt/b can make cifs_get_inode_info() returns -ERESTARTSYS.

This triggers the following chain of events:
=> the dentry revalidation fail
=> dentry is put and released
=> superblock associated with the dentry is put
=> /mnt/b is unmounted

This patch makes cifs_d_revalidate() return the error instead of 0
(invalid) when cifs_revalidate_dentry() fails, except for ENOENT (file
deleted) and ESTALE (file recreated).

Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Suggested-by: Shyam Prasad N <nspmangalore@gmail.com>
Reviewed-by: Shyam Prasad N <nspmangalore@gmail.com>
CC: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-02-05 13:17:48 -06:00
Lai Jiangshan
3943abf2db x86/debug: Prevent data breakpoints on cpu_dr7
local_db_save() is called at the start of exc_debug_kernel(), reads DR7 and
disables breakpoints to prevent recursion.

When running in a guest (X86_FEATURE_HYPERVISOR), local_db_save() reads the
per-cpu variable cpu_dr7 to check whether a breakpoint is active or not
before it accesses DR7.

A data breakpoint on cpu_dr7 therefore results in infinite #DB recursion.

Disallow data breakpoints on cpu_dr7 to prevent that.

Fixes: 84b6a3491567a("x86/entry: Optimize local_db_save() for virt")
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210204152708.21308-2-jiangshanlai@gmail.com
2021-02-05 20:13:12 +01:00
Lai Jiangshan
c4bed4b969 x86/debug: Prevent data breakpoints on __per_cpu_offset
When FSGSBASE is enabled, paranoid_entry() fetches the per-CPU GSBASE value
via __per_cpu_offset or pcpu_unit_offsets.

When a data breakpoint is set on __per_cpu_offset[cpu] (read-write
operation), the specific CPU will be stuck in an infinite #DB loop.

RCU will try to send an NMI to the specific CPU, but it is not working
either since NMI also relies on paranoid_entry(). Which means it's
undebuggable.

Fixes: eaad981291ee3("x86/entry/64: Introduce the FIND_PERCPU_BASE macro")
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210204152708.21308-1-jiangshanlai@gmail.com
2021-02-05 20:13:11 +01:00
Nathan Chancellor
654eb3f2a0 MAINTAINERS/.mailmap: use my @kernel.org address
Use my @kernel.org for all points of contact so that I am always
accessible.

Link: https://lkml.kernel.org/r/20210126212730.2097108-1-nathan@kernel.org
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Nick Desaulniers <ndesaulniers@google.com>
Acked-by: Miguel Ojeda <ojeda@kernel.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-05 11:03:47 -08:00
Muchun Song
e558464be9 mm: hugetlb: fix missing put_page in gather_surplus_pages()
The VM_BUG_ON_PAGE avoids the generation of any code, even if that
expression has side-effects when !CONFIG_DEBUG_VM.

Link: https://lkml.kernel.org/r/20210126031009.96266-1-songmuchun@bytedance.com
Fixes: e5dfacebe4 ("mm/hugetlb.c: just use put_page_testzero() instead of page_count()")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-05 11:03:47 -08:00
Nathan Chancellor
28abcc9631 ubsan: implement __ubsan_handle_alignment_assumption
When building ARCH=mips 32r2el_defconfig with CONFIG_UBSAN_ALIGNMENT:

  ld.lld: error: undefined symbol: __ubsan_handle_alignment_assumption
     referenced by slab.h:557 (include/linux/slab.h:557)
                   main.o:(do_initcalls) in archive init/built-in.a
     referenced by slab.h:448 (include/linux/slab.h:448)
                   do_mounts_rd.o:(rd_load_image) in archive init/built-in.a
     referenced by slab.h:448 (include/linux/slab.h:448)
                   do_mounts_rd.o:(identify_ramdisk_image) in archive init/built-in.a
     referenced 1579 more times

Implement this for the kernel based on LLVM's
handleAlignmentAssumptionImpl because the kernel is not linked against
the compiler runtime.

Link: https://github.com/ClangBuiltLinux/linux/issues/1245
Link: https://github.com/llvm/llvm-project/blob/llvmorg-11.0.1/compiler-rt/lib/ubsan/ubsan_handlers.cpp#L151-L190
Link: https://lkml.kernel.org/r/20210127224451.2587372-1-nathan@kernel.org
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-05 11:03:47 -08:00
Vincenzo Frascino
b99acdcbfe kasan: make addr_has_metadata() return true for valid addresses
Currently, addr_has_metadata() returns true for every address.  An
invalid address (e.g.  NULL) passed to the function when, KASAN_HW_TAGS
is enabled, leads to a kernel panic.

Make addr_has_metadata() return true for valid addresses only.

Note: KASAN_HW_TAGS support for vmalloc will be added with a future
patch.

Link: https://lkml.kernel.org/r/20210126134409.47894-3-vincenzo.frascino@arm.com
Fixes: 2e903b9147 ("kasan, arm64: implement HW_TAGS runtime")
Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Leon Romanovsky <leonro@mellanox.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Naresh Kamboju <naresh.kamboju@linaro.org>
Cc: "Paul E . McKenney" <paulmck@kernel.org>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-05 11:03:47 -08:00
Vincenzo Frascino
49c6631d3b kasan: add explicit preconditions to kasan_report()
Patch series "kasan: Fix metadata detection for KASAN_HW_TAGS", v5.

With the introduction of KASAN_HW_TAGS, kasan_report() currently assumes
that every location in memory has valid metadata associated.  This is
due to the fact that addr_has_metadata() returns always true.

As a consequence of this, an invalid address (e.g.  NULL pointer
address) passed to kasan_report() when KASAN_HW_TAGS is enabled, leads
to a kernel panic.

Example below, based on arm64:

   BUG: KASAN: invalid-access in 0x0
   Read at addr 0000000000000000 by task swapper/0/1
   Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
   Mem abort info:
     ESR = 0x96000004
     EC = 0x25: DABT (current EL), IL = 32 bits
     SET = 0, FnV = 0
     EA = 0, S1PTW = 0
   Data abort info:
     ISV = 0, ISS = 0x00000004
     CM = 0, WnR = 0

  ...

   Call trace:
    mte_get_mem_tag+0x24/0x40
    kasan_report+0x1a4/0x410
    alsa_sound_last_init+0x8c/0xa4
    do_one_initcall+0x50/0x1b0
    kernel_init_freeable+0x1d4/0x23c
    kernel_init+0x14/0x118
    ret_from_fork+0x10/0x34
   Code: d65f03c0 9000f021 f9428021 b6cfff61 (d9600000)
   ---[ end trace 377c8bb45bdd3a1a ]---
   hrtimer: interrupt took 48694256 ns
   note: swapper/0[1] exited with preempt_count 1
   Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
   SMP: stopping secondary CPUs
   Kernel Offset: 0x35abaf140000 from 0xffff800010000000
   PHYS_OFFSET: 0x40000000
   CPU features: 0x0a7e0152,61c0a030
   Memory Limit: none
   ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]---

This series fixes the behavior of addr_has_metadata() that now returns
true only when the address is valid.

This patch (of 2):

With the introduction of KASAN_HW_TAGS, kasan_report() accesses the
metadata only when addr_has_metadata() succeeds.

Add a comment to make sure that the preconditions to the function are
explicitly clarified.

Link: https://lkml.kernel.org/r/20210126134409.47894-1-vincenzo.frascino@arm.com
Link: https://lkml.kernel.org/r/20210126134409.47894-2-vincenzo.frascino@arm.com
Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Andrey Konovalov <andreyknvl@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Leon Romanovsky <leonro@mellanox.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: "Paul E . McKenney" <paulmck@kernel.org>
Cc: Naresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-05 11:03:47 -08:00
Waiman Long
da74240eb3 mm/filemap: add missing mem_cgroup_uncharge() to __add_to_page_cache_locked()
Commit 3fea5a499d ("mm: memcontrol: convert page cache to a new
mem_cgroup_charge() API") introduced a bug in __add_to_page_cache_locked()
causing the following splat:

  page dumped because: VM_BUG_ON_PAGE(page_memcg(page))
  pages's memcg:ffff8889a4116000
  ------------[ cut here ]------------
  kernel BUG at mm/memcontrol.c:2924!
  invalid opcode: 0000 [#1] SMP KASAN PTI
  CPU: 35 PID: 12345 Comm: cat Tainted: G S      W I       5.11.0-rc4-debug+ #1
  Hardware name: HP HP Z8 G4 Workstation/81C7, BIOS P60 v01.25 12/06/2017
  RIP: commit_charge+0xf4/0x130
  Call Trace:
    mem_cgroup_charge+0x175/0x770
    __add_to_page_cache_locked+0x712/0xad0
    add_to_page_cache_lru+0xc5/0x1f0
    cachefiles_read_or_alloc_pages+0x895/0x2e10 [cachefiles]
    __fscache_read_or_alloc_pages+0x6c0/0xa00 [fscache]
    __nfs_readpages_from_fscache+0x16d/0x630 [nfs]
    nfs_readpages+0x24e/0x540 [nfs]
    read_pages+0x5b1/0xc40
    page_cache_ra_unbounded+0x460/0x750
    generic_file_buffered_read_get_pages+0x290/0x1710
    generic_file_buffered_read+0x2a9/0xc30
    nfs_file_read+0x13f/0x230 [nfs]
    new_sync_read+0x3af/0x610
    vfs_read+0x339/0x4b0
    ksys_read+0xf1/0x1c0
    do_syscall_64+0x33/0x40
    entry_SYSCALL_64_after_hwframe+0x44/0xa9

Before that commit, there was a try_charge() and commit_charge() in
__add_to_page_cache_locked().  These two separated charge functions were
replaced by a single mem_cgroup_charge().  However, it forgot to add a
matching mem_cgroup_uncharge() when the xarray insertion failed with the
page released back to the pool.

Fix this by adding a mem_cgroup_uncharge() call when insertion error
happens.

Link: https://lkml.kernel.org/r/20210125042441.20030-1-longman@redhat.com
Fixes: 3fea5a499d ("mm: memcontrol: convert page cache to a new mem_cgroup_charge() API")
Signed-off-by: Waiman Long <longman@redhat.com>
Reviewed-by: Alex Shi <alex.shi@linux.alibaba.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Muchun Song <smuchun@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-05 11:03:47 -08:00
Manivannan Sadhasivam
9c41e526a5 mailmap: add entries for Manivannan Sadhasivam
Map my personal and work addresses to korg mail address.

Link: https://lkml.kernel.org/r/20210201104640.108556-1-manivannan.sadhasivam@linaro.org
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-05 11:03:47 -08:00
Viresh Kumar
4c415b9a71 mailmap: fix name/email for Viresh Kumar
For some of the patches the email id was misspelled to linaro.com
instead of linaro.org and for others Viresh Kumar was written as "viresh
kumar" (all small).  Fix both with help of mailmap entries.

Link: https://lkml.kernel.org/r/d6b80b210d7fe0ddc1d4d0b22eff9708c72ef8b3.1612178938.git.viresh.kumar@linaro.org
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-05 11:03:47 -08:00
Roman Gushchin
2dcb396454 memblock: do not start bottom-up allocations with kernel_end
With kaslr the kernel image is placed at a random place, so starting the
bottom-up allocation with the kernel_end can result in an allocation
failure and a warning like this one:

  hugetlb_cma: reserve 2048 MiB, up to 2048 MiB per node
  ------------[ cut here ]------------
  memblock: bottom-up allocation failed, memory hotremove may be affected
  WARNING: CPU: 0 PID: 0 at mm/memblock.c:332 memblock_find_in_range_node+0x178/0x25a
  Modules linked in:
  CPU: 0 PID: 0 Comm: swapper Not tainted 5.10.0+ #1169
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-1.fc33 04/01/2014
  RIP: 0010:memblock_find_in_range_node+0x178/0x25a
  Code: e9 6d ff ff ff 48 85 c0 0f 85 da 00 00 00 80 3d 9b 35 df 00 00 75 15 48 c7 c7 c0 75 59 88 c6 05 8b 35 df 00 01 e8 25 8a fa ff <0f> 0b 48 c7 44 24 20 ff ff ff ff 44 89 e6 44 89 ea 48 c7 c1 70 5c
  RSP: 0000:ffffffff88803d18 EFLAGS: 00010086 ORIG_RAX: 0000000000000000
  RAX: 0000000000000000 RBX: 0000000240000000 RCX: 00000000ffffdfff
  RDX: 00000000ffffdfff RSI: 00000000ffffffea RDI: 0000000000000046
  RBP: 0000000100000000 R08: ffffffff88922788 R09: 0000000000009ffb
  R10: 00000000ffffe000 R11: 3fffffffffffffff R12: 0000000000000000
  R13: 0000000000000000 R14: 0000000080000000 R15: 00000001fb42c000
  FS:  0000000000000000(0000) GS:ffffffff88f71000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: ffffa080fb401000 CR3: 00000001fa80a000 CR4: 00000000000406b0
  Call Trace:
    memblock_alloc_range_nid+0x8d/0x11e
    cma_declare_contiguous_nid+0x2c4/0x38c
    hugetlb_cma_reserve+0xdc/0x128
    flush_tlb_one_kernel+0xc/0x20
    native_set_fixmap+0x82/0xd0
    flat_get_apic_id+0x5/0x10
    register_lapic_address+0x8e/0x97
    setup_arch+0x8a5/0xc3f
    start_kernel+0x66/0x547
    load_ucode_bsp+0x4c/0xcd
    secondary_startup_64_no_verify+0xb0/0xbb
  random: get_random_bytes called from __warn+0xab/0x110 with crng_init=0
  ---[ end trace f151227d0b39be70 ]---

At the same time, the kernel image is protected with memblock_reserve(),
so we can just start searching at PAGE_SIZE.  In this case the bottom-up
allocation has the same chances to success as a top-down allocation, so
there is no reason to fallback in the case of a failure.  All together it
simplifies the logic.

Link: https://lkml.kernel.org/r/20201217201214.3414100-2-guro@fb.com
Fixes: 8fabc62323 ("powerpc: Ensure that swiotlb buffer is allocated from low memory")
Signed-off-by: Roman Gushchin <guro@fb.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Wonhyuk Yang <vvghjk1234@gmail.com>
Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-05 11:03:47 -08:00
Hugh Dickins
1c2f67308a mm: thp: fix MADV_REMOVE deadlock on shmem THP
Sergey reported deadlock between kswapd correctly doing its usual
lock_page(page) followed by down_read(page->mapping->i_mmap_rwsem), and
madvise(MADV_REMOVE) on an madvise(MADV_HUGEPAGE) area doing
down_write(page->mapping->i_mmap_rwsem) followed by lock_page(page).

This happened when shmem_fallocate(punch hole)'s unmap_mapping_range()
reaches zap_pmd_range()'s call to __split_huge_pmd().  The same deadlock
could occur when partially truncating a mapped huge tmpfs file, or using
fallocate(FALLOC_FL_PUNCH_HOLE) on it.

__split_huge_pmd()'s page lock was added in 5.8, to make sure that any
concurrent use of reuse_swap_page() (holding page lock) could not catch
the anon THP's mapcounts and swapcounts while they were being split.

Fortunately, reuse_swap_page() is never applied to a shmem or file THP
(not even by khugepaged, which checks PageSwapCache before calling), and
anonymous THPs are never created in shmem or file areas: so that
__split_huge_pmd()'s page lock can only be necessary for anonymous THPs,
on which there is no risk of deadlock with i_mmap_rwsem.

Link: https://lkml.kernel.org/r/alpine.LSU.2.11.2101161409470.2022@eggly.anvils
Fixes: c444eb564f ("mm: thp: make the THP mapcount atomic against __split_huge_pmd_locked()")
Signed-off-by: Hugh Dickins <hughd@google.com>
Reported-by: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-05 11:03:47 -08:00
Johannes Berg
55b6f763d8 init/gcov: allow CONFIG_CONSTRUCTORS on UML to fix module gcov
On ARCH=um, loading a module doesn't result in its constructors getting
called, which breaks module gcov since the debugfs files are never
registered.  On the other hand, in-kernel constructors have already been
called by the dynamic linker, so we can't call them again.

Get out of this conundrum by allowing CONFIG_CONSTRUCTORS to be
selected, but avoiding the in-kernel constructor calls.

Also remove the "if !UML" from GCOV selecting CONSTRUCTORS now, since we
really do want CONSTRUCTORS, just not kernel binary ones.

Link: https://lkml.kernel.org/r/20210120172041.c246a2cac2fb.I1358f584b76f1898373adfed77f4462c8705b736@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Jessica Yu <jeyu@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-05 11:03:47 -08:00
Rick Edgecombe
4f6ec86023 mm/vmalloc: separate put pages and flush VM flags
When VM_MAP_PUT_PAGES was added, it was defined with the same value as
VM_FLUSH_RESET_PERMS.  This doesn't seem like it will cause any big
functional problems other than some excess flushing for VM_MAP_PUT_PAGES
allocations.

Redefine VM_MAP_PUT_PAGES to have its own value.  Also, rearrange things
so flags are less likely to be missed in the future.

Link: https://lkml.kernel.org/r/20210122233706.9304-1-rick.p.edgecombe@intel.com
Fixes: b944afc9d6 ("mm: add a VM_MAP_PUT_PAGES flag for vmap")
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Suggested-by: Matthew Wilcox <willy@infradead.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Daniel Axtens <dja@axtens.net>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-05 11:03:47 -08:00
Rokudo Yan
74e21484e4 mm, compaction: move high_pfn to the for loop scope
In fast_isolate_freepages, high_pfn will be used if a prefered one (ie
PFN >= low_fn) not found.

But the high_pfn is not reset before searching an free area, so when it
was used as freepage, it may from another free area searched before.  As
a result move_freelist_head(freelist, freepage) will have unexpected
behavior (eg corrupt the MOVABLE freelist)

  Unable to handle kernel paging request at virtual address dead000000000200
  Mem abort info:
    ESR = 0x96000044
    Exception class = DABT (current EL), IL = 32 bits
    SET = 0, FnV = 0
    EA = 0, S1PTW = 0
  Data abort info:
    ISV = 0, ISS = 0x00000044
    CM = 0, WnR = 1
  [dead000000000200] address between user and kernel address ranges

  -000|list_cut_before(inline)
  -000|move_freelist_head(inline)
  -000|fast_isolate_freepages(inline)
  -000|isolate_freepages(inline)
  -000|compaction_alloc(?, ?)
  -001|unmap_and_move(inline)
  -001|migrate_pages([NSD:0xFFFFFF80088CBBD0] from = 0xFFFFFF80088CBD88, [NSD:0xFFFFFF80088CBBC8] get_new_p
  -002|__read_once_size(inline)
  -002|static_key_count(inline)
  -002|static_key_false(inline)
  -002|trace_mm_compaction_migratepages(inline)
  -002|compact_zone(?, [NSD:0xFFFFFF80088CBCB0] capc = 0x0)
  -003|kcompactd_do_work(inline)
  -003|kcompactd([X19] p = 0xFFFFFF93227FBC40)
  -004|kthread([X20] _create = 0xFFFFFFE1AFB26380)
  -005|ret_from_fork(asm)

The issue was reported on an smart phone product with 6GB ram and 3GB
zram as swap device.

This patch fixes the issue by reset high_pfn before searching each free
area, which ensure freepage and freelist match when call
move_freelist_head in fast_isolate_freepages().

Link: http://lkml.kernel.org/r/20190118175136.31341-12-mgorman@techsingularity.net
Link: https://lkml.kernel.org/r/20210112094720.1238444-1-wu-yan@tcl.com
Fixes: 5a811889de ("mm, compaction: use free lists to quickly locate a migration target")
Signed-off-by: Rokudo Yan <wu-yan@tcl.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-05 11:03:47 -08:00
Muchun Song
71a64f618b mm: migrate: do not migrate HugeTLB page whose refcount is one
All pages isolated for the migration have an elevated reference count and
therefore seeing a reference count equal to 1 means that the last user of
the page has dropped the reference and the page has became unused and
there doesn't make much sense to migrate it anymore.

This has been done for regular pages and this patch does the same for
hugetlb pages.  Although the likelihood of the race is rather small for
hugetlb pages it makes sense the two code paths in sync.

Link: https://lkml.kernel.org/r/20210115124942.46403-2-songmuchun@bytedance.com
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: Yang Shi <shy828301@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-05 11:03:47 -08:00
Muchun Song
ecbf4724e6 mm: hugetlb: remove VM_BUG_ON_PAGE from page_huge_active
The page_huge_active() can be called from scan_movable_pages() which do
not hold a reference count to the HugeTLB page.  So when we call
page_huge_active() from scan_movable_pages(), the HugeTLB page can be
freed parallel.  Then we will trigger a BUG_ON which is in the
page_huge_active() when CONFIG_DEBUG_VM is enabled.  Just remove the
VM_BUG_ON_PAGE.

Link: https://lkml.kernel.org/r/20210115124942.46403-6-songmuchun@bytedance.com
Fixes: 7e1f049efb ("mm: hugetlb: cleanup using paeg_huge_active()")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: David Hildenbrand <david@redhat.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-05 11:03:47 -08:00
Muchun Song
0eb2df2b56 mm: hugetlb: fix a race between isolating and freeing page
There is a race between isolate_huge_page() and __free_huge_page().

  CPU0:                                     CPU1:

  if (PageHuge(page))
                                            put_page(page)
                                              __free_huge_page(page)
                                                  spin_lock(&hugetlb_lock)
                                                  update_and_free_page(page)
                                                    set_compound_page_dtor(page,
                                                      NULL_COMPOUND_DTOR)
                                                  spin_unlock(&hugetlb_lock)
    isolate_huge_page(page)
      // trigger BUG_ON
      VM_BUG_ON_PAGE(!PageHead(page), page)
      spin_lock(&hugetlb_lock)
      page_huge_active(page)
        // trigger BUG_ON
        VM_BUG_ON_PAGE(!PageHuge(page), page)
      spin_unlock(&hugetlb_lock)

When we isolate a HugeTLB page on CPU0.  Meanwhile, we free it to the
buddy allocator on CPU1.  Then, we can trigger a BUG_ON on CPU0, because
it is already freed to the buddy allocator.

Link: https://lkml.kernel.org/r/20210115124942.46403-5-songmuchun@bytedance.com
Fixes: c8721bbbdd ("mm: memory-hotplug: enable memory hotplug to handle hugepage")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: David Hildenbrand <david@redhat.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-05 11:03:47 -08:00
Muchun Song
7ffddd499b mm: hugetlb: fix a race between freeing and dissolving the page
There is a race condition between __free_huge_page()
and dissolve_free_huge_page().

  CPU0:                         CPU1:

  // page_count(page) == 1
  put_page(page)
    __free_huge_page(page)
                                dissolve_free_huge_page(page)
                                  spin_lock(&hugetlb_lock)
                                  // PageHuge(page) && !page_count(page)
                                  update_and_free_page(page)
                                  // page is freed to the buddy
                                  spin_unlock(&hugetlb_lock)
      spin_lock(&hugetlb_lock)
      clear_page_huge_active(page)
      enqueue_huge_page(page)
      // It is wrong, the page is already freed
      spin_unlock(&hugetlb_lock)

The race window is between put_page() and dissolve_free_huge_page().

We should make sure that the page is already on the free list when it is
dissolved.

As a result __free_huge_page would corrupt page(s) already in the buddy
allocator.

Link: https://lkml.kernel.org/r/20210115124942.46403-4-songmuchun@bytedance.com
Fixes: c8721bbbdd ("mm: memory-hotplug: enable memory hotplug to handle hugepage")
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-05 11:03:47 -08:00
Muchun Song
585fc0d287 mm: hugetlbfs: fix cannot migrate the fallocated HugeTLB page
If a new hugetlb page is allocated during fallocate it will not be
marked as active (set_page_huge_active) which will result in a later
isolate_huge_page failure when the page migration code would like to
move that page.  Such a failure would be unexpected and wrong.

Only export set_page_huge_active, just leave clear_page_huge_active as
static.  Because there are no external users.

Link: https://lkml.kernel.org/r/20210115124942.46403-3-songmuchun@bytedance.com
Fixes: 70c3547e36 (hugetlbfs: add hugetlbfs_fallocate())
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: David Hildenbrand <david@redhat.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-05 11:03:47 -08:00
Linus Torvalds
17fbcdf9f1 Fixes:
- Fix non-page-aligned NFS READs
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAmAYIvkACgkQM2qzM29m
 f5f9Ow/9HkUHYceOK+mdzx+Yz/Gk+DyWnjgcbLLKRqS0PbwfqxJk0ti+PuqOLc0W
 ppzTk3ENslpQJ0LgA8xvcNvRmHIYqQwAPpqelhUWuVpwEaRGf4JKgFHpB2mLa/hp
 gBxaJYRXmPpWot0IgVo4xOj6Z2B9+K47tDm2MgDOQdWnLIPtSsYjvzLsy0Vnatv2
 C5EC9Dpt7HQIOmVIrQ4kuolwqOIiy9dixoI8FrqoQiCJ/KGzKiDQiy1/Hk+NQLTl
 74LQ51VIpAScB0x5xRzqC7krkdj89D0yeJPa/pDdWiizOGG5Dhg3sd89S8JugJeM
 TpG82ZgJfRuN2FkARmt4B5wYTJoWvpkmJEe3nqnVJrv6UC/CGA6v+iqLI51yJE/V
 YGvDb7BddQ3UfD7Rb83Dmhc6aApVgVvz/c7Kjyb2u+fsFnpqNMfmqu+sKhbnSBZz
 45bt8duKYTFhzqHxZH0JXbLgL6c+C6y/77I0VbVIlC3udlH3lJMCeNps/G+wHs6j
 QCgRkv9df74J7dPhZ4bQtZOKx+xPgsUpzCiwwWuLoCHyozFjVNfwzW07s1pH/gO+
 1Ysy0BqfShgrkuQPSZDsFI46g4hpGhtACBEVt9mw2cx+CbcgzhJMGI/YA28muAjL
 Lj1e799afjOHk1ONvm8X7I2UkFUAF7T9bTcAtKkPgnAf0eAtXYk=
 =W1cN
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-5.11-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux

Pull nfsd fix from Chuck Lever:
 "Fix non-page-aligned NFS READs"

* tag 'nfsd-5.11-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
  SUNRPC: Fix NFS READs that start at non-page-aligned offsets
2021-02-05 10:11:14 -08:00
Linus Torvalds
6157ce59bf x86 has lots of small bugfixes, mostly one liners. It's quite late in
5.11-rc but none of them are related to this merge window; it's just
 bugs coming in at the wrong time.  Of note among the others:
 - "KVM: x86: Allow guests to see MSR_IA32_TSX_CTRL even if tsx=off"
   (live migration failure seen on distros that hadn't switched to tsx=off
   right away)
 
 ARM:
 - Avoid clobbering extra registers on initialisation
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmAc+3QUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroMgkAf+MdqY5F+wIOZgNMS8XfKah56hGLw0
 l7lRMrdHdDtCIoe+H8iElyIvr5+NOn7KIW48Bxtl5w3VK68h/X1+h/s+Bo0kjf5B
 Pbm0Zh5+l2tO7ocz/G1TsqDkEfWFxQI+QHcKxg1f443ZTzV1k/qM6BCNH5Pk3LFE
 kYtyOIa+YjrP0u9Bl2jZ+DCrXXRFDtDidXeHyPszErVMH90/DiGClLu5/xzCVQRD
 a+6IKLzlGc+nBj5gMXTB8dxyrZ3XrgARF/4/CCFeMLYVtwvkUHaW/ukIXTTiu8wY
 I7IGzA7lX4TZOtGVrsbEjtSaYVxd14n4KuaxvSPIHDZo3b+z0AEtcFzHtA==
 =pmlS
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:
 "x86 has lots of small bugfixes, mostly one liners. It's quite late in
  5.11-rc but none of them are related to this merge window; it's just
  bugs coming in at the wrong time.

  Of note among the others is "KVM: x86: Allow guests to see
  MSR_IA32_TSX_CTRL even if tsx=off" that fixes a live migration failure
  seen on distros that hadn't switched to tsx=off right away.

  ARM:
  - Avoid clobbering extra registers on initialisation"

[ Sean Christopherson notes that commit 943dea8af2 ("KVM: x86: Update
  emulator context mode if SYSENTER xfers to 64-bit mode") should have
  had authorship credited to Jonny Barker, not to him.  - Linus ]

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: Set so called 'reserved CR3 bits in LM mask' at vCPU reset
  KVM: x86/mmu: Fix TDP MMU zap collapsible SPTEs
  KVM: x86: cleanup CR3 reserved bits checks
  KVM: SVM: Treat SVM as unsupported when running as an SEV guest
  KVM: x86: Update emulator context mode if SYSENTER xfers to 64-bit mode
  KVM: x86: Supplement __cr4_reserved_bits() with X86_FEATURE_PCID check
  KVM/x86: assign hva with the right value to vm_munmap the pages
  KVM: x86: Allow guests to see MSR_IA32_TSX_CTRL even if tsx=off
  Fix unsynchronized access to sev members through svm_register_enc_region
  KVM: Documentation: Fix documentation for nested.
  KVM: x86: fix CPUID entries returned by KVM_GET_CPUID2 ioctl
  KVM: arm64: Don't clobber x4 in __do_hyp_init
2021-02-05 10:03:01 -08:00
Linus Torvalds
97ba0c7413 IOMMU Fix for Linux v5.11-rc6
- Fix a possible NULL-ptr dereference in dev_iommu_priv_get()
 	  which is too easy to accidentially trigger from IOMMU drivers.
 	  In the current case the AMD IOMMU driver triggered it on some
 	  machines in the IO-page-fault path, so fix it once and for
 	  all.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmAdawcACgkQK/BELZcB
 GuN6dg//XtkZJUl7Cf5wBAYv8PRIGTrbS6a4JQ4Eunv4x9Q29Cvqgb813DPO15sB
 xfR7hTGhWJz9LpZVxX9AxmklVRNFUsng6bIaU8drc0y5rDBm2fjmgRo3zX5DQp+K
 120ZwbAK0TIFlyL3nQhqrfnGJN3lMqdMQJ4EbzkQOtk8BkeBWPwYsXvI2PCgvjsY
 U4SGKh2URaJh59W+PApXUCf5owCPGJoCN2pnGJAmYUmsainTe2FZ79xXGSzvFczH
 ty4YWphoNimvMUcYzdn/ZTUwA68zlRP6/YgpWxpn3tr0624WUK4LusmzBWgNjPB5
 NzYaaOoXHIhqyuFbVd5QXeyoNhKX59ICnV8U8zLvwpT2RHMzVX6hu5Kx7zDy45V3
 F7IqHQfAz7ZnEZ8o+SzdN3ZquFfE2/zQCmFY3OQZOCl11behL1o6tY6g244R6oiq
 VYk2t5whm9K7hbDBclzzIUsax6oK8IX7GBwyy8Xw+k4Qnx/qLmG8txfjax7FBWix
 km/cAJwIv8mhLL496I6XNtHLzcgwvjLQ0HNT3E1MFj/d+uD0rUxAea9XW/QpO6gc
 6+tzL7S9PckX7TY3kuEbvz8b4TWcumSma4tZqiky7iFG+Rxzc+BeMDLMzPZhhpTL
 5AuqKLFFs/XeHGo4mqpLKPGlK30qp2GJh5VaTq2cPLA82hkFSSk=
 =Og2E
 -----END PGP SIGNATURE-----

Merge tag 'iommu-fixes-v5.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull IOMMU fix from Joerg Roedel:
 "Fix a possible NULL-ptr dereference in dev_iommu_priv_get() which is
  too easy to accidentially trigger from IOMMU drivers.

  In the current case the AMD IOMMU driver triggered it on some machines
  in the IO-page-fault path, so fix it once and for all"

* tag 'iommu-fixes-v5.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu: Check dev->iommu in dev_iommu_priv_get() before dereferencing it
2021-02-05 09:57:29 -08:00
Linus Torvalds
e07ce64d83 vdpa: last minute bugfix
A bugfix in the mlx driver I got at the last minute.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmAdZFoPHG1zdEByZWRo
 YXQuY29tAAoJECgfDbjSjVRpafIH/3532FyrBfZQpE3lDD4fLnmNxux9EcOtWVXU
 J3rZrYY72YiKRCkLt8BF1Wv2y5ip358ks7VqI8zDKkhLUuaqXPu1tTGrT8H1SLDB
 BALaYFm5A9IdczdFOZqxhEsuHCTg1bevyQD+kWGCZgzsKdfSg0Yy1h+35dv8nFN3
 jV2Zg8/b5QTS0m/aAdgWRpWOpVnk8aFtz5kon7yFamiI2BHyLIj1JD61F0PQFzKp
 laG/MMr3mQ4VVWUiygfT67PuLnOAMeHqOvBzEt7YA8slOKC+Or8HeM+ChuY31ZnA
 B2eLsEGEMJ8D/g1jqAeFkbbOWKeCcT6q1YYBrpWaZebUbLVwHa4=
 =D48v
 -----END PGP SIGNATURE-----

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull vdpa fix from Michael Tsirkin:
 "A bugfix in the mlx driver I got at the last minute"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  vdpa/mlx5: Restore the hardware used index after change map
2021-02-05 09:54:20 -08:00
Linus Torvalds
2d8bdf5906 MMC core:
- Limit retries when analyse of SDIO tuples fails
 
 MMC host:
  - sdhci: Fix linking err for sdhci-brcmstb
 -----BEGIN PGP SIGNATURE-----
 
 iQJLBAABCgA1FiEEugLDXPmKSktSkQsV/iaEJXNYjCkFAmAdSuMXHHVsZi5oYW5z
 c29uQGxpbmFyby5vcmcACgkQ/iaEJXNYjCkvWg/+Ks9PCp4taJxGcmFSHh2gu2kK
 xVIcLLQG41XTZqlKZhkM2Lud3bQt/vOVnI10Gwq8xrbyMwNqAKRseGMIEBKd2yH8
 WifyKOt2H9O3QRIBufL4OID/9YPdsz+w7yz5wj5MxmXjezCvOAmQCqgqDd7FIp3Q
 IoHjsGcTxypJtzbFtcMCcwYc4bYlTIY/QLaiiYsv7s3ehHfoMXYkcfNgQQvSQtN4
 g6nOIAuN4MqRm5WtfGc9GKvZOjMLuWW2ae2MLDPNwLLKLDH2KCQgpSd0kdo5QmUE
 4SMvuOeAns6BnggbJUIlJBN6cI10iimU/HrZvZ3qncuTsudX/2d9GsYPF03EcPfj
 8Y8/M7Nq05Co9Xm2QMN5cFlmWRjwaxLnKA+iytij7KcVki9HmH3Qm+HdyTzN9k6l
 pLxgwu96EdBnx8bSIWDkYuDnBd65CvCedN1JgYIEP/zb5nqbv9n/3eDPGjgHv7MJ
 enEgczc/MrEXdj6Yhzdgb/dBiElUcO/OjqgAN6/eS82FbdCK476JrtYacsaA0KYs
 JLjP2fQJ+h+J32afVBoDCvaFSUek6wylyMT41HTQL+ZkewZjsSTDBjesXw+JqU77
 lFD+yL0TDPxBqegRFNEFg+blpA/vOw/eLU5tCvT1AQ9aiFHUA2sd7OizKT8XvdzW
 7b0NPTplBwa5B0NVAek=
 =ftQ/
 -----END PGP SIGNATURE-----

Merge tag 'mmc-v5.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc

Pull MMC fixes from Ulf Hansson:
 "MMC core:
   - Limit retries when analyse of SDIO tuples fails

  MMC host:
   - sdhci: Fix linking err for sdhci-brcmstb"

* tag 'mmc-v5.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: sdhci-pltfm: Fix linking err for sdhci-brcmstb
  mmc: core: Limit retries when analyse of SDIO tuples fails
2021-02-05 09:53:11 -08:00
Linus Torvalds
8e91dd934b drm fixes for 5.11-rc7
ttm:
 - fix huge page warning regression
 
 i915:
 - Skip vswing programming for TBT
 - Power up combo PHY lanes for HDMI
 - Fix double YUV range correction on HDR planes
 - Fix the MST PBN divider calculation
 - Fix LTTPR vswing/pre-emp setting in non-transparent mode
 - Move the breadcrumb to the signaler if completed upon cancel
 - Close race between enable_breadcrumbs and cancel_breadcrumbs
 - Drop lru bumping on display unpinning
 
 amdgpu:
 - Fix retry in gem create
 - Vangogh fixes
 - Fix for display from shared buffers
 - Various display fixes
 
 amdkfd:
 - Fix regression in buffer free
 
 nouveau:
 - fix DMA API warning regression
 
 drm/bridge/lontium-lt9611uxc:
 - EDID fixes
 - Don't handle hotplug events in IRQ handler
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJgHKD+AAoJEAx081l5xIa+XXsQAKlhnEE72yA4qHCPnC8WM6jg
 CQGJGFpFY0woaWoRVuH/kSz+gH0ksLENQWmT0vrG1nA35fp5ypsB2sQLq7lQGOgm
 kcKtW8ykwz/m7JETeN5R5+GXHgZIQDcMjqVdXA8drFCX0RzgTD/rdtFN+XZTAS9z
 c8Sxr0+nUMIIqJCYeWe6MNvTUDxE9N34Kcm8dfELkYXBmqakbo9Mrx1ENEuM4f1r
 mIJgKTGttSsRIduoqOcTY5BEMzAzpuETWhQzaIDSUMHyz5PuzpLABbQ3Fn9ubjyu
 wEnEqNrUtrcOq2cy19L3TZ6dIju88pp3ovXbYtstvVrn6xFL4sMyeksWV/P2UfQg
 CVpO/HqRCeSSHqoNJP6vYhKcxAoUETaJ/ehSLa92jiJH6eLTxNCjA0DCF1BCFSKp
 2iBFaiaOb/jlEEouPbOX9BwBLVGXHd8ijcLyjV4NOVKsX9+/6l/Mx/HFrtzcoj3f
 u+v3/duMQvj+zcKpSJtJFUjKRkw//28p59fexjVRqM5AMhcKBSMyebzJAH2dAnWB
 YYuqkyBQOed6JNfQ1wz2hec0gkJBLz8QO20fMldpjZxEZE96+LNVvhuHlJiOUVDS
 AVZYNeZPkzUQvoJdh9b+XTqtJEgC1KFlg/ySX3IWEVRd1KOWkoyclVJvEluR+Uvv
 +Xsv9wxlwBmFkCXIUMq9
 =POII
 -----END PGP SIGNATURE-----

Merge tag 'drm-fixes-2021-02-05-1' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "Fixes for rc7, bit bigger than I'd like at this stage, but most of the
  i915 stuff and some amdgpu is destined for staging and I'd rather not
  hold it up, the i915 changes also pulled in a few precusor code
  movement patches to make things cleaner, but nothing seems that
  horrible, and I've checked over all of it.

  Otherwise there is a nouveau dma-api warning regression, and a ttm
  page allocation warning fix, and some fixes for a bridge chip,

  ttm:
   - fix huge page warning regression

  i915:
   - Skip vswing programming for TBT
   - Power up combo PHY lanes for HDMI
   - Fix double YUV range correction on HDR planes
   - Fix the MST PBN divider calculation
   - Fix LTTPR vswing/pre-emp setting in non-transparent mode
   - Move the breadcrumb to the signaler if completed upon cancel
   - Close race between enable_breadcrumbs and cancel_breadcrumbs
   - Drop lru bumping on display unpinning

  amdgpu:
   - Fix retry in gem create
   - Vangogh fixes
   - Fix for display from shared buffers
   - Various display fixes

  amdkfd:
   - Fix regression in buffer free

  nouveau:
   - fix DMA API warning regression

  drm/bridge/lontium-lt9611uxc:
   - EDID fixes
   - Don't handle hotplug events in IRQ handler"

* tag 'drm-fixes-2021-02-05-1' of git://anongit.freedesktop.org/drm/drm: (29 commits)
  drm/nouveau: fix dma syncing warning with debugging on.
  drm/amd/display: Decrement refcount of dc_sink before reassignment
  drm/amd/display: Free atomic state after drm_atomic_commit
  drm/amd/display: Fix dc_sink kref count in emulated_link_detect
  drm/amd/display: Release DSC before acquiring
  drm/amd/display: Revert "Fix EDID parsing after resume from suspend"
  drm/amd/display: Add more Clock Sources to DCN2.1
  drm/amd/display: reuse current context instead of recreating one
  drm/amd/display: Fix DPCD translation for LTTPR AUX_RD_INTERVAL
  drm/amdgpu: enable freesync for A+A configs
  drm/amd/pm: fill in the data member of v2 gpu metrics table for vangogh
  drm/amdgpu/gfx10: update CGTS_TCC_DISABLE and CGTS_USER_TCC_DISABLE register offsets for VGH
  drm/amdkfd: fix null pointer panic while free buffer in kfd
  drm/amdgpu: fix the issue that retry constantly once the buffer is oversize
  drm/i915/dp: Fix LTTPR vswing/pre-emp setting in non-transparent mode
  drm/i915/dp: Move intel_dp_set_signal_levels() to intel_dp_link_training.c
  drm/i915: Fix the MST PBN divider calculation
  drm/dp/mst: Export drm_dp_get_vc_payload_bw()
  drm/i915/gem: Drop lru bumping on display unpinning
  drm/i915/gt: Close race between enable_breadcrumbs and cancel_breadcrumbs
  ...
2021-02-05 09:50:21 -08:00
Geert Uytterhoeven
24c242ec7a ntp: Use freezable workqueue for RTC synchronization
The bug fixed by commit e3fab2f3de ("ntp: Fix RTC synchronization on
32-bit platforms") revealed an underlying issue: RTC synchronization may
happen anytime, even while the system is partially suspended.

On systems where the RTC is connected to an I2C bus, the I2C bus controller
may already or still be suspended, triggering a WARNING during suspend or
resume from s2ram:

    WARNING: CPU: 0 PID: 124 at drivers/i2c/i2c-core.h:54 __i2c_transfer+0x634/0x680
    i2c i2c-6: Transfer while suspended
    [...]
    Workqueue: events_power_efficient sync_hw_clock
    [...]
      (__i2c_transfer)
      (i2c_transfer)
      (regmap_i2c_read)
      ...
      (da9063_rtc_set_time)
      (rtc_set_time)
      (sync_hw_clock)
      (process_one_work)

Fix this race condition by using the freezable instead of the normal
power-efficient workqueue.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lore.kernel.org/r/20210125143039.1051912-1-geert+renesas@glider.be
2021-02-05 18:03:13 +01:00
Eli Cohen
b35ccebe3e vdpa/mlx5: Restore the hardware used index after change map
When a change of memory map occurs, the hardware resources are destroyed
and then re-created again with the new memory map. In such case, we need
to restore the hardware available and used indices. The driver failed to
restore the used index which is added here.

Also, since the driver also fails to reset the available and used
indices upon device reset, fix this here to avoid regression caused by
the fact that used index may not be zero upon device reset.

Fixes: 1a86b377aa ("vdpa/mlx5: Add VDPA driver for supported mlx5 devices")
Signed-off-by: Eli Cohen <elic@nvidia.com>
Link: https://lore.kernel.org/r/20210204073618.36336-1-elic@nvidia.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2021-02-05 10:28:04 -05:00
Pavel Shilovsky
91792bb808 smb3: fix crediting for compounding when only one request in flight
Currently we try to guess if a compound request is going to
succeed waiting for credits or not based on the number of
requests in flight. This approach doesn't work correctly
all the time because there may be only one request in
flight which is going to bring multiple credits satisfying
the compound request.

Change the behavior to fail a request only if there are no requests
in flight at all and proceed waiting for credits otherwise.

Cc: <stable@vger.kernel.org> # 5.1+
Signed-off-by: Pavel Shilovsky <pshilov@microsoft.com>
Reviewed-by: Tom Talpey <tom@talpey.com>
Reviewed-by: Shyam Prasad N <nspmangalore@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-02-05 08:12:00 -06:00
Barry Song
9f5f8ec501 dma-mapping: benchmark: use u8 for reserved field in uAPI structure
The original code put five u32 before a u64 expansion[10] array. Five is
odd, this will cause trouble in the extension of the structure by adding
new features. This patch moves to use u8 for reserved field to avoid
future alignment risk.
Meanwhile, it also clears the memory of struct map_benchmark in tools,
otherwise, if users use old version to run on newer kernel, the random
expansion value will cause side effect on newer kernel.

Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-02-05 12:48:46 +01:00
Russell King
4d62e81b60 ARM: kexec: fix oops after TLB are invalidated
Giancarlo Ferrari reports the following oops while trying to use kexec:

 Unable to handle kernel paging request at virtual address 80112f38
 pgd = fd7ef03e
 [80112f38] *pgd=0001141e(bad)
 Internal error: Oops: 80d [#1] PREEMPT SMP ARM
 ...

This is caused by machine_kexec() trying to set the kernel text to be
read/write, so it can poke values into the relocation code before
copying it - and an interrupt occuring which changes the page tables.
The subsequent writes then hit read-only sections that trigger a
data abort resulting in the above oops.

Fix this by copying the relocation code, and then writing the variables
into the destination, thereby avoiding the need to make the kernel text
read/write.

Reported-by: Giancarlo Ferrari <giancarlo.ferrari89@gmail.com>
Tested-by: Giancarlo Ferrari <giancarlo.ferrari89@gmail.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2021-02-05 10:23:29 +00:00
Russell King
9c698bff66 ARM: ensure the signal page contains defined contents
Ensure that the signal page contains our poison instruction to increase
the protection against ROP attacks and also contains well defined
contents.

Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2021-02-05 10:23:00 +00:00