Commit Graph

982107 Commits

Author SHA1 Message Date
David Hildenbrand
98ff9f9411 virtio-mem: generalize handling when memory is getting onlined deferred
We don't want to add too much memory when it's not getting onlined
immediately, to avoid running OOM. Generalize the handling, to avoid
making use of memory block states. Use a threshold of 1 GiB for now.

Properly adjust the offline size when adding/removing memory. As we are
not always protected by a lock when touching the offline size, use an
atomic64_t. We don't care about races (e.g., someone offlining memory
while we are adding more), only about consistent values.

(1 GiB needs a memmap of ~16MiB - which sounds reasonable even for
 setups with little boot memory and (possibly) one virtio-mem device per
 node)

We don't want to retrigger when onlining is caused immediately by our
action (e.g., adding memory which immediately gets onlined), so use a
flag to indicate if the workqueue is active and use that as an
indicator whether to trigger a retry. This will also be especially relevant
for Big Block Mode (BBM), whereby we might re-online memory in case
offlining of another memory block failed.

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20201112133815.13332-17-david@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18 16:14:26 -05:00
David Hildenbrand
1d33c2caa8 virtio-mem: don't always trigger the workqueue when offlining memory
Let's trigger from offlining code only when we're not allowed to unplug
online memory. Handle the other case (memmap possibly freeing up another
memory block) when actually removing memory. We now also properly handle
the case when removing already offline memory blocks via
virtio_mem_mb_remove(). When removing via virtio_mem_remove(), when
unloading the driver, virtio_mem_retry() is a NOP and safe to use.

While at it, move retry handling when offlining out of
virtio_mem_notify_offline(), to share it with Big Block Mode (BBM)
soon.

This is a preparation for Big Block Mode (BBM), whereby we can see some
temporary offlining of memory blocks without actually making progress.
Imagine you have a Big Block that spans to Linux memory blocks. Assume
the first Linux memory blocks has no unmovable data on it. When we would
call offline_and_remove_memory() on the big block, we would
	1. Try to offline the first block. Works, notifiers triggered.
	   virtio_mem_retry() called.
	2. Try to offline the second block. Does not work.
	3. Re-online first block.
	4. Exit to main loop, exit workqueue.
	5. Retry immediately (due to virtio_mem_retry()), go to 1.
The result are endless retries.

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20201112133815.13332-16-david@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18 16:14:26 -05:00
David Hildenbrand
420066829b virtio-mem: drop last_mb_id
No longer used, let's drop it.

Reviewed-by: Wei Yang <richard.weiyang@linux.alibaba.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20201112133815.13332-15-david@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18 16:14:26 -05:00
David Hildenbrand
835491c554 virtio-mem: generalize virtio_mem_overlaps_range()
Avoid using memory block ids. While at it, use uint64_t for
address/size.

This is a preparation for Big Block Mode (BBM).

Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20201112133815.13332-14-david@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18 16:14:26 -05:00
David Hildenbrand
8464e3bdf2 virtio-mem: generalize virtio_mem_owned_mb()
Avoid using memory block ids. Rename it to virtio_mem_contains_range().

This is a preparation for Big Block Mode (BBM).

Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20201112133815.13332-13-david@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18 16:14:26 -05:00
David Hildenbrand
989ff82527 virtio-mem: generalize check for added memory
Let's check by traversing busy system RAM resources instead, to avoid
relying on memory block states.

Don't use walk_system_ram_range(), as that works on pages and we want to
use the bare addresses we have easily at hand.

This is a preparation for Big Block Mode (BBM), which won't have memory
block states.

Reviewed-by: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20201112133815.13332-12-david@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18 16:14:26 -05:00
David Hildenbrand
f2d799d591 virtio-mem: retry fake-offlining via alloc_contig_range() on ZONE_MOVABLE
ZONE_MOVABLE is supposed to give some guarantees, yet,
alloc_contig_range() isn't prepared to properly deal with some racy
cases properly (e.g., temporary page pinning when exiting processed, PCP).

Retry 5 times for now. There is certainly room for improvement in the
future.

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20201112133815.13332-11-david@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18 16:14:26 -05:00
David Hildenbrand
7a34c77dab virtio-mem: factor out handling of fake-offline pages in memory notifier
Let's factor out the core pieces and place the implementation next to
virtio_mem_fake_offline(). We'll reuse this functionality soon.

Reviewed-by: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20201112133815.13332-10-david@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18 16:14:26 -05:00
David Hildenbrand
89c486c47f virtio-mem: factor out fake-offlining into virtio_mem_fake_offline()
... which now matches virtio_mem_fake_online(). We'll reuse this
functionality soon.

Reviewed-by: Wei Yang <richard.weiyang@linux.alibaba.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20201112133815.13332-9-david@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18 16:14:25 -05:00
David Hildenbrand
6beb3a9421 virtio-mem: print debug messages from virtio_mem_send_*_request()
Let's move the existing dev_dbg() into the functions, print if something
went wrong, and also print for virtio_mem_send_unplug_all_request().

Reviewed-by: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20201112133815.13332-8-david@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18 16:14:25 -05:00
David Hildenbrand
41e6215c6d virtio-mem: factor out calculation of the bit number within the subblock bitmap
The calculation is already complicated enough, let's limit it to one
location.

Reviewed-by: Wei Yang <richard.weiyang@linux.alibaba.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20201112133815.13332-7-david@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18 16:14:25 -05:00
David Hildenbrand
2a6285114b virtio-mem: use "unsigned long" for nr_pages when fake onlining/offlining
No harm done, but let's be consistent.

Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Reviewed-by: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20201112133815.13332-6-david@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18 16:14:25 -05:00
David Hildenbrand
d76944f80d virtio-mem: drop rc2 in virtio_mem_mb_plug_and_add()
We can drop rc2, we don't actually need the value.

Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Reviewed-by: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20201112133815.13332-5-david@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18 16:14:25 -05:00
David Hildenbrand
20b9150225 virtio-mem: simplify MAX_ORDER - 1 / pageblock_order handling
Let's use pageblock_nr_pages and MAX_ORDER_NR_PAGES instead where
possible to simplify.

Add a comment why we have that restriction for now.

Reviewed-by: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20201112133815.13332-4-david@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18 16:14:25 -05:00
David Hildenbrand
347202dc04 virtio-mem: more precise calculation in virtio_mem_mb_state_prepare_next_mb()
We actually need one byte less (next_mb_id is exclusive, first_mb_id is
inclusive). While at it, compact the code.

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20201112133815.13332-3-david@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18 16:14:25 -05:00
David Hildenbrand
6725f21157 virtio-mem: determine nid only once using memory_add_physaddr_to_nid()
Let's determine the target nid only once in case we have none specified -
usually, we'll end up with node 0 either way.

Reviewed-by: Wei Yang <richard.weiyang@linux.alibaba.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20201112133815.13332-2-david@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2020-12-18 16:14:25 -05:00
Linus Torvalds
a0b9631487 Merge tag 'xfs-5.11-merge-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs updates from Darrick Wong:
 "In this release we add the ability to set a 'needsrepair' flag
  indicating that we /know/ the filesystem requires xfs_repair, but
  other than that, it's the usual strengthening of metadata validation
  and miscellaneous cleanups.

  Summary:

   - Introduce a "needsrepair" "feature" to flag a filesystem as needing
     a pass through xfs_repair. This is key to enabling filesystem
     upgrades (in xfs_db) that require xfs_repair to make minor
     adjustments to metadata.

   - Refactor parameter checking of recovered log intent items so that
     we actually use the same validation code as them that generate the
     intent items.

   - Various fixes to online scrub not reacting correctly to directory
     entries pointing to inodes that cannot be igetted.

   - Refactor validation helpers for data and rt volume extents.

   - Refactor XFS_TRANS_DQ_DIRTY out of existence.

   - Fix a longstanding bug where mounting with "uqnoenforce" would
     start user quotas in non-enforcing mode but /proc/mounts would
     display "usrquota", implying that they are being enforced.

   - Don't flag dax+reflink inodes as corruption since that is a valid
     (but not fully functional) combination right now.

   - Clean up raid stripe validation functions.

   - Refactor the inode allocation code to be more straightforward.

   - Small prep cleanup for idmapping support.

   - Get rid of the xfs_buf_t typedef"

* tag 'xfs-5.11-merge-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (40 commits)
  xfs: remove xfs_buf_t typedef
  fs/xfs: convert comma to semicolon
  xfs: open code updating i_mode in xfs_set_acl
  xfs: remove xfs_vn_setattr_nonsize
  xfs: kill ialloced in xfs_dialloc()
  xfs: spilt xfs_dialloc() into 2 functions
  xfs: move xfs_dialloc_roll() into xfs_dialloc()
  xfs: move on-disk inode allocation out of xfs_ialloc()
  xfs: introduce xfs_dialloc_roll()
  xfs: convert noroom, okalloc in xfs_dialloc() to bool
  xfs: don't catch dax+reflink inodes as corruption in verifier
  xfs: fix the forward progress assertion in xfs_iwalk_run_callbacks
  xfs: remove unneeded return value check for *init_cursor()
  xfs: introduce xfs_validate_stripe_geometry()
  xfs: show the proper user quota options
  xfs: remove the unused XFS_B_FSB_OFFSET macro
  xfs: remove unnecessary null check in xfs_generic_create
  xfs: directly return if the delta equal to zero
  xfs: check tp->t_dqinfo value instead of the XFS_TRANS_DQ_DIRTY flag
  xfs: delete duplicated tp->t_dqinfo null check and allocation
  ...
2020-12-18 12:50:18 -08:00
Linus Torvalds
4862c741bd Merge tag 'ktest-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest
Pull ktest updates from Steven Rostedt:
 "No new features. Just a couple of fixes that I had in my local
  repository that fixed issues with sending the result emails"

* tag 'ktest-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest:
  ktest.pl: Fix the logic for truncating the size of the log file for email
  ktest.pl: If size of log is too big to email, email error message
2020-12-18 12:46:43 -08:00
Linus Torvalds
c59c7588fc Merge tag 'drm-next-2020-12-18' of git://anongit.freedesktop.org/drm/drm
Pull more drm updates from Daniel Vetter:
 "UAPI Changes:

   - Only enable char/agp uapi when CONFIG_DRM_LEGACY is set

  Cross-subsystem Changes:

   - vma_set_file helper to make vma->vm_file changing less brittle,
     acked by Andrew

  Core Changes:

   - dma-buf heaps improvements

   - pass full atomic modeset state to driver callbacks

   - shmem helpers: cached bo by default

   - cleanups for fbdev, fb-helpers

   - better docs for drm modes and SCALING_FITLER uapi

   - ttm: fix dma32 page pool regression

  Driver Changes:

   - multi-hop regression fixes for amdgpu, radeon, nouveau

   - lots of small amdgpu hw enabling fixes (display, pm, ...)

   - fixes for imx, mcde, meson, some panels, virtio, qxl, i915, all
     fairly minor

   - some cleanups for legacy drm/fbdev drivers"

* tag 'drm-next-2020-12-18' of git://anongit.freedesktop.org/drm/drm: (117 commits)
  drm/qxl: don't allocate a dma_address array
  drm/nouveau: fix multihop when move doesn't work.
  drm/i915/tgl: Fix REVID macros for TGL to fetch correct stepping
  drm/i915: Fix mismatch between misplaced vma check and vma insert
  drm/i915/perf: also include Gen11 in OATAILPTR workaround
  Revert "drm/i915: re-order if/else ladder for hpd_irq_setup"
  drm/amdgpu/disply: fix documentation warnings in display manager
  drm/amdgpu: print mmhub client name for dimgrey_cavefish
  drm/amdgpu: set mode1 reset as default for dimgrey_cavefish
  drm/amd/display: Add get_dig_frontend implementation for DCEx
  drm/radeon: remove h from printk format specifier
  drm/amdgpu: remove h from printk format specifier
  drm/amdgpu: Fix spelling mistake "Heterogenous" -> "Heterogeneous"
  drm/amdgpu: fix regression in vbios reservation handling on headless
  drm/amdgpu/SRIOV: Extend VF reset request wait period
  drm/amdkfd: correct amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu log.
  drm/amd/display: Adding prototype for dccg21_update_dpp_dto()
  drm/amdgpu: print what method we are using for runtime pm
  drm/amdgpu: simplify logic in atpx resume handling
  drm/amdgpu: no need to call pci_ignore_hotplug for _PR3
  ...
2020-12-18 12:38:28 -08:00
Arnaldo Carvalho de Melo
b53d4872d2 tools headers UAPI: Update asm-generic/unistd.h
Just a comment change, trivial.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-12-18 17:32:28 -03:00
Arnaldo Carvalho de Melo
f93c789a3e tools headers cpufeatures: Sync with the kernel sources
To pick the changes in:

  e7b6385b01 ("x86/cpufeatures: Add Intel SGX hardware bits")

That causes only these 'perf bench' objects to rebuild:

  CC       /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o
  CC       /tmp/build/perf/bench/mem-memset-x86-64-asm.o

And addresses these perf build warnings:

  Warning: Kernel ABI header at 'tools/arch/x86/include/asm/disabled-features.h' differs from latest version at 'arch/x86/include/asm/disabled-features.h'
  diff -u tools/arch/x86/include/asm/disabled-features.h arch/x86/include/asm/disabled-features.h
  Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h'
  diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Sean Christopherson <seanjc@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-12-18 17:32:28 -03:00
Arnaldo Carvalho de Melo
d6dbfceec5 tools headers UAPI: Sync linux/prctl.h with the kernel sources
To pick a new prctl introduced in:

  1446e1df9e ("kernel: Implement selective syscall userspace redirection")

That results in:

  $ tools/perf/trace/beauty/prctl_option.sh > before
  $ cp include/uapi/linux/prctl.h tools/include/uapi/linux/prctl.h
  $ tools/perf/trace/beauty/prctl_option.sh > after
  $ diff -u before after
  --- before	2020-12-17 15:00:42.012537367 -0300
  +++ after	2020-12-17 15:00:49.832699463 -0300
  @@ -53,6 +53,7 @@
   	[56] = "GET_TAGGED_ADDR_CTRL",
   	[57] = "SET_IO_FLUSHER",
   	[58] = "GET_IO_FLUSHER",
  +	[59] = "SET_SYSCALL_USER_DISPATCH",
   };
   static const char *prctl_set_mm_options[] = {
   	[1] = "START_CODE",
  $

Now users can do:

  # perf trace -e syscalls:sys_enter_prctl --filter "option==SET_SYSCALL_USER_DISPATCH"
^C#
  # trace -v -e syscalls:sys_enter_prctl --filter "option==SET_SYSCALL_USER_DISPATCH"
  New filter for syscalls:sys_enter_prctl: (option==0x3b) && (common_pid != 5519 && common_pid != 3404)
^C#

And also when prctl appears in a session, its options will be
translated to the string.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Gabriel Krisman Bertazi <krisman@collabora.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-12-18 17:32:28 -03:00
Arnaldo Carvalho de Melo
4a443a5177 tools headers UAPI: Sync linux/fscrypt.h with the kernel sources
To pick the changes from:

  3ceb6543e9 ("fscrypt: remove kernel-internal constants from UAPI header")

That don't result in any changes in tooling, just addressing this perf
build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/fscrypt.h' differs from latest version at 'include/uapi/linux/fscrypt.h'
  diff -u tools/include/uapi/linux/fscrypt.h include/uapi/linux/fscrypt.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Eric Biggers <ebiggers@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-12-18 17:32:28 -03:00
Arnaldo Carvalho de Melo
7ddcdea5b5 tools headers UAPI: Sync linux/const.h with the kernel headers
To pick up the changes in:

  a85cbe6159 ("uapi: move constants from <linux/kernel.h> to <linux/const.h>")

That causes no changes in tooling, just addresses this perf build
warning:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/const.h' differs from latest version at 'include/uapi/linux/const.h'
  diff -u tools/include/uapi/linux/const.h include/uapi/linux/const.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Petr Vorel <petr.vorel@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-12-18 17:32:28 -03:00
Arnaldo Carvalho de Melo
e9bde94f1e tools arch x86: Sync the msr-index.h copy with the kernel sources
To pick up the changes in:

  d205e0f142 ("x86/{cpufeatures,msr}: Add Intel SGX Launch Control hardware bits")
  e7b6385b01 ("x86/cpufeatures: Add Intel SGX hardware bits")
  43756a2989 ("powercap: Add AMD Fam17h RAPL support")
  298ed2b31f ("x86/msr-index: sort AMD RAPL MSRs by address")
  68299a42f8 ("x86/mce: Enable additional error logging on certain Intel CPUs")

That cause these changes in tooling:

  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > before
  $ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h
  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > after
  $ diff -u before after
  --- before	2020-12-17 14:45:49.036994450 -0300
  +++ after	2020-12-17 14:46:01.654256639 -0300
  @@ -22,6 +22,10 @@
   	[0x00000060] = "LBR_CORE_TO",
   	[0x00000079] = "IA32_UCODE_WRITE",
   	[0x0000008b] = "IA32_UCODE_REV",
  +	[0x0000008C] = "IA32_SGXLEPUBKEYHASH0",
  +	[0x0000008D] = "IA32_SGXLEPUBKEYHASH1",
  +	[0x0000008E] = "IA32_SGXLEPUBKEYHASH2",
  +	[0x0000008F] = "IA32_SGXLEPUBKEYHASH3",
   	[0x0000009b] = "IA32_SMM_MONITOR_CTL",
   	[0x0000009e] = "IA32_SMBASE",
   	[0x000000c1] = "IA32_PERFCTR0",
  @@ -59,6 +63,7 @@
   	[0x00000179] = "IA32_MCG_CAP",
   	[0x0000017a] = "IA32_MCG_STATUS",
   	[0x0000017b] = "IA32_MCG_CTL",
  +	[0x0000017f] = "ERROR_CONTROL",
   	[0x00000180] = "IA32_MCG_EAX",
   	[0x00000181] = "IA32_MCG_EBX",
   	[0x00000182] = "IA32_MCG_ECX",
  @@ -294,6 +299,7 @@
   	[0xc0010241 - x86_AMD_V_KVM_MSRs_offset] = "F15H_NB_PERF_CTR",
   	[0xc0010280 - x86_AMD_V_KVM_MSRs_offset] = "F15H_PTSC",
   	[0xc0010299 - x86_AMD_V_KVM_MSRs_offset] = "AMD_RAPL_POWER_UNIT",
  +	[0xc001029a - x86_AMD_V_KVM_MSRs_offset] = "AMD_CORE_ENERGY_STATUS",
   	[0xc001029b - x86_AMD_V_KVM_MSRs_offset] = "AMD_PKG_ENERGY_STATUS",
   	[0xc00102f0 - x86_AMD_V_KVM_MSRs_offset] = "AMD_PPIN_CTL",
   	[0xc00102f1 - x86_AMD_V_KVM_MSRs_offset] = "AMD_PPIN",
  $

Which causes these parts of tools/perf/ to be rebuilt:

  CC       /tmp/build/perf/trace/beauty/tracepoints/x86_msr.o
  LD       /tmp/build/perf/trace/beauty/tracepoints/perf-in.o
  LD       /tmp/build/perf/trace/beauty/perf-in.o
  LD       /tmp/build/perf/perf-in.o
  LINK     /tmp/build/perf/perf

At some point these should just be tables read by perf on demand.

This allows 'perf trace' users to use those strings to translate from
the msr ids provided by the msr: tracepoints.

This addresses this perf tools build warning:

  diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h
  Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h'

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Victor Ding <victording@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-12-18 17:32:28 -03:00
Arnaldo Carvalho de Melo
eb2842da77 perf trace beauty: Update copy of linux/socket.h with the kernel sources
This just triggers the rebuilding of the syscall beautifiers that
extract patterns from this file due to this cset:

  b713c195d5 ("net: provide __sys_shutdown_sock() that takes a socket")

After updating it:

    CC       /tmp/build/perf/trace/beauty/sockaddr.o

Addressing this perf build warning:

  Warning: Kernel ABI header at 'tools/perf/trace/beauty/include/linux/socket.h' differs from latest version at 'include/linux/socket.h'
  diff -u tools/perf/trace/beauty/include/linux/socket.h include/linux/socket.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-12-18 17:32:28 -03:00
Arnaldo Carvalho de Melo
23cd9543a5 tools headers: Update linux/ctype.h with the kernel sources
To pick up the changes in:

  caabdd0f59 ("ctype.h: remove duplicate isdigit() helper")

Addressing this perf build warning:

  Warning: Kernel ABI header at 'tools/include/linux/ctype.h' differs from latest version at 'include/linux/ctype.h'
  diff -u tools/include/linux/ctype.h include/linux/ctype.h

And we need to continue using the combination of:

  inline __isdigit()
  #define isdigit() __isdigit

When the __has_builtin() thing isn't available, as it is a builtin in
older systems with it as a builtin but with compilers not hacinv
__has_builtin(), rendering the __has_builtin() check useless otherwise.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-12-18 17:32:28 -03:00
Arnaldo Carvalho de Melo
ffb9beb13e tools headers: Add conditional __has_builtin()
As it'll be used by the ctype.h sync with its kernel source original.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-12-18 17:32:28 -03:00
Arnaldo Carvalho de Melo
4bba4c4bb0 tools headers: Get tools's linux/compiler.h closer to the kernel's
We're cherry picking stuff from the kernel to allow for the other
headers that we keep in sync via tools/perf/check-headers.sh to work,
so introduce linux/compiler_types.h and from there get the compiler
specific stuff.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2020-12-18 17:32:26 -03:00
Linus Torvalds
432c19a8d9 Merge tag 'thermal-v5.11-2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux
Pull thermal fixlet from Daniel Lezcano:
 "A trivial change which fell through the cracks:

  Add Alder Lake support ACPI ids (Srinivas Pandruvada)"

* tag 'thermal-v5.11-2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux:
  thermal: int340x: Support Alder Lake
2020-12-18 12:19:37 -08:00
Boris Protopopov
3970acf7dd SMB3: Add support for getting and setting SACLs
Add SYSTEM_SECURITY access flag and use with smb2 when opening
files for getting/setting SACLs. Add "system.cifs_ntsd_full"
extended attribute to allow user-space access to the functionality.
Avoid multiple server calls when setting owner, DACL, and SACL.

Signed-off-by: Boris Protopopov <pboris@amazon.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2020-12-18 13:25:57 -06:00
Linus Torvalds
a087241716 Merge tag 's390-5.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull more s390 updates from Heiko Carstens:
 "This is mainly to decouple udelay() and arch_cpu_idle() and simplify
  both of them.

  Summary:

   - Always initialize kernel stack backchain when entering the kernel,
     so that unwinding works properly.

   - Fix stack unwinder test case to avoid rare interrupt stack
     corruption.

   - Simplify udelay() and just let it busy loop instead of implementing
     a complex logic.

   - arch_cpu_idle() cleanup.

   - Some other minor improvements"

* tag 's390-5.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/zcrypt: convert comma to semicolon
  s390/idle: allow arch_cpu_idle() to be kprobed
  s390/idle: remove raw_local_irq_save()/restore() from arch_cpu_idle()
  s390/idle: merge enabled_wait() and arch_cpu_idle()
  s390/delay: remove udelay_simple()
  s390/irq: select HAVE_IRQ_EXIT_ON_IRQ_STACK
  s390/delay: simplify udelay
  s390/test_unwind: use timer instead of udelay
  s390/test_unwind: fix CALL_ON_STACK tests
  s390: make calls to TRACE_IRQS_OFF/TRACE_IRQS_ON balanced
  s390: always clear kernel stack backchain before calling functions
2020-12-18 11:08:06 -08:00
Linus Torvalds
5ba836eb9f Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull more arm64 updates from Catalin Marinas:
 "These are some some trivial updates that mostly fix/clean-up code
  pushed during the merging window:

   - Work around broken GCC 4.9 handling of "S" asm constraint

   - Suppress W=1 missing prototype warnings

   - Warn the user when a small VA_BITS value cannot map the available
     memory

   - Drop the useless update to per-cpu cycles"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: Work around broken GCC 4.9 handling of "S" constraint
  arm64: Warn the user when a small VA_BITS value wastes memory
  arm64: entry: suppress W=1 prototype warnings
  arm64: topology: Drop the useless update to per-cpu cycles
2020-12-18 10:57:27 -08:00
Linus Torvalds
e2ae634014 Merge tag 'riscv-for-linus-5.11-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V updates from Palmer Dabbelt:
 "We have a handful of new kernel features for 5.11:

   - Support for the contiguous memory allocator.

   - Support for IRQ Time Accounting

   - Support for stack tracing

   - Support for strict /dev/mem

   - Support for kernel section protection

  I'm being a bit conservative on the cutoff for this round due to the
  timing, so this is all the new development I'm going to take for this
  cycle (even if some of it probably normally would have been OK). There
  are, however, some fixes on the list that I will likely be sending
  along either later this week or early next week.

  There is one issue in here: one of my test configurations
  (PREEMPT{,_DEBUG}=y) fails to boot on QEMU 5.0.0 (from April) as of
  the .text.init alignment patch.

  With any luck we'll sort out the issue, but given how many bugs get
  fixed all over the place and how unrelated those features seem my
  guess is that we're just running into something that's been lurking
  for a while and has already been fixed in the newer QEMU (though I
  wouldn't be surprised if it's one of these implicit assumptions we
  have in the boot flow). If it was hardware I'd be strongly inclined to
  look more closely, but given that users can upgrade their simulators
  I'm less worried about it"

* tag 'riscv-for-linus-5.11-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  arm64: Use the generic devmem_is_allowed()
  arm: Use the generic devmem_is_allowed()
  RISC-V: Use the new generic devmem_is_allowed()
  lib: Add a generic version of devmem_is_allowed()
  riscv: Fixed kernel test robot warning
  riscv: kernel: Drop unused clean rule
  riscv: provide memmove implementation
  RISC-V: Move dynamic relocation section under __init
  RISC-V: Protect all kernel sections including init early
  RISC-V: Align the .init.text section
  RISC-V: Initialize SBI early
  riscv: Enable ARCH_STACKWALK
  riscv: Make stack walk callback consistent with generic code
  riscv: Cleanup stacktrace
  riscv: Add HAVE_IRQ_TIME_ACCOUNTING
  riscv: Enable CMA support
  riscv: Ignore Image.* and loader.bin
  riscv: Clean up boot dir
  riscv: Fix compressed Image formats build
  RISC-V: Add kernel image sections to the resource tree
2020-12-18 10:43:07 -08:00
Carsten Haitzler
be3e477eff drm/komeda: Fix bit check to import to value of proper type
KASAN found this problem. find_first_bit() expects to look at a
pointer pointing to a long, but we look at a u32 - this is going to be
an issue with endianness but, KSAN already flags this as out-of-bounds
stack reads. This fixes it by just importing inot a local long.

Signed-off-by: Carsten Haitzler <carsten.haitzler@arm.com>
Acked-by: Liviu Dudau <liviu.dudau@arm.com>
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201218150812.68195-1-carsten.haitzler@foss.arm.com
2020-12-18 16:36:00 +00:00
Carsten Haitzler
a24cf238c7 drm/komeda: Handle NULL pointer access code path in error case
komeda_component_get_old_state() technically can return a NULL
pointer. komeda_compiz_set_input() even warns when this happens, but
then proceeeds to use that NULL pointer to compare memory content there
agains the new state to see if it changed. In this case, it's better to
assume that the input changed as there is no old state to compare
against and thus assume the changes happen anyway.

Signed-off-by: Carsten Haitzler <carsten.haitzler@arm.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Acked-by: Liviu Dudau <liviu.dudau@arm.com>
[Applied small spelling fixes and fix suggested by Steven Price]
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201127110054.133686-1-carsten.haitzler@foss.arm.com
2020-12-18 16:35:53 +00:00
Carsten Haitzler
8e8fbfc682 drm/komeda: Remove useless variable assignment
ret is not actually read after this (only written in one case then
returned), so this assign line is useless. This removes that assignment.

Signed-off-by: Carsten Haitzler <carsten.haitzler@arm.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Acked-by: Liviu Dudau <liviu.dudau@arm.com>
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201127110027.133569-1-carsten.haitzler@foss.arm.com
2020-12-18 16:35:48 +00:00
James Qian Wang
4b50126282 drm/komeda: Correct the sequence of hw_done() and flip_done()
Komeda HW has no special, program the update to HW is done first,
then flip happens. So correct the sequence to hw_done() first then
flip_done().

Reported-by: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: James Qian Wang <james.qian.wang@arm.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201119013948.2866343-1-james.qian.wang@arm.com
2020-12-18 16:35:41 +00:00
Takashi Iwai
11cb881bf0 ALSA: pcm: oss: Fix a few more UBSAN fixes
There are a few places that call round{up|down}_pow_of_two() with the
value zero, and this causes undefined behavior warnings.  Avoid
calling those macros if such a nonsense value is passed; it's a minor
optimization as well, as we handle it as either an error or a value to
be skipped, instead.

Reported-by: syzbot+33ef0b6639a8d2d42b4c@syzkaller.appspotmail.com
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20201218161730.26596-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2020-12-18 17:31:02 +01:00
Takashi Iwai
618de0f4ef ALSA: pcm: Clear the full allocated memory at hw_params
The PCM hw_params core function tries to clear up the PCM buffer
before actually using for avoiding the information leak from the
previous usages or the usage before a new allocation.  It performs the
memset() with runtime->dma_bytes, but this might still leave some
remaining bytes untouched; namely, the PCM buffer size is aligned in
page size for mmap, hence runtime->dma_bytes doesn't necessarily cover
all PCM buffer pages, and the remaining bytes are exposed via mmap.

This patch changes the memory clearance to cover the all buffer pages
if the stream is supposed to be mmap-ready (that guarantees that the
buffer size is aligned in page size).

Reviewed-by: Lars-Peter Clausen <lars@metafoo.de>
Link: https://lore.kernel.org/r/20201218145625.2045-3-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2020-12-18 17:09:28 +01:00
Takashi Iwai
5c1733e33c ALSA: memalloc: Align buffer allocations in page size
Currently the standard memory allocator (snd_dma_malloc_pages*())
passes the byte size to allocate as is.  Most of the backends
allocates real pages, hence the actual allocations are aligned in page
size.  However, the genalloc doesn't seem assuring the size alignment,
hence it may result in the access outside the buffer when the whole
memory pages are exposed via mmap.

For avoiding such inconsistencies, this patch makes the allocation
size always to be aligned in page size.

Note that, after this change, snd_dma_buffer.bytes field contains the
aligned size, not the originally requested size.  This value is also
used for releasing the pages in return.

Reviewed-by: Lars-Peter Clausen <lars@metafoo.de>
Link: https://lore.kernel.org/r/20201218145625.2045-2-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2020-12-18 17:09:10 +01:00
Takashi Iwai
9df28edce7 ALSA: usb-audio: Disable sample read check if firmware doesn't give back
Some buggy firmware don't give the current sample rate but leaves
zero.  Handle this case more gracefully without warning but just skip
the current rate verification from the next time.

Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20201218145858.2357-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2020-12-18 17:08:24 +01:00
Lars-Peter Clausen
f2283366c2 ALSA: pcm: Remove snd_pcm_lib_preallocate_dma_free()
Since commit d4cfb30fce ("ALSA: pcm: Set per-card upper limit of PCM
buffer allocations") snd_pcm_lib_preallocate_dma_free() is a single line
function that has one caller, which is another single line function.

Clean this up a bit and remove snd_pcm_lib_preallocate_dma_free() and
directly call do_free_pages() from snd_pcm_lib_preallocate_free(). This is
a bit less boilerplate.

Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Link: https://lore.kernel.org/r/20201218153400.18394-1-lars@metafoo.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2020-12-18 17:07:04 +01:00
Peter Zijlstra
91ea62d58b softirq: Avoid bad tracing / lockdep interaction
Similar to commit:

  1a63dcd876 ("softirq: Reorder trace_softirqs_on to prevent lockdep splat")

__local_bh_enable_ip() can also call into tracing with inconsistent
state. Unlike that commit we don't need to bother about the tracepoint
because 'cnt-1' never matches preempt_count() (by construction).

Reported-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Heiko Carstens <hca@linux.ibm.com>
Link: https://lkml.kernel.org/r/20201218154519.GW3092@hirez.programming.kicks-ass.net
2020-12-18 16:53:13 +01:00
Peter Zijlstra
441fa34097 jump_label/static_call: Add MAINTAINERS
These files don't appear to have a MAINTAINERS entry and as such
patches miss being seen by people who know this code.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Jason Baron <jbaron@akamai.com>
Link: https://lkml.kernel.org/r/20201216133014.GT3092@hirez.programming.kicks-ass.net
2020-12-18 16:53:12 +01:00
Peter Zijlstra
55d2eba8e7 jump_label: Fix usage in module __init
When the static_key is part of the module, and the module calls
static_key_inc/enable() from it's __init section *AND* has a
static_branch_*() user in that very same __init section, things go
wobbly.

If the static_key lives outside the module, jump_label_add_module()
would append this module's sites to the key and jump_label_update()
would take the static_key_linked() branch and all would be fine.

If all the sites are outside of __init, then everything will be fine
too.

However, when all is aligned just as described above,
jump_label_update() calls __jump_label_update(.init = false) and we'll
not update sites in __init text.

Fixes: 1948367768 ("jump_label: Annotate entries that operate on __init code earlier")
Reported-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Jessica Yu <jeyu@kernel.org>
Link: https://lkml.kernel.org/r/20201216135435.GV3092@hirez.programming.kicks-ass.net
2020-12-18 16:53:12 +01:00
Daniel Vetter
4efd7faba5 Merge tag 'drm-intel-next-fixes-2020-12-18' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
drm/i915 fixes for the merge window

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/87zh2bp34m.fsf@intel.com
2020-12-18 16:22:10 +01:00
Pavel Begunkov
dfea9fce29 io_uring: close a small race gap for files cancel
The purpose of io_uring_cancel_files() is to wait for all requests
matching ->files to go/be cancelled. We should first drop files of a
request in io_req_drop_files() and only then make it undiscoverable for
io_uring_cancel_files.

First drop, then delete from list. It's ok to leave req->id->files
dangling, because it's not dereferenced by cancellation code, only
compared against. It would potentially go to sleep and be awaken by
following in io_req_drop_files() wake_up().

Fixes: 0f2122045b ("io_uring: don't rely on weak ->files references")
Cc: <stable@vger.kernel.org> # 5.5+
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-12-18 08:16:02 -07:00
Xiaoguang Wang
0020ef04e4 io_uring: fix io_wqe->work_list corruption
For the first time a req punted to io-wq, we'll initialize io_wq_work's
list to be NULL, then insert req to io_wqe->work_list. If this req is not
inserted into tail of io_wqe->work_list, this req's io_wq_work list will
point to another req's io_wq_work. For splitted bio case, this req maybe
inserted to io_wqe->work_list repeatedly, once we insert it to tail of
io_wqe->work_list for the second time, now io_wq_work->list->next will be
invalid pointer, which then result in many strang error, panic, kernel
soft-lockup, rcu stall, etc.

In my vm, kernel doest not have commit cc29e1bf0d ("block: disable
iopoll for split bio"), below fio job can reproduce this bug steadily:
[global]
name=iouring-sqpoll-iopoll-1
ioengine=io_uring
iodepth=128
numjobs=1
thread
rw=randread
direct=1
registerfiles=1
hipri=1
bs=4m
size=100M
runtime=120
time_based
group_reporting
randrepeat=0

[device]
directory=/home/feiman.wxg/mntpoint/  # an ext4 mount point

If we have commit cc29e1bf0d ("block: disable iopoll for split bio"),
there will no splitted bio case for polled io, but I think we still to need
to fix this list corruption, it also should maybe go to stable branchs.

To fix this corruption, if a req is inserted into tail of io_wqe->work_list,
initialize req->io_wq_work->list->next to bu NULL.

Cc: stable@vger.kernel.org
Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-12-18 08:15:10 -07:00
Christian König
f96f62597e drm/qxl: don't allocate a dma_address array
That seems to be unused.

Daniel: Mike reported a warning when booting with qxl, which this
patch fixes:

[    1.815561] WARNING: CPU: 7 PID: 355 at drivers/gpu/drm/ttm/ttm_pool.c:365 ttm_pool_alloc+0x41b/0x540 [ttm]

Signed-off-by: Christian König <christian.koenig@amd.com>
Reported-by: Mike Galbraith <efault@gmx.de>
Tested-by: Mike Galbraith <efault@gmx.de>
References: https://lore.kernel.org/lkml/7cb43d5b-4e6a-defc-1ab6-5f713ad5a963@amd.com/
Reviewed-by: David Airlie <airlied@redhat.com>
[davnet: bring commit message up to par.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20201218134243.110884-1-christian.koenig@amd.com
2020-12-18 15:14:17 +01:00