We don't want to add too much memory when it's not getting onlined
immediately, to avoid running OOM. Generalize the handling, to avoid
making use of memory block states. Use a threshold of 1 GiB for now.
Properly adjust the offline size when adding/removing memory. As we are
not always protected by a lock when touching the offline size, use an
atomic64_t. We don't care about races (e.g., someone offlining memory
while we are adding more), only about consistent values.
(1 GiB needs a memmap of ~16MiB - which sounds reasonable even for
setups with little boot memory and (possibly) one virtio-mem device per
node)
We don't want to retrigger when onlining is caused immediately by our
action (e.g., adding memory which immediately gets onlined), so use a
flag to indicate if the workqueue is active and use that as an
indicator whether to trigger a retry. This will also be especially relevant
for Big Block Mode (BBM), whereby we might re-online memory in case
offlining of another memory block failed.
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20201112133815.13332-17-david@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Let's trigger from offlining code only when we're not allowed to unplug
online memory. Handle the other case (memmap possibly freeing up another
memory block) when actually removing memory. We now also properly handle
the case when removing already offline memory blocks via
virtio_mem_mb_remove(). When removing via virtio_mem_remove(), when
unloading the driver, virtio_mem_retry() is a NOP and safe to use.
While at it, move retry handling when offlining out of
virtio_mem_notify_offline(), to share it with Big Block Mode (BBM)
soon.
This is a preparation for Big Block Mode (BBM), whereby we can see some
temporary offlining of memory blocks without actually making progress.
Imagine you have a Big Block that spans to Linux memory blocks. Assume
the first Linux memory blocks has no unmovable data on it. When we would
call offline_and_remove_memory() on the big block, we would
1. Try to offline the first block. Works, notifiers triggered.
virtio_mem_retry() called.
2. Try to offline the second block. Does not work.
3. Re-online first block.
4. Exit to main loop, exit workqueue.
5. Retry immediately (due to virtio_mem_retry()), go to 1.
The result are endless retries.
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20201112133815.13332-16-david@redhat.com
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Pull xfs updates from Darrick Wong:
"In this release we add the ability to set a 'needsrepair' flag
indicating that we /know/ the filesystem requires xfs_repair, but
other than that, it's the usual strengthening of metadata validation
and miscellaneous cleanups.
Summary:
- Introduce a "needsrepair" "feature" to flag a filesystem as needing
a pass through xfs_repair. This is key to enabling filesystem
upgrades (in xfs_db) that require xfs_repair to make minor
adjustments to metadata.
- Refactor parameter checking of recovered log intent items so that
we actually use the same validation code as them that generate the
intent items.
- Various fixes to online scrub not reacting correctly to directory
entries pointing to inodes that cannot be igetted.
- Refactor validation helpers for data and rt volume extents.
- Refactor XFS_TRANS_DQ_DIRTY out of existence.
- Fix a longstanding bug where mounting with "uqnoenforce" would
start user quotas in non-enforcing mode but /proc/mounts would
display "usrquota", implying that they are being enforced.
- Don't flag dax+reflink inodes as corruption since that is a valid
(but not fully functional) combination right now.
- Clean up raid stripe validation functions.
- Refactor the inode allocation code to be more straightforward.
- Small prep cleanup for idmapping support.
- Get rid of the xfs_buf_t typedef"
* tag 'xfs-5.11-merge-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (40 commits)
xfs: remove xfs_buf_t typedef
fs/xfs: convert comma to semicolon
xfs: open code updating i_mode in xfs_set_acl
xfs: remove xfs_vn_setattr_nonsize
xfs: kill ialloced in xfs_dialloc()
xfs: spilt xfs_dialloc() into 2 functions
xfs: move xfs_dialloc_roll() into xfs_dialloc()
xfs: move on-disk inode allocation out of xfs_ialloc()
xfs: introduce xfs_dialloc_roll()
xfs: convert noroom, okalloc in xfs_dialloc() to bool
xfs: don't catch dax+reflink inodes as corruption in verifier
xfs: fix the forward progress assertion in xfs_iwalk_run_callbacks
xfs: remove unneeded return value check for *init_cursor()
xfs: introduce xfs_validate_stripe_geometry()
xfs: show the proper user quota options
xfs: remove the unused XFS_B_FSB_OFFSET macro
xfs: remove unnecessary null check in xfs_generic_create
xfs: directly return if the delta equal to zero
xfs: check tp->t_dqinfo value instead of the XFS_TRANS_DQ_DIRTY flag
xfs: delete duplicated tp->t_dqinfo null check and allocation
...
Pull ktest updates from Steven Rostedt:
"No new features. Just a couple of fixes that I had in my local
repository that fixed issues with sending the result emails"
* tag 'ktest-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest:
ktest.pl: Fix the logic for truncating the size of the log file for email
ktest.pl: If size of log is too big to email, email error message
Pull more drm updates from Daniel Vetter:
"UAPI Changes:
- Only enable char/agp uapi when CONFIG_DRM_LEGACY is set
Cross-subsystem Changes:
- vma_set_file helper to make vma->vm_file changing less brittle,
acked by Andrew
Core Changes:
- dma-buf heaps improvements
- pass full atomic modeset state to driver callbacks
- shmem helpers: cached bo by default
- cleanups for fbdev, fb-helpers
- better docs for drm modes and SCALING_FITLER uapi
- ttm: fix dma32 page pool regression
Driver Changes:
- multi-hop regression fixes for amdgpu, radeon, nouveau
- lots of small amdgpu hw enabling fixes (display, pm, ...)
- fixes for imx, mcde, meson, some panels, virtio, qxl, i915, all
fairly minor
- some cleanups for legacy drm/fbdev drivers"
* tag 'drm-next-2020-12-18' of git://anongit.freedesktop.org/drm/drm: (117 commits)
drm/qxl: don't allocate a dma_address array
drm/nouveau: fix multihop when move doesn't work.
drm/i915/tgl: Fix REVID macros for TGL to fetch correct stepping
drm/i915: Fix mismatch between misplaced vma check and vma insert
drm/i915/perf: also include Gen11 in OATAILPTR workaround
Revert "drm/i915: re-order if/else ladder for hpd_irq_setup"
drm/amdgpu/disply: fix documentation warnings in display manager
drm/amdgpu: print mmhub client name for dimgrey_cavefish
drm/amdgpu: set mode1 reset as default for dimgrey_cavefish
drm/amd/display: Add get_dig_frontend implementation for DCEx
drm/radeon: remove h from printk format specifier
drm/amdgpu: remove h from printk format specifier
drm/amdgpu: Fix spelling mistake "Heterogenous" -> "Heterogeneous"
drm/amdgpu: fix regression in vbios reservation handling on headless
drm/amdgpu/SRIOV: Extend VF reset request wait period
drm/amdkfd: correct amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu log.
drm/amd/display: Adding prototype for dccg21_update_dpp_dto()
drm/amdgpu: print what method we are using for runtime pm
drm/amdgpu: simplify logic in atpx resume handling
drm/amdgpu: no need to call pci_ignore_hotplug for _PR3
...
To pick the changes in:
e7b6385b01 ("x86/cpufeatures: Add Intel SGX hardware bits")
That causes only these 'perf bench' objects to rebuild:
CC /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o
CC /tmp/build/perf/bench/mem-memset-x86-64-asm.o
And addresses these perf build warnings:
Warning: Kernel ABI header at 'tools/arch/x86/include/asm/disabled-features.h' differs from latest version at 'arch/x86/include/asm/disabled-features.h'
diff -u tools/arch/x86/include/asm/disabled-features.h arch/x86/include/asm/disabled-features.h
Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h'
diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Sean Christopherson <seanjc@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick a new prctl introduced in:
1446e1df9e ("kernel: Implement selective syscall userspace redirection")
That results in:
$ tools/perf/trace/beauty/prctl_option.sh > before
$ cp include/uapi/linux/prctl.h tools/include/uapi/linux/prctl.h
$ tools/perf/trace/beauty/prctl_option.sh > after
$ diff -u before after
--- before 2020-12-17 15:00:42.012537367 -0300
+++ after 2020-12-17 15:00:49.832699463 -0300
@@ -53,6 +53,7 @@
[56] = "GET_TAGGED_ADDR_CTRL",
[57] = "SET_IO_FLUSHER",
[58] = "GET_IO_FLUSHER",
+ [59] = "SET_SYSCALL_USER_DISPATCH",
};
static const char *prctl_set_mm_options[] = {
[1] = "START_CODE",
$
Now users can do:
# perf trace -e syscalls:sys_enter_prctl --filter "option==SET_SYSCALL_USER_DISPATCH"
^C#
# trace -v -e syscalls:sys_enter_prctl --filter "option==SET_SYSCALL_USER_DISPATCH"
New filter for syscalls:sys_enter_prctl: (option==0x3b) && (common_pid != 5519 && common_pid != 3404)
^C#
And also when prctl appears in a session, its options will be
translated to the string.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Gabriel Krisman Bertazi <krisman@collabora.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick the changes from:
3ceb6543e9 ("fscrypt: remove kernel-internal constants from UAPI header")
That don't result in any changes in tooling, just addressing this perf
build warning:
Warning: Kernel ABI header at 'tools/include/uapi/linux/fscrypt.h' differs from latest version at 'include/uapi/linux/fscrypt.h'
diff -u tools/include/uapi/linux/fscrypt.h include/uapi/linux/fscrypt.h
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Eric Biggers <ebiggers@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick up the changes in:
a85cbe6159 ("uapi: move constants from <linux/kernel.h> to <linux/const.h>")
That causes no changes in tooling, just addresses this perf build
warning:
Warning: Kernel ABI header at 'tools/include/uapi/linux/const.h' differs from latest version at 'include/uapi/linux/const.h'
diff -u tools/include/uapi/linux/const.h include/uapi/linux/const.h
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Petr Vorel <petr.vorel@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This just triggers the rebuilding of the syscall beautifiers that
extract patterns from this file due to this cset:
b713c195d5 ("net: provide __sys_shutdown_sock() that takes a socket")
After updating it:
CC /tmp/build/perf/trace/beauty/sockaddr.o
Addressing this perf build warning:
Warning: Kernel ABI header at 'tools/perf/trace/beauty/include/linux/socket.h' differs from latest version at 'include/linux/socket.h'
diff -u tools/perf/trace/beauty/include/linux/socket.h include/linux/socket.h
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick up the changes in:
caabdd0f59 ("ctype.h: remove duplicate isdigit() helper")
Addressing this perf build warning:
Warning: Kernel ABI header at 'tools/include/linux/ctype.h' differs from latest version at 'include/linux/ctype.h'
diff -u tools/include/linux/ctype.h include/linux/ctype.h
And we need to continue using the combination of:
inline __isdigit()
#define isdigit() __isdigit
When the __has_builtin() thing isn't available, as it is a builtin in
older systems with it as a builtin but with compilers not hacinv
__has_builtin(), rendering the __has_builtin() check useless otherwise.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We're cherry picking stuff from the kernel to allow for the other
headers that we keep in sync via tools/perf/check-headers.sh to work,
so introduce linux/compiler_types.h and from there get the compiler
specific stuff.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pull thermal fixlet from Daniel Lezcano:
"A trivial change which fell through the cracks:
Add Alder Lake support ACPI ids (Srinivas Pandruvada)"
* tag 'thermal-v5.11-2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux:
thermal: int340x: Support Alder Lake
Add SYSTEM_SECURITY access flag and use with smb2 when opening
files for getting/setting SACLs. Add "system.cifs_ntsd_full"
extended attribute to allow user-space access to the functionality.
Avoid multiple server calls when setting owner, DACL, and SACL.
Signed-off-by: Boris Protopopov <pboris@amazon.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Pull more s390 updates from Heiko Carstens:
"This is mainly to decouple udelay() and arch_cpu_idle() and simplify
both of them.
Summary:
- Always initialize kernel stack backchain when entering the kernel,
so that unwinding works properly.
- Fix stack unwinder test case to avoid rare interrupt stack
corruption.
- Simplify udelay() and just let it busy loop instead of implementing
a complex logic.
- arch_cpu_idle() cleanup.
- Some other minor improvements"
* tag 's390-5.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/zcrypt: convert comma to semicolon
s390/idle: allow arch_cpu_idle() to be kprobed
s390/idle: remove raw_local_irq_save()/restore() from arch_cpu_idle()
s390/idle: merge enabled_wait() and arch_cpu_idle()
s390/delay: remove udelay_simple()
s390/irq: select HAVE_IRQ_EXIT_ON_IRQ_STACK
s390/delay: simplify udelay
s390/test_unwind: use timer instead of udelay
s390/test_unwind: fix CALL_ON_STACK tests
s390: make calls to TRACE_IRQS_OFF/TRACE_IRQS_ON balanced
s390: always clear kernel stack backchain before calling functions
Pull more arm64 updates from Catalin Marinas:
"These are some some trivial updates that mostly fix/clean-up code
pushed during the merging window:
- Work around broken GCC 4.9 handling of "S" asm constraint
- Suppress W=1 missing prototype warnings
- Warn the user when a small VA_BITS value cannot map the available
memory
- Drop the useless update to per-cpu cycles"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: Work around broken GCC 4.9 handling of "S" constraint
arm64: Warn the user when a small VA_BITS value wastes memory
arm64: entry: suppress W=1 prototype warnings
arm64: topology: Drop the useless update to per-cpu cycles
Pull RISC-V updates from Palmer Dabbelt:
"We have a handful of new kernel features for 5.11:
- Support for the contiguous memory allocator.
- Support for IRQ Time Accounting
- Support for stack tracing
- Support for strict /dev/mem
- Support for kernel section protection
I'm being a bit conservative on the cutoff for this round due to the
timing, so this is all the new development I'm going to take for this
cycle (even if some of it probably normally would have been OK). There
are, however, some fixes on the list that I will likely be sending
along either later this week or early next week.
There is one issue in here: one of my test configurations
(PREEMPT{,_DEBUG}=y) fails to boot on QEMU 5.0.0 (from April) as of
the .text.init alignment patch.
With any luck we'll sort out the issue, but given how many bugs get
fixed all over the place and how unrelated those features seem my
guess is that we're just running into something that's been lurking
for a while and has already been fixed in the newer QEMU (though I
wouldn't be surprised if it's one of these implicit assumptions we
have in the boot flow). If it was hardware I'd be strongly inclined to
look more closely, but given that users can upgrade their simulators
I'm less worried about it"
* tag 'riscv-for-linus-5.11-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
arm64: Use the generic devmem_is_allowed()
arm: Use the generic devmem_is_allowed()
RISC-V: Use the new generic devmem_is_allowed()
lib: Add a generic version of devmem_is_allowed()
riscv: Fixed kernel test robot warning
riscv: kernel: Drop unused clean rule
riscv: provide memmove implementation
RISC-V: Move dynamic relocation section under __init
RISC-V: Protect all kernel sections including init early
RISC-V: Align the .init.text section
RISC-V: Initialize SBI early
riscv: Enable ARCH_STACKWALK
riscv: Make stack walk callback consistent with generic code
riscv: Cleanup stacktrace
riscv: Add HAVE_IRQ_TIME_ACCOUNTING
riscv: Enable CMA support
riscv: Ignore Image.* and loader.bin
riscv: Clean up boot dir
riscv: Fix compressed Image formats build
RISC-V: Add kernel image sections to the resource tree
komeda_component_get_old_state() technically can return a NULL
pointer. komeda_compiz_set_input() even warns when this happens, but
then proceeeds to use that NULL pointer to compare memory content there
agains the new state to see if it changed. In this case, it's better to
assume that the input changed as there is no old state to compare
against and thus assume the changes happen anyway.
Signed-off-by: Carsten Haitzler <carsten.haitzler@arm.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Acked-by: Liviu Dudau <liviu.dudau@arm.com>
[Applied small spelling fixes and fix suggested by Steven Price]
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201127110054.133686-1-carsten.haitzler@foss.arm.com
The PCM hw_params core function tries to clear up the PCM buffer
before actually using for avoiding the information leak from the
previous usages or the usage before a new allocation. It performs the
memset() with runtime->dma_bytes, but this might still leave some
remaining bytes untouched; namely, the PCM buffer size is aligned in
page size for mmap, hence runtime->dma_bytes doesn't necessarily cover
all PCM buffer pages, and the remaining bytes are exposed via mmap.
This patch changes the memory clearance to cover the all buffer pages
if the stream is supposed to be mmap-ready (that guarantees that the
buffer size is aligned in page size).
Reviewed-by: Lars-Peter Clausen <lars@metafoo.de>
Link: https://lore.kernel.org/r/20201218145625.2045-3-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Currently the standard memory allocator (snd_dma_malloc_pages*())
passes the byte size to allocate as is. Most of the backends
allocates real pages, hence the actual allocations are aligned in page
size. However, the genalloc doesn't seem assuring the size alignment,
hence it may result in the access outside the buffer when the whole
memory pages are exposed via mmap.
For avoiding such inconsistencies, this patch makes the allocation
size always to be aligned in page size.
Note that, after this change, snd_dma_buffer.bytes field contains the
aligned size, not the originally requested size. This value is also
used for releasing the pages in return.
Reviewed-by: Lars-Peter Clausen <lars@metafoo.de>
Link: https://lore.kernel.org/r/20201218145625.2045-2-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Since commit d4cfb30fce ("ALSA: pcm: Set per-card upper limit of PCM
buffer allocations") snd_pcm_lib_preallocate_dma_free() is a single line
function that has one caller, which is another single line function.
Clean this up a bit and remove snd_pcm_lib_preallocate_dma_free() and
directly call do_free_pages() from snd_pcm_lib_preallocate_free(). This is
a bit less boilerplate.
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Link: https://lore.kernel.org/r/20201218153400.18394-1-lars@metafoo.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
When the static_key is part of the module, and the module calls
static_key_inc/enable() from it's __init section *AND* has a
static_branch_*() user in that very same __init section, things go
wobbly.
If the static_key lives outside the module, jump_label_add_module()
would append this module's sites to the key and jump_label_update()
would take the static_key_linked() branch and all would be fine.
If all the sites are outside of __init, then everything will be fine
too.
However, when all is aligned just as described above,
jump_label_update() calls __jump_label_update(.init = false) and we'll
not update sites in __init text.
Fixes: 1948367768 ("jump_label: Annotate entries that operate on __init code earlier")
Reported-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Jessica Yu <jeyu@kernel.org>
Link: https://lkml.kernel.org/r/20201216135435.GV3092@hirez.programming.kicks-ass.net
The purpose of io_uring_cancel_files() is to wait for all requests
matching ->files to go/be cancelled. We should first drop files of a
request in io_req_drop_files() and only then make it undiscoverable for
io_uring_cancel_files.
First drop, then delete from list. It's ok to leave req->id->files
dangling, because it's not dereferenced by cancellation code, only
compared against. It would potentially go to sleep and be awaken by
following in io_req_drop_files() wake_up().
Fixes: 0f2122045b ("io_uring: don't rely on weak ->files references")
Cc: <stable@vger.kernel.org> # 5.5+
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
For the first time a req punted to io-wq, we'll initialize io_wq_work's
list to be NULL, then insert req to io_wqe->work_list. If this req is not
inserted into tail of io_wqe->work_list, this req's io_wq_work list will
point to another req's io_wq_work. For splitted bio case, this req maybe
inserted to io_wqe->work_list repeatedly, once we insert it to tail of
io_wqe->work_list for the second time, now io_wq_work->list->next will be
invalid pointer, which then result in many strang error, panic, kernel
soft-lockup, rcu stall, etc.
In my vm, kernel doest not have commit cc29e1bf0d ("block: disable
iopoll for split bio"), below fio job can reproduce this bug steadily:
[global]
name=iouring-sqpoll-iopoll-1
ioengine=io_uring
iodepth=128
numjobs=1
thread
rw=randread
direct=1
registerfiles=1
hipri=1
bs=4m
size=100M
runtime=120
time_based
group_reporting
randrepeat=0
[device]
directory=/home/feiman.wxg/mntpoint/ # an ext4 mount point
If we have commit cc29e1bf0d ("block: disable iopoll for split bio"),
there will no splitted bio case for polled io, but I think we still to need
to fix this list corruption, it also should maybe go to stable branchs.
To fix this corruption, if a req is inserted into tail of io_wqe->work_list,
initialize req->io_wq_work->list->next to bu NULL.
Cc: stable@vger.kernel.org
Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>