Include all necessary headers in header files to enable third party
applications, like LSP servers, to resolve all used symbols.
ibpkey.h: include "flask.h" for SECINITSID_UNLABELED
initial_sid_to_string.h: include <linux/stddef.h> for NULL
Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Commit 53f3517ae0 ("selinux: do not leave dangling pointer behind")
reset the `str` field of the `context` struct in an OOM error branch.
In this struct the fields `str` and `len` are coupled and should be kept
in sync. Set the length to zero according to the string be set to NULL.
Fixes: 53f3517ae0 ("selinux: do not leave dangling pointer behind")
Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Newly added subflows should inherit the LSM label from the associated
MPTCP socket regardless of the current context.
This patch implements the above copying sid and class from the MPTCP
socket context, deleting the existing subflow label, if any, and then
re-creating the correct one.
The new helper reuses the selinux_netlbl_sk_security_free() function,
and the latter can end-up being called multiple times with the same
argument; we additionally need to make it idempotent.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paul Moore <paul@paul-moore.com>
MPTCP can create subflows in kernel context, and later indirectly
expose them to user-space, via the owning MPTCP socket.
As discussed in the reported link, the above causes unexpected failures
for server, MPTCP-enabled applications.
Let's introduce a new LSM hook to allow the security module to relabel
the subflow according to the owning user-space process, via the MPTCP
socket owning the subflow.
Note that the new hook requires both the MPTCP socket and the new
subflow. This could allow future extensions, e.g. explicitly validating
the MPTCP <-> subflow linkage.
Link: https://lore.kernel.org/mptcp/CAHC9VhTNh-YwiyTds=P1e3rixEDqbRTFj22bpya=+qJqfcaMfg@mail.gmail.com/
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paul Moore <paul@paul-moore.com>
A few small tweaks to selinux_audit_rule_init():
- Adjust how we use the @rc variable so we are not doing any extra
work in the common/success case.
- Related to the above, rework the 'out' jump label so that the
success and error paths are different, simplifying both.
- Cleanup some of the vertical whitespace while we are making the
other changes.
Signed-off-by: Paul Moore <paul@paul-moore.com>
The array of mount tokens in only used in match_opt_prefix() and never
modified.
The array of symtab names is never modified and only used in the
DEBUG_HASHES configuration as output.
The array of files for the SElinux filesystem sub-directory `ss` is
similar to the other `struct tree_descr` usages only read from to
construct the containing entries.
Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
The second parameter `tag` of avtab_hash_eval() is only used for
printing. In policydb_index() it is called with a string literal:
avtab_hash_eval(&p->te_avtab, "rules");
Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
[PM: slight formatting tweak in description]
Signed-off-by: Paul Moore <paul@paul-moore.com>
Commit 539813e418 ("selinux: stop returning node from avc_insert()")
converted the return value of avc_insert() to void but left the now
unnecessary trailing return statement.
Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
Since commit f22f9aaf6c ("selinux: remove the runtime disable
functionality") the function avc_disable() is no longer used.
Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
In case mls_context_cpy() fails due to OOM set the free'd pointer in
context_cpy() to NULL to avoid it potentially being dereferenced or
free'd again in future. Freeing a NULL pointer is well-defined and a
hard NULL dereference crash is at least not exploitable and should give
a workable stack trace.
Fixes: 12b29f3455 ("selinux: support deferred mapping of contexts")
Signed-off-by: Christian Göttsche <cgzones@googlemail.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
A few small tweaks to improve the SELinux Makefile:
- Define a new variable, 'genhdrs', to represent both flask.h and
av_permissions.h; this should help ensure consistent processing for
both generated headers.
- Move the 'ccflags-y' variable closer to the top, just after the
main 'obj-$(CONFIG_SECURITY_SELINUX)' definition to make it more
visible and improve the grouping in the Makefile.
- Rework some of the vertical whitespace to improve some of the
grouping in the Makefile.
Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
We need to better polish building with BPF skels, so revert back to
making it an experimental feature that has to be explicitely enabled
using BUILD_BPF_SKEL=1.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCZFbCXwAKCRCyPKLppCJ+
J7cHAP97erKY4hBXArjpfzcvpFmboh/oqhbTLntyIpS6TEnOyQEAyervAPGIjQYC
DCo4foyXmOWn3dhNtK9M+YiRl3o2SgQ=
=7G78
-----END PGP SIGNATURE-----
Merge tag 'perf-tools-for-v6.4-3-2023-05-06' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tool updates from Arnaldo Carvalho de Melo:
"Third version of perf tool updates, with the build problems with with
using a 'vmlinux.h' generated from the main build fixed, and the bpf
skeleton build disabled by default.
Build:
- Require libtraceevent to build, one can disable it using
NO_LIBTRACEEVENT=1.
It is required for tools like 'perf sched', 'perf kvm', 'perf
trace', etc.
libtraceevent is available in most distros so installing
'libtraceevent-devel' should be a one-time event to continue
building perf as usual.
Using NO_LIBTRACEEVENT=1 produces tooling that is functional and
sufficient for lots of users not interested in those libtraceevent
dependent features.
- Allow Python support in 'perf script' when libtraceevent isn't
linked, as not all features requires it, for instance Intel PT does
not use tracepoints.
- Error if the python interpreter needed for jevents to work isn't
available and NO_JEVENTS=1 isn't set, preventing a build without
support for JSON vendor events, which is a rare but possible
condition. The two check error messages:
$(error ERROR: No python interpreter needed for jevents generation. Install python or build with NO_JEVENTS=1.)
$(error ERROR: Python interpreter needed for jevents generation too old (older than 3.6). Install a newer python or build with NO_JEVENTS=1.)
- Make libbpf 1.0 the minimum required when building with out of
tree, distro provided libbpf.
- Use libsdtc++'s and LLVM's libcxx's __cxa_demangle, a portable C++
demangler, add 'perf test' entry for it.
- Make binutils libraries opt in, as distros disable building with it
due to licensing, they were used for C++ demangling, for instance.
- Switch libpfm4 to opt-out rather than opt-in, if libpfm-devel (or
equivalent) isn't installed, we'll just have a build warning:
Makefile.config:1144: libpfm4 not found, disables libpfm4 support. Please install libpfm4-dev
- Add a feature test for scandirat(), that is not implemented so far
in musl and uclibc, disabling features that need it, such as
scanning for tracepoints in /sys/kernel/tracing/events.
perf BPF filters:
- New feature where BPF can be used to filter samples, for instance:
$ sudo ./perf record -e cycles --filter 'period > 1000' true
$ sudo ./perf script
perf-exec 2273949 546850.708501: 5029 cycles: ffffffff826f9e25 finish_wait+0x5 ([kernel.kallsyms])
perf-exec 2273949 546850.708508: 32409 cycles: ffffffff826f9e25 finish_wait+0x5 ([kernel.kallsyms])
perf-exec 2273949 546850.708526: 143369 cycles: ffffffff82b4cdbf xas_start+0x5f ([kernel.kallsyms])
perf-exec 2273949 546850.708600: 372650 cycles: ffffffff8286b8f7 __pagevec_lru_add+0x117 ([kernel.kallsyms])
perf-exec 2273949 546850.708791: 482953 cycles: ffffffff829190de __mod_memcg_lruvec_state+0x4e ([kernel.kallsyms])
true 2273949 546850.709036: 501985 cycles: ffffffff828add7c tlb_gather_mmu+0x4c ([kernel.kallsyms])
true 2273949 546850.709292: 503065 cycles: 7f2446d97c03 _dl_map_object_deps+0x973 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2)
- In addition to 'period' (PERF_SAMPLE_PERIOD), the other
PERF_SAMPLE_ can be used for filtering, and also some other sample
accessible values, from tools/perf/Documentation/perf-record.txt:
Essentially the BPF filter expression is:
<term> <operator> <value> (("," | "||") <term> <operator> <value>)*
The <term> can be one of:
ip, id, tid, pid, cpu, time, addr, period, txn, weight, phys_addr,
code_pgsz, data_pgsz, weight1, weight2, weight3, ins_lat, retire_lat,
p_stage_cyc, mem_op, mem_lvl, mem_snoop, mem_remote, mem_lock,
mem_dtlb, mem_blk, mem_hops
The <operator> can be one of:
==, !=, >, >=, <, <=, &
The <value> can be one of:
<number> (for any term)
na, load, store, pfetch, exec (for mem_op)
l1, l2, l3, l4, cxl, io, any_cache, lfb, ram, pmem (for mem_lvl)
na, none, hit, miss, hitm, fwd, peer (for mem_snoop)
remote (for mem_remote)
na, locked (for mem_locked)
na, l1_hit, l1_miss, l2_hit, l2_miss, any_hit, any_miss, walk, fault (for mem_dtlb)
na, by_data, by_addr (for mem_blk)
hops0, hops1, hops2, hops3 (for mem_hops)
perf lock contention:
- Show lock type with address.
- Track and show mmap_lock, siglock and per-cpu rq_lock with address.
This is done for mmap_lock by following the current->mm pointer:
$ sudo ./perf lock con -abl -- sleep 10
contended total wait max wait avg wait address symbol
...
16344 312.30 ms 2.22 ms 19.11 us ffff8cc702595640
17686 310.08 ms 1.49 ms 17.53 us ffff8cc7025952c0
3 84.14 ms 45.79 ms 28.05 ms ffff8cc78114c478 mmap_lock
3557 76.80 ms 68.75 us 21.59 us ffff8cc77ca3af58
1 68.27 ms 68.27 ms 68.27 ms ffff8cda745dfd70
9 54.53 ms 7.96 ms 6.06 ms ffff8cc7642a48b8 mmap_lock
14629 44.01 ms 60.00 us 3.01 us ffff8cc7625f9ca0
3481 42.63 ms 140.71 us 12.24 us ffffffff937906ac vmap_area_lock
16194 38.73 ms 42.15 us 2.39 us ffff8cd397cbc560
11 38.44 ms 10.39 ms 3.49 ms ffff8ccd6d12fbb8 mmap_lock
1 5.43 ms 5.43 ms 5.43 ms ffff8cd70018f0d8
1674 5.38 ms 422.93 us 3.21 us ffffffff92e06080 tasklist_lock
581 4.51 ms 130.68 us 7.75 us ffff8cc9b1259058
5 3.52 ms 1.27 ms 703.23 us ffff8cc754510070
112 3.47 ms 56.47 us 31.02 us ffff8ccee38b3120
381 3.31 ms 73.44 us 8.69 us ffffffff93790690 purge_vmap_area_lock
255 3.19 ms 36.35 us 12.49 us ffff8d053ce30c80
- Update default map size to 16384.
- Allocate single letter option -M for --map-nr-entries, as it is
proving being frequently used.
- Fix struct rq lock access for older kernels with BPF's CO-RE
(Compile once, run everywhere).
- Fix problems found with MSAn.
perf report/top:
- Add inline information when using --call-graph=fp or lbr, as was
already done to the --call-graph=dwarf callchain mode.
- Improve the 'srcfile' sort key performance by really using an
optimization introduced in 6.2 for the 'srcline' sort key that
avoids calling addr2line for comparision with each sample.
perf sched:
- Make 'perf sched latency/map/replay' to use "sched:sched_waking"
instead of "sched:sched_waking", consistent with 'perf record'
since d566a9c2d4 ("perf sched: Prefer sched_waking event when it
exists").
perf ftrace:
- Make system wide the default target for latency subcommand, run the
following command then generate some network traffic and press
control+C:
# perf ftrace latency -T __kfree_skb
^C
DURATION | COUNT | GRAPH |
0 - 1 us | 27 | ############# |
1 - 2 us | 22 | ########### |
2 - 4 us | 8 | #### |
4 - 8 us | 5 | ## |
8 - 16 us | 24 | ############ |
16 - 32 us | 2 | # |
32 - 64 us | 1 | |
64 - 128 us | 0 | |
128 - 256 us | 0 | |
256 - 512 us | 0 | |
512 - 1024 us | 0 | |
1 - 2 ms | 0 | |
2 - 4 ms | 0 | |
4 - 8 ms | 0 | |
8 - 16 ms | 0 | |
16 - 32 ms | 0 | |
32 - 64 ms | 0 | |
64 - 128 ms | 0 | |
128 - 256 ms | 0 | |
256 - 512 ms | 0 | |
512 - 1024 ms | 0 | |
1 - ... s | 0 | |
#
perf top:
- Add --branch-history (LBR: Last Branch Record) option, just like
already available for 'perf record'.
- Fix segfault in thread__comm_len() where thread->comm was being
used outside thread->comm_lock.
perf annotate:
- Allow configuring objdump and addr2line in ~/.perfconfig., so that
you can use alternative binaries, such as llvm's.
perf kvm:
- Add TUI mode for 'perf kvm stat report'.
Reference counting:
- Add reference count checking infrastructure to check for use after
free, done to the 'cpumap', 'namespaces', 'maps' and 'map' structs,
more to come.
To build with it use -DREFCNT_CHECKING=1 in the make command line
to build tools/perf. Documented at:
https://perf.wiki.kernel.org/index.php/Reference_Count_Checking
- The above caught, for instance, fix, present in this series:
- Fix maps use after put in 'perf test "Share thread maps"':
'maps' is copied from leader, but the leader is put on line 79
and then 'maps' is used to read the reference count below - so
a use after put, with the put of maps happening within
thread__put.
Fixed by reversing the order of puts so that the leader is put
last.
- Also several fixes were made to places where reference counts were
not being held.
- Make this one of the tests in 'make -C tools/perf build-test' to
regularly build test it and to make sure no direct access to the
reference counted structs are made, doing that via accessors to
check the validity of the struct pointer.
ARM64:
- Fix 'perf report' segfault when filtering coresight traces by
sparse lists of CPUs.
- Add support for 'simd' as a sort field for 'perf report', to show
ARM's NEON SIMD's predicate flags: "partial" and "empty".
arm64 vendor events:
- Add N1 metrics.
Intel vendor events:
- Add graniterapids, grandridge and sierraforrest events.
- Refresh events for: alderlake, aldernaken, broadwell, broadwellde,
broadwellx, cascadelakx, haswell, haswellx, icelake, icelakex,
jaketown, meteorlake, knightslanding, sandybridge, sapphirerapids,
silvermont, skylake, tigerlake and westmereep-dp
- Refresh metrics for alderlake-n, broadwell, broadwellde,
broadwellx, haswell, haswellx, icelakex, ivybridge, ivytown and
skylakex.
perf stat:
- Implement --topdown using JSON metrics.
- Add TopdownL1 JSON metric as a default if present, but disable it
for now for some Intel hybrid architectures, a series of patches
addressing this is being reviewed and will be submitted for v6.5.
- Use metrics for --smi-cost.
- Update topdown documentation.
Vendor events (JSON) infrastructure:
- Add support for computing and printing metric threshold values. For
instance, here is one found in thesapphirerapids json file:
{
"BriefDescription": "Percentage of cycles spent in System Management Interrupts.",
"MetricExpr": "((msr@aperf@ - cycles) / msr@aperf@ if msr@smi@ > 0 else 0)",
"MetricGroup": "smi",
"MetricName": "smi_cycles",
"MetricThreshold": "smi_cycles > 0.1",
"ScaleUnit": "100%"
},
- Test parsing metric thresholds with the fake PMU in 'perf test
pmu-events'.
- Support for printing metric thresholds in 'perf list'.
- Add --metric-no-threshold option to 'perf stat'.
- Add rand (reverse and) and has_pmem (optane memory) support to
metrics.
- Sort list of input files to avoid depending on the order from
readdir() helping in obtaining reproducible builds.
S/390:
- Add common metrics: - CPI (cycles per instruction), prbstate (ratio
of instructions executed in problem state compared to total number
of instructions), l1mp (Level one instruction and data cache misses
per 100 instructions).
- Add cache metrics for z13, z14, z15 and z16.
- Add metric for TLB and cache.
ARM:
- Add raw decoding for SPE (Statistical Profiling Extension) v1.3 MTE
(Memory Tagging Extension) and MOPS (Memory Operations) load/store.
Intel PT hardware tracing:
- Add event type names UINTR (User interrupt delivered) and UIRET
(Exiting from user interrupt routine), documented in table 32-50
"CFE Packet Type and Vector Fields Details" in the Intel Processor
Trace chapter of The Intel SDM Volume 3 version 078.
- Add support for new branch instructions ERETS and ERETU.
- Fix CYC timestamps after standalone CBR
ARM CoreSight hardware tracing:
- Allow user to override timestamp and contextid settings.
- Fix segfault in dso lookup.
- Fix timeless decode mode detection.
- Add separate decode paths for timeless and per-thread modes.
auxtrace:
- Fix address filter entire kernel size.
Miscellaneous:
- Fix use-after-free and unaligned bugs in the PLT handling routines.
- Use zfree() to reduce chances of use after free.
- Add missing 0x prefix for addresses printed in hexadecimal in 'perf
probe'.
- Suppress massive unsupported target platform errors in the unwind
code.
- Fix return incorrect build_id size in elf_read_build_id().
- Fix 'perf scripts intel-pt-events.py' IPC output for Python 2 .
- Add missing new parameter in kfree_skb tracepoint to the python
scripts using it.
- Add 'perf bench syscall fork' benchmark.
- Add support for printing PERF_MEM_LVLNUM_UNC (Uncached access) in
'perf mem'.
- Fix wrong size expectation for perf test 'Setup struct
perf_event_attr' caused by the patch adding
perf_event_attr::config3.
- Fix some spelling mistakes"
* tag 'perf-tools-for-v6.4-3-2023-05-06' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (365 commits)
Revert "perf build: Make BUILD_BPF_SKEL default, rename to NO_BPF_SKEL"
Revert "perf build: Warn for BPF skeletons if endian mismatches"
perf metrics: Fix SEGV with --for-each-cgroup
perf bpf skels: Stop using vmlinux.h generated from BTF, use subset of used structs + CO-RE
perf stat: Separate bperf from bpf_profiler
perf test record+probe_libc_inet_pton: Fix call chain match on x86_64
perf test record+probe_libc_inet_pton: Fix call chain match on s390
perf tracepoint: Fix memory leak in is_valid_tracepoint()
perf cs-etm: Add fix for coresight trace for any range of CPUs
perf build: Fix unescaped # in perf build-test
perf unwind: Suppress massive unsupported target platform errors
perf script: Add new parameter in kfree_skb tracepoint to the python scripts using it
perf script: Print raw ip instead of binary offset for callchain
perf symbols: Fix return incorrect build_id size in elf_read_build_id()
perf list: Modify the warning message about scandirat(3)
perf list: Fix memory leaks in print_tracepoint_events()
perf lock contention: Rework offset calculation with BPF CO-RE
perf lock contention: Fix struct rq lock access
perf stat: Disable TopdownL1 on hybrid
perf stat: Avoid SEGV on counter->name
...
The recent fix to ensure atomicity of lookup and allocation inadvertently
broke the pool refill mechanism, so that debugobject OOMs now in certain
situations. The reason is that the functions which got updated no longer
invoke debug_objecs_init(), which is now the only place to care about
refilling the tracking object pool.
Restore the original behaviour by adding explicit refill opportunities to
those places.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmRWoFATHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYocNID/9e1fU2Nf32woHokzBGgARKb69Kl/hb
6yVdMpOnZtxmluheJLnqCWI4WbAB6NjulEMFv+KkwRZ+QndBKVEo8NMZ9RjbXDBb
HEehI6DvsqRDjaytOLEZj+/8afcZ7bUBKk7JuUK+y5B1gZViazfp1eF3hpiKsIV9
aowpH6c9lL/9sPgFe2qpp21MUmNTUQbHpz0vbYC0QjqSEU2zTlu8p//P6VLA3xpl
qoh8Gu5qo/L8lPspN2v8TRVXdiqH67J+KpbGO9IuUQWYPQqFdc6WchhHwomAk8nr
Nyn9Q1Lred96pTdW3B0Cumnxuf0VPt4X/uQxPSP0kCo/h0Q0Mh6fq59Z66H/Mhjk
TAvM52w3VzfTmQB6WgaCD1HyRRqIK5Nd+XqXnenCkHN4kjmGXNLg9MUGxua5CVgF
iQTSRYtN18rF9OevDOFGzsEig2RN1JFi9MnJg9Q/L8SoDUn5ZUfhPaSA/HcOBnSe
m+9aeRxlb0hAP7+upFKsJkDYzJTtbP6LSx6qqZMyQWqYdsUVHpdiPtJpXb7mLIqQ
wo83i/Ohq8+dF6ykd89ZcKJ8vLBrnE1rPFKKmvS5ov1eRt/hZbtR3tmMviCNna0M
2nrJE2fKClbs8Dmc6NNboJdz51ASgZEi32XmdFkATiuZqiD1id7ne0f85ju7DHD9
sOjfo4ZtIKD/Fw==
=/0Kc
-----END PGP SIGNATURE-----
Merge tag 'core-debugobjects-2023-05-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull debugobjects fix from Thomas Gleixner:
"A single fix for debugobjects:
The recent fix to ensure atomicity of lookup and allocation
inadvertently broke the pool refill mechanism, so that debugobject
OOMs now in certain situations. The reason is that the functions which
got updated no longer invoke debug_objecs_init(), which is now the
only place to care about refilling the tracking object pool.
Restore the original behaviour by adding explicit refill opportunities
to those places"
* tag 'core-debugobjects-2023-05-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
debugobject: Ensure pool refill (again)
- A long-standing bug in crypto_engine.
- A buggy but harmless check in the sun8i-ss driver.
- A regression in the CRYPTO_USER interface.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEn51F/lCuNhUwmDeSxycdCkmxi6cFAmRQ5RgACgkQxycdCkmx
i6dA1hAAsdP9Nx6d1ux0KYCI9FsnjU06JITzEA/LmzMXqxfElUzatZswaQuE8gSd
SoUNuWk5PiKEK5BIJRdKNyKc+kElxCewknEbpvbHi3QGs/BZyfPuDf17gPnNaxsA
KxA2WLU9XBaI0mt44mPuX5aLlCMnJ/RFhi7I5lJ3H31V6FU75BVE3DOCSxZNNipN
eWfG1c2f3K9Q0Qc3CSxUkzvNKTy+YW+i/fwdisToX2Zj/jduWmUi8PcbQQIX5+PX
Dch1LxnTu/l+Dn5PBICmhK8B86oTTCF83FOYX2f7ng4YuZ95gN/gn3/s5U4UkkKl
n74rupfhKkUHFeWS93ToC0qKniuN+QuPA8GumBFYjywn2gA7CS1BSV3+++ryJxRO
q9RXtHU7R7IXQGn5IKCqwzcBHrtnxUlIptMV2VH3Sbixtzn5QW0m10/lHHNKIy5E
laXr+dr+uBcL5+mSDYvsRbaQxfQ9G5wGvelnl7c1hXEG1VV6O5pVXLTlw81U7K3u
38m7dQl1/A3IetYpMrFcqhiGt0T84ttJerLaQRLNZCeNSAB70Gq/eSSSK4uyHl7I
pVb2eiLE84NCUZfemD8CER9Zv6z+kFDEO6xpLeBOd6BAzfYJXaTaquLnERh/bduo
WXnGlrDv6W+N+w4nP6AQ711mDqP5vMdLeSwbUfJChqeNHmM8nG8=
=9Us7
-----END PGP SIGNATURE-----
Merge tag 'v6.4-p2' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
- A long-standing bug in crypto_engine
- A buggy but harmless check in the sun8i-ss driver
- A regression in the CRYPTO_USER interface
* tag 'v6.4-p2' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: api - Fix CRYPTO_USER checks for report function
crypto: engine - fix crypto_queue backlog handling
crypto: sun8i-ss - Fix a test in sun8i_ss_setup_ivs()
- Revert an i.MX patch that's causing video failures because division
math goes sideways
- Fix a clang + W=1 build isue where FIELD_PREP() is taking a 32-bit
variable instead of the usual u64 type
- Fix a Kconfig bug in the StarFive JH7110 clk config that selects a
reset controller when it can't be selected
-----BEGIN PGP SIGNATURE-----
iQJFBAABCAAvFiEE9L57QeeUxqYDyoaDrQKIl8bklSUFAmRW7qARHHNib3lkQGtl
cm5lbC5vcmcACgkQrQKIl8bklSU8gA/8Dee5F/zZR+Y7TJGld2ZF5gTjo+5vWrPP
SQ4FTCSZwnslrYcESXL+tfKxGv4JaP+bbWfufbbBYAlapNQZchSnRurg0rvYCIc2
s9NeUbLvPhxJTaBJIOoUzOfzezvPpS9+eEnT6goyvrWrYBz+wolTrsOzI6t2ocyb
9KpBEiaqAUSOmno6mNmqNg48UvBytgtXBnlC3YsqLlKuBZcRBrK/nHO+dUIqb2kO
s0fbbnNI2Mal0e3Se25ehW803fl+OhnxshHO4EO+aRvCLhXINgyapFNR2vk1woN2
UM38va+BPyzaI9goXcxMN5JROxGZFTmHMYtrqgimvFwxVDnHdz2jdokaY0fLhAMV
mN5Y+J1XBdNui4Pz/iWzldkPWULtbLTCZZQFl1lbAdBcvPSQC+hIcQEFKxg0O4rk
d0+B3a2SPfK1X5Ebl+BtOogbTGr616zVHgiiHeU0vxxFbqXboGh+iKJJdvtHA4WF
t/y4gH3cx67NneUFMihNs+UqVSVnE6N1+EghTzAubnEIz3ONIQDyvWuTNagDZBWH
QRMJz51691OGuSAW6lw9Nnfe0qNFiSzQpJptBT1bJ2GvTzOUI8izNLZ0ah/effuR
avHhYGkf7DYggyldEf5eL/iVqUTJ1JYKpRdcuyfLAg4q0UPmRP43n4KdTtZ98wC3
UdusEJ8EqDc=
=IxvN
-----END PGP SIGNATURE-----
Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk fixes from Stephen Boyd:
"A couple more patches that would be good to get into -rc1:
- Revert an i.MX patch that's causing video failures because division
math goes sideways
- Fix a clang + W=1 build isue where FIELD_PREP() is taking a 32-bit
variable instead of the usual u64 type
- Fix a Kconfig bug in the StarFive JH7110 clk config that selects a
reset controller when it can't be selected"
* tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: starfive: Fix RESET_STARFIVE_JH7110 can't be selected in a specified case
clk: sp7021: Adjust width of _m in HWM_FIELD_PREP()
Revert "clk: imx: composite-8m: Add support to determine_rate"
Convert omap and pcc to use mbox_bind_client
- omap and hi6220 : use of_property_read_bool
- test: fix double-free and use spinlock header
- rockchip and bcm-pdc: drop of_match_ptr
- mpfs: change config symbol
- mediatek gce: support MT6795
- qcom apcs: consolidate of_device_id
support IPQ9574
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE6EwehDt/SOnwFyTyf9lkf8eYP5UFAmRWXwYACgkQf9lkf8eY
P5V11Q/9G2V2V/QO6EJm8JQ6ElkO+cLeZezmyAhlt3l7n2B5WjEdnKX2nxOYmgZl
qb+tYFJO7cd+lpMMTWz4PfOZbedkhrGXwzV8OMzXHJHfOdy3oojw8NYXY7IXdwsy
jhik8jYSUx2vHYCTCbAT0SgNLWJHE+RMzFjg01Wb49zIqJf9kbw8RfmWItKGLBJt
0ePSYW2o7LckMf3KSVV3YxfTlvX6Wb0rE0HXv/YI17XtDWT/RlpHb9rVvPWcpKfI
wZjQ49wxjtpsilxAQijcT+gE2wmvx62S1HeSSwE7JZ/7k4Ihg269rUa3p0Dp3dcA
RLyEhrLt3KWAWYWDXvsGVhaJvrtHI0BMA1Pb3C+qnjdcySs8AZgQbGTtI3RPOiuq
3VCshYVZMEbVoXZOHFbov0e6pyBX5dLqU8W3FhfAVunovj/d7/+74h1uP6ecJjhP
+DazIWSc20ATWUaD+UrjPhaGsNwEGceiJiY6nHhnzWZlu2K+2UXRlSrL/iAydvaj
SRJeqHiY12fGPGVOScpCMylAGsa+VI+oA61JlMfjLvuyL4qSwNwZGpmCUlIod2AJ
8vUGw254XMr5CiZ4u/D/cyi+7gJ4la53xNkNZIfFs7cP+gbCygaCHDy11bzQa+i1
/sLR9voYIp0Woljds5a6L+o2bP5hkf/Uf+6WKEU8uaW98f5tA14=
=NHj4
-----END PGP SIGNATURE-----
Merge tag 'mailbox-v6.4' of git://git.linaro.org/landing-teams/working/fujitsu/integration
Pull mailbox updates from Jassi Brar:
- mailbox api: allow direct registration to a channel and convert omap
and pcc to use mbox_bind_client
- omap and hi6220 : use of_property_read_bool
- test: fix double-free and use spinlock header
- rockchip and bcm-pdc: drop of_match_ptr
- mpfs: change config symbol
- mediatek gce: support MT6795
- qcom apcs: consolidate of_device_id and support IPQ9574
* tag 'mailbox-v6.4' of git://git.linaro.org/landing-teams/working/fujitsu/integration:
dt-bindings: mailbox: qcom: add compatible for IPQ9574 SoC
mailbox: qcom-apcs-ipc: do not grow the of_device_id
dt-bindings: mailbox: qcom,apcs-kpss-global: use fallbacks for few variants
dt-bindings: mailbox: mediatek,gce-mailbox: Add support for MT6795
mailbox: mpfs: convert SOC_MICROCHIP_POLARFIRE to ARCH_MICROCHIP_POLARFIRE
mailbox: bcm-pdc: drop of_match_ptr for ID table
mailbox: rockchip: drop of_match_ptr for ID table
mailbox: mailbox-test: Fix potential double-free in mbox_test_message_write()
mailbox: mailbox-test: Explicitly include header for spinlock support
mailbox: Use of_property_read_bool() for boolean properties
mailbox: pcc: Use mbox_bind_client
mailbox: omap: Use mbox_bind_client
mailbox: Allow direct registration to a channel
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmRXkp8QHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpsH3EADZ4ex4vvc2/giYjRYTdyda00GIdYGiSZaZ
OBW59OBYJfDjsS99Z2U4638shyqM3rA6BwzvWD6hPUTShD7PUddIKDaanpCYBs+x
GhcDhzeJxARjsp/5qZPQ4r3Hdp+kWRnOFu1BSniCEdWt43J6AOm0l5duuNxl1db5
Jz+BkSgF1pICAVB5YfSPDShyFiDfc0qcv3rHZo/rVRjwmX3x4x+OVr97zbZInnH4
fm2vcv2e3Dc8uQEB0XFEyBWNeC8hdhTNe60t1y3MlILaPjHrQlzl5lkQFZ9dtLG6
/FDK+bD6Ex8t6SQnnQGrLa9q4vQYA6zlT5u/wdvzAkakmrSxt6phf23U1TbyZSQ8
clSg6MSuyJAW7YhiT3OqAsNMvBZ5g/nKHjLxfkAe86N9YIrMebZW9GDF+fWf0q0K
em7O4XPehjkCXKclCInMv2bkv+/pncrd84D7Jt3OmCPljBL6oXfWF3vJJQ6FaY+t
G3tNxl1hMHqwX8uaaSqyeuMYAEYizZa+ebMmHJwm/151d0EKylAOh3i6O1e4bpk3
Kn5irF6pfTV1g1TcmY3fOwNgMffcHc9Y4E8bXxMJAN9WiHDuopgn2C9+MrJ+KRrx
pLN7VD9UK/5yFzHSMBhyM8lm0/oTIhfTVgb+WyuHmJLcRhAnyNy9RrIBhSD0AoH+
UMyOexZR8w==
=l7Mo
-----END PGP SIGNATURE-----
Merge tag 'for-6.4/io_uring-2023-05-07' of git://git.kernel.dk/linux
Pull more io_uring updates from Jens Axboe:
"Nothing major in here, just two different parts:
- A small series from Breno that enables passing the full SQE down
for ->uring_cmd().
This is a prerequisite for enabling full network socket operations.
Queued up a bit late because of some stylistic concerns that got
resolved, would be nice to have this in 6.4-rc1 so the dependent
work will be easier to handle for 6.5.
- Fix for the huge page coalescing, which was a regression introduced
in the 6.3 kernel release (Tobias)"
* tag 'for-6.4/io_uring-2023-05-07' of git://git.kernel.dk/linux:
io_uring: Remove unnecessary BUILD_BUG_ON
io_uring: Pass whole sqe to commands
io_uring: Create a helper to return the SQE size
io_uring/rsrc: check for nonconsecutive pages
This reverts commit a980755beb.
We need to better polish building with BPF skels, so revert back to
making it an experimental feature that has to be explicitely enabled
using BUILD_BPF_SKEL=1.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This reverts commit 51924ae69e.
We need to better polish building with BPF skels, so revert back to
making it an experimental feature that has to be explicitely enabled
using BUILD_BPF_SKEL=1.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2d55c16c0c ("dmapool: create/destroy cleanup").
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZFaTPAAKCRDdBJ7gKXxA
ji1zAP4s/mWPQOJkoTR8cn9wRdGUB4FKgDiJwwYeOdcHRuylvwEAlgOD8qlizXYu
z4tGx6J8p3X+IdEApqH6Up56wjW+Wgk=
=X19M
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2023-05-06-10-49' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull dmapool updates - again - from Andrew Morton:
"Reinstate the dmapool changes which were accidentally removed by a
mishap on the last commit in the previous attempt at the series"
Fixes: 2d55c16c0c ("dmapool: create/destroy cleanup").
[ The whole old series: def8574308ed..2d55c16c0c54 results in an empty
diff because that last commit ended up being just a revert of all that
came everything before it. - Linus ]
* tag 'mm-stable-2023-05-06-10-49' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
dmapool: link blocks across pages
dmapool: don't memset on free twice
dmapool: simplify freeing
dmapool: consolidate page initialization
dmapool: rearrange page alloc failure handling
dmapool: move debug code to own functions
dmapool: speedup DMAPOOL_DEBUG with init_on_alloc
dmapool: cleanup integer types
dmapool: use sysfs_emit() instead of scnprintf()
dmapool: remove checks for dev == NULL
changes.
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZFaSnAAKCRDdBJ7gKXxA
jskTAPwIkL6xVMIEdtWfQNIy751o0onzfOQAIWgM3dSL83cMJQEA5OXxXt4LjrJw
4q0zf4GLYXjMjhnfYr8zXF39y05iyQA=
=nIhV
-----END PGP SIGNATURE-----
Merge tag 'mm-hotfixes-stable-2023-05-06-10-45' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull hotfixes from Andrew Morton:
"Five hotfixes.
Three are cc:stable, two pertain to merge window changes"
* tag 'mm-hotfixes-stable-2023-05-06-10-45' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
afs: fix the afs_dir_get_folio return value
nilfs2: do not write dirty data after degenerating to read-only
mm: do not reclaim private data from pinned page
nilfs2: fix infinite loop in nilfs_mdt_get_block()
mm/mmap/vma_merge: always check invariants
The allocated dmapool pages are never freed for the lifetime of the pool.
There is no need for the two level list+stack lookup for finding a free
block since nothing is ever removed from the list. Just use a simple
stack, reducing time complexity to constant.
The implementation inserts the stack linking elements and the dma handle
of the block within itself when freed. This means the smallest possible
dmapool block is increased to at most 16 bytes to accommodate these
fields, but there are no exisiting users requesting a dma pool smaller
than that anyway.
Removing the list has a significant change in performance. Using the
kernel's micro-benchmarking self test:
Before:
# modprobe dmapool_test
dmapool test: size:16 blocks:8192 time:57282
dmapool test: size:64 blocks:8192 time:172562
dmapool test: size:256 blocks:8192 time:789247
dmapool test: size:1024 blocks:2048 time:371823
dmapool test: size:4096 blocks:1024 time:362237
After:
# modprobe dmapool_test
dmapool test: size:16 blocks:8192 time:24997
dmapool test: size:64 blocks:8192 time:26584
dmapool test: size:256 blocks:8192 time:33542
dmapool test: size:1024 blocks:2048 time:9022
dmapool test: size:4096 blocks:1024 time:6045
The module test allocates quite a few blocks that may not accurately
represent how these pools are used in real life. For a more marco level
benchmark, running fio high-depth + high-batched on nvme, this patch shows
submission and completion latency reduced by ~100usec each, 1% IOPs
improvement, and perf record's time spent in dma_pool_alloc/free were
reduced by half.
[kbusch@kernel.org: push new blocks in ascending order]
Link: https://lkml.kernel.org/r/20230221165400.1595247-1-kbusch@meta.com
Link: https://lkml.kernel.org/r/20230126215125.4069751-12-kbusch@meta.com
Fixes: 2d55c16c0c ("dmapool: create/destroy cleanup")
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Tony Battersby <tonyb@cybernetics.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
If debug is enabled, dmapool will poison the range, so no need to clear it
to 0 immediately before writing over it.
Link: https://lkml.kernel.org/r/20230126215125.4069751-11-kbusch@meta.com
Fixes: 2d55c16c0c ("dmapool: create/destroy cleanup")
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Tony Battersby <tonyb@cybernetics.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The actions for busy and not busy are mostly the same, so combine these
and remove the unnecessary function. Also, the pool is about to be freed
so there's no need to poison the page data since we only check for poison
on alloc, which can't be done on a freed pool.
Link: https://lkml.kernel.org/r/20230126215125.4069751-10-kbusch@meta.com
Fixes: 2d55c16c0c ("dmapool: create/destroy cleanup")
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Tony Battersby <tonyb@cybernetics.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Various fields of the dma pool are set in different places. Move it all
to one function.
Link: https://lkml.kernel.org/r/20230126215125.4069751-9-kbusch@meta.com
Fixes: 2d55c16c0c ("dmapool: create/destroy cleanup")
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Tony Battersby <tonyb@cybernetics.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Handle the error in a condition so the good path can be in the normal
flow.
Link: https://lkml.kernel.org/r/20230126215125.4069751-8-kbusch@meta.com
Fixes: 2d55c16c0c ("dmapool: create/destroy cleanup")
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Tony Battersby <tonyb@cybernetics.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Clean up the normal path by moving the debug code outside it.
Link: https://lkml.kernel.org/r/20230126215125.4069751-7-kbusch@meta.com
Fixes: 2d55c16c0c ("dmapool: create/destroy cleanup")
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Tony Battersby <tonyb@cybernetics.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Avoid double-memset of the same allocated memory in dma_pool_alloc() when
both DMAPOOL_DEBUG is enabled and init_on_alloc=1.
Link: https://lkml.kernel.org/r/20230126215125.4069751-6-kbusch@meta.com
Fixes: 2d55c16c0c ("dmapool: create/destroy cleanup")
Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
To represent the size of a single allocation, dmapool currently uses
'unsigned int' in some places and 'size_t' in other places. Standardize
on 'unsigned int' to reduce overhead, but use 'size_t' when counting all
the blocks in the entire pool.
Link: https://lkml.kernel.org/r/20230126215125.4069751-5-kbusch@meta.com
Fixes: 2d55c16c0c ("dmapool: create/destroy cleanup")
Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Use sysfs_emit instead of scnprintf, snprintf or sprintf.
Link: https://lkml.kernel.org/r/20230126215125.4069751-4-kbusch@meta.com
Fixes: 2d55c16c0c ("dmapool: create/destroy cleanup")
Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
dmapool originally tried to support pools without a device because
dma_alloc_coherent() supports allocations without a device. But nobody
ended up using dma pools without a device, and trying to do so will result
in an oops. So remove the checks for pool->dev == NULL since they are
unneeded bloat.
[kbusch@kernel.org: add check for null dev on create]
Link: https://lkml.kernel.org/r/20230126215125.4069751-3-kbusch@meta.com
Fixes: 2d55c16c0c ("dmapool: create/destroy cleanup")
Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Fix another case of an incorrect check for the returned 'folio' value
from __filemap_get_folio().
The failure case used to return NULL, but was changed by commit
66dabbb65d ("mm: return an ERR_PTR from __filemap_get_folio").
But in the meantime, commit ec108d3cc7 ("NFS: Convert readdir page
array functions to use a folio") added a new user of that function.
And my merge of the two did not fix this up correctly.
The ext4 merge had the same issue, but that one had been caught in
linux-next and got properly fixed while merging.
Fixes: 0127f25b5d ("Merge tag 'nfs-for-6.4-1' of git://git.linux-nfs.org/projects/anna/linux-nfs")
Cc: Anna Schumaker <anna@kernel.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Keep returning NULL on failure instead of letting an ERR_PTR escape to
callers that don't expect it.
Link: https://lkml.kernel.org/r/20230503154526.1223095-2-hch@lst.de
Fixes: 66dabbb65d ("mm: return an ERR_PTR from __filemap_get_folio")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Jan Kara <jack@suse.cz>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: David Howells <dhowells@redhat.com>
Tested-by: David Howells <dhowells@redhat.com>
Cc: Marc Dionne <marc.dionne@auristor.com>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
According to syzbot's report, mark_buffer_dirty() called from
nilfs_segctor_do_construct() outputs a warning with some patterns after
nilfs2 detects metadata corruption and degrades to read-only mode.
After such read-only degeneration, page cache data may be cleared through
nilfs_clear_dirty_page() which may also clear the uptodate flag for their
buffer heads. However, even after the degeneration, log writes are still
performed by unmount processing etc., which causes mark_buffer_dirty() to
be called for buffer heads without the "uptodate" flag and causes the
warning.
Since any writes should not be done to a read-only file system in the
first place, this fixes the warning in mark_buffer_dirty() by letting
nilfs_segctor_do_construct() abort early if in read-only mode.
This also changes the retry check of nilfs_segctor_write_out() to avoid
unnecessary log write retries if it detects -EROFS that
nilfs_segctor_do_construct() returned.
Link: https://lkml.kernel.org/r/20230427011526.13457-1-konishi.ryusuke@gmail.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Tested-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Reported-by: syzbot+2af3bc9585be7f23f290@syzkaller.appspotmail.com
Link: https://syzkaller.appspot.com/bug?extid=2af3bc9585be7f23f290
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
If the page is pinned, there's no point in trying to reclaim it.
Furthermore if the page is from the page cache we don't want to reclaim
fs-private data from the page because the pinning process may be writing
to the page at any time and reclaiming fs private info on a dirty page can
upset the filesystem (see link below).
Link: https://lore.kernel.org/linux-mm/20180103100430.GE4911@quack2.suse.cz
Link: https://lkml.kernel.org/r/20230428124140.30166-1-jack@suse.cz
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Lorenzo Stoakes <lstoakes@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
If the disk image that nilfs2 mounts is corrupted and a virtual block
address obtained by block lookup for a metadata file is invalid,
nilfs_bmap_lookup_at_level() may return the same internal return code as
-ENOENT, meaning the block does not exist in the metadata file.
This duplication of return codes confuses nilfs_mdt_get_block(), causing
it to read and create a metadata block indefinitely.
In particular, if this happens to the inode metadata file, ifile,
semaphore i_rwsem can be left held, causing task hangs in lock_mount.
Fix this issue by making nilfs_bmap_lookup_at_level() treat virtual block
address translation failures with -ENOENT as metadata corruption instead
of returning the error code.
Link: https://lkml.kernel.org/r/20230430193046.6769-1-konishi.ryusuke@gmail.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Tested-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Reported-by: syzbot+221d75710bde87fa0e97@syzkaller.appspotmail.com
Link: https://syzkaller.appspot.com/bug?extid=221d75710bde87fa0e97
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
We may still have inconsistent input parameters even if we choose not to
merge and the vma_merge() invariant checks are useful for checking this
with no production runtime cost (these are only relevant when
CONFIG_DEBUG_VM is specified).
Therefore, perform these checks regardless of whether we merge.
This is relevant, as a recent issue (addressed in commit "mm/mempolicy:
Correctly update prev when policy is equal on mbind") in the mbind logic
was only picked up in the 6.2.y stable branch where these assertions are
performed prior to determining mergeability.
Had this remained the same in mainline this issue may have been picked up
faster, so moving forward let's always check them.
Link: https://lkml.kernel.org/r/df548a6ae3fa135eec3b446eb3dae8eb4227da97.1682885809.git.lstoakes@gmail.com
Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Smatch reports that filemap_fault() was missed in the conversion of
__filemap_get_folio() error returns from NULL to ERR_PTR.
Fixes: 66dabbb65d ("mm: return an ERR_PTR from __filemap_get_folio")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Reported-by: syzbot+48011b86c8ea329af1b9@syzkaller.appspotmail.com
Reported-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Six late arriving patches for the merge window. Five are minor
assorted fixes and updates. The IPR driver change removes SATA
support, which will now allow a major cleanup in the ATA subsystem
because it was the only driver still using the old attachment
mechanism. The driver is only used on power systems and SATA was used
to support a DVD device, which has long been moved to a different hba.
IBM chose this route instead of porting ipr to the newer SATA
interfaces.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCZFZGfiYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishbgIAP9V4dKN
e5xLwf4q9+bKKCSCmlMrPeAnfJXyWdW6VWvWhQD9H4bgemsWhcUOztvpB2pOUpYx
JaJCgj/MAA0FSXhy3Bk=
=FwUd
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull more SCSI updates from James Bottomley:
"Six late arriving patches for the merge window. Five are minor
assorted fixes and updates.
The IPR driver change removes SATA support, which will now allow a
major cleanup in the ATA subsystem because it was the only driver
still using the old attachment mechanism. The driver is only used on
power systems and SATA was used to support a DVD device, which has
long been moved to a different hba. IBM chose this route instead of
porting ipr to the newer SATA interfaces"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: qedi: Fix use after free bug in qedi_remove()
scsi: ufs: core: mcq: Fix &hwq->cq_lock deadlock issue
scsi: ipr: Remove several unused variables
scsi: pm80xx: Log device registration
scsi: ipr: Remove SATA support
scsi: scsi_debug: Abort commands from scsi_debug_device_reset()
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmRWLQYQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpnwhD/4xRYfAY4O0oGZCITiKEEFxiSfDHCPMEBji
zTEtideDRjcrpFmjmZ411C9prW32MxYQ3wQTf4O7w4t906xTYVr9FQy8g3Et4izI
zglPcsa2jPzYCadQ0Ye4n9dRuYOH9FDJzDDLC+smu8zQKKmAqEAN/1ftpnADdVrY
qX1sHyfz4RQRAgTHg2WqgKOi2O9VwGSMwOfBocmzAnruv9oLUlypcGnFPRCSZH/a
OKpUZlvQhCBZTKScvVxQeJMg2Tl5yokQ0TH+gkQsdav9XcPJktqXiuD+c4h6q5ux
oTlysEqcrwcaAafOcV0w9u80SFNlFACUYsNmEnFJaPXFTqAdNHvo1DJNsmxiHJDU
bGo5ktlo5b/VZ51niOoWvGxursavq16G4yIYlGGHc7f4wGs12oc5ZP/yM3GRUY+C
PdezwEvvQufxP7sFokfpgAS4SuH+tBlrhFXMsYaI4NukZQW4TK1zzbMrzOkdxFhW
BOx17VFUKWtUnRmxinFGIA8Vj+FXN+E+ND+FoDsbrMyJD4maKDdJapPchG0J0Vbs
pDcsB4c0pBC6H2xrobKiA1CuSq2t2qvyvwe1Zl2Xd+RVW9vBB5SI6HXYrC+UtxwY
7LfX8F13cFD1E6iJ9Nta6x8fOunGnOVBdW5O0k4hDWEuZduvHItEDn2c3Ehqp4Jw
P8dFBbk8SQ==
=gAYf
-----END PGP SIGNATURE-----
Merge tag 'for-6.4/block-2023-05-06' of git://git.kernel.dk/linux
Pull more block updates from Jens Axboe:
- MD pull request via Song:
- Improve raid5 sequential IO performance on spinning disks, which
fixes a regression since v6.0 (Jan Kara)
- Fix bitmap offset types, which fixes an issue introduced in this
merge window (Jonathan Derrick)
- Cleanup of hweight type used for cgroup writeback (Maxim)
- Fix a regression with the "has_submit_bio" changes across partitions
(Ming)
- Cleanup of QUEUE_FLAG_ADD_RANDOM clearing.
We used to set this flag on queues non blk-mq queues, and hence some
drivers clear it unconditionally. Since all of these have since been
converted to true blk-mq drivers, drop the useless clear as the bit
is not set (Chaitanya)
- Fix the flags being set in a bio for a flush for drbd (Christoph)
- Cleanup and deduplication of the code handling setting block device
capacity (Damien)
- Fix for ublk handling IO timeouts (Ming)
- Fix for a regression in blk-cgroup teardown (Tao)
- NBD documentation and code fixes (Eric)
- Convert blk-integrity to using device_attributes rather than a second
kobject to manage lifetimes (Thomas)
* tag 'for-6.4/block-2023-05-06' of git://git.kernel.dk/linux:
ublk: add timeout handler
drbd: correctly submit flush bio on barrier
mailmap: add mailmap entries for Jens Axboe
block: Skip destroyed blkg when restart in blkg_destroy_all()
writeback: fix call of incorrect macro
md: Fix bitmap offset type in sb writer
md/raid5: Improve performance for sequential IO
docs nbd: userspace NBD now favors github over sourceforge
block nbd: use req.cookie instead of req.handle
uapi nbd: add cookie alias to handle
uapi nbd: improve doc links to userspace spec
blk-integrity: register sysfs attributes on struct device
blk-integrity: convert to struct device_attribute
blk-integrity: use sysfs_emit
block/drivers: remove dead clear of random flag
block: sync part's ->bd_has_submit_bio with disk's
block: Cleanup set_capacity()/bdev_set_nr_sectors()
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmRWK50QHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpkP8D/9PKQy4FCTRKRYABomlkqc9UlrsEzkkpXy+
t2kCZ7OmgZ1go/3gRv8AVx0xC/S8zgs8vCwriUXVQH2Q4D2FnV9/Cy5xXzQJ2jE5
sWdj5+Sks2Zerq7hRfItxFxP70Jlvn4r2Ud6n9uoBKI+DirpGQB5xwqCF/Z9I9Ev
cAOtq3cvdKqiqAmrA0WAjjpLpvuC/OOEXyQf6n+viQ7iJZgCHDWplpa4jQCamrlJ
wDz5eNkptDc3KnZ1QGtfMlCdmzwjFEyi20Eq8VLo6OBkhoP6FymqWC6YkeigibI4
Da+fyx/wJvqnsuXPLCFgla5MXMRHLb9uoRRFbon6XqXABPZ8v4v8kZhswVLUxBzz
UjBI8cmKQ8rkHu+4o/522lCRQCMU5h7LxGgCguePi9+Hap2AuBSmpOzpgDqnNDmN
ARD6gW4uOkXRU6CgvppOio3hTn3BU9bUs8iN8ym3XW0eQo6nUElRT2mwC1XEJDYn
HAM1MqoJWU5rWp1yyAdiw8rLBOmMLMyNaowHlWEjEGiykU2FlfNDp/HJeq/gxjEK
NxBC5+qlhbDp3hTqDJVaRa+i8MchrTJLXfJuXzP3el3c1dp2vU5gLO49WYAA6G5d
oycxTm2SyX6dUwWNeH06jgocVt99fC7klTuEq59Jm9ykFSfbj2kfv24U+jDi/KvX
DW++jNDl+g==
=uryD
-----END PGP SIGNATURE-----
Merge tag 'pipe-nonblock-2023-05-06' of git://git.kernel.dk/linux
Pull nonblocking pipe io_uring support from Jens Axboe:
"Here's the revised edition of the FMODE_NOWAIT support for pipes, in
which we just flag it as such supporting FMODE_NOWAIT unconditionally,
but clear it if we ever end up using splice/vmsplice on the pipe.
The pipe read/write side is perfectly fine for nonblocking IO, however
splice and vmsplice can potentially wait for IO with the pipe lock
held"
* tag 'pipe-nonblock-2023-05-06' of git://git.kernel.dk/linux:
pipe: set FMODE_NOWAIT on pipes
splice: clear FMODE_NOWAIT on file if splice/vmsplice is used
Here are collections of small fixes for rc1.
The only (LOC-wise) dominant change was ASoC Qualcomm fix, but most
of it was merely a code shuffling.
Another significant change here is for ALSA PCM core; it received a
revert and a series of fixes for PCM auto-silencing where it caused
a regression in the previous PR for rc1.
Others are all small: ASoC Intel fixes, various quirks for ASoC AMD,
HD-audio and USB-audio, the continued legacy emu10k1 code cleanup,
and some documentation updates.
-----BEGIN PGP SIGNATURE-----
iQJCBAABCAAsFiEEIXTw5fNLNI7mMiVaLtJE4w1nLE8FAmRWACMOHHRpd2FpQHN1
c2UuZGUACgkQLtJE4w1nLE8vPA//Vh+gO0e1wcmCqeoIIJUwr5G0hQYY2Fp5Ke6A
3MuQAW3JyZ7hj5Xkor1PRalelinHUuvGMFDeazLnr92MV9zplnTisbKxp7EmDhJE
3iEdvj9evKTyMk7RpIhX1rhrHy0QLaCHDgDZDrv7IapyDpLUOl0lFtJQrlYV8DbS
b0uSBJ/kpMi5AH+W7RWpdnPIFd2j8tDiyKGBDpAo3l8TIjkwdsdKo9jBz7uNpf0R
tL8J6U4dA/Yo86XPAwg0g5tkug0vWOhZY8tdqNvJJFNgo9dW8ofqkCC4lg7JSmqM
ugTm/l0CZnEC09t2GSw7k7bsUGxZkWus7r8PTEw9CryGsR2pQYAvcjMMFpXqTu6K
Rc0PPkvFdmX2PIGtbkebwr/5y5Dbr2SmzqJjdkOSH91PZdle9RwN6k19dlXi1OFr
ucS9g5gAxGWMNwOnLFud3x+qIN1XwZuuHI1CZd7jhlQ3krV+dEl6m9sttFs4saPs
MNCMOXdVcKZ6NCBPRaCCwAEgP6Ttitdj25edTy1tKMr3td7ECInXQZiAPlAXLLwm
8aRscx07jj9cpfuk9fBTBvZdiAoK9KqYCLFSle4DzOJkyLuA9OiVFp3iixeoVIsb
fRQdaYdwq166SWtJSBbsRnfxI5zxKhxz/vx0SLNf32Flh2SyIU0kQ6Wk88V716L7
D4tPskk=
=jd8l
-----END PGP SIGNATURE-----
Merge tag 'sound-fix-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"A collection of small fixes for rc1.
The only (LOC-wise) dominant change was ASoC Qualcomm fix, but most of
it was merely a code shuffling.
Another significant change here is for ALSA PCM core; it received a
revert and a series of fixes for PCM auto-silencing where it caused a
regression in the previous PR for rc1.
Others are all small: ASoC Intel fixes, various quirks for ASoC AMD,
HD-audio and USB-audio, the continued legacy emu10k1 code cleanup, and
some documentation updates"
* tag 'sound-fix-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (23 commits)
ALSA: pcm: use exit controlled loop in snd_pcm_playback_silence()
ALSA: pcm: simplify top-up mode init in snd_pcm_playback_silence()
ALSA: pcm: playback silence - move silence variable updates to separate function
ALSA: pcm: playback silence - remove extra code
ALSA: pcm: fix playback silence - correct incremental silencing
ALSA: pcm: fix playback silence - use the actual new_hw_ptr for the threshold mode
ALSA: pcm: Revert "ALSA: pcm: rewrite snd_pcm_playback_silence()"
ALSA: hda/realtek: Fix mute and micmute LEDs for an HP laptop
ALSA: caiaq: input: Add error handling for unsupported input methods in `snd_usb_caiaq_input_init`
ALSA: usb-audio: Add quirk for Pioneer DDJ-800
ALSA: hda/realtek: support HP Pavilion Aero 13-be0xxx Mute LED
ASoC: Intel: soc-acpi-cht: Add quirk for Nextbook Ares 8A tablet
ASoC: amd: yc: Add Asus VivoBook Pro 14 OLED M6400RC to the quirks list for acp6x
ASoC: codecs: wcd938x: fix accessing regmap on unattached devices
ALSA: docs: Fix code block indentation in ALSA driver example
ALSA: docs: Extend module parameters description
ALSA: hda/realtek: Add quirk for ASUS UM3402YAR using CS35L41
ALSA: emu10k1: use more existing defines instead of open-coded numbers
ASoC: amd: yc: Add ASUS M3402RA into DMI table
ALSA: hda/realtek: Add quirk for ThinkPad P1 Gen 6
...
A trivial typo fix that came in during the merge window.
-----BEGIN PGP SIGNATURE-----
iQFHBAABCgAxFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAmRVnlMTHGJyb29uaWVA
a2VybmVsLm9yZwAKCRAk1otyXVSH0HK3CACFKsLv9Hh7L5TORM7qVbmsNv5x7ovB
fC9juegYq7iszc/2+9350TkCSXvI8dVvdprufXrkaTV7IsEDtZDwesAR/eQtTJL3
ttmqqtsX1F6Jua0rP/AlWsM/H1KbwW/630Hb1YtJj3ohFTqit1cZgDSe/kd/DNFL
VSuTXmbnYqR4Is1UfWMU9lUgVwFX5GpHUnsRWVgxuj0LjdMUVgUqEU7JcfZcW0Uj
qt8X/bqALTKNZpGhBYK2w1ECL2qsQCfVhTA5ymBUwafCsgpNY/Z/ZUrzPTJCINTN
idGDBcaIZOsKky0YXTCv7N00gs7gj9vR77oFGvkmOTYK8ngLGuhF39v8
=Hq75
-----END PGP SIGNATURE-----
Merge tag 'regulator-fix-v6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator fix from Mark Brown:
"A trivial typo fix that came in during the merge window"
* tag 'regulator-fix-v6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: consumer.rst: fix 'regulator_enable' typo.
-----BEGIN PGP SIGNATURE-----
iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmRUx3IACgkQiiy9cAdy
T1G8MAv/TY6Oprfh58t4I4bWQKW49kOzFsPbNk9454CnjEnlV1jHirAKolTEN3ut
11ZvzAcUUPa23EBPOKaeroUI718AcAp91ByH58ERZtwHFz0ml0/ePL4+ihGiCUn3
0rTFSuZy1rW8Ld5sM9PMuDD3bzqAnmGZlOXAPg9acrbNyUwGguv4OF3MEjvYXpwF
vzcixvnG0HsIgCyM0gl4xK6vECx8PoDbUZDs44VWAPf5676PNwntJcbYq/GVCCGq
EiPDm1ke40nmSjxSDLQcpfWj3d6dYrmm6PVpa3wUSdI+MeuqSm0YR9DptCo6cawE
WqE6ZrUVEJyUlH6mFu6O4VZhwQ8jO4Jsasr3tay9crD8QSCLThLFkH4/Xg1u5JJw
miMaGk43hKHFStA00rObmLLFtk+HrPdKv5HIbByrm3m5x58hG+2PtKnsD8OI8Ny3
p7/lS4H/HHE9BZ44//rKtsRsAe7OAWHfYjiS/pPIhjepxr+2ra1RaI2ciKLNwIqd
tbPTVemh
=sgnv
-----END PGP SIGNATURE-----
Merge tag '6.4-rc-ksmbd-server-fixes-part2' of git://git.samba.org/ksmbd
Pull ksmbd server fixes from Steve French:
"Ten ksmbd server fixes, including some important security fixes:
- Two use after free fixes
- Fix RCU callback race
- Deadlock fix
- Three patches to prevent session setup attacks
- Prevent guest users from establishing multichannel sessions
- Fix null pointer dereference in query FS info
- Memleak fix"
* tag '6.4-rc-ksmbd-server-fixes-part2' of git://git.samba.org/ksmbd:
ksmbd: call rcu_barrier() in ksmbd_server_exit()
ksmbd: fix racy issue under cocurrent smb2 tree disconnect
ksmbd: fix racy issue from smb2 close and logoff with multichannel
ksmbd: not allow guest user on multichannel
ksmbd: fix deadlock in ksmbd_find_crypto_ctx()
ksmbd: block asynchronous requests when making a delay on session setup
ksmbd: destroy expired sessions
ksmbd: fix racy issue from session setup and logoff
ksmbd: fix NULL pointer dereference in smb2_get_info_filesystem()
ksmbd: fix memleak in session setup
Commit 0da6e5fd6c ("gcc: disable '-Warray-bounds' for gcc-13 too") makes
config GCC11_NO_ARRAY_BOUNDS to be for disabling -Warray-bounds in any gcc
version 11 and upwards, and with that, removes the GCC12_NO_ARRAY_BOUNDS
config as it is now covered by the semantics of GCC11_NO_ARRAY_BOUNDS.
As GCC11_NO_ARRAY_BOUNDS is yes by default, there is no need for the s390
architecture to explicitly select GCC11_NO_ARRAY_BOUNDS. Hence, the select
GCC12_NO_ARRAY_BOUNDS in arch/s390/Kconfig can simply be dropped.
Remove the unneeded "select GCC12_NO_ARRAY_BOUNDS".
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>