This kunit update for Linux 6.11-rc7 consist of one single fix to
a use-after-free bug resulting from kunit_driver_create() failing
to copy the driver name leaving it on the stack or freeing it.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmbY0WMACgkQCwJExA0N
QxzCgBAA7Cb6tyvGcXsQTXC50S90CR+3bGmHzTL8jl/ElHvTz521UzPTn01QB51t
JcGNhKz3RByvRBuukhg7abpCnCYWZoa9pmxojVD5D1TO2AXvypWEv0ao/UwSAYyi
2b7BTkcc7ciRske51/yFfipjwI/NLLIlu4HVcZ0OisOt+tvHzoz50KiyYV+Qan8r
e8NkqVI587KLfDAZRC+cLXyJCIRwlCK+jNMrjoiOanv1Ybe65eAGNQmAIyuGX1Fo
Ku8ZgoCgpc+Vjc1bMWgwgHWCdFOvINdd7ibfCp59JBBAkqYFpHYS5Lk9kHWH6lYF
X9THLaCSh5cq+u0qksW8p4ml1fYnWZbm92qkdPj0wG36v9la769HSXijtVhL2lxD
b1ca/NpfNfbbr5mxoVRq4ulO1JvyC6jmRKSJWt1p1SFfHf+Oaowh2Sr2ZjFfOozj
+/Joh3n2dxlnH/in8BvXGwQIo7xbyTatm/4IVCccJAolR+hPv7izBeWfYn3xgtu5
5WZVcxPMxNwgNHWnxm2nbxTtBTvTsOSC8/nbxm8g3jM9cHCP7Mz3/zSV6p2vcRxm
HPx/Qj2LmNcPKGXs4jh7WLErgkunxlvsqCJChwGjZoYR0fgRmzCgrwbkDE6/26UW
Teo51bWwD/CxTy7OtXi8D2pPzVqt8u5cFPaNgHaRzxLDuVTouhU=
=JRC5
-----END PGP SIGNATURE-----
Merge tag 'linux_kselftest-kunit-fixes-6.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kunit fix fromShuah Khan:
"One single fix to a use-after-free bug resulting from
kunit_driver_create() failing to copy the driver name leaving it on
the stack or freeing it"
* tag 'linux_kselftest-kunit-fixes-6.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
kunit: Device wrappers should also manage driver name
The timerlat interface will get and put the task that is part of the
"kthread" field of the osn_var to keep it around until all references are
released. But here's a race in the "stop_kthread()" code that will call
put_task_struct() on the kthread if it is not a kernel thread. This can
race with the releasing of the references to that task struct and the
put_task_struct() can be called twice when it should have been called just
once.
Take the interface_lock() in stop_kthread() to synchronize this change.
But to do so, the function stop_per_cpu_kthreads() needs to change the
loop from for_each_online_cpu() to for_each_possible_cpu() and remove the
cpu_read_lock(), as the interface_lock can not be taken while the cpu
locks are held. The only side effect of this change is that it may do some
extra work, as the per_cpu variables of the offline CPUs would not be set
anyway, and would simply be skipped in the loop.
Remove unneeded "return;" in stop_kthread().
Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Tomas Glozar <tglozar@redhat.com>
Cc: John Kacur <jkacur@redhat.com>
Cc: "Luis Claudio R. Goncalves" <lgoncalv@redhat.com>
Link: https://lore.kernel.org/20240905113359.2b934242@gandalf.local.home
Fixes: e88ed227f6 ("tracing/timerlat: Add user-space interface")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
The timerlat tracer can use user space threads to check for osnoise and
timer latency. If the program using this is killed via a SIGTERM, the
threads are shutdown one at a time and another tracing instance can start
up resetting the threads before they are fully closed. That causes the
hrtimer assigned to the kthread to be shutdown and freed twice when the
dying thread finally closes the file descriptors, causing a use-after-free
bug.
Only cancel the hrtimer if the associated thread is still around. Also add
the interface_lock around the resetting of the tlat_var->kthread.
Note, this is just a quick fix that can be backported to stable. A real
fix is to have a better synchronization between the shutdown of old
threads and the starting of new ones.
Link: https://lore.kernel.org/all/20240820130001.124768-1-tglozar@redhat.com/
Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: "Luis Claudio R. Goncalves" <lgoncalv@redhat.com>
Link: https://lore.kernel.org/20240905085330.45985730@gandalf.local.home
Fixes: e88ed227f6 ("tracing/timerlat: Add user-space interface")
Reported-by: Tomas Glozar <tglozar@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
The start_kthread() and stop_thread() code was not always called with the
interface_lock held. This means that the kthread variable could be
unexpectedly changed causing the kthread_stop() to be called on it when it
should not have been, leading to:
while true; do
rtla timerlat top -u -q & PID=$!;
sleep 5;
kill -INT $PID;
sleep 0.001;
kill -TERM $PID;
wait $PID;
done
Causing the following OOPS:
Oops: general protection fault, probably for non-canonical address 0xdffffc0000000002: 0000 [#1] PREEMPT SMP KASAN PTI
KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017]
CPU: 5 UID: 0 PID: 885 Comm: timerlatu/5 Not tainted 6.11.0-rc4-test-00002-gbc754cc76d1b-dirty #125 a533010b71dab205ad2f507188ce8c82203b0254
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
RIP: 0010:hrtimer_active+0x58/0x300
Code: 48 c1 ee 03 41 54 48 01 d1 48 01 d6 55 53 48 83 ec 20 80 39 00 0f 85 30 02 00 00 49 8b 6f 30 4c 8d 75 10 4c 89 f0 48 c1 e8 03 <0f> b6 3c 10 4c 89 f0 83 e0 07 83 c0 03 40 38 f8 7c 09 40 84 ff 0f
RSP: 0018:ffff88811d97f940 EFLAGS: 00010202
RAX: 0000000000000002 RBX: ffff88823c6b5b28 RCX: ffffed10478d6b6b
RDX: dffffc0000000000 RSI: ffffed10478d6b6c RDI: ffff88823c6b5b28
RBP: 0000000000000000 R08: ffff88823c6b5b58 R09: ffff88823c6b5b60
R10: ffff88811d97f957 R11: 0000000000000010 R12: 00000000000a801d
R13: ffff88810d8b35d8 R14: 0000000000000010 R15: ffff88823c6b5b28
FS: 0000000000000000(0000) GS:ffff88823c680000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000561858ad7258 CR3: 000000007729e001 CR4: 0000000000170ef0
Call Trace:
<TASK>
? die_addr+0x40/0xa0
? exc_general_protection+0x154/0x230
? asm_exc_general_protection+0x26/0x30
? hrtimer_active+0x58/0x300
? __pfx_mutex_lock+0x10/0x10
? __pfx_locks_remove_file+0x10/0x10
hrtimer_cancel+0x15/0x40
timerlat_fd_release+0x8e/0x1f0
? security_file_release+0x43/0x80
__fput+0x372/0xb10
task_work_run+0x11e/0x1f0
? _raw_spin_lock+0x85/0xe0
? __pfx_task_work_run+0x10/0x10
? poison_slab_object+0x109/0x170
? do_exit+0x7a0/0x24b0
do_exit+0x7bd/0x24b0
? __pfx_migrate_enable+0x10/0x10
? __pfx_do_exit+0x10/0x10
? __pfx_read_tsc+0x10/0x10
? ktime_get+0x64/0x140
? _raw_spin_lock_irq+0x86/0xe0
do_group_exit+0xb0/0x220
get_signal+0x17ba/0x1b50
? vfs_read+0x179/0xa40
? timerlat_fd_read+0x30b/0x9d0
? __pfx_get_signal+0x10/0x10
? __pfx_timerlat_fd_read+0x10/0x10
arch_do_signal_or_restart+0x8c/0x570
? __pfx_arch_do_signal_or_restart+0x10/0x10
? vfs_read+0x179/0xa40
? ksys_read+0xfe/0x1d0
? __pfx_ksys_read+0x10/0x10
syscall_exit_to_user_mode+0xbc/0x130
do_syscall_64+0x74/0x110
? __pfx___rseq_handle_notify_resume+0x10/0x10
? __pfx_ksys_read+0x10/0x10
? fpregs_restore_userregs+0xdb/0x1e0
? fpregs_restore_userregs+0xdb/0x1e0
? syscall_exit_to_user_mode+0x116/0x130
? do_syscall_64+0x74/0x110
? do_syscall_64+0x74/0x110
? do_syscall_64+0x74/0x110
entry_SYSCALL_64_after_hwframe+0x71/0x79
RIP: 0033:0x7ff0070eca9c
Code: Unable to access opcode bytes at 0x7ff0070eca72.
RSP: 002b:00007ff006dff8c0 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
RAX: 0000000000000000 RBX: 0000000000000005 RCX: 00007ff0070eca9c
RDX: 0000000000000400 RSI: 00007ff006dff9a0 RDI: 0000000000000003
RBP: 00007ff006dffde0 R08: 0000000000000000 R09: 00007ff000000ba0
R10: 00007ff007004b08 R11: 0000000000000246 R12: 0000000000000003
R13: 00007ff006dff9a0 R14: 0000000000000007 R15: 0000000000000008
</TASK>
Modules linked in: snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec snd_hwdep snd_hda_core
---[ end trace 0000000000000000 ]---
This is because it would mistakenly call kthread_stop() on a user space
thread making it "exit" before it actually exits.
Since kthreads are created based on global behavior, use a cpumask to know
when kthreads are running and that they need to be shutdown before
proceeding to do new work.
Link: https://lore.kernel.org/all/20240820130001.124768-1-tglozar@redhat.com/
This was debugged by using the persistent ring buffer:
Link: https://lore.kernel.org/all/20240823013902.135036960@goodmis.org/
Note, locking was originally used to fix this, but that proved to cause too
many deadlocks to work around:
https://lore.kernel.org/linux-trace-kernel/20240823102816.5e55753b@gandalf.local.home/
Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: "Luis Claudio R. Goncalves" <lgoncalv@redhat.com>
Link: https://lore.kernel.org/20240904103428.08efdf4c@gandalf.local.home
Fixes: e88ed227f6 ("tracing/timerlat: Add user-space interface")
Reported-by: Tomas Glozar <tglozar@redhat.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
In __tracing_open(), when max latency tracers took place on the cpu,
the time start of its buffer would be updated, then event entries with
timestamps being earlier than start of the buffer would be skipped
(see tracing_iter_reset()).
Softlockup will occur if the kernel is non-preemptible and too many
entries were skipped in the loop that reset every cpu buffer, so add
cond_resched() to avoid it.
Cc: stable@vger.kernel.org
Fixes: 2f26ebd549 ("tracing: use timestamp to determine start of latency traces")
Link: https://lore.kernel.org/20240827124654.3817443-1-zhengyejian@huaweicloud.com
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Zheng Yejian <zhengyejian@huaweicloud.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
The commit 783bf5d09f ("spi: spi-fsl-lpspi: limit PRESCALE bit in
TCR register") doesn't implement the prescaler maximum as intended.
The maximum allowed value for i.MX93 should be 1 and for i.MX7ULP
it should be 7. So this needs also a adjustment of the comparison
in the scldiv calculation.
Fixes: 783bf5d09f ("spi: spi-fsl-lpspi: limit PRESCALE bit in TCR register")
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Link: https://patch.msgid.link/20240905111537.90389-1-wahrenst@gmx.net
Signed-off-by: Mark Brown <broonie@kernel.org>
In sch_cake, we keep track of the count of active bulk flows per host,
when running in dst/src host fairness mode, which is used as the
round-robin weight when iterating through flows. The count of active
bulk flows is updated whenever a flow changes state.
This has a peculiar interaction with the hash collision handling: when a
hash collision occurs (after the set-associative hashing), the state of
the hash bucket is simply updated to match the new packet that collided,
and if host fairness is enabled, that also means assigning new per-host
state to the flow. For this reason, the bulk flow counters of the
host(s) assigned to the flow are decremented, before new state is
assigned (and the counters, which may not belong to the same host
anymore, are incremented again).
Back when this code was introduced, the host fairness mode was always
enabled, so the decrement was unconditional. When the configuration
flags were introduced the *increment* was made conditional, but
the *decrement* was not. Which of course can lead to a spurious
decrement (and associated wrap-around to U16_MAX).
AFAICT, when host fairness is disabled, the decrement and wrap-around
happens as soon as a hash collision occurs (which is not that common in
itself, due to the set-associative hashing). However, in most cases this
is harmless, as the value is only used when host fairness mode is
enabled. So in order to trigger an array overflow, sch_cake has to first
be configured with host fairness disabled, and while running in this
mode, a hash collision has to occur to cause the overflow. Then, the
qdisc has to be reconfigured to enable host fairness, which leads to the
array out-of-bounds because the wrapped-around value is retained and
used as an array index. It seems that syzbot managed to trigger this,
which is quite impressive in its own right.
This patch fixes the issue by introducing the same conditional check on
decrement as is used on increment.
The original bug predates the upstreaming of cake, but the commit listed
in the Fixes tag touched that code, meaning that this patch won't apply
before that.
Fixes: 7126399299 ("sch_cake: Make the dual modes fairer")
Reported-by: syzbot+7fe7b81d602cc1e6b94d@syzkaller.appspotmail.com
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://patch.msgid.link/20240903160846.20909-1-toke@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Document what was discussed multiple times on list and various
virtual / in-person conversations. guard() being okay in functions
<= 20 LoC is a bit of my own invention. If the function is trivial
it should be fine, but feel free to disagree :)
We'll obviously revisit this guidance as time passes and we and other
subsystems get more experience.
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Link: https://patch.msgid.link/20240830171443.3532077-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Tony Nguyen says:
====================
ice: fix synchronization between .ndo_bpf() and reset
Larysa Zaremba says:
PF reset can be triggered asynchronously, by tx_timeout or by a user. With some
unfortunate timings both ice_vsi_rebuild() and .ndo_bpf will try to access and
modify XDP rings at the same time, causing system crash.
The first patch factors out rtnl-locked code from VSI rebuild code to avoid
deadlock. The following changes lock rebuild and .ndo_bpf() critical sections
with an internal mutex as well and provide complementary fixes.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
ice: do not bring the VSI up, if it was down before the XDP setup
ice: remove ICE_CFG_BUSY locking from AF_XDP code
ice: check ICE_VSI_DOWN under rtnl_lock when preparing for reset
ice: check for XDP rings instead of bpf program when unconfiguring
ice: protect XDP configuration with a mutex
ice: move netif_queue_set_napi to rtnl-protected sections
====================
Link: https://patch.msgid.link/20240903183034.3530411-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Hopefully final fixes for v6.11 and this time only fixes to ath11k
driver. We need to revert hibernation support due to reported
regressions and we have a fix for kernel crash introduced in
v6.11-rc1.
-----BEGIN PGP SIGNATURE-----
iQFFBAABCgAvFiEEiBjanGPFTz4PRfLobhckVSbrbZsFAmbYZsIRHGt2YWxvQGtl
cm5lbC5vcmcACgkQbhckVSbrbZvwbAf/XhgO7uKuZnGfq6NBfyLIaXQjHRe6OSc0
/EJ+c9J3DK5vQUHcw2ALrBYtKPJIrgLIzMG+u4N/4klxLwWCzZpsMLTqNvUV4VRe
eKdD9v/0dZnrKrpmQxgtEoImc028gC5c2K9jDK4e9MU/kBMln9PpH3xyasnV9aLj
2ajkO7xdLooTJ4cdZtdDL9fd5px7IdghNgHiY+24lfuPY2AZV5AUK/AUHZw3CG6K
nXk6+PEO1Xr2P3SAc5yRYg4vXJIy7g8ELAGUj8f3XznvYKpcO+afce8By+H6GecQ
ofsWSLO86ajgu3wMMxyWhXFs8TAsyeE8fLJgNBqIfC05i1IkGfYwBg==
=C23H
-----END PGP SIGNATURE-----
Merge tag 'wireless-2024-09-04' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless
Kalle Valo says:
====================
wireless fixes for v6.11
Hopefully final fixes for v6.11 and this time only fixes to ath11k
driver. We need to revert hibernation support due to reported
regressions and we have a fix for kernel crash introduced in
v6.11-rc1.
* tag 'wireless-2024-09-04' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
MAINTAINERS: wifi: cw1200: add net-cw1200.h
Revert "wifi: ath11k: support hibernation"
Revert "wifi: ath11k: restore country code during resume"
wifi: ath11k: fix NULL pointer dereference in ath11k_mac_get_eirp_power()
====================
Link: https://patch.msgid.link/20240904135906.5986EC4CECA@smtp.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
axienet_dma_err_handler can race with axienet_stop in the following
manner:
CPU 1 CPU 2
====================== ==================
axienet_stop()
napi_disable()
axienet_dma_stop()
axienet_dma_err_handler()
napi_disable()
axienet_dma_stop()
axienet_dma_start()
napi_enable()
cancel_work_sync()
free_irq()
Fix this by setting a flag in axienet_stop telling
axienet_dma_err_handler not to bother doing anything. I chose not to use
disable_work_sync to allow for easier backporting.
Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
Fixes: 8a3b7a252d ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver")
Link: https://patch.msgid.link/20240903175141.4132898-1-sean.anderson@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When userspace wants to take over a fdb entry by setting it as
EXTERN_LEARNED, we set both flags BR_FDB_ADDED_BY_EXT_LEARN and
BR_FDB_ADDED_BY_USER in br_fdb_external_learn_add().
If the bridge updates the entry later because its port changed, we clear
the BR_FDB_ADDED_BY_EXT_LEARN flag, but leave the BR_FDB_ADDED_BY_USER
flag set.
If userspace then wants to take over the entry again,
br_fdb_external_learn_add() sees that BR_FDB_ADDED_BY_USER and skips
setting the BR_FDB_ADDED_BY_EXT_LEARN flags, thus silently ignores the
update.
Fix this by always allowing to set BR_FDB_ADDED_BY_EXT_LEARN regardless
if this was a user fdb entry or not.
Fixes: 710ae72877 ("net: bridge: Mark FDB entries that were added by user as such")
Signed-off-by: Jonas Gorski <jonas.gorski@bisdn.de>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20240903081958.29951-1-jonas.gorski@bisdn.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
generic_ocp_write() asks the parameter "size" must be 4 bytes align.
Therefore, write the bp would fail, if the mac->bp_num is odd. Align the
size to 4 for fixing it. The way may write an extra bp, but the
rtl8152_is_fw_mac_ok() makes sure the value must be 0 for the bp whose
index is more than mac->bp_num. That is, there is no influence for the
firmware.
Besides, I check the return value of generic_ocp_write() to make sure
everything is correct.
Fixes: e5c266a611 ("r8152: set bp in bulk")
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Link: https://patch.msgid.link/20240903063333.4502-1-hayeswang@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Bareudp devices update their stats concurrently.
Therefore they need proper atomic increments.
Fixes: 571912c69f ("net: UDP tunnel encapsulation module for tunnelling different protocols like MPLS, IP, NSH etc.")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/04b7b9d0b480158eb3ab4366ec80aa2ab7e41fcb.1725031794.git.gnault@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
- Fix a typo in the rebalance accounting changes
- BCH_SB_MEMBER_INVALID: small on disk format feature which will be
needed for full erasure coding support; this is only the minimum so
that 6.11 can handle future versions without barfing.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEKnAFLkS8Qha+jvQrE6szbY3KbnYFAmbYsMQACgkQE6szbY3K
bna2GQ/9Hbj2VecuEmuWkzS3fMAJbQSJ3AFKLJ2BWmi7Zvez57jhFIXVL1kdehi1
K0tW9T8yLCtT8vUHTy42fb/MQzE1ARkLk9qubOnJj1M8+JGm0LL3WoDnNr2gM11i
cSsxIk++8WqCEWw3+0a57vHc97zugzOSE3Np/J8zKLUuXEGLOrNtgFj/OHXRlYSz
iSg0JwZp+MrpmdcUN9SpymNcTQp9VlpCKjcLvxV28aFR2PwJm1LnFrFf+RhsGl94
NXEwHRYj9vqEm+8UI4u9owyBbeU7c+gtt3cKrayU4cGQoKk/la8biZvgEKDkGJwy
9W+zO7GthRCD5tLVTxsnYYDTLyO5KOvDaHXm9iZrQzbe2wSayOx4HPVR55XkLDHj
P/qN60rQvMactTrqhZVRerybvvOGS94280qkR2BPkm6gvdEu8eTYq+0uQgqpoHLi
sIXRJuYDuTB+24Hx9wc42TjEYqkOHdZ7T3ZFuP4e9j3vjo+0znJOb/aY6SsqD/wR
Wonw5/NFxW53gkXytX5MNctnizy1HrL5Kq5qIZZgLXWGqfCBcie3yT7MtItuqVFa
sMENVGpZ0vxhx6GbL/5D2rgIAK9X6NQybpPRmGvUpg/BqahcG+/aNH+LXeJPBcUt
2kkd1nqKXaJn14gTh1bmkYwKlQdLmWQQT8cJ9D29wDI7q7hvRDw=
=ABhR
-----END PGP SIGNATURE-----
Merge tag 'bcachefs-2024-09-04' of git://evilpiepirate.org/bcachefs
Pull bcachefs fixes from Kent Overstreet:
- Fix a typo in the rebalance accounting changes
- BCH_SB_MEMBER_INVALID: small on disk format feature which will be
needed for full erasure coding support; this is only the minimum so
that 6.11 can handle future versions without barfing.
* tag 'bcachefs-2024-09-04' of git://evilpiepirate.org/bcachefs:
bcachefs: BCH_SB_MEMBER_INVALID
bcachefs: fix rebalance accounting
Jeongjun Park says:
====================
bpf: fix incorrect name check pass logic in btf_name_valid_section
This patch was written to fix an issue where btf_name_valid_section() would
not properly check names with certain conditions and would throw an OOB vuln.
And selftest was added to verify this patch.
====================
Link: https://lore.kernel.org/r/20240831054525.364353-1-aha310510@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Add selftest for cases where btf_name_valid_section() does not properly
check for certain types of names.
Suggested-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Jeongjun Park <aha310510@gmail.com>
Link: https://lore.kernel.org/r/20240831054742.364585-1-aha310510@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Fix two inconsistencies in feature names as discussed in [1]:
1. Rename "dwarf-unwind-support" to "dwarf-unwind"
2. 'get_cpuid' feature and 'HAVE_AUXTRACE_SUPPORT' names don't
look related, change the feature name to 'auxtrace' to match the
macro name, as 'get_cpuid' string is not used anywhere to check the
feature presence
[1]: https://lore.kernel.org/linux-perf-users/ZoRw5we4HLSTZND6@x1/
Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Aditya Gupta <adityag@linux.ibm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240904190132.415212-7-adityag@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In probe_vfs_getname.sh, current we use "perf record --dry-run"
to check for libtraceevent and skip the test if perf is not
build with libtraceevent. Change the check to use "perf check feature"
option
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240904190132.415212-6-adityag@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently we use output of 'perf version --build-options', to check
whether perf was built with libtraceevent support.
Instead, use 'perf check feature libtraceevent' to check for
libtraceevent support.
Reviewed-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Aditya Gupta <adityag@linux.ibm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240904190132.415212-5-adityag@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now that the feature list has been duplicated in a global
'supported_features' array, use that array instead of manually checking
status of built-in features.
This helps in being consistent with commands such as 'perf check feature',
so commands can use the same array, and any new feature can be added at
one place, in the 'supported_features' array
Reviewed-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Aditya Gupta <adityag@linux.ibm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240904190132.415212-4-adityag@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
A number of small fixes for the late cycle:
* Two more build fixes on 32-bit archs
* Fixed a segfault during perf test
* Fixed spinlock/rwlock accounting bug in perf lock contention
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQSo2x5BnqMqsoHtzsmMstVUGiXMgwUCZtil/AAKCRCMstVUGiXM
g1p5AQCmfXr4T/CCVX82ExpHYYtsrbcQrmqMolqsGlN4J3I9EwD9EnSNIEkxy44o
bOrHRAzABpMTbhU8zVb2Mi+Gy+iccgc=
=SFTw
-----END PGP SIGNATURE-----
Merge tag 'perf-tools-fixes-for-v6.11-2024-09-04' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tools fixes from Namhyung Kim:
"A number of small fixes for the late cycle:
- Two more build fixes on 32-bit archs
- Fixed a segfault during perf test
- Fixed spinlock/rwlock accounting bug in perf lock contention"
* tag 'perf-tools-fixes-for-v6.11-2024-09-04' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools:
perf daemon: Fix the build on more 32-bit architectures
perf python: include "util/sample.h"
perf lock contention: Fix spinlock and rwlock accounting
perf test pmu: Set uninitialized PMU alias to null
If the length of the name string is 1 and the value of name[0] is NULL
byte, an OOB vulnerability occurs in btf_name_valid_section() and the
return value is true, so the invalid name passes the check.
To solve this, you need to check if the first position is NULL byte and
if the first character is printable.
Suggested-by: Eduard Zingerman <eddyz87@gmail.com>
Fixes: bd70a8fb7c ("bpf: Allow all printable characters in BTF DATASEC names")
Signed-off-by: Jeongjun Park <aha310510@gmail.com>
Link: https://lore.kernel.org/r/20240831054702.364455-1-aha310510@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmbYn2YACgkQxWXV+ddt
WDum5Q//Topfw8yGOMpSUajZ7n4Iy81CknH6GnV2r0qj/0vK4XZ8a8PHpJPLn0gc
neTGo62vfaQ1HstKPvXWMJkoew5cL+khXW6zaEnieVLvlrVGD9i5NgtmgiC/kK00
Pwj8h2MFhdrXEJEXdk0g9IVaGRs78lruGuc0eI0sGESMbZdQ4OsLToU4zFCqgb6b
LZrHENyTIoYjiqMPYrZh4X4TxDV9lVw3XTbebB9vZPsC1Bj0H8uZ3rMU5hS7VboH
e/c7qmJWs/Gq0CNCGvQmguO2eK29NVE24XHoLgsTwpYFSXW1VOLNUlihgkP1aZsB
Zh7ETuMah7M/yjwXNASdM2mJcO3yVRryUZXApJFCdHTRz12aIcCYfIRCZZ+GQuQg
gZaRgEW4kpTOmdUY3weeJcmfgQiHem0+cOy4dC6ykvNpfCwj3HcOft3U5qaR3C6p
c+Gd4lurnWn3CtPmYZRQ/7g9vvKth7jXvBMTkPoS4KyaTe5Kk+ph9h7uUtyHZpQP
/zxaZlYNMX1C+4atVTpQhRTBqHEbiK9BLDErWkqG0Dv6x/NJv3iDSAX+S64WWJwK
+LkHW7m+5HnCQi++8uxE+V1dWispczbgIcMEmPoyQhhEVKHg9dx9EItr8MEvNpyd
YIV6qfGoQTWzTPGbApLxe94WOm4tpcaFUbyaWjTrXexsYK6lo2I=
=LHQV
-----END PGP SIGNATURE-----
Merge tag 'for-6.11-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
- followup fix for direct io and fsync under some conditions, reported
by QEMU users
- fix a potential leak when disabling quotas while some extent tracking
work can still happen
- in zoned mode handle unexpected change of zone write pointer in
RAID1-like block groups, turn the zones to read-only
* tag 'for-6.11-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: fix race between direct IO write and fsync when using same fd
btrfs: zoned: handle broken write pointer on zones
btrfs: qgroup: don't use extent changeset when not needed
-----BEGIN PGP SIGNATURE-----
iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmbX21YACgkQiiy9cAdy
T1Hp9gv/dX8tAaYOAE6h5FpzI7kYWsOD0AqEEboZm17rP1M0ihqWhj+tXTjqa5Tb
T31Kyl/yZ0lRLe6B9cuAWVJCo+1cFnM1sdnL99yE/WlxZzZ3C3exntNlOkcUanCM
FeyFnVaxWDhZ53mroOX1KBJ1r9LOkGL7czjBwgyhpDu4Q63H4ZsgXJDIu/TJVf4t
TZkreFoBvn/WocpPl1VXxapILqcW7v5hzfof4MEvAPsHJwP3ZlN0LJuHe6YaBfff
p8jMZeFfdQc02jjAgL+7KZxlppvRzrZsm+5DZ6C9HyLLJmMJpvGODFG9hVNA8wHT
xLdekOCgekVx0UlSOzkivSu5FW4XJHPuycr4ak+XI0n20LglGbyA8bT0X5kuslSt
ejjZbx+uSlT4jjTSJsateTd8B14UO0iIrAaPumOwvBGGtcDenH0/cQ8ktWY79x97
Pc19JEPSAK2usViFonD4WUEwlg1sFFpV1TCu/HM8VJv6XOb0QzCyZgF7k7o78ztz
Fp51C0LQ
=yxks
-----END PGP SIGNATURE-----
Merge tag 'v6.11-rc6-server-fixes' of git://git.samba.org/ksmbd
Pull smb server fixes from Steve French:
- Fix crash in session setup
- Fix locking bug
- Improve access bounds checking
* tag 'v6.11-rc6-server-fixes' of git://git.samba.org/ksmbd:
ksmbd: Unlock on in ksmbd_tcp_set_interfaces()
ksmbd: unset the binding mark of a reused connection
smb: Annotate struct xattr_smb_acl with __counted_by()
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZtQmqAAKCRCRxhvAZXjc
os+mAP47NBhOecERCJSmS0RFMuRvc0ijxz1642emEthZhtf8qQD/cy56WmGZqEFZ
bfj5v6tGmsxGt4xMDUDNG0pvqba8hwA=
=JBA5
-----END PGP SIGNATURE-----
Merge tag 'vfs-6.11-rc7.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs fixes from Christian Brauner:
"Two netfs fixes for this merge window:
- Ensure that fscache_cookie_lru_time is deleted when the fscache
module is removed to prevent UAF
- Fix filemap_invalidate_inode() to use invalidate_inode_pages2_range()
Before it used truncate_inode_pages_partial() which causes
copy_file_range() to fail on cifs"
* tag 'vfs-6.11-rc7.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
fscache: delete fscache_cookie_lru_timer when fscache exits to avoid UAF
mm: Fix filemap_invalidate_inode() to use invalidate_inode_pages2_range()
- Fix a build issue with older binutils with LD dead code elimination
disabled
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEuNNh8scc2k/wOAE+9OeQG+StrGQFAmbYZagACgkQ9OeQG+St
rGRtxBAAjj7hWL9sodPuTq9gowxYGJFiuIZQ3dHt6vNfuGlT13L4M4fDmX9Tq9FE
LWjp2eTQyzzGTO5abdWAKkrhR7ASKqRMeDmOJdjC/sgRnIxGhIoP9Iy5Vch+yXMC
TPZ1D+wMoeB/2QhtFc0ExS22BNbanmARiM+kikY+Fkm5OapceTD43gMgglVSEmRx
/mS8EKO50Dn4GxB0uoPgkhYM2Q9NSvUcE/uXbMkAsPuC8FUCT+z7/YcxyMZwuXrX
zy8wzriV67Fg/s2NK7B+Dt55mIClraIq0ATmn7qNUIBhrjUGw0qc8W6PA8pyLV91
aQP5MhXOaMEroZ9n41lCbuivefRRJLxxAa2YfDd7g+1dVQeqHYOpTpYWU0TtsKi0
5kOcur96U2SnYbVxihugzJgYIzNW54eH3rHTY9fJ88A+QQfLHxASZ7aTt23QRqPc
drjRmaTUInd+f6C9leL+roxWu39nXJeuey2VXivKA7K/WbhvEbFn2L9QR2htwFZd
hOhBaflVepJYlOWr8YOTvRzuXf/E78sYegfH9tD1M7Vnk3Ek/sPbhoYSK7OX9QTU
rei++QA+bE84ksKSEwyc3UHPTvL1rK3ZFYPFQ7PEYsZMEKkMMg/isNOxFVtJ7N9S
bURxl4GkSvqamtvqkCISqViPN8iNRHZcyT1EupUpTQQ6J8p4lqk=
=fJoT
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux
Pull ARM fix from Russell King:
- Fix a build issue with older binutils with LD dead code elimination
disabled
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux:
ARM: 9414/1: Fix build issue with LD_DEAD_CODE_DATA_ELIMINATION
- Fix boot issue where boot memory is marked read-only too early
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQS86RI+GtKfB8BJu973ErUQojoPXwUCZtgixgAKCRD3ErUQojoP
X6CGAQDPjigX2Cxo/BpWv5NNsRkbJ5QETvDQTro4V9TVl8pE5AEA084As/smH7TS
FB0JfKG1MkMwqlnZdVFLJ78Y8Aiwwgs=
=ru1k
-----END PGP SIGNATURE-----
Merge tag 'parisc-for-6.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc architecture fix from Helge Deller:
- Fix boot issue where boot memory is marked read-only too early
* tag 'parisc-for-6.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Delay write-protection until mark_rodata_ro() call
Mostly MM, no identifiable theme. And a few nilfs2 fixups.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZtfR/wAKCRDdBJ7gKXxA
jofjAP9rUlliIcn8zcy7vmBTuMaH4SkoULB64QWAUddaWV+SCAEA+q0sntLPnTIZ
My3sfihR6mbvhkgKbvIHm6YYQI56NAc=
=b4Lr
-----END PGP SIGNATURE-----
Merge tag 'mm-hotfixes-stable-2024-09-03-20-19' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"17 hotfixes, 15 of which are cc:stable.
Mostly MM, no identifiable theme. And a few nilfs2 fixups"
* tag 'mm-hotfixes-stable-2024-09-03-20-19' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
alloc_tag: fix allocation tag reporting when CONFIG_MODULES=n
mm: vmalloc: optimize vmap_lazy_nr arithmetic when purging each vmap_area
mailmap: update entry for Jan Kuliga
codetag: debug: mark codetags for poisoned page as empty
mm/memcontrol: respect zswap.writeback setting from parent cg too
scripts: fix gfp-translate after ___GFP_*_BITS conversion to an enum
Revert "mm: skip CMA pages when they are not available"
maple_tree: remove rcu_read_lock() from mt_validate()
kexec_file: fix elfcorehdr digest exclusion when CONFIG_CRASH_HOTPLUG=y
mm/slub: add check for s->flags in the alloc_tagging_slab_free_hook
nilfs2: fix state management in error path of log writing function
nilfs2: fix missing cleanup on rollforward recovery error
nilfs2: protect references to superblock parameters exposed in sysfs
userfaultfd: don't BUG_ON() if khugepaged yanks our page table
userfaultfd: fix checks for huge PMDs
mm: vmalloc: ensure vmap_block is initialised before adding to queue
selftests: mm: fix build errors on armhf
There is a build issue with LD segmentation fault, while
CONFIG_LD_DEAD_CODE_DATA_ELIMINATION is not enabled, as bellow.
scripts/link-vmlinux.sh: line 49: 3796 Segmentation fault
(core dumped) ${ld} ${ldflags} -o ${output} ${wl}--whole-archive
${objs} ${wl}--no-whole-archive ${wl}--start-group
${libs} ${wl}--end-group ${kallsymso} ${btf_vmlinux_bin_o} ${ldlibs}
The error occurs in older versions of the GNU ld with version earlier
than 2.36. It makes most sense to have a minimum LD version as
a dependency for HAVE_LD_DEAD_CODE_DATA_ELIMINATION and eliminate
the impact of ".reloc .text, R_ARM_NONE, ." when
CONFIG_LD_DEAD_CODE_DATA_ELIMINATION is not enabled.
Fixes: ed0f941022 ("ARM: 9404/1: arm32: enable HAVE_LD_DEAD_CODE_DATA_ELIMINATION")
Reported-by: Harith George <mail2hgg@gmail.com>
Tested-by: Harith George <mail2hgg@gmail.com>
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Yuntao Liu <liuyuntao12@huawei.com>
Link: https://lore.kernel.org/all/14e9aefb-88d1-4eee-8288-ef15d4a9b059@gmail.com/
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
When restricting jevents generated json lookup code with JEVENTS_MODEL
a list of models must be provided. Some builds don't know model names
but know cpuids. Add a command that can convert a cpuid to a model
using mapfile.csv files. This can be used with JEVENTS_MODEL like:
$ make JEVENTS_MODEL=`./pmu-events/models.py x86 'GenuineIntel-6-8D-1,AuthenticAMD-26-1' pmu-events/arch/`
Committer testing:
$ tools/perf/pmu-events/models.py x86 'GenuineIntel-6-8D-1,AuthenticAMD-26-1' tools/perf/pmu-events/arch/
tigerlake,amdzen5
$ perf stat -v sleep 1 |& head -1
Using CPUID GenuineIntel-6-B7-1
$ tools/perf/pmu-events/models.py x86 'GenuineIntel-6-B7-1' tools/perf/pmu-events/arch/
alderlake
$
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240904044351.712080-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently the presence of a feature is checked with a combination of
perf version --build-options and greps, such as:
perf version --build-options | grep " on .* HAVE_FEATURE"
Instead of this, introduce a subcommand "perf check feature", with which
scripts can test for presence of a feature, such as:
perf check feature HAVE_FEATURE
'perf check feature' command is expected to have exit status of 0 if
feature is built-in, and 1 if it's not built-in or if feature is not known.
Multiple features can also be passed as a comma-separated list, in which
case the exit status will be 1 only if all of the passed features are
built-in. For example, with below command, it will have exit status of 0
only if both libtraceevent and bpf are enabled, else 1 in all other cases
perf check feature libtraceevent,bpf
The arguments are case-insensitive.
An array 'supported_features' has also been introduced that can be used by
other commands like 'perf version --build-options', so that new features
can be added in one place, with the array
Committer testing:
$ perf check feature libtraceevent,bpf
libtraceevent: [ on ] # HAVE_LIBTRACEEVENT
bpf: [ on ] # HAVE_LIBBPF_SUPPORT
$ perf check feature libtraceevent
libtraceevent: [ on ] # HAVE_LIBTRACEEVENT
$ perf check feature bpf
bpf: [ on ] # HAVE_LIBBPF_SUPPORT
$ perf check -q feature bpf && echo "BPF support is present"
BPF support is present
$ perf check -q feature Bogus && echo "Bogus support is present"
$
Reviewed-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Aditya Gupta <adityag@linux.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240904061836.55873-3-adityag@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently, commands which depend on 'parse_options_subcommand()' don't
show the usage string, and instead show '(null)'
$ ./perf sched
Usage: (null)
-D, --dump-raw-trace dump raw trace in ASCII
-f, --force don't complain, do it
-i, --input <file> input file name
-v, --verbose be more verbose (show symbol address, etc)
'parse_options_subcommand()' is generally expected to initialise the usage
string, with information in the passed 'subcommands[]' array
This behaviour was changed in:
230a7a71f9 ("libsubcmd: Fix parse-options memory leak")
Where the generated usage string is deallocated, and usage[0] string is
reassigned as NULL.
As discussed in [1], free the allocated usage string in the main
function itself, and don't reset usage string to NULL in
parse_options_subcommand
With this change, the behaviour is restored.
$ ./perf sched
Usage: perf sched [<options>] {record|latency|map|replay|script|timehist}
-D, --dump-raw-trace dump raw trace in ASCII
-f, --force don't complain, do it
-i, --input <file> input file name
-v, --verbose be more verbose (show symbol address, etc)
[1]: https://lore.kernel.org/linux-perf-users/htq5vhx6piet4nuq2mmhk7fs2bhfykv52dbppwxmo3s7du2odf@styd27tioc6e/
Fixes: 230a7a71f9 ("libsubcmd: Fix parse-options memory leak")
Suggested-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Aditya Gupta <adityag@linux.ibm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240904061836.55873-2-adityag@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
On arm64 the breakpoint length should be 4-bytes but 8-bytes is
tolerated as perf passes that as sizeof(long). Just pass the correct
value.
On i386 the sizeof(long) check in the kernel needs to match the
kernel's long size. Check using an environment (uname checks) whether
4 or 8 bytes needs to be passed. Cache the value in a static.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Junhao He <hejunhao3@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yang Jihong <yangjihong@bytedance.com>
Link: https://lore.kernel.org/r/20240904050606.752788-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The default breakpoint length is "sizeof(long)" however this is
incorrect on platforms like Aarch64 where sizeof(long) is 8 but the
breakpoint length is 4. Add a helper function that can be used to
determine the correct breakpoint length, in this change it just
returns the existing default sizeof(long) value.
Use the helper in the bp_account test so that, when modifying the
event from a watchpoint to a breakpoint, the breakpoint length is
appropriate for the architecture and not just sizeof(long).
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Chaitanya S Prakash <chaitanyas.prakash@arm.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Junhao He <hejunhao3@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yang Jihong <yangjihong@bytedance.com>
Link: https://lore.kernel.org/r/20240904050606.752788-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently napi_disable() gets called during rxq and txq cleanup,
even before napi is enabled and hrtimer is initialized. It causes
kernel panic.
? page_fault_oops+0x136/0x2b0
? page_counter_cancel+0x2e/0x80
? do_user_addr_fault+0x2f2/0x640
? refill_obj_stock+0xc4/0x110
? exc_page_fault+0x71/0x160
? asm_exc_page_fault+0x27/0x30
? __mmdrop+0x10/0x180
? __mmdrop+0xec/0x180
? hrtimer_active+0xd/0x50
hrtimer_try_to_cancel+0x2c/0xf0
hrtimer_cancel+0x15/0x30
napi_disable+0x65/0x90
mana_destroy_rxq+0x4c/0x2f0
mana_create_rxq.isra.0+0x56c/0x6d0
? mana_uncfg_vport+0x50/0x50
mana_alloc_queues+0x21b/0x320
? skb_dequeue+0x5f/0x80
Cc: stable@vger.kernel.org
Fixes: e1b5683ff6 ("net: mana: Move NAPI from EQ to CQ")
Signed-off-by: Souradeep Chakrabarti <schakrabarti@linux.microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Create a sentinal value for "invalid device".
This is needed for removing devices that have stripes on them (force
removing, without evacuating); we need a sentinal value for the stripe
pointers to the device being removed.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Move away from corporate infrastructure for upstream work. Also update
mailmap.
Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
Link: https://lore.kernel.org/r/20240903200956.68231-1-a.hindborg@kernel.org
[ Reworded title slightly. - Miguel ]
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
Previously the cpu_list is a string and typically no cpu_list is
passed to __add_event().
Wanting to make events have their cpus distinct from the PMU means that
in more occassions we want to pass a cpu_list.
If we're reading this from sysfs it is easier to read a perf_cpu_map
than allocate and pass around strings that will later be parsed.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Gautham Shenoy <gautham.shenoy@amd.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Link: https://lore.kernel.org/r/20240718003025.1486232-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Merge perf_pmu__parse_per_pkg() and perf_pmu__parse_snapshot() that do the
same parsing except for the file suffix used.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Gautham Shenoy <gautham.shenoy@amd.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Link: https://lore.kernel.org/r/20240718003025.1486232-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQSQHSd0lITzzeNWNm3h3BK/laaZPAUCZtbV4AAKCRDh3BK/laaZ
PC33AP9XvLpQii0mLo12hTSP11TYpaatdhUvyFFKERle1yWkUgEAvtVutUJryTD2
sz7x5jj4GD9tCWyMlp8Xs5h1Dr4U6wc=
=XdIb
-----END PGP SIGNATURE-----
Merge tag 'fuse-fixes-6.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
Pull fuse fixes from Miklos Szeredi:
- Fix EIO if splice and page stealing are enabled on the fuse device
- Disable problematic combination of passthrough and writeback-cache
- Other bug fixes found by code review
* tag 'fuse-fixes-6.11-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
fuse: disable the combination of passthrough and writeback cache
fuse: update stats for pages in dropped aux writeback list
fuse: clear PG_uptodate when using a stolen page
fuse: fix memory leak in fuse_create_open
fuse: check aborted connection before adding requests to pending list for resending
fuse: use unsigned type for getxattr/listxattr size truncation
There's a potential race when `cgroup_bpf_enabled(CGROUP_GETSOCKOPT)` is
false during the execution of `BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN`, but
becomes true when `BPF_CGROUP_RUN_PROG_GETSOCKOPT` is called.
This inconsistency can lead to `BPF_CGROUP_RUN_PROG_GETSOCKOPT` receiving
an "-EFAULT" from `__cgroup_bpf_run_filter_getsockopt(max_optlen=0)`.
Scenario shown as below:
`process A` `process B`
----------- ------------
BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN
enable CGROUP_GETSOCKOPT
BPF_CGROUP_RUN_PROG_GETSOCKOPT (-EFAULT)
To resolve this, remove the `BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN` macro and
directly uses `copy_from_sockptr` to ensure that `max_optlen` is always
set before `BPF_CGROUP_RUN_PROG_GETSOCKOPT` is invoked.
Fixes: 0d01da6afc ("bpf: implement getsockopt and setsockopt hooks")
Co-developed-by: Yanghui Li <yanghui.li@mediatek.com>
Signed-off-by: Yanghui Li <yanghui.li@mediatek.com>
Co-developed-by: Cheng-Jui Wang <cheng-jui.wang@mediatek.com>
Signed-off-by: Cheng-Jui Wang <cheng-jui.wang@mediatek.com>
Signed-off-by: Tze-nan Wu <Tze-nan.Wu@mediatek.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Link: https://patch.msgid.link/20240830082518.23243-1-Tze-nan.Wu@mediatek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When CONFIG_DQL is not enabled, dql_group should be treated as a dead
declaration. However, its current extern declaration assumes the linker
will ignore it, which is generally true across most compiler and
architecture combinations.
But in certain cases, the linker still attempts to resolve the extern
struct, even when the associated code is dead, resulting in a linking
error. For instance the following error in loongarch64:
>> loongarch64-linux-ld: net-sysfs.c:(.text+0x589c): undefined reference to `dql_group'
Modify the declaration of the dead object to be an empty declaration
instead of an extern. This change will prevent the linker from
attempting to resolve an undefined reference.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202409012047.eCaOdfQJ-lkp@intel.com/
Fixes: 74293ea1c4 ("net: sysfs: Do not create sysfs for non BQL device")
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Simon Horman <horms@kernel.org> # build-tested
Link: https://patch.msgid.link/20240902101734.3260455-1-leitao@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The --show-prio option is used to display the priority of task.
It is disabled by default, which is consistent with original behavior.
The display format is xxx (priority does not change during task running)
or xxx->yyy (priority changes during task running)
Testcase:
# perf sched record nice -n 9 true
[ perf record: Woken up 0 times to write data ]
[ perf record: Captured and wrote 0.497 MB perf.data ]
# perf sched timehist -h
Usage: perf sched timehist [<options>]
-C, --cpu <cpu> list of cpus to profile
-D, --dump-raw-trace dump raw trace in ASCII
-f, --force don't complain, do it
-g, --call-graph Display call chains if present (default on)
-I, --idle-hist Show idle events only
-i, --input <file> input file name
-k, --vmlinux <file> vmlinux pathname
-M, --migrations Show migration events
-n, --next Show next task
-p, --pid <pid[,pid...]>
analyze events only for given process id(s)
-s, --summary Show only syscall summary with statistics
-S, --with-summary Show all syscalls and summary with statistics
-t, --tid <tid[,tid...]>
analyze events only for given thread id(s)
-V, --cpu-visual Add CPU visual
-v, --verbose be more verbose (show symbol address, etc)
-w, --wakeups Show wakeup events
--kallsyms <file>
kallsyms pathname
--max-stack <n> Maximum number of functions to display backtrace.
--show-prio Show task priority
--state Show task state when sched-out
--symfs <directory>
Look for files with symbols relative to this directory
--time <str> Time span for analysis (start,stop)
# perf sched timehist
Samples of sched_switch event do not have callchains.
time cpu task name wait time sch delay run time
[tid/pid] (msec) (msec) (msec)
--------------- ------ ------------------------------ --------- --------- ---------
23952.006537 [0000] perf[534] 0.000 0.000 0.000
23952.006593 [0000] migration/0[19] 0.000 0.014 0.056
23952.006899 [0001] perf[534] 0.000 0.000 0.000
23952.006947 [0001] migration/1[22] 0.000 0.015 0.047
23952.007138 [0002] perf[534] 0.000 0.000 0.000
<SNIP>
# perf sched timehist --show-prio
Samples of sched_switch event do not have callchains.
time cpu task name prio wait time sch delay run time
[tid/pid] (msec) (msec) (msec)
--------------- ------ ------------------------------ -------- --------- --------- ---------
23952.006537 [0000] perf[534] 120 0.000 0.000 0.000
23952.006593 [0000] migration/0[19] 0 0.000 0.014 0.056
23952.006899 [0001] perf[534] 120 0.000 0.000 0.000
<SNIP>
23952.034843 [0003] nice[535] 120->129 0.189 0.024 23.314
<SNIP>
23952.053838 [0005] rcu_preempt[16] 120 3.993 0.000 0.023
23952.053990 [0005] <idle> 120 0.023 0.023 0.152
23952.054137 [0006] <idle> 120 1.427 1.427 17.855
23952.054278 [0007] <idle> 120 0.506 0.506 1.650
Signed-off-by: Yang Jihong <yangjihong@bytedance.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240819033016.2427235-2-yangjihong@bytedance.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
If netem_dequeue() enqueues packet to inner qdisc and that qdisc
returns __NET_XMIT_STOLEN. The packet is dropped but
qdisc_tree_reduce_backlog() is not called to update the parent's
q.qlen, leading to the similar use-after-free as Commit
e04991a48dbaf382 ("netem: fix return value if duplicate enqueue
fails")
Commands to trigger KASAN UaF:
ip link add type dummy
ip link set lo up
ip link set dummy0 up
tc qdisc add dev lo parent root handle 1: drr
tc filter add dev lo parent 1: basic classid 1:1
tc class add dev lo classid 1:1 drr
tc qdisc add dev lo parent 1:1 handle 2: netem
tc qdisc add dev lo parent 2: handle 3: drr
tc filter add dev lo parent 3: basic classid 3:1 action mirred egress
redirect dev dummy0
tc class add dev lo classid 3:1 drr
ping -c1 -W0.01 localhost # Trigger bug
tc class del dev lo classid 1:1
tc class add dev lo classid 1:1 drr
ping -c1 -W0.01 localhost # UaF
Fixes: 50612537e9 ("netem: fix classful handling")
Reported-by: Budimir Markovic <markovicbudimir@gmail.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Link: https://patch.msgid.link/20240901182438.4992-1-stephen@networkplumber.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>