This function wants to move a subset of a list from one element to the
tail into another list. It also needs to use the srcu synchronize
instead of the regular rcu version. Do this one element at a time
because that's the only to do it.
Fixes: be647e2c76 ("nvme: use srcu for iterating namespace list")
Reported-by: Venkat Rao Bagalkote <venkat88@linux.vnet.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.vnet.ibm.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Current release - regressions:
- Revert "igc: fix a log entry using uninitialized netdev",
it traded lack of netdev name in a printk() for a crash
Previous releases - regressions:
- Bluetooth: L2CAP: fix rejecting L2CAP_CONN_PARAM_UPDATE_REQ
- geneve: fix incorrectly setting lengths of inner headers in the skb,
confusing the drivers and causing mangled packets
- sched: initialize noop_qdisc owner to avoid false-positive recursion
detection (recursing on CPU 0), which bubbles up to user space as
a sendmsg() error, while noop_qdisc should silently drop
- netdevsim: fix backwards compatibility in nsim_get_iflink()
Previous releases - always broken:
- netfilter: ipset: fix race between namespace cleanup and gc
in the list:set type
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmZrFjoACgkQMUZtbf5S
Iru6Bw/+MfomIf6qvdCXKdka4eOeqZLg7gZU0UdC99VM1SH7QGazkAvj4ACbDMa7
04mgNZKquV5Fx6AJQwjAodzHx2KUl5WA5cWzAuLyA78lJXoipI7W+KRtcBzGl0gs
IQ+IQCofWjduLMc9y67TqTSnVhtDWaHWw6PwMW8Z4BotD9hXxoUeGXz373UA8xhW
2Wz1HkQbDqIFqc0Sp1c0IfAQtnzzvg4yC+KCV+2nHB/d8CAlCUJ6deVWbCtF8d5O
/ospqFykzkENbYh8ySMEs6bAH0mS2nMiLPRnoLW1b2vMQWgOwv8xYVaYHI5tP+7u
NxMZd4JQntBLhe8jV3sc6ciPnlPSDu6rNDwWJcvK26EHPXYg/opsihH18nMu1esO
fp//KvKz8BT4vrkAW+YpxaD86V1X0dKkPIr2qFQ3eMHF8A1p+lYcGiWd1BQNPj5A
HHX1ERTVHxyl1nH2wy0FHhPXt1k5SzUT9AS0PyBou14stwN1O8VHHmGrTbu+CHe5
/P1jJ9DNDGO6LdDr60W9r+ucyvGYGxoZe09NQOiBXYnJbb1Xq5Allh+d6O+oyT0y
kM1jsPt2360nF2TZ8lMpn+R+OfTdOaQMw5nHXd+XFX0VktQ/231vW9L/dRfcOt6C
ESuaDHz0Q1DE8PI/dfrxRQLDG7UckN27aTHdn+ZHkq4VjdUPUdk=
=cyRR
-----END PGP SIGNATURE-----
Merge tag 'net-6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from bluetooth and netfilter.
Slim pickings this time, probably a combination of summer, DevConf.cz,
and the end of first half of the year at corporations.
Current release - regressions:
- Revert "igc: fix a log entry using uninitialized netdev", it traded
lack of netdev name in a printk() for a crash
Previous releases - regressions:
- Bluetooth: L2CAP: fix rejecting L2CAP_CONN_PARAM_UPDATE_REQ
- geneve: fix incorrectly setting lengths of inner headers in the
skb, confusing the drivers and causing mangled packets
- sched: initialize noop_qdisc owner to avoid false-positive
recursion detection (recursing on CPU 0), which bubbles up to user
space as a sendmsg() error, while noop_qdisc should silently drop
- netdevsim: fix backwards compatibility in nsim_get_iflink()
Previous releases - always broken:
- netfilter: ipset: fix race between namespace cleanup and gc in the
list:set type"
* tag 'net-6.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (35 commits)
bnxt_en: Adjust logging of firmware messages in case of released token in __hwrm_send()
af_unix: Read with MSG_PEEK loops if the first unread byte is OOB
bnxt_en: Cap the size of HWRM_PORT_PHY_QCFG forwarded response
gve: Clear napi->skb before dev_kfree_skb_any()
ionic: fix use after netif_napi_del()
Revert "igc: fix a log entry using uninitialized netdev"
net: bridge: mst: fix suspicious rcu usage in br_mst_set_state
net: bridge: mst: pass vlan group directly to br_mst_vlan_set_state
net/ipv6: Fix the RT cache flush via sysctl using a previous delay
net: stmmac: replace priv->speed with the portTransmitRate from the tc-cbs parameters
gve: ignore nonrelevant GSO type bits when processing TSO headers
net: pse-pd: Use EOPNOTSUPP error code instead of ENOTSUPP
netfilter: Use flowlabel flow key when re-routing mangled packets
netfilter: ipset: Fix race between namespace cleanup and gc in the list:set type
netfilter: nft_inner: validate mandatory meta and payload
tcp: use signed arithmetic in tcp_rtx_probe0_timed_out()
mailmap: map Geliang's new email address
mptcp: pm: update add_addr counters after connect
mptcp: pm: inc RmAddr MIB counter once per RM_ADDR ID
mptcp: ensure snd_una is properly initialized on connect
...
Bugfixes:
- NFSv4.2: Fix a memory leak in nfs4_set_security_label
- NFSv2/v3: abort nfs_atomic_open_v23 if the name is too long.
- NFS: Add appropriate memory barriers to the sillyrename code
- Propagate readlink errors in nfs_symlink_filler
- NFS: don't invalidate dentries on transient errors
- NFS: fix unnecessary synchronous writes in random write workloads
- NFSv4.1: enforce rootpath check when deciding whether or not to trunk
Other:
- Change email address for Trond Myklebust due to email server concerns
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEESQctxSBg8JpV8KqEZwvnipYKAPIFAmZrD7kACgkQZwvnipYK
APJo/RAAlg5uU8kTzXCbUbe9LImF5nh+9XFtg6nnQ+rxQqCU17noT0bazghvBcDP
N6v4evWJVeZhnqspVZkdMQWeyNEqsew5uPRoC+gy0sh4RdwT+BHsMwLMWtNTzXoc
GW7DOJ7LePzxmh0bksIFn6vmsuhxyI7hKkDx8XuG0YjmHQDcl2TeyHLfii7TFIMP
hFEVw63ZFb+HKV0oInyP27iiM1HstvZ8MbxLcu1PoA3IaiNUYXUgBViWF2c5P6uY
KV7KynUMgkWQc289aaR7QE5Yz2f4vsYF4oD72+Z3v65W+5HunYut/HUnUGgjHPGq
dI5EwSgxW5YKoo/kIvto3yF+ppkppl2gUYFlN3+O/IEXwh+FTXBF2b/tUWFkKQPH
7X3YoosWV/WN1eWqa55znF5IzrG5gdR5z6Et1elTmn4xG4hwoC5U5lOP34DohS4Q
N/MMxzVcL348j2teN+dFNXM3WkEVaMJudavJ7A7KehZKSTZAuFNHDxsMjvyGBbGI
Za1DanWSCWBD8Bawt9hB8z+k4eN7dGfeWUgMHa2zxOowfeq0MYTjAlr1A8SuXADv
E+1tjL7CL7HeHReOg0cbP11BxLOlXYiObsbivFcbYGglbWwtPqR4q4+CIA6AuhCF
LSpxF/6uKCXUHCuuGdAgZ5nNApHvC+HoUN0gCBcpIALcazLq0TQ=
=DeAG
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-6.10-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client fixes from Trond Myklebust:
"Bugfixes:
- NFSv4.2: Fix a memory leak in nfs4_set_security_label
- NFSv2/v3: abort nfs_atomic_open_v23 if the name is too long.
- NFS: Add appropriate memory barriers to the sillyrename code
- Propagate readlink errors in nfs_symlink_filler
- NFS: don't invalidate dentries on transient errors
- NFS: fix unnecessary synchronous writes in random write workloads
- NFSv4.1: enforce rootpath check when deciding whether or not to trunk
Other:
- Change email address for Trond Myklebust due to email server concerns"
* tag 'nfs-for-6.10-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
NFS: add barriers when testing for NFS_FSDATA_BLOCKED
SUNRPC: return proper error from gss_wrap_req_priv
NFSv4.1 enforce rootpath check in fs_location query
NFS: abort nfs_atomic_open_v23 if name is too long.
nfs: don't invalidate dentries on transient errors
nfs: Avoid flushing many pages with NFS_FILE_SYNC
nfs: propagate readlink errors in nfs_symlink_filler
MAINTAINERS: Change email address for Trond Myklebust
NFSv4: Fix memory leak in nfs4_set_security_label
To check for unset node ID for a range memblock_validate_numa_coverage()
was checking for NUMA_NO_NODE, but x86 used MAX_NUMNODES when no node ID
was specified by buggy firmware.
Update memblock to substitute MAX_NUMNODES with NUMA_NO_NODE in
memblock_set_node() and use NUMA_NO_NODE in x86::numa_init().
-----BEGIN PGP SIGNATURE-----
iQFEBAABCgAuFiEEeOVYVaWZL5900a/pOQOGJssO/ZEFAmZq/CsQHHJwcHRAa2Vy
bmVsLm9yZwAKCRA5A4Ymyw79kcpQB/4kmPgJJ0ApdwLT1JiPgLabAPOa05GvCcfa
/1JsoAIX5NlBThy2mX0QJ3963MFkB1wc8KqJuG8OpsL9/AHpdgts+4Me/K2PORWH
cZbgU01S4eqlBIY08mODnSYIpQI+n88kzYob+jRGud/NSwk7wu/+//n6lACqsltE
K+E/9zSfmnnr8gxv6rsi7YTQrXWAsGIhLJDLamYM9Q3Pz0azvdzrfLRlVV4NaaUw
Dvj6wG60A9qAmXP46OTU3DvlVGA5qv4rahLA8JuHC3TIV12/JchENL2yOAj5SMiv
0k/q+89HAcvFm9ByV+auEd1IKjgvNPQYsWaYnB88HZ10oMNkuDD0
=Y/Dv
-----END PGP SIGNATURE-----
Merge tag 'fixes-2024-06-13' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock
Pull memblock fixes from Mike Rapoport:
"Fix validation of NUMA coverage.
memblock_validate_numa_coverage() was checking for a unset node ID
using NUMA_NO_NODE, but x86 used MAX_NUMNODES when no node ID was
specified by buggy firmware.
Update memblock to substitute MAX_NUMNODES with NUMA_NO_NODE in
memblock_set_node() and use NUMA_NO_NODE in x86::numa_init()"
* tag 'fixes-2024-06-13' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock:
x86/mm/numa: Use NUMA_NO_NODE when calling memblock_set_node()
memblock: make memblock_set_node() also warn about use of MAX_NUMNODES
In case of token is released due to token->state == BNXT_HWRM_DEFERRED,
released token (set to NULL) is used in log messages. This issue is
expected to be prevented by HWRM_ERR_CODE_PF_UNAVAILABLE error code. But
this error code is returned by recent firmware. So some firmware may not
return it. This may lead to NULL pointer dereference.
Adjust this issue by adding token pointer check.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 8fa4219dba ("bnxt_en: add dynamic debug support for HWRM messages")
Suggested-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Aleksandr Mishin <amishin@t-argos.ru>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/20240611082547.12178-1-amishin@t-argos.ru
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Firmware interface 1.10.2.118 has increased the size of
HWRM_PORT_PHY_QCFG response beyond the maximum size that can be
forwarded. When the VF's link state is not the default auto state,
the PF will need to forward the response back to the VF to indicate
the forced state. This regression may cause the VF to fail to
initialize.
Fix it by capping the HWRM_PORT_PHY_QCFG response to the maximum
96 bytes. The SPEEDS2_SUPPORTED flag needs to be cleared because the
new speeds2 fields are beyond the legacy structure. Also modify
bnxt_hwrm_fwd_resp() to print a warning if the message size exceeds 96
bytes to make this failure more obvious.
Fixes: 84a911db83 ("bnxt_en: Update firmware interface to 1.10.2.118")
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/20240612231736.57823-1-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Some editors (like the vim variants), when seeing "trim_whitespace"
decide to do just that for all of the whitespace in the file you are
saving, even if it is not on a line that you have modified. This plays
havoc with diffs and is NOT something that should be intended.
As the "only trim whitespace on modified lines" is not part of the
editorconfig standard yet, just delete these lines from the
.editorconfig file so that we don't end up with diffs that are
automatically rejected by maintainers for containing things they
shouldn't.
Cc: Danny Lin <danny@kdrag0n.dev>
Cc: Íñigo Huguet <ihuguet@redhat.com>
Cc: Mickaël Salaün <mic@digikod.net>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Fixes: 5a602de997 ("Add .editorconfig file for basic formatting")
Acked-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Link: https://lore.kernel.org/r/2024061137-jawless-dipped-e789@gregkh
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
gve_rx_free_skb incorrectly leaves napi->skb referencing an skb after it
is freed with dev_kfree_skb_any(). This can result in a subsequent call
to napi_get_frags returning a dangling pointer.
Fix this by clearing napi->skb before the skb is freed.
Fixes: 9b8dd5e5ea ("gve: DQO: Add RX path")
Cc: stable@vger.kernel.org
Reported-by: Shailend Chand <shailend@google.com>
Signed-off-by: Ziwei Xiao <ziweixiao@google.com>
Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com>
Reviewed-by: Shailend Chand <shailend@google.com>
Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com>
Link: https://lore.kernel.org/r/20240612001654.923887-1-ziweixiao@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
disable c6 called in guc_pc_fini_hw is unreachable.
GuC PC init returns earlier if skip_guc_pc is true and never
registers the finish call thus making disable_c6 unreachable.
move this call to gt idle.
v2: rebase
v3: add fixes tag (Himal)
Fixes: 975e4a3795 ("drm/xe: Manually setup C6 when skip_guc_pc is set")
Signed-off-by: Riana Tauro <riana.tauro@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240606100842.956072-3-riana.tauro@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit 6800e63cf9)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Tests show that user fence signalling requires kind of write barrier,
otherwise not all writes performed by the workload will be available
to userspace. It is already done for render and compute, we need it
also for the rest: video, gsc, copy.
Fixes: dd08ebf6c3 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240605-fix_user_fence_posted-v3-2-06e7932f784a@intel.com
(cherry picked from commit 3ad7d18c5d)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
The Local Memory (aka VRAM) is only available on DGFX platforms.
We shouldn't attempt to provision VFs with LMEM or attempt to
update the LMTT on non-DGFX platforms. Add missing asserts that
would enforce that and fix release code that could crash on iGFX
due to uninitialized LMTT.
Fixes: 0698ff57bf ("drm/xe/pf: Update the LMTT when freeing VF GT config")
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Piotr Piórkowski <piotr.piorkowski@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240607153155.1592-1-michal.wajdeczko@intel.com
(cherry picked from commit b321cb83a3)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
After starting to install the EC address space handler at the ACPI
namespace root, if there is an "orphan" _REG method in the EC device's
scope, it will not be evaluated any more. This breaks EC operation
regions on some systems, like Asus gu605.
To address this, use a wrapper around an existing ACPICA function to
look for an "orphan" _REG method in the EC device scope and evaluate
it if present.
Fixes: 60fa6ae6e6 ("ACPI: EC: Install address space handler at the namespace root")
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218945
Reported-by: VitaliiT <vitaly.torshyn@gmail.com>
Tested-by: VitaliiT <vitaly.torshyn@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
This is a re-commit of
da05b143a3 ("x86/boot: Don't add the EFI stub to targets")
after the tagged patch incorrectly reverted it.
vmlinux-objs-y is added to targets, with an assumption that they are all
relative to $(obj); adding a $(objtree)/drivers/... path causes the
build to incorrectly create a useless
arch/x86/boot/compressed/drivers/... directory tree.
Fix this just by using a different make variable for the EFI stub.
Fixes: cb8bda8ad4 ("x86/boot/compressed: Rename efi_thunk_64.S to efi-mixed.S")
Signed-off-by: Ben Segall <bsegall@google.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Cc: stable@vger.kernel.org # v6.1+
Link: https://lore.kernel.org/r/xm267ceukksz.fsf@bsegall.svl.corp.google.com
Nikolay Aleksandrov says:
====================
net: bridge: mst: fix suspicious rcu usage warning
This set fixes a suspicious RCU usage warning triggered by syzbot[1] in
the bridge's MST code. After I converted br_mst_set_state to RCU, I
forgot to update the vlan group dereference helper. Fix it by using
the proper helper, in order to do that we need to pass the vlan group
which is already obtained correctly by the callers for their respective
context. Patch 01 is a requirement for the fix in patch 02.
Note I did consider rcu_dereference_rtnl() but the churn is much bigger
and in every part of the bridge. We can do that as a cleanup in
net-next.
[1] https://syzkaller.appspot.com/bug?extid=9bbe2de1bc9d470eb5fe
=============================
WARNING: suspicious RCU usage
6.10.0-rc2-syzkaller-00235-g8a92980606e3 #0 Not tainted
-----------------------------
net/bridge/br_private.h:1599 suspicious rcu_dereference_protected() usage!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1
4 locks held by syz-executor.1/5374:
#0: ffff888022d50b18 (&mm->mmap_lock){++++}-{3:3}, at: mmap_read_lock include/linux/mmap_lock.h:144 [inline]
#0: ffff888022d50b18 (&mm->mmap_lock){++++}-{3:3}, at: __mm_populate+0x1b0/0x460 mm/gup.c:2111
#1: ffffc90000a18c00 ((&p->forward_delay_timer)){+.-.}-{0:0}, at: call_timer_fn+0xc0/0x650 kernel/time/timer.c:1789
#2: ffff88805fb2ccb8 (&br->lock){+.-.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline]
#2: ffff88805fb2ccb8 (&br->lock){+.-.}-{2:2}, at: br_forward_delay_timer_expired+0x50/0x440 net/bridge/br_stp_timer.c:86
#3: ffffffff8e333fa0 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:329 [inline]
#3: ffffffff8e333fa0 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:781 [inline]
#3: ffffffff8e333fa0 (rcu_read_lock){....}-{1:2}, at: br_mst_set_state+0x171/0x7a0 net/bridge/br_mst.c:105
stack backtrace:
CPU: 1 PID: 5374 Comm: syz-executor.1 Not tainted 6.10.0-rc2-syzkaller-00235-g8a92980606e3 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/02/2024
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
lockdep_rcu_suspicious+0x221/0x340 kernel/locking/lockdep.c:6712
nbp_vlan_group net/bridge/br_private.h:1599 [inline]
br_mst_set_state+0x29e/0x7a0 net/bridge/br_mst.c:106
br_set_state+0x28a/0x7b0 net/bridge/br_stp.c:47
br_forward_delay_timer_expired+0x176/0x440 net/bridge/br_stp_timer.c:88
call_timer_fn+0x18e/0x650 kernel/time/timer.c:1792
expire_timers kernel/time/timer.c:1843 [inline]
__run_timers kernel/time/timer.c:2417 [inline]
__run_timer_base+0x66a/0x8e0 kernel/time/timer.c:2428
run_timer_base kernel/time/timer.c:2437 [inline]
run_timer_softirq+0xb7/0x170 kernel/time/timer.c:2447
handle_softirqs+0x2c4/0x970 kernel/softirq.c:554
__do_softirq kernel/softirq.c:588 [inline]
invoke_softirq kernel/softirq.c:428 [inline]
__irq_exit_rcu+0xf4/0x1c0 kernel/softirq.c:637
irq_exit_rcu+0x9/0x30 kernel/softirq.c:649
instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1043 [inline]
sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1043
</IRQ>
<TASK>
====================
Link: https://lore.kernel.org/r/20240609103654.914987-1-razor@blackwall.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The net.ipv6.route.flush system parameter takes a value which specifies
a delay used during the flush operation for aging exception routes. The
written value is however not used in the currently requested flush and
instead utilized only in the next one.
A problem is that ipv6_sysctl_rtcache_flush() first reads the old value
of net->ipv6.sysctl.flush_delay into a local delay variable and then
calls proc_dointvec() which actually updates the sysctl based on the
provided input.
Fix the problem by switching the order of the two operations.
Fixes: 4990509f19 ("[NETNS][IPV6]: Make sysctls route per namespace.")
Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20240607112828.30285-1-petr.pavlu@suse.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
- Fix clkdev - erroring out on long strings causes boot failures, so
don't do this. Still warn about the over-sized strings (which will
never match and thus their registration with clkdev is useless.)
- Fix for ftrace with frame pointer unwinder with recent GCC changing
the way frames are stacked.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEuNNh8scc2k/wOAE+9OeQG+StrGQFAmZm34EACgkQ9OeQG+St
rGTFpRAAivJ0AigfOr4AX4t+50UkijqNHSkCWkNL00mXzLDJfiFa2nSeIqb0oW3u
lzpCzBbnFgtTFu9j6T96dRJFBsZL6+Bmw9SIxkT1rsjG3tsExajETINAJrYmuXJ/
VxgY9k+xl/3XDKAPE0zAlcadvC/lnl4yxVduRaTK++0RZN3M/zM241anZWJs7hFb
yzpaf1hfcCFd5objz29uBhJMtJWl6EVl6y2XOQPvAa5mXTRNxu5kXP0WwaetdCNM
GyxOmdskXrqQRC1LU/UyX2g1vtIJbmHAEXMgqzHAkV+uiLt98A/AKe7NzoV/2zI7
+2PZM3LHXlwkv7PUBR2B8/CYD0448btjK+3fdRK0Ck8QpvE29yHukrA2zS2f2vTh
L3cFJZYNThDycAZUMX7hGum4Xe/l6fCqMnmLlyFidTWkWx+tawtRQNQB6VVDEzrM
+iqRbSMTZ3uDH4IBnXigS1KcZG/444EzcSB3e4zeVP5yH+AJ5xJG/itG/mvT10zE
214eg3C+hSUgm40s+juykty+lyggJwNxYYLu0mSLMYl+gu224s/N4nsE4Gb+7Lvn
iw4TtK0XNOkZ3VQ+M+XOYBu3nRZON37P91BYs692TgbfN/xreccoWujY2dIwx1s/
e8izgnDqQOovXVP8zKyTvD0THXSMIwIcwxD49Hudxhsfpyi5dfw=
=2Kpc
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux
Pull ARM and clkdev fixes from Russell King:
- Fix clkdev - erroring out on long strings causes boot failures, so
don't do this. Still warn about the over-sized strings (which will
never match and thus their registration with clkdev is useless)
- Fix for ftrace with frame pointer unwinder with recent GCC changing
the way frames are stacked.
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux:
ARM: 9405/1: ftrace: Don't assume stack frames are contiguous in memory
clkdev: don't fail clkdev_alloc() if over-sized
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEN9lkrMBJgcdVAPub1V2XiooUIOQFAmZoxygACgkQ1V2XiooU
IOSznA/+IofQMauyDxkynJAVVQSPH0Ee8kbeGj0jJWVTZVtLejZ/nLenGbsVGL16
SPv6QbiHIRCCbJ/Py+LlkXJ8iXF0ka2bu8p5IYQ2ic949JSay32XAaK4EZdQTtTS
3gx3cM2PKIYY4yi9K0qDpMsC7ZyKjAlVVjOrBPiP8viYtXcgoVbMq1/E8kAUf/G1
YB+occVHk6NDY6MJE+ooYwhssv4qSaPNHNHmQDQFHhQ4cdVKzT93do55Hu4/K7DN
6LlXAW+YpVeSmopwARrc+FmchwxyIoSQsh26Yn2Q/LiE9kQGeLQ34+nZjtmp/4aD
MkThTSk+ImJO1kuLibC2m84bg94c0dfDk/p5Gbcr1l0DxqOseYOz1hfnNTvfcBE8
4m30+jdtFabDy2pJKIEC730+UW5TrH+0f8HWazGOYJxvJOpip/ZiVjzcp57EybJS
y3GxExVWGab3hl+w2wfFfM5uHWosfXfJzKjdKMJNswpcZ89QR4/GqAZAvPMh8ljW
dxhwcRQ6IEy6B6yVnQ9dq9W9aGfsojIHqcK3HXdp3xvSnW2ZcRD89EL+nw1xfRzw
gl/89/EnTjKywCAJ1XhKh/WwFT5r+b8RGqDTH4aeT4OBrR0v2/Z9RUmXW9wF+C0t
nll1UlwdZqE0E0VvhhvwGGgF0Fr8nfLC1taIE0YW1/u9DxWleHs=
=yxF0
-----END PGP SIGNATURE-----
Merge tag 'nf-24-06-11' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for net:
Patch #1 fixes insufficient sanitization of netlink attributes for the
inner expression which can trigger nul-pointer dereference,
from Davide Ornaghi.
Patch #2 address a report that there is a race condition between
namespace cleanup and the garbage collection of the list:set
type. This patch resolves this issue with other minor issues
as well, from Jozsef Kadlecsik.
Patch #3 ip6_route_me_harder() ignores flowlabel/dsfield when ip dscp
has been mangled, this unbreaks ip6 dscp set $v,
from Florian Westphal.
All of these patches address issues that are present in several releases.
* tag 'nf-24-06-11' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: Use flowlabel flow key when re-routing mangled packets
netfilter: ipset: Fix race between namespace cleanup and gc in the list:set type
netfilter: nft_inner: validate mandatory meta and payload
====================
Link: https://lore.kernel.org/r/20240611220323.413713-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
- fix kworker explosion, due to calling submit_bio() (which can block)
from a multithreaded workqueue
- fix error handling in btree node scan
- forward compat fix: kill an old debug assert
- key cache shrinker fixes
this is a partial fix for stalls doing multithreaded creates - there
were various O(n^2) issues the key cache shrinker was hitting
https://lore.kernel.org/linux-bcachefs/fmfpgkt3dlzxhotrfmqg3j3wn5bpqiqvlc44xllncvdkimmx3i@n4okabtvhu7t/
there's more work coming here; I'm working on a patch to delete the
key cache lock, which initial testing shows to be a pretty drastic
performance improvement
- assorted syzbot fixes
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEKnAFLkS8Qha+jvQrE6szbY3KbnYFAmZqAsEACgkQE6szbY3K
bnYI5g//UvmEfIwy/JJz8ageQuAyVZmAcz56L35oRHhtN4AXK0oDYsvQ4K1myNLy
D5tCCvp5j/RD2Pa+yPpknAt+yZC4tSoPvltaE69dm6S+0MWqGKL0+XxnYedmMci6
ni6FLkGYoFqrMfcy3Z0I6L5RjsYXsqAN8gJpN1Ei2uw0C/WAlwMN7+Tp4SrWMKNB
moD+b8qu8L6fUvmakn9W7TytblPfZN67UAkKd7qXbeTPoJKIuZoyAmfF1w0Etpt3
uW7wUyLYHYJbeIdegDhk2IT3Z0dCnICmR6RkmPecuu9mKhMjL+Uy7afmYuNoeAeD
CpiDgERgxUGuAQvOf1M4Hjk8u/gIXfie3Mz9kpKEMUb+xfjb59xLuTJvhjah19Pk
5GGH0EbO599E56Q4tkss32HqcXeVXFKe+1WPWooDPiuhGiE4eikN36m6T48hj3i8
v/Up9O/ytXfXzm+gXCoJzp9lTGNi8v9Ugykw8SNnY6QjIL2Hztsp3KoqlAJcacWY
dp+oLjzPmv4jBJQxF9C58eW2Iew7zGM6pwO0inBjNwvFgMC45RZ7ILZrW7k1fLN5
TozGpxdBe8NPmAPJ8n86y9A6ljcn74tgtVqYRUE3jLGwCnMEfJBM4lcqU0lM3gjJ
w7sTXw0FVK7v73MwxcWhNRfminGucW3kMdCZyzqiw0G7K/Ezb1w=
=DhKg
-----END PGP SIGNATURE-----
Merge tag 'bcachefs-2024-06-12' of https://evilpiepirate.org/git/bcachefs
Pull bcachefs fixes from Kent Overstreet:
- fix kworker explosion, due to calling submit_bio() (which can block)
from a multithreaded workqueue
- fix error handling in btree node scan
- forward compat fix: kill an old debug assert
- key cache shrinker fixes
This is a partial fix for stalls doing multithreaded creates - there
were various O(n^2) issues the key cache shrinker was hitting [1].
There's more work coming here; I'm working on a patch to delete the
key cache lock, which initial testing shows to be a pretty drastic
performance improvement
- assorted syzbot fixes
Link: https://lore.kernel.org/linux-bcachefs/CAGudoHGenxzk0ZqPXXi1_QDbfqQhGHu+wUwzyS6WmfkUZ1HiXA@mail.gmail.com/ [1]
* tag 'bcachefs-2024-06-12' of https://evilpiepirate.org/git/bcachefs:
bcachefs: Fix rcu_read_lock() leak in drop_extra_replicas
bcachefs: Add missing bch_inode_info.ei_flags init
bcachefs: Add missing synchronize_srcu_expedited() call when shutting down
bcachefs: Check for invalid bucket from bucket_gen(), gc_bucket()
bcachefs: Replace bucket_valid() asserts in bucket lookup with proper checks
bcachefs: Fix snapshot_create_lock lock ordering
bcachefs: Fix refcount leak in check_fix_ptrs()
bcachefs: Leave a buffer in the btree key cache to avoid lock thrashing
bcachefs: Fix reporting of freed objects from key cache shrinker
bcachefs: set sb->s_shrinker->seeks = 0
bcachefs: increase key cache shrinker batch size
bcachefs: Enable automatic shrinking for rhashtables
bcachefs: fix the display format for show-super
bcachefs: fix stack frame size in fsck.c
bcachefs: Delete incorrect BTREE_ID_NR assertion
bcachefs: Fix incorrect error handling found_btree_node_is_readable()
bcachefs: Split out btree_write_submit_wq
In order to improve performance of typical scenarios we can try to insert
the entire vma on fault. This accelerates typical cases, such as when
the MMIO region is DMA mapped by QEMU. The vfio_iommu_type1 driver will
fault in the entire DMA mapped range through fixup_user_fault().
In synthetic testing, this improves the time required to walk a PCI BAR
mapping from userspace by roughly 1/3rd.
This is likely an interim solution until vmf_insert_pfn_{pmd,pud}() gain
support for pfnmaps.
Suggested-by: Yan Zhao <yan.y.zhao@intel.com>
Link: https://lore.kernel.org/all/Zl6XdUkt%2FzMMGOLF@yzhao56-desk.sh.intel.com/
Reviewed-by: Yan Zhao <yan.y.zhao@intel.com>
Link: https://lore.kernel.org/r/20240607035213.2054226-1-alex.williamson@redhat.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Make it again possible for sparse to verify that blk_status_t and Unix
error codes are used in the proper context by making nbd_send_cmd()
return a blk_status_t instead of an integer.
No functionality has been changed.
Signed-off-by: Christoph Hellwig <hch@lst.de>
[ bvanassche: added description and made two small formatting changes ]
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20240604221531.327131-1-bvanassche@acm.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
There is a report of io_rsrc_ref_quiesce() locking a mutex while not
TASK_RUNNING, which is due to forgetting restoring the state back after
io_run_task_work_sig() and attempts to break out of the waiting loop.
do not call blocking ops when !TASK_RUNNING; state=1 set at
[<ffffffff815d2494>] prepare_to_wait+0xa4/0x380
kernel/sched/wait.c:237
WARNING: CPU: 2 PID: 397056 at kernel/sched/core.c:10099
__might_sleep+0x114/0x160 kernel/sched/core.c:10099
RIP: 0010:__might_sleep+0x114/0x160 kernel/sched/core.c:10099
Call Trace:
<TASK>
__mutex_lock_common kernel/locking/mutex.c:585 [inline]
__mutex_lock+0xb4/0x940 kernel/locking/mutex.c:752
io_rsrc_ref_quiesce+0x590/0x940 io_uring/rsrc.c:253
io_sqe_buffers_unregister+0xa2/0x340 io_uring/rsrc.c:799
__io_uring_register io_uring/register.c:424 [inline]
__do_sys_io_uring_register+0x5b9/0x2400 io_uring/register.c:613
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xd8/0x270 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x6f/0x77
Reported-by: Li Shi <sl1589472800@gmail.com>
Fixes: 4ea15b56f0 ("io_uring/rsrc: use wq for quiescing")
Cc: stable@vger.kernel.org
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/77966bc104e25b0534995d5dbb152332bc8f31c0.1718196953.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The spec doesn't mandate that the first two double words (aka results)
for the command queue entry need to be set to 0 when they are not
used (not specified). Though, the target implemention returns 0 for TCP
and FC but not for RDMA.
Let's make RDMA behave the same and thus explicitly initializing the
result field. This prevents leaking any data from the stack.
Signed-off-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
The id override functions return a status which is not propagated to the
caller.
Fixes: c1fef73f79 ("nvmet: add passthru code to process commands")
Signed-off-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
If a discard request needs to be retried, and that retry may fail before
a new special payload is added, a double free will result. Clear the
RQF_SPECIAL_LOAD when the request is cleaned.
Signed-off-by: Chunguang Xu <chunguang.xu@shopee.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
The user mapped intergity is copied back and unpinned by
bio_integrity_free which is a low-level routine. Do it via the submitter
rather than doing it in the low-level block layer code, to split the
submitter side from the consumer side of the bio.
Signed-off-by: Anuj Gupta <anuj20.g@samsung.com>
Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20240610111144.14647-1-anuj20.g@samsung.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Friedrich Weber reported a kernel crash problem and bisected to commit
81ada09cc2 ("blk-flush: reuse rq queuelist in flush state machine").
The root cause is that we use "list_move_tail(&rq->queuelist, pending)"
in the PREFLUSH/POSTFLUSH sequences. But rq->queuelist.next == xxx since
it's popped out from plug->cached_rq in __blk_mq_alloc_requests_batch().
We don't initialize its queuelist just for this first request, although
the queuelist of all later popped requests will be initialized.
Fix it by changing to use "list_add_tail(&rq->queuelist, pending)" so
rq->queuelist doesn't need to be initialized. It should be ok since rq
can't be on any list when PREFLUSH or POSTFLUSH, has no move actually.
Please note the commit 81ada09cc2 ("blk-flush: reuse rq queuelist in
flush state machine") also has another requirement that no drivers would
touch rq->queuelist after blk_mq_end_request() since we will reuse it to
add rq to the post-flush pending list in POSTFLUSH. If this is not true,
we will have to revert that commit IMHO.
This updated version adds "list_del_init(&rq->queuelist)" in flush rq
callback since the dm layer may submit request of a weird invalid format
(REQ_FSEQ_PREFLUSH | REQ_FSEQ_POSTFLUSH), which causes double list_add
if without this "list_del_init(&rq->queuelist)". The weird invalid format
problem should be fixed in dm layer.
Reported-by: Friedrich Weber <f.weber@proxmox.com>
Closes: https://lore.kernel.org/lkml/14b89dfb-505c-49f7-aebb-01c54451db40@proxmox.com/
Closes: https://lore.kernel.org/lkml/c9d03ff7-27c5-4ebd-b3f6-5a90d96f35ba@proxmox.com/
Fixes: 81ada09cc2 ("blk-flush: reuse rq queuelist in flush state machine")
Cc: Christoph Hellwig <hch@lst.de>
Cc: ming.lei@redhat.com
Cc: bvanassche@acm.org
Tested-by: Friedrich Weber <f.weber@proxmox.com>
Signed-off-by: Chengming Zhou <chengming.zhou@linux.dev>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20240608143115.972486-1-chengming.zhou@linux.dev
Signed-off-by: Jens Axboe <axboe@kernel.dk>
For zoned block devices using zone write plugging, an rcu_barrier() call
is needed in disk_free_zone_resources() to synchronize freeing of zone
write plugs and the destrution of the mempool used to allocate the
plugs. The barrier call does slow down a little teardown of zoned block
devices but should not affect teardown of regular block devices or zoned
block devices that do not use zone write plugging (e.g. zoned DM devices
that do not require zone append emulation).
Modify disk_free_zone_resources() to return early if we do not have a
mempool to start with, that is, if the device does not use zone write
plugging. This avoids the costly rcu_barrier() and speeds up disk
teardown.
Reported-by: Mikulas Patocka <mpatocka@redhat.com>
Fixes: dd291d77cc ("block: Introduce zone write plugging")
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Mikulas Patocka <mpatocka@redhat.com>
Reviewed-by: Niklas Cassel <cassel@kernel.org>
Link: https://lore.kernel.org/r/20240607002126.104227-1-dlemoal@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Clang static checker (scan-build) warning:
block/sed-opal.c:line 317, column 3
Value stored to 'ret' is never read.
Fix this problem by returning the error code when keyring_search() failed.
Otherwise, 'key' will have a wrong value when 'kerf' stores the error code.
Fixes: 3bfeb61256 ("block: sed-opal: keyring support for SED keys")
Signed-off-by: Su Hui <suhui@nfschina.com>
Link: https://lore.kernel.org/r/20240611073659.429582-1-suhui@nfschina.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
There is a couple of outdated addresses that are still visible
in the Git history, add them to .mailmap.
While at it, replace one in the comment.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When an I2C adapter acts only as a slave, it should not claim to
support I2C master capabilities.
Fixes: 5b6d721b26 ("i2c: designware: enable SLAVE in platform module")
Signed-off-by: Jean Delvare <jdelvare@suse.de>
Cc: Luis Oliveira <lolivei@synopsys.com>
Cc: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Jan Dabros <jsd@semihalf.com>
Cc: Andi Shyti <andi.shyti@kernel.org>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Tested-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
After recent changes in intel_pstate, global.turbo_disabled is only set
at the initialization time and never changed. However, it turns out
that on some systems the "turbo disabled" bit in MSR_IA32_MISC_ENABLE,
the initial state of which is reflected by global.turbo_disabled, can be
flipped later and there should be a way to take that into account (other
than checking that MSR every time the driver runs which is costly and
useless overhead on the vast majority of systems).
For this purpose, notice that before the changes in question,
store_no_turbo() contained a turbo_is_disabled() check that was used
for updating global.turbo_disabled if the "turbo disabled" bit in
MSR_IA32_MISC_ENABLE had been flipped and that functionality can be
restored. Then, users will be able to reset global.turbo_disabled
by writing 0 to no_turbo which used to work before on systems with
flipping "turbo disabled" bit.
This guarantees the driver state to remain in sync, but READ_ONCE()
annotations need to be added in two places where global.turbo_disabled
is accessed locklessly, so modify the driver to make that happen.
Fixes: 0940f1a801 ("cpufreq: intel_pstate: Do not update global.turbo_disabled after initialization")
Closes: https://lore.kernel.org/linux-pm/bf3ebf1571a4788e97daf861eb493c12d42639a3.camel@xry111.site
Suggested-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Reported-by: Xi Ruoyao <xry111@xry111.site>
Tested-by: Xi Ruoyao <xry111@xry111.site>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Based on grepping through the source code this driver appears to be
missing a call to drm_atomic_helper_shutdown() at system shutdown
time. Among other things, this means that if a panel is in use that it
won't be cleanly powered off at system shutdown time.
The fact that we should call drm_atomic_helper_shutdown() in the case
of OS shutdown/restart comes straight out of the kernel doc "driver
instance overview" in drm_drv.c.
This driver users the component model and shutdown happens in the base
driver. The "drvdata" for this driver will always be valid if
shutdown() is called and as of commit 2a07396828
("drm/atomic-helper: drm_atomic_helper_shutdown(NULL) should be a
noop") we don't need to confirm that "drm" is non-NULL.
Suggested-by: Maxime Ripard <mripard@kernel.org>
Reviewed-by: Maxime Ripard <mripard@kernel.org>
Reviewed-by: Fei Shao <fshao@chromium.org>
Tested-by: Fei Shao <fshao@chromium.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Maxime Ripard <mripard@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20240611102744.v2.1.I2b014f90afc4729b6ecc7b5ddd1f6dedcea4625b@changeid
Based on grepping through the source code, this driver appears to be
missing a call to drm_atomic_helper_shutdown() at system shutdown time.
This is important because drm_atomic_helper_shutdown() will cause
panels to get disabled cleanly which may be important for their power
sequencing. Future changes will remove any custom powering off in
individual panel drivers so the DRM drivers need to start getting this
right.
The fact that we should call drm_atomic_helper_shutdown() in the case of
OS shutdown comes straight out of the kernel doc "driver instance
overview" in drm_drv.c.
[geert: shmob_drm_remove() already calls drm_atomic_helper_shutdown]
Suggested-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20230901164111.RFT.15.Iaf638a1d4c8b3c307a6192efabb4cbb06b195f15@changeid
[geert: s/drm_helper_force_disable_all/drm_atomic_helper_shutdown/]
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Reviewed-by: Sui Jingfeng <sui.jingfeng@linux.dev>
Signed-off-by: Maxime Ripard <mripard@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/17c6a5a668e5975f871b77fb1fca6711a0799d9e.1718176895.git.geert+renesas@glider.be
When multiple streams are in use, multiple TDs might be in flight when
an endpoint is stopped. We need to issue a Set TR Dequeue Pointer for
each, to ensure everything is reset properly and the caches cleared.
Change the logic so that any N>1 TDs found active for different streams
are deferred until after the first one is processed, calling
xhci_invalidate_cancelled_tds() again from xhci_handle_cmd_set_deq() to
queue another command until we are done with all of them. Also change
the error/"should never happen" paths to ensure we at least clear any
affected TDs, even if we can't issue a command to clear the hardware
cache, and complain loudly with an xhci_warn() if this ever happens.
This problem case dates back to commit e9df17eb14 ("USB: xhci: Correct
assumptions about number of rings per endpoint.") early on in the XHCI
driver's life, when stream support was first added.
It was then identified but not fixed nor made into a warning in commit
674f8438c1 ("xhci: split handling halted endpoints into two steps"),
which added a FIXME comment for the problem case (without materially
changing the behavior as far as I can tell, though the new logic made
the problem more obvious).
Then later, in commit 94f339147f ("xhci: Fix failure to give back some
cached cancelled URBs."), it was acknowledged again.
[Mathias: commit 94f339147f ("xhci: Fix failure to give back some cached
cancelled URBs.") was a targeted regression fix to the previously mentioned
patch. Users reported issues with usb stuck after unmounting/disconnecting
UAS devices. This rolled back the TD clearing of multiple streams to its
original state.]
Apparently the commit author was aware of the problem (yet still chose
to submit it): It was still mentioned as a FIXME, an xhci_dbg() was
added to log the problem condition, and the remaining issue was mentioned
in the commit description. The choice of making the log type xhci_dbg()
for what is, at this point, a completely unhandled and known broken
condition is puzzling and unfortunate, as it guarantees that no actual
users would see the log in production, thereby making it nigh
undebuggable (indeed, even if you turn on DEBUG, the message doesn't
really hint at there being a problem at all).
It took me *months* of random xHC crashes to finally find a reliable
repro and be able to do a deep dive debug session, which could all have
been avoided had this unhandled, broken condition been actually reported
with a warning, as it should have been as a bug intentionally left in
unfixed (never mind that it shouldn't have been left in at all).
> Another fix to solve clearing the caches of all stream rings with
> cancelled TDs is needed, but not as urgent.
3 years after that statement and 14 years after the original bug was
introduced, I think it's finally time to fix it. And maybe next time
let's not leave bugs unfixed (that are actually worse than the original
bug), and let's actually get people to review kernel commits please.
Fixes xHC crashes and IOMMU faults with UAS devices when handling
errors/faults. Easiest repro is to use `hdparm` to mark an early sector
(e.g. 1024) on a disk as bad, then `cat /dev/sdX > /dev/null` in a loop.
At least in the case of JMicron controllers, the read errors end up
having to cancel two TDs (for two queued requests to different streams)
and the one that didn't get cleared properly ends up faulting the xHC
entirely when it tries to access DMA pages that have since been unmapped,
referred to by the stale TDs. This normally happens quickly (after two
or three loops). After this fix, I left the `cat` in a loop running
overnight and experienced no xHC failures, with all read errors
recovered properly. Repro'd and tested on an Apple M1 Mac Mini
(dwc3 host).
On systems without an IOMMU, this bug would instead silently corrupt
freed memory, making this a security bug (even on systems with IOMMUs
this could silently corrupt memory belonging to other USB devices on the
same controller, so it's still a security bug). Given that the kernel
autoprobes partition tables, I'm pretty sure a malicious USB device
pretending to be a UAS device and reporting an error with the right
timing could deliberately trigger a UAF and write to freed memory, with
no user action.
[Mathias: Commit message and code comment edit, original at:]
https://lore.kernel.org/linux-usb/20240524-xhci-streams-v1-1-6b1f13819bea@marcan.st/
Fixes: e9df17eb14 ("USB: xhci: Correct assumptions about number of rings per endpoint.")
Fixes: 94f339147f ("xhci: Fix failure to give back some cached cancelled URBs.")
Fixes: 674f8438c1 ("xhci: split handling halted endpoints into two steps")
Cc: stable@vger.kernel.org
Cc: security@kernel.org
Reviewed-by: Neal Gompa <neal@gompa.dev>
Signed-off-by: Hector Martin <marcan@marcan.st>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20240611120610.3264502-5-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
As described in commit 8f873c1ff4 ("xhci: Blacklist using streams on the
Etron EJ168 controller"), EJ188 have the same issue as EJ168, where Streams
do not work reliable on EJ188. So apply XHCI_BROKEN_STREAMS quirk to EJ188
as well.
Cc: stable@vger.kernel.org
Signed-off-by: Kuangyi Chiang <ki.chiang65@gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20240611120610.3264502-4-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
As described in commit c877b3b2ad ("xhci: Add reset on resume quirk for
asrock p67 host"), EJ188 have the same issue as EJ168, where completely
dies on resume. So apply XHCI_RESET_ON_RESUME quirk to EJ188 as well.
Cc: stable@vger.kernel.org
Signed-off-by: Kuangyi Chiang <ki.chiang65@gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20240611120610.3264502-3-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The transferred length is set incorrectly for cancelled bulk
transfer TDs in case the bulk transfer ring stops on the last transfer
block with a 'Stop - Length Invalid' completion code.
length essentially ends up being set to the requested length:
urb->actual_length = urb->transfer_buffer_length
Length for 'Stop - Length Invalid' cases should be the sum of all
TRB transfer block lengths up to the one the ring stopped on,
_excluding_ the one stopped on.
Fix this by always summing up TRB lengths for 'Stop - Length Invalid'
bulk cases.
This issue was discovered by Alan Stern while debugging
https://bugzilla.kernel.org/show_bug.cgi?id=218890, but does not
solve that bug. Issue is older than 4.10 kernel but fix won't apply
to those due to major reworks in that area.
Tested-by: Pierre Tomon <pierretom+12@ik.me>
Cc: stable@vger.kernel.org # v4.10+
Cc: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20240611120610.3264502-2-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fix an issue where get_write is not used in smb2_set_ea().
Fixes: 6fc0a265e1 ("ksmbd: fix potential circular locking issue in smb2_set_ea()")
Cc: stable@vger.kernel.org
Reported-by: Wang Zhaolong <wangzhaolong1@huawei.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
If the directory name in the root of the share starts with
character like 镜(0x955c) or Ṝ(0x1e5c), it (and anything inside)
cannot be accessed. The leading slash check must be checked after
converting unicode to nls string.
Cc: stable@vger.kernel.org
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
The current cbs parameter depends on speed after uplinking,
which is not needed and will report a configuration error
if the port is not initially connected. The UAPI exposed by
tc-cbs requires userspace to recalculate the send slope anyway,
because the formula depends on port_transmit_rate (see man tc-cbs),
which is not an invariant from tc's perspective. Therefore, we
use offload->sendslope and offload->idleslope to derive the
original port_transmit_rate from the CBS formula.
Fixes: 1f705bc61a ("net: stmmac: Add support for CBS QDISC")
Signed-off-by: Xiaolei Wang <xiaolei.wang@windriver.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://lore.kernel.org/r/20240608143524.2065736-1-xiaolei.wang@windriver.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
TSO currently fails when the skb's gso_type field has more than one bit
set.
TSO packets can be passed from userspace using PF_PACKET, TUNTAP and a
few others, using virtio_net_hdr (e.g., PACKET_VNET_HDR). This includes
virtualization, such as QEMU, a real use-case.
The gso_type and gso_size fields as passed from userspace in
virtio_net_hdr are not trusted blindly by the kernel. It adds gso_type
|= SKB_GSO_DODGY to force the packet to enter the software GSO stack
for verification.
This issue might similarly come up when the CWR bit is set in the TCP
header for congestion control, causing the SKB_GSO_TCP_ECN gso_type bit
to be set.
Fixes: a57e5de476 ("gve: DQO: Add TX path")
Signed-off-by: Joshua Washington <joshwash@google.com>
Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com>
Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Suggested-by: Eric Dumazet <edumazet@google.com>
Acked-by: Andrei Vagin <avagin@gmail.com>
v2 - Remove unnecessary comments, remove line break between fixes tag
and signoffs.
v3 - Add back unrelated empty line removal.
Link: https://lore.kernel.org/r/20240610225729.2985343-1-joshwash@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>