io_uring defaults to always doing inline submissions, if at all
possible. But for larger copies, even if the data is fully cached, that
can take a long time. Add an IOSQE_ASYNC flag that the application can
set on the SQE - if set, it'll ensure that we always go async for those
kinds of requests. Use the io-wq IO_WQ_WORK_CONCURRENT flag to ensure we
get the concurrency we desire for this case.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
io-wq assumes that work will complete fast (and not block), so it
doesn't create a new worker when work is enqueued, if we already have
at least one worker running. This is done on the assumption that if work
is running, then it will complete fast.
Add an option to force io-wq to fork a new worker for work queued. This
is signaled by setting IO_WQ_WORK_CONCURRENT on the work item. For that
case, io-wq will create a new worker, even though workers are already
running.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
To implement an async stat, we need to provide the flags mapping and
the statx user copy. Make them available internally, through
fs/internal.h.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
We currently fully quiesce the ring before an unregister or update of
the fixed fileset. This is very expensive, and we can be a bit smarter
about this.
Add a percpu refcount for the file tables as a whole. Grab a percpu ref
when we use a registered file, and put it on completion. This is cheap
to do. Upon removal of a file from a set, switch the ref count to atomic
mode. When we hit zero ref on the completion side, then we know we can
drop the previously registered files. When the old files have been
dropped, switch the ref back to percpu mode for normal operation.
Since there's a period between doing the update and the kernel being
done with it, add a IORING_OP_FILES_UPDATE opcode that can perform the
same action. The application knows the update has completed when it gets
the CQE for it. Between doing the update and receiving this completion,
the application must continue to use the unregistered fd if submitting
IO on this particular file.
This takes the runtime of test/file-register from liburing from 14s to
about 0.7s.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This works just like close(2), unsurprisingly. We remove the file
descriptor and post the completion inline, then offload the actual
(potential) last file put to async context.
Mark the async part of this work as uncancellable, as we really must
guarantee that the latter part of the close is run.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Not all work can be cancelled, some of it we may need to guarantee
that it runs to completion. Allow the caller to set IO_WQ_WORK_NO_CANCEL
on work that must not be cancelled. Note that the caller work function
must also check for IO_WQ_WORK_NO_CANCEL on work that is marked
IO_WQ_WORK_CANCEL.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Just one caller of this, and just use filp_close() there manually.
This is important to allow async close/removal of the fd.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This works just like openat(2), except it can be performed async. For
the normal case of a non-blocking path lookup this will complete
inline. If we have to do IO to perform the open, it'll be done from
async context.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
fds field of struct io_uring_files_update is problematic with regards
to compat user space, as pointer size is different in 32-bit, 32-on-64-bit,
and 64-bit user space. In order to avoid custom handling of compat in
the syscall implementation, make fds __u64 and use u64_to_user_ptr in
order to retrieve it. Also, align the field naturally and check that
no garbage is passed there.
Fixes: c3a31e6056 ("io_uring: add support for IORING_REGISTER_FILES_UPDATE")
Signed-off-by: Eugene Syromiatnikov <esyr@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Three fixes for RISC-V:
- Don't free and reuse memory containing the code that CPUs parked at
boot reside in.
- Fix rv64 build problems for ubsan and some modules by adding logical
and arithmetic shift helpers for 128-bit values. These are from
libgcc and are similar to what's present for ARM64.
- Fix vDSO builds to clean up their own temporary files.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEElRDoIDdEz9/svf2Kx4+xDQu9KksFAl4klXMACgkQx4+xDQu9
KkshPhAAiExCAl2JpZxoeuyQmjS7X68au5CWWJa+uB5osAgxJSPk4XJt9QeagOFw
wpKmefQDwPKXQuoD0VmNSdJMioBvgzFLqftoc2D9GAv3A0MB4miSsZkUheVbUifN
jtc3tc3jOCiVNOb0lbX+B6NL+qentvV6CTmujrf79wDBZpGPzKSM2S2OZMvFijVY
B7ijF1bqXBZg6weE8xdefJhBfDmd7vDBKmMJtv1RUbiOgoBGqjM8QtaXVrUz4Px0
NLlTleZL4grZVCepJ4psambm5gcZ8UilAe/ywhhrSFOSNCTYKB3ST+ci9VC4t5Vx
TiR6MnDV1qAl2Uh6aoOhZECpggge9zQOyER7QGbleNplxavhi6jsRgeV9hbqeyBZ
FCanzqO33irRwrtl7lNsPiUv3XWyyGH5yQLxA9wPq/W9dJkO6Z8pl5Fq0kI/oJNj
WtlVIp2EnkXJUmXiMBTHerLoBJAVnu+S2HRIeRqOEUSKT4bFP28kq3T6ctP4QuoT
3F2k9A3DOIZPY8dhXIdFSdcM0IbfRliFy/9hHLt6JdSa35poxGUBpOafDq87n9/3
UtbkLbxJd56EWOSx5l2bLJ/b4FjJX6powfV9cwjrHjrzoiAqc8z0lfQRGlbGjeYb
3XTTXhLIspGMK8acm8/p2I6uwDVGFMveEdsLJ8UWAiYjJkLSFEk=
=SfqL
-----END PGP SIGNATURE-----
Merge tag 'riscv/for-v5.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Paul Walmsley:
"Three fixes for RISC-V:
- Don't free and reuse memory containing the code that CPUs parked at
boot reside in.
- Fix rv64 build problems for ubsan and some modules by adding
logical and arithmetic shift helpers for 128-bit values. These are
from libgcc and are similar to what's present for ARM64.
- Fix vDSO builds to clean up their own temporary files"
* tag 'riscv/for-v5.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: Less inefficient gcc tishift helpers (and export their symbols)
riscv: delete temporary files
riscv: make sure the cores stay looping in .Lsecondary_park
Pull networking fixes from David Miller:
1) Fix non-blocking connect() in x25, from Martin Schiller.
2) Fix spurious decryption errors in kTLS, from Jakub Kicinski.
3) Netfilter use-after-free in mtype_destroy(), from Cong Wang.
4) Limit size of TSO packets properly in lan78xx driver, from Eric
Dumazet.
5) r8152 probe needs an endpoint sanity check, from Johan Hovold.
6) Prevent looping in tcp_bpf_unhash() during sockmap/tls free, from
John Fastabend.
7) hns3 needs short frames padded on transmit, from Yunsheng Lin.
8) Fix netfilter ICMP header corruption, from Eyal Birger.
9) Fix soft lockup when low on memory in hns3, from Yonglong Liu.
10) Fix NTUPLE firmware command failures in bnxt_en, from Michael Chan.
11) Fix memory leak in act_ctinfo, from Eric Dumazet.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (91 commits)
cxgb4: reject overlapped queues in TC-MQPRIO offload
cxgb4: fix Tx multi channel port rate limit
net: sched: act_ctinfo: fix memory leak
bnxt_en: Do not treat DSN (Digital Serial Number) read failure as fatal.
bnxt_en: Fix ipv6 RFS filter matching logic.
bnxt_en: Fix NTUPLE firmware command failures.
net: systemport: Fixed queue mapping in internal ring map
net: dsa: bcm_sf2: Configure IMP port for 2Gb/sec
net: dsa: sja1105: Don't error out on disabled ports with no phy-mode
net: phy: dp83867: Set FORCE_LINK_GOOD to default after reset
net: hns: fix soft lockup when there is not enough memory
net: avoid updating qdisc_xmit_lock_key in netdev_update_lockdep_key()
net/sched: act_ife: initalize ife->metalist earlier
netfilter: nat: fix ICMP header corruption on ICMP errors
net: wan: lapbether.c: Use built-in RCU list checking
netfilter: nf_tables: fix flowtable list del corruption
netfilter: nf_tables: fix memory leak in nf_tables_parse_netdev_hooks()
netfilter: nf_tables: remove WARN and add NLA_STRING upper limits
netfilter: nft_tunnel: ERSPAN_VERSION must not be null
netfilter: nft_tunnel: fix null-attribute check
...
A queue can't belong to multiple traffic classes. So, reject
any such configuration that results in overlapped queues for a
traffic class.
Fixes: b1396c2bd6 ("cxgb4: parse and configure TC-MQPRIO offload")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
T6 can support 2 egress traffic management channels per port to
double the total number of traffic classes that can be configured.
In this configuration, if the class belongs to the other channel,
then all the queues must be bound again explicitly to the new class,
for the rate limit parameters on the other channel to take effect.
So, always explicitly bind all queues to the port rate limit traffic
class, regardless of the traffic management channel that it belongs
to. Also, only bind queues to port rate limit traffic class, if all
the queues don't already belong to an existing different traffic
class.
Fixes: 4ec4762d8e ("cxgb4: add TC-MATCHALL classifier egress offload")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The existing __lshrti3 was really inefficient, and the other two helpers
are also needed to compile some modules.
Add the missing versions, and export all of the symbols like arm64
already does.
This code is based on the assembly generated by libgcc builds.
This fixes a build break triggered by ubsan:
riscv64-unknown-linux-gnu-ld: lib/ubsan.o: in function `.L2':
ubsan.c:(.text.unlikely+0x38): undefined reference to `__ashlti3'
riscv64-unknown-linux-gnu-ld: ubsan.c:(.text.unlikely+0x42): undefined reference to `__ashrti3'
Signed-off-by: Olof Johansson <olof@lixom.net>
[paul.walmsley@sifive.com: use SYM_FUNC_{START,END} instead of
ENTRY/ENDPROC; note libgcc origin]
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
core mst:
- serialize down messages and clear timeslots are on unplug
amdgpu:
- Update golden settings for renoir
- eDP fix
i915:
- uAPI fix: Remove dash and colon from PMU names to comply with tools/perf
- Fix for include file that was indirectly included
- Two fixes to make sure VMA are marked active for error capture
virtio:
- maintain obj reservation lock when submitting cmds
rockchip:
- increase link rate var size to accommodate rates
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJeI3ZMAAoJEAx081l5xIa+iuUQAIIMTL86+QPr3CSu8fU7albJ
MQWOzycWn0xP0KNYIvKz2X6dePWAEDi2kMM+6M/htzi32XmyG0oiAat2wAGkK3ir
AtfOnbKdD6T0i5wwkAn8SvNYHOTHdg7HMKTzPw2rxlTlZ8IEdVxo1CZqHZMFg0PV
UiK2VWVBPmBwdRgLu5K8XJy5LKbpx9BMWW5KdS96L2lKNVV3qeH3Q7NvAh1PtJp4
W+83OsYxYNwS0//4v0i0MjCGVXq5qVw7tzplAPW2OpUCMZ4TtVUnW2/RcrqvU9/b
v18M7thf2uccMXs7E1gyHgvrZp2lw6KsyJEpPnIp4k+2OXM0BVe9l4Pl0uUQhrDH
EUvCkpmP1/Zu44CJhtmVAZlVyy3ShNY5NeF6OsCZDSL7z60YMZHFVKF1IgrBkOJF
mXXq4eb0rMh3At6ABNliRo/bzEYfGgVy1xxNzOoTKCp6pNK/OGy4xQ/DEAkZbBBx
ul6E+NjaHhul1daWqpuv37oxsjE0RI7asFBomC5HTNWVZHY8MN8+9p/TTZQJ8n0l
yxMlPy2OOWVfwclwRSLant3Ny1hXNysMpWXMtrEtcB3Q5GI7H4ZCNh9b+VUPLjXT
JPZU2ZSIjrYNBeKYjVGV75HKjyMWJv1aB3kiXq7lX1yFq7wQLd5nE5mjTSD9wr1I
ZnhFSbHLcR7VMWUldPPi
=Vwa3
-----END PGP SIGNATURE-----
Merge tag 'drm-fixes-2020-01-19' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Back from LCA2020, fixes wasn't too busy last week, seems to have
quieten down appropriately, some amdgpu, i915, then a core mst fix and
one fix for virtio-gpu and one for rockchip:
core mst:
- serialize down messages and clear timeslots are on unplug
amdgpu:
- Update golden settings for renoir
- eDP fix
i915:
- uAPI fix: Remove dash and colon from PMU names to comply with
tools/perf
- Fix for include file that was indirectly included
- Two fixes to make sure VMA are marked active for error capture
virtio:
- maintain obj reservation lock when submitting cmds
rockchip:
- increase link rate var size to accommodate rates"
* tag 'drm-fixes-2020-01-19' of git://anongit.freedesktop.org/drm/drm:
drm/amd/display: Reorder detect_edp_sink_caps before link settings read.
drm/amdgpu: update goldensetting for renoir
drm/dp_mst: Have DP_Tx send one msg at a time
drm/dp_mst: clear time slots for ports invalid
drm/i915/pmu: Do not use colons or dashes in PMU names
drm/rockchip: fix integer type used for storing dp data rate
drm/i915/gt: Mark ring->vma as active while pinned
drm/i915/gt: Mark context->state vma as active while pinned
drm/i915/gt: Skip trying to unbind in restore_ggtt_mappings
drm/i915: Add missing include file <linux/math64.h>
drm/virtio: add missing virtio_gpu_array_lock_resv call
Temporary files used in the VDSO build process linger on even after make
mrproper: vdso-dummy.o.tmp, vdso.so.dbg.tmp.
Delete them once they're no longer needed.
Signed-off-by: Ilie Halip <ilie.halip@gmail.com>
Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
Pull x86 fixes from Ingo Molnar:
"Misc fixes:
- a resctrl fix for uninitialized objects found by debugobjects
- a resctrl memory leak fix
- fix the unintended re-enabling of the of SME and SEV CPU flags if
memory encryption was disabled at bootup via the MSR space"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/CPU/AMD: Ensure clearing of SME/SEV features is maintained
x86/resctrl: Fix potential memory leak
x86/resctrl: Fix an imbalance in domain_remove_cpu()
Pull timer fixes from Ingo Molnar:
"Three fixes: fix link failure on Alpha, fix a Sparse warning and
annotate/robustify a lockless access in the NOHZ code"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
tick/sched: Annotate lockless access to last_jiffies_update
lib/vdso: Make __cvdso_clock_getres() static
time/posix-stubs: Provide compat itimer supoprt for alpha
Pull cpu/SMT fix from Ingo Molnar:
"Fix a build bug on CONFIG_HOTPLUG_SMT=y && !CONFIG_SYSFS kernels"
* 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
cpu/SMT: Fix x86 link error without CONFIG_SYSFS
Pull x86 RAS fix from Ingo Molnar:
"Fix a thermal throttling race that can result in easy to trigger boot
crashes on certain Ice Lake platforms"
* 'ras-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mce/therm_throt: Do not access uninitialized therm_work
Pull perf fixes from Ingo Molnar:
"Tooling fixes, three Intel uncore driver fixes, plus an AUX events fix
uncovered by the perf fuzzer"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/intel/uncore: Remove PCIe3 unit for SNR
perf/x86/intel/uncore: Fix missing marker for snr_uncore_imc_freerunning_events
perf/x86/intel/uncore: Add PCI ID of IMC for Xeon E3 V5 Family
perf: Correctly handle failed perf_get_aux_event()
perf hists: Fix variable name's inconsistency in hists__for_each() macro
perf map: Set kmap->kmaps backpointer for main kernel map chunks
perf report: Fix incorrectly added dimensions as switch perf data file
tools lib traceevent: Fix memory leakage in filter_event
Pull locking fixes from Ingo Molnar:
"Three fixes:
- Fix an rwsem spin-on-owner crash, introduced in v5.4
- Fix a lockdep bug when running out of stack_trace entries,
introduced in v5.4
- Docbook fix"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/rwsem: Fix kernel crash when spinning on RWSEM_OWNER_UNKNOWN
futex: Fix kernel-doc notation warning
locking/lockdep: Fix buffer overrun problem in stack_trace[]
Pull irq fix from Ingo Molnar:
"Fix a recent regression in the Ingenic SoCs irqchip driver that floods
the syslog"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/ingenic: Get rid of the legacy IRQ domain
Pull EFI fixes from Ingo Molnar:
"Three EFI fixes:
- Fix a slow-boot-scrolling regression but making sure we use WC for
EFI earlycon framebuffer mappings on x86
- Fix a mixed EFI mode boot crash
- Disable paging explicitly before entering startup_32() in mixed
mode bootup"
* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/efistub: Disable paging at mixed mode entry
efi/libstub/random: Initialize pointer variables to zero for mixed mode
efi/earlycon: Fix write-combine mapping on x86
Pull rseq fixes from Ingo Molnar:
"Two rseq bugfixes:
- CLONE_VM !CLONE_THREAD didn't work properly, the kernel would end
up corrupting the TLS of the parent. Technically a change in the
ABI but the previous behavior couldn't resonably have been relied
on by applications so this looks like a valid exception to the ABI
rule.
- Make the RSEQ_FLAG_UNREGISTER ABI behavior consistent with the
handling of other flags. This is not thought to impact any
applications either"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
rseq: Unregister rseq for clone CLONE_VM
rseq: Reject unknown flags on rseq unregister
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCXiL/qwAKCRCRxhvAZXjc
oln5AP9ITypHs2iNWl1Cbte++y2iflWevDyPUrmagegqpKwbJAD9EypY0RVDor8T
LXWK4WaNgB0K0MK/gSPRAlgx9ejNwA4=
=6xXo
-----END PGP SIGNATURE-----
Merge tag 'for-linus-2020-01-18' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux
Pull thread fixes from Christian Brauner:
"Here is an urgent fix for ptrace_may_access() permission checking.
Commit 69f594a389 ("ptrace: do not audit capability check when
outputing /proc/pid/stat") introduced the ability to opt out of audit
messages for accesses to various proc files since they are not
violations of policy.
While doing so it switched the check from ns_capable() to
has_ns_capability{_noaudit}(). That means it switched from checking
the subjective credentials (ktask->cred) of the task to using the
objective credentials (ktask->real_cred). This is appears to be wrong.
ptrace_has_cap() is currently only used in ptrace_may_access() And is
used to check whether the calling task (subject) has the
CAP_SYS_PTRACE capability in the provided user namespace to operate on
the target task (object). According to the cred.h comments this means
the subjective credentials of the calling task need to be used.
With this fix we switch ptrace_has_cap() to use security_capable() and
thus back to using the subjective credentials.
As one example where this might be particularly problematic, Jann
pointed out that in combination with the upcoming IORING_OP_OPENAT{2}
feature, this bug might allow unprivileged users to bypass the
capability checks while asynchronously opening files like /proc/*/mem,
because the capability checks for this would be performed against
kernel credentials.
To illustrate on the former point about this being exploitable: When
io_uring creates a new context it records the subjective credentials
of the caller. Later on, when it starts to do work it creates a kernel
thread and registers a callback. The callback runs with kernel creds
for ktask->real_cred and ktask->cred.
To prevent this from becoming a full-blown 0-day io_uring will call
override_cred() and override ktask->cred with the subjective
credentials of the creator of the io_uring instance. With
ptrace_has_cap() currently looking at ktask->real_cred this override
will be ineffective and the caller will be able to open arbitray proc
files as mentioned above.
Luckily, this is currently not exploitable but would be so once
IORING_OP_OPENAT{2} land in v5.6. Let's fix it now.
To minimize potential regressions I successfully ran the criu
testsuite. criu makes heavy use of ptrace() and extensively hits
ptrace_may_access() codepaths and has a good change of detecting any
regressions.
Additionally, I succesfully ran the ptrace and seccomp kernel tests"
* tag 'for-linus-2020-01-18' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
ptrace: reintroduce usage of subjective credentials in ptrace_has_cap()
- Fix printing misleading Secure-IPL enabled message when it is not.
- Fix a race condition between host ap bus and guest ap bus doing
device reset in crypto code.
- Fix sanity check in CCA cipher key function (CCA AES cipher key
support), which fails otherwise.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEE3QHqV+H2a8xAv27vjYWKoQLXFBgFAl4i+UgACgkQjYWKoQLX
FBg0Pgf/VYdOoZCo868B6H7jR3xbPmaMz5wA57QfbPhmUkdF7ELBSDj7j+dvfmEy
kb9Tkn2cZOj40H0jwKWawnz5HA+iKtkXBcB8THB55ThlD6hgpZo8nzul1EomcxDx
NY7vxlaIpgWZgpT91Z1JxKOLpxIo0mIUH8+XITgZBU4qV9yrycpePzV7kz5vEieG
9BFObo0VtNYBnNwm68YuXsg1FKkezWktFAmdLHvxQz6wXUL5kBb/1dZkKGkg/0Qc
jsJSglY2I2uCqDesIRHxl6waWgj79nrb3T5WWYh9IL6btXfAFDTNfx7WaCWU55kd
pCfViqDJYL3idW8AfPkgXNaSclWtaQ==
=q9ev
-----END PGP SIGNATURE-----
Merge tag 's390-5.5-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Vasily Gorbik:
- Fix printing misleading Secure-IPL enabled message when it is not.
- Fix a race condition between host ap bus and guest ap bus doing
device reset in crypto code.
- Fix sanity check in CCA cipher key function (CCA AES cipher key
support), which fails otherwise.
* tag 's390-5.5-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/setup: Fix secure ipl message
s390/zcrypt: move ap device reset from bus to driver code
s390/zcrypt: Fix CCA cipher key gen with clear key value function
Three fixes in drivers with no impact to core code. The mptfusion fix
is enormous because the driver API had to be rethreaded to pass down
the necessary iocp pointer, but once that's done a significant chunk
of code is deleted. The other two patches are small.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXiNL7iYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishd6OAP0dPFlB
Nd8g0PR7OV4L6DpTcR2v6gaYZRoLeq0+p3b7yQD9EL5LiESIciLV2nk8pMoQVEqg
i7Pz5L2RJnLDRoUgIgU=
=YvR2
-----END PGP SIGNATURE-----
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Three fixes in drivers with no impact to core code.
The mptfusion fix is enormous because the driver API had to be
rethreaded to pass down the necessary iocp pointer, but once that's
done a significant chunk of code is deleted.
The other two patches are small"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: mptfusion: Fix double fetch bug in ioctl
scsi: storvsc: Correctly set number of hardware queues for IDE disk
scsi: fnic: fix invalid stack access
Here are some small fixes for 5.5-rc7
Included here are:
- two lkdtm fixes
- coresight build fix
- Documentation update for the hw process document
All of these have been in linux-next with no reported issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXiMRTA8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+yl/wQCfZWeba9TGl++stdgDCjwCI7iqwL4AniHwhzWw
h5G+rq9ZvbDDT9BT4tSi
=MOkg
-----END PGP SIGNATURE-----
Merge tag 'char-misc-5.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc fixes from Greg KH:
"Here are some small fixes for 5.5-rc7
Included here are:
- two lkdtm fixes
- coresight build fix
- Documentation update for the hw process document
All of these have been in linux-next with no reported issues"
* tag 'char-misc-5.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
Documentation/process: Add Amazon contact for embargoed hardware issues
lkdtm/bugs: fix build error in lkdtm_UNSET_SMEP
lkdtm/bugs: Make double-fault test always available
coresight: etm4x: Fix unused function warning
Here are some small staging and iio driver fixes for 5.5-rc7
All of them are for some small reported issues. Nothing major, full
details in the shortlog.
All have been in linux-next with no reported issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXiMR/w8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ymXKQCgv6VlsId1paPQkzVcTvencfBSSHMAn3sSdTus
mTwGZKoNVhHs0AZqVx5H
=x59R
-----END PGP SIGNATURE-----
Merge tag 'staging-5.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging and IIO driver fixes from Greg KH:
"Here are some small staging and iio driver fixes for 5.5-rc7
All of them are for some small reported issues. Nothing major, full
details in the shortlog.
All have been in linux-next with no reported issues"
* tag 'staging-5.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: comedi: ni_routes: allow partial routing information
staging: comedi: ni_routes: fix null dereference in ni_find_route_source()
iio: light: vcnl4000: Fix scale for vcnl4040
iio: buffer: align the size of scan bytes to size of the largest element
iio: chemical: pms7003: fix unmet triggered buffer dependency
iio: imu: st_lsm6dsx: Fix selection of ST_LSM6DS3_ID
iio: adc: ad7124: Fix DT channel configuration
Here are some small USB driver and core fixes for 5.5-rc7
There's one fix for hub wakeup issues and a number of small usb-serial
driver fixes and device id updates.
The hub fix has been in linux-next for a while with no reported issues,
and the usb-serial ones have all passed 0-day with no problems.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXiMUAw8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ym2EwCfY5TnNW+mY4kC9IJsuh67SHGkL+EAoMw7ppw1
wLMKOMwTGnWCln7p3uCn
=v2hz
-----END PGP SIGNATURE-----
Merge tag 'usb-5.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB driver fixes from Greg KH:
"Here are some small USB driver and core fixes for 5.5-rc7
There's one fix for hub wakeup issues and a number of small usb-serial
driver fixes and device id updates.
The hub fix has been in linux-next for a while with no reported
issues, and the usb-serial ones have all passed 0-day with no
problems"
* tag 'usb-5.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
USB: serial: quatech2: handle unbound ports
USB: serial: keyspan: handle unbound ports
USB: serial: io_edgeport: add missing active-port sanity check
USB: serial: io_edgeport: handle unbound ports on URB completion
USB: serial: ch341: handle unbound port at reset_resume
USB: serial: suppress driver bind attributes
USB: serial: option: add support for Quectel RM500Q in QDL mode
usb: core: hub: Improved device recognition on remote wakeup
USB: serial: opticon: fix control-message timeouts
USB: serial: option: Add support for Quectel RM500Q
USB: serial: simple: Add Motorola Solutions TETRA MTP3xxx and MTP85xx
Now that we have new LOOKUP flags, we should document them in the
relevant path-walking documentation. And now that we've settled on a
common name for nd_jump_link() style symlinks ("magic links"), use that
term where magic-link semantics are described.
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Test all of the various openat2(2) flags. A small stress-test of a
symlink-rename attack is included to show that the protections against
".."-based attacks are sufficient.
The main things these self-tests are enforcing are:
* The struct+usize ABI for openat2(2) and copy_struct_from_user() to
ensure that upgrades will be handled gracefully (in addition,
ensuring that misaligned structures are also handled correctly).
* The -EINVAL checks for openat2(2) are all correctly handled to avoid
userspace passing unknown or conflicting flag sets (most
importantly, ensuring that invalid flag combinations are checked).
* All of the RESOLVE_* semantics (including errno values) are
correctly handled with various combinations of paths and flags.
* RESOLVE_IN_ROOT correctly protects against the symlink rename(2)
attack that has been responsible for several CVEs (and likely will
be responsible for several more).
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
/* Background. */
For a very long time, extending openat(2) with new features has been
incredibly frustrating. This stems from the fact that openat(2) is
possibly the most famous counter-example to the mantra "don't silently
accept garbage from userspace" -- it doesn't check whether unknown flags
are present[1].
This means that (generally) the addition of new flags to openat(2) has
been fraught with backwards-compatibility issues (O_TMPFILE has to be
defined as __O_TMPFILE|O_DIRECTORY|[O_RDWR or O_WRONLY] to ensure old
kernels gave errors, since it's insecure to silently ignore the
flag[2]). All new security-related flags therefore have a tough road to
being added to openat(2).
Userspace also has a hard time figuring out whether a particular flag is
supported on a particular kernel. While it is now possible with
contemporary kernels (thanks to [3]), older kernels will expose unknown
flag bits through fcntl(F_GETFL). Giving a clear -EINVAL during
openat(2) time matches modern syscall designs and is far more
fool-proof.
In addition, the newly-added path resolution restriction LOOKUP flags
(which we would like to expose to user-space) don't feel related to the
pre-existing O_* flag set -- they affect all components of path lookup.
We'd therefore like to add a new flag argument.
Adding a new syscall allows us to finally fix the flag-ignoring problem,
and we can make it extensible enough so that we will hopefully never
need an openat3(2).
/* Syscall Prototype. */
/*
* open_how is an extensible structure (similar in interface to
* clone3(2) or sched_setattr(2)). The size parameter must be set to
* sizeof(struct open_how), to allow for future extensions. All future
* extensions will be appended to open_how, with their zero value
* acting as a no-op default.
*/
struct open_how { /* ... */ };
int openat2(int dfd, const char *pathname,
struct open_how *how, size_t size);
/* Description. */
The initial version of 'struct open_how' contains the following fields:
flags
Used to specify openat(2)-style flags. However, any unknown flag
bits or otherwise incorrect flag combinations (like O_PATH|O_RDWR)
will result in -EINVAL. In addition, this field is 64-bits wide to
allow for more O_ flags than currently permitted with openat(2).
mode
The file mode for O_CREAT or O_TMPFILE.
Must be set to zero if flags does not contain O_CREAT or O_TMPFILE.
resolve
Restrict path resolution (in contrast to O_* flags they affect all
path components). The current set of flags are as follows (at the
moment, all of the RESOLVE_ flags are implemented as just passing
the corresponding LOOKUP_ flag).
RESOLVE_NO_XDEV => LOOKUP_NO_XDEV
RESOLVE_NO_SYMLINKS => LOOKUP_NO_SYMLINKS
RESOLVE_NO_MAGICLINKS => LOOKUP_NO_MAGICLINKS
RESOLVE_BENEATH => LOOKUP_BENEATH
RESOLVE_IN_ROOT => LOOKUP_IN_ROOT
open_how does not contain an embedded size field, because it is of
little benefit (userspace can figure out the kernel open_how size at
runtime fairly easily without it). It also only contains u64s (even
though ->mode arguably should be a u16) to avoid having padding fields
which are never used in the future.
Note that as a result of the new how->flags handling, O_PATH|O_TMPFILE
is no longer permitted for openat(2). As far as I can tell, this has
always been a bug and appears to not be used by userspace (and I've not
seen any problems on my machines by disallowing it). If it turns out
this breaks something, we can special-case it and only permit it for
openat(2) but not openat2(2).
After input from Florian Weimer, the new open_how and flag definitions
are inside a separate header from uapi/linux/fcntl.h, to avoid problems
that glibc has with importing that header.
/* Testing. */
In a follow-up patch there are over 200 selftests which ensure that this
syscall has the correct semantics and will correctly handle several
attack scenarios.
In addition, I've written a userspace library[4] which provides
convenient wrappers around openat2(RESOLVE_IN_ROOT) (this is necessary
because no other syscalls support RESOLVE_IN_ROOT, and thus lots of care
must be taken when using RESOLVE_IN_ROOT'd file descriptors with other
syscalls). During the development of this patch, I've run numerous
verification tests using libpathrs (showing that the API is reasonably
usable by userspace).
/* Future Work. */
Additional RESOLVE_ flags have been suggested during the review period.
These can be easily implemented separately (such as blocking auto-mount
during resolution).
Furthermore, there are some other proposed changes to the openat(2)
interface (the most obvious example is magic-link hardening[5]) which
would be a good opportunity to add a way for userspace to restrict how
O_PATH file descriptors can be re-opened.
Another possible avenue of future work would be some kind of
CHECK_FIELDS[6] flag which causes the kernel to indicate to userspace
which openat2(2) flags and fields are supported by the current kernel
(to avoid userspace having to go through several guesses to figure it
out).
[1]: https://lwn.net/Articles/588444/
[2]: https://lore.kernel.org/lkml/CA+55aFyyxJL1LyXZeBsf2ypriraj5ut1XkNDsunRBqgVjZU_6Q@mail.gmail.com
[3]: commit 629e014bb8 ("fs: completely ignore unknown open flags")
[4]: https://sourceware.org/bugzilla/show_bug.cgi?id=17523
[5]: https://lore.kernel.org/lkml/20190930183316.10190-2-cyphar@cyphar.com/
[6]: https://youtu.be/ggD-eb3yPVs
Suggested-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Michael Chan says:
====================
bnxt_en: Bug fixes.
3 small bug fix patches. The 1st two are aRFS fixes and the last one
fixes a fatal driver load failure on some kernels without PCIe
extended config space support enabled.
Please also queue these for -stable. Thanks.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
DSN read can fail, for example on a kdump kernel without PCIe extended
config space support. If DSN read fails, don't set the
BNXT_FLAG_DSN_VALID flag and continue loading. Check the flag
to see if the stored DSN is valid before using it. Only VF reps
creation should fail without valid DSN.
Fixes: 03213a9965 ("bnxt: move bp->switch_id initialization to PF probe")
Reported-by: Marc Smith <msmith626@gmail.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix bnxt_fltr_match() to match ipv6 source and destination addresses.
The function currently only checks ipv4 addresses and will not work
corrently on ipv6 filters.
Fixes: c0c050c58d ("bnxt_en: New Broadcom ethernet driver.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The NTUPLE related firmware commands are sent to the wrong firmware
channel, causing all these commands to fail on new firmware that
supports the new firmware channel. Fix it by excluding the 3
NTUPLE firmware commands from the list for the new firmware channel.
Fixes: 760b6d3341 ("bnxt_en: Add support for 2nd firmware message channel.")
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 69f594a389 ("ptrace: do not audit capability check when outputing /proc/pid/stat")
introduced the ability to opt out of audit messages for accesses to various
proc files since they are not violations of policy. While doing so it
somehow switched the check from ns_capable() to
has_ns_capability{_noaudit}(). That means it switched from checking the
subjective credentials of the task to using the objective credentials. This
is wrong since. ptrace_has_cap() is currently only used in
ptrace_may_access() And is used to check whether the calling task (subject)
has the CAP_SYS_PTRACE capability in the provided user namespace to operate
on the target task (object). According to the cred.h comments this would
mean the subjective credentials of the calling task need to be used.
This switches ptrace_has_cap() to use security_capable(). Because we only
call ptrace_has_cap() in ptrace_may_access() and in there we already have a
stable reference to the calling task's creds under rcu_read_lock() there's
no need to go through another series of dereferences and rcu locking done
in ns_capable{_noaudit}().
As one example where this might be particularly problematic, Jann pointed
out that in combination with the upcoming IORING_OP_OPENAT feature, this
bug might allow unprivileged users to bypass the capability checks while
asynchronously opening files like /proc/*/mem, because the capability
checks for this would be performed against kernel credentials.
To illustrate on the former point about this being exploitable: When
io_uring creates a new context it records the subjective credentials of the
caller. Later on, when it starts to do work it creates a kernel thread and
registers a callback. The callback runs with kernel creds for
ktask->real_cred and ktask->cred. To prevent this from becoming a
full-blown 0-day io_uring will call override_cred() and override
ktask->cred with the subjective credentials of the creator of the io_uring
instance. With ptrace_has_cap() currently looking at ktask->real_cred this
override will be ineffective and the caller will be able to open arbitray
proc files as mentioned above.
Luckily, this is currently not exploitable but will turn into a 0-day once
IORING_OP_OPENAT{2} land in v5.6. Fix it now!
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Eric Paris <eparis@redhat.com>
Cc: stable@vger.kernel.org
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Serge Hallyn <serge@hallyn.com>
Reviewed-by: Jann Horn <jannh@google.com>
Fixes: 69f594a389 ("ptrace: do not audit capability check when outputing /proc/pid/stat")
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
rockchip: increase link rate var size to accommodate rates (Tobias)
mst: serialize down messages and clear timeslots are on unplug (Wayne)
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Tobias Schramm <t.schramm@manjaro.org>
Cc: Wayne Lin <Wayne.Lin@amd.com>
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEHF6rntfJ3enn8gh8cywAJXLcr3kFAl4gjvAACgkQcywAJXLc
r3knlgf8C0szUNBmHkmwuAivHLJz6XHKCtGSfPrZ3M67KDwU563BRXlzHEA3/q7p
L3+RUW9CrohyMWA51fbmNJOR54mpy0zAu2ejLFll0Ntcw5CBbiYQ2ToBMmhW56h9
lM2EDkD0stx/JYmumpU6Vc8o0CG3asiJsU7IfQZ4n2kXqboyHho+E74qri52aPtD
ZZfXLmI9Yju8j0x0AjkZWxC6gfNWXH/HEnsC9X86JF+QXcnw5cVeHvi43doPjvs/
sZQCeaRAhseJXpAPaiBhVemF8+jkwX2CZ0sggxZZP9BB26PEwonqWHj9ixQlYLyD
bVC4Wson59D/0W1QK4R/AADpIO80IQ==
=iVHE
-----END PGP SIGNATURE-----
Merge tag 'drm-misc-fixes-2020-01-16' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
virtio: maintain obj reservation lock when submitting cmds (Gerd)
rockchip: increase link rate var size to accommodate rates (Tobias)
mst: serialize down messages and clear timeslots are on unplug (Wayne)
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Tobias Schramm <t.schramm@manjaro.org>
Cc: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Sean Paul <sean@poorly.run>
Link: https://patchwork.freedesktop.org/patch/msgid/20200116162856.GA11524@art_vandelay