Make sure cached writes are not reordered around open(..., O_TRUNC), with
the obvious wrong results.
Fixes: 4d99ff8f12 ("fuse: Turn writeback cache on")
Cc: <stable@vger.kernel.org> # v3.15+
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
If writeback cache is enabled, then writes might get reordered with
chmod/chown/utimes. The problem with this is that performing the write in
the fuse daemon might itself change some of these attributes. In such case
the following sequence of operations will result in file ending up with the
wrong mode, for example:
int fd = open ("suid", O_WRONLY|O_CREAT|O_EXCL);
write (fd, "1", 1);
fchown (fd, 0, 0);
fchmod (fd, 04755);
close (fd);
This patch fixes this by flushing pending writes before performing
chown/chmod/utimes.
Reported-by: Giuseppe Scrivano <gscrivan@redhat.com>
Tested-by: Giuseppe Scrivano <gscrivan@redhat.com>
Fixes: 4d99ff8f12 ("fuse: Turn writeback cache on")
Cc: <stable@vger.kernel.org> # v3.15+
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
In commit 8020919a9b ("mac80211: Properly handle SKB with radiotap
only"), buffers whose length is too short cause a WARN_ON(1) to be
executed. This change exposed a fault in rtlwifi drivers, which is fixed
by regarding packets with skb->len <= FCS_LEN as though they are in error
and dropping them. The test is now annotated as likely.
Cc: Stable <stable@vger.kernel.org> # v5.0+
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
When converting the wrong qu configurations in an earlier commit, I
accidentally swapped 0x2720 and 0x30DC. Instead of converting 0x2720,
I converted 0x30DC. Undo 0x30DC and convert 0x2720.
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Add a workaround that forces power gating to be enabled on integrated
22000 devices. This improves power saving in certain situations.
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
iwl_mvm_tvqm_enable_txq() can return an error, notably if unable
to allocate memory for the queue. Handle this error throughout,
avoiding storing the invalid value into a u16 which later leads
to a disable of an invalid queue ("queue 65524 not used", where
65524 is just -ENOMEM in a u16).
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
A bunch of the entries for qnj were wrong. The 9460 device doesn't
exist, so update them to 9461 and 9462. There are still a bunch of
other occurrences of 9460, but that will be fixed separately.
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Some entries for PCI ID 0x2720 were using iwl9260_2ac_cfg, but the
correct is to use iwl9260_2ac_cfg_soc. Fix that.
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Nicolas Waisman noticed that even though noa_len is checked for
a compatible length it's still possible to overrun the buffers
of p2pinfo since there's no check on the upper bound of noa_num.
Bound noa_num against P2P_MAX_NOA_NUM.
Reported-by: Nicolas Waisman <nico@semmle.com>
Signed-off-by: Laura Abbott <labbott@redhat.com>
Acked-by: Ping-Ke Shih <pkshih@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Two patches were sent out of order: one removed some conditions from
an if and the other moved the code elsewhere. When sending the patch
that moved the code, an older version of the original code was moved,
causing the "make QnJ exclusive" code to be essentially undone.
Fix that by removing the inclusive conditions from the check again.
Fixes: 809805a820 ("iwlwifi: pcie: move some cfg mangling from trans_pcie_alloc to probe")
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
- fix for alignment faults under high memory pressure
- use u32 for ARM instructions in fault handler
- mark functions that must always be inlined with __always_inline
- fix for nommu XIP
- fix ARMv7M switch to handler mode in reboot path
- fix the recently introduced AMBA reset control error paths
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEuNNh8scc2k/wOAE+9OeQG+StrGQFAl2vMJkACgkQ9OeQG+St
rGTYCg//c999vIr6q+90NHuP3dHJfITyAXxiOrBtHJdqApgENiU4aBjBCjrmV88m
syYTPaNEXO2h4qkdO8kdZMVu4HB2WUSEnP1mq7RX/hW1W0uBOjLFdUlRznYF+T8x
2c2AKOeT+K5iwyMGYlmB6xK9N7OWTadz5wOtga4W1mml0GAkeeAa1Z6x151+q1PE
eir0W5dwdQ+TN+rE2J+nNrQdSngO12EOUWGM2gei2AEcr6ItKhww2eneusdz+012
ylrO6KHBNobace6LCC0XdJ8zF8e682sT/gnLOL+H73y7M0iGcDhBJ1fUMyeyaoDR
RQh58bG0ixl0Gz4nEgTeCgzyLYGyGoIc7NV9RoV6ONxPC1bJ+ETyj0GRlPcUsGli
l3ZrAsGuvSQ8EHqHZ3YVUSQOTiFjH2C7EKFAcCFSscgzcjedyqe4+rUz2nDZoc/U
zG3gkKbYMi9ToWHCzbzaHi06G+YnwFLEH2zRZhUomemjQcgyS5P4LjCY6UcnZh1R
6DzUEzN4Qt4N+nIU52x+klhZxS1m4Qhf/qwXWfsZASH5rkHAa2ivZ1vx88mnG9/f
o7X6HwEDrCE59JWyAeR8gnsyWoOoRjU/jZvXAugMirkv3MamI7ogOTwPQVmFjNbM
g65kncRITclh02L0ABD1zdUiQzKJTeUhYcXIErXNfgo+MdZAhag=
=EBH5
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm
:Pull ARM fixes from Russell King:
- fix for alignment faults under high memory pressure
- use u32 for ARM instructions in fault handler
- mark functions that must always be inlined with __always_inline
- fix for nommu XIP
- fix ARMv7M switch to handler mode in reboot path
- fix the recently introduced AMBA reset control error paths
* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 8926/1: v7m: remove register save to stack before svc
ARM: 8914/1: NOMMU: Fix exc_ret for XIP
ARM: 8908/1: add __always_inline to functions called from __get_user_check()
ARM: mm: alignment: use "u32" for 32-bit instructions
ARM: mm: fix alignment handler faults under memory pressure
drivers/amba: fix reset control error handling
Future pending rework of the ghes_edac instances registration will do
away with the single memory controller per system model and that ugly
hackery there.
This is a minimal fix for stable@, courtesy of James Morse.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAl2v/h4ACgkQEsHwGGHe
VUrBTw//UqWfFXV+82L6u98j4wAduGxJLGHu9q+bFS8G/sqw32/WGIzzsoImD4Ta
Xn3gRlQLc5zQTNNOaWh9jR3VtZsVvykLtTkIPzY/cGb78/kQbfycmIACUVKaBTdw
PUZ9dTev/PAnctotePfcUan78pDmo3LBhI5Y9ZvWdE4CelOP/+QaH8dFa/LNSe7t
jZMuJ1H6fA8wgD/i/eO4dDnvzhykf58oFyZMrdlO6GCwPCYu9CbLHgqWCqfG73IF
R6xIf7aoebnQOGqcwsJuY5n33wgmNyBQcOWVbBKAuv5ffVMobvxZ0dth2pS1Kulx
gQSI7Mt+FqL9xO9mY3GkkhcAZvANFPjgNRARUDwYFpy+Iihg2z24DdTDXTG+51Ps
JyeEZx1HgA7oCkBftphuVd7l/mmocI789j/k4rn1AHyPi8/GyYPV3KatikTCbRKc
BL7dvzYSLg1nESiiwXHuZssWT75IoIgNFvEpwJcao074YNFfaCg03Dev5v440c2Q
CdqHIZbcl7sDlmRAdbcd7gu2HSnr2wqbmDKfSpNY3ks9e3PjzQOfjDZ/vxSbPz5G
1Dqv2vsPGCfeBGxbzTRBTK7+PPowoWkqhPp2OvDi0WUkMWrlO/+ZeNw9JU4ri4S7
4eUXfLlBhD6COJDCT+4edxLcjtG2zvDCZIutNvWvs5IZAXtZpMQ=
=6R9z
-----END PGP SIGNATURE-----
Merge tag 'edac_urgent_for_5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras
Pull EDAC fix from Borislav Petkov:
"Fix ghes_edac UAF case triggered by KASAN and DEBUG_TEST_DRIVER_REMOVE.
Future pending rework of the ghes_edac instances registration will do
away with the single memory controller per system model and that ugly
hackery there.
This is a minimal fix for stable@, courtesy of James Morse"
* tag 'edac_urgent_for_5.4' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
EDAC/ghes: Fix Use after free in ghes_edac remove path
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAl2vBPgACgkQxWXV+ddt
WDvaGQ/+JbV05ML7QjufNFKjzojNPg8dOwvLqF58askMEAqzyqOrzu1YsIOjchqv
eZPP+wxh6j6i8LFnbaMYRhSVMlgpK9XOR3tNStp73QlSox6/6VKu7XR4wFXAoip3
l4XI8UqO5XqDG55UTtjKyo2VNq8vgq1gRUWD6hPtfzDP6WYj4JZuXOd+dT6PbP32
VHd91RoB8Z8qmypLF9Sju6IBWNhKl91TjlRqVdfLywaK8azqaxE1UttPf5DVbE6+
MlTuXhuxkNc4ddzj//oJ1s3ZP/nXtFmIZ75+Sd6P5DfqRNeIfjkKqXvffsjqoYVI
1Wv9sDiezxRB1RZjfInhtqvdvsqcsrXrM7x6BRVqI7IZDUaH5em8IoozQT62ze/4
MJBrRIbVodx2I7EVDMNGkx2TaIDAfnW4Z3UC+YSHLoy6jht1+SA5KLQDR1G/4NeR
dWht5wOXwhp8P3DoaczTUpk0DLtAtygj04fH8CG277EILLCpuWwW8iRkPttkrIM2
HrtRKrKFJNyEpq9vHSFVvpQJkzgzqBFs1UnqnEuYh6qNgWCrS4PJDu9geQ8aPGr2
pA5aEqg5b+jjWzIYDP93PF0u7kF4mFVAozn6xMf95FKmM1OupYYQg5BRe7n/DfDu
yh3Ms0Mmd9+snpNJ9lDr40cobHC5CDvfvO2SRERyBeXxARainN0=
=Ft+G
-----END PGP SIGNATURE-----
Merge tag 'for-5.4-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
- fixes of error handling cleanup of metadata accounting with qgroups
enabled
- fix swapped values for qgroup tracepoints
- fix race when handling full sync flag
- don't start unused worker thread, functionality removed already
* tag 'for-5.4-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
Btrfs: check for the full sync flag while holding the inode lock during fsync
Btrfs: fix qgroup double free after failure to reserve metadata for delalloc
btrfs: tracepoints: Fix bad entry members of qgroup events
btrfs: tracepoints: Fix wrong parameter order for qgroup events
btrfs: qgroup: Always free PREALLOC META reserve in btrfs_delalloc_release_extents()
btrfs: don't needlessly create extent-refs kernel thread
btrfs: block-group: Fix a memory leak due to missing btrfs_put_block_group()
Btrfs: add missing extents release on file extent cluster relocation error
When doing an out of tree build with O=, the nsdeps script constructs
the absolute pathname of the module source file so that it can insert
MODULE_IMPORT_NS statements in the right place. However, ${srctree}
contains an unescaped path to the source tree, which, when used in a sed
substitution, makes sed complain:
++ sed 's/[^ ]* *//home/jeyu/jeyu-linux\/&/g'
sed: -e expression #1, char 12: unknown option to `s'
The sed substitution command 's' ends prematurely with the forward
slashes in the pathname, and sed errors out when it encounters the 'h',
which is an invalid sed substitution option. To avoid escaping forward
slashes ${srctree}, we can use '|' as an alternative delimiter for
sed instead to avoid this error.
Reviewed-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Matthias Maennich <maennich@google.com>
Tested-by: Matthias Maennich <maennich@google.com>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
Pull operating performance points (OPP) framework fixes for v5.4
from Viresh Kumar:
"This contains:
- Patch to revert addition of regulator enable/disable in OPP core
(Marek).
- Remove incorrect lockdep assert (Viresh).
- Fix a kref counting issue (Viresh)."
* 'opp/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm:
opp: Reinitialize the list_kref before adding the static OPPs again
opp: core: Revert "add regulators enable and disable"
opp: of: drop incorrect lockdep_assert_held()
Fixes gcc '-Wunused-but-set-variable' warning:
fs/fuse/virtio_fs.c: In function virtio_fs_wake_pending_and_unlock:
fs/fuse/virtio_fs.c:983:20: warning: variable fc set but not used [-Wunused-but-set-variable]
It is not used since commit 7ee1e2e631 ("virtiofs: No need to check
fpq->connected state")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Users reported a v5.3 performance regression and inability to establish
huge page mappings. A revised version of the ndctl "dax.sh" huge page
unit test identifies commit 23c84eb783 "dax: Fix missed wakeup with
PMD faults" as the source.
Update get_unlocked_entry() to check for NULL entries before checking
the entry order, otherwise NULL is misinterpreted as a present pte
conflict. The 'order' check needs to happen before the locked check as
an unlocked entry at the wrong order must fallback to lookup the correct
order.
Reported-by: Jeff Smits <jeff.smits@intel.com>
Reported-by: Doug Nelson <doug.nelson@intel.com>
Cc: <stable@vger.kernel.org>
Fixes: 23c84eb783 ("dax: Fix missed wakeup with PMD faults")
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Link: https://lore.kernel.org/r/157167532455.3945484.11971474077040503994.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
The list_kref reaches a count of 0 when all the static OPPs are removed,
for example when dev_pm_opp_of_cpumask_remove_table() is called, though
the actual OPP table may not get freed as it may still be referenced by
other parts of the kernel, like from a call to
dev_pm_opp_set_supported_hw(). And if we call
dev_pm_opp_of_cpumask_add_table() again at this point, we must
reinitialize the list_kref otherwise the kernel will hit a WARN() in
kref infrastructure for incrementing a kref with value 0.
Fixes: 11e1a16482 ("opp: Don't decrement uninitialized list_kref")
Reported-by: Dmitry Osipenko <digetx@gmail.com>
Tested-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
There is one more problematic case I noticed while recently fixing BPF kallsyms
handling in cd7455f101 ("bpf: Fix use after free in subprog's jited symbol
removal") and that is bpf_get_prog_name().
If BTF has been attached to the prog, then we may be able to fetch the function
signature type id in kallsyms through prog->aux->func_info[prog->aux->func_idx].type_id.
However, while the BTF object itself is torn down via RCU callback, the prog's
aux->func_info is immediately freed via kvfree(prog->aux->func_info) once the
prog's refcount either hit zero or when subprograms were already exposed via
kallsyms and we hit the error path added in 5482e9a93c ("bpf: Fix memleak in
aux->func_info and aux->btf").
This violates RCU as well since kallsyms could be walked in parallel where we
could access aux->func_info. Hence, defer kvfree() to after RCU grace period.
Looking at ba64e7d852 ("bpf: btf: support proper non-jit func info") there
is no reason/dependency where we couldn't defer the kvfree(aux->func_info) into
the RCU callback.
Fixes: 5482e9a93c ("bpf: Fix memleak in aux->func_info and aux->btf")
Fixes: ba64e7d852 ("bpf: btf: support proper non-jit func info")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/875f2906a7c1a0691f2d567b4d8e4ea2739b1e88.1571779205.git.daniel@iogearbox.net
Include <net/addrconf.h> for the missing declarations of
various functions. Fixes the following sparse warnings:
net/ipv6/addrconf_core.c:94:5: warning: symbol 'register_inet6addr_notifier' was not declared. Should it be static?
net/ipv6/addrconf_core.c💯5: warning: symbol 'unregister_inet6addr_notifier' was not declared. Should it be static?
net/ipv6/addrconf_core.c:106:5: warning: symbol 'inet6addr_notifier_call_chain' was not declared. Should it be static?
net/ipv6/addrconf_core.c:112:5: warning: symbol 'register_inet6addr_validator_notifier' was not declared. Should it be static?
net/ipv6/addrconf_core.c:118:5: warning: symbol 'unregister_inet6addr_validator_notifier' was not declared. Should it be static?
net/ipv6/addrconf_core.c:125:5: warning: symbol 'inet6addr_validator_notifier_call_chain' was not declared. Should it be static?
net/ipv6/addrconf_core.c:237:6: warning: symbol 'in6_dev_finish_destroy' was not declared. Should it be static?
Signed-off-by: Ben Dooks (Codethink) <ben.dooks@codethink.co.uk>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Kernel test robot reported that the l2tp.sh test script failed:
# selftests: net: l2tp.sh
# Warning: file l2tp.sh is not executable, correct this.
Set executable bits.
Fixes: e858ef1cd4 ("selftests: Add l2tp tests")
Reported-by: kernel test robot <rong.a.chen@intel.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
We get one warnings when build kernel W=1:
net/sched/sch_taprio.c:1155:6: warning: no previous prototype for ‘taprio_offload_config_changed’ [-Wmissing-prototypes]
Make the function static to fix this.
Fixes: 9c66d15646 ("taprio: Add support for hardware offloading")
Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>
Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Michael Chan says:
====================
Devlink and error recovery bug fix patches.
Most of the work is by Vasundhara Volam.
====================
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
With the recently added error recovery logic, the device may already
be disabled if the firmware recovery is unsuccessful. In
bnxt_remove_one(), check that the device is still enabled first
before calling pci_disable_device().
Fixes: 3bc7d4a352 ("bnxt_en: Add BNXT_STATE_IN_FW_RESET state.")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
When firmware indicates that driver needs to invoke firmware reset
which is common for both error recovery and live firmware reset path,
driver needs a different time to wait before polling for firmware
readiness.
Modify the wait time to fw_reset_min_dsecs, which is initialised to
correct timeout for error recovery and firmware reset.
Fixes: 4037eb7156 ("bnxt_en: Add a new BNXT_FW_RESET_STATE_POLL_FW_DOWN state.")
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
The current code does not do endian swapping between the devlink
parameter and the internal NVRAM representation. Define a union to
represent the little endian NVRAM data and add 2 helper functions to
copy to and from the NVRAM data with the proper byte swapping.
Fixes: 782a624d00 ("bnxt_en: Add bnxt_en initial port params table and register it")
Cc: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
The current code that rounds up the NVRAM parameter bit size to the next
byte size for the devlink parameter is not always correct. The MSIX
devlink parameters are 4 bytes and we don't get the correct size
using this method.
Fix it by adding a new dl_num_bytes member to the bnxt_dl_nvm_param
structure which statically provides bytesize information according
to the devlink parameter type definition.
Fixes: 782a624d00 ("bnxt_en: Add bnxt_en initial port params table and register it")
Cc: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
When the address width of DMA is greater than 32, the packet header occupies
a BD descriptor. The starting address of the data should be added to the
header length.
Fixes: a993db88d1 ("net: stmmac: Enable support for > 32 Bits addressing in XGMAC")
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Cc: Jose Abreu <joabreu@synopsys.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Signed-off-by: yuqi jin <jinyuqi@huawei.com>
Signed-off-by: Shaokun Zhang <zhangshaokun@hisilicon.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
The ionic driver started using dymamic_hex_dump(), but
that is not always defined:
drivers/net/ethernet/pensando/ionic/ionic_main.c:229:2: error: implicit declaration of function 'dynamic_hex_dump' [-Werror,-Wimplicit-function-declaration]
Add a dummy implementation to use when CONFIG_DYNAMIC_DEBUG
is disabled, printing nothing.
Fixes: 938962d552 ("ionic: Add adminq action")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
syzkaller managed to trigger the following crash:
[...]
BUG: unable to handle page fault for address: ffffc90001923030
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD aa551067 P4D aa551067 PUD aa552067 PMD a572b067 PTE 80000000a1173163
Oops: 0000 [#1] PREEMPT SMP KASAN
CPU: 0 PID: 7982 Comm: syz-executor912 Not tainted 5.4.0-rc3+ #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:bpf_jit_binary_hdr include/linux/filter.h:787 [inline]
RIP: 0010:bpf_get_prog_addr_region kernel/bpf/core.c:531 [inline]
RIP: 0010:bpf_tree_comp kernel/bpf/core.c:600 [inline]
RIP: 0010:__lt_find include/linux/rbtree_latch.h:115 [inline]
RIP: 0010:latch_tree_find include/linux/rbtree_latch.h:208 [inline]
RIP: 0010:bpf_prog_kallsyms_find kernel/bpf/core.c:674 [inline]
RIP: 0010:is_bpf_text_address+0x184/0x3b0 kernel/bpf/core.c:709
[...]
Call Trace:
kernel_text_address kernel/extable.c:147 [inline]
__kernel_text_address+0x9a/0x110 kernel/extable.c:102
unwind_get_return_address+0x4c/0x90 arch/x86/kernel/unwind_frame.c:19
arch_stack_walk+0x98/0xe0 arch/x86/kernel/stacktrace.c:26
stack_trace_save+0xb6/0x150 kernel/stacktrace.c:123
save_stack mm/kasan/common.c:69 [inline]
set_track mm/kasan/common.c:77 [inline]
__kasan_kmalloc+0x11c/0x1b0 mm/kasan/common.c:510
kasan_slab_alloc+0xf/0x20 mm/kasan/common.c:518
slab_post_alloc_hook mm/slab.h:584 [inline]
slab_alloc mm/slab.c:3319 [inline]
kmem_cache_alloc+0x1f5/0x2e0 mm/slab.c:3483
getname_flags+0xba/0x640 fs/namei.c:138
getname+0x19/0x20 fs/namei.c:209
do_sys_open+0x261/0x560 fs/open.c:1091
__do_sys_open fs/open.c:1115 [inline]
__se_sys_open fs/open.c:1110 [inline]
__x64_sys_open+0x87/0x90 fs/open.c:1110
do_syscall_64+0xf7/0x1c0 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
[...]
After further debugging it turns out that we walk kallsyms while in parallel
we tear down a BPF program which contains subprograms that have been JITed
though the program itself has not been fully exposed and is eventually bailing
out with error.
The bpf_prog_kallsyms_del_subprogs() in bpf_prog_load()'s error path removes
the symbols, however, bpf_prog_free() tears down the JIT memory too early via
scheduled work. Instead, it needs to properly respect RCU grace period as the
kallsyms walk for BPF is under RCU.
Fix it by refactoring __bpf_prog_put()'s tear down and reuse it in our error
path where we defer final destruction when we have subprogs in the program.
Fixes: 7d1982b4e3 ("bpf: fix panic in prog load calls cleanup")
Fixes: 1c2a088a66 ("bpf: x64: add JIT support for multi-function programs")
Reported-by: syzbot+710043c5d1d5b5013bc7@syzkaller.appspotmail.com
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: syzbot+710043c5d1d5b5013bc7@syzkaller.appspotmail.com
Link: https://lore.kernel.org/bpf/55f6367324c2d7e9583fa9ccf5385dcbba0d7a6e.1571752452.git.daniel@iogearbox.net
The issue is in drivers/infiniband/core/uverbs_std_types_cq.c in the
UVERBS_HANDLER(UVERBS_METHOD_CQ_CREATE) function. We check that:
if (attr.comp_vector >= attrs->ufile->device->num_comp_vectors) {
But we don't check if "attr.comp_vector" is negative. It could
potentially lead to an array underflow. My concern would be where
cq->vector is used in the create_cq() function from the cxgb4 driver.
And really "attr.comp_vector" is appears as a u32 to user space so that's
the right type to use.
Fixes: 9ee79fce36 ("IB/core: Add completion queue (cq) object actions")
Link: https://lore.kernel.org/r/20191011133419.GA22905@mwanda
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
If the "virtualize APIC accesses" VM-execution control is set in the
VMCS, the APIC virtualization hardware is triggered when a page walk
in VMX non-root mode terminates at a PTE wherein the address of the 4k
page frame matches the APIC-access address specified in the VMCS. On
hardware, the APIC-access address may be any valid 4k-aligned physical
address.
KVM's nVMX implementation enforces the additional constraint that the
APIC-access address specified in the vmcs12 must be backed by
a "struct page" in L1. If not, L0 will simply clear the "virtualize
APIC accesses" VM-execution control in the vmcs02.
The problem with this approach is that the L1 guest has arranged the
vmcs12 EPT tables--or shadow page tables, if the "enable EPT"
VM-execution control is clear in the vmcs12--so that the L2 guest
physical address(es)--or L2 guest linear address(es)--that reference
the L2 APIC map to the APIC-access address specified in the
vmcs12. Without the "virtualize APIC accesses" VM-execution control in
the vmcs02, the APIC accesses in the L2 guest will directly access the
APIC-access page in L1.
When there is no mapping whatsoever for the APIC-access address in L1,
the L2 VM just loses the intended APIC virtualization. However, when
the APIC-access address is mapped to an MMIO region in L1, the L2
guest gets direct access to the L1 MMIO device. For example, if the
APIC-access address specified in the vmcs12 is 0xfee00000, then L2
gets direct access to L1's APIC.
Since this vmcs12 configuration is something that KVM cannot
faithfully emulate, the appropriate response is to exit to userspace
with KVM_INTERNAL_ERROR_EMULATION.
Fixes: fe3ef05c75 ("KVM: nVMX: Prepare vmcs02 from vmcs01 and vmcs12")
Reported-by: Dan Cross <dcross@google.com>
Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Peter Shier <pshier@google.com>
Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
8-letter strings representing ARC perf events are stores in two
32-bit registers as ASCII characters like that: "IJMP", "IALL", "IJMPTAK" etc.
And the same order of bytes in the word is used regardless CPU endianness.
Which means in case of big-endian CPU core we need to swap bytes to get
the same order as if it was on little-endian CPU.
Otherwise we're seeing the following error message on boot:
------------------------->8----------------------
ARC perf : 8 counters (32 bits), 40 conditions, [overflow IRQ support]
sysfs: cannot create duplicate filename '/devices/arc_pct/events/pmji'
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.2.18 #3
Stack Trace:
arc_unwind_core+0xd4/0xfc
dump_stack+0x64/0x80
sysfs_warn_dup+0x46/0x58
sysfs_add_file_mode_ns+0xb2/0x168
create_files+0x70/0x2a0
------------[ cut here ]------------
WARNING: CPU: 0 PID: 1 at kernel/events/core.c:12144 perf_event_sysfs_init+0x70/0xa0
Failed to register pmu: arc_pct, reason -17
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.2.18 #3
Stack Trace:
arc_unwind_core+0xd4/0xfc
dump_stack+0x64/0x80
__warn+0x9c/0xd4
warn_slowpath_fmt+0x22/0x2c
perf_event_sysfs_init+0x70/0xa0
---[ end trace a75fb9a9837bd1ec ]---
------------------------->8----------------------
What happens here we're trying to register more than one raw perf event
with the same name "PMJI". Why? Because ARC perf events are 4 to 8 letters
and encoded into two 32-bit words. In this particular case we deal with 2
events:
* "IJMP____" which counts all jump & branch instructions
* "IJMPC___" which counts only conditional jumps & branches
Those strings are split in two 32-bit words this way "IJMP" + "____" &
"IJMP" + "C___" correspondingly. Now if we read them swapped due to CPU core
being big-endian then we read "PMJI" + "____" & "PMJI" + "___C".
And since we interpret read array of ASCII letters as a null-terminated string
on big-endian CPU we end up with 2 events of the same name "PMJI".
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Cc: stable@vger.kernel.org
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Guest physical APIC ID may not equal to vcpu->vcpu_id in some case.
We may set the wrong physical id in avic_handle_ldr_update as we
always use vcpu->vcpu_id. Get physical APIC ID from vAPIC page
instead.
Export and use kvm_xapic_id here and in avic_handle_apic_id_update
as suggested by Vitaly.
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Scheduled policy update work may end up racing with the freeing of the
policy and unregistering the driver.
One possible race is as below, where the cpufreq_driver is unregistered,
but the scheduled work gets executed at later stage when, cpufreq_driver
is NULL (i.e. after freeing the policy and driver).
Unable to handle kernel NULL pointer dereference at virtual address 0000001c
pgd = (ptrval)
[0000001c] *pgd=80000080204003, *pmd=00000000
Internal error: Oops: 206 [#1] SMP THUMB2
Modules linked in:
CPU: 0 PID: 34 Comm: kworker/0:1 Not tainted 5.4.0-rc3-00006-g67f5a8081a4b #86
Hardware name: ARM-Versatile Express
Workqueue: events handle_update
PC is at cpufreq_set_policy+0x58/0x228
LR is at dev_pm_qos_read_value+0x77/0xac
Control: 70c5387d Table: 80203000 DAC: fffffffd
Process kworker/0:1 (pid: 34, stack limit = 0x(ptrval))
(cpufreq_set_policy) from (refresh_frequency_limits.part.24+0x37/0x48)
(refresh_frequency_limits.part.24) from (handle_update+0x2f/0x38)
(handle_update) from (process_one_work+0x16d/0x3cc)
(process_one_work) from (worker_thread+0xff/0x414)
(worker_thread) from (kthread+0xff/0x100)
(kthread) from (ret_from_fork+0x11/0x28)
Fixes: 67d874c3b2 ("cpufreq: Register notifiers with the PM QoS framework")
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
[ rjw: Cancel the work before dropping the QoS requests ]
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Commit "bpf: Process in-kernel BTF" in linux-next introduced an undefined
__weak symbol, which results in an R_390_GLOB_DAT relocation type. That
is not yet handled by the KASLR relocation code, and the kernel stops with
the message "Unknown relocation type".
Add code to detect and handle R_390_GLOB_DAT relocation types and undefined
symbols.
Fixes: 805bc0bc23 ("s390/kernel: build a relocatable kernel")
Cc: <stable@vger.kernel.org> # v5.2+
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
If a process is interrupted while accessing the crypto device and the
global ap_perms_mutex is contented, release() could return early and
fail to free related resources.
Fixes: 00fab2350e ("s390/zcrypt: multiple zcrypt device nodes support")
Cc: <stable@vger.kernel.org> # 4.19
Cc: Harald Freudenberger <freude@linux.ibm.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
The recently introduced USB-audio descriptor validator had a stupid
copy&paste error that may lead to an unexpected overlook of too short
descriptors for processing and extension units. It's likely the cause
of the report triggered by syzkaller fuzzer. Let's fix it.
Fixes: 57f8770620 ("ALSA: usb-audio: More validations of descriptor units")
Reported-by: syzbot+0620f79a1978b1133fd7@syzkaller.appspotmail.com
Link: https://lore.kernel.org/r/s5hsgnkdbsl.wl-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Commit:
8a58ddae23 ("perf/core: Fix exclusive events' grouping")
allows CAP_EXCLUSIVE events to be grouped with other events. Since all
of those also happen to be AUX events (which is not the case the other
way around, because arch/s390), this changes the rules for stopping the
output: the AUX event may not be on its PMU's context any more, if it's
grouped with a HW event, in which case it will be on that HW event's
context instead. If that's the case, munmap() of the AUX buffer can't
find and stop the AUX event, potentially leaving the last reference with
the atomic context, which will then end up freeing the AUX buffer. This
will then trip warnings:
Fix this by using the context's PMU context when looking for events
to stop, instead of the event's PMU context.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20191022073940.61814-1-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
- Fix a bug in the XIVE code which can cause a host crash.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEv0VLfXa2m9eKuaRpnZrqdyxjcZ8FAl2pKMUACgkQnZrqdyxj
cZ8ACwgA2M28tdhcK6XA/TBTnGN77oGdQiMdUt4nFm50fXNCvjeRUSGnHYATuj4s
kS0DfAUVoBI/gvbMYeQEW1MJ6t2n8zYjT2DANx/2Vy3B7vXzEIOcvd9xlP0LEIxD
yOJBuZWWxb+bqj3l2MBEdAg6gXHpu71G1IymcsIG0UIdfAFEOAuOcQrrnVdpesuC
I1d0iY0tPzvED9FzyBY94ECm7VLp8khsoz5aFUpatAyIyzK4Met4gR0B89UoSoE/
BCMmiC/vGSKvjEmtgq0rkc2qL2RH1x7MgGvEwJMLwvnls7m58CgqXlMQ8YsfTZuN
Y88SY5Nqnl1pybmApZ6vgijjgszQRw==
=uUTG
-----END PGP SIGNATURE-----
Merge tag 'kvm-ppc-fixes-5.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into HEAD
PPC KVM fix for 5.4
- Fix a bug in the XIVE code which can cause a host crash.
Special PMU edition:
- Fix cycle counter truncation
- Fix cycle counter overflow limit on pure 64bit system
- Allow chained events to be actually functional
- Correct sample period after overflow
-----BEGIN PGP SIGNATURE-----
iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAl2sMDwPHG1hekBrZXJu
ZWwub3JnAAoJECPQ0LrRPXpDyWEP/iKeWKFPoFIV2o4buIBSLlNOwPDzEF8pEABx
Wq5dw3cPEQFx5/n5vABLvUC0SoU6rhEWXseNNlOoo1r0pQzS0GpZ5B6BCuuMtk9X
DSgBc3YqrRPFVdMSCUtTSiM2en9fuLPSalh819KWqWkeMQg+meRtvjkzoXMh3gYt
KBeeaJHuwHMNlqjKSKdq4XtdQQUBzN+MbtIGTQ83hYbkvep5Z3AVuvS4CapcpeJE
OVByj0qcyHY4MG+jcTWPYepRZhAQQj8Joj3Z6hEc0ZVpw11GwqG3PcIryxAlhJp3
ON5teMeV1PiumR1fA90A6Q3M3tSoyR+5oHjS2Y7Y/W5ao6BBrytBDNtPGLYFQkXh
DKhyIHxFTNPaziSn1jGuvmZUmK9iDD8qowNCHFspAwoqqajjmb5YyiS/FQvfq+Ga
Zm5JA+f7jheGJq3zmV8oVdLoLt1ldsJb5iWDFZ/oGxLBZbITKAk5diZx+Jvr7Sgm
CyC8uoEiaoiQdabUwWymrGfrU1JKjLyKejtp/q4lZGG3e5y3jUn1F7qh7Q+N9eSX
l2cPPcH2iAcMZdFwBedUNll3JZHm3aSVg03Ub6GoYppzxc+phmr7p+Lzyxtm9dYd
JUF49yDySaiWkWoMG0sMBVSDml8JyEEEAJ1ypwQdGxlizy5/WFy41a0sxjMnCHjP
ljAsx/3n
=ORrS
-----END PGP SIGNATURE-----
Merge tag 'kvmarm-fixes-5.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm fixes for 5.4, take #2
Special PMU edition:
- Fix cycle counter truncation
- Fix cycle counter overflow limit on pure 64bit system
- Allow chained events to be actually functional
- Correct sample period after overflow
After resetting the vCPU, the kvmclock MSR keeps the previous value but it is
not enabled. This can be confusing, so fix it.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>