With mixed per-thread and (system-wide) per-cpu maps, the "any cpu" value
-1 must be skipped when setting CPU mask bits.
Prior to commit cbd7bfc7fd ("tools/perf: Fix out of bound access
to cpu mask array") the invalid setting went unnoticed, but since then
it causes perf record to fail with an error.
Example:
Before:
$ perf record -e intel_pt// --per-thread uname
Failed to initialize parallel data streaming masks
After:
$ perf record -e intel_pt// --per-thread uname
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.068 MB perf.data ]
Fixes: ae4f8ae16a ("libperf evlist: Allow mixing per-thread and per-cpu mmaps")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20220915122612.81738-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It uses PERF_EVENT_IOC_MODIFY_ATTRIBUTES ioctl. The kernel would return
ENOTTY if it's not supported. Update the skip reason in that case.
Committer notes:
On s/390 the args aren't used, so need to be marked __maybe_unused.
Reviewed-by: Ravi Bangoria <ravi.bangoria@amd.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20220914183338.546357-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
- Various fixes for build warnings
- Fix default kernel command line
-----BEGIN PGP SIGNATURE-----
iQJKBAABCAA0FiEEdgfidid8lnn52cLTZvlZhesYu8EFAmMrPKIWHHJpY2hhcmRA
c2lnbWEtc3Rhci5hdAAKCRBm+VmF6xi7wcRjEADAOOc0pJmVvydnuNpMMF0X2Zxf
UbIMLsZXx/yvabS5U9JsnmfTE2nWn/xTW/z/NEN9CyoZ4GGVbTts44y9EbqQekyv
d/otx6+2Cnp5ejocZgqzkWDpjZ7WRbFRPtrY3OQS50WdCH/fMw1sMo4U6OucHyGv
8hUwrn+UnZouQXpuuSMKZ1kf8LLEdhyIGBmTfZfA0IleIJVWWzWnv1XxnyVMRT2s
q6eryXJ5Ik8md3Rh71uNutAdJ7LEkpmxyOke6agUF5oXeFKf5io1pFGxBdIEGGQ6
9uRxZHG3rNxRH60s4DE0AAh3sp3SbkobHHpT0pX7WPURiBQMq4YD0qqLBkRzw68H
dDQS943C6RqFl55a+MdMcu0V9hdi6Sg8bQOSae3AMRUeIdXjBMHAMN3eG9zK0uSX
xCyweiA/uLm6+bnHEeh+o+jR3YoQ20ykBxhWRGf/9tq6NYA6L+FZKK9n5dfJzOlJ
oC4M5LXij8y+D0HEgWNn5X3IWlkleg7ngZqUWDzKP8D0J/Pfo0XQbAxsyqFxC+H+
tbSTq0MfWwnD+yTBDAzM6+XI4eby8nqMPHxK2Fxa0F7KlZ4P9J1l1mquOl2ly1rQ
2nTm2ly3makjGlc1Xg3jEkUpk2JrFMxNENXchhqRZKbfpuYsBdqupw/chc/Tx8d+
b1z7CgR+mal5n+hutA==
=iNUZ
-----END PGP SIGNATURE-----
Merge tag 'for-linus-6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux
Pull UML fixes from Richard Weinberger:
- Various fixes for build warnings
- Fix default kernel command line
* tag 'for-linus-6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux:
arch: um: Mark the stack non-executable to fix a binutils warning
um: Prevent KASAN splats in dump_stack()
um: fix default console kernel parameter
um: Cleanup compiler warning in arch/x86/um/tls_32.c
um: Cleanup syscall_handler_t cast in syscalls_32.h
Including:
- Two fixes for Intel VT-d:
- Check the right capability bit for 5-level page table support.
- Revert a previous fix which caused a regression with Thunderbolt
devices.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmMrAH4ACgkQK/BELZcB
GuODfQ//RlhWICLj1a1RhBoZ5aUpWsuCrPd0GAdi7scgK7qVeBqcN/ZWfrlrtekD
Ogxos+cFi/L5V2JhYRKP9/qfh6U/29ZAK85NNErcYko6WHmSsjOI++pyMrzng9cX
RMbePTgoElTRn/3kGC8AS0tTvLAMKo4sznYPZw9AygCaBc/YkwGeK0qDxTPKhzfd
GpVZxarKZvr5flQ0YgQE9DLtVdIfzSl5FXR2gIT1E8+6KRCX8XeTsj9C821qRxpo
rmf6Xmbzc+oLsJ57/qiOgorrBiu5PRWz+eDIhbPUQ7B9tl+HT96VP+45i3qTC0ig
au4M9NQQRo/hCq3gx236oO0Nqt9c5FLbpDG2+SoroYsCjNSdOpNUhtqylVSJOi8i
sCeiLZop0RckLyrst0kny7SxJNRtNEw4XDGhZCfHBnv/vxCFUOWgqI6rOlOH4j/f
KCs2jYPOghp+GZv7yVHLq/cjkhNgPW9EQeqpYrX/y29kxgmOQM4rpZlra70qQAlH
NbJBCtlOLTOniGAJgAAMORFIMWtttfhV0cucu8vQKXKnhzGn+4hpZkaNHyI3vl3S
upJ7i0t6TyghyeuUhWpF21KHc83d0Nd/fWVFstngJ4XhfNJFhqddOHaYsABK7Czk
QezpvKaNFNxAPe08CbMp8bjVEQdPhJ7oRvcLjSUSVtLxOrVebcw=
=sUtt
-----END PGP SIGNATURE-----
Merge tag 'iommu-fixes-v6.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu fixes from Joerg Roedel:
"Two fixes for Intel VT-d:
- Check the right capability bit for 5-level page table support.
- Revert a previous fix which caused a regression with Thunderbolt
devices"
* tag 'iommu-fixes-v6.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/vt-d: Check correct capability for sagaw determination
Revert "iommu/vt-d: Fix possible recursive locking in intel_iommu_init()"
A bit more changes than wished, but still manageable ammount.
Most of commits are HD-audio specific device fixes / quirks, while
there is a revert for the previous fix due to regressions and a
double-free fix in ALSA core code.
-----BEGIN PGP SIGNATURE-----
iQJCBAABCAAsFiEEIXTw5fNLNI7mMiVaLtJE4w1nLE8FAmMqv3MOHHRpd2FpQHN1
c2UuZGUACgkQLtJE4w1nLE/mQRAAnwAfR4D3RZsDO35j0ggpnF9XXE6S8uUKqmFp
1ge6qYrXgNxm26za6MTWrK5/kSsl9SbFXNjWO5Gzq2mjnNTJw/WEucyGCY/szHG9
M+X57gmrvj8ctpZQKmehpdoaUcT2Yc8oBZjYU9q2mQBMGtROcbUeF1WvXDKnh5QG
9DE9qJ0yn8LbE9Cj+rcuc8+rnurWjWGgStPAcEah65TaZ4ZqhaCcMCAm6qxio6Js
TerlILP2QAssaPNksR6R8mm2vWoOFF/2uZXIQTdyUi1RlaSdemKiO9j2cO0RmJNP
qKqHzGacwobJu2XJc/go/CesXA1K8tgMNmEWw0Ng7Rzn29jtBTLeZalcS79GLe+E
RlYmY1s8lS2vYBu7OVIOTxaW9NPC7mTtKMEuZM4mZ3YplSQLZst0oEJeKOcqkBIv
YLt81V3eH2mERy9HsPiLbCBU6wPBk3qx/QZNvTq8vc+qXYi9JI7b89g2aCHFFwkk
ONXwuW4O4OjlbqH3K4Mf1iWMdvxbnLhuICRLVCdREay/exXv1dt0sF78Rs13AAv4
J2DlLG1YtMOhsRy3YxnVr94oYi7tA3GDx1aEuwZ4v/yuA7UZTD1A40iS+/P5GRic
uZJYtLoULJkPBx9LG7ON49mF9Ap2nlj2PkNeNE6lEeWLU8oMeu1HV6XPO7FaXOPv
4VZi1jI=
=5FVb
-----END PGP SIGNATURE-----
Merge tag 'sound-6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"A bit more changes than wished, but still manageable amount.
Most of commits are HD-audio specific device fixes / quirks, while
there is a revert for the previous fix due to regressions and a
double-free fix in ALSA core code"
* tag 'sound-6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
Revert "ALSA: usb-audio: Split endpoint setups for hw_params and prepare"
ALSA: core: Fix double-free at snd_card_new()
ALSA: hda/realtek: Add a quirk for HP OMEN 16 (8902) mute LED
ALSA: hda/hdmi: Fix the converter reuse for the silent stream
ALSA: hda/realtek: Add quirk for ASUS GA503R laptop
ALSA: hda/realtek: Add pincfg for ASUS G533Z HP jack
ALSA: hda/realtek: Add pincfg for ASUS G513 HP jack
ALSA: hda/realtek: Re-arrange quirk table entries
ALSA: hda/realtek: Enable 4-speaker output Dell Precision 5530 laptop
ALSA: hda/realtek: Enable 4-speaker output Dell Precision 5570 laptop
ALSA: hda: Fix Nvidia dp infoframe
ALSA: hda/realtek: Add quirk for Huawei WRT-WX9
ALSA: hda/tegra: set depop delay for tegra
ALSA: hda: add Intel 5 Series / 3400 PCI DID
ALSA: hda: Fix hang at HD-audio codec unbinding due to refcount saturation
The kvm registration hooks must be registered even if the facilities
necessary for zPCI interpretation are unavailable, as vfio-pci-zdev will
expect to use the hooks regardless.
This fixes an issue where vfio-pci-zdev will fail its open function
because of a missing kvm_register when running on hardware that does not
support zPCI interpretation.
Fixes: ca922fecda ("KVM: s390: pci: Hook to access KVM lowlevel from VFIO")
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Link: https://lore.kernel.org/r/20220920193025.135655-1-mjrosato@linux.ibm.com
Message-Id: <20220920193025.135655-1-mjrosato@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
The GAIT and all of its entries must be represented by physical
addresses as this structure is shared with underlying firmware.
We can keep a virtual address of the GAIT origin in order to
handle processing in the kernel, but when traversing the entries
we must again convert the physical AISB stored in that GAIT entry
into a virtual address in order to process it.
Note: this currently doesn't fix a real bug, since virtual addresses
are indentical to physical ones.
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Acked-by: Nico Boehr <nrb@linux.ibm.com>
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Link: https://lore.kernel.org/r/20220907155952.87356-1-mjrosato@linux.ibm.com
Message-Id: <20220907155952.87356-1-mjrosato@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
This silences smatch warnings reported by kbuild bot:
arch/s390/kvm/gaccess.c:859 guest_range_to_gpas() error: uninitialized symbol 'prot'.
arch/s390/kvm/gaccess.c:1064 access_guest_with_key() error: uninitialized symbol 'prot'.
This is because it cannot tell that the value is not used in this case.
The trans_exc* only examine prot if code is PGM_PROTECTION.
Pass a dummy value for other codes.
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Link: https://lore.kernel.org/r/20220825192540.1560559-1-scgl@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Fix some sparse warnings that a plain integer 0 is being used instead of
NULL.
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
Link: https://lore.kernel.org/r/20220915175514.167899-1-mjrosato@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iQFHBAABCgAxFiEEBsvAIBsPu6mG7thcrX5LkNig010FAmMqy5UTHG1rbEBwZW5n
dXRyb25peC5kZQAKCRCtfkuQ2KDTXSEiB/93YAPIO2GlO1tGmvCvTiiNrzGR21au
BNCHLFrFTKWrOnceA8c6Jm0YjeAXDFM0OMN0LR3JF9aVmSuKSGeqjDVqdKvB1ykq
93V2vQEnIVaOI9zkyo9zK+xVrJx8sso3+RypNeDp75NnBM6IiXJvSvCYQFt8zSvn
qqV7RPo+R+e5g+o3iQnfb38Ej9BLDHCtOODMBTD2VmIiNSmWBI9COTscBGSD1vNO
nxJbBBSKc0UFoAYOc7D9DejG8Dd0rlBDHvwpXA2Y28zYnTAfttCQOvdlO2r2waF2
y3A37MpmCQg9iMAUh7LJQbHzL28UNCIlUNB9Zl9FxX7cZ+49oheJAKuV
=5hoh
-----END PGP SIGNATURE-----
Merge tag 'linux-can-fixes-for-6.0-20220921' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2022-09-21
The 1st patch is by me, targets the flexcan driver and fixes a
potential system hang on single core systems under high CAN packet
rate.
The next 2 patches are also by me and target the gs_usb driver. A
potential race condition during the ndo_open callback as well as the
return value if the ethtool identify feature is not supported are
fixed.
* tag 'linux-can-fixes-for-6.0-20220921' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
can: gs_usb: gs_usb_set_phys_id(): return with error if identify is not supported
can: gs_usb: gs_can_open(): fix race dev->can.state condition
can: flexcan: flexcan_mailbox_read() fix return value for drop = true
====================
Link: https://lore.kernel.org/r/20220921083609.419768-1-mkl@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The missing header makes it hard for programs like elfutils to open
these files.
Fixes: 2d86612aac ("perf symbol: Correct address for bss symbols")
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Lieven Hey <lieven.hey@kdab.com>
Tested-by: Leo Yan <leo.yan@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Link: https://lore.kernel.org/r/20220915092910.711036-1-lieven.hey@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
$ sudo ./perf test -v each-cgroup
96: perf stat --bpf-counters --for-each-cgroup test :
--- start ---
test child forked, pid 79600
test child finished with 0
---- end ----
perf stat --bpf-counters --for-each-cgroup test: Ok
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20220916184132.1161506-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
If it mixes core and uncore events, each evsel would have different cpu map.
But it assumed they are same with evlist's all_cpus and accessed by the same
index. This resulted in a crash like below.
$ perf stat -a --bpf-counters --for-each_cgroup ^. -e cycles,imc/cas_count_read/ sleep 1
Segmentation fault
While it's not recommended to use uncore events for cgroup aggregation, it
should not crash.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20220916184132.1161506-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The previous cpu map introduced a bug in the bperf cgroup counter. This
results in a failure when user gives a partial cpu map starting from
non-zero.
$ sudo ./perf stat -C 1-2 --bpf-counters --for-each-cgroup ^. sleep 1
libbpf: prog 'on_cgrp_switch': failed to create BPF link for perf_event FD 0:
-9 (Bad file descriptor)
Failed to attach cgroup program
To get the FD of an evsel, it should use a map index not the CPU number.
Fixes: 0255571a16 ("perf cpumap: Switch to using perf_cpu_map API")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: bpf@vger.kernel.org
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/r/20220916184132.1161506-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It seems the recent libbpf got more strict about the section name.
I'm seeing a failure like this:
$ sudo ./perf stat -a --bpf-counters --for-each-cgroup ^. sleep 1
libbpf: prog 'on_cgrp_switch': missing BPF prog type, check ELF section name 'perf_events'
libbpf: prog 'on_cgrp_switch': failed to load: -22
libbpf: failed to load object 'bperf_cgroup_bpf'
libbpf: failed to load BPF skeleton 'bperf_cgroup_bpf': -22
Failed to load cgroup skeleton
The section name should be 'perf_event' (without the trailing 's').
Although it's related to the libbpf change, it'd be better fix the
section name in the first place.
Fixes: 944138f048 ("perf stat: Enable BPF counter with --for-each-cgroup")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: bpf@vger.kernel.org
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/r/20220916184132.1161506-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Intel m10 bmc secure update
- Russ's change fixes the memory leak for a sysfs node reading
All patches have been reviewed on the mailing list, and have been in the
last linux-next releases (as part of our for-6.0 branch).
Signed-off-by: Xu Yilun <yilun.xu@intel.com>
-----BEGIN PGP SIGNATURE-----
iIkEABYIADEWIQSgSJpClIeaArXyudb8twOBpKCM2gUCYylUThMceWlsdW4ueHVA
aW50ZWwuY29tAAoJEPy3A4GkoIzaiNUA/3KNKS+/to+4Mbm7d44CwDHKjyDLBXjv
K13FE4ofqgskAP0Tl452unBAzOf1iGMA6fjA8eVAZ3djt2p50Foj9vfCAA==
=TG/z
-----END PGP SIGNATURE-----
Merge tag 'fpga-for-6.0-final' of git://git.kernel.org/pub/scm/linux/kernel/git/fpga/linux-fpga into char-misc-linus
Xu writes:
FPGA Manager changes for 6.0-final
Intel m10 bmc secure update
- Russ's change fixes the memory leak for a sysfs node reading
All patches have been reviewed on the mailing list, and have been in the
last linux-next releases (as part of our for-6.0 branch).
Signed-off-by: Xu Yilun <yilun.xu@intel.com>
* tag 'fpga-for-6.0-final' of git://git.kernel.org/pub/scm/linux/kernel/git/fpga/linux-fpga:
fpga: m10bmc-sec: Fix possible memory leak of flash_buf
If aq_nic_stop() fails, aq_ndev_close() returns err without calling
aq_nic_deinit() to release the relevant memory and resource, which
will lead to a memory leak.
We can fix it by deleting the if condition judgment and goto statement to
call aq_nic_deinit() directly after aq_nic_stop() to fix the memory leak.
Signed-off-by: Jianglei Nie <niejianglei2021@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit 9cd4f14344.
Some issues were reported on the original commit. Some thunderbolt devices
don't work anymore due to the following DMA fault.
DMAR: DRHD: handling fault status reg 2
DMAR: [INTR-REMAP] Request device [09:00.0] fault index 0x8080
[fault reason 0x25]
Blocked a compatibility format interrupt request
Bring it back for now to avoid functional regression.
Fixes: 9cd4f14344 ("iommu/vt-d: Fix possible recursive locking in intel_iommu_init()")
Link: https://lore.kernel.org/linux-iommu/485A6EA5-6D58-42EA-B298-8571E97422DE@getmailspring.com/
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216497
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: <stable@vger.kernel.org> # 5.19.x
Reported-and-tested-by: George Hilliard <thirtythreeforty@gmail.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Link: https://lore.kernel.org/r/20220920081701.3453504-1-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Florian Westphal says:
====================
netfilter: bugfixes for net
The following set contains netfilter fixes for the *net* tree.
Regressions (rc only):
recent ebtables crash fix was incomplete, it added a memory leak.
The patch to fix possible buffer overrun for BIG TCP in ftp conntrack
tried to be too clever, we cannot re-use ct->lock: NAT engine might
grab it again -> deadlock. Revert back to a global spinlock.
Both from myself.
Remove the documentation for the recently removed
'nf_conntrack_helper' sysctl as well, from Pablo Neira.
The static_branch_inc() that guards the 'chain stats enabled' path
needs to be deferred further, until the entire transaction was created.
From Tetsuo Handa.
Older bugs:
Since 5.3:
nf_tables_addchain may leak pcpu memory in error path when
offloading fails. Also from Tetsuo Handa.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Until commit 409c188c57 ("can: tree-wide: advertise software
timestamping capabilities") the ethtool_ops was only assigned for
devices which support the GS_CAN_FEATURE_IDENTIFY feature. That commit
assigns ethtool_ops unconditionally.
This results on controllers without GS_CAN_FEATURE_IDENTIFY support
for the following ethtool error:
| $ ethtool -p can0 1
| Cannot identify NIC: Broken pipe
Restore the correct error value by checking for
GS_CAN_FEATURE_IDENTIFY in the gs_usb_set_phys_id() function.
| $ ethtool -p can0 1
| Cannot identify NIC: Operation not supported
While there use the variable "netdev" for the "struct net_device"
pointer and "dev" for the "struct gs_can" pointer as in the rest of
the driver.
Fixes: 409c188c57 ("can: tree-wide: advertise software timestamping capabilities")
Link: http://lore.kernel.org/all/20220818143853.2671854-1-mkl@pengutronix.de
Cc: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
The dev->can.state is set to CAN_STATE_ERROR_ACTIVE, after the device
has been started. On busy networks the CAN controller might receive
CAN frame between and go into an error state before the dev->can.state
is assigned.
Assign dev->can.state before starting the controller to close the race
window.
Fixes: d08e973a77 ("can: gs_usb: Added support for the GS_USB CAN devices")
Link: https://lore.kernel.org/all/20220920195216.232481-1-mkl@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
The following happened on an i.MX25 using flexcan with many packets on
the bus:
The rx-offload queue reached a length more than skb_queue_len_max. In
can_rx_offload_offload_one() the drop variable was set to true which
made the call to .mailbox_read() (here: flexcan_mailbox_read()) to
_always_ return ERR_PTR(-ENOBUFS) and drop the rx'ed CAN frame. So
can_rx_offload_offload_one() returned ERR_PTR(-ENOBUFS), too.
can_rx_offload_irq_offload_fifo() looks as follows:
| while (1) {
| skb = can_rx_offload_offload_one(offload, 0);
| if (IS_ERR(skb))
| continue;
| if (!skb)
| break;
| ...
| }
The flexcan driver wrongly always returns ERR_PTR(-ENOBUFS) if drop is
requested, even if there is no CAN frame pending. As the i.MX25 is a
single core CPU, while the rx-offload processing is active, there is
no thread to process packets from the offload queue. So the queue
doesn't get any shorter and this results is a tight loop.
Instead of always returning ERR_PTR(-ENOBUFS) if drop is requested,
return NULL if no CAN frame is pending.
Changes since v1: https://lore.kernel.org/all/20220810144536.389237-1-u.kleine-koenig@pengutronix.de
- don't break in can_rx_offload_irq_offload_fifo() in case of an error,
return NULL in flexcan_mailbox_read() in case of no pending CAN frame
instead
Fixes: 4e9c9484b0 ("can: rx-offload: Prepare for CAN FD support")
Link: https://lore.kernel.org/all/20220811094254.1864367-1-mkl@pengutronix.de
Cc: stable@vger.kernel.org # v5.5
Suggested-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reviewed-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Tested-by: Thorsten Scherer <t.scherer@eckelmann.de>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
When running gpio test on nxp-ls1028 platform with below command
gpiomon --num-events=3 --rising-edge gpiochip1 25
There will be a warning trace as below:
Call trace:
free_irq+0x204/0x360
lineevent_free+0x64/0x70
gpio_ioctl+0x598/0x6a0
__arm64_sys_ioctl+0xb4/0x100
invoke_syscall+0x5c/0x130
......
el0t_64_sync+0x1a0/0x1a4
The reason of this issue is that calling request_threaded_irq()
function failed, and then lineevent_free() is invoked to release
the resource. Since the lineevent_state::irq was already set, so
the subsequent invocation of free_irq() would trigger the above
warning call trace. To fix this issue, set the lineevent_state::irq
after the IRQ register successfully.
Fixes: 4682427241 ("gpiolib: cdev: refactor lineevent cleanup into lineevent_free")
Cc: stable@vger.kernel.org
Signed-off-by: Meng Li <Meng.Li@windriver.com>
Reviewed-by: Kent Gibson <warthog618@gmail.com>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
The commit 924610607f ("gpio: tpmx86: Move PM device over to
irq domain") adds a dereference of girq that may be uninitialized.
Fix this by moving irq_domain_set_pm_device into if true branch
as suggested by Marc Zyngier.
Fixes: 924610607f ("gpio: tpmx86: Move PM device over to irq domain")
Suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Dongliang Mu <mudongliangabcd@gmail.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
Since binutils 2.39, ld will print a warning if any stack section is
executable, which is the default for stack sections on files without a
.note.GNU-stack section.
This was fixed for x86 in commit ffcf9c5700 ("x86: link vdso and boot with -z noexecstack --no-warn-rwx-segments"),
but remained broken for UML, resulting in several warnings:
/usr/bin/ld: warning: arch/x86/um/vdso/vdso.o: missing .note.GNU-stack section implies executable stack
/usr/bin/ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker
/usr/bin/ld: warning: .tmp_vmlinux.kallsyms1 has a LOAD segment with RWX permissions
/usr/bin/ld: warning: .tmp_vmlinux.kallsyms1.o: missing .note.GNU-stack section implies executable stack
/usr/bin/ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker
/usr/bin/ld: warning: .tmp_vmlinux.kallsyms2 has a LOAD segment with RWX permissions
/usr/bin/ld: warning: .tmp_vmlinux.kallsyms2.o: missing .note.GNU-stack section implies executable stack
/usr/bin/ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker
/usr/bin/ld: warning: vmlinux has a LOAD segment with RWX permissions
Link both the VDSO and vmlinux with -z noexecstack, fixing the warnings
about .note.GNU-stack sections. In addition, pass --no-warn-rwx-segments
to dodge the remaining warnings about LOAD segments with RWX permissions
in the kallsyms objects. (Note that this flag is apparently not
available on lld, so hide it behind a test for BFD, which is what the
x86 patch does.)
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ffcf9c5700e49c0aee42dcba9a12ba21338e8136
Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=ba951afb99912da01a6e8434126b8fac7aa75107
Signed-off-by: David Gow <davidgow@google.com>
Reviewed-by: Lukas Straub <lukasstraub2@web.de>
Tested-by: Lukas Straub <lukasstraub2@web.de>
Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested
Signed-off-by: Richard Weinberger <richard@nod.at>
Since commit 744d23c71a ("net: phy: Warn about incorrect
mdio_bus_phy_resume() state"), a warning splat is printed during system
resume with Wake-on-LAN disabled:
WARNING: CPU: 0 PID: 626 at drivers/net/phy/phy_device.c:323 mdio_bus_phy_resume+0xbc/0xe4
As the Renesas SuperH Ethernet driver already calls phy_{stop,start}()
in its suspend/resume callbacks, it is sufficient to just mark the MAC
responsible for managing the power state of the PHY.
Fixes: fba863b816 ("net: phy: make PHY PM ops a no-op if MAC driver manages PHY PM")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Link: https://lore.kernel.org/r/c6e1331b9bef61225fa4c09db3ba3e2e7214ba2d.1663598886.git.geert+renesas@glider.be
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Since commit 744d23c71a ("net: phy: Warn about incorrect
mdio_bus_phy_resume() state"), a warning splat is printed during system
resume with Wake-on-LAN disabled:
WARNING: CPU: 0 PID: 1197 at drivers/net/phy/phy_device.c:323 mdio_bus_phy_resume+0xbc/0xc8
As the Renesas Ethernet AVB driver already calls phy_{stop,start}() in
its suspend/resume callbacks, it is sufficient to just mark the MAC
responsible for managing the power state of the PHY.
Fixes: fba863b816 ("net: phy: make PHY PM ops a no-op if MAC driver manages PHY PM")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Link: https://lore.kernel.org/r/8ec796f47620980fdd0403e21bd8b7200b4fa1d4.1663598796.git.geert+renesas@glider.be
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
We can't use ct->lock, this is already used by the seqadj internals.
When using ftp helper + nat, seqadj will attempt to acquire ct->lock
again.
Revert back to a global lock for now.
Fixes: c783a29c7e ("netfilter: nf_ct_ftp: prefer skb_linearize")
Reported-by: Bruno de Paula Larini <bruno.larini@riosoft.com.br>
Signed-off-by: Florian Westphal <fw@strlen.de>
The bug fix was incomplete, it "replaced" crash with a memory leak.
The old code had an assignment to "ret" embedded into the conditional,
restore this.
Fixes: 7997eff828 ("netfilter: ebtables: reject blobs that don't provide all entry points")
Reported-and-tested-by: syzbot+a24c5252f3e3ab733464@syzkaller.appspotmail.com
Signed-off-by: Florian Westphal <fw@strlen.de>
It seems to me that percpu memory for chain stats started leaking since
commit 3bc158f8d0 ("netfilter: nf_tables: map basechain priority to
hardware priority") when nft_chain_offload_priority() returned an error.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Fixes: 3bc158f8d0 ("netfilter: nf_tables: map basechain priority to hardware priority")
Signed-off-by: Florian Westphal <fw@strlen.de>
syzbot is reporting underflow of nft_counters_enabled counter at
nf_tables_addchain() [1], for commit 43eb8949cf ("netfilter:
nf_tables: do not leave chain stats enabled on error") missed that
nf_tables_chain_destroy() after nft_basechain_init() in the error path of
nf_tables_addchain() decrements the counter because nft_basechain_init()
makes nft_is_base_chain() return true by setting NFT_CHAIN_BASE flag.
Increment the counter immediately after returning from
nft_basechain_init().
Link: https://syzkaller.appspot.com/bug?extid=b5d82a651b71cd8a75ab [1]
Reported-by: syzbot <syzbot+b5d82a651b71cd8a75ab@syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Tested-by: syzbot <syzbot+b5d82a651b71cd8a75ab@syzkaller.appspotmail.com>
Fixes: 43eb8949cf ("netfilter: nf_tables: do not leave chain stats enabled on error")
Signed-off-by: Florian Westphal <fw@strlen.de>
This toggle has been already remove by b118509076 ("netfilter: remove
nf_conntrack_helper sysctl and modparam toggles").
Remove the documentation entry for this toggle too.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
As suggested by Vinod, adding myself as the reviewer
for the Qualcomm ETHQOS Ethernet driver.
Recently I have enabled this driver on a few Qualcomm
SoCs / boards and hence trying to keep a close eye on
it.
Signed-off-by: Bhupesh Sharma <bhupesh.sharma@linaro.org>
Acked-by: Vinod Koul <vkoul@kernel.org>
Link: https://lore.kernel.org/r/20220915112804.3950680-1-bhupesh.sharma@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When performing a reset on ice driver with link-down-on-close flag on
interface would always stay down. Fix this by moving a check of this
flag to ice_stop() that is called only when user wants to bring
interface down.
Fixes: ab4ab73fc1 ("ice: Add ethtool private flag to make forcing link down optional")
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Petr Oros <poros@redhat.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
After lowering number of tx queues the warning appears:
"Number of in use tx queues changed invalidating tc mappings. Priority
traffic classification disabled!"
Example command to reproduce:
ethtool -L enp24s0f0 tx 36 rx 36
Fix this by setting correct tc mapping before setting real number of
queues on netdev.
Fixes: 0754d65bd4 ("ice: Add infrastructure for mqprio support via ndo_setup_tc")
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Vladimir Oltean says:
====================
Fixes for tc-taprio software mode
While working on some new features for tc-taprio, I found some strange
behavior which looked like bugs. I was able to eventually trigger a NULL
pointer dereference. This patch set fixes 2 issues I saw. Detailed
explanation in patches.
====================
Link: https://lore.kernel.org/r/20220915100802.2308279-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
taprio can only operate as root qdisc, and to that end, there exists the
following check in taprio_init(), just as in mqprio:
if (sch->parent != TC_H_ROOT)
return -EOPNOTSUPP;
And indeed, when we try to attach taprio to an mqprio child, it fails as
expected:
$ tc qdisc add dev swp0 root handle 1: mqprio num_tc 8 \
map 0 1 2 3 4 5 6 7 \
queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 hw 0
$ tc qdisc replace dev swp0 parent 1:2 taprio num_tc 8 \
map 0 1 2 3 4 5 6 7 \
queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \
base-time 0 sched-entry S 0x7f 990000 sched-entry S 0x80 100000 \
flags 0x0 clockid CLOCK_TAI
Error: sch_taprio: Can only be attached as root qdisc.
(extack message added by me)
But when we try to attach a taprio child to a taprio root qdisc,
surprisingly it doesn't fail:
$ tc qdisc replace dev swp0 root handle 1: taprio num_tc 8 \
map 0 1 2 3 4 5 6 7 queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \
base-time 0 sched-entry S 0x7f 990000 sched-entry S 0x80 100000 \
flags 0x0 clockid CLOCK_TAI
$ tc qdisc replace dev swp0 parent 1:2 taprio num_tc 8 \
map 0 1 2 3 4 5 6 7 \
queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \
base-time 0 sched-entry S 0x7f 990000 sched-entry S 0x80 100000 \
flags 0x0 clockid CLOCK_TAI
This is because tc_modify_qdisc() behaves differently when mqprio is
root, vs when taprio is root.
In the mqprio case, it finds the parent qdisc through
p = qdisc_lookup(dev, TC_H_MAJ(clid)), and then the child qdisc through
q = qdisc_leaf(p, clid). This leaf qdisc q has handle 0, so it is
ignored according to the comment right below ("It may be default qdisc,
ignore it"). As a result, tc_modify_qdisc() goes through the
qdisc_create() code path, and this gives taprio_init() a chance to check
for sch_parent != TC_H_ROOT and error out.
Whereas in the taprio case, the returned q = qdisc_leaf(p, clid) is
different. It is not the default qdisc created for each netdev queue
(both taprio and mqprio call qdisc_create_dflt() and keep them in
a private q->qdiscs[], or priv->qdiscs[], respectively). Instead, taprio
makes qdisc_leaf() return the _root_ qdisc, aka itself.
When taprio does that, tc_modify_qdisc() goes through the qdisc_change()
code path, because the qdisc layer never finds out about the child qdisc
of the root. And through the ->change() ops, taprio has no reason to
check whether its parent is root or not, just through ->init(), which is
not called.
The problem is the taprio_leaf() implementation. Even though code wise,
it does the exact same thing as mqprio_leaf() which it is copied from,
it works with different input data. This is because mqprio does not
attach itself (the root) to each device TX queue, but one of the default
qdiscs from its private array.
In fact, since commit 13511704f8 ("net: taprio offload: enforce qdisc
to netdev queue mapping"), taprio does this too, but just for the full
offload case. So if we tried to attach a taprio child to a fully
offloaded taprio root qdisc, it would properly fail too; just not to a
software root taprio.
To fix the problem, stop looking at the Qdisc that's attached to the TX
queue, and instead, always return the default qdiscs that we've
allocated (and to which we privately enqueue and dequeue, in software
scheduling mode).
Since Qdisc_class_ops :: leaf is only called from tc_modify_qdisc(),
the risk of unforeseen side effects introduced by this change is
minimal.
Fixes: 5a781ccbd1 ("tc: Add support for configuring the taprio scheduler")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In an incredibly strange API design decision, qdisc->destroy() gets
called even if qdisc->init() never succeeded, not exclusively since
commit 87b60cfacf ("net_sched: fix error recovery at qdisc creation"),
but apparently also earlier (in the case of qdisc_create_dflt()).
The taprio qdisc does not fully acknowledge this when it attempts full
offload, because it starts off with q->flags = TAPRIO_FLAGS_INVALID in
taprio_init(), then it replaces q->flags with TCA_TAPRIO_ATTR_FLAGS
parsed from netlink (in taprio_change(), tail called from taprio_init()).
But in taprio_destroy(), we call taprio_disable_offload(), and this
determines what to do based on FULL_OFFLOAD_IS_ENABLED(q->flags).
But looking at the implementation of FULL_OFFLOAD_IS_ENABLED()
(a bitwise check of bit 1 in q->flags), it is invalid to call this macro
on q->flags when it contains TAPRIO_FLAGS_INVALID, because that is set
to U32_MAX, and therefore FULL_OFFLOAD_IS_ENABLED() will return true on
an invalid set of flags.
As a result, it is possible to crash the kernel if user space forces an
error between setting q->flags = TAPRIO_FLAGS_INVALID, and the calling
of taprio_enable_offload(). This is because drivers do not expect the
offload to be disabled when it was never enabled.
The error that we force here is to attach taprio as a non-root qdisc,
but instead as child of an mqprio root qdisc:
$ tc qdisc add dev swp0 root handle 1: \
mqprio num_tc 8 map 0 1 2 3 4 5 6 7 \
queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 hw 0
$ tc qdisc replace dev swp0 parent 1:1 \
taprio num_tc 8 map 0 1 2 3 4 5 6 7 \
queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 base-time 0 \
sched-entry S 0x7f 990000 sched-entry S 0x80 100000 \
flags 0x0 clockid CLOCK_TAI
Unable to handle kernel paging request at virtual address fffffffffffffff8
[fffffffffffffff8] pgd=0000000000000000, p4d=0000000000000000
Internal error: Oops: 96000004 [#1] PREEMPT SMP
Call trace:
taprio_dump+0x27c/0x310
vsc9959_port_setup_tc+0x1f4/0x460
felix_port_setup_tc+0x24/0x3c
dsa_slave_setup_tc+0x54/0x27c
taprio_disable_offload.isra.0+0x58/0xe0
taprio_destroy+0x80/0x104
qdisc_create+0x240/0x470
tc_modify_qdisc+0x1fc/0x6b0
rtnetlink_rcv_msg+0x12c/0x390
netlink_rcv_skb+0x5c/0x130
rtnetlink_rcv+0x1c/0x2c
Fix this by keeping track of the operations we made, and undo the
offload only if we actually did it.
I've added "bool offloaded" inside a 4 byte hole between "int clockid"
and "atomic64_t picos_per_byte". Now the first cache line looks like
below:
$ pahole -C taprio_sched net/sched/sch_taprio.o
struct taprio_sched {
struct Qdisc * * qdiscs; /* 0 8 */
struct Qdisc * root; /* 8 8 */
u32 flags; /* 16 4 */
enum tk_offsets tk_offset; /* 20 4 */
int clockid; /* 24 4 */
bool offloaded; /* 28 1 */
/* XXX 3 bytes hole, try to pack */
atomic64_t picos_per_byte; /* 32 0 */
/* XXX 8 bytes hole, try to pack */
spinlock_t current_entry_lock; /* 40 0 */
/* XXX 8 bytes hole, try to pack */
struct sched_entry * current_entry; /* 48 8 */
struct sched_gate_list * oper_sched; /* 56 8 */
/* --- cacheline 1 boundary (64 bytes) --- */
Fixes: 9c66d15646 ("taprio: Add support for hardware offloading")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The global 'raw_v6_hashinfo' variable can be accessed even when IPv6 is
administratively disabled via the 'ipv6.disable=1' kernel command line
option, leading to a crash [1].
Fix by restoring the original behavior and always initializing the
variable, regardless of IPv6 support being administratively disabled or
not.
[1]
BUG: unable to handle page fault for address: ffffffffffffffc8
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 173e18067 P4D 173e18067 PUD 173e1a067 PMD 0
Oops: 0000 [#1] PREEMPT SMP KASAN
CPU: 3 PID: 271 Comm: ss Not tainted 6.0.0-rc4-custom-00136-g0727a9a5fbc1 #1396
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-1.fc36 04/01/2014
RIP: 0010:raw_diag_dump+0x310/0x7f0
[...]
Call Trace:
<TASK>
__inet_diag_dump+0x10f/0x2e0
netlink_dump+0x575/0xfd0
__netlink_dump_start+0x67b/0x940
inet_diag_handler_cmd+0x273/0x2d0
sock_diag_rcv_msg+0x317/0x440
netlink_rcv_skb+0x15e/0x430
sock_diag_rcv+0x2b/0x40
netlink_unicast+0x53b/0x800
netlink_sendmsg+0x945/0xe60
____sys_sendmsg+0x747/0x960
___sys_sendmsg+0x13a/0x1e0
__sys_sendmsg+0x118/0x1e0
do_syscall_64+0x34/0x80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Fixes: 0daf07e527 ("raw: convert raw sockets to RCU")
Reported-by: Roberto Ricci <rroberto2r@gmail.com>
Tested-by: Roberto Ricci <rroberto2r@gmail.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20220916084821.229287-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
TSN features on the ENETC (taprio, cbs, gate, police) are configured
through a mix of command BD ring messages and port registers:
enetc_port_rd(), enetc_port_wr().
Port registers are a region of the ENETC memory map which are only
accessible from the PCIe Physical Function. They are not accessible from
the Virtual Functions.
Moreover, attempting to access these registers crashes the kernel:
$ echo 1 > /sys/bus/pci/devices/0000\:00\:00.0/sriov_numvfs
pci 0000:00:01.0: [1957:ef00] type 00 class 0x020001
fsl_enetc_vf 0000:00:01.0: Adding to iommu group 15
fsl_enetc_vf 0000:00:01.0: enabling device (0000 -> 0002)
fsl_enetc_vf 0000:00:01.0 eno0vf0: renamed from eth0
$ tc qdisc replace dev eno0vf0 root taprio num_tc 8 map 0 1 2 3 4 5 6 7 \
queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 base-time 0 \
sched-entry S 0x7f 900000 sched-entry S 0x80 100000 flags 0x2
Unable to handle kernel paging request at virtual address ffff800009551a08
Internal error: Oops: 96000007 [#1] PREEMPT SMP
pc : enetc_setup_tc_taprio+0x170/0x47c
lr : enetc_setup_tc_taprio+0x16c/0x47c
Call trace:
enetc_setup_tc_taprio+0x170/0x47c
enetc_setup_tc+0x38/0x2dc
taprio_change+0x43c/0x970
taprio_init+0x188/0x1e0
qdisc_create+0x114/0x470
tc_modify_qdisc+0x1fc/0x6c0
rtnetlink_rcv_msg+0x12c/0x390
Split enetc_setup_tc() into separate functions for the PF and for the
VF drivers. Also remove enetc_qos.o from being included into
enetc-vf.ko, since it serves absolutely no purpose there.
Fixes: 34c6adf197 ("enetc: Configure the Time-Aware Scheduler via tc-taprio offload")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220916133209.3351399-2-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The VF netdev driver shouldn't respond to changes in the NETIF_F_HW_TC
flag; only PFs should. Moreover, TSN-specific code should go to
enetc_qos.c, which should not be included in the VF driver.
Fixes: 79e499829f ("net: enetc: add hw tc hw offload features for PSPF capability")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220916133209.3351399-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jason A. Donenfeld says:
====================
wireguard patches for 6.0-rc6
1) The ratelimiter timing test doesn't help outside of development, yet
it is currently preventing the module from being inserted on some
kernels when it flakes at insertion time. So we disable it.
2) A fix for a build error on UML, caused by a recent change in a
different tree.
3) A WARN_ON() is triggered by Kees' new fortified memcpy() patch, due
to memcpy()ing over a sockaddr pointer with the size of a
sockaddr_in[6]. The type safe fix is pretty simple. Given how classic
of a thing sockaddr punning is, I suspect this may be the first in a
few patches like this throughout the net tree, once Kees' fortify
series is more widely deployed (current it's just in next).
====================
Link: https://lore.kernel.org/r/20220916143740.831881-1-Jason@zx2c4.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Doing a variable-sized memcpy is slower, and the compiler isn't smart
enough to turn this into a constant-size assignment.
Further, Kees' latest fortified memcpy will actually bark, because the
destination pointer is type sockaddr, not explicitly sockaddr_in or
sockaddr_in6, so it thinks there's an overflow:
memcpy: detected field-spanning write (size 28) of single field
"&endpoint.addr" at drivers/net/wireguard/netlink.c:446 (size 16)
Fix this by just assigning by using explicit casts for each checked
case.
Fixes: e7096c131e ("net: WireGuard secure network tunnel")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reported-by: syzbot+a448cda4dba2dac50de5@syzkaller.appspotmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Since 1b620d539c ("kbuild: disable header exports for UML in a
straightforward way"), installing headers fails on UML, so just disable
installing them, since they're not needed anyway on the architecture.
Fixes: b438b3b8d6 ("wireguard: selftests: support UML")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
A previous commit tried to make the ratelimiter timings test more
reliable but in the process made it less reliable on other
configurations. This is an impossible problem to solve without
increasingly ridiculous heuristics. And it's not even a problem that
actually needs to be solved in any comprehensive way, since this is only
ever used during development. So just cordon this off with a DEBUG_
ifdef, just like we do for the trie's randomized tests, so it can be
enabled while hacking on the code, and otherwise disabled in CI. In the
process we also revert 151c8e499f.
Fixes: 151c8e499f ("wireguard: ratelimiter: use hrtimer in selftest")
Fixes: e7096c131e ("net: WireGuard secure network tunnel")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>