HiSilicon PCIe tune and trace device (PTT) could dynamically tune the
PCIe link's events, and trace the TLP headers).
This patch add support for PTT device in perf tool, so users could use
'perf record' to get TLP headers trace data.
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Acked-by: John Garry <john.garry@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bjorn Helgaas <helgaas@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Qi Liu <liuqi6124@gmail.com>
Cc: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Zeng Prime <prime.zeng@huawei.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-pci@vger.kernel.org
Cc: linuxarm@huawei.com
Link: https://lore.kernel.org/r/20220927081400.14364-3-yangyicong@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add find_pmu_for_event() and use to simplify logic in
auxtrace_record_init(). find_pmu_for_event() will be reused in
subsequent patches.
Reviewed-by: John Garry <john.garry@huawei.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bjorn Helgaas <helgaas@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Qi Liu <liuqi6124@gmail.com>
Cc: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Zeng Prime <prime.zeng@huawei.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-pci@vger.kernel.org
Cc: linuxarm@huawei.com
Link: https://lore.kernel.org/r/20220927081400.14364-2-yangyicong@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Testcase stat+json_output.sh fails in powerpc:
86: perf stat JSON output linter : FAILED!
The testcase "stat+json_output.sh" verifies perf stat JSON output. The
test covers aggregation modes like per-socket, per-core, per-die, -A
(no_aggr mode) along with few other tests. It counts expected fields for
various commands. For example say -A (i.e, AGGR_NONE mode), expects 7
fields in the output having "CPU" as first field. Same way, for
per-socket, it expects the first field in result to point to socket id.
The testcases compares the result with expected count.
The values for socket, die, core and cpu are fetched from topology
directory:
/sys/devices/system/cpu/cpu*/topology.
For example, socket value is fetched from "physical_package_id" file of
topology directory. (cpu__get_topology_int() in util/cpumap.c)
If a platform fails to fetch the topology information, values will be
set to -1. For example, incase of pSeries platform of powerpc, value for
"physical_package_id" is restricted and not exposed. So, -1 will be
assigned.
Perf code has a checks for valid cpu id in "aggr_printout"
(stat-display.c), which displays the fields. So, in cases where topology
values not exposed, first field of the output displaying will be empty.
This cause the testcase to fail, as it counts number of fields in the
output.
Incase of -A (AGGR_NONE mode,), testcase expects 7 fields in the output,
becos of -1 value obtained from topology files for some, only 6 fields
are printed. Hence a testcase failure reported due to mismatch in number
of fields in the output.
Patch here adds a sanity check in the testcase for topology. Check will
help to skip the test if -1 value found.
Fixes: 0c343af2a2 ("perf test: JSON format checking")
Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com>
Suggested-by: Ian Rogers <irogers@google.com>
Suggested-by: James Clark <james.clark@arm.com>
Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Claire Jensen <cjense@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Link: https://lore.kernel.org/r/20221006155149.67205-2-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Testcase stat+csv_output.sh fails in powerpc:
84: perf stat CSV output linter: FAILED!
The testcase "stat+csv_output.sh" verifies perf stat CSV output. The
test covers aggregation modes like per-socket, per-core, per-die, -A
(no_aggr mode) along with few other tests. It counts expected fields for
various commands. For example say -A (i.e, AGGR_NONE mode), expects 7
fields in the output having "CPU" as first field. Same way, for
per-socket, it expects the first field in result to point to socket id.
The testcases compares the result with expected count.
The values for socket, die, core and cpu are fetched from topology
directory:
/sys/devices/system/cpu/cpu*/topology.
For example, socket value is fetched from "physical_package_id" file of
topology directory. (cpu__get_topology_int() in util/cpumap.c)
If a platform fails to fetch the topology information, values will be
set to -1. For example, incase of pSeries platform of powerpc, value for
"physical_package_id" is restricted and not exposed. So, -1 will be
assigned.
Perf code has a checks for valid cpu id in "aggr_printout"
(stat-display.c), which displays the fields. So, in cases where topology
values not exposed, first field of the output displaying will be empty.
This cause the testcase to fail, as it counts number of fields in the
output.
Incase of -A (AGGR_NONE mode,), testcase expects 7 fields in the output,
becos of -1 value obtained from topology files for some, only 6 fields
are printed. Hence a testcase failure reported due to mismatch in number
of fields in the output.
Patch here adds a sanity check in the testcase for topology. Check will
help to skip the test if -1 value found.
Fixes: 7473ee56db ("perf test: Add checking for perf stat CSV output.")
Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com>
Suggested-by: Ian Rogers <irogers@google.com>
Suggested-by: James Clark <james.clark@arm.com>
Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Claire Jensen <cjense@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Link: https://lore.kernel.org/r/20221006155149.67205-1-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
User space tasks can migrate between CPUs, so when tracing selected CPUs,
system-wide sideband is still needed, however evlist->core.has_user_cpus
is not set in the hybrid case, so check the target cpu_list instead.
Fixes: 7d189cadbe ("perf intel-pt: Track sideband system-wide when needed")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20221012082259.22394-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
uClibc segfaulted because NULL was passed as the format to fprintf().
That happened because one of the format strings was missing and
intel_pt_print_info() didn't check that before calling fprintf().
Add the missing format string, and check format is not NULL before calling
fprintf().
Fixes: 11fa7cb86b ("perf tools: Pass Intel PT information for decoding MTC and CYC")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20221012082259.22394-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Since PERF_FORMAT_LOST was added, the default read format has that bit
set, so add it to the tests. Keep the old value as well so that the test
still passes on older kernels.
This fixes the following failure:
expected read_format=0|4, got 20
FAILED './tests/attr/test-record-C0' - match failure
Fixes: 85b425f31c8866e0 ("perf record: Set PERF_FORMAT_LOST by default")
Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20221012094633.21669-2-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add tests:
Test with MTC and TSC disabled
Test with branches disabled
Test with/without CYC
Test recording with sample mode
Test with kernel trace
Test virtual LBR
Test power events
Test with TNT packets disabled
Test with event_trace
These tests mostly check that perf record works with the corresponding
Intel PT config terms, sometimes also checking that certain packets do or
do not appear in the resulting trace as appropriate.
The "Test virtual LBR" is slightly trickier, using a Python script to
check that branch stacks are actually synthesized.
Signed-off-by: Ammy Yi <ammy.yi@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20221014170905.64069-8-adrian.hunter@intel.com
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When a program header was added, it moved the text section but
GEN_ELF_TEXT_OFFSET was not updated.
Fix by adding the program header size and aligning.
Fixes: babd04386b ("perf jit: Include program header in ELF files")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Lieven Hey <lieven.hey@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20221014170905.64069-7-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add a test for decoding self-modifying code using a jitdump file.
The test creates a workload that uses self-modifying code and generates its
own jitdump file. The result is processed with perf inject --jit and
checked for decoding errors.
Note the test will fail without patch "perf inject: Fix GEN_ELF_TEXT_OFFSET
for jit" applied.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20221014170905.64069-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tidy alignment of test function lines to make them more readable.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20221014170905.64069-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Messages display with the perf test -v option. Add a message to show when
skipping a test because the user cannot do kernel tracing.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20221014170905.64069-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When not decoding, the options "-B -N --no-bpf-event" speed up perf record.
Make a common function for them.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20221014170905.64069-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
count_result() does not always reset ret=0 which means the value can spill
into the next test result.
Fix by explicitly setting it to zero between tests.
Committer testing:
# perf test "Miscellaneous Intel PT testing"
110: Miscellaneous Intel PT testing : Ok
#
Tested as well with:
# perf test -v "Miscellaneous Intel PT testing"
Fixes: fd9b45e39c ("perf test: test_intel_pt.sh: Fix return checking")
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20221014170905.64069-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
If the kernel exposes a new perf_event_attr field in a format attr, perf
will return an error stating the specified PMU can't be found. For
example, a format attr with 'config3:0-63' causes an error as config3 is
unknown to perf. This causes a compatibility issue between a newer
kernel with older perf tool.
Before this change with a kernel adding 'config3' I get:
$ perf record -e arm_spe// -- true
event syntax error: 'arm_spe//'
\___ Cannot find PMU `arm_spe'. Missing kernel support?
Run 'perf list' for a list of valid events
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-e, --event <event> event selector. use 'perf list' to list
available events
After this change, I get:
$ perf record -e arm_spe// -- true
WARNING: 'arm_spe_0' format 'inv_event_filter' requires 'perf_event_attr::config3' which is not supported by this version of perf!
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0.091 MB perf.data ]
To support unknown configN formats, rework the YACC implementation to
pass any config[0-9]+ format to perf_pmu__new_format() to handle with a
warning.
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Rob Herring <robh@kernel.org>
Tested-by: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20220914-arm-perf-tool-spe1-2-v2-v4-1-83c098e6212e@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
$ perf list metricgroups
gives
List of pre-defined events (to be used in -e):
Metric Groups:
Backend
Bad
BadSpec
But that's incorrect of course because metric groups or metrics can only
be specified with -M. So fix the message to say -e or -M
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20221004192634.998984-1-ak@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The -C/--cpu option was maily for report but it also affected record as
it ate the option. So users needed to use "--" after perf mem record to
pass the info to the perf record properly.
Check if this option is set for record, and pass it to the actual perf
record.
Before)
$ sudo perf --debug perf-event-open mem record -C 0 2>&1 | grep -a sys_perf_event_open
...
sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 4
sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 5
sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8 = 6
sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8 = 7
sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 8
sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 9
sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8 = 10
sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8 = 11
...
After)
$ sudo perf --debug perf-event-open mem record -C 0 2>&1 | grep -a sys_perf_event_open
...
sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 4
sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 5
sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 6
sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 7
Reported-by: Ravi Bangoria <ravi.bangoria@amd.com>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Leo Yan <leo.yan@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20221004200211.1444521-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Its just for that __packed define, so use it expanded as __attribute__((packed)),
like the other files in /usr/include do.
This was problem was preventing building the libperf examples on ALT
Linux and Fedora 35, fix it.
Reported-by: Vitaly Chikunov <vt@altlinux.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Dmitry Levin <ldv@altlinux.org
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/Y0lnpl2Ix7VljVDc@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This test commonly fails on Arm Juno because the instruction interval
is large enough to miss generating any samples for Perf in system-wide
mode.
Fix this by lowering the interval until a comfortable number of Perf
instructions are generated. The test is still quick to run because only
a small amount of trace is gathered.
Before:
sudo ./perf test coresight -vvv
...
Recording trace with system wide mode
Looking at perf.data file for dumping branch samples:
Looking at perf.data file for reporting branch samples:
Looking at perf.data file for instruction samples:
CoreSight system wide testing: FAIL
...
After:
sudo ./perf test coresight -vvv
...
Recording trace with system wide mode
Looking at perf.data file for dumping branch samples:
Looking at perf.data file for reporting branch samples:
Looking at perf.data file for instruction samples:
CoreSight system wide testing: PASS
...
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: coresight@lists.linaro.org
Link: https://lore.kernel.org/r/20221005140508.1537277-1-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The recent change in the cgroup will break the backward compatiblity in
the BPF program. It should support both old and new kernels using BPF
CO-RE technique.
Like the task_struct->__state handling in the offcpu analysis, we can
check the field name in the cgroup struct.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: bpf@vger.kernel.org
Cc: cgroups@vger.kernel.org
Cc: zefan li <lizefan.x@bytedance.com>
Link: http://lore.kernel.org/lkml/20221011052808.282394-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmNIXQAQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpnplD/wP1Kcbn0wc6UktiOaGHy6/DjRmpi2a857g
ioxTioycD7/ZchIGsfXLca9xBuOufLK7//irG1CpxJT7QXXQTprHspXOZLpw2nUb
t77hGTgCHkI5QOPuVnZ4+agjvxefUQ0/TOEPLYlb1fh+Q81ZAR7NecV9XwXnGy4V
75gS1t3cP4WcmeFi6MIkvg8BZhusS2qnPvmviig6eDMDvlJ5Ct/x3zJGnxPH/jt/
fJtCvZUIUVSfuUirCh/Xv3gmXptPmwrQPKGW66dCZeZaZpUAcFkmvQu/4739q+x2
5XGhr02FlHxm6zX165rF4jvXNyn1K9nPj/gUQcPLKK+JesA56OPGsUCjkNE7azbm
ZWf710Tyzx1B2EAIk4keRNIe5YBcUQenwIbGd2czOf+oAuSmNX+tRLozIsVdwD7Z
NPvLFYfH+evPqMzIhESjt4q476eCl1Zu8pGfBvYmPJ9NNnjcG3wjFZLmmsfA4KLG
ffkJqBGLWc9LD4sUjsVhYxV79jcQDr7Wh+fkiQ1u9ML6qjNsp9dQijnvIb7dC6Vj
UKvMbB45oDPffI71dHx08xX+G7Kf5cLWm/Wf0DLRjJaL5uxUA/OGITMdjmvFzSLZ
FNdlAkwfIomLO3mFK1LrSUYMpwZ84ZBH7iZfO6omOk0MrAhMA3M1U1OnXUS+GgxK
R5A4tOD4Ow==
=kxra
-----END PGP SIGNATURE-----
Merge tag 'block-6.1-2022-10-13' of git://git.kernel.dk/linux
Pull more block updates from Jens Axboe:
"Fixes that ended up landing later than the initial block pull request.
Nothing really major in here:
- NVMe pull request via Christoph:
- add NVME_QUIRK_BOGUS_NID for Lexar NM760 (Abhijit)
- add NVME_QUIRK_NO_DEEPEST_PS to avoid the deepest sleep state
on ZHITAI TiPro5000 SSDs (Xi Ruoyao)
- fix possible hang caused during ctrl deletion (Sagi Grimberg)
- fix possible hang in live ns resize with ANA access (Sagi
Grimberg)
- Proactively avoid a sign extension issue with the queue flags
(Brian)
- Regression fix for hidden disks (Christoph)
- Update OPAL maintainers entry (Jonathan)
- blk-wbt regression initialization fix (Yu)"
* tag 'block-6.1-2022-10-13' of git://git.kernel.dk/linux:
nvme-multipath: fix possible hang in live ns resize with ANA access
nvme-pci: avoid the deepest sleep state on ZHITAI TiPro5000 SSDs
nvme-pci: add NVME_QUIRK_BOGUS_NID for Lexar NM760
nvme-tcp: fix possible hang caused during ctrl deletion
nvme-rdma: fix possible hang caused during ctrl deletion
block: fix leaking minors of hidden disks
block: avoid sign extend problem with default queue flags mask
blk-wbt: fix that 'rwb->wc' is always set to 1 in wbt_init()
block: Remove the repeat word 'can'
MAINTAINERS: Update SED-Opal Maintainers
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmNIXOYQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpr17D/9qA35JJ4lnZWo+ayqwRPAqpO9DRgmyt2W+
FQAdbaWpsi2cr+Iv4AU51rIWfydP1UZCfYBPkWvLCoCzLizWUz4c6bLBc4+HvJcR
wuiWG+js6PSi279mUA1s1XXYthYt7CiMpi3+2iatGiPBFtIIMeMqMos02ohKu/4L
gd9qTKUfwepBcM2D61wk/jl7UXtdIb+NCh685FvkQ32Nk4kQx3L5y3BRJ1NUtCTz
GI7HT2Lp6Im3pD6hpDPSS8H/DK7gTdyQYPiI9+rfdfIR8HW4k/3FkefYB5He554J
8b9MsdSN6g/oIQxPhMlbbE4e24ahEPK7h5V1HYCF8i/dPbxjuZQ0HpgC+LiLG9/J
cOVXjZVjcuzUmJM9eJ/jNZx0w3/RHGo8lstBvQ0SyuRpLYhe+pvVYlkNEkK4ptIZ
bw/scjmQxNp3xK8PRUMaIb7sHfha/+OoIj+XHXgeiujtDIFQOc+7G14o+gCkker5
ZIVrbXcn4cWAHKUMPlH3NWj89v6lkulfuOQ6NZPbz4CYj3tudqnG3ah810iGBTiR
1HOhtwKPYR8WGioj1ZjUiSJZ/8aT4GUuLM2EjKsIITo09mYwx24G+pfaWz+qBnzw
SQqg0M0XVCg+RSrfakl9LE2uX2V06YDMBg52HA014CQaRE0pJHX/0u1VKg4rdD0z
QvdbB5peXg==
=0dSL
-----END PGP SIGNATURE-----
Merge tag 'io_uring-6.1-2022-10-13' of git://git.kernel.dk/linux
Pull more io_uring updates from Jens Axboe:
"A collection of fixes that ended up either being later than the
initial pull, or dependent on multiple branches (6.0-late being one of
them) and hence deferred purposely. This contains:
- Cleanup fixes for the single submitter late 6.0 change, which we
pushed to 6.1 to keep the 6.0 changes small (Dylan, Pavel)
- Fix for IORING_OP_CONNECT not handling -EINPROGRESS correctly (me)
- Ensure that the zc sendmsg variant gets audited correctly (me)
- Regression fix from this merge window where kiocb_end_write()
doesn't always gets called, which can cause issues with fs freezing
(me)
- Registered files SCM handling fix (Pavel)
- Regression fix for big sqe dumping in fdinfo (Pavel)
- Registered buffers accounting fix (Pavel)
- Remove leftover notification structures, we killed them off late in
6.0 (Pavel)
- Minor optimizations (Pavel)
- Cosmetic variable shadowing fix (Stefan)"
* tag 'io_uring-6.1-2022-10-13' of git://git.kernel.dk/linux:
io_uring/rw: ensure kiocb_end_write() is always called
io_uring: fix fdinfo sqe offsets calculation
io_uring: local variable rw shadows outer variable in io_write
io_uring/opdef: remove 'audit_skip' from SENDMSG_ZC
io_uring: optimise locking for local tw with submit_wait
io_uring: remove redundant memory barrier in io_req_local_work_add
io_uring/net: handle -EINPROGRESS correct for IORING_OP_CONNECT
io_uring: remove notif leftovers
io_uring: correct pinned_vm accounting
io_uring/af_unix: defer registered files gc to io_uring release
io_uring: limit registration w/ SINGLE_ISSUER
io_uring: remove io_register_submitter
io_uring: simplify __io_uring_add_tctx_node
- Fixes for Mediatek MT6370 binding
- Merge the DT overlay maintainer entry to the main entry as Pantelis is
not active and Frank is taking a step back
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEktVUI4SxYhzZyEuo+vtdtY28YcMFAmNIOAAACgkQ+vtdtY28
YcO+Ow//bNddt8gh8Yz47XsTiMnLg9e/11Eb67MphHvHgvSOfw+12OhanasKQVMa
7Wioxyr2b7d2LLs+DdAJRfOw5WK+T9oQ9GdmGmlMimNLcBBwntU0wwg++PeTqiLy
j0Q+pIVq/n6ANWTvegTDh2m50znJC+wTd2wkOGZHrZ4B1W6B/hDLLGgVSGyDDO03
vK5lSkPJhDxfylem61br6L3G4FsDJY3qOkOW2E30ocSaTX/XglUcDIxXq+yNDhFH
IcbkIEYOEfKkFIKjcsGCKsJPfKOISdrIkJtbiG3ZWYrSwzDXGnHXGARVBoUdQ/rW
SAvVMCS7+iAVLKAtEX///yxltZiSxkJ7JQR4Z++5di7rveBS4F1LolrnArtMESOF
vptNXWVet1zg3iMU30+BMaYr6h3VcsiKgphrEJlsBhAzQoBZHxHoKvQSTlQyD2fE
cnlJrrk/eA9B3hyP1w6vST/wDkY10s+ButvUfgNlUiIX2JXut5xVClNFzOkaWxEv
RGC2OU8GFaoMvBlJmV/jwdjQ+YN2F+XV9meHqhZOKcdrgUt5kHmDzf6KbXRUAYc1
RXjuHk7zKHESRF7/fLw5zm/j1+Ty3p8A36C+SH2GwVfWST2OALjvpsEqHH4dwu1H
nBDbWSVvZfDQoihBBHx13+lxsoiewXAonqXHFTgThGcPIv1taYU=
=YnES
-----END PGP SIGNATURE-----
Merge tag 'devicetree-fixes-for-6.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
Pull devicetree fixes from Rob Herring:
- Fixes for Mediatek MT6370 binding
- Merge the DT overlay maintainer entry to the main entry as Pantelis
is not active and Frank is taking a step back
* tag 'devicetree-fixes-for-6.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
MAINTAINERS: of: collapse overlay entry into main device tree entry
dt-bindings: mfd: mt6370: fix the interrupt order of the charger in the example
dt-bindings: leds: mt6370: Fix MT6370 LED indicator DT warning
catching the Chinese translation up with the front-page rework.
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEIw+MvkEiF49krdp9F0NaE2wMflgFAmNIPyAPHGNvcmJldEBs
d24ubmV0AAoJEBdDWhNsDH5Y6UMH/2ZLziHH0jQkoBAIhxyUzU3ZfXLlq5Xqo6vS
oBYfJlYLClY/dmfTc3HAI6UhhJLGTcTNDaC1H4YQdgeP6RVJruFThOYF9WgW0FFl
SgsBKhTbi3dpfdzjID8bJM7ytkbIvV4voNh52J9L1TA3z/CPxKSiXCScFAH/o12t
E+CMjtmgi2P8w3kqgX59FMavp3W8M8HsT6u/wVoKb+zXjjqXGFYEXTjjKUxufRf6
QWkaQGb0PHq9+2hAhgF4vdy4tWB9lr7r2ENZ8YKUkYUYfv5KGAqt39J7A4rC+g7w
4Rvzznd0BJv3nuZ4rdxom7cOJ77i3lmWSJ65FoDHNeQ/8VBNuZc=
=UpLC
-----END PGP SIGNATURE-----
Merge tag 'docs-6.1-2' of git://git.lwn.net/linux
Pull documentation fixes from Jonathan Corbet:
"A handful of relatively simple documentation fixes, plus a set of
patches catching the Chinese translation up with the front-page
rework"
* tag 'docs-6.1-2' of git://git.lwn.net/linux:
Documentation: rtla: Correct command line example
docs/zh_CN: add a man-pages link to zh_CN/index.rst
docs/zh_CN: Rewrite the Chinese translation front page
docs/zh_CN: add zh_CN/arch.rst
docs/zh_CN: promote the title of zh_CN/process/index.rst
docs/zh_CN: Update the translation of page_owner to 6.0-rc7
docs/zh_CN: Update the translation of ksm to 6.0-rc7
docs/howto: Replace abundoned URL of gmane.org
Documentation: ubifs: Fix compression idiom
Documentation/mm/page_owner.rst: delete frequently changing experimental data
docs/zh_CN: Fix build warning
docs: ftrace: Correct access mode
Current release - regressions:
- Revert "net/sched: taprio: make qdisc_leaf() see
the per-netdev-queue pfifo child qdiscs", it may cause crashes
when the qdisc is reconfigured
- inet: ping: fix splat due to packet allocation refactoring in inet
- tcp: clean up kernel listener's reqsk in inet_twsk_purge(),
fix UAF due to races when per-netns hash table is used
Current release - new code bugs:
- eth: adin1110: check in netdev_event that netdev belongs to driver
- fixes for PTR_ERR() vs NULL bugs in driver code, from Dan and co.
Previous releases - regressions:
- ipv4: handle attempt to delete multipath route when fib_info
contains an nh reference, avoid oob access
- wifi: fix handful of bugs in the new Multi-BSSID code
- wifi: mt76: fix rate reporting / throughput regression on mt7915
and newer, fix checksum offload
- wifi: iwlwifi: mvm: fix double list_add at
iwl_mvm_mac_wake_tx_queue (other cases)
- wifi: mac80211: do not drop packets smaller than the LLC-SNAP
header on fast-rx
Previous releases - always broken:
- ieee802154: don't warn zero-sized raw_sendmsg()
- ipv6: ping: fix wrong checksum for large frames
- mctp: prevent double key removal and unref
- tcp/udp: fix memory leaks and races around IPV6_ADDRFORM
- hv_netvsc: fix race between VF offering and VF association message
Misc:
- remove -Warray-bounds silencing in the drivers, compilers fixed
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmNISaMACgkQMUZtbf5S
IruEARAArjYZbOEGkUVqtcEbnV0vmxQ5GsVyvurDkmzUULJ1rVAITtG7BbxcPyZ7
tJf5BPmmpXxEXh/lZBIlgHLOGgf/cx4gkCH9Jz6LYlSoTpTZiTqxlOfAZNeei0FI
PD95Slvd3TnIOEysv5RH/pQzIoKdd6+YqOhVITbwCW36cCLaUm+r7JUhzDrnHMNE
KCcsOX9DDtW7MDJrJj/E0wlWeWcudpHY4DLG2A723X6Esu+8k6krK32XtkrFIKqa
PFxeU1NPgMkn4S2xRPKqy+W3dTMfMKB4WWBMMUzEU220MIxV4l/RZSrnI5nrnLh2
uXyUefpx+lD92D5BOiqUw8rK7B4Jq0uUrawuCf+70tbO1f13ThkkAlV6cEzrlnZY
tGQxs0ayFIDVypU1tpY9cemUiYXrnPpCkpz+V1G0us8L323eCHxjz/f5TUlb51Na
BVFvRqvxkjztprBv2LrH2SmnVtcH2kvQG8qMYmXRchBM+11rivz6BrPdE0V+muMg
Hjr6HefYMBpSgcD+ADVFr8a/OB/W7AuWpTBd3z/WyNQ5MxkFX9Kf2Lt2+j8SRfpE
ELO0AANFQZ1Gyp6LTbEkA3mFs1LhNNQyfjHcMHC16ZExHmV3i37BE9LJdnFM27N8
R8lIm4YDs6Jj6YIUDy2wExgUAgkUk7mfZNCMPNi2nSsdJksyAsc=
=AyqG
-----END PGP SIGNATURE-----
Merge tag 'net-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from netfilter, and wifi.
Current release - regressions:
- Revert "net/sched: taprio: make qdisc_leaf() see the
per-netdev-queue pfifo child qdiscs", it may cause crashes when the
qdisc is reconfigured
- inet: ping: fix splat due to packet allocation refactoring in inet
- tcp: clean up kernel listener's reqsk in inet_twsk_purge(), fix UAF
due to races when per-netns hash table is used
Current release - new code bugs:
- eth: adin1110: check in netdev_event that netdev belongs to driver
- fixes for PTR_ERR() vs NULL bugs in driver code, from Dan and co.
Previous releases - regressions:
- ipv4: handle attempt to delete multipath route when fib_info
contains an nh reference, avoid oob access
- wifi: fix handful of bugs in the new Multi-BSSID code
- wifi: mt76: fix rate reporting / throughput regression on mt7915
and newer, fix checksum offload
- wifi: iwlwifi: mvm: fix double list_add at
iwl_mvm_mac_wake_tx_queue (other cases)
- wifi: mac80211: do not drop packets smaller than the LLC-SNAP
header on fast-rx
Previous releases - always broken:
- ieee802154: don't warn zero-sized raw_sendmsg()
- ipv6: ping: fix wrong checksum for large frames
- mctp: prevent double key removal and unref
- tcp/udp: fix memory leaks and races around IPV6_ADDRFORM
- hv_netvsc: fix race between VF offering and VF association message
Misc:
- remove -Warray-bounds silencing in the drivers, compilers fixed"
* tag 'net-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (73 commits)
sunhme: fix an IS_ERR() vs NULL check in probe
net: marvell: prestera: fix a couple NULL vs IS_ERR() checks
kcm: avoid potential race in kcm_tx_work
tcp: Clean up kernel listener's reqsk in inet_twsk_purge()
net: phy: micrel: Fixes FIELD_GET assertion
openvswitch: add nf_ct_is_confirmed check before assigning the helper
tcp: Fix data races around icsk->icsk_af_ops.
ipv6: Fix data races around sk->sk_prot.
tcp/udp: Call inet6_destroy_sock() in IPv6 sk->sk_destruct().
udp: Call inet6_destroy_sock() in setsockopt(IPV6_ADDRFORM).
tcp/udp: Fix memory leak in ipv6_renew_options().
mctp: prevent double key removal and unref
selftests: netfilter: Fix nft_fib.sh for all.rp_filter=1
netfilter: rpfilter/fib: Populate flowic_l3mdev field
selftests: netfilter: Test reverse path filtering
net/mlx5: Make ASO poll CQ usable in atomic context
tcp: cdg: allow tcp_cdg_release() to be called multiple times
inet: ping: fix recent breakage
ipv6: ping: fix wrong checksum for large frames
net: ethernet: ti: am65-cpsw: set correct devlink flavour for unused ports
...
Fix a regression in virtio pci on power.
Add a reviewer for ifcvf.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmNIFSAPHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRpwMsH/iSBZ8pswU+nnB/vD8MMypt1l3GCA9BmVfOF
8+wD5683ugx2YgJXowY0MCWhhKkwJBp6T22dS+PMHE0pvtW6Ent4IySceeBlnsTs
PypeTuGMlb/XmGiefBI4cEhGFQ/ug48UJ3NTkH08r+EAaZO8SZ1ltluln7RQJcgR
mwghMF97wWsM4UQjbcBz18tuvPcGY6+7X5xi+BIbhFDMkKyRYWU9VZlzxQnXRN42
qcxrMQ4me9IUxiuIsjwsqfTXgKqb+2esTiWJa3Au2BO1qRCfgqVjxUNEpmk0F+CF
4V7o0p5N47VC2LR3OlTWf05J3Aq+sf33itzQ66H8aId3fJP69Tk=
=kGLQ
-----END PGP SIGNATURE-----
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio fixes from Michael Tsirkin:
- Fix a regression in virtio pci on power
- Add a reviewer for ifcvf
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
vdpa/ifcvf: add reviewer
virtio_pci: use irq to detect interrupt support
- Found that the synthetic events were using strlen/strscpy() on values
that could have come from userspace, and that is bad.
Consolidate the string logic of kprobe and eprobe and extend it to
the synthetic events to safely process string addresses.
- Clean up content of text dump in ftrace_bug() where the output does not
make char reads into signed and sign extending the byte output.
- Fix some kernel docs in the ring buffer code.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCY0c6GBQccm9zdGVkdEBn
b29kbWlzLm9yZwAKCRAp5XQQmuv6qpDNAQCuw9YTeNMU4zxFqBg4/JCbfpnWQGj4
Qdl2u3WtEvTzrgEA85Q01swCYRKdrGPCrFemZ3lm6PGzpGruh+BfD4qRMwk=
=F5kK
-----END PGP SIGNATURE-----
Merge tag 'trace-v6.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing fixes from Steven Rostedt:
- Found that the synthetic events were using strlen/strscpy() on values
that could have come from userspace, and that is bad.
Consolidate the string logic of kprobe and eprobe and extend it to
the synthetic events to safely process string addresses.
- Clean up content of text dump in ftrace_bug() where the output does
not make char reads into signed and sign extending the byte output.
- Fix some kernel docs in the ring buffer code.
* tag 'trace-v6.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing: Fix reading strings from synthetic events
tracing: Add "(fault)" name injection to kernel probes
tracing: Move duplicate code of trace_kprobe/eprobe.c into header
ring-buffer: Fix kernel-doc
ftrace: Fix char print issue in print_ip_ins()
noteworthy one being some additional wakeups in cap handling code, and
a messenger cleanup.
-----BEGIN PGP SIGNATURE-----
iQFHBAABCAAxFiEEydHwtzie9C7TfviiSn/eOAIR84sFAmNINMwTHGlkcnlvbW92
QGdtYWlsLmNvbQAKCRBKf944AhHzi5UeB/0ZzxdhZarepRYdOw4K+hHmCWYjlrEi
Aw91gfS9DmzXfLyV42/6kxhKGEmVH4Wpz/mAIfMcLaLJZxI9GspVZZuofK6XPJUY
eGqllxXgbgvCqnX9puCfw4RTrEJkt/y6e0/6EhAhjArDNyEylHcApONbEsvHLB+L
IjYEJRuDDNqBnacMjn0iqI2F3zpyu6DkuJNWLxfbGhnWWsj8LaxXVgLtBeePuoIN
udVZiNxiJAldDGc99r0xX5gicjyihBRiomjnz6FO6F459CtrPE/qdx6TNUUt63N3
Lt55JDCM8qJeA8ffblZrhNnT2iefcEuqRcSwSdLbQxW/l6y23O4drx+N
=l/PT
-----END PGP SIGNATURE-----
Merge tag 'ceph-for-6.1-rc1' of https://github.com/ceph/ceph-client
Pull ceph updates from Ilya Dryomov:
"A quiet round this time: several assorted filesystem fixes, the most
noteworthy one being some additional wakeups in cap handling code, and
a messenger cleanup"
* tag 'ceph-for-6.1-rc1' of https://github.com/ceph/ceph-client:
ceph: remove Sage's git tree from documentation
ceph: fix incorrectly showing the .snap size for stat
ceph: fail the open_by_handle_at() if the dentry is being unlinked
ceph: increment i_version when doing a setattr with caps
ceph: Use kcalloc for allocating multiple elements
ceph: no need to wait for transition RDCACHE|RD -> RD
ceph: fail the request if the peer MDS doesn't support getvxattr op
ceph: wake up the waiters if any new caps comes
libceph: drop last_piece flag from ceph_msg_data_cursor
- New Features:
- Add NFSv4.2 xattr tracepoints
- Replace xprtiod WQ in rpcrdma
- Flexfiles cancels I/O on layout recall or revoke
- Bugfixes and Cleanups:
- Directly use ida_alloc() / ida_free()
- Don't open-code max_t()
- Prefer using strscpy over strlcpy
- Remove unused forward declarations
- Always return layout states on flexfiles layout return
- Have LISTXATTR treat NFS4ERR_NOXATTR as an empty reply instead of error
- Allow more xprtrdma memory allocations to fail without triggering a reclaim
- Various other xprtrdma clean ups
- Fix rpc_killall_tasks() races
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEnZ5MQTpR7cLU7KEp18tUv7ClQOsFAmNHJToACgkQ18tUv7Cl
QOtTbA//QiresBzf7cnZOAwiZbe9LXiWfR2p5IkBLJPYJ8xtTliRLwnwYgQib9OI
+4DzBiEqujah9BDac5OeatYW1UDLQ9lMIoCyvPjSw8Yxa8JEHDb/1ODDUOMS+ZIo
dk1AKV2Wi2stxn85Sy+VGriE3JKiaeJxAlsWgiT/BLP0hAyZw1L3Tg017EgxVIVz
8cfPBciu/Bc2/pZp9f5+GBjAlcUX0u/JFKiLPDHDZkvFTr4RgREZOyStDWncgsxK
iHAIfSr6TxlynHabNAnFNVuYq7gkBe3jg1TkABdQ+SilAgdLpugAW8MFdig0AZQO
UIsVJHjRHLpz6cJurnDcu9tGB6jLVTZfyz8PZQl5H9CqnbSHUxdOCTuve7fGhVas
+wSXq1U98gStzoqtw5pMwsB2YSSOsUR8QEZpLEkvQgzHwoszNa7FrELqaZUJyJHR
qmRH2nKCzsSBbQn5AhnzHBxzeOv6r0r3YjvKd5utwsRtq3g9GX14KAOmqvDTKk2q
9KmrGlDVtVmOww2QnPTXH6mSthHLuqcKg1H2H7Xymmskq9n8PC6M+EiQd8XsKNJa
MfBkOVFdxrJq6Htpx4IMLJP6jvYVKEbef2eRFt8hNnla8pMPlsDqoIysJulaWpiB
HqdoPHR9Y26Qxuw7G91ba5Q5qqu9+ZOLB9jeSRjcXtsUDxq9f/A=
=p47k
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-6.1-1' of git://git.linux-nfs.org/projects/anna/linux-nfs
Pull NFS client updates from Anna Schumaker:
"New Features:
- Add NFSv4.2 xattr tracepoints
- Replace xprtiod WQ in rpcrdma
- Flexfiles cancels I/O on layout recall or revoke
Bugfixes and Cleanups:
- Directly use ida_alloc() / ida_free()
- Don't open-code max_t()
- Prefer using strscpy over strlcpy
- Remove unused forward declarations
- Always return layout states on flexfiles layout return
- Have LISTXATTR treat NFS4ERR_NOXATTR as an empty reply instead of
error
- Allow more xprtrdma memory allocations to fail without triggering a
reclaim
- Various other xprtrdma clean ups
- Fix rpc_killall_tasks() races"
* tag 'nfs-for-6.1-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (27 commits)
NFSv4/flexfiles: Cancel I/O if the layout is recalled or revoked
SUNRPC: Add API to force the client to disconnect
SUNRPC: Add a helper to allow pNFS drivers to selectively cancel RPC calls
SUNRPC: Fix races with rpc_killall_tasks()
xprtrdma: Fix uninitialized variable
xprtrdma: Prevent memory allocations from driving a reclaim
xprtrdma: Memory allocation should be allowed to fail during connect
xprtrdma: MR-related memory allocation should be allowed to fail
xprtrdma: Clean up synopsis of rpcrdma_regbuf_alloc()
xprtrdma: Clean up synopsis of rpcrdma_req_create()
svcrdma: Clean up RPCRDMA_DEF_GFP
SUNRPC: Replace the use of the xprtiod WQ in rpcrdma
NFSv4.2: Add a tracepoint for listxattr
NFSv4.2: Add tracepoints for getxattr, setxattr, and removexattr
NFSv4.2: Move TRACE_DEFINE_ENUM(NFS4_CONTENT_*) under CONFIG_NFS_V4_2
NFSv4.2: Add special handling for LISTXATTR receiving NFS4ERR_NOXATTR
nfs: remove nfs_wait_atomic_killable() and nfs_write_prepare() declaration
NFSv4: remove nfs4_renewd_prepare_shutdown() declaration
fs/nfs/pnfs_nfs.c: fix spelling typo and syntax error in comment
NFSv4/pNFS: Always return layout stats on layout return for flexfiles
...
The '-t/-T' parameters seem to have been swapped:
-t/--trace[=file]: save the stopped trace
to [file|timerlat_trace.txt]
-T/--thread us: stop trace if the thread latency
is higher than the argument in us
Swap them back.
Signed-off-by: Pierre Gondois <pierre.gondois@arm.com>
Acked-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Link: https://lore.kernel.org/r/20221006084409.3882542-1-pierre.gondois@arm.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
The devm_request_region() function does not return error pointers, it
returns NULL on error.
Fixes: 914d9b2711 ("sunhme: switch to devres")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Sean Anderson <seanga2@gmail.com>
Reviewed-by: Rolf Eike Beer <eike-kernel@sf-tec.de>
Link: https://lore.kernel.org/r/Y0bWzJL8JknX8MUf@kili
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The __prestera_nexthop_group_create() function returns NULL on error
and the prestera_nexthop_group_get() returns error pointers. Fix these
two checks.
Fixes: 0a23ae2371 ("net: marvell: prestera: Add router nexthops ABI")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/Y0bWq+7DoKK465z8@kili
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet reported a use-after-free related to the per-netns ehash
series. [0]
When we create a TCP socket from userspace, the socket always holds a
refcnt of the netns. This guarantees that a reqsk timer is always fired
before netns dismantle. Each reqsk has a refcnt of its listener, so the
listener is not freed before the reqsk, and the net is not freed before
the listener as well.
OTOH, when in-kernel users create a TCP socket, it might not hold a refcnt
of its netns. Thus, a reqsk timer can be fired after the netns dismantle
and access freed per-netns ehash.
To avoid the use-after-free, we need to clean up TCP_NEW_SYN_RECV sockets
in inet_twsk_purge() if the netns uses a per-netns ehash.
[0]: https://lore.kernel.org/netdev/CANn89iLXMup0dRD_Ov79Xt8N9FM0XdhCHEN05sf3eLwxKweM6w@mail.gmail.com/
BUG: KASAN: use-after-free in tcp_or_dccp_get_hashinfo
include/net/inet_hashtables.h:181 [inline]
BUG: KASAN: use-after-free in reqsk_queue_unlink+0x320/0x350
net/ipv4/inet_connection_sock.c:913
Read of size 8 at addr ffff88807545bd80 by task syz-executor.2/8301
CPU: 1 PID: 8301 Comm: syz-executor.2 Not tainted
6.0.0-syzkaller-02757-gaf7d23f9d96a #0
Hardware name: Google Google Compute Engine/Google Compute Engine,
BIOS Google 09/22/2022
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
print_address_description mm/kasan/report.c:317 [inline]
print_report.cold+0x2ba/0x719 mm/kasan/report.c:433
kasan_report+0xb1/0x1e0 mm/kasan/report.c:495
tcp_or_dccp_get_hashinfo include/net/inet_hashtables.h:181 [inline]
reqsk_queue_unlink+0x320/0x350 net/ipv4/inet_connection_sock.c:913
inet_csk_reqsk_queue_drop net/ipv4/inet_connection_sock.c:927 [inline]
inet_csk_reqsk_queue_drop_and_put net/ipv4/inet_connection_sock.c:939 [inline]
reqsk_timer_handler+0x724/0x1160 net/ipv4/inet_connection_sock.c:1053
call_timer_fn+0x1a0/0x6b0 kernel/time/timer.c:1474
expire_timers kernel/time/timer.c:1519 [inline]
__run_timers.part.0+0x674/0xa80 kernel/time/timer.c:1790
__run_timers kernel/time/timer.c:1768 [inline]
run_timer_softirq+0xb3/0x1d0 kernel/time/timer.c:1803
__do_softirq+0x1d0/0x9c8 kernel/softirq.c:571
invoke_softirq kernel/softirq.c:445 [inline]
__irq_exit_rcu+0x123/0x180 kernel/softirq.c:650
irq_exit_rcu+0x5/0x20 kernel/softirq.c:662
sysvec_apic_timer_interrupt+0x93/0xc0 arch/x86/kernel/apic/apic.c:1107
</IRQ>
Fixes: d1e5e6408b ("tcp: Introduce optional per-netns ehash.")
Reported-by: syzbot <syzkaller@googlegroups.com>
Reported-by: Eric Dumazet <edumazet@google.com>
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20221012145036.74960-1-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Zhu Lingshan has been writing and reviewing ifcvf patches for
a while now, add as reviewer.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Zhu Lingshan <lingshan.zhu@intel.com>
Acked-by: Jason Wang <jasowang@redhat.com>
commit 71491c54ea ("virtio_pci: don't try to use intxif pin is zero")
breaks virtio_pci on powerpc, when running as a qemu guest.
vp_find_vqs() bails out because pci_dev->pin == 0.
But pci_dev->irq is populated correctly, so vp_find_vqs_intx() would
succeed if we called it - which is what the code used to do.
This seems to happen because pci_dev->pin is not populated in
pci_assign_irq(). A PCI core bug? Maybe.
However Linus said:
I really think that that is basically the only time you should use
that 'pci_dev->pin' thing: it basically exists not for "does this
device have an IRQ", but for "what is the routing of this irq on this
device".
and
The correct way to check for "no irq" doesn't use NO_IRQ at all, it just does
if (dev->irq) ...
so let's just check irq and be done with it.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Fixes: 71491c54ea ("virtio_pci: don't try to use intxif pin is zero")
Cc: "Angus Chen" <angus.chen@jaguarmicro.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20221012220312.308522-1-mst@redhat.com>
This has only the fixes for the scan parsing issues.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEH1e1rEeCd0AIMq6MB8qZga/fl8QFAmNH4nUACgkQB8qZga/f
l8RFHRAAlskV3quBEnFgrOyuIcWVKq4DOFFKXXQ89cwD3ZK6auYm5gyM2obE+HkD
JQZjZoJc57qh3/IoYrOlC1PO0oAHiBReQ+5xKD0d1Ye04euFuV3Boe25rH3+7wya
u8uVJQqlV0sqG0DhoOPhDV/A17owtQlN/0j3yVFukPAvBjZHRduWF/lk+ozwmAsE
xwByw1K/v9BqOSx3aZjFhb3snT/4IiV8yYLiHE0KMBC+p5Zk6iwOWbCHFstWFbku
U6rTmWSvTDLaR+koTq4WGo/fRI68Sh4eRhcZsMnSRR8rj1LQWrRehPc+IQ9TXsgS
Li0z+HeLUEUrJ8bouJV0wzizvLG473qWm5PyemomKfYA68VvplheBpL3ClLIpbW+
On3VKnyZESIKsRYoJzJ3HhG7lnay74yKC1F4GMj4GCdzkXchSBoF7z0QGM14a7+E
gby4WHDSxK5kdC01vQqDCk3PBgGw+Qte6ipIvcVXnRVWCn8O0Z1lIdOkEZkODp5s
F1Sg9YpFRqR1h8Uw664xm/RulhPx2o8iritPjQyv0trmFWsdRpll/EXqIUPMiT0n
yJI+jD+1hDXnRt2dhTTtpG46F9oru+DbBWK+iN0GHISAlIUMXlKocIyCIKxBUUAq
0kzJrqJGPbXoodEDxvLMnAObVavo4RHux+jQVBlEkjHDpj0N7WA=
=yEqx
-----END PGP SIGNATURE-----
Merge tag 'wireless-2022-10-13' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless
Johannes Berg says:
====================
More wireless fixes for 6.1
This has only the fixes for the scan parsing issues.
* tag 'wireless-2022-10-13' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
wifi: cfg80211: update hidden BSSes to avoid WARN_ON
wifi: mac80211: fix crash in beacon protection for P2P-device
wifi: mac80211_hwsim: avoid mac80211 warning on bad rate
wifi: cfg80211: avoid nontransmitted BSS list corruption
wifi: cfg80211: fix BSS refcounting bugs
wifi: cfg80211: ensure length byte is present before access
wifi: mac80211: fix MBSSID parsing use-after-free
wifi: cfg80211/mac80211: reject bad MBSSID elements
wifi: cfg80211: fix u8 overflow in cfg80211_update_notlisted_nontrans()
====================
Link: https://lore.kernel.org/r/20221013100522.46346-1-johannes@sipsolutions.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
FIELD_GET() must only be used with a mask that is a compile-time
constant. Mark the functions as __always_inline to avoid the problem.
Fixes: 21b688dabe ("net: phy: micrel: Cable Diag feature for lan8814 phy")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Divya Koppera <Divya.Koppera@microchip.com>
Link: https://lore.kernel.org/r/20221011095437.12580-1-Divya.Koppera@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
A WARN_ON call trace would be triggered when 'ct(commit, alg=helper)'
applies on a confirmed connection:
WARNING: CPU: 0 PID: 1251 at net/netfilter/nf_conntrack_extend.c:98
RIP: 0010:nf_ct_ext_add+0x12d/0x150 [nf_conntrack]
Call Trace:
<TASK>
nf_ct_helper_ext_add+0x12/0x60 [nf_conntrack]
__nf_ct_try_assign_helper+0xc4/0x160 [nf_conntrack]
__ovs_ct_lookup+0x72e/0x780 [openvswitch]
ovs_ct_execute+0x1d8/0x920 [openvswitch]
do_execute_actions+0x4e6/0xb60 [openvswitch]
ovs_execute_actions+0x60/0x140 [openvswitch]
ovs_packet_cmd_execute+0x2ad/0x310 [openvswitch]
genl_family_rcv_msg_doit.isra.15+0x113/0x150
genl_rcv_msg+0xef/0x1f0
which can be reproduced with these OVS flows:
table=0, in_port=veth1,tcp,tcp_dst=2121,ct_state=-trk
actions=ct(commit, table=1)
table=1, in_port=veth1,tcp,tcp_dst=2121,ct_state=+trk+new
actions=ct(commit, alg=ftp),normal
The issue was introduced by commit 248d45f1e1 ("openvswitch: Allow
attaching helper in later commit") where it somehow removed the check
of nf_ct_is_confirmed before asigning the helper. This patch is to fix
it by bringing it back.
Fixes: 248d45f1e1 ("openvswitch: Allow attaching helper in later commit")
Reported-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Tested-by: Aaron Conole <aconole@redhat.com>
Link: https://lore.kernel.org/r/c5c9092a22a2194650222bffaf786902613deb16.1665085502.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
setsockopt(IPV6_ADDRFORM) and tcp_v6_connect() change icsk->icsk_af_ops
under lock_sock(), but tcp_(get|set)sockopt() read it locklessly. To
avoid load/store tearing, we need to add READ_ONCE() and WRITE_ONCE()
for the reads and writes.
Thanks to Eric Dumazet for providing the syzbot report:
BUG: KCSAN: data-race in tcp_setsockopt / tcp_v6_connect
write to 0xffff88813c624518 of 8 bytes by task 23936 on cpu 0:
tcp_v6_connect+0x5b3/0xce0 net/ipv6/tcp_ipv6.c:240
__inet_stream_connect+0x159/0x6d0 net/ipv4/af_inet.c:660
inet_stream_connect+0x44/0x70 net/ipv4/af_inet.c:724
__sys_connect_file net/socket.c:1976 [inline]
__sys_connect+0x197/0x1b0 net/socket.c:1993
__do_sys_connect net/socket.c:2003 [inline]
__se_sys_connect net/socket.c:2000 [inline]
__x64_sys_connect+0x3d/0x50 net/socket.c:2000
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
read to 0xffff88813c624518 of 8 bytes by task 23937 on cpu 1:
tcp_setsockopt+0x147/0x1c80 net/ipv4/tcp.c:3789
sock_common_setsockopt+0x5d/0x70 net/core/sock.c:3585
__sys_setsockopt+0x212/0x2b0 net/socket.c:2252
__do_sys_setsockopt net/socket.c:2263 [inline]
__se_sys_setsockopt net/socket.c:2260 [inline]
__x64_sys_setsockopt+0x62/0x70 net/socket.c:2260
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
value changed: 0xffffffff8539af68 -> 0xffffffff8539aff8
Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 23937 Comm: syz-executor.5 Not tainted
6.0.0-rc4-syzkaller-00331-g4ed9c1e971b1-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine,
BIOS Google 08/26/2022
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reported-by: syzbot <syzkaller@googlegroups.com>
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>