Commit Graph

5321 Commits

Author SHA1 Message Date
Linus Torvalds
db42523b4f ftrace: Store the order of pages allocated in ftrace_page
Instead of saving the size of the records field of the ftrace_page, store
the order it uses to allocate the pages, as that is what is needed to know
in order to free the pages. This simplifies the code.

Link: https://lore.kernel.org/lkml/CAHk-=whyMxheOqXAORt9a7JK9gc9eHTgCJ55Pgs4p=X3RrQubQ@mail.gmail.com/

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[ change log written by Steven Rostedt ]
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-04-01 16:55:45 -04:00
Yordan Karadzhov (VMware)
f3ef7202ef tracing: Remove unused argument from "ring_buffer_time_stamp()
The "cpu" parameter is not being used by the function.

Link: https://lkml.kernel.org/r/20210329130331.199402-1-y.karadz@gmail.com

Signed-off-by: Yordan Karadzhov (VMware) <y.karadz@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-04-01 14:18:32 -04:00
Steven Rostedt (VMware)
22d5755a85 Merge branch 'trace/ftrace/urgent' into HEAD
Needed to merge trace/ftrace/urgent to get:

  Commit 59300b36f8 ("ftrace: Check if pages were allocated before calling free_pages()")

To clean up the code that is affected by it as well.
2021-04-01 14:16:37 -04:00
Steven Rostedt (VMware)
9deb193af6 tracing: Fix stack trace event size
Commit cbc3b92ce0 fixed an issue to modify the macros of the stack trace
event so that user space could parse it properly. Originally the stack
trace format to user space showed that the called stack was a dynamic
array. But it is not actually a dynamic array, in the way that other
dynamic event arrays worked, and this broke user space parsing for it. The
update was to make the array look to have 8 entries in it. Helper
functions were added to make it parse it correctly, as the stack was
dynamic, but was determined by the size of the event stored.

Although this fixed user space on how it read the event, it changed the
internal structure used for the stack trace event. It changed the array
size from [0] to [8] (added 8 entries). This increased the size of the
stack trace event by 8 words. The size reserved on the ring buffer was the
size of the stack trace event plus the number of stack entries found in
the stack trace. That commit caused the amount to be 8 more than what was
needed because it did not expect the caller field to have any size. This
produced 8 entries of garbage (and reading random data) from the stack
trace event:

          <idle>-0       [002] d... 1976396.837549: <stack trace>
 => trace_event_raw_event_sched_switch
 => __traceiter_sched_switch
 => __schedule
 => schedule_idle
 => do_idle
 => cpu_startup_entry
 => secondary_startup_64_no_verify
 => 0xc8c5e150ffff93de
 => 0xffff93de
 => 0
 => 0
 => 0xc8c5e17800000000
 => 0x1f30affff93de
 => 0x00000004
 => 0x200000000

Instead, subtract the size of the caller field from the size of the event
to make sure that only the amount needed to store the stack trace is
reserved.

Link: https://lore.kernel.org/lkml/your-ad-here.call-01617191565-ext-9692@work.hours/

Cc: stable@vger.kernel.org
Fixes: cbc3b92ce0 ("tracing: Set kernel_stack's caller size properly")
Reported-by: Vasily Gorbik <gor@linux.ibm.com>
Tested-by: Vasily Gorbik <gor@linux.ibm.com>
Acked-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-04-01 14:06:33 -04:00
Linus Torvalds
d19cc4bfbf Merge tag 'trace-v5.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull ftrace fix from Steven Rostedt:
 "Add check of order < 0 before calling free_pages()

  The function addresses that are traced by ftrace are stored in pages,
  and the size is held in a variable. If there's some error in creating
  them, the allocate ones will be freed. In this case, it is possible
  that the order of pages to be freed may end up being negative due to a
  size of zero passed to get_count_order(), and then that negative
  number will cause free_pages() to free a very large section.

  Make sure that does not happen"

* tag 'trace-v5.12-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  ftrace: Check if pages were allocated before calling free_pages()
2021-03-31 10:14:55 -07:00
Steven Rostedt (VMware)
59300b36f8 ftrace: Check if pages were allocated before calling free_pages()
It is possible that on error pg->size can be zero when getting its order,
which would return a -1 value. It is dangerous to pass in an order of -1
to free_pages(). Check if order is greater than or equal to zero before
calling free_pages().

Link: https://lore.kernel.org/lkml/20210330093916.432697c7@gandalf.local.home/

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-03-30 09:58:38 -04:00
David S. Miller
efd13b71a3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-25 15:31:22 -07:00
Qiujun Huang
70193038a6 tracing: Update create_system_filter() kernel-doc comment
commit f306cc82a9 ("tracing: Update event filters for multibuffer")
added the parameter @tr for create_system_filter().

commit bb9ef1cb7d ("tracing: Change apply_subsystem_event_filter()
paths to check file->system == dir") changed the parameter from @system to @dir.

Link: https://lkml.kernel.org/r/20210325161911.123452-1-hqjagain@gmail.com

Signed-off-by: Qiujun Huang <hqjagain@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-03-25 16:04:35 -04:00
Qiujun Huang
30c3d39f7f tracing: A minor cleanup for create_system_filter()
The first two parameters should be reduced to one, as @tr is simply
@dir->tr.

Link: https://lkml.kernel.org/r/20210324205642.65e03248@oasis.local.home
Link: https://lkml.kernel.org/r/20210325163752.128407-1-hqjagain@gmail.com

Suggested-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Qiujun Huang <hqjagain@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-03-25 15:26:25 -04:00
Bhaskar Chowdhury
4613bdcc12 kernel: trace: Mundane typo fixes in the file trace_events_filter.c
s/callin/calling/

Link: https://lkml.kernel.org/r/20210317095401.1854544-1-unixbhaskar@gmail.com

Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com>
[ Other fixes already done by Ingo Molnar ]
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-03-24 21:27:06 -04:00
Ingo Molnar
f2cc020d78 tracing: Fix various typos in comments
Fix ~59 single-word typos in the tracing code comments, and fix
the grammar in a handful of places.

Link: https://lore.kernel.org/r/20210322224546.GA1981273@gmail.com
Link: https://lkml.kernel.org/r/20210323174935.GA4176821@gmail.com

Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-03-23 14:08:18 -04:00
Steven Rostedt (VMware)
9a6944fee6 tracing: Add a verifier to check string pointers for trace events
It is a common mistake for someone writing a trace event to save a pointer
to a string in the TP_fast_assign() and then display that string pointer
in the TP_printk() with %s. The problem is that those two events may happen
a long time apart, where the source of the string may no longer exist.

The proper way to handle displaying any string that is not guaranteed to be
in the kernel core rodata section, is to copy it into the ring buffer via
the __string(), __assign_str() and __get_str() helper macros.

Add a check at run time while displaying the TP_printk() of events to make
sure that every %s referenced is safe to dereference, and if it is not,
trigger a warning and only show the address of the pointer, and the
dereferenced string if it can be safely retrieved with a
strncpy_from_kernel_nofault() call.

In order to not have to copy the parsing of vsnprintf() formats, or even
exporting its code, the verifier relies on vsnprintf() being able to
modify the va_list that is passed to it, and it remains modified after it
is called. This is the case for some architectures like x86_64, but other
architectures like x86_32 pass the va_list to vsnprintf() as a value not a
reference, and the verifier can not use it to parse the non string
arguments. Thus, at boot up, it is checked if vsnprintf() modifies the
passed in va_list or not, and a static branch will disable the verifier if
it's not compatible.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-03-18 12:58:27 -04:00
Steven Rostedt (VMware)
5013f454a3 tracing: Add check of trace event print fmts for dereferencing pointers
Trace events record data into the ring buffer at the time of the event. The
trace event has a printf logic to display the recorded data at a much later
time when the user reads the trace file. This makes using dereferencing
pointers unsafe if the dereferenced pointer points to the original source.
The safe way to handle this is to create an array within the trace event and
copy the source into the array. Then the dereference pointer may point to
that array.

As this is a easy mistake to make, a check is added to examine all trace
event print fmts to make sure that they are safe to use. This only checks
the various %p* dereferenced pointers like %pB, %pR, etc. It does not handle
dereferencing of strings, as there are some use cases that are OK to
dereference the source. That will be dealt with differently.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-03-18 12:58:26 -04:00
Steven Rostedt (VMware)
d8279bfc5e tracing: Add tracing_event_time_stamp() API
Add a tracing_event_time_stamp() API that checks if the event passed in is
not on the ring buffer but a pointer to the per CPU trace_buffered_event
which does not have its time stamp set yet.

If it is a pointer to the trace_buffered_event, then just return the
current time stamp that the ring buffer would produce.

Otherwise, return the time stamp from the event.

Link: https://lkml.kernel.org/r/20210316164114.131996180@goodmis.org

Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-03-18 12:58:26 -04:00
Steven Rostedt (VMware)
a948c69d6f ring-buffer: Add verifier for using ring_buffer_event_time_stamp()
The ring_buffer_event_time_stamp() must be only called by an event that has
not been committed yet, and is on the buffer that is passed in. This was
used to help debug converting the histogram logic over to using the new
time stamp code, and was proven to be very useful.

Add a verifier that can check that this is the case, and extra WARN_ONs to
catch unexpected use cases.

Link: https://lkml.kernel.org/r/20210316164113.987294354@goodmis.org

Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-03-18 12:58:26 -04:00
Steven Rostedt (VMware)
b94bc80df6 tracing: Use a no_filter_buffering_ref to stop using the filter buffer
Currently, the trace histograms relies on it using absolute time stamps to
trigger the tracing to not use the temp buffer if filters are set. That's
because the histograms need the full timestamp that is saved in the ring
buffer. That is no longer the case, as the ring_buffer_event_time_stamp()
can now return the time stamp for all events without all triggering a full
absolute time stamp.

Now that the absolute time stamp is an unrelated dependency to not using
the filters. There's nothing about having absolute timestamps to keep from
using the filter buffer. Instead, change the interface to explicitly state
to disable filter buffering that the histogram logic can use.

Link: https://lkml.kernel.org/r/20210316164113.847886563@goodmis.org

Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-03-18 12:58:26 -04:00
Steven Rostedt (VMware)
efe6196a6b ring-buffer: Allow ring_buffer_event_time_stamp() to return time stamp of all events
Currently, ring_buffer_event_time_stamp() only returns an accurate time
stamp of the event if it has an absolute extended time stamp attached to
it. To make it more robust, use the event_stamp() in case the event does
not have an absolute value attached to it.

This will allow ring_buffer_event_time_stamp() to be used in more cases
than just histograms, and it will also allow histograms to not require
including absolute values all the time.

Link: https://lkml.kernel.org/r/20210316164113.704830885@goodmis.org

Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-03-18 12:58:26 -04:00
Steven Rostedt (VMware)
b47e330231 tracing: Pass buffer of event to trigger operations
The ring_buffer_event_time_stamp() is going to be updated to extract the
time stamp for the event without needing it to be set to have absolute
values for all events. But to do so, it needs the buffer that the event is
on as the buffer saves information for the event before it is committed to
the buffer.

If the trace buffer is disabled, a temporary buffer is used, and there's
no access to this buffer from the current histogram triggers, even though
it is passed to the trace event code.

Pass the buffer that the event is on all the way down to the histogram
triggers.

Link: https://lkml.kernel.org/r/20210316164113.542448131@goodmis.org

Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-03-18 12:58:26 -04:00
Steven Rostedt (VMware)
8672e4948d ring-buffer: Add a event_stamp to cpu_buffer for each level of nesting
Add a place to save the current event time stamp for each level of nesting.
This will be used to retrieve the time stamp of the current event before it
is committed.

Link: https://lkml.kernel.org/r/20210316164113.399089673@goodmis.org

Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-03-18 12:58:25 -04:00
Steven Rostedt (VMware)
e20044f7e9 ring-buffer: Separate out internal use of ring_buffer_event_time_stamp()
The exported use of ring_buffer_event_time_stamp() is going to become
different than how it is used internally. Move the internal logic out into a
static function called rb_event_time_stamp(), and have the internal callers
call that instead.

Link: https://lkml.kernel.org/r/20210316164113.257790481@goodmis.org

Reviewed-by: Tom Zanussi <zanussi@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-03-18 12:58:25 -04:00
Alexei Starovoitov
8a141dd7f7 ftrace: Fix modify_ftrace_direct.
The following sequence of commands:
  register_ftrace_direct(ip, addr1);
  modify_ftrace_direct(ip, addr1, addr2);
  unregister_ftrace_direct(ip, addr2);
will cause the kernel to warn:
[   30.179191] WARNING: CPU: 2 PID: 1961 at kernel/trace/ftrace.c:5223 unregister_ftrace_direct+0x130/0x150
[   30.180556] CPU: 2 PID: 1961 Comm: test_progs    W  O      5.12.0-rc2-00378-g86bc10a0a711-dirty #3246
[   30.182453] RIP: 0010:unregister_ftrace_direct+0x130/0x150

When modify_ftrace_direct() changes the addr from old to new it should update
the addr stored in ftrace_direct_funcs. Otherwise the final
unregister_ftrace_direct() won't find the address and will cause the splat.

Fixes: 0567d68091 ("ftrace: Add modify_ftrace_direct()")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Link: https://lore.kernel.org/bpf/20210316195815.34714-1-alexei.starovoitov@gmail.com
2021-03-17 00:43:12 +01:00
David S. Miller
c1acda9807 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2021-03-09

The following pull-request contains BPF updates for your *net-next* tree.

We've added 90 non-merge commits during the last 17 day(s) which contain
a total of 114 files changed, 5158 insertions(+), 1288 deletions(-).

The main changes are:

1) Faster bpf_redirect_map(), from Björn.

2) skmsg cleanup, from Cong.

3) Support for floating point types in BTF, from Ilya.

4) Documentation for sys_bpf commands, from Joe.

5) Support for sk_lookup in bpf_prog_test_run, form Lorenz.

6) Enable task local storage for tracing programs, from Song.

7) bpf_for_each_map_elem() helper, from Yonghong.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-09 18:07:05 -08:00
Steven Rostedt (VMware)
ee666a1855 tracing: Skip selftests if tracing is disabled
If tracing is disabled for some reason (traceoff_on_warning, command line,
etc), the ftrace selftests are guaranteed to fail, as their results are
defined by trace data in the ring buffers. If the ring buffers are turned
off, the tests will fail, due to lack of data.

Because tracing being disabled is for a specific reason (warning, user
decided to, etc), it does not make sense to enable tracing to run the self
tests, as the test output may corrupt the reason for the tracing to be
disabled.

Instead, simply skip the self tests and report that they are being skipped
due to tracing being disabled.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-03-04 09:51:25 -05:00
Vamshi K Sthambamkadi
f40fc799af tracing: Fix memory leak in __create_synth_event()
kmemleak report:
unreferenced object 0xc5a6f708 (size 8):
  comm "ftracetest", pid 1209, jiffies 4294911500 (age 6.816s)
  hex dump (first 8 bytes):
    00 c1 3d 60 14 83 1f 8a                          ..=`....
  backtrace:
    [<f0aa4ac4>] __kmalloc_track_caller+0x2a6/0x460
    [<7d3d60a6>] kstrndup+0x37/0x70
    [<45a0e739>] argv_split+0x1c/0x120
    [<c17982f8>] __create_synth_event+0x192/0xb00
    [<0708b8a3>] create_synth_event+0xbb/0x150
    [<3d1941e1>] create_dyn_event+0x5c/0xb0
    [<5cf8b9e3>] trace_parse_run_command+0xa7/0x140
    [<04deb2ef>] dyn_event_write+0x10/0x20
    [<8779ac95>] vfs_write+0xa9/0x3c0
    [<ed93722a>] ksys_write+0x89/0xc0
    [<b9ca0507>] __ia32_sys_write+0x15/0x20
    [<7ce02d85>] __do_fast_syscall_32+0x45/0x80
    [<cb0ecb35>] do_fast_syscall_32+0x29/0x60
    [<2467454a>] do_SYSENTER_32+0x15/0x20
    [<9beaa61d>] entry_SYSENTER_32+0xa9/0xfc
unreferenced object 0xc5a6f078 (size 8):
  comm "ftracetest", pid 1209, jiffies 4294911500 (age 6.816s)
  hex dump (first 8 bytes):
    08 f7 a6 c5 00 00 00 00                          ........
  backtrace:
    [<bbac096a>] __kmalloc+0x2b6/0x470
    [<aa2624b4>] argv_split+0x82/0x120
    [<c17982f8>] __create_synth_event+0x192/0xb00
    [<0708b8a3>] create_synth_event+0xbb/0x150
    [<3d1941e1>] create_dyn_event+0x5c/0xb0
    [<5cf8b9e3>] trace_parse_run_command+0xa7/0x140
    [<04deb2ef>] dyn_event_write+0x10/0x20
    [<8779ac95>] vfs_write+0xa9/0x3c0
    [<ed93722a>] ksys_write+0x89/0xc0
    [<b9ca0507>] __ia32_sys_write+0x15/0x20
    [<7ce02d85>] __do_fast_syscall_32+0x45/0x80
    [<cb0ecb35>] do_fast_syscall_32+0x29/0x60
    [<2467454a>] do_SYSENTER_32+0x15/0x20
    [<9beaa61d>] entry_SYSENTER_32+0xa9/0xfc

In __create_synth_event(), while iterating field/type arguments, the
argv_split() will return array of atleast 2 elements even when zero
arguments(argc=0) are passed. for e.g. when there is double delimiter
or string ends with delimiter

To fix call argv_free() even when argc=0.

Link: https://lkml.kernel.org/r/20210304094521.GA1826@cosmos

Signed-off-by: Vamshi K Sthambamkadi <vamshi.k.sthambamkadi@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-03-04 09:45:57 -05:00
Steven Rostedt (VMware)
6549de1fe3 ring-buffer: Add a little more information and a WARN when time stamp going backwards is detected
When the CONFIG_RING_BUFFER_VALIDATE_TIME_DELTAS is enabled, and the time
stamps are detected as not being valid, it reports information about the
write stamp, but does not show the before_stamp which is still useful
information. Also, it should give a warning once, such that tests detect
this happening.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-03-04 09:45:17 -05:00
Steven Rostedt (VMware)
6f6be606e7 ring-buffer: Force before_stamp and write_stamp to be different on discard
Part of the logic of the new time stamp code depends on the before_stamp and
the write_stamp to be different if the write_stamp does not match the last
event on the buffer, as it will be used to calculate the delta of the next
event written on the buffer.

The discard logic depends on this, as the next event to come in needs to
inject a full timestamp as it can not rely on the last event timestamp in
the buffer because it is unknown due to events after it being discarded. But
by changing the write_stamp back to the time before it, it forces the next
event to use a full time stamp, instead of relying on it.

The issue came when a full time stamp was used for the event, and
rb_time_delta() returns zero in that case. The update to the write_stamp
(which subtracts delta) made it not change. Then when the event is removed
from the buffer, because the before_stamp and write_stamp still match, the
next event written would calculate its delta from the write_stamp, but that
would be wrong as the write_stamp is of the time of the event that was
discarded.

In the case that the delta change being made to write_stamp is zero, set the
before_stamp to zero as well, and this will force the next event to inject a
full timestamp and not use the current write_stamp.

Cc: stable@vger.kernel.org
Fixes: a389d86f7f ("ring-buffer: Have nested events still record running time stamp")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-03-04 09:45:17 -05:00
Rolf Eike Beer
69268094a1 tracing: Fix help text of TRACEPOINT_BENCHMARK in Kconfig
It's "cond_resched()" not "cond_sched()".

Link: https://lkml.kernel.org/r/1863065.aFVDpXsuPd@devpool47

Signed-off-by: Rolf Eike Beer <eb@emlix.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-03-04 09:45:17 -05:00
Yordan Karadzhov (VMware)
70d443d846 tracing: Remove duplicate declaration from trace.h
A declaration of function "int trace_empty(struct trace_iterator *iter)"
shows up twice in the header file kernel/trace/trace.h

Link: https://lkml.kernel.org/r/20210304092348.208033-1-y.karadz@gmail.com

Signed-off-by: Yordan Karadzhov (VMware) <y.karadz@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-03-04 09:44:47 -05:00
Linus Torvalds
3ab6608e66 Merge tag 'block-5.12-2021-02-27' of git://git.kernel.dk/linux-block
Pull more block updates from Jens Axboe:
 "A few stragglers (and one due to me missing it originally), and fixes
  for changes in this merge window mostly. In particular:

   - blktrace cleanups (Chaitanya, Greg)

   - Kill dead blk_pm_* functions (Bart)

   - Fixes for the bio alloc changes (Christoph)

   - Fix for the partition changes (Christoph, Ming)

   - Fix for turning off iopoll with polled IO inflight (Jeffle)

   - nbd disconnect fix (Josef)

   - loop fsync error fix (Mauricio)

   - kyber update depth fix (Yang)

   - max_sectors alignment fix (Mikulas)

   - Add bio_max_segs helper (Matthew)"

* tag 'block-5.12-2021-02-27' of git://git.kernel.dk/linux-block: (21 commits)
  block: Add bio_max_segs
  blktrace: fix documentation for blk_fill_rw()
  block: memory allocations in bounce_clone_bio must not fail
  block: remove the gfp_mask argument to bounce_clone_bio
  block: fix bounce_clone_bio for passthrough bios
  block-crypto-fallback: use a bio_set for splitting bios
  block: fix logging on capacity change
  blk-settings: align max_sectors on "logical_block_size" boundary
  block: reopen the device in blkdev_reread_part
  block: don't skip empty device in in disk_uevent
  blktrace: remove debugfs file dentries from struct blk_trace
  nbd: handle device refs for DESTROY_ON_DISCONNECT properly
  kyber: introduce kyber_depth_updated()
  loop: fix I/O error on fsync() in detached loop devices
  block: fix potential IO hang when turning off io_poll
  block: get rid of the trace rq insert wrapper
  blktrace: fix blk_rq_merge documentation
  blktrace: fix blk_rq_issue documentation
  blktrace: add blk_fill_rwbs documentation comment
  block: remove superfluous param in blk_fill_rwbs()
  ...
2021-02-28 11:23:38 -08:00
Yonghong Song
69c087ba62 bpf: Add bpf_for_each_map_elem() helper
The bpf_for_each_map_elem() helper is introduced which
iterates all map elements with a callback function. The
helper signature looks like
  long bpf_for_each_map_elem(map, callback_fn, callback_ctx, flags)
and for each map element, the callback_fn will be called. For example,
like hashmap, the callback signature may look like
  long callback_fn(map, key, val, callback_ctx)

There are two known use cases for this. One is from upstream ([1]) where
a for_each_map_elem helper may help implement a timeout mechanism
in a more generic way. Another is from our internal discussion
for a firewall use case where a map contains all the rules. The packet
data can be compared to all these rules to decide allow or deny
the packet.

For array maps, users can already use a bounded loop to traverse
elements. Using this helper can avoid using bounded loop. For other
type of maps (e.g., hash maps) where bounded loop is hard or
impossible to use, this helper provides a convenient way to
operate on all elements.

For callback_fn, besides map and map element, a callback_ctx,
allocated on caller stack, is also passed to the callback
function. This callback_ctx argument can provide additional
input and allow to write to caller stack for output.

If the callback_fn returns 0, the helper will iterate through next
element if available. If the callback_fn returns 1, the helper
will stop iterating and returns to the bpf program. Other return
values are not used for now.

Currently, this helper is only available with jit. It is possible
to make it work with interpreter with so effort but I leave it
as the future work.

[1]: https://lore.kernel.org/bpf/20210122205415.113822-1-xiyou.wangcong@gmail.com/

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210226204925.3884923-1-yhs@fb.com
2021-02-26 13:23:52 -08:00
Song Liu
a10787e6d5 bpf: Enable task local storage for tracing programs
To access per-task data, BPF programs usually creates a hash table with
pid as the key. This is not ideal because:
 1. The user need to estimate the proper size of the hash table, which may
    be inaccurate;
 2. Big hash tables are slow;
 3. To clean up the data properly during task terminations, the user need
    to write extra logic.

Task local storage overcomes these issues and offers a better option for
these per-task data. Task local storage is only available to BPF_LSM. Now
enable it for tracing programs.

Unlike LSM programs, tracing programs can be called in IRQ contexts.
Helpers that access task local storage are updated to use
raw_spin_lock_irqsave() instead of raw_spin_lock_bh().

Tracing programs can attach to functions on the task free path, e.g.
exit_creds(). To avoid allocating task local storage after
bpf_task_storage_free(). bpf_task_storage_get() is updated to not allocate
new storage when the task is not refcounted (task->usage == 0).

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: KP Singh <kpsingh@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20210225234319.336131-2-songliubraving@fb.com
2021-02-26 11:51:47 -08:00
Alexander Potapenko
9c0dee54eb tracing: add error_report_end trace point
Patch series "Add error_report_end tracepoint to KFENCE and KASAN", v3.

This patchset adds a tracepoint, error_repor_end, that is to be used by
KFENCE, KASAN, and potentially other bug detection tools, when they print
an error report.  One of the possible use cases is userspace collection of
kernel error reports: interested parties can subscribe to the tracing
event via tracefs, and get notified when an error report occurs.

This patch (of 3):

Introduce error_report_end tracepoint.  It can be used in debugging tools
like KASAN, KFENCE, etc.  to provide extensions to the error reporting
mechanisms (e.g.  allow tests hook into error reporting, ease error report
collection from production kernels).  Another benefit would be making use
of ftrace for debugging or benchmarking the tools themselves.

Should we need it, the tracepoint name leaves us with the possibility to
introduce a complementary error_report_start tracepoint in the future.

Link: https://lkml.kernel.org/r/20210121131915.1331302-1-glider@google.com
Link: https://lkml.kernel.org/r/20210121131915.1331302-2-glider@google.com
Signed-off-by: Alexander Potapenko <glider@google.com>
Suggested-by: Marco Elver <elver@google.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-26 09:41:02 -08:00
Chaitanya Kulkarni
94d4bffdda blktrace: fix documentation for blk_fill_rw()
Add missing ":" after rwbs function parameter documentation that fixes
following warning :-

./kernel/trace/blktrace.c:1877: warning: Function parameter or member 'rwbs' not described in 'blk_fill_rwbs'

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Fixes: 1f83bb4b49 ("blktrace: add blk_fill_rwbs documentation comment")
Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-24 08:55:30 -07:00
Linus Torvalds
414eece95b Merge tag 'clang-lto-v5.12-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull more clang LTO updates from Kees Cook:
 "Clang LTO x86 enablement.

  Full disclosure: while this has _not_ been in linux-next (since it
  initially looked like the objtool dependencies weren't going to make
  v5.12), it has been under daily build and runtime testing by Sami for
  quite some time. These x86 portions have been discussed on lkml, with
  Peter, Josh, and others helping nail things down.

  The bulk of the changes are to get objtool working happily. The rest
  of the x86 enablement is very small.

  Summary:

   - Generate __mcount_loc in objtool (Peter Zijlstra)

   - Support running objtool against vmlinux.o (Sami Tolvanen)

   - Clang LTO enablement for x86 (Sami Tolvanen)"

Link: https://lore.kernel.org/lkml/20201013003203.4168817-26-samitolvanen@google.com/
Link: https://lore.kernel.org/lkml/cover.1611263461.git.jpoimboe@redhat.com/

* tag 'clang-lto-v5.12-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  kbuild: lto: force rebuilds when switching CONFIG_LTO
  x86, build: allow LTO to be selected
  x86, cpu: disable LTO for cpu.c
  x86, vdso: disable LTO only for vDSO
  kbuild: lto: postpone objtool
  objtool: Split noinstr validation from --vmlinux
  x86, build: use objtool mcount
  tracing: add support for objtool mcount
  objtool: Don't autodetect vmlinux.o
  objtool: Fix __mcount_loc generation with Clang's assembler
  objtool: Add a pass for generating __mcount_loc
2021-02-23 15:13:45 -08:00
Sami Tolvanen
22c8542d7b tracing: add support for objtool mcount
This change adds build support for using objtool to generate
__mcount_loc sections.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
2021-02-23 12:46:57 -08:00
Linus Torvalds
21a6ab2131 Merge tag 'modules-for-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux
Pull module updates from Jessica Yu:

 - Retire EXPORT_UNUSED_SYMBOL() and EXPORT_SYMBOL_GPL_FUTURE(). These
   export types were introduced between 2006 - 2008. All the of the
   unused symbols have been long removed and gpl future symbols were
   converted to gpl quite a long time ago, and I don't believe these
   export types have been used ever since. So, I think it should be safe
   to retire those export types now (Christoph Hellwig)

 - Refactor and clean up some aged code cruft in the module loader
   (Christoph Hellwig)

 - Build {,module_}kallsyms_on_each_symbol only when livepatching is
   enabled, as it is the only caller (Christoph Hellwig)

 - Unexport find_module() and module_mutex and fix the last module
   callers to not rely on these anymore. Make module_mutex internal to
   the module loader (Christoph Hellwig)

 - Harden ELF checks on module load and validate ELF structures before
   checking the module signature (Frank van der Linden)

 - Fix undefined symbol warning for clang (Fangrui Song)

 - Fix smatch warning (Dan Carpenter)

* tag 'modules-for-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux:
  module: potential uninitialized return in module_kallsyms_on_each_symbol()
  module: remove EXPORT_UNUSED_SYMBOL*
  module: remove EXPORT_SYMBOL_GPL_FUTURE
  module: move struct symsearch to module.c
  module: pass struct find_symbol_args to find_symbol
  module: merge each_symbol_section into find_symbol
  module: remove each_symbol_in_section
  module: mark module_mutex static
  kallsyms: only build {,module_}kallsyms_on_each_symbol when required
  kallsyms: refactor {,module_}kallsyms_on_each_symbol
  module: use RCU to synchronize find_module
  module: unexport find_module and module_mutex
  drm: remove drm_fb_helper_modinit
  powerpc/powernv: remove get_cxl_module
  module: harden ELF info handling
  module: Ignore _GLOBAL_OFFSET_TABLE_ when warning for undefined symbols
2021-02-23 10:15:33 -08:00
Linus Torvalds
79db4d2293 Merge tag 'clang-lto-v5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull clang LTO updates from Kees Cook:
 "Clang Link Time Optimization.

  This is built on the work done preparing for LTO by arm64 folks,
  tracing folks, etc. This includes the core changes as well as the
  remaining pieces for arm64 (LTO has been the default build method on
  Android for about 3 years now, as it is the prerequisite for the
  Control Flow Integrity protections).

  While x86 LTO enablement is done, it depends on some pending objtool
  clean-ups. It's possible that I'll send a "part 2" pull request for
  LTO that includes x86 support.

  For merge log posterity, and as detailed in commit dc5723b02e
  ("kbuild: add support for Clang LTO"), here is the lt;dr to do an LTO
  build:

        make LLVM=1 LLVM_IAS=1 defconfig
        scripts/config -e LTO_CLANG_THIN
        make LLVM=1 LLVM_IAS=1

  (To do a cross-compile of arm64, add "CROSS_COMPILE=aarch64-linux-gnu-"
  and "ARCH=arm64" to the "make" command lines.)

  Summary:

   - Clang LTO build infrastructure and arm64-specific enablement (Sami
     Tolvanen)

   - Recursive build CC_FLAGS_LTO fix (Alexander Lobakin)"

* tag 'clang-lto-v5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  kbuild: prevent CC_FLAGS_LTO self-bloating on recursive rebuilds
  arm64: allow LTO to be selected
  arm64: disable recordmcount with DYNAMIC_FTRACE_WITH_REGS
  arm64: vdso: disable LTO
  drivers/misc/lkdtm: disable LTO for rodata.o
  efi/libstub: disable LTO
  scripts/mod: disable LTO for empty.c
  modpost: lto: strip .lto from module names
  PCI: Fix PREL32 relocations for LTO
  init: lto: fix PREL32 relocations
  init: lto: ensure initcall ordering
  kbuild: lto: add a default list of used symbols
  kbuild: lto: merge module sections
  kbuild: lto: limit inlining
  kbuild: lto: fix module versioning
  kbuild: add support for Clang LTO
  tracing: move function tracer options to Kconfig
2021-02-23 09:28:51 -08:00
Greg Kroah-Hartman
c0ea57608b blktrace: remove debugfs file dentries from struct blk_trace
These debugfs dentries do not need to be saved for anything as the whole
directory and everything in it is properly cleaned up when the parent
directory is removed.  So remove them from struct blk_trace and don't
save them when created as it's not necessary.

Cc: Jens Axboe <axboe@kernel.dk>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-block@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-23 09:54:51 -07:00
Linus Torvalds
c958423470 Merge tag 'trace-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing updates from Steven Rostedt:

 - Update to the way irqs and preemption is tracked via the trace event
   PC field

 - Fix handling of unregistering event failing due to allocate memory.
   This is only triggered by failure injection, as it is pretty much
   guaranteed to have less than a page allocation succeed.

 - Do not show the useless "filter" or "enable" files for the "ftrace"
   trace system, as they have no effect on doing anything.

 - Add a warning if kprobes are registered more than once.

 - Synthetic events now have their fields parsed by semicolons. Old
   formats without semicolons will still work, but new features will
   require them.

 - New option to allow trace events to show %p without hashing in trace
   file. The trace file can only be read by root, and reading the raw
   event buffer did not have any pointers hashed, so this does not
   expose anything new.

 - New directory in tools called tools/tracing, where a new tool that
   reads sequential latency reports from the ftrace latency tracers.

 - Other minor fixes and cleanups.

* tag 'trace-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (33 commits)
  kprobes: Fix to delay the kprobes jump optimization
  tracing/tools: Add the latency-collector to tools directory
  tracing: Make hash-ptr option default
  tracing: Add ptr-hash option to show the hashed pointer value
  tracing: Update the stage 3 of trace event macro comment
  tracing: Show real address for trace event arguments
  selftests/ftrace: Add '!event' synthetic event syntax check
  selftests/ftrace: Update synthetic event syntax errors
  tracing: Add a backward-compatibility check for synthetic event creation
  tracing: Update synth command errors
  tracing: Rework synthetic event command parsing
  tracing/dynevent: Delegate parsing to create function
  kprobes: Warn if the kprobe is reregistered
  ftrace: Remove unused ftrace_force_update()
  tracepoints: Code clean up
  tracepoints: Do not punish non static call users
  tracepoints: Remove unnecessary "data_args" macro parameter
  tracing: Do not create "enable" or "filter" files for ftrace event subsystem
  kernel: trace: preemptirq_delay_test: add cpu affinity
  tracepoint: Do not fail unregistering a probe due to memory failure
  ...
2021-02-22 14:07:15 -08:00
Chaitanya Kulkarni
1f83bb4b49 blktrace: add blk_fill_rwbs documentation comment
blk_fill_rwbs() is an expoted function, add kernel style documentation
comment.

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-22 06:37:41 -07:00
Chaitanya Kulkarni
179d160072 block: remove superfluous param in blk_fill_rwbs()
The last parameter for the function blk_fill_rwbs() was added in
5782138e47 ("tracing/events: convert block trace points to
TRACE_EVENT()") in order to signal read request and use of that parameter
was replaced with using switch case REQ_OP_READ with
1b9a9ab78b ("blktrace: use op accessors"), but the parameter was never
removed.

Remove the unused parameter and adjust the respective call sites.

Fixes: 1b9a9ab78b ("blktrace: use op accessors")
Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-22 06:37:41 -07:00
Linus Torvalds
582cd91f69 Merge tag 'for-5.12/block-2021-02-17' of git://git.kernel.dk/linux-block
Pull core block updates from Jens Axboe:
 "Another nice round of removing more code than what is added, mostly
  due to Christoph's relentless pursuit of tech debt removal/cleanups.
  This pull request contains:

   - Two series of BFQ improvements (Paolo, Jan, Jia)

   - Block iov_iter improvements (Pavel)

   - bsg error path fix (Pan)

   - blk-mq scheduler improvements (Jan)

   - -EBUSY discard fix (Jan)

   - bvec allocation improvements (Ming, Christoph)

   - bio allocation and init improvements (Christoph)

   - Store bdev pointer in bio instead of gendisk + partno (Christoph)

   - Block trace point cleanups (Christoph)

   - hard read-only vs read-only split (Christoph)

   - Block based swap cleanups (Christoph)

   - Zoned write granularity support (Damien)

   - Various fixes/tweaks (Chunguang, Guoqing, Lei, Lukas, Huhai)"

* tag 'for-5.12/block-2021-02-17' of git://git.kernel.dk/linux-block: (104 commits)
  mm: simplify swapdev_block
  sd_zbc: clear zone resources for non-zoned case
  block: introduce blk_queue_clear_zone_settings()
  zonefs: use zone write granularity as block size
  block: introduce zone_write_granularity limit
  block: use blk_queue_set_zoned in add_partition()
  nullb: use blk_queue_set_zoned() to setup zoned devices
  nvme: cleanup zone information initialization
  block: document zone_append_max_bytes attribute
  block: use bi_max_vecs to find the bvec pool
  md/raid10: remove dead code in reshape_request
  block: mark the bio as cloned in bio_iov_bvec_set
  block: set BIO_NO_PAGE_REF in bio_iov_bvec_set
  block: remove a layer of indentation in bio_iov_iter_get_pages
  block: turn the nr_iovecs argument to bio_alloc* into an unsigned short
  block: remove the 1 and 4 vec bvec_slabs entries
  block: streamline bvec_alloc
  block: factor out a bvec_alloc_gfp helper
  block: move struct biovec_slab to bio.c
  block: reuse BIO_INLINE_VECS for integrity bvecs
  ...
2021-02-21 11:02:48 -08:00
Linus Torvalds
51e6d17809 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from David Miller:
 "Here is what we have this merge window:

   1) Support SW steering for mlx5 Connect-X6Dx, from Yevgeny Kliteynik.

   2) Add RSS multi group support to octeontx2-pf driver, from Geetha
      Sowjanya.

   3) Add support for KS8851 PHY. From Marek Vasut.

   4) Add support for GarfieldPeak bluetooth controller from Kiran K.

   5) Add support for half-duplex tcan4x5x can controllers.

   6) Add batch skb rx processing to bcrm63xx_enet, from Sieng Piaw
      Liew.

   7) Rework RX port offload infrastructure, particularly wrt, UDP
      tunneling, from Jakub Kicinski.

   8) Add BCM72116 PHY support, from Florian Fainelli.

   9) Remove Dsa specific notifiers, they are unnecessary. From Vladimir
      Oltean.

  10) Add support for picosecond rx delay in dwmac-meson8b chips. From
      Martin Blumenstingl.

  11) Support TSO on xfrm interfaces from Eyal Birger.

  12) Add support for MP_PRIO to mptcp stack, from Geliang Tang.

  13) Support BCM4908 integrated switch, from Rafał Miłecki.

  14) Support for directly accessing kernel module variables via module
      BTF info, from Andrii Naryiko.

  15) Add DASH (esktop and mobile Architecture for System Hardware)
      support to r8169 driver, from Heiner Kallweit.

  16) Add rx vlan filtering to dpaa2-eth, from Ionut-robert Aron.

  17) Add support for 100 base0x SFP devices, from Bjarni Jonasson.

  18) Support link aggregation in DSA, from Tobias Waldekranz.

  19) Support for bitwidse atomics in bpf, from Brendan Jackman.

  20) SmartEEE support in at803x driver, from Russell King.

  21) Add support for flow based tunneling to GTP, from Pravin B Shelar.

  22) Allow arbitrary number of interconnrcts in ipa, from Alex Elder.

  23) TLS RX offload for bonding, from Tariq Toukan.

  24) RX decap offklload support in mac80211, from Felix Fietkou.

  25) devlink health saupport in octeontx2-af, from George Cherian.

  26) Add TTL attr to SCM_TIMESTAMP_OPT_STATS, from Yousuk Seung

  27) Delegated actionss support in mptcp, from Paolo Abeni.

  28) Support receive timestamping when doin zerocopy tcp receive. From
      Arjun Ray.

  29) HTB offload support for mlx5, from Maxim Mikityanskiy.

  30) UDP GRO forwarding, from Maxim Mikityanskiy.

  31) TAPRIO offloading in dsa hellcreek driver, from Kurt Kanzenbach.

  32) Weighted random twos choice algorithm for ipvs, from Darby Payne.

  33) Fix netdev registration deadlock, from Johannes Berg.

  34) Various conversions to new tasklet api, from EmilRenner Berthing.

  35) Bulk skb allocations in veth, from Lorenzo Bianconi.

  36) New ethtool interface for lane setting, from Danielle Ratson.

  37) Offload failiure notifications for routes, from Amit Cohen.

  38) BCM4908 support, from Rafał Miłecki.

  39) Support several new iwlwifi chips, from Ihab Zhaika.

  40) Flow drector support for ipv6 in i40e, from Przemyslaw Patynowski.

  41) Support for mhi prrotocols, from Loic Poulain.

  42) Optimize bpf program stats.

  43) Implement RFC6056, for better port randomization, from Eric
      Dumazet.

  44) hsr tag offloading support from George McCollister.

  45) Netpoll support in qede, from Bhaskar Upadhaya.

  46) 2005/400g speed support in bonding 3ad mode, from Nikolay
      Aleksandrov.

  47) Netlink event support in mptcp, from Florian Westphal.

  48) Better skbuff caching, from Alexander Lobakin.

  49) MRP (Media Redundancy Protocol) offloading in DSA and a few
      drivers, from Horatiu Vultur.

  50) mqprio saupport in mvneta, from Maxime Chevallier.

  51) Remove of_phy_attach, no longer needed, from Florian Fainelli"

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1766 commits)
  octeontx2-pf: Fix otx2_get_fecparam()
  cteontx2-pf: cn10k: Prevent harmless double shift bugs
  net: stmmac: Add PCI bus info to ethtool driver query output
  ptp: ptp_clockmatrix: clean-up - parenthesis around a == b are unnecessary
  ptp: ptp_clockmatrix: Simplify code - remove unnecessary `err` variable.
  ptp: ptp_clockmatrix: Coding style - tighten vertical spacing.
  ptp: ptp_clockmatrix: Clean-up dev_*() messages.
  ptp: ptp_clockmatrix: Remove unused header declarations.
  ptp: ptp_clockmatrix: Add alignment of 1 PPS to idtcm_perout_enable.
  ptp: ptp_clockmatrix: Add wait_for_sys_apll_dpll_lock.
  net: stmmac: dwmac-sun8i: Add a shutdown callback
  net: stmmac: dwmac-sun8i: Minor probe function cleanup
  net: stmmac: dwmac-sun8i: Use reset_control_reset
  net: stmmac: dwmac-sun8i: Remove unnecessary PHY power check
  net: stmmac: dwmac-sun8i: Return void from PHY unpower
  r8169: use macro pm_ptr
  net: mdio: Remove of_phy_attach()
  net: mscc: ocelot: select PACKING in the Kconfig
  net: re-solve some conflicts after net -> net-next merge
  net: dsa: tag_rtl4_a: Support also egress tags
  ...
2021-02-20 17:45:32 -08:00
David S. Miller
b8af417e4d Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2021-02-16

The following pull-request contains BPF updates for your *net-next* tree.

There's a small merge conflict between 7eeba1706e ("tcp: Add receive timestamp
support for receive zerocopy.") from net-next tree and 9cacf81f81 ("bpf: Remove
extra lock_sock for TCP_ZEROCOPY_RECEIVE") from bpf-next tree. Resolve as follows:

  [...]
                lock_sock(sk);
                err = tcp_zerocopy_receive(sk, &zc, &tss);
                err = BPF_CGROUP_RUN_PROG_GETSOCKOPT_KERN(sk, level, optname,
                                                          &zc, &len, err);
                release_sock(sk);
  [...]

We've added 116 non-merge commits during the last 27 day(s) which contain
a total of 156 files changed, 5662 insertions(+), 1489 deletions(-).

The main changes are:

1) Adds support of pointers to types with known size among global function
   args to overcome the limit on max # of allowed args, from Dmitrii Banshchikov.

2) Add bpf_iter for task_vma which can be used to generate information similar
   to /proc/pid/maps, from Song Liu.

3) Enable bpf_{g,s}etsockopt() from all sock_addr related program hooks. Allow
   rewriting bind user ports from BPF side below the ip_unprivileged_port_start
   range, both from Stanislav Fomichev.

4) Prevent recursion on fentry/fexit & sleepable programs and allow map-in-map
   as well as per-cpu maps for the latter, from Alexei Starovoitov.

5) Add selftest script to run BPF CI locally. Also enable BPF ringbuffer
   for sleepable programs, both from KP Singh.

6) Extend verifier to enable variable offset read/write access to the BPF
   program stack, from Andrei Matei.

7) Improve tc & XDP MTU handling and add a new bpf_check_mtu() helper to
   query device MTU from programs, from Jesper Dangaard Brouer.

8) Allow bpf_get_socket_cookie() helper also be called from [sleepable] BPF
   tracing programs, from Florent Revest.

9) Extend x86 JIT to pad JMPs with NOPs for helping image to converge when
   otherwise too many passes are required, from Gary Lin.

10) Verifier fixes on atomics with BPF_FETCH as well as function-by-function
    verification both related to zero-extension handling, from Ilya Leoshkevich.

11) Better kernel build integration of resolve_btfids tool, from Jiri Olsa.

12) Batch of AF_XDP selftest cleanups and small performance improvement
    for libbpf's xsk map redirect for newer kernels, from Björn Töpel.

13) Follow-up BPF doc and verifier improvements around atomics with
    BPF_FETCH, from Brendan Jackman.

14) Permit zero-sized data sections e.g. if ELF .rodata section contains
    read-only data from local variables, from Yonghong Song.

15) veth driver skb bulk-allocation for ndo_xdp_xmit, from Lorenzo Bianconi.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-02-16 13:14:06 -08:00
Song Liu
3d06f34aa8 bpf: Allow bpf_d_path in bpf_iter program
task_file and task_vma iter programs have access to file->f_path. Enable
bpf_d_path to print paths of these file.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210212183107.50963-3-songliubraving@fb.com
2021-02-12 12:56:53 -08:00
Linus Torvalds
e77a6817d4 Merge tag 'trace-v5.11-rc7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fix from Steven Rostedt:
 "Fix buffer overflow in trace event filter.

  It was reported that if an trace event was larger than a page and was
  filtered, that it caused memory corruption. The reason is that
  filtered events first go into a buffer to test the filter before being
  written into the ring buffer. Unfortunately, this write did not check
  the size"

* tag 'trace-v5.11-rc7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Check length before giving out the filter buffer
2021-02-12 11:16:17 -08:00
Steven Rostedt (VMware)
99e22ce73c tracing: Make hash-ptr option default
Since the original behavior of the trace events is to hash the %p pointers,
make that the default, and have developers have to enable the option in
order to have them unhashed.

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-02-12 11:52:59 -05:00
Florent Revest
c5dbb89fc2 bpf: Expose bpf_get_socket_cookie to tracing programs
This needs a new helper that:
- can work in a sleepable context (using sock_gen_cookie)
- takes a struct sock pointer and checks that it's not NULL

Signed-off-by: Florent Revest <revest@chromium.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: KP Singh <kpsingh@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210210111406.785541-2-revest@chromium.org
2021-02-11 17:44:41 -08:00
Masami Hiramatsu
a345a6718b tracing: Add ptr-hash option to show the hashed pointer value
Add tracefs/options/hash-ptr option to show hashed pointer
value by %p in event printk format string.

For the security reason, normal printk will show the hashed
pointer value (encrypted by random number) with %p to printk
buffer to hide the real address. But the tracefs/trace always
shows real address for debug. To bridge those outputs, add an
option to switch the output format. Ftrace users can use it
to find the hashed value corresponding to the real address
in trace log.

Link: https://lkml.kernel.org/r/160277372504.29307.14909828808982012211.stgit@devnote2

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-02-11 16:31:57 -05:00
Masami Hiramatsu
efbbdaa22b tracing: Show real address for trace event arguments
To help debugging kernel, show real address for trace event arguments
in tracefs/trace{,pipe} instead of hashed pointer value.

Since ftrace human-readable format uses vsprintf(), all %p are
translated to hash values instead of pointer address.

However, when debugging the kernel, raw address value gives a
hint when comparing with the memory mapping in the kernel.
(Those are sometimes used with crash log, which is not hashed too)
So converting %p with %px when calling trace_seq_printf().

Moreover, this is not improving the security because the tracefs
can be used only by root user and the raw address values are readable
from tracefs/percpu/cpu*/trace_pipe_raw file.

Link: https://lkml.kernel.org/r/160277370703.29307.5134475491761971203.stgit@devnote2

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-02-11 16:31:57 -05:00