2018-08-16 15:26:55 +00:00
|
|
|
// SPDX-License-Identifier: GPL-2.0
|
tracing: Add and use generic set_trigger_filter() implementation
Add a generic event_command.set_trigger_filter() op implementation and
have the current set of trigger commands use it - this essentially
gives them all support for filters.
Syntactically, filters are supported by adding 'if <filter>' just
after the command, in which case only events matching the filter will
invoke the trigger. For example, to add a filter to an
enable/disable_event command:
echo 'enable_event:system:event if common_pid == 999' > \
.../othersys/otherevent/trigger
The above command will only enable the system:event event if the
common_pid field in the othersys:otherevent event is 999.
As another example, to add a filter to a stacktrace command:
echo 'stacktrace if common_pid == 999' > \
.../somesys/someevent/trigger
The above command will only trigger a stacktrace if the common_pid
field in the event is 999.
The filter syntax is the same as that described in the 'Event
filtering' section of Documentation/trace/events.txt.
Because triggers can now use filters, the trigger-invoking logic needs
to be moved in those cases - e.g. for ftrace_raw_event_calls, if a
trigger has a filter associated with it, the trigger invocation now
needs to happen after the { assign; } part of the call, in order for
the trigger condition to be tested.
There's still a SOFT_DISABLED-only check at the top of e.g. the
ftrace_raw_events function, so when an event is soft disabled but not
because of the presence of a trigger, the original SOFT_DISABLED
behavior remains unchanged.
There's also a bit of trickiness in that some triggers need to avoid
being invoked while an event is currently in the process of being
logged, since the trigger may itself log data into the trace buffer.
Thus we make sure the current event is committed before invoking those
triggers. To do that, we split the trigger invocation in two - the
first part (event_triggers_call()) checks the filter using the current
trace record; if a command has the post_trigger flag set, it sets a
bit for itself in the return value, otherwise it directly invoks the
trigger. Once all commands have been either invoked or set their
return flag, event_triggers_call() returns. The current record is
then either committed or discarded; if any commands have deferred
their triggers, those commands are finally invoked following the close
of the current event by event_triggers_post_call().
To simplify the above and make it more efficient, the TRIGGER_COND bit
is introduced, which is set only if a soft-disabled trigger needs to
use the log record for filter testing or needs to wait until the
current log record is closed.
The syscall event invocation code is also changed in analogous ways.
Because event triggers need to be able to create and free filters,
this also adds a couple external wrappers for the existing
create_filter and free_filter functions, which are too generic to be
made extern functions themselves.
Link: http://lkml.kernel.org/r/7164930759d8719ef460357f143d995406e4eead.1382622043.git.tom.zanussi@linux.intel.com
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-10-24 13:59:29 +00:00
|
|
|
|
2008-05-12 19:20:42 +00:00
|
|
|
#ifndef _LINUX_KERNEL_TRACE_H
|
|
|
|
#define _LINUX_KERNEL_TRACE_H
|
|
|
|
|
|
|
|
#include <linux/fs.h>
|
2011-07-26 23:09:06 +00:00
|
|
|
#include <linux/atomic.h>
|
2008-05-12 19:20:42 +00:00
|
|
|
#include <linux/sched.h>
|
|
|
|
#include <linux/clocksource.h>
|
2008-09-30 03:02:41 +00:00
|
|
|
#include <linux/ring_buffer.h>
|
ftrace: mmiotrace, updates
here is a patch that makes mmiotrace work almost well within the tracing
framework. The patch applies on top of my previous patch. I have my own
output formatting in place now.
Summary of changes:
- fix the NULL dereference that was due to not calling tracing_reset()
- add print_line() callback into struct tracer
- implement print_line() for mmiotrace, producing up-to-spec text
- add my output header, but that is not really called in the right place
- rewrote the main structs in mmiotrace
- added two new trace entry types: TRACE_MMIO_RW and TRACE_MMIO_MAP
- made some functions in trace.c non-static
- check current==NULL in tracing_generic_entry_update()
- fix(?) comparison in trace_seq_printf()
Things seem to work fine except a few issues. Markers (text lines injected
into mmiotrace log) are missing, I did not feel hacking them in before we
have variable length entries. My output header is printed only for 'trace'
file, but not 'trace_pipe'. For some reason, despite my quick fix,
iter->trace is NULL in print_trace_line() when called from 'trace_pipe'
file, which means I don't get proper output formatting.
I only tried by loading nouveau.ko, which just detects the card, and that
is traced fine. I didn't try further. Map, two reads and unmap. Works
perfectly.
I am missing the information about overflows, I'd prefer to have a
counter for lost events. I didn't try, but I guess currently there is no
way of knowning when it overflows?
So, not too far from being fully operational, it seems :-)
And looking at the diffstat, there also is some 700-900 lines of user space
code that just became obsolete.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-12 19:20:57 +00:00
|
|
|
#include <linux/mmiotrace.h>
|
2009-09-12 23:26:21 +00:00
|
|
|
#include <linux/tracepoint.h>
|
2008-09-23 10:32:08 +00:00
|
|
|
#include <linux/ftrace.h>
|
2019-08-14 17:55:23 +00:00
|
|
|
#include <linux/trace.h>
|
2009-09-09 17:22:48 +00:00
|
|
|
#include <linux/hw_breakpoint.h>
|
tracing: make trace_seq operations available for core kernel
In the process to make TRACE_EVENT macro work for modules, the trace_seq
operations must be available for core kernel code.
These operations are quite useful and can be used for other implementations.
The main idea is that we create a trace_seq handle that acts very much
like the seq_file handle.
struct trace_seq *s = kmalloc(sizeof(*s, GFP_KERNEL);
trace_seq_init(s);
trace_seq_printf(s, "some data %d\n", variable);
printk("%s", s->buffer);
The main use is to allow a top level function call several other functions
that may store printf like data into the buffer. Then at the end, the top
level function can process all the data with any method it would like to.
It could be passed to userspace, output via printk or even use seq_file:
trace_seq_to_user(s, ubuf, cnt);
seq_puts(m, s->buffer);
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-04-11 16:59:57 +00:00
|
|
|
#include <linux/trace_seq.h>
|
2015-04-29 18:36:05 +00:00
|
|
|
#include <linux/trace_events.h>
|
2014-04-07 22:39:20 +00:00
|
|
|
#include <linux/compiler.h>
|
2016-10-05 11:58:15 +00:00
|
|
|
#include <linux/glob.h>
|
2019-10-08 22:08:21 +00:00
|
|
|
#include <linux/irq_work.h>
|
|
|
|
#include <linux/workqueue.h>
|
2020-10-13 14:17:53 +00:00
|
|
|
#include <linux/ctype.h>
|
2021-06-28 13:50:06 +00:00
|
|
|
#include <linux/once_lite.h>
|
tracing: make trace_seq operations available for core kernel
In the process to make TRACE_EVENT macro work for modules, the trace_seq
operations must be available for core kernel code.
These operations are quite useful and can be used for other implementations.
The main idea is that we create a trace_seq handle that acts very much
like the seq_file handle.
struct trace_seq *s = kmalloc(sizeof(*s, GFP_KERNEL);
trace_seq_init(s);
trace_seq_printf(s, "some data %d\n", variable);
printk("%s", s->buffer);
The main use is to allow a top level function call several other functions
that may store printf like data into the buffer. Then at the end, the top
level function can process all the data with any method it would like to.
It could be passed to userspace, output via printk or even use seq_file:
trace_seq_to_user(s, ubuf, cnt);
seq_puts(m, s->buffer);
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2009-04-11 16:59:57 +00:00
|
|
|
|
2021-09-24 01:03:49 +00:00
|
|
|
#include "pid_list.h"
|
|
|
|
|
2012-08-08 18:48:20 +00:00
|
|
|
#ifdef CONFIG_FTRACE_SYSCALLS
|
2022-03-29 02:57:38 +00:00
|
|
|
#include <asm/unistd.h> /* For NR_syscalls */
|
2012-08-08 18:48:20 +00:00
|
|
|
#include <asm/syscall.h> /* some archs define it here */
|
|
|
|
#endif
|
|
|
|
|
2021-08-18 15:24:51 +00:00
|
|
|
#define TRACE_MODE_WRITE 0640
|
|
|
|
#define TRACE_MODE_READ 0440
|
|
|
|
|
2008-05-23 19:37:28 +00:00
|
|
|
enum trace_type {
|
|
|
|
__TRACE_FIRST_TYPE = 0,
|
|
|
|
|
|
|
|
TRACE_FN,
|
|
|
|
TRACE_CTX,
|
|
|
|
TRACE_WAKE,
|
|
|
|
TRACE_STACK,
|
2008-08-01 16:26:41 +00:00
|
|
|
TRACE_PRINT,
|
2009-03-12 17:24:49 +00:00
|
|
|
TRACE_BPRINT,
|
ftrace: mmiotrace, updates
here is a patch that makes mmiotrace work almost well within the tracing
framework. The patch applies on top of my previous patch. I have my own
output formatting in place now.
Summary of changes:
- fix the NULL dereference that was due to not calling tracing_reset()
- add print_line() callback into struct tracer
- implement print_line() for mmiotrace, producing up-to-spec text
- add my output header, but that is not really called in the right place
- rewrote the main structs in mmiotrace
- added two new trace entry types: TRACE_MMIO_RW and TRACE_MMIO_MAP
- made some functions in trace.c non-static
- check current==NULL in tracing_generic_entry_update()
- fix(?) comparison in trace_seq_printf()
Things seem to work fine except a few issues. Markers (text lines injected
into mmiotrace log) are missing, I did not feel hacking them in before we
have variable length entries. My output header is printed only for 'trace'
file, but not 'trace_pipe'. For some reason, despite my quick fix,
iter->trace is NULL in print_trace_line() when called from 'trace_pipe'
file, which means I don't get proper output formatting.
I only tried by loading nouveau.ko, which just detects the card, and that
is traced fine. I didn't try further. Map, two reads and unmap. Works
perfectly.
I am missing the information about overflows, I'd prefer to have a
counter for lost events. I didn't try, but I guess currently there is no
way of knowning when it overflows?
So, not too far from being fully operational, it seems :-)
And looking at the diffstat, there also is some 700-900 lines of user space
code that just became obsolete.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-05-12 19:20:57 +00:00
|
|
|
TRACE_MMIO_RW,
|
|
|
|
TRACE_MMIO_MAP,
|
2008-11-12 20:24:24 +00:00
|
|
|
TRACE_BRANCH,
|
2008-11-25 23:57:25 +00:00
|
|
|
TRACE_GRAPH_RET,
|
|
|
|
TRACE_GRAPH_ENT,
|
2008-11-22 11:28:47 +00:00
|
|
|
TRACE_USER_STACK,
|
blktrace: add ftrace plugin
Impact: New way of using the blktrace infrastructure
This drops the requirement of userspace utilities to use the blktrace
facility.
Configuration is done thru sysfs, adding a "trace" directory to the
partition directory where blktrace can be enabled for the associated
request_queue.
The same filters present in the IOCTL interface are present as sysfs
device attributes.
The /sys/block/sdX/sdXN/trace/enable file allows tracing without any
filters.
The other files in this directory: pid, act_mask, start_lba and end_lba
can be used with the same meaning as with the IOCTL interface.
Using the sysfs interface will only setup the request_queue->blk_trace
fields, tracing will only take place when the "blk" tracer is selected
via the ftrace interface, as in the following example:
To see the trace, one can use the /d/tracing/trace file or the
/d/tracign/trace_pipe file, with semantics defined in the ftrace
documentation in Documentation/ftrace.txt.
[root@f10-1 ~]# cat /t/trace
kjournald-305 [000] 3046.491224: 8,1 A WBS 6367 + 8 <- (8,1) 6304
kjournald-305 [000] 3046.491227: 8,1 Q R 6367 + 8 [kjournald]
kjournald-305 [000] 3046.491236: 8,1 G RB 6367 + 8 [kjournald]
kjournald-305 [000] 3046.491239: 8,1 P NS [kjournald]
kjournald-305 [000] 3046.491242: 8,1 I RBS 6367 + 8 [kjournald]
kjournald-305 [000] 3046.491251: 8,1 D WB 6367 + 8 [kjournald]
kjournald-305 [000] 3046.491610: 8,1 U WS [kjournald] 1
<idle>-0 [000] 3046.511914: 8,1 C RS 6367 + 8 [6367]
[root@f10-1 ~]#
The default line context (prefix) format is the one described in the ftrace
documentation, with the blktrace specific bits using its existing format,
described in blkparse(8).
If one wants to have the classic blktrace formatting, this is possible by
using:
[root@f10-1 ~]# echo blk_classic > /t/trace_options
[root@f10-1 ~]# cat /t/trace
8,1 0 3046.491224 305 A WBS 6367 + 8 <- (8,1) 6304
8,1 0 3046.491227 305 Q R 6367 + 8 [kjournald]
8,1 0 3046.491236 305 G RB 6367 + 8 [kjournald]
8,1 0 3046.491239 305 P NS [kjournald]
8,1 0 3046.491242 305 I RBS 6367 + 8 [kjournald]
8,1 0 3046.491251 305 D WB 6367 + 8 [kjournald]
8,1 0 3046.491610 305 U WS [kjournald] 1
8,1 0 3046.511914 0 C RS 6367 + 8 [6367]
[root@f10-1 ~]#
Using the ftrace standard format allows more flexibility, such
as the ability of asking for backtraces via trace_options:
[root@f10-1 ~]# echo noblk_classic > /t/trace_options
[root@f10-1 ~]# echo stacktrace > /t/trace_options
[root@f10-1 ~]# cat /t/trace
kjournald-305 [000] 3318.826779: 8,1 A WBS 6375 + 8 <- (8,1) 6312
kjournald-305 [000] 3318.826782:
<= submit_bio
<= submit_bh
<= sync_dirty_buffer
<= journal_commit_transaction
<= kjournald
<= kthread
<= child_rip
kjournald-305 [000] 3318.826836: 8,1 Q R 6375 + 8 [kjournald]
kjournald-305 [000] 3318.826837:
<= generic_make_request
<= submit_bio
<= submit_bh
<= sync_dirty_buffer
<= journal_commit_transaction
<= kjournald
<= kthread
Please read the ftrace documentation to use aditional, standardized
tracing filters such as /d/tracing/trace_cpumask, etc.
See also /d/tracing/trace_mark to add comments in the trace stream,
that is equivalent to the /d/block/sdaN/msg interface.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-23 14:06:27 +00:00
|
|
|
TRACE_BLK,
|
2013-03-09 02:02:34 +00:00
|
|
|
TRACE_BPUTS,
|
2016-06-23 16:45:36 +00:00
|
|
|
TRACE_HWLAT,
|
trace: Add osnoise tracer
In the context of high-performance computing (HPC), the Operating System
Noise (*osnoise*) refers to the interference experienced by an application
due to activities inside the operating system. In the context of Linux,
NMIs, IRQs, SoftIRQs, and any other system thread can cause noise to the
system. Moreover, hardware-related jobs can also cause noise, for example,
via SMIs.
The osnoise tracer leverages the hwlat_detector by running a similar
loop with preemption, SoftIRQs and IRQs enabled, thus allowing all
the sources of *osnoise* during its execution. Using the same approach
of hwlat, osnoise takes note of the entry and exit point of any
source of interferences, increasing a per-cpu interference counter. The
osnoise tracer also saves an interference counter for each source of
interference. The interference counter for NMI, IRQs, SoftIRQs, and
threads is increased anytime the tool observes these interferences' entry
events. When a noise happens without any interference from the operating
system level, the hardware noise counter increases, pointing to a
hardware-related noise. In this way, osnoise can account for any
source of interference. At the end of the period, the osnoise tracer
prints the sum of all noise, the max single noise, the percentage of CPU
available for the thread, and the counters for the noise sources.
Usage
Write the ASCII text "osnoise" into the current_tracer file of the
tracing system (generally mounted at /sys/kernel/tracing).
For example::
[root@f32 ~]# cd /sys/kernel/tracing/
[root@f32 tracing]# echo osnoise > current_tracer
It is possible to follow the trace by reading the trace trace file::
[root@f32 tracing]# cat trace
# tracer: osnoise
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth MAX
# || / SINGLE Interference counters:
# |||| RUNTIME NOISE % OF CPU NOISE +-----------------------------+
# TASK-PID CPU# |||| TIMESTAMP IN US IN US AVAILABLE IN US HW NMI IRQ SIRQ THREAD
# | | | |||| | | | | | | | | | |
<...>-859 [000] .... 81.637220: 1000000 190 99.98100 9 18 0 1007 18 1
<...>-860 [001] .... 81.638154: 1000000 656 99.93440 74 23 0 1006 16 3
<...>-861 [002] .... 81.638193: 1000000 5675 99.43250 202 6 0 1013 25 21
<...>-862 [003] .... 81.638242: 1000000 125 99.98750 45 1 0 1011 23 0
<...>-863 [004] .... 81.638260: 1000000 1721 99.82790 168 7 0 1002 49 41
<...>-864 [005] .... 81.638286: 1000000 263 99.97370 57 6 0 1006 26 2
<...>-865 [006] .... 81.638302: 1000000 109 99.98910 21 3 0 1006 18 1
<...>-866 [007] .... 81.638326: 1000000 7816 99.21840 107 8 0 1016 39 19
In addition to the regular trace fields (from TASK-PID to TIMESTAMP), the
tracer prints a message at the end of each period for each CPU that is
running an osnoise/CPU thread. The osnoise specific fields report:
- The RUNTIME IN USE reports the amount of time in microseconds that
the osnoise thread kept looping reading the time.
- The NOISE IN US reports the sum of noise in microseconds observed
by the osnoise tracer during the associated runtime.
- The % OF CPU AVAILABLE reports the percentage of CPU available for
the osnoise thread during the runtime window.
- The MAX SINGLE NOISE IN US reports the maximum single noise observed
during the runtime window.
- The Interference counters display how many each of the respective
interference happened during the runtime window.
Note that the example above shows a high number of HW noise samples.
The reason being is that this sample was taken on a virtual machine,
and the host interference is detected as a hardware interference.
Tracer options
The tracer has a set of options inside the osnoise directory, they are:
- osnoise/cpus: CPUs at which a osnoise thread will execute.
- osnoise/period_us: the period of the osnoise thread.
- osnoise/runtime_us: how long an osnoise thread will look for noise.
- osnoise/stop_tracing_us: stop the system tracing if a single noise
higher than the configured value happens. Writing 0 disables this
option.
- osnoise/stop_tracing_total_us: stop the system tracing if total noise
higher than the configured value happens. Writing 0 disables this
option.
- tracing_threshold: the minimum delta between two time() reads to be
considered as noise, in us. When set to 0, the default value will
be used, which is currently 5 us.
Additional Tracing
In addition to the tracer, a set of tracepoints were added to
facilitate the identification of the osnoise source.
- osnoise:sample_threshold: printed anytime a noise is higher than
the configurable tolerance_ns.
- osnoise:nmi_noise: noise from NMI, including the duration.
- osnoise:irq_noise: noise from an IRQ, including the duration.
- osnoise:softirq_noise: noise from a SoftIRQ, including the
duration.
- osnoise:thread_noise: noise from a thread, including the duration.
Note that all the values are *net values*. For example, if while osnoise
is running, another thread preempts the osnoise thread, it will start a
thread_noise duration at the start. Then, an IRQ takes place, preempting
the thread_noise, starting a irq_noise. When the IRQ ends its execution,
it will compute its duration, and this duration will be subtracted from
the thread_noise, in such a way as to avoid the double accounting of the
IRQ execution. This logic is valid for all sources of noise.
Here is one example of the usage of these tracepoints::
osnoise/8-961 [008] d.h. 5789.857532: irq_noise: local_timer:236 start 5789.857529929 duration 1845 ns
osnoise/8-961 [008] dNh. 5789.858408: irq_noise: local_timer:236 start 5789.858404871 duration 2848 ns
migration/8-54 [008] d... 5789.858413: thread_noise: migration/8:54 start 5789.858409300 duration 3068 ns
osnoise/8-961 [008] .... 5789.858413: sample_threshold: start 5789.858404555 duration 8723 ns interferences 2
In this example, a noise sample of 8 microseconds was reported in the last
line, pointing to two interferences. Looking backward in the trace, the
two previous entries were about the migration thread running after a
timer IRQ execution. The first event is not part of the noise because
it took place one millisecond before.
It is worth noticing that the sum of the duration reported in the
tracepoints is smaller than eight us reported in the sample_threshold.
The reason roots in the overhead of the entry and exit code that happens
before and after any interference execution. This justifies the dual
approach: measuring thread and tracing.
Link: https://lkml.kernel.org/r/e649467042d60e7b62714c9c6751a56299d15119.1624372313.git.bristot@redhat.com
Cc: Phil Auld <pauld@redhat.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Kate Carcia <kcarcia@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Alexandre Chartre <alexandre.chartre@oracle.com>
Cc: Clark Willaims <williams@redhat.com>
Cc: John Kacur <jkacur@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com>
[
Made the following functions static:
trace_irqentry_callback()
trace_irqexit_callback()
trace_intel_irqentry_callback()
trace_intel_irqexit_callback()
Added to include/trace.h:
osnoise_arch_register()
osnoise_arch_unregister()
Fixed define logic for LATENCY_FS_NOTIFY
Reported-by: kernel test robot <lkp@intel.com>
]
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-06-22 14:42:27 +00:00
|
|
|
TRACE_OSNOISE,
|
2021-06-22 14:42:28 +00:00
|
|
|
TRACE_TIMERLAT,
|
2016-07-06 19:25:08 +00:00
|
|
|
TRACE_RAW_DATA,
|
2021-04-15 18:18:50 +00:00
|
|
|
TRACE_FUNC_REPEATS,
|
2008-05-23 19:37:28 +00:00
|
|
|
|
2008-12-24 04:24:12 +00:00
|
|
|
__TRACE_LAST_TYPE,
|
2008-05-23 19:37:28 +00:00
|
|
|
};
|
|
|
|
|
2008-05-12 19:20:42 +00:00
|
|
|
|
2009-09-12 23:17:15 +00:00
|
|
|
#undef __field
|
|
|
|
#define __field(type, item) type item;
|
2008-05-12 19:20:51 +00:00
|
|
|
|
2019-10-24 20:26:59 +00:00
|
|
|
#undef __field_fn
|
|
|
|
#define __field_fn(type, item) type item;
|
|
|
|
|
2009-09-12 23:22:23 +00:00
|
|
|
#undef __field_struct
|
|
|
|
#define __field_struct(type, item) __field(type, item)
|
2008-05-12 19:20:51 +00:00
|
|
|
|
2009-09-12 23:22:23 +00:00
|
|
|
#undef __field_desc
|
|
|
|
#define __field_desc(type, container, item)
|
2008-11-22 11:28:47 +00:00
|
|
|
|
2020-06-10 02:00:41 +00:00
|
|
|
#undef __field_packed
|
|
|
|
#define __field_packed(type, container, item)
|
|
|
|
|
2009-09-12 23:17:15 +00:00
|
|
|
#undef __array
|
|
|
|
#define __array(type, item, size) type item[size];
|
2009-03-06 16:21:47 +00:00
|
|
|
|
tracing: Add back FORTIFY_SOURCE logic to kernel_stack event structure
For backward compatibility, older tooling expects to see the kernel_stack
event with a "caller" field that is a fixed size array of 8 addresses. The
code now supports more than 8 with an added "size" field that states the
real number of entries. But the "caller" field still just looks like a
fixed size to user space.
Since the tracing macros that create the user space format files also
creates the structures that those files represent, the kernel_stack event
structure had its "caller" field a fixed size of 8, but in reality, when
it is allocated on the ring buffer, it can hold more if the stack trace is
bigger that 8 functions. The copying of these entries was simply done with
a memcpy():
size = nr_entries * sizeof(unsigned long);
memcpy(entry->caller, fstack->calls, size);
The FORTIFY_SOURCE logic noticed at runtime that when the nr_entries was
larger than 8, that the memcpy() was writing more than what the structure
stated it can hold and it complained about it. This is because the
FORTIFY_SOURCE code is unaware that the amount allocated is actually
enough to hold the size. It does not expect that a fixed size field will
hold more than the fixed size.
This was originally solved by hiding the caller assignment with some
pointer arithmetic.
ptr = ring_buffer_data();
entry = ptr;
ptr += offsetof(typeof(*entry), caller);
memcpy(ptr, fstack->calls, size);
But it is considered bad form to hide from kernel hardening. Instead, make
it work nicely with FORTIFY_SOURCE by adding a new __stack_array() macro
that is specific for this one special use case. The macro will take 4
arguments: type, item, len, field (whereas the __array() macro takes just
the first three). This macro will act just like the __array() macro when
creating the code to deal with the format file that is exposed to user
space. But for the kernel, it will turn the caller field into:
type item[] __counted_by(field);
or for this instance:
unsigned long caller[] __counted_by(size);
Now the kernel code can expose the assignment of the caller to the
FORTIFY_SOURCE and everyone is happy!
Link: https://lore.kernel.org/linux-trace-kernel/20230712105235.5fc441aa@gandalf.local.home/
Link: https://lore.kernel.org/linux-trace-kernel/20230713092605.2ddb9788@rorschach.local.home
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Suggested-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
2023-07-13 13:26:05 +00:00
|
|
|
/*
|
|
|
|
* For backward compatibility, older user space expects to see the
|
|
|
|
* kernel_stack event with a fixed size caller field. But today the fix
|
|
|
|
* size is ignored by the kernel, and the real structure is dynamic.
|
|
|
|
* Expose to user space: "unsigned long caller[8];" but the real structure
|
|
|
|
* will be "unsigned long caller[] __counted_by(size)"
|
|
|
|
*/
|
|
|
|
#undef __stack_array
|
|
|
|
#define __stack_array(type, item, size, field) type item[] __counted_by(field);
|
|
|
|
|
2009-09-12 23:22:23 +00:00
|
|
|
#undef __array_desc
|
|
|
|
#define __array_desc(type, container, item, size)
|
2008-09-30 03:02:42 +00:00
|
|
|
|
2009-09-12 23:17:15 +00:00
|
|
|
#undef __dynamic_array
|
|
|
|
#define __dynamic_array(type, item) type item[];
|
2008-09-30 03:02:42 +00:00
|
|
|
|
2021-11-22 09:30:21 +00:00
|
|
|
#undef __rel_dynamic_array
|
|
|
|
#define __rel_dynamic_array(type, item) type item[];
|
|
|
|
|
2009-09-12 23:17:15 +00:00
|
|
|
#undef F_STRUCT
|
|
|
|
#define F_STRUCT(args...) args
|
2008-11-11 22:24:42 +00:00
|
|
|
|
2009-09-12 23:17:15 +00:00
|
|
|
#undef FTRACE_ENTRY
|
2019-10-24 20:26:59 +00:00
|
|
|
#define FTRACE_ENTRY(name, struct_name, id, tstruct, print) \
|
2012-02-15 14:51:53 +00:00
|
|
|
struct struct_name { \
|
|
|
|
struct trace_entry ent; \
|
|
|
|
tstruct \
|
2009-09-12 23:17:15 +00:00
|
|
|
}
|
2008-09-30 03:02:42 +00:00
|
|
|
|
2009-09-12 23:17:15 +00:00
|
|
|
#undef FTRACE_ENTRY_DUP
|
2019-10-24 20:26:59 +00:00
|
|
|
#define FTRACE_ENTRY_DUP(name, name_struct, id, tstruct, printk)
|
2008-11-25 08:24:15 +00:00
|
|
|
|
2012-02-15 14:51:51 +00:00
|
|
|
#undef FTRACE_ENTRY_REG
|
2019-10-24 20:26:59 +00:00
|
|
|
#define FTRACE_ENTRY_REG(name, struct_name, id, tstruct, print, regfn) \
|
|
|
|
FTRACE_ENTRY(name, struct_name, id, PARAMS(tstruct), PARAMS(print))
|
2012-02-15 14:51:51 +00:00
|
|
|
|
2016-06-29 10:56:48 +00:00
|
|
|
#undef FTRACE_ENTRY_PACKED
|
2019-10-24 20:26:59 +00:00
|
|
|
#define FTRACE_ENTRY_PACKED(name, struct_name, id, tstruct, print) \
|
|
|
|
FTRACE_ENTRY(name, struct_name, id, PARAMS(tstruct), PARAMS(print)) __packed
|
2016-06-29 10:56:48 +00:00
|
|
|
|
2009-09-12 23:17:15 +00:00
|
|
|
#include "trace_entries.h"
|
2008-12-29 21:42:23 +00:00
|
|
|
|
2020-01-25 15:52:30 +00:00
|
|
|
/* Use this for memory failure errors */
|
2021-06-28 13:50:06 +00:00
|
|
|
#define MEM_FAIL(condition, fmt, ...) \
|
|
|
|
DO_ONCE_LITE_IF(condition, pr_err, "ERROR: " fmt, ##__VA_ARGS__)
|
2020-01-25 15:52:30 +00:00
|
|
|
|
2023-07-11 14:15:57 +00:00
|
|
|
#define FAULT_STRING "(fault)"
|
|
|
|
|
2023-01-17 15:21:28 +00:00
|
|
|
#define HIST_STACKTRACE_DEPTH 16
|
|
|
|
#define HIST_STACKTRACE_SIZE (HIST_STACKTRACE_DEPTH * sizeof(unsigned long))
|
|
|
|
#define HIST_STACKTRACE_SKIP 5
|
|
|
|
|
2009-09-12 23:17:15 +00:00
|
|
|
/*
|
|
|
|
* syscalls are special, and need special handling, this is why
|
|
|
|
* they are not included in trace_entries.h
|
|
|
|
*/
|
2009-03-13 14:42:11 +00:00
|
|
|
struct syscall_trace_enter {
|
|
|
|
struct trace_entry ent;
|
|
|
|
int nr;
|
|
|
|
unsigned long args[];
|
|
|
|
};
|
|
|
|
|
|
|
|
struct syscall_trace_exit {
|
|
|
|
struct trace_entry ent;
|
|
|
|
int nr;
|
2009-11-25 07:14:59 +00:00
|
|
|
long ret;
|
2009-03-13 14:42:11 +00:00
|
|
|
};
|
|
|
|
|
tracing/kprobes: Support basic types on dynamic events
Support basic types of integer (u8, u16, u32, u64, s8, s16, s32, s64) in
kprobe tracer. With this patch, users can specify above basic types on
each arguments after ':'. If omitted, the argument type is set as
unsigned long (u32 or u64, arch-dependent).
e.g.
echo 'p account_system_time+0 hardirq_offset=%si:s32' > kprobe_events
adds a probe recording hardirq_offset in signed-32bits value on the
entry of account_system_time.
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20100412171708.3790.18599.stgit@localhost6.localdomain6>
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-12 17:17:08 +00:00
|
|
|
struct kprobe_trace_entry_head {
|
2009-08-13 20:35:11 +00:00
|
|
|
struct trace_entry ent;
|
|
|
|
unsigned long ip;
|
|
|
|
};
|
|
|
|
|
tracing: Add a probe that attaches to trace events
A new dynamic event is introduced: event probe. The event is attached
to an existing tracepoint and uses its fields as arguments. The user
can specify custom format string of the new event, select what tracepoint
arguments will be printed and how to print them.
An event probe is created by writing configuration string in
'dynamic_events' ftrace file:
e[:[SNAME/]ENAME] SYSTEM/EVENT [FETCHARGS] - Set an event probe
-:SNAME/ENAME - Delete an event probe
Where:
SNAME - System name, if omitted 'eprobes' is used.
ENAME - Name of the new event in SNAME, if omitted the SYSTEM_EVENT is used.
SYSTEM - Name of the system, where the tracepoint is defined, mandatory.
EVENT - Name of the tracepoint event in SYSTEM, mandatory.
FETCHARGS - Arguments:
<name>=$<field>[:TYPE] - Fetch given filed of the tracepoint and print
it as given TYPE with given name. Supported
types are:
(u8/u16/u32/u64/s8/s16/s32/s64), basic type
(x8/x16/x32/x64), hexadecimal types
"string", "ustring" and bitfield.
Example, attach an event probe on openat system call and print name of the
file that will be opened:
echo "e:esys/eopen syscalls/sys_enter_openat file=\$filename:string" >> dynamic_events
A new dynamic event is created in events/esys/eopen/ directory. It
can be deleted with:
echo "-:esys/eopen" >> dynamic_events
Filters, triggers and histograms can be attached to the new event, it can
be matched in synthetic events. There is one limitation - an event probe
can not be attached to kprobe, uprobe or another event probe.
Link: https://lkml.kernel.org/r/20210812145805.2292326-1-tz.stoyanov@gmail.com
Link: https://lkml.kernel.org/r/20210819152825.142428383@goodmis.org
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Co-developed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-08-19 15:26:06 +00:00
|
|
|
struct eprobe_trace_entry_head {
|
|
|
|
struct trace_entry ent;
|
|
|
|
};
|
|
|
|
|
tracing/kprobes: Support basic types on dynamic events
Support basic types of integer (u8, u16, u32, u64, s8, s16, s32, s64) in
kprobe tracer. With this patch, users can specify above basic types on
each arguments after ':'. If omitted, the argument type is set as
unsigned long (u32 or u64, arch-dependent).
e.g.
echo 'p account_system_time+0 hardirq_offset=%si:s32' > kprobe_events
adds a probe recording hardirq_offset in signed-32bits value on the
entry of account_system_time.
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20100412171708.3790.18599.stgit@localhost6.localdomain6>
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-04-12 17:17:08 +00:00
|
|
|
struct kretprobe_trace_entry_head {
|
2009-08-13 20:35:11 +00:00
|
|
|
struct trace_entry ent;
|
|
|
|
unsigned long func;
|
|
|
|
unsigned long ret_ip;
|
|
|
|
};
|
|
|
|
|
2023-06-06 12:39:55 +00:00
|
|
|
struct fentry_trace_entry_head {
|
|
|
|
struct trace_entry ent;
|
|
|
|
unsigned long ip;
|
|
|
|
};
|
|
|
|
|
|
|
|
struct fexit_trace_entry_head {
|
|
|
|
struct trace_entry ent;
|
|
|
|
unsigned long func;
|
|
|
|
unsigned long ret_ip;
|
|
|
|
};
|
|
|
|
|
2008-09-16 19:06:42 +00:00
|
|
|
#define TRACE_BUF_SIZE 1024
|
2008-05-12 19:20:42 +00:00
|
|
|
|
2012-05-11 17:29:49 +00:00
|
|
|
struct trace_array;
|
|
|
|
|
2008-05-12 19:20:42 +00:00
|
|
|
/*
|
|
|
|
* The CPU trace array - it consists of thousands of trace entries
|
|
|
|
* plus some other descriptor data: (for example which task started
|
|
|
|
* the trace, etc.)
|
|
|
|
*/
|
|
|
|
struct trace_array_cpu {
|
|
|
|
atomic_t disabled;
|
2008-12-02 03:20:19 +00:00
|
|
|
void *buffer_page; /* ring buffer spare */
|
2008-05-12 19:20:45 +00:00
|
|
|
|
2012-02-02 20:00:41 +00:00
|
|
|
unsigned long entries;
|
2008-05-12 19:20:42 +00:00
|
|
|
unsigned long saved_latency;
|
|
|
|
unsigned long critical_start;
|
|
|
|
unsigned long critical_end;
|
|
|
|
unsigned long critical_sequence;
|
|
|
|
unsigned long nice;
|
|
|
|
unsigned long policy;
|
|
|
|
unsigned long rt_priority;
|
2009-09-01 15:06:29 +00:00
|
|
|
unsigned long skipped_entries;
|
2016-12-21 19:32:01 +00:00
|
|
|
u64 preempt_timestamp;
|
2008-05-12 19:20:42 +00:00
|
|
|
pid_t pid;
|
2012-03-13 23:02:19 +00:00
|
|
|
kuid_t uid;
|
2008-05-12 19:20:42 +00:00
|
|
|
char comm[TASK_COMM_LEN];
|
2015-09-25 16:58:44 +00:00
|
|
|
|
2016-04-22 22:11:33 +00:00
|
|
|
#ifdef CONFIG_FUNCTION_TRACER
|
2020-03-20 03:40:40 +00:00
|
|
|
int ftrace_ignore_pid;
|
2016-04-22 22:11:33 +00:00
|
|
|
#endif
|
2020-03-20 03:40:40 +00:00
|
|
|
bool ignore_pid;
|
2008-05-12 19:20:42 +00:00
|
|
|
};
|
|
|
|
|
2012-05-11 17:29:49 +00:00
|
|
|
struct tracer;
|
2015-09-30 18:27:31 +00:00
|
|
|
struct trace_option_dentry;
|
2012-05-11 17:29:49 +00:00
|
|
|
|
2020-01-09 23:53:48 +00:00
|
|
|
struct array_buffer {
|
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 14:24:35 +00:00
|
|
|
struct trace_array *tr;
|
2019-12-13 18:58:57 +00:00
|
|
|
struct trace_buffer *buffer;
|
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 14:24:35 +00:00
|
|
|
struct trace_array_cpu __percpu *data;
|
2016-12-21 19:32:01 +00:00
|
|
|
u64 time_start;
|
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 14:24:35 +00:00
|
|
|
int cpu;
|
|
|
|
};
|
|
|
|
|
2015-09-30 15:11:15 +00:00
|
|
|
#define TRACE_FLAGS_MAX_SIZE 32
|
|
|
|
|
2015-09-30 18:27:31 +00:00
|
|
|
struct trace_options {
|
|
|
|
struct tracer *tracer;
|
|
|
|
struct trace_option_dentry *topts;
|
|
|
|
};
|
|
|
|
|
2021-09-24 01:03:49 +00:00
|
|
|
struct trace_pid_list *trace_pid_list_alloc(void);
|
|
|
|
void trace_pid_list_free(struct trace_pid_list *pid_list);
|
|
|
|
bool trace_pid_list_is_set(struct trace_pid_list *pid_list, unsigned int pid);
|
|
|
|
int trace_pid_list_set(struct trace_pid_list *pid_list, unsigned int pid);
|
|
|
|
int trace_pid_list_clear(struct trace_pid_list *pid_list, unsigned int pid);
|
|
|
|
int trace_pid_list_first(struct trace_pid_list *pid_list, unsigned int *pid);
|
|
|
|
int trace_pid_list_next(struct trace_pid_list *pid_list, unsigned int pid,
|
|
|
|
unsigned int *next);
|
2015-09-24 15:33:26 +00:00
|
|
|
|
2020-03-25 23:51:19 +00:00
|
|
|
enum {
|
|
|
|
TRACE_PIDS = BIT(0),
|
|
|
|
TRACE_NO_PIDS = BIT(1),
|
|
|
|
};
|
|
|
|
|
|
|
|
static inline bool pid_type_enabled(int type, struct trace_pid_list *pid_list,
|
|
|
|
struct trace_pid_list *no_pid_list)
|
|
|
|
{
|
|
|
|
/* Return true if the pid list in type has pids */
|
|
|
|
return ((type & TRACE_PIDS) && pid_list) ||
|
|
|
|
((type & TRACE_NO_PIDS) && no_pid_list);
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline bool still_need_pid_events(int type, struct trace_pid_list *pid_list,
|
|
|
|
struct trace_pid_list *no_pid_list)
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* Turning off what is in @type, return true if the "other"
|
|
|
|
* pid list, still has pids in it.
|
|
|
|
*/
|
|
|
|
return (!(type & TRACE_PIDS) && pid_list) ||
|
|
|
|
(!(type & TRACE_NO_PIDS) && no_pid_list);
|
|
|
|
}
|
|
|
|
|
tracing: Add conditional snapshot
Currently, tracing snapshots are context-free - they capture the ring
buffer contents at the time the tracing_snapshot() function was
invoked, and nothing else. Additionally, they're always taken
unconditionally - the calling code can decide whether or not to take a
snapshot, but the data used to make that decision is kept separately
from the snapshot itself.
This change adds the ability to associate with each trace instance
some user data, along with an 'update' function that can use that data
to determine whether or not to actually take a snapshot. The update
function can then update that data along with any other state (as part
of the data presumably), if warranted.
Because snapshots are 'global' per-instance, only one user can enable
and use a conditional snapshot for any given trace instance. To
enable a conditional snapshot (see details in the function and data
structure comments), the user calls tracing_snapshot_cond_enable().
Similarly, to disable a conditional snapshot and free it up for other
users, tracing_snapshot_cond_disable() should be called.
To actually initiate a conditional snapshot, tracing_snapshot_cond()
should be called. tracing_snapshot_cond() will invoke the update()
callback, allowing the user to decide whether or not to actually take
the snapshot and update the user-defined data associated with the
snapshot. If the callback returns 'true', tracing_snapshot_cond()
will then actually take the snapshot and return.
This scheme allows for flexibility in snapshot implementations - for
example, by implementing slightly different update() callbacks,
snapshots can be taken in situations where the user is only interested
in taking a snapshot when a new maximum in hit versus when a value
changes in any way at all. Future patches will demonstrate both
cases.
Link: http://lkml.kernel.org/r/1bea07828d5fd6864a585f83b1eed47ce097eb45.1550100284.git.tom.zanussi@linux.intel.com
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-02-13 23:42:45 +00:00
|
|
|
typedef bool (*cond_update_fn_t)(struct trace_array *tr, void *cond_data);
|
|
|
|
|
|
|
|
/**
|
|
|
|
* struct cond_snapshot - conditional snapshot data and callback
|
|
|
|
*
|
|
|
|
* The cond_snapshot structure encapsulates a callback function and
|
|
|
|
* data associated with the snapshot for a given tracing instance.
|
|
|
|
*
|
|
|
|
* When a snapshot is taken conditionally, by invoking
|
|
|
|
* tracing_snapshot_cond(tr, cond_data), the cond_data passed in is
|
|
|
|
* passed in turn to the cond_snapshot.update() function. That data
|
|
|
|
* can be compared by the update() implementation with the cond_data
|
2020-10-10 14:09:24 +00:00
|
|
|
* contained within the struct cond_snapshot instance associated with
|
tracing: Add conditional snapshot
Currently, tracing snapshots are context-free - they capture the ring
buffer contents at the time the tracing_snapshot() function was
invoked, and nothing else. Additionally, they're always taken
unconditionally - the calling code can decide whether or not to take a
snapshot, but the data used to make that decision is kept separately
from the snapshot itself.
This change adds the ability to associate with each trace instance
some user data, along with an 'update' function that can use that data
to determine whether or not to actually take a snapshot. The update
function can then update that data along with any other state (as part
of the data presumably), if warranted.
Because snapshots are 'global' per-instance, only one user can enable
and use a conditional snapshot for any given trace instance. To
enable a conditional snapshot (see details in the function and data
structure comments), the user calls tracing_snapshot_cond_enable().
Similarly, to disable a conditional snapshot and free it up for other
users, tracing_snapshot_cond_disable() should be called.
To actually initiate a conditional snapshot, tracing_snapshot_cond()
should be called. tracing_snapshot_cond() will invoke the update()
callback, allowing the user to decide whether or not to actually take
the snapshot and update the user-defined data associated with the
snapshot. If the callback returns 'true', tracing_snapshot_cond()
will then actually take the snapshot and return.
This scheme allows for flexibility in snapshot implementations - for
example, by implementing slightly different update() callbacks,
snapshots can be taken in situations where the user is only interested
in taking a snapshot when a new maximum in hit versus when a value
changes in any way at all. Future patches will demonstrate both
cases.
Link: http://lkml.kernel.org/r/1bea07828d5fd6864a585f83b1eed47ce097eb45.1550100284.git.tom.zanussi@linux.intel.com
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-02-13 23:42:45 +00:00
|
|
|
* the trace_array. Because the tr->max_lock is held throughout the
|
|
|
|
* update() call, the update() function can directly retrieve the
|
|
|
|
* cond_snapshot and cond_data associated with the per-instance
|
|
|
|
* snapshot associated with the trace_array.
|
|
|
|
*
|
|
|
|
* The cond_snapshot.update() implementation can save data to be
|
|
|
|
* associated with the snapshot if it decides to, and returns 'true'
|
|
|
|
* in that case, or it returns 'false' if the conditional snapshot
|
|
|
|
* shouldn't be taken.
|
|
|
|
*
|
|
|
|
* The cond_snapshot instance is created and associated with the
|
|
|
|
* user-defined cond_data by tracing_cond_snapshot_enable().
|
|
|
|
* Likewise, the cond_snapshot instance is destroyed and is no longer
|
|
|
|
* associated with the trace instance by
|
|
|
|
* tracing_cond_snapshot_disable().
|
|
|
|
*
|
|
|
|
* The method below is required.
|
|
|
|
*
|
|
|
|
* @update: When a conditional snapshot is invoked, the update()
|
|
|
|
* callback function is invoked with the tr->max_lock held. The
|
|
|
|
* update() implementation signals whether or not to actually
|
|
|
|
* take the snapshot, by returning 'true' if so, 'false' if no
|
|
|
|
* snapshot should be taken. Because the max_lock is held for
|
|
|
|
* the duration of update(), the implementation is safe to
|
2020-10-10 14:09:24 +00:00
|
|
|
* directly retrieved and save any implementation data it needs
|
tracing: Add conditional snapshot
Currently, tracing snapshots are context-free - they capture the ring
buffer contents at the time the tracing_snapshot() function was
invoked, and nothing else. Additionally, they're always taken
unconditionally - the calling code can decide whether or not to take a
snapshot, but the data used to make that decision is kept separately
from the snapshot itself.
This change adds the ability to associate with each trace instance
some user data, along with an 'update' function that can use that data
to determine whether or not to actually take a snapshot. The update
function can then update that data along with any other state (as part
of the data presumably), if warranted.
Because snapshots are 'global' per-instance, only one user can enable
and use a conditional snapshot for any given trace instance. To
enable a conditional snapshot (see details in the function and data
structure comments), the user calls tracing_snapshot_cond_enable().
Similarly, to disable a conditional snapshot and free it up for other
users, tracing_snapshot_cond_disable() should be called.
To actually initiate a conditional snapshot, tracing_snapshot_cond()
should be called. tracing_snapshot_cond() will invoke the update()
callback, allowing the user to decide whether or not to actually take
the snapshot and update the user-defined data associated with the
snapshot. If the callback returns 'true', tracing_snapshot_cond()
will then actually take the snapshot and return.
This scheme allows for flexibility in snapshot implementations - for
example, by implementing slightly different update() callbacks,
snapshots can be taken in situations where the user is only interested
in taking a snapshot when a new maximum in hit versus when a value
changes in any way at all. Future patches will demonstrate both
cases.
Link: http://lkml.kernel.org/r/1bea07828d5fd6864a585f83b1eed47ce097eb45.1550100284.git.tom.zanussi@linux.intel.com
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-02-13 23:42:45 +00:00
|
|
|
* to in association with the snapshot.
|
|
|
|
*/
|
|
|
|
struct cond_snapshot {
|
|
|
|
void *cond_data;
|
|
|
|
cond_update_fn_t update;
|
|
|
|
};
|
|
|
|
|
2021-04-15 18:18:51 +00:00
|
|
|
/*
|
|
|
|
* struct trace_func_repeats - used to keep track of the consecutive
|
|
|
|
* (on the same CPU) calls of a single function.
|
|
|
|
*/
|
|
|
|
struct trace_func_repeats {
|
|
|
|
unsigned long ip;
|
|
|
|
unsigned long parent_ip;
|
|
|
|
unsigned long count;
|
|
|
|
u64 ts_last_call;
|
|
|
|
};
|
|
|
|
|
2008-05-12 19:20:42 +00:00
|
|
|
/*
|
|
|
|
* The trace array - an array of per-CPU trace arrays. This is the
|
|
|
|
* highest level data structure that individual tracers deal with.
|
|
|
|
* They have on/off state as well:
|
|
|
|
*/
|
|
|
|
struct trace_array {
|
2012-05-04 03:09:03 +00:00
|
|
|
struct list_head list;
|
2012-08-03 20:10:49 +00:00
|
|
|
char *name;
|
2020-01-09 23:53:48 +00:00
|
|
|
struct array_buffer array_buffer;
|
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 14:24:35 +00:00
|
|
|
#ifdef CONFIG_TRACER_MAX_TRACE
|
|
|
|
/*
|
|
|
|
* The max_buffer is used to snapshot the trace when a maximum
|
|
|
|
* latency is reached, or when the user initiates a snapshot.
|
|
|
|
* Some tracers will use this to store a maximum trace while
|
|
|
|
* it continues examining live traces.
|
|
|
|
*
|
2020-01-09 23:53:48 +00:00
|
|
|
* The buffers for the max_buffer are set up the same as the array_buffer
|
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 14:24:35 +00:00
|
|
|
* When a snapshot is taken, the buffer of the max_buffer is swapped
|
2020-01-09 23:53:48 +00:00
|
|
|
* with the buffer of the array_buffer and the buffers are reset for
|
|
|
|
* the array_buffer so the tracing can continue.
|
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 14:24:35 +00:00
|
|
|
*/
|
2020-01-09 23:53:48 +00:00
|
|
|
struct array_buffer max_buffer;
|
2013-03-05 23:25:02 +00:00
|
|
|
bool allocated_snapshot;
|
2024-02-20 20:23:07 +00:00
|
|
|
spinlock_t snapshot_trigger_lock;
|
|
|
|
unsigned int snapshot;
|
2014-01-14 16:28:38 +00:00
|
|
|
unsigned long max_latency;
|
2019-10-08 22:08:21 +00:00
|
|
|
#ifdef CONFIG_FSNOTIFY
|
|
|
|
struct dentry *d_max_latency;
|
|
|
|
struct work_struct fsnotify_work;
|
|
|
|
struct irq_work fsnotify_irqwork;
|
|
|
|
#endif
|
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 14:24:35 +00:00
|
|
|
#endif
|
2024-06-12 23:19:38 +00:00
|
|
|
/* The below is for memory mapped ring buffer */
|
|
|
|
unsigned int mapped;
|
|
|
|
unsigned long range_addr_start;
|
|
|
|
unsigned long range_addr_size;
|
2024-06-12 23:19:44 +00:00
|
|
|
long text_delta;
|
|
|
|
long data_delta;
|
2024-06-12 23:19:38 +00:00
|
|
|
|
2015-09-24 15:33:26 +00:00
|
|
|
struct trace_pid_list __rcu *filtered_pids;
|
2020-03-25 23:51:19 +00:00
|
|
|
struct trace_pid_list __rcu *filtered_no_pids;
|
2014-01-14 15:04:59 +00:00
|
|
|
/*
|
|
|
|
* max_lock is used to protect the swapping of buffers
|
|
|
|
* when taking a max snapshot. The buffers themselves are
|
|
|
|
* protected by per_cpu spinlocks. But the action of the swap
|
|
|
|
* needs its own lock.
|
|
|
|
*
|
|
|
|
* This is defined as a arch_spinlock_t in order to help
|
|
|
|
* with performance when lockdep debugging is enabled.
|
|
|
|
*
|
|
|
|
* It is also used in other places outside the update_max_tr
|
|
|
|
* so it needs to be defined outside of the
|
|
|
|
* CONFIG_TRACER_MAX_TRACE.
|
|
|
|
*/
|
|
|
|
arch_spinlock_t max_lock;
|
2012-02-22 20:50:28 +00:00
|
|
|
int buffer_disabled;
|
2012-08-08 18:48:20 +00:00
|
|
|
#ifdef CONFIG_FTRACE_SYSCALLS
|
|
|
|
int sys_refcount_enter;
|
|
|
|
int sys_refcount_exit;
|
2015-05-05 14:09:53 +00:00
|
|
|
struct trace_event_file __rcu *enter_syscall_files[NR_syscalls];
|
|
|
|
struct trace_event_file __rcu *exit_syscall_files[NR_syscalls];
|
2012-08-08 18:48:20 +00:00
|
|
|
#endif
|
2012-05-11 17:29:49 +00:00
|
|
|
int stop_count;
|
|
|
|
int clock_id;
|
2015-09-30 18:27:31 +00:00
|
|
|
int nr_topts;
|
tracing: Only have rmmod clear buffers that its events were active in
Currently, when a module event is enabled, when that module is removed, it
clears all ring buffers. This is to prevent another module from being loaded
and having one of its trace event IDs from reusing a trace event ID of the
removed module. This could cause undesirable effects as the trace event of
the new module would be using its own processing algorithms to process raw
data of another event. To prevent this, when a module is loaded, if any of
its events have been used (signified by the WAS_ENABLED event call flag,
which is never cleared), all ring buffers are cleared, just in case any one
of them contains event data of the removed event.
The problem is, there's no reason to clear all ring buffers if only one (or
less than all of them) uses one of the events. Instead, only clear the ring
buffers that recorded the events of a module that is being removed.
To do this, instead of keeping the WAS_ENABLED flag with the trace event
call, move it to the per instance (per ring buffer) event file descriptor.
The event file descriptor maps each event to a separate ring buffer
instance. Then when the module is removed, only the ring buffers that
activated one of the module's events get cleared. The rest are not touched.
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-08-31 21:03:47 +00:00
|
|
|
bool clear_trace;
|
2018-11-30 02:38:42 +00:00
|
|
|
int buffer_percent;
|
2019-04-02 02:52:21 +00:00
|
|
|
unsigned int n_err_log_entries;
|
2012-05-11 17:29:49 +00:00
|
|
|
struct tracer *current_trace;
|
2015-09-30 13:42:05 +00:00
|
|
|
unsigned int trace_flags;
|
2015-09-30 15:11:15 +00:00
|
|
|
unsigned char trace_flags_index[TRACE_FLAGS_MAX_SIZE];
|
2012-05-04 03:09:03 +00:00
|
|
|
unsigned int flags;
|
2012-05-11 17:29:49 +00:00
|
|
|
raw_spinlock_t start_lock;
|
2023-12-13 14:37:01 +00:00
|
|
|
const char *system_names;
|
2019-04-02 02:52:21 +00:00
|
|
|
struct list_head err_log;
|
2012-05-04 03:09:03 +00:00
|
|
|
struct dentry *dir;
|
2012-05-11 17:29:49 +00:00
|
|
|
struct dentry *options;
|
|
|
|
struct dentry *percpu_dir;
|
eventfs: Remove eventfs_file and just use eventfs_inode
Instead of having a descriptor for every file represented in the eventfs
directory, only have the directory itself represented. Change the API to
send in a list of entries that represent all the files in the directory
(but not other directories). The entry list contains a name and a callback
function that will be used to create the files when they are accessed.
struct eventfs_inode *eventfs_create_events_dir(const char *name, struct dentry *parent,
const struct eventfs_entry *entries,
int size, void *data);
is used for the top level eventfs directory, and returns an eventfs_inode
that will be used by:
struct eventfs_inode *eventfs_create_dir(const char *name, struct eventfs_inode *parent,
const struct eventfs_entry *entries,
int size, void *data);
where both of the above take an array of struct eventfs_entry entries for
every file that is in the directory.
The entries are defined by:
typedef int (*eventfs_callback)(const char *name, umode_t *mode, void **data,
const struct file_operations **fops);
struct eventfs_entry {
const char *name;
eventfs_callback callback;
};
Where the name is the name of the file and the callback gets called when
the file is being created. The callback passes in the name (in case the
same callback is used for multiple files), a pointer to the mode, data and
fops. The data will be pointing to the data that was passed in
eventfs_create_dir() or eventfs_create_events_dir() but may be overridden
to point to something else, as it will be used to point to the
inode->i_private that is created. The information passed back from the
callback is used to create the dentry/inode.
If the callback fills the data and the file should be created, it must
return a positive number. On zero or negative, the file is ignored.
This logic may also be used as a prototype to convert entire pseudo file
systems into just-in-time allocation.
The "show_events_dentry" file has been updated to show the directories,
and any files they have.
With just the eventfs_file allocations:
Before after deltas for meminfo (in kB):
MemFree: -14360
MemAvailable: -14260
Buffers: 40
Cached: 24
Active: 44
Inactive: 48
Inactive(anon): 28
Active(file): 44
Inactive(file): 20
Dirty: -4
AnonPages: 28
Mapped: 4
KReclaimable: 132
Slab: 1604
SReclaimable: 132
SUnreclaim: 1472
Committed_AS: 12
Before after deltas for slabinfo:
<slab>: <objects> [ * <size> = <total>]
ext4_inode_cache 27 [* 1184 = 31968 ]
extent_status 102 [* 40 = 4080 ]
tracefs_inode_cache 144 [* 656 = 94464 ]
buffer_head 39 [* 104 = 4056 ]
shmem_inode_cache 49 [* 800 = 39200 ]
filp -53 [* 256 = -13568 ]
dentry 251 [* 192 = 48192 ]
lsm_file_cache 277 [* 32 = 8864 ]
vm_area_struct -14 [* 184 = -2576 ]
trace_event_file 1748 [* 88 = 153824 ]
kmalloc-1k 35 [* 1024 = 35840 ]
kmalloc-256 49 [* 256 = 12544 ]
kmalloc-192 -28 [* 192 = -5376 ]
kmalloc-128 -30 [* 128 = -3840 ]
kmalloc-96 10581 [* 96 = 1015776 ]
kmalloc-64 3056 [* 64 = 195584 ]
kmalloc-32 1291 [* 32 = 41312 ]
kmalloc-16 2310 [* 16 = 36960 ]
kmalloc-8 9216 [* 8 = 73728 ]
Free memory dropped by 14,360 kB
Available memory dropped by 14,260 kB
Total slab additions in size: 1,771,032 bytes
With this change:
Before after deltas for meminfo (in kB):
MemFree: -12084
MemAvailable: -11976
Buffers: 32
Cached: 32
Active: 72
Inactive: 168
Inactive(anon): 176
Active(file): 72
Inactive(file): -8
Dirty: 24
AnonPages: 196
Mapped: 8
KReclaimable: 148
Slab: 836
SReclaimable: 148
SUnreclaim: 688
Committed_AS: 324
Before after deltas for slabinfo:
<slab>: <objects> [ * <size> = <total>]
tracefs_inode_cache 144 [* 656 = 94464 ]
shmem_inode_cache -23 [* 800 = -18400 ]
filp -92 [* 256 = -23552 ]
dentry 179 [* 192 = 34368 ]
lsm_file_cache -3 [* 32 = -96 ]
vm_area_struct -13 [* 184 = -2392 ]
trace_event_file 1748 [* 88 = 153824 ]
kmalloc-1k -49 [* 1024 = -50176 ]
kmalloc-256 -27 [* 256 = -6912 ]
kmalloc-128 1864 [* 128 = 238592 ]
kmalloc-64 4685 [* 64 = 299840 ]
kmalloc-32 -72 [* 32 = -2304 ]
kmalloc-16 256 [* 16 = 4096 ]
total = 721352
Free memory dropped by 12,084 kB
Available memory dropped by 11,976 kB
Total slab additions in size: 721,352 bytes
That's over 2 MB in savings per instance for free and available memory,
and over 1 MB in savings per instance of slab memory.
Link: https://lore.kernel.org/linux-trace-kernel/20231003184059.4924468e@gandalf.local.home
Link: https://lore.kernel.org/linux-trace-kernel/20231004165007.43d79161@gandalf.local.home
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ajay Kaher <akaher@vmware.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-10-04 20:50:07 +00:00
|
|
|
struct eventfs_inode *event_dir;
|
2015-09-30 18:27:31 +00:00
|
|
|
struct trace_options *topts;
|
2012-05-04 03:09:03 +00:00
|
|
|
struct list_head systems;
|
|
|
|
struct list_head events;
|
2018-05-09 18:17:48 +00:00
|
|
|
struct trace_event_file *trace_marker_file;
|
2013-08-08 16:47:45 +00:00
|
|
|
cpumask_var_t tracing_cpumask; /* only trace on set CPUs */
|
tracing: Introduce pipe_cpumask to avoid race on trace_pipes
There is race issue when concurrently splice_read main trace_pipe and
per_cpu trace_pipes which will result in data read out being different
from what actually writen.
As suggested by Steven:
> I believe we should add a ref count to trace_pipe and the per_cpu
> trace_pipes, where if they are opened, nothing else can read it.
>
> Opening trace_pipe locks all per_cpu ref counts, if any of them are
> open, then the trace_pipe open will fail (and releases any ref counts
> it had taken).
>
> Opening a per_cpu trace_pipe will up the ref count for just that
> CPU buffer. This will allow multiple tasks to read different per_cpu
> trace_pipe files, but will prevent the main trace_pipe file from
> being opened.
But because we only need to know whether per_cpu trace_pipe is open or
not, using a cpumask instead of using ref count may be easier.
After this patch, users will find that:
- Main trace_pipe can be opened by only one user, and if it is
opened, all per_cpu trace_pipes cannot be opened;
- Per_cpu trace_pipes can be opened by multiple users, but each per_cpu
trace_pipe can only be opened by one user. And if one of them is
opened, main trace_pipe cannot be opened.
Link: https://lore.kernel.org/linux-trace-kernel/20230818022645.1948314-1-zhengyejian1@huawei.com
Suggested-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-08-18 02:26:45 +00:00
|
|
|
/* one per_cpu trace_pipe can be opened by only one user */
|
|
|
|
cpumask_var_t pipe_cpumask;
|
2013-03-06 20:27:24 +00:00
|
|
|
int ref;
|
2020-06-30 03:45:56 +00:00
|
|
|
int trace_ref;
|
2013-11-08 01:08:58 +00:00
|
|
|
#ifdef CONFIG_FUNCTION_TRACER
|
|
|
|
struct ftrace_ops *ops;
|
2016-04-22 22:11:33 +00:00
|
|
|
struct trace_pid_list __rcu *function_pids;
|
2020-03-20 03:19:06 +00:00
|
|
|
struct trace_pid_list __rcu *function_no_pids;
|
2024-06-03 19:07:12 +00:00
|
|
|
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
|
|
|
|
struct fgraph_ops *gops;
|
|
|
|
#endif
|
2017-04-05 17:12:55 +00:00
|
|
|
#ifdef CONFIG_DYNAMIC_FTRACE
|
2017-06-23 19:26:26 +00:00
|
|
|
/* All of these are protected by the ftrace_lock */
|
2017-04-05 17:12:55 +00:00
|
|
|
struct list_head func_probes;
|
2017-06-23 19:26:26 +00:00
|
|
|
struct list_head mod_trace;
|
|
|
|
struct list_head mod_notrace;
|
2017-04-05 17:12:55 +00:00
|
|
|
#endif
|
2013-11-08 01:08:58 +00:00
|
|
|
/* function tracing enabled */
|
|
|
|
int function_enabled;
|
|
|
|
#endif
|
2021-03-16 16:41:05 +00:00
|
|
|
int no_filter_buffering_ref;
|
2018-01-16 02:51:56 +00:00
|
|
|
struct list_head hist_vars;
|
tracing: Add conditional snapshot
Currently, tracing snapshots are context-free - they capture the ring
buffer contents at the time the tracing_snapshot() function was
invoked, and nothing else. Additionally, they're always taken
unconditionally - the calling code can decide whether or not to take a
snapshot, but the data used to make that decision is kept separately
from the snapshot itself.
This change adds the ability to associate with each trace instance
some user data, along with an 'update' function that can use that data
to determine whether or not to actually take a snapshot. The update
function can then update that data along with any other state (as part
of the data presumably), if warranted.
Because snapshots are 'global' per-instance, only one user can enable
and use a conditional snapshot for any given trace instance. To
enable a conditional snapshot (see details in the function and data
structure comments), the user calls tracing_snapshot_cond_enable().
Similarly, to disable a conditional snapshot and free it up for other
users, tracing_snapshot_cond_disable() should be called.
To actually initiate a conditional snapshot, tracing_snapshot_cond()
should be called. tracing_snapshot_cond() will invoke the update()
callback, allowing the user to decide whether or not to actually take
the snapshot and update the user-defined data associated with the
snapshot. If the callback returns 'true', tracing_snapshot_cond()
will then actually take the snapshot and return.
This scheme allows for flexibility in snapshot implementations - for
example, by implementing slightly different update() callbacks,
snapshots can be taken in situations where the user is only interested
in taking a snapshot when a new maximum in hit versus when a value
changes in any way at all. Future patches will demonstrate both
cases.
Link: http://lkml.kernel.org/r/1bea07828d5fd6864a585f83b1eed47ce097eb45.1550100284.git.tom.zanussi@linux.intel.com
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-02-13 23:42:45 +00:00
|
|
|
#ifdef CONFIG_TRACER_SNAPSHOT
|
|
|
|
struct cond_snapshot *cond_snapshot;
|
|
|
|
#endif
|
2021-04-15 18:18:51 +00:00
|
|
|
struct trace_func_repeats __percpu *last_func_repeats;
|
2023-09-06 09:18:37 +00:00
|
|
|
/*
|
|
|
|
* On boot up, the ring buffer is set to the minimum size, so that
|
|
|
|
* we do not waste memory on systems that are not using tracing.
|
|
|
|
*/
|
|
|
|
bool ring_buffer_expanded;
|
2008-05-12 19:20:42 +00:00
|
|
|
};
|
|
|
|
|
2012-05-04 03:09:03 +00:00
|
|
|
enum {
|
|
|
|
TRACE_ARRAY_FL_GLOBAL = (1 << 0)
|
|
|
|
};
|
|
|
|
|
|
|
|
extern struct list_head ftrace_trace_arrays;
|
|
|
|
|
2013-07-02 02:37:54 +00:00
|
|
|
extern struct mutex trace_types_lock;
|
|
|
|
|
2013-07-02 19:30:53 +00:00
|
|
|
extern int trace_array_get(struct trace_array *tr);
|
tracing: Add tracing_check_open_get_tr()
Currently, most files in the tracefs directory test if tracing_disabled is
set. If so, it should return -ENODEV. The tracing_disabled is called when
tracing is found to be broken. Originally it was done in case the ring
buffer was found to be corrupted, and we wanted to prevent reading it from
crashing the kernel. But it's also called if a tracing selftest fails on
boot. It's a one way switch. That is, once it is triggered, tracing is
disabled until reboot.
As most tracefs files can also be used by instances in the tracefs
directory, they need to be carefully done. Each instance has a trace_array
associated to it, and when the instance is removed, the trace_array is
freed. But if an instance is opened with a reference to the trace_array,
then it requires looking up the trace_array to get its ref counter (as there
could be a race with it being deleted and the open itself). Once it is
found, a reference is added to prevent the instance from being removed (and
the trace_array associated with it freed).
Combine the two checks (tracing_disabled and trace_array_get()) into a
single helper function. This will also make it easier to add lockdown to
tracefs later.
Link: http://lkml.kernel.org/r/20191011135458.7399da44@gandalf.local.home
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-10-11 21:39:57 +00:00
|
|
|
extern int tracing_check_open_get_tr(struct trace_array *tr);
|
2020-01-29 18:59:21 +00:00
|
|
|
extern struct trace_array *trace_array_find(const char *instance);
|
|
|
|
extern struct trace_array *trace_array_find_get(const char *instance);
|
2013-07-02 19:30:53 +00:00
|
|
|
|
2021-03-16 16:41:07 +00:00
|
|
|
extern u64 tracing_event_time_stamp(struct trace_buffer *buffer, struct ring_buffer_event *rbe);
|
2021-03-16 16:41:05 +00:00
|
|
|
extern int tracing_set_filter_buffering(struct trace_array *tr, bool set);
|
2018-01-16 02:52:07 +00:00
|
|
|
extern int tracing_set_clock(struct trace_array *tr, const char *clockstr);
|
2018-01-16 02:51:39 +00:00
|
|
|
|
2018-01-16 02:51:48 +00:00
|
|
|
extern bool trace_clock_in_ns(struct trace_array *tr);
|
|
|
|
|
2012-05-04 03:09:03 +00:00
|
|
|
/*
|
|
|
|
* The global tracer (top) should be the first trace array added,
|
|
|
|
* but we check the flag anyway.
|
|
|
|
*/
|
|
|
|
static inline struct trace_array *top_trace_array(void)
|
|
|
|
{
|
|
|
|
struct trace_array *tr;
|
|
|
|
|
2014-06-10 17:53:50 +00:00
|
|
|
if (list_empty(&ftrace_trace_arrays))
|
2014-06-05 22:35:17 +00:00
|
|
|
return NULL;
|
|
|
|
|
2012-05-04 03:09:03 +00:00
|
|
|
tr = list_entry(ftrace_trace_arrays.prev,
|
|
|
|
typeof(*tr), list);
|
|
|
|
WARN_ON(!(tr->flags & TRACE_ARRAY_FL_GLOBAL));
|
|
|
|
return tr;
|
|
|
|
}
|
|
|
|
|
2008-10-01 14:52:51 +00:00
|
|
|
#define FTRACE_CMP_TYPE(var, type) \
|
|
|
|
__builtin_types_compatible_p(typeof(var), type *)
|
|
|
|
|
|
|
|
#undef IF_ASSIGN
|
2019-09-26 16:22:59 +00:00
|
|
|
#define IF_ASSIGN(var, entry, etype, id) \
|
|
|
|
if (FTRACE_CMP_TYPE(var, etype)) { \
|
|
|
|
var = (typeof(var))(entry); \
|
|
|
|
WARN_ON(id != 0 && (entry)->type != id); \
|
|
|
|
break; \
|
2008-10-01 14:52:51 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
/* Will cause compile errors if type is not found. */
|
|
|
|
extern void __ftrace_bad_type(void);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* The trace_assign_type is a verifier that the entry type is
|
|
|
|
* the same as the type being assigned. To add new types simply
|
|
|
|
* add a line with the following format:
|
|
|
|
*
|
|
|
|
* IF_ASSIGN(var, ent, type, id);
|
|
|
|
*
|
|
|
|
* Where "type" is the trace type that includes the trace_entry
|
|
|
|
* as the "ent" item. And "id" is the trace identifier that is
|
|
|
|
* used in the trace_type enum.
|
|
|
|
*
|
|
|
|
* If the type can have more than one id, then use zero.
|
|
|
|
*/
|
|
|
|
#define trace_assign_type(var, ent) \
|
|
|
|
do { \
|
|
|
|
IF_ASSIGN(var, ent, struct ftrace_entry, TRACE_FN); \
|
|
|
|
IF_ASSIGN(var, ent, struct ctx_switch_entry, 0); \
|
|
|
|
IF_ASSIGN(var, ent, struct stack_entry, TRACE_STACK); \
|
2008-11-22 11:28:47 +00:00
|
|
|
IF_ASSIGN(var, ent, struct userstack_entry, TRACE_USER_STACK);\
|
2008-10-01 14:52:51 +00:00
|
|
|
IF_ASSIGN(var, ent, struct print_entry, TRACE_PRINT); \
|
2009-03-12 17:24:49 +00:00
|
|
|
IF_ASSIGN(var, ent, struct bprint_entry, TRACE_BPRINT); \
|
2013-03-09 02:02:34 +00:00
|
|
|
IF_ASSIGN(var, ent, struct bputs_entry, TRACE_BPUTS); \
|
2016-06-23 16:45:36 +00:00
|
|
|
IF_ASSIGN(var, ent, struct hwlat_entry, TRACE_HWLAT); \
|
trace: Add osnoise tracer
In the context of high-performance computing (HPC), the Operating System
Noise (*osnoise*) refers to the interference experienced by an application
due to activities inside the operating system. In the context of Linux,
NMIs, IRQs, SoftIRQs, and any other system thread can cause noise to the
system. Moreover, hardware-related jobs can also cause noise, for example,
via SMIs.
The osnoise tracer leverages the hwlat_detector by running a similar
loop with preemption, SoftIRQs and IRQs enabled, thus allowing all
the sources of *osnoise* during its execution. Using the same approach
of hwlat, osnoise takes note of the entry and exit point of any
source of interferences, increasing a per-cpu interference counter. The
osnoise tracer also saves an interference counter for each source of
interference. The interference counter for NMI, IRQs, SoftIRQs, and
threads is increased anytime the tool observes these interferences' entry
events. When a noise happens without any interference from the operating
system level, the hardware noise counter increases, pointing to a
hardware-related noise. In this way, osnoise can account for any
source of interference. At the end of the period, the osnoise tracer
prints the sum of all noise, the max single noise, the percentage of CPU
available for the thread, and the counters for the noise sources.
Usage
Write the ASCII text "osnoise" into the current_tracer file of the
tracing system (generally mounted at /sys/kernel/tracing).
For example::
[root@f32 ~]# cd /sys/kernel/tracing/
[root@f32 tracing]# echo osnoise > current_tracer
It is possible to follow the trace by reading the trace trace file::
[root@f32 tracing]# cat trace
# tracer: osnoise
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth MAX
# || / SINGLE Interference counters:
# |||| RUNTIME NOISE % OF CPU NOISE +-----------------------------+
# TASK-PID CPU# |||| TIMESTAMP IN US IN US AVAILABLE IN US HW NMI IRQ SIRQ THREAD
# | | | |||| | | | | | | | | | |
<...>-859 [000] .... 81.637220: 1000000 190 99.98100 9 18 0 1007 18 1
<...>-860 [001] .... 81.638154: 1000000 656 99.93440 74 23 0 1006 16 3
<...>-861 [002] .... 81.638193: 1000000 5675 99.43250 202 6 0 1013 25 21
<...>-862 [003] .... 81.638242: 1000000 125 99.98750 45 1 0 1011 23 0
<...>-863 [004] .... 81.638260: 1000000 1721 99.82790 168 7 0 1002 49 41
<...>-864 [005] .... 81.638286: 1000000 263 99.97370 57 6 0 1006 26 2
<...>-865 [006] .... 81.638302: 1000000 109 99.98910 21 3 0 1006 18 1
<...>-866 [007] .... 81.638326: 1000000 7816 99.21840 107 8 0 1016 39 19
In addition to the regular trace fields (from TASK-PID to TIMESTAMP), the
tracer prints a message at the end of each period for each CPU that is
running an osnoise/CPU thread. The osnoise specific fields report:
- The RUNTIME IN USE reports the amount of time in microseconds that
the osnoise thread kept looping reading the time.
- The NOISE IN US reports the sum of noise in microseconds observed
by the osnoise tracer during the associated runtime.
- The % OF CPU AVAILABLE reports the percentage of CPU available for
the osnoise thread during the runtime window.
- The MAX SINGLE NOISE IN US reports the maximum single noise observed
during the runtime window.
- The Interference counters display how many each of the respective
interference happened during the runtime window.
Note that the example above shows a high number of HW noise samples.
The reason being is that this sample was taken on a virtual machine,
and the host interference is detected as a hardware interference.
Tracer options
The tracer has a set of options inside the osnoise directory, they are:
- osnoise/cpus: CPUs at which a osnoise thread will execute.
- osnoise/period_us: the period of the osnoise thread.
- osnoise/runtime_us: how long an osnoise thread will look for noise.
- osnoise/stop_tracing_us: stop the system tracing if a single noise
higher than the configured value happens. Writing 0 disables this
option.
- osnoise/stop_tracing_total_us: stop the system tracing if total noise
higher than the configured value happens. Writing 0 disables this
option.
- tracing_threshold: the minimum delta between two time() reads to be
considered as noise, in us. When set to 0, the default value will
be used, which is currently 5 us.
Additional Tracing
In addition to the tracer, a set of tracepoints were added to
facilitate the identification of the osnoise source.
- osnoise:sample_threshold: printed anytime a noise is higher than
the configurable tolerance_ns.
- osnoise:nmi_noise: noise from NMI, including the duration.
- osnoise:irq_noise: noise from an IRQ, including the duration.
- osnoise:softirq_noise: noise from a SoftIRQ, including the
duration.
- osnoise:thread_noise: noise from a thread, including the duration.
Note that all the values are *net values*. For example, if while osnoise
is running, another thread preempts the osnoise thread, it will start a
thread_noise duration at the start. Then, an IRQ takes place, preempting
the thread_noise, starting a irq_noise. When the IRQ ends its execution,
it will compute its duration, and this duration will be subtracted from
the thread_noise, in such a way as to avoid the double accounting of the
IRQ execution. This logic is valid for all sources of noise.
Here is one example of the usage of these tracepoints::
osnoise/8-961 [008] d.h. 5789.857532: irq_noise: local_timer:236 start 5789.857529929 duration 1845 ns
osnoise/8-961 [008] dNh. 5789.858408: irq_noise: local_timer:236 start 5789.858404871 duration 2848 ns
migration/8-54 [008] d... 5789.858413: thread_noise: migration/8:54 start 5789.858409300 duration 3068 ns
osnoise/8-961 [008] .... 5789.858413: sample_threshold: start 5789.858404555 duration 8723 ns interferences 2
In this example, a noise sample of 8 microseconds was reported in the last
line, pointing to two interferences. Looking backward in the trace, the
two previous entries were about the migration thread running after a
timer IRQ execution. The first event is not part of the noise because
it took place one millisecond before.
It is worth noticing that the sum of the duration reported in the
tracepoints is smaller than eight us reported in the sample_threshold.
The reason roots in the overhead of the entry and exit code that happens
before and after any interference execution. This justifies the dual
approach: measuring thread and tracing.
Link: https://lkml.kernel.org/r/e649467042d60e7b62714c9c6751a56299d15119.1624372313.git.bristot@redhat.com
Cc: Phil Auld <pauld@redhat.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Kate Carcia <kcarcia@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Alexandre Chartre <alexandre.chartre@oracle.com>
Cc: Clark Willaims <williams@redhat.com>
Cc: John Kacur <jkacur@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com>
[
Made the following functions static:
trace_irqentry_callback()
trace_irqexit_callback()
trace_intel_irqentry_callback()
trace_intel_irqexit_callback()
Added to include/trace.h:
osnoise_arch_register()
osnoise_arch_unregister()
Fixed define logic for LATENCY_FS_NOTIFY
Reported-by: kernel test robot <lkp@intel.com>
]
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-06-22 14:42:27 +00:00
|
|
|
IF_ASSIGN(var, ent, struct osnoise_entry, TRACE_OSNOISE);\
|
2021-06-22 14:42:28 +00:00
|
|
|
IF_ASSIGN(var, ent, struct timerlat_entry, TRACE_TIMERLAT);\
|
2016-07-06 19:25:08 +00:00
|
|
|
IF_ASSIGN(var, ent, struct raw_data_entry, TRACE_RAW_DATA);\
|
2008-10-01 14:52:51 +00:00
|
|
|
IF_ASSIGN(var, ent, struct trace_mmiotrace_rw, \
|
|
|
|
TRACE_MMIO_RW); \
|
|
|
|
IF_ASSIGN(var, ent, struct trace_mmiotrace_map, \
|
|
|
|
TRACE_MMIO_MAP); \
|
2008-11-12 20:24:24 +00:00
|
|
|
IF_ASSIGN(var, ent, struct trace_branch, TRACE_BRANCH); \
|
2008-11-25 23:57:25 +00:00
|
|
|
IF_ASSIGN(var, ent, struct ftrace_graph_ent_entry, \
|
|
|
|
TRACE_GRAPH_ENT); \
|
|
|
|
IF_ASSIGN(var, ent, struct ftrace_graph_ret_entry, \
|
|
|
|
TRACE_GRAPH_RET); \
|
2021-04-15 18:18:50 +00:00
|
|
|
IF_ASSIGN(var, ent, struct func_repeats_entry, \
|
|
|
|
TRACE_FUNC_REPEATS); \
|
2008-10-01 14:52:51 +00:00
|
|
|
__ftrace_bad_type(); \
|
|
|
|
} while (0)
|
2008-09-29 18:18:34 +00:00
|
|
|
|
2008-11-17 18:23:42 +00:00
|
|
|
/*
|
|
|
|
* An option specific to a tracer. This is a boolean value.
|
|
|
|
* The bit is the bit index that sets its value on the
|
|
|
|
* flags value in struct tracer_flags.
|
|
|
|
*/
|
|
|
|
struct tracer_opt {
|
2009-03-06 16:52:03 +00:00
|
|
|
const char *name; /* Will appear on the trace_options file */
|
|
|
|
u32 bit; /* Mask assigned in val field in tracer_flags */
|
2008-11-17 18:23:42 +00:00
|
|
|
};
|
|
|
|
|
|
|
|
/*
|
|
|
|
* The set of specific options for a tracer. Your tracer
|
|
|
|
* have to set the initial value of the flags val.
|
|
|
|
*/
|
|
|
|
struct tracer_flags {
|
|
|
|
u32 val;
|
2009-03-06 16:52:03 +00:00
|
|
|
struct tracer_opt *opts;
|
2016-03-08 13:37:01 +00:00
|
|
|
struct tracer *trace;
|
2008-11-17 18:23:42 +00:00
|
|
|
};
|
|
|
|
|
|
|
|
/* Makes more easy to define a tracer opt */
|
|
|
|
#define TRACER_OPT(s, b) .name = #s, .bit = b
|
|
|
|
|
2009-01-08 18:03:56 +00:00
|
|
|
|
2015-09-29 21:31:55 +00:00
|
|
|
struct trace_option_dentry {
|
|
|
|
struct tracer_opt *opt;
|
|
|
|
struct tracer_flags *flags;
|
|
|
|
struct trace_array *tr;
|
|
|
|
struct dentry *entry;
|
|
|
|
};
|
|
|
|
|
2009-02-11 01:25:00 +00:00
|
|
|
/**
|
2015-01-20 17:13:40 +00:00
|
|
|
* struct tracer - a specific tracer and its callbacks to interact with tracefs
|
2009-02-11 01:25:00 +00:00
|
|
|
* @name: the name chosen to select it on the available_tracers file
|
|
|
|
* @init: called when one switches to this tracer (echo name > current_tracer)
|
|
|
|
* @reset: called when one switches to another tracer
|
2015-12-22 14:44:33 +00:00
|
|
|
* @start: called when tracing is unpaused (echo 1 > tracing_on)
|
|
|
|
* @stop: called when tracing is paused (echo 0 > tracing_on)
|
2014-07-18 11:17:27 +00:00
|
|
|
* @update_thresh: called when tracing_thresh is updated
|
2009-02-11 01:25:00 +00:00
|
|
|
* @open: called when the trace file is opened
|
|
|
|
* @pipe_open: called when the trace_pipe file is opened
|
|
|
|
* @close: called when the trace file is released
|
2009-12-07 14:06:24 +00:00
|
|
|
* @pipe_close: called when the trace_pipe file is released
|
2009-02-11 01:25:00 +00:00
|
|
|
* @read: override the default read callback on trace_pipe
|
|
|
|
* @splice_read: override the default splice_read callback on trace_pipe
|
|
|
|
* @selftest: selftest to run on boot (see trace_selftest.c)
|
|
|
|
* @print_headers: override the first lines that describe your columns
|
|
|
|
* @print_line: callback that prints a trace
|
|
|
|
* @set_flag: signals one of your private flags changed (trace_options file)
|
|
|
|
* @flags: your private flags
|
2008-05-12 19:20:42 +00:00
|
|
|
*/
|
|
|
|
struct tracer {
|
|
|
|
const char *name;
|
2008-11-16 04:57:26 +00:00
|
|
|
int (*init)(struct trace_array *tr);
|
2008-05-12 19:20:42 +00:00
|
|
|
void (*reset)(struct trace_array *tr);
|
ftrace: restructure tracing start/stop infrastructure
Impact: change where tracing is started up and stopped
Currently, when a new tracer is selected via echo'ing a tracer name into
the current_tracer file, the startup is only done if tracing_enabled is
set to one. If tracing_enabled is changed to zero (by echo'ing 0 into
the tracing_enabled file) a full shutdown is performed.
The full startup and shutdown of a tracer can be expensive and the
user can lose out traces when echo'ing in 0 to the tracing_enabled file,
because the process takes too long. There can also be places that
the user would like to start and stop the tracer several times and
doing the full startup and shutdown of a tracer might be too expensive.
This patch performs the full startup and shutdown when a tracer is
selected. It also adds a way to do a quick start or stop of a tracer.
The quick version is just a flag that prevents the tracing from
taking place, but the overhead of the code is still there.
For example, the startup of a tracer may enable tracepoints, or enable
the function tracer. The stop and start will just set a flag to
have the tracer ignore the calls when the tracepoint or function trace
is called. The overhead of the tracer may still be present when
the tracer is stopped, but no tracing will occur. Setting the tracer
to the 'nop' tracer (or any other tracer) will perform the shutdown
of the tracer which will disable the tracepoint or disable the
function tracer.
The tracing_enabled file will simply start or stop tracing.
This change is all internal. The end result for the user should be the same
as before. If tracing_enabled is not set, no trace will happen.
If tracing_enabled is set, then the trace will happen. The tracing_enabled
variable is static between tracers. Enabling tracing_enabled and
going to another tracer will keep tracing_enabled enabled. Same
is true with disabling tracing_enabled.
This patch will now provide a fast start/stop method to the users
for enabling or disabling tracing.
Note: There were two methods to the struct tracer that were never
used: The methods start and stop. These were to be used as a hook
to the reading of the trace output, but ended up not being
necessary. These two methods are now used to enable the start
and stop of each tracer, in case the tracer needs to do more than
just not write into the buffer. For example, the irqsoff tracer
must stop recording max latencies when tracing is stopped.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-05 21:05:44 +00:00
|
|
|
void (*start)(struct trace_array *tr);
|
|
|
|
void (*stop)(struct trace_array *tr);
|
2014-07-18 11:17:27 +00:00
|
|
|
int (*update_thresh)(struct trace_array *tr);
|
2008-05-12 19:20:42 +00:00
|
|
|
void (*open)(struct trace_iterator *iter);
|
2008-05-12 19:21:01 +00:00
|
|
|
void (*pipe_open)(struct trace_iterator *iter);
|
2008-05-12 19:20:42 +00:00
|
|
|
void (*close)(struct trace_iterator *iter);
|
2009-12-07 14:06:24 +00:00
|
|
|
void (*pipe_close)(struct trace_iterator *iter);
|
2008-05-12 19:21:01 +00:00
|
|
|
ssize_t (*read)(struct trace_iterator *iter,
|
|
|
|
struct file *filp, char __user *ubuf,
|
|
|
|
size_t cnt, loff_t *ppos);
|
2009-02-09 06:15:56 +00:00
|
|
|
ssize_t (*splice_read)(struct trace_iterator *iter,
|
|
|
|
struct file *filp,
|
|
|
|
loff_t *ppos,
|
|
|
|
struct pipe_inode_info *pipe,
|
|
|
|
size_t len,
|
|
|
|
unsigned int flags);
|
2008-05-12 19:20:44 +00:00
|
|
|
#ifdef CONFIG_FTRACE_STARTUP_TEST
|
|
|
|
int (*selftest)(struct tracer *trace,
|
|
|
|
struct trace_array *tr);
|
|
|
|
#endif
|
2008-11-25 08:12:31 +00:00
|
|
|
void (*print_header)(struct seq_file *m);
|
2008-09-29 18:18:34 +00:00
|
|
|
enum print_line_t (*print_line)(struct trace_iterator *iter);
|
2008-11-17 18:23:42 +00:00
|
|
|
/* If you handled the flag setting, return 0 */
|
2014-01-10 16:13:54 +00:00
|
|
|
int (*set_flag)(struct trace_array *tr,
|
|
|
|
u32 old_flags, u32 bit, int set);
|
2013-03-14 19:03:53 +00:00
|
|
|
/* Return 0 if OK with change, else return non-zero */
|
2014-01-10 22:51:01 +00:00
|
|
|
int (*flag_changed)(struct trace_array *tr,
|
2013-03-14 19:03:53 +00:00
|
|
|
u32 mask, int set);
|
2008-05-12 19:20:42 +00:00
|
|
|
struct tracer *next;
|
2009-03-06 16:52:03 +00:00
|
|
|
struct tracer_flags *flags;
|
2014-01-14 13:52:35 +00:00
|
|
|
int enabled;
|
2012-10-02 08:27:10 +00:00
|
|
|
bool print_max;
|
2013-11-07 03:42:48 +00:00
|
|
|
bool allow_instances;
|
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 14:24:35 +00:00
|
|
|
#ifdef CONFIG_TRACER_MAX_TRACE
|
2012-10-02 08:27:10 +00:00
|
|
|
bool use_max_tr;
|
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 14:24:35 +00:00
|
|
|
#endif
|
2017-09-11 06:26:35 +00:00
|
|
|
/* True if tracer cannot be enabled in kernel param */
|
|
|
|
bool noboot;
|
2008-05-12 19:20:42 +00:00
|
|
|
};
|
|
|
|
|
2012-06-28 00:46:14 +00:00
|
|
|
static inline struct ring_buffer_iter *
|
|
|
|
trace_buffer_iter(struct trace_iterator *iter, int cpu)
|
|
|
|
{
|
2018-04-08 11:36:31 +00:00
|
|
|
return iter->buffer_iter ? iter->buffer_iter[cpu] : NULL;
|
2012-06-28 00:46:14 +00:00
|
|
|
}
|
|
|
|
|
2009-02-05 20:02:00 +00:00
|
|
|
int tracer_init(struct tracer *t, struct trace_array *tr);
|
ftrace: restructure tracing start/stop infrastructure
Impact: change where tracing is started up and stopped
Currently, when a new tracer is selected via echo'ing a tracer name into
the current_tracer file, the startup is only done if tracing_enabled is
set to one. If tracing_enabled is changed to zero (by echo'ing 0 into
the tracing_enabled file) a full shutdown is performed.
The full startup and shutdown of a tracer can be expensive and the
user can lose out traces when echo'ing in 0 to the tracing_enabled file,
because the process takes too long. There can also be places that
the user would like to start and stop the tracer several times and
doing the full startup and shutdown of a tracer might be too expensive.
This patch performs the full startup and shutdown when a tracer is
selected. It also adds a way to do a quick start or stop of a tracer.
The quick version is just a flag that prevents the tracing from
taking place, but the overhead of the code is still there.
For example, the startup of a tracer may enable tracepoints, or enable
the function tracer. The stop and start will just set a flag to
have the tracer ignore the calls when the tracepoint or function trace
is called. The overhead of the tracer may still be present when
the tracer is stopped, but no tracing will occur. Setting the tracer
to the 'nop' tracer (or any other tracer) will perform the shutdown
of the tracer which will disable the tracepoint or disable the
function tracer.
The tracing_enabled file will simply start or stop tracing.
This change is all internal. The end result for the user should be the same
as before. If tracing_enabled is not set, no trace will happen.
If tracing_enabled is set, then the trace will happen. The tracing_enabled
variable is static between tracers. Enabling tracing_enabled and
going to another tracer will keep tracing_enabled enabled. Same
is true with disabling tracing_enabled.
This patch will now provide a fast start/stop method to the users
for enabling or disabling tracing.
Note: There were two methods to the struct tracer that were never
used: The methods start and stop. These were to be used as a hook
to the reading of the trace output, but ended up not being
necessary. These two methods are now used to enable the start
and stop of each tracer, in case the tracer needs to do more than
just not write into the buffer. For example, the irqsoff tracer
must stop recording max latencies when tracing is stopped.
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-05 21:05:44 +00:00
|
|
|
int tracing_is_enabled(void);
|
2020-01-09 23:53:48 +00:00
|
|
|
void tracing_reset_online_cpus(struct array_buffer *buf);
|
2013-03-05 04:26:06 +00:00
|
|
|
void tracing_reset_all_online_cpus(void);
|
2022-11-23 19:25:57 +00:00
|
|
|
void tracing_reset_all_online_cpus_unlocked(void);
|
2008-05-12 19:20:42 +00:00
|
|
|
int tracing_open_generic(struct inode *inode, struct file *filp);
|
2019-10-11 23:12:21 +00:00
|
|
|
int tracing_open_generic_tr(struct inode *inode, struct file *filp);
|
2023-12-19 18:54:16 +00:00
|
|
|
int tracing_release_generic_tr(struct inode *inode, struct file *file);
|
2023-09-07 02:47:12 +00:00
|
|
|
int tracing_open_file_tr(struct inode *inode, struct file *filp);
|
|
|
|
int tracing_release_file_tr(struct inode *inode, struct file *filp);
|
2023-12-14 01:21:53 +00:00
|
|
|
int tracing_single_release_file_tr(struct inode *inode, struct file *filp);
|
2013-10-19 00:15:54 +00:00
|
|
|
bool tracing_is_disabled(void);
|
2018-08-01 20:08:57 +00:00
|
|
|
bool tracer_tracing_is_on(struct trace_array *tr);
|
2017-04-20 15:46:03 +00:00
|
|
|
void tracer_tracing_on(struct trace_array *tr);
|
|
|
|
void tracer_tracing_off(struct trace_array *tr);
|
2009-03-26 23:25:38 +00:00
|
|
|
struct dentry *trace_create_file(const char *name,
|
2011-07-24 08:33:43 +00:00
|
|
|
umode_t mode,
|
2009-03-26 23:25:38 +00:00
|
|
|
struct dentry *parent,
|
|
|
|
void *data,
|
|
|
|
const struct file_operations *fops);
|
|
|
|
|
2020-07-12 01:10:36 +00:00
|
|
|
int tracing_init_dentry(void);
|
2008-05-12 19:20:49 +00:00
|
|
|
|
tracing: Introduce trace_buffer_{lock_reserve,unlock_commit}
Impact: new API
These new functions do what previously was being open coded, reducing
the number of details ftrace plugin writers have to worry about.
It also standardizes the handling of stacktrace, userstacktrace and
other trace options we may introduce in the future.
With this patch, for instance, the blk tracer (and some others already
in the tree) can use the "userstacktrace" /d/tracing/trace_options
facility.
$ codiff /tmp/vmlinux.before /tmp/vmlinux.after
linux-2.6-tip/kernel/trace/trace.c:
trace_vprintk | -5
trace_graph_return | -22
trace_graph_entry | -26
trace_function | -45
__ftrace_trace_stack | -27
ftrace_trace_userstack | -29
tracing_sched_switch_trace | -66
tracing_stop | +1
trace_seq_to_user | -1
ftrace_trace_special | -63
ftrace_special | +1
tracing_sched_wakeup_trace | -70
tracing_reset_online_cpus | -1
13 functions changed, 2 bytes added, 355 bytes removed, diff: -353
linux-2.6-tip/block/blktrace.c:
__blk_add_trace | -58
1 function changed, 58 bytes removed, diff: -58
linux-2.6-tip/kernel/trace/trace.c:
trace_buffer_lock_reserve | +88
trace_buffer_unlock_commit | +86
2 functions changed, 174 bytes added, diff: +174
/tmp/vmlinux.after:
16 functions changed, 176 bytes added, 413 bytes removed, diff: -237
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Frédéric Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-05 18:14:13 +00:00
|
|
|
struct ring_buffer_event;
|
|
|
|
|
2009-09-02 18:17:06 +00:00
|
|
|
struct ring_buffer_event *
|
2019-12-13 18:58:57 +00:00
|
|
|
trace_buffer_lock_reserve(struct trace_buffer *buffer,
|
2009-09-02 18:17:06 +00:00
|
|
|
int type,
|
|
|
|
unsigned long len,
|
2021-01-25 19:45:08 +00:00
|
|
|
unsigned int trace_ctx);
|
tracing: Introduce trace_buffer_{lock_reserve,unlock_commit}
Impact: new API
These new functions do what previously was being open coded, reducing
the number of details ftrace plugin writers have to worry about.
It also standardizes the handling of stacktrace, userstacktrace and
other trace options we may introduce in the future.
With this patch, for instance, the blk tracer (and some others already
in the tree) can use the "userstacktrace" /d/tracing/trace_options
facility.
$ codiff /tmp/vmlinux.before /tmp/vmlinux.after
linux-2.6-tip/kernel/trace/trace.c:
trace_vprintk | -5
trace_graph_return | -22
trace_graph_entry | -26
trace_function | -45
__ftrace_trace_stack | -27
ftrace_trace_userstack | -29
tracing_sched_switch_trace | -66
tracing_stop | +1
trace_seq_to_user | -1
ftrace_trace_special | -63
ftrace_special | +1
tracing_sched_wakeup_trace | -70
tracing_reset_online_cpus | -1
13 functions changed, 2 bytes added, 355 bytes removed, diff: -353
linux-2.6-tip/block/blktrace.c:
__blk_add_trace | -58
1 function changed, 58 bytes removed, diff: -58
linux-2.6-tip/kernel/trace/trace.c:
trace_buffer_lock_reserve | +88
trace_buffer_unlock_commit | +86
2 functions changed, 174 bytes added, diff: +174
/tmp/vmlinux.after:
16 functions changed, 176 bytes added, 413 bytes removed, diff: -237
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Frédéric Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-02-05 18:14:13 +00:00
|
|
|
|
2024-06-12 23:19:39 +00:00
|
|
|
int ring_buffer_meta_seq_init(struct file *file, struct trace_buffer *buffer, int cpu);
|
|
|
|
|
2008-09-16 18:56:41 +00:00
|
|
|
struct trace_entry *tracing_get_trace_entry(struct trace_array *tr,
|
|
|
|
struct trace_array_cpu *data);
|
2009-02-02 22:29:21 +00:00
|
|
|
|
|
|
|
struct trace_entry *trace_find_next_entry(struct trace_iterator *iter,
|
|
|
|
int *ent_cpu, u64 *ent_ts);
|
|
|
|
|
2019-12-13 18:58:57 +00:00
|
|
|
void trace_buffer_unlock_commit_nostack(struct trace_buffer *buffer,
|
2016-11-24 01:28:38 +00:00
|
|
|
struct ring_buffer_event *event);
|
2012-10-11 16:14:25 +00:00
|
|
|
|
tracing: Add a verifier to check string pointers for trace events
It is a common mistake for someone writing a trace event to save a pointer
to a string in the TP_fast_assign() and then display that string pointer
in the TP_printk() with %s. The problem is that those two events may happen
a long time apart, where the source of the string may no longer exist.
The proper way to handle displaying any string that is not guaranteed to be
in the kernel core rodata section, is to copy it into the ring buffer via
the __string(), __assign_str() and __get_str() helper macros.
Add a check at run time while displaying the TP_printk() of events to make
sure that every %s referenced is safe to dereference, and if it is not,
trigger a warning and only show the address of the pointer, and the
dereferenced string if it can be safely retrieved with a
strncpy_from_kernel_nofault() call.
In order to not have to copy the parsing of vsnprintf() formats, or even
exporting its code, the verifier relies on vsnprintf() being able to
modify the va_list that is passed to it, and it remains modified after it
is called. This is the case for some architectures like x86_64, but other
architectures like x86_32 pass the va_list to vsnprintf() as a value not a
reference, and the verifier can not use it to parse the non string
arguments. Thus, at boot up, it is checked if vsnprintf() modifies the
passed in va_list or not, and a static branch will disable the verifier if
it's not compatible.
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-02-26 03:00:57 +00:00
|
|
|
bool trace_is_tracepoint_string(const char *str);
|
2020-10-15 14:55:07 +00:00
|
|
|
const char *trace_event_format(struct trace_iterator *iter, const char *fmt);
|
tracing: Add a verifier to check string pointers for trace events
It is a common mistake for someone writing a trace event to save a pointer
to a string in the TP_fast_assign() and then display that string pointer
in the TP_printk() with %s. The problem is that those two events may happen
a long time apart, where the source of the string may no longer exist.
The proper way to handle displaying any string that is not guaranteed to be
in the kernel core rodata section, is to copy it into the ring buffer via
the __string(), __assign_str() and __get_str() helper macros.
Add a check at run time while displaying the TP_printk() of events to make
sure that every %s referenced is safe to dereference, and if it is not,
trigger a warning and only show the address of the pointer, and the
dereferenced string if it can be safely retrieved with a
strncpy_from_kernel_nofault() call.
In order to not have to copy the parsing of vsnprintf() formats, or even
exporting its code, the verifier relies on vsnprintf() being able to
modify the va_list that is passed to it, and it remains modified after it
is called. This is the case for some architectures like x86_64, but other
architectures like x86_32 pass the va_list to vsnprintf() as a value not a
reference, and the verifier can not use it to parse the non string
arguments. Thus, at boot up, it is checked if vsnprintf() modifies the
passed in va_list or not, and a static branch will disable the verifier if
it's not compatible.
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-02-26 03:00:57 +00:00
|
|
|
void trace_check_vprintf(struct trace_iterator *iter, const char *fmt,
|
2022-12-05 10:21:52 +00:00
|
|
|
va_list ap) __printf(2, 0);
|
2023-03-28 18:51:56 +00:00
|
|
|
char *trace_iter_expand_format(struct trace_iterator *iter);
|
2020-10-15 14:55:07 +00:00
|
|
|
|
2010-08-05 14:22:23 +00:00
|
|
|
int trace_empty(struct trace_iterator *iter);
|
|
|
|
|
|
|
|
void *trace_find_next_entry_inc(struct trace_iterator *iter);
|
|
|
|
|
|
|
|
void trace_init_global_iter(struct trace_iterator *iter);
|
|
|
|
|
|
|
|
void tracing_iter_reset(struct trace_iterator *iter, int cpu);
|
|
|
|
|
2019-03-19 17:12:05 +00:00
|
|
|
unsigned long trace_total_entries_cpu(struct trace_array *tr, int cpu);
|
|
|
|
unsigned long trace_total_entries(struct trace_array *tr);
|
|
|
|
|
2008-05-12 19:20:49 +00:00
|
|
|
void trace_function(struct trace_array *tr,
|
|
|
|
unsigned long ip,
|
|
|
|
unsigned long parent_ip,
|
2021-01-25 19:45:08 +00:00
|
|
|
unsigned int trace_ctx);
|
2010-09-23 12:00:52 +00:00
|
|
|
void trace_graph_function(struct trace_array *tr,
|
|
|
|
unsigned long ip,
|
|
|
|
unsigned long parent_ip,
|
2021-01-25 19:45:08 +00:00
|
|
|
unsigned int trace_ctx);
|
tracing/latency: Fix header output for latency tracers
In case the the graph tracer (CONFIG_FUNCTION_GRAPH_TRACER) or even the
function tracer (CONFIG_FUNCTION_TRACER) are not set, the latency tracers
do not display proper latency header.
The involved/fixed latency tracers are:
wakeup_rt
wakeup
preemptirqsoff
preemptoff
irqsoff
The patch adds proper handling of tracer configuration options for latency
tracers, and displaying correct header info accordingly.
* The current output (for wakeup tracer) with both graph and function
tracers disabled is:
# tracer: wakeup
#
<idle>-0 0d.h5 1us+: 0:120:R + [000] 7: 0:R watchdog/0
<idle>-0 0d.h5 3us+: ttwu_do_activate.clone.1 <-try_to_wake_up
...
* The fixed output is:
# tracer: wakeup
#
# wakeup latency trace v1.1.5 on 3.1.0-tip+
# --------------------------------------------------------------------
# latency: 55 us, #4/4, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2)
# -----------------
# | task: migration/0-6 (uid:0 nice:0 policy:1 rt_prio:99)
# -----------------
#
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| / delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
cat-1129 0d..4 1us : 1129:120:R + [000] 6: 0:R migration/0
cat-1129 0d..4 2us+: ttwu_do_activate.clone.1 <-try_to_wake_up
* The current output (for wakeup tracer) with only function
tracer enabled is:
# tracer: wakeup
#
cat-1140 0d..4 1us+: 1140:120:R + [000] 6: 0:R migration/0
cat-1140 0d..4 2us : ttwu_do_activate.clone.1 <-try_to_wake_up
* The fixed output is:
# tracer: wakeup
#
# wakeup latency trace v1.1.5 on 3.1.0-tip+
# --------------------------------------------------------------------
# latency: 207 us, #109/109, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:2)
# -----------------
# | task: watchdog/1-12 (uid:0 nice:0 policy:1 rt_prio:99)
# -----------------
#
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| / delay
# cmd pid ||||| time | caller
# \ / ||||| \ | /
<idle>-0 1d.h5 1us+: 0:120:R + [001] 12: 0:R watchdog/1
<idle>-0 1d.h5 3us : ttwu_do_activate.clone.1 <-try_to_wake_up
Link: http://lkml.kernel.org/r/20111107150849.GE1807@m.brq.redhat.com
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-11-07 15:08:49 +00:00
|
|
|
void trace_latency_header(struct seq_file *m);
|
2010-04-02 17:01:22 +00:00
|
|
|
void trace_default_header(struct seq_file *m);
|
|
|
|
void print_trace_header(struct seq_file *m, struct trace_iterator *iter);
|
2008-05-12 19:20:42 +00:00
|
|
|
|
2024-06-03 19:07:11 +00:00
|
|
|
void trace_graph_return(struct ftrace_graph_ret *trace, struct fgraph_ops *gops);
|
|
|
|
int trace_graph_entry(struct ftrace_graph_ent *trace, struct fgraph_ops *gops);
|
2008-11-25 08:24:15 +00:00
|
|
|
|
2008-05-22 15:49:22 +00:00
|
|
|
void tracing_start_cmdline_record(void);
|
|
|
|
void tracing_stop_cmdline_record(void);
|
2017-06-27 02:01:55 +00:00
|
|
|
void tracing_start_tgid_record(void);
|
|
|
|
void tracing_stop_tgid_record(void);
|
|
|
|
|
2008-05-12 19:20:42 +00:00
|
|
|
int register_tracer(struct tracer *type);
|
2009-09-12 23:43:07 +00:00
|
|
|
int is_tracing_stopped(void);
|
2010-08-05 14:22:23 +00:00
|
|
|
|
2013-12-21 22:39:40 +00:00
|
|
|
loff_t tracing_lseek(struct file *file, loff_t offset, int whence);
|
|
|
|
|
2010-08-05 14:22:23 +00:00
|
|
|
extern cpumask_var_t __read_mostly tracing_buffer_mask;
|
|
|
|
|
|
|
|
#define for_each_tracing_cpu(cpu) \
|
|
|
|
for_each_cpu(cpu, tracing_buffer_mask)
|
2008-05-12 19:20:42 +00:00
|
|
|
|
|
|
|
extern unsigned long nsecs_to_usecs(unsigned long nsecs);
|
|
|
|
|
2010-02-25 23:36:43 +00:00
|
|
|
extern unsigned long tracing_thresh;
|
|
|
|
|
2016-04-14 11:38:13 +00:00
|
|
|
/* PID filtering */
|
2016-04-21 15:35:30 +00:00
|
|
|
|
|
|
|
extern int pid_max;
|
|
|
|
|
2016-04-14 11:38:13 +00:00
|
|
|
bool trace_find_filtered_pid(struct trace_pid_list *filtered_pids,
|
|
|
|
pid_t search_pid);
|
|
|
|
bool trace_ignore_this_task(struct trace_pid_list *filtered_pids,
|
2020-03-20 03:19:06 +00:00
|
|
|
struct trace_pid_list *filtered_no_pids,
|
2016-04-14 11:38:13 +00:00
|
|
|
struct task_struct *task);
|
|
|
|
void trace_filter_add_remove_task(struct trace_pid_list *pid_list,
|
|
|
|
struct task_struct *self,
|
|
|
|
struct task_struct *task);
|
2016-04-20 19:19:54 +00:00
|
|
|
void *trace_pid_next(struct trace_pid_list *pid_list, void *v, loff_t *pos);
|
|
|
|
void *trace_pid_start(struct trace_pid_list *pid_list, loff_t *pos);
|
|
|
|
int trace_pid_show(struct seq_file *m, void *v);
|
2016-04-21 15:35:30 +00:00
|
|
|
int trace_pid_write(struct trace_pid_list *filtered_pids,
|
|
|
|
struct trace_pid_list **new_pid_list,
|
|
|
|
const char __user *ubuf, size_t cnt);
|
2016-04-14 11:38:13 +00:00
|
|
|
|
2009-08-27 20:52:21 +00:00
|
|
|
#ifdef CONFIG_TRACER_MAX_TRACE
|
tracing: Add conditional snapshot
Currently, tracing snapshots are context-free - they capture the ring
buffer contents at the time the tracing_snapshot() function was
invoked, and nothing else. Additionally, they're always taken
unconditionally - the calling code can decide whether or not to take a
snapshot, but the data used to make that decision is kept separately
from the snapshot itself.
This change adds the ability to associate with each trace instance
some user data, along with an 'update' function that can use that data
to determine whether or not to actually take a snapshot. The update
function can then update that data along with any other state (as part
of the data presumably), if warranted.
Because snapshots are 'global' per-instance, only one user can enable
and use a conditional snapshot for any given trace instance. To
enable a conditional snapshot (see details in the function and data
structure comments), the user calls tracing_snapshot_cond_enable().
Similarly, to disable a conditional snapshot and free it up for other
users, tracing_snapshot_cond_disable() should be called.
To actually initiate a conditional snapshot, tracing_snapshot_cond()
should be called. tracing_snapshot_cond() will invoke the update()
callback, allowing the user to decide whether or not to actually take
the snapshot and update the user-defined data associated with the
snapshot. If the callback returns 'true', tracing_snapshot_cond()
will then actually take the snapshot and return.
This scheme allows for flexibility in snapshot implementations - for
example, by implementing slightly different update() callbacks,
snapshots can be taken in situations where the user is only interested
in taking a snapshot when a new maximum in hit versus when a value
changes in any way at all. Future patches will demonstrate both
cases.
Link: http://lkml.kernel.org/r/1bea07828d5fd6864a585f83b1eed47ce097eb45.1550100284.git.tom.zanussi@linux.intel.com
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-02-13 23:42:45 +00:00
|
|
|
void update_max_tr(struct trace_array *tr, struct task_struct *tsk, int cpu,
|
|
|
|
void *cond_data);
|
2008-05-12 19:20:42 +00:00
|
|
|
void update_max_tr_single(struct trace_array *tr,
|
|
|
|
struct task_struct *tsk, int cpu);
|
|
|
|
|
2022-12-06 14:18:01 +00:00
|
|
|
#ifdef CONFIG_FSNOTIFY
|
2021-06-25 23:47:33 +00:00
|
|
|
#define LATENCY_FS_NOTIFY
|
|
|
|
#endif
|
2022-12-06 14:18:01 +00:00
|
|
|
#endif /* CONFIG_TRACER_MAX_TRACE */
|
2019-10-08 22:08:21 +00:00
|
|
|
|
2021-06-25 23:47:33 +00:00
|
|
|
#ifdef LATENCY_FS_NOTIFY
|
2019-10-08 22:08:21 +00:00
|
|
|
void latency_fsnotify(struct trace_array *tr);
|
|
|
|
#else
|
2019-11-15 03:43:58 +00:00
|
|
|
static inline void latency_fsnotify(struct trace_array *tr) { }
|
2019-10-08 22:08:21 +00:00
|
|
|
#endif
|
|
|
|
|
2009-07-29 15:51:13 +00:00
|
|
|
#ifdef CONFIG_STACKTRACE
|
2021-01-25 19:45:08 +00:00
|
|
|
void __trace_stack(struct trace_array *tr, unsigned int trace_ctx, int skip);
|
2009-07-29 15:51:13 +00:00
|
|
|
#else
|
2021-01-25 19:45:08 +00:00
|
|
|
static inline void __trace_stack(struct trace_array *tr, unsigned int trace_ctx,
|
|
|
|
int skip)
|
2009-07-29 15:51:13 +00:00
|
|
|
{
|
|
|
|
}
|
|
|
|
#endif /* CONFIG_STACKTRACE */
|
2009-01-16 00:12:40 +00:00
|
|
|
|
2021-04-15 18:18:52 +00:00
|
|
|
void trace_last_func_repeats(struct trace_array *tr,
|
|
|
|
struct trace_func_repeats *last_info,
|
|
|
|
unsigned int trace_ctx);
|
|
|
|
|
2016-12-21 19:32:01 +00:00
|
|
|
extern u64 ftrace_now(int cpu);
|
2008-05-12 19:20:42 +00:00
|
|
|
|
2009-03-16 23:20:15 +00:00
|
|
|
extern void trace_find_cmdline(int pid, char comm[]);
|
2017-06-27 02:01:55 +00:00
|
|
|
extern int trace_find_tgid(int pid);
|
2016-04-13 20:59:18 +00:00
|
|
|
extern void trace_event_follow_fork(struct trace_array *tr, bool enable);
|
2008-12-29 12:02:17 +00:00
|
|
|
|
2008-05-12 19:20:42 +00:00
|
|
|
#ifdef CONFIG_DYNAMIC_FTRACE
|
|
|
|
extern unsigned long ftrace_update_tot_cnt;
|
2019-10-01 18:38:07 +00:00
|
|
|
extern unsigned long ftrace_number_of_pages;
|
|
|
|
extern unsigned long ftrace_number_of_groups;
|
2017-04-05 17:12:55 +00:00
|
|
|
void ftrace_init_trace_array(struct trace_array *tr);
|
|
|
|
#else
|
|
|
|
static inline void ftrace_init_trace_array(struct trace_array *tr) { }
|
2012-07-20 17:45:59 +00:00
|
|
|
#endif
|
2008-05-12 19:20:54 +00:00
|
|
|
#define DYN_FTRACE_TEST_NAME trace_selftest_dynamic_test_func
|
|
|
|
extern int DYN_FTRACE_TEST_NAME(void);
|
2011-05-06 04:08:51 +00:00
|
|
|
#define DYN_FTRACE_TEST_NAME2 trace_selftest_dynamic_test_func2
|
|
|
|
extern int DYN_FTRACE_TEST_NAME2(void);
|
2008-05-12 19:20:42 +00:00
|
|
|
|
2023-09-06 09:18:37 +00:00
|
|
|
extern void trace_set_ring_buffer_expanded(struct trace_array *tr);
|
2009-07-01 02:47:05 +00:00
|
|
|
extern bool tracing_selftest_disabled;
|
|
|
|
|
2008-05-12 19:20:44 +00:00
|
|
|
#ifdef CONFIG_FTRACE_STARTUP_TEST
|
2020-12-08 08:54:09 +00:00
|
|
|
extern void __init disable_tracing_selftest(const char *reason);
|
|
|
|
|
2008-05-12 19:20:44 +00:00
|
|
|
extern int trace_selftest_startup_function(struct tracer *trace,
|
|
|
|
struct trace_array *tr);
|
2009-02-07 20:33:57 +00:00
|
|
|
extern int trace_selftest_startup_function_graph(struct tracer *trace,
|
|
|
|
struct trace_array *tr);
|
2008-05-12 19:20:44 +00:00
|
|
|
extern int trace_selftest_startup_irqsoff(struct tracer *trace,
|
|
|
|
struct trace_array *tr);
|
|
|
|
extern int trace_selftest_startup_preemptoff(struct tracer *trace,
|
|
|
|
struct trace_array *tr);
|
|
|
|
extern int trace_selftest_startup_preemptirqsoff(struct tracer *trace,
|
|
|
|
struct trace_array *tr);
|
|
|
|
extern int trace_selftest_startup_wakeup(struct tracer *trace,
|
|
|
|
struct trace_array *tr);
|
2008-09-19 10:06:43 +00:00
|
|
|
extern int trace_selftest_startup_nop(struct tracer *trace,
|
|
|
|
struct trace_array *tr);
|
2008-11-12 20:24:24 +00:00
|
|
|
extern int trace_selftest_startup_branch(struct tracer *trace,
|
|
|
|
struct trace_array *tr);
|
2013-07-18 18:41:51 +00:00
|
|
|
/*
|
|
|
|
* Tracer data references selftest functions that only occur
|
|
|
|
* on boot up. These can be __init functions. Thus, when selftests
|
|
|
|
* are enabled, then the tracers need to reference __init functions.
|
|
|
|
*/
|
|
|
|
#define __tracer_data __refdata
|
|
|
|
#else
|
2020-12-08 08:54:09 +00:00
|
|
|
static inline void __init disable_tracing_selftest(const char *reason)
|
|
|
|
{
|
|
|
|
}
|
2013-07-18 18:41:51 +00:00
|
|
|
/* Tracers are seldom changed. Optimize when selftests are disabled. */
|
|
|
|
#define __tracer_data __read_mostly
|
2008-05-12 19:20:44 +00:00
|
|
|
#endif /* CONFIG_FTRACE_STARTUP_TEST */
|
|
|
|
|
2008-05-12 19:20:45 +00:00
|
|
|
extern void *head_page(struct trace_array_cpu *data);
|
2016-12-21 19:32:01 +00:00
|
|
|
extern unsigned long long ns2usecs(u64 nsec);
|
2008-12-03 22:45:11 +00:00
|
|
|
extern int
|
2009-03-19 18:03:53 +00:00
|
|
|
trace_vbprintk(unsigned long ip, const char *fmt, va_list args);
|
2009-03-12 17:24:49 +00:00
|
|
|
extern int
|
2009-03-19 18:03:53 +00:00
|
|
|
trace_vprintk(unsigned long ip, const char *fmt, va_list args);
|
2009-09-03 23:11:07 +00:00
|
|
|
extern int
|
|
|
|
trace_array_vprintk(struct trace_array *tr,
|
|
|
|
unsigned long ip, const char *fmt, va_list args);
|
2019-12-13 18:58:57 +00:00
|
|
|
int trace_array_printk_buf(struct trace_buffer *buffer,
|
tracing: Consolidate max_tr into main trace_array structure
Currently, the way the latency tracers and snapshot feature works
is to have a separate trace_array called "max_tr" that holds the
snapshot buffer. For latency tracers, this snapshot buffer is used
to swap the running buffer with this buffer to save the current max
latency.
The only items needed for the max_tr is really just a copy of the buffer
itself, the per_cpu data pointers, the time_start timestamp that states
when the max latency was triggered, and the cpu that the max latency
was triggered on. All other fields in trace_array are unused by the
max_tr, making the max_tr mostly bloat.
This change removes the max_tr completely, and adds a new structure
called trace_buffer, that holds the buffer pointer, the per_cpu data
pointers, the time_start timestamp, and the cpu where the latency occurred.
The trace_array, now has two trace_buffers, one for the normal trace and
one for the max trace or snapshot. By doing this, not only do we remove
the bloat from the max_trace but the instances of traces can now use
their own snapshot feature and not have just the top level global_trace have
the snapshot feature and latency tracers for itself.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-03-05 14:24:35 +00:00
|
|
|
unsigned long ip, const char *fmt, ...);
|
2010-08-05 14:22:23 +00:00
|
|
|
void trace_printk_seq(struct trace_seq *s);
|
|
|
|
enum print_line_t print_trace_line(struct trace_iterator *iter);
|
2008-05-12 19:20:45 +00:00
|
|
|
|
2014-11-24 00:34:19 +00:00
|
|
|
extern char trace_find_mark(unsigned long long duration);
|
|
|
|
|
2017-06-23 19:26:26 +00:00
|
|
|
struct ftrace_hash;
|
|
|
|
|
|
|
|
struct ftrace_mod_load {
|
|
|
|
struct list_head list;
|
|
|
|
char *func;
|
|
|
|
char *module;
|
|
|
|
int enable;
|
|
|
|
};
|
|
|
|
|
2017-06-26 15:47:31 +00:00
|
|
|
enum {
|
|
|
|
FTRACE_HASH_FL_MOD = (1 << 0),
|
|
|
|
};
|
|
|
|
|
2017-01-20 02:44:46 +00:00
|
|
|
struct ftrace_hash {
|
|
|
|
unsigned long size_bits;
|
|
|
|
struct hlist_head *buckets;
|
|
|
|
unsigned long count;
|
2017-06-26 15:47:31 +00:00
|
|
|
unsigned long flags;
|
2017-01-20 02:44:46 +00:00
|
|
|
struct rcu_head rcu;
|
|
|
|
};
|
|
|
|
|
|
|
|
struct ftrace_func_entry *
|
|
|
|
ftrace_lookup_ip(struct ftrace_hash *hash, unsigned long ip);
|
|
|
|
|
2017-01-23 12:24:45 +00:00
|
|
|
static __always_inline bool ftrace_hash_empty(struct ftrace_hash *hash)
|
2017-01-20 02:44:46 +00:00
|
|
|
{
|
2017-06-26 15:47:31 +00:00
|
|
|
return !hash || !(hash->count || (hash->flags & FTRACE_HASH_FL_MOD));
|
2017-01-20 02:44:46 +00:00
|
|
|
}
|
|
|
|
|
2008-11-11 06:14:25 +00:00
|
|
|
/* Standard output formatting function used for function return traces */
|
2008-11-25 20:07:04 +00:00
|
|
|
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
|
2010-04-02 17:01:22 +00:00
|
|
|
|
|
|
|
/* Flag options */
|
|
|
|
#define TRACE_GRAPH_PRINT_OVERRUN 0x1
|
|
|
|
#define TRACE_GRAPH_PRINT_CPU 0x2
|
|
|
|
#define TRACE_GRAPH_PRINT_OVERHEAD 0x4
|
|
|
|
#define TRACE_GRAPH_PRINT_PROC 0x8
|
|
|
|
#define TRACE_GRAPH_PRINT_DURATION 0x10
|
|
|
|
#define TRACE_GRAPH_PRINT_ABS_TIME 0x20
|
2019-01-01 15:46:10 +00:00
|
|
|
#define TRACE_GRAPH_PRINT_REL_TIME 0x40
|
|
|
|
#define TRACE_GRAPH_PRINT_IRQS 0x80
|
|
|
|
#define TRACE_GRAPH_PRINT_TAIL 0x100
|
|
|
|
#define TRACE_GRAPH_SLEEP_TIME 0x200
|
|
|
|
#define TRACE_GRAPH_GRAPH_TIME 0x400
|
function_graph: Support recording and printing the return value of function
Analyzing system call failures with the function_graph tracer can be a
time-consuming process, particularly when locating the kernel function
that first returns an error in the trace logs. This change aims to
simplify the process by recording the function return value to the
'retval' member of 'ftrace_graph_ret' and printing it when outputting
the trace log.
We have introduced new trace options: funcgraph-retval and
funcgraph-retval-hex. The former controls whether to display the return
value, while the latter controls the display format.
Please note that even if a function's return type is void, a return
value will still be printed. You can simply ignore it.
This patch only establishes the fundamental infrastructure. Subsequent
patches will make this feature available on some commonly used processor
architectures.
Here is an example:
I attempted to attach the demo process to a cpu cgroup, but it failed:
echo `pidof demo` > /sys/fs/cgroup/cpu/test/tasks
-bash: echo: write error: Invalid argument
The strace logs indicate that the write system call returned -EINVAL(-22):
...
write(1, "273\n", 4) = -1 EINVAL (Invalid argument)
...
To capture trace logs during a write system call, use the following
commands:
cd /sys/kernel/debug/tracing/
echo 0 > tracing_on
echo > trace
echo *sys_write > set_graph_function
echo *spin* > set_graph_notrace
echo *rcu* >> set_graph_notrace
echo *alloc* >> set_graph_notrace
echo preempt* >> set_graph_notrace
echo kfree* >> set_graph_notrace
echo $$ > set_ftrace_pid
echo function_graph > current_tracer
echo 1 > options/funcgraph-retval
echo 0 > options/funcgraph-retval-hex
echo 1 > tracing_on
echo `pidof demo` > /sys/fs/cgroup/cpu/test/tasks
echo 0 > tracing_on
cat trace > ~/trace.log
To locate the root cause, search for error code -22 directly in the file
trace.log and identify the first function that returned -22. Once you
have identified this function, examine its code to determine the root
cause.
For example, in the trace log below, cpu_cgroup_can_attach
returned -22 first, so we can focus our analysis on this function to
identify the root cause.
...
1) | cgroup_migrate() {
1) 0.651 us | cgroup_migrate_add_task(); /* = 0xffff93fcfd346c00 */
1) | cgroup_migrate_execute() {
1) | cpu_cgroup_can_attach() {
1) | cgroup_taskset_first() {
1) 0.732 us | cgroup_taskset_next(); /* = 0xffff93fc8fb20000 */
1) 1.232 us | } /* cgroup_taskset_first = 0xffff93fc8fb20000 */
1) 0.380 us | sched_rt_can_attach(); /* = 0x0 */
1) 2.335 us | } /* cpu_cgroup_can_attach = -22 */
1) 4.369 us | } /* cgroup_migrate_execute = -22 */
1) 7.143 us | } /* cgroup_migrate = -22 */
...
Link: https://lkml.kernel.org/r/1fc502712c981e0e6742185ba242992170ac9da8.1680954589.git.pengdonglin@sangfor.com.cn
Tested-by: Florian Kauer <florian.kauer@linutronix.de>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Donglin Peng <pengdonglin@sangfor.com.cn>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-04-08 12:42:15 +00:00
|
|
|
#define TRACE_GRAPH_PRINT_RETVAL 0x800
|
|
|
|
#define TRACE_GRAPH_PRINT_RETVAL_HEX 0x1000
|
2013-11-06 19:50:06 +00:00
|
|
|
#define TRACE_GRAPH_PRINT_FILL_SHIFT 28
|
|
|
|
#define TRACE_GRAPH_PRINT_FILL_MASK (0x3 << TRACE_GRAPH_PRINT_FILL_SHIFT)
|
2010-04-02 17:01:22 +00:00
|
|
|
|
2015-09-29 23:06:50 +00:00
|
|
|
extern void ftrace_graph_sleep_time_control(bool enable);
|
2018-11-23 18:06:07 +00:00
|
|
|
|
|
|
|
#ifdef CONFIG_FUNCTION_PROFILER
|
2015-09-29 23:06:50 +00:00
|
|
|
extern void ftrace_graph_graph_time_control(bool enable);
|
2018-11-23 18:06:07 +00:00
|
|
|
#else
|
|
|
|
static inline void ftrace_graph_graph_time_control(bool enable) { }
|
|
|
|
#endif
|
2015-09-29 23:06:50 +00:00
|
|
|
|
2010-04-02 17:01:21 +00:00
|
|
|
extern enum print_line_t
|
|
|
|
print_graph_function_flags(struct trace_iterator *iter, u32 flags);
|
|
|
|
extern void print_graph_headers_flags(struct seq_file *s, u32 flags);
|
2014-11-12 19:57:38 +00:00
|
|
|
extern void
|
2009-03-24 03:12:58 +00:00
|
|
|
trace_print_graph_duration(unsigned long long duration, struct trace_seq *s);
|
2010-04-02 17:01:22 +00:00
|
|
|
extern void graph_trace_open(struct trace_iterator *iter);
|
|
|
|
extern void graph_trace_close(struct trace_iterator *iter);
|
|
|
|
extern int __trace_graph_entry(struct trace_array *tr,
|
|
|
|
struct ftrace_graph_ent *trace,
|
2021-01-25 19:45:08 +00:00
|
|
|
unsigned int trace_ctx);
|
2010-04-02 17:01:22 +00:00
|
|
|
extern void __trace_graph_return(struct trace_array *tr,
|
|
|
|
struct ftrace_graph_ret *trace,
|
2021-01-25 19:45:08 +00:00
|
|
|
unsigned int trace_ctx);
|
2024-06-03 19:07:16 +00:00
|
|
|
extern void init_array_fgraph_ops(struct trace_array *tr, struct ftrace_ops *ops);
|
|
|
|
extern int allocate_fgraph_ops(struct trace_array *tr, struct ftrace_ops *ops);
|
2024-06-03 19:07:12 +00:00
|
|
|
extern void free_fgraph_ops(struct trace_array *tr);
|
2010-04-02 17:01:22 +00:00
|
|
|
|
2024-06-03 19:07:20 +00:00
|
|
|
enum {
|
|
|
|
TRACE_GRAPH_FL = 1,
|
2024-06-03 19:07:21 +00:00
|
|
|
|
|
|
|
/*
|
|
|
|
* In the very unlikely case that an interrupt came in
|
|
|
|
* at a start of graph tracing, and we want to trace
|
|
|
|
* the function in that interrupt, the depth can be greater
|
|
|
|
* than zero, because of the preempted start of a previous
|
|
|
|
* trace. In an even more unlikely case, depth could be 2
|
|
|
|
* if a softirq interrupted the start of graph tracing,
|
|
|
|
* followed by an interrupt preempting a start of graph
|
|
|
|
* tracing in the softirq, and depth can even be 3
|
|
|
|
* if an NMI came in at the start of an interrupt function
|
|
|
|
* that preempted a softirq start of a function that
|
|
|
|
* preempted normal context!!!! Luckily, it can't be
|
|
|
|
* greater than 3, so the next two bits are a mask
|
|
|
|
* of what the depth is when we set TRACE_GRAPH_FL
|
|
|
|
*/
|
|
|
|
|
|
|
|
TRACE_GRAPH_DEPTH_START_BIT,
|
|
|
|
TRACE_GRAPH_DEPTH_END_BIT,
|
2024-06-03 19:07:22 +00:00
|
|
|
|
|
|
|
/*
|
|
|
|
* To implement set_graph_notrace, if this bit is set, we ignore
|
|
|
|
* function graph tracing of called functions, until the return
|
|
|
|
* function is called to clear it.
|
|
|
|
*/
|
|
|
|
TRACE_GRAPH_NOTRACE_BIT,
|
2024-06-03 19:07:20 +00:00
|
|
|
};
|
|
|
|
|
2024-06-03 19:07:22 +00:00
|
|
|
#define TRACE_GRAPH_NOTRACE (1 << TRACE_GRAPH_NOTRACE_BIT)
|
|
|
|
|
2024-06-03 19:07:21 +00:00
|
|
|
static inline unsigned long ftrace_graph_depth(unsigned long *task_var)
|
|
|
|
{
|
|
|
|
return (*task_var >> TRACE_GRAPH_DEPTH_START_BIT) & 3;
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline void ftrace_graph_set_depth(unsigned long *task_var, int depth)
|
|
|
|
{
|
|
|
|
*task_var &= ~(3 << TRACE_GRAPH_DEPTH_START_BIT);
|
|
|
|
*task_var |= (depth & 3) << TRACE_GRAPH_DEPTH_START_BIT;
|
|
|
|
}
|
2010-04-02 17:01:22 +00:00
|
|
|
|
2008-12-03 20:36:57 +00:00
|
|
|
#ifdef CONFIG_DYNAMIC_FTRACE
|
2020-02-01 07:27:04 +00:00
|
|
|
extern struct ftrace_hash __rcu *ftrace_graph_hash;
|
2020-02-05 05:57:02 +00:00
|
|
|
extern struct ftrace_hash __rcu *ftrace_graph_notrace_hash;
|
2008-12-03 20:36:57 +00:00
|
|
|
|
2024-06-03 19:07:20 +00:00
|
|
|
static inline int
|
|
|
|
ftrace_graph_addr(unsigned long *task_var, struct ftrace_graph_ent *trace)
|
2008-12-03 20:36:57 +00:00
|
|
|
{
|
tracing/fgraph: Fix set_graph_function from showing interrupts
The tracefs file set_graph_function is used to only function graph functions
that are listed in that file (or all functions if the file is empty). The
way this is implemented is that the function graph tracer looks at every
function, and if the current depth is zero and the function matches
something in the file then it will trace that function. When other functions
are called, the depth will be greater than zero (because the original
function will be at depth zero), and all functions will be traced where the
depth is greater than zero.
The issue is that when a function is first entered, and the handler that
checks this logic is called, the depth is set to zero. If an interrupt comes
in and a function in the interrupt handler is traced, its depth will be
greater than zero and it will automatically be traced, even if the original
function was not. But because the logic only looks at depth it may trace
interrupts when it should not be.
The recent design change of the function graph tracer to fix other bugs
caused the depth to be zero while the function graph callback handler is
being called for a longer time, widening the race of this happening. This
bug was actually there for a longer time, but because the race window was so
small it seldom happened. The Fixes tag below is for the commit that widen
the race window, because that commit belongs to a series that will also help
fix the original bug.
Cc: stable@kernel.org
Fixes: 39eb456dacb5 ("function_graph: Use new curr_ret_depth to manage depth instead of curr_ret_stack")
Reported-by: Joe Lawrence <joe.lawrence@redhat.com>
Tested-by: Joe Lawrence <joe.lawrence@redhat.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-11-29 13:50:27 +00:00
|
|
|
unsigned long addr = trace->func;
|
2017-01-20 02:44:47 +00:00
|
|
|
int ret = 0;
|
2020-02-01 07:27:04 +00:00
|
|
|
struct ftrace_hash *hash;
|
2017-01-20 02:44:47 +00:00
|
|
|
|
|
|
|
preempt_disable_notrace();
|
|
|
|
|
2020-02-05 07:17:57 +00:00
|
|
|
/*
|
|
|
|
* Have to open code "rcu_dereference_sched()" because the
|
|
|
|
* function graph tracer can be called when RCU is not
|
|
|
|
* "watching".
|
2020-02-05 14:20:32 +00:00
|
|
|
* Protected with schedule_on_each_cpu(ftrace_sync)
|
2020-02-05 07:17:57 +00:00
|
|
|
*/
|
2020-02-01 07:27:04 +00:00
|
|
|
hash = rcu_dereference_protected(ftrace_graph_hash, !preemptible());
|
|
|
|
|
|
|
|
if (ftrace_hash_empty(hash)) {
|
2017-01-20 02:44:47 +00:00
|
|
|
ret = 1;
|
|
|
|
goto out;
|
2008-12-03 20:36:57 +00:00
|
|
|
}
|
|
|
|
|
2020-02-01 07:27:04 +00:00
|
|
|
if (ftrace_lookup_ip(hash, addr)) {
|
tracing/fgraph: Fix set_graph_function from showing interrupts
The tracefs file set_graph_function is used to only function graph functions
that are listed in that file (or all functions if the file is empty). The
way this is implemented is that the function graph tracer looks at every
function, and if the current depth is zero and the function matches
something in the file then it will trace that function. When other functions
are called, the depth will be greater than zero (because the original
function will be at depth zero), and all functions will be traced where the
depth is greater than zero.
The issue is that when a function is first entered, and the handler that
checks this logic is called, the depth is set to zero. If an interrupt comes
in and a function in the interrupt handler is traced, its depth will be
greater than zero and it will automatically be traced, even if the original
function was not. But because the logic only looks at depth it may trace
interrupts when it should not be.
The recent design change of the function graph tracer to fix other bugs
caused the depth to be zero while the function graph callback handler is
being called for a longer time, widening the race of this happening. This
bug was actually there for a longer time, but because the race window was so
small it seldom happened. The Fixes tag below is for the commit that widen
the race window, because that commit belongs to a series that will also help
fix the original bug.
Cc: stable@kernel.org
Fixes: 39eb456dacb5 ("function_graph: Use new curr_ret_depth to manage depth instead of curr_ret_stack")
Reported-by: Joe Lawrence <joe.lawrence@redhat.com>
Tested-by: Joe Lawrence <joe.lawrence@redhat.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-11-29 13:50:27 +00:00
|
|
|
/*
|
|
|
|
* This needs to be cleared on the return functions
|
|
|
|
* when the depth is zero.
|
|
|
|
*/
|
2024-06-03 19:07:20 +00:00
|
|
|
*task_var |= TRACE_GRAPH_FL;
|
2024-06-03 19:07:21 +00:00
|
|
|
ftrace_graph_set_depth(task_var, trace->depth);
|
tracing/fgraph: Fix set_graph_function from showing interrupts
The tracefs file set_graph_function is used to only function graph functions
that are listed in that file (or all functions if the file is empty). The
way this is implemented is that the function graph tracer looks at every
function, and if the current depth is zero and the function matches
something in the file then it will trace that function. When other functions
are called, the depth will be greater than zero (because the original
function will be at depth zero), and all functions will be traced where the
depth is greater than zero.
The issue is that when a function is first entered, and the handler that
checks this logic is called, the depth is set to zero. If an interrupt comes
in and a function in the interrupt handler is traced, its depth will be
greater than zero and it will automatically be traced, even if the original
function was not. But because the logic only looks at depth it may trace
interrupts when it should not be.
The recent design change of the function graph tracer to fix other bugs
caused the depth to be zero while the function graph callback handler is
being called for a longer time, widening the race of this happening. This
bug was actually there for a longer time, but because the race window was so
small it seldom happened. The Fixes tag below is for the commit that widen
the race window, because that commit belongs to a series that will also help
fix the original bug.
Cc: stable@kernel.org
Fixes: 39eb456dacb5 ("function_graph: Use new curr_ret_depth to manage depth instead of curr_ret_stack")
Reported-by: Joe Lawrence <joe.lawrence@redhat.com>
Tested-by: Joe Lawrence <joe.lawrence@redhat.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-11-29 13:50:27 +00:00
|
|
|
|
2017-01-20 02:44:47 +00:00
|
|
|
/*
|
|
|
|
* If no irqs are to be traced, but a set_graph_function
|
|
|
|
* is set, and called by an interrupt handler, we still
|
|
|
|
* want to trace it.
|
|
|
|
*/
|
2021-09-30 00:03:42 +00:00
|
|
|
if (in_hardirq())
|
2017-01-20 02:44:47 +00:00
|
|
|
trace_recursion_set(TRACE_IRQ_BIT);
|
|
|
|
else
|
|
|
|
trace_recursion_clear(TRACE_IRQ_BIT);
|
|
|
|
ret = 1;
|
|
|
|
}
|
|
|
|
|
|
|
|
out:
|
|
|
|
preempt_enable_notrace();
|
|
|
|
return ret;
|
2008-12-03 20:36:57 +00:00
|
|
|
}
|
2013-10-14 08:24:26 +00:00
|
|
|
|
2024-06-03 19:07:20 +00:00
|
|
|
static inline void
|
|
|
|
ftrace_graph_addr_finish(struct fgraph_ops *gops, struct ftrace_graph_ret *trace)
|
tracing/fgraph: Fix set_graph_function from showing interrupts
The tracefs file set_graph_function is used to only function graph functions
that are listed in that file (or all functions if the file is empty). The
way this is implemented is that the function graph tracer looks at every
function, and if the current depth is zero and the function matches
something in the file then it will trace that function. When other functions
are called, the depth will be greater than zero (because the original
function will be at depth zero), and all functions will be traced where the
depth is greater than zero.
The issue is that when a function is first entered, and the handler that
checks this logic is called, the depth is set to zero. If an interrupt comes
in and a function in the interrupt handler is traced, its depth will be
greater than zero and it will automatically be traced, even if the original
function was not. But because the logic only looks at depth it may trace
interrupts when it should not be.
The recent design change of the function graph tracer to fix other bugs
caused the depth to be zero while the function graph callback handler is
being called for a longer time, widening the race of this happening. This
bug was actually there for a longer time, but because the race window was so
small it seldom happened. The Fixes tag below is for the commit that widen
the race window, because that commit belongs to a series that will also help
fix the original bug.
Cc: stable@kernel.org
Fixes: 39eb456dacb5 ("function_graph: Use new curr_ret_depth to manage depth instead of curr_ret_stack")
Reported-by: Joe Lawrence <joe.lawrence@redhat.com>
Tested-by: Joe Lawrence <joe.lawrence@redhat.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-11-29 13:50:27 +00:00
|
|
|
{
|
2024-06-03 19:07:20 +00:00
|
|
|
unsigned long *task_var = fgraph_get_task_var(gops);
|
|
|
|
|
|
|
|
if ((*task_var & TRACE_GRAPH_FL) &&
|
2024-06-03 19:07:21 +00:00
|
|
|
trace->depth == ftrace_graph_depth(task_var))
|
2024-06-03 19:07:20 +00:00
|
|
|
*task_var &= ~TRACE_GRAPH_FL;
|
tracing/fgraph: Fix set_graph_function from showing interrupts
The tracefs file set_graph_function is used to only function graph functions
that are listed in that file (or all functions if the file is empty). The
way this is implemented is that the function graph tracer looks at every
function, and if the current depth is zero and the function matches
something in the file then it will trace that function. When other functions
are called, the depth will be greater than zero (because the original
function will be at depth zero), and all functions will be traced where the
depth is greater than zero.
The issue is that when a function is first entered, and the handler that
checks this logic is called, the depth is set to zero. If an interrupt comes
in and a function in the interrupt handler is traced, its depth will be
greater than zero and it will automatically be traced, even if the original
function was not. But because the logic only looks at depth it may trace
interrupts when it should not be.
The recent design change of the function graph tracer to fix other bugs
caused the depth to be zero while the function graph callback handler is
being called for a longer time, widening the race of this happening. This
bug was actually there for a longer time, but because the race window was so
small it seldom happened. The Fixes tag below is for the commit that widen
the race window, because that commit belongs to a series that will also help
fix the original bug.
Cc: stable@kernel.org
Fixes: 39eb456dacb5 ("function_graph: Use new curr_ret_depth to manage depth instead of curr_ret_stack")
Reported-by: Joe Lawrence <joe.lawrence@redhat.com>
Tested-by: Joe Lawrence <joe.lawrence@redhat.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-11-29 13:50:27 +00:00
|
|
|
}
|
|
|
|
|
2013-10-14 08:24:26 +00:00
|
|
|
static inline int ftrace_graph_notrace_addr(unsigned long addr)
|
|
|
|
{
|
2017-01-20 02:44:47 +00:00
|
|
|
int ret = 0;
|
2020-02-05 05:57:02 +00:00
|
|
|
struct ftrace_hash *notrace_hash;
|
2013-10-14 08:24:26 +00:00
|
|
|
|
2017-01-20 02:44:47 +00:00
|
|
|
preempt_disable_notrace();
|
2013-10-14 08:24:26 +00:00
|
|
|
|
2020-02-05 07:17:57 +00:00
|
|
|
/*
|
|
|
|
* Have to open code "rcu_dereference_sched()" because the
|
|
|
|
* function graph tracer can be called when RCU is not
|
|
|
|
* "watching".
|
2020-02-05 14:20:32 +00:00
|
|
|
* Protected with schedule_on_each_cpu(ftrace_sync)
|
2020-02-05 07:17:57 +00:00
|
|
|
*/
|
2020-02-05 05:57:02 +00:00
|
|
|
notrace_hash = rcu_dereference_protected(ftrace_graph_notrace_hash,
|
|
|
|
!preemptible());
|
|
|
|
|
|
|
|
if (ftrace_lookup_ip(notrace_hash, addr))
|
2017-01-20 02:44:47 +00:00
|
|
|
ret = 1;
|
2013-10-14 08:24:26 +00:00
|
|
|
|
2017-01-20 02:44:47 +00:00
|
|
|
preempt_enable_notrace();
|
|
|
|
return ret;
|
2013-10-14 08:24:26 +00:00
|
|
|
}
|
2008-11-11 06:14:25 +00:00
|
|
|
#else
|
2024-06-03 19:07:20 +00:00
|
|
|
static inline int ftrace_graph_addr(unsigned long *task_var, struct ftrace_graph_ent *trace)
|
2008-12-04 08:18:28 +00:00
|
|
|
{
|
|
|
|
return 1;
|
2008-12-03 20:36:57 +00:00
|
|
|
}
|
2013-10-14 08:24:26 +00:00
|
|
|
|
|
|
|
static inline int ftrace_graph_notrace_addr(unsigned long addr)
|
|
|
|
{
|
|
|
|
return 0;
|
|
|
|
}
|
2024-06-03 19:07:20 +00:00
|
|
|
static inline void ftrace_graph_addr_finish(struct fgraph_ops *gops, struct ftrace_graph_ret *trace)
|
tracing/fgraph: Fix set_graph_function from showing interrupts
The tracefs file set_graph_function is used to only function graph functions
that are listed in that file (or all functions if the file is empty). The
way this is implemented is that the function graph tracer looks at every
function, and if the current depth is zero and the function matches
something in the file then it will trace that function. When other functions
are called, the depth will be greater than zero (because the original
function will be at depth zero), and all functions will be traced where the
depth is greater than zero.
The issue is that when a function is first entered, and the handler that
checks this logic is called, the depth is set to zero. If an interrupt comes
in and a function in the interrupt handler is traced, its depth will be
greater than zero and it will automatically be traced, even if the original
function was not. But because the logic only looks at depth it may trace
interrupts when it should not be.
The recent design change of the function graph tracer to fix other bugs
caused the depth to be zero while the function graph callback handler is
being called for a longer time, widening the race of this happening. This
bug was actually there for a longer time, but because the race window was so
small it seldom happened. The Fixes tag below is for the commit that widen
the race window, because that commit belongs to a series that will also help
fix the original bug.
Cc: stable@kernel.org
Fixes: 39eb456dacb5 ("function_graph: Use new curr_ret_depth to manage depth instead of curr_ret_stack")
Reported-by: Joe Lawrence <joe.lawrence@redhat.com>
Tested-by: Joe Lawrence <joe.lawrence@redhat.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-11-29 13:50:27 +00:00
|
|
|
{ }
|
2008-12-03 20:36:57 +00:00
|
|
|
#endif /* CONFIG_DYNAMIC_FTRACE */
|
2016-12-09 00:28:28 +00:00
|
|
|
|
|
|
|
extern unsigned int fgraph_max_depth;
|
|
|
|
|
2024-06-03 19:07:20 +00:00
|
|
|
static inline bool
|
|
|
|
ftrace_graph_ignore_func(struct fgraph_ops *gops, struct ftrace_graph_ent *trace)
|
2016-12-09 00:28:28 +00:00
|
|
|
{
|
2024-06-03 19:07:20 +00:00
|
|
|
unsigned long *task_var = fgraph_get_task_var(gops);
|
|
|
|
|
2016-12-09 00:28:28 +00:00
|
|
|
/* trace it when it is-nested-in or is a function enabled. */
|
2024-06-03 19:07:20 +00:00
|
|
|
return !((*task_var & TRACE_GRAPH_FL) ||
|
|
|
|
ftrace_graph_addr(task_var, trace)) ||
|
2016-12-09 00:28:28 +00:00
|
|
|
(trace->depth < 0) ||
|
|
|
|
(fgraph_max_depth && trace->depth >= fgraph_max_depth);
|
|
|
|
}
|
|
|
|
|
2024-06-03 19:07:16 +00:00
|
|
|
void fgraph_init_ops(struct ftrace_ops *dst_ops,
|
|
|
|
struct ftrace_ops *src_ops);
|
|
|
|
|
2008-12-03 20:36:57 +00:00
|
|
|
#else /* CONFIG_FUNCTION_GRAPH_TRACER */
|
2008-11-11 06:14:25 +00:00
|
|
|
static inline enum print_line_t
|
2010-04-02 17:01:21 +00:00
|
|
|
print_graph_function_flags(struct trace_iterator *iter, u32 flags)
|
2008-11-11 06:14:25 +00:00
|
|
|
{
|
|
|
|
return TRACE_TYPE_UNHANDLED;
|
|
|
|
}
|
2024-06-03 19:07:12 +00:00
|
|
|
static inline void free_fgraph_ops(struct trace_array *tr) { }
|
2024-06-03 19:07:16 +00:00
|
|
|
/* ftrace_ops may not be defined */
|
|
|
|
#define init_array_fgraph_ops(tr, ops) do { } while (0)
|
|
|
|
#define allocate_fgraph_ops(tr, ops) ({ 0; })
|
2008-12-03 20:36:57 +00:00
|
|
|
#endif /* CONFIG_FUNCTION_GRAPH_TRACER */
|
2008-11-11 06:14:25 +00:00
|
|
|
|
2009-10-13 20:33:52 +00:00
|
|
|
extern struct list_head ftrace_pids;
|
2008-12-03 20:36:59 +00:00
|
|
|
|
2009-06-25 05:30:12 +00:00
|
|
|
#ifdef CONFIG_FUNCTION_TRACER
|
2020-07-25 00:50:48 +00:00
|
|
|
|
|
|
|
#define FTRACE_PID_IGNORE -1
|
|
|
|
#define FTRACE_PID_TRACE -2
|
|
|
|
|
2017-03-31 23:21:41 +00:00
|
|
|
struct ftrace_func_command {
|
|
|
|
struct list_head list;
|
|
|
|
char *name;
|
2017-04-05 17:12:55 +00:00
|
|
|
int (*func)(struct trace_array *tr,
|
|
|
|
struct ftrace_hash *hash,
|
2017-03-31 23:21:41 +00:00
|
|
|
char *func, char *cmd,
|
|
|
|
char *params, int enable);
|
|
|
|
};
|
2013-06-28 02:18:06 +00:00
|
|
|
extern bool ftrace_filter_param __initdata;
|
2016-04-22 22:11:33 +00:00
|
|
|
static inline int ftrace_trace_task(struct trace_array *tr)
|
2008-12-03 20:36:59 +00:00
|
|
|
{
|
2020-07-25 00:50:48 +00:00
|
|
|
return this_cpu_read(tr->array_buffer.data->ftrace_ignore_pid) !=
|
|
|
|
FTRACE_PID_IGNORE;
|
2008-12-03 20:36:59 +00:00
|
|
|
}
|
2011-09-30 01:26:16 +00:00
|
|
|
extern int ftrace_is_dead(void);
|
2014-01-10 21:17:45 +00:00
|
|
|
int ftrace_create_function_files(struct trace_array *tr,
|
|
|
|
struct dentry *parent);
|
|
|
|
void ftrace_destroy_function_files(struct trace_array *tr);
|
2020-09-10 12:39:07 +00:00
|
|
|
int ftrace_allocate_ftrace_ops(struct trace_array *tr);
|
|
|
|
void ftrace_free_ftrace_ops(struct trace_array *tr);
|
2014-01-10 22:01:58 +00:00
|
|
|
void ftrace_init_global_array_ops(struct trace_array *tr);
|
|
|
|
void ftrace_init_array_ops(struct trace_array *tr, ftrace_func_t func);
|
|
|
|
void ftrace_reset_array_ops(struct trace_array *tr);
|
2016-04-22 22:11:33 +00:00
|
|
|
void ftrace_init_tracefs(struct trace_array *tr, struct dentry *d_tracer);
|
2016-07-05 14:04:34 +00:00
|
|
|
void ftrace_init_tracefs_toplevel(struct trace_array *tr,
|
|
|
|
struct dentry *d_tracer);
|
2017-04-17 02:44:27 +00:00
|
|
|
void ftrace_clear_pids(struct trace_array *tr);
|
2017-03-03 18:48:42 +00:00
|
|
|
int init_function_trace(void);
|
2017-04-17 02:44:28 +00:00
|
|
|
void ftrace_pid_follow_fork(struct trace_array *tr, bool enable);
|
2009-06-25 05:30:12 +00:00
|
|
|
#else
|
2016-04-22 22:11:33 +00:00
|
|
|
static inline int ftrace_trace_task(struct trace_array *tr)
|
2009-06-25 05:30:12 +00:00
|
|
|
{
|
|
|
|
return 1;
|
|
|
|
}
|
2011-09-30 01:26:16 +00:00
|
|
|
static inline int ftrace_is_dead(void) { return 0; }
|
2014-01-10 21:17:45 +00:00
|
|
|
static inline int
|
|
|
|
ftrace_create_function_files(struct trace_array *tr,
|
|
|
|
struct dentry *parent)
|
|
|
|
{
|
|
|
|
return 0;
|
|
|
|
}
|
2020-09-10 12:39:07 +00:00
|
|
|
static inline int ftrace_allocate_ftrace_ops(struct trace_array *tr)
|
|
|
|
{
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
static inline void ftrace_free_ftrace_ops(struct trace_array *tr) { }
|
2014-01-10 21:17:45 +00:00
|
|
|
static inline void ftrace_destroy_function_files(struct trace_array *tr) { }
|
2014-01-10 22:01:58 +00:00
|
|
|
static inline __init void
|
|
|
|
ftrace_init_global_array_ops(struct trace_array *tr) { }
|
|
|
|
static inline void ftrace_reset_array_ops(struct trace_array *tr) { }
|
2016-04-22 22:11:33 +00:00
|
|
|
static inline void ftrace_init_tracefs(struct trace_array *tr, struct dentry *d) { }
|
2016-07-05 14:04:34 +00:00
|
|
|
static inline void ftrace_init_tracefs_toplevel(struct trace_array *tr, struct dentry *d) { }
|
2017-04-17 02:44:27 +00:00
|
|
|
static inline void ftrace_clear_pids(struct trace_array *tr) { }
|
2017-03-03 18:48:42 +00:00
|
|
|
static inline int init_function_trace(void) { return 0; }
|
2017-04-17 02:44:28 +00:00
|
|
|
static inline void ftrace_pid_follow_fork(struct trace_array *tr, bool enable) { }
|
2014-01-10 22:01:58 +00:00
|
|
|
/* ftace_func_t type is not defined, use macro instead of static inline */
|
|
|
|
#define ftrace_init_array_ops(tr, func) do { } while (0)
|
2014-01-10 21:17:45 +00:00
|
|
|
#endif /* CONFIG_FUNCTION_TRACER */
|
|
|
|
|
|
|
|
#if defined(CONFIG_FUNCTION_TRACER) && defined(CONFIG_DYNAMIC_FTRACE)
|
2017-03-31 23:01:14 +00:00
|
|
|
|
|
|
|
struct ftrace_probe_ops {
|
|
|
|
void (*func)(unsigned long ip,
|
|
|
|
unsigned long parent_ip,
|
2017-04-11 02:30:05 +00:00
|
|
|
struct trace_array *tr,
|
2017-04-03 22:18:47 +00:00
|
|
|
struct ftrace_probe_ops *ops,
|
tracing/ftrace: Add a better way to pass data via the probe functions
With the redesign of the registration and execution of the function probes
(triggers), data can now be passed from the setup of the probe to the probe
callers that are specific to the trace_array it is on. Although, all probes
still only affect the toplevel trace array, this change will allow for
instances to have their own probes separated from other instances and the
top array.
That is, something like the stacktrace probe can be set to trace only in an
instance and not the toplevel trace array. This isn't implement yet, but
this change sets the ground work for the change.
When a probe callback is triggered (someone writes the probe format into
set_ftrace_filter), it calls register_ftrace_function_probe() passing in
init_data that will be used to initialize the probe. Then for every matching
function, register_ftrace_function_probe() will call the probe_ops->init()
function with the init data that was passed to it, as well as an address to
a place holder that is associated with the probe and the instance. The first
occurrence will have a NULL in the pointer. The init() function will then
initialize it. If other probes are added, or more functions are part of the
probe, the place holder will be passed to the init() function with the place
holder data that it was initialized to the last time.
Then this place_holder is passed to each of the other probe_ops functions,
where it can be used in the function callback. When the probe_ops free()
function is called, it can be called either with the rip of the function
that is being removed from the probe, or zero, indicating that there are no
more functions attached to the probe, and the place holder is about to be
freed. This gives the probe_ops a way to free the data it assigned to the
place holder if it was allocade during the first init call.
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-04-20 02:39:44 +00:00
|
|
|
void *data);
|
2017-03-31 23:01:14 +00:00
|
|
|
int (*init)(struct ftrace_probe_ops *ops,
|
2017-04-11 02:30:05 +00:00
|
|
|
struct trace_array *tr,
|
tracing/ftrace: Add a better way to pass data via the probe functions
With the redesign of the registration and execution of the function probes
(triggers), data can now be passed from the setup of the probe to the probe
callers that are specific to the trace_array it is on. Although, all probes
still only affect the toplevel trace array, this change will allow for
instances to have their own probes separated from other instances and the
top array.
That is, something like the stacktrace probe can be set to trace only in an
instance and not the toplevel trace array. This isn't implement yet, but
this change sets the ground work for the change.
When a probe callback is triggered (someone writes the probe format into
set_ftrace_filter), it calls register_ftrace_function_probe() passing in
init_data that will be used to initialize the probe. Then for every matching
function, register_ftrace_function_probe() will call the probe_ops->init()
function with the init data that was passed to it, as well as an address to
a place holder that is associated with the probe and the instance. The first
occurrence will have a NULL in the pointer. The init() function will then
initialize it. If other probes are added, or more functions are part of the
probe, the place holder will be passed to the init() function with the place
holder data that it was initialized to the last time.
Then this place_holder is passed to each of the other probe_ops functions,
where it can be used in the function callback. When the probe_ops free()
function is called, it can be called either with the rip of the function
that is being removed from the probe, or zero, indicating that there are no
more functions attached to the probe, and the place holder is about to be
freed. This gives the probe_ops a way to free the data it assigned to the
place holder if it was allocade during the first init call.
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-04-20 02:39:44 +00:00
|
|
|
unsigned long ip, void *init_data,
|
|
|
|
void **data);
|
2017-03-31 23:01:14 +00:00
|
|
|
void (*free)(struct ftrace_probe_ops *ops,
|
2017-04-11 02:30:05 +00:00
|
|
|
struct trace_array *tr,
|
tracing/ftrace: Add a better way to pass data via the probe functions
With the redesign of the registration and execution of the function probes
(triggers), data can now be passed from the setup of the probe to the probe
callers that are specific to the trace_array it is on. Although, all probes
still only affect the toplevel trace array, this change will allow for
instances to have their own probes separated from other instances and the
top array.
That is, something like the stacktrace probe can be set to trace only in an
instance and not the toplevel trace array. This isn't implement yet, but
this change sets the ground work for the change.
When a probe callback is triggered (someone writes the probe format into
set_ftrace_filter), it calls register_ftrace_function_probe() passing in
init_data that will be used to initialize the probe. Then for every matching
function, register_ftrace_function_probe() will call the probe_ops->init()
function with the init data that was passed to it, as well as an address to
a place holder that is associated with the probe and the instance. The first
occurrence will have a NULL in the pointer. The init() function will then
initialize it. If other probes are added, or more functions are part of the
probe, the place holder will be passed to the init() function with the place
holder data that it was initialized to the last time.
Then this place_holder is passed to each of the other probe_ops functions,
where it can be used in the function callback. When the probe_ops free()
function is called, it can be called either with the rip of the function
that is being removed from the probe, or zero, indicating that there are no
more functions attached to the probe, and the place holder is about to be
freed. This gives the probe_ops a way to free the data it assigned to the
place holder if it was allocade during the first init call.
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2017-04-20 02:39:44 +00:00
|
|
|
unsigned long ip, void *data);
|
2017-03-31 23:01:14 +00:00
|
|
|
int (*print)(struct seq_file *m,
|
|
|
|
unsigned long ip,
|
|
|
|
struct ftrace_probe_ops *ops,
|
|
|
|
void *data);
|
|
|
|
};
|
|
|
|
|
2017-04-04 00:58:35 +00:00
|
|
|
struct ftrace_func_mapper;
|
|
|
|
typedef int (*ftrace_mapper_func)(void *data);
|
|
|
|
|
|
|
|
struct ftrace_func_mapper *allocate_ftrace_func_mapper(void);
|
|
|
|
void **ftrace_func_mapper_find_ip(struct ftrace_func_mapper *mapper,
|
|
|
|
unsigned long ip);
|
|
|
|
int ftrace_func_mapper_add_ip(struct ftrace_func_mapper *mapper,
|
|
|
|
unsigned long ip, void *data);
|
|
|
|
void *ftrace_func_mapper_remove_ip(struct ftrace_func_mapper *mapper,
|
|
|
|
unsigned long ip);
|
|
|
|
void free_ftrace_func_mapper(struct ftrace_func_mapper *mapper,
|
|
|
|
ftrace_mapper_func free_func);
|
|
|
|
|
2017-03-31 23:01:14 +00:00
|
|
|
extern int
|
2017-04-05 17:12:55 +00:00
|
|
|
register_ftrace_function_probe(char *glob, struct trace_array *tr,
|
|
|
|
struct ftrace_probe_ops *ops, void *data);
|
2017-04-04 20:44:43 +00:00
|
|
|
extern int
|
2017-04-18 18:50:39 +00:00
|
|
|
unregister_ftrace_function_probe_func(char *glob, struct trace_array *tr,
|
|
|
|
struct ftrace_probe_ops *ops);
|
2017-05-16 17:51:26 +00:00
|
|
|
extern void clear_ftrace_function_probes(struct trace_array *tr);
|
2017-03-31 23:01:14 +00:00
|
|
|
|
2017-03-31 23:21:41 +00:00
|
|
|
int register_ftrace_command(struct ftrace_func_command *cmd);
|
|
|
|
int unregister_ftrace_command(struct ftrace_func_command *cmd);
|
|
|
|
|
2014-01-10 21:17:45 +00:00
|
|
|
void ftrace_create_filter_files(struct ftrace_ops *ops,
|
|
|
|
struct dentry *parent);
|
|
|
|
void ftrace_destroy_filter_files(struct ftrace_ops *ops);
|
2020-01-29 09:36:44 +00:00
|
|
|
|
|
|
|
extern int ftrace_set_filter(struct ftrace_ops *ops, unsigned char *buf,
|
|
|
|
int len, int reset);
|
|
|
|
extern int ftrace_set_notrace(struct ftrace_ops *ops, unsigned char *buf,
|
|
|
|
int len, int reset);
|
2014-01-10 21:17:45 +00:00
|
|
|
#else
|
2017-03-31 23:21:41 +00:00
|
|
|
struct ftrace_func_command;
|
|
|
|
|
|
|
|
static inline __init int register_ftrace_command(struct ftrace_func_command *cmd)
|
|
|
|
{
|
|
|
|
return -EINVAL;
|
|
|
|
}
|
|
|
|
static inline __init int unregister_ftrace_command(char *cmd_name)
|
|
|
|
{
|
|
|
|
return -EINVAL;
|
|
|
|
}
|
2017-05-18 01:53:32 +00:00
|
|
|
static inline void clear_ftrace_function_probes(struct trace_array *tr)
|
|
|
|
{
|
|
|
|
}
|
|
|
|
|
2014-01-10 21:17:45 +00:00
|
|
|
/*
|
|
|
|
* The ops parameter passed in is usually undefined.
|
|
|
|
* This must be a macro.
|
|
|
|
*/
|
|
|
|
#define ftrace_create_filter_files(ops, parent) do { } while (0)
|
|
|
|
#define ftrace_destroy_filter_files(ops) do { } while (0)
|
|
|
|
#endif /* CONFIG_FUNCTION_TRACER && CONFIG_DYNAMIC_FTRACE */
|
2008-12-03 20:36:59 +00:00
|
|
|
|
2015-09-29 14:43:36 +00:00
|
|
|
bool ftrace_event_is_function(struct trace_event_call *call);
|
2012-02-15 14:51:52 +00:00
|
|
|
|
2009-09-11 15:29:27 +00:00
|
|
|
/*
|
|
|
|
* struct trace_parser - servers for reading the user input separated by spaces
|
|
|
|
* @cont: set if the input is not complete - no final space char was found
|
|
|
|
* @buffer: holds the parsed user input
|
2010-01-29 07:57:49 +00:00
|
|
|
* @idx: user input length
|
2009-09-11 15:29:27 +00:00
|
|
|
* @size: buffer size
|
|
|
|
*/
|
|
|
|
struct trace_parser {
|
|
|
|
bool cont;
|
|
|
|
char *buffer;
|
|
|
|
unsigned idx;
|
|
|
|
unsigned size;
|
|
|
|
};
|
|
|
|
|
|
|
|
static inline bool trace_parser_loaded(struct trace_parser *parser)
|
|
|
|
{
|
|
|
|
return (parser->idx != 0);
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline bool trace_parser_cont(struct trace_parser *parser)
|
|
|
|
{
|
|
|
|
return parser->cont;
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline void trace_parser_clear(struct trace_parser *parser)
|
|
|
|
{
|
|
|
|
parser->cont = false;
|
|
|
|
parser->idx = 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
extern int trace_parser_get_init(struct trace_parser *parser, int size);
|
|
|
|
extern void trace_parser_put(struct trace_parser *parser);
|
|
|
|
extern int trace_get_user(struct trace_parser *parser, const char __user *ubuf,
|
|
|
|
size_t cnt, loff_t *ppos);
|
|
|
|
|
2015-09-29 14:15:10 +00:00
|
|
|
/*
|
|
|
|
* Only create function graph options if function graph is configured.
|
|
|
|
*/
|
|
|
|
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
|
|
|
|
# define FGRAPH_FLAGS \
|
|
|
|
C(DISPLAY_GRAPH, "display-graph"),
|
|
|
|
#else
|
|
|
|
# define FGRAPH_FLAGS
|
|
|
|
#endif
|
|
|
|
|
2015-09-29 14:19:35 +00:00
|
|
|
#ifdef CONFIG_BRANCH_TRACER
|
|
|
|
# define BRANCH_FLAGS \
|
|
|
|
C(BRANCH, "branch"),
|
|
|
|
#else
|
|
|
|
# define BRANCH_FLAGS
|
|
|
|
#endif
|
|
|
|
|
2015-09-29 14:24:56 +00:00
|
|
|
#ifdef CONFIG_FUNCTION_TRACER
|
|
|
|
# define FUNCTION_FLAGS \
|
2017-04-17 02:44:28 +00:00
|
|
|
C(FUNCTION, "function-trace"), \
|
|
|
|
C(FUNC_FORK, "function-fork"),
|
2015-09-29 14:24:56 +00:00
|
|
|
# define FUNCTION_DEFAULT_FLAGS TRACE_ITER_FUNCTION
|
|
|
|
#else
|
|
|
|
# define FUNCTION_FLAGS
|
|
|
|
# define FUNCTION_DEFAULT_FLAGS 0UL
|
2017-04-17 02:44:28 +00:00
|
|
|
# define TRACE_ITER_FUNC_FORK 0UL
|
2015-09-29 14:24:56 +00:00
|
|
|
#endif
|
|
|
|
|
2015-09-29 19:38:55 +00:00
|
|
|
#ifdef CONFIG_STACKTRACE
|
|
|
|
# define STACK_FLAGS \
|
|
|
|
C(STACKTRACE, "stacktrace"),
|
|
|
|
#else
|
|
|
|
# define STACK_FLAGS
|
|
|
|
#endif
|
|
|
|
|
2008-05-12 19:21:00 +00:00
|
|
|
/*
|
|
|
|
* trace_iterator_flags is an enumeration that defines bit
|
|
|
|
* positions into trace_flags that controls the output.
|
|
|
|
*
|
|
|
|
* NOTE: These bits must match the trace_options array in
|
tracing: Use TRACE_FLAGS macro to keep enums and strings matched
Use a cute little macro trick to keep the names of the trace flags file
guaranteed to match the corresponding masks.
The macro TRACE_FLAGS is defined as a serious of enum names followed by
the string name of the file that matches it. For example:
#define TRACE_FLAGS \
C(PRINT_PARENT, "print-parent"), \
C(SYM_OFFSET, "sym-offset"), \
C(SYM_ADDR, "sym-addr"), \
C(VERBOSE, "verbose"),
Now we can define the following:
#undef C
#define C(a, b) TRACE_ITER_##a##_BIT
enum trace_iterator_bits { TRACE_FLAGS };
The above creates:
enum trace_iterator_bits {
TRACE_ITER_PRINT_PARENT_BIT,
TRACE_ITER_SYM_OFFSET_BIT,
TRACE_ITER_SYM_ADDR_BIT,
TRACE_ITER_VERBOSE_BIT,
};
Then we can redefine C as:
#undef C
#define C(a, b) TRACE_ITER_##a = (1 << TRACE_ITER_##a##_BIT)
enum trace_iterator_flags { TRACE_FLAGS };
Which creates:
enum trace_iterator_flags {
TRACE_ITER_PRINT_PARENT = (1 << TRACE_ITER_PRINT_PARENT_BIT),
TRACE_ITER_SYM_OFFSET = (1 << TRACE_ITER_SYM_OFFSET_BIT),
TRACE_ITER_SYM_ADDR = (1 << TRACE_ITER_SYM_ADDR_BIT),
TRACE_ITER_VERBOSE = (1 << TRACE_ITER_VERBOSE_BIT),
};
Then finally we can create the list of file names:
#undef C
#define C(a, b) b
static const char *trace_options[] = {
TRACE_FLAGS
NULL
};
Which creates:
static const char *trace_options[] = {
"print-parent",
"sym-offset",
"sym-addr",
"verbose",
NULL
};
The importance of this is that the strings match the bit index.
trace_options[TRACE_ITER_SYM_ADDR_BIT] == "sym-addr"
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-09-29 13:43:30 +00:00
|
|
|
* trace.c (this macro guarantees it).
|
2008-05-12 19:21:00 +00:00
|
|
|
*/
|
tracing: Use TRACE_FLAGS macro to keep enums and strings matched
Use a cute little macro trick to keep the names of the trace flags file
guaranteed to match the corresponding masks.
The macro TRACE_FLAGS is defined as a serious of enum names followed by
the string name of the file that matches it. For example:
#define TRACE_FLAGS \
C(PRINT_PARENT, "print-parent"), \
C(SYM_OFFSET, "sym-offset"), \
C(SYM_ADDR, "sym-addr"), \
C(VERBOSE, "verbose"),
Now we can define the following:
#undef C
#define C(a, b) TRACE_ITER_##a##_BIT
enum trace_iterator_bits { TRACE_FLAGS };
The above creates:
enum trace_iterator_bits {
TRACE_ITER_PRINT_PARENT_BIT,
TRACE_ITER_SYM_OFFSET_BIT,
TRACE_ITER_SYM_ADDR_BIT,
TRACE_ITER_VERBOSE_BIT,
};
Then we can redefine C as:
#undef C
#define C(a, b) TRACE_ITER_##a = (1 << TRACE_ITER_##a##_BIT)
enum trace_iterator_flags { TRACE_FLAGS };
Which creates:
enum trace_iterator_flags {
TRACE_ITER_PRINT_PARENT = (1 << TRACE_ITER_PRINT_PARENT_BIT),
TRACE_ITER_SYM_OFFSET = (1 << TRACE_ITER_SYM_OFFSET_BIT),
TRACE_ITER_SYM_ADDR = (1 << TRACE_ITER_SYM_ADDR_BIT),
TRACE_ITER_VERBOSE = (1 << TRACE_ITER_VERBOSE_BIT),
};
Then finally we can create the list of file names:
#undef C
#define C(a, b) b
static const char *trace_options[] = {
TRACE_FLAGS
NULL
};
Which creates:
static const char *trace_options[] = {
"print-parent",
"sym-offset",
"sym-addr",
"verbose",
NULL
};
The importance of this is that the strings match the bit index.
trace_options[TRACE_ITER_SYM_ADDR_BIT] == "sym-addr"
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-09-29 13:43:30 +00:00
|
|
|
#define TRACE_FLAGS \
|
|
|
|
C(PRINT_PARENT, "print-parent"), \
|
|
|
|
C(SYM_OFFSET, "sym-offset"), \
|
|
|
|
C(SYM_ADDR, "sym-addr"), \
|
|
|
|
C(VERBOSE, "verbose"), \
|
|
|
|
C(RAW, "raw"), \
|
|
|
|
C(HEX, "hex"), \
|
|
|
|
C(BIN, "bin"), \
|
|
|
|
C(BLOCK, "block"), \
|
2023-03-28 18:51:56 +00:00
|
|
|
C(FIELDS, "fields"), \
|
tracing: Use TRACE_FLAGS macro to keep enums and strings matched
Use a cute little macro trick to keep the names of the trace flags file
guaranteed to match the corresponding masks.
The macro TRACE_FLAGS is defined as a serious of enum names followed by
the string name of the file that matches it. For example:
#define TRACE_FLAGS \
C(PRINT_PARENT, "print-parent"), \
C(SYM_OFFSET, "sym-offset"), \
C(SYM_ADDR, "sym-addr"), \
C(VERBOSE, "verbose"),
Now we can define the following:
#undef C
#define C(a, b) TRACE_ITER_##a##_BIT
enum trace_iterator_bits { TRACE_FLAGS };
The above creates:
enum trace_iterator_bits {
TRACE_ITER_PRINT_PARENT_BIT,
TRACE_ITER_SYM_OFFSET_BIT,
TRACE_ITER_SYM_ADDR_BIT,
TRACE_ITER_VERBOSE_BIT,
};
Then we can redefine C as:
#undef C
#define C(a, b) TRACE_ITER_##a = (1 << TRACE_ITER_##a##_BIT)
enum trace_iterator_flags { TRACE_FLAGS };
Which creates:
enum trace_iterator_flags {
TRACE_ITER_PRINT_PARENT = (1 << TRACE_ITER_PRINT_PARENT_BIT),
TRACE_ITER_SYM_OFFSET = (1 << TRACE_ITER_SYM_OFFSET_BIT),
TRACE_ITER_SYM_ADDR = (1 << TRACE_ITER_SYM_ADDR_BIT),
TRACE_ITER_VERBOSE = (1 << TRACE_ITER_VERBOSE_BIT),
};
Then finally we can create the list of file names:
#undef C
#define C(a, b) b
static const char *trace_options[] = {
TRACE_FLAGS
NULL
};
Which creates:
static const char *trace_options[] = {
"print-parent",
"sym-offset",
"sym-addr",
"verbose",
NULL
};
The importance of this is that the strings match the bit index.
trace_options[TRACE_ITER_SYM_ADDR_BIT] == "sym-addr"
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-09-29 13:43:30 +00:00
|
|
|
C(PRINTK, "trace_printk"), \
|
|
|
|
C(ANNOTATE, "annotate"), \
|
|
|
|
C(USERSTACKTRACE, "userstacktrace"), \
|
|
|
|
C(SYM_USEROBJ, "sym-userobj"), \
|
|
|
|
C(PRINTK_MSGONLY, "printk-msg-only"), \
|
|
|
|
C(CONTEXT_INFO, "context-info"), /* Print pid/cpu/time */ \
|
|
|
|
C(LATENCY_FMT, "latency-format"), \
|
|
|
|
C(RECORD_CMD, "record-cmd"), \
|
2017-06-27 02:01:55 +00:00
|
|
|
C(RECORD_TGID, "record-tgid"), \
|
tracing: Use TRACE_FLAGS macro to keep enums and strings matched
Use a cute little macro trick to keep the names of the trace flags file
guaranteed to match the corresponding masks.
The macro TRACE_FLAGS is defined as a serious of enum names followed by
the string name of the file that matches it. For example:
#define TRACE_FLAGS \
C(PRINT_PARENT, "print-parent"), \
C(SYM_OFFSET, "sym-offset"), \
C(SYM_ADDR, "sym-addr"), \
C(VERBOSE, "verbose"),
Now we can define the following:
#undef C
#define C(a, b) TRACE_ITER_##a##_BIT
enum trace_iterator_bits { TRACE_FLAGS };
The above creates:
enum trace_iterator_bits {
TRACE_ITER_PRINT_PARENT_BIT,
TRACE_ITER_SYM_OFFSET_BIT,
TRACE_ITER_SYM_ADDR_BIT,
TRACE_ITER_VERBOSE_BIT,
};
Then we can redefine C as:
#undef C
#define C(a, b) TRACE_ITER_##a = (1 << TRACE_ITER_##a##_BIT)
enum trace_iterator_flags { TRACE_FLAGS };
Which creates:
enum trace_iterator_flags {
TRACE_ITER_PRINT_PARENT = (1 << TRACE_ITER_PRINT_PARENT_BIT),
TRACE_ITER_SYM_OFFSET = (1 << TRACE_ITER_SYM_OFFSET_BIT),
TRACE_ITER_SYM_ADDR = (1 << TRACE_ITER_SYM_ADDR_BIT),
TRACE_ITER_VERBOSE = (1 << TRACE_ITER_VERBOSE_BIT),
};
Then finally we can create the list of file names:
#undef C
#define C(a, b) b
static const char *trace_options[] = {
TRACE_FLAGS
NULL
};
Which creates:
static const char *trace_options[] = {
"print-parent",
"sym-offset",
"sym-addr",
"verbose",
NULL
};
The importance of this is that the strings match the bit index.
trace_options[TRACE_ITER_SYM_ADDR_BIT] == "sym-addr"
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-09-29 13:43:30 +00:00
|
|
|
C(OVERWRITE, "overwrite"), \
|
|
|
|
C(STOP_ON_FREE, "disable_on_free"), \
|
|
|
|
C(IRQ_INFO, "irq-info"), \
|
|
|
|
C(MARKERS, "markers"), \
|
2016-04-13 20:59:18 +00:00
|
|
|
C(EVENT_FORK, "event-fork"), \
|
2020-03-17 21:32:31 +00:00
|
|
|
C(PAUSE_ON_TRACE, "pause-on-trace"), \
|
2020-10-15 14:55:25 +00:00
|
|
|
C(HASH_PTR, "hash-ptr"), /* Print hashed pointer */ \
|
2015-09-29 14:24:56 +00:00
|
|
|
FUNCTION_FLAGS \
|
2015-09-29 14:19:35 +00:00
|
|
|
FGRAPH_FLAGS \
|
2015-09-29 19:38:55 +00:00
|
|
|
STACK_FLAGS \
|
2015-09-29 14:19:35 +00:00
|
|
|
BRANCH_FLAGS
|
2015-09-29 13:22:05 +00:00
|
|
|
|
tracing: Use TRACE_FLAGS macro to keep enums and strings matched
Use a cute little macro trick to keep the names of the trace flags file
guaranteed to match the corresponding masks.
The macro TRACE_FLAGS is defined as a serious of enum names followed by
the string name of the file that matches it. For example:
#define TRACE_FLAGS \
C(PRINT_PARENT, "print-parent"), \
C(SYM_OFFSET, "sym-offset"), \
C(SYM_ADDR, "sym-addr"), \
C(VERBOSE, "verbose"),
Now we can define the following:
#undef C
#define C(a, b) TRACE_ITER_##a##_BIT
enum trace_iterator_bits { TRACE_FLAGS };
The above creates:
enum trace_iterator_bits {
TRACE_ITER_PRINT_PARENT_BIT,
TRACE_ITER_SYM_OFFSET_BIT,
TRACE_ITER_SYM_ADDR_BIT,
TRACE_ITER_VERBOSE_BIT,
};
Then we can redefine C as:
#undef C
#define C(a, b) TRACE_ITER_##a = (1 << TRACE_ITER_##a##_BIT)
enum trace_iterator_flags { TRACE_FLAGS };
Which creates:
enum trace_iterator_flags {
TRACE_ITER_PRINT_PARENT = (1 << TRACE_ITER_PRINT_PARENT_BIT),
TRACE_ITER_SYM_OFFSET = (1 << TRACE_ITER_SYM_OFFSET_BIT),
TRACE_ITER_SYM_ADDR = (1 << TRACE_ITER_SYM_ADDR_BIT),
TRACE_ITER_VERBOSE = (1 << TRACE_ITER_VERBOSE_BIT),
};
Then finally we can create the list of file names:
#undef C
#define C(a, b) b
static const char *trace_options[] = {
TRACE_FLAGS
NULL
};
Which creates:
static const char *trace_options[] = {
"print-parent",
"sym-offset",
"sym-addr",
"verbose",
NULL
};
The importance of this is that the strings match the bit index.
trace_options[TRACE_ITER_SYM_ADDR_BIT] == "sym-addr"
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-09-29 13:43:30 +00:00
|
|
|
/*
|
|
|
|
* By defining C, we can make TRACE_FLAGS a list of bit names
|
|
|
|
* that will define the bits for the flag masks.
|
|
|
|
*/
|
|
|
|
#undef C
|
|
|
|
#define C(a, b) TRACE_ITER_##a##_BIT
|
|
|
|
|
2015-09-29 22:13:33 +00:00
|
|
|
enum trace_iterator_bits {
|
|
|
|
TRACE_FLAGS
|
|
|
|
/* Make sure we don't go more than we have bits for */
|
|
|
|
TRACE_ITER_LAST_BIT
|
|
|
|
};
|
tracing: Use TRACE_FLAGS macro to keep enums and strings matched
Use a cute little macro trick to keep the names of the trace flags file
guaranteed to match the corresponding masks.
The macro TRACE_FLAGS is defined as a serious of enum names followed by
the string name of the file that matches it. For example:
#define TRACE_FLAGS \
C(PRINT_PARENT, "print-parent"), \
C(SYM_OFFSET, "sym-offset"), \
C(SYM_ADDR, "sym-addr"), \
C(VERBOSE, "verbose"),
Now we can define the following:
#undef C
#define C(a, b) TRACE_ITER_##a##_BIT
enum trace_iterator_bits { TRACE_FLAGS };
The above creates:
enum trace_iterator_bits {
TRACE_ITER_PRINT_PARENT_BIT,
TRACE_ITER_SYM_OFFSET_BIT,
TRACE_ITER_SYM_ADDR_BIT,
TRACE_ITER_VERBOSE_BIT,
};
Then we can redefine C as:
#undef C
#define C(a, b) TRACE_ITER_##a = (1 << TRACE_ITER_##a##_BIT)
enum trace_iterator_flags { TRACE_FLAGS };
Which creates:
enum trace_iterator_flags {
TRACE_ITER_PRINT_PARENT = (1 << TRACE_ITER_PRINT_PARENT_BIT),
TRACE_ITER_SYM_OFFSET = (1 << TRACE_ITER_SYM_OFFSET_BIT),
TRACE_ITER_SYM_ADDR = (1 << TRACE_ITER_SYM_ADDR_BIT),
TRACE_ITER_VERBOSE = (1 << TRACE_ITER_VERBOSE_BIT),
};
Then finally we can create the list of file names:
#undef C
#define C(a, b) b
static const char *trace_options[] = {
TRACE_FLAGS
NULL
};
Which creates:
static const char *trace_options[] = {
"print-parent",
"sym-offset",
"sym-addr",
"verbose",
NULL
};
The importance of this is that the strings match the bit index.
trace_options[TRACE_ITER_SYM_ADDR_BIT] == "sym-addr"
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2015-09-29 13:43:30 +00:00
|
|
|
|
|
|
|
/*
|
|
|
|
* By redefining C, we can make TRACE_FLAGS a list of masks that
|
|
|
|
* use the bits as defined above.
|
|
|
|
*/
|
|
|
|
#undef C
|
|
|
|
#define C(a, b) TRACE_ITER_##a = (1 << TRACE_ITER_##a##_BIT)
|
|
|
|
|
|
|
|
enum trace_iterator_flags { TRACE_FLAGS };
|
2008-05-12 19:20:52 +00:00
|
|
|
|
2008-11-11 06:14:25 +00:00
|
|
|
/*
|
|
|
|
* TRACE_ITER_SYM_MASK masks the options in trace_flags that
|
|
|
|
* control the output of kernel symbols.
|
|
|
|
*/
|
|
|
|
#define TRACE_ITER_SYM_MASK \
|
|
|
|
(TRACE_ITER_PRINT_PARENT|TRACE_ITER_SYM_OFFSET|TRACE_ITER_SYM_ADDR)
|
|
|
|
|
2008-09-21 18:16:30 +00:00
|
|
|
extern struct tracer nop_trace;
|
|
|
|
|
2008-11-12 20:24:24 +00:00
|
|
|
#ifdef CONFIG_BRANCH_TRACER
|
2008-11-12 20:24:24 +00:00
|
|
|
extern int enable_branch_tracing(struct trace_array *tr);
|
|
|
|
extern void disable_branch_tracing(void);
|
|
|
|
static inline int trace_branch_enable(struct trace_array *tr)
|
2008-11-12 05:14:40 +00:00
|
|
|
{
|
2015-09-30 13:42:05 +00:00
|
|
|
if (tr->trace_flags & TRACE_ITER_BRANCH)
|
2008-11-12 20:24:24 +00:00
|
|
|
return enable_branch_tracing(tr);
|
2008-11-12 05:14:40 +00:00
|
|
|
return 0;
|
|
|
|
}
|
2008-11-12 20:24:24 +00:00
|
|
|
static inline void trace_branch_disable(void)
|
2008-11-12 05:14:40 +00:00
|
|
|
{
|
|
|
|
/* due to races, always disable */
|
2008-11-12 20:24:24 +00:00
|
|
|
disable_branch_tracing();
|
2008-11-12 05:14:40 +00:00
|
|
|
}
|
|
|
|
#else
|
2008-11-12 20:24:24 +00:00
|
|
|
static inline int trace_branch_enable(struct trace_array *tr)
|
2008-11-12 05:14:40 +00:00
|
|
|
{
|
|
|
|
return 0;
|
|
|
|
}
|
2008-11-12 20:24:24 +00:00
|
|
|
static inline void trace_branch_disable(void)
|
2008-11-12 05:14:40 +00:00
|
|
|
{
|
|
|
|
}
|
2008-11-12 20:24:24 +00:00
|
|
|
#endif /* CONFIG_BRANCH_TRACER */
|
2008-11-12 05:14:40 +00:00
|
|
|
|
2009-03-11 18:33:00 +00:00
|
|
|
/* set ring buffers to default size if not already done so */
|
2023-09-06 09:18:37 +00:00
|
|
|
int tracing_update_buffers(struct trace_array *tr);
|
2009-03-11 18:33:00 +00:00
|
|
|
|
2023-08-16 15:49:26 +00:00
|
|
|
union trace_synth_field {
|
|
|
|
u8 as_u8;
|
|
|
|
u16 as_u16;
|
|
|
|
u32 as_u32;
|
|
|
|
u64 as_u64;
|
|
|
|
struct trace_dynamic_info as_dynamic;
|
|
|
|
};
|
|
|
|
|
2009-03-22 08:30:39 +00:00
|
|
|
struct ftrace_event_field {
|
|
|
|
struct list_head link;
|
2013-02-28 01:41:37 +00:00
|
|
|
const char *name;
|
|
|
|
const char *type;
|
2009-08-07 02:33:02 +00:00
|
|
|
int filter_type;
|
2009-03-22 08:30:39 +00:00
|
|
|
int offset;
|
|
|
|
int size;
|
2009-04-28 08:04:53 +00:00
|
|
|
int is_signed;
|
2023-02-12 15:13:03 +00:00
|
|
|
int len;
|
2009-03-22 08:30:39 +00:00
|
|
|
};
|
|
|
|
|
2018-03-09 18:19:28 +00:00
|
|
|
struct prog_entry;
|
|
|
|
|
2009-04-28 08:04:47 +00:00
|
|
|
struct event_filter {
|
2018-03-09 18:19:28 +00:00
|
|
|
struct prog_entry __rcu *prog;
|
|
|
|
char *filter_string;
|
2009-04-28 08:04:47 +00:00
|
|
|
};
|
|
|
|
|
2009-03-22 08:31:17 +00:00
|
|
|
struct event_subsystem {
|
|
|
|
struct list_head list;
|
|
|
|
const char *name;
|
2009-07-20 02:20:53 +00:00
|
|
|
struct event_filter *filter;
|
2011-07-05 15:36:06 +00:00
|
|
|
int ref_count;
|
2009-03-22 08:31:17 +00:00
|
|
|
};
|
|
|
|
|
2015-05-13 18:59:40 +00:00
|
|
|
struct trace_subsystem_dir {
|
2012-05-04 03:09:03 +00:00
|
|
|
struct list_head list;
|
|
|
|
struct event_subsystem *subsystem;
|
|
|
|
struct trace_array *tr;
|
eventfs: Remove eventfs_file and just use eventfs_inode
Instead of having a descriptor for every file represented in the eventfs
directory, only have the directory itself represented. Change the API to
send in a list of entries that represent all the files in the directory
(but not other directories). The entry list contains a name and a callback
function that will be used to create the files when they are accessed.
struct eventfs_inode *eventfs_create_events_dir(const char *name, struct dentry *parent,
const struct eventfs_entry *entries,
int size, void *data);
is used for the top level eventfs directory, and returns an eventfs_inode
that will be used by:
struct eventfs_inode *eventfs_create_dir(const char *name, struct eventfs_inode *parent,
const struct eventfs_entry *entries,
int size, void *data);
where both of the above take an array of struct eventfs_entry entries for
every file that is in the directory.
The entries are defined by:
typedef int (*eventfs_callback)(const char *name, umode_t *mode, void **data,
const struct file_operations **fops);
struct eventfs_entry {
const char *name;
eventfs_callback callback;
};
Where the name is the name of the file and the callback gets called when
the file is being created. The callback passes in the name (in case the
same callback is used for multiple files), a pointer to the mode, data and
fops. The data will be pointing to the data that was passed in
eventfs_create_dir() or eventfs_create_events_dir() but may be overridden
to point to something else, as it will be used to point to the
inode->i_private that is created. The information passed back from the
callback is used to create the dentry/inode.
If the callback fills the data and the file should be created, it must
return a positive number. On zero or negative, the file is ignored.
This logic may also be used as a prototype to convert entire pseudo file
systems into just-in-time allocation.
The "show_events_dentry" file has been updated to show the directories,
and any files they have.
With just the eventfs_file allocations:
Before after deltas for meminfo (in kB):
MemFree: -14360
MemAvailable: -14260
Buffers: 40
Cached: 24
Active: 44
Inactive: 48
Inactive(anon): 28
Active(file): 44
Inactive(file): 20
Dirty: -4
AnonPages: 28
Mapped: 4
KReclaimable: 132
Slab: 1604
SReclaimable: 132
SUnreclaim: 1472
Committed_AS: 12
Before after deltas for slabinfo:
<slab>: <objects> [ * <size> = <total>]
ext4_inode_cache 27 [* 1184 = 31968 ]
extent_status 102 [* 40 = 4080 ]
tracefs_inode_cache 144 [* 656 = 94464 ]
buffer_head 39 [* 104 = 4056 ]
shmem_inode_cache 49 [* 800 = 39200 ]
filp -53 [* 256 = -13568 ]
dentry 251 [* 192 = 48192 ]
lsm_file_cache 277 [* 32 = 8864 ]
vm_area_struct -14 [* 184 = -2576 ]
trace_event_file 1748 [* 88 = 153824 ]
kmalloc-1k 35 [* 1024 = 35840 ]
kmalloc-256 49 [* 256 = 12544 ]
kmalloc-192 -28 [* 192 = -5376 ]
kmalloc-128 -30 [* 128 = -3840 ]
kmalloc-96 10581 [* 96 = 1015776 ]
kmalloc-64 3056 [* 64 = 195584 ]
kmalloc-32 1291 [* 32 = 41312 ]
kmalloc-16 2310 [* 16 = 36960 ]
kmalloc-8 9216 [* 8 = 73728 ]
Free memory dropped by 14,360 kB
Available memory dropped by 14,260 kB
Total slab additions in size: 1,771,032 bytes
With this change:
Before after deltas for meminfo (in kB):
MemFree: -12084
MemAvailable: -11976
Buffers: 32
Cached: 32
Active: 72
Inactive: 168
Inactive(anon): 176
Active(file): 72
Inactive(file): -8
Dirty: 24
AnonPages: 196
Mapped: 8
KReclaimable: 148
Slab: 836
SReclaimable: 148
SUnreclaim: 688
Committed_AS: 324
Before after deltas for slabinfo:
<slab>: <objects> [ * <size> = <total>]
tracefs_inode_cache 144 [* 656 = 94464 ]
shmem_inode_cache -23 [* 800 = -18400 ]
filp -92 [* 256 = -23552 ]
dentry 179 [* 192 = 34368 ]
lsm_file_cache -3 [* 32 = -96 ]
vm_area_struct -13 [* 184 = -2392 ]
trace_event_file 1748 [* 88 = 153824 ]
kmalloc-1k -49 [* 1024 = -50176 ]
kmalloc-256 -27 [* 256 = -6912 ]
kmalloc-128 1864 [* 128 = 238592 ]
kmalloc-64 4685 [* 64 = 299840 ]
kmalloc-32 -72 [* 32 = -2304 ]
kmalloc-16 256 [* 16 = 4096 ]
total = 721352
Free memory dropped by 12,084 kB
Available memory dropped by 11,976 kB
Total slab additions in size: 721,352 bytes
That's over 2 MB in savings per instance for free and available memory,
and over 1 MB in savings per instance of slab memory.
Link: https://lore.kernel.org/linux-trace-kernel/20231003184059.4924468e@gandalf.local.home
Link: https://lore.kernel.org/linux-trace-kernel/20231004165007.43d79161@gandalf.local.home
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ajay Kaher <akaher@vmware.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-10-04 20:50:07 +00:00
|
|
|
struct eventfs_inode *ei;
|
2012-05-04 03:09:03 +00:00
|
|
|
int ref_count;
|
|
|
|
int nr_events;
|
|
|
|
};
|
|
|
|
|
2016-04-27 14:13:46 +00:00
|
|
|
extern int call_filter_check_discard(struct trace_event_call *call, void *rec,
|
2019-12-13 18:58:57 +00:00
|
|
|
struct trace_buffer *buffer,
|
2016-04-27 14:13:46 +00:00
|
|
|
struct ring_buffer_event *event);
|
2016-04-28 16:04:13 +00:00
|
|
|
|
|
|
|
void trace_buffer_unlock_commit_regs(struct trace_array *tr,
|
2019-12-13 18:58:57 +00:00
|
|
|
struct trace_buffer *buffer,
|
2016-04-28 16:04:13 +00:00
|
|
|
struct ring_buffer_event *event,
|
2021-01-25 19:45:08 +00:00
|
|
|
unsigned int trcace_ctx,
|
2016-04-28 16:04:13 +00:00
|
|
|
struct pt_regs *regs);
|
2016-04-29 21:44:01 +00:00
|
|
|
|
|
|
|
static inline void trace_buffer_unlock_commit(struct trace_array *tr,
|
2019-12-13 18:58:57 +00:00
|
|
|
struct trace_buffer *buffer,
|
2016-04-29 21:44:01 +00:00
|
|
|
struct ring_buffer_event *event,
|
2021-01-25 19:45:08 +00:00
|
|
|
unsigned int trace_ctx)
|
2016-04-29 21:44:01 +00:00
|
|
|
{
|
2021-01-25 19:45:08 +00:00
|
|
|
trace_buffer_unlock_commit_regs(tr, buffer, event, trace_ctx, NULL);
|
2016-04-29 21:44:01 +00:00
|
|
|
}
|
|
|
|
|
2024-02-20 14:06:16 +00:00
|
|
|
DECLARE_PER_CPU(bool, trace_taskinfo_save);
|
|
|
|
int trace_save_cmdline(struct task_struct *tsk);
|
|
|
|
int trace_create_savedcmd(void);
|
|
|
|
int trace_alloc_tgid_map(void);
|
|
|
|
void trace_free_saved_cmdlines_buffer(void);
|
|
|
|
|
|
|
|
extern const struct file_operations tracing_saved_cmdlines_fops;
|
|
|
|
extern const struct file_operations tracing_saved_tgids_fops;
|
|
|
|
extern const struct file_operations tracing_saved_cmdlines_size_fops;
|
|
|
|
|
2016-05-03 21:15:43 +00:00
|
|
|
DECLARE_PER_CPU(struct ring_buffer_event *, trace_buffered_event);
|
|
|
|
DECLARE_PER_CPU(int, trace_buffered_event_cnt);
|
|
|
|
void trace_buffered_event_disable(void);
|
|
|
|
void trace_buffered_event_enable(void);
|
|
|
|
|
2023-02-07 17:28:51 +00:00
|
|
|
void early_enable_events(struct trace_array *tr, char *buf, bool disable_first);
|
|
|
|
|
2016-05-03 21:15:43 +00:00
|
|
|
static inline void
|
2019-12-13 18:58:57 +00:00
|
|
|
__trace_event_discard_commit(struct trace_buffer *buffer,
|
2016-05-03 21:15:43 +00:00
|
|
|
struct ring_buffer_event *event)
|
|
|
|
{
|
|
|
|
if (this_cpu_read(trace_buffered_event) == event) {
|
2021-11-30 02:39:47 +00:00
|
|
|
/* Simply release the temp buffer and enable preemption */
|
2016-05-03 21:15:43 +00:00
|
|
|
this_cpu_dec(trace_buffered_event_cnt);
|
2021-11-30 02:39:47 +00:00
|
|
|
preempt_enable_notrace();
|
2016-05-03 21:15:43 +00:00
|
|
|
return;
|
|
|
|
}
|
2021-11-30 02:39:47 +00:00
|
|
|
/* ring_buffer_discard_commit() enables preemption */
|
2016-05-03 21:15:43 +00:00
|
|
|
ring_buffer_discard_commit(buffer, event);
|
|
|
|
}
|
|
|
|
|
2016-04-27 01:22:22 +00:00
|
|
|
/*
|
|
|
|
* Helper function for event_trigger_unlock_commit{_regs}().
|
|
|
|
* If there are event triggers attached to this event that requires
|
2020-10-10 14:09:24 +00:00
|
|
|
* filtering against its fields, then they will be called as the
|
2016-04-27 01:22:22 +00:00
|
|
|
* entry already holds the field information of the current event.
|
|
|
|
*
|
|
|
|
* It also checks if the event should be discarded or not.
|
|
|
|
* It is to be discarded if the event is soft disabled and the
|
|
|
|
* event was only recorded to process triggers, or if the event
|
|
|
|
* filter is active and this event did not match the filters.
|
|
|
|
*
|
|
|
|
* Returns true if the event is discarded, false otherwise.
|
|
|
|
*/
|
|
|
|
static inline bool
|
|
|
|
__event_trigger_test_discard(struct trace_event_file *file,
|
2019-12-13 18:58:57 +00:00
|
|
|
struct trace_buffer *buffer,
|
2016-04-27 01:22:22 +00:00
|
|
|
struct ring_buffer_event *event,
|
|
|
|
void *entry,
|
|
|
|
enum event_trigger_type *tt)
|
|
|
|
{
|
|
|
|
unsigned long eflags = file->flags;
|
|
|
|
|
|
|
|
if (eflags & EVENT_FILE_FL_TRIGGER_COND)
|
2021-03-16 16:41:03 +00:00
|
|
|
*tt = event_triggers_call(file, buffer, entry, event);
|
2016-04-27 01:22:22 +00:00
|
|
|
|
2021-11-26 22:34:42 +00:00
|
|
|
if (likely(!(file->flags & (EVENT_FILE_FL_SOFT_DISABLED |
|
|
|
|
EVENT_FILE_FL_FILTERED |
|
|
|
|
EVENT_FILE_FL_PID_FILTER))))
|
|
|
|
return false;
|
|
|
|
|
|
|
|
if (file->flags & EVENT_FILE_FL_SOFT_DISABLED)
|
|
|
|
goto discard;
|
|
|
|
|
|
|
|
if (file->flags & EVENT_FILE_FL_FILTERED &&
|
|
|
|
!filter_match_preds(file->filter, entry))
|
|
|
|
goto discard;
|
|
|
|
|
|
|
|
if ((file->flags & EVENT_FILE_FL_PID_FILTER) &&
|
|
|
|
trace_event_ignore_this_pid(file))
|
|
|
|
goto discard;
|
2016-04-27 01:22:22 +00:00
|
|
|
|
2016-04-27 15:09:42 +00:00
|
|
|
return false;
|
2021-11-26 22:34:42 +00:00
|
|
|
discard:
|
|
|
|
__trace_event_discard_commit(buffer, event);
|
|
|
|
return true;
|
2016-04-27 01:22:22 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* event_trigger_unlock_commit - handle triggers and finish event commit
|
2021-03-23 17:49:35 +00:00
|
|
|
* @file: The file pointer associated with the event
|
2016-04-27 01:22:22 +00:00
|
|
|
* @buffer: The ring buffer that the event is being written to
|
|
|
|
* @event: The event meta data in the ring buffer
|
|
|
|
* @entry: The event itself
|
2021-01-25 19:45:08 +00:00
|
|
|
* @trace_ctx: The tracing context flags.
|
2016-04-27 01:22:22 +00:00
|
|
|
*
|
|
|
|
* This is a helper function to handle triggers that require data
|
|
|
|
* from the event itself. It also tests the event against filters and
|
|
|
|
* if the event is soft disabled and should be discarded.
|
|
|
|
*/
|
|
|
|
static inline void
|
|
|
|
event_trigger_unlock_commit(struct trace_event_file *file,
|
2019-12-13 18:58:57 +00:00
|
|
|
struct trace_buffer *buffer,
|
2016-04-27 01:22:22 +00:00
|
|
|
struct ring_buffer_event *event,
|
2021-01-25 19:45:08 +00:00
|
|
|
void *entry, unsigned int trace_ctx)
|
2016-04-27 01:22:22 +00:00
|
|
|
{
|
|
|
|
enum event_trigger_type tt = ETT_NONE;
|
|
|
|
|
|
|
|
if (!__event_trigger_test_discard(file, buffer, event, entry, &tt))
|
2021-01-25 19:45:08 +00:00
|
|
|
trace_buffer_unlock_commit(file->tr, buffer, event, trace_ctx);
|
2016-04-27 01:22:22 +00:00
|
|
|
|
|
|
|
if (tt)
|
2018-05-07 20:02:14 +00:00
|
|
|
event_triggers_post_call(file, tt);
|
2016-04-27 01:22:22 +00:00
|
|
|
}
|
|
|
|
|
2011-01-28 03:54:33 +00:00
|
|
|
#define FILTER_PRED_INVALID ((unsigned short)-1)
|
|
|
|
#define FILTER_PRED_IS_RIGHT (1 << 15)
|
2011-01-28 04:16:51 +00:00
|
|
|
#define FILTER_PRED_FOLD (1 << 15)
|
2011-01-28 03:54:33 +00:00
|
|
|
|
2011-01-28 04:21:34 +00:00
|
|
|
/*
|
|
|
|
* The max preds is the size of unsigned short with
|
|
|
|
* two flags at the MSBs. One bit is used for both the IS_RIGHT
|
|
|
|
* and FOLD flags. The other is reserved.
|
|
|
|
*
|
|
|
|
* 2^14 preds is way more than enough.
|
|
|
|
*/
|
|
|
|
#define MAX_FILTER_PRED 16384
|
2011-01-28 04:19:49 +00:00
|
|
|
|
2009-03-22 08:31:04 +00:00
|
|
|
struct filter_pred;
|
2009-09-24 19:10:44 +00:00
|
|
|
struct regex;
|
2009-03-22 08:31:04 +00:00
|
|
|
|
2009-09-24 19:10:44 +00:00
|
|
|
typedef int (*regex_match_func)(char *str, struct regex *r, int len);
|
|
|
|
|
2009-09-24 19:31:51 +00:00
|
|
|
enum regex_type {
|
2009-10-15 03:21:12 +00:00
|
|
|
MATCH_FULL = 0,
|
2009-09-24 19:31:51 +00:00
|
|
|
MATCH_FRONT_ONLY,
|
|
|
|
MATCH_MIDDLE_ONLY,
|
|
|
|
MATCH_END_ONLY,
|
2016-10-05 11:58:15 +00:00
|
|
|
MATCH_GLOB,
|
ftrace: Allow enabling of filters via index of available_filter_functions
Enabling of large number of functions by echoing in a large subset of the
functions in available_filter_functions can take a very long time. The
process requires testing all functions registered by the function tracer
(which is in the 10s of thousands), and doing a kallsyms lookup to convert
the ip address into a name, then comparing that name with the string passed
in.
When a function causes the function tracer to crash the system, a binary
bisect of the available_filter_functions can be done to find the culprit.
But this requires passing in half of the functions in
available_filter_functions over and over again, which makes it basically a
O(n^2) operation. With 40,000 functions, that ends up bing 1,600,000,000
opertions! And enabling this can take over 20 minutes.
As a quick speed up, if a number is passed into one of the filter files,
instead of doing a search, it just enables the function at the corresponding
line of the available_filter_functions file. That is:
# echo 50 > set_ftrace_filter
# cat set_ftrace_filter
x86_pmu_commit_txn
# head -50 available_filter_functions | tail -1
x86_pmu_commit_txn
This allows setting of half the available_filter_functions to take place in
less than a second!
# time seq 20000 > set_ftrace_filter
real 0m0.042s
user 0m0.005s
sys 0m0.015s
# wc -l set_ftrace_filter
20000 set_ftrace_filter
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-02-11 20:00:48 +00:00
|
|
|
MATCH_INDEX,
|
2009-09-24 19:31:51 +00:00
|
|
|
};
|
|
|
|
|
2009-09-24 19:10:44 +00:00
|
|
|
struct regex {
|
|
|
|
char pattern[MAX_FILTER_STR_VAL];
|
|
|
|
int len;
|
|
|
|
int field_len;
|
|
|
|
regex_match_func match;
|
|
|
|
};
|
|
|
|
|
2015-12-10 18:50:43 +00:00
|
|
|
static inline bool is_string_field(struct ftrace_event_field *field)
|
|
|
|
{
|
|
|
|
return field->filter_type == FILTER_DYN_STRING ||
|
2021-11-22 09:30:12 +00:00
|
|
|
field->filter_type == FILTER_RDYN_STRING ||
|
2015-12-10 18:50:43 +00:00
|
|
|
field->filter_type == FILTER_STATIC_STRING ||
|
2017-02-08 18:36:37 +00:00
|
|
|
field->filter_type == FILTER_PTR_STRING ||
|
|
|
|
field->filter_type == FILTER_COMM;
|
2015-12-10 18:50:43 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
static inline bool is_function_field(struct ftrace_event_field *field)
|
|
|
|
{
|
|
|
|
return field->filter_type == FILTER_TRACE_FN;
|
|
|
|
}
|
|
|
|
|
2009-09-24 19:31:51 +00:00
|
|
|
extern enum regex_type
|
|
|
|
filter_parse_regex(char *buff, int len, char **search, int *not);
|
2015-05-05 14:09:53 +00:00
|
|
|
extern void print_event_filter(struct trace_event_file *file,
|
2009-03-24 07:14:31 +00:00
|
|
|
struct trace_seq *s);
|
2015-05-05 14:09:53 +00:00
|
|
|
extern int apply_event_filter(struct trace_event_file *file,
|
tracing/filters: a better event parser
Replace the current event parser hack with a better one. Filters are
no longer specified predicate by predicate, but all at once and can
use parens and any of the following operators:
numeric fields:
==, !=, <, <=, >, >=
string fields:
==, !=
predicates can be combined with the logical operators:
&&, ||
examples:
"common_preempt_count > 4" > filter
"((sig >= 10 && sig < 15) || sig == 17) && comm != bash" > filter
If there was an error, the erroneous string along with an error
message can be seen by looking at the filter e.g.:
((sig >= 10 && sig < 15) || dsig == 17) && comm != bash
^
parse_error: Field not found
Currently the caret for an error always appears at the beginning of
the filter; a real position should be used, but the error message
should be useful even without it.
To clear a filter, '0' can be written to the filter file.
Filters can also be set or cleared for a complete subsystem by writing
the same filter as would be written to an individual event to the
filter file at the root of the subsytem. Note however, that if any
event in the subsystem lacks a field specified in the filter being
set, the set will fail and all filters in the subsytem are
automatically cleared. This change from the previous version was made
because using only the fields that happen to exist for a given event
would most likely result in a meaningless filter.
Because the logical operators are now implemented as predicates, the
maximum number of predicates in a filter was increased from 8 to 16.
[ Impact: add new, extended trace-filter implementation ]
Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: fweisbec@gmail.com
Cc: Li Zefan <lizf@cn.fujitsu.com>
LKML-Reference: <1240905899.6416.121.camel@tropicana>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-28 08:04:59 +00:00
|
|
|
char *filter_string);
|
2015-05-13 18:59:40 +00:00
|
|
|
extern int apply_subsystem_event_filter(struct trace_subsystem_dir *dir,
|
tracing/filters: a better event parser
Replace the current event parser hack with a better one. Filters are
no longer specified predicate by predicate, but all at once and can
use parens and any of the following operators:
numeric fields:
==, !=, <, <=, >, >=
string fields:
==, !=
predicates can be combined with the logical operators:
&&, ||
examples:
"common_preempt_count > 4" > filter
"((sig >= 10 && sig < 15) || sig == 17) && comm != bash" > filter
If there was an error, the erroneous string along with an error
message can be seen by looking at the filter e.g.:
((sig >= 10 && sig < 15) || dsig == 17) && comm != bash
^
parse_error: Field not found
Currently the caret for an error always appears at the beginning of
the filter; a real position should be used, but the error message
should be useful even without it.
To clear a filter, '0' can be written to the filter file.
Filters can also be set or cleared for a complete subsystem by writing
the same filter as would be written to an individual event to the
filter file at the root of the subsytem. Note however, that if any
event in the subsystem lacks a field specified in the filter being
set, the set will fail and all filters in the subsytem are
automatically cleared. This change from the previous version was made
because using only the fields that happen to exist for a given event
would most likely result in a meaningless filter.
Because the logical operators are now implemented as predicates, the
maximum number of predicates in a filter was increased from 8 to 16.
[ Impact: add new, extended trace-filter implementation ]
Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: fweisbec@gmail.com
Cc: Li Zefan <lizf@cn.fujitsu.com>
LKML-Reference: <1240905899.6416.121.camel@tropicana>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-28 08:04:59 +00:00
|
|
|
char *filter_string);
|
|
|
|
extern void print_subsystem_event_filter(struct event_subsystem *system,
|
2009-04-17 05:27:08 +00:00
|
|
|
struct trace_seq *s);
|
2009-08-07 02:33:02 +00:00
|
|
|
extern int filter_assign_type(const char *type);
|
2019-04-01 20:07:48 +00:00
|
|
|
extern int create_event_filter(struct trace_array *tr,
|
|
|
|
struct trace_event_call *call,
|
tracing: Add and use generic set_trigger_filter() implementation
Add a generic event_command.set_trigger_filter() op implementation and
have the current set of trigger commands use it - this essentially
gives them all support for filters.
Syntactically, filters are supported by adding 'if <filter>' just
after the command, in which case only events matching the filter will
invoke the trigger. For example, to add a filter to an
enable/disable_event command:
echo 'enable_event:system:event if common_pid == 999' > \
.../othersys/otherevent/trigger
The above command will only enable the system:event event if the
common_pid field in the othersys:otherevent event is 999.
As another example, to add a filter to a stacktrace command:
echo 'stacktrace if common_pid == 999' > \
.../somesys/someevent/trigger
The above command will only trigger a stacktrace if the common_pid
field in the event is 999.
The filter syntax is the same as that described in the 'Event
filtering' section of Documentation/trace/events.txt.
Because triggers can now use filters, the trigger-invoking logic needs
to be moved in those cases - e.g. for ftrace_raw_event_calls, if a
trigger has a filter associated with it, the trigger invocation now
needs to happen after the { assign; } part of the call, in order for
the trigger condition to be tested.
There's still a SOFT_DISABLED-only check at the top of e.g. the
ftrace_raw_events function, so when an event is soft disabled but not
because of the presence of a trigger, the original SOFT_DISABLED
behavior remains unchanged.
There's also a bit of trickiness in that some triggers need to avoid
being invoked while an event is currently in the process of being
logged, since the trigger may itself log data into the trace buffer.
Thus we make sure the current event is committed before invoking those
triggers. To do that, we split the trigger invocation in two - the
first part (event_triggers_call()) checks the filter using the current
trace record; if a command has the post_trigger flag set, it sets a
bit for itself in the return value, otherwise it directly invoks the
trigger. Once all commands have been either invoked or set their
return flag, event_triggers_call() returns. The current record is
then either committed or discarded; if any commands have deferred
their triggers, those commands are finally invoked following the close
of the current event by event_triggers_post_call().
To simplify the above and make it more efficient, the TRIGGER_COND bit
is introduced, which is set only if a soft-disabled trigger needs to
use the log record for filter testing or needs to wait until the
current log record is closed.
The syscall event invocation code is also changed in analogous ways.
Because event triggers need to be able to create and free filters,
this also adds a couple external wrappers for the existing
create_filter and free_filter functions, which are too generic to be
made extern functions themselves.
Link: http://lkml.kernel.org/r/7164930759d8719ef460357f143d995406e4eead.1382622043.git.tom.zanussi@linux.intel.com
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-10-24 13:59:29 +00:00
|
|
|
char *filter_str, bool set_str,
|
|
|
|
struct event_filter **filterp);
|
|
|
|
extern void free_event_filter(struct event_filter *filter);
|
2009-03-22 08:31:04 +00:00
|
|
|
|
2013-03-11 07:13:42 +00:00
|
|
|
struct ftrace_event_field *
|
2015-05-05 15:45:27 +00:00
|
|
|
trace_find_event_field(struct trace_event_call *call, char *name);
|
2010-04-22 14:35:55 +00:00
|
|
|
|
2010-07-02 03:07:32 +00:00
|
|
|
extern void trace_event_enable_cmd_record(bool enable);
|
2017-06-27 02:01:55 +00:00
|
|
|
extern void trace_event_enable_tgid_record(bool enable);
|
|
|
|
|
2018-05-08 19:09:27 +00:00
|
|
|
extern int event_trace_init(void);
|
2023-01-04 21:14:12 +00:00
|
|
|
extern int init_events(void);
|
2012-08-03 20:10:49 +00:00
|
|
|
extern int event_trace_add_tracer(struct dentry *parent, struct trace_array *tr);
|
2012-08-07 20:14:16 +00:00
|
|
|
extern int event_trace_del_tracer(struct trace_array *tr);
|
2020-09-24 16:40:08 +00:00
|
|
|
extern void __trace_early_add_events(struct trace_array *tr);
|
2010-07-02 03:07:32 +00:00
|
|
|
|
2018-05-08 19:06:38 +00:00
|
|
|
extern struct trace_event_file *__find_event_file(struct trace_array *tr,
|
|
|
|
const char *system,
|
|
|
|
const char *event);
|
2015-05-05 14:09:53 +00:00
|
|
|
extern struct trace_event_file *find_event_file(struct trace_array *tr,
|
|
|
|
const char *system,
|
|
|
|
const char *event);
|
2013-10-24 13:59:28 +00:00
|
|
|
|
tracing: Add basic event trigger framework
Add a 'trigger' file for each trace event, enabling 'trace event
triggers' to be set for trace events.
'trace event triggers' are patterned after the existing 'ftrace
function triggers' implementation except that triggers are written to
per-event 'trigger' files instead of to a single file such as the
'set_ftrace_filter' used for ftrace function triggers.
The implementation is meant to be entirely separate from ftrace
function triggers, in order to keep the respective implementations
relatively simple and to allow them to diverge.
The event trigger functionality is built on top of SOFT_DISABLE
functionality. It adds a TRIGGER_MODE bit to the ftrace_event_file
flags which is checked when any trace event fires. Triggers set for a
particular event need to be checked regardless of whether that event
is actually enabled or not - getting an event to fire even if it's not
enabled is what's already implemented by SOFT_DISABLE mode, so trigger
mode directly reuses that. Event trigger essentially inherit the soft
disable logic in __ftrace_event_enable_disable() while adding a bit of
logic and trigger reference counting via tm_ref on top of that in a
new trace_event_trigger_enable_disable() function. Because the base
__ftrace_event_enable_disable() code now needs to be invoked from
outside trace_events.c, a wrapper is also added for those usages.
The triggers for an event are actually invoked via a new function,
event_triggers_call(), and code is also added to invoke them for
ftrace_raw_event calls as well as syscall events.
The main part of the patch creates a new trace_events_trigger.c file
to contain the trace event triggers implementation.
The standard open, read, and release file operations are implemented
here.
The open() implementation sets up for the various open modes of the
'trigger' file. It creates and attaches the trigger iterator and sets
up the command parser. If opened for reading set up the trigger
seq_ops.
The read() implementation parses the event trigger written to the
'trigger' file, looks up the trigger command, and passes it along to
that event_command's func() implementation for command-specific
processing.
The release() implementation does whatever cleanup is needed to
release the 'trigger' file, like releasing the parser and trigger
iterator, etc.
A couple of functions for event command registration and
unregistration are added, along with a list to add them to and a mutex
to protect them, as well as an (initially empty) registration function
to add the set of commands that will be added by future commits, and
call to it from the trace event initialization code.
also added are a couple trigger-specific data structures needed for
these implementations such as a trigger iterator and a struct for
trigger-specific data.
A couple structs consisting mostly of function meant to be implemented
in command-specific ways, event_command and event_trigger_ops, are
used by the generic event trigger command implementations. They're
being put into trace.h alongside the other trace_event data structures
and functions, in the expectation that they'll be needed in several
trace_event-related files such as trace_events_trigger.c and
trace_events.c.
The event_command.func() function is meant to be called by the trigger
parsing code in order to add a trigger instance to the corresponding
event. It essentially coordinates adding a live trigger instance to
the event, and arming the triggering the event.
Every event_command func() implementation essentially does the
same thing for any command:
- choose ops - use the value of param to choose either a number or
count version of event_trigger_ops specific to the command
- do the register or unregister of those ops
- associate a filter, if specified, with the triggering event
The reg() and unreg() ops allow command-specific implementations for
event_trigger_op registration and unregistration, and the
get_trigger_ops() op allows command-specific event_trigger_ops
selection to be parameterized. When a trigger instance is added, the
reg() op essentially adds that trigger to the triggering event and
arms it, while unreg() does the opposite. The set_filter() function
is used to associate a filter with the trigger - if the command
doesn't specify a set_filter() implementation, the command will ignore
filters.
Each command has an associated trigger_type, which serves double duty,
both as a unique identifier for the command as well as a value that
can be used for setting a trigger mode bit during trigger invocation.
The signature of func() adds a pointer to the event_command struct,
used to invoke those functions, along with a command_data param that
can be passed to the reg/unreg functions. This allows func()
implementations to use command-specific blobs and supports code
re-use.
The event_trigger_ops.func() command corrsponds to the trigger 'probe'
function that gets called when the triggering event is actually
invoked. The other functions are used to list the trigger when
needed, along with a couple mundane book-keeping functions.
This also moves event_file_data() into trace.h so it can be used
outside of trace_events.c.
Link: http://lkml.kernel.org/r/316d95061accdee070aac8e5750afba0192fa5b9.1382622043.git.tom.zanussi@linux.intel.com
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Idea-by: Steve Rostedt <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-10-24 13:59:24 +00:00
|
|
|
static inline void *event_file_data(struct file *filp)
|
|
|
|
{
|
locking/atomics: COCCINELLE/treewide: Convert trivial ACCESS_ONCE() patterns to READ_ONCE()/WRITE_ONCE()
Please do not apply this to mainline directly, instead please re-run the
coccinelle script shown below and apply its output.
For several reasons, it is desirable to use {READ,WRITE}_ONCE() in
preference to ACCESS_ONCE(), and new code is expected to use one of the
former. So far, there's been no reason to change most existing uses of
ACCESS_ONCE(), as these aren't harmful, and changing them results in
churn.
However, for some features, the read/write distinction is critical to
correct operation. To distinguish these cases, separate read/write
accessors must be used. This patch migrates (most) remaining
ACCESS_ONCE() instances to {READ,WRITE}_ONCE(), using the following
coccinelle script:
----
// Convert trivial ACCESS_ONCE() uses to equivalent READ_ONCE() and
// WRITE_ONCE()
// $ make coccicheck COCCI=/home/mark/once.cocci SPFLAGS="--include-headers" MODE=patch
virtual patch
@ depends on patch @
expression E1, E2;
@@
- ACCESS_ONCE(E1) = E2
+ WRITE_ONCE(E1, E2)
@ depends on patch @
expression E;
@@
- ACCESS_ONCE(E)
+ READ_ONCE(E)
----
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: davem@davemloft.net
Cc: linux-arch@vger.kernel.org
Cc: mpe@ellerman.id.au
Cc: shuah@kernel.org
Cc: snitzer@redhat.com
Cc: thor.thayer@linux.intel.com
Cc: tj@kernel.org
Cc: viro@zeniv.linux.org.uk
Cc: will.deacon@arm.com
Link: http://lkml.kernel.org/r/1508792849-3115-19-git-send-email-paulmck@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-10-23 21:07:29 +00:00
|
|
|
return READ_ONCE(file_inode(filp)->i_private);
|
tracing: Add basic event trigger framework
Add a 'trigger' file for each trace event, enabling 'trace event
triggers' to be set for trace events.
'trace event triggers' are patterned after the existing 'ftrace
function triggers' implementation except that triggers are written to
per-event 'trigger' files instead of to a single file such as the
'set_ftrace_filter' used for ftrace function triggers.
The implementation is meant to be entirely separate from ftrace
function triggers, in order to keep the respective implementations
relatively simple and to allow them to diverge.
The event trigger functionality is built on top of SOFT_DISABLE
functionality. It adds a TRIGGER_MODE bit to the ftrace_event_file
flags which is checked when any trace event fires. Triggers set for a
particular event need to be checked regardless of whether that event
is actually enabled or not - getting an event to fire even if it's not
enabled is what's already implemented by SOFT_DISABLE mode, so trigger
mode directly reuses that. Event trigger essentially inherit the soft
disable logic in __ftrace_event_enable_disable() while adding a bit of
logic and trigger reference counting via tm_ref on top of that in a
new trace_event_trigger_enable_disable() function. Because the base
__ftrace_event_enable_disable() code now needs to be invoked from
outside trace_events.c, a wrapper is also added for those usages.
The triggers for an event are actually invoked via a new function,
event_triggers_call(), and code is also added to invoke them for
ftrace_raw_event calls as well as syscall events.
The main part of the patch creates a new trace_events_trigger.c file
to contain the trace event triggers implementation.
The standard open, read, and release file operations are implemented
here.
The open() implementation sets up for the various open modes of the
'trigger' file. It creates and attaches the trigger iterator and sets
up the command parser. If opened for reading set up the trigger
seq_ops.
The read() implementation parses the event trigger written to the
'trigger' file, looks up the trigger command, and passes it along to
that event_command's func() implementation for command-specific
processing.
The release() implementation does whatever cleanup is needed to
release the 'trigger' file, like releasing the parser and trigger
iterator, etc.
A couple of functions for event command registration and
unregistration are added, along with a list to add them to and a mutex
to protect them, as well as an (initially empty) registration function
to add the set of commands that will be added by future commits, and
call to it from the trace event initialization code.
also added are a couple trigger-specific data structures needed for
these implementations such as a trigger iterator and a struct for
trigger-specific data.
A couple structs consisting mostly of function meant to be implemented
in command-specific ways, event_command and event_trigger_ops, are
used by the generic event trigger command implementations. They're
being put into trace.h alongside the other trace_event data structures
and functions, in the expectation that they'll be needed in several
trace_event-related files such as trace_events_trigger.c and
trace_events.c.
The event_command.func() function is meant to be called by the trigger
parsing code in order to add a trigger instance to the corresponding
event. It essentially coordinates adding a live trigger instance to
the event, and arming the triggering the event.
Every event_command func() implementation essentially does the
same thing for any command:
- choose ops - use the value of param to choose either a number or
count version of event_trigger_ops specific to the command
- do the register or unregister of those ops
- associate a filter, if specified, with the triggering event
The reg() and unreg() ops allow command-specific implementations for
event_trigger_op registration and unregistration, and the
get_trigger_ops() op allows command-specific event_trigger_ops
selection to be parameterized. When a trigger instance is added, the
reg() op essentially adds that trigger to the triggering event and
arms it, while unreg() does the opposite. The set_filter() function
is used to associate a filter with the trigger - if the command
doesn't specify a set_filter() implementation, the command will ignore
filters.
Each command has an associated trigger_type, which serves double duty,
both as a unique identifier for the command as well as a value that
can be used for setting a trigger mode bit during trigger invocation.
The signature of func() adds a pointer to the event_command struct,
used to invoke those functions, along with a command_data param that
can be passed to the reg/unreg functions. This allows func()
implementations to use command-specific blobs and supports code
re-use.
The event_trigger_ops.func() command corrsponds to the trigger 'probe'
function that gets called when the triggering event is actually
invoked. The other functions are used to list the trigger when
needed, along with a couple mundane book-keeping functions.
This also moves event_file_data() into trace.h so it can be used
outside of trace_events.c.
Link: http://lkml.kernel.org/r/316d95061accdee070aac8e5750afba0192fa5b9.1382622043.git.tom.zanussi@linux.intel.com
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Idea-by: Steve Rostedt <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-10-24 13:59:24 +00:00
|
|
|
}
|
|
|
|
|
2009-05-06 02:33:45 +00:00
|
|
|
extern struct mutex event_mutex;
|
2009-04-10 17:52:20 +00:00
|
|
|
extern struct list_head ftrace_events;
|
2009-03-19 19:26:15 +00:00
|
|
|
|
tracing: Have format file honor EVENT_FILE_FL_FREED
When eventfs was introduced, special care had to be done to coordinate the
freeing of the file meta data with the files that are exposed to user
space. The file meta data would have a ref count that is set when the file
is created and would be decremented and freed after the last user that
opened the file closed it. When the file meta data was to be freed, it
would set a flag (EVENT_FILE_FL_FREED) to denote that the file is freed,
and any new references made (like new opens or reads) would fail as it is
marked freed. This allowed other meta data to be freed after this flag was
set (under the event_mutex).
All the files that were dynamically created in the events directory had a
pointer to the file meta data and would call event_release() when the last
reference to the user space file was closed. This would be the time that it
is safe to free the file meta data.
A shortcut was made for the "format" file. It's i_private would point to
the "call" entry directly and not point to the file's meta data. This is
because all format files are the same for the same "call", so it was
thought there was no reason to differentiate them. The other files
maintain state (like the "enable", "trigger", etc). But this meant if the
file were to disappear, the "format" file would be unaware of it.
This caused a race that could be trigger via the user_events test (that
would create dynamic events and free them), and running a loop that would
read the user_events format files:
In one console run:
# cd tools/testing/selftests/user_events
# while true; do ./ftrace_test; done
And in another console run:
# cd /sys/kernel/tracing/
# while true; do cat events/user_events/__test_event/format; done 2>/dev/null
With KASAN memory checking, it would trigger a use-after-free bug report
(which was a real bug). This was because the format file was not checking
the file's meta data flag "EVENT_FILE_FL_FREED", so it would access the
event that the file meta data pointed to after the event was freed.
After inspection, there are other locations that were found to not check
the EVENT_FILE_FL_FREED flag when accessing the trace_event_file. Add a
new helper function: event_file_file() that will make sure that the
event_mutex is held, and will return NULL if the trace_event_file has the
EVENT_FILE_FL_FREED flag set. Have the first reference of the struct file
pointer use event_file_file() and check for NULL. Later uses can still use
the event_file_data() helper function if the event_mutex is still held and
was not released since the event_file_file() call.
Link: https://lore.kernel.org/all/20240719204701.1605950-1-minipli@grsecurity.net/
Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Ajay Kaher <ajay.kaher@broadcom.com>
Cc: Ilkka Naulapää <digirigawa@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Dan Carpenter <dan.carpenter@linaro.org>
Cc: Beau Belgrave <beaub@linux.microsoft.com>
Cc: Florian Fainelli <florian.fainelli@broadcom.com>
Cc: Alexey Makhalov <alexey.makhalov@broadcom.com>
Cc: Vasavi Sirnapalli <vasavi.sirnapalli@broadcom.com>
Link: https://lore.kernel.org/20240730110657.3b69d3c1@gandalf.local.home
Fixes: b63db58e2fa5d ("eventfs/tracing: Add callback for release of an eventfs_inode")
Reported-by: Mathias Krause <minipli@grsecurity.net>
Tested-by: Mathias Krause <minipli@grsecurity.net>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-07-30 15:06:57 +00:00
|
|
|
/*
|
|
|
|
* When the trace_event_file is the filp->i_private pointer,
|
|
|
|
* it must be taken under the event_mutex lock, and then checked
|
|
|
|
* if the EVENT_FILE_FL_FREED flag is set. If it is, then the
|
|
|
|
* data pointed to by the trace_event_file can not be trusted.
|
|
|
|
*
|
|
|
|
* Use the event_file_file() to access the trace_event_file from
|
|
|
|
* the filp the first time under the event_mutex and check for
|
|
|
|
* NULL. If it is needed to be retrieved again and the event_mutex
|
|
|
|
* is still held, then the event_file_data() can be used and it
|
|
|
|
* is guaranteed to be valid.
|
|
|
|
*/
|
|
|
|
static inline struct trace_event_file *event_file_file(struct file *filp)
|
|
|
|
{
|
|
|
|
struct trace_event_file *file;
|
|
|
|
|
|
|
|
lockdep_assert_held(&event_mutex);
|
|
|
|
file = READ_ONCE(file_inode(filp)->i_private);
|
|
|
|
if (!file || file->flags & EVENT_FILE_FL_FREED)
|
|
|
|
return NULL;
|
|
|
|
return file;
|
|
|
|
}
|
|
|
|
|
tracing: Add basic event trigger framework
Add a 'trigger' file for each trace event, enabling 'trace event
triggers' to be set for trace events.
'trace event triggers' are patterned after the existing 'ftrace
function triggers' implementation except that triggers are written to
per-event 'trigger' files instead of to a single file such as the
'set_ftrace_filter' used for ftrace function triggers.
The implementation is meant to be entirely separate from ftrace
function triggers, in order to keep the respective implementations
relatively simple and to allow them to diverge.
The event trigger functionality is built on top of SOFT_DISABLE
functionality. It adds a TRIGGER_MODE bit to the ftrace_event_file
flags which is checked when any trace event fires. Triggers set for a
particular event need to be checked regardless of whether that event
is actually enabled or not - getting an event to fire even if it's not
enabled is what's already implemented by SOFT_DISABLE mode, so trigger
mode directly reuses that. Event trigger essentially inherit the soft
disable logic in __ftrace_event_enable_disable() while adding a bit of
logic and trigger reference counting via tm_ref on top of that in a
new trace_event_trigger_enable_disable() function. Because the base
__ftrace_event_enable_disable() code now needs to be invoked from
outside trace_events.c, a wrapper is also added for those usages.
The triggers for an event are actually invoked via a new function,
event_triggers_call(), and code is also added to invoke them for
ftrace_raw_event calls as well as syscall events.
The main part of the patch creates a new trace_events_trigger.c file
to contain the trace event triggers implementation.
The standard open, read, and release file operations are implemented
here.
The open() implementation sets up for the various open modes of the
'trigger' file. It creates and attaches the trigger iterator and sets
up the command parser. If opened for reading set up the trigger
seq_ops.
The read() implementation parses the event trigger written to the
'trigger' file, looks up the trigger command, and passes it along to
that event_command's func() implementation for command-specific
processing.
The release() implementation does whatever cleanup is needed to
release the 'trigger' file, like releasing the parser and trigger
iterator, etc.
A couple of functions for event command registration and
unregistration are added, along with a list to add them to and a mutex
to protect them, as well as an (initially empty) registration function
to add the set of commands that will be added by future commits, and
call to it from the trace event initialization code.
also added are a couple trigger-specific data structures needed for
these implementations such as a trigger iterator and a struct for
trigger-specific data.
A couple structs consisting mostly of function meant to be implemented
in command-specific ways, event_command and event_trigger_ops, are
used by the generic event trigger command implementations. They're
being put into trace.h alongside the other trace_event data structures
and functions, in the expectation that they'll be needed in several
trace_event-related files such as trace_events_trigger.c and
trace_events.c.
The event_command.func() function is meant to be called by the trigger
parsing code in order to add a trigger instance to the corresponding
event. It essentially coordinates adding a live trigger instance to
the event, and arming the triggering the event.
Every event_command func() implementation essentially does the
same thing for any command:
- choose ops - use the value of param to choose either a number or
count version of event_trigger_ops specific to the command
- do the register or unregister of those ops
- associate a filter, if specified, with the triggering event
The reg() and unreg() ops allow command-specific implementations for
event_trigger_op registration and unregistration, and the
get_trigger_ops() op allows command-specific event_trigger_ops
selection to be parameterized. When a trigger instance is added, the
reg() op essentially adds that trigger to the triggering event and
arms it, while unreg() does the opposite. The set_filter() function
is used to associate a filter with the trigger - if the command
doesn't specify a set_filter() implementation, the command will ignore
filters.
Each command has an associated trigger_type, which serves double duty,
both as a unique identifier for the command as well as a value that
can be used for setting a trigger mode bit during trigger invocation.
The signature of func() adds a pointer to the event_command struct,
used to invoke those functions, along with a command_data param that
can be passed to the reg/unreg functions. This allows func()
implementations to use command-specific blobs and supports code
re-use.
The event_trigger_ops.func() command corrsponds to the trigger 'probe'
function that gets called when the triggering event is actually
invoked. The other functions are used to list the trigger when
needed, along with a couple mundane book-keeping functions.
This also moves event_file_data() into trace.h so it can be used
outside of trace_events.c.
Link: http://lkml.kernel.org/r/316d95061accdee070aac8e5750afba0192fa5b9.1382622043.git.tom.zanussi@linux.intel.com
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Idea-by: Steve Rostedt <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-10-24 13:59:24 +00:00
|
|
|
extern const struct file_operations event_trigger_fops;
|
tracing: Add 'hist' event trigger command
'hist' triggers allow users to continually aggregate trace events,
which can then be viewed afterwards by simply reading a 'hist' file
containing the aggregation in a human-readable format.
The basic idea is very simple and boils down to a mechanism whereby
trace events, rather than being exhaustively dumped in raw form and
viewed directly, are automatically 'compressed' into meaningful tables
completely defined by the user.
This is done strictly via single-line command-line commands and
without the aid of any kind of programming language or interpreter.
A surprising number of typical use cases can be accomplished by users
via this simple mechanism. In fact, a large number of the tasks that
users typically do using the more complicated script-based tracing
tools, at least during the initial stages of an investigation, can be
accomplished by simply specifying a set of keys and values to be used
in the creation of a hash table.
The Linux kernel trace event subsystem happens to provide an extensive
list of keys and values ready-made for such a purpose in the form of
the event format files associated with each trace event. By simply
consulting the format file for field names of interest and by plugging
them into the hist trigger command, users can create an endless number
of useful aggregations to help with investigating various properties
of the system. See Documentation/trace/events.txt for examples.
hist triggers are implemented on top of the existing event trigger
infrastructure, and as such are consistent with the existing triggers
from a user's perspective as well.
The basic syntax follows the existing trigger syntax. Users start an
aggregation by writing a 'hist' trigger to the event of interest's
trigger file:
# echo hist:keys=xxx [ if filter] > event/trigger
Once a hist trigger has been set up, by default it continually
aggregates every matching event into a hash table using the event key
and a value field named 'hitcount'.
To view the aggregation at any point in time, simply read the 'hist'
file in the same directory as the 'trigger' file:
# cat event/hist
The detailed syntax provides additional options for user control, and
is described exhaustively in Documentation/trace/events.txt and in the
virtual tracing/README file in the tracing subsystem.
Link: http://lkml.kernel.org/r/72d263b5e1853fe9c314953b65833c3aa75479f2.1457029949.git.tom.zanussi@linux.intel.com
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-03-03 18:54:42 +00:00
|
|
|
extern const struct file_operations event_hist_fops;
|
2020-04-03 19:31:21 +00:00
|
|
|
extern const struct file_operations event_hist_debug_fops;
|
tracing: Introduce trace event injection
We have been trying to use rasdaemon to monitor hardware errors like
correctable memory errors. rasdaemon uses trace events to monitor
various hardware errors. In order to test it, we have to inject some
hardware errors, unfortunately not all of them provide error
injections. MCE does provide a way to inject MCE errors, but errors
like PCI error and devlink error don't, it is not easy to add error
injection to each of them. Instead, it is relatively easier to just
allow users to inject trace events in a generic way so that all trace
events can be injected.
This patch introduces trace event injection, where a new 'inject' is
added to each tracepoint directory. Users could write into this file
with key=value pairs to specify the value of each fields of the trace
event, all unspecified fields are set to zero values by default.
For example, for the net/net_dev_queue tracepoint, we can inject:
INJECT=/sys/kernel/debug/tracing/events/net/net_dev_queue/inject
echo "" > $INJECT
echo "name='test'" > $INJECT
echo "name='test' len=1024" > $INJECT
cat /sys/kernel/debug/tracing/trace
...
<...>-614 [000] .... 36.571483: net_dev_queue: dev= skbaddr=00000000fbf338c2 len=0
<...>-614 [001] .... 136.588252: net_dev_queue: dev=test skbaddr=00000000fbf338c2 len=0
<...>-614 [001] .N.. 208.431878: net_dev_queue: dev=test skbaddr=00000000fbf338c2 len=1024
Triggers could be triggered as usual too:
echo "stacktrace if len == 1025" > /sys/kernel/debug/tracing/events/net/net_dev_queue/trigger
echo "len=1025" > $INJECT
cat /sys/kernel/debug/tracing/trace
...
bash-614 [000] .... 36.571483: net_dev_queue: dev= skbaddr=00000000fbf338c2 len=0
bash-614 [001] .... 136.588252: net_dev_queue: dev=test skbaddr=00000000fbf338c2 len=0
bash-614 [001] .N.. 208.431878: net_dev_queue: dev=test skbaddr=00000000fbf338c2 len=1024
bash-614 [001] .N.1 284.236349: <stack trace>
=> event_inject_write
=> vfs_write
=> ksys_write
=> do_syscall_64
=> entry_SYSCALL_64_after_hwframe
The only thing that can't be injected is string pointers as they
require constant string pointers, this can't be done at run time.
Link: http://lkml.kernel.org/r/20191130045218.18979-1-xiyou.wangcong@gmail.com
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-11-30 04:52:18 +00:00
|
|
|
extern const struct file_operations event_inject_fops;
|
tracing: Add 'hist' event trigger command
'hist' triggers allow users to continually aggregate trace events,
which can then be viewed afterwards by simply reading a 'hist' file
containing the aggregation in a human-readable format.
The basic idea is very simple and boils down to a mechanism whereby
trace events, rather than being exhaustively dumped in raw form and
viewed directly, are automatically 'compressed' into meaningful tables
completely defined by the user.
This is done strictly via single-line command-line commands and
without the aid of any kind of programming language or interpreter.
A surprising number of typical use cases can be accomplished by users
via this simple mechanism. In fact, a large number of the tasks that
users typically do using the more complicated script-based tracing
tools, at least during the initial stages of an investigation, can be
accomplished by simply specifying a set of keys and values to be used
in the creation of a hash table.
The Linux kernel trace event subsystem happens to provide an extensive
list of keys and values ready-made for such a purpose in the form of
the event format files associated with each trace event. By simply
consulting the format file for field names of interest and by plugging
them into the hist trigger command, users can create an endless number
of useful aggregations to help with investigating various properties
of the system. See Documentation/trace/events.txt for examples.
hist triggers are implemented on top of the existing event trigger
infrastructure, and as such are consistent with the existing triggers
from a user's perspective as well.
The basic syntax follows the existing trigger syntax. Users start an
aggregation by writing a 'hist' trigger to the event of interest's
trigger file:
# echo hist:keys=xxx [ if filter] > event/trigger
Once a hist trigger has been set up, by default it continually
aggregates every matching event into a hash table using the event key
and a value field named 'hitcount'.
To view the aggregation at any point in time, simply read the 'hist'
file in the same directory as the 'trigger' file:
# cat event/hist
The detailed syntax provides additional options for user control, and
is described exhaustively in Documentation/trace/events.txt and in the
virtual tracing/README file in the tracing subsystem.
Link: http://lkml.kernel.org/r/72d263b5e1853fe9c314953b65833c3aa75479f2.1457029949.git.tom.zanussi@linux.intel.com
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-03-03 18:54:42 +00:00
|
|
|
|
|
|
|
#ifdef CONFIG_HIST_TRIGGERS
|
|
|
|
extern int register_trigger_hist_cmd(void);
|
2016-03-03 18:54:55 +00:00
|
|
|
extern int register_trigger_hist_enable_disable_cmds(void);
|
tracing: Add 'hist' event trigger command
'hist' triggers allow users to continually aggregate trace events,
which can then be viewed afterwards by simply reading a 'hist' file
containing the aggregation in a human-readable format.
The basic idea is very simple and boils down to a mechanism whereby
trace events, rather than being exhaustively dumped in raw form and
viewed directly, are automatically 'compressed' into meaningful tables
completely defined by the user.
This is done strictly via single-line command-line commands and
without the aid of any kind of programming language or interpreter.
A surprising number of typical use cases can be accomplished by users
via this simple mechanism. In fact, a large number of the tasks that
users typically do using the more complicated script-based tracing
tools, at least during the initial stages of an investigation, can be
accomplished by simply specifying a set of keys and values to be used
in the creation of a hash table.
The Linux kernel trace event subsystem happens to provide an extensive
list of keys and values ready-made for such a purpose in the form of
the event format files associated with each trace event. By simply
consulting the format file for field names of interest and by plugging
them into the hist trigger command, users can create an endless number
of useful aggregations to help with investigating various properties
of the system. See Documentation/trace/events.txt for examples.
hist triggers are implemented on top of the existing event trigger
infrastructure, and as such are consistent with the existing triggers
from a user's perspective as well.
The basic syntax follows the existing trigger syntax. Users start an
aggregation by writing a 'hist' trigger to the event of interest's
trigger file:
# echo hist:keys=xxx [ if filter] > event/trigger
Once a hist trigger has been set up, by default it continually
aggregates every matching event into a hash table using the event key
and a value field named 'hitcount'.
To view the aggregation at any point in time, simply read the 'hist'
file in the same directory as the 'trigger' file:
# cat event/hist
The detailed syntax provides additional options for user control, and
is described exhaustively in Documentation/trace/events.txt and in the
virtual tracing/README file in the tracing subsystem.
Link: http://lkml.kernel.org/r/72d263b5e1853fe9c314953b65833c3aa75479f2.1457029949.git.tom.zanussi@linux.intel.com
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-03-03 18:54:42 +00:00
|
|
|
#else
|
|
|
|
static inline int register_trigger_hist_cmd(void) { return 0; }
|
2016-03-03 18:54:55 +00:00
|
|
|
static inline int register_trigger_hist_enable_disable_cmds(void) { return 0; }
|
tracing: Add 'hist' event trigger command
'hist' triggers allow users to continually aggregate trace events,
which can then be viewed afterwards by simply reading a 'hist' file
containing the aggregation in a human-readable format.
The basic idea is very simple and boils down to a mechanism whereby
trace events, rather than being exhaustively dumped in raw form and
viewed directly, are automatically 'compressed' into meaningful tables
completely defined by the user.
This is done strictly via single-line command-line commands and
without the aid of any kind of programming language or interpreter.
A surprising number of typical use cases can be accomplished by users
via this simple mechanism. In fact, a large number of the tasks that
users typically do using the more complicated script-based tracing
tools, at least during the initial stages of an investigation, can be
accomplished by simply specifying a set of keys and values to be used
in the creation of a hash table.
The Linux kernel trace event subsystem happens to provide an extensive
list of keys and values ready-made for such a purpose in the form of
the event format files associated with each trace event. By simply
consulting the format file for field names of interest and by plugging
them into the hist trigger command, users can create an endless number
of useful aggregations to help with investigating various properties
of the system. See Documentation/trace/events.txt for examples.
hist triggers are implemented on top of the existing event trigger
infrastructure, and as such are consistent with the existing triggers
from a user's perspective as well.
The basic syntax follows the existing trigger syntax. Users start an
aggregation by writing a 'hist' trigger to the event of interest's
trigger file:
# echo hist:keys=xxx [ if filter] > event/trigger
Once a hist trigger has been set up, by default it continually
aggregates every matching event into a hash table using the event key
and a value field named 'hitcount'.
To view the aggregation at any point in time, simply read the 'hist'
file in the same directory as the 'trigger' file:
# cat event/hist
The detailed syntax provides additional options for user control, and
is described exhaustively in Documentation/trace/events.txt and in the
virtual tracing/README file in the tracing subsystem.
Link: http://lkml.kernel.org/r/72d263b5e1853fe9c314953b65833c3aa75479f2.1457029949.git.tom.zanussi@linux.intel.com
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Tested-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2016-03-03 18:54:42 +00:00
|
|
|
#endif
|
tracing: Add basic event trigger framework
Add a 'trigger' file for each trace event, enabling 'trace event
triggers' to be set for trace events.
'trace event triggers' are patterned after the existing 'ftrace
function triggers' implementation except that triggers are written to
per-event 'trigger' files instead of to a single file such as the
'set_ftrace_filter' used for ftrace function triggers.
The implementation is meant to be entirely separate from ftrace
function triggers, in order to keep the respective implementations
relatively simple and to allow them to diverge.
The event trigger functionality is built on top of SOFT_DISABLE
functionality. It adds a TRIGGER_MODE bit to the ftrace_event_file
flags which is checked when any trace event fires. Triggers set for a
particular event need to be checked regardless of whether that event
is actually enabled or not - getting an event to fire even if it's not
enabled is what's already implemented by SOFT_DISABLE mode, so trigger
mode directly reuses that. Event trigger essentially inherit the soft
disable logic in __ftrace_event_enable_disable() while adding a bit of
logic and trigger reference counting via tm_ref on top of that in a
new trace_event_trigger_enable_disable() function. Because the base
__ftrace_event_enable_disable() code now needs to be invoked from
outside trace_events.c, a wrapper is also added for those usages.
The triggers for an event are actually invoked via a new function,
event_triggers_call(), and code is also added to invoke them for
ftrace_raw_event calls as well as syscall events.
The main part of the patch creates a new trace_events_trigger.c file
to contain the trace event triggers implementation.
The standard open, read, and release file operations are implemented
here.
The open() implementation sets up for the various open modes of the
'trigger' file. It creates and attaches the trigger iterator and sets
up the command parser. If opened for reading set up the trigger
seq_ops.
The read() implementation parses the event trigger written to the
'trigger' file, looks up the trigger command, and passes it along to
that event_command's func() implementation for command-specific
processing.
The release() implementation does whatever cleanup is needed to
release the 'trigger' file, like releasing the parser and trigger
iterator, etc.
A couple of functions for event command registration and
unregistration are added, along with a list to add them to and a mutex
to protect them, as well as an (initially empty) registration function
to add the set of commands that will be added by future commits, and
call to it from the trace event initialization code.
also added are a couple trigger-specific data structures needed for
these implementations such as a trigger iterator and a struct for
trigger-specific data.
A couple structs consisting mostly of function meant to be implemented
in command-specific ways, event_command and event_trigger_ops, are
used by the generic event trigger command implementations. They're
being put into trace.h alongside the other trace_event data structures
and functions, in the expectation that they'll be needed in several
trace_event-related files such as trace_events_trigger.c and
trace_events.c.
The event_command.func() function is meant to be called by the trigger
parsing code in order to add a trigger instance to the corresponding
event. It essentially coordinates adding a live trigger instance to
the event, and arming the triggering the event.
Every event_command func() implementation essentially does the
same thing for any command:
- choose ops - use the value of param to choose either a number or
count version of event_trigger_ops specific to the command
- do the register or unregister of those ops
- associate a filter, if specified, with the triggering event
The reg() and unreg() ops allow command-specific implementations for
event_trigger_op registration and unregistration, and the
get_trigger_ops() op allows command-specific event_trigger_ops
selection to be parameterized. When a trigger instance is added, the
reg() op essentially adds that trigger to the triggering event and
arms it, while unreg() does the opposite. The set_filter() function
is used to associate a filter with the trigger - if the command
doesn't specify a set_filter() implementation, the command will ignore
filters.
Each command has an associated trigger_type, which serves double duty,
both as a unique identifier for the command as well as a value that
can be used for setting a trigger mode bit during trigger invocation.
The signature of func() adds a pointer to the event_command struct,
used to invoke those functions, along with a command_data param that
can be passed to the reg/unreg functions. This allows func()
implementations to use command-specific blobs and supports code
re-use.
The event_trigger_ops.func() command corrsponds to the trigger 'probe'
function that gets called when the triggering event is actually
invoked. The other functions are used to list the trigger when
needed, along with a couple mundane book-keeping functions.
This also moves event_file_data() into trace.h so it can be used
outside of trace_events.c.
Link: http://lkml.kernel.org/r/316d95061accdee070aac8e5750afba0192fa5b9.1382622043.git.tom.zanussi@linux.intel.com
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Idea-by: Steve Rostedt <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-10-24 13:59:24 +00:00
|
|
|
|
|
|
|
extern int register_trigger_cmds(void);
|
|
|
|
extern void clear_event_triggers(struct trace_array *tr);
|
|
|
|
|
tracing: Add a probe that attaches to trace events
A new dynamic event is introduced: event probe. The event is attached
to an existing tracepoint and uses its fields as arguments. The user
can specify custom format string of the new event, select what tracepoint
arguments will be printed and how to print them.
An event probe is created by writing configuration string in
'dynamic_events' ftrace file:
e[:[SNAME/]ENAME] SYSTEM/EVENT [FETCHARGS] - Set an event probe
-:SNAME/ENAME - Delete an event probe
Where:
SNAME - System name, if omitted 'eprobes' is used.
ENAME - Name of the new event in SNAME, if omitted the SYSTEM_EVENT is used.
SYSTEM - Name of the system, where the tracepoint is defined, mandatory.
EVENT - Name of the tracepoint event in SYSTEM, mandatory.
FETCHARGS - Arguments:
<name>=$<field>[:TYPE] - Fetch given filed of the tracepoint and print
it as given TYPE with given name. Supported
types are:
(u8/u16/u32/u64/s8/s16/s32/s64), basic type
(x8/x16/x32/x64), hexadecimal types
"string", "ustring" and bitfield.
Example, attach an event probe on openat system call and print name of the
file that will be opened:
echo "e:esys/eopen syscalls/sys_enter_openat file=\$filename:string" >> dynamic_events
A new dynamic event is created in events/esys/eopen/ directory. It
can be deleted with:
echo "-:esys/eopen" >> dynamic_events
Filters, triggers and histograms can be attached to the new event, it can
be matched in synthetic events. There is one limitation - an event probe
can not be attached to kprobe, uprobe or another event probe.
Link: https://lkml.kernel.org/r/20210812145805.2292326-1-tz.stoyanov@gmail.com
Link: https://lkml.kernel.org/r/20210819152825.142428383@goodmis.org
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Co-developed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-08-19 15:26:06 +00:00
|
|
|
enum {
|
|
|
|
EVENT_TRIGGER_FL_PROBE = BIT(0),
|
|
|
|
};
|
|
|
|
|
tracing: Add basic event trigger framework
Add a 'trigger' file for each trace event, enabling 'trace event
triggers' to be set for trace events.
'trace event triggers' are patterned after the existing 'ftrace
function triggers' implementation except that triggers are written to
per-event 'trigger' files instead of to a single file such as the
'set_ftrace_filter' used for ftrace function triggers.
The implementation is meant to be entirely separate from ftrace
function triggers, in order to keep the respective implementations
relatively simple and to allow them to diverge.
The event trigger functionality is built on top of SOFT_DISABLE
functionality. It adds a TRIGGER_MODE bit to the ftrace_event_file
flags which is checked when any trace event fires. Triggers set for a
particular event need to be checked regardless of whether that event
is actually enabled or not - getting an event to fire even if it's not
enabled is what's already implemented by SOFT_DISABLE mode, so trigger
mode directly reuses that. Event trigger essentially inherit the soft
disable logic in __ftrace_event_enable_disable() while adding a bit of
logic and trigger reference counting via tm_ref on top of that in a
new trace_event_trigger_enable_disable() function. Because the base
__ftrace_event_enable_disable() code now needs to be invoked from
outside trace_events.c, a wrapper is also added for those usages.
The triggers for an event are actually invoked via a new function,
event_triggers_call(), and code is also added to invoke them for
ftrace_raw_event calls as well as syscall events.
The main part of the patch creates a new trace_events_trigger.c file
to contain the trace event triggers implementation.
The standard open, read, and release file operations are implemented
here.
The open() implementation sets up for the various open modes of the
'trigger' file. It creates and attaches the trigger iterator and sets
up the command parser. If opened for reading set up the trigger
seq_ops.
The read() implementation parses the event trigger written to the
'trigger' file, looks up the trigger command, and passes it along to
that event_command's func() implementation for command-specific
processing.
The release() implementation does whatever cleanup is needed to
release the 'trigger' file, like releasing the parser and trigger
iterator, etc.
A couple of functions for event command registration and
unregistration are added, along with a list to add them to and a mutex
to protect them, as well as an (initially empty) registration function
to add the set of commands that will be added by future commits, and
call to it from the trace event initialization code.
also added are a couple trigger-specific data structures needed for
these implementations such as a trigger iterator and a struct for
trigger-specific data.
A couple structs consisting mostly of function meant to be implemented
in command-specific ways, event_command and event_trigger_ops, are
used by the generic event trigger command implementations. They're
being put into trace.h alongside the other trace_event data structures
and functions, in the expectation that they'll be needed in several
trace_event-related files such as trace_events_trigger.c and
trace_events.c.
The event_command.func() function is meant to be called by the trigger
parsing code in order to add a trigger instance to the corresponding
event. It essentially coordinates adding a live trigger instance to
the event, and arming the triggering the event.
Every event_command func() implementation essentially does the
same thing for any command:
- choose ops - use the value of param to choose either a number or
count version of event_trigger_ops specific to the command
- do the register or unregister of those ops
- associate a filter, if specified, with the triggering event
The reg() and unreg() ops allow command-specific implementations for
event_trigger_op registration and unregistration, and the
get_trigger_ops() op allows command-specific event_trigger_ops
selection to be parameterized. When a trigger instance is added, the
reg() op essentially adds that trigger to the triggering event and
arms it, while unreg() does the opposite. The set_filter() function
is used to associate a filter with the trigger - if the command
doesn't specify a set_filter() implementation, the command will ignore
filters.
Each command has an associated trigger_type, which serves double duty,
both as a unique identifier for the command as well as a value that
can be used for setting a trigger mode bit during trigger invocation.
The signature of func() adds a pointer to the event_command struct,
used to invoke those functions, along with a command_data param that
can be passed to the reg/unreg functions. This allows func()
implementations to use command-specific blobs and supports code
re-use.
The event_trigger_ops.func() command corrsponds to the trigger 'probe'
function that gets called when the triggering event is actually
invoked. The other functions are used to list the trigger when
needed, along with a couple mundane book-keeping functions.
This also moves event_file_data() into trace.h so it can be used
outside of trace_events.c.
Link: http://lkml.kernel.org/r/316d95061accdee070aac8e5750afba0192fa5b9.1382622043.git.tom.zanussi@linux.intel.com
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Idea-by: Steve Rostedt <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-10-24 13:59:24 +00:00
|
|
|
struct event_trigger_data {
|
|
|
|
unsigned long count;
|
|
|
|
int ref;
|
tracing: Add a probe that attaches to trace events
A new dynamic event is introduced: event probe. The event is attached
to an existing tracepoint and uses its fields as arguments. The user
can specify custom format string of the new event, select what tracepoint
arguments will be printed and how to print them.
An event probe is created by writing configuration string in
'dynamic_events' ftrace file:
e[:[SNAME/]ENAME] SYSTEM/EVENT [FETCHARGS] - Set an event probe
-:SNAME/ENAME - Delete an event probe
Where:
SNAME - System name, if omitted 'eprobes' is used.
ENAME - Name of the new event in SNAME, if omitted the SYSTEM_EVENT is used.
SYSTEM - Name of the system, where the tracepoint is defined, mandatory.
EVENT - Name of the tracepoint event in SYSTEM, mandatory.
FETCHARGS - Arguments:
<name>=$<field>[:TYPE] - Fetch given filed of the tracepoint and print
it as given TYPE with given name. Supported
types are:
(u8/u16/u32/u64/s8/s16/s32/s64), basic type
(x8/x16/x32/x64), hexadecimal types
"string", "ustring" and bitfield.
Example, attach an event probe on openat system call and print name of the
file that will be opened:
echo "e:esys/eopen syscalls/sys_enter_openat file=\$filename:string" >> dynamic_events
A new dynamic event is created in events/esys/eopen/ directory. It
can be deleted with:
echo "-:esys/eopen" >> dynamic_events
Filters, triggers and histograms can be attached to the new event, it can
be matched in synthetic events. There is one limitation - an event probe
can not be attached to kprobe, uprobe or another event probe.
Link: https://lkml.kernel.org/r/20210812145805.2292326-1-tz.stoyanov@gmail.com
Link: https://lkml.kernel.org/r/20210819152825.142428383@goodmis.org
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Co-developed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-08-19 15:26:06 +00:00
|
|
|
int flags;
|
tracing: Add basic event trigger framework
Add a 'trigger' file for each trace event, enabling 'trace event
triggers' to be set for trace events.
'trace event triggers' are patterned after the existing 'ftrace
function triggers' implementation except that triggers are written to
per-event 'trigger' files instead of to a single file such as the
'set_ftrace_filter' used for ftrace function triggers.
The implementation is meant to be entirely separate from ftrace
function triggers, in order to keep the respective implementations
relatively simple and to allow them to diverge.
The event trigger functionality is built on top of SOFT_DISABLE
functionality. It adds a TRIGGER_MODE bit to the ftrace_event_file
flags which is checked when any trace event fires. Triggers set for a
particular event need to be checked regardless of whether that event
is actually enabled or not - getting an event to fire even if it's not
enabled is what's already implemented by SOFT_DISABLE mode, so trigger
mode directly reuses that. Event trigger essentially inherit the soft
disable logic in __ftrace_event_enable_disable() while adding a bit of
logic and trigger reference counting via tm_ref on top of that in a
new trace_event_trigger_enable_disable() function. Because the base
__ftrace_event_enable_disable() code now needs to be invoked from
outside trace_events.c, a wrapper is also added for those usages.
The triggers for an event are actually invoked via a new function,
event_triggers_call(), and code is also added to invoke them for
ftrace_raw_event calls as well as syscall events.
The main part of the patch creates a new trace_events_trigger.c file
to contain the trace event triggers implementation.
The standard open, read, and release file operations are implemented
here.
The open() implementation sets up for the various open modes of the
'trigger' file. It creates and attaches the trigger iterator and sets
up the command parser. If opened for reading set up the trigger
seq_ops.
The read() implementation parses the event trigger written to the
'trigger' file, looks up the trigger command, and passes it along to
that event_command's func() implementation for command-specific
processing.
The release() implementation does whatever cleanup is needed to
release the 'trigger' file, like releasing the parser and trigger
iterator, etc.
A couple of functions for event command registration and
unregistration are added, along with a list to add them to and a mutex
to protect them, as well as an (initially empty) registration function
to add the set of commands that will be added by future commits, and
call to it from the trace event initialization code.
also added are a couple trigger-specific data structures needed for
these implementations such as a trigger iterator and a struct for
trigger-specific data.
A couple structs consisting mostly of function meant to be implemented
in command-specific ways, event_command and event_trigger_ops, are
used by the generic event trigger command implementations. They're
being put into trace.h alongside the other trace_event data structures
and functions, in the expectation that they'll be needed in several
trace_event-related files such as trace_events_trigger.c and
trace_events.c.
The event_command.func() function is meant to be called by the trigger
parsing code in order to add a trigger instance to the corresponding
event. It essentially coordinates adding a live trigger instance to
the event, and arming the triggering the event.
Every event_command func() implementation essentially does the
same thing for any command:
- choose ops - use the value of param to choose either a number or
count version of event_trigger_ops specific to the command
- do the register or unregister of those ops
- associate a filter, if specified, with the triggering event
The reg() and unreg() ops allow command-specific implementations for
event_trigger_op registration and unregistration, and the
get_trigger_ops() op allows command-specific event_trigger_ops
selection to be parameterized. When a trigger instance is added, the
reg() op essentially adds that trigger to the triggering event and
arms it, while unreg() does the opposite. The set_filter() function
is used to associate a filter with the trigger - if the command
doesn't specify a set_filter() implementation, the command will ignore
filters.
Each command has an associated trigger_type, which serves double duty,
both as a unique identifier for the command as well as a value that
can be used for setting a trigger mode bit during trigger invocation.
The signature of func() adds a pointer to the event_command struct,
used to invoke those functions, along with a command_data param that
can be passed to the reg/unreg functions. This allows func()
implementations to use command-specific blobs and supports code
re-use.
The event_trigger_ops.func() command corrsponds to the trigger 'probe'
function that gets called when the triggering event is actually
invoked. The other functions are used to list the trigger when
needed, along with a couple mundane book-keeping functions.
This also moves event_file_data() into trace.h so it can be used
outside of trace_events.c.
Link: http://lkml.kernel.org/r/316d95061accdee070aac8e5750afba0192fa5b9.1382622043.git.tom.zanussi@linux.intel.com
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Idea-by: Steve Rostedt <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-10-24 13:59:24 +00:00
|
|
|
struct event_trigger_ops *ops;
|
|
|
|
struct event_command *cmd_ops;
|
2013-12-22 02:55:17 +00:00
|
|
|
struct event_filter __rcu *filter;
|
tracing: Add basic event trigger framework
Add a 'trigger' file for each trace event, enabling 'trace event
triggers' to be set for trace events.
'trace event triggers' are patterned after the existing 'ftrace
function triggers' implementation except that triggers are written to
per-event 'trigger' files instead of to a single file such as the
'set_ftrace_filter' used for ftrace function triggers.
The implementation is meant to be entirely separate from ftrace
function triggers, in order to keep the respective implementations
relatively simple and to allow them to diverge.
The event trigger functionality is built on top of SOFT_DISABLE
functionality. It adds a TRIGGER_MODE bit to the ftrace_event_file
flags which is checked when any trace event fires. Triggers set for a
particular event need to be checked regardless of whether that event
is actually enabled or not - getting an event to fire even if it's not
enabled is what's already implemented by SOFT_DISABLE mode, so trigger
mode directly reuses that. Event trigger essentially inherit the soft
disable logic in __ftrace_event_enable_disable() while adding a bit of
logic and trigger reference counting via tm_ref on top of that in a
new trace_event_trigger_enable_disable() function. Because the base
__ftrace_event_enable_disable() code now needs to be invoked from
outside trace_events.c, a wrapper is also added for those usages.
The triggers for an event are actually invoked via a new function,
event_triggers_call(), and code is also added to invoke them for
ftrace_raw_event calls as well as syscall events.
The main part of the patch creates a new trace_events_trigger.c file
to contain the trace event triggers implementation.
The standard open, read, and release file operations are implemented
here.
The open() implementation sets up for the various open modes of the
'trigger' file. It creates and attaches the trigger iterator and sets
up the command parser. If opened for reading set up the trigger
seq_ops.
The read() implementation parses the event trigger written to the
'trigger' file, looks up the trigger command, and passes it along to
that event_command's func() implementation for command-specific
processing.
The release() implementation does whatever cleanup is needed to
release the 'trigger' file, like releasing the parser and trigger
iterator, etc.
A couple of functions for event command registration and
unregistration are added, along with a list to add them to and a mutex
to protect them, as well as an (initially empty) registration function
to add the set of commands that will be added by future commits, and
call to it from the trace event initialization code.
also added are a couple trigger-specific data structures needed for
these implementations such as a trigger iterator and a struct for
trigger-specific data.
A couple structs consisting mostly of function meant to be implemented
in command-specific ways, event_command and event_trigger_ops, are
used by the generic event trigger command implementations. They're
being put into trace.h alongside the other trace_event data structures
and functions, in the expectation that they'll be needed in several
trace_event-related files such as trace_events_trigger.c and
trace_events.c.
The event_command.func() function is meant to be called by the trigger
parsing code in order to add a trigger instance to the corresponding
event. It essentially coordinates adding a live trigger instance to
the event, and arming the triggering the event.
Every event_command func() implementation essentially does the
same thing for any command:
- choose ops - use the value of param to choose either a number or
count version of event_trigger_ops specific to the command
- do the register or unregister of those ops
- associate a filter, if specified, with the triggering event
The reg() and unreg() ops allow command-specific implementations for
event_trigger_op registration and unregistration, and the
get_trigger_ops() op allows command-specific event_trigger_ops
selection to be parameterized. When a trigger instance is added, the
reg() op essentially adds that trigger to the triggering event and
arms it, while unreg() does the opposite. The set_filter() function
is used to associate a filter with the trigger - if the command
doesn't specify a set_filter() implementation, the command will ignore
filters.
Each command has an associated trigger_type, which serves double duty,
both as a unique identifier for the command as well as a value that
can be used for setting a trigger mode bit during trigger invocation.
The signature of func() adds a pointer to the event_command struct,
used to invoke those functions, along with a command_data param that
can be passed to the reg/unreg functions. This allows func()
implementations to use command-specific blobs and supports code
re-use.
The event_trigger_ops.func() command corrsponds to the trigger 'probe'
function that gets called when the triggering event is actually
invoked. The other functions are used to list the trigger when
needed, along with a couple mundane book-keeping functions.
This also moves event_file_data() into trace.h so it can be used
outside of trace_events.c.
Link: http://lkml.kernel.org/r/316d95061accdee070aac8e5750afba0192fa5b9.1382622043.git.tom.zanussi@linux.intel.com
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Idea-by: Steve Rostedt <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-10-24 13:59:24 +00:00
|
|
|
char *filter_str;
|
|
|
|
void *private_data;
|
2015-12-10 18:50:47 +00:00
|
|
|
bool paused;
|
2016-03-03 18:54:58 +00:00
|
|
|
bool paused_tmp;
|
tracing: Add basic event trigger framework
Add a 'trigger' file for each trace event, enabling 'trace event
triggers' to be set for trace events.
'trace event triggers' are patterned after the existing 'ftrace
function triggers' implementation except that triggers are written to
per-event 'trigger' files instead of to a single file such as the
'set_ftrace_filter' used for ftrace function triggers.
The implementation is meant to be entirely separate from ftrace
function triggers, in order to keep the respective implementations
relatively simple and to allow them to diverge.
The event trigger functionality is built on top of SOFT_DISABLE
functionality. It adds a TRIGGER_MODE bit to the ftrace_event_file
flags which is checked when any trace event fires. Triggers set for a
particular event need to be checked regardless of whether that event
is actually enabled or not - getting an event to fire even if it's not
enabled is what's already implemented by SOFT_DISABLE mode, so trigger
mode directly reuses that. Event trigger essentially inherit the soft
disable logic in __ftrace_event_enable_disable() while adding a bit of
logic and trigger reference counting via tm_ref on top of that in a
new trace_event_trigger_enable_disable() function. Because the base
__ftrace_event_enable_disable() code now needs to be invoked from
outside trace_events.c, a wrapper is also added for those usages.
The triggers for an event are actually invoked via a new function,
event_triggers_call(), and code is also added to invoke them for
ftrace_raw_event calls as well as syscall events.
The main part of the patch creates a new trace_events_trigger.c file
to contain the trace event triggers implementation.
The standard open, read, and release file operations are implemented
here.
The open() implementation sets up for the various open modes of the
'trigger' file. It creates and attaches the trigger iterator and sets
up the command parser. If opened for reading set up the trigger
seq_ops.
The read() implementation parses the event trigger written to the
'trigger' file, looks up the trigger command, and passes it along to
that event_command's func() implementation for command-specific
processing.
The release() implementation does whatever cleanup is needed to
release the 'trigger' file, like releasing the parser and trigger
iterator, etc.
A couple of functions for event command registration and
unregistration are added, along with a list to add them to and a mutex
to protect them, as well as an (initially empty) registration function
to add the set of commands that will be added by future commits, and
call to it from the trace event initialization code.
also added are a couple trigger-specific data structures needed for
these implementations such as a trigger iterator and a struct for
trigger-specific data.
A couple structs consisting mostly of function meant to be implemented
in command-specific ways, event_command and event_trigger_ops, are
used by the generic event trigger command implementations. They're
being put into trace.h alongside the other trace_event data structures
and functions, in the expectation that they'll be needed in several
trace_event-related files such as trace_events_trigger.c and
trace_events.c.
The event_command.func() function is meant to be called by the trigger
parsing code in order to add a trigger instance to the corresponding
event. It essentially coordinates adding a live trigger instance to
the event, and arming the triggering the event.
Every event_command func() implementation essentially does the
same thing for any command:
- choose ops - use the value of param to choose either a number or
count version of event_trigger_ops specific to the command
- do the register or unregister of those ops
- associate a filter, if specified, with the triggering event
The reg() and unreg() ops allow command-specific implementations for
event_trigger_op registration and unregistration, and the
get_trigger_ops() op allows command-specific event_trigger_ops
selection to be parameterized. When a trigger instance is added, the
reg() op essentially adds that trigger to the triggering event and
arms it, while unreg() does the opposite. The set_filter() function
is used to associate a filter with the trigger - if the command
doesn't specify a set_filter() implementation, the command will ignore
filters.
Each command has an associated trigger_type, which serves double duty,
both as a unique identifier for the command as well as a value that
can be used for setting a trigger mode bit during trigger invocation.
The signature of func() adds a pointer to the event_command struct,
used to invoke those functions, along with a command_data param that
can be passed to the reg/unreg functions. This allows func()
implementations to use command-specific blobs and supports code
re-use.
The event_trigger_ops.func() command corrsponds to the trigger 'probe'
function that gets called when the triggering event is actually
invoked. The other functions are used to list the trigger when
needed, along with a couple mundane book-keeping functions.
This also moves event_file_data() into trace.h so it can be used
outside of trace_events.c.
Link: http://lkml.kernel.org/r/316d95061accdee070aac8e5750afba0192fa5b9.1382622043.git.tom.zanussi@linux.intel.com
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Idea-by: Steve Rostedt <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-10-24 13:59:24 +00:00
|
|
|
struct list_head list;
|
2016-03-03 18:54:58 +00:00
|
|
|
char *name;
|
|
|
|
struct list_head named_list;
|
|
|
|
struct event_trigger_data *named_data;
|
tracing: Add basic event trigger framework
Add a 'trigger' file for each trace event, enabling 'trace event
triggers' to be set for trace events.
'trace event triggers' are patterned after the existing 'ftrace
function triggers' implementation except that triggers are written to
per-event 'trigger' files instead of to a single file such as the
'set_ftrace_filter' used for ftrace function triggers.
The implementation is meant to be entirely separate from ftrace
function triggers, in order to keep the respective implementations
relatively simple and to allow them to diverge.
The event trigger functionality is built on top of SOFT_DISABLE
functionality. It adds a TRIGGER_MODE bit to the ftrace_event_file
flags which is checked when any trace event fires. Triggers set for a
particular event need to be checked regardless of whether that event
is actually enabled or not - getting an event to fire even if it's not
enabled is what's already implemented by SOFT_DISABLE mode, so trigger
mode directly reuses that. Event trigger essentially inherit the soft
disable logic in __ftrace_event_enable_disable() while adding a bit of
logic and trigger reference counting via tm_ref on top of that in a
new trace_event_trigger_enable_disable() function. Because the base
__ftrace_event_enable_disable() code now needs to be invoked from
outside trace_events.c, a wrapper is also added for those usages.
The triggers for an event are actually invoked via a new function,
event_triggers_call(), and code is also added to invoke them for
ftrace_raw_event calls as well as syscall events.
The main part of the patch creates a new trace_events_trigger.c file
to contain the trace event triggers implementation.
The standard open, read, and release file operations are implemented
here.
The open() implementation sets up for the various open modes of the
'trigger' file. It creates and attaches the trigger iterator and sets
up the command parser. If opened for reading set up the trigger
seq_ops.
The read() implementation parses the event trigger written to the
'trigger' file, looks up the trigger command, and passes it along to
that event_command's func() implementation for command-specific
processing.
The release() implementation does whatever cleanup is needed to
release the 'trigger' file, like releasing the parser and trigger
iterator, etc.
A couple of functions for event command registration and
unregistration are added, along with a list to add them to and a mutex
to protect them, as well as an (initially empty) registration function
to add the set of commands that will be added by future commits, and
call to it from the trace event initialization code.
also added are a couple trigger-specific data structures needed for
these implementations such as a trigger iterator and a struct for
trigger-specific data.
A couple structs consisting mostly of function meant to be implemented
in command-specific ways, event_command and event_trigger_ops, are
used by the generic event trigger command implementations. They're
being put into trace.h alongside the other trace_event data structures
and functions, in the expectation that they'll be needed in several
trace_event-related files such as trace_events_trigger.c and
trace_events.c.
The event_command.func() function is meant to be called by the trigger
parsing code in order to add a trigger instance to the corresponding
event. It essentially coordinates adding a live trigger instance to
the event, and arming the triggering the event.
Every event_command func() implementation essentially does the
same thing for any command:
- choose ops - use the value of param to choose either a number or
count version of event_trigger_ops specific to the command
- do the register or unregister of those ops
- associate a filter, if specified, with the triggering event
The reg() and unreg() ops allow command-specific implementations for
event_trigger_op registration and unregistration, and the
get_trigger_ops() op allows command-specific event_trigger_ops
selection to be parameterized. When a trigger instance is added, the
reg() op essentially adds that trigger to the triggering event and
arms it, while unreg() does the opposite. The set_filter() function
is used to associate a filter with the trigger - if the command
doesn't specify a set_filter() implementation, the command will ignore
filters.
Each command has an associated trigger_type, which serves double duty,
both as a unique identifier for the command as well as a value that
can be used for setting a trigger mode bit during trigger invocation.
The signature of func() adds a pointer to the event_command struct,
used to invoke those functions, along with a command_data param that
can be passed to the reg/unreg functions. This allows func()
implementations to use command-specific blobs and supports code
re-use.
The event_trigger_ops.func() command corrsponds to the trigger 'probe'
function that gets called when the triggering event is actually
invoked. The other functions are used to list the trigger when
needed, along with a couple mundane book-keeping functions.
This also moves event_file_data() into trace.h so it can be used
outside of trace_events.c.
Link: http://lkml.kernel.org/r/316d95061accdee070aac8e5750afba0192fa5b9.1382622043.git.tom.zanussi@linux.intel.com
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Idea-by: Steve Rostedt <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-10-24 13:59:24 +00:00
|
|
|
};
|
|
|
|
|
2016-03-03 18:54:55 +00:00
|
|
|
/* Avoid typos */
|
|
|
|
#define ENABLE_EVENT_STR "enable_event"
|
|
|
|
#define DISABLE_EVENT_STR "disable_event"
|
|
|
|
#define ENABLE_HIST_STR "enable_hist"
|
|
|
|
#define DISABLE_HIST_STR "disable_hist"
|
|
|
|
|
|
|
|
struct enable_trigger_data {
|
|
|
|
struct trace_event_file *file;
|
|
|
|
bool enable;
|
|
|
|
bool hist;
|
|
|
|
};
|
|
|
|
|
|
|
|
extern int event_enable_trigger_print(struct seq_file *m,
|
|
|
|
struct event_trigger_data *data);
|
2022-02-04 22:12:05 +00:00
|
|
|
extern void event_enable_trigger_free(struct event_trigger_data *data);
|
2022-01-10 14:04:11 +00:00
|
|
|
extern int event_enable_trigger_parse(struct event_command *cmd_ops,
|
|
|
|
struct trace_event_file *file,
|
2022-02-04 22:12:06 +00:00
|
|
|
char *glob, char *cmd,
|
|
|
|
char *param_and_filter);
|
2016-03-03 18:54:55 +00:00
|
|
|
extern int event_enable_register_trigger(char *glob,
|
|
|
|
struct event_trigger_data *data,
|
|
|
|
struct trace_event_file *file);
|
|
|
|
extern void event_enable_unregister_trigger(char *glob,
|
|
|
|
struct event_trigger_data *test,
|
|
|
|
struct trace_event_file *file);
|
2015-12-10 18:50:44 +00:00
|
|
|
extern void trigger_data_free(struct event_trigger_data *data);
|
2022-02-04 22:12:05 +00:00
|
|
|
extern int event_trigger_init(struct event_trigger_data *data);
|
2015-12-10 18:50:44 +00:00
|
|
|
extern int trace_event_trigger_enable_disable(struct trace_event_file *file,
|
|
|
|
int trigger_enable);
|
|
|
|
extern void update_cond_flag(struct trace_event_file *file);
|
|
|
|
extern int set_trigger_filter(char *filter_str,
|
|
|
|
struct event_trigger_data *trigger_data,
|
|
|
|
struct trace_event_file *file);
|
2016-03-03 18:54:58 +00:00
|
|
|
extern struct event_trigger_data *find_named_trigger(const char *name);
|
|
|
|
extern bool is_named_trigger(struct event_trigger_data *test);
|
|
|
|
extern int save_named_trigger(const char *name,
|
|
|
|
struct event_trigger_data *data);
|
|
|
|
extern void del_named_trigger(struct event_trigger_data *data);
|
|
|
|
extern void pause_named_trigger(struct event_trigger_data *data);
|
|
|
|
extern void unpause_named_trigger(struct event_trigger_data *data);
|
|
|
|
extern void set_named_trigger_data(struct event_trigger_data *data,
|
|
|
|
struct event_trigger_data *named_data);
|
2018-01-16 02:51:56 +00:00
|
|
|
extern struct event_trigger_data *
|
|
|
|
get_named_trigger_data(struct event_trigger_data *data);
|
2015-12-10 18:50:44 +00:00
|
|
|
extern int register_event_command(struct event_command *cmd);
|
2016-03-03 18:54:55 +00:00
|
|
|
extern int unregister_event_command(struct event_command *cmd);
|
|
|
|
extern int register_trigger_hist_enable_disable_cmds(void);
|
2022-01-10 14:04:14 +00:00
|
|
|
extern bool event_trigger_check_remove(const char *glob);
|
|
|
|
extern bool event_trigger_empty_param(const char *param);
|
|
|
|
extern int event_trigger_separate_filter(char *param_and_filter, char **param,
|
|
|
|
char **filter, bool param_required);
|
|
|
|
extern struct event_trigger_data *
|
|
|
|
event_trigger_alloc(struct event_command *cmd_ops,
|
|
|
|
char *cmd,
|
|
|
|
char *param,
|
|
|
|
void *private_data);
|
|
|
|
extern int event_trigger_parse_num(char *trigger,
|
|
|
|
struct event_trigger_data *trigger_data);
|
|
|
|
extern int event_trigger_set_filter(struct event_command *cmd_ops,
|
|
|
|
struct trace_event_file *file,
|
|
|
|
char *param,
|
|
|
|
struct event_trigger_data *trigger_data);
|
|
|
|
extern void event_trigger_reset_filter(struct event_command *cmd_ops,
|
|
|
|
struct event_trigger_data *trigger_data);
|
|
|
|
extern int event_trigger_register(struct event_command *cmd_ops,
|
|
|
|
struct trace_event_file *file,
|
|
|
|
char *glob,
|
2022-02-04 22:12:04 +00:00
|
|
|
struct event_trigger_data *trigger_data);
|
|
|
|
extern void event_trigger_unregister(struct event_command *cmd_ops,
|
|
|
|
struct trace_event_file *file,
|
|
|
|
char *glob,
|
|
|
|
struct event_trigger_data *trigger_data);
|
2015-12-10 18:50:44 +00:00
|
|
|
|
2023-10-31 16:24:53 +00:00
|
|
|
extern void event_file_get(struct trace_event_file *file);
|
|
|
|
extern void event_file_put(struct trace_event_file *file);
|
|
|
|
|
tracing: Add basic event trigger framework
Add a 'trigger' file for each trace event, enabling 'trace event
triggers' to be set for trace events.
'trace event triggers' are patterned after the existing 'ftrace
function triggers' implementation except that triggers are written to
per-event 'trigger' files instead of to a single file such as the
'set_ftrace_filter' used for ftrace function triggers.
The implementation is meant to be entirely separate from ftrace
function triggers, in order to keep the respective implementations
relatively simple and to allow them to diverge.
The event trigger functionality is built on top of SOFT_DISABLE
functionality. It adds a TRIGGER_MODE bit to the ftrace_event_file
flags which is checked when any trace event fires. Triggers set for a
particular event need to be checked regardless of whether that event
is actually enabled or not - getting an event to fire even if it's not
enabled is what's already implemented by SOFT_DISABLE mode, so trigger
mode directly reuses that. Event trigger essentially inherit the soft
disable logic in __ftrace_event_enable_disable() while adding a bit of
logic and trigger reference counting via tm_ref on top of that in a
new trace_event_trigger_enable_disable() function. Because the base
__ftrace_event_enable_disable() code now needs to be invoked from
outside trace_events.c, a wrapper is also added for those usages.
The triggers for an event are actually invoked via a new function,
event_triggers_call(), and code is also added to invoke them for
ftrace_raw_event calls as well as syscall events.
The main part of the patch creates a new trace_events_trigger.c file
to contain the trace event triggers implementation.
The standard open, read, and release file operations are implemented
here.
The open() implementation sets up for the various open modes of the
'trigger' file. It creates and attaches the trigger iterator and sets
up the command parser. If opened for reading set up the trigger
seq_ops.
The read() implementation parses the event trigger written to the
'trigger' file, looks up the trigger command, and passes it along to
that event_command's func() implementation for command-specific
processing.
The release() implementation does whatever cleanup is needed to
release the 'trigger' file, like releasing the parser and trigger
iterator, etc.
A couple of functions for event command registration and
unregistration are added, along with a list to add them to and a mutex
to protect them, as well as an (initially empty) registration function
to add the set of commands that will be added by future commits, and
call to it from the trace event initialization code.
also added are a couple trigger-specific data structures needed for
these implementations such as a trigger iterator and a struct for
trigger-specific data.
A couple structs consisting mostly of function meant to be implemented
in command-specific ways, event_command and event_trigger_ops, are
used by the generic event trigger command implementations. They're
being put into trace.h alongside the other trace_event data structures
and functions, in the expectation that they'll be needed in several
trace_event-related files such as trace_events_trigger.c and
trace_events.c.
The event_command.func() function is meant to be called by the trigger
parsing code in order to add a trigger instance to the corresponding
event. It essentially coordinates adding a live trigger instance to
the event, and arming the triggering the event.
Every event_command func() implementation essentially does the
same thing for any command:
- choose ops - use the value of param to choose either a number or
count version of event_trigger_ops specific to the command
- do the register or unregister of those ops
- associate a filter, if specified, with the triggering event
The reg() and unreg() ops allow command-specific implementations for
event_trigger_op registration and unregistration, and the
get_trigger_ops() op allows command-specific event_trigger_ops
selection to be parameterized. When a trigger instance is added, the
reg() op essentially adds that trigger to the triggering event and
arms it, while unreg() does the opposite. The set_filter() function
is used to associate a filter with the trigger - if the command
doesn't specify a set_filter() implementation, the command will ignore
filters.
Each command has an associated trigger_type, which serves double duty,
both as a unique identifier for the command as well as a value that
can be used for setting a trigger mode bit during trigger invocation.
The signature of func() adds a pointer to the event_command struct,
used to invoke those functions, along with a command_data param that
can be passed to the reg/unreg functions. This allows func()
implementations to use command-specific blobs and supports code
re-use.
The event_trigger_ops.func() command corrsponds to the trigger 'probe'
function that gets called when the triggering event is actually
invoked. The other functions are used to list the trigger when
needed, along with a couple mundane book-keeping functions.
This also moves event_file_data() into trace.h so it can be used
outside of trace_events.c.
Link: http://lkml.kernel.org/r/316d95061accdee070aac8e5750afba0192fa5b9.1382622043.git.tom.zanussi@linux.intel.com
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Idea-by: Steve Rostedt <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-10-24 13:59:24 +00:00
|
|
|
/**
|
|
|
|
* struct event_trigger_ops - callbacks for trace event triggers
|
|
|
|
*
|
|
|
|
* The methods in this structure provide per-event trigger hooks for
|
|
|
|
* various trigger operations.
|
|
|
|
*
|
2022-01-10 14:04:12 +00:00
|
|
|
* The @init and @free methods are used during trigger setup and
|
|
|
|
* teardown, typically called from an event_command's @parse()
|
|
|
|
* function implementation.
|
|
|
|
*
|
|
|
|
* The @print method is used to print the trigger spec.
|
|
|
|
*
|
|
|
|
* The @trigger method is the function that actually implements the
|
|
|
|
* trigger and is called in the context of the triggering event
|
|
|
|
* whenever that event occurs.
|
|
|
|
*
|
tracing: Add basic event trigger framework
Add a 'trigger' file for each trace event, enabling 'trace event
triggers' to be set for trace events.
'trace event triggers' are patterned after the existing 'ftrace
function triggers' implementation except that triggers are written to
per-event 'trigger' files instead of to a single file such as the
'set_ftrace_filter' used for ftrace function triggers.
The implementation is meant to be entirely separate from ftrace
function triggers, in order to keep the respective implementations
relatively simple and to allow them to diverge.
The event trigger functionality is built on top of SOFT_DISABLE
functionality. It adds a TRIGGER_MODE bit to the ftrace_event_file
flags which is checked when any trace event fires. Triggers set for a
particular event need to be checked regardless of whether that event
is actually enabled or not - getting an event to fire even if it's not
enabled is what's already implemented by SOFT_DISABLE mode, so trigger
mode directly reuses that. Event trigger essentially inherit the soft
disable logic in __ftrace_event_enable_disable() while adding a bit of
logic and trigger reference counting via tm_ref on top of that in a
new trace_event_trigger_enable_disable() function. Because the base
__ftrace_event_enable_disable() code now needs to be invoked from
outside trace_events.c, a wrapper is also added for those usages.
The triggers for an event are actually invoked via a new function,
event_triggers_call(), and code is also added to invoke them for
ftrace_raw_event calls as well as syscall events.
The main part of the patch creates a new trace_events_trigger.c file
to contain the trace event triggers implementation.
The standard open, read, and release file operations are implemented
here.
The open() implementation sets up for the various open modes of the
'trigger' file. It creates and attaches the trigger iterator and sets
up the command parser. If opened for reading set up the trigger
seq_ops.
The read() implementation parses the event trigger written to the
'trigger' file, looks up the trigger command, and passes it along to
that event_command's func() implementation for command-specific
processing.
The release() implementation does whatever cleanup is needed to
release the 'trigger' file, like releasing the parser and trigger
iterator, etc.
A couple of functions for event command registration and
unregistration are added, along with a list to add them to and a mutex
to protect them, as well as an (initially empty) registration function
to add the set of commands that will be added by future commits, and
call to it from the trace event initialization code.
also added are a couple trigger-specific data structures needed for
these implementations such as a trigger iterator and a struct for
trigger-specific data.
A couple structs consisting mostly of function meant to be implemented
in command-specific ways, event_command and event_trigger_ops, are
used by the generic event trigger command implementations. They're
being put into trace.h alongside the other trace_event data structures
and functions, in the expectation that they'll be needed in several
trace_event-related files such as trace_events_trigger.c and
trace_events.c.
The event_command.func() function is meant to be called by the trigger
parsing code in order to add a trigger instance to the corresponding
event. It essentially coordinates adding a live trigger instance to
the event, and arming the triggering the event.
Every event_command func() implementation essentially does the
same thing for any command:
- choose ops - use the value of param to choose either a number or
count version of event_trigger_ops specific to the command
- do the register or unregister of those ops
- associate a filter, if specified, with the triggering event
The reg() and unreg() ops allow command-specific implementations for
event_trigger_op registration and unregistration, and the
get_trigger_ops() op allows command-specific event_trigger_ops
selection to be parameterized. When a trigger instance is added, the
reg() op essentially adds that trigger to the triggering event and
arms it, while unreg() does the opposite. The set_filter() function
is used to associate a filter with the trigger - if the command
doesn't specify a set_filter() implementation, the command will ignore
filters.
Each command has an associated trigger_type, which serves double duty,
both as a unique identifier for the command as well as a value that
can be used for setting a trigger mode bit during trigger invocation.
The signature of func() adds a pointer to the event_command struct,
used to invoke those functions, along with a command_data param that
can be passed to the reg/unreg functions. This allows func()
implementations to use command-specific blobs and supports code
re-use.
The event_trigger_ops.func() command corrsponds to the trigger 'probe'
function that gets called when the triggering event is actually
invoked. The other functions are used to list the trigger when
needed, along with a couple mundane book-keeping functions.
This also moves event_file_data() into trace.h so it can be used
outside of trace_events.c.
Link: http://lkml.kernel.org/r/316d95061accdee070aac8e5750afba0192fa5b9.1382622043.git.tom.zanussi@linux.intel.com
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Idea-by: Steve Rostedt <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-10-24 13:59:24 +00:00
|
|
|
* All the methods below, except for @init() and @free(), must be
|
|
|
|
* implemented.
|
|
|
|
*
|
2022-01-10 14:04:12 +00:00
|
|
|
* @trigger: The trigger 'probe' function called when the triggering
|
tracing: Add basic event trigger framework
Add a 'trigger' file for each trace event, enabling 'trace event
triggers' to be set for trace events.
'trace event triggers' are patterned after the existing 'ftrace
function triggers' implementation except that triggers are written to
per-event 'trigger' files instead of to a single file such as the
'set_ftrace_filter' used for ftrace function triggers.
The implementation is meant to be entirely separate from ftrace
function triggers, in order to keep the respective implementations
relatively simple and to allow them to diverge.
The event trigger functionality is built on top of SOFT_DISABLE
functionality. It adds a TRIGGER_MODE bit to the ftrace_event_file
flags which is checked when any trace event fires. Triggers set for a
particular event need to be checked regardless of whether that event
is actually enabled or not - getting an event to fire even if it's not
enabled is what's already implemented by SOFT_DISABLE mode, so trigger
mode directly reuses that. Event trigger essentially inherit the soft
disable logic in __ftrace_event_enable_disable() while adding a bit of
logic and trigger reference counting via tm_ref on top of that in a
new trace_event_trigger_enable_disable() function. Because the base
__ftrace_event_enable_disable() code now needs to be invoked from
outside trace_events.c, a wrapper is also added for those usages.
The triggers for an event are actually invoked via a new function,
event_triggers_call(), and code is also added to invoke them for
ftrace_raw_event calls as well as syscall events.
The main part of the patch creates a new trace_events_trigger.c file
to contain the trace event triggers implementation.
The standard open, read, and release file operations are implemented
here.
The open() implementation sets up for the various open modes of the
'trigger' file. It creates and attaches the trigger iterator and sets
up the command parser. If opened for reading set up the trigger
seq_ops.
The read() implementation parses the event trigger written to the
'trigger' file, looks up the trigger command, and passes it along to
that event_command's func() implementation for command-specific
processing.
The release() implementation does whatever cleanup is needed to
release the 'trigger' file, like releasing the parser and trigger
iterator, etc.
A couple of functions for event command registration and
unregistration are added, along with a list to add them to and a mutex
to protect them, as well as an (initially empty) registration function
to add the set of commands that will be added by future commits, and
call to it from the trace event initialization code.
also added are a couple trigger-specific data structures needed for
these implementations such as a trigger iterator and a struct for
trigger-specific data.
A couple structs consisting mostly of function meant to be implemented
in command-specific ways, event_command and event_trigger_ops, are
used by the generic event trigger command implementations. They're
being put into trace.h alongside the other trace_event data structures
and functions, in the expectation that they'll be needed in several
trace_event-related files such as trace_events_trigger.c and
trace_events.c.
The event_command.func() function is meant to be called by the trigger
parsing code in order to add a trigger instance to the corresponding
event. It essentially coordinates adding a live trigger instance to
the event, and arming the triggering the event.
Every event_command func() implementation essentially does the
same thing for any command:
- choose ops - use the value of param to choose either a number or
count version of event_trigger_ops specific to the command
- do the register or unregister of those ops
- associate a filter, if specified, with the triggering event
The reg() and unreg() ops allow command-specific implementations for
event_trigger_op registration and unregistration, and the
get_trigger_ops() op allows command-specific event_trigger_ops
selection to be parameterized. When a trigger instance is added, the
reg() op essentially adds that trigger to the triggering event and
arms it, while unreg() does the opposite. The set_filter() function
is used to associate a filter with the trigger - if the command
doesn't specify a set_filter() implementation, the command will ignore
filters.
Each command has an associated trigger_type, which serves double duty,
both as a unique identifier for the command as well as a value that
can be used for setting a trigger mode bit during trigger invocation.
The signature of func() adds a pointer to the event_command struct,
used to invoke those functions, along with a command_data param that
can be passed to the reg/unreg functions. This allows func()
implementations to use command-specific blobs and supports code
re-use.
The event_trigger_ops.func() command corrsponds to the trigger 'probe'
function that gets called when the triggering event is actually
invoked. The other functions are used to list the trigger when
needed, along with a couple mundane book-keeping functions.
This also moves event_file_data() into trace.h so it can be used
outside of trace_events.c.
Link: http://lkml.kernel.org/r/316d95061accdee070aac8e5750afba0192fa5b9.1382622043.git.tom.zanussi@linux.intel.com
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Idea-by: Steve Rostedt <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-10-24 13:59:24 +00:00
|
|
|
* event occurs. The data passed into this callback is the data
|
|
|
|
* that was supplied to the event_command @reg() function that
|
2015-12-10 18:50:45 +00:00
|
|
|
* registered the trigger (see struct event_command) along with
|
|
|
|
* the trace record, rec.
|
tracing: Add basic event trigger framework
Add a 'trigger' file for each trace event, enabling 'trace event
triggers' to be set for trace events.
'trace event triggers' are patterned after the existing 'ftrace
function triggers' implementation except that triggers are written to
per-event 'trigger' files instead of to a single file such as the
'set_ftrace_filter' used for ftrace function triggers.
The implementation is meant to be entirely separate from ftrace
function triggers, in order to keep the respective implementations
relatively simple and to allow them to diverge.
The event trigger functionality is built on top of SOFT_DISABLE
functionality. It adds a TRIGGER_MODE bit to the ftrace_event_file
flags which is checked when any trace event fires. Triggers set for a
particular event need to be checked regardless of whether that event
is actually enabled or not - getting an event to fire even if it's not
enabled is what's already implemented by SOFT_DISABLE mode, so trigger
mode directly reuses that. Event trigger essentially inherit the soft
disable logic in __ftrace_event_enable_disable() while adding a bit of
logic and trigger reference counting via tm_ref on top of that in a
new trace_event_trigger_enable_disable() function. Because the base
__ftrace_event_enable_disable() code now needs to be invoked from
outside trace_events.c, a wrapper is also added for those usages.
The triggers for an event are actually invoked via a new function,
event_triggers_call(), and code is also added to invoke them for
ftrace_raw_event calls as well as syscall events.
The main part of the patch creates a new trace_events_trigger.c file
to contain the trace event triggers implementation.
The standard open, read, and release file operations are implemented
here.
The open() implementation sets up for the various open modes of the
'trigger' file. It creates and attaches the trigger iterator and sets
up the command parser. If opened for reading set up the trigger
seq_ops.
The read() implementation parses the event trigger written to the
'trigger' file, looks up the trigger command, and passes it along to
that event_command's func() implementation for command-specific
processing.
The release() implementation does whatever cleanup is needed to
release the 'trigger' file, like releasing the parser and trigger
iterator, etc.
A couple of functions for event command registration and
unregistration are added, along with a list to add them to and a mutex
to protect them, as well as an (initially empty) registration function
to add the set of commands that will be added by future commits, and
call to it from the trace event initialization code.
also added are a couple trigger-specific data structures needed for
these implementations such as a trigger iterator and a struct for
trigger-specific data.
A couple structs consisting mostly of function meant to be implemented
in command-specific ways, event_command and event_trigger_ops, are
used by the generic event trigger command implementations. They're
being put into trace.h alongside the other trace_event data structures
and functions, in the expectation that they'll be needed in several
trace_event-related files such as trace_events_trigger.c and
trace_events.c.
The event_command.func() function is meant to be called by the trigger
parsing code in order to add a trigger instance to the corresponding
event. It essentially coordinates adding a live trigger instance to
the event, and arming the triggering the event.
Every event_command func() implementation essentially does the
same thing for any command:
- choose ops - use the value of param to choose either a number or
count version of event_trigger_ops specific to the command
- do the register or unregister of those ops
- associate a filter, if specified, with the triggering event
The reg() and unreg() ops allow command-specific implementations for
event_trigger_op registration and unregistration, and the
get_trigger_ops() op allows command-specific event_trigger_ops
selection to be parameterized. When a trigger instance is added, the
reg() op essentially adds that trigger to the triggering event and
arms it, while unreg() does the opposite. The set_filter() function
is used to associate a filter with the trigger - if the command
doesn't specify a set_filter() implementation, the command will ignore
filters.
Each command has an associated trigger_type, which serves double duty,
both as a unique identifier for the command as well as a value that
can be used for setting a trigger mode bit during trigger invocation.
The signature of func() adds a pointer to the event_command struct,
used to invoke those functions, along with a command_data param that
can be passed to the reg/unreg functions. This allows func()
implementations to use command-specific blobs and supports code
re-use.
The event_trigger_ops.func() command corrsponds to the trigger 'probe'
function that gets called when the triggering event is actually
invoked. The other functions are used to list the trigger when
needed, along with a couple mundane book-keeping functions.
This also moves event_file_data() into trace.h so it can be used
outside of trace_events.c.
Link: http://lkml.kernel.org/r/316d95061accdee070aac8e5750afba0192fa5b9.1382622043.git.tom.zanussi@linux.intel.com
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Idea-by: Steve Rostedt <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-10-24 13:59:24 +00:00
|
|
|
*
|
|
|
|
* @init: An optional initialization function called for the trigger
|
|
|
|
* when the trigger is registered (via the event_command reg()
|
|
|
|
* function). This can be used to perform per-trigger
|
|
|
|
* initialization such as incrementing a per-trigger reference
|
|
|
|
* count, for instance. This is usually implemented by the
|
|
|
|
* generic utility function @event_trigger_init() (see
|
|
|
|
* trace_event_triggers.c).
|
|
|
|
*
|
|
|
|
* @free: An optional de-initialization function called for the
|
|
|
|
* trigger when the trigger is unregistered (via the
|
|
|
|
* event_command @reg() function). This can be used to perform
|
|
|
|
* per-trigger de-initialization such as decrementing a
|
|
|
|
* per-trigger reference count and freeing corresponding trigger
|
|
|
|
* data, for instance. This is usually implemented by the
|
|
|
|
* generic utility function @event_trigger_free() (see
|
|
|
|
* trace_event_triggers.c).
|
|
|
|
*
|
|
|
|
* @print: The callback function invoked to have the trigger print
|
|
|
|
* itself. This is usually implemented by a wrapper function
|
|
|
|
* that calls the generic utility function @event_trigger_print()
|
|
|
|
* (see trace_event_triggers.c).
|
|
|
|
*/
|
|
|
|
struct event_trigger_ops {
|
2022-01-10 14:04:12 +00:00
|
|
|
void (*trigger)(struct event_trigger_data *data,
|
|
|
|
struct trace_buffer *buffer,
|
|
|
|
void *rec,
|
|
|
|
struct ring_buffer_event *rbe);
|
2022-02-04 22:12:05 +00:00
|
|
|
int (*init)(struct event_trigger_data *data);
|
|
|
|
void (*free)(struct event_trigger_data *data);
|
tracing: Add basic event trigger framework
Add a 'trigger' file for each trace event, enabling 'trace event
triggers' to be set for trace events.
'trace event triggers' are patterned after the existing 'ftrace
function triggers' implementation except that triggers are written to
per-event 'trigger' files instead of to a single file such as the
'set_ftrace_filter' used for ftrace function triggers.
The implementation is meant to be entirely separate from ftrace
function triggers, in order to keep the respective implementations
relatively simple and to allow them to diverge.
The event trigger functionality is built on top of SOFT_DISABLE
functionality. It adds a TRIGGER_MODE bit to the ftrace_event_file
flags which is checked when any trace event fires. Triggers set for a
particular event need to be checked regardless of whether that event
is actually enabled or not - getting an event to fire even if it's not
enabled is what's already implemented by SOFT_DISABLE mode, so trigger
mode directly reuses that. Event trigger essentially inherit the soft
disable logic in __ftrace_event_enable_disable() while adding a bit of
logic and trigger reference counting via tm_ref on top of that in a
new trace_event_trigger_enable_disable() function. Because the base
__ftrace_event_enable_disable() code now needs to be invoked from
outside trace_events.c, a wrapper is also added for those usages.
The triggers for an event are actually invoked via a new function,
event_triggers_call(), and code is also added to invoke them for
ftrace_raw_event calls as well as syscall events.
The main part of the patch creates a new trace_events_trigger.c file
to contain the trace event triggers implementation.
The standard open, read, and release file operations are implemented
here.
The open() implementation sets up for the various open modes of the
'trigger' file. It creates and attaches the trigger iterator and sets
up the command parser. If opened for reading set up the trigger
seq_ops.
The read() implementation parses the event trigger written to the
'trigger' file, looks up the trigger command, and passes it along to
that event_command's func() implementation for command-specific
processing.
The release() implementation does whatever cleanup is needed to
release the 'trigger' file, like releasing the parser and trigger
iterator, etc.
A couple of functions for event command registration and
unregistration are added, along with a list to add them to and a mutex
to protect them, as well as an (initially empty) registration function
to add the set of commands that will be added by future commits, and
call to it from the trace event initialization code.
also added are a couple trigger-specific data structures needed for
these implementations such as a trigger iterator and a struct for
trigger-specific data.
A couple structs consisting mostly of function meant to be implemented
in command-specific ways, event_command and event_trigger_ops, are
used by the generic event trigger command implementations. They're
being put into trace.h alongside the other trace_event data structures
and functions, in the expectation that they'll be needed in several
trace_event-related files such as trace_events_trigger.c and
trace_events.c.
The event_command.func() function is meant to be called by the trigger
parsing code in order to add a trigger instance to the corresponding
event. It essentially coordinates adding a live trigger instance to
the event, and arming the triggering the event.
Every event_command func() implementation essentially does the
same thing for any command:
- choose ops - use the value of param to choose either a number or
count version of event_trigger_ops specific to the command
- do the register or unregister of those ops
- associate a filter, if specified, with the triggering event
The reg() and unreg() ops allow command-specific implementations for
event_trigger_op registration and unregistration, and the
get_trigger_ops() op allows command-specific event_trigger_ops
selection to be parameterized. When a trigger instance is added, the
reg() op essentially adds that trigger to the triggering event and
arms it, while unreg() does the opposite. The set_filter() function
is used to associate a filter with the trigger - if the command
doesn't specify a set_filter() implementation, the command will ignore
filters.
Each command has an associated trigger_type, which serves double duty,
both as a unique identifier for the command as well as a value that
can be used for setting a trigger mode bit during trigger invocation.
The signature of func() adds a pointer to the event_command struct,
used to invoke those functions, along with a command_data param that
can be passed to the reg/unreg functions. This allows func()
implementations to use command-specific blobs and supports code
re-use.
The event_trigger_ops.func() command corrsponds to the trigger 'probe'
function that gets called when the triggering event is actually
invoked. The other functions are used to list the trigger when
needed, along with a couple mundane book-keeping functions.
This also moves event_file_data() into trace.h so it can be used
outside of trace_events.c.
Link: http://lkml.kernel.org/r/316d95061accdee070aac8e5750afba0192fa5b9.1382622043.git.tom.zanussi@linux.intel.com
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Idea-by: Steve Rostedt <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-10-24 13:59:24 +00:00
|
|
|
int (*print)(struct seq_file *m,
|
|
|
|
struct event_trigger_data *data);
|
|
|
|
};
|
|
|
|
|
|
|
|
/**
|
|
|
|
* struct event_command - callbacks and data members for event commands
|
|
|
|
*
|
|
|
|
* Event commands are invoked by users by writing the command name
|
|
|
|
* into the 'trigger' file associated with a trace event. The
|
|
|
|
* parameters associated with a specific invocation of an event
|
|
|
|
* command are used to create an event trigger instance, which is
|
|
|
|
* added to the list of trigger instances associated with that trace
|
|
|
|
* event. When the event is hit, the set of triggers associated with
|
|
|
|
* that event is invoked.
|
|
|
|
*
|
|
|
|
* The data members in this structure provide per-event command data
|
|
|
|
* for various event commands.
|
|
|
|
*
|
|
|
|
* All the data members below, except for @post_trigger, must be set
|
|
|
|
* for each event command.
|
|
|
|
*
|
|
|
|
* @name: The unique name that identifies the event command. This is
|
|
|
|
* the name used when setting triggers via trigger files.
|
|
|
|
*
|
|
|
|
* @trigger_type: A unique id that identifies the event command
|
|
|
|
* 'type'. This value has two purposes, the first to ensure that
|
|
|
|
* only one trigger of the same type can be set at a given time
|
|
|
|
* for a particular event e.g. it doesn't make sense to have both
|
|
|
|
* a traceon and traceoff trigger attached to a single event at
|
|
|
|
* the same time, so traceon and traceoff have the same type
|
|
|
|
* though they have different names. The @trigger_type value is
|
|
|
|
* also used as a bit value for deferring the actual trigger
|
|
|
|
* action until after the current event is finished. Some
|
|
|
|
* commands need to do this if they themselves log to the trace
|
|
|
|
* buffer (see the @post_trigger() member below). @trigger_type
|
|
|
|
* values are defined by adding new values to the trigger_type
|
2015-04-29 18:36:05 +00:00
|
|
|
* enum in include/linux/trace_events.h.
|
tracing: Add basic event trigger framework
Add a 'trigger' file for each trace event, enabling 'trace event
triggers' to be set for trace events.
'trace event triggers' are patterned after the existing 'ftrace
function triggers' implementation except that triggers are written to
per-event 'trigger' files instead of to a single file such as the
'set_ftrace_filter' used for ftrace function triggers.
The implementation is meant to be entirely separate from ftrace
function triggers, in order to keep the respective implementations
relatively simple and to allow them to diverge.
The event trigger functionality is built on top of SOFT_DISABLE
functionality. It adds a TRIGGER_MODE bit to the ftrace_event_file
flags which is checked when any trace event fires. Triggers set for a
particular event need to be checked regardless of whether that event
is actually enabled or not - getting an event to fire even if it's not
enabled is what's already implemented by SOFT_DISABLE mode, so trigger
mode directly reuses that. Event trigger essentially inherit the soft
disable logic in __ftrace_event_enable_disable() while adding a bit of
logic and trigger reference counting via tm_ref on top of that in a
new trace_event_trigger_enable_disable() function. Because the base
__ftrace_event_enable_disable() code now needs to be invoked from
outside trace_events.c, a wrapper is also added for those usages.
The triggers for an event are actually invoked via a new function,
event_triggers_call(), and code is also added to invoke them for
ftrace_raw_event calls as well as syscall events.
The main part of the patch creates a new trace_events_trigger.c file
to contain the trace event triggers implementation.
The standard open, read, and release file operations are implemented
here.
The open() implementation sets up for the various open modes of the
'trigger' file. It creates and attaches the trigger iterator and sets
up the command parser. If opened for reading set up the trigger
seq_ops.
The read() implementation parses the event trigger written to the
'trigger' file, looks up the trigger command, and passes it along to
that event_command's func() implementation for command-specific
processing.
The release() implementation does whatever cleanup is needed to
release the 'trigger' file, like releasing the parser and trigger
iterator, etc.
A couple of functions for event command registration and
unregistration are added, along with a list to add them to and a mutex
to protect them, as well as an (initially empty) registration function
to add the set of commands that will be added by future commits, and
call to it from the trace event initialization code.
also added are a couple trigger-specific data structures needed for
these implementations such as a trigger iterator and a struct for
trigger-specific data.
A couple structs consisting mostly of function meant to be implemented
in command-specific ways, event_command and event_trigger_ops, are
used by the generic event trigger command implementations. They're
being put into trace.h alongside the other trace_event data structures
and functions, in the expectation that they'll be needed in several
trace_event-related files such as trace_events_trigger.c and
trace_events.c.
The event_command.func() function is meant to be called by the trigger
parsing code in order to add a trigger instance to the corresponding
event. It essentially coordinates adding a live trigger instance to
the event, and arming the triggering the event.
Every event_command func() implementation essentially does the
same thing for any command:
- choose ops - use the value of param to choose either a number or
count version of event_trigger_ops specific to the command
- do the register or unregister of those ops
- associate a filter, if specified, with the triggering event
The reg() and unreg() ops allow command-specific implementations for
event_trigger_op registration and unregistration, and the
get_trigger_ops() op allows command-specific event_trigger_ops
selection to be parameterized. When a trigger instance is added, the
reg() op essentially adds that trigger to the triggering event and
arms it, while unreg() does the opposite. The set_filter() function
is used to associate a filter with the trigger - if the command
doesn't specify a set_filter() implementation, the command will ignore
filters.
Each command has an associated trigger_type, which serves double duty,
both as a unique identifier for the command as well as a value that
can be used for setting a trigger mode bit during trigger invocation.
The signature of func() adds a pointer to the event_command struct,
used to invoke those functions, along with a command_data param that
can be passed to the reg/unreg functions. This allows func()
implementations to use command-specific blobs and supports code
re-use.
The event_trigger_ops.func() command corrsponds to the trigger 'probe'
function that gets called when the triggering event is actually
invoked. The other functions are used to list the trigger when
needed, along with a couple mundane book-keeping functions.
This also moves event_file_data() into trace.h so it can be used
outside of trace_events.c.
Link: http://lkml.kernel.org/r/316d95061accdee070aac8e5750afba0192fa5b9.1382622043.git.tom.zanussi@linux.intel.com
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Idea-by: Steve Rostedt <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-10-24 13:59:24 +00:00
|
|
|
*
|
2016-02-22 20:55:09 +00:00
|
|
|
* @flags: See the enum event_command_flags below.
|
2015-12-10 18:50:48 +00:00
|
|
|
*
|
2015-12-10 18:50:49 +00:00
|
|
|
* All the methods below, except for @set_filter() and @unreg_all(),
|
|
|
|
* must be implemented.
|
tracing: Add basic event trigger framework
Add a 'trigger' file for each trace event, enabling 'trace event
triggers' to be set for trace events.
'trace event triggers' are patterned after the existing 'ftrace
function triggers' implementation except that triggers are written to
per-event 'trigger' files instead of to a single file such as the
'set_ftrace_filter' used for ftrace function triggers.
The implementation is meant to be entirely separate from ftrace
function triggers, in order to keep the respective implementations
relatively simple and to allow them to diverge.
The event trigger functionality is built on top of SOFT_DISABLE
functionality. It adds a TRIGGER_MODE bit to the ftrace_event_file
flags which is checked when any trace event fires. Triggers set for a
particular event need to be checked regardless of whether that event
is actually enabled or not - getting an event to fire even if it's not
enabled is what's already implemented by SOFT_DISABLE mode, so trigger
mode directly reuses that. Event trigger essentially inherit the soft
disable logic in __ftrace_event_enable_disable() while adding a bit of
logic and trigger reference counting via tm_ref on top of that in a
new trace_event_trigger_enable_disable() function. Because the base
__ftrace_event_enable_disable() code now needs to be invoked from
outside trace_events.c, a wrapper is also added for those usages.
The triggers for an event are actually invoked via a new function,
event_triggers_call(), and code is also added to invoke them for
ftrace_raw_event calls as well as syscall events.
The main part of the patch creates a new trace_events_trigger.c file
to contain the trace event triggers implementation.
The standard open, read, and release file operations are implemented
here.
The open() implementation sets up for the various open modes of the
'trigger' file. It creates and attaches the trigger iterator and sets
up the command parser. If opened for reading set up the trigger
seq_ops.
The read() implementation parses the event trigger written to the
'trigger' file, looks up the trigger command, and passes it along to
that event_command's func() implementation for command-specific
processing.
The release() implementation does whatever cleanup is needed to
release the 'trigger' file, like releasing the parser and trigger
iterator, etc.
A couple of functions for event command registration and
unregistration are added, along with a list to add them to and a mutex
to protect them, as well as an (initially empty) registration function
to add the set of commands that will be added by future commits, and
call to it from the trace event initialization code.
also added are a couple trigger-specific data structures needed for
these implementations such as a trigger iterator and a struct for
trigger-specific data.
A couple structs consisting mostly of function meant to be implemented
in command-specific ways, event_command and event_trigger_ops, are
used by the generic event trigger command implementations. They're
being put into trace.h alongside the other trace_event data structures
and functions, in the expectation that they'll be needed in several
trace_event-related files such as trace_events_trigger.c and
trace_events.c.
The event_command.func() function is meant to be called by the trigger
parsing code in order to add a trigger instance to the corresponding
event. It essentially coordinates adding a live trigger instance to
the event, and arming the triggering the event.
Every event_command func() implementation essentially does the
same thing for any command:
- choose ops - use the value of param to choose either a number or
count version of event_trigger_ops specific to the command
- do the register or unregister of those ops
- associate a filter, if specified, with the triggering event
The reg() and unreg() ops allow command-specific implementations for
event_trigger_op registration and unregistration, and the
get_trigger_ops() op allows command-specific event_trigger_ops
selection to be parameterized. When a trigger instance is added, the
reg() op essentially adds that trigger to the triggering event and
arms it, while unreg() does the opposite. The set_filter() function
is used to associate a filter with the trigger - if the command
doesn't specify a set_filter() implementation, the command will ignore
filters.
Each command has an associated trigger_type, which serves double duty,
both as a unique identifier for the command as well as a value that
can be used for setting a trigger mode bit during trigger invocation.
The signature of func() adds a pointer to the event_command struct,
used to invoke those functions, along with a command_data param that
can be passed to the reg/unreg functions. This allows func()
implementations to use command-specific blobs and supports code
re-use.
The event_trigger_ops.func() command corrsponds to the trigger 'probe'
function that gets called when the triggering event is actually
invoked. The other functions are used to list the trigger when
needed, along with a couple mundane book-keeping functions.
This also moves event_file_data() into trace.h so it can be used
outside of trace_events.c.
Link: http://lkml.kernel.org/r/316d95061accdee070aac8e5750afba0192fa5b9.1382622043.git.tom.zanussi@linux.intel.com
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Idea-by: Steve Rostedt <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-10-24 13:59:24 +00:00
|
|
|
*
|
2022-01-10 14:04:11 +00:00
|
|
|
* @parse: The callback function responsible for parsing and
|
tracing: Add basic event trigger framework
Add a 'trigger' file for each trace event, enabling 'trace event
triggers' to be set for trace events.
'trace event triggers' are patterned after the existing 'ftrace
function triggers' implementation except that triggers are written to
per-event 'trigger' files instead of to a single file such as the
'set_ftrace_filter' used for ftrace function triggers.
The implementation is meant to be entirely separate from ftrace
function triggers, in order to keep the respective implementations
relatively simple and to allow them to diverge.
The event trigger functionality is built on top of SOFT_DISABLE
functionality. It adds a TRIGGER_MODE bit to the ftrace_event_file
flags which is checked when any trace event fires. Triggers set for a
particular event need to be checked regardless of whether that event
is actually enabled or not - getting an event to fire even if it's not
enabled is what's already implemented by SOFT_DISABLE mode, so trigger
mode directly reuses that. Event trigger essentially inherit the soft
disable logic in __ftrace_event_enable_disable() while adding a bit of
logic and trigger reference counting via tm_ref on top of that in a
new trace_event_trigger_enable_disable() function. Because the base
__ftrace_event_enable_disable() code now needs to be invoked from
outside trace_events.c, a wrapper is also added for those usages.
The triggers for an event are actually invoked via a new function,
event_triggers_call(), and code is also added to invoke them for
ftrace_raw_event calls as well as syscall events.
The main part of the patch creates a new trace_events_trigger.c file
to contain the trace event triggers implementation.
The standard open, read, and release file operations are implemented
here.
The open() implementation sets up for the various open modes of the
'trigger' file. It creates and attaches the trigger iterator and sets
up the command parser. If opened for reading set up the trigger
seq_ops.
The read() implementation parses the event trigger written to the
'trigger' file, looks up the trigger command, and passes it along to
that event_command's func() implementation for command-specific
processing.
The release() implementation does whatever cleanup is needed to
release the 'trigger' file, like releasing the parser and trigger
iterator, etc.
A couple of functions for event command registration and
unregistration are added, along with a list to add them to and a mutex
to protect them, as well as an (initially empty) registration function
to add the set of commands that will be added by future commits, and
call to it from the trace event initialization code.
also added are a couple trigger-specific data structures needed for
these implementations such as a trigger iterator and a struct for
trigger-specific data.
A couple structs consisting mostly of function meant to be implemented
in command-specific ways, event_command and event_trigger_ops, are
used by the generic event trigger command implementations. They're
being put into trace.h alongside the other trace_event data structures
and functions, in the expectation that they'll be needed in several
trace_event-related files such as trace_events_trigger.c and
trace_events.c.
The event_command.func() function is meant to be called by the trigger
parsing code in order to add a trigger instance to the corresponding
event. It essentially coordinates adding a live trigger instance to
the event, and arming the triggering the event.
Every event_command func() implementation essentially does the
same thing for any command:
- choose ops - use the value of param to choose either a number or
count version of event_trigger_ops specific to the command
- do the register or unregister of those ops
- associate a filter, if specified, with the triggering event
The reg() and unreg() ops allow command-specific implementations for
event_trigger_op registration and unregistration, and the
get_trigger_ops() op allows command-specific event_trigger_ops
selection to be parameterized. When a trigger instance is added, the
reg() op essentially adds that trigger to the triggering event and
arms it, while unreg() does the opposite. The set_filter() function
is used to associate a filter with the trigger - if the command
doesn't specify a set_filter() implementation, the command will ignore
filters.
Each command has an associated trigger_type, which serves double duty,
both as a unique identifier for the command as well as a value that
can be used for setting a trigger mode bit during trigger invocation.
The signature of func() adds a pointer to the event_command struct,
used to invoke those functions, along with a command_data param that
can be passed to the reg/unreg functions. This allows func()
implementations to use command-specific blobs and supports code
re-use.
The event_trigger_ops.func() command corrsponds to the trigger 'probe'
function that gets called when the triggering event is actually
invoked. The other functions are used to list the trigger when
needed, along with a couple mundane book-keeping functions.
This also moves event_file_data() into trace.h so it can be used
outside of trace_events.c.
Link: http://lkml.kernel.org/r/316d95061accdee070aac8e5750afba0192fa5b9.1382622043.git.tom.zanussi@linux.intel.com
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Idea-by: Steve Rostedt <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-10-24 13:59:24 +00:00
|
|
|
* registering the trigger written to the 'trigger' file by the
|
|
|
|
* user. It allocates the trigger instance and registers it with
|
|
|
|
* the appropriate trace event. It makes use of the other
|
|
|
|
* event_command callback functions to orchestrate this, and is
|
|
|
|
* usually implemented by the generic utility function
|
|
|
|
* @event_trigger_callback() (see trace_event_triggers.c).
|
|
|
|
*
|
|
|
|
* @reg: Adds the trigger to the list of triggers associated with the
|
|
|
|
* event, and enables the event trigger itself, after
|
|
|
|
* initializing it (via the event_trigger_ops @init() function).
|
|
|
|
* This is also where commands can use the @trigger_type value to
|
|
|
|
* make the decision as to whether or not multiple instances of
|
|
|
|
* the trigger should be allowed. This is usually implemented by
|
|
|
|
* the generic utility function @register_trigger() (see
|
|
|
|
* trace_event_triggers.c).
|
|
|
|
*
|
|
|
|
* @unreg: Removes the trigger from the list of triggers associated
|
|
|
|
* with the event, and disables the event trigger itself, after
|
|
|
|
* initializing it (via the event_trigger_ops @free() function).
|
|
|
|
* This is usually implemented by the generic utility function
|
|
|
|
* @unregister_trigger() (see trace_event_triggers.c).
|
|
|
|
*
|
2015-12-10 18:50:49 +00:00
|
|
|
* @unreg_all: An optional function called to remove all the triggers
|
|
|
|
* from the list of triggers associated with the event. Called
|
|
|
|
* when a trigger file is opened in truncate mode.
|
|
|
|
*
|
tracing: Add basic event trigger framework
Add a 'trigger' file for each trace event, enabling 'trace event
triggers' to be set for trace events.
'trace event triggers' are patterned after the existing 'ftrace
function triggers' implementation except that triggers are written to
per-event 'trigger' files instead of to a single file such as the
'set_ftrace_filter' used for ftrace function triggers.
The implementation is meant to be entirely separate from ftrace
function triggers, in order to keep the respective implementations
relatively simple and to allow them to diverge.
The event trigger functionality is built on top of SOFT_DISABLE
functionality. It adds a TRIGGER_MODE bit to the ftrace_event_file
flags which is checked when any trace event fires. Triggers set for a
particular event need to be checked regardless of whether that event
is actually enabled or not - getting an event to fire even if it's not
enabled is what's already implemented by SOFT_DISABLE mode, so trigger
mode directly reuses that. Event trigger essentially inherit the soft
disable logic in __ftrace_event_enable_disable() while adding a bit of
logic and trigger reference counting via tm_ref on top of that in a
new trace_event_trigger_enable_disable() function. Because the base
__ftrace_event_enable_disable() code now needs to be invoked from
outside trace_events.c, a wrapper is also added for those usages.
The triggers for an event are actually invoked via a new function,
event_triggers_call(), and code is also added to invoke them for
ftrace_raw_event calls as well as syscall events.
The main part of the patch creates a new trace_events_trigger.c file
to contain the trace event triggers implementation.
The standard open, read, and release file operations are implemented
here.
The open() implementation sets up for the various open modes of the
'trigger' file. It creates and attaches the trigger iterator and sets
up the command parser. If opened for reading set up the trigger
seq_ops.
The read() implementation parses the event trigger written to the
'trigger' file, looks up the trigger command, and passes it along to
that event_command's func() implementation for command-specific
processing.
The release() implementation does whatever cleanup is needed to
release the 'trigger' file, like releasing the parser and trigger
iterator, etc.
A couple of functions for event command registration and
unregistration are added, along with a list to add them to and a mutex
to protect them, as well as an (initially empty) registration function
to add the set of commands that will be added by future commits, and
call to it from the trace event initialization code.
also added are a couple trigger-specific data structures needed for
these implementations such as a trigger iterator and a struct for
trigger-specific data.
A couple structs consisting mostly of function meant to be implemented
in command-specific ways, event_command and event_trigger_ops, are
used by the generic event trigger command implementations. They're
being put into trace.h alongside the other trace_event data structures
and functions, in the expectation that they'll be needed in several
trace_event-related files such as trace_events_trigger.c and
trace_events.c.
The event_command.func() function is meant to be called by the trigger
parsing code in order to add a trigger instance to the corresponding
event. It essentially coordinates adding a live trigger instance to
the event, and arming the triggering the event.
Every event_command func() implementation essentially does the
same thing for any command:
- choose ops - use the value of param to choose either a number or
count version of event_trigger_ops specific to the command
- do the register or unregister of those ops
- associate a filter, if specified, with the triggering event
The reg() and unreg() ops allow command-specific implementations for
event_trigger_op registration and unregistration, and the
get_trigger_ops() op allows command-specific event_trigger_ops
selection to be parameterized. When a trigger instance is added, the
reg() op essentially adds that trigger to the triggering event and
arms it, while unreg() does the opposite. The set_filter() function
is used to associate a filter with the trigger - if the command
doesn't specify a set_filter() implementation, the command will ignore
filters.
Each command has an associated trigger_type, which serves double duty,
both as a unique identifier for the command as well as a value that
can be used for setting a trigger mode bit during trigger invocation.
The signature of func() adds a pointer to the event_command struct,
used to invoke those functions, along with a command_data param that
can be passed to the reg/unreg functions. This allows func()
implementations to use command-specific blobs and supports code
re-use.
The event_trigger_ops.func() command corrsponds to the trigger 'probe'
function that gets called when the triggering event is actually
invoked. The other functions are used to list the trigger when
needed, along with a couple mundane book-keeping functions.
This also moves event_file_data() into trace.h so it can be used
outside of trace_events.c.
Link: http://lkml.kernel.org/r/316d95061accdee070aac8e5750afba0192fa5b9.1382622043.git.tom.zanussi@linux.intel.com
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Idea-by: Steve Rostedt <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-10-24 13:59:24 +00:00
|
|
|
* @set_filter: An optional function called to parse and set a filter
|
|
|
|
* for the trigger. If no @set_filter() method is set for the
|
|
|
|
* event command, filters set by the user for the command will be
|
|
|
|
* ignored. This is usually implemented by the generic utility
|
|
|
|
* function @set_trigger_filter() (see trace_event_triggers.c).
|
|
|
|
*
|
|
|
|
* @get_trigger_ops: The callback function invoked to retrieve the
|
|
|
|
* event_trigger_ops implementation associated with the command.
|
2022-01-10 14:04:11 +00:00
|
|
|
* This callback function allows a single event_command to
|
|
|
|
* support multiple trigger implementations via different sets of
|
|
|
|
* event_trigger_ops, depending on the value of the @param
|
|
|
|
* string.
|
tracing: Add basic event trigger framework
Add a 'trigger' file for each trace event, enabling 'trace event
triggers' to be set for trace events.
'trace event triggers' are patterned after the existing 'ftrace
function triggers' implementation except that triggers are written to
per-event 'trigger' files instead of to a single file such as the
'set_ftrace_filter' used for ftrace function triggers.
The implementation is meant to be entirely separate from ftrace
function triggers, in order to keep the respective implementations
relatively simple and to allow them to diverge.
The event trigger functionality is built on top of SOFT_DISABLE
functionality. It adds a TRIGGER_MODE bit to the ftrace_event_file
flags which is checked when any trace event fires. Triggers set for a
particular event need to be checked regardless of whether that event
is actually enabled or not - getting an event to fire even if it's not
enabled is what's already implemented by SOFT_DISABLE mode, so trigger
mode directly reuses that. Event trigger essentially inherit the soft
disable logic in __ftrace_event_enable_disable() while adding a bit of
logic and trigger reference counting via tm_ref on top of that in a
new trace_event_trigger_enable_disable() function. Because the base
__ftrace_event_enable_disable() code now needs to be invoked from
outside trace_events.c, a wrapper is also added for those usages.
The triggers for an event are actually invoked via a new function,
event_triggers_call(), and code is also added to invoke them for
ftrace_raw_event calls as well as syscall events.
The main part of the patch creates a new trace_events_trigger.c file
to contain the trace event triggers implementation.
The standard open, read, and release file operations are implemented
here.
The open() implementation sets up for the various open modes of the
'trigger' file. It creates and attaches the trigger iterator and sets
up the command parser. If opened for reading set up the trigger
seq_ops.
The read() implementation parses the event trigger written to the
'trigger' file, looks up the trigger command, and passes it along to
that event_command's func() implementation for command-specific
processing.
The release() implementation does whatever cleanup is needed to
release the 'trigger' file, like releasing the parser and trigger
iterator, etc.
A couple of functions for event command registration and
unregistration are added, along with a list to add them to and a mutex
to protect them, as well as an (initially empty) registration function
to add the set of commands that will be added by future commits, and
call to it from the trace event initialization code.
also added are a couple trigger-specific data structures needed for
these implementations such as a trigger iterator and a struct for
trigger-specific data.
A couple structs consisting mostly of function meant to be implemented
in command-specific ways, event_command and event_trigger_ops, are
used by the generic event trigger command implementations. They're
being put into trace.h alongside the other trace_event data structures
and functions, in the expectation that they'll be needed in several
trace_event-related files such as trace_events_trigger.c and
trace_events.c.
The event_command.func() function is meant to be called by the trigger
parsing code in order to add a trigger instance to the corresponding
event. It essentially coordinates adding a live trigger instance to
the event, and arming the triggering the event.
Every event_command func() implementation essentially does the
same thing for any command:
- choose ops - use the value of param to choose either a number or
count version of event_trigger_ops specific to the command
- do the register or unregister of those ops
- associate a filter, if specified, with the triggering event
The reg() and unreg() ops allow command-specific implementations for
event_trigger_op registration and unregistration, and the
get_trigger_ops() op allows command-specific event_trigger_ops
selection to be parameterized. When a trigger instance is added, the
reg() op essentially adds that trigger to the triggering event and
arms it, while unreg() does the opposite. The set_filter() function
is used to associate a filter with the trigger - if the command
doesn't specify a set_filter() implementation, the command will ignore
filters.
Each command has an associated trigger_type, which serves double duty,
both as a unique identifier for the command as well as a value that
can be used for setting a trigger mode bit during trigger invocation.
The signature of func() adds a pointer to the event_command struct,
used to invoke those functions, along with a command_data param that
can be passed to the reg/unreg functions. This allows func()
implementations to use command-specific blobs and supports code
re-use.
The event_trigger_ops.func() command corrsponds to the trigger 'probe'
function that gets called when the triggering event is actually
invoked. The other functions are used to list the trigger when
needed, along with a couple mundane book-keeping functions.
This also moves event_file_data() into trace.h so it can be used
outside of trace_events.c.
Link: http://lkml.kernel.org/r/316d95061accdee070aac8e5750afba0192fa5b9.1382622043.git.tom.zanussi@linux.intel.com
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Idea-by: Steve Rostedt <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-10-24 13:59:24 +00:00
|
|
|
*/
|
|
|
|
struct event_command {
|
|
|
|
struct list_head list;
|
|
|
|
char *name;
|
|
|
|
enum event_trigger_type trigger_type;
|
2016-02-22 20:55:09 +00:00
|
|
|
int flags;
|
2022-01-10 14:04:11 +00:00
|
|
|
int (*parse)(struct event_command *cmd_ops,
|
|
|
|
struct trace_event_file *file,
|
|
|
|
char *glob, char *cmd,
|
|
|
|
char *param_and_filter);
|
tracing: Add basic event trigger framework
Add a 'trigger' file for each trace event, enabling 'trace event
triggers' to be set for trace events.
'trace event triggers' are patterned after the existing 'ftrace
function triggers' implementation except that triggers are written to
per-event 'trigger' files instead of to a single file such as the
'set_ftrace_filter' used for ftrace function triggers.
The implementation is meant to be entirely separate from ftrace
function triggers, in order to keep the respective implementations
relatively simple and to allow them to diverge.
The event trigger functionality is built on top of SOFT_DISABLE
functionality. It adds a TRIGGER_MODE bit to the ftrace_event_file
flags which is checked when any trace event fires. Triggers set for a
particular event need to be checked regardless of whether that event
is actually enabled or not - getting an event to fire even if it's not
enabled is what's already implemented by SOFT_DISABLE mode, so trigger
mode directly reuses that. Event trigger essentially inherit the soft
disable logic in __ftrace_event_enable_disable() while adding a bit of
logic and trigger reference counting via tm_ref on top of that in a
new trace_event_trigger_enable_disable() function. Because the base
__ftrace_event_enable_disable() code now needs to be invoked from
outside trace_events.c, a wrapper is also added for those usages.
The triggers for an event are actually invoked via a new function,
event_triggers_call(), and code is also added to invoke them for
ftrace_raw_event calls as well as syscall events.
The main part of the patch creates a new trace_events_trigger.c file
to contain the trace event triggers implementation.
The standard open, read, and release file operations are implemented
here.
The open() implementation sets up for the various open modes of the
'trigger' file. It creates and attaches the trigger iterator and sets
up the command parser. If opened for reading set up the trigger
seq_ops.
The read() implementation parses the event trigger written to the
'trigger' file, looks up the trigger command, and passes it along to
that event_command's func() implementation for command-specific
processing.
The release() implementation does whatever cleanup is needed to
release the 'trigger' file, like releasing the parser and trigger
iterator, etc.
A couple of functions for event command registration and
unregistration are added, along with a list to add them to and a mutex
to protect them, as well as an (initially empty) registration function
to add the set of commands that will be added by future commits, and
call to it from the trace event initialization code.
also added are a couple trigger-specific data structures needed for
these implementations such as a trigger iterator and a struct for
trigger-specific data.
A couple structs consisting mostly of function meant to be implemented
in command-specific ways, event_command and event_trigger_ops, are
used by the generic event trigger command implementations. They're
being put into trace.h alongside the other trace_event data structures
and functions, in the expectation that they'll be needed in several
trace_event-related files such as trace_events_trigger.c and
trace_events.c.
The event_command.func() function is meant to be called by the trigger
parsing code in order to add a trigger instance to the corresponding
event. It essentially coordinates adding a live trigger instance to
the event, and arming the triggering the event.
Every event_command func() implementation essentially does the
same thing for any command:
- choose ops - use the value of param to choose either a number or
count version of event_trigger_ops specific to the command
- do the register or unregister of those ops
- associate a filter, if specified, with the triggering event
The reg() and unreg() ops allow command-specific implementations for
event_trigger_op registration and unregistration, and the
get_trigger_ops() op allows command-specific event_trigger_ops
selection to be parameterized. When a trigger instance is added, the
reg() op essentially adds that trigger to the triggering event and
arms it, while unreg() does the opposite. The set_filter() function
is used to associate a filter with the trigger - if the command
doesn't specify a set_filter() implementation, the command will ignore
filters.
Each command has an associated trigger_type, which serves double duty,
both as a unique identifier for the command as well as a value that
can be used for setting a trigger mode bit during trigger invocation.
The signature of func() adds a pointer to the event_command struct,
used to invoke those functions, along with a command_data param that
can be passed to the reg/unreg functions. This allows func()
implementations to use command-specific blobs and supports code
re-use.
The event_trigger_ops.func() command corrsponds to the trigger 'probe'
function that gets called when the triggering event is actually
invoked. The other functions are used to list the trigger when
needed, along with a couple mundane book-keeping functions.
This also moves event_file_data() into trace.h so it can be used
outside of trace_events.c.
Link: http://lkml.kernel.org/r/316d95061accdee070aac8e5750afba0192fa5b9.1382622043.git.tom.zanussi@linux.intel.com
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Idea-by: Steve Rostedt <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-10-24 13:59:24 +00:00
|
|
|
int (*reg)(char *glob,
|
|
|
|
struct event_trigger_data *data,
|
2015-05-05 14:09:53 +00:00
|
|
|
struct trace_event_file *file);
|
tracing: Add basic event trigger framework
Add a 'trigger' file for each trace event, enabling 'trace event
triggers' to be set for trace events.
'trace event triggers' are patterned after the existing 'ftrace
function triggers' implementation except that triggers are written to
per-event 'trigger' files instead of to a single file such as the
'set_ftrace_filter' used for ftrace function triggers.
The implementation is meant to be entirely separate from ftrace
function triggers, in order to keep the respective implementations
relatively simple and to allow them to diverge.
The event trigger functionality is built on top of SOFT_DISABLE
functionality. It adds a TRIGGER_MODE bit to the ftrace_event_file
flags which is checked when any trace event fires. Triggers set for a
particular event need to be checked regardless of whether that event
is actually enabled or not - getting an event to fire even if it's not
enabled is what's already implemented by SOFT_DISABLE mode, so trigger
mode directly reuses that. Event trigger essentially inherit the soft
disable logic in __ftrace_event_enable_disable() while adding a bit of
logic and trigger reference counting via tm_ref on top of that in a
new trace_event_trigger_enable_disable() function. Because the base
__ftrace_event_enable_disable() code now needs to be invoked from
outside trace_events.c, a wrapper is also added for those usages.
The triggers for an event are actually invoked via a new function,
event_triggers_call(), and code is also added to invoke them for
ftrace_raw_event calls as well as syscall events.
The main part of the patch creates a new trace_events_trigger.c file
to contain the trace event triggers implementation.
The standard open, read, and release file operations are implemented
here.
The open() implementation sets up for the various open modes of the
'trigger' file. It creates and attaches the trigger iterator and sets
up the command parser. If opened for reading set up the trigger
seq_ops.
The read() implementation parses the event trigger written to the
'trigger' file, looks up the trigger command, and passes it along to
that event_command's func() implementation for command-specific
processing.
The release() implementation does whatever cleanup is needed to
release the 'trigger' file, like releasing the parser and trigger
iterator, etc.
A couple of functions for event command registration and
unregistration are added, along with a list to add them to and a mutex
to protect them, as well as an (initially empty) registration function
to add the set of commands that will be added by future commits, and
call to it from the trace event initialization code.
also added are a couple trigger-specific data structures needed for
these implementations such as a trigger iterator and a struct for
trigger-specific data.
A couple structs consisting mostly of function meant to be implemented
in command-specific ways, event_command and event_trigger_ops, are
used by the generic event trigger command implementations. They're
being put into trace.h alongside the other trace_event data structures
and functions, in the expectation that they'll be needed in several
trace_event-related files such as trace_events_trigger.c and
trace_events.c.
The event_command.func() function is meant to be called by the trigger
parsing code in order to add a trigger instance to the corresponding
event. It essentially coordinates adding a live trigger instance to
the event, and arming the triggering the event.
Every event_command func() implementation essentially does the
same thing for any command:
- choose ops - use the value of param to choose either a number or
count version of event_trigger_ops specific to the command
- do the register or unregister of those ops
- associate a filter, if specified, with the triggering event
The reg() and unreg() ops allow command-specific implementations for
event_trigger_op registration and unregistration, and the
get_trigger_ops() op allows command-specific event_trigger_ops
selection to be parameterized. When a trigger instance is added, the
reg() op essentially adds that trigger to the triggering event and
arms it, while unreg() does the opposite. The set_filter() function
is used to associate a filter with the trigger - if the command
doesn't specify a set_filter() implementation, the command will ignore
filters.
Each command has an associated trigger_type, which serves double duty,
both as a unique identifier for the command as well as a value that
can be used for setting a trigger mode bit during trigger invocation.
The signature of func() adds a pointer to the event_command struct,
used to invoke those functions, along with a command_data param that
can be passed to the reg/unreg functions. This allows func()
implementations to use command-specific blobs and supports code
re-use.
The event_trigger_ops.func() command corrsponds to the trigger 'probe'
function that gets called when the triggering event is actually
invoked. The other functions are used to list the trigger when
needed, along with a couple mundane book-keeping functions.
This also moves event_file_data() into trace.h so it can be used
outside of trace_events.c.
Link: http://lkml.kernel.org/r/316d95061accdee070aac8e5750afba0192fa5b9.1382622043.git.tom.zanussi@linux.intel.com
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Idea-by: Steve Rostedt <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-10-24 13:59:24 +00:00
|
|
|
void (*unreg)(char *glob,
|
|
|
|
struct event_trigger_data *data,
|
2015-05-05 14:09:53 +00:00
|
|
|
struct trace_event_file *file);
|
2015-12-10 18:50:49 +00:00
|
|
|
void (*unreg_all)(struct trace_event_file *file);
|
tracing: Add basic event trigger framework
Add a 'trigger' file for each trace event, enabling 'trace event
triggers' to be set for trace events.
'trace event triggers' are patterned after the existing 'ftrace
function triggers' implementation except that triggers are written to
per-event 'trigger' files instead of to a single file such as the
'set_ftrace_filter' used for ftrace function triggers.
The implementation is meant to be entirely separate from ftrace
function triggers, in order to keep the respective implementations
relatively simple and to allow them to diverge.
The event trigger functionality is built on top of SOFT_DISABLE
functionality. It adds a TRIGGER_MODE bit to the ftrace_event_file
flags which is checked when any trace event fires. Triggers set for a
particular event need to be checked regardless of whether that event
is actually enabled or not - getting an event to fire even if it's not
enabled is what's already implemented by SOFT_DISABLE mode, so trigger
mode directly reuses that. Event trigger essentially inherit the soft
disable logic in __ftrace_event_enable_disable() while adding a bit of
logic and trigger reference counting via tm_ref on top of that in a
new trace_event_trigger_enable_disable() function. Because the base
__ftrace_event_enable_disable() code now needs to be invoked from
outside trace_events.c, a wrapper is also added for those usages.
The triggers for an event are actually invoked via a new function,
event_triggers_call(), and code is also added to invoke them for
ftrace_raw_event calls as well as syscall events.
The main part of the patch creates a new trace_events_trigger.c file
to contain the trace event triggers implementation.
The standard open, read, and release file operations are implemented
here.
The open() implementation sets up for the various open modes of the
'trigger' file. It creates and attaches the trigger iterator and sets
up the command parser. If opened for reading set up the trigger
seq_ops.
The read() implementation parses the event trigger written to the
'trigger' file, looks up the trigger command, and passes it along to
that event_command's func() implementation for command-specific
processing.
The release() implementation does whatever cleanup is needed to
release the 'trigger' file, like releasing the parser and trigger
iterator, etc.
A couple of functions for event command registration and
unregistration are added, along with a list to add them to and a mutex
to protect them, as well as an (initially empty) registration function
to add the set of commands that will be added by future commits, and
call to it from the trace event initialization code.
also added are a couple trigger-specific data structures needed for
these implementations such as a trigger iterator and a struct for
trigger-specific data.
A couple structs consisting mostly of function meant to be implemented
in command-specific ways, event_command and event_trigger_ops, are
used by the generic event trigger command implementations. They're
being put into trace.h alongside the other trace_event data structures
and functions, in the expectation that they'll be needed in several
trace_event-related files such as trace_events_trigger.c and
trace_events.c.
The event_command.func() function is meant to be called by the trigger
parsing code in order to add a trigger instance to the corresponding
event. It essentially coordinates adding a live trigger instance to
the event, and arming the triggering the event.
Every event_command func() implementation essentially does the
same thing for any command:
- choose ops - use the value of param to choose either a number or
count version of event_trigger_ops specific to the command
- do the register or unregister of those ops
- associate a filter, if specified, with the triggering event
The reg() and unreg() ops allow command-specific implementations for
event_trigger_op registration and unregistration, and the
get_trigger_ops() op allows command-specific event_trigger_ops
selection to be parameterized. When a trigger instance is added, the
reg() op essentially adds that trigger to the triggering event and
arms it, while unreg() does the opposite. The set_filter() function
is used to associate a filter with the trigger - if the command
doesn't specify a set_filter() implementation, the command will ignore
filters.
Each command has an associated trigger_type, which serves double duty,
both as a unique identifier for the command as well as a value that
can be used for setting a trigger mode bit during trigger invocation.
The signature of func() adds a pointer to the event_command struct,
used to invoke those functions, along with a command_data param that
can be passed to the reg/unreg functions. This allows func()
implementations to use command-specific blobs and supports code
re-use.
The event_trigger_ops.func() command corrsponds to the trigger 'probe'
function that gets called when the triggering event is actually
invoked. The other functions are used to list the trigger when
needed, along with a couple mundane book-keeping functions.
This also moves event_file_data() into trace.h so it can be used
outside of trace_events.c.
Link: http://lkml.kernel.org/r/316d95061accdee070aac8e5750afba0192fa5b9.1382622043.git.tom.zanussi@linux.intel.com
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Idea-by: Steve Rostedt <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-10-24 13:59:24 +00:00
|
|
|
int (*set_filter)(char *filter_str,
|
|
|
|
struct event_trigger_data *data,
|
2015-05-05 14:09:53 +00:00
|
|
|
struct trace_event_file *file);
|
tracing: Add basic event trigger framework
Add a 'trigger' file for each trace event, enabling 'trace event
triggers' to be set for trace events.
'trace event triggers' are patterned after the existing 'ftrace
function triggers' implementation except that triggers are written to
per-event 'trigger' files instead of to a single file such as the
'set_ftrace_filter' used for ftrace function triggers.
The implementation is meant to be entirely separate from ftrace
function triggers, in order to keep the respective implementations
relatively simple and to allow them to diverge.
The event trigger functionality is built on top of SOFT_DISABLE
functionality. It adds a TRIGGER_MODE bit to the ftrace_event_file
flags which is checked when any trace event fires. Triggers set for a
particular event need to be checked regardless of whether that event
is actually enabled or not - getting an event to fire even if it's not
enabled is what's already implemented by SOFT_DISABLE mode, so trigger
mode directly reuses that. Event trigger essentially inherit the soft
disable logic in __ftrace_event_enable_disable() while adding a bit of
logic and trigger reference counting via tm_ref on top of that in a
new trace_event_trigger_enable_disable() function. Because the base
__ftrace_event_enable_disable() code now needs to be invoked from
outside trace_events.c, a wrapper is also added for those usages.
The triggers for an event are actually invoked via a new function,
event_triggers_call(), and code is also added to invoke them for
ftrace_raw_event calls as well as syscall events.
The main part of the patch creates a new trace_events_trigger.c file
to contain the trace event triggers implementation.
The standard open, read, and release file operations are implemented
here.
The open() implementation sets up for the various open modes of the
'trigger' file. It creates and attaches the trigger iterator and sets
up the command parser. If opened for reading set up the trigger
seq_ops.
The read() implementation parses the event trigger written to the
'trigger' file, looks up the trigger command, and passes it along to
that event_command's func() implementation for command-specific
processing.
The release() implementation does whatever cleanup is needed to
release the 'trigger' file, like releasing the parser and trigger
iterator, etc.
A couple of functions for event command registration and
unregistration are added, along with a list to add them to and a mutex
to protect them, as well as an (initially empty) registration function
to add the set of commands that will be added by future commits, and
call to it from the trace event initialization code.
also added are a couple trigger-specific data structures needed for
these implementations such as a trigger iterator and a struct for
trigger-specific data.
A couple structs consisting mostly of function meant to be implemented
in command-specific ways, event_command and event_trigger_ops, are
used by the generic event trigger command implementations. They're
being put into trace.h alongside the other trace_event data structures
and functions, in the expectation that they'll be needed in several
trace_event-related files such as trace_events_trigger.c and
trace_events.c.
The event_command.func() function is meant to be called by the trigger
parsing code in order to add a trigger instance to the corresponding
event. It essentially coordinates adding a live trigger instance to
the event, and arming the triggering the event.
Every event_command func() implementation essentially does the
same thing for any command:
- choose ops - use the value of param to choose either a number or
count version of event_trigger_ops specific to the command
- do the register or unregister of those ops
- associate a filter, if specified, with the triggering event
The reg() and unreg() ops allow command-specific implementations for
event_trigger_op registration and unregistration, and the
get_trigger_ops() op allows command-specific event_trigger_ops
selection to be parameterized. When a trigger instance is added, the
reg() op essentially adds that trigger to the triggering event and
arms it, while unreg() does the opposite. The set_filter() function
is used to associate a filter with the trigger - if the command
doesn't specify a set_filter() implementation, the command will ignore
filters.
Each command has an associated trigger_type, which serves double duty,
both as a unique identifier for the command as well as a value that
can be used for setting a trigger mode bit during trigger invocation.
The signature of func() adds a pointer to the event_command struct,
used to invoke those functions, along with a command_data param that
can be passed to the reg/unreg functions. This allows func()
implementations to use command-specific blobs and supports code
re-use.
The event_trigger_ops.func() command corrsponds to the trigger 'probe'
function that gets called when the triggering event is actually
invoked. The other functions are used to list the trigger when
needed, along with a couple mundane book-keeping functions.
This also moves event_file_data() into trace.h so it can be used
outside of trace_events.c.
Link: http://lkml.kernel.org/r/316d95061accdee070aac8e5750afba0192fa5b9.1382622043.git.tom.zanussi@linux.intel.com
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Idea-by: Steve Rostedt <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-10-24 13:59:24 +00:00
|
|
|
struct event_trigger_ops *(*get_trigger_ops)(char *cmd, char *param);
|
|
|
|
};
|
|
|
|
|
2016-02-22 20:55:09 +00:00
|
|
|
/**
|
|
|
|
* enum event_command_flags - flags for struct event_command
|
|
|
|
*
|
|
|
|
* @POST_TRIGGER: A flag that says whether or not this command needs
|
|
|
|
* to have its action delayed until after the current event has
|
|
|
|
* been closed. Some triggers need to avoid being invoked while
|
|
|
|
* an event is currently in the process of being logged, since
|
|
|
|
* the trigger may itself log data into the trace buffer. Thus
|
|
|
|
* we make sure the current event is committed before invoking
|
|
|
|
* those triggers. To do that, the trigger invocation is split
|
|
|
|
* in two - the first part checks the filter using the current
|
|
|
|
* trace record; if a command has the @post_trigger flag set, it
|
|
|
|
* sets a bit for itself in the return value, otherwise it
|
|
|
|
* directly invokes the trigger. Once all commands have been
|
|
|
|
* either invoked or set their return flag, the current record is
|
|
|
|
* either committed or discarded. At that point, if any commands
|
|
|
|
* have deferred their triggers, those commands are finally
|
|
|
|
* invoked following the close of the current event. In other
|
|
|
|
* words, if the event_trigger_ops @func() probe implementation
|
|
|
|
* itself logs to the trace buffer, this flag should be set,
|
|
|
|
* otherwise it can be left unspecified.
|
|
|
|
*
|
|
|
|
* @NEEDS_REC: A flag that says whether or not this command needs
|
|
|
|
* access to the trace record in order to perform its function,
|
|
|
|
* regardless of whether or not it has a filter associated with
|
|
|
|
* it (filters make a trigger require access to the trace record
|
|
|
|
* but are not always present).
|
|
|
|
*/
|
|
|
|
enum event_command_flags {
|
|
|
|
EVENT_CMD_FL_POST_TRIGGER = 1,
|
|
|
|
EVENT_CMD_FL_NEEDS_REC = 2,
|
|
|
|
};
|
|
|
|
|
|
|
|
static inline bool event_command_post_trigger(struct event_command *cmd_ops)
|
|
|
|
{
|
|
|
|
return cmd_ops->flags & EVENT_CMD_FL_POST_TRIGGER;
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline bool event_command_needs_rec(struct event_command *cmd_ops)
|
|
|
|
{
|
|
|
|
return cmd_ops->flags & EVENT_CMD_FL_NEEDS_REC;
|
|
|
|
}
|
|
|
|
|
2015-05-05 14:09:53 +00:00
|
|
|
extern int trace_event_enable_disable(struct trace_event_file *file,
|
tracing: Add basic event trigger framework
Add a 'trigger' file for each trace event, enabling 'trace event
triggers' to be set for trace events.
'trace event triggers' are patterned after the existing 'ftrace
function triggers' implementation except that triggers are written to
per-event 'trigger' files instead of to a single file such as the
'set_ftrace_filter' used for ftrace function triggers.
The implementation is meant to be entirely separate from ftrace
function triggers, in order to keep the respective implementations
relatively simple and to allow them to diverge.
The event trigger functionality is built on top of SOFT_DISABLE
functionality. It adds a TRIGGER_MODE bit to the ftrace_event_file
flags which is checked when any trace event fires. Triggers set for a
particular event need to be checked regardless of whether that event
is actually enabled or not - getting an event to fire even if it's not
enabled is what's already implemented by SOFT_DISABLE mode, so trigger
mode directly reuses that. Event trigger essentially inherit the soft
disable logic in __ftrace_event_enable_disable() while adding a bit of
logic and trigger reference counting via tm_ref on top of that in a
new trace_event_trigger_enable_disable() function. Because the base
__ftrace_event_enable_disable() code now needs to be invoked from
outside trace_events.c, a wrapper is also added for those usages.
The triggers for an event are actually invoked via a new function,
event_triggers_call(), and code is also added to invoke them for
ftrace_raw_event calls as well as syscall events.
The main part of the patch creates a new trace_events_trigger.c file
to contain the trace event triggers implementation.
The standard open, read, and release file operations are implemented
here.
The open() implementation sets up for the various open modes of the
'trigger' file. It creates and attaches the trigger iterator and sets
up the command parser. If opened for reading set up the trigger
seq_ops.
The read() implementation parses the event trigger written to the
'trigger' file, looks up the trigger command, and passes it along to
that event_command's func() implementation for command-specific
processing.
The release() implementation does whatever cleanup is needed to
release the 'trigger' file, like releasing the parser and trigger
iterator, etc.
A couple of functions for event command registration and
unregistration are added, along with a list to add them to and a mutex
to protect them, as well as an (initially empty) registration function
to add the set of commands that will be added by future commits, and
call to it from the trace event initialization code.
also added are a couple trigger-specific data structures needed for
these implementations such as a trigger iterator and a struct for
trigger-specific data.
A couple structs consisting mostly of function meant to be implemented
in command-specific ways, event_command and event_trigger_ops, are
used by the generic event trigger command implementations. They're
being put into trace.h alongside the other trace_event data structures
and functions, in the expectation that they'll be needed in several
trace_event-related files such as trace_events_trigger.c and
trace_events.c.
The event_command.func() function is meant to be called by the trigger
parsing code in order to add a trigger instance to the corresponding
event. It essentially coordinates adding a live trigger instance to
the event, and arming the triggering the event.
Every event_command func() implementation essentially does the
same thing for any command:
- choose ops - use the value of param to choose either a number or
count version of event_trigger_ops specific to the command
- do the register or unregister of those ops
- associate a filter, if specified, with the triggering event
The reg() and unreg() ops allow command-specific implementations for
event_trigger_op registration and unregistration, and the
get_trigger_ops() op allows command-specific event_trigger_ops
selection to be parameterized. When a trigger instance is added, the
reg() op essentially adds that trigger to the triggering event and
arms it, while unreg() does the opposite. The set_filter() function
is used to associate a filter with the trigger - if the command
doesn't specify a set_filter() implementation, the command will ignore
filters.
Each command has an associated trigger_type, which serves double duty,
both as a unique identifier for the command as well as a value that
can be used for setting a trigger mode bit during trigger invocation.
The signature of func() adds a pointer to the event_command struct,
used to invoke those functions, along with a command_data param that
can be passed to the reg/unreg functions. This allows func()
implementations to use command-specific blobs and supports code
re-use.
The event_trigger_ops.func() command corrsponds to the trigger 'probe'
function that gets called when the triggering event is actually
invoked. The other functions are used to list the trigger when
needed, along with a couple mundane book-keeping functions.
This also moves event_file_data() into trace.h so it can be used
outside of trace_events.c.
Link: http://lkml.kernel.org/r/316d95061accdee070aac8e5750afba0192fa5b9.1382622043.git.tom.zanussi@linux.intel.com
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Idea-by: Steve Rostedt <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-10-24 13:59:24 +00:00
|
|
|
int enable, int soft_disable);
|
2013-10-24 13:59:26 +00:00
|
|
|
extern int tracing_alloc_snapshot(void);
|
tracing: Add conditional snapshot
Currently, tracing snapshots are context-free - they capture the ring
buffer contents at the time the tracing_snapshot() function was
invoked, and nothing else. Additionally, they're always taken
unconditionally - the calling code can decide whether or not to take a
snapshot, but the data used to make that decision is kept separately
from the snapshot itself.
This change adds the ability to associate with each trace instance
some user data, along with an 'update' function that can use that data
to determine whether or not to actually take a snapshot. The update
function can then update that data along with any other state (as part
of the data presumably), if warranted.
Because snapshots are 'global' per-instance, only one user can enable
and use a conditional snapshot for any given trace instance. To
enable a conditional snapshot (see details in the function and data
structure comments), the user calls tracing_snapshot_cond_enable().
Similarly, to disable a conditional snapshot and free it up for other
users, tracing_snapshot_cond_disable() should be called.
To actually initiate a conditional snapshot, tracing_snapshot_cond()
should be called. tracing_snapshot_cond() will invoke the update()
callback, allowing the user to decide whether or not to actually take
the snapshot and update the user-defined data associated with the
snapshot. If the callback returns 'true', tracing_snapshot_cond()
will then actually take the snapshot and return.
This scheme allows for flexibility in snapshot implementations - for
example, by implementing slightly different update() callbacks,
snapshots can be taken in situations where the user is only interested
in taking a snapshot when a new maximum in hit versus when a value
changes in any way at all. Future patches will demonstrate both
cases.
Link: http://lkml.kernel.org/r/1bea07828d5fd6864a585f83b1eed47ce097eb45.1550100284.git.tom.zanussi@linux.intel.com
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-02-13 23:42:45 +00:00
|
|
|
extern void tracing_snapshot_cond(struct trace_array *tr, void *cond_data);
|
|
|
|
extern int tracing_snapshot_cond_enable(struct trace_array *tr, void *cond_data, cond_update_fn_t update);
|
|
|
|
|
|
|
|
extern int tracing_snapshot_cond_disable(struct trace_array *tr);
|
|
|
|
extern void *tracing_cond_snapshot_data(struct trace_array *tr);
|
tracing: Add basic event trigger framework
Add a 'trigger' file for each trace event, enabling 'trace event
triggers' to be set for trace events.
'trace event triggers' are patterned after the existing 'ftrace
function triggers' implementation except that triggers are written to
per-event 'trigger' files instead of to a single file such as the
'set_ftrace_filter' used for ftrace function triggers.
The implementation is meant to be entirely separate from ftrace
function triggers, in order to keep the respective implementations
relatively simple and to allow them to diverge.
The event trigger functionality is built on top of SOFT_DISABLE
functionality. It adds a TRIGGER_MODE bit to the ftrace_event_file
flags which is checked when any trace event fires. Triggers set for a
particular event need to be checked regardless of whether that event
is actually enabled or not - getting an event to fire even if it's not
enabled is what's already implemented by SOFT_DISABLE mode, so trigger
mode directly reuses that. Event trigger essentially inherit the soft
disable logic in __ftrace_event_enable_disable() while adding a bit of
logic and trigger reference counting via tm_ref on top of that in a
new trace_event_trigger_enable_disable() function. Because the base
__ftrace_event_enable_disable() code now needs to be invoked from
outside trace_events.c, a wrapper is also added for those usages.
The triggers for an event are actually invoked via a new function,
event_triggers_call(), and code is also added to invoke them for
ftrace_raw_event calls as well as syscall events.
The main part of the patch creates a new trace_events_trigger.c file
to contain the trace event triggers implementation.
The standard open, read, and release file operations are implemented
here.
The open() implementation sets up for the various open modes of the
'trigger' file. It creates and attaches the trigger iterator and sets
up the command parser. If opened for reading set up the trigger
seq_ops.
The read() implementation parses the event trigger written to the
'trigger' file, looks up the trigger command, and passes it along to
that event_command's func() implementation for command-specific
processing.
The release() implementation does whatever cleanup is needed to
release the 'trigger' file, like releasing the parser and trigger
iterator, etc.
A couple of functions for event command registration and
unregistration are added, along with a list to add them to and a mutex
to protect them, as well as an (initially empty) registration function
to add the set of commands that will be added by future commits, and
call to it from the trace event initialization code.
also added are a couple trigger-specific data structures needed for
these implementations such as a trigger iterator and a struct for
trigger-specific data.
A couple structs consisting mostly of function meant to be implemented
in command-specific ways, event_command and event_trigger_ops, are
used by the generic event trigger command implementations. They're
being put into trace.h alongside the other trace_event data structures
and functions, in the expectation that they'll be needed in several
trace_event-related files such as trace_events_trigger.c and
trace_events.c.
The event_command.func() function is meant to be called by the trigger
parsing code in order to add a trigger instance to the corresponding
event. It essentially coordinates adding a live trigger instance to
the event, and arming the triggering the event.
Every event_command func() implementation essentially does the
same thing for any command:
- choose ops - use the value of param to choose either a number or
count version of event_trigger_ops specific to the command
- do the register or unregister of those ops
- associate a filter, if specified, with the triggering event
The reg() and unreg() ops allow command-specific implementations for
event_trigger_op registration and unregistration, and the
get_trigger_ops() op allows command-specific event_trigger_ops
selection to be parameterized. When a trigger instance is added, the
reg() op essentially adds that trigger to the triggering event and
arms it, while unreg() does the opposite. The set_filter() function
is used to associate a filter with the trigger - if the command
doesn't specify a set_filter() implementation, the command will ignore
filters.
Each command has an associated trigger_type, which serves double duty,
both as a unique identifier for the command as well as a value that
can be used for setting a trigger mode bit during trigger invocation.
The signature of func() adds a pointer to the event_command struct,
used to invoke those functions, along with a command_data param that
can be passed to the reg/unreg functions. This allows func()
implementations to use command-specific blobs and supports code
re-use.
The event_trigger_ops.func() command corrsponds to the trigger 'probe'
function that gets called when the triggering event is actually
invoked. The other functions are used to list the trigger when
needed, along with a couple mundane book-keeping functions.
This also moves event_file_data() into trace.h so it can be used
outside of trace_events.c.
Link: http://lkml.kernel.org/r/316d95061accdee070aac8e5750afba0192fa5b9.1382622043.git.tom.zanussi@linux.intel.com
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Idea-by: Steve Rostedt <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-10-24 13:59:24 +00:00
|
|
|
|
2009-03-12 18:19:25 +00:00
|
|
|
extern const char *__start___trace_bprintk_fmt[];
|
|
|
|
extern const char *__stop___trace_bprintk_fmt[];
|
|
|
|
|
2013-07-12 21:07:27 +00:00
|
|
|
extern const char *__start___tracepoint_str[];
|
|
|
|
extern const char *__stop___tracepoint_str[];
|
|
|
|
|
2015-09-29 22:21:35 +00:00
|
|
|
void trace_printk_control(bool enabled);
|
2012-10-11 14:15:05 +00:00
|
|
|
void trace_printk_start_comm(void);
|
2013-03-14 19:03:53 +00:00
|
|
|
int trace_keep_overwrite(struct tracer *tracer, u32 mask, int set);
|
2012-05-11 17:29:49 +00:00
|
|
|
int set_tracer_flag(struct trace_array *tr, unsigned int mask, int enabled);
|
tracing: Add percpu buffers for trace_printk()
Currently, trace_printk() uses a single buffer to write into
to calculate the size and format needed to save the trace. To
do this safely in an SMP environment, a spin_lock() is taken
to only allow one writer at a time to the buffer. But this could
also affect what is being traced, and add synchronization that
would not be there otherwise.
Ideally, using percpu buffers would be useful, but since trace_printk()
is only used in development, having per cpu buffers for something
never used is a waste of space. Thus, the use of the trace_bprintk()
format section is changed to be used for static fmts as well as dynamic ones.
Then at boot up, we can check if the section that holds the trace_printk
formats is non-empty, and if it does contain something, then we
know a trace_printk() has been added to the kernel. At this time
the trace_printk per cpu buffers are allocated. A check is also
done at module load time in case a module is added that contains a
trace_printk().
Once the buffers are allocated, they are never freed. If you use
a trace_printk() then you should know what you are doing.
A buffer is made for each type of context:
normal
softirq
irq
nmi
The context is checked and the appropriate buffer is used.
This allows for totally lockless usage of trace_printk(),
and they no longer even disable interrupts.
Requested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-09-22 18:01:55 +00:00
|
|
|
|
2020-01-29 09:36:44 +00:00
|
|
|
/* Used from boot time tracer */
|
|
|
|
extern int trace_set_options(struct trace_array *tr, char *option);
|
|
|
|
extern int tracing_set_tracer(struct trace_array *tr, const char *buf);
|
|
|
|
extern ssize_t tracing_resize_ring_buffer(struct trace_array *tr,
|
|
|
|
unsigned long size, int cpu_id);
|
|
|
|
extern int tracing_set_cpumask(struct trace_array *tr,
|
|
|
|
cpumask_var_t tracing_cpumask_new);
|
|
|
|
|
|
|
|
|
2017-09-22 19:58:20 +00:00
|
|
|
#define MAX_EVENT_NAME_LEN 64
|
|
|
|
|
|
|
|
extern ssize_t trace_parse_run_command(struct file *file,
|
|
|
|
const char __user *buffer, size_t count, loff_t *ppos,
|
2021-02-01 19:48:11 +00:00
|
|
|
int (*createfn)(const char *));
|
2017-09-22 19:58:20 +00:00
|
|
|
|
2019-03-31 23:48:15 +00:00
|
|
|
extern unsigned int err_pos(char *cmd, const char *str);
|
2019-04-02 02:52:21 +00:00
|
|
|
extern void tracing_log_err(struct trace_array *tr,
|
|
|
|
const char *loc, const char *cmd,
|
2022-01-27 21:44:19 +00:00
|
|
|
const char **errs, u8 type, u16 pos);
|
2019-03-31 23:48:15 +00:00
|
|
|
|
2013-03-09 05:40:58 +00:00
|
|
|
/*
|
|
|
|
* Normal trace_printk() and friends allocates special buffers
|
|
|
|
* to do the manipulation, as well as saves the print formats
|
|
|
|
* into sections to display. But the trace infrastructure wants
|
|
|
|
* to use these without the added overhead at the price of being
|
|
|
|
* a bit slower (used mainly for warnings, where we don't care
|
|
|
|
* about performance). The internal_trace_puts() is for such
|
|
|
|
* a purpose.
|
|
|
|
*/
|
|
|
|
#define internal_trace_puts(str) __trace_puts(_THIS_IP_, str, strlen(str))
|
|
|
|
|
2009-09-12 23:26:21 +00:00
|
|
|
#undef FTRACE_ENTRY
|
2019-10-24 20:26:59 +00:00
|
|
|
#define FTRACE_ENTRY(call, struct_name, id, tstruct, print) \
|
2015-05-05 15:45:27 +00:00
|
|
|
extern struct trace_event_call \
|
2014-04-07 22:39:20 +00:00
|
|
|
__aligned(4) event_##call;
|
2009-09-12 23:26:21 +00:00
|
|
|
#undef FTRACE_ENTRY_DUP
|
2019-10-24 20:26:59 +00:00
|
|
|
#define FTRACE_ENTRY_DUP(call, struct_name, id, tstruct, print) \
|
|
|
|
FTRACE_ENTRY(call, struct_name, id, PARAMS(tstruct), PARAMS(print))
|
2016-06-29 10:56:48 +00:00
|
|
|
#undef FTRACE_ENTRY_PACKED
|
2019-10-24 20:26:59 +00:00
|
|
|
#define FTRACE_ENTRY_PACKED(call, struct_name, id, tstruct, print) \
|
|
|
|
FTRACE_ENTRY(call, struct_name, id, PARAMS(tstruct), PARAMS(print))
|
2016-06-29 10:56:48 +00:00
|
|
|
|
2009-09-12 23:26:21 +00:00
|
|
|
#include "trace_entries.h"
|
2009-03-31 05:48:49 +00:00
|
|
|
|
2012-04-13 08:52:59 +00:00
|
|
|
#if defined(CONFIG_PERF_EVENTS) && defined(CONFIG_FUNCTION_TRACER)
|
2015-05-05 15:45:27 +00:00
|
|
|
int perf_ftrace_event_register(struct trace_event_call *call,
|
2012-02-15 14:51:52 +00:00
|
|
|
enum trace_reg type, void *data);
|
|
|
|
#else
|
|
|
|
#define perf_ftrace_event_register NULL
|
2012-04-13 08:52:59 +00:00
|
|
|
#endif
|
2012-02-15 14:51:52 +00:00
|
|
|
|
2014-12-13 01:05:10 +00:00
|
|
|
#ifdef CONFIG_FTRACE_SYSCALLS
|
|
|
|
void init_ftrace_syscalls(void);
|
2015-12-10 18:50:46 +00:00
|
|
|
const char *get_syscall_name(int syscall);
|
2014-12-13 01:05:10 +00:00
|
|
|
#else
|
|
|
|
static inline void init_ftrace_syscalls(void) { }
|
2015-12-10 18:50:46 +00:00
|
|
|
static inline const char *get_syscall_name(int syscall)
|
|
|
|
{
|
|
|
|
return NULL;
|
|
|
|
}
|
2014-12-13 01:05:10 +00:00
|
|
|
#endif
|
|
|
|
|
|
|
|
#ifdef CONFIG_EVENT_TRACING
|
|
|
|
void trace_event_init(void);
|
2017-05-31 21:56:48 +00:00
|
|
|
void trace_event_eval_update(struct trace_eval_map **map, int len);
|
2020-01-29 09:36:44 +00:00
|
|
|
/* Used from boot time tracer */
|
|
|
|
extern int ftrace_set_clr_event(struct trace_array *tr, char *buf, int set);
|
|
|
|
extern int trigger_process_regex(struct trace_event_file *file, char *buff);
|
2014-12-13 01:05:10 +00:00
|
|
|
#else
|
|
|
|
static inline void __init trace_event_init(void) { }
|
2017-05-31 21:56:48 +00:00
|
|
|
static inline void trace_event_eval_update(struct trace_eval_map **map, int len) { }
|
2014-12-13 01:05:10 +00:00
|
|
|
#endif
|
|
|
|
|
2018-05-28 14:56:36 +00:00
|
|
|
#ifdef CONFIG_TRACER_SNAPSHOT
|
|
|
|
void tracing_snapshot_instance(struct trace_array *tr);
|
|
|
|
int tracing_alloc_snapshot_instance(struct trace_array *tr);
|
2024-02-20 20:23:07 +00:00
|
|
|
int tracing_arm_snapshot(struct trace_array *tr);
|
|
|
|
void tracing_disarm_snapshot(struct trace_array *tr);
|
2018-05-28 14:56:36 +00:00
|
|
|
#else
|
|
|
|
static inline void tracing_snapshot_instance(struct trace_array *tr) { }
|
|
|
|
static inline int tracing_alloc_snapshot_instance(struct trace_array *tr)
|
|
|
|
{
|
|
|
|
return 0;
|
|
|
|
}
|
2024-02-20 20:23:07 +00:00
|
|
|
static inline int tracing_arm_snapshot(struct trace_array *tr) { return 0; }
|
|
|
|
static inline void tracing_disarm_snapshot(struct trace_array *tr) { }
|
2018-05-28 14:56:36 +00:00
|
|
|
#endif
|
|
|
|
|
2018-08-09 01:28:05 +00:00
|
|
|
#ifdef CONFIG_PREEMPT_TRACER
|
|
|
|
void tracer_preempt_on(unsigned long a0, unsigned long a1);
|
|
|
|
void tracer_preempt_off(unsigned long a0, unsigned long a1);
|
|
|
|
#else
|
|
|
|
static inline void tracer_preempt_on(unsigned long a0, unsigned long a1) { }
|
|
|
|
static inline void tracer_preempt_off(unsigned long a0, unsigned long a1) { }
|
|
|
|
#endif
|
|
|
|
#ifdef CONFIG_IRQSOFF_TRACER
|
|
|
|
void tracer_hardirqs_on(unsigned long a0, unsigned long a1);
|
|
|
|
void tracer_hardirqs_off(unsigned long a0, unsigned long a1);
|
|
|
|
#else
|
|
|
|
static inline void tracer_hardirqs_on(unsigned long a0, unsigned long a1) { }
|
|
|
|
static inline void tracer_hardirqs_off(unsigned long a0, unsigned long a1) { }
|
|
|
|
#endif
|
|
|
|
|
2019-05-23 12:45:35 +00:00
|
|
|
/*
|
|
|
|
* Reset the state of the trace_iterator so that it can read consumed data.
|
|
|
|
* Normally, the trace_iterator is used for reading the data when it is not
|
|
|
|
* consumed, and must retain state.
|
|
|
|
*/
|
|
|
|
static __always_inline void trace_iterator_reset(struct trace_iterator *iter)
|
|
|
|
{
|
2021-12-10 01:22:45 +00:00
|
|
|
memset_startat(iter, 0, seq);
|
2019-05-23 12:45:35 +00:00
|
|
|
iter->pos = -1;
|
|
|
|
}
|
|
|
|
|
2020-10-13 14:17:53 +00:00
|
|
|
/* Check the name is good for event/group/fields */
|
2022-11-22 17:23:45 +00:00
|
|
|
static inline bool __is_good_name(const char *name, bool hash_ok)
|
2020-10-13 14:17:53 +00:00
|
|
|
{
|
2022-11-22 17:23:45 +00:00
|
|
|
if (!isalpha(*name) && *name != '_' && (!hash_ok || *name != '-'))
|
2020-10-13 14:17:53 +00:00
|
|
|
return false;
|
|
|
|
while (*++name != '\0') {
|
2022-11-22 17:23:45 +00:00
|
|
|
if (!isalpha(*name) && !isdigit(*name) && *name != '_' &&
|
|
|
|
(!hash_ok || *name != '-'))
|
2020-10-13 14:17:53 +00:00
|
|
|
return false;
|
|
|
|
}
|
|
|
|
return true;
|
|
|
|
}
|
|
|
|
|
2022-11-22 17:23:45 +00:00
|
|
|
/* Check the name is good for event/group/fields */
|
|
|
|
static inline bool is_good_name(const char *name)
|
|
|
|
{
|
|
|
|
return __is_good_name(name, false);
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Check the name is good for system */
|
|
|
|
static inline bool is_good_system_name(const char *name)
|
|
|
|
{
|
|
|
|
return __is_good_name(name, true);
|
|
|
|
}
|
|
|
|
|
tracing: Add a probe that attaches to trace events
A new dynamic event is introduced: event probe. The event is attached
to an existing tracepoint and uses its fields as arguments. The user
can specify custom format string of the new event, select what tracepoint
arguments will be printed and how to print them.
An event probe is created by writing configuration string in
'dynamic_events' ftrace file:
e[:[SNAME/]ENAME] SYSTEM/EVENT [FETCHARGS] - Set an event probe
-:SNAME/ENAME - Delete an event probe
Where:
SNAME - System name, if omitted 'eprobes' is used.
ENAME - Name of the new event in SNAME, if omitted the SYSTEM_EVENT is used.
SYSTEM - Name of the system, where the tracepoint is defined, mandatory.
EVENT - Name of the tracepoint event in SYSTEM, mandatory.
FETCHARGS - Arguments:
<name>=$<field>[:TYPE] - Fetch given filed of the tracepoint and print
it as given TYPE with given name. Supported
types are:
(u8/u16/u32/u64/s8/s16/s32/s64), basic type
(x8/x16/x32/x64), hexadecimal types
"string", "ustring" and bitfield.
Example, attach an event probe on openat system call and print name of the
file that will be opened:
echo "e:esys/eopen syscalls/sys_enter_openat file=\$filename:string" >> dynamic_events
A new dynamic event is created in events/esys/eopen/ directory. It
can be deleted with:
echo "-:esys/eopen" >> dynamic_events
Filters, triggers and histograms can be attached to the new event, it can
be matched in synthetic events. There is one limitation - an event probe
can not be attached to kprobe, uprobe or another event probe.
Link: https://lkml.kernel.org/r/20210812145805.2292326-1-tz.stoyanov@gmail.com
Link: https://lkml.kernel.org/r/20210819152825.142428383@goodmis.org
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Co-developed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-08-19 15:26:06 +00:00
|
|
|
/* Convert certain expected symbols into '_' when generating event names */
|
|
|
|
static inline void sanitize_event_name(char *name)
|
|
|
|
{
|
|
|
|
while (*name++ != '\0')
|
|
|
|
if (*name == ':' || *name == '.')
|
|
|
|
*name = '_';
|
|
|
|
}
|
|
|
|
|
trace: Add a generic function to read/write u64 values from tracefs
The hwlat detector and (in preparation for) the osnoise/timerlat tracers
have a set of u64 parameters that the user can read/write via tracefs.
For instance, we have hwlat_detector's window and width.
To reduce the code duplication, hwlat's window and width share the same
read function. However, they do not share the write functions because
they do different parameter checks. For instance, the width needs to
be smaller than the window, while the window needs to be larger
than the window. The same pattern repeats on osnoise/timerlat, and
a large portion of the code was devoted to the write function.
Despite having different checks, the write functions have the same
structure:
read a user-space buffer
take the lock that protects the value
check for minimum and maximum acceptable values
save the value
release the lock
return success or error
To reduce the code duplication also in the write functions, this patch
provides a generic read and write implementation for u64 values that
need to be within some minimum and/or maximum parameters, while
(potentially) being protected by a lock.
To use this interface, the structure trace_min_max_param needs to be
filled:
struct trace_min_max_param {
struct mutex *lock;
u64 *val;
u64 *min;
u64 *max;
};
The desired value is stored on the variable pointed by *val. If *min
points to a minimum acceptable value, it will be checked during the
write operation. Likewise, if *max points to a maximum allowable value,
it will be checked during the write operation. Finally, if *lock points
to a mutex, it will be taken at the beginning of the operation and
released at the end.
The definition of a trace_min_max_param needs to passed as the
(private) *data for tracefs_create_file(), and the trace_min_max_fops
(added by this patch) as the *fops file_operations.
Link: https://lkml.kernel.org/r/3e35760a7c8b5c55f16ae5ad5fc54a0e71cbe647.1624372313.git.bristot@redhat.com
Cc: Phil Auld <pauld@redhat.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Kate Carcia <kcarcia@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Alexandre Chartre <alexandre.chartre@oracle.com>
Cc: Clark Willaims <williams@redhat.com>
Cc: John Kacur <jkacur@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-06-22 14:42:23 +00:00
|
|
|
/*
|
|
|
|
* This is a generic way to read and write a u64 value from a file in tracefs.
|
|
|
|
*
|
|
|
|
* The value is stored on the variable pointed by *val. The value needs
|
|
|
|
* to be at least *min and at most *max. The write is protected by an
|
|
|
|
* existing *lock.
|
|
|
|
*/
|
|
|
|
struct trace_min_max_param {
|
|
|
|
struct mutex *lock;
|
|
|
|
u64 *val;
|
|
|
|
u64 *min;
|
|
|
|
u64 *max;
|
|
|
|
};
|
|
|
|
|
|
|
|
#define U64_STR_SIZE 24 /* 20 digits max */
|
|
|
|
|
|
|
|
extern const struct file_operations trace_min_max_fops;
|
|
|
|
|
rv: Add Runtime Verification (RV) interface
RV is a lightweight (yet rigorous) method that complements classical
exhaustive verification techniques (such as model checking and
theorem proving) with a more practical approach to complex systems.
RV works by analyzing the trace of the system's actual execution,
comparing it against a formal specification of the system behavior.
RV can give precise information on the runtime behavior of the
monitored system while enabling the reaction for unexpected
events, avoiding, for example, the propagation of a failure on
safety-critical systems.
The development of this interface roots in the development of the
paper:
De Oliveira, Daniel Bristot; Cucinotta, Tommaso; De Oliveira, Romulo
Silva. Efficient formal verification for the Linux kernel. In:
International Conference on Software Engineering and Formal Methods.
Springer, Cham, 2019. p. 315-332.
And:
De Oliveira, Daniel Bristot. Automata-based formal analysis
and verification of the real-time Linux kernel. PhD Thesis, 2020.
The RV interface resembles the tracing/ interface on purpose. The current
path for the RV interface is /sys/kernel/tracing/rv/.
It presents these files:
"available_monitors"
- List the available monitors, one per line.
For example:
# cat available_monitors
wip
wwnr
"enabled_monitors"
- Lists the enabled monitors, one per line;
- Writing to it enables a given monitor;
- Writing a monitor name with a '!' prefix disables it;
- Truncating the file disables all enabled monitors.
For example:
# cat enabled_monitors
# echo wip > enabled_monitors
# echo wwnr >> enabled_monitors
# cat enabled_monitors
wip
wwnr
# echo '!wip' >> enabled_monitors
# cat enabled_monitors
wwnr
# echo > enabled_monitors
# cat enabled_monitors
#
Note that more than one monitor can be enabled concurrently.
"monitoring_on"
- It is an on/off general switcher for monitoring. Note
that it does not disable enabled monitors or detach events,
but stop the per-entity monitors of monitoring the events
received from the system. It resembles the "tracing_on" switcher.
"monitors/"
Each monitor will have its one directory inside "monitors/". There
the monitor specific files will be presented.
The "monitors/" directory resembles the "events" directory on
tracefs.
For example:
# cd monitors/wip/
# ls
desc enable
# cat desc
wakeup in preemptive per-cpu testing monitor.
# cat enable
0
For further information, see the comments in the header of
kernel/trace/rv/rv.c from this patch.
Link: https://lkml.kernel.org/r/a4bfe038f50cb047bfb343ad0e12b0e646ab308b.1659052063.git.bristot@kernel.org
Cc: Wim Van Sebroeck <wim@linux-watchdog.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Marco Elver <elver@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Gabriele Paoloni <gpaoloni@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Tao Zhou <tao.zhou@linux.dev>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-trace-devel@vger.kernel.org
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-07-29 09:38:40 +00:00
|
|
|
#ifdef CONFIG_RV
|
|
|
|
extern int rv_init_interface(void);
|
|
|
|
#else
|
|
|
|
static inline int rv_init_interface(void)
|
|
|
|
{
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
#endif
|
|
|
|
|
2008-05-12 19:20:42 +00:00
|
|
|
#endif /* _LINUX_KERNEL_TRACE_H */
|