linux/kernel
Daniel Borkmann 7e3f977edd perf, events: add non-linear data support for raw records
This patch adds support for non-linear data on raw records. It
extends raw records to have one or multiple fragments that will
be written linearly into the ring slot, where each fragment can
optionally have a custom callback handler to walk and extract
complex, possibly non-linear data.

If a callback handler is provided for a fragment, then the new
__output_custom() will be used instead of __output_copy() for
the perf_output_sample() part. perf_prepare_sample() does all
the size calculation only once, so perf_output_sample() doesn't
need to redo the same work anymore, meaning real_size and padding
will be cached in the raw record. The raw record becomes 32 bytes
in size without holes; to not increase it further and to avoid
doing unnecessary recalculations in fast-path, we can reuse
next pointer of the last fragment, idea here is borrowed from
ZERO_OR_NULL_PTR(), which should keep the perf_output_sample()
path for PERF_SAMPLE_RAW minimal.

This facility is needed for BPF's event output helper as a first
user that will, in a follow-up, add an additional perf_raw_frag
to its perf_raw_record in order to be able to more efficiently
dump skb context after a linear head meta data related to it.
skbs can be non-linear and thus need a custom output function to
dump buffers. Currently, the skb data needs to be copied twice;
with the help of __output_custom() this work only needs to be
done once. Future users could be things like XDP/BPF programs
that work on different context though and would thus also have
a different callback function.

The few users of raw records are adapted to initialize their frag
data from the raw record itself, no change in behavior for them.
The code is based upon a PoC diff provided by Peter Zijlstra [1].

  [1] http://thread.gmane.org/gmane.linux.network/421294

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-15 14:23:56 -07:00
..
bpf bpf: make inode code explicitly non-modular 2016-07-11 13:52:43 -07:00
configs
debug mm/init: Add 'rodata=off' boot cmdline parameter to disable read-only kernel mappings 2016-02-22 08:51:37 +01:00
events perf, events: add non-linear data support for raw records 2016-07-15 14:23:56 -07:00
gcov gcov: disable for COMPILE_TEST 2016-05-10 17:12:49 +02:00
irq irqchip updates for 4.7-rc1: 2016-06-03 15:05:51 +02:00
livepatch Merge branches 'for-4.7/core', 'for-4.7/livepatching-doc' and 'for-4.7/livepatching-ppc64' into for-linus 2016-05-17 12:06:35 +02:00
locking locking: avoid passing around 'thread_info' in mutex debugging code 2016-06-23 12:11:17 -07:00
power oom, suspend: fix oom_reaper vs. oom_killer_disable race 2016-06-24 17:23:52 -07:00
printk printk/nmi: flush NMI messages on the system panic 2016-05-20 17:58:30 -07:00
rcu debugobjects: insulate non-fixup logic related to static obj from fixup callbacks 2016-05-19 19:12:14 -07:00
sched sched/core: Allow kthreads to fall back to online && !active cpus 2016-06-24 08:26:53 +02:00
time timer: Export destroy_hrtimer_on_stack() 2016-05-31 11:44:08 -07:00
trace perf, events: add non-linear data support for raw records 2016-07-15 14:23:56 -07:00
.gitignore certs: add .gitignore to stop git nagging about x509_certificate_list 2015-10-21 15:18:35 +01:00
acct.c
async.c async: export current_is_async() 2015-11-19 17:51:48 +01:00
audit_fsnotify.c wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
audit_tree.c audit: cleanup prune_tree_thread 2016-04-04 09:46:47 -04:00
audit_watch.c don't bother with ->d_inode->i_sb - it's always equal to ->d_sb 2016-04-10 17:11:51 -04:00
audit.c Merge branch 'stable-4.7' of git://git.infradead.org/users/pcmoore/audit 2016-06-29 15:18:47 -07:00
audit.h audit: move audit_get_tty to reduce scope and kabi changes 2016-06-28 15:48:48 -04:00
auditfilter.c audit: Fix typo in comment 2016-02-08 11:25:39 -05:00
auditsc.c Merge branch 'stable-4.7' of git://git.infradead.org/users/pcmoore/audit 2016-06-29 15:18:47 -07:00
backtracetest.c
bounds.c
capability.c
cgroup_freezer.c cgroup: kill cgrp_ss_priv[CGROUP_CANFORK_COUNT] and friends 2015-12-03 10:24:08 -05:00
cgroup_pids.c cgroup_pids: fix a typo. 2015-12-14 14:54:37 -05:00
cgroup.c cgroup: Add cgroup_get_from_fd 2016-07-01 16:30:38 -04:00
compat.c
configs.c
context_tracking.c context_tracking: Switch to new static_branch API 2015-11-24 09:56:43 +01:00
cpu_pm.c
cpu.c sched/hotplug: Make activate() the last hotplug step 2016-05-06 14:58:25 +02:00
cpuset.c cpuset: use static key better and convert to new API 2016-05-19 19:12:14 -07:00
crash_dump.c
cred.c kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
delayacct.c kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
dma.c
elfcore.c
exec_domain.c
exit.c wait: allow sys_waitid() to accept __WNOTHREAD/__WCLONE/__WALL 2016-05-23 17:04:14 -07:00
extable.c
fork.c Fix build break in fork.c when THREAD_SIZE < PAGE_SIZE 2016-06-25 06:01:28 -07:00
freezer.c
futex_compat.c ptrace: use fsuid, fsgid, effective creds for fs access checks 2016-01-20 17:09:18 -08:00
futex.c futex: Calculate the futex key based on a tail page for file-based futexes 2016-06-08 19:23:54 +02:00
groups.c
hung_task.c kernel/hung_task.c: use timeout diff when timeout is updated 2016-03-22 15:36:02 -07:00
irq_work.c treewide: Remove old email address 2015-11-23 09:44:58 +01:00
jump_label.c locking/static_key: Fix concurrent static_key_slow_inc() 2016-06-24 08:23:16 +02:00
kallsyms.c kallsyms: add support for relative offsets in kallsyms address table 2016-03-15 16:55:16 -07:00
kcmp.c ptrace: use fsuid, fsgid, effective creds for fs access checks 2016-01-20 17:09:18 -08:00
Kconfig.freezer
Kconfig.hz
Kconfig.locks
Kconfig.preempt
kcov.c kernel/kcov: unproxify debugfs file's fops 2016-06-15 04:56:35 -07:00
kexec_core.c s390/kexec: consolidate crash_map/unmap_reserved_pages() and arch_kexec_protect(unprotect)_crashkres() 2016-05-23 17:04:14 -07:00
kexec_file.c kexec: introduce a protection mechanism for the crashkernel reserved memory 2016-05-23 17:04:14 -07:00
kexec_internal.h kexec: move some memembers and definitions within the scope of CONFIG_KEXEC_FILE 2016-01-20 17:09:18 -08:00
kexec.c s390/kexec: consolidate crash_map/unmap_reserved_pages() and arch_kexec_protect(unprotect)_crashkres() 2016-05-23 17:04:14 -07:00
kmod.c kmod: don't run async usermode helper as a child of kworker thread 2015-10-23 17:55:10 +09:00
kprobes.c
ksysfs.c rcu: Remove TINY_RCU bloat from pointless boot parameters 2015-12-07 16:59:37 -08:00
kthread.c
latencytop.c sched/debug: Make schedstats a runtime tunable that is disabled by default 2016-02-09 11:54:23 +01:00
Makefile ELF/MIPS build fix 2016-05-23 17:04:14 -07:00
membarrier.c
memremap.c memremap: add arch specific hook for MEMREMAP_WB mappings 2016-04-04 10:26:41 +02:00
module_signing.c KEYS: Move the point of trust determination to __key_link() 2016-04-11 22:43:43 +01:00
module-internal.h
module.c module: preserve Elf information for livepatch modules 2016-04-01 15:00:10 +02:00
notifier.c
nsproxy.c cgroup: introduce cgroup namespaces 2016-02-16 13:04:58 -05:00
padata.c kernel/padata.c: hide unused functions 2016-05-19 19:12:14 -07:00
panic.c printk/nmi: flush NMI messages on the system panic 2016-05-20 17:58:30 -07:00
params.c Nothing exciting, minor tweaks and cleanups. 2015-11-09 15:53:39 -08:00
pid_namespace.c
pid.c remove lots of IS_ERR_VALUE abuses 2016-05-27 15:26:11 -07:00
profile.c profile: hide unused functions when !CONFIG_PROC_FS 2016-03-22 15:36:02 -07:00
ptrace.c ptrace: change __ptrace_unlink() to clear ->ptrace under ->siglock 2016-03-22 15:36:02 -07:00
range.c
reboot.c
relay.c kernel/relay.c: fix potential memory leak 2016-06-09 14:23:11 -07:00
resource.c /proc/iomem: only expose physical resource addresses to privileged users 2016-04-14 12:56:09 -07:00
seccomp.c Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus 2016-05-19 10:02:26 -07:00
signal.c kernel/signal.c: convert printk(KERN_<LEVEL> ...) to pr_<level>(...) 2016-05-23 17:04:14 -07:00
smp.c Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-03-15 13:50:29 -07:00
smpboot.c cpu/hotplug: Unpark smpboot threads from the state machine 2016-03-01 20:36:56 +01:00
smpboot.h cpu/hotplug: Create hotplug threads 2016-03-01 20:36:56 +01:00
softirq.c arch, ftrace: for KASAN put hard/soft IRQ entries into separate sections 2016-03-25 16:37:42 -07:00
stacktrace.c
stop_machine.c kernel/stop_machine.c: remove CONFIG_SMP dependencies 2016-01-16 11:17:24 -08:00
sys_ni.c vfs: add copy_file_range syscall and vfs helper 2015-12-01 14:00:53 -05:00
sys.c prctl: make PR_SET_THP_DISABLE wait for mmap_sem killable 2016-05-23 17:04:14 -07:00
sysctl_binary.c kernel/sysctl_binary.c: use generic UUID library 2016-05-20 17:58:30 -07:00
sysctl.c Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-05-25 17:05:40 -07:00
task_work.c
taskstats.c taskstats: use the libnl API to align nlattr on 64-bit 2016-04-23 20:13:25 -04:00
test_kprobes.c
torture.c rcutorture: Dump trace buffer upon shutdown 2016-04-21 13:47:04 -07:00
tracepoint.c kernel/...: convert pr_warning to pr_warn 2016-03-22 15:36:02 -07:00
tsacct.c time, acct: Drop irq save & restore from __acct_update_integrals() 2016-02-29 09:53:09 +01:00
uid16.c
up.c
user_namespace.c kernel/*: switch to memdup_user_nul() 2016-01-04 10:27:55 -05:00
user-return-notifier.c
user.c
utsname_sysctl.c
utsname.c
watchdog.c watchdog: don't run proc_watchdog_update if new value is same as old 2016-03-17 15:09:34 -07:00
workqueue_internal.h sched/core: Get rid of 'cpu' argument in wq_worker_sleeping() 2016-03-02 10:28:47 -05:00
workqueue.c debugobjects: insulate non-fixup logic related to static obj from fixup callbacks 2016-05-19 19:12:14 -07:00