linux/kernel/bpf
Song Liu 615755a77b bpf: extend stackmap to save binary_build_id+offset instead of address
Currently, bpf stackmap store address for each entry in the call trace.
To map these addresses to user space files, it is necessary to maintain
the mapping from these virtual address to symbols in the binary. Usually,
the user space profiler (such as perf) has to scan /proc/pid/maps at the
beginning of profiling, and monitor mmap2() calls afterwards. Given the
cost of maintaining the address map, this solution is not practical for
system wide profiling that is always on.

This patch tries to solve this problem with a variation of stackmap. This
variation is enabled by flag BPF_F_STACK_BUILD_ID. Instead of storing
addresses, the variation stores ELF file build_id + offset.

Build ID is a 20-byte unique identifier for ELF files. The following
command shows the Build ID of /bin/bash:

  [user@]$ readelf -n /bin/bash
  ...
    Build ID: XXXXXXXXXX
  ...

With BPF_F_STACK_BUILD_ID, bpf_get_stackid() tries to parse Build ID
for each entry in the call trace, and translate it into the following
struct:

  struct bpf_stack_build_id_offset {
          __s32           status;
          unsigned char   build_id[BPF_BUILD_ID_SIZE];
          union {
                  __u64   offset;
                  __u64   ip;
          };
  };

The search of build_id is limited to the first page of the file, and this
page should be in page cache. Otherwise, we fallback to store ip for this
entry (ip field in struct bpf_stack_build_id_offset). This requires the
build_id to be stored in the first page. A quick survey of binary and
dynamic library files in a few different systems shows that almost all
binary and dynamic library files have build_id in the first page.

Build_id is only meaningful for user stack. If a kernel stack is added to
a stackmap with BPF_F_STACK_BUILD_ID, it will automatically fallback to
only store ip (status == BPF_STACK_BUILD_ID_IP). Similarly, if build_id
lookup failed for some reason, it will also fallback to store ip.

User space can access struct bpf_stack_build_id_offset with bpf
syscall BPF_MAP_LOOKUP_ELEM. It is necessary for user space to
maintain mapping from build id to binary files. This mostly static
mapping is much easier to maintain than per process address maps.

Note: Stackmap with build_id only works in non-nmi context at this time.
This is because we need to take mm->mmap_sem for find_vma(). If this
changes, we would like to allow build_id lookup in nmi context.

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-03-15 01:09:28 +01:00
..
arraymap.c bpf: add schedule points in percpu arrays management 2018-02-22 21:27:06 +01:00
bpf_lru_list.c bpf: lru: Lower the PERCPU_NR_SCANS from 16 to 4 2017-04-17 13:55:52 -04:00
bpf_lru_list.h bpf: Only set node->ref = 1 if it has not been set 2017-09-01 09:57:39 -07:00
cgroup.c bpf/cgroup: fix a verification error for a CGROUP_DEVICE type prog 2017-12-19 01:43:29 +01:00
core.c bpf: fix bpf_prog_array_copy_to_user warning from perf event prog query 2018-02-14 08:59:37 -08:00
cpumap.c bpf: cpumap: use GFP_KERNEL instead of GFP_ATOMIC in __cpu_map_entry_alloc() 2018-02-14 15:34:27 +01:00
devmap.c bpf: add helper for copying attrs to struct bpf_map 2018-01-14 23:36:29 +01:00
disasm.c bpf: allow for correlation of maps and helpers in dump 2017-12-20 18:09:40 -08:00
disasm.h bpf: annotate bpf_insn_print_t with __printf 2018-01-17 01:15:05 +01:00
hashtab.c bpf: add helper for copying attrs to struct bpf_map 2018-01-14 23:36:29 +01:00
helpers.c bpf: rename ARG_PTR_TO_STACK 2017-01-09 16:56:27 -05:00
inode.c bpf: comment why dots in filenames under BPF virtual FS are not allowed 2018-03-09 10:30:30 +01:00
lpm_trie.c bpf: fix rcu lockdep warning for lpm_trie map_free callback 2018-02-22 21:29:12 +01:00
Makefile bpf: only build sockmap with CONFIG_INET 2018-01-04 19:01:14 +01:00
map_in_map.c bpf: Add syscall lookup support for fd array and htab 2017-06-29 13:13:25 -04:00
map_in_map.h bpf: Add syscall lookup support for fd array and htab 2017-06-29 13:13:25 -04:00
offload.c bpf: offload: report device information about offloaded maps 2018-01-18 22:54:25 +01:00
percpu_freelist.c bpf: fix lockdep splat 2017-11-15 19:46:32 +09:00
percpu_freelist.h bpf: introduce percpu_freelist 2016-03-08 15:28:31 -05:00
sockmap.c bpf: fix sock_map_alloc() error path 2018-02-13 19:19:15 -08:00
stackmap.c bpf: extend stackmap to save binary_build_id+offset instead of address 2018-03-15 01:09:28 +01:00
syscall.c bpf: Use the IS_FD_ARRAY() macro in map_update_elem() 2018-01-25 18:05:24 -08:00
tnum.c bpf/verifier: track signed and unsigned min/max values 2017-08-08 17:51:34 -07:00
verifier.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-03-06 01:20:46 -05:00