39aead8373
88 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
|
f5a489dc81 |
perf metricgroup: Pass pmu_event structure as a parameter for arch_get_runtimeparam()
This patch adds passing of pmu_event as a parameter in function 'arch_get_runtimeparam' which can be used to get details like if the event is percore/perchip. Signed-off-by: Kajol Jain <kjain@linux.ibm.com> Acked-by: Ian Rogers <irogers@google.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: John Garry <john.garry@huawei.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Clarke <pc@us.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Link: http://lore.kernel.org/lkml/20200907064133.75090-5-kjain@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
00e4db5125 |
perf tools changes for v5.9
New features: - Introduce controlling how 'perf stat' and 'perf record' works via a control file descriptor, allowing starting with events configured but disabled until commands are received via the control file descriptor. This allows, for instance for tools such as Intel VTune to make further use of perf as its Linux platform driver. - Improve 'perf record' to to register in a perf.data file header the clockid used to help later correlate things like syslog files and perf events recorded. - Add basic syscall and find_next_bit benchmarks to 'perf bench'. - Allow using computed metrics in calculating other metrics. For instance: { .metric_expr = "l2_rqsts.demand_data_rd_hit + l2_rqsts.pf_hit + l2_rqsts.rfo_hit", .metric_name = "DCache_L2_All_Hits", }, { .metric_expr = "max(l2_rqsts.all_demand_data_rd - l2_rqsts.demand_data_rd_hit, 0) + l2_rqsts.pf_miss + l2_rqsts.rfo_miss", .metric_name = "DCache_L2_All_Miss", }, { .metric_expr = "dcache_l2_all_hits + dcache_l2_all_miss", .metric_name = "DCache_L2_All", } - Add suport for 'd_ratio', '>' and '<' operators to the expression resolver used in calculating metrics in 'perf stat'. Support for new kernel features: - Support TEXT_POKE and KSYMBOL_TYPE_OOL perf metadata events to cope with things like ftrace, trampolines, i.e. changes in the kernel text that gets in the way of properly decoding Intel PT hardware traces, for instance. Intel PT: - Add various knobs to reduce the volume of Intel PT traces by reducing the level of details such as decoding just some types of packets (e.g., FUP/TIP, PSB+), also filtering by time range. - Add new itrace options (log flags to the 'd' option, error flags to the 'e' one, etc), controlling how Intel PT is transformed into perf events, document some missing options (e.g., how to synthesize callchains). BPF: - Properly report BPF errors when parsing events. - Do not setup side-band events if LIBBPF is not linked, fixing a segfault. Libraries: - Improvements on the libtraceevent plugin mechanism. - Improve libtracevent support for KVM trace events SVM exit reasons. - Add a libtracevent plugins for decoding syscalls/sys_enter_futex and for tlb_flush. - Ensure sample_period is set libpfm4 events in 'perf test'. - Fixup libperf namespacing, to make sure what is in libperf has the perf_ namespace while what is now only in tools/perf/ doesn't use that prefix. Arch specific: - Improve the testing of vendor events and metrics in 'perf test'. - Allow no ARM CoreSight hardware tracer sink to be specified on command line. - Fix arm_spe_x recording when mixed with other perf events. - Add s390 idle functions 'psw_idle' and 'psw_idle_exit' to list of idle symbols. - List kernel supplied event aliases for arm64 in 'perf list'. - Add support for extended register capability in PowerPC 9 and 10. - Added nest IMC power9 metric events. Miscellaneous: - No need to setup sample_regs_intr/sample_regs_user for dummy events. - Update various copies of kernel headers, some causing perf to handle new syscalls, MSRs, etc. - Improve usage of flex and yacc, enabling warnings and addressing the fallout. - Add missing '--output' option to 'perf kmem' so that it can pass it along to 'perf record'. - 'perf probe' fixes related to adding multiple probes on the same address for the same event. - Make 'perf probe' warn if the target function is a GNU indirect function. - Remove //anon mmap events from 'perf inject jit' to fix supporting both using ELF files for generated functions and the perf-PID.map approaches. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Test results: The first ones are container based builds of tools/perf with and without libelf support. Where clang is available, it is also used to build perf with/without libelf, and building with LIBCLANGLLVM=1 (built-in clang) with gcc and clang when clang and its devel libraries are installed. The objtool and samples/bpf/ builds are disabled now that I'm switching from using the sources in a local volume to fetching them from a http server to build it inside the container, to make it easier to build in a container cluster. Those will come back later. Several are cross builds, the ones with -x-ARCH and the android one, and those may not have all the features built, due to lack of multi-arch devel packages, available and being used so far on just a few, like debian:experimental-x-{arm64,mipsel}. The 'perf test' one will perform a variety of tests exercising tools/perf/util/, tools/lib/{bpf,traceevent,etc}, as well as run perf commands with a variety of command line event specifications to then intercept the sys_perf_event syscall to check that the perf_event_attr fields are set up as expected, among a variety of other unit tests. Then there is the 'make -C tools/perf build-test' ones, that build tools/perf/ with a variety of feature sets, exercising the build with an incomplete set of features as well as with a complete one. It is planned to have it run on each of the containers mentioned above, using some container orchestration infrastructure. Get in contact if interested in helping having this in place. fedora:rawhide with python3 and gcc 10.1.1-2 is failing (10.1.1-1 on fedora:32 works), fixes will be provided soon. clearlinux:latest is failing on libbpf, there is a fix already in the bpf tree. The ones failing when linking with libllvm, not the default build, were restricted to clang-9/llvm-9, working with anything before or after, e.g., using clang-8 on ubuntu:19.10 and clang-11 on debian:experimental fixed the build in those environments. # export PERF_TARBALL=http://192.168.124.1/perf/perf-5.8.0.tar.xz # dm 1 alpine:3.4 : Ok gcc (Alpine 5.3.0) 5.3.0, clang version 3.8.0 (tags/RELEASE_380/final) 2 alpine:3.5 : Ok gcc (Alpine 6.2.1) 6.2.1 20160822, clang version 3.8.1 (tags/RELEASE_381/final) 3 alpine:3.6 : Ok gcc (Alpine 6.3.0) 6.3.0, clang version 4.0.0 (tags/RELEASE_400/final) 4 alpine:3.7 : Ok gcc (Alpine 6.4.0) 6.4.0, Alpine clang version 5.0.0 (tags/RELEASE_500/final) (based on LLVM 5.0.0) 5 alpine:3.8 : Ok gcc (Alpine 6.4.0) 6.4.0, Alpine clang version 5.0.1 (tags/RELEASE_501/final) (based on LLVM 5.0.1) 6 alpine:3.9 : Ok gcc (Alpine 8.3.0) 8.3.0, Alpine clang version 5.0.1 (tags/RELEASE_502/final) (based on LLVM 5.0.1) 7 alpine:3.10 : Ok gcc (Alpine 8.3.0) 8.3.0, Alpine clang version 8.0.0 (tags/RELEASE_800/final) (based on LLVM 8.0.0) 8 alpine:3.11 : Ok gcc (Alpine 9.2.0) 9.2.0, Alpine clang version 9.0.0 (https://git.alpinelinux.org/aports f7f0d2c2b8bcd6a5843401a9a702029556492689) (based on LLVM 9.0.0) 9 alpine:3.12 : Ok gcc (Alpine 9.3.0) 9.3.0, Alpine clang version 10.0.0 (https://gitlab.alpinelinux.org/alpine/aports.git 7445adce501f8473efdb93b17b5eaf2f1445ed4c) 10 alpine:edge : Ok gcc (Alpine 9.3.0) 9.3.0, Alpine clang version 10.0.0 (git://git.alpinelinux.org/aports 7445adce501f8473efdb93b17b5eaf2f1445ed4c) 11 alt:p8 : Ok x86_64-alt-linux-gcc (GCC) 5.3.1 20151207 (ALT p8 5.3.1-alt3.M80P.1), clang version 3.8.0 (tags/RELEASE_380/final) 12 alt:p9 : Ok x86_64-alt-linux-gcc (GCC) 8.4.1 20200305 (ALT p9 8.4.1-alt0.p9.1), clang version 7.0.1 13 alt:sisyphus : Ok x86_64-alt-linux-gcc (GCC) 9.2.1 20200123 (ALT Sisyphus 9.2.1-alt3), clang version 10.0.0 14 amazonlinux:1 : Ok gcc (GCC) 7.2.1 20170915 (Red Hat 7.2.1-2), clang version 3.6.2 (tags/RELEASE_362/final) 15 amazonlinux:2 : Ok gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-6), clang version 7.0.1 (Amazon Linux 2 7.0.1-1.amzn2.0.2) 16 android-ndk:r12b-arm : Ok arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease) 17 android-ndk:r15c-arm : Ok arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease) 18 centos:6 : Ok gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23) 19 centos:7 : Ok gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39) 20 centos:8 : Ok gcc (GCC) 8.3.1 20191121 (Red Hat 8.3.1-5), clang version 9.0.1 (Red Hat 9.0.1-2.module_el8.2.0+309+0c7b6b03) 21 clearlinux:latest : FAIL gcc (Clear Linux OS for Intel Architecture) 10.2.1 20200723 releases/gcc-10.2.0-3-g677b80db41, clang version 10.0.1 gcc (Clear Linux OS for Intel Architecture) 10.2.1 20200723 releases/gcc-10.2.0-3-g677b80db41 btf.c: In function 'btf__parse_raw': btf.c:625:28: error: 'btf' may be used uninitialized in this function [-Werror=maybe-uninitialized] 625 | return err ? ERR_PTR(err) : btf; | ~~~~~~~~~~~~~~~~~~~^~~~~ 22 debian:8 : Ok gcc (Debian 4.9.2-10+deb8u2) 4.9.2, Debian clang version 3.5.0-10 (tags/RELEASE_350/final) (based on LLVM 3.5.0) 23 debian:9 : Ok gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, clang version 3.8.1-24 (tags/RELEASE_381/final) 24 debian:10 : Ok gcc (Debian 8.3.0-6) 8.3.0, clang version 7.0.1-8 (tags/RELEASE_701/final) 25 debian:experimental : Ok gcc (Debian 10.2.0-3) 10.2.0, Debian clang version 11.0.0-+rc1-1 26 debian:experimental-x-arm64 : Ok aarch64-linux-gnu-gcc (Debian 9.3.0-8) 9.3.0 27 debian:experimental-x-mips : Ok mips-linux-gnu-gcc (Debian 8.3.0-19) 8.3.0 28 debian:experimental-x-mips64 : Ok mips64-linux-gnuabi64-gcc (Debian 9.3.0-8) 9.3.0 29 debian:experimental-x-mipsel : Ok mipsel-linux-gnu-gcc (Debian 9.2.1-8) 9.2.1 20190909 30 fedora:20 : Ok gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-7) 31 fedora:22 : Ok gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6), clang version 3.5.0 (tags/RELEASE_350/final) 32 fedora:23 : Ok gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6), clang version 3.7.0 (tags/RELEASE_370/final) 33 fedora:24 : Ok gcc (GCC) 6.3.1 20161221 (Red Hat 6.3.1-1), clang version 3.8.1 (tags/RELEASE_381/final) 34 fedora:24-x-ARC-uClibc : Ok arc-linux-gcc (ARCompact ISA Linux uClibc toolchain 2017.09-rc2) 7.1.1 20170710 35 fedora:25 : Ok gcc (GCC) 6.4.1 20170727 (Red Hat 6.4.1-1), clang version 3.9.1 (tags/RELEASE_391/final) 36 fedora:26 : Ok gcc (GCC) 7.3.1 20180130 (Red Hat 7.3.1-2), clang version 4.0.1 (tags/RELEASE_401/final) 37 fedora:27 : Ok gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-6), clang version 5.0.2 (tags/RELEASE_502/final) 38 fedora:28 : Ok gcc (GCC) 8.3.1 20190223 (Red Hat 8.3.1-2), clang version 6.0.1 (tags/RELEASE_601/final) 39 fedora:29 : Ok gcc (GCC) 8.3.1 20190223 (Red Hat 8.3.1-2), clang version 7.0.1 (Fedora 7.0.1-6.fc29) 40 fedora:30 : Ok gcc (GCC) 9.3.1 20200408 (Red Hat 9.3.1-2), clang version 8.0.0 (Fedora 8.0.0-3.fc30) 41 fedora:30-x-ARC-glibc : Ok arc-linux-gcc (ARC HS GNU/Linux glibc toolchain 2019.03-rc1) 8.3.1 20190225 42 fedora:30-x-ARC-uClibc : Ok arc-linux-gcc (ARCv2 ISA Linux uClibc toolchain 2019.03-rc1) 8.3.1 20190225 43 fedora:31 : Ok gcc (GCC) 9.3.1 20200408 (Red Hat 9.3.1-2), clang version 9.0.1 (Fedora 9.0.1-2.fc31) 44 fedora:32 : Ok gcc (GCC) 10.1.1 20200507 (Red Hat 10.1.1-1), clang version 10.0.0 (Fedora 10.0.0-2.fc32) 45 fedora:rawhide : FAIL gcc (GCC) 10.2.1 20200723 (Red Hat 10.2.1-1), clang version 10.0.0 (Fedora 10.0.0-10.fc33) gcc (GCC) 10.2.1 20200723 (Red Hat 10.2.1-1) util/scripting-engines/trace-event-python.c: In function 'python_start_script': util/scripting-engines/trace-event-python.c:1595:2: error: 'visibility' attribute ignored [-Werror=attributes] 1595 | PyMODINIT_FUNC (*initfunc)(void); | ^~~~~~~~~~~~~~ 46 gentoo-stage3-amd64:latest : Ok gcc (Gentoo 9.3.0-r1 p3) 9.3.0 47 mageia:5 : Ok gcc (GCC) 4.9.2, clang version 3.5.2 (tags/RELEASE_352/final) 48 mageia:6 : Ok gcc (Mageia 5.5.0-1.mga6) 5.5.0, clang version 3.9.1 (tags/RELEASE_391/final) 49 mageia:7 : Ok gcc (Mageia 8.3.1-0.20190524.1.mga7) 8.3.1 20190524, clang version 8.0.0 (Mageia 8.0.0-1.mga7) 50 manjaro:latest : Ok gcc (GCC) 9.2.0, clang version 9.0.0 (tags/RELEASE_900/final) 51 openmandriva:cooker : Ok gcc (GCC) 10.0.0 20200502 (OpenMandriva), clang version 10.0.1 52 opensuse:15.0 : Ok gcc (SUSE Linux) 7.4.1 20190424 [gcc-7-branch revision 270538], clang version 5.0.1 (tags/RELEASE_501/final 312548) 53 opensuse:15.1 : Ok gcc (SUSE Linux) 7.5.0, clang version 7.0.1 (tags/RELEASE_701/final 349238) 54 opensuse:15.2 : Ok gcc (SUSE Linux) 7.5.0, clang version 9.0.1 55 opensuse:42.3 : Ok gcc (SUSE Linux) 4.8.5, clang version 3.8.0 (tags/RELEASE_380/final 262553) 56 opensuse:tumbleweed : Ok gcc (SUSE Linux) 10.2.1 20200728 [revision c0438ced53bcf57e4ebb1c38c226e41571aca892], clang version 10.0.1 57 oraclelinux:6 : Ok gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23.0.1) 58 oraclelinux:7 : Ok gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39.0.5) 59 oraclelinux:8 : Ok gcc (GCC) 8.3.1 20191121 (Red Hat 8.3.1-5.0.3), clang version 9.0.1 (Red Hat 9.0.1-2.0.1.module+el8.2.0+5599+9ed9ef6d) 60 ubuntu:12.04 : Ok gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3, Ubuntu clang version 3.0-6ubuntu3 (tags/RELEASE_30/final) (based on LLVM 3.0) 61 ubuntu:14.04 : Ok gcc (Ubuntu 4.8.4-2ubuntu1~14.04.4) 4.8.4 62 ubuntu:16.04 : Ok gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609, clang version 3.8.0-2ubuntu4 (tags/RELEASE_380/final) 63 ubuntu:16.04-x-arm : Ok arm-linux-gnueabihf-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609 64 ubuntu:16.04-x-arm64 : Ok aarch64-linux-gnu-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609 65 ubuntu:16.04-x-powerpc : Ok powerpc-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609 66 ubuntu:16.04-x-powerpc64 : Ok powerpc64-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609 67 ubuntu:16.04-x-powerpc64el : Ok powerpc64le-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609 68 ubuntu:16.04-x-s390 : Ok s390x-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609 69 ubuntu:18.04 : Ok gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0, clang version 6.0.0-1ubuntu2 (tags/RELEASE_600/final) 70 ubuntu:18.04-x-arm : Ok arm-linux-gnueabihf-gcc (Ubuntu/Linaro 7.5.0-3ubuntu1~18.04) 7.5.0 71 ubuntu:18.04-x-arm64 : Ok aarch64-linux-gnu-gcc (Ubuntu/Linaro 7.5.0-3ubuntu1~18.04) 7.5.0 72 ubuntu:18.04-x-m68k : Ok m68k-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 73 ubuntu:18.04-x-powerpc : Ok powerpc-linux-gnu-gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0 74 ubuntu:18.04-x-powerpc64 : Ok powerpc64-linux-gnu-gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0 75 ubuntu:18.04-x-powerpc64el : Ok powerpc64le-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 76 ubuntu:18.04-x-riscv64 : Ok riscv64-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 77 ubuntu:18.04-x-s390 : Ok s390x-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 78 ubuntu:18.04-x-sh4 : Ok sh4-linux-gnu-gcc (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0 79 ubuntu:18.04-x-sparc64 : Ok sparc64-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 80 ubuntu:18.10 : Ok gcc (Ubuntu 8.3.0-6ubuntu1~18.10.1) 8.3.0, clang version 7.0.0-3 (tags/RELEASE_700/final) 81 ubuntu:19.04 : Ok gcc (Ubuntu 8.3.0-6ubuntu1) 8.3.0, clang version 8.0.0-3 (tags/RELEASE_800/final) 82 ubuntu:19.04-x-alpha : Ok alpha-linux-gnu-gcc (Ubuntu 8.3.0-6ubuntu1) 8.3.0 83 ubuntu:19.04-x-arm64 : Ok aarch64-linux-gnu-gcc (Ubuntu/Linaro 8.3.0-6ubuntu1) 8.3.0 84 ubuntu:19.04-x-hppa : Ok hppa-linux-gnu-gcc (Ubuntu 8.3.0-6ubuntu1) 8.3.0 85 ubuntu:19.10 : Ok gcc (Ubuntu 9.2.1-9ubuntu2) 9.2.1 20191008, clang version 8.0.1-3build1 (tags/RELEASE_801/final) 86 219.74 ubuntu:20.04 : Ok gcc (Ubuntu 9.3.0-10ubuntu2) 9.3.0, clang version 10.0.0-4ubuntu1 # # uname -a Linux quaco 5.7.12-200.fc32.x86_64 #1 SMP Sat Aug 1 16:13:38 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux # git log --oneline -1 |
||
|
6665598658 |
perf tools powerpc: Add support for extended regs in power10
Added support for supported regs which are new in power10 ( MMCR3, SIER2, SIER3 ) to sample_reg_mask in the tool side to use with `-I?` option. Also added PVR check to send extended mask for power10 at kernel while capturing extended regs in each sample. Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Reviewed-by: Kajol Jain <kjain@linux.ibm.com> Reviewed-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michael Neuling <mikey@neuling.org> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
33583e6950 |
perf tools powerpc: Add support for extended register capability
Add extended regs to sample_reg_mask in the tool side to use with `-I?` option. Perf tools side uses extended mask to display the platform supported register names (with -I? option) to the user and also send this mask to the kernel to capture the extended registers in each sample. Hence decide the mask value based on the processor version. Currently definitions for `mfspr`, `SPRN_PVR` are part of `arch/powerpc/util/header.c`. Move this to a header file so that these definitions can be re-used in other source files as well. Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com> Reviewed-by: Kajol Jain <kjain@linux.ibm.com> Reviewed-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Reviewed--by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michael Neuling <mikey@neuling.org> <mikey@neuling.org> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: linuxppc-dev@lists.ozlabs.org [Decide extended mask at run time based on platform] Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
0f10228c6f |
KVM: PPC: Fix typo on H_DISABLE_AND_GET hcall
On PAPR+ the hcall() on 0x1B0 is called H_DISABLE_AND_GET, but got defined as H_DISABLE_AND_GETC instead. This define was introduced with a typo in commit <b13a96cfb055> ("[PATCH] powerpc: Extends HCALL interface for InfiniBand usage"), and was later used without having the typo noticed. Signed-off-by: Leonardo Bras <leobras.c@gmail.com> Acked-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200707004812.190765-1-leobras.c@gmail.com |
||
|
5cf0e8ebc2 |
perf libdw: Fix off-by 1 relative directory includes
This is currently working due to extra include paths in the build. Before: $ cd tools/perf/arch/arm64/util $ ls -la ../../util/unwind-libdw.h ls: cannot access '../../util/unwind-libdw.h': No such file or directory After: $ ls -la ../../../util/unwind-libdw.h -rw-r----- 1 irogers irogers 553 Apr 17 14:31 ../../../util/unwind-libdw.h Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lore.kernel.org/lkml/20200529225232.207532-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
39548e50e6 |
perf powerpc: Don't ignore sym-handling.c file
Commit |
||
|
efc0cdc9ed |
perf evsel: Rename perf_evsel__{str,int}val() and other tracepoint field metehods to to evsel__*()
As those are not 'struct evsel' methods, not part of tools/lib/perf/, aka libperf, to whom the perf_ prefix belongs. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
1e1a873dc6 |
perf metricgroups: Enhance JSON/metric infrastructure to handle "?"
Patch enhances current metric infrastructure to handle "?" in the metric expression. The "?" can be use for parameters whose value not known while creating metric events and which can be replace later at runtime to the proper value. It also add flexibility to create multiple events out of single metric event added in JSON file. Patch adds function 'arch_get_runtimeparam' which is a arch specific function, returns the count of metric events need to be created. By default it return 1. This infrastructure needed for hv_24x7 socket/chip level events. "hv_24x7" chip level events needs specific chip-id to which the data is requested. Function 'arch_get_runtimeparam' implemented in header.c which extract number of sockets from sysfs file "sockets" under "/sys/devices/hv_24x7/interface/". With this patch basically we are trying to create as many metric events as define by runtime_param. For that one loop is added in function 'metricgroup__add_metric', which create multiple events at run time depend on return value of 'arch_get_runtimeparam' and merge that event in 'group_list'. To achieve that we are actually passing this parameter value as part of `expr__find_other` function and changing "?" present in metric expression with this value. As in our JSON file, there gonna be single metric event, and out of which we are creating multiple events. To understand which data count belongs to which parameter value, we also printing param value in generic_metric function. For example, command:# ./perf stat -M PowerBUS_Frequency -C 0 -I 1000 1.000101867 9,356,933 hv_24x7/pm_pb_cyc,chip=0/ # 2.3 GHz PowerBUS_Frequency_0 1.000101867 9,366,134 hv_24x7/pm_pb_cyc,chip=1/ # 2.3 GHz PowerBUS_Frequency_1 2.000314878 9,365,868 hv_24x7/pm_pb_cyc,chip=0/ # 2.3 GHz PowerBUS_Frequency_0 2.000314878 9,366,092 hv_24x7/pm_pb_cyc,chip=1/ # 2.3 GHz PowerBUS_Frequency_1 So, here _0 and _1 after PowerBUS_Frequency specify parameter value. Signed-off-by: Kajol Jain <kjain@linux.ibm.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Anju T Sudhakar <anju@linux.vnet.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Joe Mario <jmario@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Mamatha Inamdar <mamatha4@linux.vnet.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@ozlabs.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lore.kernel.org/lkml/20200401203340.31402-5-kjain@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
7eec00a747 |
perf symbols: Consolidate symbol fixup issue
After copying Arm64's perf archive with object files and perf.data file to x86 laptop, the x86's perf kernel symbol resolution fails. It outputs 'unknown' for all symbols parsing. This issue is root caused by the function elf__needs_adjust_symbols(), x86 perf tool uses one weak version, Arm64 (and powerpc) has rewritten their own version. elf__needs_adjust_symbols() decides if need to parse symbols with the relative offset address; but x86 building uses the weak function which misses to check for the elf type 'ET_DYN', so that it cannot parse symbols in Arm DSOs due to the wrong result from elf__needs_adjust_symbols(). The DSO parsing should not depend on any specific architecture perf building; e.g. x86 perf tool can parse Arm and Arm64 DSOs, vice versa. And confirmed by Naveen N. Rao that powerpc64 kernels are not being built as ET_DYN anymore and change to ET_EXEC. This patch removes the arch specific functions for Arm64 and powerpc and changes elf__needs_adjust_symbols() as a common function. In the common elf__needs_adjust_symbols(), it checks an extra condition 'ET_DYN' for elf header type. With this fixing, the Arm64 DSO can be parsed properly with x86's perf tool. Before: # perf script main 3258 1 branches: 0 [unknown] ([unknown]) => ffff800010c4665c [unknown] ([kernel.kallsyms]) main 3258 1 branches: ffff800010c46670 [unknown] ([kernel.kallsyms]) => ffff800010c4eaec [unknown] ([kernel.kallsyms]) main 3258 1 branches: ffff800010c4eaec [unknown] ([kernel.kallsyms]) => ffff800010c4eb00 [unknown] ([kernel.kallsyms]) main 3258 1 branches: ffff800010c4eb08 [unknown] ([kernel.kallsyms]) => ffff800010c4e780 [unknown] ([kernel.kallsyms]) main 3258 1 branches: ffff800010c4e7a0 [unknown] ([kernel.kallsyms]) => ffff800010c4eeac [unknown] ([kernel.kallsyms]) main 3258 1 branches: ffff800010c4eebc [unknown] ([kernel.kallsyms]) => ffff800010c4ed80 [unknown] ([kernel.kallsyms]) After: # perf script main 3258 1 branches: 0 [unknown] ([unknown]) => ffff800010c4665c coresight_timeout+0x54 ([kernel.kallsyms]) main 3258 1 branches: ffff800010c46670 coresight_timeout+0x68 ([kernel.kallsyms]) => ffff800010c4eaec etm4_enable_hw+0x3cc ([kernel.kallsyms]) main 3258 1 branches: ffff800010c4eaec etm4_enable_hw+0x3cc ([kernel.kallsyms]) => ffff800010c4eb00 etm4_enable_hw+0x3e0 ([kernel.kallsyms]) main 3258 1 branches: ffff800010c4eb08 etm4_enable_hw+0x3e8 ([kernel.kallsyms]) => ffff800010c4e780 etm4_enable_hw+0x60 ([kernel.kallsyms]) main 3258 1 branches: ffff800010c4e7a0 etm4_enable_hw+0x80 ([kernel.kallsyms]) => ffff800010c4eeac etm4_enable+0x2d4 ([kernel.kallsyms]) main 3258 1 branches: ffff800010c4eebc etm4_enable+0x2e4 ([kernel.kallsyms]) => ffff800010c4ed80 etm4_enable+0x1a8 ([kernel.kallsyms]) v3: Changed to check for ET_DYN across all architectures. v2: Fixed Arm64 and powerpc native building. Reported-by: Mike Leach <mike.leach@linaro.org> Signed-off-by: Leo Yan <leo.yan@linaro.org> Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: Jiri Olsa <jolsa@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Allison Randal <allison@lohutok.net> Cc: Enrico Weigelt <info@metux.net> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: John Garry <john.garry@huawei.com> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Richter <tmricht@linux.vnet.ibm.com> Link: http://lore.kernel.org/lkml/20200306015759.10084-1-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
441b62acd9 |
tools: Fix off-by 1 relative directory includes
This is currently working due to extra include paths in the build. Committer testing: $ cd tools/include/uapi/asm/ Before this patch: $ ls -la ../../arch/x86/include/uapi/asm/errno.h ls: cannot access '../../arch/x86/include/uapi/asm/errno.h': No such file or directory $ After this patch; $ ls -la ../../../arch/x86/include/uapi/asm/errno.h -rw-rw-r--. 1 acme acme 31 Feb 20 12:42 ../../../arch/x86/include/uapi/asm/errno.h $ Check that that is still under tools/, i.e. hasn't escaped into the main kernel sources: $ cd ../../../arch/x86/include/uapi/asm/ $ pwd /home/acme/git/perf/tools/arch/x86/include/uapi/asm $ Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexios Zavras <alexios.zavras@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Igor Lubashev <ilubashe@akamai.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wei Li <liwei391@huawei.com> Link: http://lore.kernel.org/lkml/20200306071110.130202-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
a910e4666d |
perf parse: Report initial event parsing error
Record the first event parsing error and report. Implementing feedback from Jiri Olsa: https://lkml.org/lkml/2019/10/28/680 An example error is: $ tools/perf/perf stat -e c/c/ WARNING: multiple event parsing errors event syntax error: 'c/c/' \___ unknown term valid terms: event,filter_rem,filter_opc0,edge,filter_isoc,filter_tid,filter_loc,filter_nc,inv,umask,filter_opc1,tid_en,thresh,filter_all_op,filter_not_nm,filter_state,filter_nm,config,config1,config2,name,period,percore Initial error: event syntax error: 'c/c/' \___ Cannot find PMU `c'. Missing kernel support? Run 'perf list' for a list of valid events Usage: perf stat [<options>] [<command>] -e, --event <event> event selector. use 'perf list' to list available events Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Allison Randal <allison@lohutok.net> Cc: Andi Kleen <ak@linux.intel.com> Cc: Anju T Sudhakar <anju@linux.vnet.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Richter <tmricht@linux.ibm.com> Link: http://lore.kernel.org/lkml/20191116074652.9960-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
f67001a4a0 |
perf tools: Propagate get_cpuid() error
For consistency, propagate the exact cause for get_cpuid() to have failed. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-9ig269f7ktnhh99g4l15vpu2@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
2bff2b8285 |
perf kvm stat: Set 'trace_cycles' as default event for 'perf kvm record' in powerpc
Use 'trace_imc/trace_cycles' as the default event for 'perf kvm record' in powerpc. Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com> Reviewed-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lore.kernel.org/lkml/20190718181749.30612-3-anju@linux.vnet.ibm.com [ Add missing pmu.h header, needed because this patch uses pmu_have_event() ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
124eb5f82b |
perf kvm: Add arch neutral function to choose event for perf kvm record
'perf kvm record' uses 'cycles'(if the user did not specify any event) as the default event to profile the guest. This will not provide any proper samples from the guest incase of powerpc architecture, since in powerpc the PMUs are controlled by the guest rather than the host. Patch adds a function to pick an arch specific event for 'perf kvm record', instead of selecting 'cycles' as a default event for all architectures. For powerpc this function checks for any user specified event, and if there isn't any it returns invalid instead of proceeding with 'cycles' event. Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com> Reviewed-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lore.kernel.org/lkml/20190718181749.30612-2-anju@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
9c9e754fb8 |
perf callchain: Remove needless event.h include
All we need is a bunch of struct forward declarations and then add event.h to the only place that was getting it indirectly via callchain.h. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-qq2xhyuxcvx5vmxha9otjd8d@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
fb71c86cc8 |
perf tools: Remove util.h from where it is not needed
Check that it is not needed and remove, fixing up some fallout for places where it was only serving to get something else. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-9h6dg6lsqe2usyqjh5rrues4@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
4a903c2e15 |
perf tools: Remove debug.h from places where it is not needed
Pruning a bit more the includes dependency tree. Building this thing on lots of containers takes time, we better reduce the time per build, each container is doing 6 builds when clang and clang-devel are available, and the plan is to do a 'make -C tools/perf build-test' that have many more. Also helps when doing normal development, as touching some random file will have a much reduced chance of triggering lots of rebuilds. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-r889ur2cxe16m91m2a4pl15p@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
d3300a3c4e |
perf symbols: Move mem_info and branch_info out of symbol.h
The mem_info struct goes to mem-events.h and branch_info goes to branch.h, where they belong, this way we can remove several headers from symbols.h and trim the include dependency tree more. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-aupw71xnravcsu2xoabfmhpc@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
b1d1b094f7 |
perf symbols: Move symsrc prototypes to a separate header
So that we can remove dso.h from symbol.h and reduce the header dependency tree. Fixup cases where struct dso guts are needed but were obtained via symbol.h, indirectly. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-ip683cegt306ncu3gsz7ii21@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
4cb3c6d546 |
perf event: Remove needless include directives from event.h
bpf.h and build-id.h are not needed at all in event.h, remove them. And fixup the fallout of files that were getting needed stuff from this now pruned include. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-rdm3dgtlrndmmnlc4bafsg3b@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
0ac25fd0a0 |
perf tools: Remove perf.h from source files not needing it
With the movement of lots of stuff out of perf.h to other headers we ended up not needing it in lots of places, remove it from those places. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-c718m0sxxwp73lp9d8vpihb4@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
0f98b11c61 |
perf evlist: Rename perf_evlist__new() to evlist__new()
Rename perf_evlist__new() to evlist__new(), so we don't have a name clash when we add perf_evlist__new() in libperf. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190721112506.12306-9-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
63503dba87 |
perf evlist: Rename struct perf_evlist to struct evlist
Rename struct perf_evlist to struct evlist, so we don't have a name clash when we add struct perf_evlist in libperf. Committer notes: Added fixes to build on arm64, from Jiri and from me (tools/perf/util/cs-etm.c) Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190721112506.12306-6-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
32dcd021d0 |
perf evsel: Rename struct perf_evsel to struct evsel
Rename struct perf_evsel to struct evsel, so we don't have a name clash when we add struct perf_evsel in libperf. Committer notes: Added fixes for arm64, provided by Jiri. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190721112506.12306-5-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
7f7c536f23 |
tools lib: Adopt zalloc()/zfree() from tools/perf
Eroding a bit more the tools/perf/util/util.h hodpodge header. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-natazosyn9rwjka25tvcnyi0@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
215a0d305c |
perf tools: Add missing headers, mostly stdlib.h
Part of the erosion of util/util.h, that will lose its include stdlib.h, we need to add it to places where it is needed but was getting it indirectly. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-1imnqezw99ahc07fjeb51qby@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
d2912cb15b |
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500
Based on 2 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation # extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 4122 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Enrico Weigelt <info@metux.net> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
2874c5fd28 |
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 3029 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190527070032.746973796@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
5ff328836d |
perf tools: Rename build libperf to perf
Rename build libperf to perf, because it's used to build perf. The libperf build object name will be used for libperf library. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20190213123246.4015-4-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
6854daa07a |
perf/core improvements and fixes:
Hardware tracing: Adrian Hunter: - Handle calls optimized into jumps to a different symbol in the thread stack routines used to process hardware traces (Adrian Hunter) Intel PT: Adrian Hunter: - Fix overlap calculation for padding. - Fix CYC timestamp calculation after OVF. - Packet splitting can only happen in 32-bit. - Add timestamp to auxtrace errors. ARM CoreSight: Leo Yan: - Add last instruction information in packet - Set sample flags for instruction range, exception and return packets and for a trace discontinuity. - Add exception number in exception packet - Change tuple from traceID-CPU# to traceID-metadata - Add traceID in packet Mathieu Poirier: - Add "sinks" group to PMU directory - Use event attributes to send sink information to kernel - Remove set_drv_config() API, no longer used. perf annotate: Jiri Olsa: - Delay symbol annotation to the resort phase, speeding up 'perf report' startup. perf record: Alexey Budankov: - Allow binding userspace buffers to NUMA nodes. Symbols: Adrian Hunter: - Fix calculating of symbol sizes when splitting kallsyms into maps for kcore processing. Vendor events: William Cohen: - Intel: Fix Load_Miss_Real_Latency on CLX Misc: Arnaldo Carvalho de Melo: - Streamline headers, removing includes when all that is needed are just forward declarations, fixup the fallout for cases where headers should have been explicitely included but were instead obtained indirectly, by sheer luck. - Add fallback versions for CPU_{OR,EQUAL}(), so that code using it continue to build on older systems where those were not yet introduced or in systems using some other libc than the GNU one where those helpers aren't present. Documentation: Changbin Du: - Add documentation for BPF event selection. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCXFsqugAKCRCyPKLppCJ+ JzpwAQDEh1mNZoxfdGZEi9d+8p2hnRlOs3GOUG4iGnqAYfae4QEAkMJ0V1wrmkdw NXgV+PgWfDcgbD4Cn90eWA8M6KEcbgA= =ogOF -----END PGP SIGNATURE----- Merge tag 'perf-core-for-mingo-5.1-20190206' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: Hardware tracing: Adrian Hunter: - Handle calls optimized into jumps to a different symbol in the thread stack routines used to process hardware traces (Adrian Hunter) Intel PT: Adrian Hunter: - Fix overlap calculation for padding. - Fix CYC timestamp calculation after OVF. - Packet splitting can only happen in 32-bit. - Add timestamp to auxtrace errors. ARM CoreSight: Leo Yan: - Add last instruction information in packet - Set sample flags for instruction range, exception and return packets and for a trace discontinuity. - Add exception number in exception packet - Change tuple from traceID-CPU# to traceID-metadata - Add traceID in packet Mathieu Poirier: - Add "sinks" group to PMU directory - Use event attributes to send sink information to kernel - Remove set_drv_config() API, no longer used. perf annotate: Jiri Olsa: - Delay symbol annotation to the resort phase, speeding up 'perf report' startup. perf record: Alexey Budankov: - Allow binding userspace buffers to NUMA nodes. Symbols: Adrian Hunter: - Fix calculating of symbol sizes when splitting kallsyms into maps for kcore processing. Vendor events: William Cohen: - Intel: Fix Load_Miss_Real_Latency on CLX Misc: Arnaldo Carvalho de Melo: - Streamline headers, removing includes when all that is needed are just forward declarations, fixup the fallout for cases where headers should have been explicitely included but were instead obtained indirectly, by sheer luck. - Add fallback versions for CPU_{OR,EQUAL}(), so that code using it continue to build on older systems where those were not yet introduced or in systems using some other libc than the GNU one where those helpers aren't present. Documentation: Changbin Du: - Add documentation for BPF event selection. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
|
5afbb37c68 |
perf powerpc kvm-stat: Add missing evlist.h header
This header was being obtained indirectly, by sheer luck, add it. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-c3h8oyav16iu5ivput8n4wt6@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
5691903a6f |
perf kvm stat: Replace kvm-stat.h includes with forward declarations
To reduce the include header dependency tree and speed up perf builds. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-dngwaxuhfnhksawgdpo6e74n@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
d6e4ae499f |
perf powerpc: Add missing headers to skip-callchain-idx.c
It uses several structs but don't explicitely includes the headers where they are defined, getting them by sheer luck from one of the headers it includes, since those are being streamlined to avoid unnecessary rebuilds when changes are made to a random header, they will break, fix them now so that they continue to build. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Maynard Johnson <maynard@us.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Sandipan Das <sandipan@linux.ibm.com> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Link: https://lkml.kernel.org/n/tip-j1nyksegpnz36wi3qx2p46i1@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
f0fabf9c89 |
perf mem/c2c: Fix perf_mem_events to support powerpc
PowerPC hardware does not have a builtin latency filter (--ldlat) for the "mem-load" event and perf_mem_events by default includes "/ldlat=30/" which is causing a failure on PowerPC. Refactor the code to support "perf mem/c2c" on PowerPC. This patch depends on kernel side changes done my Madhavan: https://lists.ozlabs.org/pipermail/linuxppc-dev/2018-December/182596.html Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Dick Fowles <fowles@inreach.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Joe Mario <jmario@redhat.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Namhyung Kim <namhyung@kernel.org> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/20190129132412.771-1-ravi.bangoria@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
6529870cb0 |
powerpc/perf: Update perf_regs structure to include MMCRA
On each sample, Monitor Mode Control Register A (MMCRA) content is
saved in pt_regs. MMCRA does not have a entry as-is in the pt_regs but
instead, MMCRA content is saved in the "dsisr" register of pt_regs.
Patch adds another entry to the perf_regs structure to include the
"MMCRA" printing which internally maps to the "dsisr" of pt_regs.
It also check for the MMCRA availability in the platform and present
value accordingly
mpe: This was the 2nd patch in a series with commit
|
||
|
333804dc3b |
powerpc/perf: Update perf_regs structure to include SIER
On each sample, Sample Instruction Event Register (SIER) content is saved in pt_regs. SIER does not have a entry as-is in the pt_regs but instead, SIER content is saved in the "dar" register of pt_regs. Patch adds another entry to the perf_regs structure to include the "SIER" printing which internally maps to the "dar" of pt_regs. It also check for the SIER availability in the platform and present value accordingly Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> |
||
|
9d67121a4f |
Merge remote-tracking branch 'remotes/powerpc/topic/ppc-kvm' into kvm-ppc-next
This merges in the "ppc-kvm" topic branch of the powerpc tree to get a series of commits that touch both general arch/powerpc code and KVM code. These commits will be merged both via the KVM tree and the powerpc tree. Signed-off-by: Paul Mackerras <paulus@ozlabs.org> |
||
|
d24ea8a733 |
KVM: PPC: Book3S: Simplify external interrupt handling
Currently we use two bits in the vcpu pending_exceptions bitmap to indicate that an external interrupt is pending for the guest, one for "one-shot" interrupts that are cleared when delivered, and one for interrupts that persist until cleared by an explicit action of the OS (e.g. an acknowledge to an interrupt controller). The BOOK3S_IRQPRIO_EXTERNAL bit is used for one-shot interrupt requests and BOOK3S_IRQPRIO_EXTERNAL_LEVEL is used for persisting interrupts. In practice BOOK3S_IRQPRIO_EXTERNAL never gets used, because our Book3S platforms generally, and pseries in particular, expect external interrupt requests to persist until they are acknowledged at the interrupt controller. That combined with the confusion introduced by having two bits for what is essentially the same thing makes it attractive to simplify things by only using one bit. This patch does that. With this patch there is only BOOK3S_IRQPRIO_EXTERNAL, and by default it has the semantics of a persisting interrupt. In order to avoid breaking the ABI, we introduce a new "external_oneshot" flag which preserves the behaviour of the KVM_INTERRUPT ioctl with the KVM_INTERRUPT_SET argument. Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@ozlabs.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> |
||
|
fa694160cc |
perf probe powerpc: Ignore SyS symbols irrespective of endianness
This makes sure that the SyS symbols are ignored for any powerpc system,
not just the big endian ones.
Reported-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Sandipan Das <sandipan@linux.ibm.com>
Reviewed-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Fixes:
|
||
|
354b064b8e |
perf probe powerpc: Fix trace event post-processing
In some cases, a symbol may have multiple aliases. Attempting to add an
entry probe for such symbols results in a probe being added at an
incorrect location while it fails altogether for return probes. This is
only applicable for binaries with debug information.
During the arch-dependent post-processing, the offset from the start of
the symbol at which the probe is to be attached is determined and added
to the start address of the symbol to get the probe's location. In case
there are multiple aliases, this offset gets added multiple times for
each alias of the symbol and we end up with an incorrect probe location.
This can be verified on a powerpc64le system as shown below.
$ nm /lib/modules/$(uname -r)/build/vmlinux | grep "sys_open$"
...
c000000000414290 T __se_sys_open
c000000000414290 T sys_open
$ objdump -d /lib/modules/$(uname -r)/build/vmlinux | grep -A 10 "<__se_sys_open>:"
c000000000414290 <__se_sys_open>:
c000000000414290: 19 01 4c 3c addis r2,r12,281
c000000000414294: 70 c4 42 38 addi r2,r2,-15248
c000000000414298: a6 02 08 7c mflr r0
c00000000041429c: e8 ff a1 fb std r29,-24(r1)
c0000000004142a0: f0 ff c1 fb std r30,-16(r1)
c0000000004142a4: f8 ff e1 fb std r31,-8(r1)
c0000000004142a8: 10 00 01 f8 std r0,16(r1)
c0000000004142ac: c1 ff 21 f8 stdu r1,-64(r1)
c0000000004142b0: 78 23 9f 7c mr r31,r4
c0000000004142b4: 78 1b 7e 7c mr r30,r3
For both the entry probe and the return probe, the probe location
should be _text+4276888 (0xc000000000414298). Since another alias
exists for 'sys_open', the post-processing code will end up adding
the offset (8 for powerpc64le) twice and perf will attempt to add
the probe at _text+4276896 (0xc0000000004142a0) instead.
Before:
# perf probe -v -a sys_open
probe-definition(0): sys_open
symbol:sys_open file:(null) line:0 offset:0 return:0 lazy:(null)
0 arguments
Looking at the vmlinux_path (8 entries long)
Using /lib/modules/4.18.0-rc8+/build/vmlinux for symbols
Open Debuginfo file: /lib/modules/4.18.0-rc8+/build/vmlinux
Try to find probe point from debuginfo.
Symbol sys_open address found : c000000000414290
Matched function: __se_sys_open [2ad03a0]
Probe point found: __se_sys_open+0
Found 1 probe_trace_events.
Opening /sys/kernel/debug/tracing/kprobe_events write=1
Writing event: p:probe/sys_open _text+4276896
Added new event:
probe:sys_open (on sys_open)
...
# perf probe -v -a sys_open%return $retval
probe-definition(0): sys_open%return
symbol:sys_open file:(null) line:0 offset:0 return:1 lazy:(null)
0 arguments
Looking at the vmlinux_path (8 entries long)
Using /lib/modules/4.18.0-rc8+/build/vmlinux for symbols
Open Debuginfo file: /lib/modules/4.18.0-rc8+/build/vmlinux
Try to find probe point from debuginfo.
Symbol sys_open address found : c000000000414290
Matched function: __se_sys_open [2ad03a0]
Probe point found: __se_sys_open+0
Found 1 probe_trace_events.
Opening /sys/kernel/debug/tracing/README write=0
Opening /sys/kernel/debug/tracing/kprobe_events write=1
Parsing probe_events: p:probe/sys_open _text+4276896
Group:probe Event:sys_open probe:p
Writing event: r:probe/sys_open__return _text+4276896
Failed to write event: Invalid argument
Error: Failed to add events. Reason: Invalid argument (Code: -22)
After:
# perf probe -v -a sys_open
probe-definition(0): sys_open
symbol:sys_open file:(null) line:0 offset:0 return:0 lazy:(null)
0 arguments
Looking at the vmlinux_path (8 entries long)
Using /lib/modules/4.18.0-rc8+/build/vmlinux for symbols
Open Debuginfo file: /lib/modules/4.18.0-rc8+/build/vmlinux
Try to find probe point from debuginfo.
Symbol sys_open address found : c000000000414290
Matched function: __se_sys_open [2ad03a0]
Probe point found: __se_sys_open+0
Found 1 probe_trace_events.
Opening /sys/kernel/debug/tracing/kprobe_events write=1
Writing event: p:probe/sys_open _text+4276888
Added new event:
probe:sys_open (on sys_open)
...
# perf probe -v -a sys_open%return $retval
probe-definition(0): sys_open%return
symbol:sys_open file:(null) line:0 offset:0 return:1 lazy:(null)
0 arguments
Looking at the vmlinux_path (8 entries long)
Using /lib/modules/4.18.0-rc8+/build/vmlinux for symbols
Open Debuginfo file: /lib/modules/4.18.0-rc8+/build/vmlinux
Try to find probe point from debuginfo.
Symbol sys_open address found : c000000000414290
Matched function: __se_sys_open [2ad03a0]
Probe point found: __se_sys_open+0
Found 1 probe_trace_events.
Opening /sys/kernel/debug/tracing/README write=0
Opening /sys/kernel/debug/tracing/kprobe_events write=1
Parsing probe_events: p:probe/sys_open _text+4276888
Group:probe Event:sys_open probe:p
Writing event: r:probe/sys_open__return _text+4276888
Added new event:
probe:sys_open__return (on sys_open%return)
...
Reported-by: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Sandipan Das <sandipan@linux.ibm.com>
Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Fixes:
|
||
|
9068533e4f |
perf powerpc: Fix callchain ip filtering when return address is in a register
For powerpc64, perf will filter out the second entry in the callchain, i.e. the LR value, if the return address of the function corresponding to the probed location has already been saved on its caller's stack. The state of the return address is determined using debug information. At any point within a function, if the return address is already saved somewhere, a DWARF expression can tell us about its location. If the return address in still in LR only, no DWARF expression would exist. Typically, the instructions in a function's prologue first copy the LR value to R0 and then pushes R0 on to the stack. If LR has already been copied to R0 but R0 is yet to be pushed to the stack, we can still get a DWARF expression that says that the return address is in R0. This is indicating that getting a DWARF expression for the return address does not guarantee the fact that it has already been saved on the stack. This can be observed on a powerpc64le system running Fedora 27 as shown below. # objdump -d /usr/lib64/libc-2.26.so | less ... 000000000015af20 <inet_pton>: 15af20: 0b 00 4c 3c addis r2,r12,11 15af24: e0 c1 42 38 addi r2,r2,-15904 15af28: a6 02 08 7c mflr r0 15af2c: f0 ff c1 fb std r30,-16(r1) 15af30: f8 ff e1 fb std r31,-8(r1) 15af34: 78 1b 7f 7c mr r31,r3 15af38: 78 23 83 7c mr r3,r4 15af3c: 78 2b be 7c mr r30,r5 15af40: 10 00 01 f8 std r0,16(r1) 15af44: c1 ff 21 f8 stdu r1,-64(r1) 15af48: 28 00 81 f8 std r4,40(r1) ... # readelf --debug-dump=frames-interp /usr/lib64/libc-2.26.so | less ... 00027024 0000000000000024 00027028 FDE cie=00000000 pc=000000000015af20..000000000015af88 LOC CFA r30 r31 ra 000000000015af20 r1+0 u u u 000000000015af34 r1+0 c-16 c-8 r0 000000000015af48 r1+64 c-16 c-8 c+16 000000000015af5c r1+0 c-16 c-8 c+16 000000000015af78 r1+0 u u ... # perf probe -x /usr/lib64/libc-2.26.so -a inet_pton+0x18 # perf record -e probe_libc:inet_pton -g ping -6 -c 1 ::1 # perf script Before: ping 2829 [005] 512917.460174: probe_libc:inet_pton: (7fff7e2baf38) 7fff7e2baf38 __GI___inet_pton+0x18 (/usr/lib64/libc-2.26.so) 7fff7e2705b4 getaddrinfo+0x164 (/usr/lib64/libc-2.26.so) 12f152d70 _init+0xbfc (/usr/bin/ping) 7fff7e1836a0 generic_start_main.isra.0+0x140 (/usr/lib64/libc-2.26.so) 7fff7e183898 __libc_start_main+0xb8 (/usr/lib64/libc-2.26.so) 0 [unknown] ([unknown]) After: ping 2829 [005] 512917.460174: probe_libc:inet_pton: (7fff7e2baf38) 7fff7e2baf38 __GI___inet_pton+0x18 (/usr/lib64/libc-2.26.so) 7fff7e26fa54 gaih_inet.constprop.7+0xf44 (/usr/lib64/libc-2.26.so) 7fff7e2705b4 getaddrinfo+0x164 (/usr/lib64/libc-2.26.so) 12f152d70 _init+0xbfc (/usr/bin/ping) 7fff7e1836a0 generic_start_main.isra.0+0x140 (/usr/lib64/libc-2.26.so) 7fff7e183898 __libc_start_main+0xb8 (/usr/lib64/libc-2.26.so) 0 [unknown] ([unknown]) Reported-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Signed-off-by: Sandipan Das <sandipan@linux.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Maynard Johnson <maynard@us.ibm.com> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/66e848a7bdf2d43b39210a705ff6d828a0865661.1530724939.git.sandipan@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
c715fcfda5 |
perf powerpc: Fix callchain ip filtering
For powerpc64, redundant entries in the callchain are filtered out by
determining the state of the return address and the stack frame using
DWARF debug information.
For making these filtering decisions we must analyze the debug
information for the location corresponding to the program counter value,
i.e. the first entry in the callchain, and not the LR value; otherwise,
perf may filter out either the second or the third entry in the
callchain incorrectly.
This can be observed on a powerpc64le system running Fedora 27 as shown
below.
Case 1 - Attaching a probe at inet_pton+0x8 (binary offset 0x15af28).
Return address is still in LR and a new stack frame is not yet
allocated. The LR value, i.e. the second entry, should not be
filtered out.
# objdump -d /usr/lib64/libc-2.26.so | less
...
000000000010eb10 <gaih_inet.constprop.7>:
...
10fa48: 78 bb e4 7e mr r4,r23
10fa4c: 0a 00 60 38 li r3,10
10fa50: d9 b4 04 48 bl 15af28 <inet_pton+0x8>
10fa54: 00 00 00 60 nop
10fa58: ac f4 ff 4b b 10ef04 <gaih_inet.constprop.7+0x3f4>
...
0000000000110450 <getaddrinfo>:
...
1105a8: 54 00 ff 38 addi r7,r31,84
1105ac: 58 00 df 38 addi r6,r31,88
1105b0: 69 e5 ff 4b bl 10eb18 <gaih_inet.constprop.7+0x8>
1105b4: 78 1b 71 7c mr r17,r3
1105b8: 50 01 7f e8 ld r3,336(r31)
...
000000000015af20 <inet_pton>:
15af20: 0b 00 4c 3c addis r2,r12,11
15af24: e0 c1 42 38 addi r2,r2,-15904
15af28: a6 02 08 7c mflr r0
15af2c: f0 ff c1 fb std r30,-16(r1)
15af30: f8 ff e1 fb std r31,-8(r1)
...
# perf probe -x /usr/lib64/libc-2.26.so -a inet_pton+0x8
# perf record -e probe_libc:inet_pton -g ping -6 -c 1 ::1
# perf script
Before:
ping 4507 [002] 514985.546540: probe_libc:inet_pton: (7fffa7dbaf28)
7fffa7dbaf28 __GI___inet_pton+0x8 (/usr/lib64/libc-2.26.so)
7fffa7d705b4 getaddrinfo+0x164 (/usr/lib64/libc-2.26.so)
13fb52d70 _init+0xbfc (/usr/bin/ping)
7fffa7c836a0 generic_start_main.isra.0+0x140 (/usr/lib64/libc-2.26.so)
7fffa7c83898 __libc_start_main+0xb8 (/usr/lib64/libc-2.26.so)
0 [unknown] ([unknown])
After:
ping 4507 [002] 514985.546540: probe_libc:inet_pton: (7fffa7dbaf28)
7fffa7dbaf28 __GI___inet_pton+0x8 (/usr/lib64/libc-2.26.so)
7fffa7d6fa54 gaih_inet.constprop.7+0xf44 (/usr/lib64/libc-2.26.so)
7fffa7d705b4 getaddrinfo+0x164 (/usr/lib64/libc-2.26.so)
13fb52d70 _init+0xbfc (/usr/bin/ping)
7fffa7c836a0 generic_start_main.isra.0+0x140 (/usr/lib64/libc-2.26.so)
7fffa7c83898 __libc_start_main+0xb8 (/usr/lib64/libc-2.26.so)
0 [unknown] ([unknown])
Case 2 - Attaching a probe at _int_malloc+0x180 (binary offset 0x9cf10).
Return address in still in LR and a new stack frame has already
been allocated but not used. The caller's caller, i.e. the third
entry, is invalid and should be filtered out and not the second
one.
# objdump -d /usr/lib64/libc-2.26.so | less
...
000000000009cd90 <_int_malloc>:
9cd90: 17 00 4c 3c addis r2,r12,23
9cd94: 70 a3 42 38 addi r2,r2,-23696
9cd98: 26 00 80 7d mfcr r12
9cd9c: f8 ff e1 fb std r31,-8(r1)
9cda0: 17 00 e4 3b addi r31,r4,23
9cda4: d8 ff 61 fb std r27,-40(r1)
9cda8: 78 23 9b 7c mr r27,r4
9cdac: 1f 00 bf 2b cmpldi cr7,r31,31
9cdb0: f0 ff c1 fb std r30,-16(r1)
9cdb4: b0 ff c1 fa std r22,-80(r1)
9cdb8: 78 1b 7e 7c mr r30,r3
9cdbc: 08 00 81 91 stw r12,8(r1)
9cdc0: 11 ff 21 f8 stdu r1,-240(r1)
9cdc4: 4c 01 9d 41 bgt cr7,9cf10 <_int_malloc+0x180>
9cdc8: 20 00 a4 2b cmpldi cr7,r4,32
...
9cf08: 00 00 00 60 nop
9cf0c: 00 00 42 60 ori r2,r2,0
9cf10: e4 06 ff 7b rldicr r31,r31,0,59
9cf14: 40 f8 a4 7f cmpld cr7,r4,r31
9cf18: 68 05 9d 41 bgt cr7,9d480 <_int_malloc+0x6f0>
...
000000000009e3c0 <tcache_init.part.4>:
...
9e420: 40 02 80 38 li r4,576
9e424: 78 fb e3 7f mr r3,r31
9e428: 71 e9 ff 4b bl 9cd98 <_int_malloc+0x8>
9e42c: 00 00 a3 2f cmpdi cr7,r3,0
9e430: 78 1b 7e 7c mr r30,r3
...
000000000009f7a0 <__libc_malloc>:
...
9f8f8: 00 00 89 2f cmpwi cr7,r9,0
9f8fc: 1c ff 9e 40 bne cr7,9f818 <__libc_malloc+0x78>
9f900: c9 ea ff 4b bl 9e3c8 <tcache_init.part.4+0x8>
9f904: 00 00 00 60 nop
9f908: e8 90 22 e9 ld r9,-28440(r2)
...
# perf probe -x /usr/lib64/libc-2.26.so -a _int_malloc+0x180
# perf record -e probe_libc:_int_malloc -g ./test-malloc
# perf script
Before:
test-malloc 6554 [009] 515975.797403: probe_libc:_int_malloc: (7fffa6e6cf10)
7fffa6e6cf10 _int_malloc+0x180 (/usr/lib64/libc-2.26.so)
7fffa6dd0000 [unknown] (/usr/lib64/libc-2.26.so)
7fffa6e6f904 malloc+0x164 (/usr/lib64/libc-2.26.so)
7fffa6e6f9fc malloc+0x25c (/usr/lib64/libc-2.26.so)
100006b4 main+0x38 (/home/testuser/test-malloc)
7fffa6df36a0 generic_start_main.isra.0+0x140 (/usr/lib64/libc-2.26.so)
7fffa6df3898 __libc_start_main+0xb8 (/usr/lib64/libc-2.26.so)
0 [unknown] ([unknown])
After:
test-malloc 6554 [009] 515975.797403: probe_libc:_int_malloc: (7fffa6e6cf10)
7fffa6e6cf10 _int_malloc+0x180 (/usr/lib64/libc-2.26.so)
7fffa6e6e42c tcache_init.part.4+0x6c (/usr/lib64/libc-2.26.so)
7fffa6e6f904 malloc+0x164 (/usr/lib64/libc-2.26.so)
7fffa6e6f9fc malloc+0x25c (/usr/lib64/libc-2.26.so)
100006b4 main+0x38 (/home/sandipan/test-malloc)
7fffa6df36a0 generic_start_main.isra.0+0x140 (/usr/lib64/libc-2.26.so)
7fffa6df3898 __libc_start_main+0xb8 (/usr/lib64/libc-2.26.so)
0 [unknown] ([unknown])
Signed-off-by: Sandipan Das <sandipan@linux.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Maynard Johnson <maynard@us.ibm.com>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Fixes:
|
||
|
143c99f6ac |
perf report powerpc: Fix crash if callchain is empty
For some cases, the callchain provided by the kernel may be empty. So, the callchain ip filtering code will cause a crash if we do not check whether the struct ip_callchain pointer is NULL before accessing any members. This can be observed on a powerpc64le system running Fedora 27 as shown below. # perf record -b -e cycles:u ls Before: # perf report --branch-history perf: Segmentation fault -------- backtrace -------- perf[0x1027615c] linux-vdso64.so.1(__kernel_sigtramp_rt64+0x0)[0x7fff856304d8] perf(arch_skip_callchain_idx+0x44)[0x10257c58] perf[0x1017f2e4] perf(thread__resolve_callchain+0x124)[0x1017ff5c] perf(sample__resolve_callchain+0xf0)[0x10172788] ... After: # perf report --branch-history Samples: 25 of event 'cycles:u', Event count (approx.): 2306870 Overhead Source:Line Symbol Shared Object + 11.60% _init+35736 [.] _init ls + 9.84% strcoll_l.c:137 [.] __strcoll_l libc-2.26.so + 9.16% memcpy.S:175 [.] __memcpy_power7 libc-2.26.so + 9.01% gconv_charset.h:54 [.] _nl_find_locale libc-2.26.so + 8.87% dl-addr.c:52 [.] _dl_addr libc-2.26.so + 8.83% _init+236 [.] _init ls ... Reported-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Signed-off-by: Sandipan Das <sandipan@linux.ibm.com> Acked-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20180611104049.11048-1-sandipan@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
4546263d72 |
perf thread: Introduce thread__find_symbol()
Out of thread__find_addr_location(..., MAP__FUNCTION, ...), idea here is to continue removing references to MAP__{FUNCTION,VARIABLE} ahead of getting both types of symbols in the same rbtree, as various places do two lookups, looking first at MAP__FUNCTION, then at MAP__VARIABLE. So thread__find_symbol() will eventually do just that, and 'struct symbol' will have the symbol type, for code that cares about that. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-n7528en9e08yd3flzmb26tth@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
4b3a2716dd |
perf probe: Find versioned symbols from map
Commit
|
||
|
54e32dc0f8 |
perf pmu: Pass pmu as a parameter to get_cpuid_str()
The cpuid string will not be same on all CPUs on heterogeneous platforms like ARM's big.LITTLE, adding provision(using pmu->cpus) to find cpuid string from associated CPUs of PMU CORE device. Also optimise arguments to function pmu_add_cpu_aliases. Signed-off-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Jayachandran C <jnair@caviumnetworks.com> Cc: Jonathan Cameron <jonathan.cameron@huawei.com> Cc: linux-arm-kernel@lists.infradead.org Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Robert Richter <robert.richter@cavium.com> Cc: Shaokun Zhang <zhangshaokun@hisilicon.com> Link: http://lkml.kernel.org/r/20171016183222.25750-2-ganapatrao.kulkarni@cavium.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
b24413180f |
License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
|
544abd44c7 |
perf probe: Allow placing uprobes in alternate namespaces.
Teaches perf how to place a uprobe on a file that's in a different mount namespace. The user must add the probe using the --target-ns argument to perf probe. Once it has been placed, it may be recorded against without further namespace-specific commands. Signed-off-by: Krister Johansen <kjlx@templeofstupid.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> [ PPC build fixed by Ravi: ] Link: http://lkml.kernel.org/r/1500287542-6219-1-git-send-email-ravi.bangoria@linux.vnet.ibm.com Cc: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com> [ Fix !HAVE_DWARF_SUPPORT build ] Link: http://lkml.kernel.org/r/1499305693-1599-4-git-send-email-kjlx@templeofstupid.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
a7f0fda085 |
perf unwind: Support for powerpc
Porting PPC to libdw only needs an architecture-specific hook to move the register state from perf to libdw. The ARM and x86 architectures already use libdw, and it is useful to have as much common code for the unwinder as possible. Mark Wielaard has contributed a frame-based unwinder to libdw, so that unwinding works even for binaries that do not have CFI information. In addition, libunwind is always preferred to libdw by the build machinery so this cannot introduce regressions on machines that have both libunwind and libdw installed. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Milian Wolff <milian.wolff@kdab.com> Acked-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/1496312681-20133-1-git-send-email-pbonzini@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |