Move the get_main_thread function from db-export.c to thread.c so that
it can be used elsewhere.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/1464051145-19968-2-git-send-email-andi@firstfloor.org
[ Removed leftover bits from db-export.h ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Before this patch, a simple 'perf record' could fail if kptr_restrict is
set to 1 (for normal user) or 2 (for root):
# perf record ls
WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted,
check /proc/sys/kernel/kptr_restrict.
Samples in kernel functions may not be resolved if a suitable vmlinux
file is not found in the buildid cache or in the vmlinux path.
Samples in kernel modules won't be resolved at all.
If some relocation was applied (e.g. kexec) symbols may be misresolved
even with a suitable vmlinux or kallsyms file.
Segmentation fault (core dumped)
This patch skips perf_event__synthesize_kernel_mmap() when kptr is not
available.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fixes: 45e9005690 ("perf machine: Do not bail out if not managing to read ref reloc symbol")
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1464081688-167940-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
If kptr_restrict is set to 2, even root is not allowed to see pointers.
This patch checks kptr_restrict even if euid == 0. For root, report
error if kptr_restrict is 2.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1464081688-167940-1-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pull perf updates from Ingo Molnar:
"Mostly tooling and PMU driver fixes, but also a number of late updates
such as the reworking of the call-chain size limiting logic to make
call-graph recording more robust, plus tooling side changes for the
new 'backwards ring-buffer' extension to the perf ring-buffer"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (34 commits)
perf record: Read from backward ring buffer
perf record: Rename variable to make code clear
perf record: Prevent reading invalid data in record__mmap_read
perf evlist: Add API to pause/resume
perf trace: Use the ptr->name beautifier as default for "filename" args
perf trace: Use the fd->name beautifier as default for "fd" args
perf report: Add srcline_from/to branch sort keys
perf evsel: Record fd into perf_mmap
perf evsel: Add overwrite attribute and check write_backward
perf tools: Set buildid dir under symfs when --symfs is provided
perf trace: Only auto set call-graph to "dwarf" when syscalls are being traced
perf annotate: Sort list of recognised instructions
perf annotate: Fix identification of ARM blt and bls instructions
perf tools: Fix usage of max_stack sysctl
perf callchain: Stop validating callchains by the max_stack sysctl
perf trace: Fix exit_group() formatting
perf top: Use machine->kptr_restrict_warned
perf trace: Warn when trying to resolve kernel addresses with kptr_restrict=1
perf machine: Do not bail out if not managing to read ref reloc symbol
perf/x86/intel/p4: Trival indentation fix, remove space
...
Introduce rb_find_range() to find start and end position from a backward
ring buffer.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1463987628-163563-5-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
record__mmap_read() writes data from ring buffer into perf.data. 'head'
is maintained by the kernel, points to the last written record.
'old' is maintained by perf, points to the record read in previous
round. record__mmap_read() saves data from 'old' to 'head' to
perf.data.
The names of these variables are not very intutive. In addition,
when dealing with backward writing ring buffer, the md->prev pointer
should point to 'head' instead of the last byte it got.
Add 'start' and 'end' pointer to make code clear and set md->prev to
'head' instead of the moved 'old' pointer. This patch doesn't change
behavior since:
buf = &data[old & md->mask];
size = head - old;
old += size; <--- Here, old == head
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1463987628-163563-4-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When record__mmap_read() requires data more than the size of ring
buffer, drop those data to avoid accessing invalid memory.
This can happen when reading from overwritable ring buffer, which
should be avoided. However, check this for robustness.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1463987628-163563-3-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf_evlist__toggle_{pause,resume}() are introduced to pause/resume
events in an evlist. Utilize PERF_EVENT_IOC_PAUSE_OUTPUT ioctl.
Following commits use them to ensure overwrite ring buffer is paused
before reading.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1463987628-163563-2-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
[ Return -1, like all other ioctl() usage in evlist.c, rename 'pause'
arg to avoid breaking the build on ubuntu 12.04 and other old systems ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Auto-attach the ptr->name beautifier to syscall args "filename", "path"
and "pathname" if they are of type "const char *".
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-jxii4qmcgoppftv0zdvml9d7@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Noticed when the 'setsockopt' 'fd' arg wasn't being formatted via
the SCA_FD beautifier, so just remove the setting of "fd" args to
SCA_FD and do it when reading the syscall info, like we do for
args of type "pid_t", i.e. "fd" as the name should be enough as
the decision to use the SFA_FD beautifier. For odd cases we can
just do it explicitely.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-0qissgetiuqmqyj4b6ancmpn@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add a fd field into struct perf_mmap so that perf can track the mmap fd.
This feature will be used for toggling overwrite ring buffers.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1463762315-155689-3-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add 'overwrite' attribute to evsel to mark whether this event is
overwritable. The following commits will support syntax like:
# perf record -e cycles/overwrite/ ...
An overwritable evsel requires kernel support for the
perf_event_attr.write_backward ring buffer feature.
Add it to perf_missing_feature.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1463762315-155689-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
User visible:
- We should not use the current value of the kernel.perf_event_max_stack as the
default value for --max-stack in tools that can process perf.data files, they
will only match if that sysctl wasn't changed from its default value at the
time the perf.data file was recorded, fix it.
This fixes a bug where a 'perf record -a --call-graph dwarf ; perf report'
produces a glibc invalid free backtrace (Arnaldo Carvalho de Melo)
- Provide a better warning when running 'perf trace' on a system where the
kernel.kptr_restrict is set to 1, similar to the one produced by 'perf record',
noticed on ubuntu 16.04 where this is the default kptr_restrict setting.
(Arnaldo Carvalho de Melo)
- Fix ordering of instructions in the annotation code, noticed when annotating
ARM binaries, now that table is auto-ordered at first use, to avoid more such
problems (Chris Ryder)
- Set buildid dir under symfs when --symfs is provided (He Kuang)
- Fix the 'exit_group()' syscall output in 'perf trace' (Arnaldo Carvalho de Melo)
- Only auto set call-graph to "dwarf" in 'perf trace' when syscalls are being
traced (Arnaldo Carvalho de Melo)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJXPyWGAAoJENZQFvNTUqpAb0IQAJnk65WicWUuMppPic3owIb7
KWudqyRtGY+3Rq1HkuWr49IeC0sEAQw5OAP3fiwJuwBap/+7VtIkKFzGr/+3N1lC
lncj8Pmap52SraqKmQbR9GV5fqr6/I1MW28uEFLvI0bBYkGRm55QX8wjGlM7AMUu
m47ok/y07An6OFSJoXV+Sn/gbRJVOayjw/1o9cYy6s/6j2ZV3gN8rUiuHVwtfIhd
OKoxGPQUX1RVmWrzy6aG718JM8pt1tdQ6zd7AgEBbqvevT8TMxTgE8EzSw3TyQXL
+O3CF91gmWm7SE8LxKJ0VhGazMJ9IFj/HcK6XV4BAJAX7IDCBmoRJdrGvbDIG6WC
wHoBkMlpGkq2Y1eBsYQzbmj0ZE80jx8qvi6GETOnTTmaUyCBB3HI+1yUwYZIKVVN
aqZUPfbJBlJU8MMj1SMS/QU30TKtdeCA3QSJEQC6jf8AinLgqIs2a+3FshHzQ1Tq
2l55ySdPTpcUiTK9demBJP5GZTHSr8UfIXXZvuD2pu3zrf/5YdWxOP11HppreffA
vG9C71HJrAbiSlFmjWOF9qra//hN4DD3c7F5ikBePWGk1Hvf0NfjO4hDZUMva4yn
bXsB203+qe/R2G2sT8OfCYV1UnEp+lOMSDJXzLv7YS4rD8Nq/tPKmMXcNV7WIHWO
cLf0Ati8z4DBGXFGaTsI
=mOmQ
-----END PGP SIGNATURE-----
Merge tag 'perf-core-for-mingo-20160520' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
User visible changes:
- We should not use the current value of the kernel.perf_event_max_stack as the
default value for --max-stack in tools that can process perf.data files, they
will only match if that sysctl wasn't changed from its default value at the
time the perf.data file was recorded, fix it.
This fixes a bug where a 'perf record -a --call-graph dwarf ; perf report'
produces a glibc invalid free backtrace (Arnaldo Carvalho de Melo)
- Provide a better warning when running 'perf trace' on a system where the
kernel.kptr_restrict is set to 1, similar to the one produced by 'perf record',
noticed on ubuntu 16.04 where this is the default kptr_restrict setting.
(Arnaldo Carvalho de Melo)
- Fix ordering of instructions in the annotation code, noticed when annotating
ARM binaries, now that table is auto-ordered at first use, to avoid more such
problems (Chris Ryder)
- Set buildid dir under symfs when --symfs is provided (He Kuang)
- Fix the 'exit_group()' syscall output in 'perf trace' (Arnaldo Carvalho de Melo)
- Only auto set call-graph to "dwarf" in 'perf trace' when syscalls are being
traced (Arnaldo Carvalho de Melo)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Highlights:
- Support for Power ISA 3.0 (Power9) Radix Tree MMU from Aneesh Kumar K.V
- Live patching support for ppc64le (also merged via livepatching.git)
Various cleanups & minor fixes from:
- Aaro Koskinen, Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V,
Chris Smart, Daniel Axtens, Frederic Barrat, Gavin Shan, Ian Munsie, Lennart
Sorensen, Madhavan Srinivasan, Mahesh Salgaonkar, Markus Elfring, Michael
Ellerman, Oliver O'Halloran, Paul Gortmaker, Paul Mackerras, Rashmica Gupta,
Russell Currey, Suraj Jitindar Singh, Thiago Jung Bauermann, Valentin
Rothberg, Vipin K Parashar.
General:
- Update LMB associativity index during DLPAR add/remove from Nathan Fontenot
- Fix branching to OOL handlers in relocatable kernel from Hari Bathini
- Add support for userspace Power9 copy/paste from Chris Smart
- Always use STRICT_MM_TYPECHECKS from Michael Ellerman
- Add mask of possible MMU features from Michael Ellerman
PCI:
- Enable pass through of NVLink to guests from Alexey Kardashevskiy
- Cleanups in preparation for powernv PCI hotplug from Gavin Shan
- Don't report error in eeh_pe_reset_and_recover() from Gavin Shan
- Restore initial state in eeh_pe_reset_and_recover() from Gavin Shan
- Revert "powerpc/eeh: Fix crash in eeh_add_device_early() on Cell" from Guilherme G. Piccoli
- Remove the dependency on EEH struct in DDW mechanism from Guilherme G. Piccoli
selftests:
- Test cp_abort during context switch from Chris Smart
- Add several tests for transactional memory support from Rashmica Gupta
perf:
- Add support for sampling interrupt register state from Anju T
- Add support for unwinding perf-stackdump from Chandan Kumar
cxl:
- Configure the PSL for two CAPI ports on POWER8NVL from Philippe Bergheaud
- Allow initialization on timebase sync failures from Frederic Barrat
- Increase timeout for detection of AFU mmio hang from Frederic Barrat
- Handle num_of_processes larger than can fit in the SPA from Ian Munsie
- Ensure PSL interrupt is configured for contexts with no AFU IRQs from Ian Munsie
- Add kernel API to allow a context to operate with relocate disabled from Ian Munsie
- Check periodically the coherent platform function's state from Christophe Lombard
Freescale:
- Updates from Scott: "Contains 86xx fixes, minor device tree fixes, an erratum
workaround, and a kconfig dependency fix."
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJXPsGzAAoJEFHr6jzI4aWAVoAP/iKdrDe0eYHlVAE9SqnbsiZs
lgDxdsC8P3fsmP1G9o/HkKhC82zHl/La8Ztz8dtqa+LkSzbfliWP1ztJsI7GsBFo
tyCKzWnX9Rwvd3meHu/o/SQ29TNLm/PbPyyRqpj5QPbJ8XCXkAXR7ZZZqjvcMsJW
/AgIr7Cgf53tl9oZzzl/c7CnNHhMq+NBdA71vhWtUx+T97wfJEGyKW6HhZyHDbEU
iAki7fu77ZpEqC/Fh9swf0dCGBJ+a132NoMVo0AdV7EQLznUYlQpQEqa+1PyHZOP
/ArOzf2mDg6m3PfCo1eiB07v8PnVZ3llEUbVAJNg3GUxbE4SHrqq/kwm0iElm3p/
DvFxerCwdX9vmskJX4wDs+pSZRabXYj9XVMptsgFzA4joWrqqb7mBHqaort88YcY
YSljEt1bHyXmiJ+dBya40qARsWUkCVN7ZgEzdxckq0KI3w7g2tqpqIbO2lClWT6t
B3GpqQ4jp34+d1M14FB91fIGK7tMvOhSInE0Mv9+tPvRsepXqiiU/SwdAtRlr3m2
zs/K+4FYcVjJ3Rmpgc+tI38PbZxHe212I35YN6L1LP+4ZfAtzz0NyKdooTIBtkbO
19pX4WbBjKq8zK+YutrySncBIrbnI6VjW51vtRhgVKZliPFO/6zKagyU6FbxM+E5
udQES+t3F/9gvtxgxtDe
=YvyQ
-----END PGP SIGNATURE-----
Merge tag 'powerpc-4.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
"Highlights:
- Support for Power ISA 3.0 (Power9) Radix Tree MMU from Aneesh Kumar K.V
- Live patching support for ppc64le (also merged via livepatching.git)
Various cleanups & minor fixes from:
- Aaro Koskinen, Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V,
Chris Smart, Daniel Axtens, Frederic Barrat, Gavin Shan, Ian Munsie,
Lennart Sorensen, Madhavan Srinivasan, Mahesh Salgaonkar, Markus Elfring,
Michael Ellerman, Oliver O'Halloran, Paul Gortmaker, Paul Mackerras,
Rashmica Gupta, Russell Currey, Suraj Jitindar Singh, Thiago Jung
Bauermann, Valentin Rothberg, Vipin K Parashar.
General:
- Update LMB associativity index during DLPAR add/remove from Nathan
Fontenot
- Fix branching to OOL handlers in relocatable kernel from Hari Bathini
- Add support for userspace Power9 copy/paste from Chris Smart
- Always use STRICT_MM_TYPECHECKS from Michael Ellerman
- Add mask of possible MMU features from Michael Ellerman
PCI:
- Enable pass through of NVLink to guests from Alexey Kardashevskiy
- Cleanups in preparation for powernv PCI hotplug from Gavin Shan
- Don't report error in eeh_pe_reset_and_recover() from Gavin Shan
- Restore initial state in eeh_pe_reset_and_recover() from Gavin Shan
- Revert "powerpc/eeh: Fix crash in eeh_add_device_early() on Cell"
from Guilherme G Piccoli
- Remove the dependency on EEH struct in DDW mechanism from Guilherme
G Piccoli
selftests:
- Test cp_abort during context switch from Chris Smart
- Add several tests for transactional memory support from Rashmica
Gupta
perf:
- Add support for sampling interrupt register state from Anju T
- Add support for unwinding perf-stackdump from Chandan Kumar
cxl:
- Configure the PSL for two CAPI ports on POWER8NVL from Philippe
Bergheaud
- Allow initialization on timebase sync failures from Frederic Barrat
- Increase timeout for detection of AFU mmio hang from Frederic
Barrat
- Handle num_of_processes larger than can fit in the SPA from Ian
Munsie
- Ensure PSL interrupt is configured for contexts with no AFU IRQs
from Ian Munsie
- Add kernel API to allow a context to operate with relocate disabled
from Ian Munsie
- Check periodically the coherent platform function's state from
Christophe Lombard
Freescale:
- Updates from Scott: "Contains 86xx fixes, minor device tree fixes,
an erratum workaround, and a kconfig dependency fix."
* tag 'powerpc-4.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (192 commits)
powerpc/86xx: Fix PCI interrupt map definition
powerpc/86xx: Move pci1 definition to the include file
powerpc/fsl: Fix build of the dtb embedded kernel images
powerpc/fsl: Fix rcpm compatible string
powerpc/fsl: Remove FSL_SOC dependency from FSL_LBC
powerpc/fsl-pci: Add a workaround for PCI 5 errata
powerpc/fsl: Fix SPI compatible on t208xrdb and t1040rdb
powerpc/powernv/npu: Add PE to PHB's list
powerpc/powernv: Fix insufficient memory allocation
powerpc/iommu: Remove the dependency on EEH struct in DDW mechanism
Revert "powerpc/eeh: Fix crash in eeh_add_device_early() on Cell"
powerpc/eeh: Drop unnecessary label in eeh_pe_change_owner()
powerpc/eeh: Ignore handlers in eeh_pe_reset_and_recover()
powerpc/eeh: Restore initial state in eeh_pe_reset_and_recover()
powerpc/eeh: Don't report error in eeh_pe_reset_and_recover()
Revert "powerpc/powernv: Exclude root bus in pnv_pci_reset_secondary_bus()"
powerpc/powernv/npu: Enable NVLink pass through
powerpc/powernv/npu: Rework TCE Kill handling
powerpc/powernv/npu: Add set/unset window helpers
powerpc/powernv/ioda2: Export debug helper pe_level_printk()
...
This patch moves the reference of buildid dir to 'symfs/.debug' and
skips the local buildid dir when '--symfs' is given, so that every
single file opened by perf is relative to symfs directory now.
Signed-off-by: He Kuang <hekuang@huawei.com>
Acked-by: David Ahern <dsahern@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1463658462-85131-2-git-send-email-hekuang@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When --min-stack or --max-stack is passwd but --no-syscalls is also in
effect, there is no point in automatically setting '--call-graph dwarf'.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-pq922i7h9wef0pho1dqpttvn@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently the list of instructions recognised by perf annotate has to be
explicitly written in sorted order. This makes it easy to make mistakes
when adding new instructions. Sort the list of instructions on first
access.
Signed-off-by: Chris Ryder <chris.ryder@arm.com>
Acked-by: Pawel Moll <pawel.moll@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Link: http://lkml.kernel.org/r/4268febaf32f47f322c166fb2fe98cfec7041e11.1463676839.git.chris.ryder@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The ARM blt and bls instructions are not correctly identified when
parsing assembly because the list of recognised instructions must be
sorted by name. Swap the ordering of blt and bls.
Signed-off-by: Chris Ryder <chris.ryder@arm.com>
Acked-by: Pawel Moll <pawel.moll@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Link: http://lkml.kernel.org/r/560e196b7c79b7ff853caae13d8719a31479cb1a.1463676839.git.chris.ryder@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We cannot limit processing stacks from the current value of the sysctl,
as we may be processing perf.data files, possibly from other machines.
Instead use the old PERF_MAX_STACK_DEPTH, the sysctl default, that can
be overriden using --max-stack or equivalent.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Fixes: 4cb93446c5 ("perf tools: Set the maximum allowed stack from /proc/sys/kernel/perf_event_max_stack")
Link: http://lkml.kernel.org/n/tip-eqeutsr7n7wy0c36z24ytvii@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
As thread__resolve_callchain_sample can be used for handling perf.data
files, that could've been recorded with a large max_stack sysctl setting
than what the system used for analysis has set.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Link: http://lkml.kernel.org/n/tip-2995bt2g5yq2m05vga4kip6m@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This doesn't return, so there is no raw_syscalls:sys_exit for it, add
the ending ')', without any return value, since it is void.
Reported-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-vh2mii0g4qlveuc4joufbipu@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Its now there, no need to have it too.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-y18oeou494uy11im7u9to0dx@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Hook into the libtraceevent plugin kernel symbol resolver to warn the
user that that can't happen with kptr_restrict=1.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-9gc412xx1gl0lvqj1d1xwlyb@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This means the user can't access /proc/kallsyms, for instance, because
/proc/sys/kernel/kptr_restrict is set to 1.
Instead leave the ref_reloc_sym as NULL and code using it will cope.
This allows 'perf trace' to work on such systems for !root, the only
issue would be when trying to resolve kernel symbols, which happens,
for instance, in some libtracevent plugins. A warning for that case
will be provided in the next patch in this series.
Noticed in Ubuntu 16.04, that comes with kptr_restrict=1.
Reported-by: Milian Wolff <milian.wolff@kdab.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-knpu3z4iyp2dxpdfm798fac4@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
User visible:
- Honour the kernel.perf_event_max_stack knob more precisely by not counting
PERF_CONTEXT_{KERNEL,USER} when deciding when to stop adding entries to
the perf_sample->ip_callchain[] array (Arnaldo Carvalho de Melo)
- Fix identation of 'stalled-backend-cycles' in 'perf stat' (Namhyung Kim)
- Update runtime using 'cpu-clock' event in 'perf stat' (Namhyung Kim)
- Use 'cpu-clock' for cpu targets in 'perf stat' (Namhyung Kim)
- Avoid fractional digits for integer scales in 'perf stat' (Andi Kleen)
- Store vdso buildid unconditionally, as it appears in callchains and
we're not checking those when creating the build-id table, so we
end up not being able to resolve VDSO symbols when doing analysis
on a different machine than the one where recording was done, possibly
of a different arch even (arm -> x86_64) (He Kuang)
Infrastructure:
- Generalize max_stack sysctl handler, will be used for configuring
multiple kernel knobs related to callchains (Arnaldo Carvalho de Melo)
Cleanups:
- Introduce DSO__NAME_KALLSYMS and DSO__NAME_KCORE, to stop using
open coded strings (Masami Hiramatsu)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJXOn7eAAoJENZQFvNTUqpAsOAP/3f/XJekPQAnMcKRBp2noCuj
nRu1kBltVJyP8iOU5PKSJwel4F9ykNNMl+/rzzxHDo13IM8uc+HnZOJZ6e9mJIJ1
xqjdqM4EDlYYoFApJzCjTK6CMlevCazosdQT1bbmMDYVPc2uQR/GnutFrzqf/Plg
hEougIGtfrdy85g95CRdxpy2yMwDK4EwsiDRm9ib1hnuamQZl97buWemBVqSJmLY
p82E2aMU5Fv5+B8AO4I7V88ZmgpmryjxpM+LjffgNUDSKsSHrlG4NiQ3znV1bgst
Rc++w78+qxoIozOu6/IX8eSI2L/1eyM/yQ6Qre0KuvYXCl+NopTAYSSJlaA4tyHF
c55z7HucuyATN3PrFRHlbWUT/RMIVC0j0lnZOc7SJLl90hJQ+nv0iZcbYwMbeHu1
3LGlcd9jDwQYiClbaT9ATxZJ8B9An0/k/HJdatbAHN0wRomP2Ozz/qD2nmEbUwpV
sCyLOo/LJkvVkuUjSg6ZiOArNIk4iTSPSAUV+SAL6YOEOZMAX5ISUJQ174+zFC9a
gqtVsCXvwLIsndXb8ys1r9/fit/MUci0OzKX3SG1K765+E4Bk23KcAgMNbM/a7lp
ZmHDXMC+yBYcnYNnaxkp7c55CWUlKGOeR4e+KmB99KoeIleYgPhD2UM5beo61TmN
yUEPtiiFiZmTRkiAu83R
=7OdF
-----END PGP SIGNATURE-----
Merge tag 'perf-core-for-mingo-20160516' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
User visible changes:
- Honour the kernel.perf_event_max_stack knob more precisely by not counting
PERF_CONTEXT_{KERNEL,USER} when deciding when to stop adding entries to
the perf_sample->ip_callchain[] array (Arnaldo Carvalho de Melo)
- Fix identation of 'stalled-backend-cycles' in 'perf stat' (Namhyung Kim)
- Update runtime using 'cpu-clock' event in 'perf stat' (Namhyung Kim)
- Use 'cpu-clock' for cpu targets in 'perf stat' (Namhyung Kim)
- Avoid fractional digits for integer scales in 'perf stat' (Andi Kleen)
- Store vdso buildid unconditionally, as it appears in callchains and
we're not checking those when creating the build-id table, so we
end up not being able to resolve VDSO symbols when doing analysis
on a different machine than the one where recording was done, possibly
of a different arch even (arm -> x86_64) (He Kuang)
Infrastructure changes:
- Generalize max_stack sysctl handler, will be used for configuring
multiple kernel knobs related to callchains (Arnaldo Carvalho de Melo)
Cleanups:
- Introduce DSO__NAME_KALLSYMS and DSO__NAME_KCORE, to stop using
open coded strings (Masami Hiramatsu)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The perf_sample->ip_callchain->nr value includes all the entries in the
ip_callchain->ip[] array, real addresses and PERF_CONTEXT_{KERNEL,USER,etc},
while what the user expects is that what is in the kernel.perf_event_max_stack
sysctl or in the upcoming per event perf_event_attr.sample_max_stack knob be
honoured in terms of IP addresses in the stack trace.
So match the kernel support and validate chain->nr taking into account
both kernel.perf_event_max_stack and kernel.perf_event_max_contexts_per_stack.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Link: http://lkml.kernel.org/n/tip-mgx0jpzfdq4uq4abfa40byu0@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Instead of using a raw string, use DSO__NAME_KALLSYMS and
DSO__NAME_KCORE macros for kallsyms and kcore.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160515031935.4017.50971.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently only the task-clock event updates the runtime_nsec so it
cannot show the metric when using cpu-clock events. However cpu clock
works basically same as task-clock, so no need to not update the runtime
IMHO.
Before:
# perf stat -a -e cpu-clock,context-switches,page-faults,cycles sleep 0.1
Performance counter stats for 'system wide':
1217.759506 cpu-clock (msec)
93 context-switches
61 page-faults
18,958,022 cycles
0.101393794 seconds time elapsed
After:
Performance counter stats for 'system wide':
1220.471884 cpu-clock (msec) # 12.013 CPUs utilized
118 context-switches # 0.097 K/sec
59 page-faults # 0.048 K/sec
17,941,247 cycles # 0.015 GHz
0.101594777 seconds time elapsed
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1463119263-5569-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When unwinding callchains on a different machine, vdso info should be
available so the unwind process won't be interrupted if address falls
into vdso region. But in most cases, the addresses of sample events are
not in vdso range, the buildid of a zero hit vdso won't be stored into
perf.data.
This patch stores vdso buildid regardless of whether the vdso is hit or
not.
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1463042596-61703-3-git-send-email-hekuang@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When the scaling factor is a full integer don't display fractional
digits. This avoids unnecessary .00 output for topdown metrics with
scale factors.
v2: Remove redundant check.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1462489447-31832-7-git-send-email-andi@firstfloor.org
[ Rename 'round' to 'stat_round' as 'round' is defined in math.h,
included by this patch, and this breaks the build on ubuntu 12.04 ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pull perf updates from Ingo Molnar:
"Bigger kernel side changes:
- Add backwards writing capability to the perf ring-buffer code,
which is preparation for future advanced features like robust
'overwrite support' and snapshot mode. (Wang Nan)
- Add pause and resume ioctls for the perf ringbuffer (Wang Nan)
- x86 Intel cstate code cleanups and reorgnization (Thomas Gleixner)
- x86 Intel uncore and CPU PMU driver updates (Kan Liang, Peter
Zijlstra)
- x86 AUX (Intel PT) related enhancements and updates (Alexander
Shishkin)
- x86 MSR PMU driver enhancements and updates (Huang Rui)
- ... and lots of other changes spread out over 40+ commits.
Biggest tooling side changes:
- 'perf trace' features and enhancements. (Arnaldo Carvalho de Melo)
- BPF tooling updates (Wang Nan)
- 'perf sched' updates (Jiri Olsa)
- 'perf probe' updates (Masami Hiramatsu)
- ... plus 200+ other enhancements, fixes and cleanups to tools/
The merge commits, the shortlog and the changelogs contain a lot more
details"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (249 commits)
perf/core: Disable the event on a truncated AUX record
perf/x86/intel/pt: Generate PMI in the STOP region as well
perf buildid-cache: Use lsdir() for looking up buildid caches
perf symbols: Use lsdir() for the search in kcore cache directory
perf tools: Use SBUILD_ID_SIZE where applicable
perf tools: Fix lsdir to set errno correctly
perf trace: Move seccomp args beautifiers to tools/perf/trace/beauty/
perf trace: Move flock op beautifier to tools/perf/trace/beauty/
perf build: Add build-test for debug-frame on arm/arm64
perf build: Add build-test for libunwind cross-platforms support
perf script: Fix export of callchains with recursion in db-export
perf script: Fix callchain addresses in db-export
perf script: Fix symbol insertion behavior in db-export
perf symbols: Add dso__insert_symbol function
perf scripting python: Use Py_FatalError instead of die()
perf tools: Remove xrealloc and ALLOC_GROW
perf help: Do not use ALLOC_GROW in add_cmd_list
perf pmu: Make pmu_formats_string to check return value of strbuf
perf header: Make topology checkers to check return value of strbuf
perf tools: Make alias handler to check return value of strbuf
...
After 0161028b7c ("perf/core: Change the default paranoia level to 2")
'perf stat' fails for users without CAP_SYS_ADMIN, so just use
'perf_evsel__fallback()' to have the same behaviour as 'perf record',
i.e. set perf_event_attr.exclude_kernel to 1.
Now:
[acme@jouet linux]$ perf stat usleep 1
Performance counter stats for 'usleep 1':
0.352536 task-clock:u (msec) # 0.423 CPUs utilized
0 context-switches:u # 0.000 K/sec
0 cpu-migrations:u # 0.000 K/sec
49 page-faults:u # 0.139 M/sec
309,407 cycles:u # 0.878 GHz
243,791 instructions:u # 0.79 insn per cycle
49,622 branches:u # 140.757 M/sec
3,884 branch-misses:u # 7.83% of all branches
0.000834174 seconds time elapsed
[acme@jouet linux]$
Reported-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-b20jmx4dxt5hpaa9t2rroi0o@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now with the default for the kernel.perf_event_paranoid sysctl being 2 [1]
we need to fall back to :u, i.e. to set perf_event_attr.exclude_kernel
to 1.
Before:
[acme@jouet linux]$ perf record usleep 1
Error:
You may not have permission to collect stats.
Consider tweaking /proc/sys/kernel/perf_event_paranoid,
which controls use of the performance events system by
unprivileged users (without CAP_SYS_ADMIN).
The current value is 2:
-1: Allow use of (almost) all events by all users
>= 0: Disallow raw tracepoint access by users without CAP_IOC_LOCK
>= 1: Disallow CPU event access by users without CAP_SYS_ADMIN
>= 2: Disallow kernel profiling by users without CAP_SYS_ADMIN
[acme@jouet linux]$
After:
[acme@jouet linux]$ perf record usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.016 MB perf.data (7 samples) ]
[acme@jouet linux]$ perf evlist
cycles:u
[acme@jouet linux]$ perf evlist -v
cycles:u: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, exclude_kernel: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
[acme@jouet linux]$
And if the user turns on verbose mode, an explanation will appear:
[acme@jouet linux]$ perf record -v usleep 1
Warning:
kernel.perf_event_paranoid=2, trying to fall back to excluding kernel samples
mmap size 528384B
[ perf record: Woken up 1 times to write data ]
Looking at the vmlinux_path (8 entries long)
Using /lib/modules/4.6.0-rc7+/build/vmlinux for symbols
[ perf record: Captured and wrote 0.016 MB perf.data (7 samples) ]
[acme@jouet linux]$
[1] 0161028b7c ("perf/core: Change the default paranoia level to 2")
Reported-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-b20jmx4dxt5hpaa9t2rroi0o@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We were showing a hardcoded default value for the kernel.perf_event_paranoid
sysctl, now that it became more paranoid (1 -> 2 [1]), this would need to be
updated, instead show the current value:
[acme@jouet linux]$ perf record ls
Error:
You may not have permission to collect stats.
Consider tweaking /proc/sys/kernel/perf_event_paranoid,
which controls use of the performance events system by
unprivileged users (without CAP_SYS_ADMIN).
The current value is 2:
-1: Allow use of (almost) all events by all users
>= 0: Disallow raw tracepoint access by users without CAP_IOC_LOCK
>= 1: Disallow CPU event access by users without CAP_SYS_ADMIN
>= 2: Disallow kernel profiling by users without CAP_SYS_ADMIN
[acme@jouet linux]$
[1] 0161028b7c ("perf/core: Change the default paranoia level to 2")
Reported-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-0gc4rdpg8d025r5not8s8028@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
If not, tell the user that:
config/Makefile:273: Old libdw.h, finding variables at given 'perf probe' point will not work, install elfutils-devel/libdw-dev >= 0.157
And return -ENOTSUPP in die_get_var_range(), failing features that
need it, like the one pointed out above.
This fixes the build on older systems, such as Ubuntu 12.04.5.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Vinson Lee <vlee@freedesktop.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-9l7luqkq4gfnx7vrklkq4obs@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To fix the build on Fedora Rawhide (gcc 6.0.0 20160311 (Red Hat 6.0.0-0.17):
CC /tmp/build/perf/arch/x86/util/dwarf-regs.o
arch/x86/util/dwarf-regs.c:66:36: error: 'x86_32_regoffset_table' defined but not used [-Werror=unused-const-variable=]
static const struct pt_regs_offset x86_32_regoffset_table[] = {
^~~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-fghuksc1u8ln82bof4lwcj0o@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The readdir() function is thread safe as long as just one thread uses a
DIR, which is the case when parsing tracepoint event definitions, to
avoid breaking the build with glibc-2.23.90 (upcoming 2.24), use it
instead of readdir_r().
See: http://man7.org/linux/man-pages/man3/readdir.3.html
"However, in modern implementations (including the glibc implementation),
concurrent calls to readdir() that specify different directory streams
are thread-safe. In cases where multiple threads must read from the
same directory stream, using readdir() with external synchronization is
still preferable to the use of the deprecated readdir_r(3) function."
Noticed while building on a Fedora Rawhide docker container.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-wddn49r6bz6wq4ee3dxbl7lo@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The readdir() function is thread safe as long as just one thread uses a
DIR, which is the case in thread_map, so, to avoid breaking the build
with glibc-2.23.90 (upcoming 2.24), use it instead of readdir_r().
See: http://man7.org/linux/man-pages/man3/readdir.3.html
"However, in modern implementations (including the glibc implementation),
concurrent calls to readdir() that specify different directory streams
are thread-safe. In cases where multiple threads must read from the
same directory stream, using readdir() with external synchronization is
still preferable to the use of the deprecated readdir_r(3) function."
Noticed while building on a Fedora Rawhide docker container.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-del8h2a0f40z75j4r42l96l0@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The readdir() function is thread safe as long as just one thread uses a
DIR, which is the case in 'perf script', so, to avoid breaking the build
with glibc-2.23.90 (upcoming 2.24), use it instead of readdir_r().
See: http://man7.org/linux/man-pages/man3/readdir.3.html
"However, in modern implementations (including the glibc implementation),
concurrent calls to readdir() that specify different directory streams
are thread-safe. In cases where multiple threads must read from the
same directory stream, using readdir() with external synchronization is
still preferable to the use of the deprecated readdir_r(3) function."
Noticed while building on a Fedora Rawhide docker container.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-mt3xz7n2hl49ni2vx7kuq74g@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The readdir() function is thread safe as long as just one thread uses a
DIR, which is the case when synthesizing events for pre-existing threads
by traversing /proc, so, to avoid breaking the build with glibc-2.23.90
(upcoming 2.24), use it instead of readdir_r().
See: http://man7.org/linux/man-pages/man3/readdir.3.html
"However, in modern implementations (including the glibc implementation),
concurrent calls to readdir() that specify different directory streams
are thread-safe. In cases where multiple threads must read from the
same directory stream, using readdir() with external synchronization is
still preferable to the use of the deprecated readdir_r(3) function."
Noticed while building on a Fedora Rawhide docker container.
CC /tmp/build/perf/util/event.o
util/event.c: In function '__event__synthesize_thread':
util/event.c:466:2: error: 'readdir_r' is deprecated [-Werror=deprecated-declarations]
while (!readdir_r(tasks, &dirent, &next) && next) {
^~~~~
In file included from /usr/include/features.h:368:0,
from /usr/include/stdint.h:25,
from /usr/lib/gcc/x86_64-redhat-linux/6.0.0/include/stdint.h:9,
from /git/linux/tools/include/linux/types.h:6,
from util/event.c:1:
/usr/include/dirent.h:189:12: note: declared here
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-i1vj7nyjp2p750rirxgrfd3c@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Use new lsdir() for looking up buildid caches. This changes logic a bit
to ignore all dot files, since the build-id cache must not start with
dot.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160511135217.23943.94596.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Use lsdir() to search in kcore cache directory. This also avoids
checking hidden dot directory entries, because kcore cache directories
must always have the name from timestamps when taking the kcore
snapshots, and it never start with dot.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160511135208.23943.68071.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Use the existing SBUILD_ID_SIZE macro instead of the equivalent
BUILD_ID_SIZE * 2 + 1 expression for allocating a buffer for build-id
strings.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160511135159.23943.57120.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fix lsdir() to set correct positive error number (ENOMEM). Since
"errno" must have a positive error number instead of negative number,
fix lsdir to set it correctly.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: e1ce726e1d ("perf tools: Add lsdir() helper to read a directory")
Link: http://lkml.kernel.org/r/20160511135127.23943.40644.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To reduce the size of builtin-trace.c.
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/n/tip-ovxifncj34ynrjjseg33lil3@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To reduce the size of builtin-trace.c.
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/n/tip-c4c47w2a2jx13terl2p2hros@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When an IP with an unresolved symbol occurs in the callchain more than
once (ie. recursion), then duplicate symbols can be created because
the callchain nodes are never updated after they are first created.
To fix this issue we call dso__find_symbol whenever we encounter a NULL
symbol, in case we already added a symbol at that IP since we started
traversing the callchain.
This change prevents duplicate symbols from being exported when duplicate
IPs are present in the callchain.
Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1462937209-6032-5-git-send-email-cphlipot0@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Remove the call to map_ip() to adjust al.addr, because it has already
been called when assembling the callchain, in:
thread__resolve_callchain_sample(perf_sample)
add_callchain_ip(ip = perf_sample->callchain->ips[j])
thread__find_addr_location(addr = ip)
thread__find_addr_map(addr) {
al->addr = addr
if (al->map)
al->addr = al->map->map_ip(al->map, al->addr);
}
Calling it a second time can result in incorrect addresses being used.
This can have effects such as duplicate symbols being created and
exported.
Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1462937209-6032-4-git-send-email-cphlipot0@gmail.com
[ Show the callchain where it is done, to help reviewing this change down the line ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Use the dso__insert_symbol function instead of symbols__insert() in
order to properly update the dso symbol cache.
If the cache is not updated, then duplicate symbols can be
unintentionally created, inserted, and exported.
This change prevents duplicate symbols from being exported due to
dso__find_symbol() using a stale symbol cache.
Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1462937209-6032-3-git-send-email-cphlipot0@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The current method for inserting symbols is to use the symbols__insert()
function. However symbols__insert() does not update the dso symbol
cache. This causes problems in the following scenario:
1. symbol not found at addr using dso__find_symbol
2. symbol inserted at addr using the existing symbols__insert function
3. symbol still not found at addr using dso__find_symbol() because cache isn't
updated. This is undesired behavior.
The undesired behavior in (3) is addressed by creating a new function,
dso__insert_symbol() to both insert the symbol and update the symbol
cache if necessary.
If dso__insert_symbol() is used in (2) instead of symbols__insert(),
then the undesired behavior in (3) is avoided.
Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1462937209-6032-2-git-send-email-cphlipot0@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It probably is equivalent, but that seems to be the "pythonic" way of
dieing? Anyway, one less die() in the tools/perf codebase.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Chris Phlipot <cphlipot0@gmail.com>
Link: http://lkml.kernel.org/n/tip-nlzgepdv2818zs4e7faif9tu@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
User visible changes:
- Recording 'dwarf' callchains do not need DWARF unwinding support (He Kuang)
- Print recently added perf_event_attr.write_backward bit flag in -vv
verbose mode (Arnaldo Carvalho de Melo)
- Fix incorrect python db-export error message in 'perf script' (Chris Phlipot)
- Fix handling of zero-length symbols (Chris Phlipot)
- perf stat: Scale values by unit before metrics (Andi Kleen)
Infrastructure changes:
- Rewrite strbuf not to die(), making tools using it to check its
return value instead (Masami Hiramatsu)
- Support reading from backward ring buffer, add a 'perf test' entry
for it (Wang Nan)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
On some architectures (powerpc in particular), the number of registers
exceeds what can be represented in an integer bitmask. Ensure we
generate the proper bitmask on such platforms.
Fixes: 71ad0f5e4 ("perf tools: Support for DWARF CFI unwinding on post processing")
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Adds support for unwinding user stack dump by linking with libunwind.
Signed-off-by: Chandan Kumar <chandan.kumar@linux.vnet.ibm.com>
Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Tested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Replace ALLOC_GROW with normal realloc code in add_cmd_list() so that it
can handle errors directly.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160510054752.6158.30562.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Make pmu_formats_string() to check return value of strbuf APIs so that
it can detect errors in it.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160510054744.6158.37810.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Make topology checkers to check the return value of strbuf APIs so that
it can detect errors in it.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160510054735.6158.98650.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Make alias handler and sq_quote_argv to check the return value of strbuf
APIs.
In sq_quote_argv() calls die(), but this fix handles strbuf failure as a
special case and returns to caller, since the caller - handle_alias()
also has to check the return value of other strbuf APIs and those checks
can be merged to one if() statement.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160510054725.6158.84597.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Make check_emacsclient_version() to check the return value of strbuf
APIs so that it can handle errors in strbuf.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160510054716.6158.11755.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Check the return value of strbuf APIs in perf-probe
related code, so that it can handle errors in strbuf.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160510054707.6158.69861.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Rewrite strbuf implementation not to use die() nor xrealloc(). Instead
of die(), now most of the API returns error code or 0 if succeeded.
Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160510054658.6158.24080.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This change introduces a fix to symbols__find, so that it is able to
find symbols of length zero (where start == end).
The current code has the following problem:
- The current implementation of symbols__find is unable to find any symbols
of length zero.
- The db-export framework explicitly creates zero length symbols at
locations where no symbol currently exists.
The combination of the two above behaviors results in behavior similar
to the example below.
1. addr_location is created for a sample, but symbol is unable to be
resolved.
2. db export creates an "unknown" symbol of length zero at that address
and inserts it into the dso.
3. A new sample comes in at the same address, but symbol__find is unable
to find the zero length symbol, so it is still unresolved.
4. db export sees the symbol is unresolved, and allocated a duplicate
symbol, even though it already did this in step 2.
This behavior continues every time an address without symbol information
is seen, which causes a very large number of these symbols to be
allocated.
The effect of this fix can be observed by looking at the contents of an
exported database before/after the fix (generated with
scripts/python/export-to-postgresql.py)
Ex.
BEFORE THE CHANGE:
example_db=# select count(*) from symbols;
count
--------
900213
(1 row)
example_db=# select count(*) from symbols where symbols.name='unknown';
count
--------
897355
(1 row)
example_db=# select count(*) from symbols where symbols.name!='unknown';
count
-------
2858
(1 row)
AFTER THE CHANGE:
example_db=# select count(*) from symbols;
count
-------
25217
(1 row)
example_db=# select count(*) from symbols where name='unknown';
count
-------
22359
(1 row)
example_db=# select count(*) from symbols where name!='unknown';
count
-------
2858
(1 row)
Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1462612620-25008-1-git-send-email-cphlipot0@gmail.com
[ Moved the test to later in the rb_tree tests, as this not the likely case ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now we can see if it is set when using verbose mode in various tools,
such as 'perf test':
# perf test -vv back
45: Test backward reading from ring buffer :
--- start ---
<SNIP>
------------------------------------------------------------
perf_event_attr:
type 2
size 112
config 0x98
{ sample_period, sample_freq } 1
sample_type IP|TID|TIME|CPU|PERIOD|RAW
disabled 1
mmap 1
comm 1
task 1
sample_id_all 1
exclude_guest 1
mmap2 1
comm_exec 1
write_backward 1
------------------------------------------------------------
sys_perf_event_open: pid 20911 cpu -1 group_fd -1 flags 0x8
<SNIP>
---- end ----
Test backward reading from ring buffer: Ok
#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-kxv05kv9qwl5of7rzfeiiwbv@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This test checks reading from backward ring buffer.
Test result:
# ~/perf test 'ring buffer'
45: Test backward reading from ring buffer : Ok
The test case is a while loop which calls prctl(PR_SET_NAME) multiple
times. Each prctl should issue 2 events: one PERF_RECORD_SAMPLE, one
PERF_RECORD_COMM.
The first round creates a relative large ring buffer (256 pages). It can
afford all events. Read from it and check the count of each type of
events.
The second round creates a small ring buffer (1 page) and makes it
overwritable. Check the correctness of the buffer.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1462758471-89706-3-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf_evlist__mmap_read_backward() is introduced for reading backward
ring buffer. Since direction for reading such ring buffer is different
from the direction kernel writing to it, and since user need to fetch
most recent record from it, a perf_evlist__mmap_read_catchup() is
introduced to move the reading pointer to the end of the buffer.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1462758471-89706-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fix the error message printed when attempting and failing to create the
call path root incorrectly references the call return process.
This change fixes the message to properly reference the failure to
create the call path root.
Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1462612620-25008-2-git-send-email-cphlipot0@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Scale values by unit before passing them to the metrics printing
functions. This is needed for TopDown, because it needs to scale the
slots correctly by pipeline width / SMTness.
For existing metrics it shouldn't make any difference, as those
generally use events that don't have any units.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1462489447-31832-8-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
There is no need to check for DWARF unwinding support when using the
'dwarf' callchain record method, as this will only ask the kernel to
collect stack dumps for later DWARF CFI processing, which can be done in
another machine, where the support for DWARF unwinding need to be
present.
Signed-off-by: He Kuang <hekuang@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ekaterina Tumanova <tumanova@linux.vnet.ibm.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1462525154-125656-2-git-send-email-hekuang@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To reduce the size of builtin-trace.c.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-vb8dpy7bptkf219q5c25ulfp@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To reduce the size of builtin-trace.c.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-jt293541hv9od7gqw6lilioh@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To reduce the size of builtin-trace.c.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-qecqxwwtreio6eaatfv58yq5@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add debug output of raw counter values per CPU when perf stat -v is
specified, together with their cpu numbers. This is very useful to
debug problems with per core counters, where we can normally only see
aggregated values.
v2: Make it depend on -vv, not -v
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1461787251-6702-12-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update the export-to-postgresql.py to support the newly introduced
callchain export.
callchains are added into the existing call_paths table and can now
be associated with samples when the "callpaths" commandline option
is used with the script.
Ex.:
$ perf script -s export-to-postgresql.py example_db all callchains
Includes the following changes to enable callchain export via the python export
APIs:
- Add the "callchains" commandline option, which is used to enable
callchain export by setting the perf_db_export_callchains global
- Add perf_db_export_callchains checks for call_path table creation
and population.
- Add call_path_id to samples_table to conform with the new API
example usage and output using a small test app:
test_app.c:
volatile int x = 0;
void inc_x_loop()
{
int i;
for(i=0; i<100000000; i++)
x++;
}
void a()
{
inc_x_loop();
}
void b()
{
inc_x_loop();
}
int main()
{
a();
b();
return 0;
}
example usage:
$ gcc -g -O0 test_app.c
$ perf record --call-graph=dwarf ./a.out
[ perf record: Woken up 77 times to write data ]
[ perf record: Captured and wrote 19.373 MB perf.data (2404 samples) ]
$ perf script -s scripts/python/export-to-postgresql.py
example_db all callchains
$ psql example_db
example_db=#
SELECT
(SELECT name FROM symbols WHERE id = cps.symbol_id) as symbol,
(SELECT name FROM symbols WHERE id =
(SELECT symbol_id from call_paths where id = cps.parent_id))
as parent_symbol,
sum(period) as event_count
FROM samples join call_paths as cps on call_path_id = cps.id
GROUP BY cps.id,evsel_id
ORDER BY event_count DESC
LIMIT 5;
symbol | parent_symbol | event_count
------------------+--------------------------+-------------
inc_x_loop | a | 734250982
inc_x_loop | b | 731028057
unknown | unknown | 1335858
task_tick_fair | scheduler_tick | 1238842
update_wall_time | tick_do_update_jiffies64 | 650373
(5 rows)
The above data shows total "self time" in cycles for each call path that was
sampled. It is intended to demonstrate how it accounts separately for the two
ways to reach the "inc_x_loop" function(via "a" and "b"). Recursive common
table expressions can be used as well to get cumulative time spent in a
function as well, but that is beyond the scope of this basic example.
Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1461831551-12213-7-git-send-email-cphlipot0@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This change allows python scripts to be able to utilize the recent
changes to the db export api allowing the export of call_paths derived
from sampled callchains. These call paths are also now associated with
the samples from which they were derived.
- This feature is enabled by setting "perf_db_export_callchains" to true
- When enabled, samples that have callchain information will have the
callchains exported via call_path_table
- The call_path_id field is added to sample_table to enable association of
samples with the corresponding callchain stored in the call paths
table. A call_path_id of 0 will be exported if there is no
corresponding callchain.
- When "perf_db_export_callchains" and "perf_db_export_calls" are both
set to True, the call path root data structure will be shared. This
prevents duplicating of data and call path ids that would result from
building two separate call path trees in memory.
- The call_return_processor structure definition was relocated to the header
file to make its contents visible to db-export.c. This enables the
sharing of call path trees between the two features, as mentioned
above.
This change is visible to python scripts using the python db export api.
The change is backwards compatible with scripts written against the
previous API, assuming that the scripts model the sample_table function
after the one in export-to-postgresql.py script by allowing for
additional arguments to be added in the future. ie. using *x as the
final argument of the sample_table function.
Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1461831551-12213-6-git-send-email-cphlipot0@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The exported sample now contains a reference to the call_path_id that
represents its callchain.
While callchains themselves are nice to have, being able to associate
them with samples makes them much more useful, and can allow for such
things as determining how much cumulative time is spent in a particular
function. This information is normally possible to get from the call
return processor. However, when doing normal sampling, call/return
information is not available, thus necessitating the need for
associating samples directly with call paths.
This commit include changes to db-export layer to make this information
available for subsequent patches in this change set, but by itself, does
not make any changes visible to the user.
Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1461831551-12213-5-git-send-email-cphlipot0@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This change enables the db export api to export callchains. This is
accomplished by adding callchains obtained from samples to the
call_path_root structure and exporting them via the current call path
export API.
While the current API does support exporting call paths, this is not
supported when sampling. This commit addresses that missing feature by
allowing the export of call paths when callchains are present in
samples.
Summary:
- This feature is activated by initializing the call_path_root member
inside the db_export structure to a non-null value.
- Callchains are resolved with thread__resolve_callchain() and then stored
and exported by adding a call path under call path root.
- Symbol and DSO for each callchain node are exported via db_ids_from_al()
This commit puts in place infrastructure to be used by subsequent commits,
and by itself, does not introduce any user-visible changes.
Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1461831551-12213-4-git-send-email-cphlipot0@gmail.com
[ Made adjustments suggested by Adrian Hunter, see thread via this cset's Link: tag ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move the call path handling code out of thread-stack.c and
thread-stack.h to allow other components that are not part of
thread-stack to create call paths.
Summary:
- Create call-path.c and call-path.h and add them to the build.
- Move all call path related code out of thread-stack.c and thread-stack.h
and into call-path.c and call-path.h.
- A small subset of structures and functions are now visible through
call-path.h, which is required for thread-stack.c to continue to
compile.
This change is a prerequisite for subsequent patches in this change set
and by itself contains no user-visible changes.
Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1461831551-12213-3-git-send-email-cphlipot0@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The existing implementation of thread__resolve_callchain, under certain
circumstances, can assemble callchain entries in the incorrect order.
The callchain entries are resolved incorrectly for a sample when all of
the following conditions are met:
1. callchain_param.order is set to ORDER_CALLER
2. thread__resolve_callchain_sample is able to resolve callchain entries
for the sample.
3. unwind__get_entries is also able to resolve callchain entries for the
sample.
The fix is accomplished by reversing the order in which
thread__resolve_callchain_sample and unwind__get_entries are called when
callchain_param.order is set to ORDER_CALLER.
Unwind specific code from thread__resolve_callchain is also moved into a
new static function to improve readability of the fix.
How to Reproduce the Existing Bug:
Modifying perf script to print call trees in the opposite order or
applying the remaining patches from this series and comparing the
results output from export-to-postgtresql.py are the easiest ways to see
the bug, however it can still be seen in current builds using perf
report.
Here is how i can reproduce the bug using perf report:
# perf record --call-graph=dwarf stress -c 1 -t 5
when i run this command:
# perf report --call-graph=flat,0,0,callee
This callchain, containing kernel (handle_irq_event, etc) and userspace
samples (__libc_start_main, etc) is contained in the output, which looks
correct (callee order):
gen8_irq_handler
handle_irq_event_percpu
handle_irq_event
handle_edge_irq
handle_irq
do_IRQ
ret_from_intr
__random
rand
0x558f2a04dded
0x558f2a04c774
__libc_start_main
0x558f2a04dcd9
Now run this command using caller order:
# perf report --call-graph=flat,0,0,caller
It is expected to see the exact reverse of the above when using caller
order (with "0x558f2a04dcd9" at the top and "gen8_irq_handler" at the
bottom) in the output, but it is nowhere to be found.
instead you see this:
ret_from_intr
do_IRQ
handle_irq
handle_edge_irq
handle_irq_event
handle_irq_event_percpu
gen8_irq_handler
0x558f2a04dcd9
__libc_start_main
0x558f2a04c774
0x558f2a04dded
rand
__random
Notice how internally the kernel symbols are reversed and the user space
symbols are reversed, but the kernel symbols still appear above the user
space symbols.
if this patch is applied and perf script is re-run, you will see the
expected output (with "0x558f2a04dcd9" at the top and "gen8_irq_handler"
at the bottom):
0x558f2a04dcd9
__libc_start_main
0x558f2a04c774
0x558f2a04dded
rand
__random
ret_from_intr
do_IRQ
handle_irq
handle_edge_irq
handle_irq_event
handle_irq_event_percpu
gen8_irq_handler
Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1461831551-12213-2-git-send-email-cphlipot0@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The test to check if the arg format had been read from the
syscall:sys_enter_name/format file was looking at the list of non-commom
fields, and if that is empty, it would think it had failed to read it,
because it doesn't exist, for instance, for the clone() syscall.
So instead before dumping the raw syscall args list check
IS_ERR(sc->tp_format), if that is true, then an attempt was made to read
the format file and failed, in which case dump the raw arg list values.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-ls7pmdqb2xy9339vdburwvnk@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In perf_mmap__read(), give better names to pointers. Original name 'old'
and 'head' directly related to pointers in ring buffer control page. For
backward ring buffer, the meaning of 'head' point is not 'the first byte
of free space', but 'the first byte of the last record'. To reduce
confusion, rename 'old' to 'start', 'head' to 'end'. 'start' -> 'end'
is the direction the records should be read from.
Change parameter order.
Change 'overwrite' to 'check_messup'. When reading from 'head', no need
to check messup for for backward ring buffer.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1461723563-67451-3-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Extract event reader from perf_evlist__mmap_read() to perf__mmap_read().
Future commit will feed it with manually computed 'head' and 'old'
pointers.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1461723563-67451-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
ppc64le functions have a Global Entry Point (GEP) and a Local Entry
Point (LEP). While placing a probe, we always prefer the LEP since it
catches function calls through both the GEP and the LEP. In order to do
this, we fixup the function entry points during elf symbol table lookup
to point to the LEPs. This works, but breaks 'perf test kallsyms' since
the symbols loaded from the symbol table (pointing to the LEP) do not
match the symbols in kallsyms.
To fix this, we do not adjust all the symbols during symbol table load.
Instead, we note down st_other in a newly introduced arch-specific
member of perf symbol structure, and later use this to adjust the probe
trace point.
Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Cc: Mark Wielaard <mjw@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/6be7c2b17e370100c2f79dd444509df7929bdd3e.1460451721.git.naveen.n.rao@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
So far, we used to treat probe point offsets as being offset from the
LEP. However, userspace applications (objdump/readelf) always show
disassembly and offsets from the function GEP. This is confusing to the
user as we will end up probing at an address different from what the
user expects when looking at the function disassembly with
readelf/objdump. Fix this by changing how we modify probe address with
perf.
If only the function name is provided, we assume the user needs the LEP.
Otherwise, if an offset is specified, we assume that the user knows the
exact address to probe based on function disassembly, and so we just
place the probe from the GEP offset.
Finally, kretprobe was also broken with kallsyms as we were trying to
specify an offset. This patch also fixes that issue.
Reported-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Cc: Mark Wielaard <mjw@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/75df860aad8216bf4b9bcd10c6351ecc0e3dee54.1460451721.git.naveen.n.rao@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now we have sort dimensions private for struct hists,
we need to make dimension booleans hists specific as
well.
Moving sort__has_comm into struct perf_hpp_list.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1462276488-26683-8-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now we have sort dimensions private for struct hists, we need to make
dimension booleans hists specific as well.
Moving sort__has_thread into struct perf_hpp_list.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1462276488-26683-7-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now we have sort dimensions private for struct hists, we need to make
dimension booleans hists specific as well.
Moving sort__has_socket into struct perf_hpp_list.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1462276488-26683-6-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now we have sort dimensions private for struct hists, we need to make
dimension booleans hists specific as well.
Moving sort__has_dso into struct perf_hpp_list.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1462276488-26683-5-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now we have sort dimensions private for struct hists, we need to make
dimension booleans hists specific as well.
Moving sort__has_sym into struct perf_hpp_list.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1462276488-26683-4-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now we have sort dimensions private for struct hists, we need to make
dimension booleans hists specific as well.
Moving sort__has_parent into struct perf_hpp_list.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1462276488-26683-3-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now we have sort dimensions private for struct hists, we need to make
dimension booleans hists specific as well.
Moving sort__need_collapse into struct perf_hpp_list.
Adding hists__has macro to easily access this info perf struct hists
object.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1462276488-26683-2-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
That field is only updated when we use the "sched:sched_stat_runtime"
tracepoint, and that is only done so far when we use the '--stat' command line
option, without it we get just zeros, confusing the users:
Without this patch:
# trace -a -s sleep 1
<SNIP>
qemu-system-x86 (9931), 468 events, 9.6%, 0.000 msec
syscall calls total min avg max stddev
(msec) (msec) (msec) (msec) (%)
---------- ------ --------- --------- --------- --------- ------
ppoll 98 982.374 0.000 10.024 29.983 12.65%
write 34 0.401 0.005 0.012 0.027 5.49%
ioctl 102 0.347 0.002 0.003 0.007 3.08%
firefox (10871), 1856 events, 38.2%, 0.000 msec
(msec) (msec) (msec) (msec) (%)
---------- ------ --------- --------- --------- --------- ------
poll 395 934.873 0.000 2.367 17.120 11.51%
recvmsg 395 0.988 0.001 0.003 0.021 4.20%
read 106 0.460 0.002 0.004 0.007 3.17%
futex 24 0.108 0.001 0.004 0.010 10.05%
mmap 2 0.041 0.016 0.021 0.026 23.92%
write 6 0.027 0.004 0.004 0.005 2.52%
After this patch that ', 0.000 msecs' gets suppressed when --stat is not
in use.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-p7emqrsw7900tdkg43v9l1e1@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Sometimes we want to sort an existing rbtree by a different key,
introduce a template for that, that needs only to be provided the
rbtree root and the number of entries in it.
To do that a new rbtree will be created with extra space for each entry,
where possibly pre-calculated keys will be stored to be used in the
resort process and also later, when using the newly sorted rbtree.
Please check the following two changesets to see it in use for resorting
stats for threads and its syscalls in 'perf trace --summary'.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-9l6e1q34lmf3wwdeewstyakg@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To be used, for instance, for pre-allocating an rb_tree array for
sorting by other keys besides the current pid one.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-ja0ifkwue7ttjhbwijn6g6eu@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Using sizeof on a malloced pointer type will return the wordsize which
can often cause one to allocate a buffer much smaller than it is needed.
So, here do not use sizeof on pointer type.
Note that this has no effect on runtime because 'dsos' is a pointer to a
pointer.
Problem found using Coccinelle.
Signed-off-by: Vaishali Thakkar <vaishali.thakkar@oracle.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1461862017-23358-1-git-send-email-vaishali.thakkar@oracle.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To reduce the size of builtin-trace.c.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-11zxg3qitk6bw2x30135k9z4@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
With 'perf record --switch-output' without -a, record__synthesize() in
record__switch_output() won't generate tracking events because there's
no thread_map in evlist. Which causes newly created perf.data doesn't
contain map and comm information.
This patch creates a fake thread_map and directly call
perf_event__synthesize_thread_map() for those events.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1461178794-40467-8-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The cost of buildid cache processing is high: reading all events in
output perf.data, opening each elf file to read buildids then copying
them into ~/.debug directory. In switch output mode, these heavy works
block perf from receiving perf events for too long.
Enable no-buildid and no-buildid-cache by default if --switch-output is
provided. Still allow user use --no-no-buildid to explicitly enable
buildid in this case.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1461178794-40467-6-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
[ Updated man page ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Without this patch, the last output doesn't have timestamp appended if
--timestamp-filename is not explicitly provided. For example:
# perf record -a --switch-output &
[1] 11224
# kill -s SIGUSR2 11224
[ perf record: dump data: Woken up 1 times ]
# [ perf record: Dump perf.data.2015122622372823 ]
# fg
perf record -a --switch-output
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.027 MB perf.data (540 samples) ]
# ls -l
total 836
-rw------- 1 root root 33256 Dec 26 22:37 perf.data <---- *Odd*
-rw------- 1 root root 817156 Dec 26 22:37 perf.data.2015122622372823
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1461178794-40467-5-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
[ Updated man page, that also got an entry for --timestamp-filename ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
auxtrace_snapshot_state matches the trigger model. Use trigger to
implement it. auxtrace_snapshot_state and auxtrace_snapshot_err are
absorbed.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1461178794-40467-3-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Use 'trigger' to model operations which need to be executed when an
event (a signal, for example) is observed.
States and transits:
OFF--(on)--> READY --(hit)--> HIT
^ |
| (ready)
| |
\_____________/
is_hit and is_ready are two key functions to query the state of a
trigger. is_hit means the event already happen; is_ready means the
trigger is waiting for the event.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1461178794-40467-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The error messages returned by this method should not have an ending
newline, fix the two cases where it was.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-8af0pazzhzl3dluuh8p7ar7p@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When the kernel allows tweaking perf_event_max_stack and the event being
setup has PERF_SAMPLE_CALLCHAIN in its perf_event_attr.sample_type, tell
the user that tweaking /proc/sys/kernel/perf_event_max_stack may solve
the problem.
Before:
# echo 32000 > /proc/sys/kernel/perf_event_max_stack
# perf record -g usleep 1
Error:
The sys_perf_event_open() syscall returned with 12 (Cannot allocate memory) for event (cycles:ppp).
/bin/dmesg may provide additional information.
No CONFIG_PERF_EVENTS=y kernel support configured?
#
After:
# echo 64000 > /proc/sys/kernel/perf_event_max_stack
# perf record -g usleep 1
Error:
Not enough memory to setup event with callchain.
Hint: Try tweaking /proc/sys/kernel/perf_event_max_stack
Hint: Current value: 64000
#
Suggested-by: David Ahern <dsahern@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-ebv0orelj1s1ye857vhb82ov@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
There is an upper limit to what tooling considers a valid callchain,
and it was tied to the hardcoded value in the kernel,
PERF_MAX_STACK_DEPTH (127), now that this can be tuned via a sysctl,
make it read it and use that as the upper limit, falling back to
PERF_MAX_STACK_DEPTH for kernels where this sysctl isn't present.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-yjqsd30nnkogvj5oyx9ghir9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Propagate the error instead.
Cc: David Ahern <dsahern@gmail.com>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-z6erjg35d1gekevwujoa0223@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Introduced in commit 4babf2c5ef ("x86: wire up preadv2 and pwritev2").
This will make 'perf trace' aware of them.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-vojoylgce2cetsy36446s5ny@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Perf is not able to register probe in kernel module when dwarf supprt
is not there(and so it goes for symtab). Perf passes full path of
module where only module name is required which is causing the problem.
This patch fixes this issue.
Before applying patch:
$ dpkg -s libdw-dev
dpkg-query: package 'libdw-dev' is not installed and no information is...
$ sudo ./perf probe -m /linux/samples/kprobes/kprobe_example.ko kprobe_init
Added new event:
probe:kprobe_init (on kprobe_init in /linux/samples/kprobes/kprobe_example.ko)
You can now use it in all perf tools, such as:
perf record -e probe:kprobe_init -aR sleep 1
$ sudo cat /sys/kernel/debug/tracing/kprobe_events
p:probe/kprobe_init /linux/samples/kprobes/kprobe_example.ko:kprobe_init
$ sudo ./perf record -a -e probe:kprobe_init
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.105 MB perf.data ]
$ sudo ./perf script # No output here
After applying patch:
$ sudo ./perf probe -m /linux/samples/kprobes/kprobe_example.ko kprobe_init
Added new event:
probe:kprobe_init (on kprobe_init in kprobe_example)
You can now use it in all perf tools, such as:
perf record -e probe:kprobe_init -aR sleep 1
$ sudo cat /sys/kernel/debug/tracing/kprobe_events
p:probe/kprobe_init kprobe_example:kprobe_init
$ sudo ./perf record -a -e probe:kprobe_init
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.105 MB perf.data (2 samples) ]
$ sudo ./perf script
insmod 13990 [002] 5961.216833: probe:kprobe_init: ...
insmod 13995 [002] 5962.889384: probe:kprobe_init: ...
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1461680741-12517-1-git-send-email-ravi.bangoria@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Perf can add a probe on kernel module which has not been loaded yet.
The current implementation finds the module name from path. But if the
filename is different from the actual module name then perf fails to
register a probe while loading module because of mismatch in the names.
For example, samples/kobject/kobject-example.ko is loaded as
kobject_example.
Before applying patch:
$ sudo ./perf probe -m /linux/samples/kobject/kobject-example.ko foo_show
Added new event:
probe:foo_show (on foo_show in kobject-example)
You can now use it in all perf tools, such as:
perf record -e probe:foo_show -aR sleep 1
$ cat /sys/kernel/debug/tracing/kprobe_events
p:probe/foo_show kobject-example:foo_show
$ insmod kobject-example.ko
$ lsmod
Module Size Used by
kobject_example 16384 0
Generate read to /sys/kernel/kobject_example/foo while recording data
with below command
$ sudo ./perf record -e probe:foo_show -a
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.093 MB perf.data ]
$./perf report --stdio -F overhead,comm,dso,sym
Error:
The perf.data.old file has no samples!
After applying patch:
$ sudo ./perf probe -m /linux/samples/kobject/kobject-example.ko foo_show
Added new event:
probe:foo_show (on foo_show in kobject_example)
You can now use it in all perf tools, such as:
perf record -e probe:foo_show -aR sleep 1
$ sudo cat /sys/kernel/debug/tracing/kprobe_events
p:probe/foo_show kobject_example:foo_show
$ insmod kobject-example.ko
$ lsmod
Module Size Used by
kobject_example 16384 0
Generate read to /sys/kernel/kobject_example/foo while recording data
with below command
$ sudo ./perf record -e probe:foo_show -a
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.097 MB perf.data (8 samples) ]
$ sudo ./perf report --stdio -F overhead,comm,dso,sym
...
# Samples: 8 of event 'probe:foo_show'
# Event count (approx.): 8
#
# Overhead Command Shared Object Symbol
# ........ ....... ................. ............
#
100.00% cat [kobject_example] [k] foo_show
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1461680741-12517-2-git-send-email-ravi.bangoria@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We get notifications for threads that gets created while we're tracing,
but for preexisting threads we may end not having synthesized them, like
when tracing a 'perf trace' session that will use '--pid' to trace some
other thread.
And besides we should probably stop synthesizing those records and
instead read thread information in a lazy way, i.e. just when we need,
like done in this patch:
Now the 'pid_t' argument in 'perf_event_open' gets translated to a COMM:
# perf trace -e perf_event_open perf stat -e cycles -p 31601
0.027 ( 0.027 ms): perf/23393 perf_event_open(attr_uptr: 0x2fdd0d8, pid: 31601 (abrt-dump-journ), cpu: -1, group_fd: -1, flags: FD_CLOEXEC)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
= 3
^C
And in other syscalls containing pid_t without thread->comm_set at the
time of the formatting.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-ioeps6dlwst17d6oozc9shtk@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Will be used for lazy comm loading in 'perf trace'.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-7ogbkuoka1y2qsmcckqxvl5m@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Leave it alone so that it ends up assigned to SCA_PID via its type,
'pid_t', that will look up the pid on the machine thread rb_tree and
possibly find its COMM.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-r7dujgmhtxxfajuunpt1bkuo@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To reduce the size of builtin-trace.c.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-8r3gmymyn3r0ynt4yuzspp9g@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Set kprobe group name as "probe" if it is not given.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160426090413.11891.95640.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Since other methods return 0 if succeeded (or filedesc), let
probe_file__add_event() return 0 instead of the length of written bytes.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160426090303.11891.18232.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
As a utility function, add lsdir() which reads given directory and store
entry name into a strlist. lsdir accepts a filter function so that user
can filter out unneeded entries.
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160426090242.11891.79014.stgit@devbox
[ Do not use the 'dirname' it is used in some distros ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fix a bug to close target elf file in get_text_start_address().
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160426064737.1443.44093.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Don't read broken data after 'head' pointer.
Following commits will feed perf_evlist__mmap_read() with some 'head'
pointers not maintained by kernel. If 'head' pointer breaks an event, we
should avoid reading from the broken event. This can happen in backward
ring buffer.
For example:
old head
| |
V V
+---+------+----------+----+-----+--+
|..E|D....D|C........C|B..B|A....|E.|
+---+------+----------+----+-----+--+
'old' pointer points to the beginning of 'A' and trying read from it,
but 'A' has been overwritten. In this case, don't try to read from 'A',
simply return NULL.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1461637738-62722-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The check for the maximum code is off-by-one; the current comparison of
a code that is INTEL_PT_ERR_MAX will cause the strlcpy to perform an out
of bounds array access on the intel_pt_err_msgs array.
Fix this with a >= comparison.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1461524203-10224-1-git-send-email-colin.king@canonical.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Given that the 'val' parameter is ignored for FUTEX_LOCK_PI, get rid of
the bogus deadlock detection flag in the wrapper code and avoid the
extra argument, making it resemble its unlock counterpart. And if
nothing else, we already only pass 0 anyway.
Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Davidlohr Bueso <dbueso@suse.de>
Link: http://lkml.kernel.org/r/1461208447-29328-1-git-send-email-dave@stgolabs.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The current assert check is checking an assignment, which will always be
true. Instead, the assert should be checking if scale is equal to 0.122
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1461419154-16918-1-git-send-email-colin.king@canonical.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To check deeply nested page fault callchains.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-wuji34xx003kr88nmqt6jkgf@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-shj0fazntmskhjild5i6x73l@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This fixes a bug caused by an unitialized callchain cursor. The crash
frist appeared in:
6f736735e3 ("perf evsel: Require that callchains be resolved before
calling fprintf_{sym,callchain}")
The callchain cursor is a struct that contains pointers, that when
uninitialized will cause unpredictable behavior (usually a crash)
when trying to append to the callchain.
The existing implementation has the following issues:
1. The callchain cursor used is not initialized, resulting in
unpredictable behavior when used.
2. The cursor is declared on the stack. Even if it is properly initalized,
the implmentation will leak memory when the function returns,
since all the references to the callchain_nodes allocated by
callchain_cursor_append will be lost when the cursor goes out of
scope.
3. Storing the cursor on the stack is inefficient. Even if memory is
properly freed when it goes out of scope, a performance penalty
will be incurred due to reallocation of callchain nodes.
callchain_cursor_append is designed to avoid these reallocations
when an existing cursor is reused.
This patch fixes the crash by replacing cursor_callchain with a reference
to the global callchain_cursor which also resolves all 3 issues mentioned
above.
How to reproduce the crash:
$ perf record --call-graph=dwarf stress -t 1 -c 1
$ perf script > /dev/null
Segfault
Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: 6f736735e3 ("perf evsel: Require that callchains be resolved before calling fprintf_{sym,callchain}")
Link: http://lkml.kernel.org/r/1461119531-2529-1-git-send-email-cphlipot0@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Prep work for next patches, where we'll need access to the created
evsels, to possibly configure callchains.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-2pcgsgnkgellhlcao4aub8tu@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
write_buildid() increments 'name_len' with intention to take into
account trailing zero byte. However, 'name_len' was already incremented
in machine__write_buildid_table() before. So this leads to
out-of-bounds read in do_write():
$ ./perf record sleep 0
[ perf record: Woken up 1 times to write data ]
=================================================================
==15899==ERROR: AddressSanitizer: global-buffer-overflow on address 0x00000099fc92 at pc 0x7f1aa9c7eab5 bp 0x7fff940f84d0 sp 0x7fff940f7c78
READ of size 19 at 0x00000099fc92 thread T0
#0 0x7f1aa9c7eab4 (/usr/lib/gcc/x86_64-pc-linux-gnu/5.3.0/libasan.so.2+0x44ab4)
#1 0x649c5b in do_write util/header.c:67
#2 0x649c5b in write_padded util/header.c:82
#3 0x57e8bc in write_buildid util/build-id.c:239
#4 0x57e8bc in machine__write_buildid_table util/build-id.c:278
...
0x00000099fc92 is located 0 bytes to the right of global variable '*.LC99' defined in 'util/symbol.c' (0x99fc80) of size 18
'*.LC99' is ascii string '[kernel.kallsyms]'
...
Shadow bytes around the buggy address:
0x00008012bf80: f9 f9 f9 f9 00 00 00 00 00 00 03 f9 f9 f9 f9 f9
=>0x00008012bf90: 00 00[02]f9 f9 f9 f9 f9 00 00 00 00 00 05 f9 f9
0x00008012bfa0: f9 f9 f9 f9 00 03 f9 f9 f9 f9 f9 f9 00 00 00 00
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1461053847-5633-1-git-send-email-aryabinin@virtuozzo.com
[ Remove the off-by one at the origin, to keep len(s) == strlen(s) assumption ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
Build fixes:
- Fix 'perf trace' build when DWARF unwind isn't available (Arnaldo Carvalho de Melo)
- Remove x86 references from arch-neutral Build, fixing it in !x86 arches,
reported as breaking the build for powerpc64le in linux-next (Arnaldo Carvalho de Melo)
Infrastructure changes:
- Do memset() variable 'st' using the correct size in the jit code (Colin Ian King)
- Fix postgresql ubuntu 'perf script' install instructions (Chris Phlipot)
- Use callchain_param more thoroughly when checking how callchains were
configured, eventually will be the only way to look for callchain parameters
(Arnaldo Carvalho de Melo)
- Fix some issues in the 'perf test kallsyms' entry (Arnaldo Carvalho de Melo)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Add sample_reg_mask array with pt_regs registers.
This is needed for printing supported regs ( -I? option).
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
One of the branches leading to an error had no debug message emitted,
fix it, the new lines are:
# perf test -v kallsyms
<SNIP>
0xffffffff81001000: diff name v: xen_hypercall_set_trap_table k: hypercall_page
0xffffffff810691f0: diff name v: try_to_free_pud_page k: try_to_free_pmd_page
<SNIP>
0xffffffff8150bb20: diff name v: wakeup_expire_count_show.part.5 k: wakeup_active_count_show.part.7
0xffffffff816bc7f0: diff name v: phys_switch_id_show.part.11 k: phys_port_name_show.part.12
0xffffffff817bbb90: diff name v: __do_softirq k: __softirqentry_text_start
<SNIP>
This in turn exercises another bug, still under investigation, because those
aliases _are_ in kallsyms, with the same name...
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: ab414dcda8 ("perf test: Fixup aliases checking in the 'vmlinux matches kallsyms' test")
Link: http://lkml.kernel.org/n/tip-5fhea7a54a54gsmagu9obpr4@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Before:
# perf test -v kallsyms
<SNIP>
Maps only in vmlinux:
ffffffff81d5e000-ffffffff81ec3ac8 115e000 [kernel].init.text
ffffffff81ec3ac8-ffffffffa0000000 12c3ac8 [kernel].exit.text
ffffffffa0000000-ffffffffa000c000 0 [fjes]
ffffffffa000c000-ffffffffa0017000 0 [video]
ffffffffa0017000-ffffffffa001c000 0 [grace]
<SNIP>
ffffffffa0a7f000-ffffffffa0ba5000 0 [xfs]
ffffffffa0ba5000-ffffffffffffffff 0 [veth]
Maps in vmlinux with a different name in kallsyms:
Maps only in kallsyms:
ffff880000100000-ffff88001000b000 80000103000 [kernel.kallsyms]
ffff88001000b000-ffff880100000000 8001000e000 [kernel.kallsyms]
ffff880100000000-ffffc90000000000 80100003000 [kernel.kallsyms]
<SNIP>
ffffffffa0000000-ffffffffff600000 7fffa0003000 [kernel.kallsyms]
ffffffffff600000-ffffffffffffffff 7fffff603000 [kernel.kallsyms]
test child finished with -1
---- end ----
vmlinux symtab matches kallsyms: FAILED!
#
After:
# perf test -v 1
1: vmlinux symtab matches kallsyms :
--- start ---
test child forked, pid 7058
Looking at the vmlinux_path (8 entries long)
Using /lib/modules/4.6.0-rc1+/build/vmlinux for symbols
0xffffffff81076870: diff end addr for aesni_gcm_dec v: 0xffffffff810791f2 k: 0xffffffff81076902
0xffffffff81079200: diff end addr for aesni_gcm_enc v: 0xffffffff8107bb03 k: 0xffffffff81079292
0xffffffff8107e8d0: diff end addr for aesni_gcm_enc_avx_gen2 v: 0xffffffff81083e76 k: 0xffffffff8107e943
0xffffffff81083e80: diff end addr for aesni_gcm_dec_avx_gen2 v: 0xffffffff81089611 k: 0xffffffff81083ef3
0xffffffff81089990: diff end addr for aesni_gcm_enc_avx_gen4 v: 0xffffffff8108e7c4 k: 0xffffffff81089a03
0xffffffff8108e7d0: diff end addr for aesni_gcm_dec_avx_gen4 v: 0xffffffff810937ef k: 0xffffffff8108e843
Maps only in vmlinux:
ffffffff81d5e000-ffffffff81ec3ac8 115e000 [kernel].init.text
ffffffff81ec3ac8-ffffffffa0000000 12c3ac8 [kernel].exit.text
Maps in vmlinux with a different name in kallsyms:
Maps only in kallsyms:
test child finished with -1
---- end ----
vmlinux symtab matches kallsyms: FAILED!
#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 8e0cf965f9 ("perf symbols: Add support for reading from /proc/kcore")
Link: http://lkml.kernel.org/n/tip-n6vrwt9t89w8k769y349govx@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Before the support for using /proc/kcore was introduced, the kallsyms
routines used /proc/modules and the first 'perf test' entry expected
finding maps for each module in the system, which is not the case with
the kcore code. Provide a way to ignore kcore files so that the test can
have its expectations met.
Improving the test to cover kcore files as well needs to be done.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-ek5urnu103dlhfk4l6pcw041@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It will already be dealt with generating the syscalltbl.c file in the
x86 arch specific Build files, namely via 'archheaders'.
This fixes the build on !x86 arches, as reported for powerpcle
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 1b700c9975 ("perf tools: Build syscall table .c header from kernel's syscall_64.tbl")
Link: http://lkml.kernel.org/r/20160415212831.GT9056@kernel.org
[ Removed the syscalltbl.o altogether, as per Jiri's suggestion ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The current code is memsetting the 'struct stat' variable 'st' with the size of
'stat' (which turns out to be 1 byte) rather than the size of variable 'sz'.
Committer notes:
sizeof(function) isn't valid, the result depends on the compiler used, with
gcc, enabling pedantic warnings we get:
$ cat sizeof_function.c
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <stdio.h>
int main(void)
{
printf("sizeof(stat)=%zd, stat=%p\n", sizeof(stat), stat);
return 0;
}
$ readelf -sW sizeof_function | grep -w stat
49: 0000000000400630 16 FUNC WEAK HIDDEN 13 stat
$ cc -pedantic sizeof_function.c -o sizeof_function
sizeof_function.c: In function ‘main’:
sizeof_function.c:8:46: warning: invalid application of ‘sizeof’ to a function type [-Wpointer-arith]
printf("sizeof(stat)=%zd, stat=%p\n", sizeof(stat), stat);
^
$ ./sizeof_function
sizeof(stat)=1, stat=0x400630
$
Standard C, section 6.5.3.4:
"The sizeof operator shall not be applied to an expression that has function
type or an incomplete type, to the parenthesized name of such a type,
or to an expression that designates a bit-field member."
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Fixes: 9b07e27f88 ("perf inject: Add jitdump mmap injection support")
Link: http://lkml.kernel.org/r/1461020838-9260-1-git-send-email-colin.king@canonical.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The current instructions for setting up an Ubuntu system for using the
export-to-postgresql.py script are incorrect.
The instructions in the script have been updated to work on newer
versions of ubuntu.
-Add missing dependencies to apt-get command:
python-pyside.qtsql, libqt4-sql-psql
-Add '-s' option to createuser command to force the user to be a
superuser since the command doesn't prompt as indicated in the
current instructions.
Tested on: Ubuntu 14.04, Ubuntu 16.04(beta)
Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1461056164-14914-3-git-send-email-cphlipot0@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
One more step in the direction of using just callchain_param for
callchain parameters.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-3b1o9kb2dc94zldz0klckti6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-u701i6qpecgm9jiat52i8l98@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We have callchain_param.enabled for that.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-silwqjc2t25ls42dsvg28pp5@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We have callchain_param.enabled, so no need to have something just for
'perf report' to do the same thing.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-wbeisubpualwogwi5u8utnt1@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Trying to move in the direction of using callchain_param for all
callchain parameters, eventually ditching them from symbol_conf.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-kixllia6r26mz45ng056zq7z@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Found by code inspection, while looking at thread__resolve_callchain()
callsites, one had it, the other didn't.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-6r8i2afd3523thuuaxl39yhk@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-5i07ivw1yjsweb7gztr255jd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tracing a workload that uses transactions gave a seg fault as follows:
perf record -e intel_pt// workload
perf report
Program received signal SIGSEGV, Segmentation fault.
0x000000000054b58c in intel_pt_reset_last_branch_rb (ptq=0x1a36110)
at util/intel-pt.c:929
929 ptq->last_branch_rb->nr = 0;
(gdb) p ptq->last_branch_rb
$1 = (struct branch_stack *) 0x0
(gdb) up
1148 intel_pt_reset_last_branch_rb(ptq);
(gdb) l
1143 if (ret)
1144 pr_err("Intel Processor Trace: failed to deliver transaction event
1145 ret);
1146
1147 if (pt->synth_opts.callchain)
1148 intel_pt_reset_last_branch_rb(ptq);
1149
1150 return ret;
1151 }
1152
(gdb) p pt->synth_opts.callchain
$2 = true
(gdb)
(gdb) bt
#0 0x000000000054b58c in intel_pt_reset_last_branch_rb (ptq=0x1a36110)
#1 0x000000000054c1e0 in intel_pt_synth_transaction_sample (ptq=0x1a36110)
#2 0x000000000054c5b2 in intel_pt_sample (ptq=0x1a36110)
Caused by checking the 'callchain' flag when it should have been the
'last_branch' flag. Fix that.
Reported-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org # v4.4+
Fixes: f14445ee72 ("perf intel-pt: Support generating branch stack")
Link: http://lkml.kernel.org/r/1460977068-11566-1-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The variable is initialized and then conditionally set to a different
value, but not used when DWARF unwinding is not available, bummer, write
1000 times: "Run make -C tools/perf build-test"...
builtin-trace.c: In function ‘cmd_trace’:
builtin-trace.c:3112:6: error: variable ‘max_stack_user_set’ set but not
used [-Werror=unused-but-set-variable]
bool max_stack_user_set = true;
^
cc1: all warnings being treated as err
Fix it by marking it as __maybe_unused.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 0561499326 ("perf trace: Make --(min,max}-stack imply "--call-graph dwarf"")
Link: http://lkml.kernel.org/n/tip-85r40c5hhv6jnmph77l1hgsr@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To reduce the chances we'll overflow the mmap buffer, manual fine tuning
trumps this.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-wxygbxmp1v9mng1ea28wet02@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When the user doesn't set --mmap-pages, perf_evlist__mmap() will do it
by reading the maximum possible for a non-root user from the
/proc/sys/kernel/perf_event_mlock_kb file.
Expose that function so that 'perf trace' can, for root users, to bump
mmap-pages to a higher value for root, based on the contents of this
proc file.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-xay69plylwibpb3l4isrpl1k@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
If one uses:
# perf trace --min-stack 16
Then it implicitly means that callgraphs should be enabled, and the best
option in terms of widespread availability is "dwarf".
Further work needed to choose a better alternative, LBR, in capable
systems.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-xtjmnpkyk42npekxz3kynzmx@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To be able to call it outside option parsing, like when setting a
default --call-graph parameter in 'perf trace' when just --min-stack is
used.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-xay69plylwibpb3l4isrpl1k@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
They still use functions that would drag more stuff to the python
binding, where these fprintf methods are not used, so separate it.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-xfp0mgq3hh3px61di6ixi1jk@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Similar to the one in the other tools (report, script, top).
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-lh7kk5a5t3erwxw31ah0cgar@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Works just like with 'perf report'. In some cases we may want to have
more than 127 entries, the default maximum.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-mqkz2p5ok2978gztb0vsnocc@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Not used at all, nuke it.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-jf2w8ce8nl3wso3vuodg5jci@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This way the print routine merely does printing, not requiring access to
the resolving machinery, which helps disentangling the object files and
easing creating subsets with a limited functionality set.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-2ti2jbra8fypdfawwwm3aee3@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To disentangle symbol printing from all the code related to symbol
tables, resolution of addresses to symbols, etc.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-eik9g3hbtdc7ddv57f1d4v3p@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
# perf test -v python
16: Try 'import perf' in python, checking link problems :
--- start ---
test child forked, pid 672
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: /tmp/build/perf/python/perf.so: undefined symbol:
symbol_conf
test child finished with -1
---- end ----
Try 'import perf' in python, checking link problems: FAILED!
#
To fix it just pass a parameter to perf_evsel__fprintf_sym telling if
callchains should be printed.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-comrsr20bsnr8bg0n6rfwv12@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The recent perf_evsel__fprintf_callchain() move to evsel.c added several
new symbol requirements to the python binding, for instance:
# perf test -v python
16: Try 'import perf' in python, checking link problems :
--- start ---
test child forked, pid 18030
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: /tmp/build/perf/python/perf.so: undefined symbol:
callchain_cursor
test child finished with -1
---- end ----
Try 'import perf' in python, checking link problems: FAILED!
#
This would require linking against callchain.c to access to the global
callchain_cursor variables.
Since lots of functions already receive as a parameter a
callchain_cursor struct pointer, make that be the case for some more
function so that we can start phasing out usage of yet another global
variable.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-djko3097eyg2rn66v2qcqfvn@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To reduce the size of builtin-trace.c.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-ao91htwxdqwlwxr47gbluou1@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently show_config() has a problem when user and system config files
have the same config variables i.e.:
# cat ~/.perfconfig
[top]
children = false
When $(sysconfdir) is /usr/local/etc
# cat /usr/local/etc/perfconfig
[top]
children = true
Before:
# perf config --user --list
top.children=false
# perf config --system --list
top.children=true
# perf config --list
top.children=true
top.children=false
Because perf_config() can call show_config() each the config file (user
and system). Fix it.
After:
# perf config --user --list
top.children=false
# perf config --system --list
top.children=true
# perf config --list
top.children=false
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1460620401-23430-3-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This infrastructure code was designed for upcoming features of
'perf config'.
That collect config key-value pairs from user and system config files
(i.e. user wide ~/.perfconfig and system wide $(sysconfdir)/perfconfig)
to manage perf's configs.
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1460620401-23430-2-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This option appends current timestamp to the output file name.
For example:
# perf record -a --timestamp-filename
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Dump perf.data.2015122622265847 ]
[ perf record: Captured and wrote 0.742 MB perf.data.<timestamp> (90 samples) ]
# ls
perf.data.201512262226584
The timestamp will be useful for identifying each perf.data after the
'perf record' support for generating multiple output files gets
introduced.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1460535673-159866-5-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
auxtrace_snapshot_enable has only two states (0/1). Turns it into a
triple states enum so SIGUSR2 handler can safely do other works without
triggering auxtrace snapshot.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1460535673-159866-4-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf_data_file__switch() closes current output file, renames it, then
open a new one to continue recording. It will be used by 'perf record'
to split output into multiple perf.data files.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1460535673-159866-3-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
ordered_events__free() leaves linked lists and timestamps not cleared,
so unable to be reused after ordered_events__free(). Which is inconvenient
after 'perf record' supports generating multiple perf.data output and
process build-ids for each of them.
Use ordered_events__reinit() for this.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1460535673-159866-2-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
[ Split from larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
'perf record' will use this when outputting multiple perf.data files.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1460535673-159866-2-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
[ Split from larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To better organize all these beautifiers.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-zrw5zz7cnrs44o5osouyutvt@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To better organize all these beautifiers.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-zbr27mdy9ssdhux3ib2nfa7j@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Were the detached tarball (make perf-tar-src-pkg) build was failing because
those definitions aren't available in the system headers.
On RHEL7, for instance:
builtin-trace.c: In function ‘syscall_arg__scnprintf_getrandom_flags’:
builtin-trace.c:1113:14: error: ‘GRND_RANDOM’ undeclared (first use in this function)
P_FLAG(RANDOM);
^
builtin-trace.c:1114:14: error: ‘GRND_NONBLOCK’ undeclared (first use in this function)
P_FLAG(NONBLOCK);
^
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-r8496g24a3kbqynvk6617b0e@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Were the detached tarball (make perf-tar-src-pkg) build was failing because
those definitions aren't available in the system headers.
On RHEL7, for instance:
builtin-trace.c: In function ‘syscall_arg__scnprintf_seccomp_op’:
builtin-trace.c:1069:7: error: ‘SECCOMP_SET_MODE_STRICT’ undeclared (first use in this function)
P_SECCOMP_SET_MODE_OP(STRICT);
^
builtin-trace.c:1069:7: note: each undeclared identifier is reported only once for each function it appears in
builtin-trace.c:1070:7: error: ‘SECCOMP_SET_MODE_FILTER’ undeclared (first use in this function)
P_SECCOMP_SET_MODE_OP(FILTER);
^
builtin-trace.c: In function ‘syscall_arg__scnprintf_seccomp_flags’:
builtin-trace.c:1091:14: error: ‘SECCOMP_FILTER_FLAG_TSYNC’ undeclared (first use in this function)
P_FLAG(TSYNC);
^
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-4f8dzzwd7g6l5dzz693u7kul@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Doesn't make sense and was causing a segfault, fix it.
# trace -e clone --no-syscalls --event sched:*exec firefox
The -e option can't be used with --no-syscalls.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-ccrahezikdk2uebptzr1eyyi@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Those were converted to be evsel methods long ago, move the
source to where it belongs.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-vja8rjmkw3gd5ungaeyb5s2j@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Introducing --cpus option that will display only given cpus. Could be
used together with color-cpus option.
$ perf sched map --cpus 0,1
*A0 309999.786924 secs A0 => rcu_sched:7
*. 309999.786930 secs
*B0 . 309999.786931 secs B0 => rcuos/2:25
B0 *A0 309999.786947 secs
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1460467771-26532-9-git-send-email-jolsa@kernel.org
[ Added entry to man page ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding --color-cpus option to display selected cpus with background
color (red by default). It helps on navigating through the perf sched
map output.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1460467771-26532-8-git-send-email-jolsa@kernel.org
[ Added entry to man page ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding --color-pids option to display selected pids in color (blue by
default). It helps on navigating through the 'perf sched map' output.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1460467771-26532-7-git-send-email-jolsa@kernel.org
[ Added entry to man page ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It will be used in following patch.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1460467771-26532-6-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
As preparation for next patch.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1460467771-26532-5-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding cpu_map__has() to return bool of cpu presence in cpus map.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1460467771-26532-3-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding thread_map__has() to return bool of pid presence in threads map.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1460467771-26532-2-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We already were able to ask for callchains for a specific event:
# trace -e nanosleep --call dwarf --event sched:sched_switch/call-graph=fp/ usleep 1
This would enable tracing just the "nanosleep" syscall, with callchains
at syscall exit and would ask the kernel for frame pointer callchains to
be enabled for the "sched:sched_switch" tracepoint event, its just that
we were not resolving the callchain and printing it in 'perf trace', do
it:
# trace -e nanosleep --call dwarf --event sched:sched_switch/call-graph=fp/ usleep 1
0.425 ( 0.013 ms): usleep/6718 nanosleep(rqtp: 0x7ffcc1d16e20) ...
0.425 ( ): sched:sched_switch:usleep:6718 [120] S ==> swapper/2:0 [120])
__schedule+0xfe200402 ([kernel.kallsyms])
schedule+0xfe200035 ([kernel.kallsyms])
do_nanosleep+0xfe20006f ([kernel.kallsyms])
hrtimer_nanosleep+0xfe2000dc ([kernel.kallsyms])
sys_nanosleep+0xfe20007a ([kernel.kallsyms])
do_syscall_64+0xfe200062 ([kernel.kallsyms])
return_from_SYSCALL_64+0xfe200000 ([kernel.kallsyms])
__nanosleep+0xffff008b8cbe2010 (/usr/lib64/libc-2.22.so)
0.486 ( 0.073 ms): usleep/6718 ... [continued]: nanosleep()) = 0
__nanosleep+0x10 (/usr/lib64/libc-2.22.so)
usleep+0x34 (/usr/lib64/libc-2.22.so)
main+0x1eb (/usr/bin/usleep)
__libc_start_main+0xf0 (/usr/lib64/libc-2.22.so)
_start+0x29 (/usr/bin/usleep)
#
Pretty compact, huh? DWARF callchains for raw_syscalls:sys_exit + frame
pointer callchains for a tracepoint, if your hardware supports LBR, go
wild with /call-graph=lbr/, guess the next step is to lift this from
'perf script':
-F, --fields <str> comma separated output fields prepend with 'type:'. Valid types: hw,sw,trace,raw.
Fields: comm,tid,pid,time,cpu,event,trace,ip,sym,dso,addr,symoff,period,iregs,brstack,brstacksym,flags
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-2e7yiv5hqdm8jywlmfivvx2v@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
- Automagically create a 'bpf-output' event, easing the setup of BPF
C "scripts" that produce output via the perf ring buffer. Now it is
just a matter of calling any perf tool, such as 'trace', with a C
source file that references the __bpf_stdout__ output channel and
that channel will be created and connected to the script:
# trace -e nanosleep --event test_bpf_stdout.c usleep 1
0.013 ( 0.013 ms): usleep/2818 nanosleep(rqtp: 0x7ffcead45f40 ) ...
0.013 ( ): __bpf_stdout__:Raise a BPF event!..)
0.015 ( ): perf_bpf_probe:func_begin:(ffffffff81112460))
0.261 ( ): __bpf_stdout__:Raise a BPF event!..)
0.262 ( ): perf_bpf_probe:func_end:(ffffffff81112460 <- ffffffff81003d92))
0.264 ( 0.264 ms): usleep/2818 ... [continued]: nanosleep()) = 0
#
Further work is needed to reduce the number of lines in a perf bpf C source
file, this being the part where we greatly reduce the command line setup (Wang Nan)
- 'perf trace' now supports callchains, with 'trace --call-graph dwarf' using
libunwind, just like 'perf top', to ask the kernel for stack dumps for CFI
processing. This reduces the overhead by asking just for userspace callchains
and also only for the syscall exit tracepoint (raw_syscalls:sys_exit)
(Milian Wolff, Arnaldo Carvalho de Melo)
Try it with, for instance:
# perf trace --call dwarf ping 127.0.0.1
An excerpt of a system wide 'perf trace --call dwarf" session is at:
https://fedorapeople.org/~acme/perf/perf-trace--call-graph-dwarf--all-cpus.txt
You may need to bump the number of mmap pages, using -m/--mmap-pages,
but on a Broadwell machine the defaults allowed system wide tracing to
work without losing that many records, experiment with just some
syscalls, like:
# perf trace --call dwarf -e nanosleep,futex
All the targets available for 'perf record', 'perf top' (--pid, --tid, --cpu,
etc) should work. Also --duration may be interesting to try.
To get filenames from in various syscalls pointer args (open, ettc), add this
to the mix:
# perf probe 'vfs_getname=getname_flags:72 pathname=filename:string'
Making this work is next in line:
# trace --call dwarf --ev sched:sched_switch/call-graph=fp/ usleep 1
I.e. honouring per-tracepoint callchains in 'perf trace' in addition to
in raw_syscalls:sys_exit.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJXDFQyAAoJENZQFvNTUqpAZCsP/2Q1Q8XfNpNZm7+JPrZDYUkm
KBxR3WSP8l46G8hJO2SBKHgXDv6EOCsfL/lvtLv18IHrz9pSTLZFPgl3a889iOnz
/d2pC/ydlDQ9yPR28cELb7gKMB0OF+rUqdZIWBqSM84LnvsYHgY6CntEIejfc2wf
jiVYHkug2dcUOmfgFpV4Jp3m6J8Okf9w9+/W4n+mkcS6o9WJvKCCiTMWoOwDDkyQ
gfEGN7YJt2iYLg4AhsG9ZJa+XKye53znjodpFNLCVbozXbZ4YSEbogR0qKJksHfH
5uabD2bEu2y0LiC9694xp5FLFM9tGML3Nr0JAkq6Jd230Ho4XyUy+/ZD0Lq0BHnv
HdIR7T4+wUYVKUf/ZW8gbPShR63UJ6qrgfLE8yZMxG0WKzh3XIQtg/BcxLw8XPi8
aF/IQt/om2KXPVEZv6SjNMp9DdmydeZ4KPrA9q2BGhbQzC2Ast7e6pHKouxbRrpb
mOSfLgDcqPFp75ZpIbFatKdg6S8VNKtFgF8wWAGrACtLboKa5PDS3El56BSNx2IA
6pexLuhaD8ndwvHP1F6nQQAHvFn5q4FKEg2fU0Pq8VnUN8SxrCvVjZZR3SjM+tGy
V5GHzJ7GTn9Cm2fwllrD/tndzPWQsbFA0UuLZPwVoxq2Lt2HC0YG30+SupsAVZrx
fCANHt3ci+qU1OCQAlIP
=jBSC
-----END PGP SIGNATURE-----
Merge tag 'perf-core-for-mingo-20160411' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements from Arnaldo Carvalho de Melo:
User visible changes:
- Automagically create a 'bpf-output' event, easing the setup of BPF
C "scripts" that produce output via the perf ring buffer. Now it is
just a matter of calling any perf tool, such as 'trace', with a C
source file that references the __bpf_stdout__ output channel and
that channel will be created and connected to the script:
# trace -e nanosleep --event test_bpf_stdout.c usleep 1
0.013 ( 0.013 ms): usleep/2818 nanosleep(rqtp: 0x7ffcead45f40 ) ...
0.013 ( ): __bpf_stdout__:Raise a BPF event!..)
0.015 ( ): perf_bpf_probe:func_begin:(ffffffff81112460))
0.261 ( ): __bpf_stdout__:Raise a BPF event!..)
0.262 ( ): perf_bpf_probe:func_end:(ffffffff81112460 <- ffffffff81003d92))
0.264 ( 0.264 ms): usleep/2818 ... [continued]: nanosleep()) = 0
#
Further work is needed to reduce the number of lines in a perf bpf C source
file, this being the part where we greatly reduce the command line setup (Wang Nan)
- 'perf trace' now supports callchains, with 'trace --call-graph dwarf' using
libunwind, just like 'perf top', to ask the kernel for stack dumps for CFI
processing. This reduces the overhead by asking just for userspace callchains
and also only for the syscall exit tracepoint (raw_syscalls:sys_exit)
(Milian Wolff, Arnaldo Carvalho de Melo)
Try it with, for instance:
# perf trace --call dwarf ping 127.0.0.1
An excerpt of a system wide 'perf trace --call dwarf" session is at:
https://fedorapeople.org/~acme/perf/perf-trace--call-graph-dwarf--all-cpus.txt
You may need to bump the number of mmap pages, using -m/--mmap-pages,
but on a Broadwell machine the defaults allowed system wide tracing to
work without losing that many records, experiment with just some
syscalls, like:
# perf trace --call dwarf -e nanosleep,futex
All the targets available for 'perf record', 'perf top' (--pid, --tid, --cpu,
etc) should work. Also --duration may be interesting to try.
To get filenames from in various syscalls pointer args (open, ettc), add this
to the mix:
# perf probe 'vfs_getname=getname_flags:72 pathname=filename:string'
Making this work is next in line:
# trace --call dwarf --ev sched:sched_switch/call-graph=fp/ usleep 1
I.e. honouring per-tracepoint callchains in 'perf trace' in addition to
in raw_syscalls:sys_exit.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
User visible:
- Beautify more syscall arguments in 'perf trace', using the type column in
tracepoint /format fields to attach, for instance, a pid_t resolver to the
thread COMM, also attach a mode_t beautifier in the same fashion
(Arnaldo Carvalho de Melo)
- Build the syscall table id <-> name resolver using the same .tbl file
used in the kernel to generate headers, to avoid the delay in getting
new syscalls supported in the audit-libs external dependency, done so
far only for x86_64 (Arnaldo Carvalho de Melo)
- Improve the documentation of event specifications (Andi Kleen)
- Process update events in 'perf script', fixing up this use case:
# perf stat -a -I 1000 -e cycles record | perf script -s script.py
- Shared object symbol adjustment fixes, fixing symbol resolution in
Android (Wang Nan)
Infrastructure:
- Add dedicated unwind addr_space member into thread struct, to allow
tools to use thread->priv, noticed while working on having callchains
in 'perf trace' (Jiri Olsa)
Build fixes:
- Fix the build in Ubuntu 12.04 (Arnaldo Carvalho de Melo, Vinson Lee)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJXB646AAoJENZQFvNTUqpAvroP/1BE/cViqrM3v351qkPDh2Dw
Mat7ZDTDUxsZ2XqzJQTsRiao1oi2ALF1Kyh3UzgRLqsDuahQHhwpm4M9pegtgDsu
I77yGGFVw/rdZLdJGHrGXE8HeOrvW6NKk7HqmnMlwpLpNkMfPVpqlq82/rUHcA46
F98/7DTG/PNstbIxuCgV3fI1aB8/v+lVGLw+S7vnf/Xuba4MNdqTQNEzkXaAubzv
c2I6YBwSO3v+oVgu3Vr679gGxNx3P/iGkgIb2WqJxezNiZUVdh4McJ4Dg3/mm5hj
Q1scLuW2MGXy3WaQW4i7sjoAwlOX+4kkUO8l5yk67GXKLwstnx4SWJNyeupMxDcp
UKl2mPcZIPaCPjR7aFnR6Xvxv71QkEEgeQlvR1gwAZeONBtZ9kOFD2jJfK5tbwyS
GG3rF5dB4KJqjr31tDcGYxuti4ggiJIvi/ujyrKvJUl7JChkJVDbqENOK35qistR
iHV1nSEQM1O2+NNaH4Ise9WYjQOH5tERc9L9XOpaR9x6KDsz1TR7bqqBu8HdASJs
v9x02RaPjDc/x3MUrhZ1sEDOqEqLO2eHhq1stnFTz4x9r7k+YPySS43FtnARB1CZ
H8Pc21gyxt1pzO0ogvnLoji1715+DaSaEWas+vrfzJ1cJ4Tf6v5zYEuOfYJhDQrC
dK75s0KDR6DCO/cwUEr1
=AF/P
-----END PGP SIGNATURE-----
Merge tag 'perf-core-for-mingo-20160408' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
User visible changes:
- Beautify more syscall arguments in 'perf trace', using the type column in
tracepoint /format fields to attach, for instance, a pid_t resolver to the
thread COMM, also attach a mode_t beautifier in the same fashion
(Arnaldo Carvalho de Melo)
- Build the syscall table id <-> name resolver using the same .tbl file
used in the kernel to generate headers, to avoid the delay in getting
new syscalls supported in the audit-libs external dependency, done so
far only for x86_64 (Arnaldo Carvalho de Melo)
- Improve the documentation of event specifications (Andi Kleen)
- Process update events in 'perf script', fixing up this use case:
# perf stat -a -I 1000 -e cycles record | perf script -s script.py
- Shared object symbol adjustment fixes, fixing symbol resolution in
Android (Wang Nan)
Infrastructure changes:
- Add dedicated unwind addr_space member into thread struct, to allow
tools to use thread->priv, noticed while working on having callchains
in 'perf trace' (Jiri Olsa)
Build fixes:
- Fix the build in Ubuntu 12.04 (Arnaldo Carvalho de Melo, Vinson Lee)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The fprintf_sym() and fprintf_callchain() methods now allow users to
change the existing behaviour of showing "[unknown]" as the name of
unresolved symbols to instead show "[0x123456]", i.e. its address.
The current patch doesn't change tools to use this facility, the results
from 'perf trace' and 'perf script' cotinue like:
70.109 ( 0.001 ms): qemu-system-x8/10153 poll(ufds: 0x7f2d93ffe870, nfds: 1) = 0 Timeout
[unknown] (/usr/lib64/libc-2.22.so)
[unknown] (/usr/lib64/libspice-server.so.1.10.0)
[unknown] (/usr/lib64/libspice-server.so.1.10.0)
[unknown] (/usr/lib64/libspice-server.so.1.10.0)
start_thread+0xca (/usr/lib64/libpthread-2.22.so)
__clone+0x6d (/usr/lib64/libc-2.22.so)
The next patch will make 'perf trace' use the new formatting.
Suggested-by: Milian Wolff <milian.wolff@kdab.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-fja1ods5vqpg42mdz09xcz3r@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We don't need the callchains at the syscall enter tracepoint, just when
finishing it at syscall exit, so reduce the overhead by asking for
callchains just at syscall exit.
Suggested-by: Milian Wolff <milian.wolff@kdab.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-fja1ods5vqpg42mdz09xcz3r@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The rename is for consistency with the parameter name.
Make it public for fine grained control of which evsels should have
callchains enabled, like, for instance, will be done in the next
changesets in 'perf trace', to enable callchains just on the
"raw_syscalls:sys_exit" tracepoint.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-og8vup111rn357g4yagus3ao@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For fiddling with sample_type fields in all evsels in an evlist.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-dg6yavctt0hzl2tsgfb43qsr@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Instead receive a callchain_param pointer to configure callchain
aspects, not doing so if NULL is passed.
This will allow fine grained control over which evsels in an evlist
gets callchains enabled.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-2mupip6khc92mh5x4nw9to82@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In 'perf trace' we're just interested in printing callchains, and we
don't want to use the symbol_conf.use_callchain, so move the callchain
part to a new method.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-kcn3romzivcpxb3u75s9nz33@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
As it receives a FILE, and its more than just the IP, which can even be
requested not to be printed.
For consistency with other similar methods in tools/perf/, name it as
perf_evsel__fprintf_sym() and make it return the number of bytes
printed, just like 'fprintf(3)'
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-84gawlqa3lhk63nf0t9vnqnn@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now, one can print the call chain for every encountered sys_exit event,
e.g.:
$ perf trace -e nanosleep --call-graph dwarf path/to/ex_sleep
1005.757 (1000.090 ms): ex_sleep/13167 nanosleep(...) = 0
syscall_slow_exit_work ([kernel.kallsyms])
syscall_return_slowpath ([kernel.kallsyms])
int_ret_from_sys_call ([kernel.kallsyms])
__nanosleep (/usr/lib/libc-2.23.so)
[unknown] (/usr/lib/libQt5Core.so.5.6.0)
QThread::sleep (/usr/lib/libQt5Core.so.5.6.0)
main (path/to/ex_sleep)
__libc_start_main (/usr/lib/libc-2.23.so)
_start (path/to/ex_sleep)
Note that it is advised to increase the number of mmap pages to prevent
event losses when using this new feature. Often, adding `-m 10M` to the
`perf trace` invocation is enough.
This feature is also available in strace when built with libunwind via
`strace -k`. Performance wise, this solution is much better:
$ time find path/to/linux &> /dev/null
real 0m0.051s
user 0m0.013s
sys 0m0.037s
$ time perf trace -m 800M --call-graph dwarf find path/to/linux &> /dev/null
real 0m2.624s
user 0m1.203s
sys 0m1.333s
$ time strace -k find path/to/linux &> /dev/null
real 0m35.398s
user 0m10.403s
sys 0m23.173s
Note that it is currently not possible to configure the print output.
Adding such a feature, similar to what is available in `perf script` via
its `--fields` knob can be added later on.
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
LPU-Reference: 1460115255-17648-1-git-send-email-milian.wolff@kdab.com
[ Split from a larger patch, do not print the IP, left align,
remove dup call symbol__init(), added man page entry ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For callchains, etc where we want it to align just below the syscall
name, for instance, in 'perf trace'
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-uk9ekchd67651c625ltaur5y@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch allows cloning bpf-output event configuration among multiple
bpf scripts. If there exist a map named '__bpf_output__' and not
configured using 'map:__bpf_output__.event=', this patch clones the
configuration of another '__bpf_stdout__' map. For example, following
command:
# perf trace --ev bpf-output/no-inherit,name=evt/ \
--ev ./test_bpf_trace.c/map:__bpf_stdout__.event=evt/ \
--ev ./test_bpf_trace2.c usleep 100000
equals to:
# perf trace --ev bpf-output/no-inherit,name=evt/ \
--ev ./test_bpf_trace.c/map:__bpf_stdout__.event=evt/ \
--ev ./test_bpf_trace2.c/map:__bpf_stdout__.event=evt/ \
usleep 100000
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1460128045-97310-4-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To fix the build on Fedora Rawhide (gcc 6.0.0 20160311 (Red Hat 6.0.0-0.17):
CC /tmp/build/perf/arch/x86/util/dwarf-regs.o
arch/x86/util/dwarf-regs.c:66:36: error: 'x86_32_regoffset_table' defined but not used [-Werror=unused-const-variable=]
static const struct pt_regs_offset x86_32_regoffset_table[] = {
^~~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-fghuksc1u8ln82bof4lwcj0o@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The readdir() function is thread safe as long as just one thread uses a
DIR, which is the case when parsing tracepoint event definitions, to
avoid breaking the build with glibc-2.23.90 (upcoming 2.24), use it
instead of readdir_r().
See: http://man7.org/linux/man-pages/man3/readdir.3.html
"However, in modern implementations (including the glibc implementation),
concurrent calls to readdir() that specify different directory streams
are thread-safe. In cases where multiple threads must read from the
same directory stream, using readdir() with external synchronization is
still preferable to the use of the deprecated readdir_r(3) function."
Noticed while building on a Fedora Rawhide docker container.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-wddn49r6bz6wq4ee3dxbl7lo@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The readdir() function is thread safe as long as just one thread uses a
DIR, which is the case when synthesizing events for pre-existing threads
by traversing /proc, so, to avoid breaking the build with glibc-2.23.90
(upcoming 2.24), use it instead of readdir_r().
See: http://man7.org/linux/man-pages/man3/readdir.3.html
"However, in modern implementations (including the glibc implementation),
concurrent calls to readdir() that specify different directory streams
are thread-safe. In cases where multiple threads must read from the
same directory stream, using readdir() with external synchronization is
still preferable to the use of the deprecated readdir_r(3) function."
Noticed while building on a Fedora Rawhide docker container.
CC /tmp/build/perf/util/event.o
util/event.c: In function '__event__synthesize_thread':
util/event.c:466:2: error: 'readdir_r' is deprecated [-Werror=deprecated-declarations]
while (!readdir_r(tasks, &dirent, &next) && next) {
^~~~~
In file included from /usr/include/features.h:368:0,
from /usr/include/stdint.h:25,
from /usr/lib/gcc/x86_64-redhat-linux/6.0.0/include/stdint.h:9,
from /git/linux/tools/include/linux/types.h:6,
from util/event.c:1:
/usr/include/dirent.h:189:12: note: declared here
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-i1vj7nyjp2p750rirxgrfd3c@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The readdir() function is thread safe as long as just one thread uses a
DIR, which is the case in thread_map, so, to avoid breaking the build
with glibc-2.23.90 (upcoming 2.24), use it instead of readdir_r().
See: http://man7.org/linux/man-pages/man3/readdir.3.html
"However, in modern implementations (including the glibc implementation),
concurrent calls to readdir() that specify different directory streams
are thread-safe. In cases where multiple threads must read from the
same directory stream, using readdir() with external synchronization is
still preferable to the use of the deprecated readdir_r(3) function."
Noticed while building on a Fedora Rawhide docker container.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-del8h2a0f40z75j4r42l96l0@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The readdir() function is thread safe as long as just one thread uses a
DIR, which is the case in 'perf script', so, to avoid breaking the build
with glibc-2.23.90 (upcoming 2.24), use it instead of readdir_r().
See: http://man7.org/linux/man-pages/man3/readdir.3.html
"However, in modern implementations (including the glibc implementation),
concurrent calls to readdir() that specify different directory streams
are thread-safe. In cases where multiple threads must read from the
same directory stream, using readdir() with external synchronization is
still preferable to the use of the deprecated readdir_r(3) function."
Noticed while building on a Fedora Rawhide docker container.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-mt3xz7n2hl49ni2vx7kuq74g@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
He Kuang reported a problem that perf fails to get correct symbol on
Android platform in [1]. The problem can be reproduced on normal x86_64
platform. I will describe the reproducing steps in detail at the end of
commit message.
The reason of this problem is the missing of symbol adjustment for normal
shared objects. In most of the cases skipping adjustment is okay. However,
when '.text' section have different 'address' and 'offset' the result is wrong.
I checked all shared objects in my working platform, only wine dll objects and
debug objects (in .debug) have this problem. However, it is common on Android.
For example:
$ readelf -S ./libsurfaceflinger.so | grep \.text
[10] .text PROGBITS 0000000000029030 00012030
This patch enables symbol adjustment for dynamic objects so the symbol
address got from elfutils would be adjusted correctly.
Now nearly all types of ELF files should adjust symbols. Makes
ss->adjust_symbols default to true.
Steps to reproduce the problem:
$ cat ./Makefile
PWD := $(shell pwd)
LDFLAGS += "-Wl,-rpath=$(PWD)"
CFLAGS += -g
main: main.c libbuggy.so
libbuggy.so: buggy.c
gcc -g -shared -fPIC -Wl,-Ttext-segment=0x200000 $< -o $@
clean:
rm -rf main libbuggy.so *.o
$ cat ./buggy.c
int fib(int x)
{
return (x == 0) ? 1 : (x == 1) ? 1 : fib(x - 1) + fib(x - 2);
}
$ cat ./main.c
#include <stdio.h>
extern int fib(int x);
int main()
{
int i;
for (i = 0; i < 40; i++)
printf("%d\n", fib(i));
return 0;
}
$ make
$ perf record ./main
...
$ perf report --stdio
# Overhead Command Shared Object Symbol
# ........ ....... ................. ...............................
#
14.97% main libbuggy.so [.] 0x000000000000066c
8.68% main libbuggy.so [.] 0x00000000000006aa
8.52% main libbuggy.so [.] fib@plt
7.95% main libbuggy.so [.] 0x0000000000000664
5.94% main libbuggy.so [.] 0x00000000000006a9
5.35% main libbuggy.so [.] 0x0000000000000678
...
The correct result should be (after this patch):
# Overhead Command Shared Object Symbol
# ........ ....... ................. ...............................
#
91.47% main libbuggy.so [.] fib
8.52% main libbuggy.so [.] fib@plt
0.00% main [kernel.kallsyms] [k] kmem_cache_free
[1] http://lkml.kernel.org/g/1452567507-54013-1-git-send-email-hekuang@huawei.com
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1460024671-64774-3-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In this patch, the offset of '.text' section is stored into dso
and used here to re-calculate address to objdump.
In most of the cases, executable code is in '.text' section, so the
adjustment made to a symbol in dso__load_sym (using
sym.st_value -= shdr.sh_addr - shdr.sh_offset) should equal to
'sym.st_value -= dso->text_offset'. Therefore, adding text_offset back
get objdump address from symbol address (rip). However, it is not true
for kernel and kernel module since there could be multiple executable
sections with different offset. Exclude kernel for this reason.
After this patch, even dso->adjust_symbols is set to true for shared
objects, map__rip_2objdump() and map__objdump_2mem() would return
correct result, so perf behavior of annotate won't be changed.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1460024671-64774-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We used libaudit to map ids to syscall names and vice-versa, but that
imposes a delay in supporting new syscalls, having to wait for libaudit
to get those new syscalls on its tables.
To remove that delay, for x86_64 initially, grab a copy of
arch/x86/entry/syscalls/syscall_64.tbl and use it to generate those
tables.
Syscalls currently not available in audit-libs:
# trace -e copy_file_range,membarrier,mlock2,pread64,pwrite64,timerfd_create,userfaultfd
Error: Invalid syscall copy_file_range, membarrier, mlock2, pread64, pwrite64, timerfd_create, userfaultfd
Hint: try 'perf list syscalls:sys_enter_*'
Hint: and: 'man syscalls'
#
With this patch:
# trace -e copy_file_range,membarrier,mlock2,pread64,pwrite64,timerfd_create,userfaultfd
8505.733 ( 0.010 ms): gnome-shell/2519 timerfd_create(flags: 524288) = 36
8506.688 ( 0.005 ms): gnome-shell/2519 timerfd_create(flags: 524288) = 40
30023.097 ( 0.025 ms): qemu-system-x8/24629 pwrite64(fd: 18, buf: 0x7f63ae382000, count: 4096, pos: 529592320) = 4096
31268.712 ( 0.028 ms): qemu-system-x8/24629 pwrite64(fd: 18, buf: 0x7f63afd8b000, count: 4096, pos: 2314133504) = 4096
31268.854 ( 0.016 ms): qemu-system-x8/24629 pwrite64(fd: 18, buf: 0x7f63afda2000, count: 4096, pos: 2314137600) = 4096
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-51xfjbxevdsucmnbc4ka5r88@git.kernel.org
[ Added make dep for 'prepare' in 'LIBPERF_IN', fix by Wang Nan to fix parallell build ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tools should use a mechanism similar to arch/x86/entry/syscalls/ to
generate a header file with the definitions for two variables:
static const char *syscalltbl_x86_64[] = {
[0] = "read",
[1] = "write",
<SNIP>
[324] = "membarrier",
[325] = "mlock2",
[326] = "copy_file_range",
};
static const int syscalltbl_x86_64_max_id = 326;
In a per arch file that should then be included in
tools/perf/util/syscalltbl.c.
First one will be for x86_64.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-02uuamkxgccczdth8komspgp@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We're using libaudit for doing name to id and id to syscall name
translations, but that makes 'perf trace' to have to wait for newer
libaudit versions supporting recently added syscalls, such as
"userfaultfd" at the time of this changeset.
We have all the information right there, in the kernel sources, so move
this code to a separate place, wrapped behind functions that will
progressively use the kernel source files to extract the syscall table
for use in 'perf trace'.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-i38opd09ow25mmyrvfwnbvkj@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Andreas reported following command produces no output:
# cat test.py
#!/usr/bin/env python
def stat__krava(cpu, thread, time, val, ena, run):
print "event %s cpu %d, thread %d, time %d, val %d, ena %d, run %d" % \
("krava", cpu, thread, time, val, ena, run)
# perf stat -a -I 1000 -e cycles,"cpu/config=0x6530160,name=krava/" record | perf script -s test.py
^C
#
The reason is that 'perf script' does not process event update events and
will never get the event name update thus the python callback is never
called.
The fix is just to add already existing callback we use in 'perf stat
report'.
Committer note:
After the patch:
# perf stat -a -I 1000 -e cycles,"cpu/config=0x6530160,name=krava/" record | perf script -s test.py
event krava cpu -1, thread -1, time 1000239179, val 1789051, ena 4000690920, run 4000690920
event krava cpu -1, thread -1, time 2000479061, val 2391338, ena 4000879596, run 4000879596
event krava cpu -1, thread -1, time 3000740802, val 1939121, ena 4000977209, run 4000977209
event krava cpu -1, thread -1, time 4001006730, val 2356115, ena 4001000489, run 4001000489
^C
#
Reported-by: Andreas Hollmann <hollmann@in.tum.de>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1460013073-18444-3-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Milian reported issue with thread::priv, which was double booked by perf
trace and DWARF unwind code. So using those together is impossible at
the moment.
Moving DWARF unwind private data into separate variable so perf trace
can keep using thread::priv.
Reported-and-Tested-by: Milian Wolff <milian.wolff@kdab.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andreas Hollmann <hollmann@in.tum.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1460013073-18444-2-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To be used in cases for both sides trim.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andreas Hollmann <hollmann@in.tum.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1460013073-18444-1-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Document some features for specifying events in the perf list manpage:
- Event groups
- Leader sampling
- How to specify raw PMU events in the new syntax
- Global versus per process PMUs.
- Access restrictions
- Fix Intel SDM URL
v2: Lots of new content. address review feedback.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/1459810686-15913-1-git-send-email-andi@firstfloor.org
[ Add quotes to some keywords, such as "any" ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This ended up triggering these warnings when building on Ubuntu 12.04.5:
util/scripting-engines/trace-event-perl.c: In function 'perl_process_callchain':
util/scripting-engines/trace-event-perl.c:293:4: error: value computed is not used [-Werror=unused-value]
util/scripting-engines/trace-event-perl.c:294:4: error: value computed is not used [-Werror=unused-value]
util/scripting-engines/trace-event-perl.c:295:4: error: value computed is not used [-Werror=unused-value]
util/scripting-engines/trace-event-perl.c:297:4: error: value computed is not used [-Werror=unused-value]
util/scripting-engines/trace-event-perl.c:309:4: error: value computed is not used [-Werror=unused-value]
cc1: all warnings being treated as errors
mv: cannot stat `/tmp/build/perf/util/scripting-engines/.trace-event-perl.o.tmp': No such file or directory
make[4]: *** [/tmp/build/perf/util/scripting-engines/trace-event-perl.o] Error 1
Fix it by doing error checking when building the perl data structures
related to callchains.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Dima Kogan <dima@secretsauce.net>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@gmail.com>
Fixes: f7380c12ec ("perf script perl: Perl scripts now get a backtrace, like the python ones")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
If not, tell the user that:
config/Makefile:273: Old libdw.h, finding variables at given 'perf probe' point will not work, install elfutils-devel/libdw-dev >= 0.157
And return -ENOTSUPP in die_get_var_range(), failing features that
need it, like the one pointed out above.
This fixes the build on older systems, such as Ubuntu 12.04.5.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Vinson Lee <vlee@freedesktop.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-9l7luqkq4gfnx7vrklkq4obs@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fix build error on Ubuntu 12.04.5 with GCC 4.6.3.
CC util/config.o
util/config.c: In function ‘perf_buildid_config’:
util/config.c:384:15: error: declaration of ‘dirname’ shadows a global declaration [-Werror=shadow]
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Taeung Song <treeze.taeung@gmail.com>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 9cb5987c82 ("perf config: Rework buildid_dir_command_config to perf_buildid_config")
Link: http://lkml.kernel.org/r/1459807659-9020-1-git-send-email-vlee@freedesktop.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
That is used in both live runs, i.e.:
# trace ls
As when processing events recorded in a perf.data file:
# trace -i perf.data
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-901l6yebnzeqg7z8mbaf49xb@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently the max value of format is calculated by the bits number. It
relies on the continuity of the format.
However, uncore event format is not continuous. E.g. uncore qpi event
format can be 0-7,21.
If bit 21 is set, there is parsing issues as below.
$ perf stat -a -e uncore_qpi_0/event=0x200002,umask=0x8/
event syntax error: '..pi_0/event=0x200002,umask=0x8/'
\___ value too big for format, maximum is 511
This patch return the real max value by setting all possible bits to 1.
Signed-off-by: Kan Liang <kan.liang@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1459365375-14285-1-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For Intel PT / BTS, define the environment variable that selects TSC
timestamps in the jitdump file.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1457426333-30260-1-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Intel PT uses TSC as a timestamp, so add support for using TSC instead
of the monotonic clock. Use of TSC is selected by an environment
variable "JITDUMP_USE_ARCH_TIMESTAMP" and flagged in the jitdump file
with flag JITDUMP_FLAGS_ARCH_TIMESTAMP.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1457426330-30226-1-git-send-email-adrian.hunter@intel.com
[ Added the fixup from He Kuang to make it build on other arches, ]
[ such as aarch64, to avoid inserting this bisectiong breakage upstream ]
Link: http://lkml.kernel.org/r/1459482572-129494-1-git-send-email-hekuang@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Intel PT uses the time members from the perf_event_mmap_page to convert
between TSC and perf time.
Due to a lack of foresight when Intel PT was implemented, those time
members were recorded in the (implementation dependent) AUXTRACE_INFO
event, the structure of which is generally inaccessible outside of the
Intel PT decoder. However now the conversion between TSC and perf time
is needed when processing a jitdump file when Intel PT has been used for
tracing.
So add a user event to record the time members. 'perf record' will
synthesize the event if the information is available. And session
processing will put a copy of the event on the session so that tools
like 'perf inject' can easily access it.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1457426324-30158-1-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We catch this record to provide a visual indication that events are
getting lost, then call the default method to allow extra logging shared
with the other tools to take place.
This extra logging was done twice because we were continuing to the
"default" clause where machine__process_event() will end up calling
machine__process_lost_event() again, fix it.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-wus2zlhw3qo24ye84ewu4aqw@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
User visible:
- Add support for skipping itrace instructions, useful to fast forward
processor trace (Intel PT, BTS) to right after initialization code at the start
of a workload (Andi Kleen)
- Add support for backtraces in perl 'perf script's (Dima Kogan)
- Add -U/-K (--all-user/--all-kernel) options to 'perf mem' (Jiri Olsa)
- Make -f/--force option documentation consistent across tools (Jiri Olsa)
Infrastructure:
- Add 'perf test' to check for event times (Jiri Olsa)
- 'perf config' cleanups (Taeung Song)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJW++RsAAoJENZQFvNTUqpAxrgP/A8i+A+WqTok8jEvmeCnWiqQ
b8ZIdiLOM0hrDHrZwSXvmOqyicix1mOaHfS1INm76i9C0Olwh2SRla4/MIYVI5vI
MAdvXIABgm0ychNh+XtleE16gXRWuFfc/7NBZk3f36bYUXhErOGxFuAUFlK1iOZ3
0D4rZaJkvVNc2LG01W56uZxueD5R0oCae8ctuVH7cqiCtWH6geIkrOHW2xk6Fj30
IU48Oq5C0p4q1n2YHpNudpngUH+Sr9GSMKPtRYCtGkuD/AgDzKfIDrqppeIK33Yo
SXKXkTCK2+cCajCbAFQgLf8xP+GLSgbk2fnQMsSXmEjnng7w0s0bu+Yew33qVjnx
/Hiyl0kkFD3xuSdmqlQE15ltrB1s1RHtBqK10SxZVQ7MNFLg5lltpMddEf/d/U3Z
U/jii+RYJKVIRnS/QwqRZdwnsRwTCeGB3KyZU55mM0P0PN3+U+EMTRasWvdyE7ou
7rSnhKdY9swybZyU0Z084LOWHSIEfYeMD/JofqDOrLpR/3ZKIFbmWAv5kcPp+tXL
CMoNPY2QIvo7GQXy3Qju8cthi/Td5GO1vr2InYgmm2IqksOT7OgvTssxbfbYOB/M
ii6wrkT/WILQNoMThoZXm+dFJP+6Lxsu0KbpEVBytnd0tmSgnHVPlFeGLB3Qfblv
d2mZss5ibOsMpo0f22hm
=NGjt
-----END PGP SIGNATURE-----
Merge tag 'perf-core-for-mingo-20160330' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes:
User visible changes:
- Add support for skipping itrace instructions, useful to fast forward
processor trace (Intel PT, BTS) to right after initialization code at the start
of a workload (Andi Kleen)
- Add support for backtraces in perl 'perf script's (Dima Kogan)
- Add -U/-K (--all-user/--all-kernel) options to 'perf mem' (Jiri Olsa)
- Make -f/--force option documentation consistent across tools (Jiri Olsa)
Infrastructure changes:
- Add 'perf test' to check for event times (Jiri Olsa)
- 'perf config' cleanups (Taeung Song)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Commit 9b07e27f88 ("perf inject: Add jitdump mmap injection support")
incorrectly assumed that PowerPC is big endian only.
Simplify things by consolidating the define of GEN_ELF_ENDIAN and checking
for __BYTE_ORDER == __BIG_ENDIAN.
The PowerPC checks were also incorrect, they do not match what gcc
emits. We should first look for __powerpc64__, then __powerpc__.
Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: Carl Love <cel@us.ibm.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Fixes: 9b07e27f88 ("perf inject: Add jitdump mmap injection support")
Link: http://lkml.kernel.org/r/20160329175944.33a211cc@kryten
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The 4b3a321223 ("perf hists browser: Support flat callchains") commit
over-aggressively tried to optimize callchain_node__init_have_children().
That lead to --tui mode not allowing to expand call chain elements if a
call chain element had only one parent. That's why --inverted callgraphs
looked halfway sane, but plain ones didn't.
Revert that individual optimization, it wasn't really related to the
rest of the commit.
Signed-off-by: Andres Freund <andres@anarazel.de>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Fixes: 4b3a321223 ("perf hists browser: Support flat callchains")
Link: http://lkml.kernel.org/r/20160330190245.GB13305@awork2.anarazel.de
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When using 'perf script' to look at PT traces it is often useful to
ignore the initialization code at the beginning.
On larger traces which may have many millions of instructions in
initialization code doing that in a pipeline can be very slow, with perf
script spending a lot of CPU time calling printf and writing data.
This patch adds an extension to the --itrace argument that skips 'n'
events (instructions, branches or transactions) at the beginning. This
is much more efficient.
v2:
Add support for BTS (Adrian Hunter)
Document in itrace.txt
Fix branch check
Check transactions and instructions too
Committer note:
To test intel_pt one needs to make sure VT-x isn't active, i.e.
stopping KVM guests on the test machine, as described by Andi Kleen
at http://lkml.kernel.org/r/20160301234953.GD23621@tassilo.jf.intel.com
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1459187142-20035-1-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We have some infrastructure to use perl or python to analyze logs
generated by perf. Prior to this patch, only the python tools had
access to backtrace information. This patch makes this information
available to perl scripts as well. Example:
Let's look at malloc() calls made by the seq utility. First we
create a probe point:
$ perf probe -x /lib/x86_64-linux-gnu/libc.so.6 malloc
Added new events:
...
Now we run seq, while monitoring malloc() calls with perf
$ perf record --call-graph=dwarf -e probe_libc:malloc seq 5
1
2
3
4
5
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.064 MB perf.data (6 samples) ]
We can use perf to look at its log to see the malloc calls and the backtrace
$ perf script
seq 14195 [000] 1927993.748254: probe_libc:malloc: (7f9ff8edd320) bytes=0x22
7f9ff8edd320 malloc (/lib/x86_64-linux-gnu/libc-2.22.so)
7f9ff8e8eab0 set_binding_values.part.0 (/lib/x86_64-linux-gnu/libc-2.22.so)
7f9ff8e8eda1 __bindtextdomain (/lib/x86_64-linux-gnu/libc-2.22.so)
401b22 main (/usr/bin/seq)
7f9ff8e82610 __libc_start_main (/lib/x86_64-linux-gnu/libc-2.22.so)
402799 _start (/usr/bin/seq)
...
We can also use the scripting facilities. We create a skeleton perl
script that simply prints out the events
$ perf script -g perl
generated Perl script: perf-script.pl
We can then use this script to see the malloc() calls with a
backtrace. Prior to this patch, the backtrace was not available to
the perl scripts.
$ perf script -s perf-script.pl
probe_libc::malloc 0 1927993.748254260 14195 seq __probe_ip=140325052863264, bytes=34
[7f9ff8edd320] malloc
[7f9ff8e8eab0] set_binding_values.part.0
[7f9ff8e8eda1] __bindtextdomain
[401b22] main
[7f9ff8e82610] __libc_start_main
[402799] _start
...
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lkml.kernel.org/r/87mvphzld0.fsf@secretsauce.net
Signed-off-by: Dima Kogan <dima@secretsauce.net>
Change the variable name 'v' to 'home' to make it more readable.
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1459099340-16911-3-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To avoid repeated calling perf_config() remove
buildid_dir_command_config() and add new perf_buildid_config into
perf_default_config.
Because perf_config() is already called with perf_default_config at
main().
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1459099340-16911-2-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This test creates software event 'cpu-clock' attaches it in several ways
and checks that enabled and running times match.
Committer notes:
Testing it:
[acme@jouet linux]$ perf test -v times
44: Test events times :
--- start ---
test child forked, pid 27170
attaching to spawned child, enable on exec
OK : ena 307328, run 307328
attaching to current thread as enabled
OK : ena 7826, run 7826
attaching to current thread as disabled
OK : ena 738, run 738
attaching to CPU 0 as enabled
SKIP : not enough rights
attaching to CPU 0 as enabled
SKIP : not enough rights
test child finished with -2
---- end ----
Test events times: Skip
[acme@jouet linux]$
[root@jouet ~]# perf test times
44: Test events times : Ok
[root@jouet ~]# perf test -v times
44: Test events times :
--- start ---
test child forked, pid 27306
attaching to spawned child, enable on exec
OK : ena 479290, run 479290
attaching to current thread as enabled
OK : ena 11356, run 11356
attaching to current thread as disabled
OK : ena 987, run 987
attaching to CPU 0 as enabled
OK : ena 3717, run 3717
attaching to CPU 0 as enabled
OK : ena 2323, run 2323
test child finished with 0
---- end ----
Test events times: Ok
[root@jouet ~]#
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1458823940-24583-7-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
No need to export hists__collapse_insert_entry function.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1458823940-24583-4-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add -U/-K (--all-user/--all-kernel) options to use the perf record
--all-user/--all-kernel options.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1458823940-24583-3-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In 473398a21d ("perf tools: Add cpumode to struct perf_sample"), I
missed some places where perf_sample fields are directly initialized in
addition to what is done in perf_evsel__parse_sample(), namely when
synthesizing PERF_RECORD_{MMAP*,COMM,FORK,EXIT} for pre-existing threads
and also in intel_pt and intel_bts when synthesizing events from
processor trace, the jitdump code also was affected, fix it.
The problem was noticed with running:
# perf record -e intel_pt//u true
# perf script
Where the samples wouldn't get resolved because perf_sample.cpumode
would be left as zero, i.e. PERF_RECORD_MISC_CPUMODE_UNKNOWN, not
resolving as kernel, hypervisor or user cpu modes.
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 473398a21d ("perf tools: Add cpumode to struct perf_sample")
Link: http://lkml.kernel.org/n/tip-n5sdauxgk24d5nun8kuuu2mh@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Commit 531d241063 ("perf tools: Do not include stringify.h from the
kernel sources") seems to have accidentially removed the inclusion of
"util/header.h" from "arch/powerpc/util/header.c".
"util/header.h" provides the prototype for get_cpuid() and is needed to
build perf on Powerpc:
arch/powerpc/util/header.c:17:1: error: no previous prototype for 'get_cpuid' [-Werror=missing-prototypes]
Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Fixes: 531d241063 ("perf tools: Do not include stringify.h from the kernel sources")
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
[ Included "util.h" too, to get the scnprintf() prototype ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pull perf fixes from Ingo Molnar:
"This tree contains various perf fixes on the kernel side, plus three
hw/event-enablement late additions:
- Intel Memory Bandwidth Monitoring events and handling
- the AMD Accumulated Power Mechanism reporting facility
- more IOMMU events
... and a final round of perf tooling updates/fixes"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (44 commits)
perf llvm: Use strerror_r instead of the thread unsafe strerror one
perf llvm: Use realpath to canonicalize paths
perf tools: Unexport some methods unused outside strbuf.c
perf probe: No need to use formatting strbuf method
perf help: Use asprintf instead of adhoc equivalents
perf tools: Remove unused perf_pathdup, xstrdup functions
perf tools: Do not include stringify.h from the kernel sources
tools include: Copy linux/stringify.h from the kernel
tools lib traceevent: Remove redundant CPU output
perf tools: Remove needless 'extern' from function prototypes
perf tools: Simplify die() mechanism
perf tools: Remove unused DIE_IF macro
perf script: Remove lots of unused arguments
perf thread: Rename perf_event__preprocess_sample_addr to thread__resolve
perf machine: Rename perf_event__preprocess_sample to machine__resolve
perf tools: Add cpumode to struct perf_sample
perf tests: Forward the perf_sample in the dwarf unwind test
perf tools: Remove misplaced __maybe_unused
perf list: Fix documentation of :ppp
perf bench numa: Fix assertion for nodes bitfield
...
A change on kernel files included by the 'perf bench memcpy' code grew some new
include deps, breaking the detached tarball build:
$ make -C tools/perf build-test
make: Entering directory '/home/acme/git/linux/tools/perf'
- tarpkg: ./tests/perf-targz-src-pkg .
tests/make:302: recipe for target 'tarpkg' failed
make[1]: *** [tarpkg] Error 2
Makefile:102: recipe for target 'build-test' failed
make: *** [build-test] Error 2
make: Leaving directory '/home/acme/git/linux/tools/perf'
$ cat tools/perf/tarpkg
./tests/perf-targz-src-pkg .
PERF_VERSION = 4.5.g05f5ec
PERF_VERSION = 4.5.g05f5ec
In file included from bench/mem-memcpy-x86-64-asm.S:9:0:
bench/../../../arch/x86/lib/memcpy_64.S:5:29: fatal error: asm/cpufeatures.h: No such file or directory
compilation terminated.
mv: cannot stat ‘bench/.mem-memcpy-x86-64-asm.o.tmp’: No such file or directory
make[5]: *** [bench/mem-memcpy-x86-64-asm.o] Error 1
make[5]: *** Waiting for unfinished jobs....
make[4]: *** [bench] Error 2
make[4]: *** Waiting for unfinished jobs....
make[3]: *** [perf-in.o] Error 2
make[3]: *** Waiting for unfinished jobs....
make[2]: *** [all] Error 2
$
Add arch/*/include/asm/*features.h to tools/perf/MANIFEST so that we can
continue to use detached tarballs to build perf.
Now it builds ok, doing it manually:
$ make help | grep perf
perf-tar-src-pkg - Build perf-4.5.0.tar source tarball
perf-targz-src-pkg - Build perf-4.5.0.tar.gz source tarball
perf-tarbz2-src-pkg - Build perf-4.5.0.tar.bz2 source tarball
perf-tarxz-src-pkg - Build perf-4.5.0.tar.xz source tarball
$ ls -la perf-4.5.0.tar
ls: cannot access perf-4.5.0.tar: No such file or directory
$ make perf-tar-src-pkg
TAR
PERF_VERSION = 4.5.g32c25b
$ ls -la perf-4.5.0.tar
-rw-rw-r--. 1 acme acme 6318080 Mar 24 11:52 perf-4.5.0.tar
$ mv perf-4.5.0.tar /tmp
$ cd /tmp
$ tar xf perf-4.5.0.tar
$ cd perf-4.5.0/tools/perf
$ make > /dev/null
PERF_VERSION = 4.5.g32c25b
$ ls -la perf
-rwxrwxr-x. 1 acme acme 14046416 Mar 24 11:53 perf
$ ./perf --version
perf version 4.5.g32c25b
$ perf bench
Usage:
perf bench [<common options>] <collection> <benchmark> [<options>]
# List of all available benchmark collections:
sched: Scheduler and IPC benchmarks
mem: Memory access benchmarks
numa: NUMA scheduling and MM benchmarks
futex: Futex stressing benchmarks
all: All benchmarks
$ perf bench mem
# List of available benchmarks for collection 'mem':
memcpy: Benchmark for memcpy() functions
memset: Benchmark for memset() functions
all: Run all memory access benchmarks
$ perf bench mem memcpy
# Running 'mem/memcpy' benchmark:
# function 'default' (Default memcpy() provided by glibc)
# Copying 1MB bytes ...
15.024038 GB/sec
# function 'x86-64-unrolled' (unrolled memcpy() in arch/x86/lib/memcpy_64.S)
# Copying 1MB bytes ...
17.438616 GB/sec
# function 'x86-64-movsq' (movsq-based memcpy() in arch/x86/lib/memcpy_64.S)
# Copying 1MB bytes ...
25.040064 GB/sec
# function 'x86-64-movsb' (movsb-based memcpy() in arch/x86/lib/memcpy_64.S)
# Copying 1MB bytes ...
25.040064 GB/sec
$
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-2c2sncwffuabw58fj1pw86gu@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
So we need to trow away just stdout, leaving stderr to be caught by
the build tests infrastructure, so that we can see what went wrong
when the tarpkg build test fails:
$ make -C tools/perf build-test
make: Entering directory '/home/acme/git/linux/tools/perf'
- tarpkg: ./tests/perf-targz-src-pkg .
tests/make:302: recipe for target 'tarpkg' failed
make[1]: *** [tarpkg] Error 2
Makefile:102: recipe for target 'build-test' failed
make: *** [build-test] Error 2
make: Leaving directory '/home/acme/git/linux/tools/perf'
$ cat tools/perf/tarpkg
./tests/perf-targz-src-pkg .
PERF_VERSION = 4.5.g05f5ec
PERF_VERSION = 4.5.g05f5ec
In file included from bench/mem-memcpy-x86-64-asm.S:9:0:
bench/../../../arch/x86/lib/memcpy_64.S:5:29: fatal error: asm/cpufeatures.h: No such file or directory
compilation terminated.
mv: cannot stat ‘bench/.mem-memcpy-x86-64-asm.o.tmp’: No such file or directory
make[5]: *** [bench/mem-memcpy-x86-64-asm.o] Error 1
make[5]: *** Waiting for unfinished jobs....
make[4]: *** [bench] Error 2
make[4]: *** Waiting for unfinished jobs....
make[3]: *** [perf-in.o] Error 2
make[3]: *** Waiting for unfinished jobs....
make[2]: *** [all] Error 2
$
So the test flow is:
1. Run: 'make -C tools/perf build-test'
2. One of its tests failed, in this case, the 'tarpkg' one
3. Look at what went wrong, by looking at the output of that test, in
tools/perf/tarpkg
Admittedly, this should be shortcircuited to showing what went wrong directly
from the 'make build-test' step, but lets first fix this tarpkg one and the
problem it spotted, which should be fixed by adding some extra file to the
tools/perf/MANIFEST so that detached tarballs continue being self contained and
build successfully.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-ynld6egoxolmftcddpnd7oh6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-5njrq9dltckgm624omw9ljgu@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To kill the last user of make_nonrelative_path(), that gets ditched,
one more panicking function killed.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-3hu56rvyh4q5gxogovb6ko8a@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-nq1wvtky4mpu0nupjyar7sbw@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We have addch() for chars, add() for fixed size data, and addstr() for
variable length strings, use them.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-0ap02fn2xtvpduj2j6b2o1j4@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
That doesn't chekcs malloc return and that, when using strbuf, if it
can't grow, just explodes away via die().
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-vr8qsjbwub7e892hpa9msz95@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-s87zi5d03m6rz622y1z6rlsa@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Use instead the copy just made to tools/include/linux/.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-q736w12nwy98x5ox2hamp5ow@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-w246stf7ponfamclsai6b9zo@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This should die altogether, but for now lets remove a bit of this stuff,
as it is not used at all.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-ade3n99xscldhg5mx2vzd8p3@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-elxg25jd4dhwod4wqbko87qh@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Since none of the perf_event fields are used anymore, just the
perf_sample ones, and since this resolves to (map, symbol) from data
structures within struct thread, rename it to thread__resolve and make
the argument ordering similar to the one in machine__resolve().
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-2b33hs9bp550tezzlhl4kejh@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Since we only deal with fields in the passed struct perf_sample move
this method to struct machine, that is where the perf_sample fields
will be resolved to a struct addr_location, i.e. thread, map, symbol,
etc.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-a1ww2lbm2vbuqsv4p7ilubu9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To avoid parsing event->header.misc in many locations.
This will also allow setting perf.sample.{ip,cpumode} in a single place,
from tracepoint fields, as needed by 'perf kvm' with PPC guests, where
the guest hardware counters is not available at the host.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-qp3yradhyt6q3wl895b1aat0@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It _will_ be used, no sense in receiving it and nor fowarding it along.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-ht8v5et209wuoh5o6nh9pzyq@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
All over the tree.
Cc: David Ahern <dsahern@gmail.com>
cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/n/tip-8nzhnokxyp8y4v7gf0j00oyb@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Correctly document what is implemented for :ppp on Intel CPUs in recent
kernels.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1458575793-12091-2-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Comparing bits and bytes in numa benchmark assertion
I hit the issue on two socket Power8 machine presenting its numa nodes
as 0,1,16,17 (according to numactl). Therefore I got error (and hang of
parent process):
perf: bench/numa.c:296: bind_to_memnode: Assertion `!(g->p.nr_nodes > (int)sizeof(nodemask))' failed.
This is obviously false positive. We can fit all the 18 nodes into
bitfield of 8 bytes (long on 64b architecture).
Signed-off-by: Jakub Jelen <jakuje@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jakub Jelen <jjelen@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: trivial@kernel.org
Link: http://lkml.kernel.org/r/1458388687-24421-1-git-send-email-jakuje@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pull 'objtool' stack frame validation from Ingo Molnar:
"This tree adds a new kernel build-time object file validation feature
(ONFIG_STACK_VALIDATION=y): kernel stack frame correctness validation.
It was written by and is maintained by Josh Poimboeuf.
The motivation: there's a category of hard to find kernel bugs, most
of them in assembly code (but also occasionally in C code), that
degrades the quality of kernel stack dumps/backtraces. These bugs are
hard to detect at the source code level. Such bugs result in
incorrect/incomplete backtraces most of time - but can also in some
rare cases result in crashes or other undefined behavior.
The build time correctness checking is done via the new 'objtool'
user-space utility that was written for this purpose and which is
hosted in the kernel repository in tools/objtool/. The tool's (very
simple) UI and source code design is shaped after Git and perf and
shares quite a bit of infrastructure with tools/perf (which tooling
infrastructure sharing effort got merged via perf and is already
upstream). Objtool follows the well-known kernel coding style.
Objtool does not try to check .c or .S files, it instead analyzes the
resulting .o generated machine code from first principles: it decodes
the instruction stream and interprets it. (Right now objtool supports
the x86-64 architecture.)
From tools/objtool/Documentation/stack-validation.txt:
"The kernel CONFIG_STACK_VALIDATION option enables a host tool named
objtool which runs at compile time. It has a "check" subcommand
which analyzes every .o file and ensures the validity of its stack
metadata. It enforces a set of rules on asm code and C inline
assembly code so that stack traces can be reliable.
Currently it only checks frame pointer usage, but there are plans to
add CFI validation for C files and CFI generation for asm files.
For each function, it recursively follows all possible code paths
and validates the correct frame pointer state at each instruction.
It also follows code paths involving special sections, like
.altinstructions, __jump_table, and __ex_table, which can add
alternative execution paths to a given instruction (or set of
instructions). Similarly, it knows how to follow switch statements,
for which gcc sometimes uses jump tables."
When this new kernel option is enabled (it's disabled by default), the
tool, if it finds any suspicious assembly code pattern, outputs
warnings in compiler warning format:
warning: objtool: rtlwifi_rate_mapping()+0x2e7: frame pointer state mismatch
warning: objtool: cik_tiling_mode_table_init()+0x6ce: call without frame pointer save/setup
warning: objtool:__schedule()+0x3c0: duplicate frame pointer save
warning: objtool:__schedule()+0x3fd: sibling call from callable instruction with changed frame pointer
... so that scripts that pick up compiler warnings will notice them.
All known warnings triggered by the tool are fixed by the tree, most
of the commits in fact prepare the kernel to be warning-free. Most of
them are bugfixes or cleanups that stand on their own, but there are
also some annotations of 'special' stack frames for justified cases
such entries to JIT-ed code (BPF) or really special boot time code.
There are two other long-term motivations behind this tool as well:
- To improve the quality and reliability of kernel stack frames, so
that they can be used for optimized live patching.
- To create independent infrastructure to check the correctness of
CFI stack frames at build time. CFI debuginfo is notoriously
unreliable and we cannot use it in the kernel as-is without extra
checking done both on the kernel side and on the build side.
The quality of kernel stack frames matters to debuggability as well,
so IMO we can merge this without having to consider the live patching
or CFI debuginfo angle"
* 'core-objtool-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (52 commits)
objtool: Only print one warning per function
objtool: Add several performance improvements
tools: Copy hashtable.h into tools directory
objtool: Fix false positive warnings for functions with multiple switch statements
objtool: Rename some variables and functions
objtool: Remove superflous INIT_LIST_HEAD
objtool: Add helper macros for traversing instructions
objtool: Fix false positive warnings related to sibling calls
objtool: Compile with debugging symbols
objtool: Detect infinite recursion
objtool: Prevent infinite recursion in noreturn detection
objtool: Detect and warn if libelf is missing and don't break the build
tools: Support relative directory path for 'O='
objtool: Support CROSS_COMPILE
x86/asm/decoder: Use explicitly signed chars
objtool: Enable stack metadata validation on 64-bit x86
objtool: Add CONFIG_STACK_VALIDATION option
objtool: Add tool to perform compile-time stack metadata validation
x86/kprobes: Mark kretprobe_trampoline() stack frame as non-standard
sched: Always inline context_switch()
...
Store DSO's .text offset into DSO, used for VDSOs and will also be used for
other needs, like handling kernel modules.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1456479154-136027-2-git-send-email-wangnan0@huawei.com
[ Extracted from larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
As it is used by several other tools, better move it outside tools/perf.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-34s9kue3xq9w5mijdmfrfx8s@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In tracepoints, it's possible to print gfp flags in a human-friendly
format through a macro show_gfp_flags(), which defines a translation
array and passes is to __print_flags(). Since the following patch will
introduce support for gfp flags printing in printk(), it would be nice
to reuse the array. This is not straightforward, since __print_flags()
can't simply reference an array defined in a .c file such as mm/debug.c
- it has to be a macro to allow the macro magic to communicate the
format to userspace tools such as trace-cmd.
The solution is to create a macro __def_gfpflag_names which is used both
in show_gfp_flags(), and to define the gfpflag_names[] array in
mm/debug.c.
On the other hand, mm/debug.c also defines translation tables for page
flags and vma flags, and desire was expressed (but not implemented in
this series) to use these also from tracepoints. Thus, this patch also
renames the events/gfpflags.h file to events/mmflags.h and moves the
table definitions there, using the same macro approach as for gfpflags.
This allows translating all three kinds of mm-specific flags both in
tracepoints and printk.
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Michal Hocko <mhocko@suse.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When updating tracing's show_gfp_flags() I have noticed that perf's
gfp_compact_table is also outdated. Fill in the missing flags and place
a note in gfp.h to increase chance that future updates are synced.
Convert the __GFP_X flags from "GFP_X" to "__GFP_X" strings in line with
the previous patch.
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: David Rientjes <rientjes@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The topology test case of 'perf test' seems to be broken on my x86
system - due to the comparison of a "core-id" with # of CPUs online.
There are 8 online CPUs:
$ cat /sys/devices/system/cpu/online
0-7
but core-ids are not sequential and some core-ids exceed the number
of online CPUs.
$ cat /sys/devices/system/cpu/cpu?/topology/core_id
0
1
9
10
0
1
9
10
Looks like we can safely remove the check. Output before:
$ perf --version
perf version 4.4.rc1.g34258a
$ perf test -v topo
36: Test topology in session :
--- start ---
test child forked, pid 5906
templ file: /tmp/perf-test-vCwWG3
core_id number is too big.You may need to upgrade the perf tool.
test child interrupted
---- end ----
Test topology in session: FAILED!
and after:
$ perf test -v topo
36: Test topology in session :
--- start ---
test child forked, pid 6532
templ file: /tmp/perf-test-y10wFJ
CPU 0, core 0, socket 0
CPU 1, core 1, socket 0
CPU 2, core 9, socket 0
CPU 3, core 10, socket 0
CPU 4, core 0, socket 1
CPU 5, core 1, socket 1
CPU 6, core 9, socket 1
CPU 7, core 10, socket 1
test child finished with 0
---- end ----
Test topology in session: Ok
Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Jan Stancek <jstancek@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Link: http://lkml.kernel.org/r/20151203233219.GA27696@us.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add metric only support for -A too. This requires a new print function
that prints the metrics in the right order.
v2: Fix manpage
v3: Simplify nrcpus computation
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/1457049458-28956-7-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add a new mode to only print metrics. Sometimes we don't care about the
raw values, just want the computed metrics. This allows more compact
printing, so with -I each sample is only a single line. This also
allows easier plotting and processing with other tools.
The main target is with using --topdown, but it also works with -T and
standard perf stat. A few metrics are not supported.
To avoiding having to hardcode all the metrics in the code it uses a two
pass approach: first compute dummy metrics and only print the headers in
the print_metric callback. Then use the callback to print the actual
values.
There are some additional changes in the stat printout code to handle
all metrics being on a single line.
One issue is that the column code doesn't know in advance what events
are not supported by the CPU, and it would be hard to find out as this
could change based on dynamic conditions. That causes empty columns in
some cases.
The output can be fairly wide, often you may need more than 80 columns.
Example:
% perf stat -a -I 1000 --metric-only
1.001452803 frontend cycles idle insn per cycle stalled cycles per insn branch-misses of all branches
1.001452803 158.91% 0.66 2.39 2.92%
2.002192321 180.63% 0.76 2.08 2.96%
3.003088282 150.59% 0.62 2.57 2.84%
4.004369835 196.20% 0.98 1.62 3.79%
5.005227314 231.98% 0.84 1.90 4.71%
v2: Lots of updates.
v3: Use slightly narrower columns
v4: Add comment
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/1457049458-28956-6-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
With all the recently added fields in the perf stat CSV output we should
finally document them in the man page. Do this here.
v2: Fix fields in documentation (Jiri)
v3: fix order of fields again (Jiri)
v4: Change order again.
v5: Document more fields (Jiri)
v6: Move time stamp first
v7: More fixes (Jiri)
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/1457049458-28956-5-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The context menu in TUI hists browser checks corresponding sort keys
when creating the menu item. But hotkey actions lacks these checks so
it can filter using incorrect info.
For example, default sort key of 'perf top' doesn't contain 'comm' or
'pid' sort key so each hist entry's thread info is not reliable. Thus
it should prohibit using thread filter on 't' key.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1457533253-21419-3-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The commit 2eafd410e6 ("perf hists browser: Only 'Zoom into thread'
only when sort order has 'pid'") disabled thread filtering in hist
browser for the default sort key. However the he->thread is still valid
even if 'pid' sort key is not given. Only thing it should not use is
the pid (or tid) of the thread. So allow to filter by thread when
'comm' sort key is given and show pid only if 'pid' sort key is given.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1457536490-24084-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The sort__has_comm variable is to check whether the comm sort key is
given. This is necessary to support thread filtering in the TUI hists
browser later.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1457533253-21419-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When hierarchy mode is enabled, each entry in a hierarchy level shares
the period. IOW an upper level entry's period is the sum of lower level
entries. Thus perf uses only one of them to calculate the total period
of hists. It was lowest-level (leaf) entries but it has a problem when
it comes to filters.
If a filter is applied, entries in the same level will be filtered or
not. But upper level entries still have period of their sum including
filtered one. So total sum of upper level entries will not be same as
sum of lower level entries.
This resulted in entries having more than 100% of overhead and it can be
produced using perf top with filter(s).
Reported-and-Tested-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1457531222-18130-8-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The nr_sort_keys field is to carry the number of sort entries in a
hpp_list or hists to determine the depth of indentation of a hist entry.
As it's only used in hierarchy mode and now we have used nr_hpp_node for
this reason, there's no need to keep it anymore. Let's get rid of it.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1457531222-18130-7-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The hist_browser__fprintf_hierarchy_entry() if to dump current output
into a file so it needs to be sync-ed with the corresponding function
hist_browser__show_hierarchy_entry(). So use hists->nr_hpp_node to
indent width and use first fmt_node to print overhead columns instead of
checking whether it's a sort entry (or dynamic entry).
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1457531222-18130-6-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It's not used anymore and the output format is accessed by the hpp_list
pointer instead when hierarchy is enabled. Let's get rid of it.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1457531222-18130-5-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When a command-line filter is applied in hierarchy mode, output is
broken especially when filtering on lower level. The higher level
entries doesn't show up so it's hard to see the results.
Also it needs to handle multi sort keys in a single hierarchy level.
Before:
$ perf report --hierarchy -s 'cpu,{dso,comm}' --comms swapper --stdio
...
# Overhead CPU / Shared Object+Command
# ........... ...........................
#
13.79% [kernel.vmlinux] swapper
31.71% 000
13.80% [kernel.vmlinux] swapper
0.43% [e1000e] swapper
11.89% [kernel.vmlinux] swapper
9.18% [kernel.vmlinux] swapper
After:
# Overhead CPU / Shared Object+Command
# ........... ...............................
#
33.09% 003
13.79% [kernel.vmlinux] swapper
31.71% 000
13.80% [kernel.vmlinux] swapper
0.43% [e1000e] swapper
21.90% 002
11.89% [kernel.vmlinux] swapper
13.30% 001
9.18% [kernel.vmlinux] swapper
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1457531222-18130-4-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Those functions are for checkinf if a given perf_hpp_fmt is a
filter-related sort entry. With hierarchy mode, it needs to check
filters on the hist entries with its own hpp format list.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1457531222-18130-3-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When hierarchy mode is enabled each output format is in a separate hpp
list. So when applying a filter it should check all formats in the
list. Currently it only checks a single ->fmt field which was not set
properly.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1457531222-18130-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Build jitdump only on architectures defined in util/genelf.h file, to avoid
breaking the build on such arches.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Colin Ian King <colin.king@canonical.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Davidlohr Bueso <dbueso@suse.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Mel Gorman <mgorman@suse.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20160310164113.GA11357@krava.redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
There's no need to use a const char pointer, we can used char pointer
from the beginning and omit the unnecessary cast.
Reported-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20160308184230.GB7897@krava.redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pass perf_hpp_list all the way through setup_sort_list so that the sort
entry can be added on the arbitrary list.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20160309100417.GA30910@krava.redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Remove the union in evsel so that the database id and priv pointer can
be used simultainously without conflicting and crashing.
Detailed Description for the fixed bug follows:
perf script crashes with a segmentation fault on user space tool version
4.5.rc7.ge2857b when using the python database export API. It works
properly in 4.4 and prior versions.
the crash fist appeared in:
cfc8874a48 ("perf script: Process cpu/threads maps")
How to reproduce the bug:
Remove any temporary files left over from a previous crash (if you have
already attemped to reproduce the bug):
$ rm -r test_db-perf-data
$ dropdb test_db
$ perf record timeout 1 yes >/dev/null
$ perf script -s scripts/python/export-to-postgresql.py test_db
Stack Trace:
Program received signal SIGSEGV, Segmentation fault.
__GI___libc_free (mem=0x1) at malloc.c:2929
2929 malloc.c: No such file or directory.
(gdb) bt
at util/stat.c:122
argv=<optimized out>, prefix=<optimized out>) at builtin-script.c:2231
argc=argc@entry=4, argv=argv@entry=0x7fffffffdf70) at perf.c:390
at perf.c:451
Signed-off-by: Chris Phlipot <cphlipot0@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: cfc8874a48 ("perf script: Process cpu/threads maps")
Link: http://lkml.kernel.org/r/1457500314-8912-1-git-send-email-cphlipot0@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
While building on a Docker container for ubuntu and installing package
by package one ends up with:
MKDIR /tmp/build/util/
CC /tmp/build/util/genelf.o
util/genelf.c:22:19: fatal error: dwarf.h: No such file or directory
#include <dwarf.h>
^
compilation terminated.
mv: cannot stat '/tmp/build/util/.genelf.o.tmp': No such file or directory
Because the jitdump code needs the DWARF related development packages to
be installed. So make it dependent on that so that the build can succeed
without jitdump support.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-le498robnmxd40237wej3w62@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The following upcoming upstream commit:
92b0729c34 ("x86/mm, x86/mce: Add memcpy_mcsafe()")
Adds _ASM_EXTABLE_FAULT(), which is not available in user-space
and breaks the build.
We don't really need _ASM_EXTABLE_FAULT() in user-space, so simply
wrap it to nothing.
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Now hpp formats are linked using perf_hpp_list_node when hierarchy is
enabled. Like in stdio, use this info to print entries with multiple
sort keys in a single hierarchy properly.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1457361308-514-8-git-send-email-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Now hpp formats are linked using perf_hpp_list_node when hierarchy is
enabled. Like in stdio, use this info to print entries with multiple
sort keys in a single hierarchy properly.
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1457361308-514-7-git-send-email-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Now hpp formats are linked using perf_hpp_list_node when hierarchy is
enabled. Use this info to print entries with multiple sort keys in a
single hierarchy properly.
For example, the below example shows using 4 sort keys with 2 levels.
$ perf report --hierarchy -s '{prev_pid,prev_comm},{next_pid,next_comm}' \
--percent-limit 1 -i perf.data.sched
...
# Overhead prev_pid+prev_comm / next_pid+next_comm
# ........... .......................................
#
22.36% 0 swapper/0
9.48% 17773 transmission-gt
5.25% 109 kworker/0:1H
1.53% 6524 Xephyr
21.39% 17773 transmission-gt
9.52% 0 swapper/0
9.04% 0 swapper/2
1.78% 0 swapper/3
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1457361308-514-6-git-send-email-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
When multiple sort keys are used in a single hierarchy, it should indent
using number of hierarchy levels instead of number of sort keys.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1457361308-514-5-git-send-email-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This implements having multiple sort keys in a single hierarchy level.
Originally only single sort key is supported for each level, but now
using the group syntax with '{ }', it can set more than one sort key in
one level. Note that now it needs to quote in order to prevent shell
interpretation.
For example:
$ perf report --hierarchy -s '{comm,dso},sym'
...
# Overhead Command / Shared Object / Symbol
# .............. ..........................................
#
48.67% swapper [kernel.vmlinux]
34.42% [k] intel_idle
1.30% [k] __tick_nohz_idle_enter
1.03% [k] cpuidle_reflect
8.87% firefox libpthread-2.22.so
6.60% [.] __GI___libc_recvmsg
1.18% [.] pthread_cond_signal@@GLIBC_2.3.2
1.09% [.] 0x000000000000ff4b
6.11% Xorg libc-2.22.so
5.27% [.] __memcpy_sse2_unaligned
In the above example, the command name and the shared object name are
shown on the same line but the symbol name is on the different line.
Since the first two are grouped by '{}', they are in the same level.
Suggested-and-Tested=by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1457361308-514-4-git-send-email-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Now each hists has its own hpp lists in hierarchy. So instead of having
a pointer to a single perf_hpp_fmt in a hist entry, make it point the
hpp_list for its level. This will be used to support multiple sort keys
in a single hierarchy level.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1457361308-514-3-git-send-email-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The perf_hpp__setup_hists_formats() is to build hists-specific output
formats (and sort keys). Currently it's only used in order to build the
output format in a hierarchy with same sort keys, but it could be used
with different sort keys in non-hierarchy mode later.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1457361308-514-2-git-send-email-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
I'm surprised this remained undocumented since at least 2011. And it is
actually a very useful switch, as Steve and I came to realize recently.
Add the text from
2cba3ffb9a ("perf stat: Add -d -d and -d -d -d options to show more CPU events")
which added the incrementing aspect to -d.
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Davidlohr Bueso <dbueso@suse.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mel Gorman <mgorman@suse.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 2cba3ffb9a ("perf stat: Add -d -d and -d -d -d options to show more CPU events")
Link: http://lkml.kernel.org/r/1457347294-32546-1-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The level field is to distinguish levels in the hierarchy mode.
Currently each column (perf_hpp_fmt) has a different level.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1457103582-28396-2-git-send-email-namhyung@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Commit b9511cd761 ("perf/x86: Fix time_shift in perf_event_mmap_page")
altered the time conversion algorithms documented in the perf_event.h
header file, to use 64-bit shifts. That was done to make the code more
future-proof (i.e. some time in the future a 32-bit shift could be
allowed). Reflect those changes in perf tools.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1457005856-6143-9-git-send-email-adrian.hunter@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Move clockid validation into jit_process() so it can later be made
conditional.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1457005856-6143-6-git-send-email-adrian.hunter@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
In preparation for moving clockid validation into jit_process().
Previously a return value of zero meant the processing had been done and
non-zero meant either the processing was not done (i.e. not the jitdump
file mmap event) or an error occurred.
Change it so that zero means the processing was not done, one means the
processing was done and successful, and negative values are an error.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1457005856-6143-5-git-send-email-adrian.hunter@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Some of the stubs are identical so just have one function for them.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1457005856-6143-3-git-send-email-adrian.hunter@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Currently, when injecting build ids, if there is AUX data then 'perf
inject' hits all DSOs because it is not known which DSOs the trace data
would hit.
That needs to be done for JIT injection also, and in fact there is no
reason to distinguish what kind of injection is being done. That is,
any time there is AUX data and the HEADER_BUID_ID feature flag is set,
and the AUX data is not being processed, then hit all DSOs. This patch
does that.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1457005856-6143-2-git-send-email-adrian.hunter@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The return type is not defined, so it defaults to int, however, the
function is not returning anything, so this is clearly not correct. Make
it a void function.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1457008214-14393-1-git-send-email-colin.king@canonical.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
When running objtool on a ppc64le host to analyze x86 binaries, it
reports a lot of false warnings like:
ipc/compat_mq.o: warning: objtool: compat_SyS_mq_open()+0x91: can't find jump dest instruction at .text+0x3a5
The warnings are caused by the x86 instruction decoder setting the wrong
value for the jump instruction's immediate field because it assumes that
"char == signed char", which isn't true for all architectures. When
converting char to int, gcc sign-extends on x86 but doesn't sign-extend
on ppc64le.
According to the gcc man page, that's a feature, not a bug:
> Each kind of machine has a default for what "char" should be. It is
> either like "unsigned char" by default or like "signed char" by
> default.
>
> Ideally, a portable program should always use "signed char" or
> "unsigned char" when it depends on the signedness of an object.
Conform to the "standards" by changing the "char" casts to "signed
char". This results in no actual changes to the object code on x86.
Note: the x86 decoder now lives in three different locations in the
kernel tree, which are all kept in sync via makefile checks and
warnings: in-kernel, perf, and objtool. This fixes all three locations.
Eventually we should probably try to at least converge the two separate
"tools" locations into a single shared location.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/9dd4161719b20e6def9564646d68bfbe498c549f.1456962210.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Add an extra check for frontend stalled in the metrics. This avoids an
extra column for the --metric-only case when the CPU does not support
frontend stalled.
v2: Add separate init function
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/1456858672-21594-8-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The sa_flags field is not being initialized, so a garbage value is being
passed to sigaction. Initialize it to zero.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1456923322-29697-1-git-send-email-colin.king@canonical.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
That got broken by d3a72fd818 ("perf report: Fix indentation of
dynamic entries in hierarchy"), by using the evlist in setup_sorting()
without checking if it is NULL, as done in some 'perf test' entries:
$ find tools/ -name "*.c" | xargs grep 'setup_sorting(NULL);'
tools/perf/tests/hists_output.c: setup_sorting(NULL);
tools/perf/tests/hists_output.c: setup_sorting(NULL);
tools/perf/tests/hists_output.c: setup_sorting(NULL);
tools/perf/tests/hists_output.c: setup_sorting(NULL);
tools/perf/tests/hists_output.c: setup_sorting(NULL);
tools/perf/tests/hists_cumulate.c: setup_sorting(NULL);
tools/perf/tests/hists_cumulate.c: setup_sorting(NULL);
tools/perf/tests/hists_cumulate.c: setup_sorting(NULL);
tools/perf/tests/hists_cumulate.c: setup_sorting(NULL);
$
Fix it.
Before:
[root@jouet ~]# perf test
<SNIP>
15: Test matching and linking multiple hists : FAILED!
16: Try 'import perf' in python, checking link problems : Ok
17: Test breakpoint overflow signal handler : Ok
18: Test breakpoint overflow sampling : Ok
19: Test number of exit event of a simple workload : Ok
20: Test software clock events have valid period values : Ok
21: Test object code reading : Ok
22: Test sample parsing : Ok
23: Test using a dummy software event to keep tracking : Ok
24: Test parsing with no sample_id_all bit set : Ok
25: Test filtering hist entries : FAILED!
26: Test mmap thread lookup : Ok
27: Test thread mg sharing : Ok
28: Test output sorting of hist entries : FAILED!
29: Test cumulation of child hist entries : FAILED!
<SNIP>
After the patch the above failed tests complete successfully.
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: d3a72fd818 ("perf report: Fix indentation of dynamic entries in hierarchy")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The 'command_line' variable is free'd twice if db_export__branch_types()
fails. To avoid this, defer the free'ing of 'command_line' to after this
call so that the error return path will just free 'command_line' once.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Javi Merino <javi.merino@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1456875980-25606-1-git-send-email-colin.king@canonical.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now support CSV output for metrics. With the new output callbacks this
is relatively straight forward by creating new callbacks.
This allows to easily plot metrics from CSV files.
The new line callback needs to know the number of fields to skip them
correctly
Example output before:
% perf stat -x, true
0.200687,,task-clock,200687,100.00
0,,context-switches,200687,100.00
0,,cpu-migrations,200687,100.00
40,,page-faults,200687,100.00
730871,,cycles,203601,100.00
551056,,stalled-cycles-frontend,203601,100.00
<not supported>,,stalled-cycles-backend,0,100.00
385523,,instructions,203601,100.00
78028,,branches,203601,100.00
3946,,branch-misses,203601,100.00
After:
% perf stat -x, true
.502457,,task-clock,502457,100.00,0.485,CPUs utilized
0,,context-switches,502457,100.00,0.000,K/sec
0,,cpu-migrations,502457,100.00,0.000,K/sec
45,,page-faults,502457,100.00,0.090,M/sec
644692,,cycles,509102,100.00,1.283,GHz
423470,,stalled-cycles-frontend,509102,100.00,65.69,frontend cycles idle
<not supported>,,stalled-cycles-backend,0,100.00,,,,
492701,,instructions,509102,100.00,0.76,insn per cycle
,,,,,0.86,stalled cycles per insn
97767,,branches,509102,100.00,194.578,M/sec
4788,,branch-misses,509102,100.00,4.90,of all branches
or easier readable
$ perf stat -x, -o x.csv true
$ column -s, -t x.csv
0.490635 task-clock 490635 100.00 0.489 CPUs utilized
0 context-switches 490635 100.00 0.000 K/sec
0 cpu-migrations 490635 100.00 0.000 K/sec
45 page-faults 490635 100.00 0.092 M/sec
629080 cycles 497698 100.00 1.282 GHz
409498 stalled-cycles-frontend 497698 100.00 65.09 frontend cycles idle
<not supported> stalled-cycles-backend 0 100.00
491424 instructions 497698 100.00 0.78 insn per cycle
0.83 stalled cycles per insn
97278 branches 497698 100.00 198.270 M/sec
4569 branch-misses 497698 100.00 4.70 of all branches
Two new fields are added: metric value and metric name.
v2: Split out function argument changes
v3: Reenable metrics for real.
v4: Fix wrong hunk from refactoring.
v5: Remove extra "noise" printing (Jiri), but add it to the not counted case.
Print empty metrics for not counted.
v6: Avoid outputting metric on empty format.
v7: Print metric at the end
v8: Remove extra run, ena fields
v9: Avoid extra new line for unsupported counters
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lkml.kernel.org/r/1456785386-19481-3-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf_evlist__mmap_ex() can fail without setting errno (for example, fail
in condition checking. In this case all syscall is success).
If this happen, record__open() incorrectly returns 0. Force setting rc
is a quick way to avoid this problem, or we have to follow all possible
code path in perf_evlist__mmap_ex() to make sure there's at least one
system call before returning an error.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1456479154-136027-30-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move code for finalizing 'perf.data' to record__finish_output(). It will
be used by following commits to split output to multiple files.
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1456479154-136027-23-git-send-email-wangnan0@huawei.com
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Create record__synthesize(). It can be used to create tracking events
for each perf.data after perf supporting splitting into multiple
outputs.
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1456479154-136027-20-git-send-email-wangnan0@huawei.com
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Commits in a BPF patchkit will extract kernel and module synthesizing
code into a separated function and call it multiple times. This patch
replace 'if (err < 0)' using WARN_ONCE, makes sure the error message
show one time.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1456479154-136027-19-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
After babeltrace commit 5cec03e402aa ("ir: copy variants and sequences
when setting a field path"), 'perf data convert' gets incorrect result
if there's bpf output data. For example:
# perf data convert --to-ctf ./out.ctf
# babeltrace ./out.ctf
[10:44:31.186045346] (+?.?????????) evt: { cpu_id = 0 }, { perf_ip = 0xFFFFFFFF810E7DD1, perf_tid = 23819, perf_pid = 23819, perf_id = 518, raw_len = 3, raw_data = [ [0] = 0xC028E32F, [1] = 0x815D0100, [2] = 0x1000000 ] }
[10:44:31.286101003] (+0.100055657) evt: { cpu_id = 0 }, { perf_ip = 0xFFFFFFFF8105B609, perf_tid = 23819, perf_pid = 23819, perf_id = 518, raw_len = 3, raw_data = [ [0] = 0x35D9F1EB, [1] = 0x15D81, [2] = 0x2 ] }
The expected result of the first sample should be:
raw_data = [ [0] = 0x2FE328C0, [1] = 0x15D81, [2] = 0x1 ] }
however, 'perf data convert' output big endian value to resuling CTF
file.
The reason is a internal change (or a bug?) of babeltrace.
Before this patch, at the first add_bpf_output_values(), byte order of
all integer type is uncertain (is 0, neither 1234 (le) nor 4321 (be)).
It would be fixed by:
perf_evlist__deliver_sample
-> process_sample_event
-> ctf_stream
...
->bt_ctf_trace_add_stream_class
->bt_ctf_field_type_structure_set_byte_order
->bt_ctf_field_type_integer_set_byte_order
during creating the stream.
However, the babeltrace commit mentioned above duplicates types in
sequence to prevent potential conflict in following call stack and link
the newly allocated type into the 'raw_data' sequence:
perf_evlist__deliver_sample
-> process_sample_event
-> ctf_stream
...
-> bt_ctf_trace_add_stream_class
-> bt_ctf_stream_class_resolve_types
...
-> bt_ctf_field_type_sequence_copy
->bt_ctf_field_type_integer_copy
This happens before byte order setting, so only the newly allocated
type is initialized, the byte order of original type perf choose to
create the first raw_data is still uncertain.
Byte order in CTF output is not related to byte order in perf.data.
Setting it to anything other than BT_CTF_BYTE_ORDER_NATIVE solves this
problem (only BT_CTF_BYTE_ORDER_NATIVE needs to be fixed). To reduce
behavior changing, set byte order according to compiling options.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1456479154-136027-10-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ingo reported regression on display format of big numbers, which is
missing separators (in default perf stat output).
triton:~/tip> perf stat -a sleep 1
...
127008602 cycles # 0.011 GHz
279538533 stalled-cycles-frontend # 220.09% frontend cycles idle
119213269 instructions # 0.94 insn per cycle
This is caused by recent change:
perf stat: Check existence of frontend/backed stalled cycles
that added call to pmu_have_event, that subsequently calls
perf_pmu__parse_scale, which has a bug in locale handling.
The lc string returned from setlocale, that we use to store old locale
value, may be allocated in static storage. Getting a dynamic copy to
make it survive another setlocale call.
$ perf stat ls
...
2,360,602 cycles # 3.080 GHz
2,703,090 instructions # 1.15 insn per cycle
546,031 branches # 712.511 M/sec
Committer note:
Since the patch introducing the regression didn't made to perf/core,
move it to just before where the regression was introduced, so that we
don't break bisection for this feature.
Reported-by: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20160303095348.GA24511@krava.redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Format fields of a syscall have the first variable '__syscall_nr' or
'nr' that mean the syscall number. But it isn't relevant here so drop
it.
'nr' among fields of syscall was renamed '__syscall_nr'. So add
exception handling to drop '__syscall_nr' and modify the comment for
this excpetion handling.
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1456492465-5946-1-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The util/python-ext-sources file contains source files required to build
the python extension relative to $(srctree)/tools/perf,
Such a file path $(FILE).c is handed over to the python extension build
system, which builds the final object in the
$(PYTHON_EXTBUILD)/tmp/$(FILE).o path.
After the build is done all files from $(PYTHON_EXTBUILD)lib/ are
carried as the result binaries.
Above system fails when we add source file relative to ../lib, which we
do for:
../lib/bitmap.c
../lib/find_bit.c
../lib/hweight.c
../lib/rbtree.c
All above objects will be built like:
$(PYTHON_EXTBUILD)/tmp/../lib/bitmap.c
$(PYTHON_EXTBUILD)/tmp/../lib/find_bit.c
$(PYTHON_EXTBUILD)/tmp/../lib/hweight.c
$(PYTHON_EXTBUILD)/tmp/../lib/rbtree.c
which accidentally happens to be final library path:
$(PYTHON_EXTBUILD)/lib/
Changing setup.py to pass full paths of source files to Extension build
class and thus keep all built objects under $(PYTHON_EXTBUILD)tmp
directory.
Reported-by: Jeff Bastian <jbastian@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Josh Boyer <jwboyer@fedoraproject.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: stable@vger.kernel.org # v4.2+
Link: http://lkml.kernel.org/r/20160227201350.GB28494@krava.redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Without this patch BPF map configuration is not applied.
Command like this:
# ./perf trace --ev bpf-output/no-inherit,name=evt/ \
--ev ./test_bpf_trace.c/map:channel.event=evt/ \
usleep 100000
Load BPF files without error, but since map:channel.event=evt is not
applied, bpf-output event not work.
This patch allows 'perf trace' load and run BPF scripts.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1456479154-136027-7-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf_evlist__set_filter() tries to set filter to every evsel linked in
the evlist. However, since filters can only be applied to tracepoints,
checking type of evsel before calling perf_evsel__set_filter() would be
better.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1456479154-136027-6-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Before this patch each subcommand calls perf_config() by themself,
reading the default configuration together with subcommand specific
options. If a subcommand doesn't have it own options, it needs to call
'perf_config(perf_default_config, NULL)' to ensure .perfconfig is
loaded.
This patch brings perf_config(perf_default_config, NULL) to the very
start of main(), so subcommands don't need to do it.
After this patch, 'llvm.clang-path' works for 'perf trace'.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Suggested-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1456479154-136027-4-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The column width of dynamic entries is updated when comparing hist
entries. However some unique entries can miss the chance to update. So
move the update to output resort stage to make sure every entry will get
called before display.
To do that, abuse ->sort callback to update the width when the third
argument is NULL. When resorting entries in normal path, it never be
NULL so it should be fine IMHO.
Before:
# Overhead ptr / bytes_req / gfp_flags
# .............. ..........................................
#
37.50% 0xffff8803f7669400
37.50% 448
37.50% GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC
10.42% 0xffff8803f766be00
8.33% 96
8.33% GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC
2.08% 512
2.08% GFP_KERNEL|GFP_NOWARN|GFP_REPEAT|GFP <-- here
After:
# Overhead ptr / bytes_req / gfp_flags
# .............. .....................................................
#
37.50% 0xffff8803f7669400
37.50% 448
37.50% GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC
10.42% 0xffff8803f766be00
8.33% 96
8.33% GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC
2.08% 512
2.08% GFP_KERNEL|GFP_NOWARN|GFP_REPEAT|GFP_NOMEMALLOC
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456512767-1164-5-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When dynamic sort key is used it might not show pretty printed output.
This is because the trace output was not set only for the first dynamic
sort key. During hierarchy_insert_entry() it missed to pass the
trace_output to dynamic entries. Also even if it did, only first entry
will have it. Subsequent entries might set it during collapsing stage
but it's not guaranteed.
Before:
$ perf report --hierarchy --stdio -s ptr,bytes_req,gfp_flags -g none
#
# Overhead ptr / bytes_req / gfp_flags
# .............. ..........................................
#
37.50% 0xffff8803f7669400
37.50% 448
37.50% 66080
10.42% 0xffff8803f766be00
8.33% 96
8.33% 66080
2.08% 512
2.08% 67280
After:
#
# Overhead ptr / bytes_req / gfp_flags
# .............. ..........................................
#
37.50% 0xffff8803f7669400
37.50% 448
37.50% GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC
10.42% 0xffff8803f766be00
8.33% 96
8.33% GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC
2.08% 512
2.08% GFP_KERNEL|GFP_NOWARN|GFP_REPEAT|GFP
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456512767-1164-4-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The dynamic entries are right-aligned unlike other entries since it
usually has numeric value. But for the hierarchy mode, left alignment
is more appropriate IMHO. Also trim spaces on the left so that we can
easily identify the hierarchy.
Before:
$ perf report --hierarchy -i perf.data.kmem -s gfp_flags,ptr,bytes_req --stdio -g none
...
#
# Overhead gfp_flags / ptr / bytes_req
# .............. .................................................................................................
#
91.67% GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC
37.50% 0xffff8803f7669400
37.50% 448
8.33% 0xffff8803f766be00
8.33% 96
4.17% 0xffff8800d156dc00
4.17% 704
After:
# Overhead gfp_flags / ptr / bytes_req
# .............. ....................................
#
91.67% GFP_ATOMIC|GFP_NOWARN|GFP_NOMEMALLOC
37.50% 0xffff8803f7669400
37.50% 448
8.33% 0xffff8803f766be00
8.33% 96
4.17% 0xffff8800d156dc00
4.17% 704
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456512767-1164-3-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When dynamic entries are used in the hierarchy mode with multiple
events, the output might not be aligned properly. In the hierarchy
mode, the each sort column is indented using total number of sort keys.
So it keeps track of number of sort keys when adding them. However
a dynamic sort key can be added more than once when multiple events have
same field names. This results in unnecessarily long indentation in the
output.
For example perf kmem records following events:
$ perf evlist --trace-fields -i perf.data.kmem
kmem:kmalloc: trace_fields: call_site,ptr,bytes_req,bytes_alloc,gfp_flags
kmem:kmalloc_node: trace_fields: call_site,ptr,bytes_req,bytes_alloc,gfp_flags,node
kmem:kfree: trace_fields: call_site,ptr
kmem:kmem_cache_alloc: trace_fields: call_site,ptr,bytes_req,bytes_alloc,gfp_flags
kmem:kmem_cache_alloc_node: trace_fields: call_site,ptr,bytes_req,bytes_alloc,gfp_flags,node
kmem:kmem_cache_free: trace_fields: call_site,ptr
kmem:mm_page_alloc: trace_fields: page,order,gfp_flags,migratetype
kmem:mm_page_free: trace_fields: page,order
As you can see, many field names shared between kmem events. So adding
'ptr' dynamic sort key alone will set nr_sort_keys to 6. And this adds
many unnecessary spaces between columns.
Before:
$ perf report -i perf.data.kmem --hierarchy -s ptr -g none --stdio
...
# Overhead ptr
# ....................... ...................................
#
99.89% 0xffff8803ffb79720
0.06% 0xffff8803d228a000
0.03% 0xffff8803f7678f00
0.00% 0xffff880401dc5280
0.00% 0xffff880406172380
0.00% 0xffff8803ffac3a00
0.00% 0xffff8803ffac1600
After:
# Overhead ptr
# ........ ....................
#
99.89% 0xffff8803ffb79720
0.06% 0xffff8803d228a000
0.03% 0xffff8803f7678f00
0.00% 0xffff880401dc5280
0.00% 0xffff880406172380
0.00% 0xffff8803ffac3a00
0.00% 0xffff8803ffac1600
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456512767-1164-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When hist_entry__cmp() and hist_entry__collapse() are called, they
should check if the dynamic entry is comparing matching hists only.
Otherwise it might access different hists resulting in incorrect output.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456512767-1164-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Like the stdio, it should show messages about omitted hierarchy
entries. Please refer the previous commit for more details.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456488800-28124-5-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Like the stdio, it should show messages about omitted hierarchy entries.
Please refer the previous commit for more details.
As it needs to check an entry is omitted or not multiple times, add the
has_no_entry field in the hist entry.
Suggested-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456488800-28124-4-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The previous patch introduced __rb_hierarchy_next() function with
various move direction like HMD_FORCE_CHILD but missed to change using
it some place.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456488800-28124-3-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When the hierarchy mode is used, some entries might be omiited due to a
percent limit or filter. In this case the output hierarchy is different
than other entries. Add an informative message to users about this.
For example, when 4% of percent limit is applied:
Before:
# Overhead Command / Shared Object / Symbol
# .............. ..........................................
#
49.09% swapper
48.67% [kernel.vmlinux]
34.42% [k] intel_idle
11.51% firefox
8.87% libpthread-2.22.so
6.60% [.] __GI___libc_recvmsg
10.49% gnome-shell
4.74% libc-2.22.so
10.08% Xorg
6.11% libc-2.22.so
5.27% [.] __memcpy_sse2_unaligned
6.15% perf
Note that, gnome-shell/libc has no symbols and perf has no dso/symbols.
With that patch the output will look like below:
After:
# Overhead Command / Shared Object / Symbol
# .............. ..........................................
#
49.09% swapper
48.67% [kernel.vmlinux]
34.42% [k] intel_idle
11.51% firefox
8.87% libpthread-2.22.so
6.60% [.] __GI___libc_recvmsg
10.49% gnome-shell
4.74% libc-2.22.so
no entry >= 4.00%
10.08% Xorg
6.11% libc-2.22.so
5.27% [.] __memcpy_sse2_unaligned
6.15% perf
no entry >= 4.00%
Suggested-and-Tested-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456488800-28124-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The hists__overhead_width() is to calculate width occupied by the
overhead (and others) columns before the sort columns.
The hist_entry__has_hiearchy_children() is to check whether an entry has
lower entries (children) in the hierarchy to be shown in the output.
This means the children should not be filtered out and above the percent
limit.
These two functions will be used to show information when all children
of an entry is omitted by the percent limit (or filter).
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456488800-28124-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
script_spec_register() called two functions: script_spec__find() and
script_spec__findnew(). But this way script_spec__find() gets called
two times, directly and via script_spec__findnew().
So remove script_spec__findnew() and make script_spec_register() only
call once script_spec__find().
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1456413190-12378-1-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
After collecting samples for events 'syscalls:', perf-script with python
script doesn't occasionally work generating a segmentation fault.
The reason is that the print fmt is empty and a value of
event->print_fmt.args is NULL, so dereferencing the null pointer results
in a segmentation fault i.e.:
# perf record -e syscalls:*
# perf script -g python
# perf script -s perf-script.py
in trace_begin
syscalls__sys_enter_brk 3 79841.832099154 3777 test.sh syscall_nr=12, brk=0
... (omitted) ...
Segmentation fault (core dumped)
For example, a format of sys_enter_getuid() hasn't
print fmt as below.
# cat /sys/kernel/debug/tracing/events/syscalls/sys_enter_getuid/format
name: sys_enter_getuid
ID: 188
format:
field:unsigned short common_type; offset:0; size:2; signed:0;
field:unsigned char common_flags; offset:2; size:1; signed:0;
field:unsigned char common_preempt_count; offset:3; size:1; signed:0;
field:int common_pid; offset:4; size:4; signed:1;
field:int syscall_nr; offset:8; size:4; signed:1;
print fmt: ""
So add exception handling to avoid this problem.
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456413179-12331-1-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In 1d55e8ef34 ("perf tools: Introduce opt_event_config nonterminal") I
removed the unconditional "'/' '/'" for pmu events such as
"intel_pt//" but forgot to use opt_event_config where it expected some
event_config, oops. Fix it.
Noticed when trying to use:
# perf record -e intel_pt// -a sleep 1
event syntax error: 'intel_pt//'
\___ parser error
Run 'perf list' for a list of valid events
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-e, --event <event> event selector. use 'perf list' to list available events
#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 1d55e8ef34 ("perf tools: Introduce opt_event_config nonterminal")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch improves the error message given by jvmti Makefile when the
alternatives command cannot be found. It now suggests the user locates
the root of their Java installation and pass it with JDIR=
Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1456378056-18812-1-git-send-email-eranian@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
No need to use strbuf there, its just a simple alloc+formatting, which
asprintf does just fine.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-6q6cxfhk8c8ypg3tfpo0i2iy@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Support hierarchy output for perf-top using --hierarchy option.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456326830-30456-19-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In the hierarchy mode, hist entries should decay their children too.
Also update hists__delete_entry() to be able to free child entries.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456326830-30456-18-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The --hierarchy option is to show output in hierarchy mode. It extends
folding/unfolding in the TUI and GTK browsers to support sort items as
well as callchains. Users can toggle the items to see the performance
result at wanted level.
$ perf report --hierarchy --tui
Overhead Command / Shared Object / Symbol
--------------------------------------------------
+ 32.96% gnome-shell
- 15.11% swapper
- 14.97% [kernel.vmlinux]
6.82% [k] intel_idle
0.66% [k] menu_select
0.43% [k] __hrtimer_start_range_ns
...
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456326830-30456-17-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The hierarchy output mode is to group entries for each level so that
user can see higher level picture more easily.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456326830-30456-16-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Implement hierarchy mode in TUI. The output is look like stdio but it
also supports to fold/unfold children dynamically.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456326830-30456-14-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The 'C' and 'E' keys are to collapse/expand all hist entries. Update
nr_hierarchy_entries properly in this case.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456326830-30456-13-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add nr_hierarchy_entries field to keep current number of (unfolded) hist
entries. And the hist_entry->nr_rows carries number of direct children.
But in the hierarchy mode, entry can have grand children and callchains.
So update the number properly using hierarchy_count_rows() when toggling
the folded state (by pressing ENTER key).
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456326830-30456-12-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The hierarchy output mode is to group entries so the existing columns
won't fit to the new output. Treat all sort keys as a single column and
separate headers by "/".
# Overhead Command / Shared Object
# ........... ................................
#
15.11% swapper
14.97% [kernel.vmlinux]
0.09% [libahci]
0.05% [iwlwifi]
...
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456326830-30456-11-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The hierarchy output mode is to group entries for each level so that
user can see higher level picture more easily. It also helps to find
out which component is most costly. The output will look like below:
15.11% swapper
14.97% [kernel.vmlinux]
0.09% [libahci]
0.05% [iwlwifi]
10.29% irq/33-iwlwifi
6.45% [kernel.vmlinux]
1.41% [mac80211]
1.15% [iwldvm]
1.14% [iwlwifi]
0.14% [cfg80211]
4.81% firefox
3.92% libxul.so
0.34% [kernel.vmlinux]
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456326830-30456-10-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It'll be used for hierarchy output mode to indent entries properly.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456326830-30456-9-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In hierarchy mode, a filter can affect periods of entries in upper
hierarchy. So it needs to resort the hists after filter.
For example, let's look at following example:
Overhead Command / Shared Object / Symbol
------------ --------------------------------
30.00% perf
20.00% perf
10.00% main
5.00% pr_debug
5.00% memcpy
10.00% [kernel.vmlinux]
8.00% memset
2.00% cpu_idle
If we apply simbol filter for 'mem' it should look like this
13.00% perf
8.00% [kernel.vmlinux]
8.00% memset
5.00% perf
5.00% memcpy
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456326830-30456-8-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The hists__filter_hierarchy() function implements filtering in hierarchy
mode. Now we have hist_entry__filter() so use it for entries in the
hierarchy. It returns 3 kind of values.
A negative value means that it's not filtered by this type. It marks
current entry as filtered tentatively so if a lower level entry removes
the filter it also removes the all parent so that we can find the entry
in the output.
Zero means it's filtered out by this type. A positive value means it's
not filtered so it removes the filter and shows in the output. In these
cases, it moves to next entry since lower level entry won't match by
this type of filter anymore. Thus all children will be filtered or not
together.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456326830-30456-7-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The hist_entry__filter() function is to filter hist entries using sort
key related info. This is needed to support hierarchy mode since each
hist entry will be associated with a hpp fmt which has a sort key. So
each entry should compare to only matching type of filters.
To do that, add the ->se_filter callback field to struct sort_entry.
This callback takes 'type' argument which determines whether it's
matching sort key or not. It returns -1 for non-matching type, 0 for
filtered entry and 1 for not filtered entries.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456326830-30456-6-git-send-email-namhyung@kernel.org
[ 'socket' is reserved in sys/socket.h, so replace it with 'sk' ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The rb_hierarchy_{next,prev,last} functions are to traverse all hist
entries in a hierarchy. They will be used by various function which
supports hierarchy output.
As the rb_hierarchy_next() is used to traverse the whole hierarchy, it
sometime needs to visit entries regardless of current folding state. So
add enum hierarchy_move_dir and pass it to __rb_hierarchy_next() for
those cases.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456326830-30456-5-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For hierarchical output, each entry must be sorted in their rbtree
(hroot) properly. Add hists__hierarchy_output_resort() to do the job.
Note that those hierarchy entries share the period counts, it'd be
important to update the hists->stats only once (for leaves).
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456326830-30456-4-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In the hierarchical view, entries will be grouped and sorted on the
first key, and then on the second key, and so on. Add the
he->hroot_{in,out} fields to keep the lower level entries. Actually this
can share space, in a union, with callchain's 'sorted_root' since the
hroots are only used by non-leaf entries and callchain is only used by
leaf entries.
It also adds the 'parent_he' and 'depth' fields which can be used by browsers.
This patch only implements collapsing part which creates internal
entries for each sort key. These need to be sorted by output_sort stage
and to be displayed properly in the later patch(es).
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456326830-30456-3-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The 'trace', 'srcline' and 'srcfile' sort keys updates hist entry's
field later. With the hierarchy mode, those fields are passed to a
matching entry so it needs to identify the sort keys.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1456326830-30456-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move code printing binray data from trace_event() to utils.c and allows
passing different printer. Further commits will use this logic to print
bpf output event.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1456312845-111583-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Moving strncat call into scnprintf to easily track number of displayed
bytes. It will be used in following patch.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1456303616-26926-13-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Moving strncat/strcpy calls into scnprintf to easily track number of
displayed bytes. It will be used in following patch.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1456303616-26926-12-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Moving strncat/strcpy calls into scnprintf to easily track number of
displayed bytes. It will be used in following patch.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1456303616-26926-11-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Moving strncat/strcpy calls into scnprintf to easily track
number of displayed bytes. It will be used in following patch.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1456303616-26926-10-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move meminfo's lck display function into mem-events.c object, so it
could be reused later from script code.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1456303616-26926-9-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move meminfo's snp display function into mem-events.c object, so it
could be reused later from script code.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1456303616-26926-8-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move meminfo's lvl display function into mem-events.c object, so it
could be reused later from script code.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1456303616-26926-7-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move meminfo's tlb display function into mem-events.c object, so it
could be reused later from script code.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1456303616-26926-6-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Wrap perf_mem_events[].name into perf_mem_events__name() so we could alter the
events name if needed.
This will be handy when changing latency settings for loads event in following
patch.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1456303616-26926-3-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Check if current kernel support available memory events and display the
status within -e list option:
$ perf mem record -e list
ldlat-loads : available
ldlat-stores : available
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1456303616-26926-2-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
No users, nuke them.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-kfv2wo8xann8t97wdalttcx7@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It is the only user of this function, just use the strlen() to skip
the prefix.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-blao710l5cd5hmwrhy51ftgq@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When an error happens during alias parsing currently the complete
parsing of all attributes of the PMU is stopped. This is breaks old perf
on a newer kernel that may have not-yet-know alias attributes (such as
.scale or .per-pkg).
Continue when some attribute is unparseable.
This is IMHO a stable candidate and should be backported to older
versions to avoid problems with newer kernels.
v2: Print warnings when something goes wrong.
v3: Change warning to debug output
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: stable@vger.kernel.org # v3.6+
Link: http://lkml.kernel.org/r/1455749095-18358-1-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding data_src and weight column definitions, so it's displayed for
related sample types.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1455525293-8671-22-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
There's no need to define extra macros for that.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1455525293-8671-13-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding -e option for perf mem record command, to be able to specify
memory event directly.
Get list of available events:
$ perf mem record -e list
ldlat-loads
ldlat-stores
Monitor ldlat-loads:
$ perf mem record -e ldlat-loads true
Committer notes:
Further testing:
# perf mem record -e ldlat-loads true
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.020 MB perf.data (10 samples) ]
# perf evlist
cpu/mem-loads,ldlat=30/P
#
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1455525293-8671-6-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It will ease up configuration of memory events and addition of other
memory events in following patches.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1455525293-8671-5-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It'll be used in following patches.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1455525293-8671-4-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It'll be used in following patches.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1455525293-8671-3-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Users can pass options to tracepoints defined in the BPF script. For
example:
# perf record -e ./test.c/no-inherit/ bash
# dd if=/dev/zero of=/dev/null count=10000
# exit
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.022 MB perf.data (139 samples) ]
(no-inherit works, only the sys_read issued by bash are captured, at
least 10000 sys_read issued by dd are skipped.)
test.c:
#define SEC(NAME) __attribute__((section(NAME), used))
SEC("func=sys_read")
int bpf_func__sys_read(void *ctx)
{
return 1;
}
char _license[] SEC("license") = "GPL";
int _version SEC("version") = LINUX_VERSION_CODE;
no-inherit is applied to the kprobe event defined in test.c.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1456132275-98875-10-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch introduces basic facilities to support config different slots
in a BPF map one by one.
array.nr_ranges and array.ranges are introduced into 'struct
parse_events_term', where ranges is an array of indices range (start,
length) which will be configured by this config term. nr_ranges is the
size of the array. The array is passed to 'struct bpf_map_priv'. To
indicate the new type of configuration, BPF_MAP_KEY_RANGES is added as a
new key type. bpf_map_config_foreach_key() is extended to iterate over
those indices instead of all possible keys.
Code in this commit will be enabled by following commit which enables
the indices syntax for array configuration.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1456132275-98875-8-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
bpf__apply_obj_config() is introduced as the core API to apply object
config options to all BPF objects. This patch also does the real work
for setting values for BPF_MAP_TYPE_PERF_ARRAY maps by inserting value
stored in map's private field into the BPF map.
This patch is required because we are not always able to set all BPF
config during parsing. Further patch will set events created by perf to
BPF_MAP_TYPE_PERF_EVENT_ARRAY maps, which is not exist until
perf_evsel__open().
bpf_map_foreach_key() is introduced to iterate over each key needs to be
configured. This function would be extended to support more map types
and different key settings.
In perf record, before start recording, call bpf__apply_config() to turn
on all BPF config options.
Test result:
# cat ./test_bpf_map_1.c
/************************ BEGIN **************************/
#include <uapi/linux/bpf.h>
#define SEC(NAME) __attribute__((section(NAME), used))
struct bpf_map_def {
unsigned int type;
unsigned int key_size;
unsigned int value_size;
unsigned int max_entries;
};
static void *(*map_lookup_elem)(struct bpf_map_def *, void *) =
(void *)BPF_FUNC_map_lookup_elem;
static int (*trace_printk)(const char *fmt, int fmt_size, ...) =
(void *)BPF_FUNC_trace_printk;
struct bpf_map_def SEC("maps") channel = {
.type = BPF_MAP_TYPE_ARRAY,
.key_size = sizeof(int),
.value_size = sizeof(int),
.max_entries = 1,
};
SEC("func=sys_nanosleep")
int func(void *ctx)
{
int key = 0;
char fmt[] = "%d\n";
int *pval = map_lookup_elem(&channel, &key);
if (!pval)
return 0;
trace_printk(fmt, sizeof(fmt), *pval);
return 0;
}
char _license[] SEC("license") = "GPL";
int _version SEC("version") = LINUX_VERSION_CODE;
/************************* END ***************************/
# echo "" > /sys/kernel/debug/tracing/trace
# ./perf record -e './test_bpf_map_1.c/map:channel.value=11/' usleep 10
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.012 MB perf.data ]
# cat /sys/kernel/debug/tracing/trace
# tracer: nop
#
# entries-in-buffer/entries-written: 1/1 #P:8
[SNIP]
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
usleep-18593 [007] d... 2394714.395539: : 11
# ./perf record -e './test_bpf_map_1.c/map:channel.value=101/' usleep 10
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.012 MB perf.data ]
# cat /sys/kernel/debug/tracing/trace
# tracer: nop
#
# entries-in-buffer/entries-written: 1/1 #P:8
[SNIP]
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
usleep-18593 [007] d... 2394714.395539: : 11
usleep-19000 [006] d... 2394831.057840: : 101
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1456132275-98875-6-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch adds the final step for BPF map configuration. A new syntax
is appended into parser so user can config BPF objects through '/' '/'
enclosed config terms.
After this patch, following syntax is available:
# perf record -e ./test_bpf_map_1.c/map:channel.value=10/ ...
It would takes effect after appling following commits.
Test result:
# cat ./test_bpf_map_1.c
/************************ BEGIN **************************/
#include <uapi/linux/bpf.h>
#define SEC(NAME) __attribute__((section(NAME), used))
struct bpf_map_def {
unsigned int type;
unsigned int key_size;
unsigned int value_size;
unsigned int max_entries;
};
static void *(*map_lookup_elem)(struct bpf_map_def *, void *) =
(void *)BPF_FUNC_map_lookup_elem;
static int (*trace_printk)(const char *fmt, int fmt_size, ...) =
(void *)BPF_FUNC_trace_printk;
struct bpf_map_def SEC("maps") channel = {
.type = BPF_MAP_TYPE_ARRAY,
.key_size = sizeof(int),
.value_size = sizeof(int),
.max_entries = 1,
};
SEC("func=sys_nanosleep")
int func(void *ctx)
{
int key = 0;
char fmt[] = "%d\n";
int *pval = map_lookup_elem(&channel, &key);
if (!pval)
return 0;
trace_printk(fmt, sizeof(fmt), *pval);
return 0;
}
char _license[] SEC("license") = "GPL";
int _version SEC("version") = LINUX_VERSION_CODE;
/************************* END ***************************/
- Normal case:
# ./perf record -e './test_bpf_map_1.c/map:channel.value=10/' usleep 10
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.012 MB perf.data ]
- Error case:
# ./perf record -e './test_bpf_map_1.c/map:channel.value/' usleep 10
event syntax error: '..ps:channel:value/'
\___ Config value not set (missing '=')
Hint: Valid config term:
map:[<arraymap>]:value=[value]
(add -v to see detail)
Run 'perf list' for a list of valid events
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-e, --event <event> event selector. use 'perf list' to list available events
# ./perf record -e './test_bpf_map_1.c/xmap:channel.value=10/' usleep 10
event syntax error: '..pf_map_1.c/xmap:channel.value=10/'
\___ Invalid object config option
[SNIP]
# ./perf record -e './test_bpf_map_1.c/map:xchannel.value=10/' usleep 10
event syntax error: '..p_1.c/map:xchannel.value=10/'
\___ Target map not exist
[SNIP]
# ./perf record -e './test_bpf_map_1.c/map:channel.xvalue=10/' usleep 10
event syntax error: '..ps:channel.xvalue=10/'
\___ Invalid object map config option
[SNIP]
# ./perf record -e './test_bpf_map_1.c/map:channel.value=x10/' usleep 10
event syntax error: '..nnel.value=x10/'
\___ Incorrect value type for map
[SNIP]
Change BPF_MAP_TYPE_ARRAY to '1' in test_bpf_map_1.c:
# ./perf record -e './test_bpf_map_1.c/map:channel.value=10/' usleep 10
event syntax error: '..ps:channel.value=10/'
\___ Can't use this config term to this type of map
Hint: Valid config term:
map:[<arraymap>].value=[value]
(add -v to see detail)
Signed-off-by: Wang Nan <wangnan0@huawei.com>
[for parser part]
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1456132275-98875-5-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
bpf__config_obj() is introduced as a core API to config BPF object after
loading. One configuration option of maps is introduced. After this
patch BPF object can accept assignments like:
map:my_map.value=1234
(map.my_map.value looks pretty. However, there's a small but hard to fix
problem related to flex's greedy matching. Please see [1]. Choose ':'
to avoid it in a simpler way.)
This patch is more complex than the work it does because the
consideration of extension. In designing BPF map configuration, the
following things should be considered:
1. Array indices selection: perf should allow user setting different
value for different slots in an array, with syntax like:
map:my_map.value[0,3...6]=1234;
2. A map should be set by different config terms, each for a part
of it. For example, set each slot to the pid of a thread;
3. Type of value: integer is not the only valid value type. A perf
counter can also be put into a map after commit 35578d7984
("bpf: Implement function bpf_perf_event_read() that get the
selected hardware PMU counter")
4. For a hash table, it should be possible to use a string or other
value as a key;
5. It is possible that map configuration is unable to be setup
during parsing. A perf counter is an example.
Therefore, this patch does the following:
1. Instead of updating map element during parsing, this patch stores
map config options in 'struct bpf_map_priv'. Following patches
will apply those configs at an appropriate time;
2. Link map operations in a list so a map can have multiple config
terms attached, so different parts can be configured separately;
3. Make 'struct bpf_map_priv' extensible so that the following patches
can add new types of keys and operations;
4. Use bpf_obj_config__map_funcs array to support more map config options.
Since the patch changing the event parser to parse BPF object config is
relative large, I've put it in another commit. Code in this patch can be
tested after applying the next patch.
[1] http://lkml.kernel.org/g/564ED621.4050500@huawei.com
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1456132275-98875-4-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
[ Changes "maps:my_map.value" to "map:my_map.value", improved error messages ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The dynamic entry is created for each field in a tracepoint event.
Since they have no fixed hpp format index, it should skip when
perf_hpp__reset_width() is called.
This caused following assertion failure..
$ perf record -e sched:sched_switch -a sleep 1
$ perf report -s comm,next_pid --stdio
perf: ui/hist.c:651: perf_hpp__reset_width:
Assertion `!(fmt->idx >= PERF_HPP__MAX_INDEX)' failed.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1456064558-13086-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It missed to update column length of the 'trace' sort key in the
hists__calc_col_len() so it might truncate the output. It calculated
the column length in the ->cmp() callback originally but it doesn't
guarantee it's called always.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1456064558-13086-5-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The srcline, srcfile and trace sort keys can have long entries. With
commit 89fee70943 ("perf hists: Do column alignment on the format
iterator"), it now aligns output with hist_entry__snprintf_alignment().
So each (possibly long) sort entries don't need to do it themselves.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1456101153-14519-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Normally the hist entry's srcline and/or srcfile is set during sorting.
However sometime it's possible to a hist entry's srcline is not set yet
after the sorting. This is because the entry is so unique and other
sort keys already make it distinct. Then the srcline/file sort didn't
have a chance to be called during the sorting. In that case it has NULL
srcline/srcfile field and shows nothing.
Before:
$ perf report -s comm,sym,srcline
...
Overhead Command Symbol
-----------------------------------------------------------------
34.42% swapper [k] intel_idle intel_idle.c:0
2.44% perf [.] __poll_nocancel (null)
1.70% gnome-shell [k] fw_domains_get (null)
1.04% Xorg [k] sock_poll (null)
After:
34.42% swapper [k] intel_idle intel_idle.c:0
2.44% perf [.] __poll_nocancel .:0
1.70% gnome-shell [k] fw_domains_get fw_domains_get+42
1.04% Xorg [k] sock_poll socket.c:0
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1456101111-14400-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
A dynamic entry is created for each tracepoint event. When it sets up
the sort key, it checks with existing keys using ->equal() callback.
But it missed to set the ->equal for dynamic entries. The following
segfault was due to the missing ->equal() callback.
(gdb) bt
#0 0x0000000000140003 in ?? ()
#1 0x0000000000537769 in fmt_equal (b=0x2106980, a=0x21067a0) at ui/hist.c:548
#2 perf_hpp__setup_output_field (list=0x8c6d80 <perf_hpp_list>) at ui/hist.c:560
#3 0x00000000004e927e in setup_sorting (evlist=<optimized out>) at util/sort.c:2642
#4 0x000000000043cf50 in cmd_report (argc=<optimized out>, argv=<optimized out>, prefix=<optimized out>)
at builtin-report.c:932
#5 0x00000000004865a1 in run_builtin (p=p@entry=0x8bbce0 <commands+192>, argc=argc@entry=7,
argv=argv@entry=0x7ffd24d56ce0) at perf.c:390
#6 0x000000000042dc1f in handle_internal_command (argv=0x7ffd24d56ce0, argc=7) at perf.c:451
#7 run_argv (argv=0x7ffd24d56a70, argcp=0x7ffd24d56a7c) at perf.c:495
#8 main (argc=7, argv=0x7ffd24d56ce0) at perf.c:620
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1456064558-13086-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Older compilers don't like this, for instance, on RHEL6.7:
CC /tmp/build/perf/util/parse-events.o
util/parse-events.c:844: error: redefinition of typedef ‘config_term_func_t’
util/parse-events.c:353: note: previous declaration of ‘config_term_func_t’ was here
So remove the second definition, that should've been just moved in 43d0b97817
("perf tools: Enable config and setting names for legacy cache events"), not copied.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 43d0b97817 ("perf tools: Enable config and setting names for legacy cache events")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In RHEL 6.7:
CC /tmp/build/perf/util/parse-events.o
cc1: warnings being treated as errors
util/parse-events.c: In function ‘parse_events_add_cache’:
util/parse-events.c:366: error: declaration of ‘error’ shadows a global declaration
util/util.h:136: error: shadowed declaration is here
Rename it to 'err'.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 43d0b97817 ("perf tools: Enable config and setting names for legacy cache events")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
If it returns an error, warn user and bail out instead of silently
ignoring it.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1455631723-17345-9-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently hists__collapse_resort() and hists__collapse_insert_entry()
don't return an error code. Now that callchain_merge() can check for
errors, abort and pass the error to the user. A later patch can add
more work which also can fail.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1455631723-17345-8-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now it can check the error case, so check and pass it to the caller.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1455631723-17345-7-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now create_child() and add_child() return errors so check and pass it
to the caller.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1455631723-17345-6-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The append_chain() might return either result of match_chain() or other
(error) code. But match_chain() can return any value in s64 type so
it's hard to check the error case. Add new enum match_result and make
match_chain() return non-negative values only so that we can check the
error cases.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1455631723-17345-5-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Memory allocation in the fill_node() can fail so change its return type
to int and check it in add_child() too.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1455631723-17345-4-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The create_child() in add_child() can return NULL in case of memory
allocation failure. So check the return value and bail out. The proper
error handling will be added later.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1455631723-17345-3-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently 'perf top --tui' decrements percentage of all entries on any
key press. This is because it adds total period as new samples are
added to hists. As perf-top does it currently but added samples are not
passed to the display thread, the percentages are decresing
continuously.
So separate total period stat into a different variable so that it
cannot affect the output total period. This new total period stats are
used only for calcualating callchain percent limit.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 0f58474ec8 ("perf hists: Update hists' total period when adding entries")
Link: http://lkml.kernel.org/r/1455631723-17345-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To remove duplicated code that differs only in using the matching
'/a,b,c/' part or NULL if no event configuration is done ('//' or no
pair of slashes at all).
Will be used by some new targets allowing the configuration of hardware
events, etc.
Lifted part of the 'opt_event_config' nonterminal from a patch by Wang
Nan.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/n/tip-e3xzpx9cqsmwnaguaxyw6r42@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Following commits will make more events obey /name=newname/ options.
This patch makes pmu_event_name() a generic helper.
Makes new get_config_name() accept NULL input to make life easier.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1455882283-79592-12-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
'perf stat' accepts some config terms but doesn't apply them. For
example:
# perf stat -e 'instructions/no-inherit/' -e 'instructions/inherit/' bash
# ls
# exit
Performance counter stats for 'bash':
266258061 instructions/no-inherit/
266258061 instructions/inherit/
1.402183915 seconds time elapsed
The result is confusing, because user may expect the first
'instructions' event exclude the 'ls' command.
This patch forbid most of these config terms for 'perf stat'.
Result:
# ./perf stat -e 'instructions/no-inherit/' -e 'instructions/inherit/' bash
event syntax error: 'instructions/no-inherit/'
\___ 'no-inherit' is not usable in 'perf stat'
...
We can add blocked config terms back when 'perf stat' really supports them.
This patch also removes unavailable config term from error message:
# ./perf stat -e 'instructions/badterm/' ls
event syntax error: 'instructions/badterm/'
\___ unknown term
valid terms: config,config1,config2,name
# ./perf stat -e 'cpu/badterm/' ls
event syntax error: 'cpu/badterm/'
\___ unknown term
valid terms: pc,any,inv,edge,cmask,event,in_tx,ldlat,umask,in_tx_cp,offcore_rsp,config,config1,config2,name
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1455882283-79592-11-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
config_term_names[] is introduced for future commits which will be able
to retrieve the config name through the config term.
Utilize this array in parse_events_formats_error_string() so the missing
'{,no-}inherit' terms are added.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1455882283-79592-10-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
According to man pages, asprintf returns -1 when failure. This patch
fixes two incorrect return value checker.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Cc: stable@vger.kernel.org # v4.4+
Fixes: ffeb883e56 ("perf tools: Show proper error message for wrong terms of hw/sw events")
Link: http://lkml.kernel.org/r/1455882283-79592-5-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The name of bpf_prog_priv__clear() doesn't follow perf's naming
convention. bpf_prog_priv__delete() seems to be a better name. However,
bpf_prog_priv__delete() should be a method of 'struct bpf_prog_priv',
but its first parameter is 'struct bpf_program'.
It is callback from libbpf to clear priv structures when destroying a
bpf program. It is actually a method of bpf_program (libbpf object), but
bpf_program__ functions should be provided by libbpf.
This patch removes the prefix of that function.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1455882283-79592-4-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo pointed out that the earlier cb110f4710 ("perf stat: Move
noise/running printing into printout") change changed behavior for not
counted counters. This patch fixes it again.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Fixes: cb110f4710 ("perf stat: Move noise/running printing into printout")
Link: http://lkml.kernel.org/r/1455749045-18098-2-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Using 4 kHz is not necessary and sometimes is more than what was
auto-tuned:
# dmesg | grep max_sample_rate | tail -2
[ 2499.144373] perf interrupt took too long (2501 > 2500), lowering kernel.perf_event_max_sample_rate to 50000
[ 3592.413606] perf interrupt took too long (5069 > 5000), lowering kernel.perf_event_max_sample_rate to 25000
Simulating a auto-tune of 2000 we make the test fail, as reported
by Steven Noonan for one of his machines, so reduce it to 500 HZ,
it is enough to get a good number of samples for this test:
# perf test -v 21 2>&1 | grep '^Reading object code for memory address' | tee /tmp/out | tail -5
Reading object code for memory address: 0x479f40
Reading object code for memory address: 0x7f29b7eea80d
Reading object code for memory address: 0x7f29b7eea80d
Reading object code for memory address: 0x7f29b7eea800
Reading object code for memory address: 0xffffffff813b2f23
[root@jouet ~]# wc -l /tmp/out
40 /tmp/out
[root@jouet ~]#
For systems that auto-tune below that, the previous patches will tell the
user what is happening so that he may either ignore the result of this test or
bump /proc/sys/kernel/perf_event_max_sample_rate.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Steven Noonan <steven@uplinklabs.net>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-6kufyy1iprdfzrbtuqgxir70@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Before:
# perf test -v "code reading" 2>&1 | tail -4
perf_evlist__open failed
test child finished with -1
---- end ----
Test object code reading: FAILED!
#
After:
# perf test -v "code reading" 2>&1 | tail -7
perf_evlist__open() failed!
Error: Invalid argument.
Hint: Check /proc/sys/kernel/perf_event_max_sample_rate.
Hint: The current value is 1000 and 4000 is being requested.
test child finished with -1
---- end ----
Test object code reading: FAILED!
#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Steven Noonan <steven@uplinklabs.net>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-ifbx7vmrc38loe6317owz2jx@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When running the "code reading" test we get:
# perf test -v "code reading" 2>&1 | tail -5
Parsing event 'cycles:u'
perf_evlist__open failed
test child finished with -1
---- end ----
Test object code reading: FAILED!
#
And with -vv we get the errno value, -22, i.e. -EINVAL, but we can do
better and handle the case at hand, with this patch it becomes:
# perf test -v "code reading" 2>&1 | tail -7
perf_evlist__open() failed!
Error: Invalid argument.
Hint: Check /proc/sys/kernel/perf_event_max_sample_rate.
Hint: The current value is 1000 and 4000 is being requested.
test child finished with -1
---- end ----
Test object code reading: FAILED!
#
Next patch will make this 'perf test' entry to use perf_evlist__strerror()
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Steven Noonan <steven@uplinklabs.net>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-i31ai6kfefn75eapejjokfhc@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Allow user to easily switch all events to user or kernel space with simple
--all-user or --all-kernel options.
This will be handy within perf mem/c2c wrappers to switch easily monitoring
modes.
Committer note:
Testing it:
# perf record --all-kernel --all-user -a sleep 2
Error: option `all-user' cannot be used with all-kernel
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
--all-user Configure all used events to run in user space.
--all-kernel Configure all used events to run in kernel space.
# perf record --all-user --all-kernel -a sleep 2
Error: option `all-kernel' cannot be used with all-user
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
--all-kernel Configure all used events to run in kernel space.
--all-user Configure all used events to run in user space.
# perf record --all-user -a sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.416 MB perf.data (162 samples) ]
# perf report | grep '\[k\]'
# perf record --all-kernel -a sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.423 MB perf.data (296 samples) ]
# perf report | grep '\[\.\]'
#
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1455525293-8671-2-git-send-email-jolsa@kernel.org
[ Made those options to be mutually exclusive ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We were dropping the reference we possibly held but not obtaining one
for the new maps, which we will drop at perf_evlist__delete(), fix it.
This was caught by Steven Noonan in some of the machines which would
produce this output when caught by glibc debug mechanisms:
$ sudo perf test 21
21: Test object code reading :***
Error in `perf': corrupted double-linked list: 0x00000000023ffcd0 ***
======= Backtrace: =========
/usr/lib/libc.so.6(+0x72055)[0x7f25be0f3055]
/usr/lib/libc.so.6(+0x779b6)[0x7f25be0f89b6]
/usr/lib/libc.so.6(+0x7a0ed)[0x7f25be0fb0ed]
/usr/lib/libc.so.6(__libc_calloc+0xba)[0x7f25be0fceda]
perf(parse_events_lex_init_extra+0x38)[0x4cfff8]
perf(parse_events+0x55)[0x4a0615]
perf(perf_evlist__config+0xcf)[0x4eeb2f]
perf[0x479f82]
perf(test__code_reading+0x1e)[0x47ad4e]
perf(cmd_test+0x5dd)[0x46452d]
perf[0x47f4e3]
perf(main+0x603)[0x42c723]
/usr/lib/libc.so.6(__libc_start_main+0xf0)[0x7f25be0a1610]
perf(_start+0x29)[0x42c859]
Further investigation using valgrind led to the reference count imbalance fixed
in this patch.
Reported-and-Tested-by: Steven Noonan <steven@uplinklabs.net>
Report-Link: http://lkml.kernel.org/r/CAKbGBLjC2Dx5vshxyGmQkcD+VwiAQLbHoXA9i7kvRB2-2opHZQ@mail.gmail.com
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: f30a79b012 ("perf tools: Add reference counting for cpu_map object")
Link: http://lkml.kernel.org/n/tip-j0u1bdhr47sa511sgg76kb8h@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move the running/noise printing into printout to avoid duplicated code
in the callers.
v2: Merged with other patches. Remove unnecessary hunk.
Readd hunk that ended in earlier patch.
v3: Fix noise/running output in CSV mode
v4: Merge with later patch that also moves not supported printing.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1454173616-17710-4-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now that we can modify the metrics printout functions easily, it's
straight forward to support metric printing for interval mode. All that
is needed is to print the time stamp on every new line. Pass the prefix
into the context and print it out.
v2: Move wrong hunk to here.
Committer note:
Before:
[root@jouet ~]# perf stat -I 1000 -e instructions,cycles sleep 1
# time counts unit events
1.000168216 538,913 instructions
1.000168216 748,765 cycles
1.000660048 153,741 instructions
1.000660048 214,066 cycles
After:
# perf stat -I 1000 -e instructions,cycles sleep 1
# time counts unit events
1.000215928 519,620 instructions # 0.69 insn per cycle
1.000215928 752,003 cycles
1.000946033 148,502 instructions # 0.33 insn per cycle
1.000946033 160,104 cycles
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1454173616-17710-3-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Abstract the printing of shadow metrics. Instead of every metric calling
fprintf directly and taking care of indentation, use two call backs: one
to print metrics and another to start a new line.
This will allow adding metrics to CSV mode and also using them for other
purposes.
The computation of padding is now done in the central callback, instead
of every metric doing it manually. This makes it easier to add new
metrics.
v2: Refactor functions, printout now does more. Move
shadow printing. Improve fallback callbacks. Don't
use void * callback data.
v3: Remove unnecessary hunk. Add typedef for new_line
v4: Remove unnecessary hunk. Don't print metrics for CSV/interval
mode yet. Move printout change to separate patch.
v5: Fix bisect bugs. Avoid bogus frontend cycles printing.
Fix indentation in different aggregation modes.
v6: Delay newline handling
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1454173616-17710-2-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Setting libapi debug output functions to use perf functions.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1455465826-8426-5-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adhering to the naming convention used when va_args is in a printf like
function, e.g. stdio.h.
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-b5l3wt77ct28dcnriguxtvn6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We already moved similar functions in here, also it'll be useful for
sysfs__read_str addition in following patch.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1455465826-8426-3-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch modifies the jvmti makefile to check if the
/usr/sbin/java-update-alternatives utility is present. If so, then use
it, if not then use the altenatives command.
This helps handle the difference between Ubuntu and Fedora Linux
distributions.
Signed-off-by: Stephane Eranian <eranian@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1455604661-9357-1-git-send-email-eranian@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
fixing the following problems, for instance, on RHEL6.7:
CC /tmp/build/perf/tests/bp_signal.o
cc1: warnings being treated as errors
tests/bp_signal.c: In function ‘__event’:
tests/bp_signal.c:106: error: declaration of ‘signal’ shadows a global declaration
/usr/include/signal.h:101: error: shadowed declaration is here
tests/bp_signal.c: In function ‘bp_event’:
tests/bp_signal.c:144: error: declaration of ‘signal’ shadows a global declaration
/usr/include/signal.h:101: error: shadowed declaration is here
tests/bp_signal.c: In function ‘wp_event’:
tests/bp_signal.c:149: error: declaration of ‘signal’ shadows a global declaration
/usr/include/signal.h:101: error: shadowed declaration is here
mv: cannot stat `/tmp/build/perf/tests/.bp_signal.o.tmp': No such file or directory
make[3]: *** [/tmp/build/perf/tests/bp_signal.o] Error 1
make[2]: *** [tests] Error 2
make[1]: *** [/tmp/build/perf/perf-in.o] Error 2
make[1]: *** Waiting for unfinished jobs....
Reported-by: Vinson Lee <vlee@freedesktop.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: He Kuang <hekuang@huawei.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: pi3orama@163.com
Fixes: 8fd34e1cce ("perf test: Improve bp_signal")
Link: http://lkml.kernel.org/n/tip-wlpx6tik1b0jirlkw64bv400@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
A new patch in libbabeltrace [1] reveals a object leak problem in
'perf data' CTF support: perf code never releases the event_class
which is allocated in add_event() and stored in evsel's private field.
If libbabeltrace has the above patch applied, leaking event_class
prevents the writer from being destroyed and flushing metadata. For
example:
$ perf record ls
perf.data
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.012 MB perf.data (12 samples) ]
$ perf data convert --to-ctf ./out.ctf
[ perf data convert: Converted 'perf.data' into CTF data './out.ctf' ]
[ perf data convert: Converted and wrote 0.000 MB (12 samples) ]
$ cat ./out.ctf/metadata
$ ls -l ./out.ctf/metadata
-rw-r----- 1 w00229757 mm 0 Jan 27 10:49 ./out.ctf/metadata
The correct result should be:
...
$ cat ./out.ctf/metadata
/* CTF 1.8 */
trace {
[SNIP]
$ ls -l ./out.ctf/metadata
-rw-r----- 1 w00229757 mm 2446 Jan 27 10:52 ./out.ctf/metadata
The full story is:
Patch [1] of babeltrace redesigns its reference counting scheme. In that
patch:
* writer <- trace (bt_ctf_writer_create)
* trace <- stream_class (bt_ctf_trace_add_stream_class)
* stream_class <- event_class (bt_ctf_stream_class_add_event_class)
('<-' means 'is a parent of')
Holding of event_class causes reference count of corresponding 'writer'
to increase through parent chain. Perf expects that 'writer' is released
(so metadata is flushed) through bt_ctf_writer_put() in
ctf_writer__cleanup(). However, since it never releases event_class, the
reference of 'writer' won't be dropped, so bt_ctf_writer_put() won't
lead to the release of writer.
Before this CTF patch, !(writer <- trace). Even with event_class leaking,
the writer ends up being released.
[1] e6a8e8e474
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1454680939-24963-6-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To follow convention used in other tools/perf/ areas. Also remove the
need to check if it is NULL before calling the destructor, again, to
follow convention that goes back to free().
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/n/tip-w6owu7rb8a46gvunlinxaqwx@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fixing a leak, since code calling parse_events__free_terms() expect it
to free the list_head too.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
[ Spun off from another patch ]
Link: http://lkml.kernel.org/r/1454680939-24963-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In these two cases, a 'perf test' entry and in the PMU code the
list_head is on the stack, so we can't use perf_event__free_terms()
(soon to be renamed to perf_event_terms__delete()), because it will
free the list_head as well.
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/n/tip-i956ryjhz97gnnqe8iqe7m7s@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Purges 'struct parse_event_term' entries from a list_head.
Some users need this because they don't allocate space for the list
head, it maybe on the stack or embedded into some other struct.
Next patch will convert users that need just purging and then the
perf_events__free_terms() routine will free the list head as well,
finally being renamed to perf_events_terms__delete().
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/n/tip-4w3zl4ifcl0ed0j4bu3tckqp@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We were just freeing them, better unlink and init its nodes to catch
bugs faster if we keep dangling references to them.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
[ Spun off from another patch, use list_del_init() instead of list_del() ]
Link: http://lkml.kernel.org/r/1454680939-24963-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We were doing column alignment in the format function for each cell,
returning a string padded with spaces so that when the next column is
printed the cursor is at its column alignment.
This ends up needlessly printing trailing spaces, do it at the format
iterator, that is where we know if it is needed, i.e. if there is more
columns to be printed.
This eliminates the need for triming lines when doing a dump using 'P'
in the TUI browser and also produces far saner results with things like
piping 'perf report' to 'less'.
Right now only the formatters for sym->name and the 'locked' column
(perf mem report), that are the ones that end up at the end of lines
in the default 'perf report', 'perf top' and 'perf mem report' tools,
the others will be done in a subsequent patch.
In the end the 'width' parameter for the formatters now mean, in
'printf' terms, the 'precision', where before it was the field 'width'.
Reported-by: Dave Jones <davej@codemonkey.org.uk>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/n/tip-s7iwl2gj23w92l6tibnrcqzr@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-4j67nvlfwbnkg85b969ewnkr@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To print syscall names, the audit-libs-python package is required.. If
not installed, it prints this error string:
# perf script syscall-counts
Install the audit-libs-python package to get syscall names.
But the package name is different in Ubuntu, mention that in the error
message, similar to a error message of util/trace-event-scripting.c:
# perf script syscall-counts
Install the audit-libs-python package to get syscall names.
For example:
# apt-get install python-audit (Ubuntu)
# yum install audit-libs-python (Fedora)
etc.
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1455018790-13425-1-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To compile for little-endian systems, you need to pass -EL to CC and LD.
EXTRA_CFLAGS works to pass -EL to CC.
Add EXTRA_LDFLAGS to pass -EL to LD.
Signed-off-by: Zubair Lutfullah Kakakhel <Zubair.Kakakhel@imgtec.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1455024818-15842-1-git-send-email-Zubair.Kakakhel@imgtec.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Before this patch, if a sample is triggered inside a module not in
/lib/modules/`uname -r`/, even if the module is in buildid-cache, 'perf
report' will still be unable to find the correct symbol. For example:
# rm -rf ~/.debug/
# perf buildid-cache -a ./mymodule.ko
# perf probe -m ./mymodule.ko -a get_mymodule_val
Added new event:
probe:get_mymodule_val (on get_mymodule_val in mymodule)
You can now use it in all perf tools, such as:
perf record -e probe:get_mymodule_val -aR sleep 1
# perf record -e probe:get_mymodule_val cat /proc/mymodule
mymodule:3
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.011 MB perf.data (1 samples) ]
# perf report --stdio
[SNIP]
#
# Overhead Command Shared Object Symbol
# ........ ....... ................ ......................
#
100.00% cat [mymodule] [k] 0x0000000000000001
# perf report -vvvv --stdio
dso__load_sym: adjusting symbol: st_value: 0 sh_addr: 0 sh_offset: 0x70
symbol__new: get_mymodule_val 0x70-0x8a
[SNIP]
This is caused by dso__load() -> dso__load_sym(). In dso__load(), kmod
is true only when its file is found in some well know directories. All
files loaded from buildid-cache are treated as user programs. Following
dso__load_sym() set map->pgoff incorrectly.
This patch gives kernel modules in buildid-cache a chance to adjust
value of kmod. After dso__load() get the type of symbols, if it is
buildid, check the last 3 chars of original filename against '.ko', and
adjust the value of kmod if the file is a kernel module.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1454680939-24963-3-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The '--system' option means $(sysconfdir)/perfconfig and '--user' means
$HOME/.perfconfig. If none is used, both system and user config file are
read. E.g.:
# perf config [<file-option>] [options]
With an specific config file:
# perf config --user | --system
or both user and system config file:
# perf config
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1455126685-32367-2-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch adds source line information support to perf for jitted code.
The source line info must be emitted by the runtime, such as JVMTI.
Perf injects extract the source line info from the jitdump file and adds
the corresponding .debug_lines section in the ELF image generated for
each jitted function.
The source line enables matching any address in the profile with a
source file and line number.
The improvement is visible in perf annotate with the source code
displayed alongside the assembly code.
The dwarf code leverages the support from OProfile which is also
released under GPLv2. Copyright 2007 OProfile authors.
Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Carl Love <cel@us.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John McCutchan <johnmccutchan@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sonny Rao <sonnyrao@chromium.org>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1448874143-7269-5-git-send-email-eranian@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This is a standalone JVMTI library to help profile Java jitted code with perf
record/perf report. The library is not installed or compiled automatically by
perf Makefile. It is not used directly by perf. It is arch agnostic and has
been tested on X86 and ARM. It needs to be used with a Java runtime, such as
OpenJDK, as follows:
$ java -agentpath:libjvmti.so .......
See the "Committer Notes" below on how to build it.
When used this way, java will generate a jitdump binary file in
$HOME/.debug/java/jit/java-jit-*
This binary dump file contains information to help symbolize and
annotate jitted code.
The jitdump information must be injected into the perf.data file
using:
$ perf inject --jit -i perf.data -o perf.data.jitted
This injects the MMAP records to cover the jitted code and also generates
one ELF image for each jitted function. The ELF images are created in the
same subdir as the jitdump file. The MMAP records point there too.
Then, to visualize the function or asm profile, simply use the regular
perf commands:
$ perf report -i perf.data.jitted
or
$ perf annotate -i perf.data.jitted
JVMTI agent code adapted from the OProfile's opagent code.
This version of the JVMTI agent is using the CLOCK_MONOTONIC as the time
source to timestamp jit samples. To correlate with perf_events samples,
it needs to run on kernel 4.0.0-rc5+ or later with the following commit
from Peter Zijlstra:
34f439278c ("perf: Add per event clockid support")
With this patch recording jitted code is done as follows:
$ perf record -k mono -- java -agentpath:libjvmti.so .......
--------------------------------------------------------------------------
Committer Notes:
Extended testing instructions:
$ cd tools/perf/jvmti/
$ dnf install java-devel
$ make
Then, create some simple java stuff to record some samples:
$ cat hello.java
public class hello {
public static void main(String[] args) {
System.out.println("Hello, World");
}
}
$ javac hello.java
$ java hello
Hello, World
$
And then record it using this jvmti thing:
$ perf record -k mono java -agentpath:/home/acme/git/linux/tools/perf/jvmti/libjvmti.so hello
java: jvmti: jitdump in /home/acme/.debug/jit/java-jit-20160205.XXWIEDls/jit-1908.dump
Hello, World
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.030 MB perf.data (268 samples) ]
$
Now lets insert the PERF_RECORD_MMAP2 records to point jitted mmaps to
files created by the agent:
$ perf inject --jit -i perf.data -o perf.data.jitted
And finally see that it did its job:
$ perf report -D -i perf.data.jitted | grep PERF_RECORD_MMAP2 | tail -5
79197149129422 0xfe10 [0xa0]: PERF_RECORD_MMAP2 1908/1923: [0x7f172428bd60(0x80) @ 0x40 fd:02 1840554 1]: --xs /home/acme/.debug/jit/java-jit-20160205.XXWIEDls/jitted-1908-283.so
79197149235701 0xfeb0 [0xa0]: PERF_RECORD_MMAP2 1908/1923: [0x7f172428ba60(0x180) @ 0x40 fd:02 1840555 1]: --xs /home/acme/.debug/jit/java-jit-20160205.XXWIEDls/jitted-1908-284.so
79197149250558 0xff50 [0xa0]: PERF_RECORD_MMAP2 1908/1923: [0x7f172428b860(0x180) @ 0x40 fd:02 1840556 1]: --xs /home/acme/.debug/jit/java-jit-20160205.XXWIEDls/jitted-1908-285.so
79197149714746 0xfff0 [0xa0]: PERF_RECORD_MMAP2 1908/1923: [0x7f172428b660(0x180) @ 0x40 fd:02 1840557 1]: --xs /home/acme/.debug/jit/java-jit-20160205.XXWIEDls/jitted-1908-286.so
79197149806558 0x10090 [0xa0]: PERF_RECORD_MMAP2 1908/1923: [0x7f172428b460(0x180) @ 0x40 fd:02 1840558 1]: --xs /home/acme/.debug/jit/java-jit-20160205.XXWIEDls/jitted-1908-287.so
$
So:
$ perf report -D -i perf.data | grep PERF_RECORD_MMAP2 | wc -l
Failed to open /tmp/perf-1908.map, continuing without symbols
21
$ perf report -D -i perf.data.jitted | grep PERF_RECORD_MMAP2 | wc -l
307
$ echo $((307 - 21))
286
$
286 extra PERF_RECORD_MMAP2 records.
All for thise tiny, with just one function, ELF files:
$ file /home/acme/.debug/jit/java-jit-20160205.XXWIEDls/jitted-1908-9.so
/home/acme/.debug/jit/java-jit-20160205.XXWIEDls/jitted-1908-9.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), corrupted program header size, BuildID[sha1]=ae54a2ebc3ecf0ba547bfc8cabdea1519df5203f, not stripped
$ readelf -sw /home/acme/.debug/jit/java-jit-20160205.XXWIEDls/jitted-1908-9.so
Symbol table '.symtab' contains 2 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 0000000000000040 9 FUNC LOCAL DEFAULT 1 atomic_cmpxchg_long
$
Inserted into the build-id cache:
$ ls -la ~/.debug/.build-id/ae/54a2ebc3ecf0ba547bfc8cabdea1519df5203f
lrwxrwxrwx. 1 acme acme 111 Feb 5 11:30 /home/acme/.debug/.build-id/ae/54a2ebc3ecf0ba547bfc8cabdea1519df5203f -> ../../home/acme/.debug/jit/java-jit-20160205.XXWIEDls/jitted-1908-9.so/ae54a2ebc3ecf0ba547bfc8cabdea1519df5203f
Note: check why 'file' reports that 'corrupted program header size'.
With a stupid java hog to do some profiling:
$ cat hog.java
public class hog {
private static double do_something_else(int i) {
double total = 0;
while (i > 0) {
total += Math.log(i--);
}
return total;
}
private static double do_something(int i) {
double total = 0;
while (i > 0) {
total += Math.sqrt(i--) + do_something_else(i / 100);
}
return total;
}
public static void main(String[] args) {
System.out.println(String.format("%s=%f & %f", args[0],
do_something(Integer.parseInt(args[0])),
do_something_else(Integer.parseInt(args[1]))));
}
}
$ javac hog.java
$ perf record -F 10000 -g -k mono java -agentpath:/home/acme/git/linux/tools/perf/jvmti/libjvmti.so hog 100000 2345000
java: jvmti: jitdump in /home/acme/.debug/jit/java-jit-20160205.XX4sqd14/jit-8670.dump
100000=291561592.669602 & 32050989.778714
[ perf record: Woken up 6 times to write data ]
[ perf record: Captured and wrote 1.536 MB perf.data (12538 samples) ]
$ perf inject --jit -i perf.data -o perf.data.jitted
Looking at the 'perf report' TUI, at one expanded callchain leading
to the jitted code:
$ perf report --no-children -i perf.data.jitted
Samples: 12K of event 'cycles:pp', Event count (approx.): 3829569932
Overhead Comm Shared Object Symbol
- 93.38% java jitted-8670-291.so [.] class hog.do_something_else(int)
class hog.do_something_else(int)
- Interpreter
- 75.86% call_stub
JavaCalls::call_helper
jni_invoke_static
jni_CallStaticVoidMethod
JavaMain
start_thread
- 17.52% JavaCalls::call_helper
jni_invoke_static
jni_CallStaticVoidMethod
JavaMain
start_thread
Signed-off-by: Stephane Eranian <eranian@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Carl Love <cel@us.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John McCutchan <johnmccutchan@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sonny Rao <sonnyrao@chromium.org>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1448874143-7269-4-git-send-email-eranian@google.com
[ Made it build on fedora23, added some build/usage instructions ]
[ Check if filename != NULL in compiled_method_load_cb, fixing segfault ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch adds a --jit/-j option to perf inject.
This options injects MMAP records into the perf.data file to cover the
jitted code mmaps. It also emits ELF images for each function in the
jidump file. Those images are created where the jitdump file is. The
MMAP records point to that location as well.
Typical flow:
$ perf record -k mono -- java -agentpath:libpjvmti.so java_class
$ perf inject --jit -i perf.data -o perf.data.jitted
$ perf report -i perf.data.jitted
Note that jitdump.h support is not limited to Java, it works with any
jitted environment modified to emit the jitdump file format, include
those where code can be jitted multiple times and moved around.
The jitdump.h format is adapted from the Oprofile project.
The genelf.c (ELF binary generation) depends on MD5 hash encoding for
the buildid. To enable this, libssl-dev must be installed. If not, then
genelf.c defaults to using urandom to generate the buildid, which is not
ideal. The Makefile auto-detects the presence on libssl-dev.
This version mmaps the jitdump file to create a marker MMAP record in
the perf.data file. The marker is used to detect jitdump and cause perf
inject to inject the jitted mmaps and generate ELF images for jitted
functions.
In V8, the following fixes and changes were made among other things:
- the jidump header format include a new flags field to be used
to carry information about the configuration of the runtime agent.
Contributed by: Adrian Hunter <adrian.hunter@intel.com>
- Fix mmap pgoff: MMAP event pgoff must be the offset within the ELF file
at which the code resides.
Contributed by: Adrian Hunter <adrian.hunter@intel.com>
- Fix ELF virtual addresses: perf tools expect the ELF virtual addresses of dynamic
objects to match the file offset.
Contributed by: Adrian Hunter <adrian.hunter@intel.com>
- JIT MMAP injection does not obey finished_round semantics. JIT MMAP injection injects all
MMAP events in one go, so it does not obey finished_round semantics, so drop the
finished_round events from the output perf.data file.
Contributed by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Carl Love <cel@us.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John McCutchan <johnmccutchan@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sonny Rao <sonnyrao@chromium.org>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1448874143-7269-3-git-send-email-eranian@google.com
[ Moved inject.build_ids ordering bits to a separate patch, fixed the NO_LIBELF=1 build ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To make sure the mmap records are ordered correctly and so that the
correct especially due to jitted code mmaps.
We cannot generate the buildid hit list and inject the jit mmaps (will
come right after this patch) in at the same time for now.
Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Carl Love <cel@us.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John McCutchan <johnmccutchan@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sonny Rao <sonnyrao@chromium.org>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1448874143-7269-3-git-send-email-eranian@google.com
[ Carved out from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Will be used to generate build-ids in the jitdump code.
Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Carl Love <cel@us.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John McCutchan <johnmccutchan@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sonny Rao <sonnyrao@chromium.org>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1448874143-7269-3-git-send-email-eranian@google.com
[ tools/perf/Makefile.perf comment about NO_LIBCRYPTO and added it to tests/make ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add Java function descriptor demangling support. Something bfd cannot
do.
Use the JAVA_DEMANGLE_NORET flag to avoid decoding the return type of
functions.
Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Carl Love <cel@us.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John McCutchan <johnmccutchan@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sonny Rao <sonnyrao@chromium.org>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1448874143-7269-2-git-send-email-eranian@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Steam frequently puts game binaries in folders with spaces.
Note: "(deleted)" markers are now treated as part of the file name.
Signed-off-by: Marcin Ślusarz <marcin.slusarz@gmail.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Fixes: 6064803313 ("perf tools: Use sscanf for parsing /proc/pid/maps")
Link: http://lkml.kernel.org/r/20160119190303.GA17579@marcin-Inspiron-7720
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-jhmnf9g7y9ryqcjql00unk5y@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Do not parallelize 'clean' with other targets, figure out if it is
present and do it first, then the other targets.
Noticed with:
tools/perf> make -j24 clean all
LD arch/libperf-in.o
LD plugin_xen-in.o
arch//libperf-in.o: file not recognized: File truncated
make[3]: *** [arch/libperf-in.o] Error 1
make[2]: *** [arch] Error 2
make[2]: *** Waiting for unfinished jobs....
AR libapi.a
Reported-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-kb0qs29zbz7hxn32mc5zbsoz@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Explain 'man.viewer' variable and how to add new man viewer tools.
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1454577913-16401-6-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Explain 'report' section's variables:
'percent-limit', 'queue-size' and 'children'.
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1454577913-16401-4-git-send-email-treeze.taeung@gmail.com
[ Fix some grammar issues, add some more info ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Explain 'call-graph' section and its variables:
'record-mode', 'dump-size', 'print-type', 'order', 'sort-key',
'threshold' and 'print-limit'.
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1454577913-16401-3-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This option controls display of column headers (like 'Overhead' and
'Symbol') in 'report' and 'top'. If this option is false, they are
hidden. This option is only applied to TUI.
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1454577913-16401-2-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
So that we do less visual searching on the 'make build-test' output to
see the feature related variables:
After:
$ make -C tools/perf build-test
<SNIP>
make_no_newt_O: cd . && make NO_NEWT=1 FEATURES_DUMP=/home/acme/git/linux/tools/perf/BUILD_TEST_FEATURE_DUMP O=/tmp/tmp.dz55IX DESTDIR=/tmp/tmp.X29xxo
make_tags_O: cd . && make tags FEATURES_DUMP=/home/acme/git/linux/tools/perf/BUILD_TEST_FEATURE_DUMP O=/tmp/tmp.6ecLh8 DESTDIR=/tmp/tmp.6vIla578Ho
make_util_pmu_bison_o_O: cd . && make util/pmu-bison.o FEATURES_DUMP=/home/acme/git/linux/tools/perf/BUILD_TEST_FEATURE_DUMP O=/tmp/tmp.SVPM2G DESTDIR=/tmp/tmp.C0oAam
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-dx4krgzqa566v1pedrbrcchi@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Since this is the name that 'make' will look for if no explicit -f file
is passed.
This in turn makes the output of 'build-test' more compact:
Before:
$ perf stat make -C tools/perf build-test
<SNIP>
cd . && make FEATURE_DUMP_COPY=/home/acme/git/linux/tools/perf/BUILD_TEST_FEATURE_DUMP feature-dump
make_no_libaudit_O: cd . && make -f Makefile O=/tmp/tmp.tHIa0Kkk2Y DESTDIR=/tmp/tmp.foK7rckkVi NO_LIBAUDIT=1 FEATURES_DUMP=/home/acme/git/linux/tools/perf/BUILD_TEST_FEATURE_DUMP
<SNIP>
After:
$ perf stat make -C tools/perf build-test
<SNIP>
make_no_libaudit_O: cd . && make O=/tmp/tmp.tHIa0Kkk2Y DESTDIR=/tmp/tmp.foK7rckkVi NO_LIBAUDIT=1 FEATURES_DUMP=/home/acme/git/linux/tools/perf/BUILD_TEST_FEATURE_DUMP
<SNIP>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-m440lb8dkfsywsyah0htif6t@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
New features:
- Add 'L' hotkey to dynamicly set the percent threshold for histogram
entries and callchains, i.e. dynamicly do what the --percent-limit
command line option to 'top' and 'report' does. (Namhyung Kim)
Infrastructure:
- Per hists field and sort lists, that will be used, for instance,
in the c2c tool (Jiri Olsa)
Documentation:
- Update documentation of --sort and --perf-limit options
for 'perf report' (Namhyung Kim)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJWsiK5AAoJENZQFvNTUqpADJoQAIVBlW29iw+qshg0bWM+1xw/
RBKgTCLT2VPK9wE00qIQOmZjf4ZRHLWDBVDf8iB36UMhY4q4vUaWYRJu8dZ/IoPx
AQPg5JFe8INd2voDHo6E8vCdS58Mk7NWRdWs9dctVHGa7LjdDdLhYNLcghIps8Pb
4cSLltmaa/XGxBvu526g6TRSaoaCIVO1W0wxNvoc3XFipI7zqtgCasLr5Hb6RxdD
BUC7WYuJ5n96EqR0vjeV7ajv9m65+RaLFIIa6vjUIEdAB+3sDIpfiMpkIprAzoze
TcsGpSz5+5yAbnezgIoHaRvFWzd/hune62sZmCySd1E3+5Vb8JbUo4WxNTAlTwdS
XbwY1VyLYDnKFVqQSAElFbeXad8kbkd06tzLgUOTfoeHWKMp5qxfH/myLNjF42pK
d5mQ5TnN/ESsj3D/3TcgBzmIOGIS5R2evKsIIwip4rSGAP901U+RRBMDE87vn0M2
TsqrkKPZgQcYTA9FKlWIPCdT0Y+yYr0jvgRabN9rQMCp5r7Wb7hrpwHGMPp16hbT
1mnMm8N+moIMv1Wp0R5C75bls2uFaOiS3Y7XyORjqfwHMfzLUqPWPSmDwtSrbE8a
FTDGInJubTJRXdjTE/0+d+zZbpiEqTybfjOce3AcB2KBMfSJaQRvLcvAei21GMWC
hrBhLY6ngbgd7jD1RTHg
=foQI
-----END PGP SIGNATURE-----
Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
New features:
- Add 'L' hotkey to dynamicly set the percent threshold for histogram
entries and callchains, i.e. dynamicly do what the --percent-limit
command line option to 'top' and 'report' does. (Namhyung Kim)
Infrastructure changes:
- Per hists field and sort lists, that will be used, for instance,
in the c2c tool (Jiri Olsa)
Documentation changes:
- Update documentation of --sort and --perf-limit options
for 'perf report' (Namhyung Kim)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
- tracepoint_error() can receive e=NULL, robustify it, fixes a problem noticed
with a very specific combination: Machine with Intel PT (e.g. Broadwell),
kernel with no perf_event_attr.context_switch feature (e.g. 4.2) and unreadable
tracefs (for instance !root users), making the fallback from
perf_event_attr.context_switch to the sched:sched_switch tracepoint to fail
reading its info from tracefs, fix it. (Adrian Hunter)
- Fix segfault in intel pt, by making it follow the 'struct thread' lifetime cycle
checking expectations, noticed for instance, when processing perf.data files with
Intel PT data using 'perf script' and when exiting 'perf report' (Adrian Hunter)
- Fix CFI usage from .eh_frame and .debug_frame, which sometimes requires that we
fallback from .eh_frame to .debug_frame in architectures such as PowerPC (Hemant Kumar)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJWsjJzAAoJENZQFvNTUqpAF2EP/2UJGz/mi7O3wu90YO7BnglD
WizS/Y4fTLDfoz9+hwiUscHMvBOpAYMRbGX73AyHCP5qPsv1fRyW5jPCyCqpvDWB
TU86CX0t9CEXAj3mKTwShqIXiY9Hmf0lOwxAxY+Y5I12utirqHzZreilBhNvHStz
ESYXpTKsDNdU08Zu7nLmKqIlFLnRvY+7sL55+rgcw7DWaIcivpAF8b8RX7iJyoJk
fL7dkebXDtZvQpBZ4A8TniACjqebfpg1BSiZ7c9NDIs7YMB+2VPDzXrySP2Oq3q6
u8rZtwn8/0idZ5Es2LWU68QXJL0Z6q7p74BZ+/IO1jSTviegu8CQTfIHRfyx+ur4
IZroUuEPDz9tFw7q8tUt/D48Qbh7rOIFBYUHtbq9e0g1WfW2g0NzP/EseNQxkica
uZdfn98cHZyeGiNLhRAjqTwGmZTlV7EoNh6282i7PwyJ9J5nOs36f3Tuo1bekVp+
qtugNbE2xebwwCiBSAHbsQcIrKnyL+bcgSrDzKAP5kBz9r58TQad+CUNQ/IHyhdr
q66RYEy3cdmotcPKtK5jxNMYoSoJzlGEpX3FKXZNHkpRIkNp1vZA6MlSIiHOo/A8
eUg6O55XBRJLYdZ/Q2Vb1t1X83g2789o4tgW0tOUtwBwCW7AIQq7w2m08aUCbQf5
2/HaTdhyxMx6ra34dzS1
=0cG5
-----END PGP SIGNATURE-----
Merge tag 'perf-urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
Pull perf/urgent fixes from Arnaldo Carvalho de Melo:
- tracepoint_error() can receive e=NULL, robustify it, fixes a problem noticed
with a very specific combination: Machine with Intel PT (e.g. Broadwell),
kernel with no perf_event_attr.context_switch feature (e.g. 4.2) and unreadable
tracefs (for instance !root users), making the fallback from
perf_event_attr.context_switch to the sched:sched_switch tracepoint to fail
reading its info from tracefs, fix it. (Adrian Hunter)
- Fix segfault in intel PT, by making it follow the 'struct thread' lifetime cycle
checking expectations, noticed for instance, when processing perf.data files with
Intel PT data using 'perf script' and when exiting 'perf report' (Adrian Hunter)
- Fix CFI usage from .eh_frame and .debug_frame, which sometimes requires that we
fallback from .eh_frame to .debug_frame in architectures such as PowerPC (Hemant Kumar)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
We broke interval data displays with commit:
3f416f22d1 ("perf stat: Do not clean event's private stats")
This commit removed stats cleaning, which is important for '-r' option
to carry counters data over the whole run. But it's necessary to clean
it for interval mode, otherwise the displayed value is avg of all
previous values.
Before:
$ perf stat -e cycles -a -I 1000 record
# time counts unit events
1.000240796 75,216,287 cycles
2.000512791 107,823,524 cycles
$ perf stat report
# time counts unit events
1.000240796 75,216,287 cycles
2.000512791 91,519,906 cycles
Now:
$ perf stat report
# time counts unit events
1.000240796 75,216,287 cycles
2.000512791 107,823,524 cycles
Notice the second value being bigger (91,.. < 107,..).
This could be easily verified by using perf script which displays raw
stat data:
$ perf script
CPU THREAD VAL ENA RUN TIME EVENT
0 -1 23855779 1000209530 1000209530 1000240796 cycles
1 -1 33340397 1000224964 1000224964 1000240796 cycles
2 -1 15835415 1000226695 1000226695 1000240796 cycles
3 -1 2184696 1000228245 1000228245 1000240796 cycles
0 -1 97014312 2000514533 2000514533 2000512791 cycles
1 -1 46121497 2000543795 2000543795 2000512791 cycles
2 -1 32269530 2000543566 2000543566 2000512791 cycles
3 -1 7634472 2000544108 2000544108 2000512791 cycles
The sum of the first 4 values is the first interval aggregated value:
23855779 + 33340397 + 15835415 + 2184696 = 75,216,287
The sum of the second 4 values minus first value is the second interval
aggregated value:
97014312 + 46121497 + 32269530 + 7634472 - 75216287 = 107,823,524
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1454485436-20639-1-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add 'L' key action to change the percent limit applied to both of hist
entries and callchains.
Suggested-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1454508683-5735-4-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The --percent-limit option was changed to be applied to callchains as
well as to hist entries recently, but it missed to update the doc.
Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1454508683-5735-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The description of the memory sort key (used by --mem-mode) was
misplaced. Move it under the --sort option so that it can be referenced
properly.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1454508683-5735-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
With the hist object having the perf_hpp_list we can now iterate sort
format entries based in the hists object. Adding
hists__for_each_sort_list macro to do that.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-27-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
With the hist object having the perf_hpp_list we can now iterate output
format entries based in the hists object. Adding hists__for_each_format
macro to do that.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-26-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding hpp_list into struct hists object.
Initializing struct hists_evsel hists object to carry global
perf_hpp_list list.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-25-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding struct perf_hpp_list argument to following helper functions:
void perf_hpp__setup_output_field(struct perf_hpp_list *list);
void perf_hpp__reset_output_field(struct perf_hpp_list *list);
void perf_hpp__append_sort_keys(struct perf_hpp_list *list);
so they could be used on hists's hpp_list.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-24-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Passing perf_hpp_list all the way through setup_output_list so the
output entry could be added on the arbitrary list.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-19-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding 2 perf_hpp_list register helpers:
perf_hpp_list__column_register()
perf_hpp_list__register_sort_field()
to be called within existing helpers:
perf_hpp__column_register()
perf_hpp__register_sort_field()
to register format entries within global perf_hpp_list object.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-17-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Introducing perf_hpp_list__init function to have an easy way to
initialize perf_hpp_list struct.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-16-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Gather output and sort lists under struct perf_hpp_list, so we could
have multiple instancies of sort/output format entries.
Replacing current perf_hpp__list and perf_hpp__sort_list lists with
single perf_hpp_list instance.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-15-git-send-email-jolsa@kernel.org
[ Renamed fields to .{fields,sorts} as suggested by Namhyung and acked by Jiri ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Separating output fields parsing into setup_output_list function, so
it's separated from field_order string setup and could be reused later
in following patches.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-14-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Separating sort fields parsing into setup_sort_list function, so it's
separated from sort_order string setup and could be reused later in
following patches.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-13-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
With multiple list holding format entries, we need the support properly
releasing format output/sort fields.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-12-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Those functions are no longer needed. They operate over perf_hpp__format
array which is now used only as template for dynamic entries.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-11-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently we use static output fields, because we have single global
list of all sort/output fields.
We will add hists specific sort and output lists in following patches,
so we need all format entries to be dynamically allocated. Adding
support to allocate output sort field.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-10-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The ui initialization changes hpp format callbacks, based on the used
browser. Thus we need this init being processed before setup_sorting.
Replica of a patch by Jiri for 'perf report'.
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-9-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The ui initialization changes hpp format callbacks, based on the used
browser. Thus we need this init being processed before setup_sorting.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-9-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now that we have the 'equal' method implemented for hpp format entries
we can ease up the logic in the following functions and make them
generic wrt comparing format entries:
perf_hpp__setup_output_field
perf_hpp__append_sort_keys
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-8-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding 'hpp__equal' callback function to compare hpp output format
entries.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-7-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To easily compare format entries and make it available for all kinds of
format entries.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-6-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We are going to add dynamic hpp format fields, so we need to make the
'len' change for the format itself, not in the perf_hpp__format
template.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-5-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently there's no way of comparing hpp format entries, which is
needed in following patches.
Adding _idx fields into struct perf_hpp_fmt to recognize and be able to
compare hpp format entries.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-4-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding evsel specific function to sort hists_evsel based hists. The
hists__output_resort can be now used to sort common hists object.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-3-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently hists__output_resort() depends on hists based on hists_evsel
struct, but we need to be able to sort common hists as well.
Cutting out the sorting base sorting code into output_resort
function, so it can be reused in following patch.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-2-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
User visible:
- Make --percent-limit apply to callchains also and fix some bugs
related to --percent-limit (Namhyung Kim)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJWr9ItAAoJENZQFvNTUqpAmDcP/A0DqDIsttDMrJtsTb+DeMEt
HaOjOZT9usZW8khAkEsk62i6lg/ewVqajalzFkfByG0gAJ5u/Xi1YQo64MzXpLjV
DZ0Nlqujhb6PowO9eRPra6UAEiq+88Gzn+y+XzYqVsPVLAK/d8Ck9ALWo33gIBhc
uq32fpp79zrCgfq8pOhvWMaMmRqmpyUiwCjiFCgUs1FD2NjdwGWSfH6XqxVdojVv
/s1agYu+E9WJ74Df2upoIUxiFcG4+aT6Y4li3N1XaATrWoiqrkSyp1uVwOZ9H4i0
9OyIhDzR0aar8z0aVJJmccqfGpC9LLWaf5YkYqK6A8vI0x5FyCu4TieeKCMJ5k7S
1AO2E6FGsQ/vOJx/LvVGrEAmUog/kZ8q4OmudpmGBcHJ9PGHpnUg/6uAij2Nwyxo
68oL4kgZFTrC5Cxdr1W+8Z/4Z9piNzArs2SSr5PfHWyzAB35WEKXCwoDy1uQ2q7d
XIUa+6Gvldc5iRjrulY8YCqwhltfx9LiCWdOYmEpS2BGIeWzTQIinYNzVwCTP7Av
tsLKaGx4/O5iZf1yuMaOXx9nXK6N87gb9il8sSQD2AZVPIkBTkE5mKYycqXblqUV
wFH4oZ4QKTPnbwV2gOHsjOKABhsm6Jop8vpgKZtF3May5K9lNx6Ivq4KyqA+uSht
BpYuVeCKwKHyT2uwSQf4
=3p/S
-----END PGP SIGNATURE-----
Merge tag 'perf-core-for-mingo-3' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core callchain fixes and improvements from Arnaldo Carvalho de Melo <acme@redhat.com:
User visible changes:
- Make --percent-limit apply to callchains also and fix some bugs
related to --percent-limit (Namhyung Kim)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
- Port 'perf kvm stat' to PowerPC (Hemant Kumar)
Infrastructure:
- Use the 'feature-dump' target to do the feature checks just once and then
add code to reuse that in the tests/make makefile, speeding up the
'make -C tools/perf build-test' target (Wang Nan)
- Reduce the number of tests the 'build-test' target do to those that don't
pollute the source tree (Arnaldo Carvalho de Melo)
- Improve the output of the build tests a bit by aligning the name of the
tests, more can be done to filter out uninteresting info in the output
(Arnaldo Carvalho de Melo)
- Add perf_evlist pointer to *info_priv_size(), more prep work for
supporting the coresight architecture (Mathieu Poirier)
- Improve the 'perf test bp_signal' test (Wang Nan)
- Check environment before starting the BPF 'perf test', so that we can just
'Skip' older kernels instead of 'FAIL'ing them (Wang Nan)
- Fix cpumode of synthesized buildid event (Wang Nan)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJWq9VUAAoJENZQFvNTUqpAi0QP/2i7itD/P9wkGLsPb+HbNlX+
umTxbhdKkyLc9WI8hGfdXXtdHcdkHUAmuZ/DzMbOcnGUE0mJdL6dphslsm1VFslP
Q9sAj43BjWddEKfka1ylos1u/nDhpdpRX7bkRaepA9Zl0P0BSPXv+S28GO3jttxX
uadXN9K7Amsa8tibKicxgLTUhZH05lmhPO00xGHuhQ6EQHcaw8VDYUlA+Wrh+NIa
jIVnRE5q/hBwOyFQR/1gal8N5w2vO0vCglQmGQTEDjgQVMf/cSZChUlVqtxcDxcu
FIDE42+jAnbESmVkBHq2n8ZvNxHOVlG6hTqZOqeiqs+tyfw7fYnGf+tkFPgBIEXP
hB/hwgCJVBbbYo5hzT12eBz7UeWwn1ljqTpTnBrCaOl05MwvN4bMAMFVBXPLQHtm
47AsyaOXEli9RaRwgdcYGVUhqIPTa2Ql2vPRb1PmQ3ugBqqLyUpYOox8WUYQv2g9
sd61KMoXxUiuNsoq0ZXXkjWBeEBz2joRQYrlBQ0tZR8m06UA8FXLUXFopAUZKHGh
7w8BTXRRCc9lEm/pWfHjVykObRlHew0qcDihybtMVsNGpUQzqKh7A8b2DmMvmRrJ
BmnUBQA8kFiE4BJSdOdqwH8PpDRYpTCg0a6cyK4RDlm7isX2ho40edstspEO1N4n
BUG1zE5SIPC1o1MSFxBn
=CFBX
-----END PGP SIGNATURE-----
Merge tag 'perf-core-for-mingo-2' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf tooling changes from Arnaldo Carvalho de Melo:
New features:
- Port 'perf kvm stat' to PowerPC (Hemant Kumar)
Infrastructure changes:
- Use the 'feature-dump' target to do the feature checks just once and then
add code to reuse that in the tests/make makefile, speeding up the
'make -C tools/perf build-test' target (Wang Nan)
- Reduce the number of tests the 'build-test' target do to those that don't
pollute the source tree (Arnaldo Carvalho de Melo)
- Improve the output of the build tests a bit by aligning the name of the
tests, more can be done to filter out uninteresting info in the output
(Arnaldo Carvalho de Melo)
- Add perf_evlist pointer to *info_priv_size(), more prep work for
supporting the coresight architecture (Mathieu Poirier)
- Improve the 'perf test bp_signal' test (Wang Nan)
- Check environment before starting the BPF 'perf test', so that we can just
'Skip' older kernels instead of 'FAIL'ing them (Wang Nan)
- Fix cpumode of synthesized buildid event (Wang Nan)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
User visible:
- Rename the "colors.code" ~/.perfconfig variable to "colors.jump_arrows",
as it controls just the that UI element in the annotate browser (Taeung Song)
- Avoid trying to read ELF symtabs from device files, noticed while doing
memory profiling work (Jiri Olsa)
- Improve context detection when offering options in the hists browser,
i.e. some options don't make sense when the browser is not working with
a perf.data file ('perf top' mode), only in 'perf report' mode, like
scripting (Namhyung Kim)
Infrastructure:
- Elliminate duplication in the hists browser filter functions, getting the
common part into a function that receives callbacks for filtering by
DSO, thread, etc (Namhyung Kim)
- Fix misleadingly indented assignment, found using
gcc6 -Wmisleading-indentation (Markus Trippelsdorf)
- Handle LLVM relocation oddities in libbpf, introducing a 'perf test' that
detects such problems and then fixing the problem, so that the test now
passes (Wang Nan)
- More improvements to the build infrastructure to allow reusing the
feature detection facilities (Wang Nan)
- Auto initialize the globals needed by cpu__max_{cpu,node}() routines
(Arnaldo Carvalho de Melo)
Documentation:
- Document the perf sysctls in Documentation/sysctl/kernel.txt (Ben Hutchings)
- Document a bunch more ~/.perfconfig knobs (Taeung Song)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJWp8VgAAoJENZQFvNTUqpA8gAQAJY4pLDDeK6rZAqAY5fD//JJ
aETuF0icXErkboef5usFk0MGVzK8HOWFOD5YnVAPXsGRqTHZ9ix3xw5sBFDaZKrP
zagySidfHxrPDvtSW6doCjtg571dFaEHWUL48kT8ZpH9vwGDs42Gl/hjEY2P91zK
uNktNoHvbHMUOxoMIp9zyCcV5WEWTog8RwCp53QrxxNrLYIT40wpADQIvuKNgqEP
wIQyC2pgLv9ra27fXThauDes+a/TWLfURtxoeGgiDaIFmOi2t5VeN8D+DxXskKIB
GtYF7Wxk5U+gELsAo5cZKS5Hyf13LqmwL4Jy/Th5jWaObyNXU2ZwnB3zXxZ3Dmvu
keiOY8EmoOoKqOhjUVfsdvsVy0tNhObIJYhlqyOfQg+EqizR0PVlkDxWvODEKkkA
T+dWXm183aXwCsHKM0EhAPgsVAJ/U9+lQjHro/lPq/i54oOogL/aBsVvUjNKo6Od
m6q2ezgFZRHuPMLmOYhJaxtpvOirQkxORZZx2wgzgs5AsJly+ydoR3ETdhAD76Sg
QGSKdTCziDA8KM0Vul6mjoqNlASpUM9cN6uLlv4c26pmf1krwleILMFqzaoYV3iE
3y/ebiRyj2luwKSXELNjcs/7GzCfN3h8sjP6AQ+q0fuWH3zU6+J9oKi+KevYBr8J
fFEX6MNxdxOY92mXDPZa
=cuZc
-----END PGP SIGNATURE-----
Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
User visible changes:
- Rename the "colors.code" ~/.perfconfig variable to "colors.jump_arrows",
as it controls just the that UI element in the annotate browser (Taeung Song)
- Avoid trying to read ELF symtabs from device files, noticed while doing
memory profiling work (Jiri Olsa)
- Improve context detection when offering options in the hists browser,
i.e. some options don't make sense when the browser is not working with
a perf.data file ('perf top' mode), only in 'perf report' mode, like
scripting (Namhyung Kim)
Infrastructure changes:
- Elliminate duplication in the hists browser filter functions, getting the
common part into a function that receives callbacks for filtering by
DSO, thread, etc. (Namhyung Kim)
- Fix misleadingly indented assignment, found using
gcc6 -Wmisleading-indentation (Markus Trippelsdorf)
- Handle LLVM relocation oddities in libbpf, introducing a 'perf test' that
detects such problems and then fixing the problem, so that the test now
passes (Wang Nan)
- More improvements to the build infrastructure to allow reusing the
feature detection facilities (Wang Nan)
- Auto initialize the globals needed by cpu__max_{cpu,node}() routines
(Arnaldo Carvalho de Melo)
Documentation changes:
- Document the perf sysctls in Documentation/sysctl/kernel.txt (Ben Hutchings)
- Document a bunch more ~/.perfconfig knobs (Taeung Song)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
'perf probe' through debuginfo__find_probes() in util/probe-finder.c
checks for the functions' frame descriptions in either .eh_frame section
of an ELF or the .debug_frame.
The check is based on whether either one of these sections is present.
Depending on distro, toolchain defaults, architetcutre, build flags,
etc., CFI might be found in either .eh_frame and/or .debug_frame.
Sometimes, it may happen that, .eh_frame, even if present, may not be
complete and may miss some descriptions.
Therefore, to be sure, to find the CFI covering an address we will
always have to investigate both if available.
For e.g., in powerpc, this may happen:
$ gcc -g bin.c -o bin
$ objdump --dwarf ./bin
<1><145>: Abbrev Number: 7 (DW_TAG_subprogram)
<146> DW_AT_external : 1
<146> DW_AT_name : (indirect string, offset: 0x9e): main
<14a> DW_AT_decl_file : 1
<14b> DW_AT_decl_line : 39
<14c> DW_AT_prototyped : 1
<14c> DW_AT_type : <0x57>
<150> DW_AT_low_pc : 0x100007b8
If the .eh_frame and .debug_frame are checked for the same binary, we
will find that, .eh_frame (although present) doesn't contain a
description for "main" function.
But, .debug_frame has a description:
000000d8 00000024 00000000 FDE cie=00000000 pc=100007b8..10000838
DW_CFA_advance_loc: 16 to 100007c8
DW_CFA_def_cfa_offset: 144
DW_CFA_offset_extended_sf: r65 at cfa+16
...
Due to this (since, perf checks whether .eh_frame is present and goes on
searching for that address inside that frame), perf is unable to process
the probes:
# perf probe -x ./bin main
Failed to get call frame on 0x100007b8
Error: Failed to add events.
To avoid this issue, we need to check both the sections (.eh_frame and
.debug_frame), which is done in this patch.
Note that, we can always force everything into both .eh_frame and
.debug_frame by:
$ gcc bin.c -fasynchronous-unwind-tables -fno-dwarf2-cfi-asm -g -o bin
Signed-off-by: Hemant Kumar <hemant@linux.vnet.ibm.com>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Mark Wielaard <mjw@redhat.com>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1454426806-13974-1-git-send-email-hemant@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
intel_pt_process_auxtrace_info() creates a pt->unknown_thread thread
that eventually needs to be freed by the last thread__put() on it, when
its refcount hits zero, which may happen in
intel_pt_process_auxtrace_info() error handling path and triggers the
following segfault, which would happen as well at intel_pt_free, when
tools using this intel_pt codebase frees up resources:
# perf record -I -e intel_pt/tsc=1,noretcomp=1/u /bin/ls
0 a anaconda-ks.cfg bin perf.data perf.data.old perf-f23-bringup.todo
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.217 MB perf.data ]
#
# perf script -F event,comm,pid,tid,time,addr,ip,sym,dso,iregs
Samples for 'instructions:u' event do not have IREGS attribute set. Cannot print 'iregs' field.
intel_pt_synth_events: failed to synthesize 'instructions' event type
Segmentation fault (core dumped)
#
The problem is: there's a union in 'struct thread' combines a list_head
and a rb_node. The standard life cycle of a thread is: init rb_node in
the constructor, insert it into machine->threads rbtree using rb_node,
move it to machine->dead_threads using list_head, clean in the last
thread__put: list_del_init(&thread->node).
In the above command, it clean a thread before adding it into list,
causes the above segfault.
Since pt->unknown_thread will never live in an rbtree, initialize its
list node so that when list_del_init() is done on it we don't segfault.
After this patch:
# perf script -F event,comm,pid,tid,time,addr,ip,sym,dso,iregs
Samples for 'instructions:u' event do not have IREGS attribute set. Cannot print 'iregs' field.
intel_pt_synth_events: failed to synthesize 'instructions' event type
0x248 [0x88]: failed to process type: 70
#
Reported-by: Tong Zhang <ztong@vt.edu>
Reported-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Link: http://lkml.kernel.org/r/1454296865-19749-1-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When all callchains of a hist entry is percent-limited, do not add a
blank line at the end. It makes the entry look like it doesn't have
callchains.
Reported-and-Tested-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20160128122454.GA27446@danjae.kornet
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When there's only a single callchain, perf doesn't print its percentage
in front of the symbols. This is because it assumes that the percentage
is same as parents. But if a percent limit is applied, it's possible
that there are actually a couple of child nodes but only one of them is
shown. In this case it should display the percent to prevent
misunderstanding of its percentage is same as the parent's.
For example, let's see the following callchain.
$ perf report --no-children --percent-limit 0.01 --tui
...
- 0.06% sleep [kernel.vmlinux] [k] kmem_cache_alloc_trace
kmem_cache_alloc_trace
- perf_event_mmap
- 0.04% mmap_region
do_mmap_pgoff
- vm_mmap_pgoff
+ 0.02% sys_mmap_pgoff
+ 0.02% vm_mmap
+ 0.02% mprotect_fixup
Current code omits the percent if 'mmap_region' becomes the only node
when percent limit is set to 0.03%, its percent is not 0.06% but users
will assume it incorrectly.
Before:
$ perf report --no-children --percent-limit 0.03 --tui
...
0.06% sleep [kernel.vmlinux] [k] kmem_cache_alloc_trace
kmem_cache_alloc_trace
- perf_event_mmap
- mmap_region
do_mmap_pgoff
vm_mmap_pgoff
After:
$ perf report --no-children --percent-limit 0.03 --tui
...
0.06% sleep [kernel.vmlinux] [k] kmem_cache_alloc_trace
kmem_cache_alloc_trace
- perf_event_mmap
- 0.04% mmap_region
do_mmap_pgoff
vm_mmap_pgoff
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1453909257-26015-10-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pass parent node's total period to callchain print functions. This info
is needed by later patch to determine whether it can omit percent or not
correctly.
No functional change intended.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1453909257-26015-9-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The commit 8c430a3486 ("perf hists browser: Support folded
callchains") missed to update hist_browser__dump() so it always shows
graph-style callchains regardless of current setting.
To fix that, factor out callchain printing code and rename the existing
function which prints graph-style callchain.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 8c430a3486 ("perf hists browser: Support folded callchains")
Link: http://lkml.kernel.org/r/1453909257-26015-8-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When there's only a single callchain, perf doesn't print its percentage
in front of the symbols. This is because it assumes that the percentage
is same as parents. But if a percent limit is applied, it's possible
that there are actually a couple of child nodes but only one of them is
shown. In this case it should display the percent to prevent
misunderstanding of its percentage is same as the parent's.
For example, let's see the following callchain.
$ perf report -s comm --percent-limit 0.01 --stdio
...
9.95% swapper
|
|--7.57%--intel_idle
| cpuidle_enter_state
| cpuidle_enter
| call_cpuidle
| cpu_startup_entry
| |
| |--4.89%--start_secondary
| |
| --2.68%--rest_init
| start_kernel
| x86_64_start_reservations
| x86_64_start_kernel
|
|--0.15%--__schedule
| |
| |--0.13%--schedule
| | schedule_preempt_disable
| | cpu_startup_entry
| | |
| | |--0.09%--start_secondary
| | |
| | --0.04%--rest_init
| | start_kernel
| | x86_64_start_reservations
| | x86_64_start_kernel
| |
| --0.01%--schedule_preempt_disabled
| cpu_startup_entry
...
Current code omits the percent if 'intel_idle' becomes the only node
when percent limit is set to 0.5%, its percent is not 9.95% but users
will assume it incorrectly.
Before:
$ perf report --percent-limit 0.5 --stdio
...
9.95% swapper
|
---intel_idle
cpuidle_enter_state
cpuidle_enter
call_cpuidle
cpu_startup_entry
|
|--4.89%--start_secondary
|
--2.68%--rest_init
start_kernel
x86_64_start_reservations
x86_64_start_kernel
After:
$ perf report --percent-limit 0.5 --stdio
...
9.95% swapper
|
--7.57%--intel_idle
cpuidle_enter_state
cpuidle_enter
call_cpuidle
cpu_startup_entry
|
|--4.89%--start_secondary
|
--2.68%--rest_init
start_kernel
x86_64_start_reservations
x86_64_start_kernel
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1453909257-26015-7-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pass hist entry's period to graph callchain print function. This info
is needed by later patch to determine whether it can omit percentage of
top-level node or not.
No functional change intended.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1453909257-26015-6-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It's just a wrapper function to align the start position ofcallchains to
'comm' of each thread if it's a first sort key. But it doesn't not work
with tracepoint events and also with upcoming hierarchy view.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1453909257-26015-5-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently --percent-limit option only works for hist entries. However
it'd be better to have same effect to callchains as well
Requested-by: Andi Kleen <andi@firstfloor.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1453909257-26015-4-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently the hist entry addition path doesn't update total_period of
hists and it's calculated during 'resort' path. But the resort path
needs to know the total period before doing its job because it's used
for calculating percent limit of callchains in hist entries.
So this patch update the total period during the addition path. It
makes the percent limit of callchains working (again).
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1453909257-26015-3-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The total period should be get using hists__total_period() since it
takes filtered entries into account. In addition, if callchain mode is
'fractal', the total period should be the entry's period.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1453909257-26015-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fixes segmentation fault using, for instance:
(gdb) run record -I -e intel_pt/tsc=1,noretcomp=1/u /bin/ls
Starting program: /home/acme/bin/perf record -I -e intel_pt/tsc=1,noretcomp=1/u /bin/ls
Missing separate debuginfos, use: dnf debuginfo-install glibc-2.22-7.fc23.x86_64
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Program received signal SIGSEGV, Segmentation fault.
0 x00000000004b9ea5 in tracepoint_error (e=0x0, err=13, sys=0x19b1370 "sched", name=0x19a5d00 "sched_switch") at util/parse-events.c:410
(gdb) bt
#0 0x00000000004b9ea5 in tracepoint_error (e=0x0, err=13, sys=0x19b1370 "sched", name=0x19a5d00 "sched_switch") at util/parse-events.c:410
#1 0x00000000004b9fc5 in add_tracepoint (list=0x19a5d20, idx=0x7fffffffb8c0, sys_name=0x19b1370 "sched", evt_name=0x19a5d00 "sched_switch", err=0x0, head_config=0x0)
at util/parse-events.c:433
#2 0x00000000004ba334 in add_tracepoint_event (list=0x19a5d20, idx=0x7fffffffb8c0, sys_name=0x19b1370 "sched", evt_name=0x19a5d00 "sched_switch", err=0x0, head_config=0x0)
at util/parse-events.c:498
#3 0x00000000004bb699 in parse_events_add_tracepoint (list=0x19a5d20, idx=0x7fffffffb8c0, sys=0x19b1370 "sched", event=0x19a5d00 "sched_switch", err=0x0, head_config=0x0)
at util/parse-events.c:936
#4 0x00000000004f6eda in parse_events_parse (_data=0x7fffffffb8b0, scanner=0x19a49d0) at util/parse-events.y:391
#5 0x00000000004bc8e5 in parse_events__scanner (str=0x663ff2 "sched:sched_switch", data=0x7fffffffb8b0, start_token=258) at util/parse-events.c:1361
#6 0x00000000004bca57 in parse_events (evlist=0x19a5220, str=0x663ff2 "sched:sched_switch", err=0x0) at util/parse-events.c:1401
#7 0x0000000000518d5f in perf_evlist__can_select_event (evlist=0x19a3b90, str=0x663ff2 "sched:sched_switch") at util/record.c:253
#8 0x0000000000553c42 in intel_pt_track_switches (evlist=0x19a3b90) at arch/x86/util/intel-pt.c:364
#9 0x00000000005549d1 in intel_pt_recording_options (itr=0x19a2c40, evlist=0x19a3b90, opts=0x8edf68 <record+232>) at arch/x86/util/intel-pt.c:664
#10 0x000000000051e076 in auxtrace_record__options (itr=0x19a2c40, evlist=0x19a3b90, opts=0x8edf68 <record+232>) at util/auxtrace.c:539
#11 0x0000000000433368 in cmd_record (argc=1, argv=0x7fffffffde60, prefix=0x0) at builtin-record.c:1264
#12 0x000000000049bec2 in run_builtin (p=0x8fa2a8 <commands+168>, argc=5, argv=0x7fffffffde60) at perf.c:390
#13 0x000000000049c12a in handle_internal_command (argc=5, argv=0x7fffffffde60) at perf.c:451
#14 0x000000000049c278 in run_argv (argcp=0x7fffffffdcbc, argv=0x7fffffffdcb0) at perf.c:495
#15 0x000000000049c60a in main (argc=5, argv=0x7fffffffde60) at perf.c:618
(gdb)
Intel PT attempts to find the sched:sched_switch tracepoint but that seg
faults if tracefs is not readable, because the error reporting structure
is null, as errors are not reported when automatically adding
tracepoints. Fix by checking before using.
Committer note:
This doesn't take place in a kernel that supports
perf_event_attr.context_switch, that is the default way that will be
used for tracking context switches, only in older kernels, like 4.2, in
a machine with Intel PT (e.g. Broadwell) for non-priviledged users.
Further info from a similar patch by Wang:
The error is in tracepoint_error: it assumes the 'e' parameter is valid.
However, there are many situation a parse_event() can be called without
parse_events_error. See result of
$ grep 'parse_events(.*NULL)' ./tools/perf/ -r'
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Tong Zhang <ztong@vt.edu>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: stable@vger.kernel.org # v4.4+
Fixes: 196581717d ("perf tools: Enhance parsing events tracepoint error output")
Link: http://lkml.kernel.org/r/1453809921-24596-2-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pull perf fixes from Thomas Gleixner:
"This is much bigger than typical fixes, but Peter found a category of
races that spurred more fixes and more debugging enhancements. Work
started before the merge window, but got finished only now.
Aside of that this contains the usual small fixes to perf and tools.
Nothing particular exciting"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (43 commits)
perf: Remove/simplify lockdep annotation
perf: Synchronously clean up child events
perf: Untangle 'owner' confusion
perf: Add flags argument to perf_remove_from_context()
perf: Clean up sync_child_event()
perf: Robustify event->owner usage and SMP ordering
perf: Fix STATE_EXIT usage
perf: Update locking order
perf: Remove __free_event()
perf/bpf: Convert perf_event_array to use struct file
perf: Fix NULL deref
perf/x86: De-obfuscate code
perf/x86: Fix uninitialized value usage
perf: Fix race in perf_event_exit_task_context()
perf: Fix orphan hole
perf stat: Do not clean event's private stats
perf hists: Fix HISTC_MEM_DCACHELINE width setting
perf annotate browser: Fix behaviour of Shift-Tab with nothing focussed
perf tests: Remove wrong semicolon in while loop in CQM test
perf: Synchronously free aux pages in case of allocation failure
...
$ make -C tools/perf build-test
make[1]: Entering directory `/home/acme/git/linux/tools/perf'
make_pure_O: cd . && make -f Makefile O=/tmp/tmp.mPx0Cmik3f DESTDIR=/tmp/tmp.U0SUmVbtJm
make_clean_all_O: cd . && make -f Makefile O=/tmp/tmp.Yl5UzhTU7T DESTDIR=/tmp/tmp.fop1E4jdER clean all
make_debug_O: cd . && make -f Makefile O=/tmp/tmp.pMn2ozBoXC DESTDIR=/tmp/tmp.azxhDp5sEp DEBUG=1
make_no_libperl_O: cd . && make -f Makefile O=/tmp/tmp.qJPiINMtA7 DESTDIR=/tmp/tmp.KNMrLeGDxZ NO_LIBPERL=1
<SNIP>
More needs to be done to make it more compact, i.e. elide the '-f Makefile',
remove that 'cd . &&', move the DESTDIR= and O= to the end, as they don't
convey that much information besides the fact that they are being set to some
random directory just for this build, move the meat, i.e. the meaningful
feature disabling bits to the start, etc.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-wir3w3o4f1nmbgcxgnx8cj9c@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Powerpc provides hcall events that also provides insights into guest
behaviour. Enhance perf kvm stat to record and analyze hcall events.
- To trace hcall events :
perf kvm stat record
- To show the results :
perf kvm stat report --event=hcall
The result shows the number of hypervisor calls from the guest grouped
by their respective reasons displayed with the frequency.
This patch makes use of two additional tracepoints
"kvm_hv:kvm_hcall_enter" and "kvm_hv:kvm_hcall_exit". To map the hcall
codes to their respective names, it needs a mapping. Such mapping is
added in this patch in book3s_hcalls.h.
# pgrep qemu
A sample output :
19378
60515
2 VMs running.
# perf kvm stat record -a
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 4.153 MB perf.data.guest (39624
samples) ]
# perf kvm stat report -p 60515 --event=hcall
Analyze events for all VMs, all VCPUs:
HCALL-EVENT Samples Samples% Time% MinTime MaxTime AvgTime
H_IPI 822 66.08% 88.10% 0.63us 11.38us 2.05us (+- 1.42%)
H_SEND_CRQ 144 11.58% 3.77% 0.41us 0.88us 0.50us (+- 1.47%)
H_VIO_SIGNAL 118 9.49% 2.86% 0.37us 0.83us 0.47us (+- 1.43%)
H_PUT_TERM_CHAR 76 6.11% 2.07% 0.37us 0.90us 0.52us (+- 2.43%)
H_GET_TERM_CHAR 74 5.95% 2.23% 0.37us 1.70us 0.58us (+- 4.77%)
H_RTAS 6 0.48% 0.85% 1.10us 9.25us 2.70us (+-48.57%)
H_PERFMON 4 0.32% 0.12% 0.41us 0.96us 0.59us (+-20.92%)
Total Samples:1244, Total events handled time:1916.69us.
Signed-off-by: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Scott Wood <scottwood@freescale.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/1453962787-15376-4-git-send-email-hemant@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf kvm can be used to analyze guest exit reasons. This support already
exists in x86. Hence, porting it to powerpc.
- To trace KVM events :
perf kvm stat record
If many guests are running, we can track for a specific guest by using
--pid as in : perf kvm stat record --pid <pid>
- To see the results :
perf kvm stat report
The result shows the number of exits (from the guest context to
host/hypervisor context) grouped by their respective exit reasons with
their frequency.
Since, different powerpc machines have different KVM tracepoints, this
patch discovers the available tracepoints dynamically and accordingly
looks for them. If any single tracepoint is not present, this support
won't be enabled for reporting. To record, this will fail if any of the
events we are looking to record isn't available. Right now, its only
supported on PowerPC Book3S_HV architectures.
To analyze the different exits, group them and present them (in a slight
descriptive way) to the user, we need a mapping between the "exit code"
(dumped in the kvm_guest_exit tracepoint data) and to its related
Interrupt vector description (exit reason). This patch adds this mapping
in book3s_hv_exits.h.
It records on two available KVM tracepoints for book3s_hv:
"kvm_hv:kvm_guest_exit" and "kvm_hv:kvm_guest_enter".
Here is a sample o/p:
# pgrep qemu
19378
60515
2 Guests are running on the host.
# perf kvm stat record -a
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 4.153 MB perf.data.guest (39624
samples) ]
# perf kvm stat report -p 60515
Analyze events for pid(s) 60515, all VCPUs:
VM-EXIT Samples Samples% Time% MinTime MaxTime Avg time
SYSCALL 9141 63.67% 7.49% 1.26us 5782.39us 9.87us (+- 6.46%)
H_DATA_STORAGE 4114 28.66% 5.07% 1.72us 4597.68us 14.84us (+-20.06%)
HV_DECREMENTER 418 2.91% 4.26% 0.70us 30002.22us 122.58us (+-70.29%)
EXTERNAL 392 2.73% 0.06% 0.64us 104.10us 1.94us (+-18.83%)
RETURN_TO_HOST 287 2.00% 83.11% 1.53us 124240.15us 3486.52us (+-16.81%)
H_INST_STORAGE 5 0.03% 0.00% 1.88us 3.73us 2.39us (+-14.20%)
Total Samples:14357, Total events handled time:1203918.42us.
Signed-off-by: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Scott Wood <scottwood@freescale.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/1453962787-15376-3-git-send-email-hemant@linux.vnet.ibm.com
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch removes the "const" qualifier from kvm_events_tp declaration
to account for the fact that some architectures may need to update this
variable dynamically. For instance, powerpc will need to update this
variable dynamically depending on the machine type.
Signed-off-by: Hemant Kumar <hemant@linux.vnet.ibm.com>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Scott Wood <scottwood@freescale.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/1453962787-15376-2-git-send-email-hemant@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Its better to remove the dependency on uapi/kvm_perf.h to allow dynamic
discovery of kvm events (if its needed). To do this, some extern
variables have been introduced with which we can keep the generic
functions generic.
Signed-off-by: Hemant Kumar <hemant@linux.vnet.ibm.com>
Acked-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Scott Wood <scottwood@freescale.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/1453962787-15376-1-git-send-email-hemant@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
'perf record' knows whether buildid cache is enabled (via
--no-no-buildid-cache) deliberately. Buildid cache can be turned off in
some situations.
Output switching support needs this feature to turn off buildid cache
by default.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1453715801-7732-33-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Timestamp generation becomes a public available helper. Which will
be used by 'perf record', help it output to split output file based
on time.
For example:
perf.data.2015122620363710
perf.data.2015122620364092
perf.data.2015122620365423
...
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1453715801-7732-27-git-send-email-wangnan0@huawei.com
Signed-off-by: He Kuang <hekuang@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Will Deacon [1] has some question on patch [2]. This patch improves
test__bp_signal so we can test:
1. A watchpoint and a breakpoint that fire on the same instruction
2. Nested signals
Test result:
On x86_64 and ARM64 (result are similar with patch [2] on ARM64):
# ./perf test -v signal
17: Test breakpoint overflow signal handler :
--- start ---
test child forked, pid 10213
count1 1, count2 3, count3 2, overflow 3, overflows_2 3
test child finished with 0
---- end ----
Test breakpoint overflow signal handler: Ok
So at least 2 cases Will doubted are handled correctly.
[1] http://lkml.kernel.org/g/20160104165535.GI1616@arm.com
[2] http://lkml.kernel.org/g/1450921362-198371-1-git-send-email-wangnan0@huawei.com
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: He Kuang <hekuang@huawei.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1453715801-7732-9-git-send-email-wangnan0@huawei.com
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Copying perf to old kernel system results:
# perf test bpf
37: Test BPF filter :
37.1: Test basic BPF filtering : FAILED!
37.2: Test BPF prologue generation : Skip
However, in case when kernel doesn't support a test case it should
return 'Skip', 'FAILED!' should be reserved for kernel tests for when
the kernel supports a feature that then fails to work as advertised.
This patch checks environment before real testcase.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1453715801-7732-7-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
There is a nasty confusion that, for kernel module, dso->kernel is not
necessary to be DSO_TYPE_KERNEL or DSO_TYPE_GUEST_KERNEL. These two
enums are for vmlinux. See thread [1]. We tried to fix this part but it
is costy.
Code machine__write_buildid_table() is another unfortunate function fall
into this trap that, when issuing buildid event for a kernel module,
cpumode it gives to the event is PERF_RECORD_MISC_USER, not
PERF_RECORD_MISC_KERNEL.
However, even with this bug, most of the time it doesn't causes real
problem. I find this issue when trying to use a perf before commit
3d39ac5386 ("perf machine: No need to have two DSOs lists") to parse a
perf.data generated by newest perf.
[1] https://lkml.org/lkml/2015/9/21/908
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1454089251-203152-1-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
On some architecture the size of the private header may be dependent on
the number of tracers used in the session. As such adding a "struct
perf_evlist *" parameter, which should contain all the required
information.
Also adjusting the existing client of the interface to take the new
parameter into account.
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Al Grant <al.grant@arm.com>
Cc: Chunyan Zhang <zhang.chunyan@linaro.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-doc@vger.kernel.org
Cc: Mike Leach <mike.leach@arm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Rabin Vincent <rabin@rab.in>
Cc: Tor Jeremiassen <tor@ti.com>
Link: http://lkml.kernel.org/r/1452807977-8069-22-git-send-email-mathieu.poirier@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The 'tools/perf/test/make' makefile has in its default, 'all' target
builds that will pollute the source code directory, i.e. that will not
use O= variable.
The 'build-test' should be run as often as possible, preferrably after
each non strictly non-code commit, so speed it up by selecting just
the O= targets.
Furthermore it tests both the Makefile.perf file, that is normally
driven by the main Makefile, and the Makefile, reduce the time in half
by having just MK=Makefile, the most usual, tested by 'build-test'.
Please run:
make -C tools/perf -f tests/make
from time to time for testing also the in-place build tests.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-jrt9utscsiqkmjy3ccufostd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To prevent the feature check tests to run repeately, one time per
'tests/make' target/test, this patch utilizes the previously introduced
'feature-dump' make target and FEATURES_DUMP variable, making sure that
the feature checkers run only once when doing build-test for normal test
cases.
However, since standard users doesn't reuse features dump result, we'd
better give an option to check their behaviors. The above feature
should be used to make build-test faster only. Only utilize it for
build-test.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1454068269-235999-1-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
'make feature-dump' should give a stable result, so even 'NO_SOMETHING=1'
is given (for babeltrace, if LIBBABELTRACE=1 is not given), we should
try to detect those feature and {C,LD}FLAGS. Build or not should be
controled independent.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Link: http://lkml.kernel.org/r/1454047050-204993-1-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Since it was always checking if the initialization was done, use that
branch to do the initialization if not done already.
With this we reduce the number of exported globals from these files.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20160125212955.GG22501@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The script and data-switch context menu are only meaningful when it
deals with a data file. So add a check so that it cannot be shown when
perf-top is run.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>,
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1453555902-18401-4-git-send-email-namhyung@kernel.org
[ Use goto skip_scripting instead of two is_report_browser() tests ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Set FEATURE_TESTS to 'all' so all possible feature checkers are
executed. Without this setting the output feature dump file miss some
feature, for example, liberity. Select all checker so we won't get an
incomplete feature dump file.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Link: http://lkml.kernel.org/r/1453715801-7732-5-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
There's a bug in LLVM that it can generate unneeded relocation
information. See [1] and [2]. Libbpf should check the target section of
a relocation symbol.
This patch adds a testcase which references a global variable (BPF
doesn't support global variables). Before fixing libbpf, the new test
case can be loaded into kernel, the global variable acts like the first
map. It is incorrect.
Result:
# ~/perf test BPF
37: Test BPF filter :
37.1: Test basic BPF filtering : Ok
37.2: Test BPF prologue generation : Ok
37.3: Test BPF relocation checker : FAILED!
# ~/perf test -v BPF
...
libbpf: loading object '[bpf_relocation_test]' from buffer
libbpf: section .strtab, size 126, link 0, flags 0, type=3
libbpf: section .text, size 0, link 0, flags 6, type=1
libbpf: section .data, size 0, link 0, flags 3, type=1
libbpf: section .bss, size 0, link 0, flags 3, type=8
libbpf: section func=sys_write, size 104, link 0, flags 6, type=1
libbpf: found program func=sys_write
libbpf: section .relfunc=sys_write, size 16, link 10, flags 0, type=9
libbpf: section maps, size 16, link 0, flags 3, type=1
libbpf: maps in [bpf_relocation_test]: 16 bytes
libbpf: section license, size 4, link 0, flags 3, type=1
libbpf: license of [bpf_relocation_test] is GPL
libbpf: section version, size 4, link 0, flags 3, type=1
libbpf: kernel version of [bpf_relocation_test] is 40400
libbpf: section .symtab, size 144, link 1, flags 0, type=2
libbpf: map 0 is "my_table"
libbpf: collecting relocating info for: 'func=sys_write'
libbpf: relocation: insn_idx=7
Success unexpectedly: libbpf error when dealing with relocation
test child finished with -1
---- end ----
Test BPF filter subtest 2: FAILED!
[1] https://llvm.org/bugs/show_bug.cgi?id=26243
[2] https://patchwork.ozlabs.org/patch/571385/
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1453715801-7732-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
There are cases where looking at just the next and prev entries is not
enough, like with:
$ readelf -sW /usr/lib/debug/lib/modules/4.3.3-301.fc23.x86_64/vmlinux | grep ffffffff81065ec0
4979: ffffffff81065ec0 53 FUNC LOCAL DEFAULT 1 try_to_free_pud_page
4980: ffffffff81065ec0 53 FUNC LOCAL DEFAULT 1 try_to_free_pte_page
4981: ffffffff81065ec0 53 FUNC LOCAL DEFAULT 1 try_to_free_pmd_page
So just search by name to see if the symbol is in kallsyms.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-jj1vlljg7ol4i713l60rt5ai@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To be used in the 'vmlinux matches kallsyms' 'perf test' entry.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-m56g1853lz2c6nhnqxibq4jd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now that we check more strictly what each of the menu entries need, we
can stop bailing out when 'sym' is not in the --sort order, instead we
let each option be added if what it needs is present.
This way, for instance, we can run scripts on all samples, see DSO map
details when 'dso' is in the --sort provided, etc.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>,
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1452960197-5323-9-git-send-email-namhyung@kernel.org
[ Carved out from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For consistency with the other sort order checks.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>,
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1452960197-5323-9-git-send-email-namhyung@kernel.org
[ Carved out from a larger patch, moved check to add_socket_opt() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We can't offer a zoom into DSO when a bucket (struct hist_entry) may
have samples for more than one DSO, i.e. when 'dso' is not part of
the sort order, ditto for 'Map details', fix it.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>,
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1452960197-5323-9-git-send-email-namhyung@kernel.org
[ Carved out from a larger patch, moved check to add_{dso,map}_opt() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When this feature was introduced a check was made if there was a
resolved symbol under the cursor, it got lost in commit ea7cd59233
("perf hists browser: Split popup menu actions - part 2"), reinstate it.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>,
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: ea7cd59233 ("perf hists browser: Split popup menu actions - part 2")
Link: http://lkml.kernel.org/r/1452960197-5323-9-git-send-email-namhyung@kernel.org
[ Carved out from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We can't offer a zoom into thread when a bucket (struct hist_entry) may
have samples for more than one thread, i.e. when 'pid' is not part of
the sort order, fix it.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>,
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1452960197-5323-9-git-send-email-namhyung@kernel.org
[ Carved out from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now the UI browsers will be able to offer thread related operations only
if the thread is part of the sort order in use, i.e. if hist_entry stats
are all for a single thread.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>,
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1452960197-5323-9-git-send-email-namhyung@kernel.org
[ Carved out from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Explain 'annotate' section and its variables.
'hide_src_code', 'use_offset', 'jump_arrows',
'show_linenr', 'show_nr_jump' and 'show_total_period'.
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1452253193-30502-5-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Explain 'tui' and 'gtk' sections and these variables.
'top', 'report' and 'annotate'
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1452253193-30502-3-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Explain 'colors' section and its variables, used for The variables for
customizing the colors used in the output for the 'report', 'top' and
'annotate' in the TUI, those are:
'top', 'medium', 'normal', 'selected',
'jump_arrows', 'addr' and 'root'.
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1452253193-30502-2-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
USe 'jump_arrows' config name instead of 'code' on 'colors' section.
'colors.code' config is only for jump arrows on assembly code listings
i.e.
│ ┌──jmp 1333
│ │ xchg %ax,%ax
│ │ mov %r15,%r10
│ └─→cmp %r15,%r14
But this config name seems unfit.
'jump_arrows' is more descriptive than 'code'.
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1452240971-25418-1-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf_event_paranoid was only documented in source code and a perf error
message. Copy the documentation from the error message to
Documentation/sysctl/kernel.txt.
perf_cpu_time_max_percent was already documented but missing from the
list at the top, so add it there.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-doc@vger.kernel.org
Link: http://lkml.kernel.org/r/20160119213515.GG2637@decadent.org.uk
[ Remove reference to external Documentation file, provide info inline, as before ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The hists__filter_by_xxx functions share same logic with different
filters. Factor out the common code into the hists__filter_by_type.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1453252521-24398-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The --exclude-other option sets HIST_FILTER__PARENT bit and it's only
set when a hist entry was created. DSO filters don't change this so no
need to have the check in hists__filter_by_dso() IMHO.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1453252521-24398-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
There's no need for the following functions to be global:
perf_evsel__reset_stat_priv
perf_evsel__alloc_stat_priv
perf_evsel__free_stat_priv
perf_evsel__alloc_prev_raw_counts
perf_evsel__free_prev_raw_counts
perf_evsel__alloc_stats
They all ended up in util/stat.c, and they no longer need to be called
from outside this object.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453290995-18485-5-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
With mem sampling we could get data source within mapped device file.
Processing such sample would block during report phase on trying to read
the device file.
Chacking for device files and skip the processing if it's detected.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453290995-18485-2-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
One line in perf_pmu__parse_unit() is indented wrongly, leading to a
warning (=> error) from gcc 6:
util/pmu.c:156:3: error: statement is indented as if it were guarded by... [-Werror=misleading-indentation]
sret = read(fd, alias->unit, UNIT_MAX_LEN);
^~~~
util/pmu.c:153:2: note: ...this 'if' clause, but it is not
if (fd == -1)
^~
Signed-off-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: 410136f5dd ("tools/perf/stat: Add event unit and scale support")
Link: http://lkml.kernel.org/r/20151214154440.GC1409@x4
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Mel reported stddev reporting was broken due to following commit:
106a94a0f8 ("perf stat: Introduce read_counters function")
This commit merged interval and overall counters reading into single
read_counters function.
The old interval code cleaned the stddev data for some reason (it's
never displayed in interval mode) and the mentioned commit kept on
cleaning the stddev data in merged function, which resulted in the
stddev not being displayed.
Removing the wrong stddev data cleanup init_stats call.
Reported-and-Tested-by: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: stable@vger.kernel.org # v4.2+
Fixes: 106a94a0f8 ("perf stat: Introduce read_counters function")
Link: http://lkml.kernel.org/r/1453290995-18485-4-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The issue was pointed out by gcc-6's -Wmisleading-indentation.
Signed-off-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: c97cf42219 ("perf top: Live TUI Annotation")
Link: http://lkml.kernel.org/r/20151214154403.GB1409@x4
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The while loop was spinning. Fix by removing a semicolon.
The issue was pointed out by gcc-6's -Wmisleading-indentation.
Signed-off-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: 035827e9f2 ("perf tests: Add Intel CQM test")
Link: http://lkml.kernel.org/r/20151214154335.GA1409@x4
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
- Remove usage of ib_query_device and instead store attributes in
ib_device struct
- Move iopoll out of block and into lib, rename to irqpoll, and use
in several places in the rdma stack as our new completion queue
polling library mechanism. Update the other block drivers that
already used iopoll to use the new mechanism too.
- Replace the per-entry GID table locks with a single GID table lock
- IPoIB multicast cleanup
- Cleanups to the IB MR facility
- Add support for 64bit extended IB counters
- Fix for netlink oops while parsing RDMA nl messages
- RoCEv2 support for the core IB code
- mlx4 RoCEv2 support
- mlx5 RoCEv2 support
- Cross Channel support for mlx5
- Timestamp support for mlx5
- Atomic support for mlx5
- Raw QP support for mlx5
- MAINTAINERS update for mlx4/mlx5
- Misc ocrdma, qib, nes, usNIC, cxgb3, cxgb4, mlx4, mlx5 updates
- Add support for remote invalidate to the iSER driver (pushed through the
RDMA tree due to dependencies, acknowledged by nab)
- Update to NFSoRDMA (pushed through the RDMA tree due to dependencies,
acknowledged by Bruce)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJWoSygAAoJELgmozMOVy/dDjsP/2vbTda2MvQfkfkGEZBQdJSg
095RN0gQgCJdg78lAl8yuaK8r4VN/7uefpDtFdudH1I/Pei7X0wxN9R1UzFNG4KR
AD53lz92IVPs15328SbPR2kvNWISR9aBFQo3rlElq3Grqlp0EMn2Ou1vtu87rekF
aMllxr8Nl0uZhP+eWusOsYpJUUtwirLgRnrAyfqo2UxZh/TMIroT0TCx1KXjVcAg
dhDARiZAdu3OgSc6OsWqmH+DELEq6dFVA5F+DDBGAb8bFZqlJc7cuMHWInwNsNXT
so4bnEQ835alTbsdYtqs5DUNS8heJTAJP4Uz0ehkTh/uNCcvnKeUTw1c2P/lXI1k
7s33gMM+0FXj0swMBw0kKwAF2d9Hhus9UAN7NwjBuOyHcjGRd5q7SAnfWkvKx000
s9jVW19slb2I38gB58nhjOh8s+vXUArgxnV1+kTia1+bJSR5swvVoWRicRXdF0vh
TvLX/BjbSIU73g1TnnLNYoBTV3ybFKQ6bVdQW7fzSTDs54dsI1vvdHXi3bYZCpnL
HVwQTZRfEzkvb0AdKbcvf8p/TlaAHem3ODqtO1eHvO4if1QJBSn+SptTEeJVYYdK
n4B3l/dMoBH4JXJUmEHB9jwAvYOpv/YLAFIvdL7NFwbqGNsC3nfXFcmkVORB1W3B
KEMcM2we4bz+uyKMjEAD
=5oO7
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
Pull rdma updates from Doug Ledford:
"Initial roundup of 4.5 merge window patches
- Remove usage of ib_query_device and instead store attributes in
ib_device struct
- Move iopoll out of block and into lib, rename to irqpoll, and use
in several places in the rdma stack as our new completion queue
polling library mechanism. Update the other block drivers that
already used iopoll to use the new mechanism too.
- Replace the per-entry GID table locks with a single GID table lock
- IPoIB multicast cleanup
- Cleanups to the IB MR facility
- Add support for 64bit extended IB counters
- Fix for netlink oops while parsing RDMA nl messages
- RoCEv2 support for the core IB code
- mlx4 RoCEv2 support
- mlx5 RoCEv2 support
- Cross Channel support for mlx5
- Timestamp support for mlx5
- Atomic support for mlx5
- Raw QP support for mlx5
- MAINTAINERS update for mlx4/mlx5
- Misc ocrdma, qib, nes, usNIC, cxgb3, cxgb4, mlx4, mlx5 updates
- Add support for remote invalidate to the iSER driver (pushed
through the RDMA tree due to dependencies, acknowledged by nab)
- Update to NFSoRDMA (pushed through the RDMA tree due to
dependencies, acknowledged by Bruce)"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (169 commits)
IB/mlx5: Unify CQ create flags check
IB/mlx5: Expose Raw Packet QP to user space consumers
{IB, net}/mlx5: Move the modify QP operation table to mlx5_ib
IB/mlx5: Support setting Ethernet priority for Raw Packet QPs
IB/mlx5: Add Raw Packet QP query functionality
IB/mlx5: Add create and destroy functionality for Raw Packet QP
IB/mlx5: Refactor mlx5_ib_qp to accommodate other QP types
IB/mlx5: Allocate a Transport Domain for each ucontext
net/mlx5_core: Warn on unsupported events of QP/RQ/SQ
net/mlx5_core: Add RQ and SQ event handling
net/mlx5_core: Export transport objects
IB/mlx5: Expose CQE version to user-space
IB/mlx5: Add CQE version 1 support to user QPs and SRQs
IB/mlx5: Fix data validation in mlx5_ib_alloc_ucontext
IB/sa: Fix netlink local service GFP crash
IB/srpt: Remove redundant wc array
IB/qib: Improve ipoib UD performance
IB/mlx4: Advertise RoCE v2 support
IB/mlx4: Create and use another QP1 for RoCEv2
IB/mlx4: Enable send of RoCE QP1 packets with IP/UDP headers
...
Introducing FEATURES_DUMP make variable to provide features
detection dump file and bypass the feature detection.
The intention is to use this during build tests to skip
repeated features detection, like:
Get feature dump static build into /tmp/fd file:
$ make feature-dump FEATURE_DUMP_COPY=/tmp/fd LDFLAGS=-static
BUILD: Doing 'make -j4' parallel build
Auto-detecting system features:
... dwarf: [ OFF ]
SNIP
FEATURE-DUMP file copied into /tmp/fd
Use /tmp/fd to build perf:
$ make FEATURES_DUMP=/tmp/fd LDFLAGS=-static
$ file perf
perf: ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux), statically linked, for ...
Suggested-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1452830421-77757-7-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To provide FEATURE-DUMP into $(FEATURE_DUMP_COPY) if defined, with no
further action.
Get feature dump of the current build:
$ make feature-dump
BUILD: Doing 'make -j4' parallel build
Auto-detecting system features:
... dwarf: [ on ]
FEATURE-DUMP file available in FEATURE-DUMP
Get feature dump static build into /tmp/fd file:
$ make feature-dump FEATURE_DUMP_COPY=/tmp/fd LDFLAGS=-static
BUILD: Doing 'make -j4' parallel build
Auto-detecting system features:
... dwarf: [ OFF ]
SNIP
FEATURE-DUMP file copied into /tmp/fd
Suggested-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1452830421-77757-6-git-send-email-wangnan0@huawei.com
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Kernel makefile only follows an 'O' option passed from command line
explicitely. In build-test with 'O' option set, kernel makefile
contaminate kernel source directory. Build test also fail if we don't
create output directory manually.
K_O_OPT is added and passed to kernel makefile if 'O' is passed
to build-test.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1452830421-77757-5-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
If an 'O' is passed to 'make build-test', many 'test -x' and 'test -f'
will fail because perf resides in a different directory. Fix this by
computing PERF_OUT according to 'O' and test correct output files.
For make_kernelsrc and make_kernelsrc_tools, set KBUILD_OUTPUT_DIR
instead because the path is different from others ($(O)/perf vs
$(O)/tools/perf).
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1452830421-77757-4-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Unlike tools/perf/Makefile, tools/perf/Makefile.perf obey 'O' option
when it is passed through cmdline only, due to code in
tools/scripts/Makefile.include:
ifneq ($(O),)
ifeq ($(origin O), command line)
...
ABSOLUTE_O := $(shell cd $(O) ; pwd)
OUTPUT := $(ABSOLUTE_O)/$(if $(subdir),$(subdir)/)
endif
endif
This patch passes 'O' to Makefile.perf through cmdline explicitly
to make it follow O variable during build-test.
'make clean' should have identical 'O' option with 'make'. If not,
config-clean may error.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1452830421-77757-3-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
'make build-test' is painful because of time consuming. In a full test,
all test cases are built twice with tools/perf/Makefile and
tools/perf/Makefile.perf. 'Makefile' automatically computes parallel
options for make, but 'Makefile.perf' not, so all test cases is built
with one job. It is very slow.
This patch adds '-j' options to Makefile.perf testing. It computes
parallel building options like what tools/perf/Makefile does, and pass
'-j' option to Makefile.perf test.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1452687442-6186-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We need to use the long name (the filename) when reading the build-id
from a DSO. Using the short name doesn't work for (at least) vDSOs.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20160113172301.GT28542@decadent.org.uk
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
While recording guest samples in host using perf kvm record, it will
populate unprocessable sample error, though samples will be recorded
properly. While generating report using perf kvm report, no samples will
be processed and same error will populate. We have seen this behaviour
with upstream perf(4.4-rc3) on x86 and ppc64 hardware.
Reason behind this failure is, when it tries to fetch machine from
rb_tree of machines, it fails. As a part of tracing a bug, we figured
out that this code was incorrectly refactored in commit 54245fdc35
("perf session: Remove wrappers to machines__find").
This patch will change the functionality such that if it can't fetch
machine in first trial, it will create one node of machine and add that to
rb_tree. So next time when it tries to fetch same machine from rb_tree,
it won't fail. Actually it was the case before refactoring of code in
aforementioned commit.
This patch is generated from acme perf/core branch.
Below I've mention an example that demonstrate the behaviour before and
after applying patch.
Before applying patch:
[Note: One needs to run guest before recording data in host]
ravi@ravi-bangoria:~$ ./perf kvm record -a
Warning:
5903 unprocessable samples recorded.
Do you have a KVM guest running and not using 'perf kvm'?
[ perf record: Captured and wrote 1.409 MB perf.data.guest (285 samples) ]
ravi@ravi-bangoria:~$ ./perf kvm report --stdio
Warning:
5903 unprocessable samples recorded.
Do you have a KVM guest running and not using 'perf kvm'?
# To display the perf.data header info, please use --header/--header-only options.
#
# Total Lost Samples: 0
#
# Samples: 285 of event 'cycles'
# Event count (approx.): 88715406
#
# Overhead Command Shared Object Symbol
# ........ ....... ............. ......
#
# (For a higher level overview, try: perf report --sort comm,dso)
#
After applying patch:
ravi@ravi-bangoria:~$ ./perf kvm record -a
[ perf record: Captured and wrote 1.188 MB perf.data.guest (17 samples) ]
ravi@ravi-bangoria:~$ ./perf kvm report --stdio
# To display the perf.data header info, please use --header/--header-only options.
#
# Total Lost Samples: 0
#
# Samples: 17 of event 'cycles'
# Event count (approx.): 700746
#
# Overhead Command Shared Object Symbol
# ........ ....... ................ ......................
#
34.19% :5758 [unknown] [g] 0xffffffff818682ab
22.79% :5758 [unknown] [g] 0xffffffff812dc7f8
22.79% :5758 [unknown] [g] 0xffffffff818650d0
14.83% :5758 [unknown] [g] 0xffffffff8161a1b6
2.49% :5758 [unknown] [g] 0xffffffff818692bf
0.48% :5758 [unknown] [g] 0xffffffff81869253
0.05% :5758 [unknown] [g] 0xffffffff81869250
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org # v3.19+
Fixes: 54245fdc35 ("perf session: Remove wrappers to machines__find")
Link: http://lkml.kernel.org/r/1449471302-11283-1-git-send-email-ravi.bangoria@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Some people don't install perf, but just use compiled version in the
source. Fallback to lookup the source directory for those poor guys. :)
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1452334589-8782-4-git-send-email-namhyung@kernel.org
[ Make perf_tip() return NULL for ENOENT, making the fallback to really take place ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When a tip message contains a percent sign, it was treated printf format
specifier so broken string was printed like below.
Tip: Limit to show entries above 577nly: perf report --percent-limit 5
^^^
As ui_browser__show receives format string, pass additional "%s" so that
the help (tip) message can be printed as is.
Tip: Limit to show entries above 5% only: perf report --percent-limit 5
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1452509594-13616-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It'll be used to locate perf document directory to find tips.txt in case
it's not installed on the system.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1452334589-8782-3-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
If strlist_config.dirname is present, the strlist__new() tries to load
stirngs from dirname/list file first but if it failes it falls back to
add 'list' as string. But sometimes it's not desired so adds new
file_only field to prevent it.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1452334589-8782-2-git-send-email-namhyung@kernel.org
[ Add documentation for strlist_config::file_only, in the struct definition */
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The --buildid-all option is to record build-id of all DSOs in the file.
It might be very costly to postprocess samples to find which DSO hits.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1452519429-31779-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf_event__synthesize_mmap_events() issues mmap2 events, but the memory
of that event is allocated using:
mmap_event = malloc(sizeof(mmap_event->mmap) + machine->id_hdr_size);
If path of mmap source file is long (near PATH_MAX), random crash would
happen. Should use sizeof(mmap_event->mmap2).
Fix two memory allocations.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1452593524-138970-1-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Markus reported gcc 6 complains when compiling perf stat command:
builtin-stat.c:1591:27: error: ‘recort_usage’ defined but not used
[-Werror=unused-const-variable]
static const char * const recort_usage[] = {
^~~~~~~~~~~~
I fixed the typo and realized we already export record_usage, so I also
prefixed it with stat (and included report_usage).
Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1452591329-27620-1-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
All hists test cases forget to reset err after using it to hold an
error code. If error occure in setup_fake_machine() it incorrectly
return TEST_OK.
This patch fixes it.
Suggested-and-Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1452520124-2073-13-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Commit 71d6de64fe ("perf test: Fix hist testcases when kptr_restrict is on")
solves a double free problem when 'perf test hist' calling
setup_fake_machine(). However, the result is still incorrect. For example:
$ ./perf test -v 'filtering hist entries'
25: Test filtering hist entries :
--- start ---
test child forked, pid 4186
Cannot create kernel maps
test child finished with 0
---- end ----
Test filtering hist entries: Ok
In this case the body of this test is not get executed at all, but the
result is 'Ok'.
Actually, in setup_fake_machine() there's no need to create real kernel
maps. What we want are the fake maps. This patch removes the
machine__create_kernel_maps() in setup_fake_machine(), so it won't be
affected by kptr_restrict setting.
Test result:
$ cat /proc/sys/kernel/kptr_restrict
1
$ ~/perf test -v hist
15: Test matching and linking multiple hists :
--- start ---
test child forked, pid 24031
test child finished with 0
---- end ----
Test matching and linking multiple hists: Ok
[SNIP]
Suggested-and-Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1452520124-2073-12-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
After this patch other directories can use this architecture detector
without directly including it from perf's directory. Libbpf would
utilize it to get proper $(ARCH) so it can receive correct uapi include
directory.
Tested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1452520124-2073-8-git-send-email-wangnan0@huawei.com
[ Add missing srctree definition in tests/make ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@kernel.org>
make_kernelsrc and make_kernelsrc_tools are skiped if a previous
build-test is done, because 'make build-test' creates two files with
same names. To avoid this, they should be included in .PHONY list.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1452520124-2073-3-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
On some system the perf-config is broken, causes link failure like this:
/usr/lib64/python2.7/config/libpython2.7.a(posixmodule.o): In function `posix_forkpty':
/opt/wangnan/yocto-build/tmp-eglibc/work/x86_64-oe-linux/python/2.7.3-r0.3.1/Python-2.7.3/./Modules/posixmodule.c:3816: undefined reference to `forkpty'
/usr/lib64/python2.7/config/libpython2.7.a(posixmodule.o): In function `posix_openpty':
/opt/wangnan/yocto-build/tmp-eglibc/work/x86_64-oe-linux/python/2.7.3-r0.3.1/Python-2.7.3/./Modules/posixmodule.c:3756: undefined reference to `openpty'
collect2: error: ld returned 1 exit status
make[1]: *** [/home/wangnan/kernel-hydrogen/tools/perf/out/perf] Error 1
make: *** [all] Error 2
$ python-config --libs
-lpthread -ldl -lpthread -lutil -lm -lpython2.7
In this case a '-lutil' should be appended to -lpython2.7.
(I know we have --start-group and --end-group. I can see them in command
line of collect2 by strace. However it doesn't work. Seems I have a
broken environment?)
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1452520124-2073-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding missing bitmap.[ch] sources to the MANIFEST file. Fixes building
'make perf-*-src-pkg' generated tarballs.
Reported-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Fixes: 915b0882c3 ("tools lib: Move bitmap.[ch] from tools/perf/ to tools/{lib,include}/")
Link: http://lkml.kernel.org/r/1452509693-13452-5-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently we don't synthesize data mmap by default. It depends on -d
option, that enables data address sampling.
But we've seen cases (softice) where DWARF unwinder went through non
executable mmaps, which we need to lookup in MAP__VARIABLE tree.
Making data mmaps to be synthesized for dwarf unwind as well.
Reported-by: Noel Grandin <noelgrandin@gmail.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20160107133022.GA32115@krava.brq.redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We've seen cases (softice) where DWARF unwinder went through non
executable mmaps, which we need to lookup in MAP__VARIABLE tree.
Reported-and-Tested-by: Noel Grandin <noelgrandin@gmail.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1452158050-28061-6-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We've seen cases (softice) where DWARF unwinder went through non
executable mmaps, which we need to lookup in MAP__VARIABLE tree.
Reported-and-Tested-by: Noel Grandin <noelgrandin@gmail.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1452158050-28061-5-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The find_map helper is already there, so let's use it.
Also we're going to introduce wider search in following patch, so it'll
be easier to make this change on single place.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Noel Grandin <noelgrandin@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1452158050-28061-4-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Replacing them with perf_evsel__(enable|disable).
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Noel Grandin <noelgrandin@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1452158050-28061-2-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
'perf record' uses perf_evsel__open() to open events and passes the
evsel->cpus and evsel->threads. Many tests and some tools instead use
perf_evlist__open() which passes instead evlist->cpus and
evlist->threads.
Make perf_evlist__open() follow the 'perf record' behaviour so that a
consistent approach is taken.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Noel Grandin <noelgrandin@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1452158050-28061-3-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently perf report only shows a help message "For a higher level
overview, try: perf report --sort comm,dso" unconditionally (even if
the sort keys were used). Add more help tips and show randomly.
Load tips from ${prefix}/share/doc/perf-tip/tips.txt file.
$ perf report | tail
0.10% swapper [kernel.vmlinux] [k] irq_exit
0.09% swapper [kernel.vmlinux] [k] flush_smp_call_function_queue
0.08% swapper [kernel.vmlinux] [k] native_write_msr_safe
0.03% swapper [kernel.vmlinux] [k] group_sched_in
0.01% perf [kernel.vmlinux] [k] native_write_msr_safe
#
# (Tip: Search options using a keyword: perf report -h <keyword>)
#
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1452166913-27046-1-git-send-email-namhyung@kernel.org
[ Renamed it to perf_tip() and the parameter dirname to dirpath to fix the build on older distros ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
These are necessary for multi threaded sample processing:
- hists__get__get_rotate_entries_in()
- hists__collapse_insert_entry()
- __hists__init()
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Noel Grandin <noelgrandin@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1452158050-28061-14-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Using perf_hpp__register_sort_field interface instead of directly adding
the entry.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Noel Grandin <noelgrandin@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1452158050-28061-13-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We currently set 'overhead' and 'overhead_children' as default sort keys
within perf_hpp__init function by directly adding into the sort list.
This patch adds 'overhead' and 'overhead_children' in text form into
sort_keys and let them be added by standard sort dimension interface.
We need to eliminate dirrect sort_list additions to be able to add
support for hists specific sort keys.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Noel Grandin <noelgrandin@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1452158050-28061-12-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Besides lockdep we use all the 'tools/lib' code in perf, so include it
completely in tags.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Noel Grandin <noelgrandin@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1452158050-28061-10-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
These lost headers are found in arm64 cross buildings, failing to build
perf using tarballs generated using:
$ make perf-targz-src-pkg
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1452263041-225488-3-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The trace command still appears in help message when you run simple
'perf' command.
It's because the generate-cmdlist.sh does not care about the
HAVE_LIBAUDIT_SUPPORT dependency of trace command and puts it into
generated common_cmds array.
Wrapping trace command under HAVE_LIBAUDIT_SUPPORT dependency, which
will exclude it from common_cmds array if HAVE_LIBAUDIT_SUPPORT is not
set.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Noel Grandin <noelgrandin@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1452158050-28061-8-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The event group view feature is to see related events together. To use
the group view, events should be recorded as a group with a dedicated
syntax of surrounding events by braces (-e '{ evt1, evt2, ... }').
Also 'perf report' also requires the --group option to enable it.
However it's almost always beneficial to use the group view to see the
group events as it's more expressive. And I think it's more natural to
see events together if they are recorded as a group.
Thus this patch changes the default value to enable it. If users don't
want to see like it and keep the original behavior, they can set the
report.group config variable to false and/or use --no-group option in
the 'perf report' command line.
Requested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Taeung Song <treeze.taeung@gmail.com>
Link: http://lkml.kernel.org/r/1448807057-3506-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It missed to decay periods in callchains when decaying hist entries.
This resulted in more than 100 percent overhead in callchains in the
fractal style output.
Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1451963160-17196-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Need to move the bitmap.[ch] things from tools/perf/ to tools/lib, will
be done in the next patches.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: George Spelvin <linux@horizon.com
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Yury Norov <yury.norov@gmail.com>
Link: http://lkml.kernel.org/n/tip-5fys65wkd7gu8j7a7xgukc5t@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The commit that introduced it should've moved it to the same place, plus
the 'tools/' prefix, but instead moved it to a bogus tools/lib/util/
directory, being the only file there.
Move it to tools/lib/find_bit.c, picking the name for the file where
these routines live since:
8f6f19dd51 ("lib: move find_last_bit to lib/find_next_bit.c")
Next step is to make tools/lib/find_bit.c to differ from lib/find_bit.c
just in removing what is not used by tools/.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: George Spelvin <linux@horizon.com
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Yury Norov <yury.norov@gmail.com>
Link: http://lkml.kernel.org/n/tip-p391cex5mqvahp4pwrton87n@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Before:
$ perf test -v cqm
48: Test intel cqm nmi context read :
--- start ---
test child forked, pid 1681
parse_events failed
test child finished with -2
---- end ----
Test intel cqm nmi context read: Skip
$
After:
$ perf test -v cqm
48: Test intel cqm nmi context read :
--- start ---
test child forked, pid 1681
parse_events failed, is "intel_cqm/llc_occupancy/" available?
test child finished with -2
---- end ----
Test intel cqm nmi context read: Skip
$
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Matt Fleming <matt.fleming@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-eidpiv5x4nkbsx37xwikbnir@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We were asking for a 4kHz sample_freq, making the test fail needlessly
when the system reduced /proc/sys/kernel/perf_event_max_sample_rate
below that.
Before:
# perf test -vv dummy
23: Test using a dummy software event to keep tracking :
--- start ---
test child forked, pid 32421
------------------------------------------------------------
perf_event_attr:
type 1
size 112
config 0x9
{ sample_period, sample_freq } 4000
sample_type IP|TID|ID|PERIOD
<SNIP>
sys_perf_event_open failed, error -22
Unable to open dummy and cycles event
test child finished with -2
---- end ----
Test using a dummy software event to keep tracking: Skip
#
[root@zoo ~]# cat /proc/sys/kernel/perf_event_max_sample_rate
1000
After:
[root@zoo ~]# perf test dummy
23: Test using a dummy software event to keep tracking : Ok
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-487iquegrs2379e5n0pi0tcp@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fixing this problem, introduced recently:
$ perf test python
16: Try 'import perf' in python, checking link problems : FAILED!
In verbose mode we find out what is missing:
$ perf test -v python
16: Try 'import perf' in python, checking link problems :
--- start ---
test child forked, pid 24894
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: /tmp/build/perf/python/perf.so: undefined symbol: find_next_bit
test child finished with -1
---- end ----
Try 'import perf' in python, checking link problems: FAILED!
$
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: f77b57ad4f ("perf cpu_map: Add cpu_map__new_event function")
Link: http://lkml.kernel.org/n/tip-rajx0zkz6czdrnvvwf0jp76p@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We're not looking at PERF_RECORD_SAMPLE entries and now by default we
use PERF_COUNT_SW_DUMMY, so just remove that setting.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-cly7cnotktv5rqao13pkorem@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
As we're test just the !PERF_RECORD_SAMPLE records.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-qp8radcz3il4q9wbnseh337d@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For case where all we need is an evlist with just an "dummy" evsel,
like in some 'perf test' entries.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-q52le0pblm2k3ncvyilelr9z@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We were asking for a 4kHz sample_freq, making the test fail needlessly
when the system reduced /proc/sys/kernel/perf_event_max_sample_rate
below that.
In this test we only look at the PERF_SAMPLE_TIME fields in PERF_RECORD_
meta events, no need to set sample_freq.
Thanks to Namhyung for suggesting that max_sample_rate could be the
reason for the test failure, seeing the 'perf test -vv' output I sent.
Before:
# echo 1000 > /proc/sys/kernel/perf_event_max_sample_rate
# perf test TSC
45: Test converting perf time to TSC : FAILED!
After:
# perf test TSC
45: Test converting perf time to TSC : Ok
# cat /proc/sys/kernel/perf_event_max_sample_rate
1000
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-lcob05qhawkuvsyuu9g1fld5@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch fixes a bug in __perf_pmu__new_alias() whereby the
alias->snapshot field was not initialized to false. This led to random
alias->snapshot value for an alias and was breaking some measurements
such as:
$ perf stat -a -e uncore_imc/data_reads/ -I 1000 sleep 100
Because the event ended up being treated as snapshot mode, when it is
not.
Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1452106201-13073-1-git-send-email-eranian@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding stat-cpi.py as an example of how to do stat scripting.
It computes the CPI metrics from cycles and instructions events.
The CPI is based performance metric showing the Cycles Per Instructions
ratio, which helps to identify cycles-hungry code.
Following stat record/report/script combinations could be used:
- get CPI for given workload
$ perf stat -e cycles,instructions record ls
SNIP
Performance counter stats for 'ls':
2,904,431 cycles
3,346,878 instructions # 1.15 insns per cycle
0.001782686 seconds time elapsed
$ perf script -s ./scripts/python/stat-cpi.py
0.001783: cpu -1, thread -1 -> cpi 0.867803 (2904431/3346878)
$ perf stat -e cycles,instructions record ls | perf script -s ./scripts/python/stat-cpi.py
SNIP
0.001730: cpu -1, thread -1 -> cpi 0.869026 (2928292/3369627)
- get CPI systemwide:
$ perf stat -e cycles,instructions -a -I 1000 record sleep 3
# time counts unit events
1.000158618 594,274,711 cycles (100.00%)
1.000158618 441,898,250 instructions
2.000350973 567,649,705 cycles (100.00%)
2.000350973 432,669,206 instructions
3.000559210 561,940,430 cycles (100.00%)
3.000559210 420,403,465 instructions
3.000670798 780,105 cycles (100.00%)
3.000670798 326,516 instructions
$ perf script -s ./scripts/python/stat-cpi.py
1.000159: cpu -1, thread -1 -> cpi 1.344823 (594274711/441898250)
2.000351: cpu -1, thread -1 -> cpi 1.311972 (567649705/432669206)
3.000559: cpu -1, thread -1 -> cpi 1.336669 (561940430/420403465)
3.000671: cpu -1, thread -1 -> cpi 2.389178 (780105/326516)
$ perf stat -e cycles,instructions -a -I 1000 record sleep 3 | perf script -s ./scripts/python/stat-cpi.py
1.000202: cpu -1, thread -1 -> cpi 1.035091 (940778881/908885530)
2.000392: cpu -1, thread -1 -> cpi 1.442600 (627493992/434974455)
3.000545: cpu -1, thread -1 -> cpi 1.353612 (741463930/547766890)
3.000622: cpu -1, thread -1 -> cpi 2.642110 (784083/296764)
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1452077397-31958-4-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We can't convert u16 cpu_map_entries::cpu[x] value directly to int,
because it could hold -1, which would be converted as 65535.
Adding special treatment for -1, which is not real cpu number, to be
converted to (int -1).
Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1452077397-31958-2-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add support to get stat events data in perf python scripts.
The python script shall implement the following new interface to process
stat data:
def stat__<event_name>_[<modifier>](cpu, thread, time, val, ena, run):
- is called for every stat event for given counter,
if user monitors 'cycles,instructions:u" following
callbacks should be defined:
def stat__cycles(cpu, thread, time, val, ena, run):
def stat__instructions_u(cpu, thread, time, val, ena, run):
def stat__interval(time):
- is called for every interval with its time,
in non interval mode it's called after last
stat event with total measured time in ns
The rest of the current interface stays untouched..
Please check example CPI metrics script in following patch
with command line examples in changelogs.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1452028152-26762-8-git-send-email-jolsa@kernel.org
[ Rename 'time' parameters to 'tstamp', to fix the build in older distros ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Implement struct scripting_ops::(process_stat|process_stat_interval)
handlers - calling scripting handlers from stat events handlers.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1452028152-26762-6-git-send-email-jolsa@kernel.org
[ Rename 'time' parameters to 'tstamp', to fix the build in older distros ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Python and perl scripting code will define those callbacks and get stat
data.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1452028152-26762-5-git-send-email-jolsa@kernel.org
[ Rename 'time' parameters to 'tstamp', to fix the build in older distros ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding processing of stat config event and initialize stat_config
object.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1452028152-26762-4-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding processing of cpu/threads maps. Configuring session's evlist with
these maps.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1452028152-26762-3-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For pipe sessions we need to keep sample_type zero, because script's
perf_evsel__check_attr is triggered by sample_type != 0, and the check
would fail on stat session.
I was tempted to keep it zero unconditionally, but the pipe session is
sufficient. In perf.data session we are guarded by HEADER_STAT feature.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1452028152-26762-2-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1451991518-25673-3-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When a perf.data file has multiple events, it's likely to be similar
(tracepoint) events. In that case, they might have same field name so
add all of them to sort keys instead of bailing out.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1451991518-25673-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Using FEATURE-DUMP in bpf subproject for features detection in case bpf
is built via perf. Keeping the current features detection otherwise.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Wang Nan <wangnan0@huawei.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama <pi3orama@163.com>
Link: http://lkml.kernel.org/r/1450893514-9158-6-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We decide what dwarf unwind to choose way after the Makefile.feature
makefile is included. The $(dwarf-post-unwind) is not even set at that
time. For the same reason it was never included in FEATURE-DUMP file.
Moving it into perf VF=1 verbose display.
$ make VF=1
BUILD: Doing 'make -j4' parallel build
Auto-detecting system features:
... dwarf: [ on ]
...
... LIBUNWIND_DIR:
... LIBDW_DIR:
... DWARF post unwind library: libunwind
...
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Wang Nan <wangnan0@huawei.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama <pi3orama@163.com>
Link: http://lkml.kernel.org/r/1450893514-9158-3-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When an evlist contains tracepoint events only, use 'trace' sort key as
default. If --raw-trace option was given, use 'trace_fields' instead.
This will make users more convenient to see trace result.
Suggested-and-Acked-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1450804030-29193-14-git-send-email-namhyung@kernel.org
[ Check evlist in get_default_sort_order() fixing a segfault in 'perf test hists' reported by Jiri Olsa ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The dynamic sort key requires event name but specifying full event name
is rather inconvenient. This patch adds more ways to identify the event
in a more compact way.
1. If session has just one event, event name can be omitted.
2. Events can be accessed by index preceded by a percent sign.
3. A part of the name can be used, if it's not ambiguous. The partial
name should not contain ':' in it.
4. Full system + event name is still used, it should contain ':'.
So in the below example all does same thing:
$ perf record -e sched:sched_switch -a sleep 1
$ perf report -s next_pid,next_comm
$ perf report -s %1.next_pid,%1.next_comm
$ perf report -s switch.next_pid,switch.next_comm
$ perf report -s sched:sched_switch.next_pid,sched:sched_switch.next_comm
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1450804030-29193-10-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The 'trace' sort key is to show tracepoint event output using either
print fmt or plugin. For example sched_switch event (using plugin) will
show output like below:
# perf record -e sched:sched_switch -a usleep 10
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.197 MB perf.data (69 samples) ]
#
$ perf report -s trace --stdio
...
# Overhead Trace output
# ........ ...................................................
#
9.48% swapper/0:0 [120] R ==> transmission-gt:17773 [120]
9.48% transmission-gt:17773 [120] S ==> swapper/0:0 [120]
9.04% swapper/2:0 [120] R ==> transmission-gt:17773 [120]
8.92% transmission-gt:17773 [120] S ==> swapper/2:0 [120]
5.25% swapper/0:0 [120] R ==> kworker/0:1H:109 [100]
5.21% kworker/0:1H:109 [100] S ==> swapper/0:0 [120]
1.78% swapper/3:0 [120] R ==> transmission-gt:17773 [120]
1.78% transmission-gt:17773 [120] S ==> swapper/3:0 [120]
1.53% Xephyr:6524 [120] S ==> swapper/0:0 [120]
1.53% swapper/0:0 [120] R ==> Xephyr:6524 [120]
1.17% swapper/2:0 [120] R ==> irq/33-iwlwifi:233 [49]
1.13% irq/33-iwlwifi:233 [49] S ==> swapper/2:0 [120]
Note that the 'trace' sort key works only for tracepoint events. If
it's used to other type of events, just "N/A" will be printed.
Suggested-and-acked-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1450804030-29193-8-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Each tracepoint event has format string for print to improve
readability. Try to parse the output and match the field name. If it
finds one, use that for the result. If not, fallbacks to the original
output.
For example, sort on kmem:kmalloc.gfp_flags looks like below:
(Note: libtraceevent plugins are not installed on my system. They might
affect the output below)
Before:
# Overhead Command gfp_flags
# ........ ....... ..........
#
99.89% perf 32848
0.06% sleep 208
0.03% perf 32976
0.01% perf 208
After:
# Overhead Command gfp_flags
# ........ ....... ...................
#
99.89% perf GFP_NOFS|GFP_ZERO
0.06% sleep GFP_KERNEL
0.03% perf GFP_KERNEL|GFP_ZERO
0.01% perf GFP_KERNEL
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1450804030-29193-7-git-send-email-namhyung@kernel.org
[ Fixed clash with earlier, updated patch in this patchkit ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This is a preparation to support dynamic sort keys for tracepoint
events. Dynamic sort keys can be created for specific fields in trace
events so it needs the event information.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1450804030-29193-5-git-send-email-namhyung@kernel.org
[ Moving the evlist creation earlier in top was split to a previous patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This is a preparation to support dynamic sort keys for tracepoint
events. Dynamic sort keys can be created for specific fields in trace
events so it needs the event information, so we need to pass the evlist
to the sort routines, create it sooner so that the next patch can do
that.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1450804030-29193-5-git-send-email-namhyung@kernel.org
[ Split from the patch passing the evlist to the sort routines ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The raw_data and raw_size fields are to provide tracepoint specific
information. They will be used by dynamic sort keys later.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1450923377-18641-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This is a preparation to add more info into the hist_entry. Also it
already passes too many argument, so passing sample directly will reduce
the overhead of the function call.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1450804030-29193-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Allowing to override record aggr_mode. It's possible to use perf stat
like:
$ perf stat report -A
$ perf stat report --per-core
$ perf stat report --per-socket
To customize the recorded aggregate mode regardless what was used during
the stat record command.
Reported-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1446734469-11352-19-git-send-email-jolsa@kernel.org
[ Renamed 'stat' parameter to 'st' to fix 'already defined' build error with older distros (e.g. RHEL6.7) ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding processing of event update events, so perf stat report can store
additional info for events - unit,scale,name.
Committer note:
Before:
# perf stat record -e power/energy-cores/ -a
^C
Performance counter stats for 'system wide':
77.41 Joules power/energy-cores/
1.597176695 seconds time elapsed
# perf stat report
Performance counter stats for '/home/acme/bin/perf stat record -e power/energy-cores/ -a':
332,488,114,176 power/energy-cores/
1.597176695 seconds time elapsed
#
After, using the same perf.data file generated in the "Before" case
above:
# perf stat report
Performance counter stats for '/home/acme/bin/perf stat record -e power/energy-cores/ -a':
77.41 Joules power/energy-cores/
1.597176695 seconds time elapsed
#
Reported-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1446734469-11352-17-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding processing of stat and stat round events.
The stat data com in stat events, using generic function
process_stat_round_event to store data under perf_evsel object.
The stat-round events comes each interval or as last event in non
interval mode. The function process_stat_round_event process stored data
for each perf_evsel object and print it out.
Committer note:
After this patch:
$ perf stat record usleep 1
Performance counter stats for 'usleep 1':
0.498381 task-clock (msec) # 0.571 CPUs utilized
2 context-switches # 0.004 M/sec
0 cpu-migrations # 0.000 K/sec
149 page-faults # 0.299 M/sec
1,271,635 cycles # 2.552 GHz
928,712 stalled-cycles-frontend # 73.03% frontend cycles idle
663,286 stalled-cycles-backend # 52.16% backend cycles idle
792,614 instructions # 0.62 insns per cycle
# 1.17 stalled cycles per insn
136,850 branches # 274.589 M/sec
<not counted> branch-misses (0.00%)
0.000873419 seconds time elapsed
$
$ perf stat report
Performance counter stats for '/home/acme/bin/perf stat record usleep 1':
0.498381 task-clock (msec) # 0.571 CPUs utilized
2 context-switches # 0.004 M/sec
0 cpu-migrations # 0.000 K/sec
149 page-faults # 0.299 M/sec
1,271,635 cycles # 2.552 GHz
928,712 stalled-cycles-frontend # 73.03% frontend cycles idle
663,286 stalled-cycles-backend # 52.16% backend cycles idle
792,614 instructions # 0.62 insns per cycle
# 1.17 stalled cycles per insn
136,850 branches # 274.589 M/sec
<not counted> branch-misses (0.00%)
0.000873419 seconds time elapsed
$
Reported-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1446734469-11352-16-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
So we have csv_sep properly initialized before report command leg.
Reported-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1446734469-11352-18-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Using perf.data's perf_env data to initialize aggregate config.
Reported-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1446734469-11352-15-git-send-email-jolsa@kernel.org
[ s/stat/st/g, s/socket/socket_id/g to fix 'already defined' build error with older distros (e.g. RHEL6.7) ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding processing of stat config event and initialize stat_config
object.
Reported-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1446734469-11352-14-git-send-email-jolsa@kernel.org
[ Renamed 'stat' parameter to 'st' to fix 'already defined' build error with older distros (e.g. RHEL6.7) ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding processing of cpu/threads maps. Configuring session's evlist with
these maps.
Reported-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1446734469-11352-13-git-send-email-jolsa@kernel.org
[ s/stat/st/g, s/time/tm/g parameters to fix 'already defined' build error with older distros (e.g. RHEL6.7) ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding 'perf stat report' command support. ATM it only processes attr
events and display nothing.
Reported-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1446734469-11352-12-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Synthesize other events stuff not carried within attr event - unit,
scale, name.
Reported-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1446734469-11352-11-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We currently don't support storing multiple session in perf.data,
so we can't allow -r option in stat record.
$ perf stat -e cycles -r 2 record ls
Cannot use -r option with perf stat record.
Committer note:
Before this patch we would a perf.data file such as:
$ perf stat -e cycles -r 2 record ls
<SNIP>
Performance counter stats for 'ls' (2 runs):
3,935,236 cycles
0.002353261 seconds time elapsed ( +- 4.76% )
$ perf report -D | grep PERF_RECORD | grep ROUND
0xf0 [0]: failed to process type: 16
Error:
failed to process sample
$
Reported-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1446734469-11352-10-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Writing stat round events on 'perf stat record' for each interval round.
In non interval mode we store round event after the last stat event.
Committer note:
After the patch:
$ perf report -D | grep PERF_RECORD | grep ROUND
0x852 [0x18]: PERF_RECORD_STAT_ROUND
$
Reported-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1446734469-11352-9-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Allowing storing stat record data into pipe, so report tools
(report/script) could read data directly from record.
Committer note:
Before this patch:
$ perf stat record -o - usleep 1 | perf report -i -
incompatible file format (rerun with -v to learn more)
$ perf stat record -o - usleep 1 | perf script -i -
incompatible file format (rerun with -v to learn more)
$ ls -la perf.data
ls: cannot access perf.data: No such file or directory
$
After:
$ perf stat record -o - usleep 1 | perf report -i -
# To display the perf.data header info, please use
# --header/--header-only options.
#
Error:
The - file has no samples!
$ perf stat record -o - usleep 1 | perf script -i -
Display of symbols requested but neither sample IP nor sample address
is selected. Hence, no addresses to convert to symbols.
0 [0x80]: failed to process type: 64
$ ls -la perf.data
ls: cannot access perf.data: No such file or directory
$
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1446734469-11352-7-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Store event IDs in evlist object so it get stored into perf.data file.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1446734469-11352-6-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Will be used to storing the event IDs in evlist object so it get stored
into perf.data file.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1446734469-11352-6-git-send-email-jolsa@kernel.org
[ Split from the patch storing the ids in the perf.data file ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Synthesizing needed stat record data for report/script:
- cpu/thread maps
- stat config
Committer note:
New records generated on a perf.data file with this patch:
$ perf report -D | grep PERF_RECORD_
0x568 [0x28]: PERF_RECORD_THREAD_MAP nr: 1 thread: 29097
0x590 [0x12]: PERF_RECORD_CPU_MAP nr: 1 cpu: 65535
0x5a2 [0x40]: PERF_RECORD_STAT_CONFIG
$
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1446734469-11352-5-git-send-email-jolsa@kernel.org
[ Adjusted wrt kernel PERF_RECORD_MMAP added when introducing 'perf stat record' ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Disabling all non stat related features.
Also as we now enable STAT feature in the data file, adding code to
instruct session open to skip sample type checking for stat data files.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1446734469-11352-4-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add 'perf stat record' command support. It creates simple (header only)
perf.data file ATM.
The record command could be specified anywhere among stat options. All
stat command options are valid for stat record command with '-o' option
exception. If specified for record command it denotes the perf data file
name.
Committer note:
Set sample_type to PERF_SAMPLE_IDENTIFIER, which should be harmless
while avoiding that older tools show confusing messages, for instance,
with sample_type = 0, we get:
$ perf stat record usleep 1
Performance counter stats for 'usleep 1':
0.630237 task-clock (msec) # 0.528 CPUs utilized
1 context-switches # 0.002 M/sec
0 cpu-migrations # 0.000 K/sec
52 page-faults # 0.083 M/sec
978,312 cycles # 1.552 GHz
671,931 stalled-cycles-frontend # 68.68% frontend cycles idle
<not supported> stalled-cycles-backend
646,379 instructions # 0.66 insns per cycle
# 1.04 stalled cycles per insn
131,046 branches # 207.931 M/sec
7,073 branch-misses # 5.40% of all branches
0.001193240 seconds time elapsed
$ oldperf evlist
WARNING: The perf.data file's data size field is 0 which is unexpected.
Was the 'perf record' command properly terminated?
non matching sample_type
$
While with sample_type set to PERF_SAMPLE_IDENTIFIER, after we re-run 'perf
stat record usleep' we get:
$ oldperf evlist
WARNING: The perf.data file's data size field is 0 which is unexpected.
Was the 'perf record' command properly terminated?
task-clock
context-switches
cpu-migrations
page-faults
cycles
stalled-cycles-frontend
stalled-cycles-backend
instructions
branches
branch-misses
$
Which at least shows the names of the events in the perf.data file.
Additionally, such files, when passed to 'perf report' will produce:
$ oldperf report --stdio
WARNING: The perf.data file's data size field is 0 which is unexpected.
Was the 'perf record' command properly terminated?
Warning:
Kernel address maps (/proc/{kallsyms,modules}) were restricted.
Check /proc/sys/kernel/kptr_restrict before running 'perf record'.
As no suitable kallsyms nor vmlinux was found, kernel samples
can't be resolved.
Samples in kernel modules can't be resolved as well.
Error:
The perf.data file has no samples!
# To display the perf.data header info, please use --header/--header-only options.
#
$
Which is confusing and can be solved by just adding the kernel mmap record,
which will also remove that warning about the data size field being equal to
zero, after generating the mmap record:
$ perf stat record usleep 1
Performance counter stats for 'usleep 1':
0.600796 task-clock (msec) # 0.478 CPUs utilized
1 context-switches # 0.002 M/sec
0 cpu-migrations # 0.000 K/sec
54 page-faults # 0.090 M/sec
886,844 cycles # 1.476 GHz
582,169 stalled-cycles-frontend # 65.65% frontend cycles idle
<not supported> stalled-cycles-backend
638,344 instructions # 0.72 insns per cycle
# 0.91 stalled cycles per insn
130,204 branches # 216.719 M/sec
7,500 branch-misses # 5.76% of all branches
0.001255897 seconds time elapsed
$ oldperf evlist
task-clock
context-switches
cpu-migrations
page-faults
cycles
stalled-cycles-frontend
stalled-cycles-backend
instructions
branches
branch-misses
$ oldperf report --stdio
Error:
The perf.data file has no samples!
# To display the perf.data header info, please use --header/--header-only options.
#
[acme@zoo linux]$
No warnings, sensible output about what are the events in the perf.data file and also
a "file has no samples" message, which indeed it doesn't.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: htp://lkml.kernel.org/r/1446734469-11352-3-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Introducing the 'stat' feature to mark a perf.data as created by the
'perf stat record' command. It contains no data.
It's needed so that the report tools (report/script) can differentiate
sampling data from counting data, because they need to be treated in a
different way.
In the future it might be used to store the version of the stat storage
system used.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1445784728-21732-28-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The 'perf report -D' command will now display detailed output for these
newly added events:
event_update
thread_map
cpu_map
stat
stat_config
stat_round
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1445784728-21732-27-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To display a 'event update' event for raw dump.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1445784728-21732-26-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding the cpumask 'event update' event, that stores/transfer the
cpumask for a event.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1445784728-21732-25-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding name type 'event update' event, that stores/transfer events name.
Event's name is stored within perf.data's EVENT_DESC feature, but we
don't have it if we get the report data from pipe.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1445784728-21732-24-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
A__allocdding scale type 'event update' event, that stores/transfer
events scale value. The PMU events can define the scale
value which is used to multiply events data.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1445784728-21732-23-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding unit type 'event update' event, that stores/transfer events unit
name. The unit name is part of the perf stat output data.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1445784728-21732-22-git-send-email-jolsa@kernel.org
[ Rename __alloc() to __new() for consistency ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It'll serve as a base event for additional event attributes details,
that are not part of the attr event.
At the moment this event is just a dummy one without any specific
functionality. The type value will distinguish the update event details.
It'll come in the following patches.
The idea for this event is to be extensible for any update that the
event might need in the future.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1445784728-21732-21-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Introducing the following functions to display the stat events for raw
dump.
perf_event__fprintf_stat
perf_event__fprintf_stat_round
perf_event__fprintf_stat_config
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1445784728-21732-20-git-send-email-jolsa@kernel.org
[ s/stat/st/g and s/round/rd/g parameters to fix 'already defined' build error with older distros (e.g. RHEL6.7) ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Introduce the perf_event__synthesize_stat_round function to
synthesize a 'struct stat_round_event'.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1445784728-21732-19-git-send-email-jolsa@kernel.org
[ Renamed 'time' parameter to 'evtime' to fix build on older systems ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding the stat round event to be stored after each stat interval round,
so that report tools (report/script) gets notified and process interval
data.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1445784728-21732-18-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Introducing the perf_event__process_stat_event function to process a
'struct perf_stat' data from a stat event.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1445784728-21732-17-git-send-email-jolsa@kernel.org
[ Renamed 'stat' parameter to 'st' to fix 'already defined' build error with older distros (e.g. RHEL6.7) ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Introduce the perf_event__synthesize_stat function to synthesize a
'struct stat_event'.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1445784728-21732-16-git-send-email-jolsa@kernel.org
[ Renamed 'stat' parameter to 'st' to fix 'already defined' build error with older distros (e.g. RHEL6.7) ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding a stat event to store a 'struct perf_counter_values' for a given
event/cpu/thread.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1445784728-21732-15-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Introducing the perf_event__read_stat_config function to read a struct
perf_stat_config object data from a stat config event.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1445784728-21732-14-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Introduce the perf_event__synthesize_stat_config to synthesize a 'struct
perf_stat_config'.
Storing the stat config in the form of tag-value pairs will, I believe,
sort out future version extensibility issues.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1445784728-21732-13-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding the stat config event to pass/store stat config data, so report
tools (report/script) know how to interpret stat data.
The config data is stored in a 'tag|value' way to allow for easy
extension and backwards compatibility.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1445784728-21732-12-git-send-email-jolsa@kernel.org
[ stat_config_term_event -> stat_config_event_entry ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To display a cpu_map event for raw dump.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1445784728-21732-11-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Introducing the cpu_map__new_event function to create a struct cpu_map
object from a cpu_map event.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1445784728-21732-10-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Introduce the perf_event__synthesize_cpu_map function to synthesize a
struct cpu_map.
Added generic interface:
cpu_map_data__alloc
cpu_map_data__synthesize
to make the cpu_map synthesizing usable for other events.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1445784728-21732-9-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding the cpu_map event to pass/store cpu maps as data in
a pipe/perf.data.
We store maps in 2 formats:
- list of cpus
- mask of cpus
The format that takes less space is selected transparently in the
following patch.
The interface is made generic, so we could add the cpumap event data
into another event in the following patches.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1445784728-21732-8-git-send-email-jolsa@kernel.org
[ cpu_map_data_cpus -> cpu_map_entries, cpu_map_data_mask -> cpu_map_mask ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To display a thread_map event for a raw dump.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1445784728-21732-7-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Introducing the thread_map__new_event function to create a struct
thread_map object from a thread_map event.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1445784728-21732-6-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Introduce the perf_event__synthesize_thread_map2 function to synthesize
struct thread_map.
The perf_event__synthesize_thread_map name is already taken for
synthesizing the complete threads data (comm/mmap/fork).
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1445784728-21732-5-git-send-email-jolsa@kernel.org
[ Rename thread_map_data_event to thread_map_event_entry ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding the thread_map event to pass/store thread maps as data in
the pipe/perf.data.
Storing the thread ID along with the standard comm[16] thread name string.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Kan Liang <kan.liang@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1445784728-21732-4-git-send-email-jolsa@kernel.org
[ Renamed thread_map_data_event to thread_map_event_entry ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move the subcommand-related files from perf to a new library named
libsubcmd.a.
Since we're moving files anyway, go ahead and rename 'exec_cmd.*' to
'exec-cmd.*' to be consistent with the naming of all the other files.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/c0a838d4c878ab17fee50998811612b2281355c1.1450193761.git.jpoimboe@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For the files that will be moved to the subcmd library, remove all their
perf-specific includes and duplicate any needed functionality.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/6e12946f0f26ce4d543d34db68d9dae3c8551cb9.1450193761.git.jpoimboe@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In preparation for moving exec_cmd.c and run-command.c out of perf and
into a library, remove 'perf' from all the symbol names.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/bc3ee82b40b8f396b644fa49e0f7260ce442635b.1450193761.git.jpoimboe@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Introduce and use new astrcat() and astrcatf() functions which replace
the strbuf functionality for subcmd.
For now they duplicate strbuf's die-on-allocation-error policy.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/957d207e1254406fa11fc2e405e75a7e405aad8f.1450193761.git.jpoimboe@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Create init functions for exec_cmd.c and pager.c. This allows their
configuration to be specified at runtime so they can be split out into a
separate library which can be used by other programs. Their
configuration is stored in a shared subcmd_config struct.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/21f5f6b38da72c985a8dcfa185700d03e7eecd1d.1450193761.git.jpoimboe@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Generally, calling exit() from a library is bad practice. Eventually
these functions might be redesigned so that they don't exit. For now,
just document the fact that they do.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/97b1af06cc3b18dd0f49e655d6d659eaa64ecde5.1450193761.git.jpoimboe@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
strlcpy() will be needed by the subcmd library. Move it to the shared
tools/lib/string.c file which can be used by other tools.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/71e2804b973bf39ad3d3b9be10f99f2ea630be46.1450193761.git.jpoimboe@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Post processing at 'perf record' takes a long time on big machines.
What it does is to find the build-id of binaries found in the event
stream, so that it can make sure, at 'report' time, that the symtabs (be
it ELF, kallsyms, etc) being used to resolve symbols are the ones
matching the binaries found at 'record' time.
Sometimes we just want to skip this processing of events at the end of
the session to get quicker results, making sure the binaries haven't
changed from 'record' to 'report' time.
Add a new config option to control this behavior.
The record.build-id config variable can have one of the following
values:
- cache: post-process data and save/update the binaries into the
build-id cache (in ~/.debug). This is the default.
- no-cache: post-process the data but not update the build-id cache.
Same effect as using the -N option.
- skip: skip post-processing and do not update the cache.
Same effect as using the -B option.
Reported-and-Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Taeung Song <treeze.taeung@gmail.com>
Link: http://lkml.kernel.org/r/1450144196-22957-1-git-send-email-namhyung@kernel.org
[ Added some more text to the documentation ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Make perf-record command support --vmlinux option if BPF_PROLOGUE is on.
'perf record' needs vmlinux as the source of DWARF info to generate
prologue for BPF programs, so path of vmlinux should be specified.
Short name 'k' has been taken by 'clockid'. This patch skips the short
option name and uses '--vmlinux' for vmlinux path.
Documentation is also updated.
Test result:
In a production (or broken) environment:
(by:
# rm -rf ~/.debug/
# mv /lib/modules/`uname -r`/build/vmlinux /tmp/
)
# ./perf record -e ./test_bpf_base.c ls
Failed to find the path for kernel: No such file or directory
event syntax error: './test_bpf_base.c'
\___ You need to check probing points in BPF file
...
# ./perf record --vmlinux /tmp/vmlinux -e ./test_bpf_base.c ls
...
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.011 MB perf.data ]
Help messages when build with NO_LIBBPF:
# ./perf record -h
--transaction sample transaction flags (special events only)
--vmlinux <file> vmlinux pathname
(not built-in because NO_LIBBPF=1)
# ./perf record --vmlinux /tmp/vmlinux ls /
Warning: option `vmlinux' is being ignored because NO_LIBBPF=1
...
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.011 MB perf.data (11 samples) ]
Help messages when build with NO_DWARF:
# ./perf record -h
--transaction sample transaction flags (special events only)
--vmlinux <file> vmlinux pathname
(not built-in because NO_DWARF=1)
Signed-off-by: He Kuang <hekuang@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1450089563-122430-15-git-send-email-wangnan0@huawei.com
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch keeps options of perf builtins same in all conditions. If
one option is disabled because of compiling options, users should be
notified.
Masami suggested another implementation in [1] that, by adding a
OPTION_NEXT_DEPENDS option before those options in the 'struct option'
array, options parser knows an option is disabled. However, in some
cases this array is reordered (options__order()). In addition, in
parse-option.c that array is const, so we can't simply merge
information in decorator option into the affacted option.
This patch chooses a simpler implementation that, introducing a
set_option_nobuild() function and two option parsing flags. Builtins
with such options should call set_option_nobuild() before option
parsing. The complexity of this patch is because we want some of options
can be skipped safely. In this case their arguments should also be
consumed.
Options in 'perf record' and 'perf probe' are fixed in this patch.
[1] http://lkml.kernel.org/g/50399556C9727B4D88A595C8584AAB3752627CD4@GSjpTKYDCembx32.service.hitachi.net
Test result:
Normal case:
# ./perf probe --vmlinux /tmp/vmlinux sys_write
Added new event:
probe:sys_write (on sys_write)
You can now use it in all perf tools, such as:
perf record -e probe:sys_write -aR sleep 1
Build with NO_DWARF=1:
# ./perf probe -L sys_write
Error: switch `L' is not available because NO_DWARF=1
Usage: perf probe [<options>] 'PROBEDEF' ['PROBEDEF' ...]
or: perf probe [<options>] --add 'PROBEDEF' [--add 'PROBEDEF' ...]
or: perf probe [<options>] --del '[GROUP:]EVENT' ...
or: perf probe --list [GROUP:]EVENT ...
or: perf probe [<options>] --funcs
-L, --line <FUNC[:RLN[+NUM|-RLN2]]|SRC:ALN[+NUM|-ALN2]>
Show source code lines.
(not built-in because NO_DWARF=1)
# ./perf probe -k /tmp/vmlinux sys_write
Warning: switch `k' is being ignored because NO_DWARF=1
Added new event:
probe:sys_write (on sys_write)
You can now use it in all perf tools, such as:
perf record -e probe:sys_write -aR sleep 1
# ./perf probe --vmlinux /tmp/vmlinux sys_write
Warning: option `vmlinux' is being ignored because NO_DWARF=1
Added new event:
[SNIP]
# ./perf probe -l
Usage: perf probe [<options>] 'PROBEDEF' ['PROBEDEF' ...]
or: perf probe [<options>] --add 'PROBEDEF' [--add 'PROBEDEF' ...]
...
-k, --vmlinux <file> vmlinux pathname
(not built-in because NO_DWARF=1)
-L, --line <FUNC[:RLN[+NUM|-RLN2]]|SRC:ALN[+NUM|-ALN2]>
Show source code lines.
(not built-in because NO_DWARF=1)
...
-V, --vars <FUNC[@SRC][+OFF|%return|:RL|;PT]|SRC:AL|SRC;PT>
Show accessible variables on PROBEDEF
(not built-in because NO_DWARF=1)
--externs Show external variables too (with --vars only)
(not built-in because NO_DWARF=1)
--no-inlines Don't search inlined functions
(not built-in because NO_DWARF=1)
--range Show variables location range in scope (with --vars only)
(not built-in because NO_DWARF=1)
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1450089563-122430-14-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
help_unknown_cmd() is quite perf-specific because it relies on some
perf_config*() functions. Move it and its supporting functions out into
a separate file so that help.c can be moved to a library.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/562d918bcaaf340c1ae3e47586b3f0ae33b9918b.1449965119.git.jpoimboe@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
PERF_PAGER_IN_USE doesn't seem to be used anywhere, so let's remove it.
This will also make it easier to move pager.c into a separate library.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/ed9e8370db9811746dc590544cf48c36dcfb1731.1449965119.git.jpoimboe@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move the 'pager' function prototypes into a new pager.h so that the
pager code can be moved out to a library.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/ba7c316474dd6bfc047e5c6dc4dcab39a982caf5.1449965119.git.jpoimboe@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
'LIB_PATH' is a misnomer because there are multiple library paths.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/c10df0b749a27f05cc531fe06b8dd71a329341fa.1449965119.git.jpoimboe@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add some missing files to the 'make clean' target.
Reported-and-Acked-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/8b1f5a5bd66a652be071d423e64aaa994254be31.1449965119.git.jpoimboe@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Because the Build file writes source code to the generated llvm-src-*.c
files, it should be listed as one of the dependencies, so that any
future changes to the code being echoed won't require a 'make clean'.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/9b9886c295750dc83cbbb29a665d280f9c5e8b3e.1449965119.git.jpoimboe@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently if kptr_restrict is enabled, all hist tests failed with
segfaults. This is because machine__create_kernel_maps() in
setup_fake_machine() failed in that situation, and it called
machine__delete() on the error path. But outer callers again called
machines__exit() causing double free for the host machine.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1450062673-22312-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
[The kernel patch needed for this is in tip now (b16a5b52eb perf/x86:
Add option to disable ...) So this user tools patch to make use of it
should be merged now]
Automatically disable collecting branch flags and cycles with
--call-graph lbr. This allows avoiding a bunch of extra MSR
reads in the PMI on Skylake.
When the kernel doesn't support the new flags they are automatically
cleared in the fallback code.
v2: Switch to use branch_sample_type instead of sample_type.
Adjust description.
Fix the fallback logic.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/1449879144-29074-1-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We should always return from thread__new(), the constructor, with the
object with a reference count of one, so that:
struct thread *thread = thread__new();
thread__put(thread);
Will call thread__delete().
If any reference is made to that 'thread' variable, it better use
thread__get(thread) to hold a reference.
We were returning with thread->refcnt set to zero, fix it and some cases
where thread__delete() was being called, which were not a problem
because just one reference was being used, now that we set it to 1, use
thread__put() instead.
Reported-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-4b9mkuk66to4ecckpmpvqx6s@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
I.e. don't exit with the signal number, instead set the signal handler
to the default one and then raise it again.
Noticed while trying to dump the stack at segfaults in the 'perf test'
forked process used to run each test, that inspects signal info at
each test.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-5x5r176wnoqxi5p6id05wv9w@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The new name is irq_poll as iopoll is already taken. Better suggestions
welcome.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
There are so many test cases use stack allocated 'struct machine'.
Including:
test__hists_link
test__hists_filter
test__mmap_thread_lookup
test__thread_mg_share
test__hists_output
test__hists_cumulate
Also, in non-test code (for example, machine__new_host()) there are
code use 'malloc()' to alloc struct machine.
These are dangerous operations, cause some tests fail or hung in
machines__exit(). For example, in
machines__exit ->
machine__destroy_kernel_maps ->
map_groups__remove ->
maps__remove ->
pthread_rwlock_wrlock
a incorrectly initialized lock causes unintended behavior.
This patch memset(0) that structure in machine__init() to ensure all
fields in 'struct machine' are initialized to zero.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1449541544-67621-17-git-send-email-wangnan0@huawei.com
[ Use memset, see 'man bzero' ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add hexadecimal u32 to base data type, which is useful for raw output
because raw data is u32 aligned.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1449541544-67621-12-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The 'he' cannot be NULL since it's caller hist_iter__top_callback() is
called only if iter->he is not NULL (see hist_entry_iter__add). So
setting 'sym' before the condition to simplify the code.
Also make it clearer that the top->symbol_filter_entry check is only
meaningful on stdio mode (i.e. when use_browser is 0).
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1449802616-16170-4-git-send-email-namhyung@kernel.org
[ Complete the simplification replacing one more he->ms.sym with sym ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The ui__has_annotation() inside perf_top__record_precise_ip() should be
removed since it returns true only for TUI (and when sort key has
symbol). However the 'perf top --stdio' also supports annotation for a
symbol which was specified by 's' key action.
Actually it already does the necessary checks before calling the
function. So it's ok to get rid of the check here.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1449802616-16170-3-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The perf_top__record_precise_ip() releases and regrabs the
he->hists->lock because it can sleep if there's an error. But it should
be done conditionally as it slows down the fast path.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1449802616-16170-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We call map->unmap_ip() before the function and call map->map_ip()
inside the function. This is meaningless and look strange since only
one of the two checks 'map'. Let's use al->addr directly.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1449802616-16170-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fix dso__load_sym to put dso because dsos__add already got it.
Refcnt debugger explain the problem:
----
==== [0] ====
Unreclaimed dso: 0x19dd200
Refcount +1 => 1 at
./perf(dso__new+0x1ff) [0x4a62df]
./perf(dso__load_sym+0xe89) [0x503509]
./perf(dso__load_vmlinux+0xbf) [0x4aa77f]
./perf(dso__load_vmlinux_path+0x8c) [0x4aa8dc]
./perf() [0x50539a]
./perf(convert_perf_probe_events+0xd79) [0x50ad39]
./perf() [0x45600f]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f74dd0efaf5]
./perf() [0x4220a9]
Refcount +1 => 2 at
./perf(dso__get+0x34) [0x4a65f4]
./perf(map__new2+0x76) [0x4be216]
./perf(dso__load_sym+0xee1) [0x503561]
./perf(dso__load_vmlinux+0xbf) [0x4aa77f]
./perf(dso__load_vmlinux_path+0x8c) [0x4aa8dc]
./perf() [0x50539a]
./perf(convert_perf_probe_events+0xd79) [0x50ad39]
./perf() [0x45600f]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f74dd0efaf5]
./perf() [0x4220a9]
Refcount +1 => 3 at
./perf(dsos__add+0xf3) [0x4a6bc3]
./perf(dso__load_sym+0xfc1) [0x503641]
./perf(dso__load_vmlinux+0xbf) [0x4aa77f]
./perf(dso__load_vmlinux_path+0x8c) [0x4aa8dc]
./perf() [0x50539a]
./perf(convert_perf_probe_events+0xd79) [0x50ad39]
./perf() [0x45600f]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f74dd0efaf5]
./perf() [0x4220a9]
Refcount -1 => 2 at
./perf(dso__put+0x2f) [0x4a664f]
./perf(map_groups__exit+0xb9) [0x4bee29]
./perf(machine__delete+0xb0) [0x4b93d0]
./perf(exit_probe_symbol_maps+0x28) [0x506718]
./perf() [0x45628a]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f74dd0efaf5]
./perf() [0x4220a9]
Refcount -1 => 1 at
./perf(dso__put+0x2f) [0x4a664f]
./perf(machine__delete+0xfe) [0x4b941e]
./perf(exit_probe_symbol_maps+0x28) [0x506718]
./perf() [0x45628a]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f74dd0efaf5]
./perf() [0x4220a9]
----
So, in the dso__load_sym, dso is gotten 3 times, by dso__new,
map__new2, and dsos__add. The last 2 is actually released by
map_groups and machine__delete correspondingly. However, the
first reference by dso__new, is never released.
Committer note:
Changed the place where the reference count is dropped to:
Fix it by dropping it right after creating curr_map, since we know that
either that operation failed and we need to drop the dso refcount or
that it succeed and we have it referenced via curr_map->dso.
Then only drop the curr_map refcount after we call dsos__add() to make
sure we hold a reference to it via curr_map->dso.
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20151209021118.10245.49869.stgit@localhost.localdomain
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Note that since the thread was already inserted to the session
list, it will be released when the session is released.
Also, in perf_session__register_idle_thread() failure path,
the thread should be put before returning.
Refcnt debugger shows that the perf_session__register_idle_thread
gets the returned thread, but the caller (__cmd_top) does not
put the returned idle thread.
----
==== [0] ====
Unreclaimed thread@0x24e6240
Refcount +1 => 0 at
./perf(thread__new+0xe5) [0x4c8a75]
./perf(machine__findnew_thread+0x9a) [0x4bbdba]
./perf(perf_session__register_idle_thread+0x28) [0x4c63c8]
./perf(cmd_top+0xd7d) [0x43cf6d]
./perf() [0x47ba35]
./perf(main+0x617) [0x4225b7]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f06027c5af5]
./perf() [0x42272d]
Refcount +1 => 1 at
./perf(thread__get+0x2c) [0x4c8bcc]
./perf(machine__findnew_thread+0xee) [0x4bbe0e]
./perf(perf_session__register_idle_thread+0x28) [0x4c63c8]
./perf(cmd_top+0xd7d) [0x43cf6d]
./perf() [0x47ba35]
./perf(main+0x617) [0x4225b7]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f06027c5af5]
./perf() [0x42272d]
Refcount +1 => 2 at
./perf(thread__get+0x2c) [0x4c8bcc]
./perf(machine__findnew_thread+0x112) [0x4bbe32]
./perf(perf_session__register_idle_thread+0x28) [0x4c63c8]
./perf(cmd_top+0xd7d) [0x43cf6d]
./perf() [0x47ba35]
./perf(main+0x617) [0x4225b7]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f06027c5af5]
./perf() [0x42272d]
----
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20151209021122.10245.69707.stgit@localhost.localdomain
[ Drop the refcount in perf_session__register_idle_thread() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
After sample processing is done, hist entries are in both of
hists->entries and hists->entries_in (or hists->entries_collapsed). So
I guess perf report does not have leaks on hists.
But for perf top, it's possible to have half-processed entries which are
only in hists->entries_in. Eventually they will go to the
hists->entries and get freed but they cannot be deleted by current
hists__delete_entries(). This patch adds hists__delete_all_entries
function to delete those entries.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-and-Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1449734015-9148-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Since all of its users call before setup_browser(), there's no need to
call exit_browser() inside of the function.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1449716459-23004-8-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This is necessary to get rid of the browser dependency from
usage_with_options() and its friends. Because we validate the targets
which are used to create the cpu/thread maps and inform the user about
any override performed via the chosen UI, we don't need to call the
usage routine for that.
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-slu7lj7buzpwgop1vo9la8ma@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This is necessary to get rid of the browser dependency from
usage_with_options() and its friends. Because there's no code
changing the argc and argv, it'd be ok to check it early.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1449716459-23004-5-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Calling setup_browser(false) with use_browser = 0 is meaningless.
Just get rid of it. This is necessary to remove the browser
dependency from usage_with_options() and friends.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1449716459-23004-4-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move setup_browser after all necessary initialization is done. This is
to remove the browser dependency from usage_with_options and friends.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1449716459-23004-3-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This is necessary to get rid of the browser dependency from
usage_with_options() and its friends. Because there's no code changing
the argc and argv, it'd be ok to check it early.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1449716459-23004-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move cmd_version() to its own file so that help.c can be moved to a
library.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/e908b1b68f20ab6d8d33941d5571c23110622e60.1449548395.git.jpoimboe@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf_env__set_cmdline() only saves the arguments the first time it's
called. It doesn't need to be called every time the options and
suboptions are parsed. Instead it can just be called once.
This also has the advantage of making the option parsing code less
perf-specific so it can be moved out to a library.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/19b76a5aa1b688bd635bd65d80bbc103a978d75e.1449548395.git.jpoimboe@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The term functions are needed by help.c which is going to be moved into
a separate library. Move them out of util.c and into their own file.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/9a39c854dd156b55ebda57e427594c9a59dcb40f.1449548395.git.jpoimboe@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fix write_numa_topology to put cpu_map instead of free because cpu_map
is managed based on refcnt.
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20151209021135.10245.79046.stgit@localhost.localdomain
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fix machine.vmlinux_maps to make sure to clear the old one if it is
renewal. This can leak the previous maps on the vmlinux_maps because
those are just overwritten.
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20151209021133.10245.93730.stgit@localhost.localdomain
[ Simplified the memset, same end result ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Since hists__init doesn't set the destructor of hists_evsel (which is an
extended evsel structure), when hists_evsel is released, the extended
part of the hists_evsel is not deleted (note that the hists_evsel object
itself is freed).
This fixes it to add a destructor for hists__evsel and to set it up.
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20151209021129.10245.28710.stgit@localhost.localdomain
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fix cmd_stat() to release cpu_map objects (aggr_map and
cpus_aggr_map) afterwards.
refcnt debugger shows that the cmd_stat initializes cpu_map
but not puts it.
----
# ./perf stat -v ls
....
REFCNT: BUG: Unreclaimed objects found.
==== [0] ====
Unreclaimed cpu_map@0x29339c0
Refcount +1 => 1 at
./perf(cpu_map__empty_new+0x6d) [0x4e64bd]
./perf(cmd_stat+0x5fe) [0x43594e]
./perf() [0x47b785]
./perf(main+0x617) [0x422587]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f2dff420af5]
./perf() [0x4226fd]
REFCNT: Total 1 objects are not reclaimed.
"cpu_map" leaks 1 objects
----
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20151209021127.10245.93697.stgit@localhost.localdomain
[ Remove NULL checks before calling the put operation, it checks it already ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Boris reported that 'perf top' is unusable on his default 'black on
white' terminal, which uses (eye friendly) light-grey as a background
color.
The reason is that the TUI cursor for the current selection line uses
HE_COLORSET_SELECTED, and that has a default background color of
'lightgrey' - which is a common terminal background choice and thus
the colors conflict.
Use yellow as the background color instead: that should be an uncommon
terminal background, yet it's still ergonomic on both black and
white/grey terminals.
[ It would be a better solution to straight out detect color
collisions and resolve them reasonably by converting them to RGB and
calculating color space distances, but I was unable to find
proper documentation for SLtt_get_color_object() to recover the
current color scheme so I gave up ... Yellow works well enough. ]
Reported-and-Tested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Binderman <dcb314@hotmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20150305103213.GA23046@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>