When annotating multiple entries, for instance, when running simply as:
$ perf annotate
the right and left keys, as well as TAB can be used to cycle thru the
multiple symbols being annotated.
If one doesn't like TUI annotate, disable it by editing ~/.perfconfig
and adding:
[tui]
annotate = off
Just like it is possible for report.
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
One day we'll have support for the "dump raw trace in ASCII" in the TUI
frontend, but till then, use the tty code.
Reported-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We need to have stdio.h included with _GNU_SOURCEfopr getline,
which is broken with the inclusion of build-id.h.
Keep util.h included first in hist.c
Fixes:
util/hist.c: Dans la fonction «hist_entry__parse_objdump_line» :
util/hist.c:938: attention : déclaration implicite de la fonction « «getline» »
util/hist.c:938: attention : nested extern declaration of «getline»
make: *** [util/hist.o] Erreur 1
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1274438919-5104-1-git-send-regression-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
It seems a waste of space to create a buffer per
event, share it per-cpu.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <20100521090710.634824884@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Since it is not allowed to create cross-cpu (or
cross-task) buffers, this option is no longer valid.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <20100521090710.582740993@chello.nl>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Using the same scheme as for git's/perf's pager setup, i.e. if one
doesn't want to, on a newt enabled perf binary, to disable the TUI for
'perf report', its just a matter of doing:
[root@doppio linux-2.6-tip]# printf "[tui]\n\nreport = off\n" >
/root/.perfconfig
[root@doppio linux-2.6-tip]# cat /root/.perfconfig
[tui]
report = off
[root@doppio linux-2.6-tip]#
System wide settings are also possible, by editing /etc/perfconfig, etc,
i.e. the git machinery for config files applies to perf as well, so when
in doubt where to put your settings, consult the git documentation, if
it fails, please let us know.
Suggested-by: Ingo Molnar <mingo@elte.hu>
Discussed-with: Stephane Eranian <eranian@google.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Perf record repeatedly calls gettimeofday() which adds noise to the performance
measurements. Since gettimeofday() is only used for the error printf, delete
it.
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <20100518225240.GC25589@sgi.com>
Signed-off-by: Russ Anderson <rja@sgi.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The FunctionFS gadget may provide the source/sink interface
not as the first interface (with id == 0) but some different
interface hence a code to find the interface number is
required.
(Note that you will still configure the gadget to report
idProduct == 0xa4a4 (an "echo 0xa4a4
>/sys/module/g_ffs/parameters/usb_product" should suffice) or
configure host to handle 0x0525:0xa4ac devices using the
usbtest driver.)
Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
The testusb program just issues ioctls to perform the tests
implemented by the kernel driver. It can generate a variety
of transfer patterns; you should make sure to test both regular
streaming and mixes of transfer sizes (including short transfers).
For more information on how this can be used and on USB testing
refer to <URL:http://www.linux-usb.org/usbtest/>.
Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
This adds an example user-space FunctionFS driver which
implements a source/sink interface used for testing.
Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com>
Cc: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
We were still using the pathname found on the MMAP event, that could not
be the one we used when recording, so use the build-id cache for that,
only falling back to use the pathname in the MMAP event if no build-ids
are available.
With this we now also are able to do secure, seamless offline annotation.
Example:
[root@doppio linux-2.6-tip]# perf report -g none -v 2> /dev/null | head -10
8.12% Xorg /usr/lib64/libpixman-1.so.0.14.0 0x0000000000026d02 B [.] pixman_rasterize_edges
4.68% firefox /usr/lib64/xulrunner-1.9.1/libxul.so 0x00000000005dbdba B [.] 0x000000005dbdba
3.70% swapper /lib/modules/2.6.34-rc6/build/vmlinux 0xffffffff81022cea ! [k] read_hpet
2.96% init /lib/modules/2.6.34-rc6/build/vmlinux 0xffffffff81022cea ! [k] read_hpet
2.73% swapper /lib/modules/2.6.34-rc6/build/vmlinux 0xffffffff8100a738 ! [k] mwait_idle_with_hints
[root@doppio linux-2.6-tip]# perf annotate -v pixman_rasterize_edges 2>&1 | grep Executing
Executing: objdump --start-address=0x000000371ce26670 --stop-address=0x000000371ce2709f -dS /root/.debug/.build-id/bd/6ac5199137aaeb279f864717d8d061477466c1|grep -v /root/.debug/.build-id/bd/6ac5199137aaeb279f864717d8d061477466c1|expand
[root@doppio linux-2.6-tip]# perf buildid-list | grep libpixman-1.so.0.14.0
bd6ac5199137aaeb279f864717d8d061477466c1 /usr/lib64/libpixman-1.so.0.14.0
[root@doppio linux-2.6-tip]#
Reported-by: Stephane Eranian <eranian@google.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Just like if one is using the stdio based pager, or more/less, for that
matter.
Suggested-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Accessing trace values of an 8 size may end up in a segfault
on archs that can't deal with misaligned access, which is the
case for sparc 64. This is because PERF_SAMPLE_RAW are aligned
to 4 and not to 8.
Fix this on the macros that get the values of 8 size.
This fixes segfaults on perf tools in sparc 64.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: David Miller <davem@davemloft.net>
This is a small fix for a problem affecting live-mode, introduced
recently:
root@tropicana:~# perf trace rwtop
perf trace started with Perl
script /root/libexec/perf-core/scripts/perl/rwtop.pl
Fatal: did not read header event
commit d00a47cce5 added a skip()
function to skip over e.g. header_page, but this doesn't work for
live mode. This patch re-implements skip() to use read() instead of
lseek() to fix that.
Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1273032130.6383.28.camel@tropicana>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
The changes made to support host and guest machines in a session, that
started when the 'perf kvm' tool was introduced ended up introducing a
bug where the host_machine was not having its DSOs traversed for
build-id processing.
Fix it by moving some methods to the right classes and considering the
host_machine when processing build-ids.
Reported-by: Tom Zanussi <tzanussi@gmail.com>
Reported-by: Stephane Eranian <eranian@google.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In __dsos__read_build_ids if the dso already had its build-id read,
don't try again.
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
All the functions that call this can handle the equivalent, non
panic'ing wrapped routines.
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Functions that were calling xzalloc also returned -1 when, for other
reasons, it could fail, and the calleds are coping with failures, so
stop using die() and xzalloc().
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
That could leave filedescriptors open and leak memory. Also stop using
xmalloc, use malloc and handle results just like other error cases in
the same routine that used it.
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Without the bloated cplus_demangle from binutils, i.e building with:
$ make NO_DEMANGLE=1 O=~acme/git/build/perf -j3 -C tools/perf/ install
Before:
text data bss dec hex filename
471851 29280 4025056 4526187 45106b /home/acme/bin/perf
After:
[acme@doppio linux-2.6-tip]$ size ~/bin/perf
text data bss dec hex filename
446886 29232 4008576 4484694 446e56 /home/acme/bin/perf
So its a 5.3% size reduction in code, but the interesting part is in the git
diff --stat output:
19 files changed, 20 insertions(+), 1909 deletions(-)
If we ever need some of the things we got from git but weren't using, we just
have to go to the git repo and get fresh, uptodate source code bits.
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It is hard to read very large numbers so provide an option to perf stat
to separate thousands using a separator. The patch leverages the locale
support of stdio. You need to set your LC_NUMERIC appropriately, for
instance LC_NUMERIC=en_US.UTF8. You need to pass -B to activate this
feature. This way existing scripts parsing the output do not need to be
changed. Here is an example.
$ perf stat noploop 2
noploop for 2 seconds
Performance counter stats for 'noploop 2':
1998.347031 task-clock-msecs # 0.998 CPUs
61 context-switches # 0.000 M/sec
0 CPU-migrations # 0.000 M/sec
118 page-faults # 0.000 M/sec
4,138,410,900 cycles # 2070.917 M/sec (scaled from 70.01%)
2,062,650,268 instructions # 0.498 IPC (scaled from 70.01%)
2,057,653,466 branches # 1029.678 M/sec (scaled from 70.01%)
40,267 branch-misses # 0.002 % (scaled from 30.04%)
2,055,961,348 cache-references # 1028.831 M/sec (scaled from 30.03%)
53,725 cache-misses # 0.027 M/sec (scaled from 30.02%)
2.001393933 seconds time elapsed
$ perf stat -B noploop 2
noploop for 2 seconds
Performance counter stats for 'noploop 2':
1998.297883 task-clock-msecs # 0.998 CPUs
59 context-switches # 0.000 M/sec
0 CPU-migrations # 0.000 M/sec
119 page-faults # 0.000 M/sec
4,131,380,160 cycles # 2067.450 M/sec (scaled from 70.01%)
2,059,096,507 instructions # 0.498 IPC (scaled from 70.01%)
2,054,681,303 branches # 1028.216 M/sec (scaled from 70.01%)
25,650 branch-misses # 0.001 % (scaled from 30.05%)
2,056,283,014 cache-references # 1029.017 M/sec (scaled from 30.03%)
47,097 cache-misses # 0.024 M/sec (scaled from 30.02%)
2.001391016 seconds time elapsed
Cc: David S. Miller <davem@davemloft.net>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <4bf28fe8.914ed80a.01ca.fffff5f5@mx.google.com>
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (311 commits)
perf tools: Add mode to build without newt support
perf symbols: symbol inconsistency message should be done only at verbose=1
perf tui: Add explicit -lslang option
perf options: Type check all the remaining OPT_ variants
perf options: Type check OPT_BOOLEAN and fix the offenders
perf options: Check v type in OPT_U?INTEGER
perf options: Introduce OPT_UINTEGER
perf tui: Add workaround for slang < 2.1.4
perf record: Fix bug mismatch with -c option definition
perf options: Introduce OPT_U64
perf tui: Add help window to show key associations
perf tui: Make <- exit menus too
perf newt: Add single key shortcuts for zoom into DSO and threads
perf newt: Exit browser unconditionally when CTRL+C, q or Q is pressed
perf newt: Fix the 'A'/'a' shortcut for annotate
perf newt: Make <- exit the ui_browser
x86, perf: P4 PMU - fix counters management logic
perf newt: Make <- zoom out filters
perf report: Report number of events, not samples
perf hist: Clarify events_stats fields usage
...
Fix up trivial conflicts in kernel/fork.c and tools/perf/builtin-record.c
slang versions <= 2.0.6 have a "#if HAVE_LONG_LONG" that breaks the
build if it isn't defined. Use the equivalent one that glibc has on
features.h.
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Check elfutils version, and if it is old don't compile CFI analysis code. This
allows to compile perf with old elfutils.
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Tested-by: Stephane Eranian <eranian@google.com>
Reported-by: Robert Richter <robert.richter@amd.com>
Cc: Ingo Molnar <mingo@elte.hu>
LKML-Reference: <20100510171207.26029.97604.stgit@localhost6.localdomain6>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
make NO_NEWT=1
Will avoid building the newt (tui) support.
Suggested-by: Ingo Molnar <mingo@elte.hu>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
That happened for an old perf.data file that had no fake MMAP events for
the kernel modules, but even then it should warn once for each module,
not one time for every symbol in every module not found.
Reported-by: Ingo Molnar <mingo@elte.hu>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
At least on rawhide using -lnewt is not enough if we use SLang routines
directly, so add an explicit -lslang since we use SLang routines.
Reported-by: Ingo Molnar <mingo@elte.hu>
Tested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
OPT_SET_INT was renamed to OPT_SET_UINT since the only use in these
tools is to set something that has an enum type, that is builtin
compatible with unsigned int.
Several string constifications were done to make OPT_STRING require a
const char * type.
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To avoid problems like the one fixed by Stephane Eranian in 3de29ca, now
we'll got this instead:
bench/sched-messaging.c:259: error: negative width in bit-field ‘<anonymous>’
bench/sched-messaging.c:261: error: negative width in bit-field ‘<anonymous>’
Which is rather cryptic, but is how BUILD_BUG_ON_ZERO works, so kernel
hackers should be already used to this.
With it in place found some problems, fixed by changing the affected
variables to sensible types or changed some OPT_INTEGER to OPT_UINTEGER.
Next csets will go thru converting each of the remaining OPT_ so that
review can be made easier by grouping changes per type per patch.
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For unsigned int options to be parsed, next patches will make use of it.
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Older versions of the slang library didn't used the 'const' specifier,
causing problems with modern compilers of this kind:
util/newt.c:252: error: passing argument 1 of ‘SLsmg_printf’ discards
qualifiers from pointer target type
Fix it by using some wrappers that when needed const the affected
parameters back to plain (char *).
Reported-by: Lin Ming <ming.m.lin@intel.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <20100517145421.GD29052@ghostprotocols.net>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The -c option defines the user requested sampling period. It was implemented
using an unsigned int variable but the type of the option was OPT_LONG. Thus,
the option parser was overwriting memory belonging to other variables, namely
the mmap_pages leading to a zero page sampling buffer. The bug was exposed only
when compiling at -O0, probably because the compiler was padding variables at
higher optimization levels.
This patch fixes this problem by declaring user_interval as u64. This also
avoids wrap-around issues for large period on 32-bit systems.
Commiter note:
Made it use OPT_U64(user_interval) after implementing OPT_U64 in the
previous patch.
Cc: David S. Miller <davem@davemloft.net>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <4bf11ae9.e88cd80a.06b0.ffffa8e3@mx.google.com>
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We have things like user_interval (-c/--count) in 'perf record' that
needs this.
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In fact it is now added to the hot key list when newt_form__new is used,
allowing us to remove the explicit assignment in all its users.
The visible change is that <- will exit the menu that pops up when -> is
pressed (and Enter when callchains are not being used).
Suggested-by: Ingo Molnar <mingo@elte.hu>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
'D'/'d' for zooming into the DSO in the current highlighted hist entry,
'T'/'t' for zooming into the current thread.
Suggested-by: Ingo Molnar <mingo@elte.hu>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Right now that means that pressing the left arrow willl make the symbol
annotation window to exit back to the main symbol histogram browser.
This is another improvement on the UI fastpath, i.e. just the arrows and
enter are enough for most browsing.
Suggested-by: Ingo Molnar <mingo@elte.hu>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
After we use the filters to zoom into DSOs or threads, we can use <-
(left arrow) to zoom out from the last filter applied.
It is still possible to zoom out of order by using the popup menu.
With this we now have the zoom out operation on the browsing fast path,
by allowing fast navigation using just the four arrors and the enter key
to expand collapse callchains.
Suggested-by: Ingo Molnar <mingo@elte.hu>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Number of samples is meaningless after we switched to auto-freq, so
report the number of events, i.e. not the sum of the different periods,
but the number PERF_RECORD_SAMPLE emitted by the kernel.
While doing this I noticed that naming "count" to the sum of all the
event periods can be confusing, so rename it to .period, just like in
struct sample.data, so that we become more consistent.
This helps with the next step, that was to record in struct hist_entry
the number of sample events for each instance, we need that because we
use it to generate the number of events when applying filters to the
tree of hist entries like it is being done in the TUI report browser.
Suggested-by: Ingo Molnar <mingo@elte.hu>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The events_stats.total field is too generic, rename it to .total_period,
and also add a comment explaining that it is the sum of all the .period
fields in samples, that is needed because we use auto-freq to avoid
sampling artifacts.
Ditto for events_stats.lost, that is the sum of all lost_event.lost
fields, i.e. the number of events the kernel dropped.
Looking at the users, builtin-sched.c can make use of these fields and
stop doing it again.
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This is one more thing that started global but are more useful per hist
or per session.
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
option option -> option
special special -> special
Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <1273747165-17242-1-git-send-email-kirr@mns.spb.ru>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
By default, event inheritance across fork and pthread_create was on but the -i
option of stat and record, which enabled inheritance, led to believe it was off
by default.
This patch fixes this logic by inverting the meaning of the -i option. By
default inheritance is on whether you attach to a process (-p), a thread (-t)
or start a process. If you pass -i, then you turn off inheritance. Turning off
inheritance if you don't need it, helps limit perf resource usage as well.
The patch also fixes perf stat -t xxxx and perf record -t xxxx which did not
start the counters.
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <4bea9d2f.d60ce30a.0b5b.08e1@mx.google.com>
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
hist.c needs to include util.h so that it gets stdio.h
inclusion with __GNU_SOURCE defined.
Fixes:
util/hist.c: In function ‘hist_entry__parse_objdump_line’:
util/hist.c:931: erreur: implicit declaration of function ‘getline’
util/hist.c:931: erreur: nested extern declaration of ‘getline’
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1273772836-11533-1-git-send-regression-fweisbec@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fix mistake in a parameter type of the no-newt hists__browse()
version.
Fixes:
builtin-report.c: In function ‘__cmd_report’:
builtin-report.c:314: erreur: incompatible type for argument 1 of ‘hists__browse’
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1273771378-8577-1-git-send-regression-fweisbec@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Usually "_text" is enough, but I received reports that its not always
available, so fallback to "_stext" for the symbol we use to check if we
need to apply any relocation to all the symbols in the kernel symtab,
for when, for instance, kexec is being used.
Reported-by: Darren Hart <dvhltc@us.ibm.com>
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Now we don't anymore use popen to run 'perf annotate' for the selected
symbol, instead we collect per address samplings when processing samples
in 'perf report' if we're using the newt browser, then we use this data
directly to do annotation.
Done this way we can actually traverse the objdump_line objects
directly, matching the addresses to the collected samples and colouring
them appropriately using lower level slang routines.
The new ui_browser class will be reused for the main, callchain aware,
histogram browser, when it will be made generic and don't assume that
the objects are always instances of the objdump_line class maintained
using list_heads.
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Initially this was just to be able to have a printf like method to
prepare the formatted string and then pass to newtPushHelpLine, but as
we already have for ui_progress, etc, its a step in identifying a
restricted, highlevel set of widgets we can then have implementations
for multiple widget sets (GTK, etc).
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For Fedora, I want to force perf to link against libiberty.a for
cplus_demangle, rather than libbfd.a for bfd_demangle due to licensing insanity
on binutils. (libiberty is LGPL2, libbfd is GPL3.)
If we just rely on autodetection, we'll end up with libbfd linked against us,
since they're both in binutils-static in the buildroot.
Cc: Ingo Molnar <mingo@elte.hu>
LKML-Reference: <20100510204335.GA7565@bombadil.infradead.org>
Signed-off-by: Kyle McMartin <kyle@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Check whether elfutils is older than 0.138 (from which version checking
routine has been introduced). And if so, set NO_DWARF because it is hard
to check the API dependency without version checking.
Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Reported-by: Robert Richter <robert.richter@amd.com>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Ingo Molnar <mingo@elte.hu>
LKML-Reference: <20100511045953.9913.19485.stgit@localhost6.localdomain6>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Those are really not specific to the newt code, can be used by other UI
frontends.
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The raw_field_ptr() helper, used to retrieve the address of a field
inside a trace event, treats every strings as if they were dynamic
ie: having a secondary level of indirection to retrieve their
contents.
FIELD_IS_STRING doesn't mean FIELD_IS_DYNAMIC, we only need to
compute the secondary dereference for the latter case.
This fixes perf sched segfaults, bad cmdline report and may be
some other bugs.
Reported-by: Jason Baron <jbaron@redhat.com>
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
A small fix for the syscall counts script:
- silence the match output in the shell script
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
LKML-Reference: <1273466820-9330-10-git-send-email-tzanussi@gmail.com>
Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
A small fix for the syscall counts by pid script:
- silence the match output in the shell script
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
LKML-Reference: <1273466820-9330-9-git-send-email-tzanussi@gmail.com>
Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
A small fixe for the failed syscalls by pid script:
- silence the match output in the shell script
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
LKML-Reference: <1273466820-9330-8-git-send-email-tzanussi@gmail.com>
Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Only print the script start/stop messages in verbose mode - users
normally don't care and it just clutters up the output.
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
LKML-Reference: <1273466820-9330-7-git-send-email-tzanussi@gmail.com>
Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Some minor fixes for the workqueue-stats script:
- Fix nuisance 'use of uninitialized value' warnings
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
LKML-Reference: <1273466820-9330-6-git-send-email-tzanussi@gmail.com>
Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
A couple of fixes for the rwtop script:
- printing the totals and clearing the hashes in the signal handler
eventually leads to various random and serious problems when running
the rwtop script continuously. Moving the print_totals() calls to
the event handlers solves that problem, and the event handlers are
invoked frequently enough that it doesn't affect the timeliness of
the output.
- Fix nuisance 'use of uninitialized value' warnings
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Message-Id: <1273466820-9330-4-git-send-email-tzanussi@gmail.com>
Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Some minor fixes for the rw-by-pid script:
- Fix nuisance 'use of uninitialized value' warnings
- Change the failed read/write sections to sort by error counts
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
LKML-Reference: <1273466820-9330-3-git-send-email-tzanussi@gmail.com>
Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
A couple small fixes for the failed syscalls script:
- The script description says it can be restricted to a specific comm,
make it so.
- silence the match output in the shell script
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
LKML-Reference: <1273466820-9330-2-git-send-email-tzanussi@gmail.com>
Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Better done when we are adding entries, be it initially of when we're
re-sorting the histograms.
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In cbbc79a we introduced support for multiple events by introducing a
new "event_stat_id" struct and then made several perf_session methods
receive a point to it instead of a pointer to perf_session, and kept the
event_stats and hists rb_tree in perf_session.
While working on the new newt based browser, I realised that it would be
better to introduce a new class, "hists" (short for "histograms"),
renaming the "event_stat_id" struct and the perf_session methods that
were really "hists" methods, as they manipulate only struct hists
members, not touching anything in the other perf_session members.
Other optimizations, such as calculating the maximum lenght of a symbol
name present in an hists instance will be possible as we add them,
avoiding a re-traversal just for finding that information.
The rationale for the name "hists" to replace "event_stat_id" is that we
may have multiple sets of hists for the same event_stat id, as, for
instance, the 'perf diff' tool has, so event stat id is not what
characterizes what this struct and the functions that manipulate it do.
Cc: Eric B Munson <ebmunson@us.ibm.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Using machines__create_kernel_maps(..., HOST_KERNEL_ID) it would create
another machine instance for the host machine, and since 1f626bc we have
it out of the machines rb_tree.
Fix it by using machine__create_kernel_maps(&self->host_machine)
directly.
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Instead of newtAddComponents(just-one-entry, NULL), that is not needed
if, like in this browser, we're adding just one component at a time.
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Works by adding a third parameter to the '-g' argument, after the graph
type and minimum percentage, for example:
[root@doppio linux-2.6-tip]# perf report -g fractal,0.5,2
Will show only the first two symbols where at least 0.5% of the samples
took place.
All the other symbols that don't fall outside these constraints will be
put together in the last entry, prefixed with "[...]" and the total
percentage for them.
Suggested-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We have just one host on a given session, and that is the most common
setup right now, so embed a ->host_machine struct machine instance
directly in the perf_session class, check if we're looking for it before
going to the rb_tree.
This also fixes a problem found when we try to process old perf.data
files where we didn't have MMAP events for the kernel and modules and
thus don't create the kernel maps, do it in event__preprocess_sample if
it wasn't already.
Reported-by: Ingo Molnar <mingo@elte.hu>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Zhang, Yanmin <yanmin_zhang@linux.intel.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Which can happen when processing old files that had no fake kernel MMAP,
events.
That shouldn't result in perf_session__create_kernel_maps not being
called, this will be fixed in a followup patch, for now do these checks
to avoid segfaulting.
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
By using BITS_PER_LONG / 4, that is the number of chars that will be
used in such cases as the DSO "name".
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch drops "-a" from the default arguments passed to
perf record by perf lock.
If a user wants to do a system wide record of lock events,
perf lock record -a <program> <argument> ...
is enough for this purpose.
This can reduce the size of the perf.data file.
% sudo ./perf lock record whoami
root
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.439 MB perf.data (~19170 samples) ]
% sudo ./perf lock record -a whoami # with -a option
root
[ perf record: Woken up 0 times to write data ]
[ perf record: Captured and wrote 48.962 MB perf.data (~2139197 samples) ]
Signed-off-by: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
LKML-Reference: Message-Id: <1273306229-5216-1-git-send-email-mitake@dcl.info.waseda.ac.jp>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
And with that fix at least one bug:
The first hit for an entry, the one that calls malloc to create a new
instance in __perf_session__add_hist_entry, wasn't adding the count to
the per cpumode (PERF_RECORD_MISC_USER, etc) total variable.
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Some events, such as the PERF_RECORD_FINISHED_ROUND event consist of
only an event header and no data. In this case, a 0-length payload
will be read, and the 0 return value will be wrongly interpreted as an
'unexpected end of event stream'.
This patch allows for proper handling of data-less events by skipping
0-length reads.
Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
LKML-Reference: <1273038527.6383.51.camel@tropicana>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
When a lock is acquired after beeing contended, we update the
wait time statistics for the given lock.
But if the min wait time is updated, we don't check the max wait
time. This is wrong because the first time we update the wait time,
we want to update both min and max wait time.
Before:
Name acquired contended total wait (ns) max wait (ns) min wait (ns)
key 8 1 21656 0 21656
After:
Name acquired contended total wait (ns) max wait (ns) min wait (ns)
key 8 1 21656 21656 21656
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Fix the cast made to get the bad rate. It is made in the result
instead of the operands. We need the operands to be cast in double,
otherwise the result will always be zero.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Use enum to get a human view of bad_hist indexes and
put bad histogram output in its own function.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
This adds the "info" subcommand to perf lock which can be used
to dump metadata like threads or addresses of lock instances.
"map" was removed because info should do the work for it.
This will be useful not only for debugging but also for ordinary
analyzing.
v2: adding example of usage
% sudo ./perf lock info -t
| Thread ID: comm
| 0: swapper
| 1: init
| 18: migration/5
| 29: events/2
| 32: events/5
| 33: events/6
...
% sudo ./perf lock info -m
| Address of instance: name of class
| 0xffff8800b95adae0: &(&sighand->siglock)->rlock
| 0xffff8800bbb41ae0: &(&sighand->siglock)->rlock
| 0xffff8800bf165ae0: &(&sighand->siglock)->rlock
| 0xffff8800b9576a98: &p->cred_guard_mutex
| 0xffff8800bb890a08: &(&p->alloc_lock)->rlock
| 0xffff8800b9522a08: &(&p->alloc_lock)->rlock
| 0xffff8800bb8aaa08: &(&p->alloc_lock)->rlock
| 0xffff8800bba72a08: &(&p->alloc_lock)->rlock
| 0xffff8800bf18ea08: &(&p->alloc_lock)->rlock
| 0xffff8800b8a0d8a0: &(&ip->i_lock)->mr_lock
| 0xffff88009bf818a0: &(&ip->i_lock)->mr_lock
| 0xffff88004c66b8a0: &(&ip->i_lock)->mr_lock
| 0xffff8800bb6478a0: &(shost->host_lock)->rlock
v3: fixed some problems Frederic pointed out
* better rbtree tracking in dump_threads()
* removed printf() and used pr_info() and pr_debug()
Signed-off-by: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
LKML-Reference: <1272863520-16179-1-git-send-email-mitake@dcl.info.waseda.ac.jp>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
The current events reordering algorithm is based on a heuristic that
gets broken once we deal with a very fast flow of events.
Indeed the time period based flushing is not suitable anymore
in the following case, assuming we have a flush period of two
seconds.
CPU 0 | CPU 1
|
cnt1 timestamps | cnt1 timestamps
|
0 | 0
1 | 1
2 | 2
3 | 3
[...] | [...]
4 seconds later
If we spend too much time to read the buffers (case of a lot of
events to record in each buffers or when we have a lot of CPU buffers
to read), in the next pass the CPU 0 buffer could contain a slice
of several seconds of events. We'll read them all and notice we've
reached the period to flush. In the above example we flush the first
half of the CPU 0 buffer, then we read the CPU 1 buffer where we
have events that were on the flush slice and then the reordering
fails.
It's simple to reproduce with:
perf lock record perf bench sched messaging
To solve this, we use a new solution that doesn't rely on an
heuristical time slice period anymore but on a deterministic basis
based on how perf record does its job.
perf record saves the buffers through passes. A pass is a tour
on every buffers from every CPUs. This is made in order: for
each CPU we read the buffers of every counters. So the more
buffers we visit, the later will be the timstamps of their events.
When perf record finishes a pass it records a
PERF_RECORD_FINISHED_ROUND pseudo event.
We record the max timestamp t found in the pass n. Assuming these
timestamps are monotonic across cpus, we know that if a buffer
still has events with timestamps below t, they will be all available
and then read in the pass n + 1.
Hence when we start to read the pass n + 2, we can safely flush every
events with timestamps below t.
============ PASS n =================
CPU 0 | CPU 1
|
cnt1 timestamps | cnt2 timestamps
1 | 2
2 | 3
- | 4 <--- max recorded
============ PASS n + 1 ==============
CPU 0 | CPU 1
|
cnt1 timestamps | cnt2 timestamps
3 | 5
4 | 6
5 | 7 <---- max recorded
Flush every events below timestamp 4
============ PASS n + 2 ==============
CPU 0 | CPU 1
|
cnt1 timestamps | cnt2 timestamps
6 | 8
7 | 9
- | 10
Flush every events below timestamp 7
etc...
It also works on perf.data versions that don't have
PERF_RECORD_FINISHED_ROUND pseudo events. The difference is that
the events will be only flushed in the end of the perf.data
processing. It will then consume more memory and scale less with
large perf.data files.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
In order to provide a more rubust and deterministic reordering
algorithm, we need to know when we reach a point where we just
did a pass through over every counter buffers to read every thing
they had.
This patch introduces a new PERF_RECORD_FINISHED_ROUND pseudo event
that only consist in an event header and doesn't need to contain
anything.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
This patch improves 'perf report -h' output for the
'--call-graph' command line option by enumerating the
different output types.
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1273332783-4268-1-git-send-email-penberg@cs.helsinki.fi>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
It was x86 specific and imcomplete at that, improve the situation by
making it clear where the example provided applies and by adding the
URLs for the Intel and AMD manuals where this is discussed in depth.
Acked-by: Robert Richter <robert.richter@amd.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Robert Richter <robert.richter@amd.com>
Reported-by: Robert Richter <robert.richter@amd.com
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Rename perf_event_attr::precise to perf_event_attr::precise_ip and
widen it to 2 bits. This new field describes the required precision of
the PERF_SAMPLE_IP field:
0 - SAMPLE_IP can have arbitrary skid
1 - SAMPLE_IP must have constant skid
2 - SAMPLE_IP requested to have 0 skid
3 - SAMPLE_IP must have 0 skid
And modify the Intel PEBS code accordingly. The PEBS implementation
now supports up to precise_ip == 2, where we perform the IP fixup.
Also s/PERF_RECORD_MISC_EXACT/&_IP/ to clarify its meaning, this bit
should be set for each PERF_SAMPLE_IP field known to match the actual
instruction triggering the event.
This new scheme allows for a PEBS mode that uses the buffer for more
than a single event.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Using explanation given by Ingo Molnar in the oprofile mailing list.
Suggested-by: Nick Black <dank@qemfd.net>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Nick Black <dank@qemfd.net>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fix a couple of inefficiencies and redundancies related to
have_tracepoints() and its use when checking whether to write
TRACE_INFO.
First, there's no need to use get_tracepoints_path() in
have_tracepoints() - we really just want the part that checks whether
any attributes correspondo to tracepoints.
Second, we really don't care about raw_samples per se - tracepoints
are always raw_samples. In any case, the have_tracepoints() check
should be sufficient to decide whether or not to write TRACE_INFO.
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>,
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <1273030770.6383.6.camel@tropicana>
Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The first was always using the ->long_name, while the later used
->short_name if verbose was not set, resulting in the dso column to be
much wider than needed most of the time.
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
On a large machine we spend a lot of time in perf_header__find_attr when
running perf report.
If we are parsing a file without PERF_SAMPLE_ID then for each sample we call
perf_header__find_attr and loop through all counter IDs, never finding a match.
As the machine gets larger there are more per cpu counters and we spend an
awful lot of time in there.
The patch below initialises each sample id to -1ULL and checks for this in
perf_header__find_attr. We may need to do something more intelligent eventually
(eg a hash lookup from counter id to attr) but this at least fixes the most
common usage of perf report.
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Eric B Munson <ebmunson@us.ibm.com>
Acked-by: Eric B Munson <ebmunson@us.ibm.com>
LKML-Reference: <20100504111915.GB14636@kryten>
Signed-off-by: Anton Blanchard <anton@samba.org>
--
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>