linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-18 01:51:53 +00:00

Author	SHA1	Message	Date
Arnaldo Carvalho de Melo	cee75ac7ec	perf hist: Clarify events_stats fields usage The events_stats.total field is too generic, rename it to .total_period, and also add a comment explaining that it is the sum of all the .period fields in samples, that is needed because we use auto-freq to avoid sampling artifacts. Ditto for events_stats.lost, that is the sum of all lost_event.lost fields, i.e. the number of events the kernel dropped. Looking at the users, builtin-sched.c can make use of these fields and stop doing it again. Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-05-14 13:16:55 -03:00
Arnaldo Carvalho de Melo	c8446b9bda	perf hist: Make event__totals per hists This is one more thing that started global but are more useful per hist or per session. Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-05-14 10:36:42 -03:00
Kirill Smelkov	5d2be7cb19	perf trace scripts: Fix typos in perf-trace-python.txt option option -> option special special -> special Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <1273747165-17242-1-git-send-email-kirr@mns.spb.ru> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-05-13 17:10:40 -03:00
Stephane Eranian	2e6cdf996b	perf tools: change event inheritance logic in stat and record By default, event inheritance across fork and pthread_create was on but the -i option of stat and record, which enabled inheritance, led to believe it was off by default. This patch fixes this logic by inverting the meaning of the -i option. By default inheritance is on whether you attach to a process (-p), a thread (-t) or start a process. If you pass -i, then you turn off inheritance. Turning off inheritance if you don't need it, helps limit perf resource usage as well. The patch also fixes perf stat -t xxxx and perf record -t xxxx which did not start the counters. Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: David S. Miller <davem@davemloft.net> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <4bea9d2f.d60ce30a.0b5b.08e1@mx.google.com> Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-05-13 16:39:12 -03:00
Frederic Weisbecker	8a0ecfb8b4	perf hist: Fix missing getline declaration hist.c needs to include util.h so that it gets stdio.h inclusion with __GNU_SOURCE defined. Fixes: util/hist.c: In function ‘hist_entry__parse_objdump_line’: util/hist.c:931: erreur: implicit declaration of function ‘getline’ util/hist.c:931: erreur: nested extern declaration of ‘getline’ Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1273772836-11533-1-git-send-regression-fweisbec@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-05-13 16:32:58 -03:00
Frederic Weisbecker	8769e1c717	perf hist: Fix hists__browse no-newt case Fix mistake in a parameter type of the no-newt hists__browse() version. Fixes: builtin-report.c: In function ‘__cmd_report’: builtin-report.c:314: erreur: incompatible type for argument 1 of ‘hists__browse’ Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <1273771378-8577-1-git-send-regression-fweisbec@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-05-13 16:32:28 -03:00
Arnaldo Carvalho de Melo	46db2c3205	perf record: Add a fallback to the reference relocation symbol Usually "_text" is enough, but I received reports that its not always available, so fallback to "_stext" for the symbol we use to check if we need to apply any relocation to all the symbols in the kernel symtab, for when, for instance, kexec is being used. Reported-by: Darren Hart <dvhltc@us.ibm.com> Reported-by: Steven Rostedt <rostedt@goodmis.org> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-05-13 07:55:29 +02:00
Arnaldo Carvalho de Melo	ef7b93a119	perf report: Librarize the annotation code and use it in the newt browser Now we don't anymore use popen to run 'perf annotate' for the selected symbol, instead we collect per address samplings when processing samples in 'perf report' if we're using the newt browser, then we use this data directly to do annotation. Done this way we can actually traverse the objdump_line objects directly, matching the addresses to the collected samples and colouring them appropriately using lower level slang routines. The new ui_browser class will be reused for the main, callchain aware, histogram browser, when it will be made generic and don't assume that the objects are always instances of the objdump_line class maintained using list_heads. Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-05-11 23:23:20 -03:00
Arnaldo Carvalho de Melo	3798ed7bc7	perf ui: Add ui_helpline methods Initially this was just to be able to have a printf like method to prepare the formatted string and then pass to newtPushHelpLine, but as we already have for ui_progress, etc, its a step in identifying a restricted, highlevel set of widgets we can then have implementations for multiple widget sets (GTK, etc). Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-05-11 18:01:23 -03:00
Kyle McMartin	d11c7addfe	perf symbols: allow forcing use of cplus_demangle For Fedora, I want to force perf to link against libiberty.a for cplus_demangle, rather than libbfd.a for bfd_demangle due to licensing insanity on binutils. (libiberty is LGPL2, libbfd is GPL3.) If we just rely on autodetection, we'll end up with libbfd linked against us, since they're both in binutils-static in the buildroot. Cc: Ingo Molnar <mingo@elte.hu> LKML-Reference: <20100510204335.GA7565@bombadil.infradead.org> Signed-off-by: Kyle McMartin <kyle@redhat.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-05-11 12:43:11 -03:00
Masami Hiramatsu	6b3c4ef504	perf probe: Check older elfutils and set NO_DWARF Check whether elfutils is older than 0.138 (from which version checking routine has been introduced). And if so, set NO_DWARF because it is hard to check the API dependency without version checking. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Reported-by: Robert Richter <robert.richter@amd.com> Cc: Robert Richter <robert.richter@amd.com> Cc: Ingo Molnar <mingo@elte.hu> LKML-Reference: <20100511045953.9913.19485.stgit@localhost6.localdomain6> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-05-11 12:43:11 -03:00
Arnaldo Carvalho de Melo	b09e0190ac	perf hist: Adopt filter by dso and by thread methods from the newt browser Those are really not specific to the newt code, can be used by other UI frontends. Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-05-11 12:43:10 -03:00
Frederic Weisbecker	de068ec048	perf: Fix static strings treated like dynamic ones The raw_field_ptr() helper, used to retrieve the address of a field inside a trace event, treats every strings as if they were dynamic ie: having a secondary level of indirection to retrieve their contents. FIELD_IS_STRING doesn't mean FIELD_IS_DYNAMIC, we only need to compute the secondary dereference for the latter case. This fixes perf sched segfaults, bad cmdline report and may be some other bugs. Reported-by: Jason Baron <jbaron@redhat.com> Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Tom Zanussi <tzanussi@gmail.com>	2010-05-11 09:14:24 +02:00
Tom Zanussi	e61a639a79	perf/trace/scripting: syscall-counts script cleanup A small fix for the syscall counts script: - silence the match output in the shell script Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> LKML-Reference: <1273466820-9330-10-git-send-email-tzanussi@gmail.com> Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-05-10 19:51:02 -03:00
Tom Zanussi	79e653f1bf	perf/trace/scripting: syscall-counts-by-pid script cleanup A small fix for the syscall counts by pid script: - silence the match output in the shell script Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> LKML-Reference: <1273466820-9330-9-git-send-email-tzanussi@gmail.com> Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-05-10 19:51:01 -03:00
Tom Zanussi	a4ab0c1297	perf/trace/scripting: failed-syscalls-by-pid script cleanup A small fixe for the failed syscalls by pid script: - silence the match output in the shell script Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> LKML-Reference: <1273466820-9330-8-git-send-email-tzanussi@gmail.com> Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-05-10 19:51:00 -03:00
Tom Zanussi	3824a4e8da	perf/trace/scripting: don't show script start/stop messages by default Only print the script start/stop messages in verbose mode - users normally don't care and it just clutters up the output. Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> LKML-Reference: <1273466820-9330-7-git-send-email-tzanussi@gmail.com> Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-05-10 19:50:59 -03:00
Tom Zanussi	a3412d9b35	perf/trace/scripting: workqueue-stats script cleanup Some minor fixes for the workqueue-stats script: - Fix nuisance 'use of uninitialized value' warnings Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> LKML-Reference: <1273466820-9330-6-git-send-email-tzanussi@gmail.com> Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-05-10 19:50:58 -03:00
Tom Zanussi	e366728d57	perf/trace/scripting: wakeup-latency script cleanup Some minor fixes for the wakeup-latency script: - Fix nuisance 'use of uninitialized value' warnings - Avoid divide-by-zero error Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> LKML-Reference: <1273466820-9330-5-git-send-email-tzanussi@gmail.com> Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-05-10 19:50:57 -03:00
Tom Zanussi	e88a4bfbcd	perf/trace/scripting: rwtop script cleanup A couple of fixes for the rwtop script: - printing the totals and clearing the hashes in the signal handler eventually leads to various random and serious problems when running the rwtop script continuously. Moving the print_totals() calls to the event handlers solves that problem, and the event handlers are invoked frequently enough that it doesn't affect the timeliness of the output. - Fix nuisance 'use of uninitialized value' warnings Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Message-Id: <1273466820-9330-4-git-send-email-tzanussi@gmail.com> Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-05-10 19:50:56 -03:00
Tom Zanussi	6922c3d772	perf/trace/scripting: rw-by-pid script cleanup Some minor fixes for the rw-by-pid script: - Fix nuisance 'use of uninitialized value' warnings - Change the failed read/write sections to sort by error counts Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> LKML-Reference: <1273466820-9330-3-git-send-email-tzanussi@gmail.com> Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-05-10 19:50:55 -03:00
Tom Zanussi	c3f5fd287a	perf/trace/scripting: failed-syscalls script cleanup A couple small fixes for the failed syscalls script: - The script description says it can be restricted to a specific comm, make it so. - silence the match output in the shell script Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> LKML-Reference: <1273466820-9330-2-git-send-email-tzanussi@gmail.com> Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-05-10 19:50:54 -03:00
Arnaldo Carvalho de Melo	fefb0b94bb	perf hist: Calculate max_sym name len and nr_entries Better done when we are adding entries, be it initially of when we're re-sorting the histograms. Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-05-10 19:49:08 -03:00
Arnaldo Carvalho de Melo	1c02c4d2e9	perf hist: Introduce hists class and move lots of methods to it In cbbc79a we introduced support for multiple events by introducing a new "event_stat_id" struct and then made several perf_session methods receive a point to it instead of a pointer to perf_session, and kept the event_stats and hists rb_tree in perf_session. While working on the new newt based browser, I realised that it would be better to introduce a new class, "hists" (short for "histograms"), renaming the "event_stat_id" struct and the perf_session methods that were really "hists" methods, as they manipulate only struct hists members, not touching anything in the other perf_session members. Other optimizations, such as calculating the maximum lenght of a symbol name present in an hists instance will be possible as we add them, avoiding a re-traversal just for finding that information. The rationale for the name "hists" to replace "event_stat_id" is that we may have multiple sets of hists for the same event_stat id, as, for instance, the 'perf diff' tool has, so event stat id is not what characterizes what this struct and the functions that manipulate it do. Cc: Eric B Munson <ebmunson@us.ibm.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-05-10 13:13:49 -03:00
Arnaldo Carvalho de Melo	d118f8ba6a	perf session: create_kernel_maps should use ->host_machine Using machines__create_kernel_maps(..., HOST_KERNEL_ID) it would create another machine instance for the host machine, and since `1f626bc` we have it out of the machines rb_tree. Fix it by using machine__create_kernel_maps(&self->host_machine) directly. Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-05-10 12:51:05 -03:00
Arnaldo Carvalho de Melo	cdd5b75b0c	perf callchains: Use zalloc to allocate objects Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-05-10 10:57:39 -03:00
Arnaldo Carvalho de Melo	7f8264539c	perf newt: Use newtAddComponent() Instead of newtAddComponents(just-one-entry, NULL), that is not needed if, like in this browser, we're adding just one component at a time. Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-05-10 10:51:25 -03:00
Ingo Molnar	1f0ac7183f	Merge branch 'perf/test' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing into perf/core	2010-05-10 08:20:19 +02:00
Arnaldo Carvalho de Melo	232a5c948d	perf report: Allow limiting the number of entries to print in callchains Works by adding a third parameter to the '-g' argument, after the graph type and minimum percentage, for example: [root@doppio linux-2.6-tip]# perf report -g fractal,0.5,2 Will show only the first two symbols where at least 0.5% of the samples took place. All the other symbols that don't fall outside these constraints will be put together in the last entry, prefixed with "[...]" and the total percentage for them. Suggested-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Arjan van de Ven <arjan@linux.intel.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-05-09 21:15:35 -03:00
Arnaldo Carvalho de Melo	1f626bc368	perf session: Embed the host machine data on perf_session We have just one host on a given session, and that is the most common setup right now, so embed a ->host_machine struct machine instance directly in the perf_session class, check if we're looking for it before going to the rb_tree. This also fixes a problem found when we try to process old perf.data files where we didn't have MMAP events for the kernel and modules and thus don't create the kernel maps, do it in event__preprocess_sample if it wasn't already. Reported-by: Ingo Molnar <mingo@elte.hu> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Tom Zanussi <tzanussi@gmail.com> Cc: Zhang, Yanmin <yanmin_zhang@linux.intel.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-05-09 21:14:52 -03:00
Arnaldo Carvalho de Melo	4cc4945844	perf symbols: Check if a struct machine instance was found Which can happen when processing old files that had no fake kernel MMAP, events. That shouldn't result in perf_session__create_kernel_maps not being called, this will be fixed in a followup patch, for now do these checks to avoid segfaulting. Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-05-09 21:14:07 -03:00
Arnaldo Carvalho de Melo	3ceb0d4438	perf symbols: Consider unresolved DSOs in the dso__col_widt calculation By using BITS_PER_LONG / 4, that is the number of chars that will be used in such cases as the DSO "name". Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-05-09 18:32:32 -03:00
Hitoshi Mitake	76ba7e846f	perf lock: Drop "-a" option from cmd_record() default arguments set This patch drops "-a" from the default arguments passed to perf record by perf lock. If a user wants to do a system wide record of lock events, perf lock record -a <program> <argument> ... is enough for this purpose. This can reduce the size of the perf.data file. % sudo ./perf lock record whoami root [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.439 MB perf.data (~19170 samples) ] % sudo ./perf lock record -a whoami # with -a option root [ perf record: Woken up 0 times to write data ] [ perf record: Captured and wrote 48.962 MB perf.data (~2139197 samples) ] Signed-off-by: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Jason Baron <jbaron@redhat.com> Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> LKML-Reference: Message-Id: <1273306229-5216-1-git-send-email-mitake@dcl.info.waseda.ac.jp> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>	2010-05-09 21:52:27 +02:00
Arnaldo Carvalho de Melo	28e2a106d1	perf hist: Simplify the insertion of new hist_entry instances And with that fix at least one bug: The first hit for an entry, the one that calls malloc to create a new instance in __perf_session__add_hist_entry, wasn't adding the count to the per cpumode (PERF_RECORD_MISC_USER, etc) total variable. Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-05-09 13:10:39 -03:00
Arnaldo Carvalho de Melo	39d1e1b1e2	perf report: Fix leak of resolved callchains array on error path Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-05-09 13:07:39 -03:00
Arnaldo Carvalho de Melo	139633c6a4	perf callchain: Move validate_callchain to callchain lib Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-05-09 13:07:05 -03:00
Tom Zanussi	794e43b56c	perf/live-mode: Handle payload-less events Some events, such as the PERF_RECORD_FINISHED_ROUND event consist of only an event header and no data. In this case, a 0-length payload will be read, and the 0 return value will be wrongly interpreted as an 'unexpected end of event stream'. This patch allows for proper handling of data-less events by skipping 0-length reads. Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Masami Hiramatsu <mhiramat@redhat.com> LKML-Reference: <1273038527.6383.51.camel@tropicana> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>	2010-05-09 13:49:52 +02:00
Frederic Weisbecker	90c0e5fc7b	perf lock: Always check min AND max wait time When a lock is acquired after beeing contended, we update the wait time statistics for the given lock. But if the min wait time is updated, we don't check the max wait time. This is wrong because the first time we update the wait time, we want to update both min and max wait time. Before: Name acquired contended total wait (ns) max wait (ns) min wait (ns) key 8 1 21656 0 21656 After: Name acquired contended total wait (ns) max wait (ns) min wait (ns) key 8 1 21656 21656 21656 Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>	2010-05-09 13:45:30 +02:00
Frederic Weisbecker	5efe08cf68	perf: Fix perf lock bad rate Fix the cast made to get the bad rate. It is made in the result instead of the operands. We need the operands to be cast in double, otherwise the result will always be zero. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>	2010-05-09 13:45:29 +02:00
Frederic Weisbecker	84c7a21791	perf: Humanize lock flags in perf lock Use an enum instead of plain constants for lock flags. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>	2010-05-09 13:45:27 +02:00
Frederic Weisbecker	10350ec362	perf: Cleanup perf lock broken states Use enum to get a human view of bad_hist indexes and put bad histogram output in its own function. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>	2010-05-09 13:45:26 +02:00
Hitoshi Mitake	26242d859c	perf lock: Add "info" subcommand for dumping misc information This adds the "info" subcommand to perf lock which can be used to dump metadata like threads or addresses of lock instances. "map" was removed because info should do the work for it. This will be useful not only for debugging but also for ordinary analyzing. v2: adding example of usage % sudo ./perf lock info -t \| Thread ID: comm \| 0: swapper \| 1: init \| 18: migration/5 \| 29: events/2 \| 32: events/5 \| 33: events/6 ... % sudo ./perf lock info -m \| Address of instance: name of class \| 0xffff8800b95adae0: &(&sighand->siglock)->rlock \| 0xffff8800bbb41ae0: &(&sighand->siglock)->rlock \| 0xffff8800bf165ae0: &(&sighand->siglock)->rlock \| 0xffff8800b9576a98: &p->cred_guard_mutex \| 0xffff8800bb890a08: &(&p->alloc_lock)->rlock \| 0xffff8800b9522a08: &(&p->alloc_lock)->rlock \| 0xffff8800bb8aaa08: &(&p->alloc_lock)->rlock \| 0xffff8800bba72a08: &(&p->alloc_lock)->rlock \| 0xffff8800bf18ea08: &(&p->alloc_lock)->rlock \| 0xffff8800b8a0d8a0: &(&ip->i_lock)->mr_lock \| 0xffff88009bf818a0: &(&ip->i_lock)->mr_lock \| 0xffff88004c66b8a0: &(&ip->i_lock)->mr_lock \| 0xffff8800bb6478a0: &(shost->host_lock)->rlock v3: fixed some problems Frederic pointed out * better rbtree tracking in dump_threads() * removed printf() and used pr_info() and pr_debug() Signed-off-by: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Jason Baron <jbaron@redhat.com> Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> LKML-Reference: <1272863520-16179-1-git-send-email-mitake@dcl.info.waseda.ac.jp> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>	2010-05-09 13:45:24 +02:00
Frederic Weisbecker	d6b17bebd7	perf: Provide a new deterministic events reordering algorithm The current events reordering algorithm is based on a heuristic that gets broken once we deal with a very fast flow of events. Indeed the time period based flushing is not suitable anymore in the following case, assuming we have a flush period of two seconds. CPU 0 \| CPU 1 \| cnt1 timestamps \| cnt1 timestamps \| 0 \| 0 1 \| 1 2 \| 2 3 \| 3 [...] \| [...] 4 seconds later If we spend too much time to read the buffers (case of a lot of events to record in each buffers or when we have a lot of CPU buffers to read), in the next pass the CPU 0 buffer could contain a slice of several seconds of events. We'll read them all and notice we've reached the period to flush. In the above example we flush the first half of the CPU 0 buffer, then we read the CPU 1 buffer where we have events that were on the flush slice and then the reordering fails. It's simple to reproduce with: perf lock record perf bench sched messaging To solve this, we use a new solution that doesn't rely on an heuristical time slice period anymore but on a deterministic basis based on how perf record does its job. perf record saves the buffers through passes. A pass is a tour on every buffers from every CPUs. This is made in order: for each CPU we read the buffers of every counters. So the more buffers we visit, the later will be the timstamps of their events. When perf record finishes a pass it records a PERF_RECORD_FINISHED_ROUND pseudo event. We record the max timestamp t found in the pass n. Assuming these timestamps are monotonic across cpus, we know that if a buffer still has events with timestamps below t, they will be all available and then read in the pass n + 1. Hence when we start to read the pass n + 2, we can safely flush every events with timestamps below t. ============ PASS n ================= CPU 0 \| CPU 1 \| cnt1 timestamps \| cnt2 timestamps 1 \| 2 2 \| 3 - \| 4 <--- max recorded ============ PASS n + 1 ============== CPU 0 \| CPU 1 \| cnt1 timestamps \| cnt2 timestamps 3 \| 5 4 \| 6 5 \| 7 <---- max recorded Flush every events below timestamp 4 ============ PASS n + 2 ============== CPU 0 \| CPU 1 \| cnt1 timestamps \| cnt2 timestamps 6 \| 8 7 \| 9 - \| 10 Flush every events below timestamp 7 etc... It also works on perf.data versions that don't have PERF_RECORD_FINISHED_ROUND pseudo events. The difference is that the events will be only flushed in the end of the perf.data processing. It will then consume more memory and scale less with large perf.data files. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Tom Zanussi <tzanussi@gmail.com> Cc: Masami Hiramatsu <mhiramat@redhat.com>	2010-05-09 13:43:42 +02:00
Frederic Weisbecker	9840280757	perf: Introduce a new "round of buffers read" pseudo event In order to provide a more rubust and deterministic reordering algorithm, we need to know when we reach a point where we just did a pass through over every counter buffers to read every thing they had. This patch introduces a new PERF_RECORD_FINISHED_ROUND pseudo event that only consist in an event header and doesn't need to contain anything. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Tom Zanussi <tzanussi@gmail.com> Cc: Masami Hiramatsu <mhiramat@redhat.com>	2010-05-09 13:43:42 +02:00
Pekka Enberg	e157eb8341	perf report: Document '--call-graph' better for usage This patch improves 'perf report -h' output for the '--call-graph' command line option by enumerating the different output types. Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1273332783-4268-1-git-send-email-penberg@cs.helsinki.fi> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-05-08 18:11:44 +02:00
Ingo Molnar	ed82702155	Merge branch 'perf' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux-2.6 into perf/core	2010-05-08 10:02:57 +02:00
Arnaldo Carvalho de Melo	1cf4a0632c	perf list: Improve the raw hw event descriptor documentation It was x86 specific and imcomplete at that, improve the situation by making it clear where the example provided applies and by adding the URLs for the Intel and AMD manuals where this is discussed in depth. Acked-by: Robert Richter <robert.richter@amd.com> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Tom Zanussi <tzanussi@gmail.com> Cc: Robert Richter <robert.richter@amd.com> Reported-by: Robert Richter <robert.richter@amd.com LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-05-07 14:07:05 -03:00
Peter Zijlstra	ab608344bc	perf, x86: Improve the PEBS ABI Rename perf_event_attr::precise to perf_event_attr::precise_ip and widen it to 2 bits. This new field describes the required precision of the PERF_SAMPLE_IP field: 0 - SAMPLE_IP can have arbitrary skid 1 - SAMPLE_IP must have constant skid 2 - SAMPLE_IP requested to have 0 skid 3 - SAMPLE_IP must have 0 skid And modify the Intel PEBS code accordingly. The PEBS implementation now supports up to precise_ip == 2, where we perform the IP fixup. Also s/PERF_RECORD_MISC_EXACT/&_IP/ to clarify its meaning, this bit should be set for each PERF_SAMPLE_IP field known to match the actual instruction triggering the event. This new scheme allows for a PEBS mode that uses the buffer for more than a single event. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-05-07 11:31:02 +02:00
Arnaldo Carvalho de Melo	4778e0e8c6	perf tools: Fixup minor doc formatting issues Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-05-05 11:23:27 -03:00
Arnaldo Carvalho de Melo	9e32a3cb06	perf list: Add explanation about raw hardware event descriptors Using explanation given by Ingo Molnar in the oprofile mailing list. Suggested-by: Nick Black <dank@qemfd.net> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Nick Black <dank@qemfd.net> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-05-05 11:20:05 -03:00
Tom Zanussi	db620b1c2f	perf/record: simplify TRACE_INFO tracepoint check Fix a couple of inefficiencies and redundancies related to have_tracepoints() and its use when checking whether to write TRACE_INFO. First, there's no need to use get_tracepoints_path() in have_tracepoints() - we really just want the part that checks whether any attributes correspondo to tracepoints. Second, we really don't care about raw_samples per se - tracepoints are always raw_samples. In any case, the have_tracepoints() check should be sufficient to decide whether or not to write TRACE_INFO. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu>, Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <1273030770.6383.6.camel@tropicana> Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-05-05 11:12:53 -03:00
Arnaldo Carvalho de Melo	9890948d85	perf report: Make dso__calc_col_width agree with hist_entry__dso_snprintf The first was always using the ->long_name, while the later used ->short_name if verbose was not set, resulting in the dso column to be much wider than needed most of the time. Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-05-05 09:49:48 -03:00
Ingo Molnar	c4f3b5a2d7	Merge branch 'perf' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux-2.6 into perf/core	2010-05-04 18:31:47 +02:00
Anton Blanchard	02bf60aad7	perf: Fix performance issue with perf report On a large machine we spend a lot of time in perf_header__find_attr when running perf report. If we are parsing a file without PERF_SAMPLE_ID then for each sample we call perf_header__find_attr and loop through all counter IDs, never finding a match. As the machine gets larger there are more per cpu counters and we spend an awful lot of time in there. The patch below initialises each sample id to -1ULL and checks for this in perf_header__find_attr. We may need to do something more intelligent eventually (eg a hash lookup from counter id to attr) but this at least fixes the most common usage of perf report. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Eric B Munson <ebmunson@us.ibm.com> Acked-by: Eric B Munson <ebmunson@us.ibm.com> LKML-Reference: <20100504111915.GB14636@kryten> Signed-off-by: Anton Blanchard <anton@samba.org> -- Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-05-04 10:54:09 -03:00
Arnaldo Carvalho de Melo	11d232ec28	perf inject: Add missing bits New commands need to have Documentation and be added to command-list.txt so that they can appear when 'perf' is called withouth any subcommand: [root@doppio linux-2.6-tip]# perf usage: perf [--version] [--help] COMMAND [ARGS] The most commonly used perf commands are: annotate Read perf.data (created by perf record) and display annotated code archive Create archive with object files with build-ids found in perf.data file bench General framework for benchmark suites buildid-cache Manage build-id cache. buildid-list List the buildids in a perf.data file diff Read two perf.data files and display the differential profile inject Filter to augment the events stream with additional information kmem Tool to trace/measure kernel memory(slab) properties kvm Tool to trace/measure kvm guest os list List all symbolic event types lock Analyze lock events probe Define new dynamic tracepoints record Run a command and record its profile into perf.data report Read perf.data (created by perf record) and display the profile sched Tool to trace/measure scheduler properties (latencies) stat Run a command and gather performance counter statistics test Runs sanity tests. timechart Tool to visualize total system behavior during a workload top System profiling tool. trace Read perf.data (created by perf record) and display trace output See 'perf help COMMAND' for more information on a specific command. [root@doppio linux-2.6-tip]# The new 'perf inject' command hadn't so it wasn't appearing on that list. Also fix the long option, that should have no spaces in it, rename the faulty one to be '--build-ids', instead of '--inject build-ids'. Reported-by: Ingo Molnar <mingo@elte.hu> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-05-04 10:48:22 -03:00
Tom Zanussi	63e0c7715a	perf: record TRACE_INFO only if using tracepoints and SAMPLE_RAW The current perf code implicitly assumes SAMPLE_RAW means tracepoints are being used, but doesn't check for that. It happily records the TRACE_INFO even if SAMPLE_RAW is used without tracepoints, but when the perf data is read it won't go any further when it finds TRACE_INFO but no tracepoints, and displays misleading errors. This adds a check for both in perf-record, and won't record TRACE_INFO unless both are true. This at least allows perf report -D to dump raw events, and avoids triggering a misleading error condition in perf trace. It doesn't actually enable the non-tracepoint raw events to be displayed in perf trace, since perf trace currently only deals with tracepoint events. Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1272865861.7932.16.camel@tropicana> Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-05-03 10:31:48 -03:00
Ingo Molnar	0806ebd974	Merge branch 'perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing into perf/core	2010-05-03 08:29:35 +02:00
Arnaldo Carvalho de Melo	090f7204df	perf inject: Refactor read_buildid function Into two functions, one that actually reads the build_id for the dso if it wasn't already read, and another taht will inject the event if the build_id is available. Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-05-02 19:46:36 -03:00
Arnaldo Carvalho de Melo	2c9faa0600	perf record: Don't exit in live mode when no tracepoints are enabled With this I was able to actually test Tom Zanussi's two previous patches in my usual perf testing ways, i.e. without any tracepoints activated. Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-05-02 13:37:24 -03:00
Tom Zanussi	454c407ec1	perf: add perf-inject builtin Currently, perf 'live mode' writes build-ids at the end of the session, which isn't actually useful for processing live mode events. What would be better would be to have the build-ids sent before any of the samples that reference them, which can be done by processing the event stream and retrieving the build-ids on the first hit. Doing that in perf-record itself, however, is off-limits. This patch introduces perf-inject, which does the same job while leaving perf-record untouched. Normal mode perf still records the build-ids at the end of the session as it should, but for live mode, perf-inject can be injected in between the record and report steps e.g.: perf record -o - ./hackbench 10 \| perf inject -v -b \| perf report -v -i - perf-inject reads a perf-record event stream and repipes it to stdout. At any point the processing code can inject other events into the event stream - in this case build-ids (-b option) are read and injected as needed into the event stream. Build-ids are just the first user of perf-inject - potentially anything that needs userspace processing to augment the trace stream with additional information could make use of this facility. Cc: Ingo Molnar <mingo@elte.hu> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frédéric Weisbecker <fweisbec@gmail.com> LKML-Reference: <1272696080-16435-3-git-send-email-tzanussi@gmail.com> Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-05-02 13:36:56 -03:00
Tom Zanussi	789688faef	perf/live: don't synthesize build ids at the end of a live mode trace It doesn't really make sense to record the build ids at the end of a live mode session - live mode samples need that information during the trace rather than at the end. Leave event__synthesize_build_id() in place, however; we'll still be using that to synthesize build ids in a more timely fashion in a future patch. Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <1272696080-16435-2-git-send-email-tzanussi@gmail.com> Signed-off-by: Tom Zanussi <tzanussi@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-05-02 12:04:05 -03:00
Arnaldo Carvalho de Melo	fb72014d98	perf tools: Don't use code surrounded by __KERNEL__ We need to refactor code to be explicitely shared by the kernel and at least the tools/ userspace programs, so, till we do that, copy the bare minimum bitmap/bitops code needed by tools/perf. Reported-by: "H. Peter Anvin" <hpa@zytor.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-05-02 12:00:44 -03:00
Frederic Weisbecker	d00a47cce5	perf: Fix warning while reading ring buffer headers commit `e9e94e3bd8` "perf trace: Ignore "overwrite" field if present in /events/header_page" makes perf trace launching spurious warnings about unexpected tokens read: Warning: Error: expected type 6 but read 4 This change tries to handle the overcommit field in the header_page file whenever this field is present or not. The problem is that if this field is not present, we try to find it and give up in the middle of the line when we realize we are actually dealing with another field, which is the "data" one. And this failure abandons the file pointer in the middle of the "data" description line: field: u64 timestamp; offset:0; size:8; signed:0; field: local_t commit; offset:8; size:8; signed:1; field: char data; offset:16; size:4080; signed:1; ^^^ Here What happens next is that we want to read this line to parse the data field, but we fail because the pointer is not in the beginning of the line. We could probably fix that by rewinding the pointer. But in fact we don't care much about these headers that only concern the ftrace ring-buffer. We don't use them from perf. Just skip this part of perf.data, but don't remove it from recording to stay compatible with olders perf.data Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Stephane Eranian <eranian@google.com> Cc: Steven Rostedt <rostedt@goodmis.org>	2010-05-01 04:31:48 +02:00
Frederic Weisbecker	e5a5f1f015	perf: Remove leftover useless options to record trace events from scripts -f, -c 1, -R are now useless for trace events recording, moreover -M is useless and event hurts. Remove them from the documentation examples and from record scripts. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Tom Zanussi <tzanussi@gmail.com>	2010-04-30 19:55:00 +02:00
Arnaldo Carvalho de Melo	1c6a800cde	perf test: Initial regression testing command First an example with the first internal test: [acme@doppio linux-2.6-tip]$ perf test 1: vmlinux symtab matches kallsyms: Ok So it run just one test, that is "vmlinux symtab matches kallsyms", and it was successful. If we run it in verbose mode, we'll see details about errors and extra warnings for non-fatal problems: [acme@doppio linux-2.6-tip]$ perf test -v 1: vmlinux symtab matches kallsyms: --- start --- Looking at the vmlinux_path (5 entries long) No build_id in vmlinux, ignoring it No build_id in /boot/vmlinux, ignoring it No build_id in /boot/vmlinux-2.6.34-rc4-tip+, ignoring it Using /lib/modules/2.6.34-rc4-tip+/build/vmlinux for symbols Maps only in vmlinux: ffffffff81cb81b1-ffffffff81e1149b 0 [kernel].init.text ffffffff81e1149c-ffffffff9fffffff 0 [kernel].exit.text ffffffffff600000-ffffffffff6000ff 0 [kernel].vsyscall_0 ffffffffff600100-ffffffffff6003ff 0 [kernel].vsyscall_fn ffffffffff600400-ffffffffff6007ff 0 [kernel].vsyscall_1 ffffffffff600800-ffffffffffffffff 0 [kernel].vsyscall_2 Maps in vmlinux with a different name in kallsyms: ffffffffff600000-ffffffffff6000ff 0 [kernel].vsyscall_0 in kallsyms as [kernel].0 ffffffffff600100-ffffffffff6003ff 0 [kernel].vsyscall_fn in kallsyms as: *ffffffffff600100-ffffffffff60012f 0 [kernel].2 ffffffffff600400-ffffffffff6007ff 0 [kernel].vsyscall_1 in kallsyms as [kernel].6 ffffffffff600800-ffffffffffffffff 0 [kernel].vsyscall_2 in kallsyms as [kernel].8 Maps only in kallsyms: ffffffffff600130-ffffffffff6003ff 0 [kernel].4 ---- end ---- vmlinux symtab matches kallsyms: Ok [acme@doppio linux-2.6-tip]$ In the above case we only know the name of the non contiguous kernel ranges in the address space when reading the symbol information from the ELF symtab in vmlinux. The /proc/kallsyms file lack this, we only notice they are separate because there are modules after the kernel and after that more kernel functions, so we need to have a module rbtree backed by the module .ko path to get symtabs in the vmlinux case. The tool uses it to match by address to emit appropriate warning, but don't considers this fatal. The .init.text and .exit.text ines, of course, aren't in kallsyms, so I left these cases just as extra info in verbose mode. The end of the sections also aren't in kallsyms, so we the symbols layer does another pass and sets the end addresses as the next map start minus one, which sometimes pads, causing harmless mismatches. But at least the symbols match, tested it by copying /proc/kallsyms to /tmp/kallsyms and doing changes to see if they were detected. This first test also should serve as a first stab at documenting the symbol library by providing a self contained example that exercises it together with comments about what is being done. More tests to check if actions done on a monitored app, like doing mmaps, etc, makes the kernel generate the expected events should be added next. Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-04-29 18:59:23 -03:00
Arnaldo Carvalho de Melo	5c0541d53e	perf symbols: Add machine helper routines Created when writing the first 'perf test' regression testing routine. Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-04-29 15:25:23 -03:00
Arnaldo Carvalho de Melo	18acde52b8	perf tools: Create $(OUTPUT)arch/$(ARCH)/util/ directory So that "make -C tools/perf O=/tmp/some/path" works again. Problem introduced in: `cd932c5` "perf: Move arch specific code into separate arch director" Cc: Ian Munsie <imunsie@au.ibm.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-04-27 22:29:45 -03:00
Arnaldo Carvalho de Melo	cbf6968098	perf machines: Make the machines class adopt the dsos__fprintf methods Now those methods don't operate on a global list of dsos, but on lists of machines, so make this clear by renaming the functions. Cc: Avi Kivity <avi@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Zhang, Yanmin <yanmin_zhang@linux.intel.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-04-27 21:22:44 -03:00
Arnaldo Carvalho de Melo	d28c62232e	perf machine: Adopt some map_groups functions Those functions operated on members now grouped in 'struct machine', so move those methods to this new class. The changes made to 'perf probe' shows that using this abstraction inserting probes on guests almost got supported for free. Cc: Avi Kivity <avi@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Masami Hiramatsu <mhiramat@redhat.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Zhang, Yanmin <yanmin_zhang@linux.intel.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-04-27 21:21:18 -03:00
Arnaldo Carvalho de Melo	48ea8f5470	perf machine: Pass buffer size to machine__mmap_name Don't blindly assume that the size of the buffer is enough, use snprintf. Cc: Avi Kivity <avi@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Zhang, Yanmin <yanmin_zhang@linux.intel.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-04-27 21:19:05 -03:00
Arnaldo Carvalho de Melo	23346f21b2	perf tools: Rename "kernel_info" to "machine" struct kernel_info and kerninfo__ are too vague, what they really describe are machines, virtual ones or hosts. There are more changes to introduce helpers to shorten function calls and to make more clear what is really being done, but I left that for subsequent patches. Cc: Avi Kivity <avi@redhat.com> Cc: Frédéric Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Zhang, Yanmin <yanmin_zhang@linux.intel.com> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-04-27 21:17:50 -03:00
Ingo Molnar	462b04e28a	Merge branch 'perf' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux-2.6 into perf/core	2010-04-27 11:16:54 +02:00
Stefan Hajnoczi	f93830fbb0	perf tools: Fix libdw-dev package name in error message The headers required for DWARF support are provided by the libdw-dev package in Debian-based distros. This patch corrects the elfutils-dev package name to libdw-dev in the Makefile error message when libdw.h is not found. Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Masami Hiramatsu <mhiramat@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Tom Zanussi <tzanussi@gmail.com> LKML-Reference: <1272292023-9869-1-git-send-email-stefanha@linux.vnet.ibm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-04-26 15:39:54 -03:00
Masami Hiramatsu	ef4a356574	perf probe: Add --max-probes option Add --max-probes option to change the maximum limit of findable probe points per event, since inlined function can be expanded into thousands of probe points. Default value is 128. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> LKML-Reference: <20100421195640.24664.62984.stgit@localhost6.localdomain6> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-04-26 15:35:20 -03:00
Masami Hiramatsu	5d1ee0413c	perf probe: Fix to exit callback soon after finding too many probe points Fix to exit callback soon after finding too many probe points. Don't try to continue searching because it already failed. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> LKML-Reference: <20100421195632.24664.42598.stgit@localhost6.localdomain6> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-04-26 15:33:08 -03:00
Masami Hiramatsu	15eca306ec	perf probe: Fix to use symtab only if no debuginfo Fix perf probe to use symtab only if there is no debuginfo, because debuginfo has more information than symtab. If we can't find a function in debuginfo, we never find it in symtab. Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> LKML-Reference: <20100421195624.24664.46214.stgit@localhost6.localdomain6> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-04-26 15:32:37 -03:00
Masami Hiramatsu	0ab061cd52	perf tools: Initialize dso->node member in dso__new If dso->node member is not initialized, it causes a segmentation fault when adding to other lists. It should be initilized in dso__new(). Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> LKML-Reference: : <20100421195616.24664.89980.stgit@localhost6.localdomain6> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-04-26 15:31:32 -03:00
William Cohen	cfadf9d4ac	perf: Some perf-kvm documentation edits asciidoc does not allow the "===" to be longer than the line above it. Also fix a couple types and formatting errors. Signed-off-by: William Cohen <wcohen@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@elte.hu> LKML-Reference: <4BD204C5.9000504@redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>	2010-04-24 03:50:49 +02:00
Frederic Weisbecker	e1889d75af	perf: Add a perf trace option to check samples ordering reliability To ensure sample events time reordering is reliable, add a -d option to perf trace to check that automatically. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp> Cc: Ingo Molnar <mingo@elte.hu> Cc: Masami Hiramatsu <mhiramat@redhat.com> Cc: Tom Zanussi <tzanussi@gmail.com>	2010-04-24 03:50:48 +02:00
Frederic Weisbecker	9df9bbba9f	perf: Use generic sample reordering in perf timechart Use the new generic sample events reordering from perf timechart, this drops the ad hoc sample reordering it was using before. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp> Cc: Ingo Molnar <mingo@elte.hu> Cc: Masami Hiramatsu <mhiramat@redhat.com> Cc: Tom Zanussi <tzanussi@gmail.com> Cc: Arjan van de Ven <arjan@linux.intel.com>	2010-04-24 03:50:46 +02:00
Frederic Weisbecker	e0a808c65c	perf: Use generic sample reordering in perf trace Use the new generic sample events reordering from perf trace. Before that, the displayed traces were ordered as they were in the input as recorded by perf record (not time ordered). This makes eventually perf trace displaying the events as beeing time ordered. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp> Cc: Ingo Molnar <mingo@elte.hu> Cc: Masami Hiramatsu <mhiramat@redhat.com> Cc: Tom Zanussi <tzanussi@gmail.com>	2010-04-24 03:50:45 +02:00
Frederic Weisbecker	587570d4cc	perf: Use generic sample reordering in perf kmem Use the new generic sample events reordering from perf kmem, this drops the need of multiplexing the buffers on record time, improving the scalability of perf kmem. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp> Cc: Ingo Molnar <mingo@elte.hu> Cc: Masami Hiramatsu <mhiramat@redhat.com> Cc: Tom Zanussi <tzanussi@gmail.com> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Li Zefan <lizf@cn.fujitsu.com>	2010-04-24 03:50:44 +02:00
Frederic Weisbecker	a64eae703b	perf: Use generic sample reordering in perf sched Use the new generic sample events reordering from perf sched, this drops the need of multiplexing the buffers on record time, improving the scalability of perf sched. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp> Cc: Ingo Molnar <mingo@elte.hu> Cc: Masami Hiramatsu <mhiramat@redhat.com> Cc: Tom Zanussi <tzanussi@gmail.com>	2010-04-24 03:50:42 +02:00
Frederic Weisbecker	c61e52ee70	perf: Generalize perf lock's sample event reordering to the session layer The sample events recorded by perf record are not time ordered because we have one buffer per cpu for each event (even demultiplexed per task/per cpu for task bound events). But when we read trace events we want them to be ordered by time because many state machines are involved. There are currently two ways perf tools deal with that: - use -M to multiplex every buffers (perf sched, perf kmem) But this creates a lot of contention in SMP machines on record time. - use a post-processing time reordering (perf timechart, perf lock) The reordering used by timechart is simple but doesn't scale well with huge flow of events, in terms of performance and memory use (unusable with perf lock for example). Perf lock has its own samples reordering that flushes its memory use in a regular basis and that uses a sorting based on the previous event queued (a new event to be queued is close to the previous one most of the time). This patch proposes to export perf lock's samples reordering facility to the session layer that reads the events. So if a tool wants to get ordered sample events, it needs to set its struct perf_event_ops::ordered_samples to true and that's it. This prepares tracing based perf tools to get rid of the need to use buffers multiplexing (-M) or to implement their own reordering. Also lower the flush period to 2 as it's sufficient already. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp> Cc: Ingo Molnar <mingo@elte.hu> Cc: Masami Hiramatsu <mhiramat@redhat.com> Cc: Tom Zanussi <tzanussi@gmail.com>	2010-04-24 03:49:58 +02:00
Stephane Eranian	5710fcad7c	perf: Fix initialization bug in parse_single_tracepoint_event() The parse_single_tracepoint_event() was setting some attributes before it validated the event was indeed a tracepoint event. This caused problems with other initialization routines like in the builtin-top.c module whereby sample_period is not set if not 0. Signed-off-by: Stephane Eranian <eranian@google.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@elte.hu> LKML-Reference: <4bcf232b.698fd80a.6fbe.ffffb737@mx.google.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>	2010-04-24 03:24:09 +02:00
Hitoshi Mitake	e4cef1f650	perf lock: Fix state machine to recognize lock sequence Previous state machine of perf lock was really broken. This patch improves it a little. This patch prepares the list of state machine that represents lock sequences for each threads. These state machines can be one of these sequences: 1) acquire -> acquired -> release 2) acquire -> contended -> acquired -> release 3) acquire (w/ try) -> release 4) acquire (w/ read) -> release The case of 4) is a little special. Double acquire of read lock is allowed, so the state machine counts read lock number, and permits double acquire and release. But, things are not so simple. Something in my model is still wrong. I counted the number of lock instances with bad sequence, and ratio is like this (case of tracing whoami): bad:233, total:2279 version 2: * threads are now identified with tid, not pid * prepared SEQ_STATE_READ_ACQUIRED for read lock. * bunch of struct lock_seq_stat is now linked list * debug information enhanced (this have to be removed someday) e.g. \| === output for debug=== \| \| bad:233, total:2279 \| bad rate:0.000000 \| histogram of events caused bad sequence \| acquire: 165 \| acquired: 0 \| contended: 0 \| release: 68 Signed-off-by: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Jason Baron <jbaron@redhat.com> Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com> Cc: Ingo Molnar <mingo@elte.hu> LKML-Reference: <1271852634-9351-1-git-send-email-mitake@dcl.info.waseda.ac.jp> [rename SEQ_STATE_UNINITED to SEQ_STATE_UNINITIALIZED] Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>	2010-04-24 03:23:14 +02:00
Ian Munsie	fead7960f0	perf probe: Add PowerPC DWARF register number mappings This adds mappings from the register numbers from DWARF to the register names used in the PowerPC Regs and Stack Access API. This allows perf probe to be used to record variable contents on PowerPC. This requires the functionality represented by the config symbol HAVE_REGS_AND_STACK_ACCESS_API in order to function, although it will compile without it. That functionality is added for PowerPC in commit `359e4284` ("powerpc: Add kprobe-based event tracer"). Signed-off-by: Ian Munsie <imunsie@au.ibm.com> Acked-by: Masami Hiramatsu <mhiramat@redhat.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2010-04-22 13:48:31 +10:00
Ian Munsie	cd932c5939	perf: Move arch specific code into separate arch directory The perf userspace tool included some architecture specific code to map registers from the DWARF register number into the names used by the regs and stack access API. This moves the architecture specific code out into a separate arch/x86 directory along with the infrastructure required to use it. Signed-off-by: Ian Munsie <imunsie@au.ibm.com> Acked-by: Masami Hiramatsu <mhiramat@redhat.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2010-04-22 13:48:31 +10:00
Frederic Weisbecker	6eca8cc35b	perf: Fix perf probe build error When we run into dry run mode, we want to make write_kprobe_trace_event to succeed on writing the event. Let's initialize it to 0. Fixes the following build error: util/probe-event.c:1266: attention : «ret» may be used uninitialized in this function util/probe-event.c:1266: note: «ret» was declared here Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> LKML-Reference: <1271808065-25290-1-git-send-regression-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-04-21 09:39:52 +02:00
Zhang, Yanmin	a1645ce12a	perf: 'perf kvm' tool for monitoring guest performance from host Here is the patch of userspace perf tool. Signed-off-by: Zhang Yanmin <yanmin_zhang@linux.intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2010-04-19 12:37:24 +03:00
Ingo Molnar	b5a80b7e91	Merge branch 'perf' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux-2.6 into perf/core	2010-04-15 09:16:51 +02:00
Ingo Molnar	84b13fd596	Merge branch 'perf/live' into perf/core Conflicts: tools/perf/builtin-record.c Merge reason: add the live tracing feature, resolve conflict. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-04-15 09:13:26 +02:00
Frederic Weisbecker	f921281930	perf: Make the trace events sample period default to 1 Trace events are mostly used for tracing and then require not to be lost when possible. As opposite to hardware events that really require to trigger after a given sample period, trace events mostly need to trigger everytime. It is a frustrating experience to trace with perf and realize we lost a lot of events because we forgot the "-c 1" option. Then default sample_period to 1 for trace events but let the user override it. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de>	2010-04-15 04:12:52 +02:00
Frederic Weisbecker	bdef3b02ce	perf: Always record tracepoints raw samples from perf record Trace events are mostly used for tracing rather than simple counting. Don't bother anymore with adding -R when using them, just record raw samples of trace events every time. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de>	2010-04-15 04:12:52 +02:00
Frederic Weisbecker	7865e817e9	perf: Make -f the default for perf record Force the overwriting mode by default if append mode is not explicit. Adding -f every time one uses perf on a daily basis quickly becomes a burden. Keep the -f among the options though to avoid breaking some random users scripts. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de>	2010-04-15 04:12:51 +02:00
Thomas Gleixner	a1e2f60e3e	perf: Fix dynamic field detection Checking if a tracing field is an array with a dynamic length requires to check the field type and seek the "__data_loc" string that prepends the actual type, as can be found in a trace event format file: field:__data_loc char[] name; offset:16; size:4; signed:1; But we actually use strcmp() to check if the field type fully matches "__data_loc", which may fail as we trip over the rest of the type. To fix this, use strncmp to only check if it starts with "__data_loc". Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Tom Zanussi <tzanussi@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> LKML-Reference: <1271282283-23721-1-git-send-regression-fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-04-15 01:34:46 +02:00
Masami Hiramatsu	f6c903f585	perf probe: Show function entry line as probe-able Function entry line should be shown as probe-able line, because each function has declared line attribute. LKML-Reference: <20100414224007.14630.96915.stgit@localhost6.localdomain6> Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-04-14 17:45:39 -03:00
Masami Hiramatsu	de1439d8a5	perf probe: Support DW_OP_plus_uconst in DW_AT_data_member_location DW_OP_plus_uconst can be used for DW_AT_data_member_location. This patch adds DW_OP_plus_uconst support when getting structure member offset. Commiter note: Fixed up the size_t format specifier in one case: cc1: warnings being treated as errors util/probe-finder.c: In function ‘die_get_data_member_location’: util/probe-finder.c:270: error: format ‘%d’ expects type ‘int’, but argument 4 has type ‘size_t’ make: *** [/home/acme/git/build/perf/util/probe-finder.o] Error 1 LKML-Reference: <20100414223958.14630.5230.stgit@localhost6.localdomain6> Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-04-14 17:44:00 -03:00
Masami Hiramatsu	dda4ab34fe	perf probe: Fix line range to show end line Line range should reject the range if the number of lines is 0 (e.g. "sched.c:1024+0"), and it should show the lines include the end of line number (e.g. "sched.c:1024-2048" should show 2048th line). LKML-Reference: <20100414223950.14630.42263.stgit@localhost6.localdomain6> Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-04-14 17:41:30 -03:00
Masami Hiramatsu	d3b63d7ae0	perf probe: Fix a bug that --line range can be overflow Since line_finder.lno_s/e are signed int but line_range.start/end are unsigned int, it is possible to be overflow when converting line_range->start/end to line_finder->lno_s/e. This changes line_range.start/end and line_list.line to signed int and adds overflow checks when setting line_finder.lno_s/e. LKML-Reference: <20100414223942.14630.72730.stgit@localhost6.localdomain6> Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2010-04-14 17:41:21 -03:00

1 2 3 4 5 ...

1100 Commits