Commit Graph

1926 Commits

Author SHA1 Message Date
Masami Hiramatsu
baad2d3e69 perf probe: Rename DIE_FIND_CB_FOUND to DIE_FIND_CB_END
Since die_find/walk* callbacks use DIE_FIND_CB_FOUND for
both of failed and found cases, it should be "END"
instead "FOUND" for avoiding confusion.

Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Reported-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Link: http://lkml.kernel.org/r/20110627072709.6528.45706.stgit@fedora15
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-07-15 15:55:57 -04:00
Sonny Rao
259032bfe3 perf: Robustify proc and debugfs file recording
While attempting to create a timechart of boot up I found perf didn't
tolerate modules being loaded/unloaded.  This patch fixes this by
reading the file once and then writing the size read at the correct
point in the file.  It also simplifies the code somewhat.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: Sonny Rao <sonnyrao@chromium.org>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Link: http://lkml.kernel.org/r/10011.1310614483@neuling.org
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-07-14 15:53:01 -04:00
Anton Blanchard
5d67be97f8 perf report/annotate/script: Add option to specify a CPU range
Add an option to perf report/annotate/script to specify which
CPUs to operate on. This enables us to take a single system wide
profile and analyse each CPU (or group of CPUs) in isolation.

This was useful when profiling a multiprocess workload where the
bottleneck was on one CPU but this was hidden in the overall
profile. Per process and per thread breakdowns didn't help
because multiple processes were running on each CPU and no
single process consumed an entire CPU.

The patch converts the list of CPUs returned by cpu_map__new
into a bitmap for fast lookup. I wanted to use -C to be
consistent with perf top/record/stat, but unfortunately perf
report already uses -C <comms>.

 v2: Incorporate suggestions from David Ahern:
	- Added -c to perf script
	- Check that SAMPLE_CPU is set when -c is used
	- Update documentation

 v3: Create perf_session__cpu_bitmap()

Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Link: http://lkml.kernel.org/r/20110704215750.11647eb9@kryten
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-05 10:44:44 +02:00
Christoph Lameter
9da4714a2d slub: slabinfo update for cmpxchg handling
Update the statistics handling and the slabinfo tool to include the new
statistics in the reports it generates.

Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2011-07-02 13:26:57 +03:00
Zhengyu He
3ae9a34d74 perf stat: Add noise output for csv mode
Previously, when you want perf-stat to output the statistics in
csv mode, no information of the noise will be printed out.

For example right now we output this --repeat information:

 ./perf stat -r3 -x, sleep 1
 1.164789,task-clock
 8,context-switches
 0,CPU-migrations
 219,page-faults
 3337800,cycles

With this patch, the output will be appended with an additional
entry for the noise value:

 ./perf stat -r3 -x, sleep 1
 1.164789,task-clock,3.75%
 8,context-switches,75.00%
 0,CPU-migrations,100.00%
 219,page-faults,0.00%
 3337800,cycles,3.36%

Signed-off-by: Zhengyu He <zhengyuh@google.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Stephane Eranian <eranian@google.com>
Cc: Venkatesh Pallipadi <venki@google.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/1308861942-4945-1-git-send-email-zhengyuh@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-01 12:52:40 +02:00
Ingo Molnar
343a031f3c Merge branch 'perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing into perf/core 2011-07-01 11:51:58 +02:00
Ingo Molnar
10e6962765 Merge commit 'v3.0-rc5' into perf/core
Merge reason: Pick up the latest fixes.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-01 10:28:46 +02:00
Frederic Weisbecker
cb1955b86c perf tools: Only display parent field if explictly sorted
We don't need to display the parent field if the parent
sorting machinery is only used for parent filtering
(as in "-p foo").

However if parent filtering is used in combination with
explicit parent sorting ( -s parent), we want to
display it.

Result with:

  perf report -p kernel_thread -s parent

Before:

 # Overhead  Parent symbol
 # ........  .............
 #
     0.07%
            |
            --- ioread8
                ata_sff_check_status
                ata_sff_tf_load
                ata_sff_qc_issue
                ata_bmdma_qc_issue
                ata_qc_issue
                ata_scsi_translate
                ata_scsi_queuecmd
                scsi_dispatch_cmd
                scsi_request_fn
                __blk_run_queue
                __make_request
                generic_make_request
                submit_bio
                submit_bh
                journal_submit_commit_record
                jbd2_journal_commit_transaction
                kjournald2
                kthread
                kernel_thread_helpe

After:

 # Overhead  Parent symbol
 # ........  .............
 #
     0.07%  kernel_thread_helper
            |
            --- ioread8
                ata_sff_check_status
                ata_sff_tf_load
                ata_sff_qc_issue
                ata_bmdma_qc_issue
                ata_qc_issue
                ata_scsi_translate
                ata_scsi_queuecmd
                scsi_dispatch_cmd
                scsi_request_fn
                __blk_run_queue
                __make_request
                generic_make_request
                submit_bio
                submit_bh
                journal_submit_commit_record
                jbd2_journal_commit_transaction
                kjournald2
                kthread
                kernel_thread_helper

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Sam Liao <phyomh@gmail.com>
2011-06-30 00:26:49 +02:00
Frederic Weisbecker
fd8ea21276 perf tools: Allow sort dimensions to be registered more than once
So that the parent sort dimension can be registered twice: once
if we add it as an explicit sort dimension (-s parent) and twice
if we request a parent filter (-p foo).

We'll have only one parent sort dimension in the end but this
allows to override the default parent filter with we gave in "-p"
option. The goal of this is to prepare to allow the use of
"-s parent" and "-p foo" at the same time, ie: sort by filtered
parent.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Sam Liao <phyomh@gmail.com>
2011-06-30 00:26:41 +02:00
Frederic Weisbecker
e84d21227c perf tools: Don't display ignored entries on stdio ui
As for newt ui, don't display entries that have been marked
as ignored.

The practical current effect of this is to make parent
filtering really working. Before, entries that were ignored
were given a null parent but were still displayed. This
resulted in some weird effects:

 # Overhead      Command      Shared Object        Symbol
 # ........  ...........  .................  ............
 #
^A
                   |
                   --- __lock_acquire
                      |
                      |--95.97%-- lock_acquire
                      |          |
                      |          |--30.75%-- _raw_spin_lock

Discard these from the stdio display.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Sam Liao <phyomh@gmail.com>
2011-06-30 00:26:33 +02:00
Frederic Weisbecker
2fd701bc78 perf tools: Remove sort print helpers declarations
These are probably some old leftovers.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Sam Liao <phyomh@gmail.com>
2011-06-30 00:26:19 +02:00
Frederic Weisbecker
872a878fb1 perf tools: Make sort operations static
These don't need to be globally visible.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Sam Liao <phyomh@gmail.com>
2011-06-30 00:25:12 +02:00
Sam Liao
d797fdc5c5 perf tools: Add inverted call graph report support.
Add "caller/callee" option to support inverted butterfly report,
in the inverted report (with caller option), the call graph start
from the callee's ancestor. Users can use such view to catch system's
performance bottleneck from a sysprof like view. Using this option
with specified sort order like pid gives us high level view of call
graph statistics.

Also add "-G" alias for inverted call graph.

Signed-off-by: Sam Liao <phyomh@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2011-06-30 00:24:30 +02:00
Linus Torvalds
8816ead9d8 Merge branches 'perf-urgent-for-linus', 'sched-urgent-for-linus', 'timers-urgent-for-linus' and 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  tools/perf: Fix static build of perf tool
  tracing: Fix regression in printk_formats file

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  generic-ipi: Fix kexec boot crash by initializing call_single_queue before enabling interrupts

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  clocksource: Make watchdog robust vs. interruption
  timerfd: Fix wakeup of processes when timer is cancelled on clock change

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, MAINTAINERS: Add x86 MCE people
  x86, efi: Do not reserve boot services regions within reserved areas
2011-06-19 09:00:18 -07:00
Linus Torvalds
357ed6b1a1 Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  rcu: Move RCU_BOOST #ifdefs to header file
  rcu: use softirq instead of kthreads except when RCU_BOOST=y
  rcu: Use softirq to address performance regression
  rcu: Simplify curing of load woes
2011-06-19 08:56:56 -07:00
Linus Torvalds
7cc2ed0589 Merge branch 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6
* 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6:
  kbuild: Call depmod.sh via shell
  perf: clear out make flags when calling kernel make kernelver
2011-06-16 10:26:58 -07:00
Ingo Molnar
b4f9f2b64a Merge commit 'v3.0-rc3' into perf/core
Merge reason: add the latest fixes.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-06-16 13:23:22 +02:00
Mathias Krause
203db2952b tools/perf: Fix static build of perf tool
To build a statically linked version of the perf tool all needed
libraries must be added in the correct order to get the symbols
resolved. Currently this is broken when, e.g. python or newt
support is enabled -- libpython needs libpthread which is an
unconditional link dependency of the perf tool; libslang needs
libm, another unconditional dependency. To solve the problem in
the long run without the need to keep track of transitive
library dependencies, simply make the linker look at the EXTLIBS
multiple times until it has all symbols resolved.

Signed-off-by: Mathias Krause <minipli@googlemail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Link: http://lkml.kernel.org/r/1308171818-20370-1-git-send-email-minipli@googlemail.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-06-16 10:17:39 +02:00
Andy Whitcroft
37aa9a2eb4 perf: clear out make flags when calling kernel make kernelver
When generating the perf version from the kernel version using 'make
kernelver' it is necessary to clear out any MAKEFLAGS otherwise they may
trigger additional output which pollute the contents.

Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Michal Marek <mmarek@suse.cz>
2011-06-15 22:12:55 +02:00
Steven Rostedt
0df213ca31 ktest: Require one TEST_START in config file
There has been too many times that I put in one too many SKIP
TEST_STARTs and start the test with the default randconfig by accident
that I added this to have ktest ask the user for which test they want to
run if no TEST_START is specified.

Now if I accidently start the test with all TEST_STARTs skipped, ktest
asks what test do I want to run, and I now have a chance to kill it
before it does a make mrproper on my build directory.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-06-14 20:51:37 -04:00
Steven Rostedt
ddf607e5f8 ktest: Add helper function to avoid duplicate code
Several places had the following code:

    get_grub_index;
    get_version;
    install;

    start_monitor;
    return monitor;

Creating a function "start_monitor_and_boot()" replaces these mulitple
uses with a single call.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-06-14 20:49:13 -04:00
Steven Rostedt
1990207d53 ktest: Add IGNORE_WARNINGS to ignore warnings in some patches
Doing a patchcheck test, there may be warnings that gcc produces which
may be OK, and the test should not fail on that commit. By adding a
IGNORE_WARNINGS option to list a space delimited SHA1s that are ignored
lets the user avoid having the test fail on certain commits.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-06-14 20:46:25 -04:00
Steven Rostedt
e7b1344189 ktest: Fix tar extracting of modules to target
The tar command to create the module directory is cjf, but the
extraction only had xf. This works on most versions of tar, but some
versions of tar require xjf for extraction as well.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-06-14 20:44:36 -04:00
Steven Rostedt
4892063043 ktest: Have the testing tmp dir include machine name
As multiple tests may be executed by the same server, have the test
machine name add uniqueness to the value of the temp directory.
Otherwise the temp directories may overwrite each other's tests.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-06-14 20:43:59 -04:00
Steven Rostedt
0bd6c1a38f ktest: Add POST/PRE_BUILD options
There are some cases that a patch may be needed to apply to the kernel
in patchcheck or bisect tests. Adding a PRE_BUILD option to apply the
patch and POST_BUILD to remove it, allows for this to be done easily.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-06-14 20:39:31 -04:00
Shaohua Li
09223371de rcu: Use softirq to address performance regression
Commit a26ac2455ffcf3(rcu: move TREE_RCU from softirq to kthread)
introduced performance regression. In an AIM7 test, this commit degraded
performance by about 40%.

The commit runs rcu callbacks in a kthread instead of softirq. We observed
high rate of context switch which is caused by this. Out test system has
64 CPUs and HZ is 1000, so we saw more than 64k context switch per second
which is caused by RCU's per-CPU kthread.  A trace showed that most of
the time the RCU per-CPU kthread doesn't actually handle any callbacks,
but instead just does a very small amount of work handling grace periods.
This means that RCU's per-CPU kthreads are making the scheduler do quite
a bit of work in order to allow a very small amount of RCU-related
processing to be done.

Alex Shi's analysis determined that this slowdown is due to lock
contention within the scheduler.  Unfortunately, as Peter Zijlstra points
out, the scheduler's real-time semantics require global action, which
means that this contention is inherent in real-time scheduling.  (Yes,
perhaps someone will come up with a workaround -- otherwise, -rt is not
going to do well on large SMP systems -- but this patch will work around
this issue in the meantime.  And "the meantime" might well be forever.)

This patch therefore re-introduces softirq processing to RCU, but only
for core RCU work.  RCU callbacks are still executed in kthread context,
so that only a small amount of RCU work runs in softirq context in the
common case.  This should minimize ksoftirqd execution, allowing us to
skip boosting of ksoftirqd for CONFIG_RCU_BOOST=y kernels.

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Tested-by: "Alex,Shi" <alex.shi@intel.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2011-06-14 15:25:39 -07:00
Steven Rostedt
db05cfefce ktest: Allow initrd processing without modules defined
When a config is set with CONFIG_MODULES=n, it does not mean that the
kernel does not need an initrd to boot. For systems that depend on LVM
and such, an initrd must run first.

If POST_INSTALL is defined, then run the post install regardless if
modules are needed or not.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-06-13 11:09:22 -04:00
Steven Rostedt
23715c3c9a ktest: Have LOG_FILE evaluate options as well
The LOG_FILE variable needs to evaluate the $ options as well.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-06-13 11:03:34 -04:00
Steven Rostedt
ecaf8e5213 ktest: Have wait on stdio honor bug timeout
After a bug is found, the STOP_AFTER_FAILURE timeout is used to
determine how much output should be printed before breaking out
of the monitor loop. This is to get things like call traces and
enough infromation about the bug to help determine what caused it.

The STOP_AFTER_FAILURE is usually much shorter than the TIMEOUT
that is used to determine when to quit after no more stdio is given.

But since the stdio read uses a wait on I/O, the STOP_AFTER_FAILURE is
only checked after we get something from I/O. But if the I/O does
not return any more data, we wait the TIMEOUT period instead, even
though we already triggered a bug report.

The wait on I/O should honor the STOP_AFTER_FAILURE time if a bug has
been found.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-06-13 10:48:10 -04:00
Steven Rostedt
fcb3f16a4f ktest: Implement our own force min config
Using the build KCONFIG_ALLCONFIG environment variable to force
the min config may not always work properly. Since ktest is
written in perl, it is trivial to read and replace the current
config with the configs specified by the min config.

Now the min config (and add configs) are read by perl and before
a make is done, these configs in the .config file are replaced
by the version in the min config.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-06-13 10:40:58 -04:00
Steven Rostedt
9064af5206 ktest: Add TEST_NAME option
Searching through several tests, it gets confusing which test result
is for which test. By adding the TEST_NAME option, the user can tell
which test result belongs to which test.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-06-13 10:38:48 -04:00
Steven Rostedt
30f75da5ff ktest: Add CONFIG_BISECT_GOOD option
Currently the config_bisect compares the min config with the
CONFIG_BISECT config. There may be another config that we know
is good that we want to ignore configs on. By passing in this
config it will ignore the options that are set in the good config.

Note: This only ignores the config, it does not (yet) handle
options that are different between the two configs. If the good
config has "SLAB" set and the bad config has "SLUB" it will not
find the bug if the bug had to do with changing these two options.

This is something that I intend to implement in the future.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-06-13 10:35:35 -04:00
Steven Rostedt
f1a5b96219 ktest: Add detection of triple faults
When a triple fault happens in a test, no call trace nor panic
is displayed. Instead, the system reboots to the good kernel.
Since the good kernel may display a boot prompt that matches the
success string, ktest may think that the test succeeded, when it
did not.

Detecting triple faults is tricky because it is hard to generalize
what a reboot looks like. The best that we can come up with for now
is to examine the Linux banner. If we detect that the Linux banner
matches the test we want to test, then look to see if we hit another
Linux banner with a different kernel is booted. This can be assumed
to be a triple fault.

We can't just check for two Linux banners because things like
early printk may cause the Linux banner to be displayed twice. Checking
for different kernel versions should be the safe bet.

If this for some reason detects a false triple boot. A new ktest
config option is also created:

 DETECT_TRIPLE_FAULT

This can be set to 0 to disable this checking.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-06-13 10:30:00 -04:00
Steven Rostedt
cd4f1d536c ktest: Notify reason to break out of monitoring boot
Different timeouts can cause the ktest monitor to break out of the
loop. It becomes annoying that one does not know the reason why
it exited the monitor loop. Display the cause of the reason why
the loop was exited.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-06-13 10:26:27 -04:00
Linus Torvalds
6aecceccf5 Merge branch 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6
* 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6:
  perf: Use make kernelversion instead of parsing the Makefile
  kbuild: Hack for depmod not handling X.Y versions
  kbuild: Move depmod call to a separate script
  kbuild: Fix <linux/version.h> for empty SUBLEVEL or PATCHLEVEL
  kbuild: Fix KERNELVERSION for empty SUBLEVEL or PATCHLEVEL
  kbuild: silence Nothing to be done for 'all' message
2011-06-09 16:27:42 -07:00
Michal Marek
5d61b9fd19 perf: Use make kernelversion instead of parsing the Makefile
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: Michal Marek <mmarek@suse.cz>
2011-06-09 23:05:54 +02:00
Linus Torvalds
33726bf214 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  perf: Fix comments in include/linux/perf_event.h
  perf: Comment /proc/sys/kernel/perf_event_paranoid to be part of user ABI
  perf python: Fix argument name list of read_on_cpu()
  perf evlist: Don't die if sample_{id_all|type} is invalid
  perf python: Use exception to propagate errors
  perf evlist: Remove dependency on debug routines
  perf, cgroups: Fix up for new API
2011-06-08 08:36:15 -07:00
Ingo Molnar
3ce2a0bc9d Merge branch 'perf/urgent' into perf/core
Conflicts:
	tools/perf/util/python.c

Merge reason: resolve the conflict with perf/urgent.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-06-04 12:28:05 +02:00
Linus Torvalds
9a44fde343 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-ktest
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-ktest:
  ktest: Ignore unset values of the minconfig in config_bisect
  ktest: Fix result of rebooting the kernel
  ktest: Fix off-by-one in config bisect result
2011-06-04 07:58:48 +09:00
Frederic Weisbecker
b273fa9716 perf python: Fix argument name list of read_on_cpu()
Mandatory arguments need to be present in the argument name list, as
well as optional arguments, otherwise python barfs:

	# ./python/twatch.py
	Traceback (most recent call last):
	  File "./python/twatch.py", line 41, in <module>
	    main()
	  File "./python/twatch.py", line 32, in main
	    event = evlist.read_on_cpu(cpu)
	RuntimeError: more argument specifiers than keyword list entries

Hence, add cpu to the name list.

Cc: David Ahern <daahern@cisco.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Link: http://lkml.kernel.org/r/1301588863-20210-1-git-send-email-fweisbec@gmail.com
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-06-03 10:09:22 -03:00
Arnaldo Carvalho de Melo
56722381b8 perf evlist: Don't die if sample_{id_all|type} is invalid
Fixes two more cases where the python binding would not load:

. Not finding die(), which it shouldn't anyway, not good to just stop the
  world because some particular perf.data file is invalid, just propagate
  the error to the caller.

. Not finding perf_sample_size: fix it by moving it from event.c to evsel,
  where it belongs, as most cases are moving to operate on an evsel object.o

One of the fixed problems:

[root@emilia ~]# python
>>> import perf
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: /home/acme/git/build/perf/python/perf.so: undefined symbol: perf_sample_size
>>>
[root@emilia ~]#

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-1hkj7b2cvgbfnoizsekjb6c9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-06-03 10:07:52 -03:00
Arnaldo Carvalho de Melo
9c850d6c4b perf python: Use exception to propagate errors
We were using pr_debug to tell the user about not being able to parse a sample
where we should really use the python way of reporting errors: exceptions.

Fixes this problem:

[root@emilia ~]# python
>>> import perf
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: /home/acme/git/build/perf/python/perf.so: undefined symbol: eprintf
>>>
[root@emilia ~]

As we want to keep the objects linked in the python binding (and in the future
in a shared library) minimal.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-m9dba9kaluas0kq8r58z191c@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-06-03 10:07:01 -03:00
Arnaldo Carvalho de Melo
d21cc9f67d perf evlist: Remove dependency on debug routines
So far we avoided having to link debug.o in the python binding, keep it
that way by not using ui__warning() in evlist.c.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-4wtew8hd3g7ejnlehtspys2t@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-06-03 10:05:23 -03:00
David Ahern
7cec092238 perf script: Add printing of sample address
Resolve to a function or variable if possible and if the sym option is
enabled.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1306782503-22002-1-git-send-email-dsahern@gmail.com
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-06-02 13:31:01 -03:00
David Ahern
610723f24e perf script: Make printing of dso a separate field option
The 'sym' option displays both the function name and the DSO it comes
from. Split the display of the dso into a separate option.  This allows
display of the ip address and symbol without the dso, thus shortening
line lengths - and decluttering the output a bit.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1306528124-25861-3-git-send-email-dsahern@gmail.com
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-06-02 13:29:14 -03:00
David Ahern
787bef174f perf script: "sym" field really means show IP data
Currently the "sym" output field is used to dump instruction pointers
and callchain stack. Sample addresses can also be converted to symbols,
so the meaning of "sym" needs to be fixed. This patch adds an "ip"
option and if it is selected the user can also opt to dump symbols for
them. If the user opts to dump IP without syms only the address is
shown.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1306528124-25861-2-git-send-email-dsahern@gmail.com
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-06-02 13:28:34 -03:00
David Ahern
2cee77c450 perf stat: clarify unsupported events from uncounted events
perf stat continues running even if the event list contains counters
that are not supported. The resulting output then contains <not counted>
for those events which gets confusing as to which events are supported,
but not counted and which are not supported.

Before:

perf stat -ddd -- sleep 1

      Performance counter stats for 'sleep 1':

          0.571283 task-clock                #    0.001 CPUs utilized
                 1 context-switches          #    0.002 M/sec
                 0 CPU-migrations            #    0.000 M/sec
               157 page-faults               #    0.275 M/sec
         1,037,707 cycles                    #    1.816 GHz
     <not counted> stalled-cycles-frontend
     <not counted> stalled-cycles-backend
           654,499 instructions              #    0.63  insns per cycle
           136,129 branches                  #  238.286 M/sec
     <not counted> branch-misses
     <not counted> L1-dcache-loads
     <not counted> L1-dcache-load-misses
     <not counted> LLC-loads
     <not counted> LLC-load-misses
     <not counted> L1-icache-loads
     <not counted> L1-icache-load-misses
     <not counted> dTLB-loads
     <not counted> dTLB-load-misses
     <not counted> iTLB-loads
     <not counted> iTLB-load-misses
     <not counted> L1-dcache-prefetches
     <not counted> L1-dcache-prefetch-misses

       1.001004836 seconds time elapsed

After:

perf stat -ddd -- sleep 1

 Performance counter stats for 'sleep 1':

          1.350326 task-clock                #    0.001 CPUs utilized
                 2 context-switches          #    0.001 M/sec
                 0 CPU-migrations            #    0.000 M/sec
               157 page-faults               #    0.116 M/sec
            11,986 cycles                    #    0.009 GHz
   <not supported> stalled-cycles-frontend
   <not supported> stalled-cycles-backend
           496,986 instructions              #   41.46  insns per cycle
           138,065 branches                  #  102.246 M/sec
             7,245 branch-misses             #    5.25% of all branches
     <not counted> L1-dcache-loads
     <not counted> L1-dcache-load-misses
     <not counted> LLC-loads
     <not counted> LLC-load-misses
     <not counted> L1-icache-loads
     <not counted> L1-icache-load-misses
     <not counted> dTLB-loads
     <not counted> dTLB-load-misses
     <not counted> iTLB-loads
     <not counted> iTLB-load-misses
     <not counted> L1-dcache-prefetches
   <not supported> L1-dcache-prefetch-misses

       1.002397333 seconds time elapsed

v1->v2:
changed supported type from int to bool

v2->v3
fixed vertical alignment of new struct element

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1306767359-13221-1-git-send-email-dsahern@gmail.com
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-06-02 13:26:15 -03:00
Frederic Weisbecker
64348153c6 perf python: Cleanup useless double NULL termination in method arg names
The list of methods argument names only needs to be NULL terminated
once. Remove the second ones.

Cc: David Ahern <daahern@cisco.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Link: http://lkml.kernel.org/r/1301588863-20210-2-git-send-email-fweisbec@gmail.com
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-06-02 13:21:26 -03:00
Frederic Weisbecker
e95cc02880 perf python: Fix argument name list of read_on_cpu()
Mandatory arguments need to be present in the argument name list, as
well as optional arguments, otherwise python barfs:

	# ./python/twatch.py
	Traceback (most recent call last):
	  File "./python/twatch.py", line 41, in <module>
	    main()
	  File "./python/twatch.py", line 32, in main
	    event = evlist.read_on_cpu(cpu)
	RuntimeError: more argument specifiers than keyword list entries

Hence, add cpu to the name list.

Cc: David Ahern <daahern@cisco.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Link: http://lkml.kernel.org/r/1301588863-20210-1-git-send-email-fweisbec@gmail.com
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-06-02 13:21:07 -03:00
Steven Rostedt
9bf7174949 ktest: Ignore unset values of the minconfig in config_bisect
By ignoring the unset values of the minconfig in deciding
what to test in the config_bisect can cause the problem
config from being tested too.

Just do not test the configs that are set in the minconfig.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-06-02 11:57:57 -04:00
Steven Rostedt
4da46da2d2 ktest: Fix result of rebooting the kernel
The command that is called that reboots the kernel may fail
but the return code is not passed back to the ktest.pl script.
This is because a ';' is used between the two commands and
if the second command fails, only the first command's return
code is returned. Using a '&&' between the two commands fixes
this.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-06-02 11:57:56 -04:00
Steven Rostedt
4c8cc55b3c ktest: Fix off-by-one in config bisect result
Because in perl the array size returned by $#arr, is the last
index and not the actually size of the array, we end the config
bisect early, thinking there is only one config left when there
are in fact two. Thus the result has a 50% chance of picking
the correct config that caused the problem.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-06-02 11:57:41 -04:00
Arnaldo Carvalho de Melo
c2a70653af perf evlist: Don't die if sample_{id_all|type} is invalid
Fixes two more cases where the python binding would not load:

. Not finding die(), which it shouldn't anyway, not good to just stop the
  world because some particular perf.data file is invalid, just propagate
  the error to the caller.

. Not finding perf_sample_size: fix it by moving it from event.c to evsel,
  where it belongs, as most cases are moving to operate on an evsel object.o

One of the fixed problems:

[root@emilia ~]# python
>>> import perf
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: /home/acme/git/build/perf/python/perf.so: undefined symbol: perf_sample_size
>>>
[root@emilia ~]#

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-1hkj7b2cvgbfnoizsekjb6c9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-06-02 11:04:54 -03:00
Arnaldo Carvalho de Melo
5c6970af2f perf python: Use exception to propagate errors
We were using pr_debug to tell the user about not being able to parse a sample
where we should really use the python way of reporting errors: exceptions.

Fixes this problem:

[root@emilia ~]# python
>>> import perf
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: /home/acme/git/build/perf/python/perf.so: undefined symbol: eprintf
>>>
[root@emilia ~]

As we want to keep the objects linked in the python binding (and in the future
in a shared library) minimal.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-m9dba9kaluas0kq8r58z191c@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-06-02 10:55:10 -03:00
Arnaldo Carvalho de Melo
bccdaba044 perf evlist: Remove dependency on debug routines
So far we avoided having to link debug.o in the python binding, keep it
that way by not using ui__warning() in evlist.c.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-4wtew8hd3g7ejnlehtspys2t@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-06-02 10:41:41 -03:00
Michael S. Tsirkin
4423fe40b0 virtio_test: support event index
Add ability to test the new event idx feature,
enable by default.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-05-30 11:14:15 +09:30
Arnaldo Carvalho de Melo
e4a338d05d perf top: Don't stop if no kernel symtab is found
We now just warn the user about the fact and go on providing just
userspace samples.

This fixes a problem when no vmlinux is explicetely passed by the user,
thus symbol_conf.vmlinux_name is NULL, no suitable vmlinux is found, and
then we get:

 aldebaran:~> perf top -p 7557
 [kernel.kallsyms] with build id 44d9a989eabbd79e486bc079d6b743d397c204e0
 not found, continuing without symbols
 The (null) file can't be used

Reported-by: Ingo Molnar <mingo@elte.hu>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Link: http://lkml.kernel.org/n/tip-cj2g81hn64wv2bipmqk4fy2m@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-05-27 16:02:29 -03:00
Arnaldo Carvalho de Melo
5f6f558097 perf top: Handle kptr_restrict
Reported-by: Ingo Molnar <mingo@elte.hu>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Link: http://lkml.kernel.org/n/tip-cyl5zmi1nu35vyu7l5im2pyv@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-05-27 16:02:25 -03:00
Arnaldo Carvalho de Melo
59fb1ee95e perf top: Remove unused macro
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Link: http://lkml.kernel.org/n/tip-weqbs0tkk2u0qp1xxdxxosfg@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-05-27 16:02:20 -03:00
David Ahern
4af4c9550c perf events: initialize fd array to -1 instead of 0
perf_evsel__alloc_fd allocates an array of file descriptors with the
memory initialized to 0. The array has dimensions for cpus and threads.

Later, __perf_evsel__open calls sys_perf_event_open for each cpu and thread
dimensions. If the open fails for any of the cpus or threads then the fd's
for this event are closed and the fd entry in the array is set to -1. Now,
if the first attempt fails for the event (e.g., the event is not supported)
the remaining dimensions (cpu > 0 and thread > 0) are not touched and left
at the initialized value of 0.

builtin-stat catches ENOENT and ENOSYS failures and allows the command to
continue. The end result is that stat attempts to read from an fd of 0 which
of course is stdin and so the command hangs until you type ctrl-D.

Resolve by initializing the array to -1 since an fd < 0 is already
handled.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1306511914-8016-1-git-send-email-dsahern@gmail.com
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-05-27 16:02:12 -03:00
Arnaldo Carvalho de Melo
646aaea615 perf tools: Make sure kptr_restrict warnings fit 80 col terms
Suggested-by: Ingo Molnar <mingo@elte.hu>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Link: http://lkml.kernel.org/n/tip-i1p8vrhq7xveyui6t1sc914e@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-05-27 16:02:09 -03:00
Arnaldo Carvalho de Melo
75911c9bd1 perf tools: Fix build on older systems
Where /usr/include/linux/const.h is not present, e.g. RHEL5.

Reported-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Link: http://lkml.kernel.org/n/tip-ypcw2mu0w7dl1rrc6ncz3pee@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-05-26 11:16:29 -03:00
Arnaldo Carvalho de Melo
ec80fde746 perf symbols: Handle /proc/sys/kernel/kptr_restrict
Perf uses /proc/modules to figure out where kernel modules are loaded.

With the advent of kptr_restrict, non root users get zeroes for all module
start addresses.

So check if kptr_restrict is non zero and don't generate the syntethic
PERF_RECORD_MMAP events for them.

Warn the user about it in perf record and in perf report.

In perf report the reference relocation symbol being zero means that
kptr_restrict was set, thus /proc/kallsyms has only zeroed addresses, so don't
use it to fixup symbol addresses when using a valid kallsyms (in the buildid
cache) or vmlinux (in the vmlinux path) build-id located automatically or
specified by the user.

Provide an explanation about it in 'perf report' if kernel samples were taken,
checking if a suitable vmlinux or kallsyms was found/specified.

Restricted /proc/kallsyms don't go to the buildid cache anymore.

Example:

 [acme@emilia ~]$ perf record -F 100000 sleep 1

 WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted, check
 /proc/sys/kernel/kptr_restrict.

 Samples in kernel functions may not be resolved if a suitable vmlinux file is
 not found in the buildid cache or in the vmlinux path.

 Samples in kernel modules won't be resolved at all.

 If some relocation was applied (e.g. kexec) symbols may be misresolved even
 with a suitable vmlinux or kallsyms file.

 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.005 MB perf.data (~231 samples) ]
 [acme@emilia ~]$

 [acme@emilia ~]$ perf report --stdio
 Kernel address maps (/proc/{kallsyms,modules}) were restricted,
 check /proc/sys/kernel/kptr_restrict before running 'perf record'.

 If some relocation was applied (e.g. kexec) symbols may be misresolved.

 Samples in kernel modules can't be resolved as well.

 # Events: 13  cycles
 #
 # Overhead  Command      Shared Object                 Symbol
 # ........  .......  .................  .....................
 #
    20.24%    sleep  [kernel.kallsyms]  [k] page_fault
    20.04%    sleep  [kernel.kallsyms]  [k] filemap_fault
    19.78%    sleep  [kernel.kallsyms]  [k] __lru_cache_add
    19.69%    sleep  ld-2.12.so         [.] memcpy
    14.71%    sleep  [kernel.kallsyms]  [k] dput
     4.70%    sleep  [kernel.kallsyms]  [k] flush_signal_handlers
     0.73%    sleep  [kernel.kallsyms]  [k] perf_event_comm
     0.11%    sleep  [kernel.kallsyms]  [k] native_write_msr_safe

 #
 # (For a higher level overview, try: perf report --sort comm,dso)
 #
 [acme@emilia ~]$

This is because it found a suitable vmlinux (build-id checked) in
/lib/modules/2.6.39-rc7+/build/vmlinux (use -v in perf report to see the long
file name).

If we remove that file from the vmlinux path:

 [root@emilia ~]# mv /lib/modules/2.6.39-rc7+/build/vmlinux \
		     /lib/modules/2.6.39-rc7+/build/vmlinux.OFF
 [acme@emilia ~]$ perf report --stdio
 [kernel.kallsyms] with build id 57298cdbe0131f6871667ec0eaab4804dcf6f562
 not found, continuing without symbols

 Kernel address maps (/proc/{kallsyms,modules}) were restricted, check
 /proc/sys/kernel/kptr_restrict before running 'perf record'.

 As no suitable kallsyms nor vmlinux was found, kernel samples can't be
 resolved.

 Samples in kernel modules can't be resolved as well.

 # Events: 13  cycles
 #
 # Overhead  Command      Shared Object  Symbol
 # ........  .......  .................  ......
 #
    80.31%    sleep  [kernel.kallsyms]  [k] 0xffffffff8103425a
    19.69%    sleep  ld-2.12.so         [.] memcpy

 #
 # (For a higher level overview, try: perf report --sort comm,dso)
 #
 [acme@emilia ~]$

Reported-by: Stephane Eranian <eranian@google.com>
Suggested-by: David Miller <davem@davemloft.net>
Cc: Dave Jones <davej@redhat.com>
Cc: David Miller <davem@davemloft.net>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Kees Cook <kees.cook@canonical.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Link: http://lkml.kernel.org/n/tip-mt512joaxxbhhp1odop04yit@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-05-26 11:15:25 -03:00
Jesper Juhl
ea7659fb2b perf: Remove duplicate headers
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Cc: Tom Zanussi <tom.zanussi@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: trivial@kernel.org
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Link: http://lkml.kernel.org/r/alpine.LNX.2.00.1105261011290.17400@swampdragon.chaosbits.net
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-26 13:49:57 +02:00
Linus Torvalds
5214638384 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  perf tools: Fix sample type size calculation in 32 bits archs
  profile: Use vzalloc() rather than vmalloc() & memset()
2011-05-23 21:20:48 -07:00
Frederic Weisbecker
0f61f3e4db perf tools: Fix sample type size calculation in 32 bits archs
The shift used here to count the number of bits set in
the mask doesn't work above the low part for archs that
are not 64 bits.

Fix the constant used for the shift.

This fixes a 32-bit perf top failure reported by Eric Dumazet:

	Can't parse sample, err = -14
	Can't parse sample, err = -14
	...

Reported-and-tested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com
Link: http://lkml.kernel.org/r/1306200686-17317-1-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-24 04:33:24 +02:00
Linus Torvalds
19504828b4 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  perf tools: Fix sample size bit operations
  perf tools: Fix ommitted mmap data update on remap
  watchdog: Change the default timeout and configure nmi watchdog period based on watchdog_thresh
  watchdog: Disable watchdog when thresh is zero
  watchdog: Only disable/enable watchdog if neccessary
  watchdog: Fix rounding bug in get_sample_period()
  perf tools: Propagate event parse error handling
  perf tools: Robustify dynamic sample content fetch
  perf tools: Pre-check sample size before parsing
  perf tools: Move evlist sample helpers to evlist area
  perf tools: Remove junk code in mmap size handling
  perf tools: Check we are able to read the event size on mmap
2011-05-23 09:25:52 -07:00
Linus Torvalds
57d19e80f4 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits)
  b43: fix comment typo reqest -> request
  Haavard Skinnemoen has left Atmel
  cris: typo in mach-fs Makefile
  Kconfig: fix copy/paste-ism for dell-wmi-aio driver
  doc: timers-howto: fix a typo ("unsgined")
  perf: Only include annotate.h once in tools/perf/util/ui/browsers/annotate.c
  md, raid5: Fix spelling error in comment ('Ofcourse' --> 'Of course').
  treewide: fix a few typos in comments
  regulator: change debug statement be consistent with the style of the rest
  Revert "arm: mach-u300/gpio: Fix mem_region resource size miscalculations"
  audit: acquire creds selectively to reduce atomic op overhead
  rtlwifi: don't touch with treewide double semicolon removal
  treewide: cleanup continuations and remove logging message whitespace
  ath9k_hw: don't touch with treewide double semicolon removal
  include/linux/leds-regulator.h: fix syntax in example code
  tty: fix typo in descripton of tty_termios_encode_baud_rate
  xtensa: remove obsolete BKL kernel option from defconfig
  m68k: fix comment typo 'occcured'
  arch:Kconfig.locks Remove unused config option.
  treewide: remove extra semicolons
  ...
2011-05-23 09:12:26 -07:00
Linus Torvalds
2e77defc5d Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-ktest
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-ktest:
  ktest: Allow options to be used by other options
  ktest: Create variables for the ktest config files
  ktest: Reboot after each patchcheck run
  ktest: Reboot to good kernel after every bisect run
  ktest: If test failed due to timeout, print that
  ktest: Fix post install command
2011-05-23 08:20:44 -07:00
Frederic Weisbecker
3cb6d15408 perf tools: Fix sample size bit operations
What we want is to count the number of bits in the mask,
not some other random operation written in the middle
of the night.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1306148788-6179-2-git-send-email-fweisbec@gmail.com
[ Fixed perf_event__names[] alignment which was nearby and hurting my eyes ... ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-23 13:26:36 +02:00
Frederic Weisbecker
998bedc8c5 perf tools: Fix ommitted mmap data update on remap
Commit eac9eacee1 "perf tools: Check we are able to read the event
size on mmap" brought a check to ensure we can read the size of the
event before dereferencing it, and do a remap otherwise to move the
buffer forward.

However that remap was ommitting all the necessary work to
update the new page offset, head, and to unmap previous pages,
etc...

To fix this, gather all the code that fetches the event in a
seperate helper which does all the necessary checks about the
header/event size and tells us anytime a remap is needed.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1306148788-6179-3-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-23 13:22:57 +02:00
Ingo Molnar
3ac1bbcf13 Merge branch 'perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing into perf/urgent
Conflicts:
	tools/perf/builtin-top.c

Semantic conflict:
	util/include/linux/list.h        # fix prefetch.h removal fallout

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-22 10:10:01 +02:00
Frederic Weisbecker
5538becaec perf tools: Propagate event parse error handling
Better handle event parsing error by propagating the details
in upper layers or by dumping some failure message. So that
the user knows he has some crazy events in the batch.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
2011-05-22 03:38:49 +02:00
Frederic Weisbecker
98e1da905c perf tools: Robustify dynamic sample content fetch
Ensure the size of the dynamic fields such as callchains
or raw events don't overlap the whole event boundaries.

This prevents from dereferencing junk if the given size of
the callchain goes too eager.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
2011-05-22 03:38:48 +02:00
Frederic Weisbecker
a285412479 perf tools: Pre-check sample size before parsing
Check that the total size of the sample fields having a fixed
size do not exceed the one of the whole event. This robustifies
the sample parsing.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
2011-05-22 03:38:36 +02:00
Frederic Weisbecker
74429964d8 perf tools: Move evlist sample helpers to evlist area
These APIs should belong to evlist.c as they may not be
exclusively tied to the headers.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com
2011-05-22 03:12:29 +02:00
Frederic Weisbecker
dd5f5fd108 perf tools: Remove junk code in mmap size handling
size is overriden later and used only then. Those
lines are only junk, probably a leftover.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
2011-05-22 03:12:28 +02:00
Frederic Weisbecker
eac9eacee1 perf tools: Check we are able to read the event size on mmap
Check we have enough mmaped space to read the current event
size from its headers, otherwise we may dereference some
hell there.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
2011-05-22 03:12:13 +02:00
Linus Torvalds
268bb0ce3e sanitize <linux/prefetch.h> usage
Commit e66eed651f ("list: remove prefetching from regular list
iterators") removed the include of prefetch.h from list.h, which
uncovered several cases that had apparently relied on that rather
obscure header file dependency.

So this fixes things up a bit, using

   grep -L linux/prefetch.h $(git grep -l '[^a-z_]prefetchw*(' -- '*.[ch]')
   grep -L 'prefetchw*(' $(git grep -l 'linux/prefetch.h' -- '*.[ch]')

to guide us in finding files that either need <linux/prefetch.h>
inclusion, or have it despite not needing it.

There are more of them around (mostly network drivers), but this gets
many core ones.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-20 12:50:29 -07:00
Steven Rostedt
2a62512bce ktest: Allow options to be used by other options
There are cases where one ktest option may be used within another
ktest option. Allow them to be reused just like config variables
but there are evaluated at time of test not config processing time.

Thus having something like:

MAKE_CMD = make ARCH=${ARCH}

TEST_START
ARCH = powerpc

TEST_START
ARCH = arm

Will have the arch defined for each test iteration.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-05-20 15:48:59 -04:00
Steven Rostedt
77d942ceac ktest: Create variables for the ktest config files
I found that I constantly reuse information for each test case.
It would be nice to just define a variable to reuse.

For example I may have:

TEST_START
[...]
TEST = ssh root@mybox /path/to/my/script

TEST_START
[...]
TEST = ssh root@mybox /path/to/my/script

[etc]

The issue is, I may wont to change that script or one of the other
fields. Then I need to update each line individually.

With the addition of config variables (variables only used during parsing
the config) we can simplify the config files. These variables can
also be defined multiple times and each time the new value will
overwrite the old value.

The convention to use a config variable over a ktest option is to use :=
instead of =.

Now we could do:

USER := root
TARGET := mybox
TEST_SCRIPT := /path/to/my/script
TEST_CASE := ${USER}@${TARGET} ${TEST_SCRIPT}

TEST_START
[...]
TEST = ${TEST_CASE}

TEST_START
[...]
TEST = ${TEST_CASE}

[etc]

Now we just need to update the variables at the top.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-05-20 15:26:26 -04:00
Steven Rostedt
27d934b287 ktest: Reboot after each patchcheck run
The patches being checked may not leave the kernel in a state
that the next run will allow the new kernel to be copied to the
machine. Reboot to a known good kernel before continuing to the
next kernel to test.

Added option PATCHCHECK_SLEEP_TIME for the max time to sleep between
patchcheck reboots.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-05-20 09:21:02 -04:00
Steven Rostedt
4025bc62dd ktest: Reboot to good kernel after every bisect run
Reboot after each bisect run regardless if the bisect passed
or failed. The test may just be to boot the kernel and that kernel
may not have a way to copy the next kerne to it. Reboot to a known
good kernel after each bisect run.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-05-20 09:16:29 -04:00
Steven Rostedt
4d62bf51ac ktest: If test failed due to timeout, print that
If the test failed due to timeout for boot, print a message saying
so. Otherwise the user will be confused to why their test just failed.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-05-20 09:14:35 -04:00
Steven Rostedt
ca6a21f874 ktest: Fix post install command
The command to run post install (for those that want initrds) was
broken. Instead of doing a substitution for the $KERNEL_VERSION
variable. It was replacing the entire command with nothing.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-05-20 09:11:58 -04:00
Linus Torvalds
eb04f2f04e Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (78 commits)
  Revert "rcu: Decrease memory-barrier usage based on semi-formal proof"
  net,rcu: convert call_rcu(prl_entry_destroy_rcu) to kfree
  batman,rcu: convert call_rcu(softif_neigh_free_rcu) to kfree_rcu
  batman,rcu: convert call_rcu(neigh_node_free_rcu) to kfree()
  batman,rcu: convert call_rcu(gw_node_free_rcu) to kfree_rcu
  net,rcu: convert call_rcu(kfree_tid_tx) to kfree_rcu()
  net,rcu: convert call_rcu(xt_osf_finger_free_rcu) to kfree_rcu()
  net/mac80211,rcu: convert call_rcu(work_free_rcu) to kfree_rcu()
  net,rcu: convert call_rcu(wq_free_rcu) to kfree_rcu()
  net,rcu: convert call_rcu(phonet_device_rcu_free) to kfree_rcu()
  perf,rcu: convert call_rcu(swevent_hlist_release_rcu) to kfree_rcu()
  perf,rcu: convert call_rcu(free_ctx) to kfree_rcu()
  net,rcu: convert call_rcu(__nf_ct_ext_free_rcu) to kfree_rcu()
  net,rcu: convert call_rcu(net_generic_release) to kfree_rcu()
  net,rcu: convert call_rcu(netlbl_unlhsh_free_addr6) to kfree_rcu()
  net,rcu: convert call_rcu(netlbl_unlhsh_free_addr4) to kfree_rcu()
  security,rcu: convert call_rcu(sel_netif_free) to kfree_rcu()
  net,rcu: convert call_rcu(xps_dev_maps_release) to kfree_rcu()
  net,rcu: convert call_rcu(xps_map_release) to kfree_rcu()
  net,rcu: convert call_rcu(rps_map_release) to kfree_rcu()
  ...
2011-05-19 18:14:34 -07:00
Linus Torvalds
80fe02b5da Merge branches 'sched-core-for-linus' and 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (60 commits)
  sched: Fix and optimise calculation of the weight-inverse
  sched: Avoid going ahead if ->cpus_allowed is not changed
  sched, rt: Update rq clock when unthrottling of an otherwise idle CPU
  sched: Remove unused parameters from sched_fork() and wake_up_new_task()
  sched: Shorten the construction of the span cpu mask of sched domain
  sched: Wrap the 'cfs_rq->nr_spread_over' field with CONFIG_SCHED_DEBUG
  sched: Remove unused 'this_best_prio arg' from balance_tasks()
  sched: Remove noop in alloc_rt_sched_group()
  sched: Get rid of lock_depth
  sched: Remove obsolete comment from scheduler_tick()
  sched: Fix sched_domain iterations vs. RCU
  sched: Next buddy hint on sleep and preempt path
  sched: Make set_*_buddy() work on non-task entities
  sched: Remove need_migrate_task()
  sched: Move the second half of ttwu() to the remote cpu
  sched: Restructure ttwu() some more
  sched: Rename ttwu_post_activation() to ttwu_do_wakeup()
  sched: Remove rq argument from ttwu_stat()
  sched: Remove rq->lock from the first half of ttwu()
  sched: Drop rq->lock from sched_exec()
  ...

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  sched: Fix rt_rq runtime leakage bug
2011-05-19 17:41:22 -07:00
Linus Torvalds
df48d8716e Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (107 commits)
  perf stat: Add more cache-miss percentage printouts
  perf stat: Add -d -d and -d -d -d options to show more CPU events
  ftrace/kbuild: Add recordmcount files to force full build
  ftrace: Add self-tests for multiple function trace users
  ftrace: Modify ftrace_set_filter/notrace to take ops
  ftrace: Allow dynamically allocated function tracers
  ftrace: Implement separate user function filtering
  ftrace: Free hash with call_rcu_sched()
  ftrace: Have global_ops store the functions that are to be traced
  ftrace: Add ops parameter to ftrace_startup/shutdown functions
  ftrace: Add enabled_functions file
  ftrace: Use counters to enable functions to trace
  ftrace: Separate hash allocation and assignment
  ftrace: Create a global_ops to hold the filter and notrace hashes
  ftrace: Use hash instead for FTRACE_FL_FILTER
  ftrace: Replace FTRACE_FL_NOTRACE flag with a hash of ignored functions
  perf bench, x86: Add alternatives-asm.h wrapper
  x86, 64-bit: Fix copy_[to/from]_user() checks for the userspace address limit
  x86, mem: memset_64.S: Optimize memset by enhanced REP MOVSB/STOSB
  x86, mem: memmove_64.S: Optimize memmove by enhanced REP MOVSB/STOSB
  ...
2011-05-19 17:36:08 -07:00
Ingo Molnar
c3305257cd perf stat: Add more cache-miss percentage printouts
Print out the cache-miss percentage as well if the cache refs were
collected, for all the generic cache event types.

Before:

   11,103,723,230 dTLB-loads                #  622.471 M/sec                    ( +-  0.30% )
       87,065,337 dTLB-load-misses          #    4.881 M/sec                    ( +-  0.90% )

After:

   11,353,713,242 dTLB-loads                #  626.020 M/sec                    ( +-  0.35% )
      113,393,472 dTLB-load-misses          #    1.00% of all dTLB cache hits   ( +-  0.49% )

Also ASCII color highlight too high percentages, them when it's executed on the console.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/n/tip-lkhwxsevdbd9a8nymx0vxc3y@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-19 14:30:50 +02:00
Ingo Molnar
2cba3ffb9a perf stat: Add -d -d and -d -d -d options to show more CPU events
Print even more detailed statistics if requested via perf stat -d:

       -d:          detailed events, L1 and LLC data cache
    -d -d:     more detailed events, dTLB and iTLB events
 -d -d -d:     very detailed events, adding prefetch events

Full output looks like this now:

 Performance counter stats for '/home/mingo/hackbench 10' (5 runs):

       1703.674707 task-clock                #    8.709 CPUs utilized            ( +-  4.19% )
            49,068 context-switches          #    0.029 M/sec                    ( +- 16.66% )
             8,303 CPU-migrations            #    0.005 M/sec                    ( +- 24.90% )
            17,397 page-faults               #    0.010 M/sec                    ( +-  0.46% )
     2,345,389,239 cycles                    #    1.377 GHz                      ( +-  4.61% ) [55.90%]
     1,884,503,527 stalled-cycles-frontend   #   80.35% frontend cycles idle     ( +-  5.67% ) [50.39%]
       743,919,737 stalled-cycles-backend    #   31.72% backend  cycles idle     ( +-  8.75% ) [49.91%]
     1,314,416,379 instructions              #    0.56  insns per cycle
                                             #    1.43  stalled cycles per insn  ( +-  2.53% ) [60.87%]
       272,592,567 branches                  #  160.003 M/sec                    ( +-  1.74% ) [56.56%]
         3,794,846 branch-misses             #    1.39% of all branches          ( +-  6.59% ) [58.50%]
       449,982,778 L1-dcache-loads           #  264.125 M/sec                    ( +-  2.47% ) [49.88%]
        22,404,961 L1-dcache-load-misses     #    4.98% of all L1-dcache hits    ( +-  6.08% ) [55.05%]
         6,204,750 LLC-loads                 #    3.642 M/sec                    ( +-  8.91% ) [43.75%]
         1,837,411 LLC-load-misses           #    1.078 M/sec                    ( +-  7.27% ) [12.07%]
       411,440,421 L1-icache-loads           #  241.502 M/sec                    ( +-  5.60% ) [36.52%]
        27,556,832 L1-icache-load-misses     #   16.175 M/sec                    ( +-  7.46% ) [46.72%]
       464,067,627 dTLB-loads                #  272.392 M/sec                    ( +-  4.46% ) [54.17%]
        10,765,648 dTLB-load-misses          #    6.319 M/sec                    ( +-  3.18% ) [48.68%]
     1,273,080,386 iTLB-loads                #  747.256 M/sec                    ( +-  3.38% ) [47.53%]
           117,481 iTLB-load-misses          #    0.069 M/sec                    ( +- 14.99% ) [47.01%]
         4,590,653 L1-dcache-prefetches      #    2.695 M/sec                    ( +-  4.49% ) [46.19%]
         1,712,660 L1-dcache-prefetch-misses #    1.005 M/sec                    ( +-  3.75% ) [44.82%]

        0.195622057  seconds time elapsed  ( +-  6.84% )

Also clean up the attribute construction code to be appending, and factor
it out into add_default_attributes().

Tweak the coverage percentage printout a bit, so that it's easier to view it
alongside the +- sttddev colum.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/n/tip-to3kgu04449s64062val8b62@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-19 14:29:51 +02:00
Ingo Molnar
b313207286 perf bench, x86: Add alternatives-asm.h wrapper
perf bench needs this to build the kernel's memcpy routine:

In file included from bench/mem-memcpy-x86-64-asm.S:2:0:
bench/../../../arch/x86/lib/memcpy_64.S:7:33: fatal error: asm/alternative-asm.h: No such file or directory

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/n/tip-c5d41xibgullk8h2280q4gv0@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-18 21:00:44 +02:00
Stephane Eranian
94692349c4 perf: Fix multi-event parsing bug
This patch fixes an issue with event parsing.
The following commit appears to have broken the
ability to specify a comma separated list of events:

   commit ceb53fbf6d
   Author: Ingo Molnar <mingo@elte.hu>
   Date:   Wed Apr 27 04:06:33 2011 +0200

       perf stat: Fail more clearly when an invalid modifier is specified

This patch fixes this while preserving the desired effect:

$ perf stat -e instructions:u,instructions:k ls /dev/null /dev/null

 Performance counter stats for 'ls /dev/null':

            365956 instructions:u           #    0.00  insns per cycle
            731806 instructions:k           #    0.00  insns per cycle

        0.001108862  seconds time elapsed

$ perf stat -e task-clock-msecs true
invalid event modifier: '-msecs'
Run 'perf list' for a list of valid events and modifiers

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: acme@redhat.com
Cc: peterz@infradead.org
Cc: fweisbec@gmail.com
Link: http://lkml.kernel.org/r/20110517133619.GA6999@quad
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-17 20:45:36 +02:00
Ingo Molnar
52004ea7ca Merge branch 'perf/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux-2.6 into perf/urgent 2011-05-15 19:41:00 +02:00
Arnaldo Carvalho de Melo
aece948f5d perf evlist: Fix per thread mmap setup
The PERF_EVENT_IOC_SET_OUTPUT ioctl was returning -EINVAL when using
--pid when monitoring multithreaded apps, as we can only share a ring
buffer for events on the same thread if not doing per cpu.

Fix it by using per thread ring buffers.

Tested with:

[root@felicio ~]# tuna -t 26131 -CP | nl
  1                      thread       ctxt_switches
  2    pid SCHED_ rtpri affinity voluntary nonvoluntary             cmd
  3 26131   OTHER     0      0,1  10814276      2397830 chromium-browse
  4  642    OTHER     0      0,1     14688            0 chromium-browse
  5  26148  OTHER     0      0,1    713602       115479 chromium-browse
  6  26149  OTHER     0      0,1    801958         2262 chromium-browse
  7  26150  OTHER     0      0,1   1271128          248 chromium-browse
  8  26151  OTHER     0      0,1         3            0 chromium-browse
  9  27049  OTHER     0      0,1     36796            9 chromium-browse
 10  618    OTHER     0      0,1     14711            0 chromium-browse
 11  661    OTHER     0      0,1     14593            0 chromium-browse
 12  29048  OTHER     0      0,1     28125            0 chromium-browse
 13  26143  OTHER     0      0,1   2202789          781 chromium-browse
[root@felicio ~]#

So 11 threads under pid 26131, then:

[root@felicio ~]# perf record -F 50000 --pid 26131

[root@felicio ~]# grep perf_event /proc/`pidof perf`/maps | nl
  1 7fa4a2538000-7fa4a25b9000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
  2 7fa4a25b9000-7fa4a263a000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
  3 7fa4a263a000-7fa4a26bb000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
  4 7fa4a26bb000-7fa4a273c000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
  5 7fa4a273c000-7fa4a27bd000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
  6 7fa4a27bd000-7fa4a283e000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
  7 7fa4a283e000-7fa4a28bf000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
  8 7fa4a28bf000-7fa4a2940000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
  9 7fa4a2940000-7fa4a29c1000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
 10 7fa4a29c1000-7fa4a2a42000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
 11 7fa4a2a42000-7fa4a2ac3000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
[root@felicio ~]#

11 mmaps, one per thread since we didn't specify any CPU list, so we need one
mmap per thread and:

[root@felicio ~]# perf record -F 50000 --pid 26131
^M
^C[ perf record: Woken up 79 times to write data ]
[ perf record: Captured and wrote 20.614 MB perf.data (~900639 samples) ]

[root@felicio ~]# perf report -D | grep PERF_RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort -n | uniq -c | sort -nr | nl
     1	 371310 26131
     2	  96516 26148
     3	  95694 26149
     4	  95203 26150
     5	   7291 26143
     6	     87 27049
     7	     76 661
     8	     60 29048
     9	     47 618
    10	     43 642
[root@felicio ~]#

Ok, one of the threads, 26151 was quiescent, so no samples there, but all the
others are there.

Then, if I specify one CPU:

[root@felicio ~]# perf record -F 50000 --pid 26131 --cpu 1
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.680 MB perf.data (~29730 samples) ]

[root@felicio ~]# perf report -D | grep PERF_RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort -n | uniq -c | sort -nr | nl
     1	   8444 26131
     2	   2584 26149
     3	   2518 26148
     4	   2324 26150
     5	    123 26143
     6	      9 661
     7	      9 29048
[root@felicio ~]#

This machine has two cores, so fewer threads appeared on the radar, and:

[root@felicio ~]# grep perf_event /proc/`pidof perf`/maps | nl
 1 7f484b922000-7f484b9a3000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
[root@felicio ~]#

Just one mmap, as now we can use just one per-cpu buffer instead of the
per-thread needed in the previous case.

For global profiling:

[root@felicio ~]# perf record -F 50000 -a
^C[ perf record: Woken up 26 times to write data ]
[ perf record: Captured and wrote 7.128 MB perf.data (~311412 samples) ]

[root@felicio ~]# grep perf_event /proc/`pidof perf`/maps | nl
     1	7fb49b435000-7fb49b4b6000 rwxs 00000000 00:09 4064                       anon_inode:[perf_event]
     2	7fb49b4b6000-7fb49b537000 rwxs 00000000 00:09 4064                       anon_inode:[perf_event]
[root@felicio ~]#

It uses per-cpu buffers.

For just one thread:

[root@felicio ~]# perf record -F 50000 --tid 26148
^C[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0.330 MB perf.data (~14426 samples) ]

[root@felicio ~]# perf report -D | grep PERF_RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort -n | uniq -c | sort -nr | nl
     1	   9969 26148
[root@felicio ~]#

[root@felicio ~]# grep perf_event /proc/`pidof perf`/maps | nl
     1	7f286a51b000-7f286a59c000 rwxs 00000000 00:09 4064                       anon_inode:[perf_event]
[root@felicio ~]#

Tested-by: David Ahern <dsahern@gmail.com>
Tested-by: Lin Ming <ming.m.lin@intel.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Link: http://lkml.kernel.org/r/20110426204401.GB1746@ghostprotocols.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-05-15 10:02:14 -03:00
Arnaldo Carvalho de Melo
b901941819 perf tools: Honour the cpu list parameter when also monitoring a thread list
The perf_evlist__create_maps was discarding the --cpu parameter when a
--pid or --tid was specified, fix that.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Link: http://lkml.kernel.org/r/20110426204401.GB1746@ghostprotocols.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-05-15 09:32:52 -03:00
Ingo Molnar
9cb5baba5e Merge commit 'v2.6.39-rc7' into sched/core 2011-05-12 09:36:18 +02:00
Lin Ming
2b348a7798 perf probe: Fix the missed parameter initialization
pubname_callback_param::found should be initialized to 0 in
fastpath lookup, the structure is on the stack and
uninitialized otherwise.

Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Link: http://lkml.kernel.org/r/1304066518-30420-2-git-send-email-ming.m.lin@intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-10 17:06:23 +02:00
Ingo Molnar
932fed4e2e Merge commit 'v2.6.39-rc7' into perf/core
Merge reason: pull in the latest fixes.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-10 17:05:45 +02:00
Jesper Juhl
2d9ff66da6 perf: Only include annotate.h once in tools/perf/util/ui/browsers/annotate.c
Including "../../annotate.h" once in
tools/perf/util/ui/browsers/annotate.c is enough. No need to do it twice.

Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-05-10 10:20:07 +02:00
Lin Ming
174a7b1f96 perf tools: Makefile: Use gcc to determine ARCH
The original Makefile uses "uname -m" to determine ARCH.
This causes problem on x86 when compile perf tool on 32 bit
userspace with a 64 bit kernel.

 bench/../../../arch/x86/lib/memcpy_64.S: Assembler messages:
 bench/../../../arch/x86/lib/memcpy_64.S:28: Error: bad register name `%rdi'

This is because "uname -m" returns x86_64 and memcpy_64.S is
included in 32 bit build.

Reported-by: Riccardo Magliocchetti <riccardo.magliocchetti@gmail.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Link: http://lkml.kernel.org/r/1304743274.3132.17.camel@localhost
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-07 11:40:59 +02:00
Paul E. McKenney
a26ac2455f rcu: move TREE_RCU from softirq to kthread
If RCU priority boosting is to be meaningful, callback invocation must
be boosted in addition to preempted RCU readers.  Otherwise, in presence
of CPU real-time threads, the grace period ends, but the callbacks don't
get invoked.  If the callbacks don't get invoked, the associated memory
doesn't get freed, so the system is still subject to OOM.

But it is not reasonable to priority-boost RCU_SOFTIRQ, so this commit
moves the callback invocations to a kthread, which can be boosted easily.

Also add comments and properly synchronized all accesses to
rcu_cpu_kthread_task, as suggested by Lai Jiangshan.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2011-05-05 23:16:54 -07:00
David Ahern
c63ca0c01d perf stat: Tell user about unsupported events in the list
Similar to perf-record, tell user about unsupported events
that will not be counted if invoked in verbose mode.

e.g.,

 $ perf stat -e dTLB-prefetch-misses -v -- sleep 1
 dTLB-prefetch-misses event is not supported by the kernel.
 dTLB-prefetch-misses: 0 0 0

 Performance counter stats for 'sleep 1':

     <not counted> dTLB-prefetch-misses

        1.001884783  seconds time elapsed

Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/1304114655-10600-1-git-send-email-dsahern@gmail.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-30 00:18:14 +02:00
Ingo Molnar
947b4ad1d1 perf list: Fix max event string size
Recent stalled-cycles event names were larger than the 40 chars printout
used by perf list.

Extend that, make it robust for future extensions and also adjust alignments
in face of wider event names.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-7y40wib8n009io7hjpn1dsrm@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-29 22:52:54 +02:00
Ingo Molnar
370faf1dd0 perf stat: Fail softly on unsupported events
David Ahern reported this perf stat failure:

> # /tmp/build-perf/perf stat -- sleep 1
>   Error: stalled-cycles-frontend event is not supported.
>   Fatal: Not all events could be opened.
>
> This is a Dell R410 with an E5620 processor.

Fail in a softer fashion on unknown/unsupported events.

Reported-by: David Ahern <dsahern@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-7y40wib8n006io7hjpn1dsrm@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-29 16:22:33 +02:00
Ingo Molnar
fce3c786d3 perf stat: Leave more room for percentages
Triple digit percentages do not fit otherwise.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-7y40wib8n005io7hjpn1dsrm@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-29 16:06:09 +02:00
Ingo Molnar
2b427e14b7 perf stat: Adjust stall cycles warning percentages
Adjust to color thresholds to better match the percentages seen in
real workloads. Both are now a bit more sensitive.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-7y40wib8n004io7hjpn1dsrm@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-29 14:35:57 +02:00
Ingo Molnar
d3d1e86da0 perf stat: Analyze front-end and back-end stall counts
Sample output:

 Performance counter stats for './loop_1b':

        873.691065 task-clock               #    1.000 CPUs utilized
                 1 context-switches         #    0.000 M/sec
                 1 CPU-migrations           #    0.000 M/sec
                96 page-faults              #    0.000 M/sec
     2,012,637,222 cycles                   #    2.304 GHz                      (66.58%)
     1,001,397,911 stalled-cycles-frontend  #   49.76% frontend cycles idle     (66.58%)
         7,523,398 stalled-cycles-backend   #    0.37%  backend cycles idle     (66.76%)
     2,004,551,046 instructions             #    1.00  insns per cycle
                                            #    0.50  stalled cycles per insn  (66.80%)
     1,001,304,992 branches                 # 1146.063 M/sec                    (66.76%)
            39,453 branch-misses            #    0.00% of all branches          (66.64%)

        0.874046121  seconds time elapsed

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-7y40wib8n003io7hjpn1dsrm@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-29 14:35:55 +02:00
Ingo Molnar
129c04cb8c perf tools: Add front-end and back-end stalled cycles support
Update perf tooling to deal with front-end and back-end stalled cycles events.

Add both the default 'perf stat' output.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-7y40wib8n002io7hjpn1dsrm@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-29 14:35:49 +02:00
Ingo Molnar
ede7029004 perf stat: Fix compatibility behavior
Instead of failing on an unknown event, when new perf stat is run on
older kernels:

  $ ./perf stat true
  Error: open_counter returned with 22 (Invalid argument). /bin/dmesg
  may provide additional information.

  Fatal: Not all events could be opened.

Just ignore EINVAL and ENOSYS, we'll print the results as not counted:

 Performance counter stats for 'true':

          0.239483 task-clock               #    0.493 CPUs utilized
                 0 context-switches         #    0.000 M/sec
                 0 CPU-migrations           #    0.000 M/sec
                86 page-faults              #    0.359 M/sec
           704,766 cycles                   #    2.943 GHz
     <not counted> stalled-cycles
           381,961 instructions             #    0.54  insns per cycle
            69,626 branches                 #  290.735 M/sec
             4,594 branch-misses            #    6.60% of all branches

        0.000485883  seconds time elapsed

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-7y40wib8n1eqio5hjpn3dsrm@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-28 08:48:42 +02:00
Ingo Molnar
f9cef0a90c perf stat: Add --sync/-S option
--sync will tell perf stat to run sync() before starting a command.

This allows IO-heavy tests to be used with --repeat, without one
iteration impacting the other.

Elapsed time will stabilize for example:

  before:        3.971525714  seconds time elapsed  ( +-  8.56% )
  after:         3.211098537  seconds time elapsed  ( +-  1.52% )

So measurements will be more accurate.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-7y40wib8n1eqio7hjpn1dsrm@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-28 08:39:39 +02:00
Ingo Molnar
9ceb1c3d1f perf stat: Fix printout vertical alignment
Before:

 |
 | Performance counter stats for '/home/mingo/hackbench 20' (5 runs):
 |
 |        71,321,607 instructions:u           #    0.42  insns per cycle  ( +-  0.00% )
 |       168,040,009 cycles:u                 #    0.000 GHz                      ( +-  0.81% )
 |
 |        1.468002368  seconds time elapsed  ( +-  1.33% )
 |

After:

 |
 | Performance counter stats for '/home/mingo/hackbench 20' (5 runs):
 |
 |        71,321,607 instructions:u           #    0.42  insns per cycle          ( +-  0.00% )
 |       168,040,009 cycles:u                 #    0.000 GHz                      ( +-  0.81% )
 |
 |        1.468002368  seconds time elapsed  ( +-  1.33% )
 |

The last column (stddev noise) is properly aligned, vertically.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-7y40wib8n1eqio7hjpn0dsrm@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-27 17:48:55 +02:00
Ingo Molnar
c6264deff7 perf stat: Add -d/--detailed flag to run with a lot of events
Add the new -d/--detailed flag, which generates a pretty detailed event list:

 Performance counter stats for './hackbench 10' (10 runs):

       1514.287888 task-clock               #   10.897 CPUs utilized            ( +-  3.05% )
            39,698 context-switches         #    0.026 M/sec                    ( +- 12.19% )
             8,147 CPU-migrations           #    0.005 M/sec                    ( +- 16.55% )
            17,918 page-faults              #    0.012 M/sec                    ( +-  0.37% )
     2,944,504,050 cycles                   #    1.944 GHz                      ( +-  3.89% )  (32.60%)
     1,043,971,283 stalled-cycles           #   35.45% of all cycles are idle   ( +-  5.22% )  (44.48%)
     1,655,906,768 instructions             #    0.56  insns per cycle
                                            #    0.63  stalled cycles per insn  ( +-  1.95% )  (55.09%)
       338,832,373 branches                 #  223.757 M/sec                    ( +-  1.96% )  (64.47%)
         3,892,416 branch-misses            #    1.15% of all branches          ( +-  5.49% )  (73.12%)
       606,410,482 L1-dcache-loads          #  400.459 M/sec                    ( +-  1.29% )  (71.21%)
        31,204,395 L1-dcache-load-misses    #    5.15% of all L1-dcache hits    ( +-  3.04% )  (60.43%)
         3,922,751 LLC-loads                #    2.590 M/sec                    ( +-  6.80% )  (46.87%)
         5,037,288 LLC-load-misses          #    3.327 M/sec                    ( +-  3.56% )  (13.00%)

        0.138966828  seconds time elapsed  ( +-  4.11% )

This can be used "at a glance" for narrower analysis.

-d can also be used in addition to other -e events, to further expand an event list.

Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-cxs98quixs3qyvdqx3goojc4@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-26 21:03:16 +02:00
Ingo Molnar
8bb6c79f24 perf stat: Print out miss/hit ratio for L1 data-cache events
Print out this kind of l1-dcache-misses percentage:

 Performance counter stats for './bw_tcp localhost':

    29,956,262,201 cycles                   #    3.002 GHz                      (scaled from 85.14%)
     8,255,209,558 stalled-cycles           #   27.56% of all cycles are idle   (scaled from 86.56%)
     1,206,130,308 l1-dcache-misses         #   40.49% of all L1-dcache hits    (scaled from 86.30%)
     2,978,756,779 l1-dcache-refs           #  298.512 M/sec                    (scaled from 70.02%)
     8,861,956,159 instructions             #    0.30  insns per cycle
                                            #    0.93  stalled cycles per insn  (scaled from 84.27%)
     1,644,306,068 branches                 #  164.782 M/sec                    (scaled from 86.43%)
        74,778,443 branch-misses            #    4.55% of all branches          (scaled from 70.69%)
       9978.695711 task-clock               #    0.693 CPUs utilized

       14.404347983  seconds time elapsed

And color the result depending on the severity of cache-trashing.

Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-54gmz0zymaid84zcs7joq02p@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-26 20:32:24 +02:00
Ingo Molnar
c78df6c1d4 perf stat: Print branch misses warning colors
Print the missed-branches percentage with different warning level ASCII colors,
as the percentage passes the 5%/10%/20% thresholds.

These thresholds are set to relatively low levels, because on most CPUs even a
moderate percentage of branch-misses already shows up as a slowdown.

Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-ybqukg7p86leiup7gl03ecgk@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-26 20:04:58 +02:00
Ingo Molnar
a5d243d04a perf stat: Print stalled cycles warning colors
Print the stalled-cycles percentage with different warning level ASCII colors,
as the percentage passes the 25%/50%/75% thresholds.

Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-e25zz44rcms7mu9az4fu5zp0@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-26 20:04:58 +02:00
Ingo Molnar
f99844cb76 perf stat: Fix -nan% output in perf stat noise printouts
Before:

                 0 CPU-migrations           #    0.000 M/sec                    ( +-  -nan% )

After:

                 0 CPU-migrations           #    0.000 M/sec                    ( +-  0.00% )

Also factor out the noise printing function.

Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-z89h2v1bk1mikcbsf7e6v34q@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-26 20:04:57 +02:00
Ingo Molnar
1fc570ad89 perf stat: Add stalled cycles to the default output
The new default output looks like this:

 Performance counter stats for './loop_1b_instructions':

        236.010686 task-clock               #    0.996 CPUs utilized
                 0 context-switches         #    0.000 M/sec
                 0 CPU-migrations           #    0.000 M/sec
                99 page-faults              #    0.000 M/sec
       756,487,646 cycles                   #    3.205 GHz
       354,938,996 stalled-cycles           #   46.92% of all cycles are idle
     1,001,403,797 instructions             #    1.32  insns per cycle
                                            #    0.35  stalled cycles per insn
       100,279,773 branches                 #  424.895 M/sec
            12,646 branch-misses            #    0.013 % of all branches

        0.236902540  seconds time elapsed

We dropped cache-refs and cache-misses and added stalled-cycles - this is a
more generic "how well utilized is the CPU" metric.

If the stalled-cycles ratio is too high then more specific measurements can be
taken to figure out the source of the inefficiency.

Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-pbpl2l4mn797s69bclfpwkwn@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-26 20:04:57 +02:00
Ingo Molnar
481f988a01 perf stat: Add stalled cycles accounting, prettify the resulting output
Add stalled cycles accounting and use it to print the "cycles stalled per
instruction" value.

Also change the unit of the cycles output from M/sec to GHz - this is more
intuitive.

Prettify the output to:

 Performance counter stats for './loop_1b_instructions':

        239.775036 task-clock               #    0.997 CPUs utilized
       761,903,912 cycles                   #    3.178 GHz
       356,620,620 stalled-cycles           #   46.81% of all cycles are idle
     1,001,578,351 instructions             #    1.31  insns per cycle
                                            #    0.36  stalled cycles per insn
            14,782 cache-references         #    0.062 M/sec
             5,694 cache-misses             #   38.520 % of all cache refs

        0.240493656  seconds time elapsed

Also adjust the --repeat output to make the percentages align vertically:

 Performance counter stats for './loop_1b_instructions' (10 runs):

        236.096793 task-clock               #    0.997 CPUs utilized             ( +-   0.011% )
       756,553,086 cycles                   #    3.204 GHz                       ( +-   0.002% )
       354,942,692 stalled-cycles           #   46.92% of all cycles are idle    ( +-   0.008% )
     1,001,389,700 instructions             #    1.32  insns per cycle
                                            #    0.35  stalled cycles per insn   ( +-   0.000% )
            10,166 cache-references         #    0.043 M/sec                     ( +-   0.742% )
               468 cache-misses             #    4.608 % of all cache refs       ( +-  13.385% )

        0.236874136  seconds time elapsed   ( +- 0.01% )

Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-uapziqny39601apdmmhoz7hk@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-26 20:04:56 +02:00
Ingo Molnar
dcd9936a5a perf stat: Factor our shadow stats
Create update_shadow_stats() which is then used in both read_counter_aggr()
and read_counter().

This not only simplifies the code but also fixes a bug: HW_CACHE_REFERENCES
was not updated in read_counter().

Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-9uc55z3g88r47exde7zxjm6p@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-26 20:04:56 +02:00
Ingo Molnar
749141d926 perf stat: Make all displayed event names parseable as well
Right now we display this by default:

          0.202204 task-clock-msecs         #      0.282 CPUs
                 0 context-switches         #      0.000 M/sec
                 0 CPU-migrations           #      0.000 M/sec
                85 page-faults              #      0.420 M/sec

The task-clock-msecs event cannot actually be passed back as an
event name, the event name we recognize is 'task-clock'.

So change the output of the cpu-clock and task-clock events
to be idempotent.

( Units should be printed out in the right-side column, if needed. )

Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-lexrnbzy09asscgd4f7oac4i@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-26 20:04:55 +02:00
Ingo Molnar
ceb53fbf6d perf stat: Fail more clearly when an invalid modifier is specified
Currently we fail without printing any error message on "perf stat -e task-clock-msecs".

The reason is that the task-clock event is matched and the "-msecs" postfix is assumed
to be an event modifier - but is not recognized.

This patch changes the code to be more informative:

 $ perf stat -e task-clock-msecs true
 invalid event modifier: '-msecs'
 Run 'perf list' for a list of valid events and modifiers

And restructures the return value of parse_event_modifier() to allow
the printing of all variants of invalid event modifiers.

Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-wlaw3dvz1ly6wple8l52cfca@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-26 20:04:55 +02:00
Ingo Molnar
b908debd4e perf tools: Accept case-insensitive symbolic event variants
We currently fail on something like '-e CPU-migrations', with:

  invalid or unsupported event: 'CPU-migrations'

While 'CPU-migrations' is how we actually print out the event
in the default perf stat output:

 Performance counter stats for 'true':

          0.202204 task-clock-msecs         #      0.282 CPUs
                 0 context-switches         #      0.000 M/sec
                 0 CPU-migrations           #      0.000 M/sec

So change the matching to be case-insensitive.

Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-omcm3edjjtx83a4kh2e244se@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-26 20:04:55 +02:00
Ingo Molnar
d58f4c82fe perf stat: Print cache misses as percentage
Before:

       113,393,041 cache-references         #     83.636 M/sec
         7,052,454 cache-misses             #      5.202 M/sec

After:

       112,589,441 cache-references         #     87.925 M/sec
         6,556,354 cache-misses             #      5.823 %

misses/hits percentages are more expressive than absolute numbers
or rates.

(Also prettify the CPUs printout line to not have a trailing whitespace.)

Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-axm28f43x439bl41zkvfzd63@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-26 20:04:54 +02:00
Ingo Molnar
11ba2b85f5 perf stat: Print stalled cycles percentage
Print:

           611,527 cycles
           400,553 instructions             # (  0.71 instructions per cycle )
            77,809 stalled-cycles           # ( 12.71% of all cycles )

        0.000610987  seconds time elapsed

Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Link: http://lkml.kernel.org/n/tip-fd6x8r1cpyb6zhlrc4ix8m45@git.kernel.org
2011-04-26 20:04:54 +02:00
Ingo Molnar
94403f8863 perf events: Add stalled cycles generic event - PERF_COUNT_HW_STALLED_CYCLES
The new PERF_COUNT_HW_STALLED_CYCLES event tries to approximate
cycles the CPU does nothing useful, because it is stalled on a
cache-miss or some other condition.

Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-fue11vymwqsoo5to72jxxjyl@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-26 20:04:53 +02:00
Jiri Kosina
07f9479a40 Merge branch 'master' into for-next
Fast-forwarded to current state of Linus' tree as there are patches to be
applied for files that didn't exist on the old branch.
2011-04-26 10:22:59 +02:00
Jonathan Corbet
625f2a378e sched: Get rid of lock_depth
Neil Brown pointed out that lock_depth somehow escaped the BKL
removal work.  Let's get rid of it now.

Note that the perf scripting utilities still have a bunch of
code for dealing with common_lock_depth in tracepoints; I have
left that in place in case anybody wants to use that code with
older kernels.

Suggested-by: Neil Brown <neilb@suse.de>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20110422111910.456c0e84@bike.lwn.net
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-24 13:18:38 +02:00
David Ahern
9cbdb70209 perf script: improve validation of sample attributes for output fields
Check for required sample attributes using evsel rather than sample_type
in the session header. If the attribute for a default field is not
present for the event type (e.g., new command operating on file from
older kernel) the field is removed from the output list.

Expected event types must exist. For example, if a user specifies

  -f trace:time,trace -f sw:time,cpu,sym

the perf.data file must contain both tracepoints and software events
(ie., it is an error if either does not exist in the file).

Attribute checking is done once at the beginning of perf-script rather
than for each sample.

v1 -> v2:
- addressed comments from acme

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1302148460-570-1-git-send-email-daahern@cisco.com
Signed-off-by: David Ahern <daahern@cisco.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-04-20 07:27:52 -03:00
Arun Sharma
0817a6a3a4 perf script: Add support for PERF_TYPE_RAW
Useful for getting stack traces for hardware events not handled by
PERF_TYPE_HARDWARE.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Arun Sharma <asharma@fb.com>
Link: http://lkml.kernel.org/n/tip-qimdcdpekjqxuyqovy4kjusx@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-04-19 11:08:58 -03:00
Michael Witten
f18568aae5 perf tools: git mv tools/perf/{features-tests.mak,config/}
Signed-off-by: Michael Witten <mfwitten@gmail.com>
Link: http://lkml.kernel.org/n/tip-a6zhefjayuounko1tk5sjji2@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-04-19 08:18:36 -03:00
Michael Witten
7fbd065f5a perf tools: Move `try-cc'
The `try-cc' user-defined function was in tools/perf/feature-tests.mak;
this commit moves it to tools/perf/config/utilities.mak.

Signed-off-by: Michael Witten <mfwitten@gmail.com>
Link: http://lkml.kernel.org/n/tip-bqhwcuxsrve0iodn6q4ejaoi@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-04-19 08:18:36 -03:00
Michael Witten
ced465c400 perf tools: Makefile: PYTHON{,_CONFIG} to bandage Python 3 incompatibility
Currently, Python 3 is not supported by perf's code; this
can cause the build to fail for systems that have Python 3
installed as the default python:

  python{,-config}

The Correct Solution is to write compatibility code so that
Python 3 works out-of-the-box.

However, users often have an ancillary Python 2 installed:

  python2{,-config}

Therefore, a quick fix is to allow the user to specify those
ancillary paths as the python binaries that Makefile should
use, thereby avoiding Python 3 altogether; as an added benefit,
the Python binaries may be installed in non-standard locations
without the need for updating any PATH variable.

This commit adds the ability to set PYTHON and/or PYTHON_CONFIG
either as environment variables or as make variables on the
command line; the paths may be relative, and usually only PYTHON
is necessary in order for PYTHON_CONFIG to be defined implicitly.
Some rudimentary error checking is performed when the user
explicitly specifies a value for any of these variables.

In addition, this commit introduces significantly robust makefile
infrastructure for working with paths and communicating with the
shell; it's currently only used for handling Python, but I hope
it will prove useful in refactoring the makefiles.

Thanks to:

  Raghavendra D Prabhu <rprabhu@wnohang.net>

for motivating this patch.

Acked-by: Raghavendra D Prabhu <rprabhu@wnohang.net>
Link: http://lkml.kernel.org/r/e987828e-87ec-4973-95e7-47f10f5d9bab-mfwitten@gmail.com
Signed-off-by: Michael Witten <mfwitten@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-04-19 08:18:36 -03:00
Michael Witten
3643b133f2 perf tools: Makefile: Clean up `python/perf.so' rule
There is no need for a subshell or an explicit `export';
as per the POSIX Shell Command Language specification:

  http://pubs.opengroup.org/onlinepubs/009695399/utilities/xcu_chap02.html#tag_02_09_01
  http://pubs.opengroup.org/onlinepubs/009695399/utilities/xcu_chap02.html#tag_02_10_02

It is only necessary to include the environment variable
assignment just before the command to be run.

Also, it is better to use single-quotes, because GNU make
might expand `$(BASIC_CFLAGS)' into something that the shell
could interpret within double-quotes.

Acked-by: Raghavendra D Prabhu <rprabhu@wnohang.net>
Link: http://lkml.kernel.org/n/tip-58n38o02ocuzrm9qh096hsf5@git.kernel.org
Signed-off-by: Michael Witten <mfwitten@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-04-19 08:18:36 -03:00
Arnaldo Carvalho de Melo
aeafcbaf4f perf symbols: Give more useful names to 'self' parameters
One more installment on an area that is mostly dormant.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-04-19 08:18:35 -03:00
Ingo Molnar
68d2cf25d3 Merge branch 'perf/urgent' into perf/core
Merge reason: we'll be queueing up dependent changes.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-19 07:56:17 +02:00
Arnaldo Carvalho de Melo
5d2cd90922 perf evsel: Fix use of inherit
perf stat doesn't mmap and its perfectly fine for it to use task-bound
counters with inheritance.

So set the attr.inherit on the caller and leave the syscall itself to
validate it.

When the mmap fails perf_evlist__mmap will just emit a warning if this
is the failure reason.

Reported-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Link: http://lkml.kernel.org/r/20110414170121.GC3229@ghostprotocols.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-04-15 12:52:28 -03:00
Lin Ming
db9a9cbc81 perf hists browser: Fix seg fault when annotate null symbol
In hists browser, press hotkey 'a' to annotate current symbol.

Now it causes segment fault if 'a' is pressed on a null symbol.

Here are 2 small bugs:
- In perf_evsel__hists_browse, the condition check after 'a' is pressed
  is not correct, we should check ->sym instead of ->map.
- In symbol__tui_annotate we must check whether sym is NULL or not
  before getting annotation structure.

This patch fixes above 2 small bugs.

Link: http://lkml.kernel.org/r/1302244286.4106.36.camel@minggr.sh.intel.com
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-04-15 12:51:49 -03:00
Justin P. Mattock
6eab04a876 treewide: remove extra semicolons
Signed-off-by: Justin P. Mattock <justinmattock@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-04-10 17:01:05 +02:00
Eric Dumazet
621d26567f perf: Fix a build error with some GCC versions
Fix this:

 util/cgroup.c: In function ‘open_cgroup’:
 util/cgroup.c:16:16: error: ‘saved_ptr’ may be used uninitialized in this function
 util/cgroup.c:16:16: note: ‘saved_ptr’ was declared here

Apparently newer GCC (4.6) can figure out that this variable is properly
initialized - but some versions of GCC (such as 4.5.2) need help.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-08 17:40:21 +02:00
Linus Torvalds
8b9686ff4d Merge branches 'x86-fixes-for-linus', 'sched-fixes-for-linus', 'timers-fixes-for-linus', 'irq-fixes-for-linus' and 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86-32, fpu: Fix FPU exception handling on non-SSE systems
  x86, hibernate: Initialize mmu_cr4_features during boot
  x86-32, NUMA: Fix ACPI NUMA init broken by recent x86-64 change
  x86: visws: Fixup irq overhaul fallout

* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  sched: Clean up rebalance_domains() load-balance interval calculation

* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86/mrst/vrtc: Fix boot crash in mrst_rtc_init()
  rtc, x86/mrst/vrtc: Fix boot crash in rtc_read_alarm()

* 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  genirq: Fix cpumask leak in __setup_irq()

* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  perf probe: Fix listing incorrect line number with inline function
  perf probe: Fix to find recursively inlined function
  perf probe: Fix multiple --vars options behavior
  perf probe: Fix to remove redundant close
  perf probe: Fix to ensure function declared file
2011-04-07 12:12:58 -07:00
Linus Torvalds
42933bac11 Merge branch 'for-linus2' of git://git.profusion.mobi/users/lucas/linux-2.6
* 'for-linus2' of git://git.profusion.mobi/users/lucas/linux-2.6:
  Fix common misspellings
2011-04-07 11:14:49 -07:00
Masami Hiramatsu
1d46ea2a6a perf probe: Fix listing incorrect line number with inline function
Fix a bug showing incorrect line number when a probe is put on the head of an
inline function. This patch updates find_perf_probe_point() and introduces new
rules to get correct line number.

 - If debuginfo doesn't have a correct file name, we shouldn't return line
   number too, because, without file name, line number is meaningless.

 - If the address is in a function, it stores the function name and the offset
   from the function entry.

   - If the address is on a line, it tries to get the relative line number from
     the function entry line, except for the address is same as the entry
     address of the function (in this case, the relative line number should
     be 0).

     - If the address is in an inline function entry (call-site), it uses the
       inline function call line number as the line on which the address is.

   - If the address is in an inline function body, it stores the inline
     function name and offset from the inline function call site instead of the
     (non-inlined) function.

Cc: 2nddept-manager@sdl.hitachi.co.jp
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20110330092605.2132.11629.stgit@ltc236.sdl.hitachi.co.jp>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-04-05 15:38:12 -03:00
Masami Hiramatsu
1d878083c2 perf probe: Fix to find recursively inlined function
Fix die_find_inlinefunc() to return correct innermost inlined function
at given address. Without this fix, it returns the outermost inlined
function.

Cc: 2nddept-manager@sdl.hitachi.co.jp
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20110330092559.2132.78634.stgit@ltc236.sdl.hitachi.co.jp>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-04-05 15:36:47 -03:00
Masami Hiramatsu
cc446446ff perf probe: Fix multiple --vars options behavior
Fix a bug that perf-probe fails to initialize libdwfl and shows incorrect error
when user gives multiple --vars options.

Cc: 2nddept-manager@sdl.hitachi.co.jp
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20110330092553.2132.42691.stgit@ltc236.sdl.hitachi.co.jp>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-04-05 15:36:04 -03:00
Masami Hiramatsu
f0c4801a17 perf probe: Fix to remove redundant close
Since dwfl_end() closes given fd with dwfl, caller doesn't need to close its fd
when finishing process.

Cc: 2nddept-manager@sdl.hitachi.co.jp
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20110330092547.2132.93728.stgit@ltc236.sdl.hitachi.co.jp>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-04-05 15:35:16 -03:00
Masami Hiramatsu
7d21635ac5 perf probe: Fix to ensure function declared file
Fix to ensure function declared file matches given file name. This fixes
a potential bug.

As I've commented on Lin Ming's fastpath enhancement, decl_file should
be checked on each probe point if user gives a probe point as func@file.

Cc: 2nddept-manager@sdl.hitachi.co.jp
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20110330092541.2132.3584.stgit@ltc236.sdl.hitachi.co.jp>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-04-05 15:34:53 -03:00
Lucas De Marchi
25985edced Fix common misspellings
Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
2011-03-31 11:26:23 -03:00
Frederic Weisbecker
800cd25c12 perf: mmap 512 kiB by default
The default setting of perf record is to mmap 128 pages if the user
did not override with -m.

However the page size may vary accross different architecture
settings, giving different default size between each.

Moreover the kernel side still has a default max number of mlocked
pages of 512 kiB + 1 page for unprivileged users. 128 + 1 pages
with page size > 4096 overlaps this threshold.

Thus, better adapt to this limitation and set the default number of
pages to fit those 512 kiB + 1 page.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <1301535324-9735-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-31 13:02:55 +02:00
Arnaldo Carvalho de Melo
176fcc5c5f perf script: Add more documentation about the -f/--fields parameters
Using the commit log for 2c9e45f.

Cc: David Ahern <daahern@cisco.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-03-30 15:30:43 -03:00
David Ahern
2c9e45f7a2 perf script: If type not given fields apply to all event types
Allow:
  perf script -f <fields>

to be equivalent to:
  perf script -f trace:<fields> -f sw:<fields> -f hw:<fields>

i.e., the specified fields apply to all event types if the type string
is not given.

The field (-f) arguments are processed in the order received. A later
usage can reset a prior request. e.g.,

  -f trace: -f comm,tid,time,sym

The first -f suppresses trace events (field list is ""), but then the second
invocation sets the fields to comm,tid,time,sym. In this case a warning is
given to the user:

  "Overriding previous field request for all events."

Alternativey, consider the order:

  -f comm,tid,time,sym -f trace:

The first -f sets the fields for all events and the second -f suppresses trace
events. The user is given a warning message about the override, and the result
of the above is that only S/W and H/W events are displayed with the given
fields.

For the 'wildcard' option if a user selected field is invalid for an event
type, a message is displayed to the user that the option is ignored for that
type. For example:

  perf script -f comm,tid,trace 2>&1 | less
  'trace' not valid for hardware events. Ignoring.
  'trace' not valid for software events. Ignoring.

Alternatively, if the type is given an invalid field is specified it is an
error. For example:

    perf script -v -f sw:comm,tid,trace 2>&1 | less
    'trace' not valid for software events.

At this point usage is displayed, and perf-script exits.

Finally, a user may not set fields to none for all event types.
i.e., -f "" is not allowed.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org
LPU-Reference: <1300377801-27246-1-git-send-email-daahern@cisco.com>
Signed-off-by: David Ahern <daahern@cisco.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-03-30 14:43:11 -03:00