Commit Graph

401580 Commits

Author SHA1 Message Date
Steven Rostedt
0497a9ebaf tools lib traceevent: Add direct access to dynamic arrays
Jiri Olsa was writing a plugin for the cfg80211_tx_mlme_mgmt trace
event, and was not able to get the implemented function working.
The event's print fmt looks like:

   "netdev:%s(%d), ftype:0x%.2x", REC->name, REC->ifindex,
            __le16_to_cpup((__le16 *)__get_dynamic_array(frame))

As there's no helper function for __le16_to_cpup(), Jiri was creating one
with a plugin. But unfortunately, it would not work even though he set
up the plugin correctly.

The problem is that the function parameters do not handle the helper
function "__get_dynamic_array()", and that passes in a NULL pointer.

Adding PRINT_DYNAMIC_ARRAY direct support to eval_num_arg() allows the
use of __get_dynamic_array() in function parameters.

Reported-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20131111160810.0ba9df7d@gandalf.local.home
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-12 17:23:44 -03:00
Arnaldo Carvalho de Melo
602ad878d4 perf target: Shorten perf_target__ to target__
Getting unwieldly long, for this app domain should be descriptive enough
and the use of __ to separate the class from the method names should
help with avoiding clashes with other code bases.

Reported-by: David Ahern <dsahern@gmail.com>
Suggested-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20131112113427.GA4053@ghostprotocols.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-12 16:51:03 -03:00
Adrian Hunter
48095b721c perf tests: Handle throttle events in 'object code reading' test
Unhandled events cause an error that fails the test, fix it.

Reported-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/5281DFE5.3000909@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-12 16:37:54 -03:00
David Ahern
33c2dcfdfe perf evlist: Refactor mmap_pages parsing
Logic will be re-used for the out-pages argument for mmap based writes
in perf-record.

Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1384267617-3446-4-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-12 16:33:22 -03:00
David Ahern
9639837e95 perf evlist: Round mmap pages to power 2 - v2
Currently perf requires the -m / --mmap_pages option to be a power of 2.

To be more user friendly perf should automatically round this up to the
next power of 2.

Currently:
  $ perf record -m 3 -a -- sleep 1
  --mmap_pages/-m value must be a power of two.sleep: Terminated

With patch:
  $ perf record -m 3 -a -- sleep 1
  rounding mmap pages size to 16384 (4 pages)
  ...

v2: Add bytes units to rounding message per Ingo's request. Other
    suggestions (e.g., prefixing INFO) should be addressed by wrapping
    pr_info to catch all instances.

Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1384267617-3446-3-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-12 16:31:53 -03:00
David Ahern
8973504be7 perf record: Fix segfault with --no-mmap-pages
Adrian reported a segfault when using --no-out-pages:

$ tools/perf/perf record -vv --no-out-pages uname
Segmentation fault (core dumped)

The same occurs with --no-mmap-pages. Fix by checking that str is
non-NULL before parsing it.

Signed-off-by: David Ahern <dsahern@gmail.com>
Reported-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1384267617-3446-2-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-12 16:30:54 -03:00
David Ahern
fd2eabaf16 perf trace: Add summary only option
Per request from Pekka make --summary a summary only option meaning do
not show the individual system calls. Add another option to see all
syscalls along with the summary. In addition use 's' and 'S' as
shortcuts for the options.

Requested-by: Pekka Enberg <penberg@kernel.org>
Signed-off-by: David Ahern <dsahern@gmail.com>
Tested-by: Pekka Enberg <penberg@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pekka Enberg <penberg@kernel.org>
Link: http://lkml.kernel.org/r/1384273875-3751-1-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-12 16:24:38 -03:00
Pekka Enberg
99ff715054 perf trace: Simplify '--summary' output
The output of 'perf trace --summary' tries to be too cute with
formatting and makes it very hard to read.  Simplify it in the spirit of
"strace -c":

[penberg@localhost libtrading]$ perf trace -a --duration 10000 --summary -- sleep 1
^C
 Summary of events:

 dbus-daemon (555), 10 events, 0.0%, 0.000 msec

                                                    msec/call
   syscall            calls      min      avg      max stddev
   --------------- -------- -------- -------- -------- ------
   sendmsg                2    0.002    0.005    0.008  55.00
   recvmsg                2    0.002    0.003    0.005  44.00
   epoll_wait             1    0.000    0.000    0.000   0.00

 NetworkManager (667), 56 events, 0.0%, 0.000 msec

                                                    msec/call
   syscall            calls      min      avg      max stddev
   --------------- -------- -------- -------- -------- ------
   poll                   2    0.000    0.002    0.003 100.00
   sendmsg               10    0.004    0.007    0.016  15.41
   recvmsg               16    0.002    0.003    0.005   8.24

 zfs-fuse (669), 4 events, 0.0%, 0.000 msec

                                                    msec/call
   syscall            calls      min      avg      max stddev
   --------------- -------- -------- -------- -------- ------
   futex                  2    0.000    0.001    0.002 100.00

Signed-off-by: Pekka Enberg <penberg@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Link: http://lkml.kernel.org/r/1384267334-18953-1-git-send-email-penberg@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-12 13:00:38 -03:00
Pekka Enberg
7f7a4138c6 perf trace: Change syscall summary duration order
Switch duration order to minimum, average, maximum for the '--summary'
command line option because it's more natural to read.

Signed-off-by: Pekka Enberg <penberg@kernel.org>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Link: http://lkml.kernel.org/r/1384265410-12344-1-git-send-email-penberg@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-12 13:00:38 -03:00
Adrian Hunter
3fe2130523 perf tests: Compensate lower sample freq with longer test loop
Doesn't work for me:

./perf test -v 19
19: Test software clock events have valid period values    :
--- start ---
mmap size 528384B
mmap size 528384B
All (0) samples have period value of 1!
---- end ----
Test software clock events have valid period values: FAILED!

Compensate the lower freq introduced in 67c1e4a53b with a longer loop,

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/5281D3B8.2030104@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-12 13:00:37 -03:00
Namhyung Kim
003824e8c2 perf trace: Fix segfault on perf trace -i perf.data
When replaying a previous record session, it'll get a segfault since it
doesn't initialize raw_syscalls enter/exit tracepoint's evsel->priv for
caching the format fields.

So fix it by properly initializing sys_enter/exit evsels that comes from
reading the perf.data file header.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1384237500-22991-2-git-send-email-namhyung@kernel.org
[ Split the syscall tp field caching part in the previous patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-12 13:00:37 -03:00
Namhyung Kim
96695d4402 perf trace: Separate tp syscall field caching into init routine to be reused
We need to set this in evsels coming out of a perf.data file header, not
just for new ones created for live sessions.

So separate the code that caches the syscall entry/exit tracepoint
format fields into a new function that will be used in the next
changeset.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20131112115700.GC4053@ghostprotocols.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-12 13:00:36 -03:00
Namhyung Kim
73faab3a42 perf trace: Beautify fifth argument of mmap() as fd
The fifth argument of mmap syscall is fd and it often contains -1 as a
value for anon mappings.  Without this patch it doesn't show the file
name as well as it shows -1 as 4294967295.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1384237500-22991-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-12 13:00:23 -03:00
Arnaldo Carvalho de Melo
67c1e4a53b perf tests: Use lower sample_freq in sw clock event period test
We were using it at 10 kHz, which doesn't work in machines where somehow
the max freq was auto reduced by the kernel:

[root@ssdandy ~]# perf test 19
19: Test software clock events have valid period values    : FAILED!
[root@ssdandy ~]# perf test -v 19
19: Test software clock events have valid period values    :
--- start ---
Couldn't open evlist: Invalid argument
---- end ----
Test software clock events have valid period values: FAILED!
[root@ssdandy ~]#

[root@ssdandy ~]# cat /proc/sys/kernel/perf_event_max_sample_rate
7000

Reducing it to 500 Hz should be good enough for this test and also
shouldn't affect what it is testing.

But warn the user if it fails, informing the knob and the freq tried.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-548rhj1uo6xbwnxa95kw3hqe@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-11 16:43:34 -03:00
Arnaldo Carvalho de Melo
d0b849e9bc perf tests: Check return of perf_evlist__open sw clock event period test
We were not checking if we successfully opened the counters, i.e. if
sys_perf_event_open worked, when it doesn't in this test, we were
continuing anyway and then segfaulting when trying to access the file
descriptor array, that at that point had been freed in perf_evlist__open
error path:

[root@ssdandy ~]# perf test -v 19
19: Test software clock events have valid period values    :
--- start ---
Segmentation fault (core dumped)
[root@ssdandy ~]#

Do the check and bail out instead.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-6qy8ljkn0e9hm7bh7keo5z68@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-11 16:28:42 -03:00
David Ahern
a9986fad66 perf record: Move existing write_output into helper function
Code move only; no logic changes. In preparation for the mmap based
output option in the next patch.

Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1383884605-30968-2-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-11 15:56:40 -03:00
Adrian Hunter
410f178603 perf record: Use correct return type for write()
write() returns a 'ssize_t' not an 'int'.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1383906470-21002-1-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-11 15:56:40 -03:00
Namhyung Kim
7524f63b99 perf tools: Prevent condition that all sort keys are elided
If given sort keys are all elided there'll be no output except for the
overhead column - actually the TUI shows a noisy output.  In this case
it'd be better to show up the sort keys rather than elide.

Before:

  $ perf report -s comm -c perf
  (...)
  # Overhead
  # ........
  #
     100.00%

After:

  $ perf report -s comm -c perf
  (...)
  # Overhead  Command
  # ........  .......
  #
     100.00%     perf

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1383900822-14609-1-git-send-email-namhyung@kernel.org
[ Us curly braces around multi-line statements, as requested by Ingo Molnar ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-11 15:56:40 -03:00
Arnaldo Carvalho de Melo
a33fbd56ec perf machine: Simplify synthesize_threads method
Several tools (top, kvm) don't need to be called back to process each of
the syntheiszed records, instead relying on the machine__process_event
function to change the per machine data structures that represent
threads and mmaps, so provide a way to ask for this common idiom.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-pusqibp8n3c4ynegd1frn4zd@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-11 15:56:40 -03:00
Arnaldo Carvalho de Melo
58d925dced perf machine: Introduce synthesize_threads method out of open coded equivalent
Further simplifications to be done on following patch, as most tools
don't use the callback, using instead just the canned
machine__process_event one.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-r1m0vuuj3cat4bampno9yc8d@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-11 15:56:39 -03:00
Arnaldo Carvalho de Melo
62605dc50c perf record: Synthesize non-exec MMAP records when --data used
When perf_event_attr.mmap_data is set the kernel will generate
PERF_RECORD_MMAP events when non-exec (data, SysV mem) mmaps are
created, so we need to synthesize from /proc/pid/maps for existing
threads, as we do for exec mmaps.

Right now just 'perf record' does it, but any other tool that uses
perf_event__synthesize_thread(s|map) can request it.

Reported-by: Don Zickus <dzickus@redhat.com>
Tested-by: Don Zickus <dzickus@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Bill Gray <bgray@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Fowles <rfowles@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-ihwzraikx23ian9txinogvv2@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-11 15:56:39 -03:00
Arnaldo Carvalho de Melo
ef503831d8 perf evsel: Remove idx parm from constructor
Most uses of the evsel constructor are followed by a call to
perf_evlist__add with an idex of evlist->nr_entries, so make rename
the current constructor to perf_evsel__new_idx and remove the need
for passing the constructor for the common case.

We still need the new_idx variant because the way groups are handled,
with evsel->nr_members holding the number of entries in an evlist,
partitioning the evlist into sublists inside a single linked list.

This asks for a clarifying refactoring, but for now simplify the non
parser cases, so that tool writers don't have to bother with evsel idx
setting.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-zy9tskx6jqm2rmw7468zze2a@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-11 15:56:39 -03:00
Patrick Palka
d53e57d039 perf ui tui progress: Don't force a refresh during progress update
Each call to tui_progress__update() would forcibly refresh the entire
screen.  This is somewhat inefficient and causes noticable flickering
during the startup of perf-report, especially on large/slow terminals.

It looks like the force-refresh in tui_progress__update() serves no
purpose other than to clear the screen so that the progress bar of a
previous operation does not subsume that of a subsequent operation.  But
we can do just that in a much more efficient manner by clearing only the
region that a previous progress bar may have occupied before repainting
the new progress bar.  Then the force-refresh could be removed with no
change in visuals.

This patch disables the slow force-refresh in tui_progress__update() and
instead calls SLsmg_fill_region() on the entire area that the progress
bar may occupy before repainting it.  This change makes the startup of
perf-report much faster and appear much "smoother".

It turns out that this was a big bottleneck in the startup speed of
perf-report -- with this patch, perf-report starts up ~2x faster (1.1s
vs 0.55s) on my machines.  (These numbers were measured by running "time
perf report" on an 8MB perf.data and pressing 'q' immediately.)

Signed-off-by: Patrick Palka <patrick@parcs.ath.cx>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1382747149-9716-1-git-send-email-patrick@parcs.ath.cx
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-11 15:56:39 -03:00
Ingo Molnar
caea6cf521 Merge branch 'uprobes/core' of git://git.kernel.org/pub/scm/linux/kernel/git/oleg/misc into perf/core
Pull uprobes fixes from Oleg Nesterov.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-11 09:44:16 +01:00
Oleg Nesterov
2ded0980a6 uprobes: Fix the memory out of bound overwrite in copy_insn()
1. copy_insn() doesn't look very nice, all calculations are
   confusing and it is not immediately clear why do we read
   the 2nd page first.

2. The usage of inode->i_size is wrong on 32-bit machines.

3. "Instruction at end of binary" logic is simply wrong, it
   doesn't handle the case when uprobe->offset > inode->i_size.

   In this case "bytes" overflows, and __copy_insn() writes to
   the memory outside of uprobe->arch.insn.

   Yes, uprobe_register() checks i_size_read(), but this file
   can be truncated after that. All i_size checks are racy, we
   do this only to catch the obvious mistakes.

Change copy_insn() to call __copy_insn() in a loop, simplify
and fix the bytes/nbytes calculations.

Note: we do not care if we read extra bytes after inode->i_size
if we got the valid page. This is fine because the task gets the
same page after page-fault, and arch_uprobe_analyze_insn() can't
know how many bytes were actually read anyway.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2013-11-09 17:05:43 +01:00
Oleg Nesterov
70d7f98722 uprobes: Fix the wrong usage of current->utask in uprobe_copy_process()
Commit aa59c53fd4 "uprobes: Change uprobe_copy_process() to dup
xol_area" has a stupid typo, we need to setup t->utask->vaddr but
the code wrongly uses current->utask.

Even with this bug dup_xol_work() works "in practice", but only
because get_unmapped_area(NULL, TASK_SIZE - PAGE_SIZE) likely
returns the same address every time.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2013-11-09 17:05:41 +01:00
Ingo Molnar
4d9218daae perf/core improvements and fixes:
. Fix version when building out of tree, as when using one of these:
 
   $ make help | grep perf
     perf-tar-src-pkg    - Build perf-3.12.0.tar source tarball
     perf-targz-src-pkg  - Build perf-3.12.0.tar.gz source tarball
     perf-tarbz2-src-pkg - Build perf-3.12.0.tar.bz2 source tarball
     perf-tarxz-src-pkg  - Build perf-3.12.0.tar.xz source tarball
   $
 
   from David Ahern.
 
 . Don't relookup fields by name in each sample in 'trace'.
 
 . 'perf record' code cleanups, from David Ahern.
 
 . Remove unneeded include in sort.h, from Rodrigo Campos.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.14 (GNU/Linux)
 
 iQIcBAABAgAGBQJSe6sYAAoJENZQFvNTUqpAhPQQAJVMtHBgprppa5DxMPr6WJD6
 mhTEQmg6fXGBxikol1vUsyM2QTu3QJab9dkFol9SUyzhVYO2cwJ3Pe5C26kwrt/h
 BagdTWJJkvsxn/wd8zXc8R/GaTYYz7uF3QgQboJ6puMJYc/VVnOIb6Ab6d6EFxtB
 85KCm6FmEEwYu81z3NQUjOloByeXifr+fCnNSgha8riUW9eCiTmb0Q2L+HcBNjcU
 WmRJMYV60MwRqF/zNJwisH4w+zF0kpClLu6vuQ/QCkA0x943D4rgJ5EhFZOh8JYy
 bmZe9CjSuFF+08/O4o7ITGOFwxqyu9vEmCq8y51VC8H/wEQlIxeZZMCqh4Hjfa7b
 GXKXvhX5+VLmJEZBXxrCdS8i8UdlPdyoywegrz/JmjRzsYpCqAq/Hc2mHuC8CyC6
 iOeuGg2mo7NZukF0ODS5LohUUZL+G1ZI6IjomgqaISsLyKre2lc5to+4+aY8gUif
 Xb/9nNUTlw1ye42W1aZ0EWAgFQFz4jMKl36ZqdoBxBxwg8wRxJOs3PSLK44T5VlR
 IrYzosxK7iXUnpCGboFbwy37AQJ/XRtBZAoDV4g8i+LnQs9vInQbWBdySHxv62Ov
 8oNBQiCL0L0rZOrP+xvnZSxMXJ2+FTO4OGWzwZf+U4e0IWctWfU6NZpxy0U8Hi9Q
 TQsXQ4Dmo0yVimtTj/Pq
 =u5JX
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

  * Fix version when building out of tree, as when using one of these:

    $ make help | grep perf
      perf-tar-src-pkg    - Build perf-3.12.0.tar source tarball
      perf-targz-src-pkg  - Build perf-3.12.0.tar.gz source tarball
      perf-tarbz2-src-pkg - Build perf-3.12.0.tar.bz2 source tarball
      perf-tarxz-src-pkg  - Build perf-3.12.0.tar.xz source tarball
    $

    from David Ahern.

  * Don't relookup fields by name in each sample in 'trace',
    by Arnaldo Carvalho de Melo.

  * 'perf record' code cleanups, from David Ahern.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-07 16:24:57 +01:00
Rodrigo Campos
8ce000e838 perf tools: Remove unneeded include
There is no point in sort.h including itself.

The include was added when the file was created, in commit "perf tools:
Create util/sort.and use it" (dd68ada2d) and added a include to "sort.h"
in lot of files (all the files that started using the file). It was
probably added by mistake on sort.h too.

Signed-off-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1383776454-10595-1-git-send-email-rodrigo@sdfg.com.ar
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-07 11:51:19 -03:00
David Ahern
7ab75cffd6 perf record: Remove post_processing_offset variable
Duplicates the data_offset from header in the session.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1383763297-27066-4-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-07 11:01:59 -03:00
David Ahern
f34b9001f9 perf record: Remove advance_output function
1 line function with only 1 user; might as well embed directly.

Signed-off-by: David Ahern <dsahern@gmail.com>
Suggested-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1383763297-27066-3-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-07 10:43:15 -03:00
David Ahern
57706abc19 perf record: Refactor feature handling into a separate function
Code move only. No logic changes.

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1383763297-27066-2-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-07 10:42:26 -03:00
Arnaldo Carvalho de Melo
77170988ff perf trace: Don't relookup fields by name in each sample
Instead do the lookups just when creating the tracepoints, initially for
the most common, raw_syscalls:sys_{enter,exit}.

It works by having evsel->priv have a per tracepoint structure with
entries for the fields, for direct access, with the offset and a
function to get the value from the sample, doing the swap if needed.

Using a simple workload that does M millions write syscalls, we go from:

 # perf stat -i -e cycles /tmp/oldperf trace ./sc_hello 100 > /dev/null

 Performance counter stats for '/tmp/oldperf trace ./sc_hello 100':

     8,366,771,459 cycles

       2.668025928 seconds time elapsed

 # perf stat -i -e cycles perf trace ./sc_hello 100 > /dev/null

 Performance counter stats for 'perf trace ./sc_hello 100':

     8,345,187,650 cycles

       2.631748425 seconds time elapsed

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-eyfhvoo510a5i10b27dnvm88@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-07 10:40:47 -03:00
David Ahern
a614d01bdd perf tools: Fix version when building out of tree
When building perf out of tree:

  $ make perf-tar-src-pkg
  $ tar -xf perf-<ver>.tar -C /tmp
  $ cd /tmp/perf<ver>
  $ make -C tools/perf

you get this warning message:
    make[1]: *** No rule to make target `kernelversion'.  Stop.

Fix it by saving the perf version in the tar file and using that for the
out of tree builds.

v2: removed short form request and fixed up version string from usual output.

Signed-off-by: David Ahern <dsahern@gmail.com>
Suggested-by: Ingo Molnar <mingo@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1383753335-25782-1-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-07 10:40:47 -03:00
Arnaldo Carvalho de Melo
744a971940 perf evsel: Ditch evsel->handler.data field
Not needed since this cset:

  fcf65bf149: perf evsel: Cache associated event_format

So lets trim this struct a bit.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-j8setslokt0goiwxq9dogzqm@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2013-11-07 10:40:47 -03:00
Ingo Molnar
8a4d0b56b0 Merge branch 'uprobes/core' of git://git.kernel.org/pub/scm/linux/kernel/git/oleg/misc into perf/core
Pull uprobes updates from Oleg Nesterov:

 " [...] this way the upcoming ARM port doesn't (almost) need
   changes outside of arch/arm and thus it would be simpler to
   route everything via the ARM trees. "

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-07 08:46:13 +01:00
Oleg Nesterov
f72d41fa90 uprobes: Export write_opcode() as uprobe_write_opcode()
set_swbp() and set_orig_insn() are __weak, but this is pointless
because write_opcode() is static.

Export write_opcode() as uprobe_write_opcode() for the upcoming
arm port, this way it can actually override set_swbp() and use
__opcode_to_mem_arm(bpinsn) instead if UPROBE_SWBP_INSN.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2013-11-06 20:00:09 +01:00
Oleg Nesterov
8a8de66c4f uprobes: Introduce arch_uprobe->ixol
Currently xol_get_insn_slot() assumes that we should simply copy
arch_uprobe->insn[] which is (ignoring arch_uprobe_analyze_insn)
just the copy of the original insn.

This is not true for arm which needs to create another insn to
execute it out-of-line.

So this patch simply adds the new member, ->ixol into the union.
This doesn't make any difference for x86 and powerpc, but arm
can divorce insn/ixol and initialize the correct xol insn in
arch_uprobe_analyze_insn().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2013-11-06 20:00:05 +01:00
Oleg Nesterov
736e89d9f7 uprobes: Kill module_init() and module_exit()
Turn module_init() into __initcall() and kill module_exit().

This code can't be compiled as a module so these module_*()
calls only add the confusion, especially if arch-dependant
code needs its own initialization hooks.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2013-11-06 19:59:50 +01:00
David A. Long
3820b4d278 uprobes: Move function declarations out of arch
Move the function declarations from the arch headers to the common
header, since only the function bodies are architecture-specific.
These changes are from Vincent Rabin's uprobes patch.

[ oleg: update arch/powerpc/include/asm/uprobes.h ]

Signed-off-by: Rabin Vincent <rabin@rab.in>
Signed-off-by: David A. Long <dave.long@linaro.org>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2013-11-06 19:59:37 +01:00
Yan, Zheng
f891d8cfb8 perf/x86/intel: Add Ivy Bridge-EP uncore IRP box support
Unlike other uncore boxes, IRP boxes live in PCI buses with no UBOX
device. For PCI bus without UBOX device, we find the next bus that
has UBOX device and use its 'bus to socket' mapping.

Besides the counter/control registers in IRP boxes are not properly
aligned.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: eranian@google.com
Cc: "Yan Zheng" <zheng.z.yan@intel.com>
Link: http://lkml.kernel.org/r/1383197815-17706-2-git-send-email-zheng.z.yan@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06 12:34:31 +01:00
Yan, Zheng
d1e8f4a836 perf/x86/intel/uncore: Add filter support for IvyBridge-EP QPI boxes
The encoding for filter registers of IvyBridge-EP uncore QPI boxes is
completely the same as SandyBridge-EP.

Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: eranian@google.com
Cc: "Yan Zheng" <zheng.z.yan@intel.com>
Link: http://lkml.kernel.org/r/1383197815-17706-1-git-send-email-zheng.z.yan@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06 12:34:29 +01:00
Oleg Nesterov
c7e548b45c perf: Factor out strncpy() in perf_event_mmap_event()
While this is really minor, but strncpy() does the unnecessary
zero-padding till the end of tmp[16] and it is called every time
we are going to use the string literal.

Turn these strncpy()'s into the single strlcpy() under the new
label, saves 72 bytes.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20131017182417.GA17753@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06 12:34:28 +01:00
Peter Zijlstra
a94d342b9c tools/perf: Add required memory barriers
To match patch bf378d341e ("perf: Fix perf ring buffer memory
ordering") change userspace to also adhere to the ordering outlined.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Michael Neuling <mikey@neuling.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: james.hogan@imgtec.com
Cc: Vince Weaver <vince@deater.net>
Cc: Victor Kaplansky <VICTORK@il.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Anton Blanchard <anton@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Michael Ellerman <michael@ellerman.id.au>
Link: http://lkml.kernel.org/r/20131030104246.GH16117@laptop.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06 12:34:26 +01:00
Peter Zijlstra
0a196848ca perf: Fix arch_perf_out_copy_user default
The arch_perf_output_copy_user() default of
__copy_from_user_inatomic() returns bytes not copied, while all other
argument functions given DEFINE_OUTPUT_COPY() return bytes copied.

Since copy_from_user_nmi() is the odd duck out by returning bytes
copied where all other *copy_{to,from}* functions return bytes not
copied, change it over and ammend DEFINE_OUTPUT_COPY() to expect bytes
not copied.

Oddly enough DEFINE_OUTPUT_COPY() already returned bytes not copied
while expecting its worker functions to return bytes copied.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: will.deacon@arm.com
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/20131030201622.GR16117@laptop.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06 12:34:25 +01:00
Peter Zijlstra
394570b793 perf: Update a stale comment
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Michael Neuling <mikey@neuling.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: james.hogan@imgtec.com
Cc: Vince Weaver <vince@deater.net>
Cc: Victor Kaplansky <VICTORK@il.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Anton Blanchard <anton@samba.org>
Link: http://lkml.kernel.org/n/tip-9s5mze78gmlz19agt39i8rii@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06 12:34:23 +01:00
Peter Zijlstra
524feca5e9 perf: Optimize perf_output_begin() -- address calculation
Rewrite the handle address calculation code to be clearer.

Saves 8 bytes on x86_64-defconfig.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Michael Neuling <mikey@neuling.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: james.hogan@imgtec.com
Cc: Vince Weaver <vince@deater.net>
Cc: Victor Kaplansky <VICTORK@il.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Anton Blanchard <anton@samba.org>
Link: http://lkml.kernel.org/n/tip-3trb2n2henb9m27tncef3ag7@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06 12:34:22 +01:00
Peter Zijlstra
d20a973f46 perf: Optimize perf_output_begin() -- lost_event case
Avoid touching the lost_event and sample_data cachelines twince. Its
not like we end up doing less work, but it might help to keep all
accesses to these cachelines in one place.

Due to code shuffle, this looses 4 bytes on x86_64-defconfig.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Michael Neuling <mikey@neuling.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: james.hogan@imgtec.com
Cc: Vince Weaver <vince@deater.net>
Cc: Victor Kaplansky <VICTORK@il.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Anton Blanchard <anton@samba.org>
Link: http://lkml.kernel.org/n/tip-zfxnc58qxj0eawdoj31hhupv@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06 12:34:21 +01:00
Peter Zijlstra
85f59edf96 perf: Optimize perf_output_begin()
There's no point in re-doing the memory-barrier when we fail the
cmpxchg(). Also placing it after the space reservation loop makes it
clearer it only separates the userpage->tail read from the data
stores.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Michael Neuling <mikey@neuling.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: james.hogan@imgtec.com
Cc: Vince Weaver <vince@deater.net>
Cc: Victor Kaplansky <VICTORK@il.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Anton Blanchard <anton@samba.org>
Link: http://lkml.kernel.org/n/tip-c19u6egfldyx86tpyc3zgkw9@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06 12:34:20 +01:00
Peter Zijlstra
c72b42a3dd perf: Add unlikely() to the ring-buffer code
Add unlikely() annotations to 'slow' paths:

When having a sampling event but no output buffer; you have bigger
issues -- also the bail is still faster than actually doing the work.

When having a sampling event but a control page only buffer, you have
bigger issues -- again the bail is still faster than actually doing
work.

Optimize for the case where you're not loosing events -- again, not
doing the work is still faster but make sure that when you have to
actually do work its as fast as possible.

The typical watermark is 1/2 the buffer size, so most events will not
take this path.

Shrinks perf_output_begin() by 16 bytes on x86_64-defconfig.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Michael Neuling <mikey@neuling.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: james.hogan@imgtec.com
Cc: Vince Weaver <vince@deater.net>
Cc: Victor Kaplansky <VICTORK@il.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Anton Blanchard <anton@samba.org>
Link: http://lkml.kernel.org/n/tip-wlg3jew3qnutm8opd0hyeuwn@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06 12:34:19 +01:00
Peter Zijlstra
26c86da882 perf: Simplify the ring-buffer code
By using CIRC_SPACE() we can obviate the need for perf_output_space().

Shrinks the size of perf_output_begin() by 17 bytes on
x86_64-defconfig.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Michael Neuling <mikey@neuling.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: james.hogan@imgtec.com
Cc: Vince Weaver <vince@deater.net>
Cc: Victor Kaplansky <VICTORK@il.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Anton Blanchard <anton@samba.org>
Link: http://lkml.kernel.org/n/tip-vtb0xb0llebmsdlfn1v5vtfj@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06 12:34:18 +01:00