Moving 'struct perf_counts' and associated functions into separate
object, so we could remove stat.c object dependency from python build.
It makes the python code to build properly, because it fails to load due
to missing stat-shadow.c object dependency if some patches from Kan
Liang are applied.
So apply this one, then Kan's.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20150807105103.GB8624@krava.brq.redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Previous patches introduce llvm__compile_bpf() to compile source file to
eBPF object. This patch adds testcase to test it. It also tests libbpf
by opening generated object after applying next patch which introduces
HAVE_LIBBPF_SUPPORT option.
Since llvm__compile_bpf() prints long messages which users who don't
explicitly test llvm doesn't care, this patch set verbose to -1 to
suppress all debug, warning and error message, and hint user use 'perf
test -v' to see the full output.
For the same reason, if clang is not found in PATH and there's no [llvm]
section in .perfconfig, skip this test.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/n/1436445342-1402-17-git-send-email-wangnan0@huawei.com
[ Add tools/lib/bpf/ to tools/perf/MANIFEST, so that the tarball targets build ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To help user find correct kernel include options, this patch extracts
them from kbuild system by an embedded script kinc_fetch_script, which
creates a temporary directory, generates Makefile and an empty dummy.o
then use the Makefile to fetch $(NOSTDINC_FLAGS), $(LINUXINCLUDE) and
$(EXTRA_CFLAGS) options. The result is passed to compiler script using
'KERNEL_INC_OPTIONS' environment variable.
Because options from kbuild contains relative path like
'Iinclude/generated/uapi', the work directory must be changed. This is
done by previous patch.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1436445342-1402-16-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch detects kernel build directory by checking the existence of
include/generated/autoconf.h.
clang working directory is changed to kbuild directory if it is found,
to help user use relative include path. Following patch will detect
kernel include directory, which contains relative include patch so this
workdir changing is needed.
Users are allowed to set 'kbuild-dir = ""' manually to disable this
checking.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/n/tip-owyfwfbemrjn0tlj6tgk2nf5@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This is the core patch for supporting eBPF on-the-fly compiling, does
the following work:
1. Search clang compiler using search_program().
2. Run command template defined in llvm-bpf-cmd-template option in
[llvm] config section using read_from_pipe(). Patch of clang and
source code path is injected into shell command using environment
variable using force_set_env().
Commiter notice:
When building with DEBUG=1 we get a compiler error that gets fixed with
the same approach described in commit b236512280:
perf kmem: Fix compiler warning about may be accessing uninitialized variable
The last argument to strtok_r doesn't need to be initialized, its
just a placeholder to make this routine reentrant, but gcc doesn't know
about that and complains, breaking the build, fix it by setting it to
NULL.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/n/1436445342-1402-14-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch introduces [llvm] config section with 5 options. Following
patches will use then to config llvm dynamica compiling.
'llvm-utils.[ch]' is introduced in this patch for holding all
llvm/clang related stuffs.
Example:
[llvm]
# Path to clang. If omit, search it from $PATH.
clang-path = "/path/to/clang"
# Cmdline template. Following line shows its default value.
# Environment variable is used to passing options.
#
# *NOTE*: -D__KERNEL__ MUST appears before $CLANG_OPTIONS,
# so user have a chance to use -U__KERNEL__ in $CLANG_OPTIONS
# to cancel it.
clang-bpf-cmd-template = "$CLANG_EXEC -D__KERNEL__ $CLANG_OPTIONS \
$KERNEL_INC_OPTIONS -Wno-unused-value \
-Wno-pointer-sign -working-directory \
$WORKING_DIR -c $CLANG_SOURCE -target \
bpf -O2 -o -"
# Options passed to clang, will be passed to cmdline by
# $CLANG_OPTIONS.
clang-opt = "-Wno-unused-value -Wno-pointer-sign"
# kbuild directory. If not set, use /lib/modules/`uname -r`/build.
# If set to "" deliberately, skip kernel header auto-detector.
kbuild-dir = "/path/to/kernel/build"
# Options passed to 'make' when detecting kernel header options.
kbuild-opts = "ARCH=x86_64"
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1437477214-149684-1-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Extend the event parser maximum error index from 10 to 13. That allows
PMU config terms of up to 10 characters to display un-truncated in the
error message.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1437150840-31811-17-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently the value of a PMU config term is silently truncated if it is
too big. This is an impediment to validating the value for other
criteria later on i.e. the user provides an invalid value that gets
truncated to a valid one.
The maximum value validation is only done for the parser where the error
is passed back to the user. In other cases the silent truncation
continues so as not to affect tools that perhaps rely on it.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1437150840-31811-16-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add perf_pmu__format_bits() to get the format bits for a PMU config
term. Intel PT will use this to validate terms and to record format
bits to enable later interpreting the config from the attribute stored
in the perf.data file.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1437150840-31811-15-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
PERF_ITRACE_PERIOD_INSTRUCTIONS is zero so it got overwritten by the
default period type.
Fix by checking if the period type was set rather than if the value was
zero when applying the default.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1437150840-31811-12-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Display the cycles by default in branch sort mode.
To make enough room for the new column I removed dso_to. It is usually
redundant with dso_from.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1437233094-12844-9-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now that we can process branch data in annotate it makes sense to
support enabling branch recording from top too. Most of the code needed
for this is already in shared code with report. But we need to add:
- The option parsing code (using shared code from the previous patch)
- Document the options
- Set up the IPC/cycles accounting state in the top session
- Call the accounting code in the hist iter callback
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1437233094-12844-8-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add two new columns to the annotate display and display the average
cycles and the compute IPC if available.
When the LBR was not in any branch mode the IPC computation is
automatically disabled. We still display the cycle information.
Example output (with made up numbers):
The second column is the IPC and third average cycles.
│ __attribute__((noinline)) f2()
│ {
5.15 0.07 │ push %rbp
0.01 0.07 │ mov %rsp,%rbp
│ c = a / b;
9.87 0.07 │ mov a,%eax
0.07 │ mov b,%ecx
0.07 │ cltd
4.92 0.07 123│ idiv %ecx
70.79 0.07 │ mov %eax,__TMC_END__
│ }
9.25 0.07 │ pop %rbp
0.01 0.07 123│ ← retq
v2: Fix display problems.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1437233094-12844-7-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Compute the IPC and the basic block cycles for the annotate display.
IPC is computed by counting the instructions, and then dividing the
accounted cycles by that count.
The actual IPC computation can only be done at annotate time, because we
need to parse the objdump output first to know the number of
instructions in the basic block.
The cycles/IPC are also put into the perf function annotation so that
the display code can show them.
Again basic block overlaps are not handled, with the longest winning,
but there are some heuristics to hide the IPC when the longest is not
the most common.
v2: Compute IPC correctly.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1437233094-12844-6-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Call the earlier added cycle histogram infrastructure from the perf
report hist iter callback. For this we walk the branch records.
This allows to use cycle histograms when browsing perf report annotate.
v2: Rename flag
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1437233094-12844-5-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This adds the basic infrastructure to keep track of cycle counts per
basic block for annotate. We allocate an array similar to the normal
accounting, and then account branch cycles there.
We handle two cases:
cycles per basic block with start and cycles per branch (these are later
used for either IPC or just cycles per BB)
In the start case we cannot handle overlaps, so always the longest basic
block wins.
For the cycles per branch case everything is accurately accounted.
v2: Remove unnecessary checks. Slight restructure. Move
symbol__get_annotation to another patch. Move histogram allocation.
v3: Merged with current tree
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1437233094-12844-4-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Later patches need to cheaply check that the branch mode is in ANY. Add
a new function to check all event attrs and add a flag to the report
state, which is then initialized.
v2: Rename flag
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1437233094-12844-3-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
cycles is a new branch_info field available on some CPUs that indicates
the time deltas between branches in the LBR.
Add a sort key and output code for the cycles to allow to display the
basic block cycles individually in perf report.
We also pass in the cycles for weight when LBRs are processed, which
allows to get global and local weight, to get an estimate of the total
cost.
And also print the cycles information for perf report -D. I also added
printing for the previously missing LBR flags (mispredict etc.)
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1437233094-12844-2-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf currently fails to build on MIPS as there is no
tools/perf/arch/mips/Build file. Adding an empty file fixes this as
there are no MIPS-specific sources to build.
It looks like the same is needed for Alpha and PA-RISC, though I
haven't been able to test those.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Fixes: 5e8c0fb6a9 ("perf build: Add arch x86 objects building")
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1438704627.7315.2.camel@decadent.org.uk
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Moving counter processing code into stat object as
perf_stat__process_counter.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1437481927-29538-8-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Passing 'struct perf_stat_config' into process_counter(), so that we can
make process_counter() non static and use it from other places.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1437481927-29538-7-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Moving 'interval' into struct perf_stat_config. The point is to
centralize the base stat config so it could be used localy together with
other stat routines in other parts of perf code.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1437481927-29538-6-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Moving 'output' into struct perf_stat_config. The point is to centralize
the base stat config so it could be used localy together with other stat
routines in other parts of perf code.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1437481927-29538-5-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Moving 'scale' into struct perf_stat_config. The point is to centralize
the base stat config so it could be used localy together with other stat
routines in other parts of perf code.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1437481927-29538-4-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Moving 'aggr_mode' into new struct. The point is to centralize the base
stat config so it could be used localy together with other stat routines
in other parts of perf code.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1437481927-29538-3-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Commit 7b6ff0bdbf ("perf probe ppc64le:
Fixup function entry if using kallsyms lookup") adds 'struct map' into
probe-event.h but not forward declares it. This patch fixes it.
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Fixes: 7b6ff0bdbf ("perf probe ppc64le: Fixup function entry if using kallsyms lookup")
Link: http://lkml.kernel.org/n/1436445342-1402-30-git-send-email-wangnan0@huawei.com
[ No need to include map.h, just forward declare 'struct map' ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
va_args alternative to eprintf().
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/n/1436445342-1402-19-git-send-email-wangnan0@huawei.com
[ split from another patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Without this patch, it is cumbersome to read the trace output but
ignoring the normal, potentially verbose, output of the debuggee. One
common example is doing something like the following:
perf trace -s find /tmp > /dev/null
Without this patch, the trace summary will be lost. Now, it will still
be printed at the end. This behavior is also applied by strace.
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: David Ahern <dsahern@gmail.com>
Link: http://lkml.kernel.org/n/tip-tqnks6y2cnvm5f9g2dsfr7zl@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
color_vprintf was including the length of the invisible escape sequences
in its return argument. Don't include them to make the return value
usable for indentation calculations.
v2: Add comment, rebase
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1438649408-20807-3-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Seems like it's always '\n' through color_fprintf_ln, which is not used
at all, removing.. ;-)
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1438649408-20807-2-git-send-email-andi@firstfloor.org
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pass global callchain_param into parse_callchain_record_opt and
perf_evsel__config_callgraph as parameter. So we can reuse these
functions to parse/config local param for callchain.
Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1438677022-34296-3-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patchkit adds the ability to turn off time stamps per event.
One usaful case for partial time is to work with per-event callgraph to
enable "PEBS threshold > 1" (https://lkml.org/lkml/2015/5/10/196), which
can significantly reduce the sampling overhead.
The event samples with time stamps off will not be ordered.
Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1438677022-34296-2-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To work like strace and dereference syscall pointer args we need to
insert probes (or tracepoints) right after we copy those bytes from
userspace.
Since we're formatting the syscall args at raw_syscalls:sys_enter time,
we need to have a formatter that just stores the position where, later,
when we get the probe:vfs_getname, we can insert the pointer contents.
Now, if a probe:vfs_getname with this format is in place:
# perf probe -l
probe:vfs_getname (on getname_flags:72@/home/git/linux/fs/namei.c with pathname)
That was, in this case, put in place with:
# perf probe 'vfs_getname=getname_flags:72 pathname=filename:string'
Added new event:
probe:vfs_getname (on getname_flags:72 with pathname=filename:string)
You can now use it in all perf tools, such as:
perf record -e probe:vfs_getname -aR sleep 1
#
Then 'perf trace' will notice that and do the pointer -> contents
expansion:
# trace -e open touch /tmp/bla
0.165 (0.010 ms): touch/17752 open(filename: /etc/ld.so.cache, flags: CLOEXEC) = 3
0.195 (0.011 ms): touch/17752 open(filename: /lib64/libc.so.6, flags: CLOEXEC) = 3
0.512 (0.012 ms): touch/17752 open(filename: /usr/lib/locale/locale-archive, flags: CLOEXEC) = 3
0.582 (0.012 ms): touch/17752 open(filename: /tmp/bla, flags: CREAT|NOCTTY|NONBLOCK|WRONLY, mode: 438) = 3
#
Roughly equivalent to strace's output:
# strace -rT -e open touch /tmp/bla
0.000000 open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 <0.000039>
0.000317 open("/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3 <0.000102>
0.001461 open("/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3 <0.000072>
0.000405 open("/tmp/bla", O_WRONLY|O_CREAT|O_NOCTTY|O_NONBLOCK, 0666) = 3 <0.000055>
0.000641 +++ exited with 0 +++
#
Now we need to either look for at all syscalls that are marked as
pointers and have some well known names ("filename", "pathname", etc)
and set the arg formatter to the one used for the "open" syscall in this
patch.
This implementation works for syscalls with just a string being copied
from userspace, for matching syscalls with more than one string being
copied via the same probe/trace point (vfs_getname) we need to extend
the vfs_getname probe spec to include the pointer too, but there are
some problems with that in 'perf probe' or the kernel kprobes code, need
to investigate before considering supporting multiple strings per
syscall.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Milian Wolff <mail@milianw.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-xvuwx6nuj8cf389kf9s2ue2s@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We were using it as a magic number, 1024, fix that.
Eventually we need to stop doing it per line, and do it per
arg, traversing the args at output time, to avoid the memmove()
calls that will be used in the next cset to replace pointers
present at raw_syscalls:sys_enter time with its contents that
appear at probe:vfs_getname time, before raw_syscalls:sys_exit
time.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Milian Wolff <mail@milianw.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-4sz3wid39egay1pp8qmbur4u@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
So that we can later decide if we will store where to expand the
pathname once we are handling vfs_getname or if we should instead
just go on and straight away print the pointer.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Milian Wolff <mail@milianw.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-ytxk5s5jpc50wahffmlxgxuw@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We were accessing trace->syscalls.events members even when that struct
wasn't initialized, i.e. --no-syscalls was specified on the command
line, fix it to show that, still in debug mode, when we have an event
qualifier list, i.e. when we actually are doing subset syscall tracing.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Milian Wolff <mail@milianw.de>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Fixes: 19867b6186 ("perf trace: Use event filters for the event qualifier list")
Link: http://lkml.kernel.org/n/tip-7980ym6vujgh3yiai0cqzc88@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The libtraceevent handler (session->tevent) is only initialized when
there are tracepoints in a perf.data event list, so do not call
pevent_set_function_resolve() in those cases, fixing a segfault.
Reported-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-xyynkucl5p4bcs13zi4i4b1f@git.kernel.org
Report-link: http://lkml.kernel.org/r/20150803174113.GA20282@krava.redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
User visible:
- Force period term to overload global settings, i.e. previously this
command line:
$ perf record -e 'cpu/instructions,period=20000/',cycles -c 1000 sleep 1
would result in both events having a period equal to 1000, with the fix we
get something saner:
$ perf evlist -v | grep period
cpu/instructions,period=20000/: ... { sample_period, sample_freq }: 20000, ...
cycles: ... { sample_period, sample_freq }: 1000 ...
$
(Jiri Olsa)
Infrastructure:
- Use the dummy software event with freq=0 in the twatch.py python
binding example, to avoid disabling nohz (Arnaldo Carvalho de Melo)
- Add some missing constants to the python binding (Arnaldo Carvalho de Melo)
- Fix mismatched declarations for elf_getphdrnum, that happens
only in the corner case where this function is not found on
the system (Arnaldo Carvalho de Melo).
- Adding build test for having ending double slash (Jiri Olsa)
- Introduce callgraph_set for callgraph option (Kan Liang)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJVuk2cAAoJENZQFvNTUqpA15IQAIwfXkfs6we+5+VOXp35bKrl
EiXVTBZPo1IW+duas0exur8nEmdYV9VMuZE8t5WpldOsARxJKjnKyB+muFCDMKqA
3yeBXPATawxRKjqIsPVucksFXGHf19v6Nsh2plqY+qqFynKxm7DibcMxIIFMeqY0
VwdUzHEojIMQ2pzHA7Ef2eSdzSgAUqO06+O9BBB+udizcCAonF5KZh4tw5n2795E
gz0rvnYVC8q7EU2oSKEWwWyj2Ti07iaC0b/adg6jY9OU0Mnlx0K3MkREmq6KQjoz
GrayIknp0CoatLPbpuPf9jz3si7lL/WErl3F3Qeg1lfzAPdGDakmfufQmgyHhEfF
in0qAYxYKMnsgRblTWynOMUWISfdKlhjsofXFv3hXOB2iWbulHLU7WdS8ieyuiGq
N0jcYEII4+/qk+Wi/XbiCujOmaZdvG+slSmx9JgZwXhj4kRiBkUYeNk/JCdNhzgX
u6fse5lBQRI+YDmNXe+QQxUTpL+jpx2OnmpD8v2Yx4YvvLN/SU47Y9VYw9YWmkDq
NySRFn/bBc/zlrT6EnYI7ENpydLIovS+Wa8WSmzQFCDGcbWl2TSKjcVAB1aqhhsk
IAMnzv5/0ybcx8WiYcGqd6Z4gh+WuUenzHGKE3sSmSqbbRRS1WvBYeyeZaU1IGkW
8lrAASPd8X3YP76aP8ww
=XBSW
-----END PGP SIGNATURE-----
Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
User visible changes:
- Force period term to overload global settings, i.e. previously this
command line:
$ perf record -e 'cpu/instructions,period=20000/',cycles -c 1000 sleep 1
would result in both events having a period equal to 1000, with the fix we
get something saner:
$ perf evlist -v | grep period
cpu/instructions,period=20000/: ... { sample_period, sample_freq }: 20000, ...
cycles: ... { sample_period, sample_freq }: 1000 ...
$
(Jiri Olsa)
Infrastructure changes:
- Use the dummy software event with freq=0 in the twatch.py python
binding example, to avoid disabling nohz. (Arnaldo Carvalho de Melo)
- Add some missing constants to the python binding. (Arnaldo Carvalho de Melo)
- Fix mismatched declarations for elf_getphdrnum, that happens
only in the corner case where this function is not found on
the system. (Arnaldo Carvalho de Melo)
- Add build test for having ending double slash. (Jiri Olsa)
- Introduce callgraph_set for callgraph option. (Kan Liang)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pawel Moll reported build issue for having extra slash (/) at the end of
the prefix variable.
$ make prefix=/usr/local/
CC tests/attr.o
tests/attr.c: In function ‘test__attr’:
tests/attr.c:168:50: error: expected ‘)’ before ‘;’ token
snprintf(path_perf, PATH_MAX, "%s/perf", BINDIR);
^
tests/attr.c:176:1: error: expected ‘;’ before ‘}’ token
}
^
tests/attr.c:176:1: error: control reaches end of non-void function [-Werror=return-type]
}
^
cc1: all warnings being treated as errors
Adding automated test case for this.
Reported-by: Pawel Moll <pawel.moll@arm.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20150727182417.GD20509@krava.brq.redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Introduce callgraph_set to indicate whether the callgraph option was set
by user.
Signed-off-by: Kan Liang <kan.liang@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1438162936-59698-4-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently the command line option settings beats the per event period
settings:
With no global settings, we get per-event configuration:
$ perf record -e 'cpu/instructions,period=20000/' sleep 1
$ perf evlist -v
... { sample_period, sample_freq }: 20000 ...
With 'c' option period setup, we get 'c' option value:
$ perf record -e 'cpu/instructions,period=20000/' -c 1000 sleep 1
$ perf evlist -v
... { sample_period, sample_freq }: 1000 ...
This patch makes the per-event settings overload the global 'c' option
setup:
$ perf record -e 'cpu/instructions,period=20000/' -c 1000 sleep 1
$ perf evlist -v
... { sample_period, sample_freq }: 20000 ...
I think the making the per-event settings to overload any other config
makes more sense than current state. However it breaks the current
'period' term handling, which might cause some noise.. so let's see ;-).
Also fixing parse event tests with the new behaviour.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1438162936-59698-3-git-send-email-kan.liang@intel.com
Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add support to overload any global settings for event and force user
specified term value. It will be useful for new time and backtrace
terms.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1438162936-59698-2-git-send-email-kan.liang@intel.com
Signed-off-by: Kan Liang <kan.liang@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The semantic associated in tools/perf/ with foo__delete(instance) is to
release all resources referenced by 'instance' members and then release
the memory for 'instance' itself.
The perf_session_env__delete() function isn't doing this, it just does
the first part, but the space used by 'instance' itself isn't freed, as
it is embedded in a larger structure, that will be freed at other stage.
For these cases we se foo__exit(), i.e. the usage is:
void foo__delete(foo)
{
if (foo) {
foo__exit(foo);
free(foo);
}
}
But when we have something like:
struct bar {
struct foo foo;
. . .
}
Then we can't really call foo__delete(&bar.foo), we must have this
instead:
void bar__exit(bar)
{
foo__exit(&bar.foo);
/* free other bar-> resources */
}
void bar__delete(bar)
{
if (bar) {
bar__exit(bar);
free(bar);
}
}
So just rename perf_session_env__delete() to perf_session_env__exit().
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-djbgpcfo5udqptx3q0flwtmk@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When HAVE_ELF_GETPHDRNUM_SUPPORT is false we trip on this problem:
CC /tmp/build/perf/util/symbol-elf.o
util/symbol-elf.c:41:12: error: static declaration of ‘elf_getphdrnum’ follows non-static declaration
static int elf_getphdrnum(Elf *elf, size_t *dst)
^
In file included from util/symbol.h:19:0,
from util/symbol-elf.c:8:
/usr/include/libelf.h:206:12: note: previous declaration of ‘elf_getphdrnum’ was here
extern int elf_getphdrnum (Elf *__elf, size_t *__dst);
^
MKDIR /tmp/build/perf/bench/
/home/git/linux/tools/build/Makefile.build:68: recipe for target '/tmp/build/perf/util/symbol-elf.o' failed
make[3]: *** [/tmp/build/perf/util/symbol-elf.o] Error 1
Fix it.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-qcmekyfedmov4sxr0wahcikr@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>