Commit Graph

12236 Commits

Author SHA1 Message Date
Arnaldo Carvalho de Melo
374fbe5606 tools headers: Synchronize KVM arch ABI headers
To pick up changes from these csets:

  da9a1446d2 ("KVM: s390: provide a capability for AIS state migration")
  5c5196da4e ("KVM: arm/arm64: Support EL1 phys timer register access in set/get reg")

None of which affects buildint tools/perf/.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Christoffer Dall <cdall@linaro.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-dd72s6izo4qdzt1isowlz8ji@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-28 14:31:01 -03:00
Arnaldo Carvalho de Melo
485be0cb0c tools headers: Synchronize drm/i915_drm.h
To pick up the changes from these csets:

  bf64e0b00e ("drm/i915: Expand I915_PARAM_HAS_SCHEDULER into a capability bitmask")
  ac14fbd460 ("drm/i915/scheduler: Support user-defined priorities")
  822a4b6732 ("drm/i915: Don't use BIT() in UAPI section")
  3fd3a6ffe2 ("drm/i915: Simplify i915_reg_read_ioctl")

None of them affects how the tools are built, this os done just to
silence this perf build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/drm/i915_drm.h' differs from latest version at 'include/uapi/drm/i915_drm.h'

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-d2gor8brpcowe7bcxovjhqwm@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-28 14:30:42 -03:00
Arnaldo Carvalho de Melo
8ce6d5eb01 tools headers uapi: Synchronize drm/drm.h
To pick up the new ioctls added in these csets:

  3064abfa93 ("drm: Add CRTC_GET_SEQUENCE and CRTC_QUEUE_SEQUENCE ioctls [v3]")
  62884cd386 ("drm: Add four ioctls for managing drm mode object leases [v7]")

That will be automatically decoded (the ioctl cmd parameter, the structs
will be supported when we start using eBPF for that, which is in the
works).

This silences this warning when building tools/perf:

  Warning: Kernel ABI header at 'tools/include/uapi/drm/drm.h' differs from latest version at 'include/uapi/drm/drm.h'

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Keith Packard <keithp@keithp.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-bivwf1pkfmi1ugpswbsxd9e9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-28 14:30:16 -03:00
Arnaldo Carvalho de Melo
0f1aabeb49 tools headers: Synchronize perf_event.h header
To get the changes in the 085b30625e ("perf/core: Add
PERF_AUX_FLAG_COLLISION to report colliding samples") commit, that will
be eventually used by perf to handle the ARM SPE architecture.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Will Deacon <will.deacon@arm.com>
Link: https://lkml.kernel.org/n/tip-178ohv0oy0csq3kzfdk8ky4n@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-28 14:29:49 -03:00
Arnaldo Carvalho de Melo
8536913189 tools headers: Synchronize kernel ABI headers wrt SPDX tags
Two more, that were just in perf/core and thus weren't covered by Ingo's
latest headers synch, kcmp.h and prctl.h, silencing this:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/kcmp.h' differs from latest version at 'include/uapi/linux/kcmp.h'
  Warning: Kernel ABI header at 'tools/include/uapi/linux/prctl.h' differs from latest version at 'include/uapi/linux/prctl.h'

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-2a0r7iybyqpkftllyy5t9hfk@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-28 14:29:20 -03:00
Ingo Molnar
0b44cfb8e4 tools/headers: Synchronize kernel x86 UAPI headers
Two x86 headers got modified in this merge window:

 arch/x86/include/asm/cpufeatures.h
 arch/x86/include/asm/disabled-features.h

To support x86 UMIP feature, to add new AVX instructions, plus cleanups.

None of those changes have an effect on tooling, so do a plain copy.

Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-28 14:29:05 -03:00
Adrian Hunter
51cacdc898 perf intel-pt: Bring instruction decoder files into line with the kernel
There are just a few new defines which do not affect perf tools.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Link: http://lkml.kernel.org/r/1511253326-22308-3-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-28 14:28:49 -03:00
Thomas Richter
996548499d perf test: Fix test 21 for s390x
Test case 21 (Number of exit events of a simple workload) fails on
s390x. The reason is the invalid sample frequency supplied for this
test. On s390x the minimum sample frequency is much higher (see output
of /proc/service_levels).

Supply a save sample frequency value for s390x to fix this.  The value
will be adjusted by the s390x CPUMF frequency convertion function to a
value well below the sysctl kernel.perf_event_max_sample_rate value.

Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
LPU-Reference: 20171123114611.93397-1-tmricht@linux.vnet.ibm.com
Link: https://lkml.kernel.org/n/tip-1ynblyhi1n81idpido59nt1y@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-28 14:28:26 -03:00
Satheesh Rajendran
321a7c35c9 perf bench numa: Fixup discontiguous/sparse numa nodes
Certain systems are designed to have sparse/discontiguous nodes.  On
such systems, 'perf bench numa' hangs, shows wrong number of nodes and
shows values for non-existent nodes. Handle this by only taking nodes
that are exposed by kernel to userspace.

Signed-off-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1edbcd353c009e109e93d78f2f46381930c340fe.1511368645.git.sathnaga@linux.vnet.ibm.com
Signed-off-by: Balamuruhan S <bala24@linux.vnet.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-28 14:28:10 -03:00
Jiri Olsa
bdaab8c4b3 perf top: Use signal interface for SIGWINCH handler
There's no need for SA_SIGINFO data in SIGWINCH handler, switching it to
register the handler via signal interface as we do for the rest of the
signals in perf top.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-elxp1vdnaog1scaj13cx7cu0@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-28 14:27:43 -03:00
Jiri Olsa
89d0aeab42 perf top: Fix window dimensions change handling
The stdio perf top crashes when we change the terminal
window size. The reason is that we assumed we get the
perf_top pointer as a signal handler argument which is
not the case.

Changing the SIGWINCH handler logic to change global
resize variable, which is checked in the main thread
loop.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-ysuzwz77oev1ftgvdscn9bpu@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-28 14:27:20 -03:00
Arnaldo Carvalho de Melo
df7ccfa21e perf top: Ignore kptr_restrict when not sampling the kernel
If all events have attr.exclude_kernel set, no need to look at
kptr_restrict.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-yegpzg5bf2im69g0tfizqaqz@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-28 14:26:49 -03:00
Arnaldo Carvalho de Melo
b0ebd811af perf record: Ignore kptr_restrict when not sampling the kernel
If we're not sampling the kernel, we shouldn't care about kptr_restrict
neither synthesize anything for assisting in resolving kernel samples,
like the reference relocation symbol or kernel modules information.

Before:

  $ cat /proc/sys/kernel/kptr_restrict /proc/sys/kernel/perf_event_paranoid
  2
  2
  $ perf record sleep 1
  WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted,
  check /proc/sys/kernel/kptr_restrict.

  Samples in kernel functions may not be resolved if a suitable vmlinux
  file is not found in the buildid cache or in the vmlinux path.

  Samples in kernel modules won't be resolved at all.

  If some relocation was applied (e.g. kexec) symbols may be misresolved
  even with a suitable vmlinux or kallsyms file.

  Couldn't record kernel reference relocation symbol
  Symbol resolution may be skewed if relocation was used (e.g. kexec).
  Check /proc/kallsyms permission or run as root.
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.001 MB perf.data (8 samples) ]
  $ perf evlist -v
  cycles:uppp: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, exclude_kernel: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
  $

After:

  $ perf record sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.001 MB perf.data (10 samples) ]
  $

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-t025e9zftbx2b8cq2w01g5e5@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-28 14:26:33 -03:00
Arnaldo Carvalho de Melo
3f0a4c873c perf report: Ignore kptr_restrict when not sampling the kernel
If none of the evsels has attr.exclude_kernel set to zero, no kernel
samples, so no point in warning the user about problems in processing
kernel samples, as there will be none.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-7dn926v3at8txxkky92aesz2@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-28 14:24:52 -03:00
Arnaldo Carvalho de Melo
5b0d1cb406 perf evlist: Add helper to check if attr.exclude_kernel is set in all evsels
The warning about kptr_restrict needs to be emitted only when it is set
and we ask for kernel space samples, so add a helper to help with that.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-fh7drty6yljei9gxxzer6eup@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-28 14:23:43 -03:00
Thomas Richter
d5c5e46aa7 perf test shell: Fix test case probe libc's inet_pton on s390x
The 'perf test' case "probe libc's inet_pton & backtrace it with ping"
fails on s390x. The reason is the 'realpath /lib64/ld*.so.* | uniq' line
which returns 2 libraries:

        root@s35lp76 shell]# realpath /lib64/ld*.so.* | uniq
        /usr/lib64/ld-2.26.so
        /usr/lib64/ld_pre_smc.so.1.0.1
        [root@s35lp76 shell]

This output makes the "perf probe" command lines invalid.

Use ldd tool to find out the libraries required by "bash" and check if
symbol "inet_pton" is part of the "libc" library.  Some distros do not
have a /lib64 directory.

I have also added a check for the existence of an IPv6 network interface
before it is being used.

Committer changes:

We can't really use ldd for libc, as in some systems, such as x86_64, it
has hardlinks and then ldd sees one and the kernel the other, so grep
for libc in /proc/self/maps to get the one we'll receive from
PERF_RECORD_MMAP.

Thomas checked this change and acked it.

Signed-off-by: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Suggested-by: Hendrik Brückner <brueckner@linux.vnet.ibm.com>
Reviewed-by: Hendrik Brückner <brueckner@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/20171114133409.GN8836@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-28 14:23:16 -03:00
Thomas Richter
ccafc38f1c perf test shell: Fix check open filename arg using 'perf trace' on s390x
This 'perf test' case fails on s390x. The 'touch' command on s390x uses
the 'openat' system call to open the file named on the command line:

[root@s35lp76 perf]# perf probe -l
  probe:vfs_getname    (on getname_flags:72@fs/namei.c with pathname)
[root@s35lp76 perf]# perf trace -e open touch /tmp/abc
     0.400 ( 0.015 ms): touch/27542 open(filename:
		/usr/lib/locale/locale-archive, flags: CLOEXEC) = 3
[root@s35lp76 perf]#

There is no 'open' system call for file '/tmp/abc'. Instead the 'openat'
system call is used:

[root@s35lp76 perf]# strace touch /tmp/abc
    execve("/usr/bin/touch", ["touch", "/tmp/abc"], 0x3ffd547ec98
			/* 30 vars */) = 0
    [...]
    openat(AT_FDCWD, "/tmp/abc", O_WRONLY|O_CREAT|O_NOCTTY|O_NONBLOCK, 0666) = 3
    [...]

On s390x the 'egrep' command does not find a matching pattern and
returns an error.

Fix this for s390x create a platform dependent command line to enable
the 'perf probe' call to listen to the 'openat' system call and get the
expected output.

Signed-off-by: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
LPU-Reference: 20171114071847.2381-1-tmricht@linux.vnet.ibm.com
Link: http://lkml.kernel.org/n/tip-3qf38jk0prz54rhmhyu871my@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-28 14:22:56 -03:00
Ravi Bangoria
05d0e62d9f perf annotate: Do not truncate instruction names at 6 chars
There are many instructions, esp on PowerPC, whose mnemonics are longer
than 6 characters. Using precision limit causes truncation of such
mnemonics.

Fix this by removing precision limit. Note that, 'width' is still 6, so
alignment won't get affected for length <= 6.

Before:

   li     r11,-1
   xscvdp vs1,vs1
   add.   r10,r10,r11

After:

  li     r11,-1
  xscvdpsxds vs1,vs1
  add.   r10,r10,r11

Reported-by: Donald Stence <dstence@us.ibm.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Taeung Song <treeze.taeung@gmail.com>
Link: http://lkml.kernel.org/r/20171114032540.4564-1-ravi.bangoria@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-28 14:22:31 -03:00
Namhyung Kim
af98f2273f perf help: Fix a bug during strstart() conversion
The commit 8e99b6d453 changed prefixcmp() to strstart() but missed to
change the return value in some place.  It makes perf help print
annoying output even for sane config items like below:

  $ perf help
  '.root': unsupported man viewer sub key.
  ...

Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Sihyeon Jang <uneedsihyeon@gmail.com>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20171114001542.GA16464@sejong
Fixes: 8e99b6d453 ("tools include: Adopt strstarts() from the kernel")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-28 14:21:18 -03:00
Arnaldo Carvalho de Melo
4a2233b194 perf machine: Guard against NULL in machine__exit()
A recent fix for 'perf trace' introduced a bug where
machine__exit(trace->host) could be called while trace->host was still
NULL, so make this more robust by guarding against NULL, just like
free() does.

The problem happens, for instance, when !root users try to run 'perf
trace':

  [acme@jouet linux]$ trace
  Error:	No permissions to read /sys/kernel/debug/tracing/events/raw_syscalls/sys_(enter|exit)
  Hint:	Try 'sudo mount -o remount,mode=755 /sys/kernel/debug/tracing'

  perf: Segmentation fault
  Obtained 7 stack frames.
  [0x4f1b2e]
  /lib64/libc.so.6(+0x3671f) [0x7f43a1dd971f]
  [0x4f3fec]
  [0x47468b]
  [0x42a2db]
  /lib64/libc.so.6(__libc_start_main+0xe9) [0x7f43a1dc3509]
  [0x42a6c9]
  Segmentation fault (core dumped)
  [acme@jouet linux]$

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrei Vagin <avagin@openvz.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Vasily Averin <vvs@virtuozzo.com>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 33974a414c ("perf trace: Call machine__exit() at exit")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-28 14:21:01 -03:00
Arnaldo Carvalho de Melo
501e5bbec3 perf script: Fix --per-event-dump for auxtrace synth evsels
When processing PERF_RECORD_AUXTRACE_INFO several perf_evsel entries
will be synthesized and inserted into session->evlist, eventually ending
in perf_script.tool.sample(), which ends up calling builtin-script.c's
process_event(), that expects evsel->priv to be a perf_evsel_script
object with a valid FILE pointer in fp.

So we need to intercept the processing of PERF_RECORD_AUXTRACE_INFO and
then setup evsel->priv for these newly created perf_evsel instances, do
it to fix the segfault in process_event() trying to use a NULL for that
FILE pointer.

Reported-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: yuzhoujian <yuzhoujian@didichuxing.com>
Fixes: a14390fde6 ("perf script: Allow creating per-event dump files")
Link: http://lkml.kernel.org/n/tip-bthnur8r8de01gxvn2qayx6e@git.kernel.org
[ Merge fix by Ravi Bangoria before pushing upstream to preserv bisectability ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-28 14:20:45 -03:00
Arnaldo Carvalho de Melo
8e2d8e2042 perf evsel: Fix up leftover perf_evsel_stat usage via evsel->priv
I forgot one conversion, which got noticed by Thomas when running:

  $ perf stat  -e '{cpu-clock,instructions}' kill
  kill: not enough arguments
  Segmentation fault (core dumped)
  $

Fix it, those stats are in evsel->stats, not anymore in evsel->priv.

Reported-by: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
Tested-by: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: e669e833da ("perf evsel: Restore evsel->priv as a tool private area")
Link: http://lkml.kernel.org/r/20171109150046.GN4333@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-28 14:20:32 -03:00
Andrei Vagin
35c33633ab perf trace: Fix an exit code of trace__symbols_init
Currently if trace_event__register_resolver() fails, we return -errno,
but we can't be sure that errno isn't zero in this case.

Signed-off-by: Andrei Vagin <avagin@openvz.org>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Vasily Averin <vvs@virtuozzo.com>
Link: http://lkml.kernel.org/r/20171108002246.8924-2-avagin@openvz.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-28 14:20:15 -03:00
Andi Kleen
59622fd496 perf record: Fix -c/-F options for cpu event aliases
The Intel PMU event aliases have a implicit period= specifier to set the
default period.

Unfortunately this breaks overriding these periods with -c or -F,
because the alias terms look like they are user specified to the
internal parser, and user specified event qualifiers override the
command line options.

Track that they are coming from aliases by adding a "weak" state to the
term. Any weak terms don't override command line options.

I only did it for -c/-F for now, I think that's the only case that's
broken currently.

Before:

$ perf record -c 1000 -vv -e uops_issued.any
...
  { sample_period, sample_freq }   2000003

After:

$ perf record -c 1000 -vv -e uops_issued.any
...
  { sample_period, sample_freq }   1000

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20171020202755.21410-2-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-28 14:19:39 -03:00
Arnaldo Carvalho de Melo
dffdcbdbb0 perf record: Generate PERF_RECORD_{MMAP,COMM,EXEC} with --delay
When we use an initial delay, e.g.: 'perf record --delay 1000', we do not
enable the events until that delay has passed after we started the workload,
including the tracking event, i.e. the one for which we have attr.mmap, etc,
enabled to ask the kernel to generate the PERF_RECORD_{MMAP,COMM,EXEC} metadata
events that will then allow us to resolve addresses in samples to the map, dso
and symbol. There will be a shadow that even synthesizing samples won't cover,
i.e. the workload that we start and other processes forking while we
wait for the initial delay to expire.

So use a dummy event to be the tracking one and make it be enabled on exec.

Before:

  # perf record --delay 1000 stress --cpu 1 --timeout 5
  stress: info: [9029] dispatching hogs: 1 cpu, 0 io, 0 vm, 0 hdd
  stress: info: [9029] successful run completed in 5s
  [ perf record: Woken up 3 times to write data ]
  [ perf record: Captured and wrote 0.624 MB perf.data (15908 samples) ]
  # perf script | head
      :9031 9031 32001.826888:       1 cycles:ppp: ffffffff831aa30d event_function (/lib/modules/4.14.0-rc6+/build/vmlinux)
      :9031 9031 32001.826893:       1 cycles:ppp: ffffffff8300d1a0 intel_bts_enable_local (/lib/modules/4.14.0-rc6+/build/vmlinux)
      :9031 9031 32001.826895:       7 cycles:ppp: ffffffff83023870 sched_clock (/lib/modules/4.14.0-rc6+/build/vmlinux)
      :9031 9031 32001.826897:     103 cycles:ppp: ffffffff8300c331 intel_pmu_handle_irq (/lib/modules/4.14.0-rc6+/build/vmlinux)
      :9031 9031 32001.826899:    1615 cycles:ppp: ffffffff830231f8 native_sched_clock (/lib/modules/4.14.0-rc6+/build/vmlinux)
      :9031 9031 32001.826902:   26724 cycles:ppp: ffffffff8384c6a7 native_irq_return_iret (/lib/modules/4.14.0-rc6+/build/vmlinux)
      :9031 9031 32001.826913:  329739 cycles:ppp:     7fb2a5410932 [unknown] ([unknown])
      :9031 9031 32001.827033: 1225451 cycles:ppp:     7fb2a5410930 [unknown] ([unknown])
      :9031 9031 32001.827474: 1391725 cycles:ppp:     7fb2a5410930 [unknown] ([unknown])
      :9031 9031 32001.827978: 1233697 cycles:ppp:     7fb2a5410928 [unknown] ([unknown])
  #

After:

  # perf record --delay 1000 stress --cpu 1 --timeout 5
  stress: info: [9741] dispatching hogs: 1 cpu, 0 io, 0 vm, 0 hdd
  stress: info: [9741] successful run completed in 5s
  [ perf record: Woken up 3 times to write data ]
  [ perf record: Captured and wrote 0.751 MB perf.data (15976 samples) ]
  # perf script | head
     stress  9742 32110.959106:          1 cycles:ppp:  ffffffff831b26f6 __perf_event_task_sched_in (/lib/modules/4.14.0-rc6+/build/vmlinux)
     stress 9742 32110.959110:       1 cycles:ppp: ffffffff8300c2e9 intel_pmu_handle_irq (/lib/modules/4.14.0-rc6+/build/vmlinux)
     stress 9742 32110.959112:       7 cycles:ppp: ffffffff830231e0 native_sched_clock (/lib/modules/4.14.0-rc6+/build/vmlinux)
     stress 9742 32110.959115:     101 cycles:ppp: ffffffff83023870 sched_clock (/lib/modules/4.14.0-rc6+/build/vmlinux)
     stress 9742 32110.959117:    1533 cycles:ppp: ffffffff830231f8 native_sched_clock (/lib/modules/4.14.0-rc6+/build/vmlinux)
     stress 9742 32110.959119:   23992 cycles:ppp: ffffffff831b0900 ctx_sched_in (/lib/modules/4.14.0-rc6+/build/vmlinux)
     stress 9742 32110.959129:  329406 cycles:ppp:     7f4b1b661930 __random_r (/usr/lib64/libc-2.25.so)
     stress 9742 32110.959249: 1288322 cycles:ppp:     5566e1e7cbc9 hogcpu (/usr/bin/stress)
     stress 9742 32110.959712: 1464046 cycles:ppp:     7f4b1b66179e __random (/usr/lib64/libc-2.25.so)
     stress 9742 32110.960241: 1266918 cycles:ppp:     7f4b1b66195b __random_r (/usr/lib64/libc-2.25.so)
  #

Reported-by: Bram Stolk <b.stolk@gmail.com>
Tested-by: Bram Stolk <b.stolk@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 6619a53ef7 ("perf record: Add --initial-delay option")
Link: http://lkml.kernel.org/n/tip-nrdfchshqxf7diszhxcecqb9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-28 14:19:21 -03:00
Arnaldo Carvalho de Melo
555b4ec4d5 perf evlist: Set the correct idx when adding dummy events
The evsel->idx field is used mainly to access the right bucket in
per-event arrays such as the annotation ones, but also to set
evsel->tracking, that in turn will decide what of the events will ask
for PERF_RECORD_{MMAP,COMM,EXEC} to be generated, i.e. which
perf_event_attr will have its mmap, etc fields set.

When we were adding the "dummy" event using perf_evlist__add_dummy() we
were not setting it correctly, which could result in multiple tracking
events.

Now that I'll try using a dummy event to be the tracking one when using
'perf record --delay', i.e. when we process the --delay
setting we may already have the evlist set up, like with:

  perf record -e cycles,instructions --delay 1000 ./workload

We will need to add a "dummy" event, then reset evsel->tracking for the
first event, "cycles", and set it instead to the dummy one, and also
setting its attr.enable_on_exec, so that we get the PERF_RECORD_MMAP,
etc metadata events while waiting to enable the explicitely requested
events, so lets get this straight and set the right evsel->idx.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Bram Stolk <b.stolk@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-nrdfchshqxf7diszhxcecqb9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-28 14:19:00 -03:00
Linus Torvalds
d6ec9d9a4d Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 core updates from Ingo Molnar:
 "Note that in this cycle most of the x86 topics interacted at a level
  that caused them to be merged into tip:x86/asm - but this should be a
  temporary phenomenon, hopefully we'll back to the usual patterns in
  the next merge window.

  The main changes in this cycle were:

  Hardware enablement:

   - Add support for the Intel UMIP (User Mode Instruction Prevention)
     CPU feature. This is a security feature that disables certain
     instructions such as SGDT, SLDT, SIDT, SMSW and STR. (Ricardo Neri)

     [ Note that this is disabled by default for now, there are some
       smaller enhancements in the pipeline that I'll follow up with in
       the next 1-2 days, which allows this to be enabled by default.]

   - Add support for the AMD SEV (Secure Encrypted Virtualization) CPU
     feature, on top of SME (Secure Memory Encryption) support that was
     added in v4.14. (Tom Lendacky, Brijesh Singh)

   - Enable new SSE/AVX/AVX512 CPU features: AVX512_VBMI2, GFNI, VAES,
     VPCLMULQDQ, AVX512_VNNI, AVX512_BITALG. (Gayatri Kammela)

  Other changes:

   - A big series of entry code simplifications and enhancements (Andy
     Lutomirski)

   - Make the ORC unwinder default on x86 and various objtool
     enhancements. (Josh Poimboeuf)

   - 5-level paging enhancements (Kirill A. Shutemov)

   - Micro-optimize the entry code a bit (Borislav Petkov)

   - Improve the handling of interdependent CPU features in the early
     FPU init code (Andi Kleen)

   - Build system enhancements (Changbin Du, Masahiro Yamada)

   - ... plus misc enhancements, fixes and cleanups"

* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (118 commits)
  x86/build: Make the boot image generation less verbose
  selftests/x86: Add tests for the STR and SLDT instructions
  selftests/x86: Add tests for User-Mode Instruction Prevention
  x86/traps: Fix up general protection faults caused by UMIP
  x86/umip: Enable User-Mode Instruction Prevention at runtime
  x86/umip: Force a page fault when unable to copy emulated result to user
  x86/umip: Add emulation code for UMIP instructions
  x86/cpufeature: Add User-Mode Instruction Prevention definitions
  x86/insn-eval: Add support to resolve 16-bit address encodings
  x86/insn-eval: Handle 32-bit address encodings in virtual-8086 mode
  x86/insn-eval: Add wrapper function for 32 and 64-bit addresses
  x86/insn-eval: Add support to resolve 32-bit address encodings
  x86/insn-eval: Compute linear address in several utility functions
  resource: Fix resource_size.cocci warnings
  X86/KVM: Clear encryption attribute when SEV is active
  X86/KVM: Decrypt shared per-cpu variables when SEV is active
  percpu: Introduce DEFINE_PER_CPU_DECRYPTED
  x86: Add support for changing memory encryption attribute in early boot
  x86/io: Unroll string I/O when SEV is active
  x86/boot: Add early boot support when running with SEV active
  ...
2017-11-13 14:13:48 -08:00
Linus Torvalds
31486372a1 Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
 "The main changes in this cycle were:

  Kernel:

   - kprobes updates: use better W^X patterns for code modifications,
     improve optprobes, remove jprobes. (Masami Hiramatsu, Kees Cook)

   - core fixes: event timekeeping (enabled/running times statistics)
     fixes, perf_event_read() locking fixes and cleanups, etc. (Peter
     Zijlstra)

   - Extend x86 Intel free-running PEBS support and support x86
     user-register sampling in perf record and perf script. (Andi Kleen)

  Tooling:

   - Completely rework the way inline frames are handled. Instead of
     querying for the inline nodes on-demand in the individual tools, we
     now create proper callchain nodes for inlined frames. (Milian
     Wolff)

   - 'perf trace' updates (Arnaldo Carvalho de Melo)

   - Implement a way to print formatted output to per-event files in
     'perf script' to facilitate generate flamegraphs, elliminating the
     need to write scripts to do that separation (yuzhoujian, Arnaldo
     Carvalho de Melo)

   - Update vendor events JSON metrics for Intel's Broadwell, Broadwell
     Server, Haswell, Haswell Server, IvyBridge, IvyTown, JakeTown,
     Sandy Bridge, Skylake, SkyLake Server - and Goldmont Plus V1 (Andi
     Kleen, Kan Liang)

   - Multithread the synthesizing of PERF_RECORD_ events for
     pre-existing threads in 'perf top', speeding up that phase, greatly
     improving the user experience in systems such as Intel's Knights
     Mill (Kan Liang)

   - Introduce the concept of weak groups in 'perf stat': try to set up
     a group, but if it's not schedulable fallback to not using a group.
     That gives us the best of both worlds: groups if they work, but
     still a usable fallback if they don't. E.g: (Andi Kleen)

   - perf sched timehist enhancements (David Ahern)

   - ... various other enhancements, updates, cleanups and fixes"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (139 commits)
  kprobes: Don't spam the build log with deprecation warnings
  arm/kprobes: Remove jprobe test case
  arm/kprobes: Fix kretprobe test to check correct counter
  perf srcline: Show correct function name for srcline of callchains
  perf srcline: Fix memory leak in addr2inlines()
  perf trace beauty kcmp: Beautify arguments
  perf trace beauty: Implement pid_fd beautifier
  tools include uapi: Grab a copy of linux/kcmp.h
  perf callchain: Fix double mapping al->addr for children without self period
  perf stat: Make --per-thread update shadow stats to show metrics
  perf stat: Move the shadow stats scale computation in perf_stat__update_shadow_stats
  perf tools: Add perf_data_file__write function
  perf tools: Add struct perf_data_file
  perf tools: Rename struct perf_data_file to perf_data
  perf script: Print information about per-event-dump files
  perf trace beauty prctl: Generate 'option' string table from kernel headers
  tools include uapi: Grab a copy of linux/prctl.h
  perf script: Allow creating per-event dump files
  perf evsel: Restore evsel->priv as a tool private area
  perf script: Use event_format__fprintf()
  ...
2017-11-13 13:05:08 -08:00
Linus Torvalds
8e9a2dba86 Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core locking updates from Ingo Molnar:
 "The main changes in this cycle are:

   - Another attempt at enabling cross-release lockdep dependency
     tracking (automatically part of CONFIG_PROVE_LOCKING=y), this time
     with better performance and fewer false positives. (Byungchul Park)

   - Introduce lockdep_assert_irqs_enabled()/disabled() and convert
     open-coded equivalents to lockdep variants. (Frederic Weisbecker)

   - Add down_read_killable() and use it in the VFS's iterate_dir()
     method. (Kirill Tkhai)

   - Convert remaining uses of ACCESS_ONCE() to
     READ_ONCE()/WRITE_ONCE(). Most of the conversion was Coccinelle
     driven. (Mark Rutland, Paul E. McKenney)

   - Get rid of lockless_dereference(), by strengthening Alpha atomics,
     strengthening READ_ONCE() with smp_read_barrier_depends() and thus
     being able to convert users of lockless_dereference() to
     READ_ONCE(). (Will Deacon)

   - Various micro-optimizations:

        - better PV qspinlocks (Waiman Long),
        - better x86 barriers (Michael S. Tsirkin)
        - better x86 refcounts (Kees Cook)

   - ... plus other fixes and enhancements. (Borislav Petkov, Juergen
     Gross, Miguel Bernal Marin)"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (70 commits)
  locking/x86: Use LOCK ADD for smp_mb() instead of MFENCE
  rcu: Use lockdep to assert IRQs are disabled/enabled
  netpoll: Use lockdep to assert IRQs are disabled/enabled
  timers/posix-cpu-timers: Use lockdep to assert IRQs are disabled/enabled
  sched/clock, sched/cputime: Use lockdep to assert IRQs are disabled/enabled
  irq_work: Use lockdep to assert IRQs are disabled/enabled
  irq/timings: Use lockdep to assert IRQs are disabled/enabled
  perf/core: Use lockdep to assert IRQs are disabled/enabled
  x86: Use lockdep to assert IRQs are disabled/enabled
  smp/core: Use lockdep to assert IRQs are disabled/enabled
  timers/hrtimer: Use lockdep to assert IRQs are disabled/enabled
  timers/nohz: Use lockdep to assert IRQs are disabled/enabled
  workqueue: Use lockdep to assert IRQs are disabled/enabled
  irq/softirqs: Use lockdep to assert IRQs are disabled/enabled
  locking/lockdep: Add IRQs disabled/enabled assertion APIs: lockdep_assert_irqs_enabled()/disabled()
  locking/pvqspinlock: Implement hybrid PV queued/unfair locks
  locking/rwlocks: Fix comments
  x86/paravirt: Set up the virt_spin_lock_key after static keys get initialized
  block, locking/lockdep: Assign a lock_class per gendisk used for wait_for_completion()
  workqueue: Remove now redundant lock acquisitions wrt. workqueue flushes
  ...
2017-11-13 12:38:26 -08:00
Linus Torvalds
6098850e7e Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU updates from Ingo Molnar:
 "The main changes in this cycle are:

   - Documentation updates

   - RCU CPU stall-warning updates

   - Torture-test updates

   - Miscellaneous fixes

  Size wise the biggest updates are to documentation. Excluding
  documentation most of the code increase comes from a single commit
  which expands debugging"

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
  srcu: Add parameters to SRCU docbook comments
  doc: Rewrite confusing statement about memory barriers
  memory-barriers.txt: Fix typo in pairing example
  rcu/segcblist: Include rcupdate.h
  rcu: Add extended-quiescent-state testing advice
  rcu: Suppress lockdep false-positive ->boost_mtx complaints
  rcu: Do not include rtmutex_common.h unconditionally
  torture: Provide TMPDIR environment variable to specify tmpdir
  rcutorture: Dump writer stack if stalled
  rcutorture: Add interrupt-disable capability to stall-warning tests
  rcu: Suppress RCU CPU stall warnings while dumping trace
  rcu: Turn off tracing before dumping trace
  rcu: Make RCU CPU stall warnings check for irq-disabled CPUs
  sched,rcu: Make cond_resched() provide RCU quiescent state
  sched: Make resched_cpu() unconditional
  irq_work: Map irq_work_on_queue() to irq_work_on() in !SMP
  rcu: Create call_rcu_tasks() kthread at boot time
  rcu: Fix up pending cbs check in rcu_prepare_for_idle
  memory-barriers: Rework multicopy-atomicity section
  memory-barriers: Replace uses of "transitive"
  ...
2017-11-13 12:18:10 -08:00
Ingo Molnar
505ee76761 tooling/headers: Sync the tools/include/uapi/drm/i915_drm.h UAPI header
Last minute upstream update to one of the UAPI headers - sync it with tooling,
to address this warning:

  Warning: Kernel ABI header at 'tools/include/uapi/drm/i915_drm.h' differs from latest version at 'include/uapi/drm/i915_drm.h'

Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-11 09:08:43 +01:00
Ingo Molnar
529b3ca832 Merge branch 'perf/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
Pull perf tooling fixes from Arnaldo Carvalho de Melo.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-11 09:03:59 +01:00
Michael S. Tsirkin
450cbdd012 locking/x86: Use LOCK ADD for smp_mb() instead of MFENCE
MFENCE appears to be way slower than a locked instruction - let's use
LOCK ADD unconditionally, as we always did on old 32-bit.

Performance testing results:

  perf stat -r 10 -- ./virtio_ring_0_9 --sleep --host-affinity 0 --guest-affinity 0
  Before:
         0.922565990 seconds time elapsed                                          ( +-  1.15% )
  After:
         0.578667024 seconds time elapsed                                          ( +-  1.21% )

i.e. about ~60% faster.

Just poking at SP would be the most natural, but if we then read the
value from SP, we get a false dependency which will slow us down.

This was noted in this article:

  http://shipilev.net/blog/2014/on-the-fence-with-dependencies/

And is easy to reproduce by sticking a barrier in a small non-inline
function.

So let's use a negative offset - which avoids this problem since we
build with the red zone disabled.

For userspace, use an address just below the redzone.

The one difference between LOCK ADD and MFENCE is that LOCK ADD does
not affect CLFLUSH, previous patches converted all uses of CLFLUSH to
call mb(), such that changes to smp_mb() won't affect it.

Update mb/rmb/wmb() on 32-bit to use the negative offset, too, for
consistency.

As a follow-up, it might be worth considering switching users
of CLFLUSH to another API (e.g. clflush_mb()?) - we will
then be able to convert mb() to smp_mb() again.

Also arguably, GCC should switch to use LOCK ADD for __sync_synchronize().
This might be worth pursuing separately.

Suggested-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: qemu-devel@nongnu.org
Cc: virtualization@lists.linux-foundation.org
Link: http://lkml.kernel.org/r/1509118355-4890-1-git-send-email-mst@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-10 13:43:44 +01:00
Andrei Vagin
33974a414c perf trace: Call machine__exit() at exit
Otherwise 'perf trace' leaves a temporary file /tmp/perf-vdso.so-XXXXXX.

  $ perf trace -o log true
  $ ls -l /tmp/perf-vdso.*
  -rw------- 1 root root 8192 Nov  8 03:08 /tmp/perf-vdso.so-5bCpD0

Signed-off-by: Andrei Vagin <avagin@openvz.org>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Vasily Averin <vvs@virtuozzo.com>
Link: http://lkml.kernel.org/r/20171108002246.8924-1-avagin@openvz.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-09 10:17:32 -03:00
Jiri Olsa
a271bfaf30 perf tools: Fix eBPF event specification parsing
Looks like I've reached the new level of stupidity, adding missing braces.

Committer testing:

Given the following eBPF C filter, that will add a record when it
returns true, i.e. when the tv_nsec variable is > 2000ns, should be
built and installed via sys_bpf(), but fails to do so before this patch:

  # cat filter.c
  #include <uapi/linux/bpf.h>
  #define SEC(NAME) __attribute__((section(NAME), used))

  SEC("func=hrtimer_nanosleep rqtp->tv_nsec")
  int func(void *ctx, int err, long nsec)
  {
	  return nsec > 1000;
  }
  char _license[] SEC("license") = "GPL";
  int _version SEC("version") = LINUX_VERSION_CODE;
  #

  # perf trace -e nanosleep,filter.c usleep 1
  invalid or unsupported event: 'filter.c'
  Run 'perf list' for a list of valid events

   Usage: perf trace [<options>] [<command>]
      or: perf trace [<options>] -- <command> [<options>]
      or: perf trace record [<options>] [<command>]
      or: perf trace record [<options>] -- <command> [<options>]

      -e, --event <event>   event/syscall selector. use 'perf list' to list available events
  #

And works again after it is applied, the nothing is inserted when the co

  # perf trace -e *sleep,filter.c usleep 1
     0.000 ( 0.066 ms): usleep/23994 nanosleep(rqtp: 0x7ffead94a0d0) = 0
  # perf trace -e *sleep,filter.c usleep 2
     0.000 ( 0.008 ms): usleep/24378 nanosleep(rqtp: 0x7fffa021ba50) ...
     0.008 (         ): perf_bpf_probe:func:(ffffffffb410cb30) tv_nsec=2000)
     0.000 ( 0.066 ms): usleep/24378  ... [continued]: nanosleep()) = 0
  #

The intent of 9445464bb8 is kept:

  # perf stat -e 'cpu/uops_executed.core,krava/'  true
  event syntax error: '..cuted.core,krava/'
                                    \___ unknown term

  valid terms: cmask,pc,event,edge,in_tx,any,ldlat,inv,umask,in_tx_cp,offcore_rsp,config,config1,config2,name,period
  Run 'perf list' for a list of valid events

   Usage: perf stat [<options>] [<command>]

      -e, --event <event>   event selector. use 'perf list' to list available events
  #
  # perf stat -e 'cpu/uops_executed.core,period=1/'  true

   Performance counter stats for 'true':

           808,332      cpu/uops_executed.core,period=1/

       0.002997237 seconds time elapsed

  #

Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Fixes: 9445464bb8 ("perf tools: Unwind properly location after REJECT")
Link: http://lkml.kernel.org/n/tip-diea0ihbwpxfw6938huv3whj@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-09 10:10:58 -03:00
Jiri Olsa
b6af53b7d6 perf tools: Add "reject" option for parse-events.l
Arnaldo reported broken builds in some distros using a newer flex
release, 2.6.4, found in Alpine Linux 3.6 and Edge, with flex not
spotting the REJECT macro:

  CC       /tmp/build/perf/util/parse-events-flex.o
  util/parse-events.l: In function 'parse_events_lex':
  /tmp/build/perf/util/parse-events-flex.c:4734:16: error: \
  'reject_used_but_not_detected' undeclared (first use in this function)

It's happening because we put the REJECT under another USER_REJECT macro
in following commit:

  9445464bb8 perf tools: Unwind properly location after REJECT

Fortunately flex provides option for force it to use REJECT, adding it
to parse-events.l.

Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Reviewed-by: Andi Kleen <andi@firstfloor.org>
Tested-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Fixes: 9445464bb8 ("perf tools: Unwind properly location after REJECT")
Link: http://lkml.kernel.org/n/tip-7kdont984mw12ijk7rji6b8p@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-11-09 10:09:03 -03:00
Ricardo Neri
a9e017d561 selftests/x86: Add tests for the STR and SLDT instructions
The STR and SLDT instructions are not valid when running on virtual-8086
mode and generate an invalid operand exception. These two instructions are
protected by the Intel User-Mode Instruction Prevention (UMIP) security
feature. In protected mode, if UMIP is enabled, these instructions generate
a general protection fault if called from CPL > 0. Linux traps the general
protection fault and emulates the instructions sgdt, sidt and smsw; but not
str and sldt.

These tests are added to verify that the emulation code does not emulate
these two instructions but the expected invalid operand exception is
seen.

Tests fallback to exit with INT3 in case emulation does happen.

Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chen Yucong <slaoub@gmail.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: ricardo.neri@intel.com
Link: http://lkml.kernel.org/r/1509935277-22138-13-git-send-email-ricardo.neri-calderon@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-08 11:16:25 +01:00
Ricardo Neri
9390afebe1 selftests/x86: Add tests for User-Mode Instruction Prevention
Certain user space programs that run on virtual-8086 mode may utilize
instructions protected by the User-Mode Instruction Prevention (UMIP)
security feature present in new Intel processors: SGDT, SIDT and SMSW. In
such a case, a general protection fault is issued if UMIP is enabled. When
such a fault happens, the kernel traps it and emulates the results of
these instructions with dummy values. The purpose of this new
test is to verify whether the impacted instructions can be executed
without causing such #GP. If no #GP exceptions occur, we expect to exit
virtual-8086 mode from INT3.

The instructions protected by UMIP are executed in representative use
cases:

 a) displacement-only memory addressing
 b) register-indirect memory addressing
 c) results stored directly in operands

Unfortunately, it is not possible to check the results against a set of
expected values because no emulation will occur in systems that do not
have the UMIP feature. Instead, results are printed for verification. A
simple verification is done to ensure that results of all tests are
identical.

Signed-off-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chen Yucong <slaoub@gmail.com>
Cc: Chris Metcalf <cmetcalf@mellanox.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi V. Shankar <ravi.v.shankar@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: ricardo.neri@intel.com
Link: http://lkml.kernel.org/r/1509935277-22138-12-git-send-email-ricardo.neri-calderon@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-08 11:16:24 +01:00
Andy Lutomirski
fec8f5ae17 selftests/x86/ldt_get: Add a few additional tests for limits
We weren't testing the .limit and .limit_in_pages fields very well.
Add more tests.

This addition seems to trigger the "bits 16:19 are undefined" issue
that was fixed in an earlier patch.  I think that, at least on my
CPU, the high nibble of the limit ends in LAR bits 16:19.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/5601c15ea9b3113d288953fd2838b18bedf6bc67.1509794321.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-07 11:13:43 +01:00
Andy Lutomirski
adedf2893c selftests/x86/ldt_gdt: Run most existing LDT test cases against the GDT as well
Now that the main test infrastructure supports the GDT, run tests
that will pass the kernel's GDT permission tests against the GDT.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/686a1eda63414da38fcecc2412db8dba1ae40581.1509794321.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-07 11:13:43 +01:00
Andy Lutomirski
d744dcad39 selftests/x86/ldt_gdt: Add infrastructure to test set_thread_area()
Much of the test design could apply to set_thread_area() (i.e. GDT),
not just modify_ldt().  Add set_thread_area() to the
install_valid_mode() helper.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/02c23f8fba5547007f741dc24c3926e5284ede02.1509794321.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-07 11:13:42 +01:00
Andy Lutomirski
d60ad744c9 selftests/x86/ldt_gdt: Robustify against set_thread_area() and LAR oddities
Bits 19:16 of LAR's result are undefined, and some upcoming
improvements to the test case seem to trigger this.  Mask off those
bits to avoid spurious failures.

commit 5b781c7e31 ("x86/tls: Forcibly set the accessed bit in TLS
segments") adds a valid case in which LAR's output doesn't quite
agree with set_thread_area()'s input.  This isn't triggered in the
test as is, but it will be if we start calling set_thread_area()
with the accessed bit clear.  Work around this discrepency.

I've added a Fixes tag so that -stable can pick this up if neccesary.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 5b781c7e31 ("x86/tls: Forcibly set the accessed bit in TLS segments")
Link: http://lkml.kernel.org/r/b82f3f89c034b53580970ac865139fd8863f44e2.1509794321.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-07 11:13:42 +01:00
Andy Lutomirski
693cb5580f selftests/x86/protection_keys: Fix syscall NR redefinition warnings
On new enough glibc, the pkey syscalls numbers are available.  Check
first before defining them to avoid warnings like:

protection_keys.c:198:0: warning: "SYS_pkey_alloc" redefined

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1fbef53a9e6befb7165ff855fc1a7d4788a191d6.1509794321.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-07 11:13:42 +01:00
Ingo Molnar
b3d9a13681 Merge branch 'linus' into x86/asm, to pick up fixes and resolve conflicts
Conflicts:
	arch/x86/kernel/cpu/Makefile

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-07 10:53:06 +01:00
Ingo Molnar
8c5db92a70 Merge branch 'linus' into locking/core, to resolve conflicts
Conflicts:
	include/linux/compiler-clang.h
	include/linux/compiler-gcc.h
	include/linux/compiler-intel.h
	include/uapi/linux/stddef.h

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-07 10:32:44 +01:00
Ingo Molnar
15bcdc9477 Merge branch 'linus' into perf/core, to fix conflicts
Conflicts:
	tools/perf/arch/arm/annotate/instructions.c
	tools/perf/arch/arm64/annotate/instructions.c
	tools/perf/arch/powerpc/annotate/instructions.c
	tools/perf/arch/s390/annotate/instructions.c
	tools/perf/arch/x86/tests/intel-cqm.c
	tools/perf/ui/tui/progress.c
	tools/perf/util/zlib.c

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-07 10:30:18 +01:00
Linus Torvalds
9d9cc4aa00 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "Various fixes:

   - synchronize kernel and tooling headers

   - cgroup support fix

   - two tooling fixes"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  tools/headers: Synchronize kernel ABI headers
  perf/cgroup: Fix perf cgroup hierarchy support
  perf tools: Unwind properly location after REJECT
  perf symbols: Fix memory corruption because of zero length symbols
2017-11-05 11:44:39 -08:00
Ingo Molnar
fb7df12d64 tools/headers: Synchronize kernel ABI headers
After the SPDX license tags were added a number of tooling headers got out of
sync with their kernel variants, generating lots of build warnings.

Sync them:

 - tools/arch/x86/include/asm/disabled-features.h,
   tools/arch/x86/include/asm/required-features.h,
   tools/include/linux/hash.h:

     Remove the SPDX tag where the kernel version does not have it.

 - tools/include/asm-generic/bitops/__fls.h,
   tools/include/asm-generic/bitops/arch_hweight.h,
   tools/include/asm-generic/bitops/const_hweight.h,
   tools/include/asm-generic/bitops/fls.h,
   tools/include/asm-generic/bitops/fls64.h,
   tools/include/uapi/asm-generic/ioctls.h,
   tools/include/uapi/asm-generic/mman-common.h,
   tools/include/uapi/sound/asound.h,
   tools/include/uapi/linux/kvm.h,
   tools/include/uapi/linux/perf_event.h,
   tools/include/uapi/linux/sched.h,
   tools/include/uapi/linux/vhost.h,
   tools/include/uapi/sound/asound.h:

     Add the SPDX tag of the respective kernel header.

 - tools/include/uapi/linux/bpf_common.h,
   tools/include/uapi/linux/fcntl.h,
   tools/include/uapi/linux/hw_breakpoint.h,
   tools/include/uapi/linux/mman.h,
   tools/include/uapi/linux/stat.h,

     Change the tag to the kernel header version:

       -/* SPDX-License-Identifier: GPL-2.0 */
       +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */

Also sync other header details:

 - include/uapi/sound/asound.h:

     Fix pointless end of line whitespace noise the header grew in this cycle.

 - tools/arch/x86/lib/memcpy_64.S:

     Sync the code and add tools/include/asm/export.h with dummy wrappers
     to support building the kernel side code in a tooling header environment.

 - tools/include/uapi/asm-generic/mman.h,
   tools/include/uapi/linux/bpf.h:

     Sync other details that don't impact tooling's use of the ABIs.

Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-04 09:27:46 +01:00
Josh Poimboeuf
da0db32bbe objtool: Resync objtool's instruction decoder source code copy with the kernel's latest version
This fixes the following warning:

  warning: objtool: x86 instruction decoder differs from kernel

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/013315a808ccf5580abc293808827c8e2b5e1354.1509719152.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-04 08:54:06 +01:00
Ingo Molnar
294cbd05e3 Merge branch 'linus' into perf/urgent, to pick up dependent commits
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-03 12:30:12 +01:00