- Support instruction latency in 'perf report', with both memory latency
(weight) and instruction latency information, users can locate expensive load
instructions and understand time spent in different stages.
- Extend 'perf c2c' to display the number of loads which were blocked by data
or address conflict.
- Add 'perf stat' support for L2 topdown events in systems such as Intel's
Sapphire rapids server.
- Add support for PERF_SAMPLE_CODE_PAGE_SIZE in various tools, as a sort key, for instance:
perf report --stdio --sort=comm,symbol,code_page_size
- New 'perf daemon' command to run long running sessions while providing a way to control
the enablement of events without restarting a traditional 'perf record' session.
- Enable counting events for BPF programs in 'perf stat' just like for other
targets (tid, cgroup, cpu, etc), e.g.:
# perf stat -e ref-cycles,cycles -b 254 -I 1000
1.487903822 115,200 ref-cycles
1.487903822 86,012 cycles
2.489147029 80,560 ref-cycles
2.489147029 73,784 cycles
^C#
The example above counts 'cycles' and 'ref-cycles' of BPF program of id 254.
It is similar to bpftool-prog-profile command, but more flexible.
- Support the new layout for PERF_RECORD_MMAP2 to carry the DSO build-id using infrastructure
generalised from the eBPF subsystem, removing the need for traversing the perf.data file
to collect build-ids at the end of 'perf record' sessions and helping with long running
sessions where binaries can get replaced in updates, leading to possible mis-resolution
of symbols.
- Support filtering by hex address in 'perf script'.
- Support DSO filter in 'perf script', like in other perf tools.
- Add namespaces support to 'perf inject'
- Add support for SDT (Dtrace Style Markers) events on ARM64.
perf record:
- Fix handling of eventfd() when draining a buffer in 'perf record'.
- Improvements to the generation of metadata events for pre-existing threads (mmaps, comm, etc),
speeding up the work done at the start of system wide or per CPU 'perf record' sessions.
Hardware tracing:
- Initial support for tracing KVM with Intel PT.
- Intel PT fixes for IPC
- Support Intel PT PSB (synchronization packets) events.
- Automatically group aux-output events to overcome --filter syntax.
- Enable PERF_SAMPLE_DATA_SRC on ARMs SPE.
- Update ARM's CoreSight hardware tracing OpenCSD library to v1.0.0.
perf annotate TUI:
- Fix handling of 'k' ("show line number") hotkey
- Fix jump parsing for C++ code.
perf probe:
- Add protection to avoid endless loop.
cgroups:
- Avoid reading cgroup mountpoint multiple times, caching it.
- Fix handling of cgroup v1/v2 in mixed hierarchy.
Symbol resolving:
- Add OCaml symbol demangling.
- Further fixes for handling PE executables when using perf with Wine and .exe/.dll files.
- Fix 'perf unwind' DSO handling.
- Resolve symbols against debug file first, to deal with artifacts related to LTO.
- Fix gap between kernel end and module start on powerpc.
Reporting tools:
- The DSO filter shouldn't show samples in unresolved maps.
- Improve debuginfod support in various tools.
build ids:
- Fix 16-byte build ids in 'perf buildid-cache', add a 'perf test' entry for that case.
perf test:
- Support for PERF_SAMPLE_WEIGHT_STRUCT.
- Add test case for PERF_SAMPLE_CODE_PAGE_SIZE.
- Shell based tests for 'perf daemon's commands ('start', 'stop, 'reconfig', 'list', etc).
- ARM cs-etm 'perf test' fixes.
- Add parse-metric memory bandwidth testcase.
Compiler related:
- Fix 'perf probe' kretprobe issue caused by gcc 11 bug when used with -fpatchable-function-entry.
- Fix ARM64 build with gcc 11's -Wformat-overflow.
- Fix unaligned access in sample parsing test.
- Fix printf conversion specifier for IP addresses on arm64, s390 and powerpc.
Arch specific:
- Support exposing Performance Monitor Counter SPRs as part of extended regs on powerpc.
- Add JSON 'perf stat' metrics for ARM64's imx8mp, imx8mq and imx8mn DDR, fix imx8mm ones.
- Fix common and uarch events for ARM64's A76 and Ampere eMag
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCYDANTQAKCRCyPKLppCJ+
J4veAQCISY1BPHscUTRYhq9cwU/Zs0ImtX7zDT4jxaP39JkduAD/eSqYavAJrtQh
HDyEiTgZ7CQSp5eCbXkzrnet4n3G9QE=
=H/Jk
-----END PGP SIGNATURE-----
Merge tag 'perf-tools-for-v5.12-2020-02-19' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tool updates from Arnaldo Carvalho de Melo:
"New features:
- Support instruction latency in 'perf report', with both memory
latency (weight) and instruction latency information, users can
locate expensive load instructions and understand time spent in
different stages.
- Extend 'perf c2c' to display the number of loads which were blocked
by data or address conflict.
- Add 'perf stat' support for L2 topdown events in systems such as
Intel's Sapphire rapids server.
- Add support for PERF_SAMPLE_CODE_PAGE_SIZE in various tools, as a
sort key, for instance:
perf report --stdio --sort=comm,symbol,code_page_size
- New 'perf daemon' command to run long running sessions while
providing a way to control the enablement of events without
restarting a traditional 'perf record' session.
- Enable counting events for BPF programs in 'perf stat' just like
for other targets (tid, cgroup, cpu, etc), e.g.:
# perf stat -e ref-cycles,cycles -b 254 -I 1000
1.487903822 115,200 ref-cycles
1.487903822 86,012 cycles
2.489147029 80,560 ref-cycles
2.489147029 73,784 cycles
^C
The example above counts 'cycles' and 'ref-cycles' of BPF program
of id 254. It is similar to bpftool-prog-profile command, but more
flexible.
- Support the new layout for PERF_RECORD_MMAP2 to carry the DSO
build-id using infrastructure generalised from the eBPF subsystem,
removing the need for traversing the perf.data file to collect
build-ids at the end of 'perf record' sessions and helping with
long running sessions where binaries can get replaced in updates,
leading to possible mis-resolution of symbols.
- Support filtering by hex address in 'perf script'.
- Support DSO filter in 'perf script', like in other perf tools.
- Add namespaces support to 'perf inject'
- Add support for SDT (Dtrace Style Markers) events on ARM64.
perf record:
- Fix handling of eventfd() when draining a buffer in 'perf record'.
- Improvements to the generation of metadata events for pre-existing
threads (mmaps, comm, etc), speeding up the work done at the start
of system wide or per CPU 'perf record' sessions.
Hardware tracing:
- Initial support for tracing KVM with Intel PT.
- Intel PT fixes for IPC
- Support Intel PT PSB (synchronization packets) events.
- Automatically group aux-output events to overcome --filter syntax.
- Enable PERF_SAMPLE_DATA_SRC on ARMs SPE.
- Update ARM's CoreSight hardware tracing OpenCSD library to v1.0.0.
perf annotate TUI:
- Fix handling of 'k' ("show line number") hotkey
- Fix jump parsing for C++ code.
perf probe:
- Add protection to avoid endless loop.
cgroups:
- Avoid reading cgroup mountpoint multiple times, caching it.
- Fix handling of cgroup v1/v2 in mixed hierarchy.
Symbol resolving:
- Add OCaml symbol demangling.
- Further fixes for handling PE executables when using perf with Wine
and .exe/.dll files.
- Fix 'perf unwind' DSO handling.
- Resolve symbols against debug file first, to deal with artifacts
related to LTO.
- Fix gap between kernel end and module start on powerpc.
Reporting tools:
- The DSO filter shouldn't show samples in unresolved maps.
- Improve debuginfod support in various tools.
build ids:
- Fix 16-byte build ids in 'perf buildid-cache', add a 'perf test'
entry for that case.
perf test:
- Support for PERF_SAMPLE_WEIGHT_STRUCT.
- Add test case for PERF_SAMPLE_CODE_PAGE_SIZE.
- Shell based tests for 'perf daemon's commands ('start', 'stop,
'reconfig', 'list', etc).
- ARM cs-etm 'perf test' fixes.
- Add parse-metric memory bandwidth testcase.
Compiler related:
- Fix 'perf probe' kretprobe issue caused by gcc 11 bug when used
with -fpatchable-function-entry.
- Fix ARM64 build with gcc 11's -Wformat-overflow.
- Fix unaligned access in sample parsing test.
- Fix printf conversion specifier for IP addresses on arm64, s390 and
powerpc.
Arch specific:
- Support exposing Performance Monitor Counter SPRs as part of
extended regs on powerpc.
- Add JSON 'perf stat' metrics for ARM64's imx8mp, imx8mq and imx8mn
DDR, fix imx8mm ones.
- Fix common and uarch events for ARM64's A76 and Ampere eMag"
* tag 'perf-tools-for-v5.12-2020-02-19' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (148 commits)
perf buildid-cache: Don't skip 16-byte build-ids
perf buildid-cache: Add test for 16-byte build-id
perf symbol: Remove redundant libbfd checks
perf test: Output the sub testing result in cs-etm
perf test: Suppress logs in cs-etm testing
perf tools: Fix arm64 build error with gcc-11
perf intel-pt: Add documentation for tracing virtual machines
perf intel-pt: Split VM-Entry and VM-Exit branches
perf intel-pt: Adjust sample flags for VM-Exit
perf intel-pt: Allow for a guest kernel address filter
perf intel-pt: Support decoding of guest kernel
perf machine: Factor out machine__idle_thread()
perf machine: Factor out machines__find_guest()
perf intel-pt: Amend decoder to track the NR flag
perf intel-pt: Retain the last PIP packet payload as is
perf intel_pt: Add vmlaunch and vmresume as branches
perf script: Add branch types for VM-Entry and VM-Exit
perf auxtrace: Automatically group aux-output events
perf test: Fix unaligned access in sample parsing test
perf tools: Support arch specific PERF_SAMPLE_WEIGHT_STRUCT processing
...
151 lines
4.5 KiB
Makefile
151 lines
4.5 KiB
Makefile
# SPDX-License-Identifier: GPL-2.0
|
|
ifneq ($(O),)
|
|
ifeq ($(origin O), command line)
|
|
dummy := $(if $(shell cd $(PWD); test -d $(O) || echo $(O)),$(error O=$(O) does not exist),)
|
|
ABSOLUTE_O := $(shell cd $(PWD); cd $(O) ; pwd)
|
|
OUTPUT := $(ABSOLUTE_O)/$(if $(subdir),$(subdir)/)
|
|
COMMAND_O := O=$(ABSOLUTE_O)
|
|
ifeq ($(objtree),)
|
|
objtree := $(O)
|
|
endif
|
|
endif
|
|
endif
|
|
|
|
# check that the output directory actually exists
|
|
ifneq ($(OUTPUT),)
|
|
OUTDIR := $(shell cd $(OUTPUT) && pwd)
|
|
$(if $(OUTDIR),, $(error output directory "$(OUTPUT)" does not exist))
|
|
endif
|
|
|
|
#
|
|
# Include saner warnings here, which can catch bugs:
|
|
#
|
|
EXTRA_WARNINGS := -Wbad-function-cast
|
|
EXTRA_WARNINGS += -Wdeclaration-after-statement
|
|
EXTRA_WARNINGS += -Wformat-security
|
|
EXTRA_WARNINGS += -Wformat-y2k
|
|
EXTRA_WARNINGS += -Winit-self
|
|
EXTRA_WARNINGS += -Wmissing-declarations
|
|
EXTRA_WARNINGS += -Wmissing-prototypes
|
|
EXTRA_WARNINGS += -Wnested-externs
|
|
EXTRA_WARNINGS += -Wno-system-headers
|
|
EXTRA_WARNINGS += -Wold-style-definition
|
|
EXTRA_WARNINGS += -Wpacked
|
|
EXTRA_WARNINGS += -Wredundant-decls
|
|
EXTRA_WARNINGS += -Wstrict-prototypes
|
|
EXTRA_WARNINGS += -Wswitch-default
|
|
EXTRA_WARNINGS += -Wswitch-enum
|
|
EXTRA_WARNINGS += -Wundef
|
|
EXTRA_WARNINGS += -Wwrite-strings
|
|
EXTRA_WARNINGS += -Wformat
|
|
|
|
CC_NO_CLANG := $(shell $(CC) -dM -E -x c /dev/null | grep -Fq "__clang__"; echo $$?)
|
|
|
|
# Makefiles suck: This macro sets a default value of $(2) for the
|
|
# variable named by $(1), unless the variable has been set by
|
|
# environment or command line. This is necessary for CC and AR
|
|
# because make sets default values, so the simpler ?= approach
|
|
# won't work as expected.
|
|
define allow-override
|
|
$(if $(or $(findstring environment,$(origin $(1))),\
|
|
$(findstring command line,$(origin $(1)))),,\
|
|
$(eval $(1) = $(2)))
|
|
endef
|
|
|
|
# Allow setting various cross-compile vars or setting CROSS_COMPILE as a prefix.
|
|
$(call allow-override,CC,$(CROSS_COMPILE)gcc)
|
|
$(call allow-override,AR,$(CROSS_COMPILE)ar)
|
|
$(call allow-override,LD,$(CROSS_COMPILE)ld)
|
|
$(call allow-override,CXX,$(CROSS_COMPILE)g++)
|
|
$(call allow-override,STRIP,$(CROSS_COMPILE)strip)
|
|
|
|
ifneq ($(LLVM),)
|
|
HOSTAR ?= llvm-ar
|
|
HOSTCC ?= clang
|
|
HOSTLD ?= ld.lld
|
|
else
|
|
HOSTAR ?= ar
|
|
HOSTCC ?= gcc
|
|
HOSTLD ?= ld
|
|
endif
|
|
|
|
# Some tools require Clang, LLC and/or LLVM utils
|
|
CLANG ?= clang
|
|
LLC ?= llc
|
|
LLVM_CONFIG ?= llvm-config
|
|
LLVM_OBJCOPY ?= llvm-objcopy
|
|
LLVM_STRIP ?= llvm-strip
|
|
|
|
ifeq ($(CC_NO_CLANG), 1)
|
|
EXTRA_WARNINGS += -Wstrict-aliasing=3
|
|
endif
|
|
|
|
# Hack to avoid type-punned warnings on old systems such as RHEL5:
|
|
# We should be changing CFLAGS and checking gcc version, but this
|
|
# will do for now and keep the above -Wstrict-aliasing=3 in place
|
|
# in newer systems.
|
|
# Needed for the __raw_cmpxchg in tools/arch/x86/include/asm/cmpxchg.h
|
|
#
|
|
# See https://lkml.org/lkml/2006/11/28/253 and https://gcc.gnu.org/gcc-4.8/changes.html,
|
|
# that takes into account Linus's comments (search for Wshadow) for the reasoning about
|
|
# -Wshadow not being interesting before gcc 4.8.
|
|
|
|
ifneq ($(filter 3.%,$(MAKE_VERSION)),) # make-3
|
|
EXTRA_WARNINGS += -fno-strict-aliasing
|
|
EXTRA_WARNINGS += -Wno-shadow
|
|
else
|
|
EXTRA_WARNINGS += -Wshadow
|
|
endif
|
|
|
|
ifneq ($(findstring $(MAKEFLAGS), w),w)
|
|
PRINT_DIR = --no-print-directory
|
|
else
|
|
NO_SUBDIR = :
|
|
endif
|
|
|
|
ifneq ($(findstring s,$(filter-out --%,$(MAKEFLAGS))),)
|
|
silent=1
|
|
endif
|
|
|
|
#
|
|
# Define a callable command for descending to a new directory
|
|
#
|
|
# Call by doing: $(call descend,directory[,target])
|
|
#
|
|
descend = \
|
|
+mkdir -p $(OUTPUT)$(1) && \
|
|
$(MAKE) $(COMMAND_O) subdir=$(if $(subdir),$(subdir)/$(1),$(1)) $(PRINT_DIR) -C $(1) $(2)
|
|
|
|
QUIET_SUBDIR0 = +$(MAKE) $(COMMAND_O) -C # space to separate -C and subdir
|
|
QUIET_SUBDIR1 =
|
|
|
|
ifneq ($(silent),1)
|
|
ifneq ($(V),1)
|
|
QUIET_CC = @echo ' CC '$@;
|
|
QUIET_CC_FPIC = @echo ' CC FPIC '$@;
|
|
QUIET_CLANG = @echo ' CLANG '$@;
|
|
QUIET_AR = @echo ' AR '$@;
|
|
QUIET_LINK = @echo ' LINK '$@;
|
|
QUIET_MKDIR = @echo ' MKDIR '$@;
|
|
QUIET_GEN = @echo ' GEN '$@;
|
|
QUIET_SUBDIR0 = +@subdir=
|
|
QUIET_SUBDIR1 = ;$(NO_SUBDIR) \
|
|
echo ' SUBDIR '$$subdir; \
|
|
$(MAKE) $(PRINT_DIR) -C $$subdir
|
|
QUIET_FLEX = @echo ' FLEX '$@;
|
|
QUIET_BISON = @echo ' BISON '$@;
|
|
QUIET_GENSKEL = @echo ' GEN-SKEL '$@;
|
|
|
|
descend = \
|
|
+@echo ' DESCEND '$(1); \
|
|
mkdir -p $(OUTPUT)$(1) && \
|
|
$(MAKE) $(COMMAND_O) subdir=$(if $(subdir),$(subdir)/$(1),$(1)) $(PRINT_DIR) -C $(1) $(2)
|
|
|
|
QUIET_CLEAN = @printf ' CLEAN %s\n' $1;
|
|
QUIET_INSTALL = @printf ' INSTALL %s\n' $1;
|
|
QUIET_UNINST = @printf ' UNINST %s\n' $1;
|
|
endif
|
|
endif
|
|
|
|
pound := \#
|