Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf updates from Ingo Molnar:
 "The main kernel side changes were:

   - uprobes enhancements (Masami Hiramatsu)

   - Uncore group events enhancements (David Carrillo-Cisneros)

   - x86 Intel: Add support for Skylake server uncore PMUs (Kan Liang)

   - x86 Intel: LBR cleanups and enhancements, for better branch
     annotation tracking (Peter Zijlstra)

   - x86 Intel: Add support for PTWRITE and power event tracing
     (Alexander Shishkin)

   - ... various fixes, cleanups and smaller enhancements.

  Lots of tooling changes - a couple of highlights:

   - Support event group view with hierarchy mode in 'perf top' and
     'perf report' (Namhyung Kim)

     e.g.:

     $ perf record -e '{cycles,instructions}' make
     $ perf report --hierarchy --stdio
     ...
     #   Overhead  Command / Shared Object / Symbol
     # ......................  ..................................
     ...
     25.74%  27.18%sh
     19.96%  24.14%libc-2.24.so
      9.55%  14.64%[.] __strcmp_sse2
      1.54%   0.00%[.] __tfind
      1.07%   1.13%[.] _int_malloc
      0.95%   0.00%[.] __strchr_sse2
      0.89%   1.39%[.] __tsearch
      0.76%   0.00%[.] strlen

   - Add branch stack / basic block info to 'perf annotate --stdio',
     where for each branch, we add an asm comment after the instruction
     with information on how often it was taken and predicted. See
     example with color output at:

       http://vger.kernel.org/~acme/perf/annotate_basic_blocks.png

     (Peter Zijlstra)

   - Add support for using symbols in address filters with Intel PT and
     ARM CoreSight (hardware assisted tracing facilities) (Adrian
     Hunter, Mathieu Poirier)

   - Add support for interacting with Coresight PMU ETMs/PTMs, that are
     IP blocks to perform hardware assisted tracing on a ARM CPU core
     (Mathieu Poirier)

   - Support generating cross arch probes, i.e. if you specify a vmlinux
     file for different arch than the one in the host machine,

        $ perf probe --definition function_name args

     will generate the probe definition string needed to append to the
     target machine /sys/kernel/debug/tracing/kprobes_events file, using
     scripting (Masami Hiramatsu).

   - Allow configuring the default 'perf report -s' sort order in
     ~/.perfconfig, for instance, "sym,dso" may be more fitting for
     kernel developers. (Arnaldo Carvalho de Melo)

   - ... plus lots of other changes, refactorings, features and fixes"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (149 commits)
  perf tests: Add dwarf unwind test for powerpc
  perf probe: Match linkage name with mangled name
  perf probe: Fix to cut off incompatible chars from group name
  perf probe: Skip if the function address is 0
  perf probe: Ignore the error of finding inline instance
  perf intel-pt: Fix decoding when there are address filters
  perf intel-pt: Enable decoder to handle TIP.PGD with missing IP
  perf intel-pt: Read address filter from AUXTRACE_INFO event
  perf intel-pt: Record address filter in AUXTRACE_INFO event
  perf intel-pt: Add a helper function for processing AUXTRACE_INFO
  perf intel-pt: Fix missing error codes processing auxtrace_info
  perf intel-pt: Add support for recording the max non-turbo ratio
  perf intel-pt: Fix snapshot overlap detection decoder errors
  perf probe: Increase debug level of SDT debug messages
  perf record: Add support for using symbols in address filters
  perf symbols: Add dso__last_symbol()
  perf record: Fix error paths
  perf record: Rename label 'out_symbol_exit'
  perf script: Fix vanished idle symbols
  perf evsel: Add support for address filters
  ...
This commit is contained in:
Linus Torvalds 2016-10-03 12:47:28 -07:00
commit 12b7bcb43e
179 changed files with 6053 additions and 988 deletions

View File

@ -44,8 +44,8 @@ Synopsis of kprobe_events
+|-offs(FETCHARG) : Fetch memory at FETCHARG +|- offs address.(**)
NAME=FETCHARG : Set NAME as the argument name of FETCHARG.
FETCHARG:TYPE : Set TYPE as the type of FETCHARG. Currently, basic types
(u8/u16/u32/u64/s8/s16/s32/s64), "string" and bitfield
are supported.
(u8/u16/u32/u64/s8/s16/s32/s64), hexadecimal types
(x8/x16/x32/x64), "string" and bitfield are supported.
(*) only for return probe.
(**) this is useful for fetching a field of data structures.
@ -54,7 +54,10 @@ Types
-----
Several types are supported for fetch-args. Kprobe tracer will access memory
by given type. Prefix 's' and 'u' means those types are signed and unsigned
respectively. Traced arguments are shown in decimal (signed) or hex (unsigned).
respectively. 'x' prefix implies it is unsigned. Traced arguments are shown
in decimal ('s' and 'u') or hexadecimal ('x'). Without type casting, 'x32'
or 'x64' is used depends on the architecture (e.g. x86-32 uses x32, and
x86-64 uses x64).
String type is a special type, which fetches a "null-terminated" string from
kernel space. This means it will fail and store NULL if the string container
has been paged out.

View File

@ -40,8 +40,8 @@ Synopsis of uprobe_tracer
+|-offs(FETCHARG) : Fetch memory at FETCHARG +|- offs address.(**)
NAME=FETCHARG : Set NAME as the argument name of FETCHARG.
FETCHARG:TYPE : Set TYPE as the type of FETCHARG. Currently, basic types
(u8/u16/u32/u64/s8/s16/s32/s64), "string" and bitfield
are supported.
(u8/u16/u32/u64/s8/s16/s32/s64), hexadecimal types
(x8/x16/x32/x64), "string" and bitfield are supported.
(*) only for return probe.
(**) this is useful for fetching a field of data structures.
@ -50,7 +50,10 @@ Types
-----
Several types are supported for fetch-args. Uprobe tracer will access memory
by given type. Prefix 's' and 'u' means those types are signed and unsigned
respectively. Traced arguments are shown in decimal (signed) or hex (unsigned).
respectively. 'x' prefix implies it is unsigned. Traced arguments are shown
in decimal ('s' and 'u') or hexadecimal ('x'). Without type casting, 'x32'
or 'x64' is used depends on the architecture (e.g. x86-32 uses x32, and
x86-64 uses x64).
String type is a special type, which fetches a "null-terminated" string from
user space.
Bitfield is another special type, which takes 3 parameters, bit-width, bit-

View File

@ -1125,6 +1125,11 @@ F: drivers/hwtracing/coresight/*
F: Documentation/trace/coresight.txt
F: Documentation/devicetree/bindings/arm/coresight.txt
F: Documentation/ABI/testing/sysfs-bus-coresight-devices-*
F: tools/perf/arch/arm/util/pmu.c
F: tools/perf/arch/arm/util/auxtrace.c
F: tools/perf/arch/arm/util/cs-etm.c
F: tools/perf/arch/arm/util/cs-etm.h
F: tools/perf/util/cs-etm.h
ARM/CORGI MACHINE SUPPORT
M: Richard Purdie <rpurdie@rpsys.net>

View File

@ -1201,6 +1201,9 @@ static int x86_pmu_add(struct perf_event *event, int flags)
* If group events scheduling transaction was started,
* skip the schedulability test here, it will be performed
* at commit time (->commit_txn) as a whole.
*
* If commit fails, we'll call ->del() on all events
* for which ->add() was called.
*/
if (cpuc->txn_flags & PERF_PMU_TXN_ADD)
goto done_collect;
@ -1223,6 +1226,14 @@ done_collect:
cpuc->n_added += n - n0;
cpuc->n_txn += n - n0;
if (x86_pmu.add) {
/*
* This is before x86_pmu_enable() will call x86_pmu_start(),
* so we enable LBRs before an event needs them etc..
*/
x86_pmu.add(event);
}
ret = 0;
out:
return ret;
@ -1346,7 +1357,7 @@ static void x86_pmu_del(struct perf_event *event, int flags)
event->hw.flags &= ~PERF_X86_EVENT_COMMITTED;
/*
* If we're called during a txn, we don't need to do anything.
* If we're called during a txn, we only need to undo x86_pmu.add.
* The events never got scheduled and ->cancel_txn will truncate
* the event_list.
*
@ -1354,7 +1365,7 @@ static void x86_pmu_del(struct perf_event *event, int flags)
* an event added during that same TXN.
*/
if (cpuc->txn_flags & PERF_PMU_TXN_ADD)
return;
goto do_del;
/*
* Not a TXN, therefore cleanup properly.
@ -1384,6 +1395,15 @@ static void x86_pmu_del(struct perf_event *event, int flags)
--cpuc->n_events;
perf_event_update_userpage(event);
do_del:
if (x86_pmu.del) {
/*
* This is after x86_pmu_stop(); so we disable LBRs after any
* event can need them etc..
*/
x86_pmu.del(event);
}
}
int x86_pmu_handle_irq(struct pt_regs *regs)

View File

@ -1906,13 +1906,6 @@ static void intel_pmu_disable_event(struct perf_event *event)
cpuc->intel_ctrl_host_mask &= ~(1ull << hwc->idx);
cpuc->intel_cp_status &= ~(1ull << hwc->idx);
/*
* must disable before any actual event
* because any event may be combined with LBR
*/
if (needs_branch_stack(event))
intel_pmu_lbr_disable(event);
if (unlikely(hwc->config_base == MSR_ARCH_PERFMON_FIXED_CTR_CTRL)) {
intel_pmu_disable_fixed(hwc);
return;
@ -1924,6 +1917,14 @@ static void intel_pmu_disable_event(struct perf_event *event)
intel_pmu_pebs_disable(event);
}
static void intel_pmu_del_event(struct perf_event *event)
{
if (needs_branch_stack(event))
intel_pmu_lbr_del(event);
if (event->attr.precise_ip)
intel_pmu_pebs_del(event);
}
static void intel_pmu_enable_fixed(struct hw_perf_event *hwc)
{
int idx = hwc->idx - INTEL_PMC_IDX_FIXED;
@ -1967,12 +1968,6 @@ static void intel_pmu_enable_event(struct perf_event *event)
intel_pmu_enable_bts(hwc->config);
return;
}
/*
* must enabled before any actual event
* because any event may be combined with LBR
*/
if (needs_branch_stack(event))
intel_pmu_lbr_enable(event);
if (event->attr.exclude_host)
cpuc->intel_ctrl_guest_mask |= (1ull << hwc->idx);
@ -1993,6 +1988,14 @@ static void intel_pmu_enable_event(struct perf_event *event)
__x86_pmu_enable_event(hwc, ARCH_PERFMON_EVENTSEL_ENABLE);
}
static void intel_pmu_add_event(struct perf_event *event)
{
if (event->attr.precise_ip)
intel_pmu_pebs_add(event);
if (needs_branch_stack(event))
intel_pmu_lbr_add(event);
}
/*
* Save and restart an expired event. Called by NMI contexts,
* so it has to be careful about preempting normal event ops:
@ -3291,6 +3294,8 @@ static __initconst const struct x86_pmu intel_pmu = {
.enable_all = intel_pmu_enable_all,
.enable = intel_pmu_enable_event,
.disable = intel_pmu_disable_event,
.add = intel_pmu_add_event,
.del = intel_pmu_del_event,
.hw_config = intel_pmu_hw_config,
.schedule_events = x86_schedule_events,
.eventsel = MSR_ARCH_PERFMON_EVENTSEL0,

View File

@ -806,9 +806,65 @@ struct event_constraint *intel_pebs_constraints(struct perf_event *event)
return &emptyconstraint;
}
static inline bool pebs_is_enabled(struct cpu_hw_events *cpuc)
/*
* We need the sched_task callback even for per-cpu events when we use
* the large interrupt threshold, such that we can provide PID and TID
* to PEBS samples.
*/
static inline bool pebs_needs_sched_cb(struct cpu_hw_events *cpuc)
{
return (cpuc->pebs_enabled & ((1ULL << MAX_PEBS_EVENTS) - 1));
return cpuc->n_pebs && (cpuc->n_pebs == cpuc->n_large_pebs);
}
static inline void pebs_update_threshold(struct cpu_hw_events *cpuc)
{
struct debug_store *ds = cpuc->ds;
u64 threshold;
if (cpuc->n_pebs == cpuc->n_large_pebs) {
threshold = ds->pebs_absolute_maximum -
x86_pmu.max_pebs_events * x86_pmu.pebs_record_size;
} else {
threshold = ds->pebs_buffer_base + x86_pmu.pebs_record_size;
}
ds->pebs_interrupt_threshold = threshold;
}
static void
pebs_update_state(bool needed_cb, struct cpu_hw_events *cpuc, struct pmu *pmu)
{
/*
* Make sure we get updated with the first PEBS
* event. It will trigger also during removal, but
* that does not hurt:
*/
bool update = cpuc->n_pebs == 1;
if (needed_cb != pebs_needs_sched_cb(cpuc)) {
if (!needed_cb)
perf_sched_cb_inc(pmu);
else
perf_sched_cb_dec(pmu);
update = true;
}
if (update)
pebs_update_threshold(cpuc);
}
void intel_pmu_pebs_add(struct perf_event *event)
{
struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
struct hw_perf_event *hwc = &event->hw;
bool needed_cb = pebs_needs_sched_cb(cpuc);
cpuc->n_pebs++;
if (hwc->flags & PERF_X86_EVENT_FREERUNNING)
cpuc->n_large_pebs++;
pebs_update_state(needed_cb, cpuc, event->ctx->pmu);
}
void intel_pmu_pebs_enable(struct perf_event *event)
@ -816,12 +872,9 @@ void intel_pmu_pebs_enable(struct perf_event *event)
struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
struct hw_perf_event *hwc = &event->hw;
struct debug_store *ds = cpuc->ds;
bool first_pebs;
u64 threshold;
hwc->config &= ~ARCH_PERFMON_EVENTSEL_INT;
first_pebs = !pebs_is_enabled(cpuc);
cpuc->pebs_enabled |= 1ULL << hwc->idx;
if (event->hw.flags & PERF_X86_EVENT_PEBS_LDLAT)
@ -830,46 +883,34 @@ void intel_pmu_pebs_enable(struct perf_event *event)
cpuc->pebs_enabled |= 1ULL << 63;
/*
* When the event is constrained enough we can use a larger
* threshold and run the event with less frequent PMI.
* Use auto-reload if possible to save a MSR write in the PMI.
* This must be done in pmu::start(), because PERF_EVENT_IOC_PERIOD.
*/
if (hwc->flags & PERF_X86_EVENT_FREERUNNING) {
threshold = ds->pebs_absolute_maximum -
x86_pmu.max_pebs_events * x86_pmu.pebs_record_size;
if (first_pebs)
perf_sched_cb_inc(event->ctx->pmu);
} else {
threshold = ds->pebs_buffer_base + x86_pmu.pebs_record_size;
/*
* If not all events can use larger buffer,
* roll back to threshold = 1
*/
if (!first_pebs &&
(ds->pebs_interrupt_threshold > threshold))
perf_sched_cb_dec(event->ctx->pmu);
}
/* Use auto-reload if possible to save a MSR write in the PMI */
if (hwc->flags & PERF_X86_EVENT_AUTO_RELOAD) {
ds->pebs_event_reset[hwc->idx] =
(u64)(-hwc->sample_period) & x86_pmu.cntval_mask;
}
}
if (first_pebs || ds->pebs_interrupt_threshold > threshold)
ds->pebs_interrupt_threshold = threshold;
void intel_pmu_pebs_del(struct perf_event *event)
{
struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
struct hw_perf_event *hwc = &event->hw;
bool needed_cb = pebs_needs_sched_cb(cpuc);
cpuc->n_pebs--;
if (hwc->flags & PERF_X86_EVENT_FREERUNNING)
cpuc->n_large_pebs--;
pebs_update_state(needed_cb, cpuc, event->ctx->pmu);
}
void intel_pmu_pebs_disable(struct perf_event *event)
{
struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
struct hw_perf_event *hwc = &event->hw;
struct debug_store *ds = cpuc->ds;
bool large_pebs = ds->pebs_interrupt_threshold >
ds->pebs_buffer_base + x86_pmu.pebs_record_size;
if (large_pebs)
if (cpuc->n_pebs == cpuc->n_large_pebs)
intel_pmu_drain_pebs_buffer();
cpuc->pebs_enabled &= ~(1ULL << hwc->idx);
@ -879,9 +920,6 @@ void intel_pmu_pebs_disable(struct perf_event *event)
else if (event->hw.flags & PERF_X86_EVENT_PEBS_ST)
cpuc->pebs_enabled &= ~(1ULL << 63);
if (large_pebs && !pebs_is_enabled(cpuc))
perf_sched_cb_dec(event->ctx->pmu);
if (cpuc->enabled)
wrmsrl(MSR_IA32_PEBS_ENABLE, cpuc->pebs_enabled);

View File

@ -380,7 +380,6 @@ static void __intel_pmu_lbr_save(struct x86_perf_task_context *task_ctx)
void intel_pmu_lbr_sched_task(struct perf_event_context *ctx, bool sched_in)
{
struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
struct x86_perf_task_context *task_ctx;
/*
@ -390,31 +389,21 @@ void intel_pmu_lbr_sched_task(struct perf_event_context *ctx, bool sched_in)
*/
task_ctx = ctx ? ctx->task_ctx_data : NULL;
if (task_ctx) {
if (sched_in) {
if (sched_in)
__intel_pmu_lbr_restore(task_ctx);
cpuc->lbr_context = ctx;
} else {
else
__intel_pmu_lbr_save(task_ctx);
}
return;
}
/*
* When sampling the branck stack in system-wide, it may be
* necessary to flush the stack on context switch. This happens
* when the branch stack does not tag its entries with the pid
* of the current task. Otherwise it becomes impossible to
* associate a branch entry with a task. This ambiguity is more
* likely to appear when the branch stack supports priv level
* filtering and the user sets it to monitor only at the user
* level (which could be a useful measurement in system-wide
* mode). In that case, the risk is high of having a branch
* stack with branch from multiple tasks.
*/
if (sched_in) {
* Since a context switch can flip the address space and LBR entries
* are not tagged with an identifier, we need to wipe the LBR, even for
* per-cpu events. You simply cannot resolve the branches from the old
* address space.
*/
if (sched_in)
intel_pmu_lbr_reset();
cpuc->lbr_context = ctx;
}
}
static inline bool branch_user_callstack(unsigned br_sel)
@ -422,7 +411,7 @@ static inline bool branch_user_callstack(unsigned br_sel)
return (br_sel & X86_BR_USER) && (br_sel & X86_BR_CALL_STACK);
}
void intel_pmu_lbr_enable(struct perf_event *event)
void intel_pmu_lbr_add(struct perf_event *event)
{
struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
struct x86_perf_task_context *task_ctx;
@ -430,27 +419,38 @@ void intel_pmu_lbr_enable(struct perf_event *event)
if (!x86_pmu.lbr_nr)
return;
/*
* Reset the LBR stack if we changed task context to
* avoid data leaks.
*/
if (event->ctx->task && cpuc->lbr_context != event->ctx) {
intel_pmu_lbr_reset();
cpuc->lbr_context = event->ctx;
}
cpuc->br_sel = event->hw.branch_reg.reg;
if (branch_user_callstack(cpuc->br_sel) && event->ctx &&
event->ctx->task_ctx_data) {
if (branch_user_callstack(cpuc->br_sel) && event->ctx->task_ctx_data) {
task_ctx = event->ctx->task_ctx_data;
task_ctx->lbr_callstack_users++;
}
cpuc->lbr_users++;
/*
* Request pmu::sched_task() callback, which will fire inside the
* regular perf event scheduling, so that call will:
*
* - restore or wipe; when LBR-callstack,
* - wipe; otherwise,
*
* when this is from __perf_event_task_sched_in().
*
* However, if this is from perf_install_in_context(), no such callback
* will follow and we'll need to reset the LBR here if this is the
* first LBR event.
*
* The problem is, we cannot tell these cases apart... but we can
* exclude the biggest chunk of cases by looking at
* event->total_time_running. An event that has accrued runtime cannot
* be 'new'. Conversely, a new event can get installed through the
* context switch path for the first time.
*/
perf_sched_cb_inc(event->ctx->pmu);
if (!cpuc->lbr_users++ && !event->total_time_running)
intel_pmu_lbr_reset();
}
void intel_pmu_lbr_disable(struct perf_event *event)
void intel_pmu_lbr_del(struct perf_event *event)
{
struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
struct x86_perf_task_context *task_ctx;
@ -467,12 +467,6 @@ void intel_pmu_lbr_disable(struct perf_event *event)
cpuc->lbr_users--;
WARN_ON_ONCE(cpuc->lbr_users < 0);
perf_sched_cb_dec(event->ctx->pmu);
if (cpuc->enabled && !cpuc->lbr_users) {
__intel_pmu_lbr_disable();
/* avoid stale pointer */
cpuc->lbr_context = NULL;
}
}
void intel_pmu_lbr_enable_all(bool pmi)

View File

@ -69,6 +69,8 @@ static struct pt_cap_desc {
PT_CAP(psb_cyc, 0, CR_EBX, BIT(1)),
PT_CAP(ip_filtering, 0, CR_EBX, BIT(2)),
PT_CAP(mtc, 0, CR_EBX, BIT(3)),
PT_CAP(ptwrite, 0, CR_EBX, BIT(4)),
PT_CAP(power_event_trace, 0, CR_EBX, BIT(5)),
PT_CAP(topa_output, 0, CR_ECX, BIT(0)),
PT_CAP(topa_multiple_entries, 0, CR_ECX, BIT(1)),
PT_CAP(single_range_output, 0, CR_ECX, BIT(2)),
@ -259,10 +261,16 @@ fail:
#define RTIT_CTL_MTC (RTIT_CTL_MTC_EN | \
RTIT_CTL_MTC_RANGE)
#define RTIT_CTL_PTW (RTIT_CTL_PTW_EN | \
RTIT_CTL_FUP_ON_PTW)
#define PT_CONFIG_MASK (RTIT_CTL_TSC_EN | \
RTIT_CTL_DISRETC | \
RTIT_CTL_CYC_PSB | \
RTIT_CTL_MTC)
RTIT_CTL_MTC | \
RTIT_CTL_PWR_EVT_EN | \
RTIT_CTL_FUP_ON_PTW | \
RTIT_CTL_PTW_EN)
static bool pt_event_valid(struct perf_event *event)
{
@ -311,6 +319,20 @@ static bool pt_event_valid(struct perf_event *event)
return false;
}
if (config & RTIT_CTL_PWR_EVT_EN &&
!pt_cap_get(PT_CAP_power_event_trace))
return false;
if (config & RTIT_CTL_PTW) {
if (!pt_cap_get(PT_CAP_ptwrite))
return false;
/* FUPonPTW without PTW doesn't make sense */
if ((config & RTIT_CTL_FUP_ON_PTW) &&
!(config & RTIT_CTL_PTW_EN))
return false;
}
return true;
}

View File

@ -26,11 +26,14 @@
#define RTIT_CTL_CYCLEACC BIT(1)
#define RTIT_CTL_OS BIT(2)
#define RTIT_CTL_USR BIT(3)
#define RTIT_CTL_PWR_EVT_EN BIT(4)
#define RTIT_CTL_FUP_ON_PTW BIT(5)
#define RTIT_CTL_CR3EN BIT(7)
#define RTIT_CTL_TOPA BIT(8)
#define RTIT_CTL_MTC_EN BIT(9)
#define RTIT_CTL_TSC_EN BIT(10)
#define RTIT_CTL_DISRETC BIT(11)
#define RTIT_CTL_PTW_EN BIT(12)
#define RTIT_CTL_BRANCH_EN BIT(13)
#define RTIT_CTL_MTC_RANGE_OFFSET 14
#define RTIT_CTL_MTC_RANGE (0x0full << RTIT_CTL_MTC_RANGE_OFFSET)
@ -91,6 +94,8 @@ enum pt_capabilities {
PT_CAP_psb_cyc,
PT_CAP_ip_filtering,
PT_CAP_mtc,
PT_CAP_ptwrite,
PT_CAP_power_event_trace,
PT_CAP_topa_output,
PT_CAP_topa_multiple_entries,
PT_CAP_single_range_output,

View File

@ -357,6 +357,8 @@ static int rapl_pmu_event_init(struct perf_event *event)
if (event->cpu < 0)
return -EINVAL;
event->event_caps |= PERF_EV_CAP_READ_ACTIVE_PKG;
/*
* check event is known (determines counter)
*/
@ -765,6 +767,8 @@ static const struct x86_cpu_id rapl_cpu_match[] __initconst = {
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_MOBILE, skl_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_DESKTOP, skl_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_SKYLAKE_X, hsx_rapl_init),
X86_RAPL_MODEL_MATCH(INTEL_FAM6_ATOM_GOLDMONT, hsw_rapl_init),
{},
};

View File

@ -664,6 +664,8 @@ static int uncore_pmu_event_init(struct perf_event *event)
event->cpu = box->cpu;
event->pmu_private = box;
event->event_caps |= PERF_EV_CAP_READ_ACTIVE_PKG;
event->hw.idx = -1;
event->hw.last_tag = ~0ULL;
event->hw.extra_reg.idx = EXTRA_REG_NONE;
@ -683,7 +685,8 @@ static int uncore_pmu_event_init(struct perf_event *event)
/* fixed counters have event field hardcoded to zero */
hwc->config = 0ULL;
} else {
hwc->config = event->attr.config & pmu->type->event_mask;
hwc->config = event->attr.config &
(pmu->type->event_mask | ((u64)pmu->type->event_mask_ext << 32));
if (pmu->type->ops->hw_config) {
ret = pmu->type->ops->hw_config(box, event);
if (ret)
@ -1321,6 +1324,11 @@ static const struct intel_uncore_init_fun skl_uncore_init __initconst = {
.pci_init = skl_uncore_pci_init,
};
static const struct intel_uncore_init_fun skx_uncore_init __initconst = {
.cpu_init = skx_uncore_cpu_init,
.pci_init = skx_uncore_pci_init,
};
static const struct x86_cpu_id intel_uncore_match[] __initconst = {
X86_UNCORE_MODEL_MATCH(INTEL_FAM6_NEHALEM_EP, nhm_uncore_init),
X86_UNCORE_MODEL_MATCH(INTEL_FAM6_NEHALEM, nhm_uncore_init),
@ -1343,6 +1351,7 @@ static const struct x86_cpu_id intel_uncore_match[] __initconst = {
X86_UNCORE_MODEL_MATCH(INTEL_FAM6_XEON_PHI_KNL, knl_uncore_init),
X86_UNCORE_MODEL_MATCH(INTEL_FAM6_SKYLAKE_DESKTOP,skl_uncore_init),
X86_UNCORE_MODEL_MATCH(INTEL_FAM6_SKYLAKE_MOBILE, skl_uncore_init),
X86_UNCORE_MODEL_MATCH(INTEL_FAM6_SKYLAKE_X, skx_uncore_init),
{},
};

View File

@ -44,6 +44,7 @@ struct intel_uncore_type {
unsigned perf_ctr;
unsigned event_ctl;
unsigned event_mask;
unsigned event_mask_ext;
unsigned fixed_ctr;
unsigned fixed_ctl;
unsigned box_ctl;
@ -120,6 +121,7 @@ struct intel_uncore_box {
};
#define UNCORE_BOX_FLAG_INITIATED 0
#define UNCORE_BOX_FLAG_CTL_OFFS8 1 /* event config registers are 8-byte apart */
struct uncore_event_desc {
struct kobj_attribute attr;
@ -172,6 +174,9 @@ static inline unsigned uncore_pci_fixed_ctr(struct intel_uncore_box *box)
static inline
unsigned uncore_pci_event_ctl(struct intel_uncore_box *box, int idx)
{
if (test_bit(UNCORE_BOX_FLAG_CTL_OFFS8, &box->flags))
return idx * 8 + box->pmu->type->event_ctl;
return idx * 4 + box->pmu->type->event_ctl;
}
@ -377,6 +382,8 @@ int bdx_uncore_pci_init(void);
void bdx_uncore_cpu_init(void);
int knl_uncore_pci_init(void);
void knl_uncore_cpu_init(void);
int skx_uncore_pci_init(void);
void skx_uncore_cpu_init(void);
/* perf_event_intel_uncore_nhmex.c */
void nhmex_uncore_cpu_init(void);

View File

@ -388,6 +388,8 @@ static int snb_uncore_imc_event_init(struct perf_event *event)
event->cpu = box->cpu;
event->pmu_private = box;
event->event_caps |= PERF_EV_CAP_READ_ACTIVE_PKG;
event->hw.idx = -1;
event->hw.last_tag = ~0ULL;
event->hw.extra_reg.idx = EXTRA_REG_NONE;

View File

@ -1,6 +1,10 @@
/* SandyBridge-EP/IvyTown uncore support */
#include "uncore.h"
/* SNB-EP pci bus to socket mapping */
#define SNBEP_CPUNODEID 0x40
#define SNBEP_GIDNIDMAP 0x54
/* SNB-EP Box level control */
#define SNBEP_PMON_BOX_CTL_RST_CTRL (1 << 0)
#define SNBEP_PMON_BOX_CTL_RST_CTRS (1 << 1)
@ -264,15 +268,72 @@
SNBEP_PCU_MSR_PMON_CTL_OCC_INVERT | \
SNBEP_PCU_MSR_PMON_CTL_OCC_EDGE_DET)
/* SKX pci bus to socket mapping */
#define SKX_CPUNODEID 0xc0
#define SKX_GIDNIDMAP 0xd4
/* SKX CHA */
#define SKX_CHA_MSR_PMON_BOX_FILTER_TID (0x1ffULL << 0)
#define SKX_CHA_MSR_PMON_BOX_FILTER_LINK (0xfULL << 9)
#define SKX_CHA_MSR_PMON_BOX_FILTER_STATE (0x3ffULL << 17)
#define SKX_CHA_MSR_PMON_BOX_FILTER_REM (0x1ULL << 32)
#define SKX_CHA_MSR_PMON_BOX_FILTER_LOC (0x1ULL << 33)
#define SKX_CHA_MSR_PMON_BOX_FILTER_ALL_OPC (0x1ULL << 35)
#define SKX_CHA_MSR_PMON_BOX_FILTER_NM (0x1ULL << 36)
#define SKX_CHA_MSR_PMON_BOX_FILTER_NOT_NM (0x1ULL << 37)
#define SKX_CHA_MSR_PMON_BOX_FILTER_OPC0 (0x3ffULL << 41)
#define SKX_CHA_MSR_PMON_BOX_FILTER_OPC1 (0x3ffULL << 51)
#define SKX_CHA_MSR_PMON_BOX_FILTER_C6 (0x1ULL << 61)
#define SKX_CHA_MSR_PMON_BOX_FILTER_NC (0x1ULL << 62)
#define SKX_CHA_MSR_PMON_BOX_FILTER_ISOC (0x1ULL << 63)
/* SKX IIO */
#define SKX_IIO0_MSR_PMON_CTL0 0xa48
#define SKX_IIO0_MSR_PMON_CTR0 0xa41
#define SKX_IIO0_MSR_PMON_BOX_CTL 0xa40
#define SKX_IIO_MSR_OFFSET 0x20
#define SKX_PMON_CTL_TRESH_MASK (0xff << 24)
#define SKX_PMON_CTL_TRESH_MASK_EXT (0xf)
#define SKX_PMON_CTL_CH_MASK (0xff << 4)
#define SKX_PMON_CTL_FC_MASK (0x7 << 12)
#define SKX_IIO_PMON_RAW_EVENT_MASK (SNBEP_PMON_CTL_EV_SEL_MASK | \
SNBEP_PMON_CTL_UMASK_MASK | \
SNBEP_PMON_CTL_EDGE_DET | \
SNBEP_PMON_CTL_INVERT | \
SKX_PMON_CTL_TRESH_MASK)
#define SKX_IIO_PMON_RAW_EVENT_MASK_EXT (SKX_PMON_CTL_TRESH_MASK_EXT | \
SKX_PMON_CTL_CH_MASK | \
SKX_PMON_CTL_FC_MASK)
/* SKX IRP */
#define SKX_IRP0_MSR_PMON_CTL0 0xa5b
#define SKX_IRP0_MSR_PMON_CTR0 0xa59
#define SKX_IRP0_MSR_PMON_BOX_CTL 0xa58
#define SKX_IRP_MSR_OFFSET 0x20
/* SKX UPI */
#define SKX_UPI_PCI_PMON_CTL0 0x350
#define SKX_UPI_PCI_PMON_CTR0 0x318
#define SKX_UPI_PCI_PMON_BOX_CTL 0x378
#define SKX_PMON_CTL_UMASK_EXT 0xff
/* SKX M2M */
#define SKX_M2M_PCI_PMON_CTL0 0x228
#define SKX_M2M_PCI_PMON_CTR0 0x200
#define SKX_M2M_PCI_PMON_BOX_CTL 0x258
DEFINE_UNCORE_FORMAT_ATTR(event, event, "config:0-7");
DEFINE_UNCORE_FORMAT_ATTR(event2, event, "config:0-6");
DEFINE_UNCORE_FORMAT_ATTR(event_ext, event, "config:0-7,21");
DEFINE_UNCORE_FORMAT_ATTR(use_occ_ctr, use_occ_ctr, "config:7");
DEFINE_UNCORE_FORMAT_ATTR(umask, umask, "config:8-15");
DEFINE_UNCORE_FORMAT_ATTR(umask_ext, umask, "config:8-15,32-39");
DEFINE_UNCORE_FORMAT_ATTR(qor, qor, "config:16");
DEFINE_UNCORE_FORMAT_ATTR(edge, edge, "config:18");
DEFINE_UNCORE_FORMAT_ATTR(tid_en, tid_en, "config:19");
DEFINE_UNCORE_FORMAT_ATTR(inv, inv, "config:23");
DEFINE_UNCORE_FORMAT_ATTR(thresh9, thresh, "config:24-35");
DEFINE_UNCORE_FORMAT_ATTR(thresh8, thresh, "config:24-31");
DEFINE_UNCORE_FORMAT_ATTR(thresh6, thresh, "config:24-29");
DEFINE_UNCORE_FORMAT_ATTR(thresh5, thresh, "config:24-28");
@ -280,6 +341,8 @@ DEFINE_UNCORE_FORMAT_ATTR(occ_sel, occ_sel, "config:14-15");
DEFINE_UNCORE_FORMAT_ATTR(occ_invert, occ_invert, "config:30");
DEFINE_UNCORE_FORMAT_ATTR(occ_edge, occ_edge, "config:14-51");
DEFINE_UNCORE_FORMAT_ATTR(occ_edge_det, occ_edge_det, "config:31");
DEFINE_UNCORE_FORMAT_ATTR(ch_mask, ch_mask, "config:36-43");
DEFINE_UNCORE_FORMAT_ATTR(fc_mask, fc_mask, "config:44-46");
DEFINE_UNCORE_FORMAT_ATTR(filter_tid, filter_tid, "config1:0-4");
DEFINE_UNCORE_FORMAT_ATTR(filter_tid2, filter_tid, "config1:0");
DEFINE_UNCORE_FORMAT_ATTR(filter_tid3, filter_tid, "config1:0-5");
@ -288,18 +351,26 @@ DEFINE_UNCORE_FORMAT_ATTR(filter_cid, filter_cid, "config1:5");
DEFINE_UNCORE_FORMAT_ATTR(filter_link, filter_link, "config1:5-8");
DEFINE_UNCORE_FORMAT_ATTR(filter_link2, filter_link, "config1:6-8");
DEFINE_UNCORE_FORMAT_ATTR(filter_link3, filter_link, "config1:12");
DEFINE_UNCORE_FORMAT_ATTR(filter_link4, filter_link, "config1:9-12");
DEFINE_UNCORE_FORMAT_ATTR(filter_nid, filter_nid, "config1:10-17");
DEFINE_UNCORE_FORMAT_ATTR(filter_nid2, filter_nid, "config1:32-47");
DEFINE_UNCORE_FORMAT_ATTR(filter_state, filter_state, "config1:18-22");
DEFINE_UNCORE_FORMAT_ATTR(filter_state2, filter_state, "config1:17-22");
DEFINE_UNCORE_FORMAT_ATTR(filter_state3, filter_state, "config1:17-23");
DEFINE_UNCORE_FORMAT_ATTR(filter_state4, filter_state, "config1:18-20");
DEFINE_UNCORE_FORMAT_ATTR(filter_state5, filter_state, "config1:17-26");
DEFINE_UNCORE_FORMAT_ATTR(filter_rem, filter_rem, "config1:32");
DEFINE_UNCORE_FORMAT_ATTR(filter_loc, filter_loc, "config1:33");
DEFINE_UNCORE_FORMAT_ATTR(filter_nm, filter_nm, "config1:36");
DEFINE_UNCORE_FORMAT_ATTR(filter_not_nm, filter_not_nm, "config1:37");
DEFINE_UNCORE_FORMAT_ATTR(filter_local, filter_local, "config1:33");
DEFINE_UNCORE_FORMAT_ATTR(filter_all_op, filter_all_op, "config1:35");
DEFINE_UNCORE_FORMAT_ATTR(filter_nnm, filter_nnm, "config1:37");
DEFINE_UNCORE_FORMAT_ATTR(filter_opc, filter_opc, "config1:23-31");
DEFINE_UNCORE_FORMAT_ATTR(filter_opc2, filter_opc, "config1:52-60");
DEFINE_UNCORE_FORMAT_ATTR(filter_opc3, filter_opc, "config1:41-60");
DEFINE_UNCORE_FORMAT_ATTR(filter_opc_0, filter_opc0, "config1:41-50");
DEFINE_UNCORE_FORMAT_ATTR(filter_opc_1, filter_opc1, "config1:51-60");
DEFINE_UNCORE_FORMAT_ATTR(filter_nc, filter_nc, "config1:62");
DEFINE_UNCORE_FORMAT_ATTR(filter_c6, filter_c6, "config1:61");
DEFINE_UNCORE_FORMAT_ATTR(filter_isoc, filter_isoc, "config1:63");
@ -1153,7 +1224,7 @@ static struct pci_driver snbep_uncore_pci_driver = {
/*
* build pci bus to socket mapping
*/
static int snbep_pci2phy_map_init(int devid)
static int snbep_pci2phy_map_init(int devid, int nodeid_loc, int idmap_loc, bool reverse)
{
struct pci_dev *ubox_dev = NULL;
int i, bus, nodeid, segment;
@ -1168,12 +1239,12 @@ static int snbep_pci2phy_map_init(int devid)
break;
bus = ubox_dev->bus->number;
/* get the Node ID of the local register */
err = pci_read_config_dword(ubox_dev, 0x40, &config);
err = pci_read_config_dword(ubox_dev, nodeid_loc, &config);
if (err)
break;
nodeid = config;
/* get the Node ID mapping */
err = pci_read_config_dword(ubox_dev, 0x54, &config);
err = pci_read_config_dword(ubox_dev, idmap_loc, &config);
if (err)
break;
@ -1207,11 +1278,20 @@ static int snbep_pci2phy_map_init(int devid)
raw_spin_lock(&pci2phy_map_lock);
list_for_each_entry(map, &pci2phy_map_head, list) {
i = -1;
for (bus = 255; bus >= 0; bus--) {
if (map->pbus_to_physid[bus] >= 0)
i = map->pbus_to_physid[bus];
else
map->pbus_to_physid[bus] = i;
if (reverse) {
for (bus = 255; bus >= 0; bus--) {
if (map->pbus_to_physid[bus] >= 0)
i = map->pbus_to_physid[bus];
else
map->pbus_to_physid[bus] = i;
}
} else {
for (bus = 0; bus <= 255; bus++) {
if (map->pbus_to_physid[bus] >= 0)
i = map->pbus_to_physid[bus];
else
map->pbus_to_physid[bus] = i;
}
}
}
raw_spin_unlock(&pci2phy_map_lock);
@ -1224,7 +1304,7 @@ static int snbep_pci2phy_map_init(int devid)
int snbep_uncore_pci_init(void)
{
int ret = snbep_pci2phy_map_init(0x3ce0);
int ret = snbep_pci2phy_map_init(0x3ce0, SNBEP_CPUNODEID, SNBEP_GIDNIDMAP, true);
if (ret)
return ret;
uncore_pci_uncores = snbep_pci_uncores;
@ -1788,7 +1868,7 @@ static struct pci_driver ivbep_uncore_pci_driver = {
int ivbep_uncore_pci_init(void)
{
int ret = snbep_pci2phy_map_init(0x0e1e);
int ret = snbep_pci2phy_map_init(0x0e1e, SNBEP_CPUNODEID, SNBEP_GIDNIDMAP, true);
if (ret)
return ret;
uncore_pci_uncores = ivbep_pci_uncores;
@ -2897,7 +2977,7 @@ static struct pci_driver hswep_uncore_pci_driver = {
int hswep_uncore_pci_init(void)
{
int ret = snbep_pci2phy_map_init(0x2f1e);
int ret = snbep_pci2phy_map_init(0x2f1e, SNBEP_CPUNODEID, SNBEP_GIDNIDMAP, true);
if (ret)
return ret;
uncore_pci_uncores = hswep_pci_uncores;
@ -3186,7 +3266,7 @@ static struct pci_driver bdx_uncore_pci_driver = {
int bdx_uncore_pci_init(void)
{
int ret = snbep_pci2phy_map_init(0x6f1e);
int ret = snbep_pci2phy_map_init(0x6f1e, SNBEP_CPUNODEID, SNBEP_GIDNIDMAP, true);
if (ret)
return ret;
@ -3196,3 +3276,525 @@ int bdx_uncore_pci_init(void)
}
/* end of BDX uncore support */
/* SKX uncore support */
static struct intel_uncore_type skx_uncore_ubox = {
.name = "ubox",
.num_counters = 2,
.num_boxes = 1,
.perf_ctr_bits = 48,
.fixed_ctr_bits = 48,
.perf_ctr = HSWEP_U_MSR_PMON_CTR0,
.event_ctl = HSWEP_U_MSR_PMON_CTL0,
.event_mask = SNBEP_U_MSR_PMON_RAW_EVENT_MASK,
.fixed_ctr = HSWEP_U_MSR_PMON_UCLK_FIXED_CTR,
.fixed_ctl = HSWEP_U_MSR_PMON_UCLK_FIXED_CTL,
.ops = &ivbep_uncore_msr_ops,
.format_group = &ivbep_uncore_ubox_format_group,
};
static struct attribute *skx_uncore_cha_formats_attr[] = {
&format_attr_event.attr,
&format_attr_umask.attr,
&format_attr_edge.attr,
&format_attr_tid_en.attr,
&format_attr_inv.attr,
&format_attr_thresh8.attr,
&format_attr_filter_tid4.attr,
&format_attr_filter_link4.attr,
&format_attr_filter_state5.attr,
&format_attr_filter_rem.attr,
&format_attr_filter_loc.attr,
&format_attr_filter_nm.attr,
&format_attr_filter_all_op.attr,
&format_attr_filter_not_nm.attr,
&format_attr_filter_opc_0.attr,
&format_attr_filter_opc_1.attr,
&format_attr_filter_nc.attr,
&format_attr_filter_c6.attr,
&format_attr_filter_isoc.attr,
NULL,
};
static struct attribute_group skx_uncore_chabox_format_group = {
.name = "format",
.attrs = skx_uncore_cha_formats_attr,
};
static struct event_constraint skx_uncore_chabox_constraints[] = {
UNCORE_EVENT_CONSTRAINT(0x11, 0x1),
UNCORE_EVENT_CONSTRAINT(0x36, 0x1),
EVENT_CONSTRAINT_END
};
static struct extra_reg skx_uncore_cha_extra_regs[] = {
SNBEP_CBO_EVENT_EXTRA_REG(0x0334, 0xffff, 0x4),
SNBEP_CBO_EVENT_EXTRA_REG(0x0534, 0xffff, 0x4),
SNBEP_CBO_EVENT_EXTRA_REG(0x0934, 0xffff, 0x4),
SNBEP_CBO_EVENT_EXTRA_REG(0x1134, 0xffff, 0x4),
SNBEP_CBO_EVENT_EXTRA_REG(0x2134, 0xffff, 0x4),
SNBEP_CBO_EVENT_EXTRA_REG(0x8134, 0xffff, 0x4),
};
static u64 skx_cha_filter_mask(int fields)
{
u64 mask = 0;
if (fields & 0x1)
mask |= SKX_CHA_MSR_PMON_BOX_FILTER_TID;
if (fields & 0x2)
mask |= SKX_CHA_MSR_PMON_BOX_FILTER_LINK;
if (fields & 0x4)
mask |= SKX_CHA_MSR_PMON_BOX_FILTER_STATE;
return mask;
}
static struct event_constraint *
skx_cha_get_constraint(struct intel_uncore_box *box, struct perf_event *event)
{
return __snbep_cbox_get_constraint(box, event, skx_cha_filter_mask);
}
static int skx_cha_hw_config(struct intel_uncore_box *box, struct perf_event *event)
{
struct hw_perf_event_extra *reg1 = &event->hw.extra_reg;
struct extra_reg *er;
int idx = 0;
for (er = skx_uncore_cha_extra_regs; er->msr; er++) {
if (er->event != (event->hw.config & er->config_mask))
continue;
idx |= er->idx;
}
if (idx) {
reg1->reg = HSWEP_C0_MSR_PMON_BOX_FILTER0 +
HSWEP_CBO_MSR_OFFSET * box->pmu->pmu_idx;
reg1->config = event->attr.config1 & skx_cha_filter_mask(idx);
reg1->idx = idx;
}
return 0;
}
static struct intel_uncore_ops skx_uncore_chabox_ops = {
/* There is no frz_en for chabox ctl */
.init_box = ivbep_uncore_msr_init_box,
.disable_box = snbep_uncore_msr_disable_box,
.enable_box = snbep_uncore_msr_enable_box,
.disable_event = snbep_uncore_msr_disable_event,
.enable_event = hswep_cbox_enable_event,
.read_counter = uncore_msr_read_counter,
.hw_config = skx_cha_hw_config,
.get_constraint = skx_cha_get_constraint,
.put_constraint = snbep_cbox_put_constraint,
};
static struct intel_uncore_type skx_uncore_chabox = {
.name = "cha",
.num_counters = 4,
.perf_ctr_bits = 48,
.event_ctl = HSWEP_C0_MSR_PMON_CTL0,
.perf_ctr = HSWEP_C0_MSR_PMON_CTR0,
.event_mask = HSWEP_S_MSR_PMON_RAW_EVENT_MASK,
.box_ctl = HSWEP_C0_MSR_PMON_BOX_CTL,
.msr_offset = HSWEP_CBO_MSR_OFFSET,
.num_shared_regs = 1,
.constraints = skx_uncore_chabox_constraints,
.ops = &skx_uncore_chabox_ops,
.format_group = &skx_uncore_chabox_format_group,
};
static struct attribute *skx_uncore_iio_formats_attr[] = {
&format_attr_event.attr,
&format_attr_umask.attr,
&format_attr_edge.attr,
&format_attr_inv.attr,
&format_attr_thresh9.attr,
&format_attr_ch_mask.attr,
&format_attr_fc_mask.attr,
NULL,
};
static struct attribute_group skx_uncore_iio_format_group = {
.name = "format",
.attrs = skx_uncore_iio_formats_attr,
};
static struct event_constraint skx_uncore_iio_constraints[] = {
UNCORE_EVENT_CONSTRAINT(0x83, 0x3),
UNCORE_EVENT_CONSTRAINT(0x88, 0xc),
UNCORE_EVENT_CONSTRAINT(0x95, 0xc),
UNCORE_EVENT_CONSTRAINT(0xc0, 0xc),
UNCORE_EVENT_CONSTRAINT(0xc5, 0xc),
UNCORE_EVENT_CONSTRAINT(0xd4, 0xc),
EVENT_CONSTRAINT_END
};
static void skx_iio_enable_event(struct intel_uncore_box *box,
struct perf_event *event)
{
struct hw_perf_event *hwc = &event->hw;
wrmsrl(hwc->config_base, hwc->config | SNBEP_PMON_CTL_EN);
}
static struct intel_uncore_ops skx_uncore_iio_ops = {
.init_box = ivbep_uncore_msr_init_box,
.disable_box = snbep_uncore_msr_disable_box,
.enable_box = snbep_uncore_msr_enable_box,
.disable_event = snbep_uncore_msr_disable_event,
.enable_event = skx_iio_enable_event,
.read_counter = uncore_msr_read_counter,
};
static struct intel_uncore_type skx_uncore_iio = {
.name = "iio",
.num_counters = 4,
.num_boxes = 5,
.perf_ctr_bits = 48,
.event_ctl = SKX_IIO0_MSR_PMON_CTL0,
.perf_ctr = SKX_IIO0_MSR_PMON_CTR0,
.event_mask = SKX_IIO_PMON_RAW_EVENT_MASK,
.event_mask_ext = SKX_IIO_PMON_RAW_EVENT_MASK_EXT,
.box_ctl = SKX_IIO0_MSR_PMON_BOX_CTL,
.msr_offset = SKX_IIO_MSR_OFFSET,
.constraints = skx_uncore_iio_constraints,
.ops = &skx_uncore_iio_ops,
.format_group = &skx_uncore_iio_format_group,
};
static struct attribute *skx_uncore_formats_attr[] = {
&format_attr_event.attr,
&format_attr_umask.attr,
&format_attr_edge.attr,
&format_attr_inv.attr,
&format_attr_thresh8.attr,
NULL,
};
static struct attribute_group skx_uncore_format_group = {
.name = "format",
.attrs = skx_uncore_formats_attr,
};
static struct intel_uncore_type skx_uncore_irp = {
.name = "irp",
.num_counters = 2,
.num_boxes = 5,
.perf_ctr_bits = 48,
.event_ctl = SKX_IRP0_MSR_PMON_CTL0,
.perf_ctr = SKX_IRP0_MSR_PMON_CTR0,
.event_mask = SNBEP_PMON_RAW_EVENT_MASK,
.box_ctl = SKX_IRP0_MSR_PMON_BOX_CTL,
.msr_offset = SKX_IRP_MSR_OFFSET,
.ops = &skx_uncore_iio_ops,
.format_group = &skx_uncore_format_group,
};
static struct intel_uncore_ops skx_uncore_pcu_ops = {
IVBEP_UNCORE_MSR_OPS_COMMON_INIT(),
.hw_config = hswep_pcu_hw_config,
.get_constraint = snbep_pcu_get_constraint,
.put_constraint = snbep_pcu_put_constraint,
};
static struct intel_uncore_type skx_uncore_pcu = {
.name = "pcu",
.num_counters = 4,
.num_boxes = 1,
.perf_ctr_bits = 48,
.perf_ctr = HSWEP_PCU_MSR_PMON_CTR0,
.event_ctl = HSWEP_PCU_MSR_PMON_CTL0,
.event_mask = SNBEP_PCU_MSR_PMON_RAW_EVENT_MASK,
.box_ctl = HSWEP_PCU_MSR_PMON_BOX_CTL,
.num_shared_regs = 1,
.ops = &skx_uncore_pcu_ops,
.format_group = &snbep_uncore_pcu_format_group,
};
static struct intel_uncore_type *skx_msr_uncores[] = {
&skx_uncore_ubox,
&skx_uncore_chabox,
&skx_uncore_iio,
&skx_uncore_irp,
&skx_uncore_pcu,
NULL,
};
static int skx_count_chabox(void)
{
struct pci_dev *chabox_dev = NULL;
int bus, count = 0;
while (1) {
chabox_dev = pci_get_device(PCI_VENDOR_ID_INTEL, 0x208d, chabox_dev);
if (!chabox_dev)
break;
if (count == 0)
bus = chabox_dev->bus->number;
if (bus != chabox_dev->bus->number)
break;
count++;
}
pci_dev_put(chabox_dev);
return count;
}
void skx_uncore_cpu_init(void)
{
skx_uncore_chabox.num_boxes = skx_count_chabox();
uncore_msr_uncores = skx_msr_uncores;
}
static struct intel_uncore_type skx_uncore_imc = {
.name = "imc",
.num_counters = 4,
.num_boxes = 6,
.perf_ctr_bits = 48,
.fixed_ctr_bits = 48,
.fixed_ctr = SNBEP_MC_CHy_PCI_PMON_FIXED_CTR,
.fixed_ctl = SNBEP_MC_CHy_PCI_PMON_FIXED_CTL,
.event_descs = hswep_uncore_imc_events,
.perf_ctr = SNBEP_PCI_PMON_CTR0,
.event_ctl = SNBEP_PCI_PMON_CTL0,
.event_mask = SNBEP_PMON_RAW_EVENT_MASK,
.box_ctl = SNBEP_PCI_PMON_BOX_CTL,
.ops = &ivbep_uncore_pci_ops,
.format_group = &skx_uncore_format_group,
};
static struct attribute *skx_upi_uncore_formats_attr[] = {
&format_attr_event_ext.attr,
&format_attr_umask_ext.attr,
&format_attr_edge.attr,
&format_attr_inv.attr,
&format_attr_thresh8.attr,
NULL,
};
static struct attribute_group skx_upi_uncore_format_group = {
.name = "format",
.attrs = skx_upi_uncore_formats_attr,
};
static void skx_upi_uncore_pci_init_box(struct intel_uncore_box *box)
{
struct pci_dev *pdev = box->pci_dev;
__set_bit(UNCORE_BOX_FLAG_CTL_OFFS8, &box->flags);
pci_write_config_dword(pdev, SKX_UPI_PCI_PMON_BOX_CTL, IVBEP_PMON_BOX_CTL_INT);
}
static struct intel_uncore_ops skx_upi_uncore_pci_ops = {
.init_box = skx_upi_uncore_pci_init_box,
.disable_box = snbep_uncore_pci_disable_box,
.enable_box = snbep_uncore_pci_enable_box,
.disable_event = snbep_uncore_pci_disable_event,
.enable_event = snbep_uncore_pci_enable_event,
.read_counter = snbep_uncore_pci_read_counter,
};
static struct intel_uncore_type skx_uncore_upi = {
.name = "upi",
.num_counters = 4,
.num_boxes = 3,
.perf_ctr_bits = 48,
.perf_ctr = SKX_UPI_PCI_PMON_CTR0,
.event_ctl = SKX_UPI_PCI_PMON_CTL0,
.event_mask = SNBEP_QPI_PCI_PMON_RAW_EVENT_MASK,
.event_mask_ext = SKX_PMON_CTL_UMASK_EXT,
.box_ctl = SKX_UPI_PCI_PMON_BOX_CTL,
.ops = &skx_upi_uncore_pci_ops,
.format_group = &skx_upi_uncore_format_group,
};
static void skx_m2m_uncore_pci_init_box(struct intel_uncore_box *box)
{
struct pci_dev *pdev = box->pci_dev;
__set_bit(UNCORE_BOX_FLAG_CTL_OFFS8, &box->flags);
pci_write_config_dword(pdev, SKX_M2M_PCI_PMON_BOX_CTL, IVBEP_PMON_BOX_CTL_INT);
}
static struct intel_uncore_ops skx_m2m_uncore_pci_ops = {
.init_box = skx_m2m_uncore_pci_init_box,
.disable_box = snbep_uncore_pci_disable_box,
.enable_box = snbep_uncore_pci_enable_box,
.disable_event = snbep_uncore_pci_disable_event,
.enable_event = snbep_uncore_pci_enable_event,
.read_counter = snbep_uncore_pci_read_counter,
};
static struct intel_uncore_type skx_uncore_m2m = {
.name = "m2m",
.num_counters = 4,
.num_boxes = 2,
.perf_ctr_bits = 48,
.perf_ctr = SKX_M2M_PCI_PMON_CTR0,
.event_ctl = SKX_M2M_PCI_PMON_CTL0,
.event_mask = SNBEP_PMON_RAW_EVENT_MASK,
.box_ctl = SKX_M2M_PCI_PMON_BOX_CTL,
.ops = &skx_m2m_uncore_pci_ops,
.format_group = &skx_uncore_format_group,
};
static struct event_constraint skx_uncore_m2pcie_constraints[] = {
UNCORE_EVENT_CONSTRAINT(0x23, 0x3),
EVENT_CONSTRAINT_END
};
static struct intel_uncore_type skx_uncore_m2pcie = {
.name = "m2pcie",
.num_counters = 4,
.num_boxes = 4,
.perf_ctr_bits = 48,
.constraints = skx_uncore_m2pcie_constraints,
.perf_ctr = SNBEP_PCI_PMON_CTR0,
.event_ctl = SNBEP_PCI_PMON_CTL0,
.event_mask = SNBEP_PMON_RAW_EVENT_MASK,
.box_ctl = SNBEP_PCI_PMON_BOX_CTL,
.ops = &ivbep_uncore_pci_ops,
.format_group = &skx_uncore_format_group,
};
static struct event_constraint skx_uncore_m3upi_constraints[] = {
UNCORE_EVENT_CONSTRAINT(0x1d, 0x1),
UNCORE_EVENT_CONSTRAINT(0x1e, 0x1),
UNCORE_EVENT_CONSTRAINT(0x40, 0x7),
UNCORE_EVENT_CONSTRAINT(0x4e, 0x7),
UNCORE_EVENT_CONSTRAINT(0x4f, 0x7),
UNCORE_EVENT_CONSTRAINT(0x50, 0x7),
UNCORE_EVENT_CONSTRAINT(0x51, 0x7),
UNCORE_EVENT_CONSTRAINT(0x52, 0x7),
EVENT_CONSTRAINT_END
};
static struct intel_uncore_type skx_uncore_m3upi = {
.name = "m3upi",
.num_counters = 3,
.num_boxes = 3,
.perf_ctr_bits = 48,
.constraints = skx_uncore_m3upi_constraints,
.perf_ctr = SNBEP_PCI_PMON_CTR0,
.event_ctl = SNBEP_PCI_PMON_CTL0,
.event_mask = SNBEP_PMON_RAW_EVENT_MASK,
.box_ctl = SNBEP_PCI_PMON_BOX_CTL,
.ops = &ivbep_uncore_pci_ops,
.format_group = &skx_uncore_format_group,
};
enum {
SKX_PCI_UNCORE_IMC,
SKX_PCI_UNCORE_M2M,
SKX_PCI_UNCORE_UPI,
SKX_PCI_UNCORE_M2PCIE,
SKX_PCI_UNCORE_M3UPI,
};
static struct intel_uncore_type *skx_pci_uncores[] = {
[SKX_PCI_UNCORE_IMC] = &skx_uncore_imc,
[SKX_PCI_UNCORE_M2M] = &skx_uncore_m2m,
[SKX_PCI_UNCORE_UPI] = &skx_uncore_upi,
[SKX_PCI_UNCORE_M2PCIE] = &skx_uncore_m2pcie,
[SKX_PCI_UNCORE_M3UPI] = &skx_uncore_m3upi,
NULL,
};
static const struct pci_device_id skx_uncore_pci_ids[] = {
{ /* MC0 Channel 0 */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x2042),
.driver_data = UNCORE_PCI_DEV_FULL_DATA(10, 2, SKX_PCI_UNCORE_IMC, 0),
},
{ /* MC0 Channel 1 */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x2046),
.driver_data = UNCORE_PCI_DEV_FULL_DATA(10, 6, SKX_PCI_UNCORE_IMC, 1),
},
{ /* MC0 Channel 2 */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x204a),
.driver_data = UNCORE_PCI_DEV_FULL_DATA(11, 2, SKX_PCI_UNCORE_IMC, 2),
},
{ /* MC1 Channel 0 */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x2042),
.driver_data = UNCORE_PCI_DEV_FULL_DATA(12, 2, SKX_PCI_UNCORE_IMC, 3),
},
{ /* MC1 Channel 1 */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x2046),
.driver_data = UNCORE_PCI_DEV_FULL_DATA(12, 6, SKX_PCI_UNCORE_IMC, 4),
},
{ /* MC1 Channel 2 */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x204a),
.driver_data = UNCORE_PCI_DEV_FULL_DATA(13, 2, SKX_PCI_UNCORE_IMC, 5),
},
{ /* M2M0 */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x2066),
.driver_data = UNCORE_PCI_DEV_FULL_DATA(8, 0, SKX_PCI_UNCORE_M2M, 0),
},
{ /* M2M1 */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x2066),
.driver_data = UNCORE_PCI_DEV_FULL_DATA(9, 0, SKX_PCI_UNCORE_M2M, 1),
},
{ /* UPI0 Link 0 */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x2058),
.driver_data = UNCORE_PCI_DEV_FULL_DATA(14, 0, SKX_PCI_UNCORE_UPI, 0),
},
{ /* UPI0 Link 1 */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x2058),
.driver_data = UNCORE_PCI_DEV_FULL_DATA(15, 0, SKX_PCI_UNCORE_UPI, 1),
},
{ /* UPI1 Link 2 */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x2058),
.driver_data = UNCORE_PCI_DEV_FULL_DATA(16, 0, SKX_PCI_UNCORE_UPI, 2),
},
{ /* M2PCIe 0 */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x2088),
.driver_data = UNCORE_PCI_DEV_FULL_DATA(21, 1, SKX_PCI_UNCORE_M2PCIE, 0),
},
{ /* M2PCIe 1 */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x2088),
.driver_data = UNCORE_PCI_DEV_FULL_DATA(22, 1, SKX_PCI_UNCORE_M2PCIE, 1),
},
{ /* M2PCIe 2 */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x2088),
.driver_data = UNCORE_PCI_DEV_FULL_DATA(23, 1, SKX_PCI_UNCORE_M2PCIE, 2),
},
{ /* M2PCIe 3 */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x2088),
.driver_data = UNCORE_PCI_DEV_FULL_DATA(21, 5, SKX_PCI_UNCORE_M2PCIE, 3),
},
{ /* M3UPI0 Link 0 */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x204C),
.driver_data = UNCORE_PCI_DEV_FULL_DATA(18, 0, SKX_PCI_UNCORE_M3UPI, 0),
},
{ /* M3UPI0 Link 1 */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x204D),
.driver_data = UNCORE_PCI_DEV_FULL_DATA(18, 1, SKX_PCI_UNCORE_M3UPI, 1),
},
{ /* M3UPI1 Link 2 */
PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x204C),
.driver_data = UNCORE_PCI_DEV_FULL_DATA(18, 4, SKX_PCI_UNCORE_M3UPI, 2),
},
{ /* end: all zeroes */ }
};
static struct pci_driver skx_uncore_pci_driver = {
.name = "skx_uncore",
.id_table = skx_uncore_pci_ids,
};
int skx_uncore_pci_init(void)
{
/* need to double check pci address */
int ret = snbep_pci2phy_map_init(0x2014, SKX_CPUNODEID, SKX_GIDNIDMAP, false);
if (ret)
return ret;
uncore_pci_uncores = skx_pci_uncores;
uncore_pci_driver = &skx_uncore_pci_driver;
return 0;
}
/* end of SKX uncore support */

View File

@ -194,12 +194,13 @@ struct cpu_hw_events {
*/
struct debug_store *ds;
u64 pebs_enabled;
int n_pebs;
int n_large_pebs;
/*
* Intel LBR bits
*/
int lbr_users;
void *lbr_context;
struct perf_branch_stack lbr_stack;
struct perf_branch_entry lbr_entries[MAX_LBR_ENTRIES];
struct er_account *lbr_sel;
@ -508,6 +509,8 @@ struct x86_pmu {
void (*enable_all)(int added);
void (*enable)(struct perf_event *);
void (*disable)(struct perf_event *);
void (*add)(struct perf_event *);
void (*del)(struct perf_event *);
int (*hw_config)(struct perf_event *event);
int (*schedule_events)(struct cpu_hw_events *cpuc, int n, int *assign);
unsigned eventsel;
@ -888,6 +891,10 @@ extern struct event_constraint intel_skl_pebs_event_constraints[];
struct event_constraint *intel_pebs_constraints(struct perf_event *event);
void intel_pmu_pebs_add(struct perf_event *event);
void intel_pmu_pebs_del(struct perf_event *event);
void intel_pmu_pebs_enable(struct perf_event *event);
void intel_pmu_pebs_disable(struct perf_event *event);
@ -906,9 +913,9 @@ u64 lbr_from_signext_quirk_wr(u64 val);
void intel_pmu_lbr_reset(void);
void intel_pmu_lbr_enable(struct perf_event *event);
void intel_pmu_lbr_add(struct perf_event *event);
void intel_pmu_lbr_disable(struct perf_event *event);
void intel_pmu_lbr_del(struct perf_event *event);
void intel_pmu_lbr_enable_all(bool pmi);

View File

@ -339,6 +339,24 @@ static inline int bitmap_parse(const char *buf, unsigned int buflen,
return __bitmap_parse(buf, buflen, 0, maskp, nmaskbits);
}
/*
* bitmap_from_u64 - Check and swap words within u64.
* @mask: source bitmap
* @dst: destination bitmap
*
* In 32-bit Big Endian kernel, when using (u32 *)(&val)[*]
* to read u64 mask, we will get the wrong word.
* That is "(u32 *)(&val)[0]" gets the upper 32 bits,
* but we expect the lower 32-bits of u64.
*/
static inline void bitmap_from_u64(unsigned long *dst, u64 mask)
{
dst[0] = mask & ULONG_MAX;
if (sizeof(mask) > sizeof(unsigned long))
dst[1] = mask >> 32;
}
#endif /* __ASSEMBLY__ */
#endif /* __LINUX_BITMAP_H */

View File

@ -510,9 +510,15 @@ typedef void (*perf_overflow_handler_t)(struct perf_event *,
struct perf_sample_data *,
struct pt_regs *regs);
enum perf_group_flag {
PERF_GROUP_SOFTWARE = 0x1,
};
/*
* Event capabilities. For event_caps and groups caps.
*
* PERF_EV_CAP_SOFTWARE: Is a software event.
* PERF_EV_CAP_READ_ACTIVE_PKG: A CPU event (or cgroup event) that can be read
* from any CPU in the package where it is active.
*/
#define PERF_EV_CAP_SOFTWARE BIT(0)
#define PERF_EV_CAP_READ_ACTIVE_PKG BIT(1)
#define SWEVENT_HLIST_BITS 8
#define SWEVENT_HLIST_SIZE (1 << SWEVENT_HLIST_BITS)
@ -568,7 +574,12 @@ struct perf_event {
struct hlist_node hlist_entry;
struct list_head active_entry;
int nr_siblings;
int group_flags;
/* Not serialized. Only written during event initialization. */
int event_caps;
/* The cumulative AND of all event_caps for events in this group. */
int group_caps;
struct perf_event *group_leader;
struct pmu *pmu;
void *pmu_private;
@ -774,6 +785,9 @@ struct perf_cpu_context {
#ifdef CONFIG_CGROUP_PERF
struct perf_cgroup *cgrp;
#endif
struct list_head sched_cb_entry;
int sched_cb_usage;
};
struct perf_output_handle {
@ -985,7 +999,7 @@ static inline bool is_sampling_event(struct perf_event *event)
*/
static inline int is_software_event(struct perf_event *event)
{
return event->pmu->task_ctx_nr == perf_sw_context;
return event->event_caps & PERF_EV_CAP_SOFTWARE;
}
extern struct static_key perf_swevent_enabled[PERF_COUNT_SW_MAX];

View File

@ -1475,8 +1475,7 @@ list_add_event(struct perf_event *event, struct perf_event_context *ctx)
if (event->group_leader == event) {
struct list_head *list;
if (is_software_event(event))
event->group_flags |= PERF_GROUP_SOFTWARE;
event->group_caps = event->event_caps;
list = ctx_group_list(event, ctx);
list_add_tail(&event->group_entry, list);
@ -1630,9 +1629,7 @@ static void perf_group_attach(struct perf_event *event)
WARN_ON_ONCE(group_leader->ctx != event->ctx);
if (group_leader->group_flags & PERF_GROUP_SOFTWARE &&
!is_software_event(event))
group_leader->group_flags &= ~PERF_GROUP_SOFTWARE;
group_leader->group_caps &= event->event_caps;
list_add_tail(&event->group_entry, &group_leader->sibling_list);
group_leader->nr_siblings++;
@ -1723,7 +1720,7 @@ static void perf_group_detach(struct perf_event *event)
sibling->group_leader = sibling;
/* Inherit group flags from the previous leader */
sibling->group_flags = event->group_flags;
sibling->group_caps = event->group_caps;
WARN_ON_ONCE(sibling->ctx != event->ctx);
}
@ -1832,6 +1829,8 @@ group_sched_out(struct perf_event *group_event,
struct perf_event *event;
int state = group_event->state;
perf_pmu_disable(ctx->pmu);
event_sched_out(group_event, cpuctx, ctx);
/*
@ -1840,6 +1839,8 @@ group_sched_out(struct perf_event *group_event,
list_for_each_entry(event, &group_event->sibling_list, group_entry)
event_sched_out(event, cpuctx, ctx);
perf_pmu_enable(ctx->pmu);
if (state == PERF_EVENT_STATE_ACTIVE && group_event->attr.exclusive)
cpuctx->exclusive = 0;
}
@ -2145,7 +2146,7 @@ static int group_can_go_on(struct perf_event *event,
/*
* Groups consisting entirely of software events can always go on.
*/
if (event->group_flags & PERF_GROUP_SOFTWARE)
if (event->group_caps & PERF_EV_CAP_SOFTWARE)
return 1;
/*
* If an exclusive group is already on, no other hardware
@ -2491,7 +2492,7 @@ static int __perf_event_stop(void *info)
* while restarting.
*/
if (sd->restart)
event->pmu->start(event, PERF_EF_START);
event->pmu->start(event, 0);
return 0;
}
@ -2837,19 +2838,36 @@ unlock:
}
}
static DEFINE_PER_CPU(struct list_head, sched_cb_list);
void perf_sched_cb_dec(struct pmu *pmu)
{
struct perf_cpu_context *cpuctx = this_cpu_ptr(pmu->pmu_cpu_context);
this_cpu_dec(perf_sched_cb_usages);
if (!--cpuctx->sched_cb_usage)
list_del(&cpuctx->sched_cb_entry);
}
void perf_sched_cb_inc(struct pmu *pmu)
{
struct perf_cpu_context *cpuctx = this_cpu_ptr(pmu->pmu_cpu_context);
if (!cpuctx->sched_cb_usage++)
list_add(&cpuctx->sched_cb_entry, this_cpu_ptr(&sched_cb_list));
this_cpu_inc(perf_sched_cb_usages);
}
/*
* This function provides the context switch callback to the lower code
* layer. It is invoked ONLY when the context switch callback is enabled.
*
* This callback is relevant even to per-cpu events; for example multi event
* PEBS requires this to provide PID/TID information. This requires we flush
* all queued PEBS records before we context switch to a new task.
*/
static void perf_pmu_sched_task(struct task_struct *prev,
struct task_struct *next,
@ -2857,34 +2875,24 @@ static void perf_pmu_sched_task(struct task_struct *prev,
{
struct perf_cpu_context *cpuctx;
struct pmu *pmu;
unsigned long flags;
if (prev == next)
return;
local_irq_save(flags);
list_for_each_entry(cpuctx, this_cpu_ptr(&sched_cb_list), sched_cb_entry) {
pmu = cpuctx->unique_pmu; /* software PMUs will not have sched_task */
rcu_read_lock();
if (WARN_ON_ONCE(!pmu->sched_task))
continue;
list_for_each_entry_rcu(pmu, &pmus, entry) {
if (pmu->sched_task) {
cpuctx = this_cpu_ptr(pmu->pmu_cpu_context);
perf_ctx_lock(cpuctx, cpuctx->task_ctx);
perf_pmu_disable(pmu);
perf_ctx_lock(cpuctx, cpuctx->task_ctx);
pmu->sched_task(cpuctx->task_ctx, sched_in);
perf_pmu_disable(pmu);
pmu->sched_task(cpuctx->task_ctx, sched_in);
perf_pmu_enable(pmu);
perf_ctx_unlock(cpuctx, cpuctx->task_ctx);
}
perf_pmu_enable(pmu);
perf_ctx_unlock(cpuctx, cpuctx->task_ctx);
}
rcu_read_unlock();
local_irq_restore(flags);
}
static void perf_event_switch(struct task_struct *task,
@ -3416,6 +3424,22 @@ struct perf_read_data {
int ret;
};
static int find_cpu_to_read(struct perf_event *event, int local_cpu)
{
int event_cpu = event->oncpu;
u16 local_pkg, event_pkg;
if (event->group_caps & PERF_EV_CAP_READ_ACTIVE_PKG) {
event_pkg = topology_physical_package_id(event_cpu);
local_pkg = topology_physical_package_id(local_cpu);
if (event_pkg == local_pkg)
return local_cpu;
}
return event_cpu;
}
/*
* Cross CPU call to read the hardware event
*/
@ -3537,7 +3561,7 @@ u64 perf_event_read_local(struct perf_event *event)
static int perf_event_read(struct perf_event *event, bool group)
{
int ret = 0;
int ret = 0, cpu_to_read, local_cpu;
/*
* If event is enabled and currently active on a CPU, update the
@ -3549,6 +3573,11 @@ static int perf_event_read(struct perf_event *event, bool group)
.group = group,
.ret = 0,
};
local_cpu = get_cpu();
cpu_to_read = find_cpu_to_read(event, local_cpu);
put_cpu();
/*
* Purposely ignore the smp_call_function_single() return
* value.
@ -3559,7 +3588,7 @@ static int perf_event_read(struct perf_event *event, bool group)
* Therefore, either way, we'll have an up-to-date event count
* after this.
*/
(void)smp_call_function_single(event->oncpu, __perf_event_read, &data, 1);
(void)smp_call_function_single(cpu_to_read, __perf_event_read, &data, 1);
ret = data.ret;
} else if (event->state == PERF_EVENT_STATE_INACTIVE) {
struct perf_event_context *ctx = event->ctx;
@ -5350,9 +5379,10 @@ perf_output_sample_regs(struct perf_output_handle *handle,
struct pt_regs *regs, u64 mask)
{
int bit;
DECLARE_BITMAP(_mask, 64);
for_each_set_bit(bit, (const unsigned long *) &mask,
sizeof(mask) * BITS_PER_BYTE) {
bitmap_from_u64(_mask, mask);
for_each_set_bit(bit, _mask, sizeof(mask) * BITS_PER_BYTE) {
u64 val;
val = perf_reg_value(regs, bit);
@ -9505,6 +9535,9 @@ SYSCALL_DEFINE5(perf_event_open,
goto err_alloc;
}
if (pmu->task_ctx_nr == perf_sw_context)
event->event_caps |= PERF_EV_CAP_SOFTWARE;
if (group_leader &&
(is_software_event(event) != is_software_event(group_leader))) {
if (is_software_event(event)) {
@ -9518,7 +9551,7 @@ SYSCALL_DEFINE5(perf_event_open,
*/
pmu = group_leader->pmu;
} else if (is_software_event(group_leader) &&
(group_leader->group_flags & PERF_GROUP_SOFTWARE)) {
(group_leader->group_caps & PERF_EV_CAP_SOFTWARE)) {
/*
* In case the group is a pure software group, and we
* try to add a hardware event, move the whole group to
@ -10453,6 +10486,8 @@ static void __init perf_event_init_all_cpus(void)
INIT_LIST_HEAD(&per_cpu(pmu_sb_events.list, cpu));
raw_spin_lock_init(&per_cpu(pmu_sb_events.lock, cpu));
INIT_LIST_HEAD(&per_cpu(sched_cb_list, cpu));
}
}

View File

@ -150,7 +150,7 @@ static loff_t vaddr_to_offset(struct vm_area_struct *vma, unsigned long vaddr)
* Returns 0 on success, -EFAULT on failure.
*/
static int __replace_page(struct vm_area_struct *vma, unsigned long addr,
struct page *page, struct page *kpage)
struct page *old_page, struct page *new_page)
{
struct mm_struct *mm = vma->vm_mm;
spinlock_t *ptl;
@ -161,49 +161,49 @@ static int __replace_page(struct vm_area_struct *vma, unsigned long addr,
const unsigned long mmun_end = addr + PAGE_SIZE;
struct mem_cgroup *memcg;
err = mem_cgroup_try_charge(kpage, vma->vm_mm, GFP_KERNEL, &memcg,
err = mem_cgroup_try_charge(new_page, vma->vm_mm, GFP_KERNEL, &memcg,
false);
if (err)
return err;
/* For try_to_free_swap() and munlock_vma_page() below */
lock_page(page);
lock_page(old_page);
mmu_notifier_invalidate_range_start(mm, mmun_start, mmun_end);
err = -EAGAIN;
ptep = page_check_address(page, mm, addr, &ptl, 0);
ptep = page_check_address(old_page, mm, addr, &ptl, 0);
if (!ptep) {
mem_cgroup_cancel_charge(kpage, memcg, false);
mem_cgroup_cancel_charge(new_page, memcg, false);
goto unlock;
}
get_page(kpage);
page_add_new_anon_rmap(kpage, vma, addr, false);
mem_cgroup_commit_charge(kpage, memcg, false, false);
lru_cache_add_active_or_unevictable(kpage, vma);
get_page(new_page);
page_add_new_anon_rmap(new_page, vma, addr, false);
mem_cgroup_commit_charge(new_page, memcg, false, false);
lru_cache_add_active_or_unevictable(new_page, vma);
if (!PageAnon(page)) {
dec_mm_counter(mm, mm_counter_file(page));
if (!PageAnon(old_page)) {
dec_mm_counter(mm, mm_counter_file(old_page));
inc_mm_counter(mm, MM_ANONPAGES);
}
flush_cache_page(vma, addr, pte_pfn(*ptep));
ptep_clear_flush_notify(vma, addr, ptep);
set_pte_at_notify(mm, addr, ptep, mk_pte(kpage, vma->vm_page_prot));
set_pte_at_notify(mm, addr, ptep, mk_pte(new_page, vma->vm_page_prot));
page_remove_rmap(page, false);
if (!page_mapped(page))
try_to_free_swap(page);
page_remove_rmap(old_page, false);
if (!page_mapped(old_page))
try_to_free_swap(old_page);
pte_unmap_unlock(ptep, ptl);
if (vma->vm_flags & VM_LOCKED)
munlock_vma_page(page);
put_page(page);
munlock_vma_page(old_page);
put_page(old_page);
err = 0;
unlock:
mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end);
unlock_page(page);
unlock_page(old_page);
return err;
}

View File

@ -4123,6 +4123,30 @@ static const char readme_msg[] =
"\t\t\t traces\n"
#endif
#endif /* CONFIG_STACK_TRACER */
#ifdef CONFIG_KPROBE_EVENT
" kprobe_events\t\t- Add/remove/show the kernel dynamic events\n"
"\t\t\t Write into this file to define/undefine new trace events.\n"
#endif
#ifdef CONFIG_UPROBE_EVENT
" uprobe_events\t\t- Add/remove/show the userspace dynamic events\n"
"\t\t\t Write into this file to define/undefine new trace events.\n"
#endif
#if defined(CONFIG_KPROBE_EVENT) || defined(CONFIG_UPROBE_EVENT)
"\t accepts: event-definitions (one definition per line)\n"
"\t Format: p|r[:[<group>/]<event>] <place> [<args>]\n"
"\t -:[<group>/]<event>\n"
#ifdef CONFIG_KPROBE_EVENT
"\t place: [<module>:]<symbol>[+<offset>]|<memaddr>\n"
#endif
#ifdef CONFIG_UPROBE_EVENT
"\t place: <path>:<offset>\n"
#endif
"\t args: <name>=fetcharg[:type]\n"
"\t fetcharg: %<register>, @<address>, @<symbol>[+|-<offset>],\n"
"\t $stack<index>, $stack, $retval, $comm\n"
"\t type: s8/16/32/64, u8/16/32/64, x8/16/32/64, string,\n"
"\t b<bit-width>@<bit-offset>/<container-size>\n"
#endif
" events/\t\t- Directory containing all trace event subsystems:\n"
" enable\t\t- Write 0/1 to enable/disable tracing of all events\n"
" events/<system>/\t- Directory containing all trace events for <system>:\n"

View File

@ -253,6 +253,10 @@ static const struct fetch_type kprobes_fetch_type_table[] = {
ASSIGN_FETCH_TYPE(s16, u16, 1),
ASSIGN_FETCH_TYPE(s32, u32, 1),
ASSIGN_FETCH_TYPE(s64, u64, 1),
ASSIGN_FETCH_TYPE_ALIAS(x8, u8, u8, 0),
ASSIGN_FETCH_TYPE_ALIAS(x16, u16, u16, 0),
ASSIGN_FETCH_TYPE_ALIAS(x32, u32, u32, 0),
ASSIGN_FETCH_TYPE_ALIAS(x64, u64, u64, 0),
ASSIGN_FETCH_TYPE_END
};

View File

@ -36,24 +36,28 @@ const char *reserved_field_names[] = {
};
/* Printing in basic type function template */
#define DEFINE_BASIC_PRINT_TYPE_FUNC(type, fmt) \
int PRINT_TYPE_FUNC_NAME(type)(struct trace_seq *s, const char *name, \
#define DEFINE_BASIC_PRINT_TYPE_FUNC(tname, type, fmt) \
int PRINT_TYPE_FUNC_NAME(tname)(struct trace_seq *s, const char *name, \
void *data, void *ent) \
{ \
trace_seq_printf(s, " %s=" fmt, name, *(type *)data); \
return !trace_seq_has_overflowed(s); \
} \
const char PRINT_TYPE_FMT_NAME(type)[] = fmt; \
NOKPROBE_SYMBOL(PRINT_TYPE_FUNC_NAME(type));
const char PRINT_TYPE_FMT_NAME(tname)[] = fmt; \
NOKPROBE_SYMBOL(PRINT_TYPE_FUNC_NAME(tname));
DEFINE_BASIC_PRINT_TYPE_FUNC(u8 , "0x%x")
DEFINE_BASIC_PRINT_TYPE_FUNC(u16, "0x%x")
DEFINE_BASIC_PRINT_TYPE_FUNC(u32, "0x%x")
DEFINE_BASIC_PRINT_TYPE_FUNC(u64, "0x%Lx")
DEFINE_BASIC_PRINT_TYPE_FUNC(s8, "%d")
DEFINE_BASIC_PRINT_TYPE_FUNC(s16, "%d")
DEFINE_BASIC_PRINT_TYPE_FUNC(s32, "%d")
DEFINE_BASIC_PRINT_TYPE_FUNC(s64, "%Ld")
DEFINE_BASIC_PRINT_TYPE_FUNC(u8, u8, "%u")
DEFINE_BASIC_PRINT_TYPE_FUNC(u16, u16, "%u")
DEFINE_BASIC_PRINT_TYPE_FUNC(u32, u32, "%u")
DEFINE_BASIC_PRINT_TYPE_FUNC(u64, u64, "%Lu")
DEFINE_BASIC_PRINT_TYPE_FUNC(s8, s8, "%d")
DEFINE_BASIC_PRINT_TYPE_FUNC(s16, s16, "%d")
DEFINE_BASIC_PRINT_TYPE_FUNC(s32, s32, "%d")
DEFINE_BASIC_PRINT_TYPE_FUNC(s64, s64, "%Ld")
DEFINE_BASIC_PRINT_TYPE_FUNC(x8, u8, "0x%x")
DEFINE_BASIC_PRINT_TYPE_FUNC(x16, u16, "0x%x")
DEFINE_BASIC_PRINT_TYPE_FUNC(x32, u32, "0x%x")
DEFINE_BASIC_PRINT_TYPE_FUNC(x64, u64, "0x%Lx")
/* Print type function for string type */
int PRINT_TYPE_FUNC_NAME(string)(struct trace_seq *s, const char *name,

View File

@ -149,6 +149,11 @@ DECLARE_BASIC_PRINT_TYPE_FUNC(s8);
DECLARE_BASIC_PRINT_TYPE_FUNC(s16);
DECLARE_BASIC_PRINT_TYPE_FUNC(s32);
DECLARE_BASIC_PRINT_TYPE_FUNC(s64);
DECLARE_BASIC_PRINT_TYPE_FUNC(x8);
DECLARE_BASIC_PRINT_TYPE_FUNC(x16);
DECLARE_BASIC_PRINT_TYPE_FUNC(x32);
DECLARE_BASIC_PRINT_TYPE_FUNC(x64);
DECLARE_BASIC_PRINT_TYPE_FUNC(string);
#define FETCH_FUNC_NAME(method, type) fetch_##method##_##type
@ -203,7 +208,7 @@ DEFINE_FETCH_##method(u32) \
DEFINE_FETCH_##method(u64)
/* Default (unsigned long) fetch type */
#define __DEFAULT_FETCH_TYPE(t) u##t
#define __DEFAULT_FETCH_TYPE(t) x##t
#define _DEFAULT_FETCH_TYPE(t) __DEFAULT_FETCH_TYPE(t)
#define DEFAULT_FETCH_TYPE _DEFAULT_FETCH_TYPE(BITS_PER_LONG)
#define DEFAULT_FETCH_TYPE_STR __stringify(DEFAULT_FETCH_TYPE)
@ -234,6 +239,10 @@ ASSIGN_FETCH_FUNC(file_offset, ftype), \
#define ASSIGN_FETCH_TYPE(ptype, ftype, sign) \
__ASSIGN_FETCH_TYPE(#ptype, ptype, ftype, sizeof(ftype), sign, #ptype)
/* If ptype is an alias of atype, use this macro (show atype in format) */
#define ASSIGN_FETCH_TYPE_ALIAS(ptype, atype, ftype, sign) \
__ASSIGN_FETCH_TYPE(#ptype, ptype, ftype, sizeof(ftype), sign, #atype)
#define ASSIGN_FETCH_TYPE_END {}
#define FETCH_TYPE_STRING 0

View File

@ -211,6 +211,10 @@ static const struct fetch_type uprobes_fetch_type_table[] = {
ASSIGN_FETCH_TYPE(s16, u16, 1),
ASSIGN_FETCH_TYPE(s32, u32, 1),
ASSIGN_FETCH_TYPE(s64, u64, 1),
ASSIGN_FETCH_TYPE_ALIAS(x8, u8, u8, 0),
ASSIGN_FETCH_TYPE_ALIAS(x16, u16, u16, 0),
ASSIGN_FETCH_TYPE_ALIAS(x32, u32, u32, 0),
ASSIGN_FETCH_TYPE_ALIAS(x64, u64, u64, 0),
ASSIGN_FETCH_TYPE_END
};

View File

@ -0,0 +1,47 @@
#ifndef TOOLS_ARCH_ALPHA_UAPI_ASM_MMAN_FIX_H
#define TOOLS_ARCH_ALPHA_UAPI_ASM_MMAN_FIX_H
#define MADV_DODUMP 17
#define MADV_DOFORK 11
#define MADV_DONTDUMP 16
#define MADV_DONTFORK 10
#define MADV_DONTNEED 6
#define MADV_FREE 8
#define MADV_HUGEPAGE 14
#define MADV_MERGEABLE 12
#define MADV_NOHUGEPAGE 15
#define MADV_NORMAL 0
#define MADV_RANDOM 1
#define MADV_REMOVE 9
#define MADV_SEQUENTIAL 2
#define MADV_UNMERGEABLE 13
#define MADV_WILLNEED 3
#define MAP_ANONYMOUS 0x10
#define MAP_DENYWRITE 0x02000
#define MAP_EXECUTABLE 0x04000
#define MAP_FILE 0
#define MAP_FIXED 0x100
#define MAP_GROWSDOWN 0x01000
#define MAP_HUGETLB 0x100000
#define MAP_LOCKED 0x08000
#define MAP_NONBLOCK 0x40000
#define MAP_NORESERVE 0x10000
#define MAP_POPULATE 0x20000
#define MAP_PRIVATE 0x02
#define MAP_SHARED 0x01
#define MAP_STACK 0x80000
#define PROT_EXEC 0x4
#define PROT_GROWSDOWN 0x01000000
#define PROT_GROWSUP 0x02000000
#define PROT_NONE 0x0
#define PROT_READ 0x1
#define PROT_SEM 0x8
#define PROT_WRITE 0x2
/* MADV_HWPOISON is undefined on alpha, fix it for perf */
#define MADV_HWPOISON 100
/* MADV_SOFT_OFFLINE is undefined on alpha, fix it for perf */
#define MADV_SOFT_OFFLINE 101
/* MAP_32BIT is undefined on alpha, fix it for perf */
#define MAP_32BIT 0
/* MAP_UNINITIALIZED is undefined on alpha, fix it for perf */
#define MAP_UNINITIALIZED 0
#endif

View File

@ -0,0 +1,6 @@
#ifndef TOOLS_ARCH_ARC_UAPI_ASM_MMAN_FIX_H
#define TOOLS_ARCH_ARC_UAPI_ASM_MMAN_FIX_H
#include <uapi/asm-generic/mman.h>
/* MAP_32BIT is undefined on arc, fix it for perf */
#define MAP_32BIT 0
#endif

View File

@ -0,0 +1,6 @@
#ifndef TOOLS_ARCH_ARM_UAPI_ASM_MMAN_FIX_H
#define TOOLS_ARCH_ARM_UAPI_ASM_MMAN_FIX_H
#include <uapi/asm-generic/mman.h>
/* MAP_32BIT is undefined on arm, fix it for perf */
#define MAP_32BIT 0
#endif

View File

@ -0,0 +1,6 @@
#ifndef TOOLS_ARCH_ARM64_UAPI_ASM_MMAN_FIX_H
#define TOOLS_ARCH_ARM64_UAPI_ASM_MMAN_FIX_H
#include <uapi/asm-generic/mman.h>
/* MAP_32BIT is undefined on arm64, fix it for perf */
#define MAP_32BIT 0
#endif

View File

@ -0,0 +1,6 @@
#ifndef TOOLS_ARCH_FRV_UAPI_ASM_MMAN_FIX_H
#define TOOLS_ARCH_FRV_UAPI_ASM_MMAN_FIX_H
#include <uapi/asm-generic/mman.h>
/* MAP_32BIT is undefined on frv, fix it for perf */
#define MAP_32BIT 0
#endif

View File

@ -0,0 +1,6 @@
#ifndef TOOLS_ARCH_H8300_UAPI_ASM_MMAN_FIX_H
#define TOOLS_ARCH_H8300_UAPI_ASM_MMAN_FIX_H
#include <uapi/asm-generic/mman.h>
/* MAP_32BIT is undefined on h8300, fix it for perf */
#define MAP_32BIT 0
#endif

View File

@ -0,0 +1,6 @@
#ifndef TOOLS_ARCH_HEXAGON_UAPI_ASM_MMAN_FIX_H
#define TOOLS_ARCH_HEXAGON_UAPI_ASM_MMAN_FIX_H
#include <uapi/asm-generic/mman.h>
/* MAP_32BIT is undefined on hexagon, fix it for perf */
#define MAP_32BIT 0
#endif

View File

@ -0,0 +1,6 @@
#ifndef TOOLS_ARCH_IA64_UAPI_ASM_MMAN_FIX_H
#define TOOLS_ARCH_IA64_UAPI_ASM_MMAN_FIX_H
#include <uapi/asm-generic/mman.h>
/* MAP_32BIT is undefined on ia64, fix it for perf */
#define MAP_32BIT 0
#endif

View File

@ -0,0 +1,6 @@
#ifndef TOOLS_ARCH_M32R_UAPI_ASM_MMAN_FIX_H
#define TOOLS_ARCH_M32R_UAPI_ASM_MMAN_FIX_H
#include <uapi/asm-generic/mman.h>
/* MAP_32BIT is undefined on m32r, fix it for perf */
#define MAP_32BIT 0
#endif

View File

@ -0,0 +1,6 @@
#ifndef TOOLS_ARCH_MICROBLAZE_UAPI_ASM_MMAN_FIX_H
#define TOOLS_ARCH_MICROBLAZE_UAPI_ASM_MMAN_FIX_H
#include <uapi/asm-generic/mman.h>
/* MAP_32BIT is undefined on microblaze, fix it for perf */
#define MAP_32BIT 0
#endif

View File

@ -0,0 +1,46 @@
#ifndef TOOLS_ARCH_MIPS_UAPI_ASM_MMAN_FIX_H
#define TOOLS_ARCH_MIPS_UAPI_ASM_MMAN_FIX_H
#define MADV_DODUMP 17
#define MADV_DOFORK 11
#define MADV_DONTDUMP 16
#define MADV_DONTFORK 10
#define MADV_DONTNEED 4
#define MADV_FREE 8
#define MADV_HUGEPAGE 14
#define MADV_HWPOISON 100
#define MADV_MERGEABLE 12
#define MADV_NOHUGEPAGE 15
#define MADV_NORMAL 0
#define MADV_RANDOM 1
#define MADV_REMOVE 9
#define MADV_SEQUENTIAL 2
#define MADV_UNMERGEABLE 13
#define MADV_WILLNEED 3
#define MAP_ANONYMOUS 0x0800
#define MAP_DENYWRITE 0x2000
#define MAP_EXECUTABLE 0x4000
#define MAP_FILE 0
#define MAP_FIXED 0x010
#define MAP_GROWSDOWN 0x1000
#define MAP_HUGETLB 0x80000
#define MAP_LOCKED 0x8000
#define MAP_NONBLOCK 0x20000
#define MAP_NORESERVE 0x0400
#define MAP_POPULATE 0x10000
#define MAP_PRIVATE 0x002
#define MAP_SHARED 0x001
#define MAP_STACK 0x40000
#define PROT_EXEC 0x04
#define PROT_GROWSDOWN 0x01000000
#define PROT_GROWSUP 0x02000000
#define PROT_NONE 0x00
#define PROT_READ 0x01
#define PROT_SEM 0x10
#define PROT_WRITE 0x02
/* MADV_SOFT_OFFLINE is undefined on mips, fix it for perf */
#define MADV_SOFT_OFFLINE 101
/* MAP_32BIT is undefined on mips, fix it for perf */
#define MAP_32BIT 0
/* MAP_UNINITIALIZED is undefined on mips, fix it for perf */
#define MAP_UNINITIALIZED 0
#endif

View File

@ -0,0 +1,6 @@
#ifndef TOOLS_ARCH_MN10300_UAPI_ASM_MMAN_FIX_H
#define TOOLS_ARCH_MN10300_UAPI_ASM_MMAN_FIX_H
#include <uapi/asm-generic/mman.h>
/* MAP_32BIT is undefined on mn10300, fix it for perf */
#define MAP_32BIT 0
#endif

View File

@ -0,0 +1,47 @@
#ifndef TOOLS_ARCH_PARISC_UAPI_ASM_MMAN_FIX_H
#define TOOLS_ARCH_PARISC_UAPI_ASM_MMAN_FIX_H
#define MADV_DODUMP 70
#define MADV_DOFORK 11
#define MADV_DONTDUMP 69
#define MADV_DONTFORK 10
#define MADV_DONTNEED 4
#define MADV_FREE 8
#define MADV_HUGEPAGE 67
#define MADV_MERGEABLE 65
#define MADV_NOHUGEPAGE 68
#define MADV_NORMAL 0
#define MADV_RANDOM 1
#define MADV_REMOVE 9
#define MADV_SEQUENTIAL 2
#define MADV_UNMERGEABLE 66
#define MADV_WILLNEED 3
#define MAP_ANONYMOUS 0x10
#define MAP_DENYWRITE 0x0800
#define MAP_EXECUTABLE 0x1000
#define MAP_FILE 0
#define MAP_FIXED 0x04
#define MAP_GROWSDOWN 0x8000
#define MAP_HUGETLB 0x80000
#define MAP_LOCKED 0x2000
#define MAP_NONBLOCK 0x20000
#define MAP_NORESERVE 0x4000
#define MAP_POPULATE 0x10000
#define MAP_PRIVATE 0x02
#define MAP_SHARED 0x01
#define MAP_STACK 0x40000
#define PROT_EXEC 0x4
#define PROT_GROWSDOWN 0x01000000
#define PROT_GROWSUP 0x02000000
#define PROT_NONE 0x0
#define PROT_READ 0x1
#define PROT_SEM 0x8
#define PROT_WRITE 0x2
/* MADV_HWPOISON is undefined on parisc, fix it for perf */
#define MADV_HWPOISON 100
/* MADV_SOFT_OFFLINE is undefined on parisc, fix it for perf */
#define MADV_SOFT_OFFLINE 101
/* MAP_32BIT is undefined on parisc, fix it for perf */
#define MAP_32BIT 0
/* MAP_UNINITIALIZED is undefined on parisc, fix it for perf */
#define MAP_UNINITIALIZED 0
#endif

View File

@ -0,0 +1,15 @@
#ifndef TOOLS_ARCH_POWERPC_UAPI_ASM_MMAN_FIX_H
#define TOOLS_ARCH_POWERPC_UAPI_ASM_MMAN_FIX_H
#define MAP_DENYWRITE 0x0800
#define MAP_EXECUTABLE 0x1000
#define MAP_GROWSDOWN 0x0100
#define MAP_HUGETLB 0x40000
#define MAP_LOCKED 0x80
#define MAP_NONBLOCK 0x10000
#define MAP_NORESERVE 0x40
#define MAP_POPULATE 0x8000
#define MAP_STACK 0x20000
#include <uapi/asm-generic/mman-common.h>
/* MAP_32BIT is undefined on powerpc, fix it for perf */
#define MAP_32BIT 0
#endif

View File

@ -0,0 +1,6 @@
#ifndef TOOLS_ARCH_S390_UAPI_ASM_MMAN_FIX_H
#define TOOLS_ARCH_S390_UAPI_ASM_MMAN_FIX_H
#include <uapi/asm-generic/mman.h>
/* MAP_32BIT is undefined on s390, fix it for perf */
#define MAP_32BIT 0
#endif

View File

@ -0,0 +1,6 @@
#ifndef TOOLS_ARCH_SCORE_UAPI_ASM_MMAN_FIX_H
#define TOOLS_ARCH_SCORE_UAPI_ASM_MMAN_FIX_H
#include <uapi/asm-generic/mman.h>
/* MAP_32BIT is undefined on score, fix it for perf */
#define MAP_32BIT 0
#endif

View File

@ -0,0 +1,6 @@
#ifndef TOOLS_ARCH_SH_UAPI_ASM_MMAN_FIX_H
#define TOOLS_ARCH_SH_UAPI_ASM_MMAN_FIX_H
#include <uapi/asm-generic/mman.h>
/* MAP_32BIT is undefined on sh, fix it for perf */
#define MAP_32BIT 0
#endif

View File

@ -0,0 +1,15 @@
#ifndef TOOLS_ARCH_SPARC_UAPI_ASM_MMAN_FIX_H
#define TOOLS_ARCH_SPARC_UAPI_ASM_MMAN_FIX_H
#define MAP_DENYWRITE 0x0800
#define MAP_EXECUTABLE 0x1000
#define MAP_GROWSDOWN 0x0200
#define MAP_HUGETLB 0x40000
#define MAP_LOCKED 0x100
#define MAP_NONBLOCK 0x10000
#define MAP_NORESERVE 0x40
#define MAP_POPULATE 0x8000
#define MAP_STACK 0x20000
#include <uapi/asm-generic/mman-common.h>
/* MAP_32BIT is undefined on sparc, fix it for perf */
#define MAP_32BIT 0
#endif

View File

@ -0,0 +1,15 @@
#ifndef TOOLS_ARCH_TILE_UAPI_ASM_MMAN_FIX_H
#define TOOLS_ARCH_TILE_UAPI_ASM_MMAN_FIX_H
#define MAP_DENYWRITE 0x0800
#define MAP_EXECUTABLE 0x1000
#define MAP_GROWSDOWN 0x0100
#define MAP_HUGETLB 0x4000
#define MAP_LOCKED 0x0200
#define MAP_NONBLOCK 0x0080
#define MAP_NORESERVE 0x0400
#define MAP_POPULATE 0x0040
#define MAP_STACK MAP_GROWSDOWN
#include <uapi/asm-generic/mman-common.h>
/* MAP_32BIT is undefined on tile, fix it for perf */
#define MAP_32BIT 0
#endif

View File

@ -0,0 +1,5 @@
#ifndef TOOLS_ARCH_X86_UAPI_ASM_MMAN_FIX_H
#define TOOLS_ARCH_X86_UAPI_ASM_MMAN_FIX_H
#define MAP_32BIT 0x40
#include <uapi/asm-generic/mman.h>
#endif

View File

@ -0,0 +1,47 @@
#ifndef TOOLS_ARCH_XTENSA_UAPI_ASM_MMAN_FIX_H
#define TOOLS_ARCH_XTENSA_UAPI_ASM_MMAN_FIX_H
#define MADV_DODUMP 17
#define MADV_DOFORK 11
#define MADV_DONTDUMP 16
#define MADV_DONTFORK 10
#define MADV_DONTNEED 4
#define MADV_FREE 8
#define MADV_HUGEPAGE 14
#define MADV_MERGEABLE 12
#define MADV_NOHUGEPAGE 15
#define MADV_NORMAL 0
#define MADV_RANDOM 1
#define MADV_REMOVE 9
#define MADV_SEQUENTIAL 2
#define MADV_UNMERGEABLE 13
#define MADV_WILLNEED 3
#define MAP_ANONYMOUS 0x0800
#define MAP_DENYWRITE 0x2000
#define MAP_EXECUTABLE 0x4000
#define MAP_FILE 0
#define MAP_FIXED 0x010
#define MAP_GROWSDOWN 0x1000
#define MAP_HUGETLB 0x80000
#define MAP_LOCKED 0x8000
#define MAP_NONBLOCK 0x20000
#define MAP_NORESERVE 0x0400
#define MAP_POPULATE 0x10000
#define MAP_PRIVATE 0x002
#define MAP_SHARED 0x001
#define MAP_STACK 0x40000
#define PROT_EXEC 0x4
#define PROT_GROWSDOWN 0x01000000
#define PROT_GROWSUP 0x02000000
#define PROT_NONE 0x0
#define PROT_READ 0x1
#define PROT_SEM 0x10
#define PROT_WRITE 0x2
/* MADV_HWPOISON is undefined on xtensa, fix it for perf */
#define MADV_HWPOISON 100
/* MADV_SOFT_OFFLINE is undefined on xtensa, fix it for perf */
#define MADV_SOFT_OFFLINE 101
/* MAP_32BIT is undefined on xtensa, fix it for perf */
#define MAP_32BIT 0
/* MAP_UNINITIALIZED is undefined on xtensa, fix it for perf */
#define MAP_UNINITIALIZED 0
#endif

View File

@ -0,0 +1,39 @@
/*
* Copyright(C) 2015 Linaro Limited. All rights reserved.
* Author: Mathieu Poirier <mathieu.poirier@linaro.org>
*
* This program is free software; you can redistribute it and/or modify it
* under the terms of the GNU General Public License version 2 as published by
* the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful, but WITHOUT
* ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
* FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
* more details.
*
* You should have received a copy of the GNU General Public License along with
* this program. If not, see <http://www.gnu.org/licenses/>.
*/
#ifndef _LINUX_CORESIGHT_PMU_H
#define _LINUX_CORESIGHT_PMU_H
#define CORESIGHT_ETM_PMU_NAME "cs_etm"
#define CORESIGHT_ETM_PMU_SEED 0x10
/* ETMv3.5/PTM's ETMCR config bit */
#define ETM_OPT_CYCACC 12
#define ETM_OPT_TS 28
static inline int coresight_get_trace_id(int cpu)
{
/*
* A trace ID of value 0 is invalid, so let's start at some
* random value that fits in 7 bits and go from there. Since
* the common convention is to have data trace IDs be I(N) + 1,
* set instruction trace IDs as a function of the CPU number.
*/
return (CORESIGHT_ETM_PMU_SEED + (cpu * 2));
}
#endif

View File

@ -0,0 +1,12 @@
#ifndef _TOOLS_LINUX_TIME64_H
#define _TOOLS_LINUX_TIME64_H
#define MSEC_PER_SEC 1000L
#define USEC_PER_MSEC 1000L
#define NSEC_PER_USEC 1000L
#define NSEC_PER_MSEC 1000000L
#define USEC_PER_SEC 1000000L
#define NSEC_PER_SEC 1000000000L
#define FSEC_PER_SEC 1000000000000000LL
#endif /* _LINUX_TIME64_H */

View File

@ -0,0 +1,75 @@
#ifndef __ASM_GENERIC_MMAN_COMMON_H
#define __ASM_GENERIC_MMAN_COMMON_H
/*
Author: Michael S. Tsirkin <mst@mellanox.co.il>, Mellanox Technologies Ltd.
Based on: asm-xxx/mman.h
*/
#define PROT_READ 0x1 /* page can be read */
#define PROT_WRITE 0x2 /* page can be written */
#define PROT_EXEC 0x4 /* page can be executed */
#define PROT_SEM 0x8 /* page may be used for atomic ops */
#define PROT_NONE 0x0 /* page can not be accessed */
#define PROT_GROWSDOWN 0x01000000 /* mprotect flag: extend change to start of growsdown vma */
#define PROT_GROWSUP 0x02000000 /* mprotect flag: extend change to end of growsup vma */
#define MAP_SHARED 0x01 /* Share changes */
#define MAP_PRIVATE 0x02 /* Changes are private */
#define MAP_TYPE 0x0f /* Mask for type of mapping */
#define MAP_FIXED 0x10 /* Interpret addr exactly */
#define MAP_ANONYMOUS 0x20 /* don't use a file */
#ifdef CONFIG_MMAP_ALLOW_UNINITIALIZED
# define MAP_UNINITIALIZED 0x4000000 /* For anonymous mmap, memory could be uninitialized */
#else
# define MAP_UNINITIALIZED 0x0 /* Don't support this flag */
#endif
/*
* Flags for mlock
*/
#define MLOCK_ONFAULT 0x01 /* Lock pages in range after they are faulted in, do not prefault */
#define MS_ASYNC 1 /* sync memory asynchronously */
#define MS_INVALIDATE 2 /* invalidate the caches */
#define MS_SYNC 4 /* synchronous memory sync */
#define MADV_NORMAL 0 /* no further special treatment */
#define MADV_RANDOM 1 /* expect random page references */
#define MADV_SEQUENTIAL 2 /* expect sequential page references */
#define MADV_WILLNEED 3 /* will need these pages */
#define MADV_DONTNEED 4 /* don't need these pages */
/* common parameters: try to keep these consistent across architectures */
#define MADV_FREE 8 /* free pages only if memory pressure */
#define MADV_REMOVE 9 /* remove these pages & resources */
#define MADV_DONTFORK 10 /* don't inherit across fork */
#define MADV_DOFORK 11 /* do inherit across fork */
#define MADV_HWPOISON 100 /* poison a page for testing */
#define MADV_SOFT_OFFLINE 101 /* soft offline page for testing */
#define MADV_MERGEABLE 12 /* KSM may merge identical pages */
#define MADV_UNMERGEABLE 13 /* KSM may not merge identical pages */
#define MADV_HUGEPAGE 14 /* Worth backing with hugepages */
#define MADV_NOHUGEPAGE 15 /* Not worth backing with hugepages */
#define MADV_DONTDUMP 16 /* Explicity exclude from the core dump,
overrides the coredump filter bits */
#define MADV_DODUMP 17 /* Clear the MADV_DONTDUMP flag */
/* compatibility flags */
#define MAP_FILE 0
/*
* When MAP_HUGETLB is set bits [26:31] encode the log2 of the huge page size.
* This gives us 6 bits, which is enough until someone invents 128 bit address
* spaces.
*
* Assume these are all power of twos.
* When 0 use the default page size.
*/
#define MAP_HUGE_SHIFT 26
#define MAP_HUGE_MASK 0x3f
#endif /* __ASM_GENERIC_MMAN_COMMON_H */

View File

@ -0,0 +1,22 @@
#ifndef __ASM_GENERIC_MMAN_H
#define __ASM_GENERIC_MMAN_H
#include <uapi/asm-generic/mman-common.h>
#define MAP_GROWSDOWN 0x0100 /* stack-like segment */
#define MAP_DENYWRITE 0x0800 /* ETXTBSY */
#define MAP_EXECUTABLE 0x1000 /* mark it as an executable */
#define MAP_LOCKED 0x2000 /* pages are locked */
#define MAP_NORESERVE 0x4000 /* don't check for reservations */
#define MAP_POPULATE 0x8000 /* populate (prefault) pagetables */
#define MAP_NONBLOCK 0x10000 /* do not block on IO */
#define MAP_STACK 0x20000 /* give out an address that is best suited for process/thread stacks */
#define MAP_HUGETLB 0x40000 /* create a huge page mapping */
/* Bits [26:31] are reserved, see mman-common.h for MAP_HUGETLB usage */
#define MCL_CURRENT 1 /* lock all current mappings */
#define MCL_FUTURE 2 /* lock all future mappings */
#define MCL_ONFAULT 4 /* lock all pages that are faulted in */
#endif /* __ASM_GENERIC_MMAN_H */

View File

@ -0,0 +1,13 @@
#ifndef _UAPI_LINUX_MMAN_H
#define _UAPI_LINUX_MMAN_H
#include <uapi/asm/mman.h>
#define MREMAP_MAYMOVE 1
#define MREMAP_FIXED 2
#define OVERCOMMIT_GUESS 0
#define OVERCOMMIT_ALWAYS 1
#define OVERCOMMIT_NEVER 2
#endif /* _UAPI_LINUX_MMAN_H */

View File

@ -34,6 +34,10 @@
#define TRACEFS_MAGIC 0x74726163
#endif
#ifndef HUGETLBFS_MAGIC
#define HUGETLBFS_MAGIC 0x958458f6
#endif
static const char * const sysfs__fs_known_mountpoints[] = {
"/sys",
0,
@ -67,6 +71,10 @@ static const char * const tracefs__known_mountpoints[] = {
0,
};
static const char * const hugetlbfs__known_mountpoints[] = {
0,
};
struct fs {
const char *name;
const char * const *mounts;
@ -80,6 +88,7 @@ enum {
FS__PROCFS = 1,
FS__DEBUGFS = 2,
FS__TRACEFS = 3,
FS__HUGETLBFS = 4,
};
#ifndef TRACEFS_MAGIC
@ -107,6 +116,11 @@ static struct fs fs__entries[] = {
.mounts = tracefs__known_mountpoints,
.magic = TRACEFS_MAGIC,
},
[FS__HUGETLBFS] = {
.name = "hugetlbfs",
.mounts = hugetlbfs__known_mountpoints,
.magic = HUGETLBFS_MAGIC,
},
};
static bool fs__read_mounts(struct fs *fs)
@ -265,6 +279,7 @@ FS(sysfs, FS__SYSFS);
FS(procfs, FS__PROCFS);
FS(debugfs, FS__DEBUGFS);
FS(tracefs, FS__TRACEFS);
FS(hugetlbfs, FS__HUGETLBFS);
int filename__read_int(const char *filename, int *value)
{

View File

@ -21,6 +21,7 @@ FS(sysfs)
FS(procfs)
FS(debugfs)
FS(tracefs)
FS(hugetlbfs)
#undef FS

View File

@ -110,6 +110,14 @@ Given a $HOME/.perfconfig like this:
order = caller
sort-key = function
[report]
# Defaults
sort-order = comm,dso,symbol
percent-limit = 0
queue-size = 0
children = true
group = true
Variables
~~~~~~~~~
@ -382,6 +390,10 @@ call-graph.*::
histogram entry. Default is 0 which means no limitation.
report.*::
report.sort_order::
Allows changing the default sort order from "comm,dso,symbol" to
some other default, for instance "sym,dso" may be more fitting for
kernel developers.
report.percent-limit::
This one is mostly the same as call-graph.threshold but works for
histogram entries. Entries having an overhead lower than this

View File

@ -21,6 +21,8 @@ or
'perf probe' [options] --vars='PROBEPOINT'
or
'perf probe' [options] --funcs
or
'perf probe' [options] --definition='PROBE' [...]
DESCRIPTION
-----------
@ -34,6 +36,8 @@ OPTIONS
-k::
--vmlinux=PATH::
Specify vmlinux path which has debuginfo (Dwarf binary).
Only when using this with --definition, you can give an offline
vmlinux file.
-m::
--module=MODNAME|PATH::
@ -96,6 +100,11 @@ OPTIONS
can also list functions in a user space executable / shared library.
This also can accept a FILTER rule argument.
-D::
--definition=::
Show trace-event definition converted from given probe-event instead
of write it into tracing/[k,u]probe_events.
--filter=FILTER::
(Only for --vars and --funcs) Set filter. FILTER is a combination of glob
pattern, see FILTER PATTERN for detail.
@ -176,13 +185,12 @@ Each probe argument follows below syntax.
'NAME' specifies the name of this argument (optional). You can use the name of local variable, local data structure member (e.g. var->field, var.field2), local array with fixed index (e.g. array[1], var->array[0], var->pointer[2]), or kprobe-tracer argument format (e.g. $retval, %ax, etc). Note that the name of this argument will be set as the last member name if you specify a local data structure member (e.g. field2 for 'var->field1.field2'.)
'$vars' and '$params' special arguments are also available for NAME, '$vars' is expanded to the local variables (including function parameters) which can access at given probe point. '$params' is expanded to only the function parameters.
'TYPE' casts the type of this argument (optional). If omitted, perf probe automatically set the type based on debuginfo. Currently, basic types (u8/u16/u32/u64/s8/s16/s32/s64), signedness casting (u/s), "string" and bitfield are supported. (see TYPES for detail)
'TYPE' casts the type of this argument (optional). If omitted, perf probe automatically set the type based on debuginfo (*). Currently, basic types (u8/u16/u32/u64/s8/s16/s32/s64), hexadecimal integers (x/x8/x16/x32/x64), signedness casting (u/s), "string" and bitfield are supported. (see TYPES for detail)
On x86 systems %REG is always the short form of the register: for example %AX. %RAX or %EAX is not valid.
TYPES
-----
Basic types (u8/u16/u32/u64/s8/s16/s32/s64) are integer types. Prefix 's' and 'u' means those types are signed and unsigned respectively. Traced arguments are shown in decimal (signed) or hex (unsigned). You can also use 's' or 'u' to specify only signedness and leave its size auto-detected by perf probe.
Basic types (u8/u16/u32/u64/s8/s16/s32/s64) and hexadecimal integers (x8/x16/x32/x64) are integer types. Prefix 's' and 'u' means those types are signed and unsigned respectively, and 'x' means that is shown in hexadecimal format. Traced arguments are shown in decimal (sNN/uNN) or hex (xNN). You can also use 's' or 'u' to specify only signedness and leave its size auto-detected by perf probe. Moreover, you can use 'x' to explicitly specify to be shown in hexadecimal (the size is also auto-detected).
String type is a special type, which fetches a "null-terminated" string from kernel space. This means it will fail and store NULL if the string container has been paged out. You can specify 'string' type only for the local variable or structure member which is an array of or a pointer to 'char' or 'unsigned char' type.
Bitfield is another special type, which takes 3 parameters, bit-width, bit-offset, and container-size (usually 32). The syntax is;

View File

@ -35,15 +35,15 @@ OPTIONS
- a symbolically formed PMU event like 'pmu/param1=0x3,param2/' where
'param1', 'param2', etc are defined as formats for the PMU in
/sys/bus/event_sources/devices/<pmu>/format/*.
/sys/bus/event_source/devices/<pmu>/format/*.
- a symbolically formed event like 'pmu/config=M,config1=N,config3=K/'
where M, N, K are numbers (in decimal, hex, octal format). Acceptable
values for each of 'config', 'config1' and 'config2' are defined by
corresponding entries in /sys/bus/event_sources/devices/<pmu>/format/*
corresponding entries in /sys/bus/event_source/devices/<pmu>/format/*
param1 and param2 are defined as formats for the PMU in:
/sys/bus/event_sources/devices/<pmu>/format/*
/sys/bus/event_source/devices/<pmu>/format/*
There are also some params which are not defined in .../<pmu>/format/*.
These params can be used to overload default config values per event.
@ -60,6 +60,18 @@ OPTIONS
Note: If user explicitly sets options which conflict with the params,
the value set by the params will be overridden.
Also not defined in .../<pmu>/format/* are PMU driver specific
configuration parameters. Any configuration parameter preceded by
the letter '@' is not interpreted in user space and sent down directly
to the PMU driver. For example:
perf record -e some_event/@cfg1,@cfg2=config/ ...
will see 'cfg1' and 'cfg2=config' pushed to the PMU driver associated
with the event for further processing. There is no restriction on
what the configuration parameters are, as long as their semantic is
understood and supported by the PMU driver.
- a hardware breakpoint event in the form of '\mem:addr[/len][:access]'
where addr is the address in memory you want to break in.
Access is the memory access type (read, write, execute) it can
@ -77,9 +89,62 @@ OPTIONS
--filter=<filter>::
Event filter. This option should follow a event selector (-e) which
selects tracepoint event(s). Multiple '--filter' options are combined
selects either tracepoint event(s) or a hardware trace PMU
(e.g. Intel PT or CoreSight).
- tracepoint filters
In the case of tracepoints, multiple '--filter' options are combined
using '&&'.
- address filters
A hardware trace PMU advertises its ability to accept a number of
address filters by specifying a non-zero value in
/sys/bus/event_source/devices/<pmu>/nr_addr_filters.
Address filters have the format:
filter|start|stop|tracestop <start> [/ <size>] [@<file name>]
Where:
- 'filter': defines a region that will be traced.
- 'start': defines an address at which tracing will begin.
- 'stop': defines an address at which tracing will stop.
- 'tracestop': defines a region in which tracing will stop.
<file name> is the name of the object file, <start> is the offset to the
code to trace in that file, and <size> is the size of the region to
trace. 'start' and 'stop' filters need not specify a <size>.
If no object file is specified then the kernel is assumed, in which case
the start address must be a current kernel memory address.
<start> can also be specified by providing the name of a symbol. If the
symbol name is not unique, it can be disambiguated by inserting #n where
'n' selects the n'th symbol in address order. Alternately #0, #g or #G
select only a global symbol. <size> can also be specified by providing
the name of a symbol, in which case the size is calculated to the end
of that symbol. For 'filter' and 'tracestop' filters, if <size> is
omitted and <start> is a symbol, then the size is calculated to the end
of that symbol.
If <size> is omitted and <start> is '*', then the start and size will
be calculated from the first and last symbols, i.e. to trace the whole
file.
If symbol names (or '*') are provided, they must be surrounded by white
space.
The filter passed to the kernel is not necessarily the same as entered.
To see the filter that is passed, use the -v option.
The kernel may not be able to configure a trace region if it is not
within a single mapping. MMAP events (or /proc/<pid>/maps) can be
examined to determine if that is a possibility.
Multiple filters can be separated with space or comma.
--exclude-perf::
Don't record events issued by perf itself. This option should follow
a event selector (-e) which selects tracepoint event(s). It adds a

View File

@ -437,6 +437,10 @@ in pmu-tools parser. This allows to read perf.data from python and dump it.
quipper
The quipper C++ parser is available at
https://chromium.googlesource.com/chromiumos/platform/chromiumos-wide-profiling/
https://chromium.googlesource.com/chromiumos/platform2
It is under the chromiumos-wide-profiling/ subdirectory. This library can
convert a perf data file to a protobuf and vice versa.
Unfortunately this parser tends to be many versions behind and may not be able
to parse data files generated by recent perf.

View File

@ -27,3 +27,12 @@
use_offset = true
jump_arrows = true
show_nr_jumps = false
[report]
# Defaults
sort-order = comm,dso,symbol
percent-limit = 0
queue-size = 0
children = true
group = true

View File

@ -60,14 +60,18 @@ tools/include/asm-generic/bitops.h
tools/include/linux/atomic.h
tools/include/linux/bitops.h
tools/include/linux/compiler.h
tools/include/linux/coresight-pmu.h
tools/include/linux/filter.h
tools/include/linux/hash.h
tools/include/linux/kernel.h
tools/include/linux/list.h
tools/include/linux/log2.h
tools/include/uapi/asm-generic/mman-common.h
tools/include/uapi/asm-generic/mman.h
tools/include/uapi/linux/bpf.h
tools/include/uapi/linux/bpf_common.h
tools/include/uapi/linux/hw_breakpoint.h
tools/include/uapi/linux/mman.h
tools/include/uapi/linux/perf_event.h
tools/include/linux/poison.h
tools/include/linux/rbtree.h
@ -77,4 +81,6 @@ tools/include/linux/stringify.h
tools/include/linux/types.h
tools/include/linux/err.h
tools/include/linux/bitmap.h
tools/include/linux/time64.h
tools/arch/*/include/uapi/asm/mman.h
tools/arch/*/include/uapi/asm/perf_regs.h

View File

@ -746,10 +746,13 @@ ifdef LIBBABELTRACE
endif
ifndef NO_AUXTRACE
ifeq ($(feature-get_cpuid), 0)
msg := $(warning Your gcc lacks the __get_cpuid() builtin, disables support for auxtrace/Intel PT, please install a newer gcc);
NO_AUXTRACE := 1
else
ifeq ($(ARCH),x86)
ifeq ($(feature-get_cpuid), 0)
msg := $(warning Your gcc lacks the __get_cpuid() builtin, disables support for auxtrace/Intel PT, please install a newer gcc);
NO_AUXTRACE := 1
endif
endif
ifndef NO_AUXTRACE
$(call detected,CONFIG_AUXTRACE)
CFLAGS += -DHAVE_AUXTRACE_SUPPORT
endif

View File

@ -165,7 +165,7 @@ SUBCMD_DIR = $(srctree)/tools/lib/subcmd/
# non-config cases
config := 1
NON_CONFIG_TARGETS := clean TAGS tags cscope help
NON_CONFIG_TARGETS := clean TAGS tags cscope help install-doc
ifdef MAKECMDGOALS
ifeq ($(filter-out $(NON_CONFIG_TARGETS),$(MAKECMDGOALS)),)
@ -429,6 +429,18 @@ $(PERF_IN): prepare FORCE
@(test -f ../../include/asm-generic/bitops/fls64.h && ( \
(diff -B ../include/asm-generic/bitops/fls64.h ../../include/asm-generic/bitops/fls64.h >/dev/null) \
|| echo "Warning: tools/include/asm-generic/bitops/fls64.h differs from kernel" >&2 )) || true
@(test -f ../../include/linux/coresight-pmu.h && ( \
(diff -B ../include/linux/coresight-pmu.h ../../include/linux/coresight-pmu.h >/dev/null) \
|| echo "Warning: tools/include/linux/coresight-pmu.h differs from kernel" >&2 )) || true
@(test -f ../../include/uapi/asm-generic/mman-common.h && ( \
(diff -B ../include/uapi/asm-generic/mman-common.h ../../include/uapi/asm-generic/mman-common.h >/dev/null) \
|| echo "Warning: tools/include/uapi/asm-generic/mman-common.h differs from kernel" >&2 )) || true
@(test -f ../../include/uapi/asm-generic/mman.h && ( \
(diff -B -I "^#include <\(uapi/\)*asm-generic/mman-common.h>$$" ../include/uapi/asm-generic/mman.h ../../include/uapi/asm-generic/mman.h >/dev/null) \
|| echo "Warning: tools/include/uapi/asm-generic/mman.h differs from kernel" >&2 )) || true
@(test -f ../../include/uapi/linux/mman.h && ( \
(diff -B -I "^#include <\(uapi/\)*asm/mman.h>$$" ../include/uapi/linux/mman.h ../../include/uapi/linux/mman.h >/dev/null) \
|| echo "Warning: tools/include/uapi/linux/mman.h differs from kernel" >&2 )) || true
$(Q)$(MAKE) $(build)=perf
$(OUTPUT)perf: $(PERFLIBS) $(PERF_IN) $(LIBTRACEEVENT_DYNAMIC_LIST)

View File

@ -0,0 +1,9 @@
#ifdef DEFINE_DWARF_REGSTR_TABLE
/* This is included in perf/util/dwarf-regs.c */
static const char * const arm_regstr_tbl[] = {
"%r0", "%r1", "%r2", "%r3", "%r4",
"%r5", "%r6", "%r7", "%r8", "%r9", "%r10",
"%fp", "%ip", "%sp", "%lr", "%pc",
};
#endif

View File

@ -2,3 +2,5 @@ libperf-$(CONFIG_DWARF) += dwarf-regs.o
libperf-$(CONFIG_LOCAL_LIBUNWIND) += unwind-libunwind.o
libperf-$(CONFIG_LIBDW_DWARF_UNWIND) += unwind-libdw.o
libperf-$(CONFIG_AUXTRACE) += pmu.o auxtrace.o cs-etm.o

View File

@ -0,0 +1,54 @@
/*
* Copyright(C) 2015 Linaro Limited. All rights reserved.
* Author: Mathieu Poirier <mathieu.poirier@linaro.org>
*
* This program is free software; you can redistribute it and/or modify it
* under the terms of the GNU General Public License version 2 as published by
* the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful, but WITHOUT
* ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
* FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
* more details.
*
* You should have received a copy of the GNU General Public License along with
* this program. If not, see <http://www.gnu.org/licenses/>.
*/
#include <stdbool.h>
#include <linux/coresight-pmu.h>
#include "../../util/auxtrace.h"
#include "../../util/evlist.h"
#include "../../util/pmu.h"
#include "cs-etm.h"
struct auxtrace_record
*auxtrace_record__init(struct perf_evlist *evlist, int *err)
{
struct perf_pmu *cs_etm_pmu;
struct perf_evsel *evsel;
bool found_etm = false;
cs_etm_pmu = perf_pmu__find(CORESIGHT_ETM_PMU_NAME);
if (evlist) {
evlist__for_each_entry(evlist, evsel) {
if (cs_etm_pmu &&
evsel->attr.type == cs_etm_pmu->type)
found_etm = true;
}
}
if (found_etm)
return cs_etm_record_init(err);
/*
* Clear 'err' even if we haven't found a cs_etm event - that way perf
* record can still be used even if tracers aren't present. The NULL
* return value will take care of telling the infrastructure HW tracing
* isn't available.
*/
*err = 0;
return NULL;
}

View File

@ -0,0 +1,617 @@
/*
* Copyright(C) 2015 Linaro Limited. All rights reserved.
* Author: Mathieu Poirier <mathieu.poirier@linaro.org>
*
* This program is free software; you can redistribute it and/or modify it
* under the terms of the GNU General Public License version 2 as published by
* the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful, but WITHOUT
* ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
* FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
* more details.
*
* You should have received a copy of the GNU General Public License along with
* this program. If not, see <http://www.gnu.org/licenses/>.
*/
#include <api/fs/fs.h>
#include <linux/bitops.h>
#include <linux/coresight-pmu.h>
#include <linux/kernel.h>
#include <linux/log2.h>
#include <linux/types.h>
#include "cs-etm.h"
#include "../../perf.h"
#include "../../util/auxtrace.h"
#include "../../util/cpumap.h"
#include "../../util/evlist.h"
#include "../../util/evsel.h"
#include "../../util/pmu.h"
#include "../../util/thread_map.h"
#include "../../util/cs-etm.h"
#include <stdlib.h>
#define ENABLE_SINK_MAX 128
#define CS_BUS_DEVICE_PATH "/bus/coresight/devices/"
struct cs_etm_recording {
struct auxtrace_record itr;
struct perf_pmu *cs_etm_pmu;
struct perf_evlist *evlist;
bool snapshot_mode;
size_t snapshot_size;
};
static bool cs_etm_is_etmv4(struct auxtrace_record *itr, int cpu);
static int cs_etm_parse_snapshot_options(struct auxtrace_record *itr,
struct record_opts *opts,
const char *str)
{
struct cs_etm_recording *ptr =
container_of(itr, struct cs_etm_recording, itr);
unsigned long long snapshot_size = 0;
char *endptr;
if (str) {
snapshot_size = strtoull(str, &endptr, 0);
if (*endptr || snapshot_size > SIZE_MAX)
return -1;
}
opts->auxtrace_snapshot_mode = true;
opts->auxtrace_snapshot_size = snapshot_size;
ptr->snapshot_size = snapshot_size;
return 0;
}
static int cs_etm_recording_options(struct auxtrace_record *itr,
struct perf_evlist *evlist,
struct record_opts *opts)
{
struct cs_etm_recording *ptr =
container_of(itr, struct cs_etm_recording, itr);
struct perf_pmu *cs_etm_pmu = ptr->cs_etm_pmu;
struct perf_evsel *evsel, *cs_etm_evsel = NULL;
const struct cpu_map *cpus = evlist->cpus;
bool privileged = (geteuid() == 0 || perf_event_paranoid() < 0);
ptr->evlist = evlist;
ptr->snapshot_mode = opts->auxtrace_snapshot_mode;
evlist__for_each_entry(evlist, evsel) {
if (evsel->attr.type == cs_etm_pmu->type) {
if (cs_etm_evsel) {
pr_err("There may be only one %s event\n",
CORESIGHT_ETM_PMU_NAME);
return -EINVAL;
}
evsel->attr.freq = 0;
evsel->attr.sample_period = 1;
cs_etm_evsel = evsel;
opts->full_auxtrace = true;
}
}
/* no need to continue if at least one event of interest was found */
if (!cs_etm_evsel)
return 0;
if (opts->use_clockid) {
pr_err("Cannot use clockid (-k option) with %s\n",
CORESIGHT_ETM_PMU_NAME);
return -EINVAL;
}
/* we are in snapshot mode */
if (opts->auxtrace_snapshot_mode) {
/*
* No size were given to '-S' or '-m,', so go with
* the default
*/
if (!opts->auxtrace_snapshot_size &&
!opts->auxtrace_mmap_pages) {
if (privileged) {
opts->auxtrace_mmap_pages = MiB(4) / page_size;
} else {
opts->auxtrace_mmap_pages =
KiB(128) / page_size;
if (opts->mmap_pages == UINT_MAX)
opts->mmap_pages = KiB(256) / page_size;
}
} else if (!opts->auxtrace_mmap_pages && !privileged &&
opts->mmap_pages == UINT_MAX) {
opts->mmap_pages = KiB(256) / page_size;
}
/*
* '-m,xyz' was specified but no snapshot size, so make the
* snapshot size as big as the auxtrace mmap area.
*/
if (!opts->auxtrace_snapshot_size) {
opts->auxtrace_snapshot_size =
opts->auxtrace_mmap_pages * (size_t)page_size;
}
/*
* -Sxyz was specified but no auxtrace mmap area, so make the
* auxtrace mmap area big enough to fit the requested snapshot
* size.
*/
if (!opts->auxtrace_mmap_pages) {
size_t sz = opts->auxtrace_snapshot_size;
sz = round_up(sz, page_size) / page_size;
opts->auxtrace_mmap_pages = roundup_pow_of_two(sz);
}
/* Snapshost size can't be bigger than the auxtrace area */
if (opts->auxtrace_snapshot_size >
opts->auxtrace_mmap_pages * (size_t)page_size) {
pr_err("Snapshot size %zu must not be greater than AUX area tracing mmap size %zu\n",
opts->auxtrace_snapshot_size,
opts->auxtrace_mmap_pages * (size_t)page_size);
return -EINVAL;
}
/* Something went wrong somewhere - this shouldn't happen */
if (!opts->auxtrace_snapshot_size ||
!opts->auxtrace_mmap_pages) {
pr_err("Failed to calculate default snapshot size and/or AUX area tracing mmap pages\n");
return -EINVAL;
}
}
/* We are in full trace mode but '-m,xyz' wasn't specified */
if (opts->full_auxtrace && !opts->auxtrace_mmap_pages) {
if (privileged) {
opts->auxtrace_mmap_pages = MiB(4) / page_size;
} else {
opts->auxtrace_mmap_pages = KiB(128) / page_size;
if (opts->mmap_pages == UINT_MAX)
opts->mmap_pages = KiB(256) / page_size;
}
}
/* Validate auxtrace_mmap_pages provided by user */
if (opts->auxtrace_mmap_pages) {
unsigned int max_page = (KiB(128) / page_size);
size_t sz = opts->auxtrace_mmap_pages * (size_t)page_size;
if (!privileged &&
opts->auxtrace_mmap_pages > max_page) {
opts->auxtrace_mmap_pages = max_page;
pr_err("auxtrace too big, truncating to %d\n",
max_page);
}
if (!is_power_of_2(sz)) {
pr_err("Invalid mmap size for %s: must be a power of 2\n",
CORESIGHT_ETM_PMU_NAME);
return -EINVAL;
}
}
if (opts->auxtrace_snapshot_mode)
pr_debug2("%s snapshot size: %zu\n", CORESIGHT_ETM_PMU_NAME,
opts->auxtrace_snapshot_size);
if (cs_etm_evsel) {
/*
* To obtain the auxtrace buffer file descriptor, the auxtrace
* event must come first.
*/
perf_evlist__to_front(evlist, cs_etm_evsel);
/*
* In the case of per-cpu mmaps, we need the CPU on the
* AUX event.
*/
if (!cpu_map__empty(cpus))
perf_evsel__set_sample_bit(cs_etm_evsel, CPU);
}
/* Add dummy event to keep tracking */
if (opts->full_auxtrace) {
struct perf_evsel *tracking_evsel;
int err;
err = parse_events(evlist, "dummy:u", NULL);
if (err)
return err;
tracking_evsel = perf_evlist__last(evlist);
perf_evlist__set_tracking_event(evlist, tracking_evsel);
tracking_evsel->attr.freq = 0;
tracking_evsel->attr.sample_period = 1;
/* In per-cpu case, always need the time of mmap events etc */
if (!cpu_map__empty(cpus))
perf_evsel__set_sample_bit(tracking_evsel, TIME);
}
return 0;
}
static u64 cs_etm_get_config(struct auxtrace_record *itr)
{
u64 config = 0;
struct cs_etm_recording *ptr =
container_of(itr, struct cs_etm_recording, itr);
struct perf_pmu *cs_etm_pmu = ptr->cs_etm_pmu;
struct perf_evlist *evlist = ptr->evlist;
struct perf_evsel *evsel;
evlist__for_each_entry(evlist, evsel) {
if (evsel->attr.type == cs_etm_pmu->type) {
/*
* Variable perf_event_attr::config is assigned to
* ETMv3/PTM. The bit fields have been made to match
* the ETMv3.5 ETRMCR register specification. See the
* PMU_FORMAT_ATTR() declarations in
* drivers/hwtracing/coresight/coresight-perf.c for
* details.
*/
config = evsel->attr.config;
break;
}
}
return config;
}
static size_t
cs_etm_info_priv_size(struct auxtrace_record *itr __maybe_unused,
struct perf_evlist *evlist __maybe_unused)
{
int i;
int etmv3 = 0, etmv4 = 0;
const struct cpu_map *cpus = evlist->cpus;
/* cpu map is not empty, we have specific CPUs to work with */
if (!cpu_map__empty(cpus)) {
for (i = 0; i < cpu_map__nr(cpus); i++) {
if (cs_etm_is_etmv4(itr, cpus->map[i]))
etmv4++;
else
etmv3++;
}
} else {
/* get configuration for all CPUs in the system */
for (i = 0; i < cpu__max_cpu(); i++) {
if (cs_etm_is_etmv4(itr, i))
etmv4++;
else
etmv3++;
}
}
return (CS_ETM_HEADER_SIZE +
(etmv4 * CS_ETMV4_PRIV_SIZE) +
(etmv3 * CS_ETMV3_PRIV_SIZE));
}
static const char *metadata_etmv3_ro[CS_ETM_PRIV_MAX] = {
[CS_ETM_ETMCCER] = "mgmt/etmccer",
[CS_ETM_ETMIDR] = "mgmt/etmidr",
};
static const char *metadata_etmv4_ro[CS_ETMV4_PRIV_MAX] = {
[CS_ETMV4_TRCIDR0] = "trcidr/trcidr0",
[CS_ETMV4_TRCIDR1] = "trcidr/trcidr1",
[CS_ETMV4_TRCIDR2] = "trcidr/trcidr2",
[CS_ETMV4_TRCIDR8] = "trcidr/trcidr8",
[CS_ETMV4_TRCAUTHSTATUS] = "mgmt/trcauthstatus",
};
static bool cs_etm_is_etmv4(struct auxtrace_record *itr, int cpu)
{
bool ret = false;
char path[PATH_MAX];
int scan;
unsigned int val;
struct cs_etm_recording *ptr =
container_of(itr, struct cs_etm_recording, itr);
struct perf_pmu *cs_etm_pmu = ptr->cs_etm_pmu;
/* Take any of the RO files for ETMv4 and see if it present */
snprintf(path, PATH_MAX, "cpu%d/%s",
cpu, metadata_etmv4_ro[CS_ETMV4_TRCIDR0]);
scan = perf_pmu__scan_file(cs_etm_pmu, path, "%x", &val);
/* The file was read successfully, we have a winner */
if (scan == 1)
ret = true;
return ret;
}
static int cs_etm_get_ro(struct perf_pmu *pmu, int cpu, const char *path)
{
char pmu_path[PATH_MAX];
int scan;
unsigned int val = 0;
/* Get RO metadata from sysfs */
snprintf(pmu_path, PATH_MAX, "cpu%d/%s", cpu, path);
scan = perf_pmu__scan_file(pmu, pmu_path, "%x", &val);
if (scan != 1)
pr_err("%s: error reading: %s\n", __func__, pmu_path);
return val;
}
static void cs_etm_get_metadata(int cpu, u32 *offset,
struct auxtrace_record *itr,
struct auxtrace_info_event *info)
{
u32 increment;
u64 magic;
struct cs_etm_recording *ptr =
container_of(itr, struct cs_etm_recording, itr);
struct perf_pmu *cs_etm_pmu = ptr->cs_etm_pmu;
/* first see what kind of tracer this cpu is affined to */
if (cs_etm_is_etmv4(itr, cpu)) {
magic = __perf_cs_etmv4_magic;
/* Get trace configuration register */
info->priv[*offset + CS_ETMV4_TRCCONFIGR] =
cs_etm_get_config(itr);
/* Get traceID from the framework */
info->priv[*offset + CS_ETMV4_TRCTRACEIDR] =
coresight_get_trace_id(cpu);
/* Get read-only information from sysFS */
info->priv[*offset + CS_ETMV4_TRCIDR0] =
cs_etm_get_ro(cs_etm_pmu, cpu,
metadata_etmv4_ro[CS_ETMV4_TRCIDR0]);
info->priv[*offset + CS_ETMV4_TRCIDR1] =
cs_etm_get_ro(cs_etm_pmu, cpu,
metadata_etmv4_ro[CS_ETMV4_TRCIDR1]);
info->priv[*offset + CS_ETMV4_TRCIDR2] =
cs_etm_get_ro(cs_etm_pmu, cpu,
metadata_etmv4_ro[CS_ETMV4_TRCIDR2]);
info->priv[*offset + CS_ETMV4_TRCIDR8] =
cs_etm_get_ro(cs_etm_pmu, cpu,
metadata_etmv4_ro[CS_ETMV4_TRCIDR8]);
info->priv[*offset + CS_ETMV4_TRCAUTHSTATUS] =
cs_etm_get_ro(cs_etm_pmu, cpu,
metadata_etmv4_ro
[CS_ETMV4_TRCAUTHSTATUS]);
/* How much space was used */
increment = CS_ETMV4_PRIV_MAX;
} else {
magic = __perf_cs_etmv3_magic;
/* Get configuration register */
info->priv[*offset + CS_ETM_ETMCR] = cs_etm_get_config(itr);
/* Get traceID from the framework */
info->priv[*offset + CS_ETM_ETMTRACEIDR] =
coresight_get_trace_id(cpu);
/* Get read-only information from sysFS */
info->priv[*offset + CS_ETM_ETMCCER] =
cs_etm_get_ro(cs_etm_pmu, cpu,
metadata_etmv3_ro[CS_ETM_ETMCCER]);
info->priv[*offset + CS_ETM_ETMIDR] =
cs_etm_get_ro(cs_etm_pmu, cpu,
metadata_etmv3_ro[CS_ETM_ETMIDR]);
/* How much space was used */
increment = CS_ETM_PRIV_MAX;
}
/* Build generic header portion */
info->priv[*offset + CS_ETM_MAGIC] = magic;
info->priv[*offset + CS_ETM_CPU] = cpu;
/* Where the next CPU entry should start from */
*offset += increment;
}
static int cs_etm_info_fill(struct auxtrace_record *itr,
struct perf_session *session,
struct auxtrace_info_event *info,
size_t priv_size)
{
int i;
u32 offset;
u64 nr_cpu, type;
const struct cpu_map *cpus = session->evlist->cpus;
struct cs_etm_recording *ptr =
container_of(itr, struct cs_etm_recording, itr);
struct perf_pmu *cs_etm_pmu = ptr->cs_etm_pmu;
if (priv_size != cs_etm_info_priv_size(itr, session->evlist))
return -EINVAL;
if (!session->evlist->nr_mmaps)
return -EINVAL;
/* If the cpu_map is empty all CPUs are involved */
nr_cpu = cpu_map__empty(cpus) ? cpu__max_cpu() : cpu_map__nr(cpus);
/* Get PMU type as dynamically assigned by the core */
type = cs_etm_pmu->type;
/* First fill out the session header */
info->type = PERF_AUXTRACE_CS_ETM;
info->priv[CS_HEADER_VERSION_0] = 0;
info->priv[CS_PMU_TYPE_CPUS] = type << 32;
info->priv[CS_PMU_TYPE_CPUS] |= nr_cpu;
info->priv[CS_ETM_SNAPSHOT] = ptr->snapshot_mode;
offset = CS_ETM_SNAPSHOT + 1;
/* cpu map is not empty, we have specific CPUs to work with */
if (!cpu_map__empty(cpus)) {
for (i = 0; i < cpu_map__nr(cpus) && offset < priv_size; i++)
cs_etm_get_metadata(cpus->map[i], &offset, itr, info);
} else {
/* get configuration for all CPUs in the system */
for (i = 0; i < cpu__max_cpu(); i++)
cs_etm_get_metadata(i, &offset, itr, info);
}
return 0;
}
static int cs_etm_find_snapshot(struct auxtrace_record *itr __maybe_unused,
int idx, struct auxtrace_mmap *mm,
unsigned char *data __maybe_unused,
u64 *head, u64 *old)
{
pr_debug3("%s: mmap index %d old head %zu new head %zu size %zu\n",
__func__, idx, (size_t)*old, (size_t)*head, mm->len);
*old = *head;
*head += mm->len;
return 0;
}
static int cs_etm_snapshot_start(struct auxtrace_record *itr)
{
struct cs_etm_recording *ptr =
container_of(itr, struct cs_etm_recording, itr);
struct perf_evsel *evsel;
evlist__for_each_entry(ptr->evlist, evsel) {
if (evsel->attr.type == ptr->cs_etm_pmu->type)
return perf_evsel__disable(evsel);
}
return -EINVAL;
}
static int cs_etm_snapshot_finish(struct auxtrace_record *itr)
{
struct cs_etm_recording *ptr =
container_of(itr, struct cs_etm_recording, itr);
struct perf_evsel *evsel;
evlist__for_each_entry(ptr->evlist, evsel) {
if (evsel->attr.type == ptr->cs_etm_pmu->type)
return perf_evsel__enable(evsel);
}
return -EINVAL;
}
static u64 cs_etm_reference(struct auxtrace_record *itr __maybe_unused)
{
return (((u64) rand() << 0) & 0x00000000FFFFFFFFull) |
(((u64) rand() << 32) & 0xFFFFFFFF00000000ull);
}
static void cs_etm_recording_free(struct auxtrace_record *itr)
{
struct cs_etm_recording *ptr =
container_of(itr, struct cs_etm_recording, itr);
free(ptr);
}
static int cs_etm_read_finish(struct auxtrace_record *itr, int idx)
{
struct cs_etm_recording *ptr =
container_of(itr, struct cs_etm_recording, itr);
struct perf_evsel *evsel;
evlist__for_each_entry(ptr->evlist, evsel) {
if (evsel->attr.type == ptr->cs_etm_pmu->type)
return perf_evlist__enable_event_idx(ptr->evlist,
evsel, idx);
}
return -EINVAL;
}
struct auxtrace_record *cs_etm_record_init(int *err)
{
struct perf_pmu *cs_etm_pmu;
struct cs_etm_recording *ptr;
cs_etm_pmu = perf_pmu__find(CORESIGHT_ETM_PMU_NAME);
if (!cs_etm_pmu) {
*err = -EINVAL;
goto out;
}
ptr = zalloc(sizeof(struct cs_etm_recording));
if (!ptr) {
*err = -ENOMEM;
goto out;
}
ptr->cs_etm_pmu = cs_etm_pmu;
ptr->itr.parse_snapshot_options = cs_etm_parse_snapshot_options;
ptr->itr.recording_options = cs_etm_recording_options;
ptr->itr.info_priv_size = cs_etm_info_priv_size;
ptr->itr.info_fill = cs_etm_info_fill;
ptr->itr.find_snapshot = cs_etm_find_snapshot;
ptr->itr.snapshot_start = cs_etm_snapshot_start;
ptr->itr.snapshot_finish = cs_etm_snapshot_finish;
ptr->itr.reference = cs_etm_reference;
ptr->itr.free = cs_etm_recording_free;
ptr->itr.read_finish = cs_etm_read_finish;
*err = 0;
return &ptr->itr;
out:
return NULL;
}
static FILE *cs_device__open_file(const char *name)
{
struct stat st;
char path[PATH_MAX];
const char *sysfs;
sysfs = sysfs__mountpoint();
if (!sysfs)
return NULL;
snprintf(path, PATH_MAX,
"%s" CS_BUS_DEVICE_PATH "%s", sysfs, name);
printf("path: %s\n", path);
if (stat(path, &st) < 0)
return NULL;
return fopen(path, "w");
}
static __attribute__((format(printf, 2, 3)))
int cs_device__print_file(const char *name, const char *fmt, ...)
{
va_list args;
FILE *file;
int ret = -EINVAL;
va_start(args, fmt);
file = cs_device__open_file(name);
if (file) {
ret = vfprintf(file, fmt, args);
fclose(file);
}
va_end(args);
return ret;
}
int cs_etm_set_drv_config(struct perf_evsel_config_term *term)
{
int ret;
char enable_sink[ENABLE_SINK_MAX];
snprintf(enable_sink, ENABLE_SINK_MAX, "%s/%s",
term->val.drv_cfg, "enable_sink");
ret = cs_device__print_file(enable_sink, "%d", 1);
if (ret < 0)
return ret;
return 0;
}

View File

@ -0,0 +1,26 @@
/*
* Copyright(C) 2015 Linaro Limited. All rights reserved.
* Author: Mathieu Poirier <mathieu.poirier@linaro.org>
*
* This program is free software; you can redistribute it and/or modify it
* under the terms of the GNU General Public License version 2 as published by
* the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful, but WITHOUT
* ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
* FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
* more details.
*
* You should have received a copy of the GNU General Public License along with
* this program. If not, see <http://www.gnu.org/licenses/>.
*/
#ifndef INCLUDE__PERF_CS_ETM_H__
#define INCLUDE__PERF_CS_ETM_H__
#include "../../util/evsel.h"
struct auxtrace_record *cs_etm_record_init(int *err);
int cs_etm_set_drv_config(struct perf_evsel_config_term *term);
#endif

View File

@ -0,0 +1,36 @@
/*
* Copyright(C) 2015 Linaro Limited. All rights reserved.
* Author: Mathieu Poirier <mathieu.poirier@linaro.org>
*
* This program is free software; you can redistribute it and/or modify it
* under the terms of the GNU General Public License version 2 as published by
* the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful, but WITHOUT
* ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
* FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
* more details.
*
* You should have received a copy of the GNU General Public License along with
* this program. If not, see <http://www.gnu.org/licenses/>.
*/
#include <string.h>
#include <linux/coresight-pmu.h>
#include <linux/perf_event.h>
#include "cs-etm.h"
#include "../../util/pmu.h"
struct perf_event_attr
*perf_pmu__get_default_config(struct perf_pmu *pmu __maybe_unused)
{
#ifdef HAVE_AUXTRACE_SUPPORT
if (!strcmp(pmu->name, CORESIGHT_ETM_PMU_NAME)) {
/* add ETM default config here */
pmu->selectable = true;
pmu->set_drv_config = cs_etm_set_drv_config;
}
#endif
return NULL;
}

View File

@ -0,0 +1,13 @@
#ifdef DEFINE_DWARF_REGSTR_TABLE
/* This is included in perf/util/dwarf-regs.c */
static const char * const aarch64_regstr_tbl[] = {
"%r0", "%r1", "%r2", "%r3", "%r4",
"%r5", "%r6", "%r7", "%r8", "%r9",
"%r10", "%r11", "%r12", "%r13", "%r14",
"%r15", "%r16", "%r17", "%r18", "%r19",
"%r20", "%r21", "%r22", "%r23", "%r24",
"%r25", "%r26", "%r27", "%r28", "%r29",
"%lr", "%sp",
};
#endif

View File

@ -1,2 +1,6 @@
libperf-$(CONFIG_DWARF) += dwarf-regs.o
libperf-$(CONFIG_LOCAL_LIBUNWIND) += unwind-libunwind.o
libperf-$(CONFIG_AUXTRACE) += ../../arm/util/pmu.o \
../../arm/util/auxtrace.o \
../../arm/util/cs-etm.o

View File

@ -1 +1,2 @@
libperf-y += util/
libperf-y += tests/

View File

@ -0,0 +1,13 @@
#ifndef ARCH_TESTS_H
#define ARCH_TESTS_H
#ifdef HAVE_DWARF_UNWIND_SUPPORT
struct thread;
struct perf_sample;
int test__arch_unwind_sample(struct perf_sample *sample,
struct thread *thread);
#endif
extern struct test arch_tests[];
#endif

View File

@ -0,0 +1,27 @@
#ifdef DEFINE_DWARF_REGSTR_TABLE
/* This is included in perf/util/dwarf-regs.c */
/*
* Reference:
* http://refspecs.linuxfoundation.org/ELF/ppc64/PPC-elf64abi-1.9.html
* http://refspecs.linux-foundation.org/elf/elfspec_ppc.pdf
*/
#define REG_DWARFNUM_NAME(reg, idx) [idx] = "%" #reg
static const char * const powerpc_regstr_tbl[] = {
"%gpr0", "%gpr1", "%gpr2", "%gpr3", "%gpr4",
"%gpr5", "%gpr6", "%gpr7", "%gpr8", "%gpr9",
"%gpr10", "%gpr11", "%gpr12", "%gpr13", "%gpr14",
"%gpr15", "%gpr16", "%gpr17", "%gpr18", "%gpr19",
"%gpr20", "%gpr21", "%gpr22", "%gpr23", "%gpr24",
"%gpr25", "%gpr26", "%gpr27", "%gpr28", "%gpr29",
"%gpr30", "%gpr31",
REG_DWARFNUM_NAME(msr, 66),
REG_DWARFNUM_NAME(ctr, 109),
REG_DWARFNUM_NAME(link, 108),
REG_DWARFNUM_NAME(xer, 101),
REG_DWARFNUM_NAME(dar, 119),
REG_DWARFNUM_NAME(dsisr, 118),
};
#endif

View File

@ -5,6 +5,8 @@
#include <linux/types.h>
#include <asm/perf_regs.h>
void perf_regs_load(u64 *regs);
#define PERF_REGS_MASK ((1ULL << PERF_REG_POWERPC_MAX) - 1)
#define PERF_REGS_MAX PERF_REG_POWERPC_MAX
#ifdef __powerpc64__

View File

@ -0,0 +1,4 @@
libperf-$(CONFIG_DWARF_UNWIND) += regs_load.o
libperf-$(CONFIG_DWARF_UNWIND) += dwarf-unwind.o
libperf-y += arch-tests.o

View File

@ -0,0 +1,15 @@
#include <string.h>
#include "tests/tests.h"
#include "arch-tests.h"
struct test arch_tests[] = {
#ifdef HAVE_DWARF_UNWIND_SUPPORT
{
.desc = "Test dwarf unwind",
.func = test__dwarf_unwind,
},
#endif
{
.func = NULL,
},
};

View File

@ -0,0 +1,62 @@
#include <string.h>
#include "perf_regs.h"
#include "thread.h"
#include "map.h"
#include "event.h"
#include "debug.h"
#include "tests/tests.h"
#include "arch-tests.h"
#define STACK_SIZE 8192
static int sample_ustack(struct perf_sample *sample,
struct thread *thread, u64 *regs)
{
struct stack_dump *stack = &sample->user_stack;
struct map *map;
unsigned long sp;
u64 stack_size, *buf;
buf = malloc(STACK_SIZE);
if (!buf) {
pr_debug("failed to allocate sample uregs data\n");
return -1;
}
sp = (unsigned long) regs[PERF_REG_POWERPC_R1];
map = map_groups__find(thread->mg, MAP__VARIABLE, (u64) sp);
if (!map) {
pr_debug("failed to get stack map\n");
free(buf);
return -1;
}
stack_size = map->end - sp;
stack_size = stack_size > STACK_SIZE ? STACK_SIZE : stack_size;
memcpy(buf, (void *) sp, stack_size);
stack->data = (char *) buf;
stack->size = stack_size;
return 0;
}
int test__arch_unwind_sample(struct perf_sample *sample,
struct thread *thread)
{
struct regs_dump *regs = &sample->user_regs;
u64 *buf;
buf = calloc(1, sizeof(u64) * PERF_REGS_MAX);
if (!buf) {
pr_debug("failed to allocate sample uregs data\n");
return -1;
}
perf_regs_load(buf);
regs->abi = PERF_SAMPLE_REGS_ABI;
regs->regs = buf;
regs->mask = PERF_REGS_MASK;
return sample_ustack(sample, thread, buf);
}

View File

@ -0,0 +1,94 @@
#include <linux/linkage.h>
/* Offset is based on macros from arch/powerpc/include/uapi/asm/ptrace.h. */
#define R0 0
#define R1 1 * 8
#define R2 2 * 8
#define R3 3 * 8
#define R4 4 * 8
#define R5 5 * 8
#define R6 6 * 8
#define R7 7 * 8
#define R8 8 * 8
#define R9 9 * 8
#define R10 10 * 8
#define R11 11 * 8
#define R12 12 * 8
#define R13 13 * 8
#define R14 14 * 8
#define R15 15 * 8
#define R16 16 * 8
#define R17 17 * 8
#define R18 18 * 8
#define R19 19 * 8
#define R20 20 * 8
#define R21 21 * 8
#define R22 22 * 8
#define R23 23 * 8
#define R24 24 * 8
#define R25 25 * 8
#define R26 26 * 8
#define R27 27 * 8
#define R28 28 * 8
#define R29 29 * 8
#define R30 30 * 8
#define R31 31 * 8
#define NIP 32 * 8
#define CTR 35 * 8
#define LINK 36 * 8
#define XER 37 * 8
.globl perf_regs_load
perf_regs_load:
std 0, R0(3)
std 1, R1(3)
std 2, R2(3)
std 3, R3(3)
std 4, R4(3)
std 5, R5(3)
std 6, R6(3)
std 7, R7(3)
std 8, R8(3)
std 9, R9(3)
std 10, R10(3)
std 11, R11(3)
std 12, R12(3)
std 13, R13(3)
std 14, R14(3)
std 15, R15(3)
std 16, R16(3)
std 17, R17(3)
std 18, R18(3)
std 19, R19(3)
std 20, R20(3)
std 21, R21(3)
std 22, R22(3)
std 23, R23(3)
std 24, R24(3)
std 25, R25(3)
std 26, R26(3)
std 27, R27(3)
std 28, R28(3)
std 29, R29(3)
std 30, R30(3)
std 31, R31(3)
/* store NIP */
mflr 4
std 4, NIP(3)
/* Store LR */
std 4, LINK(3)
/* Store XER */
mfxer 4
std 4, XER(3)
/* Store CTR */
mfctr 4
std 4, CTR(3)
/* Restore original value of r4 */
ld 4, R4(3)
blr

View File

@ -108,7 +108,7 @@ void arch__post_process_probe_trace_events(struct perf_probe_event *pev,
int i = 0;
map = get_target_map(pev->target, pev->uprobes);
if (!map || map__load(map, NULL) < 0)
if (!map || map__load(map) < 0)
return;
for (i = 0; i < ntevs; i++) {

View File

@ -0,0 +1,8 @@
#ifdef DEFINE_DWARF_REGSTR_TABLE
/* This is included in perf/util/dwarf-regs.c */
static const char * const s390_regstr_tbl[] = {
"%r0", "%r1", "%r2", "%r3", "%r4", "%r5", "%r6", "%r7",
"%r8", "%r9", "%r10", "%r11", "%r12", "%r13", "%r14", "%r15",
};
#endif

View File

@ -0,0 +1,25 @@
#ifdef DEFINE_DWARF_REGSTR_TABLE
/* This is included in perf/util/dwarf-regs.c */
const char * const sh_regstr_tbl[] = {
"r0",
"r1",
"r2",
"r3",
"r4",
"r5",
"r6",
"r7",
"r8",
"r9",
"r10",
"r11",
"r12",
"r13",
"r14",
"r15",
"pc",
"pr",
};
#endif

View File

@ -0,0 +1,18 @@
#ifdef DEFINE_DWARF_REGSTR_TABLE
/* This is included in perf/util/dwarf-regs.c */
static const char * const sparc_regstr_tbl[] = {
"%g0", "%g1", "%g2", "%g3", "%g4", "%g5", "%g6", "%g7",
"%o0", "%o1", "%o2", "%o3", "%o4", "%o5", "%sp", "%o7",
"%l0", "%l1", "%l2", "%l3", "%l4", "%l5", "%l6", "%l7",
"%i0", "%i1", "%i2", "%i3", "%i4", "%i5", "%fp", "%i7",
"%f0", "%f1", "%f2", "%f3", "%f4", "%f5", "%f6", "%f7",
"%f8", "%f9", "%f10", "%f11", "%f12", "%f13", "%f14", "%f15",
"%f16", "%f17", "%f18", "%f19", "%f20", "%f21", "%f22", "%f23",
"%f24", "%f25", "%f26", "%f27", "%f28", "%f29", "%f30", "%f31",
"%f32", "%f33", "%f34", "%f35", "%f36", "%f37", "%f38", "%f39",
"%f40", "%f41", "%f42", "%f43", "%f44", "%f45", "%f46", "%f47",
"%f48", "%f49", "%f50", "%f51", "%f52", "%f53", "%f54", "%f55",
"%f56", "%f57", "%f58", "%f59", "%f60", "%f61", "%f62", "%f63",
};
#endif

View File

@ -0,0 +1,14 @@
#ifdef DEFINE_DWARF_REGSTR_TABLE
/* This is included in perf/util/dwarf-regs.c */
static const char * const x86_32_regstr_tbl[] = {
"%ax", "%cx", "%dx", "%bx", "$stack",/* Stack address instead of %sp */
"%bp", "%si", "%di",
};
static const char * const x86_64_regstr_tbl[] = {
"%ax", "%dx", "%cx", "%bx", "%si", "%di",
"%bp", "%sp", "%r8", "%r9", "%r10", "%r11",
"%r12", "%r13", "%r14", "%r15",
};
#endif

View File

@ -62,6 +62,7 @@ struct intel_pt_recording {
size_t snapshot_ref_buf_size;
int snapshot_ref_cnt;
struct intel_pt_snapshot_ref *snapshot_refs;
size_t priv_size;
};
static int intel_pt_parse_terms_with_default(struct list_head *formats,
@ -273,11 +274,37 @@ intel_pt_pmu_default_config(struct perf_pmu *intel_pt_pmu)
return attr;
}
static size_t
intel_pt_info_priv_size(struct auxtrace_record *itr __maybe_unused,
struct perf_evlist *evlist __maybe_unused)
static const char *intel_pt_find_filter(struct perf_evlist *evlist,
struct perf_pmu *intel_pt_pmu)
{
return INTEL_PT_AUXTRACE_PRIV_SIZE;
struct perf_evsel *evsel;
evlist__for_each_entry(evlist, evsel) {
if (evsel->attr.type == intel_pt_pmu->type)
return evsel->filter;
}
return NULL;
}
static size_t intel_pt_filter_bytes(const char *filter)
{
size_t len = filter ? strlen(filter) : 0;
return len ? roundup(len + 1, 8) : 0;
}
static size_t
intel_pt_info_priv_size(struct auxtrace_record *itr, struct perf_evlist *evlist)
{
struct intel_pt_recording *ptr =
container_of(itr, struct intel_pt_recording, itr);
const char *filter = intel_pt_find_filter(evlist, ptr->intel_pt_pmu);
ptr->priv_size = (INTEL_PT_AUXTRACE_PRIV_MAX * sizeof(u64)) +
intel_pt_filter_bytes(filter);
return ptr->priv_size;
}
static void intel_pt_tsc_ctc_ratio(u32 *n, u32 *d)
@ -302,9 +329,13 @@ static int intel_pt_info_fill(struct auxtrace_record *itr,
bool cap_user_time_zero = false, per_cpu_mmaps;
u64 tsc_bit, mtc_bit, mtc_freq_bits, cyc_bit, noretcomp_bit;
u32 tsc_ctc_ratio_n, tsc_ctc_ratio_d;
unsigned long max_non_turbo_ratio;
size_t filter_str_len;
const char *filter;
u64 *info;
int err;
if (priv_size != INTEL_PT_AUXTRACE_PRIV_SIZE)
if (priv_size != ptr->priv_size)
return -EINVAL;
intel_pt_parse_terms(&intel_pt_pmu->format, "tsc", &tsc_bit);
@ -317,6 +348,13 @@ static int intel_pt_info_fill(struct auxtrace_record *itr,
intel_pt_tsc_ctc_ratio(&tsc_ctc_ratio_n, &tsc_ctc_ratio_d);
if (perf_pmu__scan_file(intel_pt_pmu, "max_nonturbo_ratio",
"%lu", &max_non_turbo_ratio) != 1)
max_non_turbo_ratio = 0;
filter = intel_pt_find_filter(session->evlist, ptr->intel_pt_pmu);
filter_str_len = filter ? strlen(filter) : 0;
if (!session->evlist->nr_mmaps)
return -EINVAL;
@ -351,6 +389,17 @@ static int intel_pt_info_fill(struct auxtrace_record *itr,
auxtrace_info->priv[INTEL_PT_TSC_CTC_N] = tsc_ctc_ratio_n;
auxtrace_info->priv[INTEL_PT_TSC_CTC_D] = tsc_ctc_ratio_d;
auxtrace_info->priv[INTEL_PT_CYC_BIT] = cyc_bit;
auxtrace_info->priv[INTEL_PT_MAX_NONTURBO_RATIO] = max_non_turbo_ratio;
auxtrace_info->priv[INTEL_PT_FILTER_STR_LEN] = filter_str_len;
info = &auxtrace_info->priv[INTEL_PT_FILTER_STR_LEN] + 1;
if (filter_str_len) {
size_t len = intel_pt_filter_bytes(filter);
strncpy((char *)info, filter, len);
info += len >> 3;
}
return 0;
}

View File

@ -0,0 +1,8 @@
#ifdef DEFINE_DWARF_REGSTR_TABLE
/* This is included in perf/util/dwarf-regs.c */
static const char * const xtensa_regstr_tbl[] = {
"a0", "a1", "a2", "a3", "a4", "a5", "a6", "a7",
"a8", "a9", "a10", "a11", "a12", "a13", "a14", "a15",
};
#endif

View File

@ -16,6 +16,7 @@
#include <subcmd/parse-options.h>
#include <linux/compiler.h>
#include <linux/kernel.h>
#include <linux/time64.h>
#include <errno.h>
#include "bench.h"
#include "futex.h"
@ -62,7 +63,7 @@ static void print_summary(void)
printf("Requeued %d of %d threads in %.4f ms (+-%.2f%%)\n",
requeued_avg,
nthreads,
requeuetime_avg/1e3,
requeuetime_avg / USEC_PER_MSEC,
rel_stddev_stats(requeuetime_stddev, requeuetime_avg));
}
@ -184,7 +185,7 @@ int bench_futex_requeue(int argc, const char **argv,
if (!silent) {
printf("[Run %d]: Requeued %d of %d threads in %.4f ms\n",
j + 1, nrequeued, nthreads, runtime.tv_usec/1e3);
j + 1, nrequeued, nthreads, runtime.tv_usec / (double)USEC_PER_MSEC);
}
/* everybody should be blocked on futex2, wake'em up */

View File

@ -15,6 +15,7 @@
#include <subcmd/parse-options.h>
#include <linux/compiler.h>
#include <linux/kernel.h>
#include <linux/time64.h>
#include <errno.h>
#include "bench.h"
#include "futex.h"
@ -156,7 +157,7 @@ static void print_run(struct thread_data *waking_worker, unsigned int run_num)
printf("[Run %d]: Avg per-thread latency (waking %d/%d threads) "
"in %.4f ms (+-%.2f%%)\n", run_num + 1, wakeup_avg,
nblocked_threads, waketime_avg/1e3,
nblocked_threads, waketime_avg / USEC_PER_MSEC,
rel_stddev_stats(waketime_stddev, waketime_avg));
}
@ -172,7 +173,7 @@ static void print_summary(void)
printf("Avg per-thread latency (waking %d/%d threads) in %.4f ms (+-%.2f%%)\n",
wakeup_avg,
nblocked_threads,
waketime_avg/1e3,
waketime_avg / USEC_PER_MSEC,
rel_stddev_stats(waketime_stddev, waketime_avg));
}

View File

@ -16,6 +16,7 @@
#include <subcmd/parse-options.h>
#include <linux/compiler.h>
#include <linux/kernel.h>
#include <linux/time64.h>
#include <errno.h>
#include "bench.h"
#include "futex.h"
@ -81,7 +82,7 @@ static void print_summary(void)
printf("Wokeup %d of %d threads in %.4f ms (+-%.2f%%)\n",
wakeup_avg,
nthreads,
waketime_avg/1e3,
waketime_avg / USEC_PER_MSEC,
rel_stddev_stats(waketime_stddev, waketime_avg));
}
@ -182,7 +183,7 @@ int bench_futex_wake(int argc, const char **argv,
if (!silent) {
printf("[Run %d]: Wokeup %d of %d threads in %.4f ms\n",
j + 1, nwoken, nthreads, runtime.tv_usec/1e3);
j + 1, nwoken, nthreads, runtime.tv_usec / (double)USEC_PER_MSEC);
}
for (i = 0; i < nthreads; i++) {

View File

@ -21,6 +21,7 @@
#include <string.h>
#include <sys/time.h>
#include <errno.h>
#include <linux/time64.h>
#define K 1024
@ -89,7 +90,7 @@ static u64 get_cycles(void)
static double timeval2double(struct timeval *ts)
{
return (double)ts->tv_sec + (double)ts->tv_usec / (double)1000000;
return (double)ts->tv_sec + (double)ts->tv_usec / (double)USEC_PER_SEC;
}
#define print_bps(x) do { \

View File

@ -30,6 +30,7 @@
#include <sys/wait.h>
#include <sys/prctl.h>
#include <sys/types.h>
#include <linux/time64.h>
#include <numa.h>
#include <numaif.h>
@ -1004,7 +1005,7 @@ static void calc_convergence(double runtime_ns_max, double *convergence)
if (strong && process_groups == g->p.nr_proc) {
if (!*convergence) {
*convergence = runtime_ns_max;
tprintf(" (%6.1fs converged)\n", *convergence/1e9);
tprintf(" (%6.1fs converged)\n", *convergence / NSEC_PER_SEC);
if (g->p.measure_convergence) {
g->all_converged = true;
g->stop_work = true;
@ -1012,7 +1013,7 @@ static void calc_convergence(double runtime_ns_max, double *convergence)
}
} else {
if (*convergence) {
tprintf(" (%6.1fs de-converged)", runtime_ns_max/1e9);
tprintf(" (%6.1fs de-converged)", runtime_ns_max / NSEC_PER_SEC);
*convergence = 0;
}
tprintf("\n");
@ -1022,7 +1023,7 @@ static void calc_convergence(double runtime_ns_max, double *convergence)
static void show_summary(double runtime_ns_max, int l, double *convergence)
{
tprintf("\r # %5.1f%% [%.1f mins]",
(double)(l+1)/g->p.nr_loops*100.0, runtime_ns_max/1e9 / 60.0);
(double)(l+1)/g->p.nr_loops*100.0, runtime_ns_max / NSEC_PER_SEC / 60.0);
calc_convergence(runtime_ns_max, convergence);
@ -1179,8 +1180,8 @@ static void *worker_thread(void *__tdata)
if (details >= 3) {
timersub(&stop, &start, &diff);
runtime_ns_max = diff.tv_sec * 1000000000;
runtime_ns_max += diff.tv_usec * 1000;
runtime_ns_max = diff.tv_sec * NSEC_PER_SEC;
runtime_ns_max += diff.tv_usec * NSEC_PER_USEC;
if (details >= 0) {
printf(" #%2d / %2d: %14.2lf nsecs/op [val: %016"PRIx64"]\n",
@ -1192,23 +1193,23 @@ static void *worker_thread(void *__tdata)
continue;
timersub(&stop, &start0, &diff);
runtime_ns_max = diff.tv_sec * 1000000000ULL;
runtime_ns_max += diff.tv_usec * 1000ULL;
runtime_ns_max = diff.tv_sec * NSEC_PER_SEC;
runtime_ns_max += diff.tv_usec * NSEC_PER_USEC;
show_summary(runtime_ns_max, l, &convergence);
}
gettimeofday(&stop, NULL);
timersub(&stop, &start0, &diff);
td->runtime_ns = diff.tv_sec * 1000000000ULL;
td->runtime_ns += diff.tv_usec * 1000ULL;
td->speed_gbs = bytes_done / (td->runtime_ns / 1e9) / 1e9;
td->runtime_ns = diff.tv_sec * NSEC_PER_SEC;
td->runtime_ns += diff.tv_usec * NSEC_PER_USEC;
td->speed_gbs = bytes_done / (td->runtime_ns / NSEC_PER_SEC) / 1e9;
getrusage(RUSAGE_THREAD, &rusage);
td->system_time_ns = rusage.ru_stime.tv_sec * 1000000000ULL;
td->system_time_ns += rusage.ru_stime.tv_usec * 1000ULL;
td->user_time_ns = rusage.ru_utime.tv_sec * 1000000000ULL;
td->user_time_ns += rusage.ru_utime.tv_usec * 1000ULL;
td->system_time_ns = rusage.ru_stime.tv_sec * NSEC_PER_SEC;
td->system_time_ns += rusage.ru_stime.tv_usec * NSEC_PER_USEC;
td->user_time_ns = rusage.ru_utime.tv_sec * NSEC_PER_SEC;
td->user_time_ns += rusage.ru_utime.tv_usec * NSEC_PER_USEC;
free_data(thread_data, g->p.bytes_thread);
@ -1469,7 +1470,7 @@ static int __bench_numa(const char *name)
}
/* Wait for all the threads to start up: */
while (g->nr_tasks_started != g->p.nr_tasks)
usleep(1000);
usleep(USEC_PER_MSEC);
BUG_ON(g->nr_tasks_started != g->p.nr_tasks);
@ -1488,9 +1489,9 @@ static int __bench_numa(const char *name)
timersub(&stop, &start, &diff);
startup_sec = diff.tv_sec * 1000000000.0;
startup_sec += diff.tv_usec * 1000.0;
startup_sec /= 1e9;
startup_sec = diff.tv_sec * NSEC_PER_SEC;
startup_sec += diff.tv_usec * NSEC_PER_USEC;
startup_sec /= NSEC_PER_SEC;
tprintf(" threads initialized in %.6f seconds.\n", startup_sec);
tprintf(" #\n");
@ -1529,14 +1530,14 @@ static int __bench_numa(const char *name)
tprintf("\n ###\n");
tprintf("\n");
runtime_sec_max = diff.tv_sec * 1000000000.0;
runtime_sec_max += diff.tv_usec * 1000.0;
runtime_sec_max /= 1e9;
runtime_sec_max = diff.tv_sec * NSEC_PER_SEC;
runtime_sec_max += diff.tv_usec * NSEC_PER_USEC;
runtime_sec_max /= NSEC_PER_SEC;
runtime_sec_min = runtime_ns_min/1e9;
runtime_sec_min = runtime_ns_min / NSEC_PER_SEC;
bytes = g->bytes_done;
runtime_avg = (double)runtime_ns_sum / g->p.nr_tasks / 1e9;
runtime_avg = (double)runtime_ns_sum / g->p.nr_tasks / NSEC_PER_SEC;
if (g->p.measure_convergence) {
print_res(name, runtime_sec_max,
@ -1562,7 +1563,7 @@ static int __bench_numa(const char *name)
print_res(name, bytes / 1e9,
"GB,", "data-total", "GB data processed, total");
print_res(name, runtime_sec_max * 1e9 / (bytes / g->p.nr_tasks),
print_res(name, runtime_sec_max * NSEC_PER_SEC / (bytes / g->p.nr_tasks),
"nsecs,", "runtime/byte/thread","nsecs/byte/thread runtime");
print_res(name, bytes / g->p.nr_tasks / 1e9 / runtime_sec_max,
@ -1581,9 +1582,9 @@ static int __bench_numa(const char *name)
snprintf(tname, 32, "process%d:thread%d", p, t);
print_res(tname, td->speed_gbs,
"GB/sec", "thread-speed", "GB/sec/thread speed");
print_res(tname, td->system_time_ns / 1e9,
print_res(tname, td->system_time_ns / NSEC_PER_SEC,
"secs", "thread-system-time", "system CPU time/thread");
print_res(tname, td->user_time_ns / 1e9,
print_res(tname, td->user_time_ns / NSEC_PER_SEC,
"secs", "thread-user-time", "user CPU time/thread");
}
}

View File

@ -29,6 +29,7 @@
#include <poll.h>
#include <limits.h>
#include <err.h>
#include <linux/time64.h>
#define DATASIZE 100
@ -312,11 +313,11 @@ int bench_sched_messaging(int argc, const char **argv,
thread_mode ? "threads" : "processes");
printf(" %14s: %lu.%03lu [sec]\n", "Total time",
diff.tv_sec,
(unsigned long) (diff.tv_usec/1000));
(unsigned long) (diff.tv_usec / USEC_PER_MSEC));
break;
case BENCH_FORMAT_SIMPLE:
printf("%lu.%03lu\n", diff.tv_sec,
(unsigned long) (diff.tv_usec/1000));
(unsigned long) (diff.tv_usec / USEC_PER_MSEC));
break;
default:
/* reaching here is something disaster */

View File

@ -25,6 +25,7 @@
#include <sys/time.h>
#include <sys/types.h>
#include <sys/syscall.h>
#include <linux/time64.h>
#include <pthread.h>
@ -153,24 +154,24 @@ int bench_sched_pipe(int argc, const char **argv, const char *prefix __maybe_unu
printf("# Executed %d pipe operations between two %s\n\n",
loops, threaded ? "threads" : "processes");
result_usec = diff.tv_sec * 1000000;
result_usec = diff.tv_sec * USEC_PER_SEC;
result_usec += diff.tv_usec;
printf(" %14s: %lu.%03lu [sec]\n\n", "Total time",
diff.tv_sec,
(unsigned long) (diff.tv_usec/1000));
(unsigned long) (diff.tv_usec / USEC_PER_MSEC));
printf(" %14lf usecs/op\n",
(double)result_usec / (double)loops);
printf(" %14d ops/sec\n",
(int)((double)loops /
((double)result_usec / (double)1000000)));
((double)result_usec / (double)USEC_PER_SEC)));
break;
case BENCH_FORMAT_SIMPLE:
printf("%lu.%03lu\n",
diff.tv_sec,
(unsigned long) (diff.tv_usec / 1000));
(unsigned long) (diff.tv_usec / USEC_PER_MSEC));
break;
default:

View File

@ -30,6 +30,7 @@
#include "util/tool.h"
#include "util/data.h"
#include "arch/common.h"
#include "util/block-range.h"
#include <dlfcn.h>
#include <linux/bitmap.h>
@ -46,6 +47,103 @@ struct perf_annotate {
DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS);
};
/*
* Given one basic block:
*
* from to branch_i
* * ----> *
* |
* | block
* v
* * ----> *
* from to branch_i+1
*
* where the horizontal are the branches and the vertical is the executed
* block of instructions.
*
* We count, for each 'instruction', the number of blocks that covered it as
* well as count the ratio each branch is taken.
*
* We can do this without knowing the actual instruction stream by keeping
* track of the address ranges. We break down ranges such that there is no
* overlap and iterate from the start until the end.
*
* @acme: once we parse the objdump output _before_ processing the samples,
* we can easily fold the branch.cycles IPC bits in.
*/
static void process_basic_block(struct addr_map_symbol *start,
struct addr_map_symbol *end,
struct branch_flags *flags)
{
struct symbol *sym = start->sym;
struct annotation *notes = sym ? symbol__annotation(sym) : NULL;
struct block_range_iter iter;
struct block_range *entry;
/*
* Sanity; NULL isn't executable and the CPU cannot execute backwards
*/
if (!start->addr || start->addr > end->addr)
return;
iter = block_range__create(start->addr, end->addr);
if (!block_range_iter__valid(&iter))
return;
/*
* First block in range is a branch target.
*/
entry = block_range_iter(&iter);
assert(entry->is_target);
entry->entry++;
do {
entry = block_range_iter(&iter);
entry->coverage++;
entry->sym = sym;
if (notes)
notes->max_coverage = max(notes->max_coverage, entry->coverage);
} while (block_range_iter__next(&iter));
/*
* Last block in rage is a branch.
*/
entry = block_range_iter(&iter);
assert(entry->is_branch);
entry->taken++;
if (flags->predicted)
entry->pred++;
}
static void process_branch_stack(struct branch_stack *bs, struct addr_location *al,
struct perf_sample *sample)
{
struct addr_map_symbol *prev = NULL;
struct branch_info *bi;
int i;
if (!bs || !bs->nr)
return;
bi = sample__resolve_bstack(sample, al);
if (!bi)
return;
for (i = bs->nr - 1; i >= 0; i--) {
/*
* XXX filter against symbol
*/
if (prev)
process_basic_block(prev, &bi[i].from, &bi[i].flags);
prev = &bi[i].to;
}
free(bi);
}
static int perf_evsel__add_sample(struct perf_evsel *evsel,
struct perf_sample *sample,
struct addr_location *al,
@ -72,6 +170,12 @@ static int perf_evsel__add_sample(struct perf_evsel *evsel,
return 0;
}
/*
* XXX filtered samples can still have branch entires pointing into our
* symbol and are missed.
*/
process_branch_stack(sample->branch_stack, al, sample);
sample->period = 1;
sample->weight = 1;
@ -204,8 +308,6 @@ static int __cmd_annotate(struct perf_annotate *ann)
struct perf_evsel *pos;
u64 total_nr_samples;
machines__set_symbol_filter(&session->machines, symbol__annotate_init);
if (ann->cpu_list) {
ret = perf_session__cpu_bitmap(session, ann->cpu_list,
ann->cpu_bitmap);
@ -367,7 +469,10 @@ int cmd_annotate(int argc, const char **argv, const char *prefix __maybe_unused)
if (annotate.session == NULL)
return -1;
symbol_conf.priv_size = sizeof(struct annotation);
ret = symbol__annotation_init();
if (ret < 0)
goto out_delete;
symbol_conf.try_vmlinux_path = true;
ret = symbol__init(&annotate.session->header.env);

View File

@ -1033,7 +1033,9 @@ static int hpp__entry_global(struct perf_hpp_fmt *_fmt, struct perf_hpp *hpp,
}
static int hpp__header(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
struct hists *hists __maybe_unused)
struct hists *hists __maybe_unused,
int line __maybe_unused,
int *span __maybe_unused)
{
struct diff_hpp_fmt *dfmt =
container_of(fmt, struct diff_hpp_fmt, fmt);

View File

@ -429,7 +429,7 @@ static int perf_event__inject_buildid(struct perf_tool *tool,
if (al.map != NULL) {
if (!al.map->dso->hit) {
al.map->dso->hit = 1;
if (map__load(al.map, NULL) >= 0) {
if (map__load(al.map) >= 0) {
dso__inject_build_id(al.map->dso, tool, machine);
/*
* If this fails, too bad, let the other side

View File

@ -330,7 +330,7 @@ static int build_alloc_func_list(void)
}
kernel_map = machine__kernel_map(machine);
if (map__load(kernel_map, NULL) < 0) {
if (map__load(kernel_map) < 0) {
pr_err("cannot load kernel map\n");
return -ENOENT;
}
@ -979,7 +979,7 @@ static void __print_slab_result(struct rb_root *root,
if (is_caller) {
addr = data->call_site;
if (!raw_ip)
sym = machine__find_kernel_function(machine, addr, &map, NULL);
sym = machine__find_kernel_function(machine, addr, &map);
} else
addr = data->ptr;
@ -1043,8 +1043,7 @@ static void __print_page_alloc_result(struct perf_session *session, int n_lines)
char *caller = buf;
data = rb_entry(next, struct page_stat, node);
sym = machine__find_kernel_function(machine, data->callsite,
&map, NULL);
sym = machine__find_kernel_function(machine, data->callsite, &map);
if (sym && sym->name)
caller = sym->name;
else
@ -1086,8 +1085,7 @@ static void __print_page_caller_result(struct perf_session *session, int n_lines
char *caller = buf;
data = rb_entry(next, struct page_stat, node);
sym = machine__find_kernel_function(machine, data->callsite,
&map, NULL);
sym = machine__find_kernel_function(machine, data->callsite, &map);
if (sym && sym->name)
caller = sym->name;
else

View File

@ -24,6 +24,7 @@
#include <sys/timerfd.h>
#endif
#include <linux/time64.h>
#include <termios.h>
#include <semaphore.h>
#include <pthread.h>
@ -362,7 +363,7 @@ static bool handle_end_event(struct perf_kvm_stat *kvm,
if (!skip_event(decode)) {
pr_info("%" PRIu64 " VM %d, vcpu %d: %s event took %" PRIu64 "usec\n",
sample->time, sample->pid, vcpu_record->vcpu_id,
decode, time_diff/1000);
decode, time_diff / NSEC_PER_USEC);
}
}
@ -608,15 +609,15 @@ static void print_result(struct perf_kvm_stat *kvm)
pr_info("%10llu ", (unsigned long long)ecount);
pr_info("%8.2f%% ", (double)ecount / kvm->total_count * 100);
pr_info("%8.2f%% ", (double)etime / kvm->total_time * 100);
pr_info("%9.2fus ", (double)min / 1e3);
pr_info("%9.2fus ", (double)max / 1e3);
pr_info("%9.2fus ( +-%7.2f%% )", (double)etime / ecount/1e3,
pr_info("%9.2fus ", (double)min / NSEC_PER_USEC);
pr_info("%9.2fus ", (double)max / NSEC_PER_USEC);
pr_info("%9.2fus ( +-%7.2f%% )", (double)etime / ecount / NSEC_PER_USEC,
kvm_event_rel_stddev(vcpu, event));
pr_info("\n");
}
pr_info("\nTotal Samples:%" PRIu64 ", Total events handled time:%.2fus.\n\n",
kvm->total_count, kvm->total_time / 1e3);
kvm->total_count, kvm->total_time / (double)NSEC_PER_USEC);
if (kvm->lost_events)
pr_info("\nLost events: %" PRIu64 "\n\n", kvm->lost_events);

View File

@ -326,6 +326,11 @@ static int perf_add_probe_events(struct perf_probe_event *pevs, int npevs)
if (ret < 0)
goto out_cleanup;
if (params.command == 'D') { /* it shows definition */
ret = show_probe_trace_events(pevs, npevs);
goto out_cleanup;
}
ret = apply_perf_probe_events(pevs, npevs);
if (ret < 0)
goto out_cleanup;
@ -454,6 +459,14 @@ out:
return ret;
}
#ifdef HAVE_DWARF_SUPPORT
#define PROBEDEF_STR \
"[EVENT=]FUNC[@SRC][+OFF|%return|:RL|;PT]|SRC:AL|SRC;PT [[NAME=]ARG ...]"
#else
#define PROBEDEF_STR "[EVENT=]FUNC[+OFF|%return] [[NAME=]ARG ...]"
#endif
static int
__cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
{
@ -479,13 +492,7 @@ __cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
opt_set_filter_with_command, DEFAULT_LIST_FILTER),
OPT_CALLBACK('d', "del", NULL, "[GROUP:]EVENT", "delete a probe event.",
opt_set_filter_with_command),
OPT_CALLBACK('a', "add", NULL,
#ifdef HAVE_DWARF_SUPPORT
"[EVENT=]FUNC[@SRC][+OFF|%return|:RL|;PT]|SRC:AL|SRC;PT"
" [[NAME=]ARG ...]",
#else
"[EVENT=]FUNC[+OFF|%return] [[NAME=]ARG ...]",
#endif
OPT_CALLBACK('a', "add", NULL, PROBEDEF_STR,
"probe point definition, where\n"
"\t\tGROUP:\tGroup name (optional)\n"
"\t\tEVENT:\tEvent name\n"
@ -503,6 +510,9 @@ __cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
"\t\tARG:\tProbe argument (kprobe-tracer argument format.)\n",
#endif
opt_add_probe_event),
OPT_CALLBACK('D', "definition", NULL, PROBEDEF_STR,
"Show trace event definition of given traceevent for k/uprobe_events.",
opt_add_probe_event),
OPT_BOOLEAN('f', "force", &probe_conf.force_add, "forcibly add events"
" with existing name"),
OPT_CALLBACK('L', "line", NULL,
@ -548,6 +558,7 @@ __cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
set_option_flag(options, 'a', "add", PARSE_OPT_EXCLUSIVE);
set_option_flag(options, 'd', "del", PARSE_OPT_EXCLUSIVE);
set_option_flag(options, 'D', "definition", PARSE_OPT_EXCLUSIVE);
set_option_flag(options, 'l', "list", PARSE_OPT_EXCLUSIVE);
#ifdef HAVE_DWARF_SUPPORT
set_option_flag(options, 'L', "line", PARSE_OPT_EXCLUSIVE);
@ -600,6 +611,14 @@ __cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
*/
symbol_conf.try_vmlinux_path = (symbol_conf.vmlinux_name == NULL);
/*
* Except for --list, --del and --add, other command doesn't depend
* nor change running kernel. So if user gives offline vmlinux,
* ignore its buildid.
*/
if (!strchr("lda", params.command) && symbol_conf.vmlinux_name)
symbol_conf.ignore_vmlinux_buildid = true;
switch (params.command) {
case 'l':
if (params.uprobes) {
@ -643,7 +662,9 @@ __cmd_probe(int argc, const char **argv, const char *prefix __maybe_unused)
return ret;
}
break;
case 'D':
case 'a':
/* Ensure the last given target is used */
if (params.target && !params.target_used) {
pr_err(" Error: -x/-m must follow the probe definitions.\n");

View File

@ -22,6 +22,7 @@
#include "util/evlist.h"
#include "util/evsel.h"
#include "util/debug.h"
#include "util/drv_configs.h"
#include "util/session.h"
#include "util/tool.h"
#include "util/symbol.h"
@ -42,7 +43,7 @@
#include <sched.h>
#include <sys/mman.h>
#include <asm/bug.h>
#include <linux/time64.h>
struct record {
struct perf_tool tool;
@ -96,7 +97,7 @@ backward_rb_find_range(void *buf, int mask, u64 head, u64 *start, u64 *end)
*start = head;
while (true) {
if (evt_head - head >= (unsigned int)size) {
pr_debug("Finshed reading backward ring buffer: rewind\n");
pr_debug("Finished reading backward ring buffer: rewind\n");
if (evt_head - head > (unsigned int)size)
evt_head -= pheader->size;
*end = evt_head;
@ -106,7 +107,7 @@ backward_rb_find_range(void *buf, int mask, u64 head, u64 *start, u64 *end)
pheader = (struct perf_event_header *)(buf + (evt_head & mask));
if (pheader->size == 0) {
pr_debug("Finshed reading backward ring buffer: get start\n");
pr_debug("Finished reading backward ring buffer: get start\n");
*end = evt_head;
return 0;
}
@ -383,6 +384,7 @@ static int record__open(struct record *rec)
struct perf_evlist *evlist = rec->evlist;
struct perf_session *session = rec->session;
struct record_opts *opts = &rec->opts;
struct perf_evsel_config_term *err_term;
int rc = 0;
perf_evlist__config(evlist, opts, &callchain_param);
@ -412,6 +414,14 @@ try_again:
goto out;
}
if (perf_evlist__apply_drv_configs(evlist, &pos, &err_term)) {
error("failed to set config \"%s\" on event %s with %d (%s)\n",
err_term->val.drv_cfg, perf_evsel__name(pos), errno,
str_error_r(errno, msg, sizeof(msg)));
rc = -1;
goto out;
}
rc = record__mmap(rec);
if (rc)
goto out;
@ -954,7 +964,7 @@ static int __cmd_record(struct record *rec, int argc, const char **argv)
}
if (opts->initial_delay) {
usleep(opts->initial_delay * 1000);
usleep(opts->initial_delay * USEC_PER_MSEC);
perf_evlist__enable(rec->evlist);
}
@ -1563,29 +1573,39 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
if (!rec->itr) {
rec->itr = auxtrace_record__init(rec->evlist, &err);
if (err)
return err;
goto out;
}
err = auxtrace_parse_snapshot_options(rec->itr, &rec->opts,
rec->opts.auxtrace_snapshot_opts);
if (err)
return err;
goto out;
/*
* Allow aliases to facilitate the lookup of symbols for address
* filters. Refer to auxtrace_parse_filters().
*/
symbol_conf.allow_aliases = true;
symbol__init(NULL);
err = auxtrace_parse_filters(rec->evlist);
if (err)
goto out;
if (dry_run)
return 0;
goto out;
err = bpf__setup_stdout(rec->evlist);
if (err) {
bpf__strerror_setup_stdout(rec->evlist, err, errbuf, sizeof(errbuf));
pr_err("ERROR: Setup BPF stdout failed: %s\n",
errbuf);
return err;
goto out;
}
err = -ENOMEM;
symbol__init(NULL);
if (symbol_conf.kptr_restrict)
pr_warning(
"WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted,\n"
@ -1633,7 +1653,7 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
if (rec->evlist->nr_entries == 0 &&
perf_evlist__add_default(rec->evlist) < 0) {
pr_err("Not enough memory for event selector list\n");
goto out_symbol_exit;
goto out;
}
if (rec->opts.target.tid && !rec->opts.no_inherit_set)
@ -1653,7 +1673,7 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
ui__error("%s", errbuf);
err = -saved_errno;
goto out_symbol_exit;
goto out;
}
err = -ENOMEM;
@ -1662,7 +1682,7 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
err = auxtrace_record__options(rec->itr, rec->evlist, &rec->opts);
if (err)
goto out_symbol_exit;
goto out;
/*
* We take all buildids when the file contains
@ -1674,11 +1694,11 @@ int cmd_record(int argc, const char **argv, const char *prefix __maybe_unused)
if (record_opts__config(&rec->opts)) {
err = -EINVAL;
goto out_symbol_exit;
goto out;
}
err = __cmd_record(&record, argc, argv);
out_symbol_exit:
out:
perf_evlist__delete(rec->evlist);
symbol__exit();
auxtrace_record__free(rec->itr);

View File

@ -89,6 +89,10 @@ static int report__config(const char *var, const char *value, void *cb)
rep->queue_size = perf_config_u64(var, value);
return 0;
}
if (!strcmp(var, "report.sort_order")) {
default_sort_order = strdup(value);
return 0;
}
return 0;
}
@ -931,7 +935,6 @@ repeat:
if (symbol_conf.report_hierarchy) {
/* disable incompatible options */
symbol_conf.event_group = false;
symbol_conf.cumulate_callchain = false;
if (field_order) {
@ -980,9 +983,9 @@ repeat:
* implementation.
*/
if (ui__has_annotation()) {
symbol_conf.priv_size = sizeof(struct annotation);
machines__set_symbol_filter(&session->machines,
symbol__annotate_init);
ret = symbol__annotation_init();
if (ret < 0)
goto error;
/*
* For searching by name on the "Browse map details".
* providing it only in verbose mode not to bloat too

View File

@ -26,6 +26,7 @@
#include <pthread.h>
#include <math.h>
#include <api/fs/fs.h>
#include <linux/time64.h>
#define PR_SET_NAME 15 /* Set process name */
#define MAX_CPUS 4096
@ -199,7 +200,7 @@ static u64 get_nsecs(void)
clock_gettime(CLOCK_MONOTONIC, &ts);
return ts.tv_sec * 1000000000ULL + ts.tv_nsec;
return ts.tv_sec * NSEC_PER_SEC + ts.tv_nsec;
}
static void burn_nsecs(struct perf_sched *sched, u64 nsecs)
@ -223,7 +224,7 @@ static void sleep_nsecs(u64 nsecs)
static void calibrate_run_measurement_overhead(struct perf_sched *sched)
{
u64 T0, T1, delta, min_delta = 1000000000ULL;
u64 T0, T1, delta, min_delta = NSEC_PER_SEC;
int i;
for (i = 0; i < 10; i++) {
@ -240,7 +241,7 @@ static void calibrate_run_measurement_overhead(struct perf_sched *sched)
static void calibrate_sleep_measurement_overhead(struct perf_sched *sched)
{
u64 T0, T1, delta, min_delta = 1000000000ULL;
u64 T0, T1, delta, min_delta = NSEC_PER_SEC;
int i;
for (i = 0; i < 10; i++) {
@ -452,8 +453,8 @@ static u64 get_cpu_usage_nsec_parent(void)
err = getrusage(RUSAGE_SELF, &ru);
BUG_ON(err);
sum = ru.ru_utime.tv_sec*1e9 + ru.ru_utime.tv_usec*1e3;
sum += ru.ru_stime.tv_sec*1e9 + ru.ru_stime.tv_usec*1e3;
sum = ru.ru_utime.tv_sec * NSEC_PER_SEC + ru.ru_utime.tv_usec * NSEC_PER_USEC;
sum += ru.ru_stime.tv_sec * NSEC_PER_SEC + ru.ru_stime.tv_usec * NSEC_PER_USEC;
return sum;
}
@ -667,12 +668,12 @@ static void run_one_test(struct perf_sched *sched)
sched->run_avg = delta;
sched->run_avg = (sched->run_avg * (sched->replay_repeat - 1) + delta) / sched->replay_repeat;
printf("#%-3ld: %0.3f, ", sched->nr_runs, (double)delta / 1000000.0);
printf("#%-3ld: %0.3f, ", sched->nr_runs, (double)delta / NSEC_PER_MSEC);
printf("ravg: %0.2f, ", (double)sched->run_avg / 1e6);
printf("ravg: %0.2f, ", (double)sched->run_avg / NSEC_PER_MSEC);
printf("cpu: %0.2f / %0.2f",
(double)sched->cpu_usage / 1e6, (double)sched->runavg_cpu_usage / 1e6);
(double)sched->cpu_usage / NSEC_PER_MSEC, (double)sched->runavg_cpu_usage / NSEC_PER_MSEC);
#if 0
/*
@ -680,8 +681,8 @@ static void run_one_test(struct perf_sched *sched)
* accurate than the sched->sum_exec_runtime based statistics:
*/
printf(" [%0.2f / %0.2f]",
(double)sched->parent_cpu_usage/1e6,
(double)sched->runavg_parent_cpu_usage/1e6);
(double)sched->parent_cpu_usage / NSEC_PER_MSEC,
(double)sched->runavg_parent_cpu_usage / NSEC_PER_MSEC);
#endif
printf("\n");
@ -696,13 +697,13 @@ static void test_calibrations(struct perf_sched *sched)
u64 T0, T1;
T0 = get_nsecs();
burn_nsecs(sched, 1e6);
burn_nsecs(sched, NSEC_PER_MSEC);
T1 = get_nsecs();
printf("the run test took %" PRIu64 " nsecs\n", T1 - T0);
T0 = get_nsecs();
sleep_nsecs(1e6);
sleep_nsecs(NSEC_PER_MSEC);
T1 = get_nsecs();
printf("the sleep test took %" PRIu64 " nsecs\n", T1 - T0);
@ -1213,10 +1214,10 @@ static void output_lat_thread(struct perf_sched *sched, struct work_atoms *work_
avg = work_list->total_lat / work_list->nb_atoms;
printf("|%11.3f ms |%9" PRIu64 " | avg:%9.3f ms | max:%9.3f ms | max at: %13.6f s\n",
(double)work_list->total_runtime / 1e6,
work_list->nb_atoms, (double)avg / 1e6,
(double)work_list->max_lat / 1e6,
(double)work_list->max_lat_at / 1e9);
(double)work_list->total_runtime / NSEC_PER_MSEC,
work_list->nb_atoms, (double)avg / NSEC_PER_MSEC,
(double)work_list->max_lat / NSEC_PER_MSEC,
(double)work_list->max_lat_at / NSEC_PER_SEC);
}
static int pid_cmp(struct work_atoms *l, struct work_atoms *r)
@ -1491,7 +1492,7 @@ static int map_switch_event(struct perf_sched *sched, struct perf_evsel *evsel,
if (sched->map.cpus && !cpu_map__has(sched->map.cpus, this_cpu))
goto out;
color_fprintf(stdout, color, " %12.6f secs ", (double)timestamp/1e9);
color_fprintf(stdout, color, " %12.6f secs ", (double)timestamp / NSEC_PER_SEC);
if (new_shortname) {
const char *pid_color = color;
@ -1753,7 +1754,7 @@ static int perf_sched__lat(struct perf_sched *sched)
printf(" -----------------------------------------------------------------------------------------------------------------\n");
printf(" TOTAL: |%11.3f ms |%9" PRIu64 " |\n",
(double)sched->all_runtime / 1e6, sched->all_count);
(double)sched->all_runtime / NSEC_PER_MSEC, sched->all_count);
printf(" ---------------------------------------------------\n");

View File

@ -24,6 +24,7 @@
#include "util/thread-stack.h"
#include <linux/bitmap.h>
#include <linux/stringify.h>
#include <linux/time64.h>
#include "asm/bug.h"
#include "util/mem-events.h"
@ -464,9 +465,9 @@ static void print_sample_start(struct perf_sample *sample,
if (PRINT_FIELD(TIME)) {
nsecs = sample->time;
secs = nsecs / NSECS_PER_SEC;
nsecs -= secs * NSECS_PER_SEC;
usecs = nsecs / NSECS_PER_USEC;
secs = nsecs / NSEC_PER_SEC;
nsecs -= secs * NSEC_PER_SEC;
usecs = nsecs / NSEC_PER_USEC;
if (nanosecs)
printf("%5lu.%09llu: ", secs, nsecs);
else
@ -521,11 +522,11 @@ static void print_sample_brstacksym(struct perf_sample *sample,
thread__find_addr_map(thread, sample->cpumode, MAP__FUNCTION, from, &alf);
if (alf.map)
alf.sym = map__find_symbol(alf.map, alf.addr, NULL);
alf.sym = map__find_symbol(alf.map, alf.addr);
thread__find_addr_map(thread, sample->cpumode, MAP__FUNCTION, to, &alt);
if (alt.map)
alt.sym = map__find_symbol(alt.map, alt.addr, NULL);
alt.sym = map__find_symbol(alt.map, alt.addr);
symbol__fprintf_symname_offs(alf.sym, &alf, stdout);
putchar('/');

Some files were not shown because too many files have changed in this diff Show More