linux/kernel/trace
Kalesh Singh 722eddaa40 tracing/histogram: Optimize division by a power of 2
The division is a slow operation. If the divisor is a power of 2, use a
shift instead.

Results were obtained using Android's version of perf (simpleperf[1]) as
described below:

1. hist_field_div() is modified to call 2 test functions:
   test_hist_field_div_[not]_optimized(); passing them the
   same args. Use noinline and volatile to ensure these are
   not optimized out by the compiler.
2. Create a hist event trigger that uses division:
      events/kmem/rss_stat$ echo 'hist:keys=common_pid:x=size/<divisor>'
         >> trigger
      events/kmem/rss_stat$ echo 'hist:keys=common_pid:vals=$x'
         >> trigger
3. Run Android's lmkd_test[2] to generate rss_stat events, and
   record CPU samples with Android's simpleperf:
      simpleperf record -a --exclude-perf --post-unwind=yes -m 16384 -g
         -f 2000 -o perf.data

== Results ==

Divisor is a power of 2 (divisor == 32):

   test_hist_field_div_not_optimized  | 8,717,091 cpu-cycles
   test_hist_field_div_optimized      | 1,643,137 cpu-cycles

If the divisor is a power of 2, the optimized version is ~5.3x faster.

Divisor is not a power of 2 (divisor == 33):

   test_hist_field_div_not_optimized  | 4,444,324 cpu-cycles
   test_hist_field_div_optimized      | 5,497,958 cpu-cycles

If the divisor is not a power of 2, as expected, the optimized version is
slightly slower (~24% slower).

[1] https://android.googlesource.com/platform/system/extras/+/master/simpleperf/doc/README.md
[2] https://cs.android.com/android/platform/superproject/+/master:system/memory/lmkd/tests/lmkd_test.cpp

Link: https://lkml.kernel.org/r/20211025200852.3002369-7-kaleshsingh@google.com

Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-10-26 20:27:20 -04:00
..
blktrace.c blktrace: Fix uaf in blk_trace access after removing by sysfs 2021-09-24 11:06:15 -06:00
bpf_trace.c bpf: Fix bpf-next builds without CONFIG_BPF_EVENTS 2021-08-25 19:41:39 -07:00
bpf_trace.h bpf: Use dedicated bpf_trace_printk event instead of trace_printk() 2020-07-13 16:55:49 -07:00
error_report-traces.c tracing: add error_report_end trace point 2021-02-26 09:41:02 -08:00
fgraph.c x86/ftrace: Make function graph use ftrace directly 2021-10-20 23:44:43 -04:00
ftrace_internal.h
ftrace.c ftrace: Make ftrace_profile_pages_init static 2021-10-25 22:27:07 -04:00
Kconfig tracing: Simplify the Kconfig dependency of FTRACE 2021-08-16 11:37:20 -04:00
kprobe_event_gen_test.c
Makefile tracing: Place trace_pid_list logic into abstract functions 2021-10-05 17:30:08 -04:00
pid_list.c tracing: Initialize upper and lower vars in pid_list_refill_irq() 2021-10-07 09:56:38 -04:00
pid_list.h tracing: Create a sparse bitmask for pid filtering 2021-10-05 17:38:45 -04:00
power-traces.c
preemptirq_delay_test.c kernel: trace: preemptirq_delay_test: add cpu affinity 2021-02-02 17:02:07 -05:00
ring_buffer_benchmark.c sched,tracing: Convert to sched_set_fifo() 2020-07-29 11:43:53 +02:00
ring_buffer.c tracing/perf: Add interrupt_context_level() helper 2021-10-19 20:33:20 -04:00
rpm-traces.c
synth_event_gen_test.c tracing: Fix various typos in comments 2021-03-23 14:08:18 -04:00
trace_benchmark.c tracing: Fix some typos in comments 2020-11-10 20:39:40 -05:00
trace_benchmark.h
trace_boot.c tracing: Fix missing trace_boot_init_histograms kstrdup NULL checks 2021-10-26 09:18:10 -04:00
trace_branch.c tracing: Merge irqflags + preempt counter. 2021-02-02 17:02:06 -05:00
trace_clock.c tracing: Do no increment trace_clock_global() by one 2021-06-18 09:10:00 -04:00
trace_dynevent.c tracing: Disable "other" permission bits in the tracefs files 2021-10-08 18:08:43 -04:00
trace_dynevent.h tracing: Add DYNAMIC flag for dynamic events 2021-08-18 18:10:32 -04:00
trace_entries.h trace: Add timerlat tracer 2021-06-25 19:57:24 -04:00
trace_eprobe.c tracing: Fix some alloc_event_probe() error handling bugs 2021-09-07 20:58:23 -04:00
trace_event_perf.c tracing: Have dynamic events have a ref counter 2021-08-18 18:13:47 -04:00
trace_events_filter_test.h
trace_events_filter.c tracing: Update create_system_filter() kernel-doc comment 2021-03-25 16:04:35 -04:00
trace_events_hist.c tracing/histogram: Optimize division by a power of 2 2021-10-26 20:27:20 -04:00
trace_events_inject.c tracing: Merge irqflags + preempt counter. 2021-02-02 17:02:06 -05:00
trace_events_synth.c tracing: Disable "other" permission bits in the tracefs files 2021-10-08 18:08:43 -04:00
trace_events_trigger.c tracing: Add a probe that attaches to trace events 2021-08-20 14:18:40 -04:00
trace_events.c tracing: Disable "other" permission bits in the tracefs files 2021-10-08 18:08:43 -04:00
trace_export.c tracing: Fix some typos in comments 2020-11-10 20:39:40 -05:00
trace_functions_graph.c tracing: in_irq() cleanup 2021-10-13 18:19:41 -04:00
trace_functions.c tracing: Add "func_no_repeats" option for function tracing 2021-04-15 14:50:02 -04:00
trace_hwlat.c tracing/hwlat: Make some internal symbols static 2021-10-26 09:18:28 -04:00
trace_irqsoff.c tracing: Merge irqflags + preempt counter. 2021-02-02 17:02:06 -05:00
trace_kdb.c kdb: Rename members of struct kdbtab_t 2021-07-27 17:05:06 +01:00
trace_kprobe_selftest.c
trace_kprobe_selftest.h
trace_kprobe.c tracing: Disable "other" permission bits in the tracefs files 2021-10-08 18:08:43 -04:00
trace_mmiotrace.c tracing: Remove definition of DEBUG in trace_mmiotrace.c 2021-02-02 17:02:07 -05:00
trace_nop.c
trace_osnoise.c trace/timerlat: Add migrate-disabled field to the timerlat header 2021-10-25 23:02:36 -04:00
trace_output.c tracing: Show kretprobe unknown indicator only for kretprobe_trampoline 2021-09-30 21:24:08 -04:00
trace_output.h ftrace: Add recording of functions that caused recursion 2020-11-06 08:42:26 -05:00
trace_preemptirq.c lockdep: fix order in trace_hardirqs_off_caller() 2020-09-14 10:08:07 +02:00
trace_printk.c tracing: Disable "other" permission bits in the tracefs files 2021-10-08 18:08:43 -04:00
trace_probe_tmpl.h tracing/probes: Have process_fetch_insn() take a void * instead of pt_regs 2021-08-19 09:09:03 -04:00
trace_probe.c tracing: Add a probe that attaches to trace events 2021-08-20 14:18:40 -04:00
trace_probe.h tracing: Add a probe that attaches to trace events 2021-08-20 14:18:40 -04:00
trace_recursion_record.c tracing: Disable "other" permission bits in the tracefs files 2021-10-08 18:08:43 -04:00
trace_sched_switch.c
trace_sched_wakeup.c tracing: Change variable type as bool for clean-up 2021-06-30 09:19:14 -04:00
trace_selftest_dynamic.c
trace_selftest.c tracing: Fix selftest config check for function graph start up test 2021-10-21 14:18:48 -04:00
trace_seq.c tracing: Fix various typos in comments 2021-03-23 14:08:18 -04:00
trace_stack.c tracing: Disable "other" permission bits in the tracefs files 2021-10-08 18:08:43 -04:00
trace_stat.c tracing: Disable "other" permission bits in the tracefs files 2021-10-08 18:08:43 -04:00
trace_stat.h
trace_synth.h tracing: synth events: increase max fields count 2021-09-08 15:29:16 -04:00
trace_syscalls.c tracing: Merge irqflags + preempt counter. 2021-02-02 17:02:06 -05:00
trace_uprobe.c tracing: Disable "other" permission bits in the tracefs files 2021-10-08 18:08:43 -04:00
trace.c tracing: Disable "other" permission bits in the tracefs files 2021-10-08 18:08:43 -04:00
trace.h tracing: in_irq() cleanup 2021-10-13 18:19:41 -04:00
tracing_map.c tracing/cfi: Fix cmp_entries_* functions signature mismatch 2021-10-19 20:33:20 -04:00
tracing_map.h tracing: Fix some typos in comments 2020-11-10 20:39:40 -05:00