linux/include
Thomas Gleixner 4396e058c5 timekeeping: Provide fast and NMI safe access to CLOCK_MONOTONIC
Tracers want a correlated time between the kernel instrumentation and
user space. We really do not want to export sched_clock() to user
space, so we need to provide something sensible for this.

Using separate data structures with an non blocking sequence count
based update mechanism allows us to do that. The data structure
required for the readout has a sequence counter and two copies of the
timekeeping data.

On the update side:

  smp_wmb();
  tkf->seq++;
  smp_wmb();
  update(tkf->base[0], tk);
  smp_wmb();
  tkf->seq++;
  smp_wmb();
  update(tkf->base[1], tk);

On the reader side:

  do {
     seq = tkf->seq;
     smp_rmb();
     idx = seq & 0x01;
     now = now(tkf->base[idx]);
     smp_rmb();
  } while (seq != tkf->seq)

So if a NMI hits the update of base[0] it will use base[1] which is
still consistent, but this timestamp is not guaranteed to be monotonic
across an update.

The timestamp is calculated by:

	now = base_mono + clock_delta * slope

So if the update lowers the slope, readers who are forced to the
not yet updated second array are still using the old steeper slope.

 tmono
 ^
 |    o  n
 |   o n
 |  u
 | o
 |o
 |12345678---> reader order

 o = old slope
 u = update
 n = new slope

So reader 6 will observe time going backwards versus reader 5.

While other CPUs are likely to be able observe that, the only way
for a CPU local observation is when an NMI hits in the middle of
the update. Timestamps taken from that NMI context might be ahead
of the following timestamps. Callers need to be aware of that and
deal with it.

V2: Got rid of clock monotonic raw and reorganized the data
    structures. Folded in the barrier fix from Mathieu.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2014-07-23 15:01:55 -07:00
..
acpi ACPI / i915: ignore firmware requests for backlight change 2014-07-07 23:38:05 +02:00
asm-generic core: fix typo in percpu read_mostly section 2014-07-01 16:45:22 -04:00
clocksource ARM: pxa: Add non device-tree timer link to clocksource 2014-07-23 12:02:39 +02:00
crypto
drm sound fixes for 3.16-rc4 2014-07-04 08:56:57 -07:00
dt-bindings This batch of fixes is for a handful of clock drivers from Allwinner, 2014-07-13 12:21:04 -07:00
keys
kvm
linux timekeeping: Provide fast and NMI safe access to CLOCK_MONOTONIC 2014-07-23 15:01:55 -07:00
math-emu
media Merge branch 'topic/omap3isp' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media 2014-06-12 23:04:28 -07:00
memory
misc
net ipv4: fix dst race in sk_dst_get() 2014-06-25 17:41:44 -07:00
pcmcia
ras
rdma Merge branches 'core', 'cxgb3', 'cxgb4', 'iser', 'iwpm', 'misc', 'mlx4', 'mlx5', 'noio', 'ocrdma', 'qib', 'srp' and 'usnic' into for-next 2014-06-10 10:12:14 -07:00
rxrpc
scsi SCSI for-linus on 20140705 2014-07-06 12:08:30 -07:00
sound ALSA: control: Protect user controls against concurrent access 2014-06-18 15:12:33 +02:00
target target: Report correct response length for some commands 2014-06-11 12:15:30 -07:00
trace tracing: Add __field_struct macro for TRACE_EVENT() 2014-06-21 00:18:42 -04:00
uapi Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs 2014-07-04 08:53:53 -07:00
video Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linux 2014-06-12 11:32:30 -07:00
xen Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2014-06-12 14:27:40 -07:00
Kbuild