linux/arch/x86/include/asm
Ross Zwisler 67a3e8fe90 nd_blk: change aperture mapping from WC to WB
This should result in a pretty sizeable performance gain for reads.  For
rough comparison I did some simple read testing using PMEM to compare
reads of write combining (WC) mappings vs write-back (WB).  This was
done on a random lab machine.

PMEM reads from a write combining mapping:
	# dd of=/dev/null if=/dev/pmem0 bs=4096 count=100000
	100000+0 records in
	100000+0 records out
	409600000 bytes (410 MB) copied, 9.2855 s, 44.1 MB/s

PMEM reads from a write-back mapping:
	# dd of=/dev/null if=/dev/pmem0 bs=4096 count=1000000
	1000000+0 records in
	1000000+0 records out
	4096000000 bytes (4.1 GB) copied, 3.44034 s, 1.2 GB/s

To be able to safely support a write-back aperture I needed to add
support for the "read flush" _DSM flag, as outlined in the DSM spec:

http://pmem.io/documents/NVDIMM_DSM_Interface_Example.pdf

This flag tells the ND BLK driver that it needs to flush the cache lines
associated with the aperture after the aperture is moved but before any
new data is read.  This ensures that any stale cache lines from the
previous contents of the aperture will be discarded from the processor
cache, and the new data will be read properly from the DIMM.  We know
that the cache lines are clean and will be discarded without any
writeback because either a) the previous aperture operation was a read,
and we never modified the contents of the aperture, or b) the previous
aperture operation was a write and we must have written back the dirtied
contents of the aperture to the DIMM before the I/O was completed.

In order to add support for the "read flush" flag I needed to add a
generic routine to invalidate cache lines, mmio_flush_range().  This is
protected by the ARCH_HAS_MMIO_FLUSH Kconfig variable, and is currently
only supported on x86.

Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-08-27 19:38:28 -04:00
..
crypto x86/fpu: Rename i387.h to fpu/api.h 2015-05-19 15:47:30 +02:00
fpu x86/fpu, sched: Dynamically allocate 'struct fpu' 2015-07-18 03:42:35 +02:00
numachip
trace Merge branch 'x86-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-06-22 17:59:09 -07:00
uv x86: UV BAU: Increase maximum CPUs per socket/hub 2014-11-03 13:49:24 +01:00
xen xen: Add __GFP_DMA flag when xen_swiotlb_init gets free pages on ARM 2015-05-06 15:02:58 +01:00
a.out-core.h
acenv.h ACPICA: Linux: Add support to exclude <asm/acenv.h> inclusion. 2014-07-23 01:10:44 +02:00
acpi.h x86/xen: Override ACPI IRQ management callback __acpi_unregister_gsi 2015-01-20 11:44:41 +01:00
agp.h
alternative-asm.h x86/alternatives: Document macros 2015-05-06 11:25:31 +02:00
alternative.h x86/alternatives, x86/fpu: Add 'alternatives_patched' debug flag and use it in xsave_state() 2015-05-19 15:48:03 +02:00
amd_nb.h x86/gart: Check for GART support before accessing GART registers 2015-05-06 11:15:53 +02:00
apb_timer.h
apic_flat_64.h
apic.h x86: Consolidate irq entering inlines 2015-05-15 16:04:49 +02:00
apicdef.h
apm.h
arch_hweight.h
archrandom.h
asm-offsets.h
asm.h x86/asm/uaccess: Unify the ALIGN_DESTINATION macro 2015-05-14 07:25:34 +02:00
atomic64_32.h
atomic64_64.h x86/asm: Always inline atomics 2015-04-22 08:14:41 +02:00
atomic.h x86: Force inlining of atomic ops 2015-05-08 12:55:50 +02:00
barrier.h locking/arch: Rename set_mb() to smp_store_mb() 2015-05-19 08:32:00 +02:00
bios_ebda.h
bitops.h Make ARCH_HAS_FAST_MULTIPLIER a real config variable 2014-09-13 11:14:53 -07:00
boot.h
bootparam_utils.h
bug.h
bugs.h
cache.h
cacheflush.h nd_blk: change aperture mapping from WC to WB 2015-08-27 19:38:28 -04:00
calgary.h
ce4100.h
checksum_32.h
checksum_64.h
checksum.h
clocksource.h
cmdline.h
cmpxchg_32.h x86: Simplify __HAVE_ARCH_CMPXCHG tests 2014-07-11 17:28:51 -07:00
cmpxchg_64.h x86: Simplify __HAVE_ARCH_CMPXCHG tests 2014-07-11 17:28:51 -07:00
cmpxchg.h arch: Remove __ARCH_HAVE_CMPXCHG 2015-05-13 10:55:42 +02:00
compat.h x86/asm/entry/64: Save user RSP in pt_regs->sp on SYSCALL64 fastpath 2015-03-10 13:56:10 +01:00
context_tracking.h
cpu_device_id.h
cpu.h x86: Use common outgoing-CPU-notification code 2015-03-11 13:22:35 -07:00
cpufeature.h x86_64, asm: Work around AMD SYSRET SS descriptor attribute issue 2015-04-26 17:57:38 -07:00
cpumask.h
crash.h kexec: support for kexec on panic using new system call 2014-08-08 15:57:33 -07:00
current.h
debugreg.h perf/x86/amd: AMD support for bp_len > HW_BREAKPOINT_LEN_8 2014-12-03 15:14:26 +01:00
delay.h
desc_defs.h
desc.h x86/traps: Separate set_intr_gate() and clean up early_trap_init() 2015-03-05 00:47:29 +01:00
device.h
disabled-features.h x86, mpx: Add MPX to disabled features 2014-11-18 00:58:53 +01:00
div64.h
dma-mapping.h x86: Deinline dma_free_attrs() 2015-05-05 20:48:02 +02:00
dma.h x86/mm: Fix zone ranges boot printout 2014-12-11 11:35:02 +01:00
dmi.h
e820.h mm: move memtest under mm 2015-04-14 16:49:06 -07:00
edac.h EDAC: Cleanup atomic_scrub mess 2015-05-28 15:31:53 +02:00
efi.h x86/fpu: Rename i387.h to fpu/api.h 2015-05-19 15:47:30 +02:00
elf.h mm: fold arch_randomize_brk into ARCH_HAS_ELF_RANDOMIZE 2015-04-14 16:49:05 -07:00
emergency-restart.h
entry_arch.h Merge branch 'x86/ras' into x86/core, to fix conflicts 2015-06-07 15:35:27 +02:00
espfix.h x86/espfix: Add 'cpu' parameter to init_espfix_ap() 2015-07-06 15:00:33 +02:00
exec.h
fb.h x86: Use new cache mode type in include/asm/fb.h 2014-11-16 11:04:24 +01:00
fixmap.h Merge branch 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2014-12-10 14:24:20 -08:00
floppy.h
frame.h x86/debug: Remove perpetually broken, unmaintainable dwarf annotations 2015-06-02 07:57:48 +02:00
ftrace.h ftrace/x86: Move MCOUNT_SAVE_FRAME out of header file 2014-12-01 14:07:16 -05:00
futex.h
gart.h
genapic.h
geode.h
gpio.h
hardirq.h Merge branch 'x86/ras' into x86/core, to fix conflicts 2015-06-07 15:35:27 +02:00
highmem.h x86: mm: Re-use the early_ioremap fixed area 2014-11-03 13:40:44 +01:00
hpet.h x86/MSI: Clean up unused MSI related code and interfaces 2015-04-24 15:36:49 +02:00
hugetlb.h mm/hugetlb: remove arch_prepare/release_hugepage from arch headers 2015-06-25 17:00:35 -07:00
hw_breakpoint.h perf/x86/amd: AMD support for bp_len > HW_BREAKPOINT_LEN_8 2014-12-03 15:14:26 +01:00
hw_irq.h Merge branch 'x86/ras' into x86/core, to fix conflicts 2015-06-07 15:35:27 +02:00
hypertransport.h
hypervisor.h hypervisor/x86/xen: Unset X86_BUG_SYSRET_SS_ATTRS on Xen PV guests 2015-05-05 18:27:43 +01:00
i8259.h
ia32_unistd.h
ia32.h
idle.h
imr.h x86/intel/quark: Add Isolated Memory Regions for Quark X1000 2015-02-18 23:22:47 +01:00
inat_types.h
inat.h
init.h
insn.h x86/asm/decoder: Fix and enforce max instruction size in the insn decoder 2015-02-19 00:01:24 +01:00
inst.h
intel_mid_vrtc.h
intel_pmc_ipc.h intel_pmc_ipc: Update kerneldoc formatting 2015-07-09 11:23:15 -07:00
intel_scu_ipc.h
intel-mid.h x86, intel-mid: remove Intel MID specific serial support 2015-03-07 03:25:18 +01:00
io_apic.h x86: Cleanup irq_domain ops 2015-04-24 15:36:55 +02:00
io.h nd_blk: change aperture mapping from WC to WB 2015-08-27 19:38:28 -04:00
iomap.h
iommu_table.h x86/iommu: Fix header comments regarding standard and _FINISH macros 2015-04-09 10:56:31 +02:00
iommu.h
iosf_mbi.h
ipi.h
irq_regs.h
irq_remapping.h iommu, x86: Provide irq_remapping_cap() interface 2015-06-12 11:33:52 +02:00
irq_vectors.h Merge branch 'x86/ras' into x86/core, to fix conflicts 2015-06-07 15:35:27 +02:00
irq_work.h x86: Tell irq work about self IPI support 2014-09-13 18:38:29 +02:00
irq.h x86/irq: Define a global vector for VT-d Posted-Interrupts 2015-05-19 15:51:17 +02:00
irqdomain.h x86/irq: Move irqdomain specific code into asm/irqdomain.h 2015-04-24 15:36:55 +02:00
irqflags.h x86/asm/entry: Drop now unused ENABLE_INTERRUPTS_SYSEXIT32 2015-04-03 10:34:19 +02:00
ist.h
jump_label.h jump_label: Allow asm/jump_label.h to be included in assembly 2015-04-09 09:40:23 +02:00
kasan.h x86/kasan: Fix KASAN shadow region page tables 2015-07-06 14:53:13 +02:00
kbdleds.h
Kbuild mm: clean up per architecture MM hook header files 2015-07-17 16:39:53 -07:00
kdebug.h
kexec-bzimage64.h kexec-bzImage64: support for loading bzImage using 64bit entry 2014-08-08 15:57:33 -07:00
kexec.h kexec: support for kexec on panic using new system call 2014-08-08 15:57:33 -07:00
kgdb.h
kmap_types.h
kmemcheck.h
kprobes.h kprobes/x86: Remove stale ARCH_SUPPORTS_KPROBES_ON_FTRACE define 2014-10-17 07:18:34 +02:00
kvm_emulate.h KVM: x86: stubs for SMM support 2015-06-04 16:01:45 +02:00
kvm_guest.h
kvm_host.h KVM: count number of assigned devices 2015-07-10 13:25:26 +02:00
kvm_para.h x86: Use bool function return values of true/false not 1/0 2015-03-31 18:05:09 +02:00
lguest_hcall.h lguest: remove NOTIFY call and eventfd facility. 2015-02-11 16:47:46 +10:30
lguest.h lguest: suppress interrupts for single insn, not range. 2015-03-24 11:52:08 +10:30
linkage.h
livepatch.h livepatch: x86: make kASLR logic more accurate 2015-04-29 16:51:33 +02:00
local64.h
local.h
mach_timer.h
mach_traps.h
math_emu.h
mc146818rtc.h x86: Simplify __HAVE_ARCH_CMPXCHG tests 2014-07-11 17:28:51 -07:00
mce.h x86/mce: Add infrastructure to support Local MCE 2015-06-07 15:33:14 +02:00
microcode_amd.h x86/microcode: Correct CPU family related variable types 2015-06-07 15:38:15 +02:00
microcode_intel.h x86/microcode/intel: Rename get_matching_sig() 2015-05-18 09:32:37 +02:00
microcode.h x86/microcode: Parse built-in microcode early 2015-05-06 11:24:53 +02:00
misc.h
mmconfig.h
mmu_context.h x86, perf: Fix static_key bug in load_mm_cr4() 2015-07-10 10:24:38 +02:00
mmu.h perf/x86: Only allow rdpmc if a perf_event is mapped 2015-02-04 12:10:47 +01:00
mmx.h
mmzone_32.h
mmzone_64.h
mmzone.h
module.h
mpspec_def.h
mpspec.h x86, apic: Remove mps_oem_check callback 2014-07-31 08:05:42 -07:00
mpx.h x86/mpx: Support 32-bit binaries on 64-bit kernels 2015-06-09 12:24:34 +02:00
mshyperv.h
msi.h x86/MSI: Use hierarchical irqdomains to manage MSI interrupts 2015-04-24 15:36:49 +02:00
msidef.h
msr-index.h x86/uapi: Do not export <asm/msr-index.h> as part of the user API headers 2015-06-07 15:36:04 +02:00
msr.h Merge branch 'x86/asm' into x86/core, to prepare for new patch 2015-06-08 20:48:20 +02:00
mtrr.h x86/mm/mtrr: Avoid #ifdeffery with phys_wc_to_mtrr_index() 2015-05-27 14:41:00 +02:00
mutex_32.h x86: Simplify __HAVE_ARCH_CMPXCHG tests 2014-07-11 17:28:51 -07:00
mutex_64.h
mutex.h
mwait.h sched/idle/x86: Restore mwait_idle() to fix boot hangs, to improve power savings and to improve performance 2015-03-16 11:14:21 +01:00
nmi.h
nops.h
numa_32.h
numa.h x86/mm/numa: Drop dead code and rename setup_node_data() to setup_alloc_data() 2014-09-16 08:55:10 +02:00
olpc_ofw.h
olpc.h
page_32_types.h x86_64, traps: Stop using IST for #SS 2014-11-23 13:56:19 -08:00
page_32.h
page_64_types.h kasan: enable stack instrumentation 2015-02-13 21:21:41 -08:00
page_64.h x86_64,vsyscall: Make vsyscall emulation configurable 2014-11-03 21:44:57 +01:00
page_types.h x86, mm: support huge I/O mapping capability I/F 2015-04-14 16:49:04 -07:00
page.h arm64,ia64,ppc,s390,sh,tile,um,x86,mm: remove default gate area 2014-08-08 15:57:27 -07:00
paravirt_types.h Merge branch 'locking/core' into x86/core, to prepare for dependent patch 2015-06-03 10:07:35 +02:00
paravirt.h locking/pvqspinlock: Rename QUEUED_SPINLOCK to QUEUED_SPINLOCKS 2015-05-11 09:52:09 +02:00
parport.h
pat.h x86/mm/pat: Emulate PAT when it is disabled 2015-06-07 15:28:52 +02:00
pci_64.h
pci_x86.h Revert "x86/PCI: Refine the way to release PCI IRQ resources" 2015-03-20 14:56:19 +01:00
pci-direct.h
pci-functions.h
pci.h Merge branch 'for-4.2/sg' of git://git.kernel.dk/linux-block 2015-06-25 15:22:36 -07:00
percpu.h x86-64: Use RIP-relative addressing for most per-CPU accesses 2014-11-04 20:43:14 +01:00
perf_event_p4.h percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t 2014-08-28 08:58:57 -04:00
perf_event.h perf/x86/amd/ibs: Update IBS MSRs and feature definitions 2014-11-12 15:12:32 +01:00
pgalloc.h x86: expose number of page table levels on Kconfig level 2015-04-14 16:49:02 -07:00
pgtable_32_types.h x86: mm: Re-use the early_ioremap fixed area 2014-11-03 13:40:44 +01:00
pgtable_32.h x86: Remove set_pmd_pfn 2014-09-01 10:15:31 +02:00
pgtable_64_types.h x86: expose number of page table levels on Kconfig level 2015-04-14 16:49:02 -07:00
pgtable_64.h mm: remove remaining references to NUMA hinting bits and helpers 2015-02-12 18:54:08 -08:00
pgtable_types.h x86/mm/pat: Add pgprot_writethrough() 2015-06-07 15:28:58 +02:00
pgtable-2level_types.h x86: expose number of page table levels on Kconfig level 2015-04-14 16:49:02 -07:00
pgtable-2level.h x86: drop _PAGE_FILE and pte_file()-related helpers 2015-02-10 14:30:33 -08:00
pgtable-3level_types.h x86: expose number of page table levels on Kconfig level 2015-04-14 16:49:02 -07:00
pgtable-3level.h x86: drop _PAGE_FILE and pte_file()-related helpers 2015-02-10 14:30:33 -08:00
pgtable.h mm: clarify that the function operates on hugepage pte 2015-06-24 17:49:44 -07:00
platform_sst_audio.h ASoC: Intel: mrfld: Define sst_res_info for acpi 2014-10-27 18:02:38 +00:00
pm-trace.h PM / sleep: add pm-trace support for suspending phase 2015-03-18 15:54:27 +01:00
pmc_atom.h x86: pmc_atom: Expose contents of PSS 2015-01-20 12:50:14 +01:00
pmem.h nd_blk: change aperture mapping from WC to WB 2015-08-27 19:38:28 -04:00
posix_types.h
preempt.h preempt: Use preempt_schedule_context() as the official tracing preemption point 2015-06-07 15:57:42 +02:00
probe_roms.h
processor-cyrix.h
processor-flags.h
processor.h x86/fpu, sched: Dynamically allocate 'struct fpu' 2015-07-18 03:42:35 +02:00
prom.h
proto.h x86/asm/entry: Untangle 'system_call' into two entry points: entry_SYSCALL_64 and entry_INT80_32 2015-06-08 09:14:21 +02:00
ptrace.h x86/asm/entry/32: Really make user_mode() work correctly for VM86 mode 2015-05-29 09:46:40 +02:00
pvclock-abi.h x86: kvmclock: add flag to indicate pvclock counts from zero 2015-05-29 14:01:39 +02:00
pvclock.h x86: kvmclock: drop rdtsc_barrier() 2015-05-07 11:29:48 +02:00
qrwlock.h
qspinlock_paravirt.h locking/pvqspinlock, x86: Implement the paravirt qspinlock call patching 2015-05-08 12:37:09 +02:00
qspinlock.h locking/pvqspinlock, x86: Implement the paravirt qspinlock call patching 2015-05-08 12:37:09 +02:00
realmode.h
reboot_fixups.h
reboot.h
required-features.h
rio.h
rmwcc.h
rtc.h
rwsem.h
seccomp.h x86: switch to using asm-generic for seccomp.h 2015-04-17 09:04:10 -04:00
sections.h
segment.h x86/asm/irq: Stop relying on magic JMP behavior for early_idt_handlers 2015-06-02 09:39:40 +02:00
serial.h serial: 8250: remove Kconfig indirection 2015-05-06 22:27:00 +02:00
setup_arch.h
setup.h x86: kaslr: fix build due to missing ALIGN definition 2015-04-29 21:54:54 +02:00
shmparam.h
sigcontext.h x86/signal/64: Remove 'fs' and 'gs' from sigcontext 2015-03-17 09:25:26 +01:00
sigframe.h
sighandling.h x86/signal: Remove pax argument from restore_sigcontext 2015-04-06 09:06:39 +02:00
signal.h
simd.h x86/fpu: Rename i387.h to fpu/api.h 2015-05-19 15:47:30 +02:00
smap.h x86/smap: Use ALTERNATIVE macro 2015-02-23 13:44:14 +01:00
smp.h x86: Remove cpu_sibling_mask() and cpu_core_mask() 2015-05-27 15:22:17 +02:00
sparsemem.h
special_insns.h x86/mm: Add kerneldoc comments for pcommit_sfence() 2015-05-11 10:38:44 +02:00
spinlock_types.h locking/pvqspinlock: Rename QUEUED_SPINLOCK to QUEUED_SPINLOCKS 2015-05-11 09:52:09 +02:00
spinlock.h locking/pvqspinlock: Rename QUEUED_SPINLOCK to QUEUED_SPINLOCKS 2015-05-11 09:52:09 +02:00
sta2x11.h
stackprotector.h x86/fpu: Move various internal function prototypes to fpu/internal.h 2015-05-19 15:47:48 +02:00
stacktrace.h
string_32.h
string_64.h x86_64: kasan: add interceptors for memset/memmove/memcpy functions 2015-02-13 21:21:41 -08:00
string.h
suspend_32.h x86/fpu: Rename i387.h to fpu/api.h 2015-05-19 15:47:30 +02:00
suspend_64.h x86/fpu: Rename i387.h to fpu/api.h 2015-05-19 15:47:30 +02:00
suspend.h
svm.h
swiotlb.h
switch_to.h sched/x86_64: Don't save flags on context switch 2014-10-28 11:11:30 +01:00
sync_bitops.h
sys_ia32.h
syscall.h
syscalls.h
sysfb.h
tce.h
thread_info.h x86/entry: Define 'cpu_current_top_of_stack' for 64-bit code 2015-05-08 13:50:02 +02:00
time.h
timer.h
timex.h
tlb.h
tlbflush.h x86: Store a per-cpu shadow copy of CR4 2015-02-04 12:10:42 +01:00
topology.h Merge branch 'x86-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-06-22 17:59:09 -07:00
trace_clock.h
traps.h x86/mce/amd: Introduce deferred error interrupt handler 2015-05-07 10:23:32 +02:00
tsc.h
uaccess_32.h Merge branch 'x86-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-06-22 17:59:09 -07:00
uaccess_64.h x86: clean up/fix 'copy_in_user()' tail zeroing 2015-04-08 14:28:45 -07:00
uaccess.h mm/uaccess, mm/fault: Clarify that uaccess may only sleep if pagefaults are enabled 2015-05-19 08:39:14 +02:00
unaligned.h
unistd.h
uprobes.h
user32.h
user_32.h
user_64.h
user.h x86/fpu: Rename xsave.header::xstate_bv to 'xfeatures' 2015-05-19 15:47:35 +02:00
vdso.h x86, vdso: Move the vvar area before the vdso text 2014-07-11 16:57:51 -07:00
vga.h x86, ia64: Move EFI_FB vga_default_device() initialization to pci_vga_fixup() 2014-07-10 16:48:48 -06:00
vgtod.h x86, vdso: Use asm volatile in __getcpu 2014-12-23 13:05:30 -08:00
virtext.h x86: Store a per-cpu shadow copy of CR4 2015-02-04 12:10:42 +01:00
vm86.h
vmx.h KVM: VMX: Add PML support in VMX 2015-01-30 09:39:54 +01:00
vsyscall.h x86_64,vsyscall: Make vsyscall emulation configurable 2014-11-03 21:44:57 +01:00
vvar.h x86,vdso: Use LSL unconditionally for vgetcpu 2014-11-03 13:41:53 +01:00
word-at-a-time.h
x2apic.h
x86_init.h x86: Remove more unmodified io_apic_ops 2015-04-24 15:36:54 +02:00
xor_32.h x86/fpu: Rename i387.h to fpu/api.h 2015-05-19 15:47:30 +02:00
xor_64.h
xor_avx.h x86/fpu: Rename i387.h to fpu/api.h 2015-05-19 15:47:30 +02:00
xor.h x86/fpu: Rename i387.h to fpu/api.h 2015-05-19 15:47:30 +02:00