mirror of
https://github.com/torvalds/linux.git
synced 2024-11-15 08:31:55 +00:00
cf8fb5533f
I blame Mikey for this. He elevated my slightly dubious testcase: to benchmark status. And naturally we need to be number 1 at creating zeros. So lets improve __clear_user some more. As Paul suggests we can use dcbz for large lengths. This patch gets the destination cacheline aligned then uses dcbz on whole cachelines. Before: 10485760000 bytes (10 GB) copied, 0.414744 s, 25.3 GB/s After: 10485760000 bytes (10 GB) copied, 0.268597 s, 39.0 GB/s 39 GB/s, a new record. Signed-off-by: Anton Blanchard <anton@samba.org> Tested-by: Olof Johansson <olof@lixom.net> Acked-by: Olof Johansson <olof@lixom.net> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> |
||
---|---|---|
.. | ||
alloc.c | ||
checksum_32.S | ||
checksum_64.S | ||
checksum_wrappers_64.c | ||
code-patching.c | ||
copy_32.S | ||
copypage_64.S | ||
copypage_power7.S | ||
copyuser_64.S | ||
copyuser_power7.S | ||
crtsavres.S | ||
devres.c | ||
div64.S | ||
feature-fixups-test.S | ||
feature-fixups.c | ||
hweight_64.S | ||
ldstfp.S | ||
locks.c | ||
Makefile | ||
mem_64.S | ||
memcpy_64.S | ||
memcpy_power7.S | ||
rheap.c | ||
sstep.c | ||
string_64.S | ||
string.S | ||
usercopy_64.c | ||
vmx-helper.c |