AArch64 is capable of 128-bit memory accesses without alignment restrictions, which makes it both possible and highly practical to slurp up a typical 20-byte IP header in just 2 loads. Implement our own version of ip_fast_checksum() to take advantage of that, resulting in considerably fewer instructions and memory accesses than the generic version. We can also get more optimal code generation for csum_fold() by defining it a slightly different way round from the generic version, so throw that into the mix too. Suggested-by: Luke Starrett <luke.starrett@broadcom.com> Acked-by: Luke Starrett <luke.starrett@broadcom.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> |
||
---|---|---|
.. | ||
boot | ||
configs | ||
crypto | ||
include | ||
kernel | ||
kvm | ||
lib | ||
mm | ||
net | ||
xen | ||
Kconfig | ||
Kconfig.debug | ||
Kconfig.platforms | ||
Makefile |