linux/lib
Anton Blanchard e269b08517 iommu: inline iommu_num_pages
A profile of a network benchmark showed iommu_num_pages rather high up:

     0.52%  iommu_num_pages

Looking at the profile, an integer divide is taking almost all of the time:

      %
         :      c000000000376ea4 <.iommu_num_pages>:
    1.93 :      c000000000376ea4:       fb e1 ff f8     std     r31,-8(r1)
    0.00 :      c000000000376ea8:       f8 21 ff c1     stdu    r1,-64(r1)
    0.00 :      c000000000376eac:       7c 3f 0b 78     mr      r31,r1
    3.86 :      c000000000376eb0:       38 84 ff ff     addi    r4,r4,-1
    0.00 :      c000000000376eb4:       38 05 ff ff     addi    r0,r5,-1
    0.00 :      c000000000376eb8:       7c 84 2a 14     add     r4,r4,r5
   46.95 :      c000000000376ebc:       7c 00 18 38     and     r0,r0,r3
   45.66 :      c000000000376ec0:       7c 84 02 14     add     r4,r4,r0
    0.00 :      c000000000376ec4:       7c 64 2b 92     divdu   r3,r4,r5
    0.00 :      c000000000376ec8:       38 3f 00 40     addi    r1,r31,64
    0.00 :      c000000000376ecc:       eb e1 ff f8     ld      r31,-8(r1)
    1.61 :      c000000000376ed0:       4e 80 00 20     blr

Since every caller of iommu_num_pages passes in a constant power of two
we can inline this such that the divide is replaced by a shift. The
entire function is only a few instructions once optimised, so it is
a good candidate for inlining overall.

Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-08-09 20:45:05 -07:00
..
lzo lib: add support for LZO-compressed kernels 2010-01-11 09:34:04 -08:00
reed_solomon
zlib_deflate
zlib_inflate inflate_fast: sout is already a short so ptr arith was off by one. 2010-03-12 15:52:44 -08:00
.gitignore
argv_split.c tree-wide: convert open calls to remove spaces to skip_spaces() lib function 2009-12-15 08:53:32 -08:00
atomic64_test.c ARM: 6213/1: atomic64_test: add ARM as supported architecture 2010-07-27 10:43:46 +01:00
atomic64.c lib: Fix atomic64_add_unless return value convention 2010-03-01 11:38:46 -08:00
audit.c
bcd.c
bitmap.c Revert "cpusets: randomize node rotor used in cpuset_mem_spread_node()" 2010-05-30 09:00:03 -07:00
bitrev.c
btree.c lib/btree: fix possible NULL pointer dereference 2010-05-15 12:48:10 -07:00
bug.c panic: Allow warnings to set different taint flags 2010-05-19 08:36:48 +01:00
bust_spinlocks.c
check_signature.c
checksum.c
cmdline.c
cpu-notifier-error-inject.c fault-injection: add CPU notifier error injection module 2010-05-27 09:12:48 -07:00
cpumask.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
crc7.c
crc16.c
crc32.c revert "crc32: use __BYTE_ORDER macro for endian detection" 2010-05-26 08:19:23 -07:00
crc32defs.h
crc-ccitt.c
crc-itu-t.c
crc-t10dif.c
ctype.c ctype: constify read-only _ctype string 2009-12-15 08:53:32 -08:00
debug_locks.c rcu: Introduce lockdep-based checking to RCU read-side primitives 2010-02-25 09:40:59 +01:00
debugobjects.c Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2010-05-18 08:17:58 -07:00
dec_and_lock.c
decompress_bunzip2.c bzip2: Add missing checks for malloc returning NULL 2009-12-15 14:04:19 -08:00
decompress_inflate.c
decompress_unlzma.c
decompress_unlzo.c lib: fix the use of LZO to decompress initramfs images 2010-04-24 11:31:25 -07:00
decompress.c Add LZO compression support for initramfs and old-style initrd 2010-01-11 09:34:05 -08:00
devres.c lib/devres.c: fix comment typo 2010-07-11 22:16:32 +02:00
div64.c
dma-debug.c dma-debug: Cleanup for copy-loop in filter_write() 2010-04-07 14:36:27 +02:00
dump_stack.c
dynamic_debug.c module: initialize module dynamic debug later 2010-07-04 20:17:22 -07:00
extable.c
fault-inject.c
find_last_bit.c
find_next_bit.c
flex_array.c flex_array: fix the panic when calling flex_array_alloc() without __GFP_ZERO 2010-04-24 11:31:24 -07:00
gcd.c
gen_crc32table.c crc32: major optimization 2010-05-25 08:07:06 -07:00
genalloc.c genalloc: fix allocation from end of pool 2010-06-29 15:29:30 -07:00
halfmd4.c
hexdump.c lib: introduce common method to convert hex digits 2010-05-25 08:07:05 -07:00
hweight.c x86: Add optimized popcnt variants 2010-04-06 15:52:11 -07:00
idr.c idr: fix RCU lockdep splat in idr_get_next() 2010-06-23 06:50:45 -07:00
inflate.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
int_sqrt.c
iomap_copy.c
iomap.c
iommu-helper.c iommu: inline iommu_num_pages 2010-08-09 20:45:05 -07:00
ioremap.c x86, ioremap: Fix incorrect physical address handling in PAE mode 2010-07-09 11:42:03 -07:00
irq_regs.c
is_single_threaded.c
kasprintf.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
Kconfig lmb: rename to memblock 2010-07-14 17:14:00 +10:00
Kconfig.debug Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6 2010-08-06 11:36:30 -07:00
Kconfig.kgdb mips,kgdb: kdb low level trap catch and stack trace 2010-05-20 21:04:26 -05:00
Kconfig.kmemcheck
kernel_lock.c bkl: Fixup core_lock fallout 2009-12-14 23:55:33 +01:00
klist.c
kobject_uevent.c kobject: free memory if netlink_kernel_create() fails 2010-06-04 13:27:52 -07:00
kobject.c sysfs: Comment sysfs directory tagging logic 2010-05-21 09:37:31 -07:00
kref.c kref: remove kref_set 2010-05-21 09:37:29 -07:00
lcm.c block: Fix overrun in lcm() and move it to lib 2010-03-15 12:47:59 +01:00
libcrc32c.c
list_debug.c
list_sort.c lib: revise list_sort() header comment 2010-03-06 11:26:35 -08:00
locking-selftest-hardirq.h
locking-selftest-mutex.h
locking-selftest-rlock-hardirq.h
locking-selftest-rlock-softirq.h
locking-selftest-rlock.h
locking-selftest-rsem.h
locking-selftest-softirq.h
locking-selftest-spin-hardirq.h
locking-selftest-spin-softirq.h
locking-selftest-spin.h
locking-selftest-wlock-hardirq.h
locking-selftest-wlock-softirq.h
locking-selftest-wlock.h
locking-selftest-wsem.h
locking-selftest.c
lru_cache.c
Makefile lmb: rename to memblock 2010-07-14 17:14:00 +10:00
nlattr.c
parser.c parser: remove unnecessary strlen() 2009-12-15 08:53:33 -08:00
percpu_counter.c tmpfs: add accurate compare function to percpu_counter library 2010-08-09 20:44:58 -07:00
plist.c plist: Make plist debugging raw_spinlock aware 2009-12-14 23:55:33 +01:00
prio_heap.c
prio_tree.c
proportions.c
radix-tree.c radix-tree: omplement function radix_tree_range_tag_if_tagged 2010-08-09 20:44:59 -07:00
random32.c Merge branch 'master' into for-next 2010-06-16 18:08:13 +02:00
ratelimit.c ratelimit: fix the return value when __ratelimit() fails to acquire the lock 2010-04-07 08:38:04 -07:00
rational.c lib/rational.c needs module.h 2010-01-11 09:34:05 -08:00
rbtree.c rbtree: Undo augmented trees performance damage and regression 2010-07-05 14:43:50 +02:00
reciprocal_div.c
rwsem-spinlock.c rwsem generic spinlock: use IRQ save/restore spinlocks 2010-04-07 16:15:05 -07:00
rwsem.c rwsem: Test for no active locks in __rwsem_do_wake undo code 2010-05-12 18:23:34 -07:00
scatterlist.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
sha1.c
show_mem.c mm: use the same log level for show_mem() 2010-03-06 11:26:27 -08:00
smp_processor_id.c
sort.c
spinlock_debug.c locking: Further name space cleanups 2009-12-14 23:55:33 +01:00
string_helpers.c
string.c lib/string.c: simplify strnstr() 2010-03-06 11:26:35 -08:00
swiotlb.c swiotlb: Make swiotlb bookkeeping functions visible in the header file. 2010-06-07 11:59:27 -04:00
syscall.c
textsearch.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
ts_bm.c
ts_fsm.c
ts_kmp.c
uuid.c Unified UUID/GUID definition 2010-05-19 22:40:47 -04:00
vsprintf.c vsprintf: Recursive vsnprintf: Add "%pV", struct va_format 2010-07-04 10:40:17 -07:00