linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-10 14:11:52 +00:00

History

David Hildenbrand d7f861b9c4 mm/mmu_gather: add __tlb_remove_folio_pages() Add __tlb_remove_folio_pages(), which will remove multiple consecutive pages that belong to the same large folio, instead of only a single page. We'll be using this function when optimizing unmapping/zapping of large folios that are mapped by PTEs. We're using the remaining spare bit in an encoded_page to indicate that the next enoced page in an array contains actually shifted "nr_pages". Teach swap/freeing code about putting multiple folio references, and delayed rmap handling to remove page ranges of a folio. This extension allows for still gathering almost as many small folios as we used to (-1, because we have to prepare for a possibly bigger next entry), but still allows for gathering consecutive pages that belong to the same large folio. Note that we don't pass the folio pointer, because it is not required for now. Further, we don't support page_size != PAGE_SIZE, it won't be required for simple PTE batching. We have to provide a separate s390 implementation, but it's fairly straight forward. Another, more invasive and likely more expensive, approach would be to use folio+range or a PFN range instead of page+nr_pages. But, we should do that consistently for the whole mmu_gather. For now, let's keep it simple and add "nr_pages" only. Note that it is now possible to gather significantly more pages: In the past, we were able to gather ~10000 pages, now we can also gather ~5000 folio fragments that span multiple pages. A folio fragment on x86-64 can span up to 512 pages (2 MiB THP) and on arm64 with 64k in theory 8192 pages (512 MiB THP). Gathering more memory is not considered something we should worry about, especially because these are already corner cases. While we can gather more total memory, we won't free more folio fragments. As long as page freeing time primarily only depends on the number of involved folios, there is no effective change for !preempt configurations. However, we'll adjust tlb_batch_pages_flush() separately to handle corner cases where page freeing time grows proportionally with the actual memory size. Link: https://lkml.kernel.org/r/20240214204435.167852-9-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Ryan Roberts <ryan.roberts@arm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Cc: Yin Fengwei <fengwei.yin@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>		2024-02-22 15:27:17 -08:00
..
damon	mm/damon/sysfs: handle 'state' file inputs for every sampling interval if possible	2024-02-22 10:24:55 -08:00
kasan	kasan/test: avoid gcc warning for intentional overflow	2024-02-22 10:24:58 -08:00
kfence	KFENCE: cleanup kfence_guarded_alloc() after CONFIG_SLAB removal	2023-12-05 11:17:58 +01:00
kmsan	mm: kmsan: remove runtime checks from kmsan_unpoison_memory()	2024-02-22 10:24:41 -08:00
backing-dev.c	blk-wbt: Fix detection of dirty-throttled tasks	2024-02-06 09:44:03 -07:00
balloon_compaction.c
bootmem_info.c	bootmem: use kmemleak_free_part_phys in put_page_bootmem	2023-10-25 16:47:13 -07:00
cma_debug.c
cma_sysfs.c	mm/cma: add sysfs file 'release_pages_success'	2024-02-22 10:24:57 -08:00
cma.c	mm/cma: add sysfs file 'release_pages_success'	2024-02-22 10:24:57 -08:00
cma.h	mm/cma: add sysfs file 'release_pages_success'	2024-02-22 10:24:57 -08:00
compaction.c	mm: compaction: limit the suitable target page order to be less than cc->order	2024-02-22 15:27:16 -08:00
debug_page_alloc.c	mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDER	2024-01-08 15:27:15 -08:00
debug_page_ref.c
debug_vm_pgtable.c	mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDER	2024-01-08 15:27:15 -08:00
debug.c	mm: update validate_mm() to use vma iterator	2023-06-09 16:25:31 -07:00
dmapool_test.c
dmapool.c	mm/mempool/dmapool: remove CONFIG_DEBUG_SLAB ifdefs	2023-12-05 11:17:58 +01:00
early_ioremap.c	mm/early_ioremap.c: improve the execution efficiency of early_ioremap_setup()	2023-06-09 16:25:56 -07:00
fadvise.c	mm: remove unnecessary pagevec includes	2023-06-23 16:59:31 -07:00
fail_page_alloc.c	mm: page_alloc: split out FAIL_PAGE_ALLOC	2023-06-09 16:25:23 -07:00
failslab.c
filemap.c	mm: add pfn_swap_entry_folio()	2024-02-21 16:00:03 -08:00
folio-compat.c	mm: remove page_add_new_anon_rmap and lru_cache_add_inactive_or_unevictable	2023-12-29 11:58:27 -08:00
gup_test.c	Merge mm-hotfixes-stable into mm-stable to pick up depended-upon changes.	2023-06-23 16:58:19 -07:00
gup_test.h
gup.c	mm: convert page_try_share_anon_rmap() to folio_try_share_anon_rmap_[pte\|pmd]()	2023-12-29 11:58:56 -08:00
highmem.c	x86/kexec: use pr_err() instead of kexec_dprintk() when an error occurs	2023-12-29 12:22:28 -08:00
hmm.c	mm: enable page walking API to lock vmas during the walk	2023-08-21 13:07:20 -07:00
huge_memory.c	userfaultfd: handle zeropage moves by UFFDIO_MOVE	2024-02-22 10:24:48 -08:00
hugetlb_cgroup.c	mm, hugetlb: remove HUGETLB_CGROUP_MIN_ORDER	2023-10-18 14:34:17 -07:00
hugetlb_vmemmap.c	mm: hugetlb_vmemmap: move mmap lock to vmemmap_remap_range()	2023-12-12 10:57:08 -08:00
hugetlb_vmemmap.h	mm: hugetlb_vmemmap: fix reference to nonexistent file	2023-10-25 16:47:14 -07:00
hugetlb.c	mm/hugetlb: move page order check inside hugetlb_cma_reserve()	2024-02-22 10:24:59 -08:00
hwpoison-inject.c
init-mm.c	mm: Deprecate pasid field	2023-12-12 10:11:32 +01:00
internal.h	mm/mmap: introduce vma_set_range()	2024-02-22 10:24:40 -08:00
interval_tree.c
io-mapping.c
ioremap.c	mm: ioremap: remove unneeded ioremap_allowed and iounmap_allowed	2023-08-18 10:12:36 -07:00
Kconfig	mm/zswap: only support zswap_exclusive_loads_enabled	2024-02-22 10:24:54 -08:00
Kconfig.debug	mm/slab: remove CONFIG_SLAB from all Kconfig and Makefile	2023-12-05 11:14:40 +01:00
khugepaged.c	mm: convert mm_counter_file() to take a folio	2024-02-21 16:00:04 -08:00
kmemleak.c	kmemleak: avoid RCU stalls when freeing metadata for per-CPU pointers	2023-12-12 10:57:07 -08:00
ksm.c	mm: convert page_try_share_anon_rmap() to folio_try_share_anon_rmap_[pte\|pmd]()	2023-12-29 11:58:56 -08:00
list_lru.c	mm/zswap: stop lru list shrinking when encounter warm region	2024-02-22 10:24:54 -08:00
maccess.c
madvise.c	mm/madvise: don't forget to leave lazy MMU mode in madvise_cold_or_pageout_pte_range()	2024-02-07 21:20:35 -08:00
Makefile	mm/slab: remove CONFIG_SLAB from all Kconfig and Makefile	2023-12-05 11:14:40 +01:00
mapping_dirty_helpers.c	mm: fix clean_record_shared_mapping_range kernel-doc	2023-08-24 16:20:30 -07:00
memblock.c	mm/memblock: add MEMBLOCK_RSRV_NOINIT into flagname[] array	2024-02-20 14:20:49 -08:00
memcontrol.c	mm: memcg: use larger batches for proactive reclaim	2024-02-22 10:24:52 -08:00
memfd.c	memfd: drop warning for missing exec-related flags	2023-10-04 10:32:22 -07:00
memory_hotplug.c	mm/memory_hotplug: export mhp_supports_memmap_on_memory()	2024-02-22 10:24:40 -08:00
memory-failure.c	mm/memory-failure: fix crash in split_huge_page_to_list from soft_offline_page	2024-02-07 21:20:34 -08:00
memory-tiers.c	mm/demotion: print demotion targets	2024-02-22 10:24:55 -08:00
memory.c	mm/memory: factor out zapping folio pte into zap_present_folio_pte()	2024-02-22 15:27:17 -08:00
mempolicy.c	mm/mempolicy: protect task interleave functions with tsk->mems_allowed_seq	2024-02-22 10:24:47 -08:00
mempool.c	Many singleton patches against the MM code. The patch series which	2024-01-09 11:18:47 -08:00
memremap.c	mm: remove stale example from comment	2023-12-29 11:58:26 -08:00
memtest.c	mm: memtest: convert to memtest_report_meminfo()	2023-08-21 13:37:47 -07:00
migrate_device.c	mm: convert page_try_share_anon_rmap() to folio_try_share_anon_rmap_[pte\|pmd]()	2023-12-29 11:58:56 -08:00
migrate.c	mm/migrate: preserve exact soft-dirty state	2024-02-22 10:24:55 -08:00
mincore.c	mm: enable page walking API to lock vmas during the walk	2023-08-21 13:07:20 -07:00
mlock.c	mm: mlock: avoid folio_within_range() on KSM pages	2023-10-25 16:47:14 -07:00
mm_init.c	efi: disable mirror feature during crashkernel	2024-01-12 15:20:47 -08:00
mm_slot.h
mmap_lock.c
mmap.c	mm/mmap: pass vma to vma_merge()	2024-02-22 10:24:52 -08:00
mmu_gather.c	mm/mmu_gather: add __tlb_remove_folio_pages()	2024-02-22 15:27:17 -08:00
mmu_notifier.c	mmu_notifiers: rename invalidate_range notifier	2023-08-18 10:12:41 -07:00
mmzone.c	zswap: shrink zswap pool based on memory pressure	2023-12-12 10:57:02 -08:00
mprotect.c	mprotect: use pfn_swap_entry_folio	2024-02-21 16:00:03 -08:00
mremap.c	mm: abstract VMA merge and extend into vma_merge_extend() helper	2023-10-18 14:34:18 -07:00
msync.c
nommu.c	Many singleton patches against the MM code. The patch series which are	2023-11-02 19:38:47 -10:00
oom_kill.c	mm, oom:dump_tasks add rss detailed information printing	2023-12-10 16:51:53 -08:00
page_alloc.c	mm and cache_info: remove unnecessary CPU cache info update	2024-02-22 10:24:41 -08:00
page_counter.c
page_ext.c	mm/page_ext: move functions around for minor cleanups to page_ext	2023-08-18 10:12:31 -07:00
page_idle.c
page_io.c	zswap: memcontrol: implement zswap writeback disabling	2023-12-29 20:22:11 -08:00
page_isolation.c	mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDER	2024-01-08 15:27:15 -08:00
page_owner.c	mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDER	2024-01-08 15:27:15 -08:00
page_poison.c	mm/page_poison: replace kmap_atomic() with kmap_local_page()	2023-12-10 16:51:50 -08:00
page_reporting.c	mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDER	2024-01-08 15:27:15 -08:00
page_reporting.h
page_table_check.c	mm: convert page_table_check_pte_set() to page_table_check_ptes_set()	2023-08-24 16:20:18 -07:00
page_vma_mapped.c	mm: thp: introduce multi-size THP sysfs interface	2023-12-20 14:48:12 -08:00
page-writeback.c	block-6.8-2024-02-10	2024-02-10 08:02:48 -08:00
pagewalk.c	mm: pagewalk: assert write mmap lock only for walking the user page tables	2023-12-10 16:51:53 -08:00
percpu-internal.h	percpu-internal/pcpu_chunk: re-layout pcpu_chunk structure to reduce false sharing	2023-06-19 16:19:29 -07:00
percpu-km.c
percpu-stats.c
percpu-vm.c
percpu.c	mm: Introduce flush_cache_vmap_early()	2023-12-14 00:23:17 -08:00
pgalloc-track.h
pgtable-generic.c	mm/pgtable: notes on pte_offset_map[_lock]()	2023-08-18 10:12:25 -07:00
process_vm_access.c	mm: fix process_vm_rw page counts	2023-12-10 16:51:39 -08:00
ptdump.c	mm: ptdump: add check_wx_pages debugfs attribute	2024-02-22 10:24:47 -08:00
readahead.c	readahead: use ilog2 instead of a while loop in page_cache_ra_order()	2024-02-22 10:24:38 -08:00
rmap.c	mm: convert mm_counter_file() to take a folio	2024-02-21 16:00:04 -08:00
rodata_test.c
secretmem.c	mm/secretmem: use a folio in secretmem_fault()	2023-08-21 13:38:02 -07:00
shmem_quota.c	shmem: Add default quota limit mount options	2023-08-09 09:15:40 +02:00
shmem.c	header cleanups for 6.8	2024-01-10 16:43:55 -08:00
show_mem.c	mm, treewide: introduce NR_PAGE_ORDERS	2024-01-08 15:27:15 -08:00
shrinker_debug.c	mm: shrinker: convert shrinker_rwsem to mutex	2023-10-04 10:32:26 -07:00
shrinker.c	mm: shrinker: use kvzalloc_node() from expand_one_shrinker_info()	2024-01-05 09:58:32 -08:00
shuffle.c
shuffle.h	mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDER	2024-01-08 15:27:15 -08:00
slab_common.c	slub: use a folio in __kmalloc_large_node	2024-01-05 10:17:46 -08:00
slab.h	mm/slab: move kmalloc() functions from slab_common.c to slub.c	2023-12-06 11:57:21 +01:00
slub.c	Many singleton patches against the MM code. The patch series which	2024-01-09 11:18:47 -08:00
sparse-vmemmap.c	mm/vmemmap: allow architectures to override how vmemmap optimization works	2023-08-18 10:12:53 -07:00
sparse.c	mm/memory_hotplug: introduce MEM_PREPARE_ONLINE/MEM_FINISH_OFFLINE notifiers	2024-02-21 16:00:01 -08:00
swap_cgroup.c
swap_slots.c	mm/zswap: invalidate zswap entry when swap entry free	2024-02-22 10:24:54 -08:00
swap_state.c	mm/mmu_gather: add __tlb_remove_folio_pages()	2024-02-22 15:27:17 -08:00
swap.c	mm/mmu_gather: add __tlb_remove_folio_pages()	2024-02-22 15:27:17 -08:00
swap.h	mm/swap: fix race when skipping swapcache	2024-02-20 14:20:48 -08:00
swapfile.c	mm/zswap: invalidate zswap entry when swap entry free	2024-02-22 10:24:54 -08:00
truncate.c	fs: convert error_remove_page to error_remove_folio	2023-12-10 16:51:42 -08:00
usercopy.c
userfaultfd.c	userfaultfd: handle zeropage moves by UFFDIO_MOVE	2024-02-22 10:24:48 -08:00
util.c	mm/util: use kmap_local_page() in memcmp_pages()	2023-12-10 16:51:49 -08:00
vmalloc.c	mm/vmalloc: fix the unchecked dereference warning in vread_iter()	2023-11-01 12:38:35 -07:00
vmpressure.c	eventfd: simplify eventfd_signal()	2023-11-28 14:08:38 +01:00
vmscan.c	mm/mglru: improve swappiness handling	2024-02-22 10:24:58 -08:00
vmstat.c	mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDER	2024-01-08 15:27:15 -08:00
workingset.c	mm: ratelimit stat flush from workingset shrinker	2024-01-05 10:17:45 -08:00
z3fold.c	mm/z3fold: remove obsolete comment for struct z3fold_pool	2023-08-21 13:37:51 -07:00
zbud.c	mm: zswap: remove shrink from zpool interface	2023-06-19 16:19:27 -07:00
zpool.c	mm: zswap: remove shrink from zpool interface	2023-06-19 16:19:27 -07:00
zsmalloc.c	mm: zsmalloc: return -ENOSPC rather than -EINVAL in zs_malloc while size is too large	2024-01-05 10:17:47 -08:00
zswap.c	mm/zswap: optimize and cleanup the invalidation of duplicate entry	2024-02-22 10:24:57 -08:00