linux/arch/x86/mm
Thomas Gleixner ea72ce5da2 x86/kaslr: Expose and use the end of the physical memory address space
iounmap() on x86 occasionally fails to unmap because the provided valid
ioremap address is not below high_memory. It turned out that this
happens due to KASLR.

KASLR uses the full address space between PAGE_OFFSET and vaddr_end to
randomize the starting points of the direct map, vmalloc and vmemmap
regions.  It thereby limits the size of the direct map by using the
installed memory size plus an extra configurable margin for hot-plug
memory.  This limitation is done to gain more randomization space
because otherwise only the holes between the direct map, vmalloc,
vmemmap and vaddr_end would be usable for randomizing.

The limited direct map size is not exposed to the rest of the kernel, so
the memory hot-plug and resource management related code paths still
operate under the assumption that the available address space can be
determined with MAX_PHYSMEM_BITS.

request_free_mem_region() allocates from (1 << MAX_PHYSMEM_BITS) - 1
downwards.  That means the first allocation happens past the end of the
direct map and if unlucky this address is in the vmalloc space, which
causes high_memory to become greater than VMALLOC_START and consequently
causes iounmap() to fail for valid ioremap addresses.

MAX_PHYSMEM_BITS cannot be changed for that because the randomization
does not align with address bit boundaries and there are other places
which actually require to know the maximum number of address bits.  All
remaining usage sites of MAX_PHYSMEM_BITS have been analyzed and found
to be correct.

Cure this by exposing the end of the direct map via PHYSMEM_END and use
that for the memory hot-plug and resource management related places
instead of relying on MAX_PHYSMEM_BITS. In the KASLR case PHYSMEM_END
maps to a variable which is initialized by the KASLR initialization and
otherwise it is based on MAX_PHYSMEM_BITS as before.

To prevent future hickups add a check into add_pages() to catch callers
trying to add memory above PHYSMEM_END.

Fixes: 0483e1fa6e ("x86/mm: Implement ASLR for kernel memory regions")
Reported-by: Max Ramanouski <max8rr8@gmail.com>
Reported-by: Alistair Popple <apopple@nvidia.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-By: Max Ramanouski <max8rr8@gmail.com>
Tested-by: Alistair Popple <apopple@nvidia.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Alistair Popple <apopple@nvidia.com>
Reviewed-by: Kees Cook <kees@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/87ed6soy3z.ffs@tglx
2024-08-20 13:44:57 +02:00
..
pat - 875fa64577da ("mm/hugetlb_vmemmap: fix race with speculative PFN 2024-07-21 17:15:46 -07:00
amdtopology.c x86/mm/numa: Move early mptable evaluation into common code 2024-02-15 22:07:41 +01:00
cpu_entry_area.c x86/mm: Do not shuffle CPU entry areas without KASLR 2023-03-22 10:42:47 -07:00
debug_pagetables.c x86/bugs: Rename CONFIG_PAGE_TABLE_ISOLATION => CONFIG_MITIGATION_PAGE_TABLE_ISOLATION 2024-01-10 10:52:28 +01:00
dump_pagetables.c - Sumanth Korikkar has taught s390 to allocate hotplug-time page frames 2024-03-14 17:43:30 -07:00
extable.c x86/extable: Remove unused fixup type EX_TYPE_COPY 2024-04-04 17:01:40 +02:00
fault.c The usual shower of singleton fixes and minor series all over MM, 2024-05-19 09:21:03 -07:00
highmem_32.c x86/mm: Include asm/numa.h for set_highmem_pages_init() 2023-05-18 11:56:18 -07:00
hugetlbpage.c treewide: use initializer for struct vm_unmapped_area_info 2024-04-25 20:56:27 -07:00
ident_map.c x86/mm: Introduce kernel_ident_mapping_free() 2024-06-17 17:46:22 +02:00
init_32.c mm/treewide: replace pmd_large() with pmd_leaf() 2024-03-06 13:04:19 -08:00
init_64.c x86/kaslr: Expose and use the end of the physical memory address space 2024-08-20 13:44:57 +02:00
init.c The usual shower of singleton fixes and minor series all over MM, 2024-05-19 09:21:03 -07:00
iomap_32.c
ioremap.c x86/ioremap: Add hypervisor callback for private MMIO mapping in coco VM 2023-03-26 23:42:40 +02:00
kasan_init_64.c mm/treewide: replace pud_large() with pud_leaf() 2024-03-06 13:04:19 -08:00
kaslr.c x86/kaslr: Expose and use the end of the physical memory address space 2024-08-20 13:44:57 +02:00
kmmio.c x86/mm/kmmio: Remove redundant preempt_disable() 2022-12-12 10:54:48 -05:00
kmsan_shadow.c x86: kmsan: handle CPU entry area 2022-10-03 14:03:26 -07:00
maccess.c x86/mm: Disallow vsyscall page read for copy_from_kernel_nofault() 2024-02-15 19:21:39 -08:00
Makefile kbuild: use $(src) instead of $(srctree)/$(src) for source directory 2024-05-10 04:34:52 +09:00
mem_encrypt_amd.c - Add support for running the kernel in a SEV-SNP guest, over a Secure 2024-07-16 11:12:25 -07:00
mem_encrypt_boot.S x86/mm: Remove P*D_PAGE_MASK and P*D_PAGE_SIZE macros 2022-12-15 10:37:27 -08:00
mem_encrypt_identity.c - Sumanth Korikkar has taught s390 to allocate hotplug-time page frames 2024-03-14 17:43:30 -07:00
mem_encrypt.c x86/sev: Add callback to apply RMP table fixups for kexec 2024-04-29 11:21:09 +02:00
mm_internal.h
mmap.c mm: switch mm->get_unmapped_area() to a flag 2024-04-25 20:56:25 -07:00
mmio-mod.c x86: Replace cpumask_weight() with cpumask_empty() where appropriate 2022-04-10 22:35:38 +02:00
numa_32.c fix missing vmalloc.h includes 2024-04-25 20:55:49 -07:00
numa_64.c
numa_emulation.c x86/mm: Replace nodes_weight() with nodes_empty() where appropriate 2022-04-10 22:35:38 +02:00
numa_internal.h
numa.c x86/mm/numa: Use NUMA_NO_NODE when calling memblock_set_node() 2024-06-06 22:20:39 +03:00
pf_in.c
pf_in.h
pgprot.c x86/mm: move protection_map[] inside the platform 2022-07-17 17:14:38 -07:00
pgtable_32.c
pgtable.c minmax: add a few more MIN_T/MAX_T users 2024-07-28 13:41:14 -07:00
physaddr.c
physaddr.h
pkeys.c x86/pkeys: Clarify PKRU_AD_KEY macro 2022-06-07 16:06:33 -07:00
pti.c x86/mm: Fix PTI for i386 some more 2024-08-07 15:35:01 +02:00
srat.c x86/apic: Wrap APIC ID validation into an inline 2023-08-09 11:58:30 -07:00
testmmiotrace.c
tlb.c - Sumanth Korikkar has taught s390 to allocate hotplug-time page frames 2024-03-14 17:43:30 -07:00