Merge branch 'akpm' (patches from Andrew)

Merge first patch-bomb from Andrew Morton:

 - A few hotfixes which missed 4.4 becasue I was asleep.  cc'ed to
   -stable

 - A few misc fixes

 - OCFS2 updates

 - Part of MM.  Including pretty large changes to page-flags handling
   and to thp management which have been buffered up for 2-3 cycles now.

  I have a lot of MM material this time.

[ It turns out the THP part wasn't quite ready, so that got dropped from
  this series  - Linus ]

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (117 commits)
  zsmalloc: reorganize struct size_class to pack 4 bytes hole
  mm/zbud.c: use list_last_entry() instead of list_tail_entry()
  zram/zcomp: do not zero out zcomp private pages
  zram: pass gfp from zcomp frontend to backend
  zram: try vmalloc() after kmalloc()
  zram/zcomp: use GFP_NOIO to allocate streams
  mm: add tracepoint for scanning pages
  drivers/base/memory.c: fix kernel warning during memory hotplug on ppc64
  mm/page_isolation: use macro to judge the alignment
  mm: fix noisy sparse warning in LIBCFS_ALLOC_PRE()
  mm: rework virtual memory accounting
  include/linux/memblock.h: fix ordering of 'flags' argument in comments
  mm: move lru_to_page to mm_inline.h
  Documentation/filesystems: describe the shared memory usage/accounting
  memory-hotplug: don't BUG() in register_memory_resource()
  hugetlb: make mm and fs code explicitly non-modular
  mm/swapfile.c: use list_for_each_entry_safe in free_swap_count_continuations
  mm: /proc/pid/clear_refs: no need to clear VM_SOFTDIRTY in clear_soft_dirty_pmd()
  mm: make sure isolate_lru_page() is never called for tail page
  vmstat: make vmstat_updater deferrable again and shut down on idle
  ...
This commit is contained in:
Linus Torvalds 2016-01-15 11:41:44 -08:00
commit 875fc4f5dd
176 changed files with 1853 additions and 1285 deletions

View File

@ -169,6 +169,9 @@ read the file /proc/PID/status:
VmLck: 0 kB VmLck: 0 kB
VmHWM: 476 kB VmHWM: 476 kB
VmRSS: 476 kB VmRSS: 476 kB
RssAnon: 352 kB
RssFile: 120 kB
RssShmem: 4 kB
VmData: 156 kB VmData: 156 kB
VmStk: 88 kB VmStk: 88 kB
VmExe: 68 kB VmExe: 68 kB
@ -231,14 +234,20 @@ Table 1-2: Contents of the status files (as of 4.1)
VmSize total program size VmSize total program size
VmLck locked memory size VmLck locked memory size
VmHWM peak resident set size ("high water mark") VmHWM peak resident set size ("high water mark")
VmRSS size of memory portions VmRSS size of memory portions. It contains the three
following parts (VmRSS = RssAnon + RssFile + RssShmem)
RssAnon size of resident anonymous memory
RssFile size of resident file mappings
RssShmem size of resident shmem memory (includes SysV shm,
mapping of tmpfs and shared anonymous mappings)
VmData size of data, stack, and text segments VmData size of data, stack, and text segments
VmStk size of data, stack, and text segments VmStk size of data, stack, and text segments
VmExe size of text segment VmExe size of text segment
VmLib size of shared library code VmLib size of shared library code
VmPTE size of page table entries VmPTE size of page table entries
VmPMD size of second level page tables VmPMD size of second level page tables
VmSwap size of swap usage (the number of referred swapents) VmSwap amount of swap used by anonymous private data
(shmem swap usage is not included)
HugetlbPages size of hugetlb memory portions HugetlbPages size of hugetlb memory portions
Threads number of threads Threads number of threads
SigQ number of signals queued/max. number for queue SigQ number of signals queued/max. number for queue
@ -265,7 +274,8 @@ Table 1-3: Contents of the statm files (as of 2.6.8-rc3)
Field Content Field Content
size total program size (pages) (same as VmSize in status) size total program size (pages) (same as VmSize in status)
resident size of memory portions (pages) (same as VmRSS in status) resident size of memory portions (pages) (same as VmRSS in status)
shared number of pages that are shared (i.e. backed by a file) shared number of pages that are shared (i.e. backed by a file, same
as RssFile+RssShmem in status)
trs number of pages that are 'code' (not including libs; broken, trs number of pages that are 'code' (not including libs; broken,
includes data segment) includes data segment)
lrs number of pages of library (always 0 on 2.6) lrs number of pages of library (always 0 on 2.6)
@ -459,7 +469,10 @@ and a page is modified, the file page is replaced by a private anonymous copy.
hugetlbfs page which is *not* counted in "RSS" or "PSS" field for historical hugetlbfs page which is *not* counted in "RSS" or "PSS" field for historical
reasons. And these are not included in {Shared,Private}_{Clean,Dirty} field. reasons. And these are not included in {Shared,Private}_{Clean,Dirty} field.
"Swap" shows how much would-be-anonymous memory is also used, but out on swap. "Swap" shows how much would-be-anonymous memory is also used, but out on swap.
"SwapPss" shows proportional swap share of this mapping. For shmem mappings, "Swap" includes also the size of the mapped (and not
replaced by copy-on-write) part of the underlying shmem object out on swap.
"SwapPss" shows proportional swap share of this mapping. Unlike "Swap", this
does not take into account swapped out page of underlying shmem objects.
"Locked" indicates whether the mapping is locked in memory or not. "Locked" indicates whether the mapping is locked in memory or not.
"VmFlags" field deserves a separate description. This member represents the kernel "VmFlags" field deserves a separate description. This member represents the kernel
@ -842,6 +855,7 @@ Dirty: 968 kB
Writeback: 0 kB Writeback: 0 kB
AnonPages: 861800 kB AnonPages: 861800 kB
Mapped: 280372 kB Mapped: 280372 kB
Shmem: 644 kB
Slab: 284364 kB Slab: 284364 kB
SReclaimable: 159856 kB SReclaimable: 159856 kB
SUnreclaim: 124508 kB SUnreclaim: 124508 kB
@ -898,6 +912,7 @@ MemAvailable: An estimate of how much memory is available for starting new
AnonPages: Non-file backed pages mapped into userspace page tables AnonPages: Non-file backed pages mapped into userspace page tables
AnonHugePages: Non-file backed huge pages mapped into userspace page tables AnonHugePages: Non-file backed huge pages mapped into userspace page tables
Mapped: files which have been mmaped, such as libraries Mapped: files which have been mmaped, such as libraries
Shmem: Total memory used by shared memory (shmem) and tmpfs
Slab: in-kernel data structures cache Slab: in-kernel data structures cache
SReclaimable: Part of Slab, that might be reclaimed, such as caches SReclaimable: Part of Slab, that might be reclaimed, such as caches
SUnreclaim: Part of Slab, that cannot be reclaimed on memory pressure SUnreclaim: Part of Slab, that cannot be reclaimed on memory pressure

View File

@ -17,10 +17,10 @@ RAM, where you have to create an ordinary filesystem on top. Ramdisks
cannot swap and you do not have the possibility to resize them. cannot swap and you do not have the possibility to resize them.
Since tmpfs lives completely in the page cache and on swap, all tmpfs Since tmpfs lives completely in the page cache and on swap, all tmpfs
pages currently in memory will show up as cached. It will not show up pages will be shown as "Shmem" in /proc/meminfo and "Shared" in
as shared or something like that. Further on you can check the actual free(1). Notice that these counters also include shared memory
RAM+swap use of a tmpfs instance with df(1) and du(1). (shmem, see ipcs(1)). The most reliable way to get the count is
using df(1) and du(1).
tmpfs has the following uses: tmpfs has the following uses:

View File

@ -608,6 +608,10 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
cut the overhead, others just disable the usage. So cut the overhead, others just disable the usage. So
only cgroup_disable=memory is actually worthy} only cgroup_disable=memory is actually worthy}
cgroup.memory= [KNL] Pass options to the cgroup memory controller.
Format: <string>
nosocket -- Disable socket memory accounting.
checkreqprot [SELINUX] Set initial checkreqprot flag value. checkreqprot [SELINUX] Set initial checkreqprot flag value.
Format: { "0" | "1" } Format: { "0" | "1" }
See security/selinux/Kconfig help text. See security/selinux/Kconfig help text.

View File

@ -42,6 +42,8 @@ Currently, these files are in /proc/sys/vm:
- min_slab_ratio - min_slab_ratio
- min_unmapped_ratio - min_unmapped_ratio
- mmap_min_addr - mmap_min_addr
- mmap_rnd_bits
- mmap_rnd_compat_bits
- nr_hugepages - nr_hugepages
- nr_overcommit_hugepages - nr_overcommit_hugepages
- nr_trim_pages (only if CONFIG_MMU=n) - nr_trim_pages (only if CONFIG_MMU=n)
@ -485,6 +487,33 @@ against future potential kernel bugs.
============================================================== ==============================================================
mmap_rnd_bits:
This value can be used to select the number of bits to use to
determine the random offset to the base address of vma regions
resulting from mmap allocations on architectures which support
tuning address space randomization. This value will be bounded
by the architecture's minimum and maximum supported values.
This value can be changed after boot using the
/proc/sys/vm/mmap_rnd_bits tunable
==============================================================
mmap_rnd_compat_bits:
This value can be used to select the number of bits to use to
determine the random offset to the base address of vma regions
resulting from mmap allocations for applications run in
compatibility mode on architectures which support tuning address
space randomization. This value will be bounded by the
architecture's minimum and maximum supported values.
This value can be changed after boot using the
/proc/sys/vm/mmap_rnd_compat_bits tunable
==============================================================
nr_hugepages nr_hugepages
Change the minimum size of the hugepage pool. Change the minimum size of the hugepage pool.

View File

@ -511,6 +511,74 @@ config ARCH_HAS_ELF_RANDOMIZE
- arch_mmap_rnd() - arch_mmap_rnd()
- arch_randomize_brk() - arch_randomize_brk()
config HAVE_ARCH_MMAP_RND_BITS
bool
help
An arch should select this symbol if it supports setting a variable
number of bits for use in establishing the base address for mmap
allocations, has MMU enabled and provides values for both:
- ARCH_MMAP_RND_BITS_MIN
- ARCH_MMAP_RND_BITS_MAX
config ARCH_MMAP_RND_BITS_MIN
int
config ARCH_MMAP_RND_BITS_MAX
int
config ARCH_MMAP_RND_BITS_DEFAULT
int
config ARCH_MMAP_RND_BITS
int "Number of bits to use for ASLR of mmap base address" if EXPERT
range ARCH_MMAP_RND_BITS_MIN ARCH_MMAP_RND_BITS_MAX
default ARCH_MMAP_RND_BITS_DEFAULT if ARCH_MMAP_RND_BITS_DEFAULT
default ARCH_MMAP_RND_BITS_MIN
depends on HAVE_ARCH_MMAP_RND_BITS
help
This value can be used to select the number of bits to use to
determine the random offset to the base address of vma regions
resulting from mmap allocations. This value will be bounded
by the architecture's minimum and maximum supported values.
This value can be changed after boot using the
/proc/sys/vm/mmap_rnd_bits tunable
config HAVE_ARCH_MMAP_RND_COMPAT_BITS
bool
help
An arch should select this symbol if it supports running applications
in compatibility mode, supports setting a variable number of bits for
use in establishing the base address for mmap allocations, has MMU
enabled and provides values for both:
- ARCH_MMAP_RND_COMPAT_BITS_MIN
- ARCH_MMAP_RND_COMPAT_BITS_MAX
config ARCH_MMAP_RND_COMPAT_BITS_MIN
int
config ARCH_MMAP_RND_COMPAT_BITS_MAX
int
config ARCH_MMAP_RND_COMPAT_BITS_DEFAULT
int
config ARCH_MMAP_RND_COMPAT_BITS
int "Number of bits to use for ASLR of mmap base address for compatible applications" if EXPERT
range ARCH_MMAP_RND_COMPAT_BITS_MIN ARCH_MMAP_RND_COMPAT_BITS_MAX
default ARCH_MMAP_RND_COMPAT_BITS_DEFAULT if ARCH_MMAP_RND_COMPAT_BITS_DEFAULT
default ARCH_MMAP_RND_COMPAT_BITS_MIN
depends on HAVE_ARCH_MMAP_RND_COMPAT_BITS
help
This value can be used to select the number of bits to use to
determine the random offset to the base address of vma regions
resulting from mmap allocations for compatible applications This
value will be bounded by the architecture's minimum and maximum
supported values.
This value can be changed after boot using the
/proc/sys/vm/mmap_rnd_compat_bits tunable
config HAVE_COPY_THREAD_TLS config HAVE_COPY_THREAD_TLS
bool bool
help help

View File

@ -37,6 +37,7 @@ config ARM
select HAVE_ARCH_BITREVERSE if (CPU_32v7M || CPU_32v7) && !CPU_32v6 select HAVE_ARCH_BITREVERSE if (CPU_32v7M || CPU_32v7) && !CPU_32v6
select HAVE_ARCH_JUMP_LABEL if !XIP_KERNEL && !CPU_ENDIAN_BE32 && MMU select HAVE_ARCH_JUMP_LABEL if !XIP_KERNEL && !CPU_ENDIAN_BE32 && MMU
select HAVE_ARCH_KGDB if !CPU_ENDIAN_BE32 && MMU select HAVE_ARCH_KGDB if !CPU_ENDIAN_BE32 && MMU
select HAVE_ARCH_MMAP_RND_BITS if MMU
select HAVE_ARCH_SECCOMP_FILTER if (AEABI && !OABI_COMPAT) select HAVE_ARCH_SECCOMP_FILTER if (AEABI && !OABI_COMPAT)
select HAVE_ARCH_TRACEHOOK select HAVE_ARCH_TRACEHOOK
select HAVE_ARM_SMCCC if CPU_V7 select HAVE_ARM_SMCCC if CPU_V7
@ -311,6 +312,14 @@ config MMU
Select if you want MMU-based virtualised addressing space Select if you want MMU-based virtualised addressing space
support by paged memory management. If unsure, say 'Y'. support by paged memory management. If unsure, say 'Y'.
config ARCH_MMAP_RND_BITS_MIN
default 8
config ARCH_MMAP_RND_BITS_MAX
default 14 if PAGE_OFFSET=0x40000000
default 15 if PAGE_OFFSET=0x80000000
default 16
# #
# The "ARM system type" choice list is ordered alphabetically by option # The "ARM system type" choice list is ordered alphabetically by option
# text. Please add new entries in the option alphabetic order. # text. Please add new entries in the option alphabetic order.

View File

@ -173,8 +173,7 @@ unsigned long arch_mmap_rnd(void)
{ {
unsigned long rnd; unsigned long rnd;
/* 8 bits of randomness in 20 address space bits */ rnd = (unsigned long)get_random_int() & ((1 << mmap_rnd_bits) - 1);
rnd = (unsigned long)get_random_int() % (1 << 8);
return rnd << PAGE_SHIFT; return rnd << PAGE_SHIFT;
} }

View File

@ -52,6 +52,8 @@ config ARM64
select HAVE_ARCH_JUMP_LABEL select HAVE_ARCH_JUMP_LABEL
select HAVE_ARCH_KASAN if SPARSEMEM_VMEMMAP && !(ARM64_16K_PAGES && ARM64_VA_BITS_48) select HAVE_ARCH_KASAN if SPARSEMEM_VMEMMAP && !(ARM64_16K_PAGES && ARM64_VA_BITS_48)
select HAVE_ARCH_KGDB select HAVE_ARCH_KGDB
select HAVE_ARCH_MMAP_RND_BITS
select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT
select HAVE_ARCH_SECCOMP_FILTER select HAVE_ARCH_SECCOMP_FILTER
select HAVE_ARCH_TRACEHOOK select HAVE_ARCH_TRACEHOOK
select HAVE_BPF_JIT select HAVE_BPF_JIT
@ -107,6 +109,33 @@ config ARCH_PHYS_ADDR_T_64BIT
config MMU config MMU
def_bool y def_bool y
config ARCH_MMAP_RND_BITS_MIN
default 14 if ARM64_64K_PAGES
default 16 if ARM64_16K_PAGES
default 18
# max bits determined by the following formula:
# VA_BITS - PAGE_SHIFT - 3
config ARCH_MMAP_RND_BITS_MAX
default 19 if ARM64_VA_BITS=36
default 24 if ARM64_VA_BITS=39
default 27 if ARM64_VA_BITS=42
default 30 if ARM64_VA_BITS=47
default 29 if ARM64_VA_BITS=48 && ARM64_64K_PAGES
default 31 if ARM64_VA_BITS=48 && ARM64_16K_PAGES
default 33 if ARM64_VA_BITS=48
default 14 if ARM64_64K_PAGES
default 16 if ARM64_16K_PAGES
default 18
config ARCH_MMAP_RND_COMPAT_BITS_MIN
default 7 if ARM64_64K_PAGES
default 9 if ARM64_16K_PAGES
default 11
config ARCH_MMAP_RND_COMPAT_BITS_MAX
default 16
config NO_IOPORT_MAP config NO_IOPORT_MAP
def_bool y if !PCI def_bool y if !PCI

View File

@ -51,8 +51,12 @@ unsigned long arch_mmap_rnd(void)
{ {
unsigned long rnd; unsigned long rnd;
rnd = (unsigned long)get_random_int() & STACK_RND_MASK; #ifdef CONFIG_COMPAT
if (test_thread_flag(TIF_32BIT))
rnd = (unsigned long)get_random_int() & ((1 << mmap_rnd_compat_bits) - 1);
else
#endif
rnd = (unsigned long)get_random_int() & ((1 << mmap_rnd_bits) - 1);
return rnd << PAGE_SHIFT; return rnd << PAGE_SHIFT;
} }

View File

@ -2332,8 +2332,7 @@ pfm_smpl_buffer_alloc(struct task_struct *task, struct file *filp, pfm_context_t
*/ */
insert_vm_struct(mm, vma); insert_vm_struct(mm, vma);
vm_stat_account(vma->vm_mm, vma->vm_flags, vma->vm_file, vm_stat_account(vma->vm_mm, vma->vm_flags, vma_pages(vma));
vma_pages(vma));
up_write(&task->mm->mmap_sem); up_write(&task->mm->mmap_sem);
/* /*

View File

@ -81,7 +81,10 @@ static struct resource code_resource = {
}; };
unsigned long memory_start; unsigned long memory_start;
EXPORT_SYMBOL(memory_start);
unsigned long memory_end; unsigned long memory_end;
EXPORT_SYMBOL(memory_end);
void __init setup_arch(char **); void __init setup_arch(char **);
int get_cpuinfo(char *); int get_cpuinfo(char *);

View File

@ -767,7 +767,7 @@ static int __init spufs_init(void)
ret = -ENOMEM; ret = -ENOMEM;
spufs_inode_cache = kmem_cache_create("spufs_inode_cache", spufs_inode_cache = kmem_cache_create("spufs_inode_cache",
sizeof(struct spufs_inode_info), 0, sizeof(struct spufs_inode_info), 0,
SLAB_HWCACHE_ALIGN, spufs_init_once); SLAB_HWCACHE_ALIGN|SLAB_ACCOUNT, spufs_init_once);
if (!spufs_inode_cache) if (!spufs_inode_cache)
goto out; goto out;

View File

@ -603,10 +603,7 @@ static void gmap_zap_swap_entry(swp_entry_t entry, struct mm_struct *mm)
else if (is_migration_entry(entry)) { else if (is_migration_entry(entry)) {
struct page *page = migration_entry_to_page(entry); struct page *page = migration_entry_to_page(entry);
if (PageAnon(page)) dec_mm_counter(mm, mm_counter(page));
dec_mm_counter(mm, MM_ANONPAGES);
else
dec_mm_counter(mm, MM_FILEPAGES);
} }
free_swap_and_cache(entry); free_swap_and_cache(entry);
} }

View File

@ -83,6 +83,8 @@ config X86
select HAVE_ARCH_KASAN if X86_64 && SPARSEMEM_VMEMMAP select HAVE_ARCH_KASAN if X86_64 && SPARSEMEM_VMEMMAP
select HAVE_ARCH_KGDB select HAVE_ARCH_KGDB
select HAVE_ARCH_KMEMCHECK select HAVE_ARCH_KMEMCHECK
select HAVE_ARCH_MMAP_RND_BITS if MMU
select HAVE_ARCH_MMAP_RND_COMPAT_BITS if MMU && COMPAT
select HAVE_ARCH_SECCOMP_FILTER select HAVE_ARCH_SECCOMP_FILTER
select HAVE_ARCH_SOFT_DIRTY if X86_64 select HAVE_ARCH_SOFT_DIRTY if X86_64
select HAVE_ARCH_TRACEHOOK select HAVE_ARCH_TRACEHOOK
@ -184,6 +186,20 @@ config HAVE_LATENCYTOP_SUPPORT
config MMU config MMU
def_bool y def_bool y
config ARCH_MMAP_RND_BITS_MIN
default 28 if 64BIT
default 8
config ARCH_MMAP_RND_BITS_MAX
default 32 if 64BIT
default 16
config ARCH_MMAP_RND_COMPAT_BITS_MIN
default 8
config ARCH_MMAP_RND_COMPAT_BITS_MAX
default 16
config SBUS config SBUS
bool bool

View File

@ -69,14 +69,14 @@ unsigned long arch_mmap_rnd(void)
{ {
unsigned long rnd; unsigned long rnd;
/*
* 8 bits of randomness in 32bit mmaps, 20 address space bits
* 28 bits of randomness in 64bit mmaps, 40 address space bits
*/
if (mmap_is_ia32()) if (mmap_is_ia32())
rnd = (unsigned long)get_random_int() % (1<<8); #ifdef CONFIG_COMPAT
rnd = (unsigned long)get_random_int() & ((1 << mmap_rnd_compat_bits) - 1);
#else
rnd = (unsigned long)get_random_int() & ((1 << mmap_rnd_bits) - 1);
#endif
else else
rnd = (unsigned long)get_random_int() % (1<<28); rnd = (unsigned long)get_random_int() & ((1 << mmap_rnd_bits) - 1);
return rnd << PAGE_SHIFT; return rnd << PAGE_SHIFT;
} }

View File

@ -450,8 +450,7 @@ memory_probe_store(struct device *dev, struct device_attribute *attr,
const char *buf, size_t count) const char *buf, size_t count)
{ {
u64 phys_addr; u64 phys_addr;
int nid; int nid, ret;
int i, ret;
unsigned long pages_per_block = PAGES_PER_SECTION * sections_per_block; unsigned long pages_per_block = PAGES_PER_SECTION * sections_per_block;
ret = kstrtoull(buf, 0, &phys_addr); ret = kstrtoull(buf, 0, &phys_addr);
@ -461,15 +460,12 @@ memory_probe_store(struct device *dev, struct device_attribute *attr,
if (phys_addr & ((pages_per_block << PAGE_SHIFT) - 1)) if (phys_addr & ((pages_per_block << PAGE_SHIFT) - 1))
return -EINVAL; return -EINVAL;
for (i = 0; i < sections_per_block; i++) { nid = memory_add_physaddr_to_nid(phys_addr);
nid = memory_add_physaddr_to_nid(phys_addr); ret = add_memory(nid, phys_addr,
ret = add_memory(nid, phys_addr, MIN_MEMORY_BLOCK_SIZE * sections_per_block);
PAGES_PER_SECTION << PAGE_SHIFT);
if (ret)
goto out;
phys_addr += MIN_MEMORY_BLOCK_SIZE; if (ret)
} goto out;
ret = count; ret = count;
out: out:
@ -618,7 +614,6 @@ static int init_memory_block(struct memory_block **memory,
base_memory_block_id(scn_nr) * sections_per_block; base_memory_block_id(scn_nr) * sections_per_block;
mem->end_section_nr = mem->start_section_nr + sections_per_block - 1; mem->end_section_nr = mem->start_section_nr + sections_per_block - 1;
mem->state = state; mem->state = state;
mem->section_count++;
start_pfn = section_nr_to_pfn(mem->start_section_nr); start_pfn = section_nr_to_pfn(mem->start_section_nr);
mem->phys_device = arch_get_memory_phys_device(start_pfn); mem->phys_device = arch_get_memory_phys_device(start_pfn);
@ -672,6 +667,7 @@ int register_new_memory(int nid, struct mem_section *section)
ret = init_memory_block(&mem, section, MEM_OFFLINE); ret = init_memory_block(&mem, section, MEM_OFFLINE);
if (ret) if (ret)
goto out; goto out;
mem->section_count++;
} }
if (mem->section_count == sections_per_block) if (mem->section_count == sections_per_block)
@ -692,7 +688,7 @@ unregister_memory(struct memory_block *memory)
device_unregister(&memory->dev); device_unregister(&memory->dev);
} }
static int remove_memory_block(unsigned long node_id, static int remove_memory_section(unsigned long node_id,
struct mem_section *section, int phys_device) struct mem_section *section, int phys_device)
{ {
struct memory_block *mem; struct memory_block *mem;
@ -716,7 +712,7 @@ int unregister_memory_section(struct mem_section *section)
if (!present_section(section)) if (!present_section(section))
return -EINVAL; return -EINVAL;
return remove_memory_block(0, section, 0); return remove_memory_section(0, section, 0);
} }
#endif /* CONFIG_MEMORY_HOTREMOVE */ #endif /* CONFIG_MEMORY_HOTREMOVE */

View File

@ -74,18 +74,18 @@ static void zcomp_strm_free(struct zcomp *comp, struct zcomp_strm *zstrm)
* allocate new zcomp_strm structure with ->private initialized by * allocate new zcomp_strm structure with ->private initialized by
* backend, return NULL on error * backend, return NULL on error
*/ */
static struct zcomp_strm *zcomp_strm_alloc(struct zcomp *comp) static struct zcomp_strm *zcomp_strm_alloc(struct zcomp *comp, gfp_t flags)
{ {
struct zcomp_strm *zstrm = kmalloc(sizeof(*zstrm), GFP_KERNEL); struct zcomp_strm *zstrm = kmalloc(sizeof(*zstrm), flags);
if (!zstrm) if (!zstrm)
return NULL; return NULL;
zstrm->private = comp->backend->create(); zstrm->private = comp->backend->create(flags);
/* /*
* allocate 2 pages. 1 for compressed data, plus 1 extra for the * allocate 2 pages. 1 for compressed data, plus 1 extra for the
* case when compressed size is larger than the original one * case when compressed size is larger than the original one
*/ */
zstrm->buffer = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, 1); zstrm->buffer = (void *)__get_free_pages(flags | __GFP_ZERO, 1);
if (!zstrm->private || !zstrm->buffer) { if (!zstrm->private || !zstrm->buffer) {
zcomp_strm_free(comp, zstrm); zcomp_strm_free(comp, zstrm);
zstrm = NULL; zstrm = NULL;
@ -120,8 +120,16 @@ static struct zcomp_strm *zcomp_strm_multi_find(struct zcomp *comp)
/* allocate new zstrm stream */ /* allocate new zstrm stream */
zs->avail_strm++; zs->avail_strm++;
spin_unlock(&zs->strm_lock); spin_unlock(&zs->strm_lock);
/*
zstrm = zcomp_strm_alloc(comp); * This function can be called in swapout/fs write path
* so we can't use GFP_FS|IO. And it assumes we already
* have at least one stream in zram initialization so we
* don't do best effort to allocate more stream in here.
* A default stream will work well without further multiple
* streams. That's why we use NORETRY | NOWARN.
*/
zstrm = zcomp_strm_alloc(comp, GFP_NOIO | __GFP_NORETRY |
__GFP_NOWARN);
if (!zstrm) { if (!zstrm) {
spin_lock(&zs->strm_lock); spin_lock(&zs->strm_lock);
zs->avail_strm--; zs->avail_strm--;
@ -209,7 +217,7 @@ static int zcomp_strm_multi_create(struct zcomp *comp, int max_strm)
zs->max_strm = max_strm; zs->max_strm = max_strm;
zs->avail_strm = 1; zs->avail_strm = 1;
zstrm = zcomp_strm_alloc(comp); zstrm = zcomp_strm_alloc(comp, GFP_KERNEL);
if (!zstrm) { if (!zstrm) {
kfree(zs); kfree(zs);
return -ENOMEM; return -ENOMEM;
@ -259,7 +267,7 @@ static int zcomp_strm_single_create(struct zcomp *comp)
comp->stream = zs; comp->stream = zs;
mutex_init(&zs->strm_lock); mutex_init(&zs->strm_lock);
zs->zstrm = zcomp_strm_alloc(comp); zs->zstrm = zcomp_strm_alloc(comp, GFP_KERNEL);
if (!zs->zstrm) { if (!zs->zstrm) {
kfree(zs); kfree(zs);
return -ENOMEM; return -ENOMEM;

View File

@ -33,7 +33,7 @@ struct zcomp_backend {
int (*decompress)(const unsigned char *src, size_t src_len, int (*decompress)(const unsigned char *src, size_t src_len,
unsigned char *dst); unsigned char *dst);
void *(*create)(void); void *(*create)(gfp_t flags);
void (*destroy)(void *private); void (*destroy)(void *private);
const char *name; const char *name;

View File

@ -10,17 +10,26 @@
#include <linux/kernel.h> #include <linux/kernel.h>
#include <linux/slab.h> #include <linux/slab.h>
#include <linux/lz4.h> #include <linux/lz4.h>
#include <linux/vmalloc.h>
#include <linux/mm.h>
#include "zcomp_lz4.h" #include "zcomp_lz4.h"
static void *zcomp_lz4_create(void) static void *zcomp_lz4_create(gfp_t flags)
{ {
return kzalloc(LZ4_MEM_COMPRESS, GFP_KERNEL); void *ret;
ret = kmalloc(LZ4_MEM_COMPRESS, flags);
if (!ret)
ret = __vmalloc(LZ4_MEM_COMPRESS,
flags | __GFP_HIGHMEM,
PAGE_KERNEL);
return ret;
} }
static void zcomp_lz4_destroy(void *private) static void zcomp_lz4_destroy(void *private)
{ {
kfree(private); kvfree(private);
} }
static int zcomp_lz4_compress(const unsigned char *src, unsigned char *dst, static int zcomp_lz4_compress(const unsigned char *src, unsigned char *dst,

View File

@ -10,17 +10,26 @@
#include <linux/kernel.h> #include <linux/kernel.h>
#include <linux/slab.h> #include <linux/slab.h>
#include <linux/lzo.h> #include <linux/lzo.h>
#include <linux/vmalloc.h>
#include <linux/mm.h>
#include "zcomp_lzo.h" #include "zcomp_lzo.h"
static void *lzo_create(void) static void *lzo_create(gfp_t flags)
{ {
return kzalloc(LZO1X_MEM_COMPRESS, GFP_KERNEL); void *ret;
ret = kmalloc(LZO1X_MEM_COMPRESS, flags);
if (!ret)
ret = __vmalloc(LZO1X_MEM_COMPRESS,
flags | __GFP_HIGHMEM,
PAGE_KERNEL);
return ret;
} }
static void lzo_destroy(void *private) static void lzo_destroy(void *private)
{ {
kfree(private); kvfree(private);
} }
static int lzo_compress(const unsigned char *src, unsigned char *dst, static int lzo_compress(const unsigned char *src, unsigned char *dst,

View File

@ -106,7 +106,8 @@ static int __init init_lustre_lite(void)
rc = -ENOMEM; rc = -ENOMEM;
ll_inode_cachep = kmem_cache_create("lustre_inode_cache", ll_inode_cachep = kmem_cache_create("lustre_inode_cache",
sizeof(struct ll_inode_info), sizeof(struct ll_inode_info),
0, SLAB_HWCACHE_ALIGN, NULL); 0, SLAB_HWCACHE_ALIGN|SLAB_ACCOUNT,
NULL);
if (ll_inode_cachep == NULL) if (ll_inode_cachep == NULL)
goto out_cache; goto out_cache;

View File

@ -575,7 +575,7 @@ static int v9fs_init_inode_cache(void)
v9fs_inode_cache = kmem_cache_create("v9fs_inode_cache", v9fs_inode_cache = kmem_cache_create("v9fs_inode_cache",
sizeof(struct v9fs_inode), sizeof(struct v9fs_inode),
0, (SLAB_RECLAIM_ACCOUNT| 0, (SLAB_RECLAIM_ACCOUNT|
SLAB_MEM_SPREAD), SLAB_MEM_SPREAD|SLAB_ACCOUNT),
v9fs_inode_init_once); v9fs_inode_init_once);
if (!v9fs_inode_cache) if (!v9fs_inode_cache)
return -ENOMEM; return -ENOMEM;

View File

@ -271,7 +271,7 @@ static int __init init_inodecache(void)
adfs_inode_cachep = kmem_cache_create("adfs_inode_cache", adfs_inode_cachep = kmem_cache_create("adfs_inode_cache",
sizeof(struct adfs_inode_info), sizeof(struct adfs_inode_info),
0, (SLAB_RECLAIM_ACCOUNT| 0, (SLAB_RECLAIM_ACCOUNT|
SLAB_MEM_SPREAD), SLAB_MEM_SPREAD|SLAB_ACCOUNT),
init_once); init_once);
if (adfs_inode_cachep == NULL) if (adfs_inode_cachep == NULL)
return -ENOMEM; return -ENOMEM;

View File

@ -132,7 +132,7 @@ static int __init init_inodecache(void)
affs_inode_cachep = kmem_cache_create("affs_inode_cache", affs_inode_cachep = kmem_cache_create("affs_inode_cache",
sizeof(struct affs_inode_info), sizeof(struct affs_inode_info),
0, (SLAB_RECLAIM_ACCOUNT| 0, (SLAB_RECLAIM_ACCOUNT|
SLAB_MEM_SPREAD), SLAB_MEM_SPREAD|SLAB_ACCOUNT),
init_once); init_once);
if (affs_inode_cachep == NULL) if (affs_inode_cachep == NULL)
return -ENOMEM; return -ENOMEM;

View File

@ -91,7 +91,7 @@ int __init afs_fs_init(void)
afs_inode_cachep = kmem_cache_create("afs_inode_cache", afs_inode_cachep = kmem_cache_create("afs_inode_cache",
sizeof(struct afs_vnode), sizeof(struct afs_vnode),
0, 0,
SLAB_HWCACHE_ALIGN, SLAB_HWCACHE_ALIGN|SLAB_ACCOUNT,
afs_i_init_once); afs_i_init_once);
if (!afs_inode_cachep) { if (!afs_inode_cachep) {
printk(KERN_NOTICE "kAFS: Failed to allocate inode cache\n"); printk(KERN_NOTICE "kAFS: Failed to allocate inode cache\n");

View File

@ -434,7 +434,7 @@ befs_init_inodecache(void)
befs_inode_cachep = kmem_cache_create("befs_inode_cache", befs_inode_cachep = kmem_cache_create("befs_inode_cache",
sizeof (struct befs_inode_info), sizeof (struct befs_inode_info),
0, (SLAB_RECLAIM_ACCOUNT| 0, (SLAB_RECLAIM_ACCOUNT|
SLAB_MEM_SPREAD), SLAB_MEM_SPREAD|SLAB_ACCOUNT),
init_once); init_once);
if (befs_inode_cachep == NULL) { if (befs_inode_cachep == NULL) {
pr_err("%s: Couldn't initialize inode slabcache\n", __func__); pr_err("%s: Couldn't initialize inode slabcache\n", __func__);

View File

@ -270,7 +270,7 @@ static int __init init_inodecache(void)
bfs_inode_cachep = kmem_cache_create("bfs_inode_cache", bfs_inode_cachep = kmem_cache_create("bfs_inode_cache",
sizeof(struct bfs_inode_info), sizeof(struct bfs_inode_info),
0, (SLAB_RECLAIM_ACCOUNT| 0, (SLAB_RECLAIM_ACCOUNT|
SLAB_MEM_SPREAD), SLAB_MEM_SPREAD|SLAB_ACCOUNT),
init_once); init_once);
if (bfs_inode_cachep == NULL) if (bfs_inode_cachep == NULL)
return -ENOMEM; return -ENOMEM;

View File

@ -437,7 +437,7 @@ int bdev_write_page(struct block_device *bdev, sector_t sector,
if (!ops->rw_page || bdev_get_integrity(bdev)) if (!ops->rw_page || bdev_get_integrity(bdev))
return -EOPNOTSUPP; return -EOPNOTSUPP;
result = blk_queue_enter(bdev->bd_queue, GFP_KERNEL); result = blk_queue_enter(bdev->bd_queue, GFP_NOIO);
if (result) if (result)
return result; return result;
@ -595,7 +595,7 @@ void __init bdev_cache_init(void)
bdev_cachep = kmem_cache_create("bdev_cache", sizeof(struct bdev_inode), bdev_cachep = kmem_cache_create("bdev_cache", sizeof(struct bdev_inode),
0, (SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT| 0, (SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|
SLAB_MEM_SPREAD|SLAB_PANIC), SLAB_MEM_SPREAD|SLAB_ACCOUNT|SLAB_PANIC),
init_once); init_once);
err = register_filesystem(&bd_type); err = register_filesystem(&bd_type);
if (err) if (err)

View File

@ -9161,7 +9161,8 @@ int btrfs_init_cachep(void)
{ {
btrfs_inode_cachep = kmem_cache_create("btrfs_inode", btrfs_inode_cachep = kmem_cache_create("btrfs_inode",
sizeof(struct btrfs_inode), 0, sizeof(struct btrfs_inode), 0,
SLAB_RECLAIM_ACCOUNT | SLAB_MEM_SPREAD, init_once); SLAB_RECLAIM_ACCOUNT | SLAB_MEM_SPREAD | SLAB_ACCOUNT,
init_once);
if (!btrfs_inode_cachep) if (!btrfs_inode_cachep)
goto fail; goto fail;

View File

@ -639,8 +639,8 @@ static int __init init_caches(void)
ceph_inode_cachep = kmem_cache_create("ceph_inode_info", ceph_inode_cachep = kmem_cache_create("ceph_inode_info",
sizeof(struct ceph_inode_info), sizeof(struct ceph_inode_info),
__alignof__(struct ceph_inode_info), __alignof__(struct ceph_inode_info),
(SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD), SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD|
ceph_inode_init_once); SLAB_ACCOUNT, ceph_inode_init_once);
if (ceph_inode_cachep == NULL) if (ceph_inode_cachep == NULL)
return -ENOMEM; return -ENOMEM;

View File

@ -1092,7 +1092,7 @@ cifs_init_inodecache(void)
cifs_inode_cachep = kmem_cache_create("cifs_inode_cache", cifs_inode_cachep = kmem_cache_create("cifs_inode_cache",
sizeof(struct cifsInodeInfo), sizeof(struct cifsInodeInfo),
0, (SLAB_RECLAIM_ACCOUNT| 0, (SLAB_RECLAIM_ACCOUNT|
SLAB_MEM_SPREAD), SLAB_MEM_SPREAD|SLAB_ACCOUNT),
cifs_init_once); cifs_init_once);
if (cifs_inode_cachep == NULL) if (cifs_inode_cachep == NULL)
return -ENOMEM; return -ENOMEM;

View File

@ -74,9 +74,9 @@ static void init_once(void *foo)
int __init coda_init_inodecache(void) int __init coda_init_inodecache(void)
{ {
coda_inode_cachep = kmem_cache_create("coda_inode_cache", coda_inode_cachep = kmem_cache_create("coda_inode_cache",
sizeof(struct coda_inode_info), sizeof(struct coda_inode_info), 0,
0, SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD, SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD|
init_once); SLAB_ACCOUNT, init_once);
if (coda_inode_cachep == NULL) if (coda_inode_cachep == NULL)
return -ENOMEM; return -ENOMEM;
return 0; return 0;

View File

@ -1571,7 +1571,8 @@ struct dentry *__d_alloc(struct super_block *sb, const struct qstr *name)
dentry->d_iname[DNAME_INLINE_LEN-1] = 0; dentry->d_iname[DNAME_INLINE_LEN-1] = 0;
if (name->len > DNAME_INLINE_LEN-1) { if (name->len > DNAME_INLINE_LEN-1) {
size_t size = offsetof(struct external_name, name[1]); size_t size = offsetof(struct external_name, name[1]);
struct external_name *p = kmalloc(size + name->len, GFP_KERNEL); struct external_name *p = kmalloc(size + name->len,
GFP_KERNEL_ACCOUNT);
if (!p) { if (!p) {
kmem_cache_free(dentry_cache, dentry); kmem_cache_free(dentry_cache, dentry);
return NULL; return NULL;
@ -3415,7 +3416,7 @@ static void __init dcache_init(void)
* of the dcache. * of the dcache.
*/ */
dentry_cache = KMEM_CACHE(dentry, dentry_cache = KMEM_CACHE(dentry,
SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|SLAB_MEM_SPREAD); SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|SLAB_MEM_SPREAD|SLAB_ACCOUNT);
/* Hash may have been set up in dcache_init_early */ /* Hash may have been set up in dcache_init_early */
if (!hashdist) if (!hashdist)

View File

@ -663,6 +663,7 @@ static struct ecryptfs_cache_info {
struct kmem_cache **cache; struct kmem_cache **cache;
const char *name; const char *name;
size_t size; size_t size;
unsigned long flags;
void (*ctor)(void *obj); void (*ctor)(void *obj);
} ecryptfs_cache_infos[] = { } ecryptfs_cache_infos[] = {
{ {
@ -684,6 +685,7 @@ static struct ecryptfs_cache_info {
.cache = &ecryptfs_inode_info_cache, .cache = &ecryptfs_inode_info_cache,
.name = "ecryptfs_inode_cache", .name = "ecryptfs_inode_cache",
.size = sizeof(struct ecryptfs_inode_info), .size = sizeof(struct ecryptfs_inode_info),
.flags = SLAB_ACCOUNT,
.ctor = inode_info_init_once, .ctor = inode_info_init_once,
}, },
{ {
@ -755,8 +757,8 @@ static int ecryptfs_init_kmem_caches(void)
struct ecryptfs_cache_info *info; struct ecryptfs_cache_info *info;
info = &ecryptfs_cache_infos[i]; info = &ecryptfs_cache_infos[i];
*(info->cache) = kmem_cache_create(info->name, info->size, *(info->cache) = kmem_cache_create(info->name, info->size, 0,
0, SLAB_HWCACHE_ALIGN, info->ctor); SLAB_HWCACHE_ALIGN | info->flags, info->ctor);
if (!*(info->cache)) { if (!*(info->cache)) {
ecryptfs_free_kmem_caches(); ecryptfs_free_kmem_caches();
ecryptfs_printk(KERN_WARNING, "%s: " ecryptfs_printk(KERN_WARNING, "%s: "

View File

@ -94,9 +94,9 @@ static void init_once(void *foo)
static int __init init_inodecache(void) static int __init init_inodecache(void)
{ {
efs_inode_cachep = kmem_cache_create("efs_inode_cache", efs_inode_cachep = kmem_cache_create("efs_inode_cache",
sizeof(struct efs_inode_info), sizeof(struct efs_inode_info), 0,
0, SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD, SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD|
init_once); SLAB_ACCOUNT, init_once);
if (efs_inode_cachep == NULL) if (efs_inode_cachep == NULL)
return -ENOMEM; return -ENOMEM;
return 0; return 0;

View File

@ -194,8 +194,8 @@ static int init_inodecache(void)
{ {
exofs_inode_cachep = kmem_cache_create("exofs_inode_cache", exofs_inode_cachep = kmem_cache_create("exofs_inode_cache",
sizeof(struct exofs_i_info), 0, sizeof(struct exofs_i_info), 0,
SLAB_RECLAIM_ACCOUNT | SLAB_MEM_SPREAD, SLAB_RECLAIM_ACCOUNT | SLAB_MEM_SPREAD |
exofs_init_once); SLAB_ACCOUNT, exofs_init_once);
if (exofs_inode_cachep == NULL) if (exofs_inode_cachep == NULL)
return -ENOMEM; return -ENOMEM;
return 0; return 0;

View File

@ -203,7 +203,7 @@ static int __init init_inodecache(void)
ext2_inode_cachep = kmem_cache_create("ext2_inode_cache", ext2_inode_cachep = kmem_cache_create("ext2_inode_cache",
sizeof(struct ext2_inode_info), sizeof(struct ext2_inode_info),
0, (SLAB_RECLAIM_ACCOUNT| 0, (SLAB_RECLAIM_ACCOUNT|
SLAB_MEM_SPREAD), SLAB_MEM_SPREAD|SLAB_ACCOUNT),
init_once); init_once);
if (ext2_inode_cachep == NULL) if (ext2_inode_cachep == NULL)
return -ENOMEM; return -ENOMEM;

View File

@ -966,7 +966,7 @@ static int __init init_inodecache(void)
ext4_inode_cachep = kmem_cache_create("ext4_inode_cache", ext4_inode_cachep = kmem_cache_create("ext4_inode_cache",
sizeof(struct ext4_inode_info), sizeof(struct ext4_inode_info),
0, (SLAB_RECLAIM_ACCOUNT| 0, (SLAB_RECLAIM_ACCOUNT|
SLAB_MEM_SPREAD), SLAB_MEM_SPREAD|SLAB_ACCOUNT),
init_once); init_once);
if (ext4_inode_cachep == NULL) if (ext4_inode_cachep == NULL)
return -ENOMEM; return -ENOMEM;

View File

@ -1541,8 +1541,9 @@ MODULE_ALIAS_FS("f2fs");
static int __init init_inodecache(void) static int __init init_inodecache(void)
{ {
f2fs_inode_cachep = f2fs_kmem_cache_create("f2fs_inode_cache", f2fs_inode_cachep = kmem_cache_create("f2fs_inode_cache",
sizeof(struct f2fs_inode_info)); sizeof(struct f2fs_inode_info), 0,
SLAB_RECLAIM_ACCOUNT|SLAB_ACCOUNT, NULL);
if (!f2fs_inode_cachep) if (!f2fs_inode_cachep)
return -ENOMEM; return -ENOMEM;
return 0; return 0;

View File

@ -677,7 +677,7 @@ static int __init fat_init_inodecache(void)
fat_inode_cachep = kmem_cache_create("fat_inode_cache", fat_inode_cachep = kmem_cache_create("fat_inode_cache",
sizeof(struct msdos_inode_info), sizeof(struct msdos_inode_info),
0, (SLAB_RECLAIM_ACCOUNT| 0, (SLAB_RECLAIM_ACCOUNT|
SLAB_MEM_SPREAD), SLAB_MEM_SPREAD|SLAB_ACCOUNT),
init_once); init_once);
if (fat_inode_cachep == NULL) if (fat_inode_cachep == NULL)
return -ENOMEM; return -ENOMEM;

View File

@ -37,11 +37,12 @@ static void *alloc_fdmem(size_t size)
* vmalloc() if the allocation size will be considered "large" by the VM. * vmalloc() if the allocation size will be considered "large" by the VM.
*/ */
if (size <= (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER)) { if (size <= (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER)) {
void *data = kmalloc(size, GFP_KERNEL|__GFP_NOWARN|__GFP_NORETRY); void *data = kmalloc(size, GFP_KERNEL_ACCOUNT |
__GFP_NOWARN | __GFP_NORETRY);
if (data != NULL) if (data != NULL)
return data; return data;
} }
return vmalloc(size); return __vmalloc(size, GFP_KERNEL_ACCOUNT | __GFP_HIGHMEM, PAGE_KERNEL);
} }
static void __free_fdtable(struct fdtable *fdt) static void __free_fdtable(struct fdtable *fdt)
@ -126,7 +127,7 @@ static struct fdtable * alloc_fdtable(unsigned int nr)
if (unlikely(nr > sysctl_nr_open)) if (unlikely(nr > sysctl_nr_open))
nr = ((sysctl_nr_open - 1) | (BITS_PER_LONG - 1)) + 1; nr = ((sysctl_nr_open - 1) | (BITS_PER_LONG - 1)) + 1;
fdt = kmalloc(sizeof(struct fdtable), GFP_KERNEL); fdt = kmalloc(sizeof(struct fdtable), GFP_KERNEL_ACCOUNT);
if (!fdt) if (!fdt)
goto out; goto out;
fdt->max_fds = nr; fdt->max_fds = nr;

View File

@ -1255,8 +1255,8 @@ static int __init fuse_fs_init(void)
int err; int err;
fuse_inode_cachep = kmem_cache_create("fuse_inode", fuse_inode_cachep = kmem_cache_create("fuse_inode",
sizeof(struct fuse_inode), sizeof(struct fuse_inode), 0,
0, SLAB_HWCACHE_ALIGN, SLAB_HWCACHE_ALIGN|SLAB_ACCOUNT,
fuse_inode_init_once); fuse_inode_init_once);
err = -ENOMEM; err = -ENOMEM;
if (!fuse_inode_cachep) if (!fuse_inode_cachep)

View File

@ -114,7 +114,8 @@ static int __init init_gfs2_fs(void)
gfs2_inode_cachep = kmem_cache_create("gfs2_inode", gfs2_inode_cachep = kmem_cache_create("gfs2_inode",
sizeof(struct gfs2_inode), sizeof(struct gfs2_inode),
0, SLAB_RECLAIM_ACCOUNT| 0, SLAB_RECLAIM_ACCOUNT|
SLAB_MEM_SPREAD, SLAB_MEM_SPREAD|
SLAB_ACCOUNT,
gfs2_init_inode_once); gfs2_init_inode_once);
if (!gfs2_inode_cachep) if (!gfs2_inode_cachep)
goto fail; goto fail;

View File

@ -483,8 +483,8 @@ static int __init init_hfs_fs(void)
int err; int err;
hfs_inode_cachep = kmem_cache_create("hfs_inode_cache", hfs_inode_cachep = kmem_cache_create("hfs_inode_cache",
sizeof(struct hfs_inode_info), 0, SLAB_HWCACHE_ALIGN, sizeof(struct hfs_inode_info), 0,
hfs_init_once); SLAB_HWCACHE_ALIGN|SLAB_ACCOUNT, hfs_init_once);
if (!hfs_inode_cachep) if (!hfs_inode_cachep)
return -ENOMEM; return -ENOMEM;
err = register_filesystem(&hfs_fs_type); err = register_filesystem(&hfs_fs_type);

View File

@ -663,7 +663,7 @@ static int __init init_hfsplus_fs(void)
int err; int err;
hfsplus_inode_cachep = kmem_cache_create("hfsplus_icache", hfsplus_inode_cachep = kmem_cache_create("hfsplus_icache",
HFSPLUS_INODE_SIZE, 0, SLAB_HWCACHE_ALIGN, HFSPLUS_INODE_SIZE, 0, SLAB_HWCACHE_ALIGN|SLAB_ACCOUNT,
hfsplus_init_once); hfsplus_init_once);
if (!hfsplus_inode_cachep) if (!hfsplus_inode_cachep)
return -ENOMEM; return -ENOMEM;

View File

@ -223,7 +223,7 @@ static struct inode *hostfs_alloc_inode(struct super_block *sb)
{ {
struct hostfs_inode_info *hi; struct hostfs_inode_info *hi;
hi = kmalloc(sizeof(*hi), GFP_KERNEL); hi = kmalloc(sizeof(*hi), GFP_KERNEL_ACCOUNT);
if (hi == NULL) if (hi == NULL)
return NULL; return NULL;
hi->fd = -1; hi->fd = -1;

View File

@ -261,7 +261,7 @@ static int init_inodecache(void)
hpfs_inode_cachep = kmem_cache_create("hpfs_inode_cache", hpfs_inode_cachep = kmem_cache_create("hpfs_inode_cache",
sizeof(struct hpfs_inode_info), sizeof(struct hpfs_inode_info),
0, (SLAB_RECLAIM_ACCOUNT| 0, (SLAB_RECLAIM_ACCOUNT|
SLAB_MEM_SPREAD), SLAB_MEM_SPREAD|SLAB_ACCOUNT),
init_once); init_once);
if (hpfs_inode_cachep == NULL) if (hpfs_inode_cachep == NULL)
return -ENOMEM; return -ENOMEM;

View File

@ -4,11 +4,11 @@
* Nadia Yvette Chambers, 2002 * Nadia Yvette Chambers, 2002
* *
* Copyright (C) 2002 Linus Torvalds. * Copyright (C) 2002 Linus Torvalds.
* License: GPL
*/ */
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
#include <linux/module.h>
#include <linux/thread_info.h> #include <linux/thread_info.h>
#include <asm/current.h> #include <asm/current.h>
#include <linux/sched.h> /* remove ASAP */ #include <linux/sched.h> /* remove ASAP */
@ -738,7 +738,7 @@ static struct inode *hugetlbfs_get_inode(struct super_block *sb,
/* /*
* The policy is initialized here even if we are creating a * The policy is initialized here even if we are creating a
* private inode because initialization simply creates an * private inode because initialization simply creates an
* an empty rb tree and calls spin_lock_init(), later when we * an empty rb tree and calls rwlock_init(), later when we
* call mpol_free_shared_policy() it will just return because * call mpol_free_shared_policy() it will just return because
* the rb tree will still be empty. * the rb tree will still be empty.
*/ */
@ -1202,7 +1202,6 @@ static struct file_system_type hugetlbfs_fs_type = {
.mount = hugetlbfs_mount, .mount = hugetlbfs_mount,
.kill_sb = kill_litter_super, .kill_sb = kill_litter_super,
}; };
MODULE_ALIAS_FS("hugetlbfs");
static struct vfsmount *hugetlbfs_vfsmount[HUGE_MAX_HSTATE]; static struct vfsmount *hugetlbfs_vfsmount[HUGE_MAX_HSTATE];
@ -1322,7 +1321,7 @@ static int __init init_hugetlbfs_fs(void)
error = -ENOMEM; error = -ENOMEM;
hugetlbfs_inode_cachep = kmem_cache_create("hugetlbfs_inode_cache", hugetlbfs_inode_cachep = kmem_cache_create("hugetlbfs_inode_cache",
sizeof(struct hugetlbfs_inode_info), sizeof(struct hugetlbfs_inode_info),
0, 0, init_once); 0, SLAB_ACCOUNT, init_once);
if (hugetlbfs_inode_cachep == NULL) if (hugetlbfs_inode_cachep == NULL)
goto out2; goto out2;
@ -1356,26 +1355,4 @@ static int __init init_hugetlbfs_fs(void)
out2: out2:
return error; return error;
} }
fs_initcall(init_hugetlbfs_fs)
static void __exit exit_hugetlbfs_fs(void)
{
struct hstate *h;
int i;
/*
* Make sure all delayed rcu free inodes are flushed before we
* destroy cache.
*/
rcu_barrier();
kmem_cache_destroy(hugetlbfs_inode_cachep);
i = 0;
for_each_hstate(h)
kern_unmount(hugetlbfs_vfsmount[i++]);
unregister_filesystem(&hugetlbfs_fs_type);
}
module_init(init_hugetlbfs_fs)
module_exit(exit_hugetlbfs_fs)
MODULE_LICENSE("GPL");

View File

@ -1883,7 +1883,7 @@ void __init inode_init(void)
sizeof(struct inode), sizeof(struct inode),
0, 0,
(SLAB_RECLAIM_ACCOUNT|SLAB_PANIC| (SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|
SLAB_MEM_SPREAD), SLAB_MEM_SPREAD|SLAB_ACCOUNT),
init_once); init_once);
/* Hash may have been set up in inode_init_early */ /* Hash may have been set up in inode_init_early */

View File

@ -94,7 +94,7 @@ static int __init init_inodecache(void)
isofs_inode_cachep = kmem_cache_create("isofs_inode_cache", isofs_inode_cachep = kmem_cache_create("isofs_inode_cache",
sizeof(struct iso_inode_info), sizeof(struct iso_inode_info),
0, (SLAB_RECLAIM_ACCOUNT| 0, (SLAB_RECLAIM_ACCOUNT|
SLAB_MEM_SPREAD), SLAB_MEM_SPREAD|SLAB_ACCOUNT),
init_once); init_once);
if (isofs_inode_cachep == NULL) if (isofs_inode_cachep == NULL)
return -ENOMEM; return -ENOMEM;

View File

@ -387,7 +387,7 @@ static int __init init_jffs2_fs(void)
jffs2_inode_cachep = kmem_cache_create("jffs2_i", jffs2_inode_cachep = kmem_cache_create("jffs2_i",
sizeof(struct jffs2_inode_info), sizeof(struct jffs2_inode_info),
0, (SLAB_RECLAIM_ACCOUNT| 0, (SLAB_RECLAIM_ACCOUNT|
SLAB_MEM_SPREAD), SLAB_MEM_SPREAD|SLAB_ACCOUNT),
jffs2_i_init_once); jffs2_i_init_once);
if (!jffs2_inode_cachep) { if (!jffs2_inode_cachep) {
pr_err("error: Failed to initialise inode cache\n"); pr_err("error: Failed to initialise inode cache\n");

View File

@ -898,7 +898,7 @@ static int __init init_jfs_fs(void)
jfs_inode_cachep = jfs_inode_cachep =
kmem_cache_create("jfs_ip", sizeof(struct jfs_inode_info), 0, kmem_cache_create("jfs_ip", sizeof(struct jfs_inode_info), 0,
SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD, SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD|SLAB_ACCOUNT,
init_once); init_once);
if (jfs_inode_cachep == NULL) if (jfs_inode_cachep == NULL)
return -ENOMEM; return -ENOMEM;

View File

@ -541,14 +541,7 @@ static struct kernfs_node *__kernfs_new_node(struct kernfs_root *root,
if (!kn) if (!kn)
goto err_out1; goto err_out1;
/* ret = ida_simple_get(&root->ino_ida, 1, 0, GFP_KERNEL);
* If the ino of the sysfs entry created for a kmem cache gets
* allocated from an ida layer, which is accounted to the memcg that
* owns the cache, the memcg will get pinned forever. So do not account
* ino ida allocations.
*/
ret = ida_simple_get(&root->ino_ida, 1, 0,
GFP_KERNEL | __GFP_NOACCOUNT);
if (ret < 0) if (ret < 0)
goto err_out2; goto err_out2;
kn->ino = ret; kn->ino = ret;

View File

@ -1,6 +1,6 @@
config LOGFS config LOGFS
tristate "LogFS file system" tristate "LogFS file system"
depends on (MTD || BLOCK) depends on MTD || (!MTD && BLOCK)
select ZLIB_INFLATE select ZLIB_INFLATE
select ZLIB_DEFLATE select ZLIB_DEFLATE
select CRC32 select CRC32

View File

@ -409,7 +409,8 @@ const struct super_operations logfs_super_operations = {
int logfs_init_inode_cache(void) int logfs_init_inode_cache(void)
{ {
logfs_inode_cache = kmem_cache_create("logfs_inode_cache", logfs_inode_cache = kmem_cache_create("logfs_inode_cache",
sizeof(struct logfs_inode), 0, SLAB_RECLAIM_ACCOUNT, sizeof(struct logfs_inode), 0,
SLAB_RECLAIM_ACCOUNT|SLAB_ACCOUNT,
logfs_init_once); logfs_init_once);
if (!logfs_inode_cache) if (!logfs_inode_cache)
return -ENOMEM; return -ENOMEM;

View File

@ -91,7 +91,7 @@ static int __init init_inodecache(void)
minix_inode_cachep = kmem_cache_create("minix_inode_cache", minix_inode_cachep = kmem_cache_create("minix_inode_cache",
sizeof(struct minix_inode_info), sizeof(struct minix_inode_info),
0, (SLAB_RECLAIM_ACCOUNT| 0, (SLAB_RECLAIM_ACCOUNT|
SLAB_MEM_SPREAD), SLAB_MEM_SPREAD|SLAB_ACCOUNT),
init_once); init_once);
if (minix_inode_cachep == NULL) if (minix_inode_cachep == NULL)
return -ENOMEM; return -ENOMEM;

View File

@ -82,7 +82,7 @@ static int init_inodecache(void)
ncp_inode_cachep = kmem_cache_create("ncp_inode_cache", ncp_inode_cachep = kmem_cache_create("ncp_inode_cache",
sizeof(struct ncp_inode_info), sizeof(struct ncp_inode_info),
0, (SLAB_RECLAIM_ACCOUNT| 0, (SLAB_RECLAIM_ACCOUNT|
SLAB_MEM_SPREAD), SLAB_MEM_SPREAD|SLAB_ACCOUNT),
init_once); init_once);
if (ncp_inode_cachep == NULL) if (ncp_inode_cachep == NULL)
return -ENOMEM; return -ENOMEM;

View File

@ -1969,7 +1969,7 @@ static int __init nfs_init_inodecache(void)
nfs_inode_cachep = kmem_cache_create("nfs_inode_cache", nfs_inode_cachep = kmem_cache_create("nfs_inode_cache",
sizeof(struct nfs_inode), sizeof(struct nfs_inode),
0, (SLAB_RECLAIM_ACCOUNT| 0, (SLAB_RECLAIM_ACCOUNT|
SLAB_MEM_SPREAD), SLAB_MEM_SPREAD|SLAB_ACCOUNT),
init_once); init_once);
if (nfs_inode_cachep == NULL) if (nfs_inode_cachep == NULL)
return -ENOMEM; return -ENOMEM;

View File

@ -1416,7 +1416,8 @@ static int __init nilfs_init_cachep(void)
{ {
nilfs_inode_cachep = kmem_cache_create("nilfs2_inode_cache", nilfs_inode_cachep = kmem_cache_create("nilfs2_inode_cache",
sizeof(struct nilfs_inode_info), 0, sizeof(struct nilfs_inode_info), 0,
SLAB_RECLAIM_ACCOUNT, nilfs_inode_init_once); SLAB_RECLAIM_ACCOUNT|SLAB_ACCOUNT,
nilfs_inode_init_once);
if (!nilfs_inode_cachep) if (!nilfs_inode_cachep)
goto fail; goto fail;

View File

@ -199,8 +199,7 @@ void fsnotify_unmount_inodes(struct super_block *sb)
break; break;
} }
spin_unlock(&next_i->i_lock); spin_unlock(&next_i->i_lock);
next_i = list_entry(next_i->i_sb_list.next, next_i = list_next_entry(next_i, i_sb_list);
struct inode, i_sb_list);
} }
/* /*

View File

@ -92,9 +92,6 @@
#include "fsnotify.h" #include "fsnotify.h"
struct srcu_struct fsnotify_mark_srcu; struct srcu_struct fsnotify_mark_srcu;
static DEFINE_SPINLOCK(destroy_lock);
static LIST_HEAD(destroy_list);
static DECLARE_WAIT_QUEUE_HEAD(destroy_waitq);
void fsnotify_get_mark(struct fsnotify_mark *mark) void fsnotify_get_mark(struct fsnotify_mark *mark)
{ {
@ -168,10 +165,19 @@ void fsnotify_detach_mark(struct fsnotify_mark *mark)
atomic_dec(&group->num_marks); atomic_dec(&group->num_marks);
} }
static void
fsnotify_mark_free_rcu(struct rcu_head *rcu)
{
struct fsnotify_mark *mark;
mark = container_of(rcu, struct fsnotify_mark, g_rcu);
fsnotify_put_mark(mark);
}
/* /*
* Free fsnotify mark. The freeing is actually happening from a kthread which * Free fsnotify mark. The freeing is actually happening from a call_srcu
* first waits for srcu period end. Caller must have a reference to the mark * callback. Caller must have a reference to the mark or be protected by
* or be protected by fsnotify_mark_srcu. * fsnotify_mark_srcu.
*/ */
void fsnotify_free_mark(struct fsnotify_mark *mark) void fsnotify_free_mark(struct fsnotify_mark *mark)
{ {
@ -186,10 +192,7 @@ void fsnotify_free_mark(struct fsnotify_mark *mark)
mark->flags &= ~FSNOTIFY_MARK_FLAG_ALIVE; mark->flags &= ~FSNOTIFY_MARK_FLAG_ALIVE;
spin_unlock(&mark->lock); spin_unlock(&mark->lock);
spin_lock(&destroy_lock); call_srcu(&fsnotify_mark_srcu, &mark->g_rcu, fsnotify_mark_free_rcu);
list_add(&mark->g_list, &destroy_list);
spin_unlock(&destroy_lock);
wake_up(&destroy_waitq);
/* /*
* Some groups like to know that marks are being freed. This is a * Some groups like to know that marks are being freed. This is a
@ -385,11 +388,7 @@ err:
spin_unlock(&mark->lock); spin_unlock(&mark->lock);
spin_lock(&destroy_lock); call_srcu(&fsnotify_mark_srcu, &mark->g_rcu, fsnotify_mark_free_rcu);
list_add(&mark->g_list, &destroy_list);
spin_unlock(&destroy_lock);
wake_up(&destroy_waitq);
return ret; return ret;
} }
@ -492,40 +491,3 @@ void fsnotify_init_mark(struct fsnotify_mark *mark,
atomic_set(&mark->refcnt, 1); atomic_set(&mark->refcnt, 1);
mark->free_mark = free_mark; mark->free_mark = free_mark;
} }
static int fsnotify_mark_destroy(void *ignored)
{
struct fsnotify_mark *mark, *next;
struct list_head private_destroy_list;
for (;;) {
spin_lock(&destroy_lock);
/* exchange the list head */
list_replace_init(&destroy_list, &private_destroy_list);
spin_unlock(&destroy_lock);
synchronize_srcu(&fsnotify_mark_srcu);
list_for_each_entry_safe(mark, next, &private_destroy_list, g_list) {
list_del_init(&mark->g_list);
fsnotify_put_mark(mark);
}
wait_event_interruptible(destroy_waitq, !list_empty(&destroy_list));
}
return 0;
}
static int __init fsnotify_mark_init(void)
{
struct task_struct *thread;
thread = kthread_run(fsnotify_mark_destroy, NULL,
"fsnotify_mark");
if (IS_ERR(thread))
panic("unable to start fsnotify mark destruction thread.");
return 0;
}
device_initcall(fsnotify_mark_init);

View File

@ -3139,8 +3139,8 @@ static int __init init_ntfs_fs(void)
ntfs_big_inode_cache = kmem_cache_create(ntfs_big_inode_cache_name, ntfs_big_inode_cache = kmem_cache_create(ntfs_big_inode_cache_name,
sizeof(big_ntfs_inode), 0, sizeof(big_ntfs_inode), 0,
SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD, SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD|
ntfs_big_inode_init_once); SLAB_ACCOUNT, ntfs_big_inode_init_once);
if (!ntfs_big_inode_cache) { if (!ntfs_big_inode_cache) {
pr_crit("Failed to create %s!\n", ntfs_big_inode_cache_name); pr_crit("Failed to create %s!\n", ntfs_big_inode_cache_name);
goto big_inode_err_out; goto big_inode_err_out;

View File

@ -164,7 +164,7 @@ static int ocfs2_dinode_insert_check(struct ocfs2_extent_tree *et,
struct ocfs2_extent_rec *rec); struct ocfs2_extent_rec *rec);
static int ocfs2_dinode_sanity_check(struct ocfs2_extent_tree *et); static int ocfs2_dinode_sanity_check(struct ocfs2_extent_tree *et);
static void ocfs2_dinode_fill_root_el(struct ocfs2_extent_tree *et); static void ocfs2_dinode_fill_root_el(struct ocfs2_extent_tree *et);
static struct ocfs2_extent_tree_operations ocfs2_dinode_et_ops = { static const struct ocfs2_extent_tree_operations ocfs2_dinode_et_ops = {
.eo_set_last_eb_blk = ocfs2_dinode_set_last_eb_blk, .eo_set_last_eb_blk = ocfs2_dinode_set_last_eb_blk,
.eo_get_last_eb_blk = ocfs2_dinode_get_last_eb_blk, .eo_get_last_eb_blk = ocfs2_dinode_get_last_eb_blk,
.eo_update_clusters = ocfs2_dinode_update_clusters, .eo_update_clusters = ocfs2_dinode_update_clusters,
@ -286,7 +286,7 @@ static void ocfs2_xattr_value_update_clusters(struct ocfs2_extent_tree *et,
le32_add_cpu(&vb->vb_xv->xr_clusters, clusters); le32_add_cpu(&vb->vb_xv->xr_clusters, clusters);
} }
static struct ocfs2_extent_tree_operations ocfs2_xattr_value_et_ops = { static const struct ocfs2_extent_tree_operations ocfs2_xattr_value_et_ops = {
.eo_set_last_eb_blk = ocfs2_xattr_value_set_last_eb_blk, .eo_set_last_eb_blk = ocfs2_xattr_value_set_last_eb_blk,
.eo_get_last_eb_blk = ocfs2_xattr_value_get_last_eb_blk, .eo_get_last_eb_blk = ocfs2_xattr_value_get_last_eb_blk,
.eo_update_clusters = ocfs2_xattr_value_update_clusters, .eo_update_clusters = ocfs2_xattr_value_update_clusters,
@ -332,7 +332,7 @@ static void ocfs2_xattr_tree_update_clusters(struct ocfs2_extent_tree *et,
le32_add_cpu(&xb->xb_attrs.xb_root.xt_clusters, clusters); le32_add_cpu(&xb->xb_attrs.xb_root.xt_clusters, clusters);
} }
static struct ocfs2_extent_tree_operations ocfs2_xattr_tree_et_ops = { static const struct ocfs2_extent_tree_operations ocfs2_xattr_tree_et_ops = {
.eo_set_last_eb_blk = ocfs2_xattr_tree_set_last_eb_blk, .eo_set_last_eb_blk = ocfs2_xattr_tree_set_last_eb_blk,
.eo_get_last_eb_blk = ocfs2_xattr_tree_get_last_eb_blk, .eo_get_last_eb_blk = ocfs2_xattr_tree_get_last_eb_blk,
.eo_update_clusters = ocfs2_xattr_tree_update_clusters, .eo_update_clusters = ocfs2_xattr_tree_update_clusters,
@ -379,7 +379,7 @@ static void ocfs2_dx_root_fill_root_el(struct ocfs2_extent_tree *et)
et->et_root_el = &dx_root->dr_list; et->et_root_el = &dx_root->dr_list;
} }
static struct ocfs2_extent_tree_operations ocfs2_dx_root_et_ops = { static const struct ocfs2_extent_tree_operations ocfs2_dx_root_et_ops = {
.eo_set_last_eb_blk = ocfs2_dx_root_set_last_eb_blk, .eo_set_last_eb_blk = ocfs2_dx_root_set_last_eb_blk,
.eo_get_last_eb_blk = ocfs2_dx_root_get_last_eb_blk, .eo_get_last_eb_blk = ocfs2_dx_root_get_last_eb_blk,
.eo_update_clusters = ocfs2_dx_root_update_clusters, .eo_update_clusters = ocfs2_dx_root_update_clusters,
@ -425,7 +425,7 @@ ocfs2_refcount_tree_extent_contig(struct ocfs2_extent_tree *et,
return CONTIG_NONE; return CONTIG_NONE;
} }
static struct ocfs2_extent_tree_operations ocfs2_refcount_tree_et_ops = { static const struct ocfs2_extent_tree_operations ocfs2_refcount_tree_et_ops = {
.eo_set_last_eb_blk = ocfs2_refcount_tree_set_last_eb_blk, .eo_set_last_eb_blk = ocfs2_refcount_tree_set_last_eb_blk,
.eo_get_last_eb_blk = ocfs2_refcount_tree_get_last_eb_blk, .eo_get_last_eb_blk = ocfs2_refcount_tree_get_last_eb_blk,
.eo_update_clusters = ocfs2_refcount_tree_update_clusters, .eo_update_clusters = ocfs2_refcount_tree_update_clusters,
@ -438,7 +438,7 @@ static void __ocfs2_init_extent_tree(struct ocfs2_extent_tree *et,
struct buffer_head *bh, struct buffer_head *bh,
ocfs2_journal_access_func access, ocfs2_journal_access_func access,
void *obj, void *obj,
struct ocfs2_extent_tree_operations *ops) const struct ocfs2_extent_tree_operations *ops)
{ {
et->et_ops = ops; et->et_ops = ops;
et->et_root_bh = bh; et->et_root_bh = bh;
@ -6174,8 +6174,7 @@ int ocfs2_begin_truncate_log_recovery(struct ocfs2_super *osb,
} }
bail: bail:
if (tl_inode) iput(tl_inode);
iput(tl_inode);
brelse(tl_bh); brelse(tl_bh);
if (status < 0) { if (status < 0) {

View File

@ -54,7 +54,7 @@
*/ */
struct ocfs2_extent_tree_operations; struct ocfs2_extent_tree_operations;
struct ocfs2_extent_tree { struct ocfs2_extent_tree {
struct ocfs2_extent_tree_operations *et_ops; const struct ocfs2_extent_tree_operations *et_ops;
struct buffer_head *et_root_bh; struct buffer_head *et_root_bh;
struct ocfs2_extent_list *et_root_el; struct ocfs2_extent_list *et_root_el;
struct ocfs2_caching_info *et_ci; struct ocfs2_caching_info *et_ci;

View File

@ -1780,8 +1780,8 @@ static ssize_t o2hb_region_dev_store(struct config_item *item,
} }
++live_threshold; ++live_threshold;
atomic_set(&reg->hr_steady_iterations, live_threshold); atomic_set(&reg->hr_steady_iterations, live_threshold);
/* unsteady_iterations is double the steady_iterations */ /* unsteady_iterations is triple the steady_iterations */
atomic_set(&reg->hr_unsteady_iterations, (live_threshold << 1)); atomic_set(&reg->hr_unsteady_iterations, (live_threshold * 3));
hb_task = kthread_run(o2hb_thread, reg, "o2hb-%s", hb_task = kthread_run(o2hb_thread, reg, "o2hb-%s",
reg->hr_item.ci_name); reg->hr_item.ci_name);

View File

@ -376,17 +376,6 @@ struct dlm_lock
lksb_kernel_allocated:1; lksb_kernel_allocated:1;
}; };
#define DLM_LKSB_UNUSED1 0x01
#define DLM_LKSB_PUT_LVB 0x02
#define DLM_LKSB_GET_LVB 0x04
#define DLM_LKSB_UNUSED2 0x08
#define DLM_LKSB_UNUSED3 0x10
#define DLM_LKSB_UNUSED4 0x20
#define DLM_LKSB_UNUSED5 0x40
#define DLM_LKSB_UNUSED6 0x80
enum dlm_lockres_list { enum dlm_lockres_list {
DLM_GRANTED_LIST = 0, DLM_GRANTED_LIST = 0,
DLM_CONVERTING_LIST = 1, DLM_CONVERTING_LIST = 1,

View File

@ -2388,8 +2388,8 @@ static void dlm_deref_lockres_worker(struct dlm_work_item *item, void *data)
spin_lock(&res->spinlock); spin_lock(&res->spinlock);
BUG_ON(res->state & DLM_LOCK_RES_DROPPING_REF); BUG_ON(res->state & DLM_LOCK_RES_DROPPING_REF);
__dlm_wait_on_lockres_flags(res, DLM_LOCK_RES_SETREF_INPROG);
if (test_bit(node, res->refmap)) { if (test_bit(node, res->refmap)) {
__dlm_wait_on_lockres_flags(res, DLM_LOCK_RES_SETREF_INPROG);
dlm_lockres_clear_refmap_bit(dlm, res, node); dlm_lockres_clear_refmap_bit(dlm, res, node);
cleared = 1; cleared = 1;
} }
@ -2519,6 +2519,11 @@ static int dlm_migrate_lockres(struct dlm_ctxt *dlm,
spin_lock(&dlm->master_lock); spin_lock(&dlm->master_lock);
ret = dlm_add_migration_mle(dlm, res, mle, &oldmle, name, ret = dlm_add_migration_mle(dlm, res, mle, &oldmle, name,
namelen, target, dlm->node_num); namelen, target, dlm->node_num);
/* get an extra reference on the mle.
* otherwise the assert_master from the new
* master will destroy this.
*/
dlm_get_mle_inuse(mle);
spin_unlock(&dlm->master_lock); spin_unlock(&dlm->master_lock);
spin_unlock(&dlm->spinlock); spin_unlock(&dlm->spinlock);
@ -2544,7 +2549,7 @@ static int dlm_migrate_lockres(struct dlm_ctxt *dlm,
} }
fail: fail:
if (oldmle) { if (ret != -EEXIST && oldmle) {
/* master is known, detach if not already detached */ /* master is known, detach if not already detached */
dlm_mle_detach_hb_events(dlm, oldmle); dlm_mle_detach_hb_events(dlm, oldmle);
dlm_put_mle(oldmle); dlm_put_mle(oldmle);
@ -2554,6 +2559,7 @@ fail:
if (mle_added) { if (mle_added) {
dlm_mle_detach_hb_events(dlm, mle); dlm_mle_detach_hb_events(dlm, mle);
dlm_put_mle(mle); dlm_put_mle(mle);
dlm_put_mle_inuse(mle);
} else if (mle) { } else if (mle) {
kmem_cache_free(dlm_mle_cache, mle); kmem_cache_free(dlm_mle_cache, mle);
mle = NULL; mle = NULL;
@ -2571,17 +2577,6 @@ fail:
* ensure that all assert_master work is flushed. */ * ensure that all assert_master work is flushed. */
flush_workqueue(dlm->dlm_worker); flush_workqueue(dlm->dlm_worker);
/* get an extra reference on the mle.
* otherwise the assert_master from the new
* master will destroy this.
* also, make sure that all callers of dlm_get_mle
* take both dlm->spinlock and dlm->master_lock */
spin_lock(&dlm->spinlock);
spin_lock(&dlm->master_lock);
dlm_get_mle_inuse(mle);
spin_unlock(&dlm->master_lock);
spin_unlock(&dlm->spinlock);
/* notify new node and send all lock state */ /* notify new node and send all lock state */
/* call send_one_lockres with migration flag. /* call send_one_lockres with migration flag.
* this serves as notice to the target node that a * this serves as notice to the target node that a
@ -3050,7 +3045,7 @@ int dlm_migrate_request_handler(struct o2net_msg *msg, u32 len, void *data,
int ret = 0; int ret = 0;
if (!dlm_grab(dlm)) if (!dlm_grab(dlm))
return -EINVAL; return 0;
name = migrate->name; name = migrate->name;
namelen = migrate->namelen; namelen = migrate->namelen;
@ -3141,7 +3136,8 @@ static int dlm_add_migration_mle(struct dlm_ctxt *dlm,
mlog(0, "tried to migrate %.*s, but some " mlog(0, "tried to migrate %.*s, but some "
"process beat me to it\n", "process beat me to it\n",
namelen, name); namelen, name);
ret = -EEXIST; spin_unlock(&tmp->spinlock);
return -EEXIST;
} else { } else {
/* bad. 2 NODES are trying to migrate! */ /* bad. 2 NODES are trying to migrate! */
mlog(ML_ERROR, "migration error mle: " mlog(ML_ERROR, "migration error mle: "
@ -3312,6 +3308,15 @@ top:
mle->new_master != dead_node) mle->new_master != dead_node)
continue; continue;
if (mle->new_master == dead_node && mle->inuse) {
mlog(ML_NOTICE, "%s: target %u died during "
"migration from %u, the MLE is "
"still keep used, ignore it!\n",
dlm->name, dead_node,
mle->master);
continue;
}
/* If we have reached this point, this mle needs to be /* If we have reached this point, this mle needs to be
* removed from the list and freed. */ * removed from the list and freed. */
dlm_clean_migration_mle(dlm, mle); dlm_clean_migration_mle(dlm, mle);

View File

@ -1373,6 +1373,7 @@ int dlm_mig_lockres_handler(struct o2net_msg *msg, u32 len, void *data,
char *buf = NULL; char *buf = NULL;
struct dlm_work_item *item = NULL; struct dlm_work_item *item = NULL;
struct dlm_lock_resource *res = NULL; struct dlm_lock_resource *res = NULL;
unsigned int hash;
if (!dlm_grab(dlm)) if (!dlm_grab(dlm))
return -EINVAL; return -EINVAL;
@ -1400,7 +1401,10 @@ int dlm_mig_lockres_handler(struct o2net_msg *msg, u32 len, void *data,
/* lookup the lock to see if we have a secondary queue for this /* lookup the lock to see if we have a secondary queue for this
* already... just add the locks in and this will have its owner * already... just add the locks in and this will have its owner
* and RECOVERY flag changed when it completes. */ * and RECOVERY flag changed when it completes. */
res = dlm_lookup_lockres(dlm, mres->lockname, mres->lockname_len); hash = dlm_lockid_hash(mres->lockname, mres->lockname_len);
spin_lock(&dlm->spinlock);
res = __dlm_lookup_lockres(dlm, mres->lockname, mres->lockname_len,
hash);
if (res) { if (res) {
/* this will get a ref on res */ /* this will get a ref on res */
/* mark it as recovering/migrating and hash it */ /* mark it as recovering/migrating and hash it */
@ -1421,13 +1425,16 @@ int dlm_mig_lockres_handler(struct o2net_msg *msg, u32 len, void *data,
mres->lockname_len, mres->lockname); mres->lockname_len, mres->lockname);
ret = -EFAULT; ret = -EFAULT;
spin_unlock(&res->spinlock); spin_unlock(&res->spinlock);
spin_unlock(&dlm->spinlock);
dlm_lockres_put(res); dlm_lockres_put(res);
goto leave; goto leave;
} }
res->state |= DLM_LOCK_RES_MIGRATING; res->state |= DLM_LOCK_RES_MIGRATING;
} }
spin_unlock(&res->spinlock); spin_unlock(&res->spinlock);
spin_unlock(&dlm->spinlock);
} else { } else {
spin_unlock(&dlm->spinlock);
/* need to allocate, just like if it was /* need to allocate, just like if it was
* mastered here normally */ * mastered here normally */
res = dlm_new_lockres(dlm, mres->lockname, mres->lockname_len); res = dlm_new_lockres(dlm, mres->lockname, mres->lockname_len);
@ -2450,11 +2457,7 @@ static void __dlm_hb_node_down(struct dlm_ctxt *dlm, int idx)
* perhaps later we can genericize this for other waiters. */ * perhaps later we can genericize this for other waiters. */
wake_up(&dlm->migration_wq); wake_up(&dlm->migration_wq);
if (test_bit(idx, dlm->recovery_map)) set_bit(idx, dlm->recovery_map);
mlog(0, "domain %s, node %u already added "
"to recovery map!\n", dlm->name, idx);
else
set_bit(idx, dlm->recovery_map);
} }
void dlm_hb_node_down_cb(struct o2nm_node *node, int idx, void *data) void dlm_hb_node_down_cb(struct o2nm_node *node, int idx, void *data)

View File

@ -421,7 +421,7 @@ int dlm_unlock_lock_handler(struct o2net_msg *msg, u32 len, void *data,
} }
if (!dlm_grab(dlm)) if (!dlm_grab(dlm))
return DLM_REJECTED; return DLM_FORWARD;
mlog_bug_on_msg(!dlm_domain_fully_joined(dlm), mlog_bug_on_msg(!dlm_domain_fully_joined(dlm),
"Domain %s not fully joined!\n", dlm->name); "Domain %s not fully joined!\n", dlm->name);

View File

@ -638,7 +638,7 @@ static int __init init_dlmfs_fs(void)
dlmfs_inode_cache = kmem_cache_create("dlmfs_inode_cache", dlmfs_inode_cache = kmem_cache_create("dlmfs_inode_cache",
sizeof(struct dlmfs_inode_private), sizeof(struct dlmfs_inode_private),
0, (SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT| 0, (SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|
SLAB_MEM_SPREAD), SLAB_MEM_SPREAD|SLAB_ACCOUNT),
dlmfs_init_once); dlmfs_init_once);
if (!dlmfs_inode_cache) { if (!dlmfs_inode_cache) {
status = -ENOMEM; status = -ENOMEM;

View File

@ -2432,12 +2432,6 @@ bail:
* done this we have to return AOP_TRUNCATED_PAGE so the aop method * done this we have to return AOP_TRUNCATED_PAGE so the aop method
* that called us can bubble that back up into the VFS who will then * that called us can bubble that back up into the VFS who will then
* immediately retry the aop call. * immediately retry the aop call.
*
* We do a blocking lock and immediate unlock before returning, though, so that
* the lock has a great chance of being cached on this node by the time the VFS
* calls back to retry the aop. This has a potential to livelock as nodes
* ping locks back and forth, but that's a risk we're willing to take to avoid
* the lock inversion simply.
*/ */
int ocfs2_inode_lock_with_page(struct inode *inode, int ocfs2_inode_lock_with_page(struct inode *inode,
struct buffer_head **ret_bh, struct buffer_head **ret_bh,
@ -2449,8 +2443,6 @@ int ocfs2_inode_lock_with_page(struct inode *inode,
ret = ocfs2_inode_lock_full(inode, ret_bh, ex, OCFS2_LOCK_NONBLOCK); ret = ocfs2_inode_lock_full(inode, ret_bh, ex, OCFS2_LOCK_NONBLOCK);
if (ret == -EAGAIN) { if (ret == -EAGAIN) {
unlock_page(page); unlock_page(page);
if (ocfs2_inode_lock(inode, ret_bh, ex) == 0)
ocfs2_inode_unlock(inode, ex);
ret = AOP_TRUNCATED_PAGE; ret = AOP_TRUNCATED_PAGE;
} }

View File

@ -1302,6 +1302,14 @@ int ocfs2_getattr(struct vfsmount *mnt,
} }
generic_fillattr(inode, stat); generic_fillattr(inode, stat);
/*
* If there is inline data in the inode, the inode will normally not
* have data blocks allocated (it may have an external xattr block).
* Report at least one sector for such files, so tools like tar, rsync,
* others don't incorrectly think the file is completely sparse.
*/
if (unlikely(OCFS2_I(inode)->ip_dyn_features & OCFS2_INLINE_DATA_FL))
stat->blocks += (stat->size + 511)>>9;
/* We set the blksize from the cluster size for performance */ /* We set the blksize from the cluster size for performance */
stat->blksize = osb->s_clustersize; stat->blksize = osb->s_clustersize;

View File

@ -606,9 +606,7 @@ bail:
if (gb_inode) if (gb_inode)
mutex_unlock(&gb_inode->i_mutex); mutex_unlock(&gb_inode->i_mutex);
if (gb_inode) iput(gb_inode);
iput(gb_inode);
brelse(bh); brelse(bh);
return status; return status;

View File

@ -1042,8 +1042,7 @@ void ocfs2_journal_shutdown(struct ocfs2_super *osb)
// up_write(&journal->j_trans_barrier); // up_write(&journal->j_trans_barrier);
done: done:
if (inode) iput(inode);
iput(inode);
} }
static void ocfs2_clear_journal_error(struct super_block *sb, static void ocfs2_clear_journal_error(struct super_block *sb,
@ -1687,9 +1686,7 @@ done:
if (got_lock) if (got_lock)
ocfs2_inode_unlock(inode, 1); ocfs2_inode_unlock(inode, 1);
if (inode) iput(inode);
iput(inode);
brelse(bh); brelse(bh);
return status; return status;
@ -1796,8 +1793,7 @@ static int ocfs2_trylock_journal(struct ocfs2_super *osb,
ocfs2_inode_unlock(inode, 1); ocfs2_inode_unlock(inode, 1);
bail: bail:
if (inode) iput(inode);
iput(inode);
return status; return status;
} }

View File

@ -358,8 +358,7 @@ int ocfs2_load_local_alloc(struct ocfs2_super *osb)
bail: bail:
if (status < 0) if (status < 0)
brelse(alloc_bh); brelse(alloc_bh);
if (inode) iput(inode);
iput(inode);
trace_ocfs2_load_local_alloc(osb->local_alloc_bits); trace_ocfs2_load_local_alloc(osb->local_alloc_bits);
@ -473,8 +472,7 @@ out_mutex:
iput(main_bm_inode); iput(main_bm_inode);
out: out:
if (local_alloc_inode) iput(local_alloc_inode);
iput(local_alloc_inode);
kfree(alloc_copy); kfree(alloc_copy);
} }
@ -1327,9 +1325,7 @@ bail:
brelse(main_bm_bh); brelse(main_bm_bh);
if (main_bm_inode) iput(main_bm_inode);
iput(main_bm_inode);
kfree(alloc_copy); kfree(alloc_copy);
if (ac) if (ac)

View File

@ -1683,8 +1683,7 @@ bail:
if (new_inode) if (new_inode)
sync_mapping_buffers(old_inode->i_mapping); sync_mapping_buffers(old_inode->i_mapping);
if (new_inode) iput(new_inode);
iput(new_inode);
ocfs2_free_dir_lookup_result(&target_lookup_res); ocfs2_free_dir_lookup_result(&target_lookup_res);
ocfs2_free_dir_lookup_result(&old_entry_lookup); ocfs2_free_dir_lookup_result(&old_entry_lookup);
@ -2373,6 +2372,15 @@ int ocfs2_orphan_del(struct ocfs2_super *osb,
(unsigned long long)OCFS2_I(orphan_dir_inode)->ip_blkno, (unsigned long long)OCFS2_I(orphan_dir_inode)->ip_blkno,
name, strlen(name)); name, strlen(name));
status = ocfs2_journal_access_di(handle,
INODE_CACHE(orphan_dir_inode),
orphan_dir_bh,
OCFS2_JOURNAL_ACCESS_WRITE);
if (status < 0) {
mlog_errno(status);
goto leave;
}
/* find it's spot in the orphan directory */ /* find it's spot in the orphan directory */
status = ocfs2_find_entry(name, strlen(name), orphan_dir_inode, status = ocfs2_find_entry(name, strlen(name), orphan_dir_inode,
&lookup); &lookup);
@ -2388,15 +2396,6 @@ int ocfs2_orphan_del(struct ocfs2_super *osb,
goto leave; goto leave;
} }
status = ocfs2_journal_access_di(handle,
INODE_CACHE(orphan_dir_inode),
orphan_dir_bh,
OCFS2_JOURNAL_ACCESS_WRITE);
if (status < 0) {
mlog_errno(status);
goto leave;
}
/* do the i_nlink dance! :) */ /* do the i_nlink dance! :) */
orphan_fe = (struct ocfs2_dinode *) orphan_dir_bh->b_data; orphan_fe = (struct ocfs2_dinode *) orphan_dir_bh->b_data;
if (S_ISDIR(inode->i_mode)) if (S_ISDIR(inode->i_mode))

View File

@ -322,8 +322,7 @@ static void __ocfs2_free_slot_info(struct ocfs2_slot_info *si)
if (si == NULL) if (si == NULL)
return; return;
if (si->si_inode) iput(si->si_inode);
iput(si->si_inode);
if (si->si_bh) { if (si->si_bh) {
for (i = 0; i < si->si_blocks; i++) { for (i = 0; i < si->si_blocks; i++) {
if (si->si_bh[i]) { if (si->si_bh[i]) {
@ -503,8 +502,17 @@ int ocfs2_find_slot(struct ocfs2_super *osb)
trace_ocfs2_find_slot(osb->slot_num); trace_ocfs2_find_slot(osb->slot_num);
status = ocfs2_update_disk_slot(osb, si, osb->slot_num); status = ocfs2_update_disk_slot(osb, si, osb->slot_num);
if (status < 0) if (status < 0) {
mlog_errno(status); mlog_errno(status);
/*
* if write block failed, invalidate slot to avoid overwrite
* slot during dismount in case another node rightly has mounted
*/
spin_lock(&osb->osb_lock);
ocfs2_invalidate_slot(si, osb->slot_num);
osb->slot_num = OCFS2_INVALID_SLOT;
spin_unlock(&osb->osb_lock);
}
bail: bail:
return status; return status;

View File

@ -1280,6 +1280,8 @@ static int ocfs2_parse_options(struct super_block *sb,
int status, user_stack = 0; int status, user_stack = 0;
char *p; char *p;
u32 tmp; u32 tmp;
int token, option;
substring_t args[MAX_OPT_ARGS];
trace_ocfs2_parse_options(is_remount, options ? options : "(none)"); trace_ocfs2_parse_options(is_remount, options ? options : "(none)");
@ -1298,9 +1300,6 @@ static int ocfs2_parse_options(struct super_block *sb,
} }
while ((p = strsep(&options, ",")) != NULL) { while ((p = strsep(&options, ",")) != NULL) {
int token, option;
substring_t args[MAX_OPT_ARGS];
if (!*p) if (!*p)
continue; continue;
@ -1367,7 +1366,6 @@ static int ocfs2_parse_options(struct super_block *sb,
mopt->atime_quantum = option; mopt->atime_quantum = option;
break; break;
case Opt_slot: case Opt_slot:
option = 0;
if (match_int(&args[0], &option)) { if (match_int(&args[0], &option)) {
status = 0; status = 0;
goto bail; goto bail;
@ -1376,7 +1374,6 @@ static int ocfs2_parse_options(struct super_block *sb,
mopt->slot = (s16)option; mopt->slot = (s16)option;
break; break;
case Opt_commit: case Opt_commit:
option = 0;
if (match_int(&args[0], &option)) { if (match_int(&args[0], &option)) {
status = 0; status = 0;
goto bail; goto bail;
@ -1388,7 +1385,6 @@ static int ocfs2_parse_options(struct super_block *sb,
mopt->commit_interval = HZ * option; mopt->commit_interval = HZ * option;
break; break;
case Opt_localalloc: case Opt_localalloc:
option = 0;
if (match_int(&args[0], &option)) { if (match_int(&args[0], &option)) {
status = 0; status = 0;
goto bail; goto bail;
@ -1726,8 +1722,7 @@ static int ocfs2_statfs(struct dentry *dentry, struct kstatfs *buf)
ocfs2_inode_unlock(inode, 0); ocfs2_inode_unlock(inode, 0);
status = 0; status = 0;
bail: bail:
if (inode) iput(inode);
iput(inode);
if (status) if (status)
mlog_errno(status); mlog_errno(status);
@ -1771,7 +1766,7 @@ static int ocfs2_initialize_mem_caches(void)
sizeof(struct ocfs2_inode_info), sizeof(struct ocfs2_inode_info),
0, 0,
(SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT| (SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|
SLAB_MEM_SPREAD), SLAB_MEM_SPREAD|SLAB_ACCOUNT),
ocfs2_inode_init_once); ocfs2_inode_init_once);
ocfs2_dquot_cachep = kmem_cache_create("ocfs2_dquot_cache", ocfs2_dquot_cachep = kmem_cache_create("ocfs2_dquot_cache",
sizeof(struct ocfs2_dquot), sizeof(struct ocfs2_dquot),

View File

@ -443,7 +443,7 @@ static int __init init_openprom_fs(void)
sizeof(struct op_inode_info), sizeof(struct op_inode_info),
0, 0,
(SLAB_RECLAIM_ACCOUNT | (SLAB_RECLAIM_ACCOUNT |
SLAB_MEM_SPREAD), SLAB_MEM_SPREAD | SLAB_ACCOUNT),
op_inode_init_once); op_inode_init_once);
if (!op_inode_cachep) if (!op_inode_cachep)
return -ENOMEM; return -ENOMEM;

View File

@ -95,7 +95,8 @@ void __init proc_init_inodecache(void)
proc_inode_cachep = kmem_cache_create("proc_inode_cache", proc_inode_cachep = kmem_cache_create("proc_inode_cache",
sizeof(struct proc_inode), sizeof(struct proc_inode),
0, (SLAB_RECLAIM_ACCOUNT| 0, (SLAB_RECLAIM_ACCOUNT|
SLAB_MEM_SPREAD|SLAB_PANIC), SLAB_MEM_SPREAD|SLAB_ACCOUNT|
SLAB_PANIC),
init_once); init_once);
} }

View File

@ -57,11 +57,8 @@ static int meminfo_proc_show(struct seq_file *m, void *v)
/* /*
* Estimate the amount of memory available for userspace allocations, * Estimate the amount of memory available for userspace allocations,
* without causing swapping. * without causing swapping.
*
* Free memory cannot be taken below the low watermark, before the
* system starts swapping.
*/ */
available = i.freeram - wmark_low; available = i.freeram - totalreserve_pages;
/* /*
* Not all the page cache can be freed, otherwise the system will * Not all the page cache can be freed, otherwise the system will

View File

@ -14,6 +14,7 @@
#include <linux/swapops.h> #include <linux/swapops.h>
#include <linux/mmu_notifier.h> #include <linux/mmu_notifier.h>
#include <linux/page_idle.h> #include <linux/page_idle.h>
#include <linux/shmem_fs.h>
#include <asm/elf.h> #include <asm/elf.h>
#include <asm/uaccess.h> #include <asm/uaccess.h>
@ -22,9 +23,13 @@
void task_mem(struct seq_file *m, struct mm_struct *mm) void task_mem(struct seq_file *m, struct mm_struct *mm)
{ {
unsigned long data, text, lib, swap, ptes, pmds; unsigned long text, lib, swap, ptes, pmds, anon, file, shmem;
unsigned long hiwater_vm, total_vm, hiwater_rss, total_rss; unsigned long hiwater_vm, total_vm, hiwater_rss, total_rss;
anon = get_mm_counter(mm, MM_ANONPAGES);
file = get_mm_counter(mm, MM_FILEPAGES);
shmem = get_mm_counter(mm, MM_SHMEMPAGES);
/* /*
* Note: to minimize their overhead, mm maintains hiwater_vm and * Note: to minimize their overhead, mm maintains hiwater_vm and
* hiwater_rss only when about to *lower* total_vm or rss. Any * hiwater_rss only when about to *lower* total_vm or rss. Any
@ -35,11 +40,10 @@ void task_mem(struct seq_file *m, struct mm_struct *mm)
hiwater_vm = total_vm = mm->total_vm; hiwater_vm = total_vm = mm->total_vm;
if (hiwater_vm < mm->hiwater_vm) if (hiwater_vm < mm->hiwater_vm)
hiwater_vm = mm->hiwater_vm; hiwater_vm = mm->hiwater_vm;
hiwater_rss = total_rss = get_mm_rss(mm); hiwater_rss = total_rss = anon + file + shmem;
if (hiwater_rss < mm->hiwater_rss) if (hiwater_rss < mm->hiwater_rss)
hiwater_rss = mm->hiwater_rss; hiwater_rss = mm->hiwater_rss;
data = mm->total_vm - mm->shared_vm - mm->stack_vm;
text = (PAGE_ALIGN(mm->end_code) - (mm->start_code & PAGE_MASK)) >> 10; text = (PAGE_ALIGN(mm->end_code) - (mm->start_code & PAGE_MASK)) >> 10;
lib = (mm->exec_vm << (PAGE_SHIFT-10)) - text; lib = (mm->exec_vm << (PAGE_SHIFT-10)) - text;
swap = get_mm_counter(mm, MM_SWAPENTS); swap = get_mm_counter(mm, MM_SWAPENTS);
@ -52,6 +56,9 @@ void task_mem(struct seq_file *m, struct mm_struct *mm)
"VmPin:\t%8lu kB\n" "VmPin:\t%8lu kB\n"
"VmHWM:\t%8lu kB\n" "VmHWM:\t%8lu kB\n"
"VmRSS:\t%8lu kB\n" "VmRSS:\t%8lu kB\n"
"RssAnon:\t%8lu kB\n"
"RssFile:\t%8lu kB\n"
"RssShmem:\t%8lu kB\n"
"VmData:\t%8lu kB\n" "VmData:\t%8lu kB\n"
"VmStk:\t%8lu kB\n" "VmStk:\t%8lu kB\n"
"VmExe:\t%8lu kB\n" "VmExe:\t%8lu kB\n"
@ -65,7 +72,10 @@ void task_mem(struct seq_file *m, struct mm_struct *mm)
mm->pinned_vm << (PAGE_SHIFT-10), mm->pinned_vm << (PAGE_SHIFT-10),
hiwater_rss << (PAGE_SHIFT-10), hiwater_rss << (PAGE_SHIFT-10),
total_rss << (PAGE_SHIFT-10), total_rss << (PAGE_SHIFT-10),
data << (PAGE_SHIFT-10), anon << (PAGE_SHIFT-10),
file << (PAGE_SHIFT-10),
shmem << (PAGE_SHIFT-10),
mm->data_vm << (PAGE_SHIFT-10),
mm->stack_vm << (PAGE_SHIFT-10), text, lib, mm->stack_vm << (PAGE_SHIFT-10), text, lib,
ptes >> 10, ptes >> 10,
pmds >> 10, pmds >> 10,
@ -82,10 +92,11 @@ unsigned long task_statm(struct mm_struct *mm,
unsigned long *shared, unsigned long *text, unsigned long *shared, unsigned long *text,
unsigned long *data, unsigned long *resident) unsigned long *data, unsigned long *resident)
{ {
*shared = get_mm_counter(mm, MM_FILEPAGES); *shared = get_mm_counter(mm, MM_FILEPAGES) +
get_mm_counter(mm, MM_SHMEMPAGES);
*text = (PAGE_ALIGN(mm->end_code) - (mm->start_code & PAGE_MASK)) *text = (PAGE_ALIGN(mm->end_code) - (mm->start_code & PAGE_MASK))
>> PAGE_SHIFT; >> PAGE_SHIFT;
*data = mm->total_vm - mm->shared_vm; *data = mm->data_vm + mm->stack_vm;
*resident = *shared + get_mm_counter(mm, MM_ANONPAGES); *resident = *shared + get_mm_counter(mm, MM_ANONPAGES);
return mm->total_vm; return mm->total_vm;
} }
@ -451,6 +462,7 @@ struct mem_size_stats {
unsigned long private_hugetlb; unsigned long private_hugetlb;
u64 pss; u64 pss;
u64 swap_pss; u64 swap_pss;
bool check_shmem_swap;
}; };
static void smaps_account(struct mem_size_stats *mss, struct page *page, static void smaps_account(struct mem_size_stats *mss, struct page *page,
@ -485,6 +497,19 @@ static void smaps_account(struct mem_size_stats *mss, struct page *page,
} }
} }
#ifdef CONFIG_SHMEM
static int smaps_pte_hole(unsigned long addr, unsigned long end,
struct mm_walk *walk)
{
struct mem_size_stats *mss = walk->private;
mss->swap += shmem_partial_swap_usage(
walk->vma->vm_file->f_mapping, addr, end);
return 0;
}
#endif
static void smaps_pte_entry(pte_t *pte, unsigned long addr, static void smaps_pte_entry(pte_t *pte, unsigned long addr,
struct mm_walk *walk) struct mm_walk *walk)
{ {
@ -512,6 +537,19 @@ static void smaps_pte_entry(pte_t *pte, unsigned long addr,
} }
} else if (is_migration_entry(swpent)) } else if (is_migration_entry(swpent))
page = migration_entry_to_page(swpent); page = migration_entry_to_page(swpent);
} else if (unlikely(IS_ENABLED(CONFIG_SHMEM) && mss->check_shmem_swap
&& pte_none(*pte))) {
page = find_get_entry(vma->vm_file->f_mapping,
linear_page_index(vma, addr));
if (!page)
return;
if (radix_tree_exceptional_entry(page))
mss->swap += PAGE_SIZE;
else
page_cache_release(page);
return;
} }
if (!page) if (!page)
@ -671,6 +709,31 @@ static int show_smap(struct seq_file *m, void *v, int is_pid)
}; };
memset(&mss, 0, sizeof mss); memset(&mss, 0, sizeof mss);
#ifdef CONFIG_SHMEM
if (vma->vm_file && shmem_mapping(vma->vm_file->f_mapping)) {
/*
* For shared or readonly shmem mappings we know that all
* swapped out pages belong to the shmem object, and we can
* obtain the swap value much more efficiently. For private
* writable mappings, we might have COW pages that are
* not affected by the parent swapped out pages of the shmem
* object, so we have to distinguish them during the page walk.
* Unless we know that the shmem object (or the part mapped by
* our VMA) has no swapped out pages at all.
*/
unsigned long shmem_swapped = shmem_swap_usage(vma);
if (!shmem_swapped || (vma->vm_flags & VM_SHARED) ||
!(vma->vm_flags & VM_WRITE)) {
mss.swap = shmem_swapped;
} else {
mss.check_shmem_swap = true;
smaps_walk.pte_hole = smaps_pte_hole;
}
}
#endif
/* mmap_sem is held in m_start */ /* mmap_sem is held in m_start */
walk_page_vma(vma, &smaps_walk); walk_page_vma(vma, &smaps_walk);
@ -817,9 +880,6 @@ static inline void clear_soft_dirty_pmd(struct vm_area_struct *vma,
pmd = pmd_wrprotect(pmd); pmd = pmd_wrprotect(pmd);
pmd = pmd_clear_soft_dirty(pmd); pmd = pmd_clear_soft_dirty(pmd);
if (vma->vm_flags & VM_SOFTDIRTY)
vma->vm_flags &= ~VM_SOFTDIRTY;
set_pmd_at(vma->vm_mm, addr, pmdp, pmd); set_pmd_at(vma->vm_mm, addr, pmdp, pmd);
} }
#else #else

View File

@ -365,7 +365,7 @@ static int init_inodecache(void)
qnx4_inode_cachep = kmem_cache_create("qnx4_inode_cache", qnx4_inode_cachep = kmem_cache_create("qnx4_inode_cache",
sizeof(struct qnx4_inode_info), sizeof(struct qnx4_inode_info),
0, (SLAB_RECLAIM_ACCOUNT| 0, (SLAB_RECLAIM_ACCOUNT|
SLAB_MEM_SPREAD), SLAB_MEM_SPREAD|SLAB_ACCOUNT),
init_once); init_once);
if (qnx4_inode_cachep == NULL) if (qnx4_inode_cachep == NULL)
return -ENOMEM; return -ENOMEM;

View File

@ -625,7 +625,7 @@ static int init_inodecache(void)
qnx6_inode_cachep = kmem_cache_create("qnx6_inode_cache", qnx6_inode_cachep = kmem_cache_create("qnx6_inode_cache",
sizeof(struct qnx6_inode_info), sizeof(struct qnx6_inode_info),
0, (SLAB_RECLAIM_ACCOUNT| 0, (SLAB_RECLAIM_ACCOUNT|
SLAB_MEM_SPREAD), SLAB_MEM_SPREAD|SLAB_ACCOUNT),
init_once); init_once);
if (!qnx6_inode_cachep) if (!qnx6_inode_cachep)
return -ENOMEM; return -ENOMEM;

View File

@ -626,7 +626,8 @@ static int __init init_inodecache(void)
sizeof(struct sizeof(struct
reiserfs_inode_info), reiserfs_inode_info),
0, (SLAB_RECLAIM_ACCOUNT| 0, (SLAB_RECLAIM_ACCOUNT|
SLAB_MEM_SPREAD), SLAB_MEM_SPREAD|
SLAB_ACCOUNT),
init_once); init_once);
if (reiserfs_inode_cachep == NULL) if (reiserfs_inode_cachep == NULL)
return -ENOMEM; return -ENOMEM;

View File

@ -619,8 +619,8 @@ static int __init init_romfs_fs(void)
romfs_inode_cachep = romfs_inode_cachep =
kmem_cache_create("romfs_i", kmem_cache_create("romfs_i",
sizeof(struct romfs_inode_info), 0, sizeof(struct romfs_inode_info), 0,
SLAB_RECLAIM_ACCOUNT | SLAB_MEM_SPREAD, SLAB_RECLAIM_ACCOUNT | SLAB_MEM_SPREAD |
romfs_i_init_once); SLAB_ACCOUNT, romfs_i_init_once);
if (!romfs_inode_cachep) { if (!romfs_inode_cachep) {
pr_err("Failed to initialise inode cache\n"); pr_err("Failed to initialise inode cache\n");

View File

@ -419,7 +419,8 @@ static int __init init_inodecache(void)
{ {
squashfs_inode_cachep = kmem_cache_create("squashfs_inode_cache", squashfs_inode_cachep = kmem_cache_create("squashfs_inode_cache",
sizeof(struct squashfs_inode_info), 0, sizeof(struct squashfs_inode_info), 0,
SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT, init_once); SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT|SLAB_ACCOUNT,
init_once);
return squashfs_inode_cachep ? 0 : -ENOMEM; return squashfs_inode_cachep ? 0 : -ENOMEM;
} }

View File

@ -346,7 +346,7 @@ int __init sysv_init_icache(void)
{ {
sysv_inode_cachep = kmem_cache_create("sysv_inode_cache", sysv_inode_cachep = kmem_cache_create("sysv_inode_cache",
sizeof(struct sysv_inode_info), 0, sizeof(struct sysv_inode_info), 0,
SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD, SLAB_RECLAIM_ACCOUNT|SLAB_MEM_SPREAD|SLAB_ACCOUNT,
init_once); init_once);
if (!sysv_inode_cachep) if (!sysv_inode_cachep)
return -ENOMEM; return -ENOMEM;

View File

@ -2248,8 +2248,8 @@ static int __init ubifs_init(void)
ubifs_inode_slab = kmem_cache_create("ubifs_inode_slab", ubifs_inode_slab = kmem_cache_create("ubifs_inode_slab",
sizeof(struct ubifs_inode), 0, sizeof(struct ubifs_inode), 0,
SLAB_MEM_SPREAD | SLAB_RECLAIM_ACCOUNT, SLAB_MEM_SPREAD | SLAB_RECLAIM_ACCOUNT |
&inode_slab_ctor); SLAB_ACCOUNT, &inode_slab_ctor);
if (!ubifs_inode_slab) if (!ubifs_inode_slab)
return -ENOMEM; return -ENOMEM;

View File

@ -179,7 +179,8 @@ static int __init init_inodecache(void)
udf_inode_cachep = kmem_cache_create("udf_inode_cache", udf_inode_cachep = kmem_cache_create("udf_inode_cache",
sizeof(struct udf_inode_info), sizeof(struct udf_inode_info),
0, (SLAB_RECLAIM_ACCOUNT | 0, (SLAB_RECLAIM_ACCOUNT |
SLAB_MEM_SPREAD), SLAB_MEM_SPREAD |
SLAB_ACCOUNT),
init_once); init_once);
if (!udf_inode_cachep) if (!udf_inode_cachep)
return -ENOMEM; return -ENOMEM;

View File

@ -1427,7 +1427,7 @@ static int __init init_inodecache(void)
ufs_inode_cachep = kmem_cache_create("ufs_inode_cache", ufs_inode_cachep = kmem_cache_create("ufs_inode_cache",
sizeof(struct ufs_inode_info), sizeof(struct ufs_inode_info),
0, (SLAB_RECLAIM_ACCOUNT| 0, (SLAB_RECLAIM_ACCOUNT|
SLAB_MEM_SPREAD), SLAB_MEM_SPREAD|SLAB_ACCOUNT),
init_once); init_once);
if (ufs_inode_cachep == NULL) if (ufs_inode_cachep == NULL)
return -ENOMEM; return -ENOMEM;

View File

@ -84,6 +84,7 @@ kmem_zalloc(size_t size, xfs_km_flags_t flags)
#define KM_ZONE_HWALIGN SLAB_HWCACHE_ALIGN #define KM_ZONE_HWALIGN SLAB_HWCACHE_ALIGN
#define KM_ZONE_RECLAIM SLAB_RECLAIM_ACCOUNT #define KM_ZONE_RECLAIM SLAB_RECLAIM_ACCOUNT
#define KM_ZONE_SPREAD SLAB_MEM_SPREAD #define KM_ZONE_SPREAD SLAB_MEM_SPREAD
#define KM_ZONE_ACCOUNT SLAB_ACCOUNT
#define kmem_zone kmem_cache #define kmem_zone kmem_cache
#define kmem_zone_t struct kmem_cache #define kmem_zone_t struct kmem_cache

View File

@ -1714,8 +1714,8 @@ xfs_init_zones(void)
xfs_inode_zone = xfs_inode_zone =
kmem_zone_init_flags(sizeof(xfs_inode_t), "xfs_inode", kmem_zone_init_flags(sizeof(xfs_inode_t), "xfs_inode",
KM_ZONE_HWALIGN | KM_ZONE_RECLAIM | KM_ZONE_SPREAD, KM_ZONE_HWALIGN | KM_ZONE_RECLAIM | KM_ZONE_SPREAD |
xfs_fs_inode_init_once); KM_ZONE_ACCOUNT, xfs_fs_inode_init_once);
if (!xfs_inode_zone) if (!xfs_inode_zone)
goto out_destroy_efi_zone; goto out_destroy_efi_zone;

View File

@ -1,6 +1,8 @@
#ifndef __ASM_MEMORY_MODEL_H #ifndef __ASM_MEMORY_MODEL_H
#define __ASM_MEMORY_MODEL_H #define __ASM_MEMORY_MODEL_H
#include <linux/pfn.h>
#ifndef __ASSEMBLY__ #ifndef __ASSEMBLY__
#if defined(CONFIG_FLATMEM) #if defined(CONFIG_FLATMEM)
@ -72,7 +74,7 @@
/* /*
* Convert a physical address to a Page Frame Number and back * Convert a physical address to a Page Frame Number and back
*/ */
#define __phys_to_pfn(paddr) ((unsigned long)((paddr) >> PAGE_SHIFT)) #define __phys_to_pfn(paddr) PHYS_PFN(paddr)
#define __pfn_to_phys(pfn) PFN_PHYS(pfn) #define __pfn_to_phys(pfn) PFN_PHYS(pfn)
#define page_to_pfn __page_to_pfn #define page_to_pfn __page_to_pfn

View File

@ -27,10 +27,10 @@ struct vfsmount;
/* The hash is always the low bits of hash_len */ /* The hash is always the low bits of hash_len */
#ifdef __LITTLE_ENDIAN #ifdef __LITTLE_ENDIAN
#define HASH_LEN_DECLARE u32 hash; u32 len; #define HASH_LEN_DECLARE u32 hash; u32 len
#define bytemask_from_count(cnt) (~(~0ul << (cnt)*8)) #define bytemask_from_count(cnt) (~(~0ul << (cnt)*8))
#else #else
#define HASH_LEN_DECLARE u32 len; u32 hash; #define HASH_LEN_DECLARE u32 len; u32 hash
#define bytemask_from_count(cnt) (~(~0ul >> (cnt)*8)) #define bytemask_from_count(cnt) (~(~0ul >> (cnt)*8))
#endif #endif

View File

@ -220,7 +220,10 @@ struct fsnotify_mark {
/* List of marks by group->i_fsnotify_marks. Also reused for queueing /* List of marks by group->i_fsnotify_marks. Also reused for queueing
* mark into destroy_list when it's waiting for the end of SRCU period * mark into destroy_list when it's waiting for the end of SRCU period
* before it can be freed. [group->mark_mutex] */ * before it can be freed. [group->mark_mutex] */
struct list_head g_list; union {
struct list_head g_list;
struct rcu_head g_rcu;
};
/* Protects inode / mnt pointers, flags, masks */ /* Protects inode / mnt pointers, flags, masks */
spinlock_t lock; spinlock_t lock;
/* List of marks for inode / vfsmount [obj_lock] */ /* List of marks for inode / vfsmount [obj_lock] */

View File

@ -30,7 +30,7 @@ struct vm_area_struct;
#define ___GFP_HARDWALL 0x20000u #define ___GFP_HARDWALL 0x20000u
#define ___GFP_THISNODE 0x40000u #define ___GFP_THISNODE 0x40000u
#define ___GFP_ATOMIC 0x80000u #define ___GFP_ATOMIC 0x80000u
#define ___GFP_NOACCOUNT 0x100000u #define ___GFP_ACCOUNT 0x100000u
#define ___GFP_NOTRACK 0x200000u #define ___GFP_NOTRACK 0x200000u
#define ___GFP_DIRECT_RECLAIM 0x400000u #define ___GFP_DIRECT_RECLAIM 0x400000u
#define ___GFP_OTHER_NODE 0x800000u #define ___GFP_OTHER_NODE 0x800000u
@ -73,11 +73,15 @@ struct vm_area_struct;
* *
* __GFP_THISNODE forces the allocation to be satisified from the requested * __GFP_THISNODE forces the allocation to be satisified from the requested
* node with no fallbacks or placement policy enforcements. * node with no fallbacks or placement policy enforcements.
*
* __GFP_ACCOUNT causes the allocation to be accounted to kmemcg (only relevant
* to kmem allocations).
*/ */
#define __GFP_RECLAIMABLE ((__force gfp_t)___GFP_RECLAIMABLE) #define __GFP_RECLAIMABLE ((__force gfp_t)___GFP_RECLAIMABLE)
#define __GFP_WRITE ((__force gfp_t)___GFP_WRITE) #define __GFP_WRITE ((__force gfp_t)___GFP_WRITE)
#define __GFP_HARDWALL ((__force gfp_t)___GFP_HARDWALL) #define __GFP_HARDWALL ((__force gfp_t)___GFP_HARDWALL)
#define __GFP_THISNODE ((__force gfp_t)___GFP_THISNODE) #define __GFP_THISNODE ((__force gfp_t)___GFP_THISNODE)
#define __GFP_ACCOUNT ((__force gfp_t)___GFP_ACCOUNT)
/* /*
* Watermark modifiers -- controls access to emergency reserves * Watermark modifiers -- controls access to emergency reserves
@ -104,7 +108,6 @@ struct vm_area_struct;
#define __GFP_HIGH ((__force gfp_t)___GFP_HIGH) #define __GFP_HIGH ((__force gfp_t)___GFP_HIGH)
#define __GFP_MEMALLOC ((__force gfp_t)___GFP_MEMALLOC) #define __GFP_MEMALLOC ((__force gfp_t)___GFP_MEMALLOC)
#define __GFP_NOMEMALLOC ((__force gfp_t)___GFP_NOMEMALLOC) #define __GFP_NOMEMALLOC ((__force gfp_t)___GFP_NOMEMALLOC)
#define __GFP_NOACCOUNT ((__force gfp_t)___GFP_NOACCOUNT)
/* /*
* Reclaim modifiers * Reclaim modifiers
@ -197,6 +200,9 @@ struct vm_area_struct;
* GFP_KERNEL is typical for kernel-internal allocations. The caller requires * GFP_KERNEL is typical for kernel-internal allocations. The caller requires
* ZONE_NORMAL or a lower zone for direct access but can direct reclaim. * ZONE_NORMAL or a lower zone for direct access but can direct reclaim.
* *
* GFP_KERNEL_ACCOUNT is the same as GFP_KERNEL, except the allocation is
* accounted to kmemcg.
*
* GFP_NOWAIT is for kernel allocations that should not stall for direct * GFP_NOWAIT is for kernel allocations that should not stall for direct
* reclaim, start physical IO or use any filesystem callback. * reclaim, start physical IO or use any filesystem callback.
* *
@ -236,6 +242,7 @@ struct vm_area_struct;
*/ */
#define GFP_ATOMIC (__GFP_HIGH|__GFP_ATOMIC|__GFP_KSWAPD_RECLAIM) #define GFP_ATOMIC (__GFP_HIGH|__GFP_ATOMIC|__GFP_KSWAPD_RECLAIM)
#define GFP_KERNEL (__GFP_RECLAIM | __GFP_IO | __GFP_FS) #define GFP_KERNEL (__GFP_RECLAIM | __GFP_IO | __GFP_FS)
#define GFP_KERNEL_ACCOUNT (GFP_KERNEL | __GFP_ACCOUNT)
#define GFP_NOWAIT (__GFP_KSWAPD_RECLAIM) #define GFP_NOWAIT (__GFP_KSWAPD_RECLAIM)
#define GFP_NOIO (__GFP_RECLAIM) #define GFP_NOIO (__GFP_RECLAIM)
#define GFP_NOFS (__GFP_RECLAIM | __GFP_IO) #define GFP_NOFS (__GFP_RECLAIM | __GFP_IO)
@ -271,7 +278,7 @@ static inline int gfpflags_to_migratetype(const gfp_t gfp_flags)
static inline bool gfpflags_allow_blocking(const gfp_t gfp_flags) static inline bool gfpflags_allow_blocking(const gfp_t gfp_flags)
{ {
return (bool __force)(gfp_flags & __GFP_DIRECT_RECLAIM); return !!(gfp_flags & __GFP_DIRECT_RECLAIM);
} }
#ifdef CONFIG_HIGHMEM #ifdef CONFIG_HIGHMEM
@ -377,10 +384,11 @@ static inline enum zone_type gfp_zone(gfp_t flags)
static inline int gfp_zonelist(gfp_t flags) static inline int gfp_zonelist(gfp_t flags)
{ {
if (IS_ENABLED(CONFIG_NUMA) && unlikely(flags & __GFP_THISNODE)) #ifdef CONFIG_NUMA
return 1; if (unlikely(flags & __GFP_THISNODE))
return ZONELIST_NOFALLBACK;
return 0; #endif
return ZONELIST_FALLBACK;
} }
/* /*

View File

@ -263,20 +263,18 @@ struct file *hugetlb_file_setup(const char *name, size_t size, vm_flags_t acct,
struct user_struct **user, int creat_flags, struct user_struct **user, int creat_flags,
int page_size_log); int page_size_log);
static inline int is_file_hugepages(struct file *file) static inline bool is_file_hugepages(struct file *file)
{ {
if (file->f_op == &hugetlbfs_file_operations) if (file->f_op == &hugetlbfs_file_operations)
return 1; return true;
if (is_file_shm_hugepages(file))
return 1;
return 0; return is_file_shm_hugepages(file);
} }
#else /* !CONFIG_HUGETLBFS */ #else /* !CONFIG_HUGETLBFS */
#define is_file_hugepages(file) 0 #define is_file_hugepages(file) false
static inline struct file * static inline struct file *
hugetlb_file_setup(const char *name, size_t size, vm_flags_t acctflag, hugetlb_file_setup(const char *name, size_t size, vm_flags_t acctflag,
struct user_struct **user, int creat_flags, struct user_struct **user, int creat_flags,

View File

@ -216,10 +216,10 @@ void __next_mem_pfn_range(int *idx, int nid, unsigned long *out_start_pfn,
* for_each_free_mem_range - iterate through free memblock areas * for_each_free_mem_range - iterate through free memblock areas
* @i: u64 used as loop variable * @i: u64 used as loop variable
* @nid: node selector, %NUMA_NO_NODE for all nodes * @nid: node selector, %NUMA_NO_NODE for all nodes
* @flags: pick from blocks based on memory attributes
* @p_start: ptr to phys_addr_t for start address of the range, can be %NULL * @p_start: ptr to phys_addr_t for start address of the range, can be %NULL
* @p_end: ptr to phys_addr_t for end address of the range, can be %NULL * @p_end: ptr to phys_addr_t for end address of the range, can be %NULL
* @p_nid: ptr to int for nid of the range, can be %NULL * @p_nid: ptr to int for nid of the range, can be %NULL
* @flags: pick from blocks based on memory attributes
* *
* Walks over free (memory && !reserved) areas of memblock. Available as * Walks over free (memory && !reserved) areas of memblock. Available as
* soon as memblock is initialized. * soon as memblock is initialized.
@ -232,10 +232,10 @@ void __next_mem_pfn_range(int *idx, int nid, unsigned long *out_start_pfn,
* for_each_free_mem_range_reverse - rev-iterate through free memblock areas * for_each_free_mem_range_reverse - rev-iterate through free memblock areas
* @i: u64 used as loop variable * @i: u64 used as loop variable
* @nid: node selector, %NUMA_NO_NODE for all nodes * @nid: node selector, %NUMA_NO_NODE for all nodes
* @flags: pick from blocks based on memory attributes
* @p_start: ptr to phys_addr_t for start address of the range, can be %NULL * @p_start: ptr to phys_addr_t for start address of the range, can be %NULL
* @p_end: ptr to phys_addr_t for end address of the range, can be %NULL * @p_end: ptr to phys_addr_t for end address of the range, can be %NULL
* @p_nid: ptr to int for nid of the range, can be %NULL * @p_nid: ptr to int for nid of the range, can be %NULL
* @flags: pick from blocks based on memory attributes
* *
* Walks over free (memory && !reserved) areas of memblock in reverse * Walks over free (memory && !reserved) areas of memblock in reverse
* order. Available as soon as memblock is initialized. * order. Available as soon as memblock is initialized.
@ -325,10 +325,10 @@ phys_addr_t memblock_mem_size(unsigned long limit_pfn);
phys_addr_t memblock_start_of_DRAM(void); phys_addr_t memblock_start_of_DRAM(void);
phys_addr_t memblock_end_of_DRAM(void); phys_addr_t memblock_end_of_DRAM(void);
void memblock_enforce_memory_limit(phys_addr_t memory_limit); void memblock_enforce_memory_limit(phys_addr_t memory_limit);
int memblock_is_memory(phys_addr_t addr); bool memblock_is_memory(phys_addr_t addr);
int memblock_is_map_memory(phys_addr_t addr); int memblock_is_map_memory(phys_addr_t addr);
int memblock_is_region_memory(phys_addr_t base, phys_addr_t size); int memblock_is_region_memory(phys_addr_t base, phys_addr_t size);
int memblock_is_reserved(phys_addr_t addr); bool memblock_is_reserved(phys_addr_t addr);
bool memblock_is_region_reserved(phys_addr_t base, phys_addr_t size); bool memblock_is_region_reserved(phys_addr_t base, phys_addr_t size);
extern void __memblock_dump_all(void); extern void __memblock_dump_all(void);
@ -399,6 +399,11 @@ static inline unsigned long memblock_region_reserved_end_pfn(const struct memblo
region < (memblock.memblock_type.regions + memblock.memblock_type.cnt); \ region < (memblock.memblock_type.regions + memblock.memblock_type.cnt); \
region++) region++)
#define for_each_memblock_type(memblock_type, rgn) \
idx = 0; \
rgn = &memblock_type->regions[idx]; \
for (idx = 0; idx < memblock_type->cnt; \
idx++,rgn = &memblock_type->regions[idx])
#ifdef CONFIG_ARCH_DISCARD_MEMBLOCK #ifdef CONFIG_ARCH_DISCARD_MEMBLOCK
#define __init_memblock __meminit #define __init_memblock __meminit

View File

@ -85,32 +85,10 @@ enum mem_cgroup_events_target {
MEM_CGROUP_NTARGETS, MEM_CGROUP_NTARGETS,
}; };
/*
* Bits in struct cg_proto.flags
*/
enum cg_proto_flags {
/* Currently active and new sockets should be assigned to cgroups */
MEMCG_SOCK_ACTIVE,
/* It was ever activated; we must disarm static keys on destruction */
MEMCG_SOCK_ACTIVATED,
};
struct cg_proto { struct cg_proto {
struct page_counter memory_allocated; /* Current allocated memory. */ struct page_counter memory_allocated; /* Current allocated memory. */
struct percpu_counter sockets_allocated; /* Current number of sockets. */
int memory_pressure; int memory_pressure;
long sysctl_mem[3]; bool active;
unsigned long flags;
/*
* memcg field is used to find which memcg we belong directly
* Each memcg struct can hold more than one cg_proto, so container_of
* won't really cut.
*
* The elegant solution would be having an inverse function to
* proto_cgroup in struct proto, but that means polluting the structure
* for everybody, instead of just for memcg users.
*/
struct mem_cgroup *memcg;
}; };
#ifdef CONFIG_MEMCG #ifdef CONFIG_MEMCG
@ -192,6 +170,9 @@ struct mem_cgroup {
unsigned long low; unsigned long low;
unsigned long high; unsigned long high;
/* Range enforcement for interrupt charges */
struct work_struct high_work;
unsigned long soft_limit; unsigned long soft_limit;
/* vmpressure notifications */ /* vmpressure notifications */
@ -268,6 +249,10 @@ struct mem_cgroup {
struct wb_domain cgwb_domain; struct wb_domain cgwb_domain;
#endif #endif
#ifdef CONFIG_INET
unsigned long socket_pressure;
#endif
/* List of events which userspace want to receive */ /* List of events which userspace want to receive */
struct list_head event_list; struct list_head event_list;
spinlock_t event_list_lock; spinlock_t event_list_lock;
@ -275,7 +260,8 @@ struct mem_cgroup {
struct mem_cgroup_per_node *nodeinfo[0]; struct mem_cgroup_per_node *nodeinfo[0];
/* WARNING: nodeinfo must be the last member here */ /* WARNING: nodeinfo must be the last member here */
}; };
extern struct cgroup_subsys_state *mem_cgroup_root_css;
extern struct mem_cgroup *root_mem_cgroup;
/** /**
* mem_cgroup_events - count memory events against a cgroup * mem_cgroup_events - count memory events against a cgroup
@ -308,18 +294,34 @@ struct lruvec *mem_cgroup_page_lruvec(struct page *, struct zone *);
bool task_in_mem_cgroup(struct task_struct *task, struct mem_cgroup *memcg); bool task_in_mem_cgroup(struct task_struct *task, struct mem_cgroup *memcg);
struct mem_cgroup *mem_cgroup_from_task(struct task_struct *p); struct mem_cgroup *mem_cgroup_from_task(struct task_struct *p);
struct mem_cgroup *parent_mem_cgroup(struct mem_cgroup *memcg);
static inline static inline
struct mem_cgroup *mem_cgroup_from_css(struct cgroup_subsys_state *css){ struct mem_cgroup *mem_cgroup_from_css(struct cgroup_subsys_state *css){
return css ? container_of(css, struct mem_cgroup, css) : NULL; return css ? container_of(css, struct mem_cgroup, css) : NULL;
} }
#define mem_cgroup_from_counter(counter, member) \
container_of(counter, struct mem_cgroup, member)
struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup *, struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup *,
struct mem_cgroup *, struct mem_cgroup *,
struct mem_cgroup_reclaim_cookie *); struct mem_cgroup_reclaim_cookie *);
void mem_cgroup_iter_break(struct mem_cgroup *, struct mem_cgroup *); void mem_cgroup_iter_break(struct mem_cgroup *, struct mem_cgroup *);
/**
* parent_mem_cgroup - find the accounting parent of a memcg
* @memcg: memcg whose parent to find
*
* Returns the parent memcg, or NULL if this is the root or the memory
* controller is in legacy no-hierarchy mode.
*/
static inline struct mem_cgroup *parent_mem_cgroup(struct mem_cgroup *memcg)
{
if (!memcg->memory.parent)
return NULL;
return mem_cgroup_from_counter(memcg->memory.parent, memory);
}
static inline bool mem_cgroup_is_descendant(struct mem_cgroup *memcg, static inline bool mem_cgroup_is_descendant(struct mem_cgroup *memcg,
struct mem_cgroup *root) struct mem_cgroup *root)
{ {
@ -671,12 +673,6 @@ void mem_cgroup_count_vm_event(struct mm_struct *mm, enum vm_event_item idx)
} }
#endif /* CONFIG_MEMCG */ #endif /* CONFIG_MEMCG */
enum {
UNDER_LIMIT,
SOFT_LIMIT,
OVER_LIMIT,
};
#ifdef CONFIG_CGROUP_WRITEBACK #ifdef CONFIG_CGROUP_WRITEBACK
struct list_head *mem_cgroup_cgwb_list(struct mem_cgroup *memcg); struct list_head *mem_cgroup_cgwb_list(struct mem_cgroup *memcg);
@ -703,20 +699,35 @@ static inline void mem_cgroup_wb_stats(struct bdi_writeback *wb,
#endif /* CONFIG_CGROUP_WRITEBACK */ #endif /* CONFIG_CGROUP_WRITEBACK */
struct sock; struct sock;
#if defined(CONFIG_INET) && defined(CONFIG_MEMCG_KMEM)
void sock_update_memcg(struct sock *sk); void sock_update_memcg(struct sock *sk);
void sock_release_memcg(struct sock *sk); void sock_release_memcg(struct sock *sk);
bool mem_cgroup_charge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages);
void mem_cgroup_uncharge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages);
#if defined(CONFIG_MEMCG) && defined(CONFIG_INET)
extern struct static_key_false memcg_sockets_enabled_key;
#define mem_cgroup_sockets_enabled static_branch_unlikely(&memcg_sockets_enabled_key)
static inline bool mem_cgroup_under_socket_pressure(struct mem_cgroup *memcg)
{
#ifdef CONFIG_MEMCG_KMEM
if (memcg->tcp_mem.memory_pressure)
return true;
#endif
do {
if (time_before(jiffies, memcg->socket_pressure))
return true;
} while ((memcg = parent_mem_cgroup(memcg)));
return false;
}
#else #else
static inline void sock_update_memcg(struct sock *sk) #define mem_cgroup_sockets_enabled 0
static inline bool mem_cgroup_under_socket_pressure(struct mem_cgroup *memcg)
{ {
return false;
} }
static inline void sock_release_memcg(struct sock *sk) #endif
{
}
#endif /* CONFIG_INET && CONFIG_MEMCG_KMEM */
#ifdef CONFIG_MEMCG_KMEM #ifdef CONFIG_MEMCG_KMEM
extern struct static_key memcg_kmem_enabled_key; extern struct static_key_false memcg_kmem_enabled_key;
extern int memcg_nr_cache_ids; extern int memcg_nr_cache_ids;
void memcg_get_cache_ids(void); void memcg_get_cache_ids(void);
@ -732,7 +743,7 @@ void memcg_put_cache_ids(void);
static inline bool memcg_kmem_enabled(void) static inline bool memcg_kmem_enabled(void)
{ {
return static_key_false(&memcg_kmem_enabled_key); return static_branch_unlikely(&memcg_kmem_enabled_key);
} }
static inline bool memcg_kmem_is_active(struct mem_cgroup *memcg) static inline bool memcg_kmem_is_active(struct mem_cgroup *memcg)
@ -766,15 +777,13 @@ static inline int memcg_cache_id(struct mem_cgroup *memcg)
return memcg ? memcg->kmemcg_id : -1; return memcg ? memcg->kmemcg_id : -1;
} }
struct kmem_cache *__memcg_kmem_get_cache(struct kmem_cache *cachep); struct kmem_cache *__memcg_kmem_get_cache(struct kmem_cache *cachep, gfp_t gfp);
void __memcg_kmem_put_cache(struct kmem_cache *cachep); void __memcg_kmem_put_cache(struct kmem_cache *cachep);
static inline bool __memcg_kmem_bypass(gfp_t gfp) static inline bool __memcg_kmem_bypass(void)
{ {
if (!memcg_kmem_enabled()) if (!memcg_kmem_enabled())
return true; return true;
if (gfp & __GFP_NOACCOUNT)
return true;
if (in_interrupt() || (!current->mm) || (current->flags & PF_KTHREAD)) if (in_interrupt() || (!current->mm) || (current->flags & PF_KTHREAD))
return true; return true;
return false; return false;
@ -791,7 +800,9 @@ static inline bool __memcg_kmem_bypass(gfp_t gfp)
static __always_inline int memcg_kmem_charge(struct page *page, static __always_inline int memcg_kmem_charge(struct page *page,
gfp_t gfp, int order) gfp_t gfp, int order)
{ {
if (__memcg_kmem_bypass(gfp)) if (__memcg_kmem_bypass())
return 0;
if (!(gfp & __GFP_ACCOUNT))
return 0; return 0;
return __memcg_kmem_charge(page, gfp, order); return __memcg_kmem_charge(page, gfp, order);
} }
@ -810,16 +821,15 @@ static __always_inline void memcg_kmem_uncharge(struct page *page, int order)
/** /**
* memcg_kmem_get_cache: selects the correct per-memcg cache for allocation * memcg_kmem_get_cache: selects the correct per-memcg cache for allocation
* @cachep: the original global kmem cache * @cachep: the original global kmem cache
* @gfp: allocation flags.
* *
* All memory allocated from a per-memcg cache is charged to the owner memcg. * All memory allocated from a per-memcg cache is charged to the owner memcg.
*/ */
static __always_inline struct kmem_cache * static __always_inline struct kmem_cache *
memcg_kmem_get_cache(struct kmem_cache *cachep, gfp_t gfp) memcg_kmem_get_cache(struct kmem_cache *cachep, gfp_t gfp)
{ {
if (__memcg_kmem_bypass(gfp)) if (__memcg_kmem_bypass())
return cachep; return cachep;
return __memcg_kmem_get_cache(cachep); return __memcg_kmem_get_cache(cachep, gfp);
} }
static __always_inline void memcg_kmem_put_cache(struct kmem_cache *cachep) static __always_inline void memcg_kmem_put_cache(struct kmem_cache *cachep)

Some files were not shown because too many files have changed in this diff Show More