linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-11 06:31:49 +00:00

Author	SHA1	Message	Date
Vincent Whitchurch	e994f5b677	squashfs: cache partial compressed blocks Before commit `93e72b3c61` ("squashfs: migrate from ll_rw_block usage to BIO"), compressed blocks read by squashfs were cached in the page cache, but that is not the case after that commit. That has lead to squashfs having to re-read a lot of sectors from disk/flash. For example, the first sectors of every metadata block need to be read twice from the disk. Once partially to read the length, and a second time to read the block itself. Also, in linear reads of large files, the last sectors of one data block are re-read from disk when reading the next data block, since the compressed blocks are of variable sizes and not aligned to device blocks. This extra I/O results in a degrade in read performance of, for example, ~16% in one scenario on my ARM platform using squashfs with dm-verity and NAND. Since the decompressed data is cached in the page cache or squashfs' internal metadata and fragment caches, caching _all_ compressed pages would lead to a lot of double caching and is undesirable. But make the code cache any disk blocks which were only partially requested, since these are the ones likely to include data which is needed by other file system blocks. This restores read performance in my test scenario. The compressed block caching is only applied when the disk block size is equal to the page size, to avoid having to deal with caching sub-page reads. [akpm@linux-foundation.org: fs/squashfs/block.c needs linux/pagemap.h] [vincent.whitchurch@axis.com: fix page update race] Link: https://lkml.kernel.org/r/20230526-squashfs-cache-fixup-v1-1-d54a7fa23e7b@axis.com [vincent.whitchurch@axis.com: fix page indices] Link: https://lkml.kernel.org/r/20230526-squashfs-cache-fixup-v1-2-d54a7fa23e7b@axis.com [akpm@linux-foundation.org: fix layout, per hch] Link: https://lkml.kernel.org/r/20230510-squashfs-cache-v4-1-3bd394e1ee71@axis.com Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Phillip Lougher <phillip@squashfs.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-06-09 17:44:14 -07:00
Linus Torvalds	6e8948a063	fs.idmapped.squashfs.v6.2 -----BEGIN PGP SIGNATURE----- iQEzBAABCgAdFiEEkJ7fqygL5xE8uB3PUwOZruX0MckFAmOTX4IACgkQUwOZruX0 McmDJgf8DlnoL7R88j0OZCLEpuxCOw+j1ZmQhUYz7XmDYPi8zDQCZtcx7scqQIlR oYb6fxP3NOLdsPzt/K2QNuzM8Yv2kRGskZPzojcBzkMwGBlFgyjLjGCOG93hmTcD +nXtod7QzgUvpv7w9bnLXZMw8WvU78UUsUzwhN5jgSmnpmhozZKV8xfIzJYlRUJ+ dNYRZ9O374W+NZOX97nki7oR19cdL5uzpU53ZXMm9DMlLVgcB7yqqS37ABmqS37K I9wJsfcPOsd60f5FUWIm0uxGvAC+p5Z+rG4MJytbDFBfvLlhnJOHRwE+sn5b53wI LuAr9AHF1AHuwuNswatjumJdW9qr4Q== =/IDh -----END PGP SIGNATURE----- Merge tag 'fs.idmapped.squashfs.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping Pull squashfs update from Seth Forshee: "This is a simple patch to enable idmapped mounts for squashfs. All functionality squashfs needs to support idmapped mounts is already implemented in generic VFS code, so all that is needed is to set FS_ALLOW_IDMAP in fs_flags" * tag 'fs.idmapped.squashfs.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping: squashfs: enable idmapped mounts	2022-12-12 20:24:51 -08:00
Baokun Li	c7e8d3279c	squashfs: fix null-ptr-deref in squashfs_fill_super When squashfs_read_table() returns an error or `sb->s_magic != SQUASHFS_MAGIC`, enters the error branch and calls msblk->thread_ops->destroy(msblk) to destroy msblk. However, msblk->thread_ops has not been initialized. Therefore, the following problem is triggered: ================================================================== BUG: KASAN: null-ptr-deref in squashfs_fill_super+0xe7a/0x13b0 Read of size 8 at addr 0000000000000008 by task swapper/0/1 CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.1.0-rc3-next-20221031 #367 Call Trace: <TASK> dump_stack_lvl+0x73/0x9f print_report+0x743/0x759 kasan_report+0xc0/0x120 __asan_load8+0xd3/0x140 squashfs_fill_super+0xe7a/0x13b0 get_tree_bdev+0x27b/0x450 squashfs_get_tree+0x19/0x30 vfs_get_tree+0x49/0x150 path_mount+0xaae/0x1350 init_mount+0xad/0x100 do_mount_root+0xbc/0x1d0 mount_block_root+0x173/0x316 mount_root+0x223/0x236 prepare_namespace+0x1eb/0x237 kernel_init_freeable+0x528/0x576 kernel_init+0x29/0x250 ret_from_fork+0x1f/0x30 </TASK> ================================================================== To solve this issue, msblk->thread_ops is initialized immediately after msblk is assigned a value. Link: https://lkml.kernel.org/r/20221101073343.3961562-1-libaokun1@huawei.com Fixes: b0645770d3c7 ("squashfs: add the mount parameter theads=<single\|multi\|percpu>") Signed-off-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Xiaoming Ni <nixiaoming@huawei.com> Reviewed-by: Phillip Lougher <phillip@squashfs.org.uk> Cc: Yu Kuai <yukuai3@huawei.com> Cc: Zhang Yi <yi.zhang@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-11-18 13:55:09 -08:00
Xiaoming Ni	fb40fe04f9	squashfs: allows users to configure the number of decompression threads The maximum number of threads in the decompressor_multi.c file is fixed and cannot be adjusted according to user needs. Therefore, the mount parameter needs to be added to allow users to configure the number of threads as required. The upper limit is num_online_cpus() * 2. Link: https://lkml.kernel.org/r/20221019030930.130456-3-nixiaoming@huawei.com Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com> Reviewed-by: Phillip Lougher <phillip@squashfs.org.uk> Cc: Jianguo Chen <chenjianguo3@huawei.com> Cc: Jubin Zhong <zhongjubin@huawei.com> Cc: Zhang Yi <yi.zhang@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-11-18 13:55:08 -08:00
Xiaoming Ni	80f784098f	squashfs: add the mount parameter theads=<single\|multi\|percpu> Patch series 'squashfs: Add the mount parameter "threads="'. Currently, Squashfs supports multiple decompressor parallel modes. However, this mode can be configured only during kernel building and does not support flexible selection during runtime. In the current patch set, the mount parameter "threads=" is added to allow users to select the parallel decompressor mode and configure the number of decompressors when mounting a file system. "threads=<single\|multi\|percpu\|1\|2\|3\|...>" The upper limit is num_online_cpus() * 2. This patch (of 2): Squashfs supports three decompression concurrency modes: Single-thread mode: concurrent reads are blocked and the memory overhead is small. Multi-thread mode/percpu mode: reduces concurrent read blocking but increases memory overhead. The corresponding schema must be fixed at compile time. During mounting, the concurrent decompression mode cannot be adjusted based on file read blocking. The mount parameter theads=<single\|multi\|percpu> is added to select the concurrent decompression mode of a single SquashFS file system image. Link: https://lkml.kernel.org/r/20221019030930.130456-1-nixiaoming@huawei.com Link: https://lkml.kernel.org/r/20221019030930.130456-2-nixiaoming@huawei.com Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com> Reviewed-by: Phillip Lougher <phillip@squashfs.org.uk> Cc: Jianguo Chen <chenjianguo3@huawei.com> Cc: Jubin Zhong <zhongjubin@huawei.com> Cc: Zhang Yi <yi.zhang@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-11-18 13:55:08 -08:00
Michael Weiß	42da66ac7b	squashfs: enable idmapped mounts For squashfs all needed functionality for idmapped mounts is already implemented by the generic handlers in the VFS. Thus, it is sufficient to just enable the corresponding FS_ALLOW_IDMAP flag to support idmapped mounts. We use this for unprivileged (user namespaced) containers based on squashfs images as rootfs in GyroidOS. A simple test using the mount-idmapped tool executed as user with uid=1000 looks as follows: $ mkdir test $ echo "test" > test/test_file $ mksquashfs test/ fs.img $ sudo mkdir /mnt/test $ sudo mkdir /mnt/mapped $ sudo mount fs.img -o loop /mnt/test/ $ sudo ./mount-idmapped --map-mount b:1000:2000:1 /mnt/test/ /mnt/mapped/ $ mount \| tail -n2 fs.img on /mnt/test type squashfs (ro,relatime,errors=continue) fs.img on /mnt/mapped type squashfs (ro,relatime,idmapped,errors=continue) $ ls -lan /mnt/test/ total 5 drwxr-xr-x 2 1000 1000 32 Okt 24 13:36 . drwxr-xr-x 6 0 0 4096 Okt 24 13:38 .. -rw-r--r-- 1 1000 1000 5 Okt 24 13:36 test_file $ ls -lan /mnt/mapped/ total 5 drwxr-xr-x 2 2000 2000 32 Okt 24 13:36 . drwxr-xr-x 6 0 0 4096 Okt 24 13:38 .. -rw-r--r-- 1 2000 2000 5 Okt 24 13:36 test_file Signed-off-by: Michael Weiß <michael.weiss@aisec.fraunhofer.de> Reviewed-by: Christian Brauner <brauner@kernel.org> Reviewed-by: Phillip Lougher <phillip@squashfs.org.uk> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>	2022-11-07 10:24:22 +01:00
Hsin-Yi Wang	0c12185728	Revert "squashfs: provide backing_dev_info in order to disable read-ahead" Patch series "Implement readahead for squashfs", v7. Commit 9eec1d897139("squashfs: provide backing_dev_info in order to disable read-ahead") mitigates the performance drop issue for squashfs by closing readahead for it. This series implements readahead callback for squashfs. This patch (of 4): This reverts `9eec1d8971` ("squashfs: provide backing_dev_info in order to disable read-ahead"). Revert closing the readahead to squashfs since the readahead callback for squashfs is implemented. Link: https://lkml.kernel.org/r/20220617083810.337573-1-hsinyi@chromium.org Link: https://lkml.kernel.org/r/20220617083810.337573-2-hsinyi@chromium.org Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org> Suggested-by: Xiongwei Song <Xiongwei.Song@windriver.com> Cc: Phillip Lougher <phillip@squashfs.org.uk> Cc: Matthew Wilcox <willy@infradead.org> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Zheng Liang <zhengliang6@huawei.com> Cc: Zhang Yi <yi.zhang@huawei.com> Cc: Hou Tao <houtao1@huawei.com> Cc: Miao Xie <miaoxie@huawei.com> Cc: kernel test robot <lkp@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-07-29 18:12:34 -07:00
Matthew Wilcox (Oracle)	124cfc154f	squashfs: Convert squashfs to read_folio This is a "weak" conversion which converts straight back to using pages. A full conversion should be performed at some point, hopefully by someone familiar with the filesystem. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>	2022-05-09 16:21:46 -04:00
Muchun Song	fd60b28842	fs: allocate inode by using alloc_inode_sb() The inode allocation is supposed to use alloc_inode_sb(), so convert kmem_cache_alloc() of all filesystems to alloc_inode_sb(). Link: https://lkml.kernel.org/r/20220228122126.37293-5-songmuchun@bytedance.com Signed-off-by: Muchun Song <songmuchun@bytedance.com> Acked-by: Theodore Ts'o <tytso@mit.edu> [ext4] Acked-by: Roman Gushchin <roman.gushchin@linux.dev> Cc: Alex Shi <alexs@kernel.org> Cc: Anna Schumaker <Anna.Schumaker@Netapp.com> Cc: Chao Yu <chao@kernel.org> Cc: Dave Chinner <david@fromorbit.com> Cc: Fam Zheng <fam.zheng@bytedance.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kari Argillander <kari.argillander@gmail.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Xiongchun Duan <duanxiongchun@bytedance.com> Cc: Yang Shi <shy828301@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2022-03-22 15:57:03 -07:00
Zheng Liang	9eec1d8971	squashfs: provide backing_dev_info in order to disable read-ahead Commit `c1f6925e10` ("mm: put readahead pages in cache earlier") causes the read performance of squashfs to deteriorate.Through testing, we find that the performance will be back by closing the readahead of squashfs. So we want to learn the way of ubifs, provides backing_dev_info and disable read-ahead We tested the following data by fio. squashfs image blocksize=128K test command: fio --name basic --bs=? --filename="/mnt/test_file" --rw=? --iodepth=1 --ioengine=psync --runtime=200 --time_based turn on squashfs readahead in 5.10 kernel bs(k) read/randread MB/s 4 randread 271 128 randread 231 1024 randread 246 4 read 310 128 read 245 1024 read 247 turn off squashfs readahead in 5.10 kernel bs(k) read/randread MB/s 4 randread 293 128 randread 330 1024 randread 363 4 read 338 128 read 360 1024 read 365 turn on squashfs readahead and revert the commit c1f6925e1091("mm: put readahead pages in cache earlier") in 5.10 kernel bs(k) read/randread MB/s 4 randread 289 128 randread 306 1024 randread 335 4 read 337 128 read 336 1024 read 338 Link: https://lkml.kernel.org/r/20211116113141.1391026-1-zhengliang6@huawei.com Signed-off-by: Zheng Liang <zhengliang6@huawei.com> Reviewed-by: Phillip Lougher <phillip@squashfs.org.uk> Cc: Zhang Yi <yi.zhang@huawei.com> Cc: Hou Tao <houtao1@huawei.com> Cc: Miao Xie <miaoxie@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2022-01-15 16:30:24 +02:00
Christoph Hellwig	be9a7b3e15	squashfs: use bdev_nr_bytes instead of open coding it Use the proper helper to read the block device size. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Phillip Lougher <phillip@squashfs.org.uk> Link: https://lore.kernel.org/r/20211018101130.1838532-24-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-10-18 14:43:23 -06:00
Vincent Whitchurch	10dde05b89	squashfs: add option to panic on errors Add an errors=panic mount option to make squashfs trigger a panic when errors are encountered, similar to several other filesystems. This allows a kernel dump to be saved using which the corruption can be analysed and debugged. Inspired by a pre-fs_context patch by Anton Eliasson. Link: https://lkml.kernel.org/r/20210527125019.14511-1-vincent.whitchurch@axis.com Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com> Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2021-06-29 10:53:46 -07:00
Phillip Lougher	f37aa4c736	squashfs: add more sanity checks in id lookup Sysbot has reported a number of "slab-out-of-bounds reads" and "use-after-free read" errors which has been identified as being caused by a corrupted index value read from the inode. This could be because the metadata block is uncompressed, or because the "compression" bit has been corrupted (turning a compressed block into an uncompressed block). This patch adds additional sanity checks to detect this, and the following corruption. 1. It checks against corruption of the ids count. This can either lead to a larger table to be read, or a smaller than expected table to be read. In the case of a too large ids count, this would often have been trapped by the existing sanity checks, but this patch introduces a more exact check, which can identify too small values. 2. It checks the contents of the index table for corruption. Link: https://lkml.kernel.org/r/20210204130249.4495-3-phillip@squashfs.org.uk Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk> Reported-by: syzbot+b06d57ba83f604522af2@syzkaller.appspotmail.com Reported-by: syzbot+c021ba012da41ee9807c@syzkaller.appspotmail.com Reported-by: syzbot+5024636e8b5fd19f0f19@syzkaller.appspotmail.com Reported-by: syzbot+bcbc661df46657d0fa4f@syzkaller.appspotmail.com Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2021-02-09 17:26:44 -08:00
Al Viro	6d1349c769	[PATCH] reduce boilerplate in fsid handling Get rid of boilerplate in most of ->statfs() instances... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2020-09-18 16:45:50 -04:00
Linus Torvalds	bc7d9aee3f	Merge branch 'work.mount2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull misc mount API conversions from Al Viro: "Conversions to new API for shmem and friends and for mount_mtd()-using filesystems. As for the rest of the mount API conversions in -next, some of them belong in the individual trees (e.g. binderfs one should definitely go through android folks, after getting redone on top of their changes). I'm going to drop those and send the rest (trivial ones + stuff ACKed by maintainers) in a separate series - by that point they are independent from each other. Some stuff has already migrated into individual trees (NFS conversion, for example, or FUSE stuff, etc.); those presumably will go through the regular merges from corresponding trees." * 'work.mount2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: vfs: Make fs_parse() handle fs_param_is_fd-type params better vfs: Convert ramfs, shmem, tmpfs, devtmpfs, rootfs to use the new mount API shmem_parse_one(): switch to use of fs_parse() shmem_parse_options(): take handling a single option into a helper shmem_parse_options(): don't bother with mpol in separate variable shmem_parse_options(): use a separate structure to keep the results make shmem_fill_super() static make ramfs_fill_super() static devtmpfs: don't mix {ramfs,shmem}_fill_super() with mount_single() vfs: Convert squashfs to use the new mount API mtd: Kill mount_mtd() vfs: Convert jffs2 to use the new mount API vfs: Convert cramfs to use the new mount API vfs: Convert romfs to use the new mount API vfs: Add a single-or-reconfig keying to vfs_get_super()	2019-09-19 10:06:57 -07:00
David Howells	5a2be1288b	vfs: Convert squashfs to use the new mount API Convert the squashfs filesystem to the new internal mount API as the old one will be obsoleted and removed. This allows greater flexibility in communication of mount parameters between userspace, the VFS and the filesystem. See Documentation/filesystems/mount_api.txt for more information. Signed-off-by: David Howells <dhowells@redhat.com> cc: Phillip Lougher <phillip@squashfs.org.uk> cc: squashfs-devel@lists.sourceforge.net Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2019-09-05 14:34:26 -04:00
Deepa Dinamani	22b139691f	fs: Fill in max and min timestamps in superblock Fill in the appropriate limits to avoid inconsistencies in the vfs cached inode times when timestamps are outside the permitted range. Even though some filesystems are read-only, fill in the timestamps to reflect the on-disk representation. Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Acked-By: Tigran Aivazian <aivazian.tigran@gmail.com> Acked-by: Jeff Layton <jlayton@kernel.org> Cc: aivazian.tigran@gmail.com Cc: al@alarsen.net Cc: coda@cs.cmu.edu Cc: darrick.wong@oracle.com Cc: dushistov@mail.ru Cc: dwmw2@infradead.org Cc: hch@infradead.org Cc: jack@suse.com Cc: jaharkes@cs.cmu.edu Cc: luisbg@kernel.org Cc: nico@fluxnic.net Cc: phillip@squashfs.org.uk Cc: richard@nod.at Cc: salah.triki@gmail.com Cc: shaggy@kernel.org Cc: linux-xfs@vger.kernel.org Cc: codalist@coda.cs.cmu.edu Cc: linux-ext4@vger.kernel.org Cc: linux-mtd@lists.infradead.org Cc: jfs-discussion@lists.sourceforge.net Cc: reiserfs-devel@vger.kernel.org	2019-08-30 07:27:17 -07:00
Thomas Gleixner	68252eb5f8	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 35 Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 or at your option any later version this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details you should have received a copy of the gnu general public license along with this program if not write to the free software foundation 51 franklin street fifth floor boston ma 02110 1301 usa extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 23 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190520170857.458548087@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-05-24 17:27:11 +02:00
Al Viro	56b5af1931	squashfs: switch to ->free_inode() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2019-05-01 22:43:25 -04:00
Linus Torvalds	71755ee535	squashfs: more metadata hardening The squashfs fragment reading code doesn't actually verify that the fragment is inside the fragment table. The end result _is_ verified to be inside the image when actually reading the fragment data, but before that is done, we may end up taking a page fault because the fragment table itself might not even exist. Another report from Anatoly and his endless squashfs image fuzzing. Reported-by: Анатолий Тросиненко <anatoly.trosinenko@gmail.com> Acked-by:: Phillip Lougher <phillip.lougher@gmail.com>, Cc: Willy Tarreau <w@1wt.eu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2018-08-02 09:32:23 -07:00
Linus Torvalds	1751e8a6cb	Rename superblock flags (MS_xyz -> SB_xyz) This is a pure automated search-and-replace of the internal kernel superblock flags. The s_flags are now called SB_, with the names and the values for the moment mirroring the MS_ flags that they're equivalent to. Note how the MS_xyz flags are the ones passed to the mount system call, while the SB_xyz flags are what we then use in sb->s_flags. The script to do this was: # places to look in; re security/: it generally should not* be # touched (that stuff parses mount(2) arguments directly), but # there are two places where we really deal with superblock flags. FILES="drivers/mtd drivers/staging/lustre fs ipc mm \ include/linux/fs.h include/uapi/linux/bfs_fs.h \ security/apparmor/apparmorfs.c security/apparmor/include/lib.h" # the list of MS_... constants SYMS="RDONLY NOSUID NODEV NOEXEC SYNCHRONOUS REMOUNT MANDLOCK \ DIRSYNC NOATIME NODIRATIME BIND MOVE REC VERBOSE SILENT \ POSIXACL UNBINDABLE PRIVATE SLAVE SHARED RELATIME KERNMOUNT \ I_VERSION STRICTATIME LAZYTIME SUBMOUNT NOREMOTELOCK NOSEC BORN \ ACTIVE NOUSER" SED_PROG= for i in $SYMS; do SED_PROG="$SED_PROG -e s/MS_$i/SB_$i/g"; done # we want files that contain at least one of MS_..., # with fs/namespace.c and fs/pnode.c excluded. L=$(for i in $SYMS; do git grep -w -l MS_$i $FILES; done\| sort\|uniq\|grep -v '^fs/namespace.c'\|grep -v '^fs/pnode.c') for f in $L; do sed -i $f $SED_PROG; done Requested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-11-27 13:05:09 -08:00
Kirill A. Shutemov	09cbfeaf1a	mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced long time ago with promise that one day it will be possible to implement page cache with bigger chunks than PAGE_SIZE. This promise never materialized. And unlikely will. We have many places where PAGE_CACHE_SIZE assumed to be equal to PAGE_SIZE. And it's constant source of confusion on whether PAGE_CACHE_* or PAGE_* constant should be used in a particular case, especially on the border between fs and mm. Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much breakage to be doable. Let's stop pretending that pages in page cache are special. They are not. The changes are pretty straight-forward: - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN}; - page_cache_get() -> get_page(); - page_cache_release() -> put_page(); This patch contains automated changes generated with coccinelle using script below. For some reason, coccinelle doesn't patch header files. I've called spatch for them manually. The only adjustment after coccinelle is revert of changes to PAGE_CAHCE_ALIGN definition: we are going to drop it later. There are few places in the code where coccinelle didn't reach. I'll fix them manually in a separate patch. Comments and documentation also will be addressed with the separate patch. virtual patch @@ expression E; @@ - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ expression E; @@ - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ @@ - PAGE_CACHE_SHIFT + PAGE_SHIFT @@ @@ - PAGE_CACHE_SIZE + PAGE_SIZE @@ @@ - PAGE_CACHE_MASK + PAGE_MASK @@ expression E; @@ - PAGE_CACHE_ALIGN(E) + PAGE_ALIGN(E) @@ expression E; @@ - page_cache_get(E) + get_page(E) @@ expression E; @@ - page_cache_release(E) + put_page(E) Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-04-04 10:41:08 -07:00
Vladimir Davydov	5d097056c9	kmemcg: account certain kmem allocations to memcg Mark those kmem allocations that are known to be easily triggered from userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them accounted to memcg. For the list, see below: - threadinfo - task_struct - task_delay_info - pid - cred - mm_struct - vm_area_struct and vm_region (nommu) - anon_vma and anon_vma_chain - signal_struct - sighand_struct - fs_struct - files_struct - fdtable and fdtable->full_fds_bits - dentry and external_name - inode for all filesystems. This is the most tedious part, because most filesystems overwrite the alloc_inode method. The list is far from complete, so feel free to add more objects. Nevertheless, it should be close to "account everything" approach and keep most workloads within bounds. Malevolent users will be able to breach the limit, but this was possible even with the former "account everything" approach (simply because it did not account everything in fact). [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Tejun Heo <tj@kernel.org> Cc: Greg Thelen <gthelen@google.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-01-14 16:00:49 -08:00
Dmitry Monakhov	a1c6f05733	fs: use block_device name vsprintf helper Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2016-01-06 13:03:18 -05:00
Fabian Frederick	c811f5f41e	fs/squashfs/super.c: logging cleanup - Convert printk to pr_foo() - Add pr_fmt for future logging entries - Coalesce formats Signed-off-by: Fabian Frederick <fabf@skynet.be> Cc: Phillip Lougher <phillip@squashfs.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-08-06 18:01:13 -07:00
Theodore Ts'o	02b9984d64	fs: push sync_filesystem() down to the file system's remount_fs() Previously, the no-op "mount -o mount /dev/xxx" operation when the file system is already mounted read-write causes an implied, unconditional syncfs(). This seems pretty stupid, and it's certainly documented or guaraunteed to do this, nor is it particularly useful, except in the case where the file system was mounted rw and is getting remounted read-only. However, it's possible that there might be some file systems that are actually depending on this behavior. In most file systems, it's probably fine to only call sync_filesystem() when transitioning from read-write to read-only, and there are some file systems where this is not needed at all (for example, for a pseudo-filesystem or something like romfs). Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: linux-fsdevel@vger.kernel.org Cc: Christoph Hellwig <hch@infradead.org> Cc: Artem Bityutskiy <dedekind1@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Evgeniy Dushistov <dushistov@mail.ru> Cc: Jan Kara <jack@suse.cz> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Anders Larsen <al@alarsen.net> Cc: Phillip Lougher <phillip@squashfs.org.uk> Cc: Kees Cook <keescook@chromium.org> Cc: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz> Cc: Petr Vandrovec <petr@vandrovec.name> Cc: xfs@oss.sgi.com Cc: linux-btrfs@vger.kernel.org Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Cc: codalist@coda.cs.cmu.edu Cc: linux-ext4@vger.kernel.org Cc: linux-f2fs-devel@lists.sourceforge.net Cc: fuse-devel@lists.sourceforge.net Cc: cluster-devel@redhat.com Cc: linux-mtd@lists.infradead.org Cc: jfs-discussion@lists.sourceforge.net Cc: linux-nfs@vger.kernel.org Cc: linux-nilfs@vger.kernel.org Cc: linux-ntfs-dev@lists.sourceforge.net Cc: ocfs2-devel@oss.oracle.com Cc: reiserfs-devel@vger.kernel.org	2014-03-13 10:14:33 -04:00
Phillip Lougher	9508c6b90b	Squashfs: Refactor decompressor interface and code The decompressor interface and code was written from the point of view of single-threaded operation. In doing so it mixed a lot of single-threaded implementation specific aspects into the decompressor code and elsewhere which makes it difficult to seamlessly support multiple different decompressor implementations. This patch does the following: 1. It removes compressor_options parsing from the decompressor init() function. This allows the decompressor init() function to be dynamically called to instantiate multiple decompressors, without the compressor options needing to be read and parsed each time. 2. It moves threading and all sleeping operations out of the decompressors. In doing so, it makes the decompressors non-blocking wrappers which only deal with interfacing with the decompressor implementation. 3. It splits decompressor.[ch] into decompressor generic functions in decompressor.[ch], and moves the single threaded decompressor implementation into decompressor_single.c. The result of this patch is Squashfs should now be able to support multiple decompressors by adding new decompressor_xxx.c files with specialised implementations of the functions in decompressor_single.c Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk> Reviewed-by: Minchan Kim <minchan@kernel.org>	2013-11-20 03:35:18 +00:00
Eric W. Biederman	3e64fe5b21	fs: Limit sys_mount to only request filesystem modules. (Part 3) Somehow I failed to add the MODULE_ALIAS_FS for cifs, hostfs, hpfs, squashfs, and udf despite what I thought were my careful checks :( Add them now. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2013-03-11 07:09:48 -07:00
Kirill A. Shutemov	8c0a853770	fs: push rcu_barrier() from deactivate_locked_super() to filesystems There's no reason to call rcu_barrier() on every deactivate_locked_super(). We only need to make sure that all delayed rcu free inodes are flushed before we destroy related cache. Removing rcu_barrier() from deactivate_locked_super() affects some fast paths. E.g. on my machine exit_group() of a last process in IPC namespace takes 0.07538s. rcu_barrier() takes 0.05188s of that time. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-10-02 21:35:55 -04:00
Linus Torvalds	8563f8786e	Add an extra mount time sanity check, plus some code cleanups and bug fixes. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJPc65LAAoJEJAch/D1fbHUEswP/RzwQ7NFsGe+RgC7YygUn+nP qQma8CUBXMPNKLzGl+BGqG+TtNdBDEVGvNrHYtfn54tI7N1qliHTcMqfHv4SF1nZ QFKwqhBpEOFSbHQg4ts/N01Pa2Ilqw2A4L/8CBjqUkEZ8qOorI8sLp1Xb254YoVk G4oP+dY/YIEXZxYhIerevIkpNElkqTB2dZAZ/uhNcdHkKIRyAvqyay6F04YdGqI+ r2JfzhPS0T70PbrBHur1ed7iAHYOtgrxgB89CS3jJ5X1iG+iK8i+Xsn18fBFOQPd ULaSwfdJY5xALqxrEyuO1VxP1uEGAmn2+aOPQP/KIapLmIBGaZXHjC8H3uMqGhZ6 /Y6ZnoH4NJBANn+HXN3iwqQZ8+cw+HUgzJdyZwp6d8SEBM1KsESWeR2t+U6Zvr8L sLS5inXjbS3O6B07GV58liyCFLEXtmEHj3GCtnnWvp44Vjax57hbegQzKxe8C+3D YqBf/fx9WKIA5Ojbx5fGUaz7BQ2fczMuzrwNQB05bZAdHqSuKs3dWpGpjtKcelwp k1BO+kstuwE/dRiAxWZ3lpMQ9GLNmAGg1DgqWEKRMXuThwgxhXwf0sAshhGYCmL6 IdkUqC95Be8/D5i9yxbY8TGIV4rmhV8xDR9j8cIVHtiqbuGZT1jPhXUXXJVWw/y2 Us/awa6sC0qlL18jHV35 =ywVL -----END PGP SIGNATURE----- Merge tag 'squashfs-updates' of git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-next Pull squashfs updates from Phillip Lougher: "Add an extra mount time sanity check, plus some code cleanups and bug fixes." * tag 'squashfs-updates' of git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-next: Squashfs: add mount time sanity check for block_size and block_log match Squashfs: fix f_pos check in get_dir_index_using_offset Squashfs: get rid of obsolete definitions in header file Squashfs: remove redundant length initialisation in squashfs_lookup Squashfs: remove redundant length initialisation in squashfs_readdir Squashfs: update comment removing reference to zlib only Squashfs: use define instead of constant	2012-03-28 18:05:54 -07:00
Al Viro	48fde701af	switch open-coded instances of d_make_root() to new helper Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-03-20 21:29:35 -04:00
Phillip Lougher	4b0180a49f	Squashfs: add mount time sanity check for block_size and block_log match Squashfs currently has a sanity check for block_size less than or equal to the maximum block_size (1 Mbyte). This catches some superblock corruption, but obviously with a block_size maximum of 1 Mbyte there's 7 correct values (4K, 8K, 16K, 32K, ... etc) and a lot of incorrect values which are not caught by this check. The Squashfs superblock, however, has both a block_size and a block_log (2^block_log == block_size). Checking that the block_size matches the block_log is a much more robust check. Corruption of the superblock is unlikely to produce values which match, and it also ensures the block_size is an exact power of two. Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk>	2012-03-10 03:01:02 +00:00
Linus Torvalds	96e80a7851	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-next * git://git.kernel.org/pub/scm/linux/kernel/git/pkl/squashfs-next: Squashfs: fix i_blocks calculation with extended regular files Squashfs: fix mount time sanity check for corrupted superblock Squashfs: optimise squashfs_cache_get entry search Squashfs: Update documentation to include xattrs Squashfs: add missing block release on error condition	2012-01-13 10:34:57 -08:00
Al Viro	6b520e0565	vfs: fix the stupidity with i_dentry in inode destructors Seeing that just about every destructor got that INIT_LIST_HEAD() copied into it, there is no point whatsoever keeping this INIT_LIST_HEAD in inode_init_once(); the cost of taking it into inode_init_always() will be negligible for pipes and sockets and negative for everything else. Not to mention the removal of boilerplate code from ->destroy_inode() instances... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-01-03 22:52:40 -05:00
Phillip Lougher	cc37f75a9f	Squashfs: fix mount time sanity check for corrupted superblock A Squashfs filesystem containing nothing but an empty directory, although unusual and ultimately pointless, is still valid. The directory_table >= next_table sanity check rejects these filesystems as invalid because the directory_table is empty and equal to next_table. Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk>	2012-01-02 17:47:14 +00:00
Phillip Lougher	7657cacf47	Squashfs: Add an option to set dev block size to 4K This commit adds an option to set the device block size used to 4K. By default Squashfs sets the device block size (sb_min_blocksize) to 1K or the smallest block size supported by the block device (if larger). This, because blocks are packed together and unaligned in Squashfs, should reduce latency. This, however, gives poor performance on MTD NAND devices where the optimal I/O size is 4K (even though the devices can support smaller block sizes). Using a 4K device block size may also improve overall I/O performance for some file access patterns (e.g. sequential accesses of files in filesystem order) on all media. Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk>	2011-11-02 17:25:50 +00:00
Phillip Lougher	d5b72ce15e	Squashfs: Fix sanity check patches on big-endian systems le64 values should be swapped when accessing on big-endian systems. Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk>	2011-05-29 10:03:09 +01:00
Phillip Lougher	d7f2ff6718	Squashfs: update email address My existing email address may stop working in a month or two, so update email to one that will continue working. Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk>	2011-05-26 10:49:11 +01:00
Phillip Lougher	1094a4a611	Squashfs: add extra sanity checks at mount time Add some extra sanity checks of the inode and directory structures. Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk>	2011-05-25 18:21:33 +01:00
Phillip Lougher	1cac63cc9b	Squashfs: add sanity checks to fragment reading at mount time Fsfuzzer generates corrupted filesystems which throw a warn_on in kmalloc. One of these is due to a corrupted superblock fragments field. Fix this by checking that the number of bytes to be read (and allocated) does not extend into the next filesystem structure. Also add a couple of other sanity checks of the mount-time fragment table structures. Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk>	2011-05-25 18:21:33 +01:00
Phillip Lougher	ac51a0a713	Squashfs: add sanity checks to lookup table reading at mount time Fsfuzzer generates corrupted filesystems which throw a warn_on in kmalloc. One of these is due to a corrupted superblock inodes field. Fix this by checking that the number of bytes to be read (and allocated) does not extend into the next filesystem structure. Also add a couple of other sanity checks of the mount-time lookup table structures. Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk>	2011-05-25 18:21:32 +01:00
Phillip Lougher	37986f63c8	Squashfs: add sanity checks to id reading at mount time Fsfuzzer generates corrupted filesystems which throw a warn_on in kmalloc. One of these is due to a corrupted superblock no_ids field. Fix this by checking that the number of bytes to be read (and allocated) does not extend into the next filesystem structure. Also add a couple of other sanity checks of the mount-time id table structures. Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk>	2011-05-25 18:21:32 +01:00
Phillip Lougher	76e002f755	Squashfs: reverse order of filesystem table reading Reverse order of table reading from mostly first to last in placement order, to last to first. This is to enable extra superblock sanity checks to be added in later patches. Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk>	2011-05-25 18:21:31 +01:00
Phillip Lougher	82de647e1f	Squashfs: move table allocation into squashfs_read_table() This eliminates a lot of duplicate code. Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk>	2011-05-25 18:21:31 +01:00
Phillip Lougher	003a3194d3	Squashfs: wrap squashfs_mount() definition Squashfs_get_sb() to squashfs_mount() conversion (commit `152a0836`) results in line over 80 characters. Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk>	2011-02-28 18:43:48 +00:00
Phillip Lougher	b7fc0ff09d	Squashfs: extend decompressor framework to handle compression options Extend decompressor framework to handle compression options stored in the filesystem. These options can be used by the relevant decompressor at initialisation time to over-ride defaults. The presence of compression options in the filesystem is indicated by the COMP_OPT filesystem flag. If present the data is read from the filesystem and passed to the decompressor init function. The decompressor init function signature has been extended to take this data. Also update the init function signature in the glib, lzo and xz decompressor wrappers. Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk>	2011-02-28 18:21:59 +00:00
Nick Piggin	fa0d7e3de6	fs: icache RCU free inodes RCU free the struct inode. This will allow: - Subsequent store-free path walking patch. The inode must be consulted for permissions when walking, so an RCU inode reference is a must. - sb_inode_list_lock to be moved inside i_lock because sb list walkers who want to take i_lock no longer need to take sb_inode_list_lock to walk the list in the first place. This will simplify and optimize locking. - Could remove some nested trylock loops in dcache code - Could potentially simplify things a bit in VM land. Do not need to take the page lock to follow page->mapping. The downsides of this is the performance cost of using RCU. In a simple creat/unlink microbenchmark, performance drops by about 10% due to inability to reuse cache-hot slab objects. As iterations increase and RCU freeing starts kicking over, this increases to about 20%. In cases where inode lifetimes are longer (ie. many inodes may be allocated during the average life span of a single inode), a lot of this cache reuse is not applicable, so the regression caused by this patch is smaller. The cache-hot regression could largely be avoided by using SLAB_DESTROY_BY_RCU, however this adds some complexity to list walking and store-free path walking, so I prefer to implement this at a later date, if it is shown to be a win in real situations. I haven't found a regression in any non-micro benchmark so I doubt it will be a problem. Signed-off-by: Nick Piggin <npiggin@kernel.dk>	2011-01-07 17:50:26 +11:00
Al Viro	152a083666	new helper: mount_bdev() ... and switch of the obvious get_sb_bdev() users to ->mount() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-10-29 04:16:13 -04:00
Arnd Bergmann	3dbc4b32d0	BKL: Remove BKL from squashfs The BKL is only used in put_super and fill_super, which are both protected by the superblocks s_umount rw_semaphore. Therefore it is safe to remove the BKL entirely. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Phillip Lougher <phillip@lougher.demon.co.uk>	2010-10-04 21:10:50 +02:00
Jan Blunck	db71922217	BKL: Explicitly add BKL around get_sb/fill_super This patch is a preparation necessary to remove the BKL from do_new_mount(). It explicitly adds calls to lock_kernel()/unlock_kernel() around get_sb/fill_super operations for filesystems that still uses the BKL. I've read through all the code formerly covered by the BKL inside do_kern_mount() and have satisfied myself that it doesn't need the BKL any more. do_kern_mount() is already called without the BKL when mounting the rootfs and in nfsctl. do_kern_mount() calls vfs_kern_mount(), which is called from various places without BKL: simple_pin_fs(), nfs_do_clone_mount() through nfs_follow_mountpoint(), afs_mntpt_do_automount() through afs_mntpt_follow_link(). Both later functions are actually the filesystems follow_link inode operation. vfs_kern_mount() is calling the specified get_sb function and lets the filesystem do its job by calling the given fill_super function. Therefore I think it is safe to push down the BKL from the VFS to the low-level filesystems get_sb/fill_super operation. [arnd: do not add the BKL to those file systems that already don't use it elsewhere] Signed-off-by: Jan Blunck <jblunck@infradead.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Matthew Wilcox <matthew@wil.cx> Cc: Christoph Hellwig <hch@infradead.org>	2010-10-04 21:10:10 +02:00

1 2

65 Commits