Commit Graph

660882 Commits

Author SHA1 Message Date
Cyrill Gorcunov
c857ab640c fs,eventpoll: don't test for bitfield with stack value
In case if epoll_ctl is called with operation EPOLL_CTL_DEL then
@epds.events variable allocated on stack may contain random bits which
we test then for EPOLLEXCLUSIVE.  Since currently the test look like

	if (epds.events & EPOLLEXCLUSIVE) {
		if (op == EPOLL_CTL_MOD)
			goto error_tgt_fput;
		if (op == EPOLL_CTL_ADD && (is_file_epoll(tf.file) ||
				(epds.events & ~EPOLLEXCLUSIVE_OK_BITS)))
			goto error_tgt_fput;
	}

Nothing serious will happen even if epds.events has this bit set, still
better to be on safe side and make sure that we're to test this bit at
all.

Link: http://lkml.kernel.org/r/20170214154935.GG1850@uranus.lan
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrey Vagin <avagin@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-27 18:43:45 -08:00
Tetsuo Handa
e3b5a342ab include/linux/pid.h: use for_each_thread() in do_each_pid_thread()
while_each_pid_thread() is using while_each_thread(), which is unsafe
under RCU lock according to commit 0c740d0afc ("introduce
for_each_thread() to replace the buggy while_each_thread()").  Use
for_each_thread() in do_each_pid_thread() which is safe under RCU lock.

Link: http://lkml.kernel.org/r/201702011947.DBD56740.OMVHOLOtSJFFFQ@I-love.SAKURA.ne.jp
Link: http://lkml.kernel.org/r/1486041779-4401-2-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-27 18:43:45 -08:00
Lorenzo Stoakes
369f2679f7 rapidio: use get_user_pages_unlocked()
Moving from get_user_pages() to get_user_pages_unlocked() simplifies the
code and takes advantage of VM_FAULT_RETRY functionality when faulting
in pages.

Link: http://lkml.kernel.org/r/20170103205024.6704-1-lstoakes@gmail.com
Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
Acked-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-27 18:43:45 -08:00
Pratyush Anand
464920104b /proc/kcore: update physical address for kcore ram and text
Currently all the p_paddr of PT_LOAD headers are assigned to 0, which is
not true and could be misleading, since 0 is a valid physical address.

User space tools like makedumpfile needs to know physical address for
PT_LOAD segments of direct mapped regions.  Therefore this patch updates
paddr for such regions.  It also sets an invalid paddr (-1) for other
regions, so that user space tool can know whether a physical address
provided in PT_LOAD is correct or not.

I do not know why it was 0, which is a valid physical address.  But
certainly, it might break some user space tools, and those need to be
fixed.  For example, see following code from kexec-tools

kexec/kexec-elf.c:build_mem_phdrs()

                    if ((phdr->p_paddr + phdr->p_memsz) < phdr->p_paddr) {
                            /* The memory address wraps */
                            if (probe_debug) {
                                    fprintf(stderr, "ELF address wrap around\n");
                            }
                            return -1;
                    }

We do not need to perform above check for an invalid physical address.

I think, kexec-tools and makedumpfile will need fixup.  I already have
those fixup which will be sent upstream once this patch makes through.
Pro with this approach is that, it will help to calculate variable like
page_offset, phys_base from PT_LOAD even when they are randomized and
therefore will reduce many variable and version specific values in user
space tools.

Having an ASLR offset information can help to translate an identity
mapped virtual address to a physical address.  But that would be an
additional field in PT_LOAD header structure and an arch dependent
value.

Moreover, sending a valid physical address like 0 does not seem right.
So, IMHO it is better to fix that and send valid physical address when
available (identity mapped).

Link: http://lkml.kernel.org/r/f951340d2917cdd2a329fae9837a83f2059dc3b2.1485318868.git.panand@redhat.com
Signed-off-by: Pratyush Anand <panand@redhat.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Dave Anderson <anderson@redhat.com>
Cc: Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp>
Cc: Simon Horman <simon.horman@netronome.com>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-27 18:43:45 -08:00
Stas Sergeev
0c49ad4155 tools/testing/selftests/sigaltstack/sas.c: improve output of sigaltstack testcase
Currently it uses %i for bitmasks, which makes it difficult to properly
decode the values.  Use %x instead.

Link: http://lkml.kernel.org/r/b7b4c45d-2f21-de6c-d1c8-16c8386da27c@list.ru
Signed-off-by: Stas Sergeev <stsp@users.sourceforge.net>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-27 18:43:45 -08:00
Stas Sergeev
441398d378 sigaltstack: support SS_AUTODISARM for CONFIG_COMPAT
Currently SS_AUTODISARM is not supported in compatibility mode, but does
not return -EINVAL either.  This makes dosemu built with -m32 on x86_64
to crash.  Also the kernel's sigaltstack selftest fails if compiled with
-m32.

This patch adds the needed support.

Link: http://lkml.kernel.org/r/20170205101213.8163-2-stsp@list.ru
Signed-off-by: Stas Sergeev <stsp@users.sourceforge.net>
Cc: Milosz Tanski <milosz@adfin.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
Cc: Waiman Long <Waiman.Long@hpe.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dmitry Safonov <dsafonov@virtuozzo.com>
Cc: Wang Xiaoqiang <wangxq10@lzu.edu.cn>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-27 18:43:45 -08:00
Fabian Frederick
b899ba7d8c fs/reiserfs: atomically read inode size
See i_size_read() comments in include/linux/fs.h

Link: http://lkml.kernel.org/r/20170123174701.30394-1-fabf@skynet.be
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-27 18:43:45 -08:00
Fabian Frederick
a3f2235012 hfsplus: atomically read inode size
See i_size_read() comments in include/linux/fs.h

Link: http://lkml.kernel.org/r/20170123175338.3840-1-fabf@skynet.be
Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-27 18:43:45 -08:00
Ian Kent
092a53452b autofs: take more care to not update last_used on path walk
GUI environments seem to be becoming more agressive at scanning
filesystems, to the point where autofs cannot expire mounts at all.

This is one key reason the update of the autofs dentry info last_used
field is done in the expire system when the dentry is seen to be in use.

But somewhere along the way instances of the update has crept back into
the autofs path walk functions which, with the more aggressive file
access patterns, is preventing expiration.

Changing the update in the path walk functions allows autofs to at least
make progress in spite of frequent immediate re-mounts from file
accesses.

Link: http://lkml.kernel.org/r/148577167169.9801.1377050092212016834.stgit@pluto.themaw.net
Signed-off-by: Ian Kent <raven@themaw.net>
Cc: Tomohiro Kusumi <tkusumi@tuxera.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-27 18:43:45 -08:00
Tomohiro Kusumi
3bb2fbdaba autofs: remove duplicated AUTOFS_DEV_IOCTL_SIZE definition
This macro is already defined in uapi header.  Also use this macro where
possible.

Link: http://lkml.kernel.org/r/148577166656.9801.10322423666945951186.stgit@pluto.themaw.net
Signed-off-by: Tomohiro Kusumi <tkusumi@tuxera.com>
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-27 18:43:45 -08:00
Tomohiro Kusumi
0fae77feca autofs: add command enum/macros for root-dir ioctls
Sync root-dir ioctl with misc-char-dev ioctl's enum/macro format since
these two types of ioctls aren't completely independent of each other in
terms of command nr.  No functional changes.

Link: http://lkml.kernel.org/r/148577166143.9801.15511796506678428145.stgit@pluto.themaw.net
Signed-off-by: Tomohiro Kusumi <tkusumi@tuxera.com>
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-27 18:43:45 -08:00
Tomohiro Kusumi
8848808094 autofs: update ioctl documentation regarding struct autofs_dev_ioctl
This is the same as bf72eda5 except that it's a different file.  Sync
documentation with changes made by 730c9eec in 2009.

Link: http://lkml.kernel.org/r/148577165630.9801.6081791213151121657.stgit@pluto.themaw.net
Signed-off-by: Tomohiro Kusumi <tkusumi@tuxera.com>
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-27 18:43:45 -08:00
Tomohiro Kusumi
c5dd2ea064 autofs: fix wrong ioctl documentation regarding devid
This is the same as d8732841 except that it's a different file.  A
caller has no devid input, and devid is obtained via superblock.

Link: http://lkml.kernel.org/r/148577165119.9801.16967562019122274820.stgit@pluto.themaw.net
Signed-off-by: Tomohiro Kusumi <tkusumi@tuxera.com>
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-27 18:43:45 -08:00
Tomohiro Kusumi
d0846dd5af autofs: fix typo in Documentation
Link: http://lkml.kernel.org/r/148577164606.9801.12571810310561599401.stgit@pluto.themaw.net
Signed-off-by: Tomohiro Kusumi <tkusumi@tuxera.com>
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-27 18:43:45 -08:00
Tomohiro Kusumi
8723890d1d autofs: remove wrong comment
This format seems to have been taken from device mapper header, but
autofs has no such file:function in both kernel and userspace.

Link: http://lkml.kernel.org/r/148577164094.9801.4775075118014742496.stgit@pluto.themaw.net
Signed-off-by: Tomohiro Kusumi <tkusumi@tuxera.com>
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-27 18:43:45 -08:00
Luis R. Rodriguez
7d134b2ce6 kprobes: move kprobe declarations to asm-generic/kprobes.h
Often all is needed is these small helpers, instead of compiler.h or a
full kprobes.h.  This is important for asm helpers, in fact even some
asm/kprobes.h make use of these helpers...  instead just keep a generic
asm file with helpers useful for asm code with the least amount of
clutter as possible.

Likewise we need now to also address what to do about this file for both
when architectures have CONFIG_HAVE_KPROBES, and when they do not.  Then
for when architectures have CONFIG_HAVE_KPROBES but have disabled
CONFIG_KPROBES.

Right now most asm/kprobes.h do not have guards against CONFIG_KPROBES,
this means most architecture code cannot include asm/kprobes.h safely.
Correct this and add guards for architectures missing them.
Additionally provide architectures that not have kprobes support with
the default asm-generic solution.  This lets us force asm/kprobes.h on
the header include/linux/kprobes.h always, but most importantly we can
now safely include just asm/kprobes.h on architecture code without
bringing the full kitchen sink of header files.

Two architectures already provided a guard against CONFIG_KPROBES on its
kprobes.h: sh, arch.  The rest of the architectures needed gaurds added.
We avoid including any not-needed headers on asm/kprobes.h unless
kprobes have been enabled.

In a subsequent atomic change we can try now to remove compiler.h from
include/linux/kprobes.h.

During this sweep I've also identified a few architectures defining a
common macro needed for both kprobes and ftrace, that of the definition
of the breakput instruction up.  Some refer to this as
BREAKPOINT_INSTRUCTION.  This must be kept outside of the #ifdef
CONFIG_KPROBES guard.

[mcgrof@kernel.org: fix arm64 build]
  Link: http://lkml.kernel.org/r/CAB=NE6X1WMByuARS4mZ1g9+W=LuVBnMDnh_5zyN0CLADaVh=Jw@mail.gmail.com
[sfr@canb.auug.org.au: fixup for kprobes declarations moving]
  Link: http://lkml.kernel.org/r/20170214165933.13ebd4f4@canb.auug.org.au
Link: http://lkml.kernel.org/r/20170203233139.32682-1-mcgrof@kernel.org
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-27 18:43:45 -08:00
Dan Streetman
fd5bb66cd9 zswap: don't param_set_charp while holding spinlock
Change the zpool/compressor param callback function to release the
zswap_pools_lock spinlock before calling param_set_charp, since that
function may sleep when it calls kmalloc with GFP_KERNEL.

While this problem has existed for a while, I wasn't able to trigger it
using a tight loop changing either/both the zpool and compressor params; I
think it's very unlikely to be an issue on the stable kernels, especially
since most zswap users will change the compressor and/or zpool from sysfs
only one time each boot - or zero times, if they add the params to the
kernel boot.

Fixes: c99b42c352 ("zswap: use charp for zswap param strings")
Link: http://lkml.kernel.org/r/20170126155821.4545-1-ddstreet@ieee.org
Signed-off-by: Dan Streetman <dan.streetman@canonical.com>
Reported-by: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-27 18:43:45 -08:00
Dan Streetman
bae21db88b zswap: clear compressor or zpool param if invalid at init
If either the compressor and/or zpool param are invalid at boot, and
their default value is also invalid, set the param to the empty string
to indicate there is no compressor and/or zpool configured.  This allows
users to check the sysfs interface to see which param needs changing.

Link: http://lkml.kernel.org/r/20170124200259.16191-4-ddstreet@ieee.org
Signed-off-by: Dan Streetman <dan.streetman@canonical.com>
Cc: Seth Jennings <sjenning@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-27 18:43:45 -08:00
Dan Streetman
ae3d89a7e0 zswap: allow initialization at boot without pool
Allow zswap to initialize at boot even if it can't create its pool due
to a failure to create a zpool and/or compressor.  Allow those to be
created later, from the sysfs module param interface.

Link: http://lkml.kernel.org/r/20170124200259.16191-3-ddstreet@ieee.org
Signed-off-by: Dan Streetman <dan.streetman@canonical.com>
Cc: Seth Jennings <sjenning@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-27 18:43:45 -08:00
Arnd Bergmann
01cddfe990 mm,fs,dax: mark dax_iomap_pmd_fault as const
The two alternative implementations of dax_iomap_fault have different
prototypes, and one of them is obviously wrong as seen from this build
warning:

  fs/dax.c: In function 'dax_iomap_fault':
  fs/dax.c:1462:35: error: passing argument 2 of 'dax_iomap_pmd_fault' discards 'const' qualifier from pointer target type [-Werror=discarded-qualifiers]

This marks the argument 'const' as in all the related functions.

Fixes: a2d581675d ("mm,fs,dax: change ->pmd_fault to ->huge_fault")
Link: http://lkml.kernel.org/r/20170227203349.3318733-1-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-27 18:43:45 -08:00
Dave Airlie
a44ddbcbbd Merge tag 'drm-misc-next-fixes-2017-02-27' of git://anongit.freedesktop.org/git/drm-misc into drm-next
Misc fixes for the 4.11 merge window.

- vmwgfx drm_control node compat patch
- rockchip&zte fix
- compat32 support for dma-buf ioctl (cc: stable ofc, since this is a
  massive fumble. oops)

* tag 'drm-misc-next-fixes-2017-02-27' of git://anongit.freedesktop.org/git/drm-misc:
  dma-buf: add support for compat ioctl
  drm/vmwgfx: Work around drm removal of control nodes
  drm/rockchip: cdn-dp: Fix error handling
  drm/rockchip: add extcon dependency for DP
  drm: zte: fix static checker warning on variable 'fmt'
2017-02-28 12:28:00 +10:00
Bhumika Goyal
96297aee8b ide: palm_bk3710: add __initdata to palm_bk3710_port_info
The object palm_bk3710_port_info of type ide_port_info is never
referenced anywhere after initialization by palm_bk3710_probe. It is
also passed as a parameter to ide_host_add which is called from the init
function but this call doesn't store the object reference anywhere, and
it only dereferences the values of the fields. Therefore add __initdata
to its declaration.

Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-02-27 20:43:26 -05:00
Trond Myklebust
ff7d11797e nfsd: Fix display of the version string
The current display code assumes that v4 minor version 0 is tracked by
the call to nfsd_vers(). Now it is tracked by nfsd_minorversion(), and
so we need to adjust the display code.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2017-02-27 18:04:17 -05:00
Trond Myklebust
d3635ff07e nfsd: fix configuration of supported minor versions
When the user turns off all minor versions of NFSv4, that should be
equivalent to turning off NFSv4 support, so a mount attempt using NFSv4
should get RPC_PROG_MISMATCH, not NFSERR_MINOR_VERS_MISMATCH.

Allow the user to use either '4.0' or '4' to enable or disable minor
version 0.  Other minor versions are still enabled or disabled using the
'4.x' format.

Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2017-02-27 18:04:08 -05:00
Jaegeuk Kim
8b107f5b97 f2fs: avoid to issue redundant discard commands
If segs_per_sec is over 1 like under SMR, previously f2fs issues discard
commands redundantly on the same section, since we didn't move end position
for the previous discard command.

E.g.,

                       start  end
                         |    |
      prefree_bitmap = [01111100111100]

And, after issue discard for this section,
                             end      start
                              |        |
      prefree_bitmap = [01111100111100]

Select this section again by searching from (end + 1),
                             start  end
                                |   |
      prefree_bitmap = [01111100111100]

Fixes: 36abef4e79 ("f2fs: introduce mode=lfs mount option")
Cc: <stable@vger.kernel.org>
Cc: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-27 13:44:08 -08:00
Linus Torvalds
45554b2357 Commit 79c6f448c8 ("tracing: Fix hwlat kthread migration") fixed a
bug that was caused by a race condition in initializing the hwlat
 thread. When fixing this code, I realized that it should have been done
 differently. Instead of doing the rewrite and sending that to stable,
 I just sent the above commit to fix the bug that should be back ported.
 
 This commit is on top of the quick fix commit to rewrite the code the
 way it should have been written in the first place.
 -----BEGIN PGP SIGNATURE-----
 
 iQExBAABCAAbBQJYtDsNFBxyb3N0ZWR0QGdvb2RtaXMub3JnAAoJEMm5BfJq2Y3L
 +CQH/0BhSwjiCnlHAkHNFKn47O0yDtxBLj8ar4bUQRacDXeQyAGDP13hn3q3LvG9
 CRzDXaYrusA3fjGcgmtyU33am6LK84dPn5u2HSyEalDZNBel8l6oYLUZVWLgef02
 x43949nOeBy+KO02Y118zGyxFEPtYBnCVpguMa4vdVgnr04gECo2VH5FjnLMslKM
 W1j/WrbVaO8WObh7X01JZzozWwp3McW4x6H8PUWaHjnhN/Iv6+YGtNN/Sa4cq4V/
 CyCfxrZvN/Y/uMSGzVlhuXxeRc2PRsjjmAqN+8P4KZIGW5BstdiWbrVj+KaBLR9z
 6QERD3atiEIYI/QGEep7ZH795PI=
 =Bdwk
 -----END PGP SIGNATURE-----

Merge tag 'trace-v4.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull another tracing update from Steven Rostedt:
 "Commit 79c6f448c8 ("tracing: Fix hwlat kthread migration") fixed a
  bug that was caused by a race condition in initializing the hwlat
  thread. When fixing this code, I realized that it should have been
  done differently. Instead of doing the rewrite and sending that to
  stable, I just sent the above commit to fix the bug that should be
  back ported.

  This commit is on top of the quick fix commit to rewrite the code the
  way it should have been written in the first place"

* tag 'trace-v4.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Clean up the hwlat binding code
2017-02-27 13:36:19 -08:00
Linus Torvalds
79b17ea740 This release has no new tracing features, just clean ups, minor fixes
and small optimizations.
 -----BEGIN PGP SIGNATURE-----
 
 iQExBAABCAAbBQJYtDiAFBxyb3N0ZWR0QGdvb2RtaXMub3JnAAoJEMm5BfJq2Y3L
 KygH/3sxuM9MCeJ29JsjmV49fHcNqryNZdvSadmnysPm+dFPiI6IgIIbh5R8H89b
 2V2gfQSmOTKHu3/wvJr/MprkGP275sWlZPORYFLDl/+NE/3q7g0NKOMWunLcv6dH
 QQRJIFjSMeGawA3KYBEcwBYMlgNd2VgtTxqLqSBhWth5omV6UevJNHhe3xzZ4nEE
 YbRX2mxwOuRHOyFp0Hem+Bqro4z1VXJ6YDxOvae2PP8krrIhIHYw9EI22GK68a2g
 EyKqKPPaEzfU8IjHIQCqIZta5RufnCrDbfHU0CComPANBRGO7g+ZhLO11a/Z316N
 lyV7JqtF680iem7NKcQlwEwhlLE=
 =HJnl
 -----END PGP SIGNATURE-----

Merge tag 'trace-v4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing updates from Steven Rostedt:
 "This release has no new tracing features, just clean ups, minor fixes
  and small optimizations"

* tag 'trace-v4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (25 commits)
  tracing: Remove outdated ring buffer comment
  tracing/probes: Fix a warning message to show correct maximum length
  tracing: Fix return value check in trace_benchmark_reg()
  tracing: Use modern function declaration
  jump_label: Reduce the size of struct static_key
  tracing/probe: Show subsystem name in messages
  tracing/hwlat: Update old comment about migration
  timers: Make flags output in the timer_start tracepoint useful
  tracing: Have traceprobe_probes_write() not access userspace unnecessarily
  tracing: Have COMM event filter key be treated as a string
  ftrace: Have set_graph_function handle multiple functions in one write
  ftrace: Do not hold references of ftrace_graph_{notrace_}hash out of graph_lock
  tracing: Reset parser->buffer to allow multiple "puts"
  ftrace: Have set_graph_functions handle write with RDWR
  ftrace: Reset fgd->hash in ftrace_graph_write()
  ftrace: Replace (void *)1 with a meaningful macro name FTRACE_GRAPH_EMPTY
  ftrace: Create a slight optimization on searching the ftrace_hash
  tracing: Add ftrace_hash_key() helper function
  ftrace: Convert graph filter to use hash tables
  ftrace: Expose ftrace_hash_empty and ftrace_lookup_ip
  ...
2017-02-27 13:26:17 -08:00
Hou Pengyang
37e79cd31c f2fs: fix a plint compile warning
fix such pclint warning:
...
Loss of precision (arg. no. 2) (unsigned long long to unsigned int))

Signed-off-by: Hou Pengyang <houpengyang@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-27 10:51:21 -08:00
Hou Pengyang
b8d96a30b6 f2fs: add f2fs_drop_inode tracepoint
Signed-off-by: Hou Pengyang <houpengyang@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-27 10:51:20 -08:00
Masato Suzuki
7bb3a371d1 f2fs: Fix zoned block device support
The introduction of the multi-device feature partially broke the support
for zoned block devices. In the function f2fs_scan_devices, sbi->devs
allocation and initialization is skipped in the case of a single device
mount. This result in no device information structure being allocated
for the device. This is fine if the device is a regular device, but in
the case of a zoned block device, the device zone type array is not
initialized, which causes the function __f2fs_issue_discard_zone to fail
as get_blkz_type is unable to determine the zone type of a section.

Fix this by always allocating and initializing the sbi->devs device
information array even in the case of a single device if that device is
zoned. For this particular case, make sure to obtain a reference on the
single device so that the call to blkdev_put() in destroy_device_list
operates as expected.

Fixes: 3c62be17d4 ("f2fs: support multiple devices")
Cc: <stable@vger.kernel.org> # v4.10
Signed-off-by: Masato Suzuki <masato.suzuki@wdc.com>
Acked-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-27 10:40:12 -08:00
Yunlei He
4fcf589ad4 f2fs: remove redundant set_page_dirty()
This patch remove redundant set_page_dirty in truncate_blocks

Signed-off-by: Yunlei He <heyunlei@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-27 10:40:11 -08:00
Chao Yu
a3ebfe4fd8 f2fs: fix to enlarge size of write_io_dummy mempool
It needs to double cache size of write_io_dummy mempool, otherwise we may
run out of cache in scenraio of Data/Node IOs were issued concurrently.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-27 10:40:10 -08:00
Chao Yu
b6895e8f99 f2fs: fix memory leak of write_io_dummy mempool during umount
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-27 10:40:09 -08:00
Chao Yu
540faedb00 f2fs: fix to update F2FS_{CP_}WB_DATA count correctly
We should only account F2FS_{CP_}WB_DATA IOs for write path, fix it.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-27 10:40:08 -08:00
Kinglong Mee
f0cdbfe6ef f2fs: use MAX_FREE_NIDS for the free nids target
F2FS has define MAX_FREE_NIDS for maximum of cached free nids target.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-27 10:07:48 -08:00
Chao Yu
4ac912427c f2fs: introduce free nid bitmap
In scenario of intensively node allocation, free nids will be ran out
soon, then it needs to stop to load free nids by traversing NAT blocks,
in worse case, if NAT blocks does not be cached in memory, it generates
IOs which slows down our foreground operations.

In order to speed up node allocation, in this patch we introduce a new
free_nid_bitmap array, so there is an bitmap table for each NAT block,
Once the NAT block is loaded, related bitmap cache will be switched on,
and bitmap will be set during traversing nat entries in NAT block, later
we can query and update nid usage status in memory completely.

With such implementation, I expect performance of node allocation can be
improved in the long-term after filesystem image is mounted.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-27 10:07:47 -08:00
Kinglong Mee
ced2c7ea8e f2fs: new helper cur_cp_crc() getting crc in f2fs_checkpoint
There are four places that getting the crc value in f2fs_checkpoint,
just add a new helper cur_cp_crc for them.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-27 10:07:47 -08:00
Kinglong Mee
727ebb091e f2fs: update the comment of default nr_pages to skipping
Fixes: 2c237ebaa4 ("f2fs: avoid writing node/metapages during writes")
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-27 10:07:46 -08:00
Kinglong Mee
0ec4a5b647 f2fs: drop the duplicate pval in f2fs_getxattr
Fixes: ba38c27eb9 ("f2fs: enhance lookup xattr")
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-27 10:07:45 -08:00
Kinglong Mee
5f35a2cd5b f2fs: Don't update the xattr data that same as the exist
f2fs removes the old xattr data and appends the new data although
the new data is same as the exist.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-27 10:07:44 -08:00
Chao Yu
317e130096 f2fs: kill __is_extent_same
Since commit ee6d182f2a ("f2fs: remove syncing inode page in all the
cases") delayed inode element updating from inode cache to node page
cache, so once largest cached extent is updated, we can make inode dirty
immediately instead of checking and updating it in the end of extent
cache update.

The above commit didn't clean up unneeded codes in extent_cache.c, let's
finish the job in this patch.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-27 10:07:43 -08:00
Hou Pengyang
19f4e688f8 f2fs: avoid bggc->fggc when enough free segments are avaliable after cp
We use has_not_enough_free_secs to check if there are enough free segments,

    	(free_sections(sbi) + freed) <=
	        (node_secs + 2 * dent_secs + imeta_secs +
			         reserved_sections(sbi) + needed);

Under scenario with large number of dirty nodes, these nodes would be flushed
during cp, as a result, right side of the inequality would be decreased, while
left side stays unchanged if these nodes are flushed in SSR way, which means
there are enough free segments after this cp.

For this case, we just do a bggc instead of fggc.

Signed-off-by: Hou Pengyang <houpengyang@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-27 10:07:37 -08:00
Chao Yu
d27c3d89db f2fs: select target segment with closer temperature in SSR mode
In SSR mode, we can allocate target segment which has different
temperature type from the type of current block, in order to avoid
mixing coldest and hottest data/node as much as possible, change
SSR allocation policy to select closer temperature for current
block prior.

Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-27 09:59:56 -08:00
Chao Yu
55523519bc f2fs: show simple call stack in fault injection message
Previously kernel message can show that in which function we do the
injection, but unfortunately, most of the caller are the same, for
tracking more information of injection path, it needs to show upper
caller's name. This patch supports that ability.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-27 09:59:55 -08:00
Yunlei He
dd7b2333e6 f2fs: no need lock_op in f2fs_write_inline_data
Similar as f2fs_write_inode, f2fs_write_inline_data just
mark inode page dirty, so it's no need to write inline data
under read lock of cp_rwsem.

Signed-off-by: Yunlei He <heyunlei@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-27 09:59:55 -08:00
Jaegeuk Kim
22ad0b6ab4 f2fs: add bitmaps for empty or full NAT blocks
This patches adds bitmaps to represent empty or full NAT blocks containing
free nid entries.

If we can find valid crc|cp_ver in the last block of checkpoint pack, we'll
use these bitmaps when building free nids. In order to avoid checkpointing
burden, up-to-date bitmaps will be flushed only during umount time. So,
normally we can get this gain, but when power-cut happens, we rely on fsck.f2fs
which recovers this bitmap again.

After this patch, we build free nids from nid #0 at mount time to make more
full NAT blocks, but in runtime, we check empty NAT blocks to load free nids
without loading any NAT pages from disk.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-27 09:59:54 -08:00
Yunlei He
5e8256ac2e f2fs: replace rw semaphore extent_tree_lock with mutex lock
This patch replace rw semaphore extent_tree_lock with mutex lock
for no read cases with this lock.

Signed-off-by: Yunlei He <heyunlei@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-27 09:59:53 -08:00
Kinglong Mee
3f2be04304 f2fs: avoid m_flags overlay when allocating more data blocks
When more than one data blocks are allocated, the F2FS_MAP_UNWRITTEN/MAPPED
flags will be overlapped by F2FS_MAP_NEW at the later times.

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-27 09:59:52 -08:00
Hou Pengyang
6bfaf7b150 f2fs: remove unsafe bitmap checking
proc A:                      proc B:
- writeback_sb_inodes
- __writeback_single_inode
- do_writepages
- f2fs_write_node_pages
- f2fs_balance_fs_bg         - write_checkpoint
- build_free_nids            - flush_nat_entries
- __build_free_nids          - __flush_nat_entry_set
- ra_meta_pages              - get_next_nat_page
- current_nat_addr           - set_to_next_nat
[do nat_bitmap checking]     - f2fs_change_bit

For proc A, nat_bitmap and nat_bitmap_mir would be compared without lock_op and
nm_i->nat_tree_lock, while proc B is changing nat_bitmap/nat_bitmap_ver in cp.

So it is normal for nat_bitmap/nat_bitmap diffrence under such scenario.

This patch fix this by removing the monitoring point.

[Fix: 599a09b f2fs: check in-memory nat version bitmap]
Signed-off-by: Hou Pengyang <houpengyang@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-27 09:59:51 -08:00
Hou Pengyang
e15882b6c6 f2fs: init local extent_info to avoid stale stack info in tp
To avoid such stale(fops, blk, len) info in f2fs_lookup_extent_tree_end tp

dio-23095 [005] ...1 17878.856859: f2fs_lookup_extent_tree_end:
			dev = (259,30), ino = 856, pgofs = 0,
			ext_info(fofs: 3441207040, blk: 4294967232, len: 3481143808)

Signed-off-by: Hou Pengyang <houpengyang@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-02-27 09:59:50 -08:00