Previously, we skip dentry block writes when wbc is SYNC_NONE with no memory
pressure and the number of dirty pages is pretty small.
But, we didn't skip for normal data writes, which gives us not much big impact
on overall performance.
Moreover, by skipping some data writes, kworker falls into infinite loop to try
to write blocks, when many dir inodes have only one dentry block.
So, this patch removes skipping data writes.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Protecting recovery flow by using cp_rwsem is not needed, since we have
prevent triggering any checkpoint by locking cp_mutex previously.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
In update_sit_info, we use div_u64 to handle 'u64 divide u64' case, but
div_u64 can only handle 32-bits divisor, so our divisor with u64 type
passed to div_u64 will overflow, result in the wrong calculation when
show debug info of f2fs as below:
BDF: 464, avg. vblocks: 23509
(BDF should never exceed 100)
So change to use div64_u64 to handle this case correctly.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This patch adds a new helper __try_update_largest_extent for cleanup.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This fixes error handling for calls to various functions in the
function recover_inline_data to check if these particular functions
either return a error code or the boolean value false to signal their
caller they have failed internally and if this arises return false
to signal failure immediately to the caller of recover_inline_data
as we cannot continue after failures to calling either the function
truncate_inline_inode or truncate_blocks.
Signed-off-by: Nicholas Krause <xerofoify@gmail.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Swith extent_cache option dynamically when remount may casue consistency
issue between extent cache and dnode page. Fix in this patch to avoid
that condition.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
We introduce F2FS_GET_BLOCK_READ in commit e2b4e2bc88 ("f2fs: fix
incorrect mapping for bmap"), but forget to use this flag in the right
place, fix it.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
truncate_data_blocks_range can do in batches truncation which makes all
changes in dnode page content, dnode page status, extent cache, block
count updating together.
But previously, truncate_hole() always truncates one block in dnode page
at a time by invoking truncate_data_blocks_range(,1), which make thing
slow.
This patch changes truncate_hole() to do in batches truncation for all
target blocks in one direct node inside truncate_data_blocks_range, which
can make our punch hole operation in ->fallocate more efficent.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Fix 2 potential problems:
1. when largest extent needs to be invalidated, it will be reset in
__drop_largest_extent, which makes __is_extent_same after always
return false, and largest extent unchanged. Now we update it properly.
2. when extent is split and the latter part remains in tree, next_en
should be the latter part instead of next extent of original extent.
It will cause merge failure if there is in-place update, although
there is not, I think this fix will still makes codes less ambiguous.
This patch also simplifies codes of invalidating extents, and optimizes the
procedues that split extent into two.
There are a few modifications after last patch:
1. prev_en now is updated properly.
2. more codes and branches are simplified.
Signed-off-by: Fan li <fanofcode.li@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
now we update extent by range, fofs may not be on the largest
extent if the new extent overlaps with it. so add a new function
to drop largest extent properly.
Signed-off-by: Fan li <fanofcode.li@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This patch avoids to produce new checkpoint blocks before the previous meta
pages were written completely.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
We got dentry pages from high_mem, and its address space directly goes into the
decryption path via f2fs_fname_disk_to_usr.
But, sg_init_one assumes the address is not from high_mem, so we can get this
panic since it doesn't call kmap_high but kunmap_high is triggered at the end.
kernel BUG at ../../../../../../kernel/mm/highmem.c:290!
Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
...
(kunmap_high+0xb0/0xb8) from [<c0114534>] (__kunmap_atomic+0xa0/0xa4)
(__kunmap_atomic+0xa0/0xa4) from [<c035f028>] (blkcipher_walk_done+0x128/0x1ec)
(blkcipher_walk_done+0x128/0x1ec) from [<c0366c24>] (crypto_cbc_decrypt+0xc0/0x170)
(crypto_cbc_decrypt+0xc0/0x170) from [<c0367148>] (crypto_cts_decrypt+0xc0/0x114)
(crypto_cts_decrypt+0xc0/0x114) from [<c035ea98>] (async_decrypt+0x40/0x48)
(async_decrypt+0x40/0x48) from [<c032ca34>] (f2fs_fname_disk_to_usr+0x124/0x304)
(f2fs_fname_disk_to_usr+0x124/0x304) from [<c03056fc>] (f2fs_fill_dentries+0xac/0x188)
(f2fs_fill_dentries+0xac/0x188) from [<c03059c8>] (f2fs_readdir+0x1f0/0x300)
(f2fs_readdir+0x1f0/0x300) from [<c0218054>] (vfs_readdir+0x90/0xb4)
(vfs_readdir+0x90/0xb4) from [<c0218418>] (SyS_getdents64+0x64/0xcc)
(SyS_getdents64+0x64/0xcc) from [<c0105ba0>] (ret_fast_syscall+0x0/0x30)
Cc: <stable@vger.kernel.org>
Reviewed-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
In this patch, we try to reorganize f2fs_map_blocks to make block mapping
flow more clear by using following structure:
/* check status of mapping */
if (unmapped) {
/* blkaddr == NULL_ADDR || blkaddr == NEW_ADDR */
if (create) {
/* write path, handle dio write case here */
alloc_and_map;
} else {
/*
* handle read cases from all call paths:
* 1. generic read;
* 2. dio read;
* 3. fiemap;
* 4. bmap
*/
}
}
/* map buffer_header */
Besides, this patch handles the missing case correctly for dio write:
When we fail in __allocate_data_blocks, then in f2fs_map_blocks, we will
not allocate blocks correctly for preallocated blocks, but returning with
an unmapped buffer head, which will result in failure of dio write.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
We have potential overflow issue when calculating size of object, when
we left shift index with PAGE_CACHE_SHIFT bits, if type of index has only
32-bits space in 32-bit architecture, left shifting will incur overflow,
i.e:
pgoff_t index = 0xFFFFFFFF;
loff_t size = index << PAGE_CACHE_SHIFT;
size: 0xFFFFF000
So we should cast index with 64-bits type to avoid this issue.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
When shrinking extent cache, we have two steps in the flow:
1) shrink objects which are unreferenced by inodes;
2) shrink objects from LRU list of extent cache.
In step 1, if we haven't shrunk enough number of objects, we will try
step 2, but before that we didn't update the searching position which
may point to last inode index in global extent tree, result in failing
to shrink objects by traversing the all inodes' extent tree.
In this patch, we reset searching position to beginning of global extent
tree for fixing.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This patch changes to verify file type early in f2fs_fallocate for
cleanup, meanwhile this also fixes to add missing verification for
expand_inode_data.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
As comment says, we don't need to call f2fs_lock_op in write_inode to prevent
from producing dirty node pages all the time.
That happens only when there is not enough free sections and we can avoid that
by calling balance_fs in prior to that.
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This number is referenced by checkpoint under node_write lock.
Reviewed-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
This fixes the incorrect return statement at the end of the function
f2fs_ioc_release_volatile_write's body for returning zero as this is
incorrect due to the function call before this return statement to
the function punch_hole being able to fail and we should return this
function's return fail directly in order to signal to callers of the
function f2fs_ioc_release_volatile if a failure arises with this call
to punch_hole fails.
Signed-off-by: Nicholas Krause <xerofoify@gmail.com>
Reviewed-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Rename trace_f2fs_update_extent_tree to trace_f2fs_update_extent_tree_range,
then expand and enable it to trace in batches extent info updates.
Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
- A couple of locking fixes for RT kernels
- Avoid printing bogus initrd warnings when initrd isn't present
- Performance fix for random mmap file readahead
- Typo fix
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABCgAGBQJWFO+aAAoJELescNyEwWM0z14IALpleyenZXl+xqxMjNyOXouG
/2SbTZH7iD/vnfCL6G7/Poq00I2ghtBSRFGXajfg7V0mjH1HTfVXVN19IXaUUwjW
IfAMqSyC43dDBdsGn3A1ZqPRNk+chxjwz7zimGqPowuM87C4aj/sqetBSnuybZtB
lSYfCFGpDj8cpsJ0xwYYhuq8xUgixQMslTj+rVAFtfsLkDHUu175l+vP7t2xOv5v
bmyPlz15O/v9febnLYFVFPWZB2IWfvaFkR30qUGsMe6BGWdGDe/RGUPksZDLlPdL
Yj4AKq+9Bx0lPvO+vNEqfvScKdVjMpttVEMfi2cQ8kUbD1rRPZ7ZTHsfcEOuB1w=
=Zi+I
-----END PGP SIGNATURE-----
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"This addresses a couple of issues found with RT, a broken initrd
message in the console log and a simple performance fix for some MMC
workloads.
Summary:
- A couple of locking fixes for RT kernels
- Avoid printing bogus initrd warnings when initrd isn't present
- Performance fix for random mmap file readahead
- Typo fix"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: replace read_lock to rcu lock in call_break_hook
arm64: Don't relocate non-existent initrd
arm64: convert patch_lock to raw lock
arm64: readahead: fault retry breaks mmap file read random detection
arm64: debug: Fix typo in debug-monitors.c
* fbdev: Minor fixes to broadsheetfb, fsl-diu-fb, mb862xxfb, tridentfb, omapfb
* display-timing: Fix memory leak in error path
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJWFQlvAAoJEPo9qoy8lh711iYQAK4tBVPRPRPgNFkCWfeqaRvd
UvJcLrsVddmwFqQj+CW+PxHWqhFEcSRFaaJmWirBAtumGCAYxy2BeFjY6H32JzvK
tFVIwt1GuaO+89d3xq6AuvamiH9X4pPuu/bVfIWAP4hfpKsuKAQ+UtdPHRsRTkBe
2PVKLc4ghxPMnp0XdcDYRmJ0RrejctISmmelScc6waqhRC8+bnEcVZB5CP/N8iM9
+PvoxmS2e7Cn6ZxMMqSrEsFwpqtxuUhzrIEVrcbfAnymp37uiKai76FJe1JEev3T
gXnqXAsfMbdNjtVpHQVeb3B43ZsslEruXOQZHyhW9CH2+7QgOsVfE9PDyIEDxM+9
UZH8jLwh0EhLhnfBMBmeUmy9mQXXUhyQ5qBXJ4ubBc304tS/2LnVvh507WbzKEhY
2gk0ZmqsSzpn1+NoZ0HEkZM1toG7sWEo/KuYehyifzNrPikG6jzPqbdF64ZED5Ev
mS+fOfIus4vCXX/7cf4waDXzOhlZl+arAEOM1+FK9bE80JPZnK1xZQain7wrPSHR
uaM5AAtbPSVj3w5lk1ntY7ck/2R+DRKjwVGr8Qm7EdWnE5LQcUrS619q77Y9NJ7r
sqnfMToKarqAnfcU/4sQ2ObV8nfhsZQimeffOBVRnbMlRIo7aD+d+n4WXAICNvH+
3A/vqTo/pbWL9TBXce/E
=hZoA
-----END PGP SIGNATURE-----
Merge tag 'fbdev-fixes-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux
Pull fbdev fixes from Tomi Valkeinen:
- fbdev: Minor fixes to broadsheetfb, fsl-diu-fb, mb862xxfb, tridentfb,
omapfb
- display-timing: Fix memory leak in error path
* tag 'fbdev-fixes-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux:
video: of: fix memory leak
fbdev: broadsheetfb: fix memory leak
OMAPDSS: panel-sony-acx565akm: Export OF module alias information
fbdev: omap2: connector-dvi: use of_get_i2c_adapter_by_node interface
tridentfb: Fix set_lwidth on TGUI9440 and CYBER9320
tridentfb: fix hang on Blade3D with CONFIG_CC_OPTIMIZE_FOR_SIZE
video: fbdev: mb862xx: Fix module autoload for OF platform driver
video: fbdev: fsl: Fix the sleep function for FSL DIU module
A couple of fixes for the debugfs information on the register map,
fixing issues with very small reads potentially causing underflows and
wraparounds.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJWFPtTAAoJECTWi3JdVIfQlRUH/1j/cFl0+eDBnjQjg25+bMOn
HhGQtxcGlOXIwDzRc7vCzuH8RjqJnY6jLlYzAjbYiIEoRAHVOEtWZCeSMN8bTfaK
lUe2XN6TLrAweJYP+LC+NoIXKzW4jUDXG3ZqhkRllsnn9eoLsIFBxBV+E5jS4LNo
8AFMfdlzbKaEIrAu9w6c5pHmgwacyXETOKej+GuJ/RP6a5SRPAS1FWCznLc8gY7n
NBh+vMeYdtRV57PpXxVa/sAFjqkglwXhf4C7Gkhcv5Ecu4utf5LqL3w9i5Al5Rmz
5DxO3JaFdK4ae74wdo8iCFddXJBu+TiB9UB6mvhrZ63W8ECw3xjcE4PcYuL3BN0=
=QeA5
-----END PGP SIGNATURE-----
Merge tag 'regmap-fix-v4.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap
Pull regmap fixes from Mark Brown:
"A couple of fixes for the debugfs information on the register map,
fixing issues with very small reads potentially causing underflows and
wraparounds"
* tag 'regmap-fix-v4.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: debugfs: Don't bother actually printing when calculating max length
regmap: debugfs: Ensure we don't underflow when printing access masks
A couple of very minor fixes, one for error handling in the Davinci
driver probe function and another making the Renesas sh-msiof DT binding
documentation correspond to what's actually implemented.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJWFPjZAAoJECTWi3JdVIfQvWcH/0aOEA6dcgurWr4IlWpI9vS0
ueMrT3CEDwyJOhkizxMTQSDs7K01OGOag1Dq6nZm2k38niFD7CeUU6iNXZrKoTXf
TSBiqgRitZ5HnpYJhYz/HP8HTktAZVuFaTD2IpSX6+gqiTinIYKVX4MrnZmIcRjR
uGXosAd2n3552hXjcDVOuRlmvjmd30wP96/hc0/24yfrehEcBoksM4dgnpMDC5d4
OQaP6Y8ZXkqeYYrL+xN4i6w9wwTdt+Io+7cYZoRVKJ3+CB9jHAnONRJzTPqRWEDz
vhOA2/fHmvLierP1JySz1CZzYdVekVpPYsobmsYHKddoLXBLvlaYWJajj1kTMlc=
=RMP1
-----END PGP SIGNATURE-----
Merge tag 'spi-fix-v4.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi
Pull spi fixes from Mark Brown:
"A couple of very minor fixes, one for error handling in the Davinci
driver probe function and another making the Renesas sh-msiof DT
binding documentation correspond to what's actually implemented"
* tag 'spi-fix-v4.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: sh-msiof: Match renesas,rx-fifo-size in DT bindings doc with driver
spi: davinci: fix handling platform_get_irq result
Two fixes here, one device specific fix for axp20x and a core fix for
cases where one regulator is supplying another which broke probe
deferral, substituting in a dummy regulator too aggressively.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJWFO2YAAoJECTWi3JdVIfQl6YH/iBuGvFaDNJn6qBI1/qae9KO
HQYWda7fVKB8l/4aotV4tfzW0ikIDMeIzSzvekpZK6SKgVnKAr3dzKLL3tkDT9IE
tl5ibs5nJOynGk30liVSALfUFU2IPqV0ikNQxbhBq13mIR9YJinFuow8kcCAWoSj
BJUIE/cp6fHoDXWMqJDETCBtwERBmU7VZ6WWhaZZKTrmMzM0zdfNyjisrOTx+Npm
k2+4zAs8ep7MCn0Q9V4Q+DdNOq3zZPcMjx6yMLZt71pJBiG+TP/wK89AIAq7vyjk
y5qUqdFBN/4cvtmLCLI4F9lThSBgRHLSl1NOJc9TWQgu/xT8YEmK2xqE9J+cyLQ=
=xZTQ
-----END PGP SIGNATURE-----
Merge tag 'regulator-fix-v4.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator fixes from Mark Brown:
"Two fixes here, one device specific fix for axp20x and a core fix for
cases where one regulator is supplying another which broke probe
deferral, substituting in a dummy regulator too aggressively"
* tag 'regulator-fix-v4.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: core: Handle probe deferral from DT when resolving supplies
regulator: axp20x: Fix enable bit indexes for DCDC4 and DCDC5
If of_parse_display_timing() fails we are printing an error message and
jumping to the error path but we missed freeing "dt".
Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Pull strscpy fixes from Chris Metcalf :
"This patch series fixes up a couple of architecture issues where
strscpy wasn't configured correctly (missing on h8300, duplicating
local and asm-generic copies on powerpc and tile).
It also adds a use of zero_bytemask() to the final store for strscpy
to avoid writing uninitialized data to the destination. However, to
make this work we had to add support for zero_bytemask() to the two
architectures that didn't have it (alpha and tile), because they were
providing their own local copies, but didn't provide the
zero_bytemask() that was previously only required when building with
CONFIG_DCACHE_WORD_ACCESS"
[ Side note: there is still no actual users of strscpy except for the
one preexisting use in arch/tile that predates the generic version.
So this is all about fixing the infrastructure so that we eventually
can start using it. - Linus ]
* 'strscpy' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
strscpy: zero any trailing garbage bytes in the destination
word-at-a-time.h: support zero_bytemask() on alpha and tile
word-at-a-time.h: fix some Kbuild files
* mxc_nand: a "refactoring only" change in 4.3-rc1 had some bad pointer
(array) arithmetic. Fix that
* sunxi_nand:
- Fix an old list manipulation / memory management bug in the device
release() code path
- Correct a few mistakes in OOB write support
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJWFALTAAoJEFySrpd9RFgt6gIP/in7Zl3mpVZi3D36ui6+EIiF
cHSLItczBtUg5JrE3gm7RzXsOI8CE/ExvN0Yqc1uHM5EDHZWHWIKlRxdD9CrhIv8
srM2icbG4AJBeiizCuufok3/nOMjt5azHX6RdDA2BClYP2T5GH/CthJabfTHRpqb
yhDs7WRAbS7jPhmLe0rl3yTBnMnY798bzzo7VQD0g95lgOU4K2Wj78NyuMx5NSsK
l8mH0GuNne7cHCh1z4HSiSzSyJQoTnjhd8sA1B/39t/V89EHm87P7iYN4NQ7zayK
vq9GEOrm1/jVDCPxGihJkt3jkt4QmnQeU8z+Ne42muySvvP+eXpihfX/xSstBYQl
QsTX/odoU7UpBwy9gd3dO/2o5s/+/E16VtEKEWJfExVaRHwhSmrBY/aSAnzGMomh
JSwOkgI5r/hToDB+sQ9/MANwQ1kq1gZIIzaANkdK+rm0x0HuVh3ODZYC3xBexgdf
5FVAaLhxMQaFhGv7JGgL5aPZAZzboUn0awmMBdiH5JWA5hCuCEUkyN6ggcTqvJlM
wxS4lkofpnEfTVNiOjitS+nPnXj8Vn3OB5fKUoPe0u33GZuJraq+OW6gCgSZtSWp
+cOxC1WiREOkxei6+MiR5eOF2kU4uZUGjUeZeNs1QdXUNkSTX66DtmT/ep6WrtVi
aDI1fczLhpi966uI05gw
=A7qE
-----END PGP SIGNATURE-----
Merge tag 'for-linus-20151006' of git://git.infradead.org/linux-mtd
Pull MTD fixes from Brian Norris:
"A few MTD fixes:
- mxc_nand: a "refactoring only" change in 4.3-rc1 had some bad
pointer (array) arithmetic. Fix that
- sunxi_nand:
- Fix an old list manipulation / memory management bug in the device
release() code path
- Correct a few mistakes in OOB write support"
* tag 'for-linus-20151006' of git://git.infradead.org/linux-mtd:
mxc_nand: fix copy_spare
mtd: nand: sunxi: fix sunxi_nand_chips_cleanup()
mtd: nand: sunxi: fix OOB handling in ->write_xxx() functions
Highlights include:
Bugfixes:
- Fix a use-after-free bug in the RPC/RDMA client
- Fix a write performance regression
- Fix up page writeback accounting
- Don't try to reclaim unused state owners
- Fix a NFSv4 nograce recovery hang
- reset states to use open_stateid when returning delegation voluntarily
- Fix a tracepoint NULL-pointer dereference
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJWFIgBAAoJEGcL54qWCgDy3qwQAJrMvwiO0shZe9+PsUZcDIhw
1CnDmWYafJmpNGK+YEZatI+tdR9pwSYXdfiCGj/Ijfvl1PXUgyVAmNARAB9oFUza
DVvZjqJ6aiFzeawGC8f2IfwY4XcAy4+BIZOiwp2JafepRnoSgZl24olKbO4cQ7UD
i5IaDrYYvAxefsUoRogEF19H1y8zC1yUA2aDKrriV6A9rEZSbaZLRfS8BHppXBjY
w0OP74neD4rnn/rL0YDEdsjiI17W7QwoMk05yzOJH3wQt/Y4Ll/lwLO4y3URpIGF
wzHzMIeggGPPEM9e1JixPc3Y9F9kCHW8YjGJ3xxY2C6q8vt7dzpaVhh10AxycZtZ
gcbepjMhoL7gJqu5DQ/0S86Sb5jNaL0KlUDsEnqtOfe3/UiyTJ/f57TMfdscm+wI
pdyFFtxUHcFueO1a2XuEOuSIUFzFuwIQ2aiHlbu90ev04dd7dqzU0PffhRlzu3tJ
8+ZHQMbSmotUmhxlpI+VA4rG0JUsaLY09chH5r0NvsXm0LR+z3vX7Q6oONN7IBDv
5hULj4ecB69smBv+FjQyVUAu0LiahINAGu0p0wEjTdBwFMic5qpVVfhTs8qrkGRZ
M8RYrANtVhY17fJf5WF7Wyt58icAWRKDHslGdzUav+2VFBfNK1ZeG+QhYYqDNF5k
SkJsG4iCIN9JazwqfqJI
=aoNS
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-4.3-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client bugfixes from Trond Myklebust:
"Highlights include:
Bugfixes:
- Fix a use-after-free bug in the RPC/RDMA client
- Fix a write performance regression
- Fix up page writeback accounting
- Don't try to reclaim unused state owners
- Fix a NFSv4 nograce recovery hang
- reset states to use open_stateid when returning delegation
voluntarily
- Fix a tracepoint NULL-pointer dereference"
* tag 'nfs-for-4.3-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
NFS: Fix a tracepoint NULL-pointer dereference
nfs4: reset states to use open_stateid when returning delegation voluntarily
NFSv4: Fix a nograce recovery hang
NFSv4.1: nfs4_opendata_check_deleg needs to handle NFS4_OPEN_CLAIM_DELEG_CUR_FH
NFSv4: Don't try to reclaim unused state owners
NFS: Fix a write performance regression
NFS: Fix up page writeback accounting
xprtrdma: disconnect and flush cqs before freeing buffers
This reverts commit 998ef75ddb.
The commit itself does not appear to be buggy per se, but it is exposing
a bug in ext4 (and Ted thinks ext3 too, but we solved that by getting
rid of it). It's too late in the release cycle to really worry about
this, even if Dave Hansen has a patch that may actually fix the
underlying ext4 problem. We can (and should) revisit this for the next
release.
The problem is that moving the prefaulting later now exposes a special
case with partially successful writes that isn't handled correctly. And
the prefaulting likely isn't normally even that much of a performance
issue - it looks like at least one reason Dave saw this in his
performance tests is that he also ran them on Skylake that now supports
the new SMAP code, which makes the normally very cheap user space
prefaulting noticeably more expensive.
Bisected-and-acked-by: Ted Ts'o <tytso@mit.edu>
Analyzed-and-acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Running xfstest generic/013 with the tracepoint nfs:nfs4_open_file
enabled produces a NULL-pointer dereference when calculating fileid and
filehandle of the opened file. Fix this by checking if state is NULL
before trying to use the inode pointer.
Reported-by: Olga Kornievskaia <aglo@umich.edu>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
It's possible that the destination can be shadowed in userspace
(as, for example, the perf buffers are now). So we should take
care not to leak data that could be inspected by userspace.
Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
arch/tile added word-at-a-time.h after the patch that added generic-y
entries; the generic-y entry is now stale.
arch/h8300 is newer than the generic-y patch for word-at-a-time.h,
and needs a generic-y entry.
arch/powerpc seems to have gotten a generic-y entry by mistake in
the first patch; this change removes it.
Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
When booting a kernel without an initrd, the kernel reports that it
moves -1 bytes worth, having gone through the motions with initrd_start
equal to initrd_end:
Moving initrd from [4080000000-407fffffff] to [9fff49000-9fff48fff]
Prevent this by bailing out early when the initrd size is zero (i.e. we
have no initrd), avoiding the confusing message and other associated
work.
Fixes: 1570f0d7ab ("arm64: support initrd outside kernel linear map")
Cc: Mark Salter <msalter@redhat.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
- Fix VM save performance regression with x86 PV guests.
- Make kexec work in x86 PVHVM guests (if Xen has the soft-reset ABI).
- Other minor fixes.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJWE8wQAAoJEFxbo/MsZsTRVTMH/0eqSg2M78wv4sBl234Y3FE9
AN8KFUdlkK7VN9v0uuLMDSKIWNUuFJIvo/2rElWGRiX2Q+/pfnQg3ZSFhub9S8uL
T4LCvmG9viRFb2oUz792ewqncSw3X98Jpto4smA820gJRjndBSWm5HUKUtPAkv1M
l5DFMEgOeHbu+wCbKD/ZPEt5K9GsIaNviSNoWtYHirZwrd00oLmNbWp+g8lIGQiT
3vLW0SaZzjL6akKxihb/p3WZ9eNmyz8yk0V7dItUEVUB9qoaDDLJ5qIRSHHWTWQD
Jza/GE32VallZLuEXGG5/D86MsnyVYHC+lZtwo2IptOGm8v7WuZRv094wI1ev5c=
=aiDw
-----END PGP SIGNATURE-----
Merge tag 'for-linus-4.3b-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen bug fixes from David Vrabel:
- Fix VM save performance regression with x86 PV guests
- Make kexec work in x86 PVHVM guests (if Xen has the soft-reset ABI)
- Other minor fixes.
* tag 'for-linus-4.3b-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
x86/xen/p2m: hint at the last populated P2M entry
x86/xen: Do not clip xen_e820_map to xen_e820_map_entries when sanitizing map
x86/xen: Support kexec/kdump in HVM guests by doing a soft reset
xen/x86: Don't try to write syscall-related MSRs for PV guests
xen: use correct type for HYPERVISOR_memory_op()
Pull s390 fixes from Martin Schwidefsky:
"Three bug fixes and an update to the default configuration"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/defconfig: set SCSI_DH=y
s390/vtime: correct scaled cputime of partially idle CPUs
s390/boot/decompression: disable floating point in decompressor
s390/numa: use correct type for node_to_cpumask_map
Pull CIFS fixes from Steve French:
"Two fixes for problems pointed out by automated tools.
Thanks PaX/grsecurity team and Dan Carpenter (and the Smatch tool)"
* 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
[CIFS] Update cifs version number
[SMB3] Do not fall back to SMBWriteX in set_file_size error cases
[SMB3] Missing null tcon check
With commit 633d6f17cd (x86/xen: prepare
p2m list for memory hotplug) the P2M may be sized to accomdate a much
larger amount of memory than the domain currently has.
When saving a domain, the toolstack must scan all the P2M looking for
populated pages. This results in a performance regression due to the
unnecessary scanning.
Instead of reporting (via shared_info) the maximum possible size of
the P2M, hint at the last PFN which might be populated. This hint is
increased as new leaves are added to the P2M (in the expectation that
they will be used for populated entries).
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Cc: <stable@vger.kernel.org> # 4.0+
When running kprobe test on arm64 rt kernel, it reports the below warning:
root@qemu7:~# modprobe kprobe_example
BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:917
in_atomic(): 0, irqs_disabled(): 128, pid: 484, name: modprobe
CPU: 0 PID: 484 Comm: modprobe Not tainted 4.1.6-rt5 #2
Hardware name: linux,dummy-virt (DT)
Call trace:
[<ffffffc0000891b8>] dump_backtrace+0x0/0x128
[<ffffffc000089300>] show_stack+0x20/0x30
[<ffffffc00061dae8>] dump_stack+0x1c/0x28
[<ffffffc0000bbad0>] ___might_sleep+0x120/0x198
[<ffffffc0006223e8>] rt_spin_lock+0x28/0x40
[<ffffffc000622b30>] __aarch64_insn_write+0x28/0x78
[<ffffffc000622e48>] aarch64_insn_patch_text_nosync+0x18/0x48
[<ffffffc000622ee8>] aarch64_insn_patch_text_cb+0x70/0xa0
[<ffffffc000622f40>] aarch64_insn_patch_text_sync+0x28/0x48
[<ffffffc0006236e0>] arch_arm_kprobe+0x38/0x48
[<ffffffc00010e6f4>] arm_kprobe+0x34/0x50
[<ffffffc000110374>] register_kprobe+0x4cc/0x5b8
[<ffffffbffc002038>] kprobe_init+0x38/0x7c [kprobe_example]
[<ffffffc000084240>] do_one_initcall+0x90/0x1b0
[<ffffffc00061c498>] do_init_module+0x6c/0x1cc
[<ffffffc0000fd0c0>] load_module+0x17f8/0x1db0
[<ffffffc0000fd8cc>] SyS_finit_module+0xb4/0xc8
Convert patch_lock to raw loc kto avoid this issue.
Although the problem is found on rt kernel, the fix should be applicable to
mainline kernel too.
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Yang Shi <yang.shi@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
This is the arm64 portion of commit 45cac65b0f ("readahead: fault
retry breaks mmap file read random detection"), which was absent from
the initial port and has since gone unnoticed. The original commit says:
> .fault now can retry. The retry can break state machine of .fault. In
> filemap_fault, if page is miss, ra->mmap_miss is increased. In the second
> try, since the page is in page cache now, ra->mmap_miss is decreased. And
> these are done in one fault, so we can't detect random mmap file access.
>
> Add a new flag to indicate .fault is tried once. In the second try, skip
> ra->mmap_miss decreasing. The filemap_fault state machine is ok with it.
With this change, Mark reports that:
> Random read improves by 250%, sequential read improves by 40%, and
> random write by 400% to an eMMC device with dm crypto wrapped around it.
Cc: Shaohua Li <shli@kernel.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Mark Salyzyn <salyzyn@android.com>
Signed-off-by: Riley Andrews <riandrews@android.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Pull strscpy string copy function implementation from Chris Metcalf.
Chris sent this during the merge window, but I waffled back and forth on
the pull request, which is why it's going in only now.
The new "strscpy()" function is definitely easier to use and more secure
than either strncpy() or strlcpy(), both of which are horrible nasty
interfaces that have serious and irredeemable problems.
strncpy() has a useless return value, and doesn't NUL-terminate an
overlong result. To make matters worse, it pads a short result with
zeroes, which is a performance disaster if you have big buffers.
strlcpy(), by contrast, is a mis-designed "fix" for strlcpy(), lacking
the insane NUL padding, but having a differently broken return value
which returns the original length of the source string. Which means
that it will read characters past the count from the source buffer, and
you have to trust the source to be properly terminated. It also makes
error handling fragile, since the test for overflow is unnecessarily
subtle.
strscpy() avoids both these problems, guaranteeing the NUL termination
(but not excessive padding) if the destination size wasn't zero, and
making the overflow condition very obvious by returning -E2BIG. It also
doesn't read past the size of the source, and can thus be used for
untrusted source data too.
So why did I waffle about this for so long?
Every time we introduce a new-and-improved interface, people start doing
these interminable series of trivial conversion patches.
And every time that happens, somebody does some silly mistake, and the
conversion patch to the improved interface actually makes things worse.
Because the patch is mindnumbing and trivial, nobody has the attention
span to look at it carefully, and it's usually done over large swatches
of source code which means that not every conversion gets tested.
So I'm pulling the strscpy() support because it *is* a better interface.
But I will refuse to pull mindless conversion patches. Use this in
places where it makes sense, but don't do trivial patches to fix things
that aren't actually known to be broken.
* 'strscpy' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
tile: use global strscpy() rather than private copy
string: provide strscpy()
Make asm/word-at-a-time.h available on all architectures