linux

Author	SHA1	Message	Date
Trond Myklebust	68eaba4ca9	NFS: Fix the verifier for case sensitive filesystem in nfs_atomic_open() Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2022-01-06 14:00:20 -05:00
Trond Myklebust	00bdadc7ac	NFS: Add a helper to remove case-insensitive aliases When dealing with case insensitive names, the client has no idea how the server performs the mapping, so cannot collapse the dentries into a single representative. So both rename and unlink need to deal with the fact that there could be several dentries representing the file, and have to somehow force them to be revalidated. Use d_prune_aliases() as a big hammer approach. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2022-01-06 14:00:20 -05:00
Trond Myklebust	8ce37abdeb	NFS: Invalidate negative dentries on all case insensitive directory changes If we create a file, rename it, or hardlink it, then we need to assume that cached negative dentries need to be revalidated. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2022-01-06 14:00:20 -05:00
Trond Myklebust	98ca3ee60b	NFSv4: Just don't cache negative dentries on case insensitive servers If the directory contents change, we cannot rely on the negative dentry being cacheable. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2022-01-06 14:00:20 -05:00
Trond Myklebust	1ab5be4ac5	NFSv4: Add some support for case insensitive filesystems Add capabilities to allow the NFS client to recognise when it is dealing with case insensitive and case preserving filesystems. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2022-01-06 14:00:20 -05:00
Trond Myklebust	b05bf5c63b	NFSv4.1: Fix uninitialised variable in devicenotify When decode_devicenotify_args() exits with no entries, we need to ensure that the struct cb_devicenotifyargs is initialised to { 0, NULL } in order to avoid problems in nfs4_callback_devicenotify(). Reported-by: <rtm@csail.mit.edu> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2022-01-06 14:00:20 -05:00
Xiaoke Wang	fbd2057e53	nfs: nfs4clinet: check the return value of kstrdup() kstrdup() returns NULL when some internal memory errors happen, it is better to check the return value of it so to catch the memory error in time. Signed-off-by: Xiaoke Wang <xkernel.wang@foxmail.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2022-01-06 14:00:20 -05:00
Olga Kornievskaia	2c52c8376d	NFSv4 only print the label when its queried When the bitmask of the attributes doesn't include the security label, don't bother printing it. Since the label might not be null terminated, adjust the printing format accordingly. Signed-off-by: Olga Kornievskaia <kolga@netapp.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2022-01-06 14:00:20 -05:00
Jiapeng Chong	c4f0396688	SUNRPC: clean up some inconsistent indenting Eliminate the follow smatch warning: net/sunrpc/xprtsock.c:1912 xs_local_connect() warn: inconsistent indenting. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2022-01-06 14:00:20 -05:00
Xu Wang	35e0f9a9af	sunrpc: Remove unneeded null check In g_verify_token_header, the null check of 'ret' is unneeded to be done twice. Signed-off-by: Xu Wang <vulab@iscas.ac.cn> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2022-01-06 14:00:20 -05:00
Gustavo A. R. Silva	c72a826829	nfs41: pnfs: filelayout: Replace one-element array with flexible-array member There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use “flexible array members”[1] for these cases. The older style of one-element or zero-length arrays should no longer be used[2]. Refactor the code a bit according to the use of a flexible-array member in struct nfs4_file_layout_dsaddr instead of a one-element array, and use the struct_size() helper. This helps with the ongoing efforts to globally enable -Warray-bounds and get us closer to being able to tighten the FORTIFY_SOURCE routines on memcpy(). This issue was found with the help of Coccinelle and audited and fixed, manually. [1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://www.kernel.org/doc/html/v5.10/process/deprecated.html#zero-length-and-one-element-arrays Link: https://github.com/KSPP/linux/issues/79 Link: https://github.com/KSPP/linux/issues/109 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2022-01-06 14:00:20 -05:00
Pierguido Lambri	4b0c359b81	SUNRPC: Add source address/port to rpc_socket* traces The rpc_socket* traces now show also the source address and port. An example is: kworker/u17:1-951 [005] 134218.925343: rpc_socket_close: socket:[46913] srcaddr=192.168.100.187:793 dstaddr=192.168.100.129:2049 state=4 (DISCONNECTING) sk_state=7 (CLOSE) kworker/u17:0-242 [006] 134360.841370: rpc_socket_connect: error=-115 socket:[56322] srcaddr=192.168.100.187:769 dstaddr=192.168.100.129:2049 state=2 (CONNECTING) sk_state=2 (SYN_SENT) <idle>-0 [006] 134360.841859: rpc_socket_state_change: socket:[56322] srcaddr=192.168.100.187:769 dstaddr=192.168.100.129:2049 state=2 (CONNECTING) sk_state=1 (ESTABLISHED) Signed-off-by: Pierguido Lambri <plambri@redhat.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2022-01-06 14:00:20 -05:00
Trond Myklebust	6ff9d99bb8	NFS: Ensure the server has an up to date ctime before renaming Renaming a file is required by POSIX to update the file ctime, so ensure that the file data is synced to disk so that we don't clobber the updated ctime by writing back after creating the hard link. Fixes: `f2c2c552f1` ("NFS: Move delegation recall into the NFSv4 callback for rename_setup()") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2022-01-06 14:00:20 -05:00
Trond Myklebust	204975036b	NFS: Ensure the server has an up to date ctime before hardlinking Creating a hard link is required by POSIX to update the file ctime, so ensure that the file data is synced to disk so that we don't clobber the updated ctime by writing back after creating the hard link. Fixes: `9f76827287` ("NFS: Move the delegation return down into nfs4_proc_link()") Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2022-01-06 14:00:20 -05:00
NeilBrown	6238aec83f	NFS: don't store 'struct cred ' in struct nfs_access_entry Storing the 'struct cred ' in nfs_access_entry is problematic. An active 'cred' can keep a 'struct key ' active, and a quota is imposed on the number of such keys that a user can maintain. Cached 'nfs_access_entry' structs have indefinite lifetime, and having these keep 'struct key's alive imposes on that quota. So remove the 'struct cred ' and replace it with the fields we need: kuid_t, kgid_t, and struct group_info * This makes the 'struct nfs_access_entry' 64 bits larger. New function "access_cmp" is introduced which is identical to cred_fscmp() except that the second arg is an 'nfs_access_entry', rather than a 'cred' Fixes: `b68572e07c` ("NFS: change access cache to use 'struct cred'.") Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2022-01-06 14:00:20 -05:00
NeilBrown	73fbb3fa64	NFS: pass cred explicitly for access tests Storing the 'struct cred ' in nfs_access_entry is problematic. An active 'cred' can keep a 'struct key ' active, and a quota is imposed on the number of such keys that a user can maintain. Cached 'nfs_access_entry' structs have indefinite lifetime, and having these keep 'struct key's alive imposes on that quota. So a future patch will remove the ->cred ref from nfs_access_entry. To prepare, change various functions to not assume there is a 'cred' in the nfs_access_entry, but to pass the cred around explicitly. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2022-01-06 14:00:20 -05:00
NeilBrown	b5e7b59c34	NFS: change nfs_access_get_cached to only report the mask Currently the nfs_access_get_cached family of functions report a 'struct nfs_access_entry' as the result, with both .mask and .cred set. However the .cred is never used. This is probably good and there is no guarantee that it won't be freed before use. Change to only report the 'mask' - as this is all that is used or needed. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>	2022-01-06 14:00:19 -05:00
Darrick J. Wong	7e937bb3cb	xfs: warn about inodes with project id of -1 Inodes aren't supposed to have a project id of -1U (aka 4294967295) but the kernel hasn't always validated FSSETXATTR correctly. Flag this as something for the sysadmin to check out. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2022-01-06 10:43:30 -08:00
Darrick J. Wong	eae44cb341	xfs: hold quota inode ILOCK_EXCL until the end of dqalloc Online fsck depends on callers holding ILOCK_EXCL from the time they decide to update a block mapping until after they've updated the reverse mapping records to guarantee the stability of both mapping records. Unfortunately, the quota code drops ILOCK_EXCL at the first transaction roll in the dquot allocation process, which breaks that assertion. This leads to sporadic failures in the online rmap repair code if the repair code grabs the AGF after bmapi_write maps a new block into the quota file's data fork but before it can finish the deferred rmap update. Fix this by rewriting the function to hold the ILOCK until after the transaction commit like all other bmap updates do, and get rid of the dqread wrapper that does nothing but complicate the codebase. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Dave Chinner <dchinner@redhat.com>	2022-01-06 10:43:30 -08:00
Jiapeng Chong	f4901a182d	xfs: Remove redundant assignment of mp mp is being initialized to log->l_mp but this is never read as record is overwritten later on. Remove the redundant assignment. Cleans up the following clang-analyzer warning: fs/xfs/xfs_log_recover.c:3543:20: warning: Value stored to 'mp' during its initialization is never read [clang-analyzer-deadcode.DeadStores]. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>	2022-01-06 10:43:30 -08:00
Dave Chinner	8dc9384b7d	xfs: reduce kvmalloc overhead for CIL shadow buffers Oh, let me count the ways that the kvmalloc API sucks dog eggs. The problem is when we are logging lots of large objects, we hit kvmalloc really damn hard with costly order allocations, and behaviour utterly sucks: - 49.73% xlog_cil_commit - 31.62% kvmalloc_node - 29.96% __kmalloc_node - 29.38% kmalloc_large_node - 29.33% __alloc_pages - 24.33% __alloc_pages_slowpath.constprop.0 - 18.35% __alloc_pages_direct_compact - 17.39% try_to_compact_pages - compact_zone_order - 15.26% compact_zone 5.29% __pageblock_pfn_to_page 3.71% PageHuge - 1.44% isolate_migratepages_block 0.71% set_pfnblock_flags_mask 1.11% get_pfnblock_flags_mask - 0.81% get_page_from_freelist - 0.59% _raw_spin_lock_irqsave - do_raw_spin_lock __pv_queued_spin_lock_slowpath - 3.24% try_to_free_pages - 3.14% shrink_node - 2.94% shrink_slab.constprop.0 - 0.89% super_cache_count - 0.66% xfs_fs_nr_cached_objects - 0.65% xfs_reclaim_inodes_count 0.55% xfs_perag_get_tag 0.58% kfree_rcu_shrink_count - 2.09% get_page_from_freelist - 1.03% _raw_spin_lock_irqsave - do_raw_spin_lock __pv_queued_spin_lock_slowpath - 4.88% get_page_from_freelist - 3.66% _raw_spin_lock_irqsave - do_raw_spin_lock __pv_queued_spin_lock_slowpath - 1.63% __vmalloc_node - __vmalloc_node_range - 1.10% __alloc_pages_bulk - 0.93% __alloc_pages - 0.92% get_page_from_freelist - 0.89% rmqueue_bulk - 0.69% _raw_spin_lock - do_raw_spin_lock __pv_queued_spin_lock_slowpath 13.73% memcpy_erms - 2.22% kvfree On this workload, that's almost a dozen CPUs all trying to compact and reclaim memory inside kvmalloc_node at the same time. Yet it is regularly falling back to vmalloc despite all that compaction, page and shrinker reclaim that direct reclaim is doing. Copying all the metadata is taking far less CPU time than allocating the storage! Direct reclaim should be considered extremely harmful. This is a high frequency, high throughput, CPU usage and latency sensitive allocation. We've got memory there, and we're using kvmalloc to allow memory allocation to avoid doing lots of work to try to do contiguous allocations. Except it still does lots of costly work that is unnecessary. Worse: the only way to avoid the slowpath page allocation trying to do compaction on costly allocations is to turn off direct reclaim (i.e. remove __GFP_RECLAIM_DIRECT from the gfp flags). Unfortunately, the stupid kvmalloc API then says "oh, this isn't a GFP_KERNEL allocation context, so you only get kmalloc!". This cuts off the vmalloc fallback, and this leads to almost instant OOM problems which ends up in filesystems deadlocks, shutdowns and/or kernel crashes. I want some basic kvmalloc behaviour: - kmalloc for a contiguous range with fail fast semantics - no compaction direct reclaim if the allocation enters the slow path. - run normal vmalloc (i.e. GFP_KERNEL) if kmalloc fails The really, really stupid part about this is these kvmalloc() calls are run under memalloc_nofs task context, so all the allocations are always reduced to GFP_NOFS regardless of the fact that kvmalloc requires GFP_KERNEL to be passed in. IOWs, we're already telling kvmalloc to behave differently to the gfp flags we pass in, but it still won't allow vmalloc to be run with anything other than GFP_KERNEL. So, this patch open codes the kvmalloc() in the commit path to have the above described behaviour. The result is we more than halve the CPU time spend doing kvmalloc() in this path and transaction commits with 64kB objects in them more than doubles. i.e. we get ~5x reduction in CPU usage per costly-sized kvmalloc() invocation and the profile looks like this: - 37.60% xlog_cil_commit 16.01% memcpy_erms - 8.45% __kmalloc - 8.04% kmalloc_order_trace - 8.03% kmalloc_order - 7.93% alloc_pages - 7.90% __alloc_pages - 4.05% __alloc_pages_slowpath.constprop.0 - 2.18% get_page_from_freelist - 1.77% wake_all_kswapds .... - __wake_up_common_lock - 0.94% _raw_spin_lock_irqsave - 3.72% get_page_from_freelist - 2.43% _raw_spin_lock_irqsave - 5.72% vmalloc - 5.72% __vmalloc_node_range - 4.81% __get_vm_area_node.constprop.0 - 3.26% alloc_vmap_area - 2.52% _raw_spin_lock - 1.46% _raw_spin_lock 0.56% __alloc_pages_bulk - 4.66% kvfree - 3.25% vfree - __vfree - 3.23% __vunmap - 1.95% remove_vm_area - 1.06% free_vmap_area_noflush - 0.82% _raw_spin_lock - 0.68% _raw_spin_lock - 0.92% _raw_spin_lock - 1.40% kfree - 1.36% __free_pages - 1.35% __free_pages_ok - 1.02% _raw_spin_lock_irqsave It's worth noting that over 50% of the CPU time spent allocating these shadow buffers is now spent on spinlocks. So the shadow buffer allocation overhead is greatly reduced by getting rid of direct reclaim from kmalloc, and could probably be made even less costly if vmalloc() didn't use global spinlocks to protect it's structures. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Allison Henderson <allison.henderson@oracle.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>	2022-01-06 10:43:30 -08:00
Greg Kroah-Hartman	219aac5d46	xfs: sysfs: use default_groups in kobj_type There are currently 2 ways to create a set of sysfs files for a kobj_type, through the default_attrs field, and the default_groups field. Move the xfs sysfs code to use default_groups field which has been the preferred way since `aa30f47cf6` ("kobject: Add support for default attribute groups to kobj_type") so that we can soon get rid of the obsolete default_attrs field. Cc: "Darrick J. Wong" <djwong@kernel.org> Cc: linux-xfs@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Darrick J. Wong <djwong@kernel.org>	2022-01-06 10:43:30 -08:00
Greg Kroah-Hartman	1745e857e7	md: use default_groups in kobj_type There are currently 2 ways to create a set of sysfs files for a kobj_type, through the default_attrs field, and the default_groups field. Move the md rdev sysfs code to use default_groups field which has been the preferred way since commit `aa30f47cf6` ("kobject: Add support for default attribute groups to kobj_type") so that we can soon get rid of the obsolete default_attrs field. Cc: Song Liu <song@kernel.org> Cc: linux-raid@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Song Liu <song@kernel.org>	2022-01-06 10:42:50 -08:00
Rob Herring	3e718b4475	spi: dt-bindings: mediatek,spi-mtk-nor: Fix example 'interrupts' property A phandle for 'interrupts' value is wrong and should be one or more numbers. Signed-off-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20220106182518.1435497-9-robh@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org>	2022-01-06 18:35:26 +00:00
Christophe JAILLET	0dbc416218	ice: Use bitmap_free() to free bitmap kfree() and bitmap_free() are the same. But using the latter is more consistent when freeing memory allocated with bitmap_zalloc(). Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-01-06 10:15:25 -08:00
Christophe JAILLET	e75ed29db5	ice: Optimize a few bitmap operations When a bitmap is local to a function, it is safe to use the non-atomic __[set\|clear]_bit(). No concurrent accesses can occur. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-01-06 10:15:25 -08:00
Christophe JAILLET	a5c259b162	ice: Slightly simply ice_find_free_recp_res_idx The 'possible_idx' bitmap is set just after it is zeroed, so we can save the first step. The 'free_idx' bitmap is used only at the end of the function as the result of a bitmap xor operation. So there is no need to explicitly zero it before. So, slightly simply the code and remove 2 useless 'bitmap_zero()' call Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-01-06 10:15:25 -08:00
Wojciech Drewek	c1e5da5dd4	ice: improve switchdev's slow-path In current switchdev implementation, every VF PR is assigned to individual ring on switchdev ctrl VSI. For slow-path traffic, there is a mapping VF->ring done in software based on src_vsi value (by calling ice_eswitch_get_target_netdev function). With this change, HW solution is introduced which is more efficient. For each VF, src MAC (VF's MAC) filter will be created, which forwards packets to the corresponding switchdev ctrl VSI queue based on src MAC address. This filter has to be removed and then replayed in case of resetting one VF. Keep information about this rule in repr->mac_rule, thanks to that we know which rule has to be removed and replayed for a given VF. In case of CORE/GLOBAL all rules are removed automatically. We have to take care of readding them. This is done by ice_replay_vsi_adv_rule. When driver leaves switchdev mode, remove all advanced rules from switchdev ctrl VSI. This is done by ice_rem_adv_rule_for_vsi. Flag repr->rule_added is needed because in some cases reset might be triggered before VF sends request to add MAC. Co-developed-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-01-06 10:15:09 -08:00
Yang Yingliang	31834aaa4e	ACPI: pfr_update: Fix return value check in pfru_write() In case of error, memremap() returns NULL pointer not ERR_PTR(). The IS_ERR() test in the return value check should be replaced with NULL test. Fixes: `0db89fa243` ("ACPI: Introduce Platform Firmware Runtime Update device driver") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Acked-by: Chen Yu <yu.c.chen@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2022-01-06 18:53:31 +01:00
Huang Rui	6c4ab1b86d	x86, sched: Fix undefined reference to init_freq_invariance_cppc() build error The init_freq_invariance_cppc function is implemented in smpboot and depends on CONFIG_SMP. MODPOST vmlinux.symvers MODINFO modules.builtin.modinfo GEN modules.builtin LD .tmp_vmlinux.kallsyms1 ld: drivers/acpi/cppc_acpi.o: in function `acpi_cppc_processor_probe': /home/ray/brahma3/linux/drivers/acpi/cppc_acpi.c:819: undefined reference to `init_freq_invariance_cppc' make: *** [Makefile:1161: vmlinux] Error 1 See https://lore.kernel.org/lkml/484af487-7511-647e-5c5b-33d4429acdec@infradead.org/. Fixes: `41ea667227` ("x86, sched: Calculate frequency invariance for AMD systems") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Huang Rui <ray.huang@amd.com> [ rjw: Subject edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2022-01-06 18:52:07 +01:00
Max Gurtovoy	ca2770c65b	IB/iser: Align coding style across driver The following changes were made: 1. Align function signatures to 80 characters per line. 2. Remove tabs for variable assignment and use 1 space instead. 3. Don't compare to NULL in "if" clause. 4. Remove strange indentations. This will ease on the maintenance of the driver for the future. Link: https://lore.kernel.org/r/20211215135721.3662-7-mgurtovoy@nvidia.com Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-01-06 13:47:08 -04:00
Palmer Dabbelt	d4cb5d3630	RISC-V: Clean up the defconfigs It's been a while since cleaning up the defconfigs, so I manually checked up on each change. This found a handful of minor issues, which have been fixed in-line.	2022-01-06 09:42:26 -08:00
Palmer Dabbelt	ce3fe7a4ac	RISC-V: defconfigs: Remove redundant K210 DT source The "k210_generic" DT has been the default in Kconfig since `67d96729a9` ("riscv: Update Canaan Kendryte K210 device tree"), so drop it from the defconfigs to avoid diff with savedefconfig. Reviewed-by: Anup Patel <anup@brainfault.org> Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2022-01-06 09:41:03 -08:00
Huang Rui	a2e6840b37	cpufreq: amd-pstate: Fix Kconfig dependencies for AMD P-State The AMD P-State driver is based on ACPI CPPC function, so ACPI should be dependence of this driver in the kernel config. In file included from ../drivers/cpufreq/amd-pstate.c:40:0: ../include/acpi/processor.h:226:2: error: unknown type name ‘phys_cpuid_t’ phys_cpuid_t phys_id; /* CPU hardware ID such as APIC ID for x86 */ ^~~~~~~~~~~~ ../include/acpi/processor.h:355:1: error: unknown type name ‘phys_cpuid_t’; did you mean ‘phys_addr_t’? phys_cpuid_t acpi_get_phys_id(acpi_handle, int type, u32 acpi_id); ^~~~~~~~~~~~ phys_addr_t CC drivers/rtc/rtc-rv3029c2.o ../include/acpi/processor.h:356:1: error: unknown type name ‘phys_cpuid_t’; did you mean ‘phys_addr_t’? phys_cpuid_t acpi_map_madt_entry(u32 acpi_id); ^~~~~~~~~~~~ phys_addr_t ../include/acpi/processor.h:357:20: error: unknown type name ‘phys_cpuid_t’; did you mean ‘phys_addr_t’? int acpi_map_cpuid(phys_cpuid_t phys_id, u32 acpi_id); ^~~~~~~~~~~~ phys_addr_t See https://lore.kernel.org/lkml/20e286d4-25d7-fb6e-31a1-4349c805aae3@infradead.org/. Reported-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Huang Rui <ray.huang@amd.com> [ rjw: Subject edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2022-01-06 18:31:33 +01:00
Yang Li	bdc4fd3d48	cpufreq: amd-pstate: Fix struct amd_cpudata kernel-doc comment Add the description of @req and @boost_supported in struct amd_cpudata kernel-doc comment to remove warnings found by running scripts/kernel-doc, which is caused by using 'make W=1'. drivers/cpufreq/amd-pstate.c:104: warning: Function parameter or member 'req' not described in 'amd_cpudata' drivers/cpufreq/amd-pstate.c:104: warning: Function parameter or member 'boost_supported' not described in 'amd_cpudata' Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Acked-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2022-01-06 18:28:26 +01:00
Victor Raj	c36a2b9716	ice: replay advanced rules after reset ice_replay_vsi_adv_rule will replay advanced rules for a given VSI. Exit this function when list of rules for given recipe is empty. Do not add rule when given vsi_handle does not match vsi_handle from the rule info. Use ICE_MAX_NUM_RECIPES instead of ICE_SW_LKUP_LAST in order to find advanced rules as well. Signed-off-by: Victor Raj <victor.raj@intel.com> Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-01-06 09:19:40 -08:00
Matthew Wilcox (Oracle)	07f910f9b7	mm: Remove slab from struct page All members of struct slab can now be removed from struct page. This shrinks the definition of struct page by 30 LOC, making it easier to understand. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>	2022-01-06 18:06:58 +01:00
Vlastimil Babka	9cc960a164	Merge branch 'core' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu into slab-struct_slab-part2-v1 Merge iommu tree for a series that removes usage of struct page 'freelist' field.	2022-01-06 18:03:29 +01:00
Yassine Oudjana	3b247eeaec	ASoC: wcd9335: Keep a RX port value for each SLIM RX mux Currently, rx_port_value is a single unsigned int that gets overwritten when slim_rx_mux_put() is called for any RX mux, then the same value is read when slim_rx_mux_get() is called for any of them. This results in slim_rx_mux_get() reporting the last value set by slim_rx_mux_put() regardless of which SLIM RX mux is in question. Turn rx_port_value into an array and store a separate value for each SLIM RX mux. Signed-off-by: Yassine Oudjana <y.oudjana@protonmail.com> Link: https://lore.kernel.org/r/20220104033356.343685-1-y.oudjana@protonmail.com Signed-off-by: Mark Brown <broonie@kernel.org>	2022-01-06 16:41:58 +00:00
Xiao Ni	0c031fd37f	md: Move alloc/free acct bioset in to personality bioset acct is only needed for raid0 and raid5. Therefore, md_run only allocates it for raid0 and raid5. However, this does not cover personality takeover, which may cause uninitialized bioset. For example, the following repro steps: mdadm -CR /dev/md0 -l1 -n2 /dev/loop0 /dev/loop1 mdadm --wait /dev/md0 mkfs.xfs /dev/md0 mdadm /dev/md0 --grow -l5 mount /dev/md0 /mnt causes panic like: [ 225.933939] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 225.934903] #PF: supervisor instruction fetch in kernel mode [ 225.935639] #PF: error_code(0x0010) - not-present page [ 225.936361] PGD 0 P4D 0 [ 225.936677] Oops: 0010 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN PTI [ 225.937525] CPU: 27 PID: 1133 Comm: mount Not tainted 5.16.0-rc3+ #706 [ 225.938416] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-2.module_el8.4.0+547+a85d02ba 04/01/2014 [ 225.939922] RIP: 0010:0x0 [ 225.940289] Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6. [ 225.941196] RSP: 0018:ffff88815897eff0 EFLAGS: 00010246 [ 225.941897] RAX: 0000000000000000 RBX: 0000000000092800 RCX: ffffffff81370a39 [ 225.942813] RDX: dffffc0000000000 RSI: 0000000000000000 RDI: 0000000000092800 [ 225.943772] RBP: 1ffff1102b12fe04 R08: fffffbfff0b43c01 R09: fffffbfff0b43c01 [ 225.944807] R10: ffffffff85a1e007 R11: fffffbfff0b43c00 R12: ffff88810eaaaf58 [ 225.945757] R13: 0000000000000000 R14: ffff88810eaaafb8 R15: ffff88815897f040 [ 225.946709] FS: 00007ff3f2505080(0000) GS:ffff888fb5e00000(0000) knlGS:0000000000000000 [ 225.947814] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 225.948556] CR2: ffffffffffffffd6 CR3: 000000015aa5a006 CR4: 0000000000370ee0 [ 225.949537] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 225.950455] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 225.951414] Call Trace: [ 225.951787] <TASK> [ 225.952120] mempool_alloc+0xe5/0x250 [ 225.952625] ? mempool_resize+0x370/0x370 [ 225.953187] ? rcu_read_lock_sched_held+0xa1/0xd0 [ 225.953862] ? rcu_read_lock_bh_held+0xb0/0xb0 [ 225.954464] ? sched_clock_cpu+0x15/0x120 [ 225.955019] ? find_held_lock+0xac/0xd0 [ 225.955564] bio_alloc_bioset+0x1ed/0x2a0 [ 225.956080] ? lock_downgrade+0x3a0/0x3a0 [ 225.956644] ? bvec_alloc+0xc0/0xc0 [ 225.957135] bio_clone_fast+0x19/0x80 [ 225.957651] raid5_make_request+0x1370/0x1b70 [ 225.958286] ? sched_clock_cpu+0x15/0x120 [ 225.958797] ? __lock_acquire+0x8b2/0x3510 [ 225.959339] ? raid5_get_active_stripe+0xce0/0xce0 [ 225.959986] ? lock_is_held_type+0xd8/0x130 [ 225.960528] ? rcu_read_lock_sched_held+0xa1/0xd0 [ 225.961135] ? rcu_read_lock_bh_held+0xb0/0xb0 [ 225.961703] ? sched_clock_cpu+0x15/0x120 [ 225.962232] ? lock_release+0x27a/0x6c0 [ 225.962746] ? do_wait_intr_irq+0x130/0x130 [ 225.963302] ? lock_downgrade+0x3a0/0x3a0 [ 225.963815] ? lock_release+0x6c0/0x6c0 [ 225.964348] md_handle_request+0x342/0x530 [ 225.964888] ? set_in_sync+0x170/0x170 [ 225.965397] ? blk_queue_split+0x133/0x150 [ 225.965988] ? __blk_queue_split+0x8b0/0x8b0 [ 225.966524] ? submit_bio_checks+0x3b2/0x9d0 [ 225.967069] md_submit_bio+0x127/0x1c0 [...] Fix this by moving alloc/free of acct bioset to pers->run and pers->free. While we are on this, properly handle md_integrity_register() error in raid0_run(). Fixes: `daee202471` (md: check level before create and exit io_acct_set) Cc: stable@vger.kernel.org Acked-by: Guoqing Jiang <guoqing.jiang@linux.dev> Signed-off-by: Xiao Ni <xni@redhat.com> Signed-off-by: Song Liu <song@kernel.org>	2022-01-06 08:37:03 -08:00
Dirk Müller	36dacddbf0	lib/raid6: Use strict priority ranking for pq gen() benchmarking On x86_64, currently 3 variants of AVX512, 3 variants of AVX2 and 3 variants of SSE2 are benchmarked on initialization, taking between 144-153 jiffies. Testing across a hardware pool of various generations of intel cpus I could not find a single case where SSE2 won over AVX2 or AVX512. There are cases where AVX2 wins over AVX512 however. Change "prefer" into an integer priority field (similar to how recov selection works) to have more than one ranking level available, which is backwards compatible with existing behavior. Give AVX2/512 variants higher priority over SSE2 in order to skip SSE testing when AVX is available. in a AVX2/x86_64/HZ=250 case this saves in the order of 200ms of initialization time. Signed-off-by: Dirk Müller <dmueller@suse.de> Acked-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Song Liu <song@kernel.org>	2022-01-06 08:37:03 -08:00
Dirk Müller	38640c4809	lib/raid6: skip benchmark of non-chosen xor_syndrome functions In commit `fe5cbc6e06` ("md/raid6 algorithms: delta syndrome functions") a xor_syndrome() benchmarking was added also to the raid6_choose_gen() function. However, the results of that benchmarking were intentionally discarded and did not influence the choice. It picked the xor_syndrome() variant related to the best performing gen_syndrome(). Reduce runtime of raid6_choose_gen() without modifying its outcome by only benchmarking the xor_syndrome() of the best gen_syndrome() variant. For a HZ=250 x86_64 system with avx2 and without avx512 this removes 5 out of 6 xor() benchmarks, saving 340ms of raid6 initialization time. Signed-off-by: Dirk Müller <dmueller@suse.de> Signed-off-by: Song Liu <song@kernel.org>	2022-01-06 08:37:03 -08:00
Randy Dunlap	dd3dc5f416	md: fix spelling of "its" Use the possessive "its" instead of the contraction "it's" in printed messages. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Song Liu <song@kernel.org> Cc: linux-raid@vger.kernel.org Signed-off-by: Song Liu <song@kernel.org>	2022-01-06 08:37:03 -08:00
Vishal Verma	bf2c411bb1	md: raid456 add nowait support Returns EAGAIN in case the raid456 driver would block waiting for reshape. Reviewed-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Vishal Verma <vverma@digitalocean.com> Signed-off-by: Song Liu <song@kernel.org>	2022-01-06 08:37:02 -08:00
Vishal Verma	c9aa889b03	md: raid10 add nowait support This adds nowait support to the RAID10 driver. Very similar to raid1 driver changes. It makes RAID10 driver return with EAGAIN for situations where it could wait for eg: - Waiting for the barrier, - Reshape operation, - Discard operation. wait_barrier() and regular_request_wait() fn are modified to return bool to support error for wait barriers. They returns true in case of wait or if wait is not required and returns false if wait was required but not performed to support nowait. Reviewed-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Vishal Verma <vverma@digitalocean.com> Signed-off-by: Song Liu <song@kernel.org>	2022-01-06 08:37:02 -08:00
Vishal Verma	5aa705039c	md: raid1 add nowait support This adds nowait support to the RAID1 driver. It makes RAID1 driver return with EAGAIN for situations where it could wait for eg: - Waiting for the barrier, wait_barrier() fn is modified to return bool to support error for wait barriers. It returns true in case of wait or if wait is not required and returns false if wait was required but not performed to support nowait. Reviewed-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Vishal Verma <vverma@digitalocean.com> Signed-off-by: Song Liu <song@kernel.org>	2022-01-06 08:37:02 -08:00
Vishal Verma	f51d46d0e7	md: add support for REQ_NOWAIT commit `021a24460d` ("block: add QUEUE_FLAG_NOWAIT") added support for checking whether a given bdev supports handling of REQ_NOWAIT or not. Since then commit `6abc49468e` ("dm: add support for REQ_NOWAIT and enable it for linear target") added support for REQ_NOWAIT for dm. This uses a similar approach to incorporate REQ_NOWAIT for md based bios. This patch was tested using t/io_uring tool within FIO. A nvme drive was partitioned into 2 partitions and a simple raid 0 configuration /dev/md0 was created. md0 : active raid0 nvme4n1p1[1] nvme4n1p2[0] 937423872 blocks super 1.2 512k chunks Before patch: $ ./t/io_uring /dev/md0 -p 0 -a 0 -d 1 -r 100 Running top while the above runs: $ ps -eL \| grep $(pidof io_uring) 38396 38396 pts/2 00:00:00 io_uring 38396 38397 pts/2 00:00:15 io_uring 38396 38398 pts/2 00:00:13 iou-wrk-38397 We can see iou-wrk-38397 io worker thread created which gets created when io_uring sees that the underlying device (/dev/md0 in this case) doesn't support nowait. After patch: $ ./t/io_uring /dev/md0 -p 0 -a 0 -d 1 -r 100 Running top while the above runs: $ ps -eL \| grep $(pidof io_uring) 38341 38341 pts/2 00:10:22 io_uring 38341 38342 pts/2 00:10:37 io_uring After running this patch, we don't see any io worker thread being created which indicated that io_uring saw that the underlying device does support nowait. This is the exact behaviour noticed on a dm device which also supports nowait. For all the other raid personalities except raid0, we would need to train pieces which involves make_request fn in order for them to correctly handle REQ_NOWAIT. Reviewed-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Vishal Verma <vverma@digitalocean.com> Signed-off-by: Song Liu <song@kernel.org>	2022-01-06 08:37:02 -08:00
Mariusz Tkaczyk	a92ce0feff	md: drop queue limitation for RAID1 and RAID10 As suggested by Neil Brown[1], this limitation seems to be deprecated. With plugging in use, writes are processed behind the raid thread and conf->pending_count is not increased. This limitation occurs only if caller doesn't use plugs. It can be avoided and often it is (with plugging). There are no reports that queue is growing to enormous size so remove queue limitation for non-plugged IOs too. [1] https://lore.kernel.org/linux-raid/162496301481.7211.18031090130574610495@noble.neil.brown.name Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Signed-off-by: Song Liu <song@kernel.org>	2022-01-06 08:37:02 -08:00
Davidlohr Bueso	770b1d216d	md/raid5: play nice with PREEMPT_RT raid_run_ops() relies on the implicitly disabled preemption for its percpu ops, although this is really about CPU locality. This breaks RT semantics as it can take regular (and thus sleeping) spinlocks, such as stripe_lock. Add a local_lock such that non-RT does not change and continues to be just map to preempt_disable/enable, but makes RT happy as the region will use a per-CPU spinlock and thus be preemptible and still guarantee CPU locality. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Song Liu <songliubraving@fb.com>	2022-01-06 08:37:02 -08:00
Ajit Kumar Pandey	7112550890	ASoC: amd: acp: acp-mach: Change default RT1019 amp dev id RT1019 components was initially registered with i2c1 and i2c2 but now changed to i2c0 and i2c1 in most of our AMD platforms. Change default rt1019 components to 10EC1019:00 and 10EC1019:01 which is aligned with most of AMD machines. Any exception to rt1019 device ids in near future board design can be handled using dmi based quirk for that machine. Signed-off-by: Ajit Kumar Pandey <AjitKumar.Pandey@amd.com> Link: https://lore.kernel.org/r/20220106150525.396170-1-AjitKumar.Pandey@amd.com Signed-off-by: Mark Brown <broonie@kernel.org>	2022-01-06 16:18:54 +00:00

... 61 62 63 64 65 ...

1073228 Commits