Commit Graph

1294519 Commits

Author SHA1 Message Date
Michal Koutný
af000ce852 cgroup: Do not report unavailable v1 controllers in /proc/cgroups
This is a followup to CONFIG-urability of cpuset and memory controllers
for v1 hierarchies. Make the output in /proc/cgroups reflect that
!CONFIG_CPUSETS_V1 is like !CONFIG_CPUSETS and
!CONFIG_MEMCG_V1 is like !CONFIG_MEMCG.

The intended effect is that hiding the unavailable controllers will hint
users not to try mounting them on v1.

Signed-off-by: Michal Koutný <mkoutny@suse.com>
Reviewed-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2024-09-10 10:04:28 -10:00
Michal Koutný
3c41382e92 cgroup: Disallow mounting v1 hierarchies without controller implementation
The configs that disable some v1 controllers would still allow mounting
them but with no controller-specific files. (Making such hierarchies
equivalent to named v1 hierarchies.) To achieve behavior consistent with
actual out-compilation of a whole controller, the mounts should treat
respective controllers as non-existent.

Wrap implementation into a helper function, leverage legacy_files to
detect compiled out controllers. The effect is that mounts on v1 would
fail and produce a message like:
  [ 1543.999081] cgroup: Unknown subsys name 'memory'

Signed-off-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2024-09-10 10:03:43 -10:00
Michal Koutný
659f90f863 cgroup/cpuset: Expose cpuset filesystem with cpuset v1 only
The cpuset filesystem is a legacy interface to cpuset controller with
(pre-)v1 features. It makes little sense to co-mount it on systems
without cpuset v1, so do not build it when cpuset v1 is not built
neither.

Signed-off-by: Michal Koutný <mkoutny@suse.com>
Reviewed-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2024-09-10 10:02:12 -10:00
Waiman Long
8c7e22fc91 cgroup/cpuset: Move cpu.h include to cpuset-internal.h
The newly created cpuset-v1.c file uses cpus_read_lock/unlock() functions
which are defined in cpu.h but not included in cpuset-internal.h yet
leading to compilation error under certain kernel configurations.  Fix it
by moving the cpu.h include from cpuset.c to cpuset-internal.h. While
at it, sort the include files in alphabetic order.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202408311612.mQTuO946-lkp@intel.com/
Fixes: 047b830974 ("cgroup/cpuset: move relax_domain_level to cpuset-v1.c")
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2024-09-04 10:46:49 -10:00
Chen Ridong
3f9319c691 cgroup/cpuset: add sefltest for cpuset v1
There is only hotplug test for cpuset v1, just add base read/write test
for cpuset v1.

Signed-off-by: Chen Ridong <chenridong@huawei.com>
Acked-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2024-08-30 10:00:17 -10:00
Chen Ridong
1abab1ba07 cgroup/cpuset: guard cpuset-v1 code under CONFIG_CPUSETS_V1
This patch introduces CONFIG_CPUSETS_V1 and guard cpuset-v1 code under
CONFIG_CPUSETS_V1. The default value of CONFIG_CPUSETS_V1 is N, so that
user who adopted v2 don't have 'pay' for cpuset v1.

Signed-off-by: Chen Ridong <chenridong@huawei.com>
Acked-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2024-08-30 10:00:16 -10:00
Chen Ridong
381b53c3b5 cgroup/cpuset: rename functions shared between v1 and v2
Some functions name declared in cpuset-internel.h are generic. To avoid
confilicting with other variables for the same name, rename these
functions with cpuset_/cpuset1_ prefix to make them unique to cpuset.

Signed-off-by: Chen Ridong <chenridong@huawei.com>
Acked-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2024-08-30 10:00:16 -10:00
Chen Ridong
b0ced9d378 cgroup/cpuset: move v1 interfaces to cpuset-v1.c
Move legacy cpuset controller interfaces files and corresponding code
into cpuset-v1.c. 'update_flag', 'cpuset_write_resmask' and
'cpuset_common_seq_show' are also used for v1, so declare them in
cpuset-internal.h.

'cpuset_write_s64', 'cpuset_read_s64' and 'fmeter_getrate' are only used
cpuset-v1.c now, make it static.

Signed-off-by: Chen Ridong <chenridong@huawei.com>
Acked-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2024-08-30 10:00:16 -10:00
Chen Ridong
be126b5b1b cgroup/cpuset: move validate_change_legacy to cpuset-v1.c
The validate_change_legacy functions is used for v1, move it to
cpuset-v1.c. And two micro 'cpuset_for_each_child' and
'cpuset_for_each_descendant_pre' are common for v1 and v2, move them to
cpuset-internal.h.

Signed-off-by: Chen Ridong <chenridong@huawei.com>
Acked-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2024-08-30 10:00:16 -10:00
Chen Ridong
23ca5237e3 cgroup/cpuset: move legacy hotplug update to cpuset-v1.c
There are some differents about hotplug update between cpuset v1 and
cpuset v2. Move the legacy code to cpuset-v1.c.

'update_tasks_cpumask' and 'update_tasks_nodemask' are both used in cpuset
v1 and cpuset v2, declare them in cpuset-internal.h.

The change from original code is that use callback_lock helpers to get
callback_lock lock/unlock.

Signed-off-by: Chen Ridong <chenridong@huawei.com>
Acked-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2024-08-30 10:00:16 -10:00
Chen Ridong
530020f28f cgroup/cpuset: add callback_lock helper
To modify cpuset, both cpuset_mutex and callback_lock are needed. Add
helpers for cpuset-v1 to get callback_lock.

Signed-off-by: Chen Ridong <chenridong@huawei.com>
Acked-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2024-08-30 10:00:16 -10:00
Chen Ridong
90eec9548d cgroup/cpuset: move memory_spread to cpuset-v1.c
'memory_spread' is only set in cpuset v1. move corresponding code into
cpuset-v1.c.

Currently, 'cpuset_update_task_spread_flags' and 'update_tasks_flags' are
exposed to cpuset.c.

Signed-off-by: Chen Ridong <chenridong@huawei.com>
Acked-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2024-08-30 10:00:16 -10:00
Chen Ridong
047b830974 cgroup/cpuset: move relax_domain_level to cpuset-v1.c
Setting domain level is not supported at cpuset v2, so move corresponding
code into cpuset-v1.c.

The 'cpuset_write_s64' and 'cpuset_read_s64' are only used for setting
domain level, move them to cpuset-v1.c. Currently, expose to cpuset.c.
After cpuset legacy interface files are move to cpuset-v1.c, they can
be static. The 'rebuild_sched_domains_locked' is exposed to cpuset-v1.c.

The change from original code is that using 'cpuset_lock' and
'cpuset_unlock' functions to lock or unlock cpuset_mutex.

Signed-off-by: Chen Ridong <chenridong@huawei.com>
Acked-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2024-08-30 10:00:15 -10:00
Chen Ridong
49434094ef cgroup/cpuset: move memory_pressure to cpuset-v1.c
Collection of memory_pressure can be enabled by writing 1 to the cpuset
file 'memory_pressure_enabled', which is only for cpuset-v1. Therefore,
move the corresponding code to cpuset-v1.c.

Currently, the 'fmeter_init' and 'fmeter_getrate' functions are called
at cpuset.c, so expose them to cpuset.c.

Signed-off-by: Chen Ridong <chenridong@huawei.com>
Acked-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2024-08-30 10:00:15 -10:00
Chen Ridong
619a33efa0 cgroup/cpuset: move common code to cpuset-internal.h
Move some declarations that will be used for cpuset v1 and v2,
including 'cpuset struct', 'cpuset_flagbits_t', cpuset_filetype_t,etc.
No logical change.

Signed-off-by: Chen Ridong <chenridong@huawei.com>
Acked-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2024-08-30 10:00:15 -10:00
Chen Ridong
71e934a808 cgroup/cpuset: introduce cpuset-v1.c
This patch introduces the cgroup/cpuset-v1.c source file which will be
used for all legacy (cgroup v1) cpuset cgroup code. It also introduces
cgroup/cpuset-internal.h to keep declarations shared between
cgroup/cpuset.c and cpuset/cpuset-v1.c.

As of now, let's compile it if CONFIG_CPUSET is set. Later on it can be
switched to use a separate config option, so that the legacy code won't be
compiled if not required.

Signed-off-by: Chen Ridong <chenridong@huawei.com>
Acked-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2024-08-30 10:00:15 -10:00
Waiman Long
43a17fcfcc selftest/cgroup: Make test_cpuset_prs.sh deal with pre-isolated CPUs
Since isolated CPUs can be reserved at boot time via the "isolcpus"
boot command line option, these pre-isolated CPUs may interfere with
testing done by test_cpuset_prs.sh.

With the previous commit that incorporates those boot time isolated CPUs
into "cpuset.cpus.isolated", we can check for those before testing is
started to make sure that there will be no interference.  Otherwise,
this test will be skipped if incorrect test failure can happen.

As "cpuset.cpus.isolated" is now available in a non cgroup_debug kernel,
we don't need to check for its existence anymore.

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2024-08-30 09:23:39 -10:00
Waiman Long
c188f33c86 cgroup/cpuset: Account for boot time isolated CPUs
With the "isolcpus" boot command line parameter, we are able to
create isolated CPUs at boot time. These isolated CPUs aren't fully
accounted for in the cpuset code. For instance, the root cgroup's
"cpuset.cpus.isolated" control file does not include the boot time
isolated CPUs. Fix that by looking for pre-isolated CPUs at init time.

The prstate_housekeeping_conflict() function does check the
HK_TYPE_DOMAIN housekeeping cpumask to make sure that CPUs outside of it
can only be used in isolated partition. Given the fact that we are going
to make housekeeping cpumasks dynamic, the current check may not be right
anymore. Save the boot time HK_TYPE_DOMAIN cpumask and check against
it instead of the upcoming dynamic HK_TYPE_DOMAIN housekeeping cpumask.

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2024-08-30 09:23:39 -10:00
Chen Ridong
3c2acae888 cgroup/cpuset: remove use_parent_ecpus of cpuset
use_parent_ecpus is used to track whether the children are using the
parent's effective_cpus. When a parent's effective_cpus is changed
due to changes in a child partition's effective_xcpus, any child
using parent'effective_cpus must call update_cpumasks_hier. However,
if a child is not a valid partition, it is sufficient to determine
whether to call update_cpumasks_hier based on whether the child's
effective_cpus is going to change. To make the code more succinct,
it is suggested to remove use_parent_ecpus.

Signed-off-by: Chen Ridong <chenridong@huawei.com>
Reviewed-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2024-08-20 08:51:48 -10:00
Chen Ridong
9414f68d45 cgroup/cpuset: remove fetch_xcpus
Both fetch_xcpus and user_xcpus functions are used to retrieve the value
of exclusive_cpus. If exclusive_cpus is not set, cpus_allowed is the
implicit value used as exclusive in a local partition. I can not imagine
a scenario where effective_xcpus is not empty when exclusive_cpus is
empty. Therefore, I suggest removing the fetch_xcpus function.

Signed-off-by: Chen Ridong <chenridong@huawei.com>
Reviewed-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2024-08-20 08:51:34 -10:00
Chen Ridong
e55f45b4ba cgroup/cpuset: Correct invalid remote parition prs
When enable a remote partition, I found that:

cd /sys/fs/cgroup/
mkdir test
mkdir test/test1
echo +cpuset > cgroup.subtree_control
echo +cpuset >  test/cgroup.subtree_control
echo 3 > test/test1/cpuset.cpus
echo root > test/test1/cpuset.cpus.partition
cat test/test1/cpuset.cpus.partition
root invalid (Parent is not a partition root)

The parent of a remote partition could not be a root. This is due to the
emtpy effective_xcpus. It would be better to prompt the message "invalid
cpu list in cpuset.cpus.exclusive".

Signed-off-by: Chen Ridong <chenridong@huawei.com>
Reviewed-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2024-08-20 08:51:16 -10:00
Chen Ridong
d1a92d2d6c cgroup: update some statememt about delegation
The comment in cgroup_file_write is missing some interfaces, such as
'cgroup.threads'. All delegatable files are listed in
'/sys/kernel/cgroup/delegate', so update the comment in cgroup_file_write.
Besides, add a statement that files outside the namespace shouldn't be
visible from inside the delegated namespace.

tj: Reflowed text for consistency.

Signed-off-by: Chen Ridong <chenridong@huawei.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2024-08-19 12:16:17 -10:00
Waiman Long
9b103943ab cgroup: Fix incorrect WARN_ON_ONCE() in css_release_work_fn()
It turns out that the WARN_ON_ONCE() call in css_release_work_fn
introduced by commit ab03125268 ("cgroup: Show # of subsystem CSSes
in cgroup.stat") is incorrect. Although css->nr_descendants must be
0 when a css is released and ready to be freed, the corresponding
cgrp->nr_dying_subsys[ss->id] may not be 0 if a subsystem is activated
and deactivated multiple times with one or more of its previous
activation leaving behind dying csses.

Fix the incorrect warning by removing the cgrp->nr_dying_subsys check.

Fixes: ab03125268 ("cgroup: Show # of subsystem CSSes in cgroup.stat")
Closes: https://lore.kernel.org/cgroups/6f301773-2fce-4602-a391-8af7ef00b2fb@redhat.com/T/#t
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2024-08-09 06:35:29 -10:00
Waiman Long
92841d6e23 selftest/cgroup: Add new test cases to test_cpuset_prs.sh
Add new test cases to test_cpuset_prs.sh to cover corner cases reported
in previous fix commits.

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2024-08-05 10:57:42 -10:00
Waiman Long
99570300d3 cgroup/cpuset: Check for partition roots with overlapping CPUs
With the previous commit that eliminates the overlapping partition
root corner cases in the hotplug code, the partition roots passed down
to generate_sched_domains() should not have overlapping CPUs. Enable
overlapping cpuset check for v2 and warn if that happens.

This patch also has the benefit of increasing test coverage of the new
Union-Find cpuset merging code to cgroup v2.

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2024-08-05 10:57:38 -10:00
Tejun Heo
bc3c27516f Merge branch 'cgroup/for-6.11-fixes' into cgroup/for-6.12
cgroup/for-6.12 is about to receive updates that are dependent on changes
from both for-6.11-fixes and for-6.12. Pull in for-6.11-fixes.

Signed-off-by: Tejun Heo <tj@kernel.org>
2024-08-05 10:56:49 -10:00
Waiman Long
ff0ce721ec cgroup/cpuset: Eliminate unncessary sched domains rebuilds in hotplug
It was found that some hotplug operations may cause multiple
rebuild_sched_domains_locked() calls. Some of those intermediate calls
may use cpuset states not in the final correct form leading to incorrect
sched domain setting.

Fix this problem by using the existing force_rebuild flag to inhibit
immediate rebuild_sched_domains_locked() calls if set and only doing
one final call at the end. Also renaming the force_rebuild flag to
force_sd_rebuild to make its meaning for clear.

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2024-08-05 10:54:25 -10:00
Waiman Long
311a1bdc44 cgroup/cpuset: Clear effective_xcpus on cpus_allowed clearing only if cpus.exclusive not set
Commit e2ffe502ba ("cgroup/cpuset: Add cpuset.cpus.exclusive for
v2") adds a user writable cpuset.cpus.exclusive file for setting
exclusive CPUs to be used for the creation of partitions. Since then
effective_xcpus depends on both the cpuset.cpus and cpuset.cpus.exclusive
setting. If cpuset.cpus.exclusive is set, effective_xcpus will depend
only on cpuset.cpus.exclusive.  When it is not set, effective_xcpus
will be set according to the cpuset.cpus value when the cpuset becomes
a valid partition root.

When cpuset.cpus is being cleared by the user, effective_xcpus should
only be cleared when cpuset.cpus.exclusive is not set. However, that
is not currently the case.

  # cd /sys/fs/cgroup/
  # mkdir test
  # echo +cpuset > cgroup.subtree_control
  # cd test
  # echo 3 > cpuset.cpus.exclusive
  # cat cpuset.cpus.exclusive.effective
  3
  # echo > cpuset.cpus
  # cat cpuset.cpus.exclusive.effective // was cleared

Fix it by clearing effective_xcpus only if cpuset.cpus.exclusive is
not set.

Fixes: e2ffe502ba ("cgroup/cpuset: Add cpuset.cpus.exclusive for v2")
Cc: stable@vger.kernel.org # v6.7+
Reported-by: Chen Ridong <chenridong@huawei.com>
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2024-08-05 10:52:40 -10:00
Chen Ridong
959ab6350a cgroup/cpuset: fix panic caused by partcmd_update
We find a bug as below:
BUG: unable to handle page fault for address: 00000003
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP NOPTI
CPU: 3 PID: 358 Comm: bash Tainted: G        W I        6.6.0-10893-g60d6
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/4
RIP: 0010:partition_sched_domains_locked+0x483/0x600
Code: 01 48 85 d2 74 0d 48 83 05 29 3f f8 03 01 f3 48 0f bc c2 89 c0 48 9
RSP: 0018:ffffc90000fdbc58 EFLAGS: 00000202
RAX: 0000000100000003 RBX: ffff888100b3dfa0 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000002fe80
RBP: ffff888100b3dfb0 R08: 0000000000000001 R09: 0000000000000000
R10: ffffc90000fdbcb0 R11: 0000000000000004 R12: 0000000000000002
R13: ffff888100a92b48 R14: 0000000000000000 R15: 0000000000000000
FS:  00007f44a5425740(0000) GS:ffff888237d80000(0000) knlGS:0000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000100030973 CR3: 000000010722c000 CR4: 00000000000006e0
Call Trace:
 <TASK>
 ? show_regs+0x8c/0xa0
 ? __die_body+0x23/0xa0
 ? __die+0x3a/0x50
 ? page_fault_oops+0x1d2/0x5c0
 ? partition_sched_domains_locked+0x483/0x600
 ? search_module_extables+0x2a/0xb0
 ? search_exception_tables+0x67/0x90
 ? kernelmode_fixup_or_oops+0x144/0x1b0
 ? __bad_area_nosemaphore+0x211/0x360
 ? up_read+0x3b/0x50
 ? bad_area_nosemaphore+0x1a/0x30
 ? exc_page_fault+0x890/0xd90
 ? __lock_acquire.constprop.0+0x24f/0x8d0
 ? __lock_acquire.constprop.0+0x24f/0x8d0
 ? asm_exc_page_fault+0x26/0x30
 ? partition_sched_domains_locked+0x483/0x600
 ? partition_sched_domains_locked+0xf0/0x600
 rebuild_sched_domains_locked+0x806/0xdc0
 update_partition_sd_lb+0x118/0x130
 cpuset_write_resmask+0xffc/0x1420
 cgroup_file_write+0xb2/0x290
 kernfs_fop_write_iter+0x194/0x290
 new_sync_write+0xeb/0x160
 vfs_write+0x16f/0x1d0
 ksys_write+0x81/0x180
 __x64_sys_write+0x21/0x30
 x64_sys_call+0x2f25/0x4630
 do_syscall_64+0x44/0xb0
 entry_SYSCALL_64_after_hwframe+0x78/0xe2
RIP: 0033:0x7f44a553c887

It can be reproduced with cammands:
cd /sys/fs/cgroup/
mkdir test
cd test/
echo +cpuset > ../cgroup.subtree_control
echo root > cpuset.cpus.partition
cat /sys/fs/cgroup/cpuset.cpus.effective
0-3
echo 0-3 > cpuset.cpus // taking away all cpus from root

This issue is caused by the incorrect rebuilding of scheduling domains.
In this scenario, test/cpuset.cpus.partition should be an invalid root
and should not trigger the rebuilding of scheduling domains. When calling
update_parent_effective_cpumask with partcmd_update, if newmask is not
null, it should recheck newmask whether there are cpus is available
for parect/cs that has tasks.

Fixes: 0c7f293efc ("cgroup/cpuset: Add cpuset.cpus.exclusive.effective for v2")
Cc: stable@vger.kernel.org # v6.7+
Signed-off-by: Chen Ridong <chenridong@huawei.com>
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2024-08-05 10:52:40 -10:00
Xiu Jianfeng
4980f71202 cgroup/pids: Remove unreachable paths of pids_{can,cancel}_fork
According to the implementation of cgroup_css_set_fork(), it will fail
if cset cannot be found and the can_fork/cancel_fork methods will not
be called in this case, which means that the argument 'cset' for these
methods must not be NULL, so remove the unrechable paths in them.

Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com>
Reviewed-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2024-08-05 10:32:16 -10:00
Xavier
563ea1f5f8 Documentation: Fix the compilation errors in union_find.rst
Fix the compilation errors and warnings caused by merging
Documentation/core-api/union_find.rst and
Documentation/translations/zh_CN/core-api/union_find.rst.

Signed-off-by: Xavier <xavier_qy@163.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2024-08-02 08:58:16 -10:00
Waiman Long
ab03125268 cgroup: Show # of subsystem CSSes in cgroup.stat
Cgroup subsystem state (CSS) is an abstraction in the cgroup layer to
help manage different structures in various cgroup subsystems by being
an embedded element inside a larger structure like cpuset or mem_cgroup.

The /proc/cgroups file shows the number of cgroups for each of the
subsystems.  With cgroup v1, the number of CSSes is the same as the
number of cgroups.  That is not the case anymore with cgroup v2. The
/proc/cgroups file cannot show the actual number of CSSes for the
subsystems that are bound to cgroup v2.

So if a v2 cgroup subsystem is leaking cgroups (usually memory cgroup),
we can't tell by looking at /proc/cgroups which cgroup subsystems may
be responsible.

As cgroup v2 had deprecated the use of /proc/cgroups, the hierarchical
cgroup.stat file is now being extended to show the number of live and
dying CSSes associated with all the non-inhibited cgroup subsystems that
have been bound to cgroup v2. The number includes CSSes in the current
cgroup as well as in all the descendants underneath it.  This will help
us pinpoint which subsystems are responsible for the increasing number
of dying (nr_dying_descendants) cgroups.

The CSSes dying counts are stored in the cgroup structure itself
instead of inside the CSS as suggested by Johannes. This will allow
us to accurately track dying counts of cgroup subsystems that have
recently been disabled in a cgroup. It is now possible that a zero
subsystem number is coupled with a non-zero dying subsystem number.

The cgroup-v2.rst file is updated to discuss this new behavior.

With this patch applied, a sample output from root cgroup.stat file
was shown below.

	nr_descendants 56
	nr_subsys_cpuset 1
	nr_subsys_cpu 43
	nr_subsys_io 43
	nr_subsys_memory 56
	nr_subsys_perf_event 57
	nr_subsys_hugetlb 1
	nr_subsys_pids 56
	nr_subsys_rdma 1
	nr_subsys_misc 1
	nr_dying_descendants 30
	nr_dying_subsys_cpuset 0
	nr_dying_subsys_cpu 0
	nr_dying_subsys_io 0
	nr_dying_subsys_memory 30
	nr_dying_subsys_perf_event 0
	nr_dying_subsys_hugetlb 0
	nr_dying_subsys_pids 0
	nr_dying_subsys_rdma 0
	nr_dying_subsys_misc 0

Another sample output from system.slice/cgroup.stat was:

	nr_descendants 34
	nr_subsys_cpuset 0
	nr_subsys_cpu 32
	nr_subsys_io 32
	nr_subsys_memory 34
	nr_subsys_perf_event 35
	nr_subsys_hugetlb 0
	nr_subsys_pids 34
	nr_subsys_rdma 0
	nr_subsys_misc 0
	nr_dying_descendants 30
	nr_dying_subsys_cpuset 0
	nr_dying_subsys_cpu 0
	nr_dying_subsys_io 0
	nr_dying_subsys_memory 30
	nr_dying_subsys_perf_event 0
	nr_dying_subsys_hugetlb 0
	nr_dying_subsys_pids 0
	nr_dying_subsys_rdma 0
	nr_dying_subsys_misc 0

Note that 'debug' controller wasn't used to provide this information because
the controller is not recommended in productions kernels, also many of them
won't enable CONFIG_CGROUP_DEBUG by default.

Similar information could be retrieved with debuggers like drgn but that's
also not always available (e.g. lockdown) and the additional cost of runtime
tracking here is deemed marginal.

tj: Added Michal's paragraphs on why this is not added the debug controller
    to the commit message.

Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
Reviewed-by: Kamalesh Babulal <kamalesh.babulal@oracle.com>
Cc: Michal Koutný <mkoutny@suse.com>
Link: http://lkml.kernel.org/r/20240715150034.2583772-1-longman@redhat.com
Signed-off-by: Tejun Heo <tj@kernel.org>
2024-07-31 07:00:02 -10:00
Xavier
8a895c2e6a cpuset: use Union-Find to optimize the merging of cpumasks
The process of constructing scheduling domains
 involves multiple loops and repeated evaluations, leading to numerous
 redundant and ineffective assessments that impact code efficiency.

Here, we use union-find to optimize the merging of cpumasks. By employing
path compression and union by rank, we effectively reduce the number of
lookups and merge comparisons.

Signed-off-by: Xavier <xavier_qy@163.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2024-07-30 13:04:50 -10:00
Xavier
93c8332c83 Union-Find: add a new module in kernel library
This patch implements a union-find data structure in the kernel library,
which includes operations for allocating nodes, freeing nodes,
finding the root of a node, and merging two nodes.

Signed-off-by: Xavier <xavier_qy@163.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2024-07-30 13:04:36 -10:00
Chen Ridong
4a711dd910 cgroup/cpuset: add decrease attach_in_progress helpers
There are several functions to decrease attach_in_progress, and they
will wake up cpuset_attach_wq when attach_in_progress is zero. So,
add a helper to make it concise.

Signed-off-by: Chen Ridong <chenridong@huawei.com>
Reviewed-by: Kamalesh Babulal <kamalesh.babulal@oracle.com>
Reviewed-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2024-07-30 12:38:14 -10:00
Xiu Jianfeng
c149c4a48b cgroup/cpuset: Remove cpuset_slab_spread_rotor
Since the SLAB implementation was removed in v6.8, so the
cpuset_slab_spread_rotor is no longer used and can be removed.

Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com>
Reviewed-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2024-07-30 12:29:08 -10:00
Xiu Jianfeng
d72a00a848 cgroup/pids: Avoid spurious event notification
Currently when a process in a group forks and fails due to it's
parent's max restriction, all the cgroups from 'pids_forking' to root
will generate event notifications but only the cgroups from
'pids_over_limit' to root will increase the counter of PIDCG_MAX.

Consider this scenario: there are 4 groups A, B, C,and D, the
relationships are as follows, and user is watching on C.pids.events.

root->A->B->C->D

When a process in D forks and fails due to B.max restriction, the
user will get a spurious event notification because when he wakes up
and reads C.pids.events, he will find that the content has not changed.

To address this issue, only the cgroups from 'pids_over_limit' to root
will have their PIDCG_MAX counters increased and event notifications
generated.

Fixes: 385a635cac ("cgroup/pids: Make event counters hierarchical")
Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2024-07-30 12:13:19 -10:00
Chen Ridong
d632604757 cgroup/cpuset: remove child_ecpus_count
The child_ecpus_count variable was previously used to update
sibling cpumask when parent's effective_cpus is updated. However, it became
obsolete after commit e2ffe502ba ("cgroup/cpuset: Add
cpuset.cpus.exclusive for v2"). It should be removed.

tj: Restored {} for style consistency.

Signed-off-by: Chen Ridong <chenridong@huawei.com>
Acked-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2024-07-30 09:45:26 -10:00
Linus Torvalds
8400291e28 Linux 6.11-rc1 2024-07-28 14:19:55 -07:00
Linus Torvalds
a0c04bd55a Kbuild fixes for v6.11
- Fix RPM package build error caused by an incorrect locale setup
 
  - Mark modules.weakdep as ghost in RPM package
 
  - Fix the odd combination of -S and -c in stack protector scripts, which
    is an error with the latest Clang
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmamlN8VHG1hc2FoaXJv
 eUBrZXJuZWwub3JnAAoJED2LAQed4NsG828P/R3MRpmWBhMrw/JmvpJ3hyby6sZA
 VcwOMLmmiIFdmVK5NKqWRkOBfD2JlRrbuuCrJKGJIX1lo4h/TifQf+gu8s8bN38h
 ZLo7KMzn4kCX91jSwhnOBczw2jEnKbrQrMoktTl7k9GC+pnebf9pzdxXzXWIf5Rs
 Z91XuueFrOZ/lDOVjv/675MPrp9M4tM2aZPo3RaKbbPFAaSPDihSa3iKyRDqPd6K
 1rzqgt3+fvkVVO8HZjm12XBg5AoL2bYvaJtF12zZkFEhJRRmXmGQxTVbZSh/PP+z
 D+BCFUTAt0gr8rQSLTtW2PStjvZc0nEG6Mpnz/OiQCEcMycLgEq+YkD9TbQISX8q
 FiQMVPe9YV7AWSYZd5RgMHxgeB2RDd6e6Z4yKATeatJ46UpoV6Kg1JczBJTs1b5X
 bb9Q+nZniioyzc3ys5UQchTu01bzv17GsWya6ipZWf8Lf+yvMipoADuV6MOs6YUh
 TEFjoA/cGiGQQ5uLHDZ69n8dJ+FRZfMvW4dQIGE2kf92e0lQKrKwa1WXQ78+rPMn
 j8wVYB8kiqgDW+ObxTJgr0NakbqMEBQsGxwv0W4TRjD9fN4kXcWqN/bKq2IfTwxk
 mSi/ASo900XHLCGKtBH6vNoi6bkJXC7YoSBEIulpYaxCf5S0VPHZ5iFrQpdhqKSZ
 qNfFHDoZFyTdPGZk
 =HYew
 -----END PGP SIGNATURE-----

Merge tag 'kbuild-fixes-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild fixes from Masahiro Yamada:

 - Fix RPM package build error caused by an incorrect locale setup

 - Mark modules.weakdep as ghost in RPM package

 - Fix the odd combination of -S and -c in stack protector scripts,
   which is an error with the latest Clang

* tag 'kbuild-fixes-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  kbuild: Fix '-S -c' in x86 stack protector scripts
  kbuild: rpm-pkg: ghost modules.weakdep file
  kbuild: rpm-pkg: Fix C locale setup
2024-07-28 14:02:48 -07:00
Linus Torvalds
017fa3e891 minmax: simplify and clarify min_t()/max_t() implementation
This simplifies the min_t() and max_t() macros by no longer making them
work in the context of a C constant expression.

That means that you can no longer use them for static initializers or
for array sizes in type definitions, but there were only a couple of
such uses, and all of them were converted (famous last words) to use
MIN_T/MAX_T instead.

Cc: David Laight <David.Laight@aculab.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-07-28 13:50:01 -07:00
Linus Torvalds
4477b39c32 minmax: add a few more MIN_T/MAX_T users
Commit 3a7e02c040 ("minmax: avoid overly complicated constant
expressions in VM code") added the simpler MIN_T/MAX_T macros in order
to avoid some excessive expansion from the rather complicated regular
min/max macros.

The complexity of those macros stems from two issues:

 (a) trying to use them in situations that require a C constant
     expression (in static initializers and for array sizes)

 (b) the type sanity checking

and MIN_T/MAX_T avoids both of these issues.

Now, in the whole (long) discussion about all this, it was pointed out
that the whole type sanity checking is entirely unnecessary for
min_t/max_t which get a fixed type that the comparison is done in.

But that still leaves min_t/max_t unnecessarily complicated due to
worries about the C constant expression case.

However, it turns out that there really aren't very many cases that use
min_t/max_t for this, and we can just force-convert those.

This does exactly that.

Which in turn will then allow for much simpler implementations of
min_t()/max_t().  All the usual "macros in all upper case will evaluate
the arguments multiple times" rules apply.

We should do all the same things for the regular min/max() vs MIN/MAX()
cases, but that has the added complexity of various drivers defining
their own local versions of MIN/MAX, so that needs another level of
fixes first.

Link: https://lore.kernel.org/all/b47fad1d0cf8449886ad148f8c013dae@AcuMS.aculab.com/
Cc: David Laight <David.Laight@aculab.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-07-28 13:41:14 -07:00
Linus Torvalds
7e2d0ba732 This pull request contains updates (actually, just fixes) for UBI and UBIFS:
- Many fixes for power-cut issues by Zhihao Cheng
 - Another ubiblock error path fix
 - ubiblock section mismatch fix
 - Misc fixes all over the place
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCAA0FiEEdgfidid8lnn52cLTZvlZhesYu8EFAmamiikWHHJpY2hhcmRA
 c2lnbWEtc3Rhci5hdAAKCRBm+VmF6xi7wTy6EACEj6VcTRei5Ndt2QXNiKHNgJ3j
 yV0QmLWUF5brfXbUsoP9oeKL/Y9s+02maDUkOJoHGcHRClLyHkIMMHXYy5gpXyiI
 h4zTcx3x33+oddYJeqdZYDDpWwk8RsqE1UjB8i7q+DF7vznWqYl3/+G7coH8ZD8M
 +//3AzWjD/U9Gj3KVuyBgok0OmnwFJ36w3Ckoj7NKxbFQrCmxa7qtLqseuiHumhu
 HjAMPS1V+Hmw6qw/s+hs6NdBhOvmO56s1wwT26mji+kQVMuh6tG3YNo2tyM7AHSv
 cGJDVGuDmO3F6iKqom3edjxHetWmCN3gvCkacpE6G/9Q1x3XO7vWhPfl6JHv+KTu
 zsc0WDhmNKkxUIHgBKvFGCylo6Ui4rIn5iFihXBiGTjB52tBqPU+0STDfcKcqz6U
 BrgCiKdAuX7tZT00tuFJE9eWbZvPVWQii0aNw5uLi9GwXOFmLP/VAi+FDQ0hsNSF
 XQxgWAd9b7eLmyWUrn6ORj2+FIzjmF6oxBfUOHgE4J1GPggIDXii9NDJcma3XSTn
 OcqpCJe5SOc38M7ldwbZup53fl+DFvsbnbDofrGCvhSTCqAyGx5su3m40YsIVRvI
 fpCk6L/6vGp/ONw90Vq/KKeIkAJaHnqi6KbnfyDtF6LSwW7cb4+Cw69BWvpP0gQb
 m1hTm4vIaPvD25840g==
 =0y6/
 -----END PGP SIGNATURE-----

Merge tag 'ubifs-for-linus-6.11-rc1-take2' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs

Pull UBI and UBIFS updates from Richard Weinberger:

 - Many fixes for power-cut issues by Zhihao Cheng

 - Another ubiblock error path fix

 - ubiblock section mismatch fix

 - Misc fixes all over the place

* tag 'ubifs-for-linus-6.11-rc1-take2' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs:
  ubi: Fix ubi_init() ubiblock_exit() section mismatch
  ubifs: add check for crypto_shash_tfm_digest
  ubifs: Fix inconsistent inode size when powercut happens during appendant writing
  ubi: block: fix null-pointer-dereference in ubiblock_create()
  ubifs: fix kernel-doc warnings
  ubifs: correct UBIFS_DFS_DIR_LEN macro definition and improve code clarity
  mtd: ubi: Restore missing cleanup on ubi_init() failure path
  ubifs: dbg_orphan_check: Fix missed key type checking
  ubifs: Fix unattached inode when powercut happens in creating
  ubifs: Fix space leak when powercut happens in linking tmpfile
  ubifs: Move ui->data initialization after initializing security
  ubifs: Fix adding orphan entry twice for the same inode
  ubifs: Remove insert_dead_orphan from replaying orphan process
  Revert "ubifs: ubifs_symlink: Fix memleak of inode->i_link in error path"
  ubifs: Don't add xattr inode into orphan area
  ubifs: Fix unattached xattr inode if powercut happens after deleting
  mtd: ubi: avoid expensive do_div() on 32-bit machines
  mtd: ubi: make ubi_class constant
  ubi: eba: properly rollback inside self_check_eba
2024-07-28 11:51:51 -07:00
Nathan Chancellor
3415b10a03 kbuild: Fix '-S -c' in x86 stack protector scripts
After a recent change in clang to stop consuming all instances of '-S'
and '-c' [1], the stack protector scripts break due to the kernel's use
of -Werror=unused-command-line-argument to catch cases where flags are
not being properly consumed by the compiler driver:

  $ echo | clang -o - -x c - -S -c -Werror=unused-command-line-argument
  clang: error: argument unused during compilation: '-c' [-Werror,-Wunused-command-line-argument]

This results in CONFIG_STACKPROTECTOR getting disabled because
CONFIG_CC_HAS_SANE_STACKPROTECTOR is no longer set.

'-c' and '-S' both instruct the compiler to stop at different stages of
the pipeline ('-S' after compiling, '-c' after assembling), so having
them present together in the same command makes little sense. In this
case, the test wants to stop before assembling because it is looking at
the textual assembly output of the compiler for either '%fs' or '%gs',
so remove '-c' from the list of arguments to resolve the error.

All versions of GCC continue to work after this change, along with
versions of clang that do or do not contain the change mentioned above.

Cc: stable@vger.kernel.org
Fixes: 4f7fd4d7a7 ("[PATCH] Add the -fstack-protector option to the CFLAGS")
Fixes: 60a5317ff0 ("x86: implement x86_32 stack protector")
Link: 6461e53781 [1]
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-07-29 03:47:00 +09:00
Richard Weinberger
92a286e902 ubi: Fix ubi_init() ubiblock_exit() section mismatch
Since ubiblock_exit() is now called from an init function,
the __exit section no longer makes sense.

Cc: Ben Hutchings <bwh@kernel.org>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202407131403.wZJpd8n2-lkp@intel.com/
Signed-off-by: Richard Weinberger <richard@nod.at>
Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com>
2024-07-28 20:08:25 +02:00
Linus Torvalds
e172f1e906 turbostat release 2024.07.26
Enable turbostat extensions to add both perf and PMT
 (Intel Platform Monitoring Technology) counters via the cmdline.
 
 Demonstrate PMT access with built-in support for Meteor Lake's Die%c6 counter.
 -----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEE67dNfPFP+XUaA73mB9BFOha3NhcFAmaj7c8UHGxlbi5icm93
 bkBpbnRlbC5jb20ACgkQB9BFOha3Nhe6XBAArHsMOw7r1dF94yctBukD94szLasz
 9BGI64NNVYz4pUlM0BUayZzr0kYxMonLvKMHcg1XaeCF4DRByUyhM86QfPAUVskx
 qEY3raRu18wTlfku/90jYm5AM09Dp846zOf4dtrV2Io/JM8TLqo9gAsHtZI5Qaiu
 bR9nPL4vjysnIiUG5aowBhYBnI9xvAecxbID/qfOofZMGyb6h06xlXPkDkD2KTkL
 F/5Bv7AWohAQ7cX8EEg865eQCea4YDnQtcZg4yYRMkA78bVfE1WzCgA+qjb1dJ0A
 2bbL6m7cBvikWkii82VwYWzPLbJTlQDIpQRKxRQP4SvsRc0NinZmS5d4AO9I63h3
 JfIjtj7NEhzyPnaykJqcsgyOI3lBFbcEOUckutj0M8S3B6UJSC9WQnU2D6sGuQcW
 iwav+zbsEkxL3ebYkOLuTEGLhqJFy6ZNxvPVAh+Q63jcBxraZoHVqG1VbmtqslnE
 fSrhZ/hJIqo1F25X7DsMjyk7txDQYed9g/EArQWBb+DiL/ggvaO+FT/TGE3hCGCQ
 dt2J1+hhshHesQIhF4/9sj/wHrw8RA08BVqGWzAOvwx69wevvSszSQrGU5ISypTv
 9JfeOXGvc72ZQzC8MaotQfyHmxIIJ5ZQ6wFAcThc1Og39OU2+CbfIC5tLIqRoLR6
 sEWH07ty9KpN0aQ=
 =wsSy
 -----END PGP SIGNATURE-----

Merge tag 'v6.11-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux

Pull turbostat updates from Len Brown:

 - Enable turbostat extensions to add both perf and PMT (Intel
   Platform Monitoring Technology) counters via the cmdline

 - Demonstrate PMT access with built-in support for Meteor Lake's
   Die C6 counter

* tag 'v6.11-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
  tools/power turbostat: version 2024.07.26
  tools/power turbostat: Include umask=%x in perf counter's config
  tools/power turbostat: Document PMT in turbostat.8
  tools/power turbostat: Add MTL's PMT DC6 builtin counter
  tools/power turbostat: Add early support for PMT counters
  tools/power turbostat: Add selftests for added perf counters
  tools/power turbostat: Add selftests for SMI, APERF and MPERF counters
  tools/power turbostat: Move verbose counter messages to level 2
  tools/power turbostat: Move debug prints from stdout to stderr
  tools/power turbostat: Fix typo in turbostat.8
  tools/power turbostat: Add perf added counter example to turbostat.8
  tools/power turbostat: Fix formatting in turbostat.8
  tools/power turbostat: Extend --add option with perf counters
  tools/power turbostat: Group SMI counter with APERF and MPERF
  tools/power turbostat: Add ZERO_ARRAY for zero initializing builtin array
  tools/power turbostat: Replace enum rapl_source and cstate_source with counter_source
  tools/power turbostat: Remove anonymous union from rapl_counter_info_t
  tools/power/turbostat: Switch to new Intel CPU model defines
2024-07-28 10:52:15 -07:00
Linus Torvalds
e62f81bbd2 CXL for v6.11 merge window
New Changes:
 - Refactor to a common struct for DRAM and general media CXL events
 - Add abstract distance calculation support for CXL
 - Add CXL maturity map documentation to detail current state of CXL enabling
 - Add warning on mixed CXL VH and RCH/RCD hierachy to inform unsupported config
 - Replace ENXIO with EBUSY for inject poison limit reached via debugfs
 - Replace ENXIO with EBUSY for inject poison cxl-test support
 - XOR math fixup for DPA to SPA translation. Current math works for MODULO arithmetic
   where HPA==SPA, however not for XOR decode.
 - Move pci config read in cxl_dvsec_rr_decode() to avoid unnecessary acess
 
 Fixes:
 - Add a fix to address race condition in CXL memory hotplug notifier
 - Add missing MODULE_DESCRIPTION() for CXL modules
 - Fix incorrect vendor debug UUID define
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE5DAy15EJMCV1R6v9YGjFFmlTOEoFAmahJiMACgkQYGjFFmlT
 OEq8URAArcnzmH9yLvgE2pFOtaKg34vIGDWZGC4R1LpTnFEea04FuJslmxEKNgWo
 DJgPt9VZ66ump/oIvzcbvgLl/yMCTbnSxt5U6J8G5EmpO50PvxOTeWnEgAYVa0NH
 Diuzk/aF4GA94T3w+iAOzYx2N36kF+ezsY3/kqSORT7MC+DipSSUaPUiJcjr6FC6
 /ZIwkhhRi51ONJ8IgaXD+oEU9kxx7WUEyZoQZrJ9bv8/fGbeEfqy04pz2xDKHmLD
 rlQjm3l9um67VMsCvZ62Ce14HXqM213jZ3l0FmYjO4GbdXd2+0ZmIRNAb5vvTG9n
 5cY8vNsL6fND9FKkxlcRSdzI/O/vV+gcU+jzJxiul0p5fWHh/gaYjVH7fFq3dYc+
 vYE5lr97BfyA61bdmylIc2xwDH4yNKVQLZZPVTz5XTxfzBjYCjLPb5vGQKfg/nrB
 N66wjCIWLfCH6DqusUXem1c6BSrrjob8MwXpg00eBE0AA4ihieiy5fxuApnv9mI2
 f809AXRV1k24s5upStZ9iGZSEILBBqiw/KwDyWfRvxjNz36Z1Q2eiXBwbHrVQHBa
 PFtRPPFsZ9+ouIG/8otFaLwDQdITRdA0+drG8lmJ+gs8239Z3eIMMS0+CYdLDbva
 S8vo4POOQSS+cVUjLkC9zIxwPaXq96TLIkCtiLI9xUx5eIzv4K0=
 =HaEG
 -----END PGP SIGNATURE-----

Merge tag 'cxl-for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl

Pull CXL updates from Dave Jiang:
 "Core:

   - A CXL maturity map has been added to the documentation to detail
     the current state of CXL enabling.

     It provides the status of the current state of various CXL features
     to inform current and future contributors of where things are and
     which areas need contribution.

   - A notifier handler has been added in order for a newly created CXL
     memory region to trigger the abstract distance metrics calculation.

     This should bring parity for CXL memory to the same level vs
     hotplugged DRAM for NUMA abstract distance calculation. The
     abstract distance reflects relative performance used for memory
     tiering handling.

   - An addition for XOR math has been added to address the CXL DPA to
     SPA translation.

     CXL address translation did not support address interleave math
     with XOR prior to this change.

  Fixes:

   - Fix to address race condition in the CXL memory hotplug notifier

   - Add missing MODULE_DESCRIPTION() for CXL modules

   - Fix incorrect vendor debug UUID define

  Misc:

   - A warning has been added to inform users of an unsupported
     configuration when mixing CXL VH and RCH/RCD hierarchies

   - The ENXIO error code has been replaced with EBUSY for inject poison
     limit reached via debugfs and cxl-test support

   - Moving the PCI config read in cxl_dvsec_rr_decode() to avoid
     unnecessary PCI config reads

   - A refactor to a common struct for DRAM and general media CXL
     events"

* tag 'cxl-for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
  cxl/core/pci: Move reading of control register to immediately before usage
  cxl: Remove defunct code calculating host bridge target positions
  cxl/region: Verify target positions using the ordered target list
  cxl: Restore XOR'd position bits during address translation
  cxl/core: Fold cxl_trace_hpa() into cxl_dpa_to_hpa()
  cxl/test: Replace ENXIO with EBUSY for inject poison limit reached
  cxl/memdev: Replace ENXIO with EBUSY for inject poison limit reached
  cxl/acpi: Warn on mixed CXL VH and RCH/RCD Hierarchy
  cxl/core: Fix incorrect vendor debug UUID define
  Documentation: CXL Maturity Map
  cxl/region: Simplify cxl_region_nid()
  cxl/region: Support to calculate memory tier abstract distance
  cxl/region: Fix a race condition in memory hotplug notifier
  cxl: add missing MODULE_DESCRIPTION() macros
  cxl/events: Use a common struct for DRAM and General Media events
2024-07-28 09:33:28 -07:00
Linus Torvalds
7b5d481889 Two small fixes to silent the compiler and static analyzers tools from
Ben Dooks and Jeff Johnson.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRIEdicmeMNCZKVCdo6u2Upsdk6RAUCZqA9+gAKCRA6u2Upsdk6
 RJETAQDN9OkX2GJlekEo5NPVD531ekV4G7OZMWTrmPKRINClZQEAj9Spt2zP5v4V
 413unRBro9nuKfGgTaquXoHlCuPE+wE=
 =L/SP
 -----END PGP SIGNATURE-----

Merge tag 'unicode-next-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/krisman/unicode

Pull unicode update from Gabriel Krisman Bertazi:
 "Two small fixes to silence the compiler and static analyzers tools
  from Ben Dooks and Jeff Johnson"

* tag 'unicode-next-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/krisman/unicode:
  unicode: add MODULE_DESCRIPTION() macros
  unicode: make utf8 test count static
2024-07-28 09:14:11 -07:00
Jose Ignacio Tornos Martinez
d01c14074b kbuild: rpm-pkg: ghost modules.weakdep file
In the same way as for other similar files, mark as ghost the new file
generated by depmod for configured weak dependencies for modules,
modules.weakdep, so that although it is not included in the package,
claim the ownership on it.

Signed-off-by: Jose Ignacio Tornos Martinez <jtornosm@redhat.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-07-28 17:07:03 +09:00
Linus Torvalds
5437f30d34 six smb3 client fixes
-----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmalhJwACgkQiiy9cAdy
 T1GbRgv+NPJ07ZtG7D4EosxCHiBETQS9oezS1Ulbv78YdEBHfP/9T+pYcCh+3qZC
 Sa2HQlB1y3lLZNrhYQrVtyECtVcsdeUloXf6IIczBMAtCeS7FZ0+U8B07+9vJHGz
 9p0paXOkRbOQ2JtYevsRN41Q0HxjvWqHSet/Y2tM8cj0M3yjCPHvJCFv3OC9ZUTV
 AyZZdYFoDFIYmW75459wq/80IADXhkSIsH/8IStTpshVhJbVdyGpr8FTrtW7G0m7
 prYKEzXtgdvzM1CVlfR9boyf5HqUDvcHuV0ZBFjBOx7A3kXiShdRh7PFmDaY1vqX
 o3qgmmjTntX9aRR3zL9GYuayGD8XsXFPotWbuGniKLraX5WJNXe3o8OKybXgivoY
 OEXnkmlyp4GcggmWZpPCqq7J5J+YcLQImCKXxfQI7HjToI9cy7aNZ6qh9g0LIQBm
 9totZcp5AMGk9Sbdf+MUeJ3cx8+3o26kc8a5MCV6fCPt/x7XNKG33ZRd5lne6rxr
 WX4neGG4
 =nzTc
 -----END PGP SIGNATURE-----

Merge tag '6.11-rc-smb-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6

Pull more smb client updates from Steve French:

 - fix for potential null pointer use in init cifs

 - additional dynamic trace points to improve debugging of some common
   scenarios

 - two SMB1 fixes (one addressing reconnect with POSIX extensions, one a
   mount parsing error)

* tag '6.11-rc-smb-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6:
  smb3: add dynamic trace point for session setup key expired failures
  smb3: add four dynamic tracepoints for copy_file_range and reflink
  smb3: add dynamic tracepoint for reflink errors
  cifs: mount with "unix" mount option for SMB1 incorrectly handled
  cifs: fix reconnect with SMB1 UNIX Extensions
  cifs: fix potential null pointer use in destroy_workqueue in init_cifs error path
2024-07-27 20:08:07 -07:00