Commit Graph

42 Commits

Author SHA1 Message Date
Heiko Carstens
50ab9a9a60 s390/smp,topology: add polarization member to pcpu struct
The cpu polarization member is the only per cpu state that is not part
of the pcpu structure. So add it there and have everything in one place.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:45:09 +02:00
Heiko Carstens
fade4dc491 s390/sysinfo,topology: fix cpu topology maximum nesting detection
The maximum nesting of the cpu topology is evaluated when /proc/sysinfo
is the first time read. This happens without a lock and a concurrent
reader on a different cpu can see and use an invalid intermediate value.
Besides the fact that this race is quite unlikely the worst thing that
could happen is that /proc/sysinfo would contain bogus information about
the machine's cpu topology.
Nevertheless this should be fixed. So move the detection code to the
early machine detection code and since now the value is early available
use it in the topology code as well.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:45:08 +02:00
Heiko Carstens
7860913279 s390/topology: remove sysinfo header include, add forward declaration instead
Any change to sysinfo.h causes a whole kernel recompile since sysinfo.h is
included by topology.h, which again is used nearly everywhere.
So remove that include and add a forward declaration instead.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:45:06 +02:00
Heiko Carstens
a53c8fab3f s390/comments: unify copyright messages and remove file names
Remove the file name from the comment at top of many files. In most
cases the file name was wrong anyway, so it's rather pointless.

Also unify the IBM copyright statement. We did have a lot of sightly
different statements and wanted to change them one after another
whenever a file gets touched. However that never happened. Instead
people start to take the old/"wrong" statements to use as a template
for new files.
So unify all of them in one go.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2012-07-20 11:15:04 +02:00
Martin Schwidefsky
8b646bd759 [S390] rework smp code
Define struct pcpu and merge some of the NR_CPUS arrays into it, including
__cpu_logical_map, current_set and smp_cpu_state. Split smp related
functions to those operating on physical cpus and the functions operating
on a logical cpu number. Make the functions for physical cpus use a
pointer to a struct pcpu. This hides the knowledge about cpu addresses in
smp.c, entry[64].S and swsusp_asm64.S, thus remove the sigp.h header.

The PSW restart mechanism is used to start secondary cpus, calling a
function on an online cpu, calling a function on the ipl cpu, and for
the nmi signal. Replace the different assembler functions with a
single function restart_int_handler. The new entry point calls a function
whose pointer is stored in the lowcore of the target cpu and it can wait
for the source cpu to stop. This covers all existing use cases.

Overall the code is now simpler and there are ~380 lines less code.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-03-11 11:59:28 -04:00
Linus Torvalds
72f318897e Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (31 commits)
  [S390] disassembler: mark exception causing instructions
  [S390] Enable exception traces by default
  [S390] return address of compat signals
  [S390] sysctl: get rid of dead declaration
  [S390] dasd: fix fixpoint divide exception in define_extent
  [S390] dasd: add sanity check to detect path connection error
  [S390] qdio: fix kernel panic for zfcp 31-bit
  [S390] Add s390x description to Documentation/kdump/kdump.txt
  [S390] Add VMCOREINFO_SYMBOL(high_memory) to vmcoreinfo
  [S390] dasd: fix expiration handling for recovery requests
  [S390] outstanding interrupts vs. smp_send_stop
  [S390] ipc: call generic sys_ipc demultiplexer
  [S390] zcrypt: Fix error return codes.
  [S390] zcrypt: Rework length parameter checking.
  [S390] cleanup trap handling
  [S390] Remove Kerntypes leftovers
  [S390] topology: increase poll frequency if change is anticipated
  [S390] entry[64].S improvements
  [S390] make arch/s390 subdirectories depend on config option
  [S390] kvm: move cmf host id constant out of lowcore
  ...

Fix up conflicts in arch/s390/kernel/{smp.c,topology.c} due to the
sysdev removal clashing with "topology: get rid of ifdefs" which moved
some of that code around.
2012-01-09 08:11:13 -08:00
Greg Kroah-Hartman
ff4b8a57f0 Merge branch 'driver-core-next' into Linux 3.2
This resolves the conflict in the arch/arm/mach-s3c64xx/s3c6400.c file,
and it fixes the build error in the arch/x86/kernel/microcode_core.c
file, that the merge did not catch.

The microcode_core.c patch was provided by Stephen Rothwell
<sfr@canb.auug.org.au> who was invaluable in the merge issues involved
with the large sysdev removal process in the driver-core tree.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2012-01-06 11:42:52 -08:00
Heiko Carstens
d68bddb732 [S390] topology: increase poll frequency if change is anticipated
Increase cpu topology change poll frequency if a change is anticipated.
Otherwise a user might be a bit confused to have to wait up to a minute
in order to see a change this should be visible immediatly.
However there is no guarantee that the change will happen during the
time frame the poll frequency is increased.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-12-27 11:27:12 +01:00
Heiko Carstens
4baeb964d9 [S390] topology: cleanup z10 topology handling
Cleanup z10 topology handling. This adds some more code but hopefully
the result is more readable and easier to maintain.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-12-27 11:27:11 +01:00
Heiko Carstens
83a24e3290 [S390] topology: get rid of ifdefs
Remove all ifdefs from topology code and also only compile it for the
CONFIG_SCHED_BOOK case. The new code selects SCHED_MC if SCHED_BOOK is
selected. SCHED_MC without SCHED_BOOK is not possible anymore.
Furthermore various sysfs attributes are not available anymore for the
!SCHED_BOOK case. In particular all attributes that correspond to
CPU polarization.
But since all real world kernels have SCHED_BOOK selected anyway this
doesn't matter too much.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-12-27 11:27:10 +01:00
Kay Sievers
8a25a2fd12 cpu: convert 'cpu' and 'machinecheck' sysdev_class to a regular subsystem
This moves the 'cpu sysdev_class' over to a regular 'cpu' subsystem
and converts the devices to regular devices. The sysdev drivers are
implemented as subsystem interfaces now.

After all sysdev classes are ported to regular driver core entities, the
sysdev implementation will be entirely removed from the kernel.

Userspace relies on events and generic sysfs subsystem infrastructure
from sysdev devices, which are made available with this conversion.

Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Borislav Petkov <bp@amd64.org>
Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk>
Cc: Len Brown <lenb@kernel.org>
Cc: Zhang Rui <rui.zhang@intel.com>
Cc: Dave Jones <davej@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-12-21 14:29:42 -08:00
Heiko Carstens
f6bf1a8acd [S390] topology: fix topology on z10 machines
Make sure that all cpus in a book on a z10 appear as book siblings
and not as core siblings. This fixes some performance regressions that
appeared after the book scheduling domain got introduced.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-11-14 11:19:09 +01:00
Sebastian Ott
caa04f69df [S390] topology: fix alloc_masks annotation
Fix this warning:
WARNING: vmlinux.o(.text+0x199b6): Section mismatch in reference from
the function alloc_masks() to the function .init.text:__alloc_bootmem()

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-10-30 15:16:15 +01:00
Heiko Carstens
d7b250e2a2 [S390] irq: merge irq.c and s390_ext.c
Merge irq.c and s390_ext.c into irq.c. That way all external interrupt
related functions are together.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-05-26 09:48:24 +02:00
KOSAKI Motohiro
0f1959f506 [S390] convert old cpumask API into new one
Adapt new API.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2011-05-23 10:24:31 +02:00
Heiko Carstens
0b52783d4f [S390] topology: fix cpu masks for topology=off case
Fix cpu masks for 'topology=off' case. Folding of the scheduling domains
happen in such a way that everything belongs to the MC domain instead
of the CPU doimain.
This should fix a performance regression introduced with
eafd2b6d "[S390] topology: use default MC domain initializer" and also
makes sure we have the same behavious as if CONFIG_SCHED_MC was not
selected at all.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2010-10-29 16:50:50 +02:00
Heiko Carstens
96f4a70d8e [S390] topology: export cpu topology via proc/sysinfo
Export the cpu configuration topology via sysinfo. Two new lines are
introduced:

CPU Topology HW:      0 0 0 4 6 4
CPU Topology SW:      0 0 0 0 4 24

The HW line describes the cpu topology nesting levels when the maximum
nesting level is used to get the corresponding SYSIB.
The SW line describes what Linux is actually using. In this case it
supports only two levels (CONFIG_SCHED_BOOK off) and therefore the
hardware folded the two lower levels in the SYSIB response block.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2010-10-25 16:10:21 +02:00
Heiko Carstens
c30f91b6a2 [S390] topology: move topology sysinfo code
Move the topology sysinfo SYSIB definitions to the proper place in
asm/sysinfo.h where they should be.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2010-10-25 16:10:21 +02:00
Heiko Carstens
9186d7a9cf [S390] topology: clean up facility detection
Move cpu topology facility detection to early setup code where it
should be.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2010-10-25 16:10:21 +02:00
Martin Schwidefsky
14375bc4eb [S390] cleanup facility list handling
Store the facility list once at system startup with stfl/stfle and
reuse the result for all facility tests.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2010-10-25 16:10:21 +02:00
Heiko Carstens
c9af3fa9e1 [S390] topology: change default
Switch default value of the kernel parameter 'topology' from off to on.
Various performance measurements have finally shown that there are no
(known) regressions anywhere.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2010-10-25 16:10:20 +02:00
Heiko Carstens
4cb14bc8c5 topology, s390: Add z11 cpu topology support
Use the extended cpu topology information that z11 machines provide
to improve the scheduler's decision making.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20100831082844.604956770@de.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-09-09 20:41:55 +02:00
Heiko Carstens
10d3858950 [S390] topology: expose core identifier
Provide a topology_core_id define which makes sure that the contents of
/sys/devices/system/cpu/cpuX/topology/core_id
indeed do contain the core id and not always 0.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2010-05-17 10:00:16 +02:00
Julia Lawall
d7015c120e [S390] arch/s390/kernel: Add missing unlock
In the default case the lock is not unlocked.  The return is
converted to a goto, to share the unlock at the end of the function.

A simplified version of the semantic patch that finds this problem is as
follows: (http://coccinelle.lip6.fr/)

// <smpl>
@r exists@
expression E1;
identifier f;
@@

f (...) { <+...
* spin_lock_irq (E1,...);
... when != E1
* return ...;
...+> }
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2010-04-09 13:43:01 +02:00
Heiko Carstens
fb380aadfe [S390] Move __cpu_logical_map to smp.c
Finally move it to the place where it belongs to and make get rid of
it for !CONFIG_SMP.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2010-01-13 20:44:45 +01:00
Rusty Russell
6f7a321d5f [S390] cpumask: remove cpu_coregroup_map
Impact: cleanup

cpu_coregroup_mask is the New Hotness.

As S/390 uses theirs internally, so we just make it static.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2009-03-26 15:24:32 +01:00
Rusty Russell
33edcf133b Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 2008-12-30 08:02:35 +10:30
Linus Torvalds
1db2a5c11e Merge branch 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6
* 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6: (85 commits)
  [S390] provide documentation for hvc_iucv kernel parameter.
  [S390] convert ctcm printks to dev_xxx and pr_xxx macros.
  [S390] convert zfcp printks to pr_xxx macros.
  [S390] convert vmlogrdr printks to pr_xxx macros.
  [S390] convert zfcp dumper printks to pr_xxx macros.
  [S390] convert cpu related printks to pr_xxx macros.
  [S390] convert qeth printks to dev_xxx and pr_xxx macros.
  [S390] convert sclp printks to pr_xxx macros.
  [S390] convert iucv printks to dev_xxx and pr_xxx macros.
  [S390] convert ap_bus printks to pr_xxx macros.
  [S390] convert dcssblk and extmem printks messages to pr_xxx macros.
  [S390] convert monwriter printks to pr_xxx macros.
  [S390] convert s390 debug feature printks to pr_xxx macros.
  [S390] convert monreader printks to pr_xxx macros.
  [S390] convert appldata printks to pr_xxx macros.
  [S390] convert setup printks to pr_xxx macros.
  [S390] convert hypfs printks to pr_xxx macros.
  [S390] convert time printks to pr_xxx macros.
  [S390] convert cpacf printks to pr_xxx macros.
  [S390] convert cio printks to pr_xxx macros.
  ...
2008-12-28 12:33:21 -08:00
Rusty Russell
9be3eec2c8 cpumask: cpu_coregroup_mask(): s390
Like cpu_coregroup_map, but returns a (const) pointer.

Compile-tested on s390 (defconfig).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Mike Travis <travis@sgi.com>
2008-12-26 22:23:42 +10:30
Martin Schwidefsky
395d31d40c [S390] convert cpu related printks to pr_xxx macros.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:25 +01:00
Heiko Carstens
349f1b671a [S390] cpu topology: remove dead code
Interrupts haven't been implemented. So remove the dead code.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:15 +01:00
Heiko Carstens
2b1a61f0a8 [S390] cpu topology: introduce kernel parameter
Introduce a topology=[on|off] kernel parameter which allows to switch
cpu topology on/off. Default will be off, since it looks like that for
some workloards this doesn't behave very well (on s390).

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:14 +01:00
Heiko Carstens
f414f5f153 [S390] cpu topology: dont destroy cpu sets on topology change
Call rebuild_sched_domains instead of arch_reinit_sched_domains if
cpu topology changes. This leaves cpu sets alone which otherwise would
be destroyed.
If and how it makes sense to define cpu sets on a virtualized
architecture is another question.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:38:07 +01:00
Heiko Carstens
5439050f9f [S390] cpu topology: fix cpu_core_map initialization
Common code doesn't call arch_update_cpu_topology() anymore on
cpu hotplug. But our architecture backend relied on that in order to
update the cpu_core_map. For machines without cpu topology support
this leads uninitialized cpu_core_maps for later on added cpus.

To solve this just initialize the maps with cpu_possible_map, since
that will be always valid for machines without topology support.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:38:07 +01:00
Heiko Carstens
ee79d1bdb6 sched: let arch_update_cpu_topology indicate if topology changed
Change arch_update_cpu_topology so it returns 1 if the cpu topology changed
and 0 if it didn't change. This will be useful for the next patch which adds
a call to this function in partition_sched_domains.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-12-12 13:47:21 +01:00
Heiko Carstens
74af283102 [S390] cpu topology: fix locking
cpu_coregroup_map used to grab a mutex on s390 since it was only
called from process context.
Since c7c22e4d5c "block: add support
for IO CPU affinity" this is not true anymore.
It now also gets called from softirq context.

To prevent possible deadlocks change this in architecture code and
use a spinlock instead of a mutex.

Cc: stable@kernel.org
Cc: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-11-14 18:18:54 +01:00
Oleg Nesterov
69b895fd13 S390 topology: don't use kthread() for arch_reinit_sched_domains()
Now that it is safe to use get_online_cpus() we can revert

	[S390] cpu topology: Fix possible deadlock.
	commit: fd781fa25c

and call arch_reinit_sched_domains() directly from topology_work_fn().

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Gautham R Shenoy <ego@in.ibm.com>
Tested-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Max Krasnyansky <maxk@qualcomm.com>
Cc: Paul Jackson <pj@sgi.com>
Cc: Paul Menage <menage@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Vegard Nossum <vegard.nossum@gmail.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-07-25 10:53:40 -07:00
Julia Lawall
402a3998ba [S390] arch/s390: Eliminate NULL test and memset after alloc_bootmem
As noted by Akinobu Mita in patch b1fceac2b9,
alloc_bootmem and related functions never return NULL and always return a
zeroed region of memory.  Thus a NULL test or memset after calls to these
functions is unnecessary.

 arch/s390/kernel/topology.c |    2 --
 1 file changed, 2 deletions(-)

This was fixed using the following semantic patch.
(http://www.emn.fr/x-info/coccinelle/)

// <smpl>
@@
expression E;
statement S;
@@

E = \(alloc_bootmem\|alloc_bootmem_low\|alloc_bootmem_pages\|alloc_bootmem_low_pages\)(...)
... when != E
(
- BUG_ON (E == NULL);
|
- if (E == NULL) S
)

@@
expression E,E1;
@@

E = \(alloc_bootmem\|alloc_bootmem_low\|alloc_bootmem_pages\|alloc_bootmem_low_pages\)(...)
... when != E
- memset(E,0,E1);
// </smpl>

Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-07-14 10:02:14 +02:00
Heiko Carstens
fd781fa25c [S390] cpu topology: Fix possible deadlock.
When we get a notification that cpu topology changed, we schedule a
work struct which just calls arch_reinit_sched_domains. This function
in turn calls get_online_cpus() which results int the lockdep warning
below.

After all it turnded out that it's not legal to call get_online_cpus()
from the context of a multi-threaded work queue.
It could deadlock this way:

process 0 (events/cpu-x):
-> run_workqueue
-> removes my work_struct from the work queue
-> calls work_struct->fn
-> get_online_cpus()
-> locks on cpu_hotplug.lock since process 1 below is doing cpu hotplug

process 1:
-> cpu_down (for cpu-x)
-> cpu_hotplug_begin (holds cpu_hotplug.lock now)
-> cpu-x dead
-> notifier_call_chain with CPU_DEAD
-> cleanup_workqueue_thread
-> flush_cpu_workqueue (succeeds)
-> kthread_stop for events/cpu-x
  -> now kthread_stop waits for my work_struct to complete from within
     process 0. -> dead.

A single threaded workqueue wouldn't have such problems, however there is
no such common queue available and it's not worth to create one for the
very rare calls to arch_reinit_sched_domains.

So we just create a kernel thread from our work struct which calls
arch_reinit_sched_domains and are done with it.

Thanks to Oleg Nesterov and Peter Zijlstra for helping me figuring out
that this isn't a false positive lockdep warning:

=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.25-03562-g3dc5063-dirty #12
-------------------------------------------------------
events/3/14 is trying to acquire lock:
 (&cpu_hotplug.lock){--..}, at: [<0000000000076094>] get_online_cpus+0x50/0x78

but task is already holding lock:
 (topology_work){--..}, at: [<0000000000059cde>] run_workqueue+0x106/0x278

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #2 (topology_work){--..}:
       [<000000000006fc74>] __lock_acquire+0x1010/0x111c
       [<000000000006fe40>] lock_acquire+0xc0/0xf8
       [<0000000000059d48>] run_workqueue+0x170/0x278
       [<0000000000059edc>] worker_thread+0x8c/0xf0
       [<000000000005f5bc>] kthread+0x68/0xa0
       [<000000000001a33e>] kernel_thread_starter+0x6/0xc
       [<000000000001a338>] kernel_thread_starter+0x0/0xc

-> #1 (events){--..}:
       [<000000000006fc74>] __lock_acquire+0x1010/0x111c
       [<000000000006fe40>] lock_acquire+0xc0/0xf8
       [<000000000005a23c>] cleanup_workqueue_thread+0x60/0xa8
       [<00000000003b2ab8>] workqueue_cpu_callback+0xbc/0x170
       [<00000000003bba80>] notifier_call_chain+0x5c/0xa4
       [<00000000000655a2>] __raw_notifier_call_chain+0x26/0x38
       [<00000000000655e2>] raw_notifier_call_chain+0x2e/0x40
       [<0000000000075e00>] cpu_down+0x228/0x31c
       [<00000000003b1dd8>] store_online+0x64/0xb8
       [<00000000001e7128>] sysdev_store+0x48/0x58
       [<0000000000121cd2>] sysfs_write_file+0x126/0x1c0
       [<00000000000c1944>] vfs_write+0xb0/0x15c
       [<00000000000c20e6>] sys_write+0x56/0x88
       [<0000000000027a68>] sys32_write+0x34/0x4c
       [<0000000000023f70>] sysc_noemu+0x10/0x16
       [<0000000077f3f186>] 0x77f3f186

-> #0 (&cpu_hotplug.lock){--..}:
       [<000000000006fa84>] __lock_acquire+0xe20/0x111c
       [<000000000006fe40>] lock_acquire+0xc0/0xf8
       [<00000000003b701c>] mutex_lock_nested+0xd0/0x364
       [<0000000000076094>] get_online_cpus+0x50/0x78
       [<000000000003a03e>] arch_reinit_sched_domains+0x26/0x58
       [<000000000002700e>] topology_work_fn+0x26/0x34
       [<0000000000059d4e>] run_workqueue+0x176/0x278
       [<0000000000059edc>] worker_thread+0x8c/0xf0
       [<000000000005f5bc>] kthread+0x68/0xa0
       [<000000000001a33e>] kernel_thread_starter+0x6/0xc
       [<000000000001a338>] kernel_thread_starter+0x0/0xc

other info that might help us debug this:

2 locks held by events/3/14:
 #0:  (events){--..}, at: [<0000000000059cde>] run_workqueue+0x106/0x278
 #1:  (topology_work){--..}, at: [<0000000000059cde>] run_workqueue+0x106/0x278

stack backtrace:
CPU: 3 Not tainted 2.6.25-03562-g3dc5063-dirty #12
Process events/3 (pid: 14, task: 000000002fb04038, ksp: 000000002fb0bd70)
0400000000000000 000000002fb0ba40 0000000000000002 0000000000000000
       000000002fb0bae0 000000002fb0ba58 000000002fb0ba58 0000000000016488
       0000000000000000 000000002fb0bd70 0000000000000000 0000000000000000
       000000002fb0ba40 000000000000000c 000000002fb0ba40 000000002fb0bab0
       00000000003c99e0 0000000000016488 000000002fb0ba40 000000002fb0ba90
Call Trace:
([<00000000000163fc>] show_trace+0x138/0x158)
 [<00000000000164e2>] show_stack+0xc6/0xf8
 [<0000000000016624>] dump_stack+0xb0/0xc0
 [<000000000006cd36>] print_circular_bug_tail+0xa2/0xb4
 [<000000000006fa84>] __lock_acquire+0xe20/0x111c
 [<000000000006fe40>] lock_acquire+0xc0/0xf8
 [<00000000003b701c>] mutex_lock_nested+0xd0/0x364
 [<0000000000076094>] get_online_cpus+0x50/0x78
 [<000000000003a03e>] arch_reinit_sched_domains+0x26/0x58
 [<000000000002700e>] topology_work_fn+0x26/0x34
 [<0000000000059d4e>] run_workqueue+0x176/0x278
 [<0000000000059edc>] worker_thread+0x8c/0xf0
 [<000000000005f5bc>] kthread+0x68/0xa0
 [<000000000001a33e>] kernel_thread_starter+0x6/0xc
 [<000000000001a338>] kernel_thread_starter+0x0/0xc
INFO: lockdep is turned off.

Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-04-30 13:38:45 +02:00
Heiko Carstens
d00aa4e7d0 [S390] Add topology_core_siblings to topology.h
This exposes the core siblings to user space via sysfs.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2008-04-30 13:38:45 +02:00
Heiko Carstens
c10fde0d9e [S390] Vertical cpu management.
If vertical cpu polarization is active then the hypervisor will
dispatch certain cpus for a longer time than other cpus for maximum
performance. For example if a guest would have three virtual cpus,
each of them with a share of 33 percent, then in case of vertical
cpu polarization all of the processing time would be combined to a
single cpu which would run all the time, while the other two cpus
would get nearly no cpu time.

There are three different types of vertical cpus: high, medium and
low. Low cpus hardly get any real cpu time, while high cpus get a
full real cpu. Medium cpus get something in between.

In order to switch between the two possible modes (default is
horizontal) a 0 for horizontal polarization or a 1 for vertical
polarization must be written to the dispatching sysfs attribute:

/sys/devices/system/cpu/dispatching

The polarization of each single cpu can be figured out by the
polarization sysfs attribute of each cpu:

/sys/devices/system/cpu/cpuX/polarization

horizontal, vertical:high, vertical:medium, vertical:low or unknown.

When switching polarization the polarization attribute may contain
the value unknown until the configuration change is done and the
kernel has figured out the new polarization of each cpu.

Note that running a system with different types of vertical cpus may
result in significant performance regressions. If possible only one
type of vertical cpus should be used. All other cpus should be
offlined.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2008-04-17 07:47:01 +02:00
Heiko Carstens
dbd70fb499 [S390] cpu topology support for s390.
Add s390 backend so we can give the scheduler some hints about the
cpu topology.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2008-04-17 07:47:01 +02:00