linux

Author	SHA1	Message	Date
Steven Rostedt	b2821ae68b	trace: fix default boot up tracer Peter Zijlstra started the functionality to start up a default tracing at bootup. This patch finishes the work. Now if you add 'ftrace=<tracer>' to the command line, when that tracer is registered on bootup, that tracer is selected and starts tracing. Note, all selftests for tracers that are registered after this tracer is disabled. This prevents the selftests from disturbing the running tracer, or the running tracer from disturbing the selftest. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-03 06:26:12 +01:00
Ingo Molnar	dc573f9b20	Merge branches 'tracing/ftrace', 'tracing/kmemtrace' and 'linus' into tracing/core	2009-02-03 06:25:38 +01:00
Linus Torvalds	31c952dcf8	Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched_rt: don't use first_cpu on cpumask created with cpumask_and sched: fix buddie group latency sched: clear buddies more aggressively sched: symmetric sync vs avg_overlap sched: fix sync wakeups cpuset: fix possible deadlock in async_rebuild_sched_domains	2009-02-02 19:26:29 -08:00
Eric Dumazet	720eba31f4	modules: Use a better scheme for refcounting Current refcounting for modules (done if CONFIG_MODULE_UNLOAD=y) is using a lot of memory. Each 'struct module' contains an [NR_CPUS] array of full cache lines. This patch uses existing infrastructure (percpu_modalloc() & percpu_modfree()) to allocate percpu space for the refcount storage. Instead of wasting NR_CPUS128 bytes (on i386), we now use nr_cpu_idssizeof(local_t) bytes. On a typical distro, where NR_CPUS=8, shiping 2000 modules, we reduce size of module files by about 2 Mbytes. (1Kb per module) Instead of having all refcounters in the same memory node - with TLB misses because of vmalloc() - this new implementation permits to have better NUMA properties, since each CPU will use storage on its preferred node, thanks to percpu storage. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-02-02 19:17:55 -08:00
Rusty Russell	3d398703ef	sched_rt: don't use first_cpu on cpumask created with cpumask_and cpumask_and() only initializes nr_cpu_ids bits, so the (deprecated) first_cpu() might find one of those uninitialized bits if nr_cpu_ids is less than NR_CPUS (as it can be for CONFIG_CPUMASK_OFFSTACK). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-01 10:49:52 +01:00
Peter Zijlstra	a571bbeafb	sched: fix buddie group latency Similar to the previous patch, by not clearing buddies we can select entities past their run quota, which can increase latency. This means we have to clear group buddies as well. Do not use the group clear for pick_next_task(), otherwise that'll get O(n^2). Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-01 10:49:51 +01:00
Mike Galbraith	a9f3e2b549	sched: clear buddies more aggressively It was noticed that a task could get re-elected past its run quota due to buddy affinities. This could increase latency a little. Cure it by more aggresively clearing buddy state. We do so in two situations: - when we force preempt - when we select a buddy to run Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-01 10:49:50 +01:00
Peter Zijlstra	1596e29773	sched: symmetric sync vs avg_overlap Reinstate the weakening of the sync hint if set. This yields a more symmetric usage of avg_overlap. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-01 10:49:49 +01:00
Peter Zijlstra	d942fb6c7d	sched: fix sync wakeups Pawel Dziekonski reported that the openssl benchmark and his quantum chemistry application both show slowdowns due to the scheduler under-parallelizing execution. The reason are pipe wakeups still doing 'sync' wakeups which overrides the normal buddy wakeup logic - even if waker and wakee are loosely coupled. Fix an inversion of logic in the buddy wakeup code. Reported-by: Pawel Dziekonski <dzieko@gmail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-01 10:49:06 +01:00
Linus Torvalds	1347e965f5	Merge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: generic-ipi: use per cpu data for single cpu ipi calls cpumask: convert lib/smp_processor_id to new cpumask ops signals, debug: fix BUG: using smp_processor_id() in preemptible code in print_fatal_signal()	2009-01-31 15:55:05 -08:00
Linus Torvalds	ac56b94f80	Merge branch 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: irq: export __set_irq_handler() and handle_level_irq()	2009-01-31 15:54:30 -08:00
Linus Torvalds	5b2d3e6d54	Merge branch 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: hrtimer: prevent negative expiry value after clock_was_set() hrtimers: allow the hot-unplugging of all cpus hrtimers: increase clock min delta threshold while interrupt hanging	2009-01-31 15:54:06 -08:00
Linus Torvalds	f6490438fc	Merge branch 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, ds, bts: cleanup/fix DS configuration ring-buffer: reset timestamps when ring buffer is reset trace: set max latency variable to zero on default trace: stop all recording to ring buffer on ftrace_dump trace: print ftrace_dump at KERN_EMERG log level ring_buffer: reset write when reserve buffer fail tracing/function-graph-tracer: fix a regression while suspend to disk ring-buffer: fix alignment problem	2009-01-31 15:53:30 -08:00
Thomas Gleixner	b0a9b5111a	hrtimer: prevent negative expiry value after clock_was_set() Impact: prevent false positive WARN_ON() in clockevents_program_event() clock_was_set() changes the base->offset of CLOCK_REALTIME and enforces the reprogramming of the clockevent device to expire timers which are based on CLOCK_REALTIME. If the clock change is large enough then the subtraction of the timer expiry value and base->offset can become negative which triggers the warning in clockevents_program_event(). Check the subtraction result and set a negative value to 0. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2009-01-30 22:35:34 +01:00
Sebastien Dugue	94df7de028	hrtimers: allow the hot-unplugging of all cpus Impact: fix CPU hotplug hang on Power6 testbox On architectures that support offlining all cpus (at least powerpc/pseries), hot-unpluging the tick_do_timer_cpu can result in a system hang. This comes from the fact that if the cpu going down happens to be the cpu doing the tick, then as the tick_do_timer_cpu handover happens after the cpu is dead (via the CPU_DEAD notification), we're left without ticks, jiffies are frozen and any task relying on timers (msleep, ...) is stuck. That's particularly the case for the cpu looping in __cpu_die() waiting for the dying cpu to be dead. This patch addresses this by having the tick_do_timer_cpu handover happen earlier during the CPU_DYING notification. For this, a new clockevent notification type is introduced (CLOCK_EVT_NOTIFY_CPU_DYING) which is triggered in hrtimer_cpu_notify(). Signed-off-by: Sebastien Dugue <sebastien.dugue@bull.net> Cc: <stable@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-30 22:35:29 +01:00
Frederic Weisbecker	7f22391cbe	hrtimers: increase clock min delta threshold while interrupt hanging Impact: avoid timer IRQ hanging slow systems While using the function graph tracer on a virtualized system, the hrtimer_interrupt can hang the system on an infinite loop. This can be caused in several situations: - the hardware is very slow and HZ is set too high - something intrusive is slowing the system down (tracing under emulation) ... and the next clock events to program are always before the current time. This patch implements a reasonable compromise: if such a situation is detected, we share the CPUs time in 1/4 to process the hrtimer interrupts. This is enough to let the system running without serious starvation. It has been successfully tested under VirtualBox with 1000 HZ and 100 HZ with function graph tracer launched. On both cases, the clock events were increased until about 25 ms periodic ticks, which means 40 HZ. So we change a hard to debug hang into a warning message and a system that still manages to limp along. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-30 22:35:10 +01:00
Steven Rostedt	d7240b9880	generic-ipi: use per cpu data for single cpu ipi calls The smp_call_function can be passed a wait parameter telling it to wait for all the functions running on other CPUs to complete before returning, or to return without waiting. Unfortunately, this is currently just a suggestion and not manditory. That is, the smp_call_function can decide not to return and wait instead. The reason for this is because it uses kmalloc to allocate storage to send to the called CPU and that CPU will free it when it is done. But if we fail to allocate the storage, the stack is used instead. This means we must wait for the called CPU to finish before continuing. Unfortunatly, some callers do no abide by this hint and act as if the non-wait option is mandatory. The MTRR code for instance will deadlock if the smp_call_function is set to wait. This is because the smp_call_function will wait for the other CPUs to finish their called functions, but those functions are waiting on the caller to continue. This patch changes the generic smp_call_function code to use per cpu variables if the allocation of the data fails for a single CPU call. The smp_call_function_many will fall back to the smp_call_function_single if it fails its alloc. The smp_call_function_single is modified to not force the wait state. Since we now are using a single data per cpu we must synchronize the callers to prevent a second caller modifying the data before the first called IPI functions complete. To do so, I added a flag to the call_single_data called CSD_FLAG_LOCK. When the single CPU is called (which can be called when a many call fails an alloc), we set the LOCK bit on this per cpu data. When the caller finishes it clears the LOCK bit. The caller must wait till the LOCK bit is cleared before setting it. When it is cleared, there is no IPI function using it. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Jens Axboe <jens.axboe@oracle.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-30 18:31:08 +01:00
Randy Dunlap	ecf441b593	kmemtrace: fix printk formats, fix Geert Uytterhoeven wrote: > %4zu? Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Acked-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-30 16:12:33 +01:00
Paul Menage	839ec5452e	cgroup: fix root_count when mount fails due to busy subsystem root_count was being incremented in cgroup_get_sb() after all error checking was complete, but decremented in cgroup_kill_sb(), which can be called on a superblock that we gave up on due to an error. This patch changes cgroup_kill_sb() to only decrement root_count if the root was previously linked into the list of roots. Signed-off-by: Paul Menage <menage@google.com> Tested-by: Serge Hallyn <serue@us.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-29 18:04:45 -08:00
Paul Menage	804b3c28a4	cgroups: add cpu_relax() calls in css_tryget() and cgroup_clear_css_refs() css_tryget() and cgroup_clear_css_refs() contain polling loops; these loops should have cpu_relax calls in them to reduce cross-cache traffic. Signed-off-by: Paul Menage <menage@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-29 18:04:45 -08:00
Li Zefan	1404f06565	cgroups: fix lock inconsistency in cgroup_clone() I fixed a bug in cgroup_clone() in Linus' tree in commit `7b574b7` ("cgroups: fix a race between cgroup_clone and umount") without noticing there was a cleanup patch in -mm tree that should be rebased (now commit `104cbd5`, "cgroups: use task_lock() for access tsk->cgroups safe in cgroup_clone()"), thus resulted in lock inconsistency. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Paul Menage <menage@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-29 18:04:45 -08:00
KAMEZAWA Hiroyuki	baef99a08a	cgroups: use hierarchy mutex in creation failure path Now, cgrp->sibling is handled under hierarchy mutex. error route should do so, too. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Acked-by Paul Menage <menage@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-29 18:04:43 -08:00
Arnaldo Carvalho de Melo	b3a8c34886	trace_sched_wakeup: Remove unused variable Impact: cleanup Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-29 14:31:03 +01:00
Arnaldo Carvalho de Melo	f04109bf1b	trace: Use tracing_reset_online_cpus in more places Impact: cleanup Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Frédéric Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-29 14:28:31 +01:00
David Daney	97179fd46d	cpumask fallout: Initialize irq_default_affinity earlier Move the initialization of irq_default_affinity to early_irq_init as core_initcall is too late. irq_default_affinity can be used in init_IRQ and potentially timer and SMP init as well. All of these happen before core_initcall. Moving the initialization to early_irq_init ensures that it is initialized before it is used. Signed-off-by: David Daney <ddaney@caviumnetworks.com> Acked-by: Mike Travis <travis@sgi.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-27 16:06:55 -08:00
David Daney	1267a8df20	Make irq_*_affinity depend on CONFIG_GENERIC_HARDIRQS too. In interrupt.h these functions are declared only if CONFIG_GENERIC_HARDIRQS is set. We should define them under identical conditions. Signed-off-by: David Daney <ddaney@caviumnetworks.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-01-27 16:06:49 -08:00
Linus Torvalds	490a8d70cd	Merge branch 'hibern_fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev * 'hibern_fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev: SATA PIIX: Blacklist system that spins off disks during ACPI power off SATA Sil: Blacklist system that spins off disks during ACPI power off SATA AHCI: Blacklist system that spins off disks during ACPI power off SATA: Blacklisting of systems that spin off disks during ACPI power off DMI: Introduce dmi_first_match to make the interface more flexible Hibernation: Introduce system_entering_hibernation	2009-01-27 07:50:41 -08:00
Ingo Molnar	4a66a82be7	Merge branches 'tracing/blktrace', 'tracing/kmemtrace' and 'tracing/urgent' into tracing/core	2009-01-27 14:30:57 +01:00
Rafael J. Wysocki	abfe2d7b91	Hibernation: Introduce system_entering_hibernation Introduce boolean function system_entering_hibernation() returning 'true' during the last phase of hibernation, in which devices are being put into low power states and the sleep state (for example, ACPI S4) is finally entered. Some device drivers need such a function to check if the system is in the final phase of hibernation. In particular, some SATA drivers are going to use it for blacklisting systems in which the disks should not be spun down during the last phase of hibernation (the BIOS will do that anyway). Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>	2009-01-27 02:15:45 -05:00
Ed Swierk	3a9f84d354	signals, debug: fix BUG: using smp_processor_id() in preemptible code in print_fatal_signal() With print-fatal-signals=1 on a kernel with CONFIG_PREEMPT=y, sending an unexpected signal to a process causes a BUG: using smp_processor_id() in preemptible code. get_signal_to_deliver() releases the siglock before calling print_fatal_signal(), which calls show_regs(), which calls smp_processor_id(), which is not supposed to be called from a preemptible thread. Make sure show_regs() runs with preemption disabled. Signed-off-by: Ed Swierk <eswierk@aristanetworks.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-27 00:36:19 +01:00
Linus Torvalds	2034563ca3	Merge git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-fixes * git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild-fixes: kbuild: fix kbuild.txt typos kbuild: print usage with no arguments in scripts/config Revert "kbuild: strip generated symbols from *.ko"	2009-01-26 15:10:37 -08:00
Linus Torvalds	37f5fed555	Merge branch 'sh/for-2.6.29' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6 * 'sh/for-2.6.29' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6: (22 commits) dma-coherent: Restore dma_alloc_from_coherent() large alloc fall back policy. dma-coherent: per-device coherent area is in pages, not bytes. sh: fix unaligned and nonexistent address handling nommu: Stub in vm_map_ram()/vm_unmap_ram()/vm_unmap_aliases(). sh: fix sh-sci / early printk build on sh7723 sh: export the sh7343 JPU to user space sh: update defconfigs. serial: sh-sci: Fix up SH7720/SH7721 SCI build. sh: Kill off obsolete busses from arch/sh/Kconfig. sh: sh7785lcr/highlander/hp6xx need linux/irq.h. sh: Migo-R MMC support using spi_gpio and mmc_spi. sh: ap325rxa MMC support using spi_gpio and mmc_spi sh: mach-x3proto: needs linux/irq.h. sh: Drop the BKL from sys_execve() on SH-5. sh: convert rsk7203 to use smsc911x. sh: convert magicpanelr2 platform to use smsc911x. sh: convert ap325rxa platform to use smsc911x. sh: mach-migor: Add tw9910 support. sh: mach-migor: Delete soc_camera_platform setup. sh: mach-migor: Add ov772x support. ...	2009-01-26 10:12:08 -08:00
Linus Torvalds	3386c05bdb	Merge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: debugobjects: add and use INIT_WORK_ON_STACK rcu: remove duplicate CONFIG_RCU_CPU_STALL_DETECTOR relay: fix lock imbalance in relay_late_setup_files oprofile: fix uninitialized use of struct op_entry rcu: move Kconfig menu softlock: fix false panic which can occur if softlockup_thresh is reduced rcu: add __cpuinit to rcu_init_percpu_data()	2009-01-26 09:47:56 -08:00
Linus Torvalds	1e70c7f7a9	Merge branch 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: hrtimers: fix inconsistent lock state on resume in hres_timers_resume time-sched.c: tick_nohz_update_jiffies should be static locking, hpet: annotate false positive warning kernel/fork.c: unused variable 'ret' itimers: remove the per-cpu-ish-ness	2009-01-26 09:47:43 -08:00
Linus Torvalds	810ee58de2	Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (29 commits) xen: unitialised return value in xenbus_write_transaction x86: fix section mismatch warning x86: unmask CPUID levels on Intel CPUs, fix x86: work around PAGE_KERNEL_WC not getting WC in iomap_atomic_prot_pfn. x86: use standard PIT frequency xen: handle highmem pages correctly when shrinking a domain x86, mm: fix pte_free() xen: actually release memory when shrinking domain x86: unmask CPUID levels on Intel CPUs x86: add MSR_IA32_MISC_ENABLE bits to <asm/msr-index.h> x86: fix PTE corruption issue while mapping RAM using /dev/mem x86: mtrr fix debug boot parameter x86: fix page attribute corruption with cpa() Revert "x86: signal: change type of paramter for sys_rt_sigreturn()" x86: use early clobbers in usercopy*.c x86: remove kernel_physical_mapping_init() from init section fix: crash: IP: __bitmap_intersects+0x48/0x73 cpufreq: use work_on_cpu in acpi-cpufreq.c for drv_read and drv_write work_on_cpu: Use our own workqueue. work_on_cpu: don't try to get_online_cpus() in work_on_cpu. ...	2009-01-26 09:47:28 -08:00
Arnaldo Carvalho de Melo	c71a896154	blktrace: add ftrace plugin Impact: New way of using the blktrace infrastructure This drops the requirement of userspace utilities to use the blktrace facility. Configuration is done thru sysfs, adding a "trace" directory to the partition directory where blktrace can be enabled for the associated request_queue. The same filters present in the IOCTL interface are present as sysfs device attributes. The /sys/block/sdX/sdXN/trace/enable file allows tracing without any filters. The other files in this directory: pid, act_mask, start_lba and end_lba can be used with the same meaning as with the IOCTL interface. Using the sysfs interface will only setup the request_queue->blk_trace fields, tracing will only take place when the "blk" tracer is selected via the ftrace interface, as in the following example: To see the trace, one can use the /d/tracing/trace file or the /d/tracign/trace_pipe file, with semantics defined in the ftrace documentation in Documentation/ftrace.txt. [root@f10-1 ~]# cat /t/trace kjournald-305 [000] 3046.491224: 8,1 A WBS 6367 + 8 <- (8,1) 6304 kjournald-305 [000] 3046.491227: 8,1 Q R 6367 + 8 [kjournald] kjournald-305 [000] 3046.491236: 8,1 G RB 6367 + 8 [kjournald] kjournald-305 [000] 3046.491239: 8,1 P NS [kjournald] kjournald-305 [000] 3046.491242: 8,1 I RBS 6367 + 8 [kjournald] kjournald-305 [000] 3046.491251: 8,1 D WB 6367 + 8 [kjournald] kjournald-305 [000] 3046.491610: 8,1 U WS [kjournald] 1 <idle>-0 [000] 3046.511914: 8,1 C RS 6367 + 8 [6367] [root@f10-1 ~]# The default line context (prefix) format is the one described in the ftrace documentation, with the blktrace specific bits using its existing format, described in blkparse(8). If one wants to have the classic blktrace formatting, this is possible by using: [root@f10-1 ~]# echo blk_classic > /t/trace_options [root@f10-1 ~]# cat /t/trace 8,1 0 3046.491224 305 A WBS 6367 + 8 <- (8,1) 6304 8,1 0 3046.491227 305 Q R 6367 + 8 [kjournald] 8,1 0 3046.491236 305 G RB 6367 + 8 [kjournald] 8,1 0 3046.491239 305 P NS [kjournald] 8,1 0 3046.491242 305 I RBS 6367 + 8 [kjournald] 8,1 0 3046.491251 305 D WB 6367 + 8 [kjournald] 8,1 0 3046.491610 305 U WS [kjournald] 1 8,1 0 3046.511914 0 C RS 6367 + 8 [6367] [root@f10-1 ~]# Using the ftrace standard format allows more flexibility, such as the ability of asking for backtraces via trace_options: [root@f10-1 ~]# echo noblk_classic > /t/trace_options [root@f10-1 ~]# echo stacktrace > /t/trace_options [root@f10-1 ~]# cat /t/trace kjournald-305 [000] 3318.826779: 8,1 A WBS 6375 + 8 <- (8,1) 6312 kjournald-305 [000] 3318.826782: <= submit_bio <= submit_bh <= sync_dirty_buffer <= journal_commit_transaction <= kjournald <= kthread <= child_rip kjournald-305 [000] 3318.826836: 8,1 Q R 6375 + 8 [kjournald] kjournald-305 [000] 3318.826837: <= generic_make_request <= submit_bio <= submit_bh <= sync_dirty_buffer <= journal_commit_transaction <= kjournald <= kthread Please read the ftrace documentation to use aditional, standardized tracing filters such as /d/tracing/trace_cpumask, etc. See also /d/tracing/trace_mark to add comments in the trace stream, that is equivalent to the /d/block/sdaN/msg interface. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-26 14:40:53 +01:00
Arnaldo Carvalho de Melo	9011262a37	ftrace: add ftrace_vprintk Impact: new helper function Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-26 14:40:53 +01:00
Randy Dunlap	cc2f6d90e9	kmemtrace: fix printk format warnings Fix kmemtrace printk warnings: kernel/trace/kmemtrace.c:142: warning: format '%4ld' expects type 'long int', but argument 3 has type 'size_t' kernel/trace/kmemtrace.c:147: warning: format '%4ld' expects type 'long int', but argument 3 has type 'size_t' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Acked-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-26 14:03:51 +01:00
Ingo Molnar	5ce1b1ed27	Merge branches 'tracing/ftrace' and 'tracing/function-graph-tracer' into tracing/core	2009-01-26 14:01:52 +01:00
Frederic Weisbecker	9005f3ebeb	tracing/function-graph-tracer: various fixes and features This patch brings various bugfixes: - Drop the first irrelevant task switch on the very beginning of a trace. - Drop the OVERHEAD word from the headers, the DURATION word is sufficient and will not overlap other columns. - Make the headers fit well their respective columns whatever the selected options. Ie, default options: # tracer: function_graph # # CPU DURATION FUNCTION CALLS # \| \| \| \| \| \| \| 1) 0.646 us \| } 1) \| mem_cgroup_del_lru_list() { 1) 0.624 us \| lookup_page_cgroup(); 1) 1.970 us \| } echo funcgraph-proc > trace_options # tracer: function_graph # # CPU TASK/PID DURATION FUNCTION CALLS # \| \| \| \| \| \| \| \| \| 0) bash-2937 \| 0.895 us \| } 0) bash-2937 \| 0.888 us \| __rcu_read_unlock(); 0) bash-2937 \| 0.864 us \| conv_uni_to_pc(); 0) bash-2937 \| 1.015 us \| __rcu_read_lock(); echo nofuncgraph-cpu > trace_options echo nofuncgraph-proc > trace_options # tracer: function_graph # # DURATION FUNCTION CALLS # \| \| \| \| \| \| 3.752 us \| native_pud_val(); 0.616 us \| native_pud_val(); 0.624 us \| native_pmd_val(); About features, one can now disable the duration (this will hide the overhead too for convenient reasons and because on doesn't need overhead if it hasn't the duration): echo nofuncgraph-duration > trace_options # tracer: function_graph # # FUNCTION CALLS # \| \| \| \| cap_vm_enough_memory() { __vm_enough_memory() { vm_acct_memory(); } } } And at last, an option to print the absolute time: //Restart from default options echo funcgraph-abstime > trace_options # tracer: function_graph # # TIME CPU DURATION FUNCTION CALLS # \| \| \| \| \| \| \| \| 261.339774 \| 1) + 42.823 us \| } 261.339775 \| 1) 1.045 us \| _spin_lock_irq(); 261.339777 \| 1) 0.940 us \| _spin_lock_irqsave(); 261.339778 \| 1) 0.752 us \| _spin_unlock_irqrestore(); 261.339780 \| 1) 0.857 us \| _spin_unlock_irq(); 261.339782 \| 1) \| flush_to_ldisc() { 261.339783 \| 1) \| tty_ldisc_ref() { 261.339783 \| 1) \| tty_ldisc_try() { 261.339784 \| 1) 1.075 us \| _spin_lock_irqsave(); 261.339786 \| 1) 0.842 us \| _spin_unlock_irqrestore(); 261.339788 \| 1) 4.211 us \| } 261.339788 \| 1) 5.662 us \| } The format is seconds.usecs. I guess no one needs the nanosec precision here, the main goal is to have an overview about the general timings of events, and to see the place when the trace switches from one cpu to another. ie: 274.874760 \| 1) 0.676 us \| _spin_unlock(); 274.874762 \| 1) 0.609 us \| native_load_sp0(); 274.874763 \| 1) 0.602 us \| native_load_tls(); 274.878739 \| 0) 0.722 us \| } 274.878740 \| 0) 0.714 us \| native_pmd_val(); 274.878741 \| 0) 0.730 us \| native_pmd_val(); Here there is a 4000 usecs difference when we switch the cpu. Changes in V2: - Completely fix the first pointless task switch. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-23 11:18:08 +01:00
Steven Rostedt	7e49fcce1b	trace, lockdep: manual preempt count adding for local_bh_disable Impact: fix to preempt trace triggering lockdep check_flag failure In local_bh_disable, the use of add_preempt_count causes the preempt tracer to start recording the time preemption is off. But because it already modified the preempt_count to show softirqs disabled, and before it called the lockdep code to handle this, it causes a state that lockdep can not handle. The preempt tracer will reset the ring buffer on start of a trace, and the ring buffer reset code does a spin_lock_irqsave. This calls into lockdep and lockdep will fail when it detects the invalid state of having softirqs disabled but the internal current->softirqs_enabled is still set. The fix is to manually add the SOFTIRQ_OFFSET to preempt count and call the preempt tracer code outside the lockdep critical area. Thanks to Peter Zijlstra for suggesting this solution. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-23 11:10:57 +01:00
Steven Rostedt	b06a830183	trace: fix logic to start/stop counting The logic in the tracing_start/stop code prevents the WARN_ON from ever detecting if a start/stop pair was mismatched. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-23 11:10:45 +01:00
Steven Rostedt	94523e818f	trace: remove internal irqsoff disabling for trace output Impact: cleanup of duplicate features The trace output disables the ring buffer and prevents tracing to occur. The code in irqsoff to do the same thing is no longer needed. This patch removes it. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-23 11:10:36 +01:00
Steven Rostedt	91a8d07d82	ring-buffer: reset timestamps when ring buffer is reset Impact: fix bad times of recent resets The ring buffer needs to reset its timestamps when reseting of the buffer, otherwise the timestamps are stale and might be used to calculate times in the buffer causing funny timestamps to appear. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-22 10:31:58 +01:00
Steven Rostedt	69507c0653	ring-buffer: reset timestamps when ring buffer is reset Impact: fix bad times of recent resets The ring buffer needs to reset its timestamps when reseting of the buffer, otherwise the timestamps are stale and might be used to calculate times in the buffer causing funny timestamps to appear. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-22 10:27:54 +01:00
Steven Rostedt	f8ec1062f5	wakeup-tracer: show scheduling data in output Impact: better data for wakeup tracer This patch adds the wakeup and schedule calls that are used by the scheduler tracer to make the wakeup tracer more readable. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-22 10:27:39 +01:00
Steven Rostedt	3244351c31	trace: separate out rt tasks from wakeup tracer Impact: add option to trace all tasks or just RT tasks The current wakeup tracer only traces RT task wakeups. This is fine for those interested in wake up timings of RT tasks, but it is useless for those that are interested in the causes of long wakeups for non RT tasks. This patch creates a "wakeup_rt" to implement the tracing of just RT tasks (as the current "wakeup" does). And makes "wakeup" now trace all tasks as an average developer would expect. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-22 10:27:22 +01:00
Steven Rostedt	97b17efe45	ring-buffer: do not swap if recording is disabled If the ring buffer recording has been disabled. Do not let swapping of ring buffers occur. Simply return -EAGAIN. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-22 10:27:16 +01:00
Steven Rostedt	5bc4564b22	trace: do not disable wake up tracer on output of trace Impact: fix to erased trace output To try not to have the outputing of a trace interfere with the wakeup tracer, it would disable tracing while the output was printing. But if a trace had started when it was disabled, it can show a partial trace. To try to solve this, on closing of the tracer, it would clear the trace buffer. The latency tracers (wakeup and irqsoff) have two buffers. One for recording and one for holding the max trace that is printed. The clearing of the trace above should only affect the recording buffer. But for some reason it would move the erased trace to the print buffer. Probably due to a race with the closing of the trace and the saving ofhe max race. The above is all pretty useless, and if the user does not want the printing of the trace to be traced itself, then the user can manual disable tracing. This patch removes all the code that tries to keep the output of the tracer from modifying the trace. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-01-22 10:26:50 +01:00
Thomas Gleixner	6552ebae25	Merge branch 'core/debugobjects' into core/urgent	2009-01-22 10:03:02 +01:00

1 2 3 4 5 ...

5987 Commits