Commit Graph

43942 Commits

Author SHA1 Message Date
Linus Torvalds
f7fc06e3a4 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lrg/voltage-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lrg/voltage-2.6: (42 commits)
  regulator: Fix _regulator_get_voltage if get_voltage callback is NULL
  USB: TWL6025 allow different regulator name
  REGULATOR: TWL6025: add support to twl-regulator
  regulator: twl6030: do not write to _GRP for regulator disable
  regulator: twl6030: do not write to _GRP for regulator enable
  TPS65911: Comparator: Add comparator driver
  TPS65911: Add support for added GPIO lines
  GPIO: TPS65910: Move driver to drivers/gpio/
  TPS65911: Add new irq definitions
  regulator: tps65911: Add new chip version
  MFD: TPS65910: Add support for TPS65911 device
  regulator: Fix off-by-one value range checking for mc13xxx_regulator_get_voltage
  regulator: mc13892: Fix voltage unit in test case.
  regulator: Remove MAX8997_REG_BUCK1DVS/MAX8997_REG_BUCK2DVS/MAX8997_REG_BUCK5DVS macros
  mfd: Fix off-by-one value range checking for tps65910_i2c_write
  regulator: Only apply voltage constraints from consumers that set them
  regulator: If we can't configure optimum mode we're always in the best one
  regulator: max8997: remove useless code
  regulator: Fix memory leak in max8998_pmic_probe failure path
  regulator: Fix desc_id for tps65023/6507x/65910
  ...
2011-05-27 10:13:01 -07:00
Linus Torvalds
ea0ca3a843 Merge git://git.infradead.org/battery-2.6
* git://git.infradead.org/battery-2.6:
  PXA: Use dev_pm_ops in z2_battery
  ds2760_battery: Fix rated capacity of the hx4700 1800mAh battery
  ds2760_battery: Fix indexing of the 4 active full EEPROM registers
  power: Make test_power driver more dynamic.
  bq27x00_battery: Name of cycle count property
  max8903_charger: Add GENERIC_HARDIRQS as a dependency (fixes S390 build)
  ARM: RX-51: Enable isp1704 power on/off
  isp1704_charger: Allow board specific powering routine
  gpio-charger: Add gpio_charger_resume
  power_supply: Add driver for MAX8903 charger
2011-05-27 10:12:35 -07:00
Jorge Eduardo Candelaria
6851ad3ab3 TPS65911: Comparator: Add comparator driver
This driver adds functionality to the tps65911 chip driver.

Two of the comparators are configurable by software and measures
VCCS voltage to detect high or low voltage scenarios.

Signed-off-by: Jorge Eduardo Candelaria <jedu@slimlogic.co.uk>
Acked-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
2011-05-27 10:49:29 +01:00
Jorge Eduardo Candelaria
11ad14f86a TPS65911: Add support for added GPIO lines
GPIO 1 to 8 are added for TPS65911 chip version. The gpio driver
now handles more than one gpio lines. Subsequent versions of the
chip family can add new GPIO lines with minimal driver changes.

Signed-off-by: Jorge Eduardo Candelaria <jedu@slimlogic.co.uk>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
2011-05-27 10:49:29 +01:00
Jorge Eduardo Candelaria
a2974732ca TPS65911: Add new irq definitions
TPS65911 adds new interrupt sources, as well as two new registers
to handle them, one for interrupt status and one for interrupt
masking. The added irqs are:

-VMBCH2 - Low and High threshold
-GPIO1-8 - Rising and falling edge detection
-WTCHDG - Watchdog interrupt
-PWRDN	- PWRDN reset interrupt

The code should handle these new registers only when the chip
version is TPS65911.

Signed-off-by: Jorge Eduardo Candelaria <jedu@slimlogic.co.uk>
Acked-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
2011-05-27 10:49:10 +01:00
Jorge Eduardo Candelaria
a320e3c3d6 regulator: tps65911: Add new chip version
The tps65911 chip introduces new features, including changes in
the regulator module.

- VDD1 and VDD2 remain unchanged.
- VDD3 is now named VDDCTRL and has a wider voltage range.
- LDOs are now named LDO1...8 and voltage ranges are sequential,
  making LDOs easier to handle.

Signed-off-by: Jorge Eduardo Candelaria <jedu@slimlogic.co.uk>
Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
2011-05-27 10:49:10 +01:00
Jorge Eduardo Candelaria
795570561c MFD: TPS65910: Add support for TPS65911 device
The TPS65911 is the next generation of the TPS65910 family of
PMIC chips. It adds a few features:

- Watchdog Timer
- PWM & LED generators
- Comparators for system control status

It also adds a set of Interrupts and GPIOs, among other things.

The driver exports a function to identify between different
versions of the tps65910 family, allowing other modules to
identify the capabilities of the current chip.

Signed-off-by: Jorge Eduardo Candelaria <jedu@slimlogic.co.uk>
Acked-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
2011-05-27 10:49:10 +01:00
Axel Lin
ecb9c4f595 regulator: Remove MAX8997_REG_BUCK1DVS/MAX8997_REG_BUCK2DVS/MAX8997_REG_BUCK5DVS macros
In current implementation, the original macro implementation assumes the caller
pass the parameter starting from 1 (to match the register names in datasheet).
Thus we have unneeded plus one then minus one operations
when using MAX8997_REG_BUCK1DVS/MAX8997_REG_BUCK2DVS/MAX8997_REG_BUCK5DVS macros.

This patch removes these macros to avoid unneeded plus one then minus one operations
without reducing readability.

Signed-off-by: Axel Lin <axel.lin@gmail.com>
Acked-by: Kyungmin Park <kyungmin.park@samsung.com>
Acked-by: MyungJoo Ham <myungjoo.ham@samsung.com>
Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
2011-05-27 10:49:10 +01:00
Graeme Gregory
518fb721de TPS65910: Add tps65910 regulator driver
The regulator module consists of 3 DCDCs and 8 LDOs. The output
voltages are configurable and are meant to supply power to the
main processor and other components

Signed-off-by: Graeme Gregory <gg@slimlogic.co.uk>
Signed-off-by: Jorge Eduardo Candelaria <jedu@slimlogic.co.uk>
Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
2011-05-27 10:49:08 +01:00
Graeme Gregory
e3471bdc27 TPS65910: IRQ: Add interrupt controller
This module controls the interrupt handling for the tps chip. The
interrupt sources are the following:

- GPIO falling/rising edge detection
- Battery voltage below/above threshold
- PWRON signal
- PWRHOLD signal
- Temperature detection
- RTC alarm and periodic event

Signed-off-by: Graeme Gregory <gg@slimlogic.co.uk>
Signed-off-by: Jorge Eduardo Candelaria <jedu@slimlogic.co.uk>
Reviewed-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Acked-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
2011-05-27 10:48:43 +01:00
Graeme Gregory
2537df722d TPS65910: GPIO: Add GPIO driver
TPS65910 has one configurable GPIO that can be used for several
purposes. Subsequent versions of the TPS chip support more than
one GPIO.

Signed-off-by: Graeme Gregory <gg@slimlogic.co.uk>
Signed-off-by: Jorge Eduardo Candelaria <jedu@slimlogic.co.uk>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
2011-05-27 10:48:23 +01:00
Graeme Gregory
27c6750ec5 MFD: TPS65910: Add new mfd device for TPS65910
The TPS65910 chip is a power management IC for multimedia and handheld
devices. It contains the following components:

- Regulators
- GPIO controller
- RTC

The tps65910 core driver is registered as a platform driver and provides
communication through I2C with the host device for the different
components.

Signed-off-by: Graeme Gregory <gg@slimlogic.co.uk>
Signed-off-by: Jorge Eduardo Candelaria <jedu@slimlogic.co.uk>
Acked-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
2011-05-27 10:35:22 +01:00
Mark Brown
bf5892a816 regulator: Support voltage offsets to compensate for drops in system
Some systems, particularly physically large systems used for early
prototyping, may experience substantial voltage drops between the regulator
and the consumers as a result of long traces in the system. With these
systems voltages may need to be set higher than requested in order to
ensure reliable system operation.

Allow systems to work around such hardware issues by allowing constraints
to supply an offset to be applied to any requested and reported voltages.
This is not ideal, especially since the voltage drop may be load dependant,
but is sufficient for most affected systems, it is not expected to be used
in production hardware. The offset is applied after all constraint
processing so constraints should be specified in terms of consumer values
not physically configured values.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
2011-05-27 10:34:37 +01:00
Mark Brown
492c826b9f regulator: Remove supply_regulator_dev from machine configuration
supply_regulator_dev (using a struct pointer) has been deprecated in favour
of supply_regulator (using a regulator name) for quite a few releases
now with a warning generated if it is used and there are no current in tree
users so just remove the code.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
2011-05-27 10:34:37 +01:00
Akinobu Mita
63e424c844 arch: remove CONFIG_GENERIC_FIND_{NEXT_BIT,BIT_LE,LAST_BIT}
By the previous style change, CONFIG_GENERIC_FIND_NEXT_BIT,
CONFIG_GENERIC_FIND_BIT_LE, and CONFIG_GENERIC_FIND_LAST_BIT are not used
to test for existence of find bitops anymore.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Greg Ungerer <gerg@uclinux.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-26 17:12:38 -07:00
Akinobu Mita
19de85ef57 bitops: add #ifndef for each of find bitops
The style that we normally use in asm-generic is to test the macro itself
for existence, so in asm-generic, do:

	#ifndef find_next_zero_bit_le
	extern unsigned long find_next_zero_bit_le(const void *addr,
		unsigned long size, unsigned long offset);
	#endif

and in the architectures, write

	static inline unsigned long find_next_zero_bit_le(const void *addr,
		unsigned long size, unsigned long offset)
	#define find_next_zero_bit_le find_next_zero_bit_le

This adds the #ifndef for each of the find bitops in the generic header
and source files.

Suggested-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Greg Ungerer <gerg@uclinux.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-26 17:12:38 -07:00
Sisir Koppaka
26498e89e8 pid: fix typo in function description
finds is misspelt as finr. No functional change.

Signed-off-by: Sisir Koppaka <sisir.koppaka@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-26 17:12:37 -07:00
Alexey Dobriyan
074127367a ipmi: convert to seq_file interface
The ->read_proc interface is going away, convert to seq_file.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc:Corey Minyard <minyard@acm.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-26 17:12:37 -07:00
Olaf Hering
997c136f51 fs/proc/vmcore.c: add hook to read_from_oldmem() to check for non-ram pages
The balloon driver in a Xen guest frees guest pages and marks them as
mmio.  When the kernel crashes and the crash kernel attempts to read the
oldmem via /proc/vmcore a read from ballooned pages will generate 100%
load in dom0 because Xen asks qemu-dm for the page content.  Since the
reads come in as 8byte requests each ballooned page is tried 512 times.

With this change a hook can be registered which checks wether the given
pfn is really ram.  The hook has to return a value > 0 for ram pages, a
value < 0 on error (because the hypercall is not known) and 0 for non-ram
pages.

This will reduce the time to read /proc/vmcore.  Without this change a
512M guest with 128M crashkernel region needs 200 seconds to read it, with
this change it takes just 2 seconds.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-26 17:12:37 -07:00
Jiri Slaby
3864601387 mm: extract exe_file handling from procfs
Setup and cleanup of mm_struct->exe_file is currently done in fs/proc/.
This was because exe_file was needed only for /proc/<pid>/exe.  Since we
will need the exe_file functionality also for core dumps (so core name can
contain full binary path), built this functionality always into the
kernel.

To achieve that move that out of proc FS to the kernel/ where in fact it
should belong.  By doing that we can make dup_mm_exe_file static.  Also we
can drop linux/proc_fs.h inclusion in fs/exec.c and kernel/fork.c.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-26 17:12:36 -07:00
Mike Frysinger
edeafa74e6 asm-generic/ptrace.h: start a common low level ptrace helper
This is a series of low level ptrace unification steps to make it easier
for common code (like KGDB) to poke at register state.  This also avoids
having to duplicate higher level operations for most ports which don't
have special needs for accessing things.

This patch:

This implements a bunch of helper funcs for poking the registers of a
ptrace structure.  Now common code should be able to portably update
specific registers (like kgdb updating the PC).

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Sergei Shtylyov <sshtylyov@mvista.com>
Cc: Dongdong Deng <dongdong.deng@windriver.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-26 17:12:36 -07:00
Ying Han
456f998ec8 memcg: add the pagefault count into memcg stats
Two new stats in per-memcg memory.stat which tracks the number of page
faults and number of major page faults.

  "pgfault"
  "pgmajfault"

They are different from "pgpgin"/"pgpgout" stat which count number of
pages charged/discharged to the cgroup and have no meaning of reading/
writing page to disk.

It is valuable to track the two stats for both measuring application's
performance as well as the efficiency of the kernel page reclaim path.
Counting pagefaults per process is useful, but we also need the aggregated
value since processes are monitored and controlled in cgroup basis in
memcg.

Functional test: check the total number of pgfault/pgmajfault of all
memcgs and compare with global vmstat value:

  $ cat /proc/vmstat | grep fault
  pgfault 1070751
  pgmajfault 553

  $ cat /dev/cgroup/memory.stat | grep fault
  pgfault 1071138
  pgmajfault 553
  total_pgfault 1071142
  total_pgmajfault 553

  $ cat /dev/cgroup/A/memory.stat | grep fault
  pgfault 199
  pgmajfault 0
  total_pgfault 199
  total_pgmajfault 0

Performance test: run page fault test(pft) wit 16 thread on faulting in
15G anon pages in 16G container.  There is no regression noticed on the
"flt/cpu/s"

Sample output from pft:

  TAG pft:anon-sys-default:
    Gb  Thr CLine   User     System     Wall    flt/cpu/s fault/wsec
    15   16   1     0.67s   233.41s    14.76s   16798.546 266356.260

  +-------------------------------------------------------------------------+
      N           Min           Max        Median           Avg        Stddev
  x  10     16682.962     17344.027     16913.524     16928.812      166.5362
  +  10     16695.568     16923.896     16820.604     16824.652     84.816568
  No difference proven at 95.0% confidence

[akpm@linux-foundation.org: fix build]
[hughd@google.com: shmem fix]
Signed-off-by: Ying Han <yinghan@google.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-26 17:12:36 -07:00
Ying Han
1bac180bd2 memcg: rename mem_cgroup_zone_nr_pages() to mem_cgroup_zone_nr_lru_pages()
The caller of the function has been renamed to zone_nr_lru_pages(), and
this is just fixing up in the memcg code.  The current name is easily to
be mis-read as zone's total number of pages.

Signed-off-by: Ying Han <yinghan@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-26 17:12:35 -07:00
KAMEZAWA Hiroyuki
246e87a939 memcg: fix get_scan_count() for small targets
During memory reclaim we determine the number of pages to be scanned per
zone as

	(anon + file) >> priority.
Assume
	scan = (anon + file) >> priority.

If scan < SWAP_CLUSTER_MAX, the scan will be skipped for this time and
priority gets higher.  This has some problems.

  1. This increases priority as 1 without any scan.
     To do scan in this priority, amount of pages should be larger than 512M.
     If pages>>priority < SWAP_CLUSTER_MAX, it's recorded and scan will be
     batched, later. (But we lose 1 priority.)
     If memory size is below 16M, pages >> priority is 0 and no scan in
     DEF_PRIORITY forever.

  2. If zone->all_unreclaimabe==true, it's scanned only when priority==0.
     So, x86's ZONE_DMA will never be recoverred until the user of pages
     frees memory by itself.

  3. With memcg, the limit of memory can be small. When using small memcg,
     it gets priority < DEF_PRIORITY-2 very easily and need to call
     wait_iff_congested().
     For doing scan before priorty=9, 64MB of memory should be used.

Then, this patch tries to scan SWAP_CLUSTER_MAX of pages in force...when

  1. the target is enough small.
  2. it's kswapd or memcg reclaim.

Then we can avoid rapid priority drop and may be able to recover
all_unreclaimable in a small zones.  And this patch removes nr_saved_scan.
 This will allow scanning in this priority even when pages >> priority is
very small.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Ying Han <yinghan@google.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-26 17:12:35 -07:00
Ying Han
889976dbcb memcg: reclaim memory from nodes in round-robin order
Presently, memory cgroup's direct reclaim frees memory from the current
node.  But this has some troubles.  Usually when a set of threads works in
a cooperative way, they tend to operate on the same node.  So if they hit
limits under memcg they will reclaim memory from themselves, damaging the
active working set.

For example, assume 2 node system which has Node 0 and Node 1 and a memcg
which has 1G limit.  After some work, file cache remains and the usages
are

   Node 0:  1M
   Node 1:  998M.

and run an application on Node 0, it will eat its foot before freeing
unnecessary file caches.

This patch adds round-robin for NUMA and adds equal pressure to each node.
When using cpuset's spread memory feature, this will work very well.

But yes, a better algorithm is needed.

[akpm@linux-foundation.org: comment editing]
[kamezawa.hiroyu@jp.fujitsu.com: fix time comparisons]
Signed-off-by: Ying Han <yinghan@google.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Mel Gorman <mel@csn.ul.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-26 17:12:35 -07:00
Ying Han
0ae5e89c60 memcg: count the soft_limit reclaim in global background reclaim
The global kswapd scans per-zone LRU and reclaims pages regardless of the
cgroup. It breaks memory isolation since one cgroup can end up reclaiming
pages from another cgroup. Instead we should rely on memcg-aware target
reclaim including per-memcg kswapd and soft_limit hierarchical reclaim under
memory pressure.

In the global background reclaim, we do soft reclaim before scanning the
per-zone LRU. However, the return value is ignored. This patch is the first
step to skip shrink_zone() if soft_limit reclaim does enough work.

This is part of the effort which tries to reduce reclaiming pages in global
LRU in memcg. The per-memcg background reclaim patchset further enhances the
per-cgroup targetting reclaim, which I should have V4 posted shortly.

Try running multiple memory intensive workloads within seperate memcgs. Watch
the counters of soft_steal in memory.stat.

  $ cat /dev/cgroup/A/memory.stat | grep 'soft'
  soft_steal 240000
  soft_scan 240000
  total_soft_steal 240000
  total_soft_scan 240000

This patch:

In the global background reclaim, we do soft reclaim before scanning the
per-zone LRU.  However, the return value is ignored.

We would like to skip shrink_zone() if soft_limit reclaim does enough
work.  Also, we need to make the memory pressure balanced across per-memcg
zones, like the logic vm-core.  This patch is the first step where we
start with counting the nr_scanned and nr_reclaimed from soft_limit
reclaim into the global scan_control.

Signed-off-by: Ying Han <yinghan@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Balbir Singh <balbir@in.ibm.com>
Acked-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-26 17:12:35 -07:00
Andrew Morton
f042e707ee mm: move enum vm_event_item into a standalone header file
enums are problematic because they cannot be forward-declared:

  akpm2:/home/akpm> cat t.c

  enum foo;

  static inline void bar(enum foo f)
  {
  }
  akpm2:/home/akpm> gcc -c t.c
  t.c:4: error: parameter 1 ('f') has incomplete type

So move the enum's definition into a standalone header file which can be used
wherever its definition is needed.

Cc: Ying Han <yinghan@google.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-26 17:12:34 -07:00
Daniel Lezcano
a77aea9201 cgroup: remove the ns_cgroup
The ns_cgroup is an annoying cgroup at the namespace / cgroup frontier and
leads to some problems:

  * cgroup creation is out-of-control
  * cgroup name can conflict when pids are looping
  * it is not possible to have a single process handling a lot of
    namespaces without falling in a exponential creation time
  * we may want to create a namespace without creating a cgroup

  The ns_cgroup was replaced by a compatibility flag 'clone_children',
  where a newly created cgroup will copy the parent cgroup values.
  The userspace has to manually create a cgroup and add a task to
  the 'tasks' file.

This patch removes the ns_cgroup as suggested in the following thread:

https://lists.linux-foundation.org/pipermail/containers/2009-June/018616.html

The 'cgroup_clone' function is removed because it is no longer used.

This is a userspace-visible change.  Commit 45531757b4 ("cgroup: notify
ns_cgroup deprecated") (merged into 2.6.27) caused the kernel to emit a
printk warning users that the feature is planned for removal.  Since that
time we have heard from XXX users who were affected by this.

Signed-off-by: Daniel Lezcano <daniel.lezcano@free.fr>
Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Jamal Hadi Salim <hadi@cyberus.ca>
Reviewed-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Paul Menage <menage@google.com>
Acked-by: Matt Helsley <matthltc@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-26 17:12:34 -07:00
Ben Blum
f780bdb7c1 cgroups: add per-thread subsystem callbacks
Add cgroup subsystem callbacks for per-thread attachment in atomic contexts

Add can_attach_task(), pre_attach(), and attach_task() as new callbacks
for cgroups's subsystem interface.  Unlike can_attach and attach, these
are for per-thread operations, to be called potentially many times when
attaching an entire threadgroup.

Also, the old "bool threadgroup" interface is removed, as replaced by
this.  All subsystems are modified for the new interface - of note is
cpuset, which requires from/to nodemasks for attach to be globally scoped
(though per-cpuset would work too) to persist from its pre_attach to
attach_task and attach.

This is a pre-patch for cgroup-procs-writable.patch.

Signed-off-by: Ben Blum <bblum@andrew.cmu.edu>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Matt Helsley <matthltc@us.ibm.com>
Reviewed-by: Paul Menage <menage@google.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-26 17:12:34 -07:00
Ben Blum
4714d1d32d cgroups: read-write lock CLONE_THREAD forking per threadgroup
Adds functionality to read/write lock CLONE_THREAD fork()ing per-threadgroup

Add an rwsem that lives in a threadgroup's signal_struct that's taken for
reading in the fork path, under CONFIG_CGROUPS.  If another part of the
kernel later wants to use such a locking mechanism, the CONFIG_CGROUPS
ifdefs should be changed to a higher-up flag that CGROUPS and the other
system would both depend on.

This is a pre-patch for cgroup-procs-write.patch.

Signed-off-by: Ben Blum <bblum@andrew.cmu.edu>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Matt Helsley <matthltc@us.ibm.com>
Reviewed-by: Paul Menage <menage@google.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-26 17:12:34 -07:00
Wolfram Sang
9796cc964d drivers/rtc/rtc-mxc.c: remove defines already included in rtc.h
[akpm@linux-foundation.org: retain the code comments]
Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
Cc: Vladimir Zapolskiy <vzapolskiy@gmail.com>
Cc: Alessandro Zummo <alessandro.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-26 17:12:33 -07:00
Jesse Gross
704f15ddb5 flex_array: avoid divisions when accessing elements
On most architectures division is an expensive operation and accessing an
element currently requires four of them.  This performance penalty
effectively precludes flex arrays from being used on any kind of fast
path.  However, two of these divisions can be handled at creation time and
the others can be replaced by a reciprocal divide, completely avoiding
real divisions on access.

[eparis@redhat.com: rebase on top of changes to support 0 len elements]
[eparis@redhat.com: initialize part_nr when array fits entirely in base]
Signed-off-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-26 17:12:33 -07:00
Linus Torvalds
fce637e392 Merge branches 'core-fixes-for-linus' and 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  seqlock: Get rid of SEQLOCK_UNLOCKED

* 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  irq: Remove smp_affinity_list when unregister irq proc
2011-05-26 12:19:11 -07:00
Linus Torvalds
8b29336fe0 Merge branch 'gpio/next' of git://git.secretlab.ca/git/linux-2.6
* 'gpio/next' of git://git.secretlab.ca/git/linux-2.6:
  gpio/via: rename VIA local config struct
  basic_mmio_gpio: split into a gpio library and platform device
  gpio: remove some legacy comments in build files
  gpio: add trace events for setting direction and value
  gpio/pca953x: Use handle_simple_irq instead of handle_edge_irq
  gpiolib: export gpiochip_find
  gpio: remove redundant Kconfig depends on GPIOLIB
  basic_mmio_gpio: convert to non-__raw* accessors
  basic_mmio_gpio: support direction registers
  basic_mmio_gpio: support different input/output registers
  basic_mmio_gpio: detect output method at probe time
  basic_mmio_gpio: request register regions
  basic_mmio_gpio: allow overriding number of gpio
  basic_mmio_gpio: convert to platform_{get,set}_drvdata()
  basic_mmio_gpio: remove runtime width/endianness evaluation
2011-05-26 12:14:41 -07:00
Linus Torvalds
9f1912c48c Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6: (57 commits)
  regulator: Fix 88pm8607.c printk format warning
  input: Add support for Qualcomm PMIC8XXX power key
  input: Add Qualcomm pm8xxx keypad controller driver
  mfd: Add omap-usbhs runtime PM support
  mfd: Fix ASIC3 SD Host Controller Configuration size
  mfd: Fix omap_usbhs_alloc_children error handling
  mfd: Fix omap usbhs crash when rmmoding ehci or ohci
  mfd: Add ASIC3 LED support
  leds: Add ASIC3 LED support
  mfd: Update twl4030-code maintainer e-mail address
  mfd: Correct the name and bitmask for ab8500-gpadc BTempPullUp
  mfd: Add manual ab8500-gpadc batt temp activation for AB8500 3.0
  mfd: Provide ab8500-core enumerators for chip cuts
  mfd: Check twl4030-power remove script error condition after i2cwrite
  mfd: Fix twl6030 irq definitions
  mfd: Add phoenix lite (twl6025) support to twl6030
  mfd: Avoid to use constraint name in 88pm860x regulator driver
  mfd: Remove checking on max8925 regulator[0]
  mfd: Remove unused parameter from 88pm860x API
  mfd: Avoid to allocate 88pm860x static platform data
  ...
2011-05-26 12:14:20 -07:00
Linus Torvalds
4c171acc20 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  RDMA/cma: Save PID of ID's owner
  RDMA/cma: Add support for netlink statistics export
  RDMA/cma: Pass QP type into rdma_create_id()
  RDMA: Update exported headers list
  RDMA/cma: Export enum cma_state in <rdma/rdma_cm.h>
  RDMA/nes: Add a check for strict_strtoul()
  RDMA/cxgb3: Don't post zero-byte read if endpoint is going away
  RDMA/cxgb4: Use completion objects for event blocking
  IB/srp: Fix integer -> pointer cast warnings
  IB: Add devnode methods to cm_class and umad_class
  IB/mad: Return EPROTONOSUPPORT when an RDMA device lacks the QP required
  IB/uverbs: Add devnode method to set path/mode
  RDMA/ucma: Add .nodename/.mode to tell userspace where to create device node
  RDMA: Add netlink infrastructure
  RDMA: Add error handling to ib_core_init()
2011-05-26 12:13:57 -07:00
Linus Torvalds
20e0ec119b Merge branch 'spi/next' of git://git.secretlab.ca/git/linux-2.6
* 'spi/next' of git://git.secretlab.ca/git/linux-2.6:
  spi/amba-pl022: work in polling or interrupt mode if pl022_dma_probe fails
  spi/spi_s3c24xx: Use spi_bitbang_stop instead of spi_unregister_master in s3c24xx_spi_remove
  spi/spi_nuc900: Use spi_bitbang_stop instead of spi_unregister_master in nuc900_spi_remove
  spi/spi_tegra: use spi_unregister_master() instead of spi_master_put()
  spi/spi_sh: use spi_unregister_master instead of spi_master_put in remove path
  spi: Use void pointers for data in simple SPI I/O operations
  spi/pl022: use cpu_relax in the busy loop
  spi/pl022: mark driver non-experimental
  spi/pl022: timeout on polled transfer v2
  spi/dw_spi: improve the interrupt mode with the batch ops
  spi/dw_spi: change poll mode transfer from byte ops to batch ops
  spi/dw_spi: remove the un-necessary flush()
  spi/dw_spi: unify the low level read/write routines
2011-05-26 12:13:22 -07:00
Linus Torvalds
6ddb4518c7 Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc/4xx: Adding PCIe MSI support
  powerpc: Fix irq_free_virt by adjusting bounds before loop
  powerpc/irq: Protect irq_radix_revmap_lookup against irq_free_virt
  powerpc/irq: Check desc in handle_one_irq and expand generic_handle_irq
  powerpc/irq: Always free duplicate IRQ_LEGACY hosts
  powerpc/irq: Remove stale and misleading comment
  powerpc/cell: Rename ipi functions to match current abstractions
  powerpc/cell: Use common smp ipi actions
  Remove unused MSG_ flags in linux/smp.h
  powerpc/pseries: Update MAX_HCALL_OPCODE to reflect page coalescing
  powerpc/oprofile: Handle events that raise an exception without overflowing
  powerpc/ftrace: Implement raw syscall tracepoints on PowerPC
2011-05-26 12:11:17 -07:00
Linus Torvalds
be93d8cfba Fix build with !HUGETLBFS
I stupidly broke the case of CONFIG_HUGETLBFS=n when doing the
conversion to vm_flags_t in commit ca16d140af ("mm: don't access
vm_flags as 'int'").  And my 'allyesconfig' build didn't find it, for
obvious reasons..

Include <linux/mm_types.h> in <linux/hugetlb.h>.  The problem could have
been avoided by just turning the hugetlb_file_setup() error wrapper into
a macro, but mm_types.h is a reasonable include in this file.

Reported-by: Richard -rw- Weinberger <richard.weinberger@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-26 12:03:50 -07:00
Linus Torvalds
f8d613e2a6 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/djm/tmem
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/djm/tmem:
  xen: cleancache shim to Xen Transcendent Memory
  ocfs2: add cleancache support
  ext4: add cleancache support
  btrfs: add cleancache support
  ext3: add cleancache support
  mm/fs: add hooks to support cleancache
  mm: cleancache core ops functions and config
  fs: add field to superblock to support cleancache
  mm/fs: cleancache documentation

Fix up trivial conflict in fs/btrfs/extent_io.c due to includes
2011-05-26 10:50:56 -07:00
Trilok Soni
92d57a73e4 input: Add support for Qualcomm PMIC8XXX power key
Add support for PMIC8XXX power key driven over dedicated
KYPD_PWR_N pin.

Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Trilok Soni <tsoni@codeaurora.org>
Signed-off-by: Anirudh Ghayal <aghayal@codeaurora.org>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2011-05-26 19:45:54 +02:00
Trilok Soni
39325b59d8 input: Add Qualcomm pm8xxx keypad controller driver
Add Qualcomm PMIC8XXX based keypad controller driver
supporting upto 18x8 matrix configuration.

Acked-by: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Trilok Soni <tsoni@codeaurora.org>
Signed-off-by: Anirudh Ghayal <aghayal@codeaurora.org>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2011-05-26 19:45:53 +02:00
Paul Parsons
74e32d1b68 mfd: Fix ASIC3 SD Host Controller Configuration size
The size of the TC6380AF SD Host Controller Configuration area is 0x200 bytes (assuming registers are aligned on 32-bit boundaries), not 0x400 bytes. Source: Toshiba TC6380AF Specification sections 4.2 and 4.3.1

Signed-off-by: Paul Parsons <lost.distance@yahoo.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2011-05-26 19:45:51 +02:00
Paul Parsons
7d9e7e9fbd leds: Add ASIC3 LED support
Add LED support for the HTC ASIC3. Underlying support is provided by the mfd/asic3 and leds/leds-asic3 drivers. An example configuration is provided by the pxa/hx4700 platform.

Signed-off-by: Paul Parsons <lost.distance@yahoo.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2011-05-26 19:45:46 +02:00
Peter Ujfalusi
4a7c00cd94 mfd: Update twl4030-code maintainer e-mail address
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2011-05-26 19:45:45 +02:00
Linus Walleij
863dde5bfa mfd: Provide ab8500-core enumerators for chip cuts
Since functionality in MFD cells may need to be adjusted according to
chip revision, let's enumerate them and keep track of them.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2011-05-26 19:45:42 +02:00
Graeme Gregory
6523b148b4 mfd: Fix twl6030 irq definitions
The charger fault IRQs from the twl will in future patches be handled
by a seperate IRQ handler in the charger driver than the general charger
IRQ. Give them different IRQ numbers now to allow the charger driver to
be merged in the future.

Signed-off-by: Graeme Gregory <gg@slimlogic.co.uk>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2011-05-26 19:45:40 +02:00
Graeme Gregory
521d8ec3f0 mfd: Add phoenix lite (twl6025) support to twl6030
Phoenix Lite is based on the twl6030 family of PMICs. It has mostly the
same feature set of twl6030 but with small changes. The codec block has
also been removed. It also has a new charger block and new features in
its ADC block. VUSB handling also differs.

Signed-off-by: Graeme Gregory <gg@slimlogic.co.uk>
Reviewed-by: Mark Brown <broonie@opensource.wolfsonicro.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2011-05-26 19:45:39 +02:00
Haojian Zhuang
008b30408c mfd: Add rtc support to 88pm860x
Enable rtc function in 88pm860x PMIC.

Signed-off-by: Haojian Zhuang <haojian.zhuang@marvell.com>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2011-05-26 19:45:34 +02:00
Abhijeet Dharmapurikar
c013f0a56c mfd: Add pm8xxx irq support
Add support for the irq controller in Qualcomm 8xxx pmic. The 8xxx
interrupt controller provides control for gpio and mpp configured as
interrupts in addition to other subdevice interrupts. The interrupt
controller also provides a way to read the real time status of an
interrupt. This real time status is the only way one can get the
input values of gpio and mpp lines.

Signed-off-by: Abhijeet Dharmapurikar <adharmap@codeaurora.org>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2011-05-26 19:45:28 +02:00