So we can avoid export memblock_reserved_init_regions()
Suggested by Ben.
-v2: use __init_memblock attribute
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
This is a wrapper for memblock_find_base() using slightly different
arguments (start,end instead of start,size for example) in order to
make it easier to convert existing arch/x86 code.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Arch code can define ARCH_DISCARD_MEMBLOCK in asm/memblock.h,
which in turns causes memblock code and data to go respectively
into the .init and .initdata sections. This will be used by the
x86 architecture.
If ARCH_DISCARD_MEMBLOCK is defined, the debugfs files to inspect
the memblock arrays after boot are not created.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This should make it easier to catch/debug incorrect use when
the CONFIG_ option isn't set.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
And ensure we don't hand out 0 as a valid allocation. We put the
low limit at PAGE_SIZE arbitrarily.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
will used by x86 memblock_x86_find_in_range_node and nobootmem replacement
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This exposes memblock_debug and associated memblock_dbg() macro,
along with memblock_can_resize so that x86 can use these when
ported to use memblock
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The former is now strict, it will fail if it cannot honor the allocation
within the node, while the later implements the previous semantic which
falls back to allocating anywhere.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We now provide a default (weak) implementation of memblock_nid_range()
which uses the early_pfn_map[] if CONFIG_ARCH_POPULATES_NODE_MAP
is set. Sparc still needs to use its own method due to the way
the pages can be scattered between nodes.
This implementation is inefficient due to our main algorithm and
callback construct wanting to work on an ascending addresses bases
while early_pfn_map[] would rather work with nid's (it's unsorted
at that stage). But it should work and we can look into improving
it subsequently, possibly using arch compile options to chose a
different algorithm alltogether.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Some archs such as ARM want to avoid coalescing accross things such
as the lowmem/highmem boundary or similar. This provides the option
to control it via an arch callback for which a weak default is provided
which always allows coalescing.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This is in preparation for having resizable arrays.
Note that we still allocate one more than needed, this is unchanged from
the previous implementation.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Right now, both the "memory" and "reserved" memblock_type structures have
a "size" member. It represents the calculated memory size in the former
case and is unused in the latter.
This moves it out to the main memblock structure instead
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Let's not waste space and cycles on archs that don't support >32-bit
physical address space.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The RMA (RMO is a misnomer) is a concept specific to ppc64 (in fact
server ppc64 though I hijack it on embedded ppc64 for similar purposes)
and represents the area of memory that can be accessed in real mode
(aka with MMU off), or on embedded, from the exception vectors (which
is bolted in the TLB) which pretty much boils down to the same thing.
We take that out of the generic MEMBLOCK data structure and move it into
arch/powerpc where it belongs, renaming it to "RMA" while at it.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This introduce memblock.current_limit which is used to limit allocations
from memblock_alloc() or memblock_alloc_base(..., MEMBLOCK_ALLOC_ACCESSIBLE).
The old MEMBLOCK_ALLOC_ANYWHERE changes value from 0 to ~(u64)0 and can still
be used with memblock_alloc_base() to allocate really anywhere.
It is -no-longer- cropped to MEMBLOCK_REAL_LIMIT which disappears.
Note to archs: I'm leaving the default limit to MEMBLOCK_ALLOC_ANYWHERE. I
strongly recommend that you ensure that you set an appropriate limit
during boot in order to guarantee that an memblock_alloc() at any time
results in something that is accessible with a simple __va().
The reason is that a subsequent patch will introduce the ability for
the array to resize itself by reallocating itself. The MEMBLOCK core will
honor the current limit when performing those allocations.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Walk memblock's using for_each_memblock() and use memblock_region_base/end_pfn() for
getting to PFNs.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To make it fast, we steal ARM's binary search for memblock_is_memory()
and we use that to also the replace existing implementation of
memblock_is_reserved().
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: (22 commits)
9p: fix sparse warnings in new xattr code
fs/9p: remove sparse warning in vfs_inode
fs/9p: destroy fid on failed remove
fs/9p: Prevent parallel rename when doing fid_lookup
fs/9p: Add support user. xattr
net/9p: Implement TXATTRCREATE 9p call
net/9p: Implement attrwalk 9p call
9p: Implement LOPEN
fs/9p: This patch implements TLCREATE for 9p2000.L protocol.
9p: Implement TMKDIR
9p: Implement TMKNOD
9p: Define and implement TSYMLINK for 9P2000.L
9p: Define and implement TLINK for 9P2000.L
9p: Define and implement TLINK for 9P2000.L
9p: Implement client side of setattr for 9P2000.L protocol.
9p: getattr client implementation for 9P2000.L protocol.
fs/9p: Pass the correct user credentials during attach
net/9p: Handle the server returned error properly
9p: readdir implementation for 9p2000.L
9p: Make use of iounit for read/write
...
* 'devel' of master.kernel.org:/home/rmk/linux-2.6-arm: (291 commits)
ARM: AMBA: Add pclk support to AMBA bus infrastructure
ARM: 6278/2: fix regression in RealView after the introduction of pclk
ARM: 6277/1: mach-shmobile: Allow users to select HZ, default to 128
ARM: 6276/1: mach-shmobile: remove duplicate NR_IRQS_LEGACY
ARM: 6246/1: mmci: support larger MMCIDATALENGTH register
ARM: 6245/1: mmci: enable hardware flow control on Ux500 variants
ARM: 6244/1: mmci: add variant data and default MCICLOCK support
ARM: 6243/1: mmci: pass power_mode to the translate_vdd callback
ARM: 6274/1: add global control registers definition header file for nuc900
mx2_camera: fix type of dma buffer virtual address pointer
mx2_camera: Add soc_camera support for i.MX25/i.MX27
arm/imx/gpio: add spinlock protection
ARM: Add support for the LPC32XX arch
ARM: LPC32XX: Arch config menu supoport and makefiles
ARM: LPC32XX: Phytec 3250 platform support
ARM: LPC32XX: Misc support functions
ARM: LPC32XX: Serial support code
ARM: LPC32XX: System suspend support
ARM: LPC32XX: GPIO, timer, and IRQ drivers
ARM: LPC32XX: Clock driver
...
TXATTRCREATE: Prepare a fid for setting xattr value on a file system object.
size[4] TXATTRCREATE tag[2] fid[4] name[s] attr_size[8] flags[4]
size[4] RXATTRCREATE tag[2]
txattrcreate gets a fid pointing to xattr. This fid can later be
used to set the xattr value.
flag value is derived from set Linux setxattr. The manpage says
"The flags parameter can be used to refine the semantics of the operation.
XATTR_CREATE specifies a pure create, which fails if the named attribute
exists already. XATTR_REPLACE specifies a pure replace operation, which
fails if the named attribute does not already exist. By default (no flags),
the extended attribute will be created if need be, or will simply replace
the value if the attribute exists."
The actual setxattr operation happens when the fid is clunked. At that point
the written byte count and the attr_size specified in TXATTRCREATE should be
same otherwise an error will be returned.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
TXATTRWALK: Descend a ATTR namespace
size[4] TXATTRWALK tag[2] fid[4] newfid[4] name[s]
size[4] RXATTRWALK tag[2] size[8]
txattrwalk gets a fid pointing to xattr. This fid can later be
used to read the xattr value. If name is NULL the fid returned
can be used to get the list of extended attribute associated to
the file system object.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Implement 9p2000.L version of open(LOPEN) interface in 9p client.
For LOPEN, no need to convert the flags to and from 9p mode to VFS mode.
Synopsis:
size[4] Tlopen tag[2] fid[4] mode[4]
size[4] Rlopen tag[2] qid[13] iounit[4]
[Fix mode bit format - jvrao@linux.vnet.ibm.com]
Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbegren <ericvh@gmail.com>
SYNOPSIS
size[4] Tlcreate tag[2] fid[4] name[s] flags[4] mode[4] gid[4]
size[4] Rlcreate tag[2] qid[13] iounit[4]
DESCRIPTION
The Tlreate request asks the file server to create a new regular file with the
name supplied, in the directory (dir) represented by fid.
The mode argument specifies the permissions to use. New file is created with
the uid if the fid and with supplied gid.
The flags argument represent Linux access mode flags with which the caller
is requesting to open the file with. Protocol allows all the Linux access
modes but it is upto the server to allow/disallow any of these acess modes.
If the server doesn't support any of the access mode, it is expected to
return error.
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Implement TMKDIR as part of 2000.L Work
Synopsis
size[4] Tmkdir tag[2] fid[4] name[s] mode[4] gid[4]
size[4] Rmkdir tag[2] qid[13]
Description
mkdir asks the file server to create a directory with given name,
mode and gid. The qid for the new directory is returned with
the mkdir reply message.
Note: 72 is selected as the opcode for TMKDIR from the reserved list.
Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Synopsis
size[4] Tmknod tag[2] fid[4] name[s] mode[4] major[4] minor[4] gid[4]
size[4] Rmknod tag[2] qid[13]
Description
mknod asks the file server to create a device node with given major and
minor number, mode and gid. The qid for the new device node is returned
with the mknod reply message.
[sripathik@in.ibm.com: Fix error handling code]
Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Create a symbolic link
SYNOPSIS
size[4] Tsymlink tag[2] fid[4] name[s] symtgt[s] gid[4]
size[4] Rsymlink tag[2] qid[13]
DESCRIPTION
Create a symbolic link named 'name' pointing to 'symtgt'.
gid represents the effective group id of the caller.
The permissions of a symbolic link are irrelevant hence it is omitted
from the protocol.
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Reviewed-by: Sripathi Kodi <sripathik@in.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
This patch adds a helper function to get the dentry from inode and
uses it in creating a Hardlink
SYNOPSIS
size[4] Tlink tag[2] dfid[4] oldfid[4] newpath[s]
size[4] Rlink tag[2]
DESCRIPTION
Create a link 'newpath' in directory pointed by dfid linking to oldfid path.
[sripathik@in.ibm.com : p9_client_link should not free req structure
if p9_client_rpc has returned an error.]
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
SYNOPSIS
size[4] Tsetattr tag[2] attr[n]
size[4] Rsetattr tag[2]
DESCRIPTION
The setattr command changes some of the file status information.
attr resembles the iattr structure used in Linux kernel. It
specifies which status parameter is to be changed and to what
value. It is laid out as follows:
valid[4]
specifies which status information is to be changed. Possible
values are:
ATTR_MODE (1 << 0)
ATTR_UID (1 << 1)
ATTR_GID (1 << 2)
ATTR_SIZE (1 << 3)
ATTR_ATIME (1 << 4)
ATTR_MTIME (1 << 5)
ATTR_ATIME_SET (1 << 7)
ATTR_MTIME_SET (1 << 8)
The last two bits represent whether the time information
is being sent by the client's user space. In the absense
of these bits the server always uses server's time.
mode[4]
File permission bits
uid[4]
Owner id of file
gid[4]
Group id of the file
size[8]
File size
atime_sec[8]
Time of last file access, seconds
atime_nsec[8]
Time of last file access, nanoseconds
mtime_sec[8]
Time of last file modification, seconds
mtime_nsec[8]
Time of last file modification, nanoseconds
Explanation of the patches:
--------------------------
*) The kernel just copies relevent contents of iattr structure to
p9_iattr_dotl structure and passes it down to the client. The
only check it has is calling inode_change_ok()
*) The p9_iattr_dotl structure does not have ctime and ia_file
parameters because I don't think these are needed in our case.
The client user space can request updating just ctime by calling
chown(fd, -1, -1). This is handled on server side without a need
for putting ctime on the wire.
*) The server currently supports changing mode, time, ownership and
size of the file.
*) 9P RFC says "Either all the changes in wstat request happen, or
none of them does: if the request succeeds, all changes were made;
if it fails, none were."
I have not done anything to implement this specifically because I
don't see a reason.
Signed-off-by: Sripathi Kodi <sripathik@in.ibm.com>
Signed-off-by: Venkateswararao Jujjuri <jvrao@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
SYNOPSIS
size[4] Tgetattr tag[2] fid[4] request_mask[8]
size[4] Rgetattr tag[2] lstat[n]
DESCRIPTION
The getattr transaction inquires about the file identified by fid.
request_mask is a bit mask that specifies which fields of the
stat structure is the client interested in.
The reply will contain a machine-independent directory entry,
laid out as follows:
st_result_mask[8]
Bit mask that indicates which fields in the stat structure
have been populated by the server
qid.type[1]
the type of the file (directory, etc.), represented as a bit
vector corresponding to the high 8 bits of the file's mode
word.
qid.vers[4]
version number for given path
qid.path[8]
the file server's unique identification for the file
st_mode[4]
Permission and flags
st_uid[4]
User id of owner
st_gid[4]
Group ID of owner
st_nlink[8]
Number of hard links
st_rdev[8]
Device ID (if special file)
st_size[8]
Size, in bytes
st_blksize[8]
Block size for file system IO
st_blocks[8]
Number of file system blocks allocated
st_atime_sec[8]
Time of last access, seconds
st_atime_nsec[8]
Time of last access, nanoseconds
st_mtime_sec[8]
Time of last modification, seconds
st_mtime_nsec[8]
Time of last modification, nanoseconds
st_ctime_sec[8]
Time of last status change, seconds
st_ctime_nsec[8]
Time of last status change, nanoseconds
st_btime_sec[8]
Time of creation (birth) of file, seconds
st_btime_nsec[8]
Time of creation (birth) of file, nanoseconds
st_gen[8]
Inode generation
st_data_version[8]
Data version number
request_mask and result_mask bit masks contain the following bits
#define P9_STATS_MODE 0x00000001ULL
#define P9_STATS_NLINK 0x00000002ULL
#define P9_STATS_UID 0x00000004ULL
#define P9_STATS_GID 0x00000008ULL
#define P9_STATS_RDEV 0x00000010ULL
#define P9_STATS_ATIME 0x00000020ULL
#define P9_STATS_MTIME 0x00000040ULL
#define P9_STATS_CTIME 0x00000080ULL
#define P9_STATS_INO 0x00000100ULL
#define P9_STATS_SIZE 0x00000200ULL
#define P9_STATS_BLOCKS 0x00000400ULL
#define P9_STATS_BTIME 0x00000800ULL
#define P9_STATS_GEN 0x00001000ULL
#define P9_STATS_DATA_VERSION 0x00002000ULL
#define P9_STATS_BASIC 0x000007ffULL
#define P9_STATS_ALL 0x00003fffULL
This patch implements the client side of getattr implementation for
9P2000.L. It introduces a new structure p9_stat_dotl for getting
Linux stat information along with QID. The data layout is similar to
stat structure in Linux user space with the following major
differences:
inode (st_ino) is not part of data. Instead qid is.
device (st_dev) is not part of data because this doesn't make sense
on the client.
All time variables are 64 bit wide on the wire. The kernel seems to use
32 bit variables for these variables. However, some of the architectures
have used 64 bit variables and glibc exposes 64 bit variables to user
space on some architectures. Hence to be on the safer side we have made
these 64 bit in the protocol. Refer to the comments in
include/asm-generic/stat.h
There are some additional fields: st_btime_sec, st_btime_nsec, st_gen,
st_data_version apart from the bitmask, st_result_mask. The bit mask
is filled by the server to indicate which stat fields have been
populated by the server. Currently there is no clean way for the
server to obtain these additional fields, so it sends back just the
basic fields.
Signed-off-by: Sripathi Kodi <sripathik@in.ibm.com>
Signed-off-by: Eric Van Hensbegren <ericvh@gmail.com>
This patch implements the kernel part of readdir() implementation for 9p2000.L
Change from V3: Instead of inode, server now sends qids for each dirent
SYNOPSIS
size[4] Treaddir tag[2] fid[4] offset[8] count[4]
size[4] Rreaddir tag[2] count[4] data[count]
DESCRIPTION
The readdir request asks the server to read the directory specified by 'fid'
at an offset specified by 'offset' and return as many dirent structures as
possible that fit into count bytes. Each dirent structure is laid out as
follows.
qid.type[1]
the type of the file (directory, etc.), represented as a bit
vector corresponding to the high 8 bits of the file's mode
word.
qid.vers[4]
version number for given path
qid.path[8]
the file server's unique identification for the file
offset[8]
offset into the next dirent.
type[1]
type of this directory entry.
name[256]
name of this directory entry.
This patch adds v9fs_dir_readdir_dotl() as the readdir() call for 9p2000.L.
This function sends P9_TREADDIR command to the server. In response the server
sends a buffer filled with dirent structures. This is different from the
existing v9fs_dir_readdir() call which receives stat structures from the server.
This results in significant speedup of readdir() on large directories.
For example, doing 'ls >/dev/null' on a directory with 10000 files on my
laptop takes 1.088 seconds with the existing code, but only takes 0.339 seconds
with the new readdir.
Signed-off-by: Sripathi Kodi <sripathik@in.ibm.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
Found with makes headers_check:
include/linux/virtio_9p.h:15: found __[us]{8,16,32,64} type without #include <linux/types.h>
Signed-off-by: Fang Wenqi <antonf@turbolinux.com.cn>
Signed-off-by: Eric Van Hensbergen <ericvh@gmail.com>
nfs_commit_inode() needs to be defined irrespectively of whether or not
we are supporting NFSv3 and NFSv4.
Allow the compiler to optimise away code in the NFSv2-only case by
converting it into an inlined stub function.
Reported-and-tested-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Some platforms gate the pclk (APB - the bus - clock) to the peripherals
for power saving, along with the functional clock. When devices are
accessed without pclk enabled, the kernel will oops.
This gives them two options:
1. Leave all clocks on all the time.
2. Attempt to gate pclk along with the functional clock.
(With some hardware, pclk and the functional clock are gated by a single
bit in a register.)
(1) has the disadvantage that it causes increased power usage, which is
bad news for battery operated devices. (2) can lead to kernel oops if
registers are accessed without the functional clock being enabled.
So, introduce the apb_pclk signal in such a way existing drivers don't
need to be updated. Essentially, this means we guarantee that:
1. pclk will be enabled whenever the driver is bound to a device -
from probe() to remove() time.
2. pclk will also be enabled when reading the primecell IDs from the device.
In order to allow drivers to be incrementally updated to achieve greater
power savings, we provide two additional calls to allow drivers to
manage the pclk - amba_pclk_enable()/amba_pclk_disable().
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
See https://bugzilla.kernel.org/show_bug.cgi?id=16056
If other processes are blocked waiting for kswapd to free up some memory so
that they can make progress, then we cannot allow kswapd to block on those
processes.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org
Fix __task_cred()'s lockdep check by removing the following validation
condition:
lockdep_tasklist_lock_is_held()
as commit_creds() does not take the tasklist_lock, and nor do most of the
functions that call it, so this check is pointless and it can prevent
detection of the RCU lock not being held if the tasklist_lock is held.
Instead, add the following validation condition:
task->exit_state >= 0
to permit the access if the target task is dead and therefore unable to change
its own credentials.
Fix __task_cred()'s comment to:
(1) discard the bit that says that the caller must prevent the target task
from being deleted. That shouldn't need saying.
(2) Add a comment indicating the result of __task_cred() should not be passed
directly to get_cred(), but rather than get_task_cred() should be used
instead.
Also put a note into the documentation to enforce this point there too.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It's possible for get_task_cred() as it currently stands to 'corrupt' a set of
credentials by incrementing their usage count after their replacement by the
task being accessed.
What happens is that get_task_cred() can race with commit_creds():
TASK_1 TASK_2 RCU_CLEANER
-->get_task_cred(TASK_2)
rcu_read_lock()
__cred = __task_cred(TASK_2)
-->commit_creds()
old_cred = TASK_2->real_cred
TASK_2->real_cred = ...
put_cred(old_cred)
call_rcu(old_cred)
[__cred->usage == 0]
get_cred(__cred)
[__cred->usage == 1]
rcu_read_unlock()
-->put_cred_rcu()
[__cred->usage == 1]
panic()
However, since a tasks credentials are generally not changed very often, we can
reasonably make use of a loop involving reading the creds pointer and using
atomic_inc_not_zero() to attempt to increment it if it hasn't already hit zero.
If successful, we can safely return the credentials in the knowledge that, even
if the task we're accessing has released them, they haven't gone to the RCU
cleanup code.
We then change task_state() in procfs to use get_task_cred() rather than
calling get_cred() on the result of __task_cred(), as that suffers from the
same problem.
Without this change, a BUG_ON in __put_cred() or in put_cred_rcu() can be
tripped when it is noticed that the usage count is not zero as it ought to be,
for example:
kernel BUG at kernel/cred.c:168!
invalid opcode: 0000 [#1] SMP
last sysfs file: /sys/kernel/mm/ksm/run
CPU 0
Pid: 2436, comm: master Not tainted 2.6.33.3-85.fc13.x86_64 #1 0HR330/OptiPlex
745
RIP: 0010:[<ffffffff81069881>] [<ffffffff81069881>] __put_cred+0xc/0x45
RSP: 0018:ffff88019e7e9eb8 EFLAGS: 00010202
RAX: 0000000000000001 RBX: ffff880161514480 RCX: 00000000ffffffff
RDX: 00000000ffffffff RSI: ffff880140c690c0 RDI: ffff880140c690c0
RBP: ffff88019e7e9eb8 R08: 00000000000000d0 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000040 R12: ffff880140c690c0
R13: ffff88019e77aea0 R14: 00007fff336b0a5c R15: 0000000000000001
FS: 00007f12f50d97c0(0000) GS:ffff880007400000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f8f461bc000 CR3: 00000001b26ce000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process master (pid: 2436, threadinfo ffff88019e7e8000, task ffff88019e77aea0)
Stack:
ffff88019e7e9ec8 ffffffff810698cd ffff88019e7e9ef8 ffffffff81069b45
<0> ffff880161514180 ffff880161514480 ffff880161514180 0000000000000000
<0> ffff88019e7e9f28 ffffffff8106aace 0000000000000001 0000000000000246
Call Trace:
[<ffffffff810698cd>] put_cred+0x13/0x15
[<ffffffff81069b45>] commit_creds+0x16b/0x175
[<ffffffff8106aace>] set_current_groups+0x47/0x4e
[<ffffffff8106ac89>] sys_setgroups+0xf6/0x105
[<ffffffff81009b02>] system_call_fastpath+0x16/0x1b
Code: 48 8d 71 ff e8 7e 4e 15 00 85 c0 78 0b 8b 75 ec 48 89 df e8 ef 4a 15 00
48 83 c4 18 5b c9 c3 55 8b 07 8b 07 48 89 e5 85 c0 74 04 <0f> 0b eb fe 65 48 8b
04 25 00 cc 00 00 48 3b b8 58 04 00 00 75
RIP [<ffffffff81069881>] __put_cred+0xc/0x45
RSP <ffff88019e7e9eb8>
---[ end trace df391256a100ebdd ]---
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Platforms may have some external power control which need to be
controlled from board specific code. Rename the translate_vdd()
callback to vdd_handler() and pass it the power mode.
Acked-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Rabin Vincent <rabin.vincent@stericsson.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>