* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI: Disable ASPM when _OSC control is not granted for PCIe services
PCI: Changing ASPM policy, via /sys, to POWERSAVE could cause NMIs
PCI: PCIe links may not get configured for ASPM under POWERSAVE mode
PCI/ACPI: Report ASPM support to BIOS if not disabled from command line
* git://git.infradead.org/battery-2.6: (30 commits)
bq20z75: Fix time and temp units
bq20z75: Fix issues with present and suspend
z2_battery: Fix count of properties
s3c_adc_battery: Fix method names when PM not set
z2_battery: Add MODULE_DEVICE_TABLE
ds2782_battery: Add MODULE_DEVICE_TABLE
bq20z75: Add MODULE_DEVICE_TABLE
power_supply: Update power_supply_is_watt_property
bq20z75: Add i2c retry mechanism
bq20z75: Add optional battery detect gpio
twl4030_charger: Make the driver atomic notifier safe
bq27x00: Use single i2c_transfer call for property read
bq27x00: Cleanup bq27x00_i2c_read
bq27x00: Minor cleanups
bq27x00: Give more specific reports on battery status
bq27x00: Add MODULE_DEVICE_TABLE
bq27x00: Add new properties
bq27x00: Poll battery state
bq27x00: Cache battery registers
bq27x00: Add bq27000 support
...
* git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm:
dm stripe: implement merge method
dm mpath: allow table load with no priority groups
dm mpath: fail message ioctl if specified path is not valid
dm ioctl: add flag to wipe buffers for secure data
dm ioctl: prepare for crypt key wiping
dm crypt: wipe keys string immediately after key is set
dm: add flakey target
dm: fix opening log and cow devices for read only tables
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf, x86: Complain louder about BIOSen corrupting CPU/PMU state and continue
perf, x86: P4 PMU - Read proper MSR register to catch unflagged overflows
perf symbols: Look at .dynsym again if .symtab not found
perf build-id: Add quirk to deal with perf.data file format breakage
perf session: Pass evsel in event_ops->sample()
perf: Better fit max unprivileged mlock pages for tools needs
perf_events: Fix stale ->cgrp pointer in update_cgrp_time_from_cpuctx()
perf top: Fix uninitialized 'counter' variable
tracing: Fix set_ftrace_filter probe function display
perf, x86: Fix Intel fixed counters base initialization
* 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
genirq: Provide locked setter for chip, handler, name
genirq: Provide a lockdep helper
genirq; Remove the last leftovers of the old sparse irq code
* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
futex: Fix WARN_ON() test for UP
WARN_ON_SMP(): Allow use in if() statements on UP
x86, dumpstack: Use %pB format specifier for stack trace
vsprintf: Introduce %pB format specifier
lockdep: Remove unused 'factor' variable from lockdep_stats_show()
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (43 commits)
ext4: fix a BUG in mb_mark_used during trim.
ext4: unused variables cleanup in fs/ext4/extents.c
ext4: remove redundant set_buffer_mapped() in ext4_da_get_block_prep()
ext4: add more tracepoints and use dev_t in the trace buffer
ext4: don't kfree uninitialized s_group_info members
ext4: add missing space in printk's in __ext4_grp_locked_error()
ext4: add FITRIM to compat_ioctl.
ext4: handle errors in ext4_clear_blocks()
ext4: unify the ext4_handle_release_buffer() api
ext4: handle errors in ext4_rename
jbd2: add COW fields to struct jbd2_journal_handle
jbd2: add the b_cow_tid field to journal_head struct
ext4: Initialize fsync transaction ids in ext4_new_inode()
ext4: Use single thread to perform DIO unwritten convertion
ext4: optimize ext4_bio_write_page() when no extent conversion is needed
ext4: skip orphan cleanup if fs has unknown ROCOMPAT features
ext4: use the nblocks arg to ext4_truncate_restart_trans()
ext4: fix missing iput of root inode for some mount error paths
ext4: make FIEMAP and delayed allocation play well together
ext4: suppress verbose debugging information if malloc-debug is off
...
Fi up conflicts in fs/ext4/super.c due to workqueue changes
Some archs want to print extra information for certain irq_chips which
is per irq and not per chip. Allow them to provide a chip callback to
print the chip name and the extra information.
PowerPC wants to print the LEVEL/EDGE type information. Make it configurable.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6: (9356 commits)
[media] rc: update for bitop name changes
fs: simplify iget & friends
fs: pull inode->i_lock up out of writeback_single_inode
fs: rename inode_lock to inode_hash_lock
fs: move i_wb_list out from under inode_lock
fs: move i_sb_list out from under inode_lock
fs: remove inode_lock from iput_final and prune_icache
fs: Lock the inode LRU list separately
fs: factor inode disposal
fs: protect inode->i_state with inode->i_lock
lib, arch: add filter argument to show_mem and fix private implementations
SLUB: Write to per cpu data when allocating it
slub: Fix debugobjects with lockless fastpath
autofs4: Do not potentially dereference NULL pointer returned by fget() in autofs_dev_ioctl_setpipefd()
autofs4 - remove autofs4_lock
autofs4 - fix d_manage() return on rcu-walk
autofs4 - fix autofs4_expire_indirect() traversal
autofs4 - fix dentry leak in autofs4_expire_direct()
autofs4 - reinstate last used update on access
vfs - check non-mountpoint dentry might block in __follow_mount_rcu()
...
NOTE!
This merge commit was created to fix compilation error. The block
tree was merged upstream and removed the 'elv_queue_empty()'
function which the new 'mtdswap' driver is using. So a simple
merge of the mtd tree with upstream does not compile. And the
mtd tree has already be published, so re-basing it is not an option.
To fix this unfortunate situation, I had to merge upstream into the
mtd-2.6.git tree without committing, put the fixup patch on top of
this, and then commit this. The result is that we do not have commits
which do not compile.
In other words, this merge commit "merges" 3 things: the MTD tree, the
upstream tree, and the fixup patch.
On sh-mobile platforms the SDHI driver was using the tmio_mmc SD/SDIO
MFD cell driver. Now that the tmio_mmc driver has been split into a
core and a separate MFD glue, we can support SDHI natively without the
need to emulate an MFD controller. This also allows to support systems
with an on-SoC SDHI controller and a separate MFD with a TMIO core.
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
Both WARN_ON() and WARN_ON_SMP() should be able to be used in
an if statement.
if (WARN_ON_SMP(foo)) { ... }
Because WARN_ON_SMP() is defined as a do { } while (0) on UP,
it can not be used this way.
Convert it to the same form that WARN_ON() is, even when
CONFIG_SMP is off.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Darren Hart <dvhart@linux.intel.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
LKML-Reference: <20110317192208.444147791@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
It's a big no-no to use pgprot_noncached() when mmap'ing such buffers
into userspace since they are mapped cachable in kernel space.
This can cause all sort of interesting things ranging from to garbled
sound to lockups on various architectures. I've observed that usb-audio
is broken on powerpc 4xx for example because of that.
Also remove the now unused snd_pcm_lib_mmap_noncached(). It's
an arch business to know when to use uncached mappings, there's
already hacks for MIPS inside snd_pcm_default_mmap() and other
archs are supposed to use dma_mmap_coherent().
(See my separate patch that adds dma_mmap_coherent() to powerpc)
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: <stable@kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
When we set up the flow informations in ip_route_newports(), we take
the address informations from the the rt_key_src and rt_key_dst fields
of the rtable. They appear to be empty. So take the address
informations from rt_src and rt_dst instead. This issue was introduced
by commit 5e2b61f784 ("ipv4: Remove
flowi from struct rtable.")
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
fs: simplify iget & friends
fs: pull inode->i_lock up out of writeback_single_inode
fs: rename inode_lock to inode_hash_lock
fs: move i_wb_list out from under inode_lock
fs: move i_sb_list out from under inode_lock
fs: remove inode_lock from iput_final and prune_icache
fs: Lock the inode LRU list separately
fs: factor inode disposal
fs: protect inode->i_state with inode->i_lock
autofs4: Do not potentially dereference NULL pointer returned by fget() in autofs_dev_ioctl_setpipefd()
autofs4 - remove autofs4_lock
autofs4 - fix d_manage() return on rcu-walk
autofs4 - fix autofs4_expire_indirect() traversal
autofs4 - fix dentry leak in autofs4_expire_direct()
autofs4 - reinstate last used update on access
vfs - check non-mountpoint dentry might block in __follow_mount_rcu()
All that remains of the inode_lock is protecting the inode hash list
manipulation and traversals. Rename the inode_lock to
inode_hash_lock to reflect it's actual function.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Protect the inode writeback list with a new global lock
inode_wb_list_lock and use it to protect the list manipulations and
traversals. This lock replaces the inode_lock as the inodes on the
list can be validity checked while holding the inode->i_lock and
hence the inode_lock is no longer needed to protect the list.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Protect inode state transitions and validity checks with the
inode->i_lock. This enables us to make inode state transitions
independently of the inode_lock and is the first step to peeling
away the inode_lock from the code.
This requires that __iget() is done atomically with i_state checks
during list traversals so that we don't race with another thread
marking the inode I_FREEING between the state check and grabbing the
reference.
Also remove the unlock_new_inode() memory barrier optimisation
required to avoid taking the inode_lock when clearing I_NEW.
Simplify the code by simply taking the inode->i_lock around the
state change and wakeup. Because the wakeup is no longer tricky,
remove the wake_up_inode() function and open code the wakeup where
necessary.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Move the scope value out of the fib alias entries and into fib_info,
so that we always use the correct scope when recomputing the nexthop
cached source address.
Reported-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit ddd588b5dd ("oom: suppress nodes that are not allowed from
meminfo on oom kill") moved lib/show_mem.o out of lib/lib.a, which
resulted in build warnings on all architectures that implement their own
versions of show_mem():
lib/lib.a(show_mem.o): In function `show_mem':
show_mem.c:(.text+0x1f4): multiple definition of `show_mem'
arch/sparc/mm/built-in.o:(.text+0xd70): first defined here
The fix is to remove __show_mem() and add its argument to show_mem() in
all implementations to prevent this breakage.
Architectures that implement their own show_mem() actually don't do
anything with the argument yet, but they could be made to filter nodes
that aren't allowed in the current context in the future just like the
generic implementation.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reported-by: James Bottomley <James.Bottomley@hansenpartnership.com>
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Any operation that:
1) Brings up an interface
2) Adds an IP address to an interface
3) Deletes an IP address from an interface
can potentially invalidate the nh_saddr value, requiring
it to be recomputed.
Perform the recomputation lazily using a generation ID.
Reported-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'drm-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
drm/vblank: update recently added vbl interface to be more future proof.
drm radeon: Return -EINVAL on wrong pm sysfs access
drm/radeon/kms: fix hardcoded EDID handling
Revert "drm/i915: Don't save/restore hardware status page address register"
drm/i915: Avoid unmapping pages from a NULL address space
drm/i915: Fix use after free within tracepoint
drm/i915: Restore missing command flush before interrupt on BLT ring
drm/i915: Disable pagefaults along execbuffer relocation fast path
drm/i915: Fix computation of pitch for dumb bo creator
drm/i915: report correct render clock frequencies on SNB
drm/i915/dp: Correct the order of deletion for ghost eDP devices
drm/i915: Fix tiling corruption from pipelined fencing
drm/i915: Re-enable self-refresh
drm/i915: Prevent racy removal of request from client list
drm/i915: skip redundant operations whilst enabling pipes and planes
drm/i915: Remove surplus POSTING_READs before wait_for_vblank
drm/radeon/kms: prefer legacy pll algo for tv-out
drm: check for modesetting on modeset ioctls
drm/kernel: vblank wait on crtc > 1
drm: Fix use-after-free in drm_gem_vm_close()
changes LAYOUTGET and GETDEVICEINFO XDR parsing to:
- not use vmap, which doesn't work on incoherent archs
- use xdr_stream parsing for all xdr
Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
When sec=<something> is not presented as a mount option,
we should attempt to determine what security flavor the
server is using.
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
A submount may use different security than the parent
mount does. We should figure out what sec flavor the
submount uses at mount time.
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
A later patch will need to perform a lookup using an
alternate client with a different security flavor.
This patch adds support for doing that on NFS v4.
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
* 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block: (65 commits)
Documentation/iostats.txt: bit-size reference etc.
cfq-iosched: removing unnecessary think time checking
cfq-iosched: Don't clear queue stats when preempt.
blk-throttle: Reset group slice when limits are changed
blk-cgroup: Only give unaccounted_time under debug
cfq-iosched: Don't set active queue in preempt
block: fix non-atomic access to genhd inflight structures
block: attempt to merge with existing requests on plug flush
block: NULL dereference on error path in __blkdev_get()
cfq-iosched: Don't update group weights when on service tree
fs: assign sb->s_bdi to default_backing_dev_info if the bdi is going away
block: Require subsystems to explicitly allocate bio_set integrity mempool
jbd2: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging
jbd: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging
fs: make fsync_buffers_list() plug
mm: make generic_writepages() use plugging
blk-cgroup: Add unaccounted time to timeslice_used.
block: fixup plugging stubs for !CONFIG_BLOCK
block: remove obsolete comments for blkdev_issue_zeroout.
blktrace: Use rq->cmd_flags directly in blk_add_trace_rq.
...
Fix up conflicts in fs/{aio.c,super.c}
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-2.6-mn10300:
MN10300: gcc 4.6 vs am33 inline assembly
MN10300: Deprecate gdbstub
MN10300: Allow KGDB to use the MN10300 serial ports
MN10300: Emulate single stepping in KGDB on MN10300
MN10300: Generalise kernel debugger kernel halt, reboot or power off hook
KGDB: Notify GDB of machine halt, reboot or power off
MN10300: Use KGDB
MN10300: Create generic kernel debugger hooks
MN10300: Create general kernel debugger cache flushing
MN10300: Introduce a general config option for kernel debugger hooks
MN10300: The icache invalidate functions should disable the icache first
MN10300: gdbstub: Restrict single-stepping to non-preemptable non-SMP configs
* 'rmobile-latest' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6:
mmc: Add MMC_PROGRESS_*
mmc, ARM: Rename SuperH Mobile ARM zboot helpers
ARM: mach-shmobile: add coherent DMA mask to CEU camera devices
ARM: mach-shmobile: Dynamic backlight control for Mackerel
* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6: (442 commits)
[media] videobuf2-dma-contig: make cookie() return a pointer to dma_addr_t
[media] sh_mobile_ceu_camera: Do not call vb2's mem_ops directly
[media] V4L: soc-camera: explicitly require V4L2_BUF_TYPE_VIDEO_CAPTURE
[media] v4l: soc-camera: Store negotiated buffer settings
[media] rc: interim support for 32-bit NEC-ish scancodes
[media] mceusb: topseed 0x0011 needs gen3 init for tx to work
[media] lirc_zilog: error out if buffer read bytes != chunk size
[media] lirc: silence some compile warnings
[media] hdpvr: use same polling interval as other OS
[media] ir-kbd-i2c: pass device code w/key in hauppauge case
[media] rc/keymaps: Remove the obsolete rc-rc5-tv keymap
[media] remove the old RC_MAP_HAUPPAUGE_NEW RC map
[media] rc/keymaps: Rename Hauppauge table as rc-hauppauge
[media] rc-rc5-hauppauge-new: Fix Hauppauge Grey mapping
[media] rc-rc5-hauppauge-new: Add support for the old Black RC
[media] rc-rc5-hauppauge-new: Add the old control to the table
[media] rc-winfast: Fix the keycode tables
[media] a800: Fix a few wrong IR key assignments
[media] opera1: Use multimedia keys instead of an app-specific mapping
[media] dw2102: Use multimedia keys instead of an app-specific mapping
...
Fix up trivial conflicts (remove/modify and some real conflicts) in:
arch/arm/mach-omap2/devices.c
drivers/staging/Kconfig
drivers/staging/Makefile
drivers/staging/dabusb/dabusb.c
drivers/staging/dabusb/dabusb.h
drivers/staging/easycap/easycap_ioctl.c
drivers/staging/usbvideo/usbvideo.c
drivers/staging/usbvideo/vicam.c
* 'devicetree/merge' of git://git.secretlab.ca/git/linux-2.6:
spi/pl022: Add loopback support for the SPI on 5500
spi/omap_mcspi: Fix broken last word xfer
of/flattree: minor cleanups
dt: eliminate OF_NO_DEEP_PROBE and test for NULL match table
dt: protect against NULL matches passed to of_match_node()
dt: Refactor of_platform_bus_probe()
This is my second attempt to make this enum generally available.
The first attempt added MMCIF_PROGRESS_* to include/linux/mmc/sh_mmcif.h.
However this is not sufficiently generic as the enum will be
used by SDHI boot code.
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (42 commits)
ACPI: minor printk format change in acpi_pad
ACPI: make acpi_pad /sys output more readable
ACPICA: Update version to 20110316
ACPICA: Header support for SLIC table
ACPI: Make sure the FADT is at least rev 2 before using the reset register
ACPI: Bug compatibility for Windows on the ACPI reboot vector
ACPICA: Fix access width for reset vector
ACPI battery: fribble sysfs files from a resume notifier
ACPI button: remove unused procfs I/F
ACPI, APEI, Add PCIe AER error information printing support
PCIe, AER, use pre-generated prefix in error information printing
ACPI, APEI, Add ERST record ID cache
ACPI: Use syscore_ops instead of sysdev class and sysdev
ACPI: Remove the unused EC sysdev class
ACPI: use __cpuinit for the acpi_processor_set_pdc() call tree
ACPI: use __init where possible in processor driver
Thermal_Framework-Fix_crash_during_hwmon_unregister
ACPICA: Update version to 20110211.
ACPICA: Add mechanism to defer _REG methods for some installed handlers
ACPICA: Add support for FunctionalFixedHW in acpi_ut_get_region_name
...
Add DM_SECURE_DATA_FLAG which userspace can use to ensure
that all buffers allocated for dm-ioctl are wiped
immediately after use.
The user buffer is wiped as well (we do not want to keep
and return sensitive data back to userspace if the flag is set).
Wiping is useful for cryptsetup to ensure that the key
is present in memory only in defined places and only
for the time needed.
(For crypt, key can be present in table during load or table
status, wait and message commands).
Signed-off-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
This makes the interface a bit cleaner by leaving a single gap in the
vblank bit space instead of creating two gaps.
Suggestions from Michel on mailing list/irc.
Reviewed-by: Michel Dänzer <michel@daenzer.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The %pB format specifier is for stack backtrace. Its handler
sprint_backtrace() does symbol lookup using (address-1) to
ensure the address will not point outside of the function.
If there is a tail-call to the function marked "noreturn",
gcc optimized out the code after the call then causes saved
return address points outside of the function (i.e. the start
of the next function), so pollutes call trace somewhat.
This patch adds the %pB printk mechanism that allows architecture
call-trace printout functions to improve backtrace printouts.
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-arch@vger.kernel.org
LKML-Reference: <1300934550-21394-1-git-send-email-namhyung@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
commit fd245a4adb (net_sched: move TCQ_F_THROTTLED flag)
added a race.
qdisc_watchdog() is run from softirq, so special care should be taken or
we can lose one state transition (THROTTLED/RUNNING)
Prior to fd245a4adb, we were manipulating q->flags (qdisc->flags &=
~TCQ_F_THROTTLED;) and this manipulation could only race with
qdisc_warn_nonwc().
Since we want to avoid atomic ops in qdisc fast path - it was the
meaning of commit 3711210576 (QDISC_STATE_RUNNING dont need atomic
bit ops) - fix is to move THROTTLE bit into 'state' field, this one
being manipulated with SMP and IRQ safe operations.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
deal with races in /proc/*/{syscall,stack,personality}
proc: enable writing to /proc/pid/mem
proc: make check_mem_permission() return an mm_struct on success
proc: hold cred_guard_mutex in check_mem_permission()
proc: disable mem_write after exec
mm: implement access_remote_vm
mm: factor out main logic of access_process_vm
mm: use mm_struct to resolve gate vma's in __get_user_pages
mm: arch: rename in_gate_area_no_task to in_gate_area_no_mm
mm: arch: make in_gate_area take an mm_struct instead of a task_struct
mm: arch: make get_gate_vma take an mm_struct instead of a task_struct
x86: mark associated mm when running a task in 32 bit compatibility mode
x86: add context tag to mark mm when running a task in 32-bit compatibility mode
auxv: require the target to be tracable (or yourself)
close race in /proc/*/environ
report errors in /proc/*/*map* sanely
pagemap: close races with suid execve
make sessionid permissions in /proc/*/task/* match those in /proc/*
fix leaks in path_lookupat()
Fix up trivial conflicts in fs/proc/base.c
* 'devel' of master.kernel.org:/home/rmk/linux-2.6-arm: (35 commits)
ARM: Update (and cut down) mach-types
ARM: 6771/1: vexpress: add support for multiple core tiles
ARM: 6797/1: hw_breakpoint: Fix newlines in WARNings
ARM: 6751/1: vexpress: select applicable errata workarounds in Kconfig
ARM: 6753/1: omap4: Enable ARM local timers with OMAP4430 es1.0 exception
ARM: 6759/1: smp: Select local timers vs broadcast timer support runtime
ARM: pgtable: add pud-level code
ARM: 6673/1: LPAE: use phys_addr_t instead of unsigned long for start of membanks
ARM: Use long long format when printing meminfo physical addresses
ARM: integrator: add Integrator/CP sched_clock support
ARM: realview/vexpress: consolidate SMP bringup code
ARM: realview/vexpress: consolidate localtimer support
ARM: integrator/versatile: consolidate FPGA IRQ handling code
ARM: rationalize versatile family Kconfig/Makefile
ARM: realview: remove old AMBA device DMA definitions
ARM: versatile: remove old AMBA device DMA definitions
ARM: vexpress: use new init_early for clock tree and sched_clock init
ARM: realview: use new init_early for clock tree and sched_clock init
ARM: versatile: use new init_early for clock tree and sched_clock init
ARM: integrator: use new init_early for clock tree init
...
The Xen PV drivers in a crashed HVM guest can not connect to the dom0
backend drivers because both frontend and backend drivers are still in
connected state. To run the connection reset function only in case of a
crashdump, the is_kdump_kernel() function needs to be available for the PV
driver modules.
Consolidate elfcorehdr_addr, setup_elfcorehdr and saved_max_pfn into
kernel/crash_dump.c Also export elfcorehdr_addr to make is_kdump_kernel()
usable for modules.
Leave 'elfcorehdr' as early_param(). This changes powerpc from __setup()
to early_param(). It adds an address range check from x86 also on ia64
and powerpc.
[akpm@linux-foundation.org: additional #includes]
[akpm@linux-foundation.org: remove elfcorehdr_addr export]
[akpm@linux-foundation.org: fix for Tejun's mm/nobootmem.c changes]
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
And give it a kernel-doc comment.
[akpm@linux-foundation.org: btrfs changed in linux-next]
Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Daniel Lezcano <daniel.lezcano@free.fr>
Acked-by: David Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cheat for now and say all files belong to init_user_ns. Next step will be
to let superblocks belong to a user_ns, and derive inode_userns(inode)
from inode->i_sb->s_user_ns. Finally we'll introduce more flexible
arrangements.
Changelog:
Feb 15: make is_owner_or_cap take const struct inode
Feb 23: make is_owner_or_cap bool
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Daniel Lezcano <daniel.lezcano@free.fr>
Acked-by: David Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
CAP_IPC_OWNER and CAP_IPC_LOCK can be checked against current_user_ns(),
because the resource comes from current's own ipc namespace.
setuid/setgid are to uids in own namespace, so again checks can be against
current_user_ns().
Changelog:
Jan 11: Use task_ns_capable() in place of sched_capable().
Jan 11: Use nsown_capable() as suggested by Bastian Blank.
Jan 11: Clarify (hopefully) some logic in futex and sched.c
Feb 15: use ns_capable for ipc, not nsown_capable
Feb 23: let copy_ipcs handle setting ipc_ns->user_ns
Feb 23: pass ns down rather than taking it from current
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Daniel Lezcano <daniel.lezcano@free.fr>
Acked-by: David Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Changelog:
Feb 15: Don't set new ipc->user_ns if we didn't create a new
ipc_ns.
Feb 23: Move extern declaration to ipc_namespace.h, and group
fwd declarations at top.
Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Daniel Lezcano <daniel.lezcano@free.fr>
Acked-by: David Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
So we can let type safety keep things sane, and as a bonus we can remove
the declaration of init_user_ns in capability.h.
Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Daniel Lezcano <daniel.lezcano@free.fr>
Cc: David Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ptrace is allowed to tasks in the same user namespace according to the
usual rules (i.e. the same rules as for two tasks in the init user
namespace). ptrace is also allowed to a user namespace to which the
current task the has CAP_SYS_PTRACE capability.
Changelog:
Dec 31: Address feedback by Eric:
. Correct ptrace uid check
. Rename may_ptrace_ns to ptrace_capable
. Also fix the cap_ptrace checks.
Jan 1: Use const cred struct
Jan 11: use task_ns_capable() in place of ptrace_capable().
Feb 23: same_or_ancestore_user_ns() was not an appropriate
check to constrain cap_issubset. Rather, cap_issubset()
only is meaningful when both capsets are in the same
user_ns.
Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Daniel Lezcano <daniel.lezcano@free.fr>
Acked-by: David Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Changelog:
Feb 23: let clone_uts_ns() handle setting uts->user_ns
To do so we need to pass in the task_struct who'll
get the utsname, so we can get its user_ns.
Feb 23: As per Oleg's coment, just pass in tsk, instead of two
of its members.
Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Daniel Lezcano <daniel.lezcano@free.fr>
Acked-by: David Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- Introduce ns_capable to test for a capability in a non-default
user namespace.
- Teach cap_capable to handle capabilities in a non-default
user namespace.
The motivation is to get to the unprivileged creation of new
namespaces. It looks like this gets us 90% of the way there, with
only potential uid confusion issues left.
I still need to handle getting all caps after creation but otherwise I
think I have a good starter patch that achieves all of your goals.
Changelog:
11/05/2010: [serge] add apparmor
12/14/2010: [serge] fix capabilities to created user namespaces
Without this, if user serge creates a user_ns, he won't have
capabilities to the user_ns he created. THis is because we
were first checking whether his effective caps had the caps
he needed and returning -EPERM if not, and THEN checking whether
he was the creator. Reverse those checks.
12/16/2010: [serge] security_real_capable needs ns argument in !security case
01/11/2011: [serge] add task_ns_capable helper
01/11/2011: [serge] add nsown_capable() helper per Bastian Blank suggestion
02/16/2011: [serge] fix a logic bug: the root user is always creator of
init_user_ns, but should not always have capabilities to
it! Fix the check in cap_capable().
02/21/2011: Add the required user_ns parameter to security_capable,
fixing a compile failure.
02/23/2011: Convert some macros to functions as per akpm comments. Some
couldn't be converted because we can't easily forward-declare
them (they are inline if !SECURITY, extern if SECURITY). Add
a current_user_ns function so we can use it in capability.h
without #including cred.h. Move all forward declarations
together to the top of the #ifdef __KERNEL__ section, and use
kernel-doc format.
02/23/2011: Per dhowells, clean up comment in cap_capable().
02/23/2011: Per akpm, remove unreachable 'return -EPERM' in cap_capable.
(Original written and signed off by Eric; latest, modified version
acked by him)
[akpm@linux-foundation.org: fix build]
[akpm@linux-foundation.org: export current_user_ns() for ecryptfs]
[serge.hallyn@canonical.com: remove unneeded extra argument in selinux's task_has_capability]
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Daniel Lezcano <daniel.lezcano@free.fr>
Acked-by: David Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The expected course of development for user namespaces targeted
capabilities is laid out at https://wiki.ubuntu.com/UserNamespace.
Goals:
- Make it safe for an unprivileged user to unshare namespaces. They
will be privileged with respect to the new namespace, but this should
only include resources which the unprivileged user already owns.
- Provide separate limits and accounting for userids in different
namespaces.
Status:
Currently (as of 2.6.38) you can clone with the CLONE_NEWUSER flag to
get a new user namespace if you have the CAP_SYS_ADMIN, CAP_SETUID, and
CAP_SETGID capabilities. What this gets you is a whole new set of
userids, meaning that user 500 will have a different 'struct user' in
your namespace than in other namespaces. So any accounting information
stored in struct user will be unique to your namespace.
However, throughout the kernel there are checks which
- simply check for a capability. Since root in a child namespace
has all capabilities, this means that a child namespace is not
constrained.
- simply compare uid1 == uid2. Since these are the integer uids,
uid 500 in namespace 1 will be said to be equal to uid 500 in
namespace 2.
As a result, the lxc implementation at lxc.sf.net does not use user
namespaces. This is actually helpful because it leaves us free to
develop user namespaces in such a way that, for some time, user
namespaces may be unuseful.
Bugs aside, this patchset is supposed to not at all affect systems which
are not actively using user namespaces, and only restrict what tasks in
child user namespace can do. They begin to limit privilege to a user
namespace, so that root in a container cannot kill or ptrace tasks in the
parent user namespace, and can only get world access rights to files.
Since all files currently belong to the initila user namespace, that means
that child user namespaces can only get world access rights to *all*
files. While this temporarily makes user namespaces bad for system
containers, it starts to get useful for some sandboxing.
I've run the 'runltplite.sh' with and without this patchset and found no
difference.
This patch:
copy_process() handles CLONE_NEWUSER before the rest of the namespaces.
So in the case of clone(CLONE_NEWUSER|CLONE_NEWUTS) the new uts namespace
will have the new user namespace as its owner. That is what we want,
since we want root in that new userns to be able to have privilege over
it.
Changelog:
Feb 15: don't set uts_ns->user_ns if we didn't create
a new uts_ns.
Feb 23: Move extern init_user_ns declaration from
init/version.c to utsname.h.
Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Daniel Lezcano <daniel.lezcano@free.fr>
Acked-by: David Howells <dhowells@redhat.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patchset is a cleanup and a preparation to unshare the pid namespace.
These prerequisites prepare for Eric's patchset to give a file descriptor
to a namespace and join an existing namespace.
This patch:
It turns out that the existing assignment in copy_process of the
child_reaper can handle the initial assignment of child_reaper we just
need to generalize the test in kernel/fork.c
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@free.fr>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Serge E. Hallyn <serge@hallyn.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Changes mport ID and host destination ID assignment to implement unified
method common to all mport drivers. Makes "riohdid=" kernel command line
parameter common for all architectures with support for more that one host
destination ID assignment.
Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Kumar Gala <galak@kernel.crashing.org>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Li Yang <leoli@freescale.com>
Cc: Thomas Moll <thomas.moll@sysgo.com>
Cc: Micha Nelissen <micha@neli.hopto.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Subsystem initialization sequence modified to support presence of multiple
RapidIO controllers in the system. The new sequence is compatible with
initialization of PCI devices.
Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Kumar Gala <galak@kernel.crashing.org>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Li Yang <leoli@freescale.com>
Cc: Thomas Moll <thomas.moll@sysgo.com>
Cc: Micha Nelissen <micha@neli.hopto.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This set of patches eliminates RapidIO dependency on PowerPC architecture
and makes it available to other architectures (x86 and MIPS). It also
enables support of new platform independent RapidIO controllers such as
PCI-to-SRIO and PCI Express-to-SRIO.
This patch:
Extend number of mport callback functions to eliminate direct linking of
architecture specific mport operations.
Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Kumar Gala <galak@kernel.crashing.org>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Li Yang <leoli@freescale.com>
Cc: Thomas Moll <thomas.moll@sysgo.com>
Cc: Micha Nelissen <micha@neli.hopto.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
1. namelen is declared "unsigned short" which hints for "maybe space savings".
Indeed in 2.4 struct proc_dir_entry looked like:
struct proc_dir_entry {
unsigned short low_ino;
unsigned short namelen;
Now, low_ino is "unsigned int", all savings were gone for a long time.
"struct proc_dir_entry" is not that countless to worry about it's size,
anyway.
2. converting from unsigned short to int/unsigned int can only create
problems, we better play it safe.
Space is not really conserved, because of natural alignment for the next
field. sizeof(struct proc_dir_entry) remains the same.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In struct page_cgroup, we have a full word for flags but only a few are
reserved. Use the remaining upper bits to encode, depending on
configuration, the node or the section, to enable page_cgroup-to-page
lookups without a direct pointer.
This saves a full word for every page in a system with memory cgroups
enabled.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
It is one logical function, no need to have it split up.
Also, get rid of some checks from the inner function that ensured the
sanity of the outer function.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Instead of passing a whole struct page_cgroup to this function, let it
take only what it really needs from it: the struct mem_cgroup and the
page.
This has the advantage that reading pc->mem_cgroup is now done at the same
place where the ordering rules for this pointer are enforced and
explained.
It is also in preparation for removing the pc->page backpointer.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add checks at allocating or freeing a page whether the page is used (iow,
charged) from the view point of memcg.
This check may be useful in debugging a problem and we did similar checks
before the commit 52d4b9ac(memcg: allocate all page_cgroup at boot).
This patch adds some overheads at allocating or freeing memory, so it's
enabled only when CONFIG_DEBUG_VM is enabled.
Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Since transparent huge pages, checking whether memory cgroups are below
their limits is no longer enough, but the actual amount of chargeable
space is important.
To not have more than one limit-checking interface, replace
memory_cgroup_check_under_limit() and memory_cgroup_check_margin() with a
single memory_cgroup_margin() that returns the chargeable space and leaves
the comparison to the callsite.
Soft limits are now checked the other way round, by using the already
existing function that returns the amount by which soft limits are
exceeded: res_counter_soft_limit_excess().
Also remove all the corresponding functions on the res_counter side that
are now no longer used.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Soft limit reclaim continues until the usage is below the current soft
limit, but the documented semantics are actually that soft limit reclaim
will push usage back until the soft limits are met again.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
minix bit operations are only used by minix filesystem and useless by
other modules. Because byte order of inode and block bitmaps is different
on each architecture like below:
m68k:
big-endian 16bit indexed bitmaps
h8300, microblaze, s390, sparc, m68knommu:
big-endian 32 or 64bit indexed bitmaps
m32r, mips, sh, xtensa:
big-endian 32 or 64bit indexed bitmaps for big-endian mode
little-endian bitmaps for little-endian mode
Others:
little-endian bitmaps
In order to move minix bit operations from asm/bitops.h to architecture
independent code in minix filesystem, this provides two config options.
CONFIG_MINIX_FS_BIG_ENDIAN_16BIT_INDEXED is only selected by m68k.
CONFIG_MINIX_FS_NATIVE_ENDIAN is selected by the architectures which use
native byte order bitmaps (h8300, microblaze, s390, sparc, m68knommu,
m32r, mips, sh, xtensa). The architectures which always use little-endian
bitmaps do not select these options.
Finally, we can remove minix bit operations from asm/bitops.h for all
architectures.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Greg Ungerer <gerg@uclinux.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Andreas Schwab <schwab@linux-m68k.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Michal Simek <monstr@monstr.eu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Cc: Chris Zankel <chris@zankel.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
As the result of conversions, there are no users of ext2 non-atomic bit
operations except for ext2 filesystem itself. Now we can put them into
architecture independent code in ext2 filesystem, and remove from
asm/bitops.h for all architectures.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
As a preparation for removing ext2 non-atomic bit operations from
asm/bitops.h. This converts ext2 non-atomic bit operations to
little-endian bit operations.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
As a preparation for removing ext2 non-atomic bit operations from
asm/bitops.h. This converts ext2 non-atomic bit operations to
little-endian bit operations.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Jan Kara <jack@suse.cz>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
As a preparation for removing ext2 non-atomic bit operations from
asm/bitops.h. This converts ext2 non-atomic bit operations to
little-endian bit operations.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Introduce little-endian bit operations to the big-endian architectures
which do not have native little-endian bit operations and the
little-endian architectures. (alpha, avr32, blackfin, cris, frv, h8300,
ia64, m32r, mips, mn10300, parisc, sh, sparc, tile, x86, xtensa)
These architectures can just include generic implementation
(asm-generic/bitops/le.h).
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Mikael Starvik <starvik@axis.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Matthew Wilcox <willy@debian.org>
Cc: Grant Grundler <grundler@parisc-linux.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Zankel <chris@zankel.net>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com>
Acked-by: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This makes the little-endian bitops take any pointer types by changing the
prototypes and adding casts in the preprocessor macros.
That would seem to at least make all the filesystem code happier, and they
can continue to do just something like
#define ext2_set_bit __test_and_set_bit_le
(or whatever the exact sequence ends up being).
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Mikael Starvik <starvik@axis.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Matthew Wilcox <willy@debian.org>
Cc: Grant Grundler <grundler@parisc-linux.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Zankel <chris@zankel.net>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Hans-Christian Egtvedt <hans-christian.egtvedt@atmel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch series introduces little-endian bit operations in asm/bitops.h
for all architectures and converts all ext2 non-atomic and minix bit
operations to use little-endian bit operations. It enables us to remove
ext2 non-atomic and minix bit operations from asm/bitops.h. The reason
they should be removed from asm/bitops.h is as follows:
For ext2 non-atomic bit operations, they are used for little-endian byte
order bitmap access by some filesystems and modules. But using ext2_*()
functions on a module other than ext2 filesystem makes some feel strange.
For minix bit operations, they are only used by minix filesystem and are
useless by other modules. Because byte order of inode and block bitmap is
This patch:
In order to make the forthcoming changes smaller, this merges macro
definisions in asm-generic/bitops/le.h for big-endian and little-endian as
much as possible.
This also removes unused BITOP_WORD macro.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Introduce Kconfig option allowing architectures where sysdev
operations used during system suspend, resume and shutdown have been
completely replaced with struct sycore_ops operations to avoid
building sysdev code that will never be used.
Make callbacks in struct sys_device and struct sysdev_driver depend
on ARCH_NO_SYSDEV_OPS to allows us to verify if all of the references
have been actually removed from the code the given architecture
depends on.
Make x86 select ARCH_NO_SYSDEV_OPS.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
There are no users of OF_NO_DEEP_PROBE, and of_match_node() now
gracefully handles being passed a NULL pointer, so the checks at the
top of of_platform_bus_probe can be dropped.
While at it, consolidate the root node pointer check to be easier to
read and tidy up related comments.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Provide an alternative to access_process_vm that allows the caller to obtain a
reference to the supplied mm_struct.
Signed-off-by: Stephen Wilson <wilsons@start.ca>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Now that gate vma's are referenced with respect to a particular mm and not a
particular task it only makes sense to propagate the change to this predicate as
well.
Signed-off-by: Stephen Wilson <wilsons@start.ca>
Reviewed-by: Michel Lespinasse <walken@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Morally, the question of whether an address lies in a gate vma should be asked
with respect to an mm, not a particular task. Moreover, dropping the dependency
on task_struct will help make existing and future operations on mm's more
flexible and convenient.
Signed-off-by: Stephen Wilson <wilsons@start.ca>
Reviewed-by: Michel Lespinasse <walken@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Morally, the presence of a gate vma is more an attribute of a particular mm than
a particular task. Moreover, dropping the dependency on task_struct will help
make both existing and future operations on mm's more flexible and convenient.
Signed-off-by: Stephen Wilson <wilsons@start.ca>
Reviewed-by: Michel Lespinasse <walken@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The filelayout driver sends LAYOUTCOMMIT only when COMMIT goes to
the data server (as opposed to the MDS) and the data server WRITE
is not NFS_FILE_SYNC.
Only whole file layout support means that there is only one IOMODE_RW layout
segment.
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Alexandros Batsakis <batsakis@netapp.com>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com>
Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
Signed-off-by: Mingyang Guo <guomingyang@nrchpc.ac.cn>
Signed-off-by: Tao Guo <guotao@nrchpc.ac.cn>
Signed-off-by: Zhang Jingwang <zhangjingwang@nrchpc.ac.cn>
Tested-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Implement all the hooks created in the previous patches.
This requires exporting quite a few functions and adding a few
structure fields.
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
We create three major hooks for the pnfs code.
pnfs_mark_request_commit() is called during writeback_done from
nfs_mark_request_commit, which gives the driver an opportunity to
claim it wants control over commiting a particular req.
pnfs_choose_commit_list() is called from nfs_scan_list
to choose which list a given req should be added to, based on
where we intend to send it for COMMIT. It is up to the driver
to have preallocated list headers for each destination it may need.
pnfs_commit_list() is how the driver actually takes control, it is
used instead of nfs_commit_list().
In order to pass information between the above functions, we create
a union in nfs_page to hold a lseg (which is possible because the req is
not on any list while in transition), and add some flags to indicate
if we need to use the pnfs code.
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Using blue flame can improve latency by allowing the HW to more efficiently
access the WQE. This patch presents two functions that are used to allocate or
release HW resources for using blue flame; the caller need to supply a struct
mlx4_bf object when allocating resources. Consumers that make use of this API
should post doorbells to the UAR object pointed by the initialized struct
mlx4_bf;
Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
The mlx4_en module now uses the new steering mechanism.
The RX packets are now steered through the MCG table instead
of Mac table for unicast, and default entry for multicast.
The feature is enabled through INIT_HCA
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
The same packet steering mechanism would be used both for IB and Ethernet,
Both multicasts and unicasts.
This commit prepares the general infrastructure for this.
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
HW revision is derived from device ID and rev id.
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.co.il>
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
The driver queries the FW for WOL support.
Ethtool get/set_wol is implemented accordingly.
Only magic packets are supported at the time.
Signed-off-by: Igor Yarovinsky <igory@mellanox.co.il>
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adding a pool of MSI-X vectors and EQs that can be used explicitly by mlx4_core
customers (mlx4_ib, mlx4_en). The consumers will assign their own names to the
interrupt vectors. Those vectors are not opened at mlx4 device initialization,
opened by demand.
Changed the max number of possible EQs according to the new scheme, no longer relies on
on number of cores.
The new functionality is exposed through mlx4_assign_eq() and mlx4_release_eq().
Customers that do not use the new API will get completion vectors as before.
Signed-off-by: Markuze Alex <markuze@mellanox.co.il>
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some irq_set_type() callbacks need to change the chip and the handler
when the trigger mode changes. We have already a (misnomed) setter
function for the handler which can be called from irq_set_type().
Provide one which allows to set chip and name as well. Put the
misnomed function under the COMPAT switch and provide a replacement.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Some irq chips need to call genirq functions for nested chips from
their callbacks. That upsets lockdep. So they need to set a different
lock class for those nested chips. Provide a helper function to avoid
open access to irq_desc.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Some filesystems (such as ext4) can return the same cookie value for
multiple files. If we try to start a readdir with one of these cookies,
the server will return the first file found with a cookie of the same
value. This can cause the client to enter an infinite loop.
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
nfs_opendir() created a context that held much more information than we
need for a readdir. This patch introduces a slimmed-down
nfs_open_dir_context that contains only the cookie and the cred used for
RPC operations. The new context will eventually be used to help detect
readdir loops.
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
The xmit path can sleep with a page kmapped in the network
xmit code while it waits for space to open up, so we have to use
kmap instead of kmap atomic in that path.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>
This patch adds a target_core_mib.c statistics conversion for
backend context struct se_subsystem_dev + struct se_device config_group
based statistics in target_core_device.c using CONFIGFS_EATTR()
based struct config_item_types from target_core_stat.c code.
The conversion from backend /proc/scsi_target/mib/ context output to configfs
default groups+attributes include scsi_dev, scsi_lu, and scsi_tgt_dev output
from within individual:
/sys/kernel/config/target/core/$HBA/DEV/
The legacy procfs output now appear as individual configfs attributes under:
*) $HBA/$DEV/statistics/scsi_dev:
|-- indx
|-- inst
|-- ports
`-- role
*) $HBA/$DEV/statistics/scsi_lu:
|-- creation_time
|-- dev
|-- dev_type
|-- full_stat
|-- hs_num_cmds
|-- indx
|-- inst
|-- lu_name
|-- lun
|-- num_cmds
|-- prod
|-- read_mbytes
|-- resets
|-- rev
|-- state_bit
|-- status
|-- vend
`-- write_mbytes
*) $HBA/$DEV/statistics/scsi_tgt_dev:
|-- indx
|-- inst
|-- non_access_lus
|-- num_lus
|-- resets
`-- status
The conversion from backend /proc/scsi_target/mib/ context output to configfs
default groups+attributes include scsi_port, scsi_tgt_port and scsi_transport
output from within individual:
/sys/kernel/config/target/fabric/$WWN/tpgt_$TPGT/lun/lun_$LUN_ID/statistics/
The legacy procfs output now appear as individual configfs attributes under:
*) fabric/$WWN/tpgt_$TPGT/lun/lun_$LUN_ID/statistics/scsi_port
|-- busy_count
|-- dev
|-- indx
|-- inst
`-- role
*) fabric/$WWN/tpgt_$TPGT/lun/lun_$LUN_ID/statistics/scsi_tgt_port
|-- dev
|-- hs_in_cmds
|-- in_cmds
|-- indx
|-- inst
|-- name
|-- port_index
|-- read_mbytes
`-- write_mbytes
*) fabric/$WWN/tpgt_$TPGT/lun/lun_$LUN_ID/statistics/scsi_transport
|-- dev_name
|-- device
|-- indx
`-- inst
The conversion from backend /proc/scsi_target/mib/ context output to configfs
default groups+attributes include scsi_att_intr_port and scsi_auth_intr output
from within individual:
/sys/kernel/config/target/fabric/$WWN/tpgt_$TPGT/acls/$INITIATOR_WWN/lun_$LUN_ID/statistics/
The legacy procfs output now appear as individual configfs attributes under:
*) acls/$INITIATOR_WWN/lun_$LUN_ID/statistics/scsi_att_intr_port
|-- dev
|-- indx
|-- inst
|-- port
|-- port_auth_indx
`-- port_ident
*) acls/$INITIATOR_WWN/lun_$LUN_ID/statistics/scsi_auth_intr
|-- att_count
|-- creation_time
|-- dev
|-- dev_or_port
|-- hs_num_cmds
|-- indx
|-- inst
|-- intr_name
|-- map_indx
|-- num_cmds
|-- port
|-- read_mbytes
|-- row_status
`-- write_mbytes
Also, this includes adding struct target_fabric_configfs_template->
tfc_wwn_fabric_stats_cit and ->tfc_tpg_nacl_stat_cit respectively for
use during target_core_fabric_configfs.c:target_fabric_setup_cits()
Signed-off-by: Nicholas A. Bellinger <nab@linux-iscsi.org>
Signed-off-by: James Bottomley <James.Bottomley@suse.de>