Make node look as if it was on hlist, with hlist_del()
working correctly. Usable without any locking...
Convert a couple of places where we want to do that to
inode->i_hash.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Now, rw_verify_area() checsk f_pos is negative or not. And if negative,
returns -EINVAL.
But, some special files as /dev/(k)mem and /proc/<pid>/mem etc.. has
negative offsets. And we can't do any access via read/write to the
file(device).
So introduce FMODE_UNSIGNED_OFFSET to allow negative file offsets.
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Andrew,
Could you please review this patch, you probably are the right guy to
take it, because it crosses fs and net trees.
Note : /proc/sys/fs/file-nr is a read-only file, so this patch doesnt
depend on previous patch (sysctl: fix min/max handling in
__do_proc_doulongvec_minmax())
Thanks !
[PATCH V4] fs: allow for more than 2^31 files
Robin Holt tried to boot a 16TB system and found af_unix was overflowing
a 32bit value :
<quote>
We were seeing a failure which prevented boot. The kernel was incapable
of creating either a named pipe or unix domain socket. This comes down
to a common kernel function called unix_create1() which does:
atomic_inc(&unix_nr_socks);
if (atomic_read(&unix_nr_socks) > 2 * get_max_files())
goto out;
The function get_max_files() is a simple return of files_stat.max_files.
files_stat.max_files is a signed integer and is computed in
fs/file_table.c's files_init().
n = (mempages * (PAGE_SIZE / 1024)) / 10;
files_stat.max_files = n;
In our case, mempages (total_ram_pages) is approx 3,758,096,384
(0xe0000000). That leaves max_files at approximately 1,503,238,553.
This causes 2 * get_max_files() to integer overflow.
</quote>
Fix is to let /proc/sys/fs/file-nr & /proc/sys/fs/file-max use long
integers, and change af_unix to use an atomic_long_t instead of
atomic_t.
get_max_files() is changed to return an unsigned long.
get_nr_files() is changed to return a long.
unix_nr_socks is changed from atomic_t to atomic_long_t, while not
strictly needed to address Robin problem.
Before patch (on a 64bit kernel) :
# echo 2147483648 >/proc/sys/fs/file-max
# cat /proc/sys/fs/file-max
-18446744071562067968
After patch:
# echo 2147483648 >/proc/sys/fs/file-max
# cat /proc/sys/fs/file-max
2147483648
# cat /proc/sys/fs/file-nr
704 0 2147483648
Reported-by: Robin Holt <holt@sgi.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: David Miller <davem@davemloft.net>
Reviewed-by: Robin Holt <holt@sgi.com>
Tested-by: Robin Holt <holt@sgi.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
__block_write_begin and block_prepare_write are identical except for slightly
different calling conventions. Convert all callers to the __block_write_begin
calling conventions and drop block_prepare_write.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Hugetlbfs used to need it, but after the destroy_inode and evict_inode
changes it's not required anymore.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Add a new helper to write out the inode using the writeback code,
that is including the correct dirty bit and list manipulation. A few
of filesystems already opencode this, and a lot of others should be
using it instead of using write_inode_now which also writes out the
data.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* 'davinci-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux-davinci: (50 commits)
davinci: fix remaining board support after io_pgoffst removal
davinci: mityomapl138: make file local data static
arm/davinci: remove duplicated include
davinci: Initial support for Omapl138-Hawkboard
davinci: MityDSP-L138/MityARM-1808 read MAC address from I2C Prom
davinci: add tnetv107x touchscreen platform device
input: add driver for tnetv107x touchscreen controller
davinci: add keypad config for tnetv107x evm board
davinci: add tnetv107x keypad platform device
input: add driver for tnetv107x on-chip keypad controller
net: davinci_emac: cleanup unused cpdma code
net: davinci_emac: switch to new cpdma layer
net: davinci_emac: separate out cpdma code
net: davinci_emac: cleanup unused mdio emac code
omap: cleanup unused davinci mdio arch code
davinci: cleanup mdio arch code and switch to phy_id
net: davinci_emac: switch to new mdio
omap: add mdio platform devices
davinci: add mdio platform devices
net: davinci_emac: separate out davinci mdio
...
Fix up trivial conflict in drivers/input/keyboard/Kconfig (two entries
added next to each other - one from the davinci merge, one from the
input merge)
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6: (365 commits)
ALSA: hda - Disable sticky PCM stream assignment for AD codecs
ALSA: usb - Creative USB X-Fi volume knob support
ALSA: ca0106: Use card specific dac id for mute controls.
ALSA: ca0106: Allow different sound cards to use different SPI channel mappings.
ALSA: ca0106: Create a nice spot for mapping channels to dacs.
ALSA: ca0106: Move enabling of front dac out of hardcoded setup sequence.
ALSA: ca0106: Pull out dac powering routine into separate function.
ALSA: ca0106 - add Sound Blaster 5.1vx info.
ASoC: tlv320dac33: Use usleep_range for delays
ALSA: usb-audio: add Novation Launchpad support
ALSA: hda - Add workarounds for CT-IBG controllers
ALSA: hda - Fix wrong TLV mute bit for STAC/IDT codecs
ASoC: tpa6130a2: Error handling for broken chip
ASoC: max98088: Staticise m98088_eq_band
ASoC: soc-core: Fix codec->name memory leak
ALSA: hda - Apply ideapad quirk to Acer laptops with Cxt5066
ALSA: hda - Add some workarounds for Creative IBG
ALSA: hda - Fix wrong SPDIF NID assignment for CA0110
ALSA: hda - Fix codec rename rules for ALC662-compatible codecs
ALSA: hda - Add alc_init_jacks() call to other codecs
...
* 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/dvrabel/uwb:
uwb: Orphan the UWB and WUSB subsystems
uwb: Remove the WLP subsystem and drivers
* 'next-devicetree' of git://git.secretlab.ca/git/linux-2.6:
mtd/m25p80: add support to parse the partitions by OF node
of/irq: of_irq.c needs to include linux/irq.h
of/mips: Cleanup some include directives/files.
of/mips: Add device tree support to MIPS
of/flattree: Eliminate need to provide early_init_dt_scan_chosen_arch
of/device: Rework to use common platform_device_alloc() for allocating devices
of/xsysace: Fix OF probing on little-endian systems
of: use __be32 types for big-endian device tree data
of/irq: remove references to NO_IRQ in drivers/of/platform.c
of/promtree: add package-to-path support to pdt
of/promtree: add of_pdt namespace to pdt code
of/promtree: no longer call prom_ functions directly; use an ops structure
of/promtree: make drivers/of/pdt.c no longer sparc-only
sparc: break out some PROM device-tree building code out into drivers/of
of/sparc: convert various prom_* functions to use phandle
sparc: stop exporting openprom.h header
powerpc, of_serial: Endianness issues setting up the serial ports
of: MTD: Fix OF probing on little-endian systems
of: GPIO: Fix OF probing on little-endian systems
Replace the BKL with a mutex to protect the venus_comm structure which
binds the mountpoint with the character device and holds the upcall
queues.
Signed-off-by: Yoshihisa Abe <yoshiabe@cs.cmu.edu>
Signed-off-by: Jan Harkes <jaharkes@cs.cmu.edu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Now that shared inode state is locked using the cii->c_lock, the BKL is
only used to protect the upcall queues used to communicate with the
userspace cache manager. The remaining state is all local and we can
push the lock further down into coda_upcall().
Signed-off-by: Yoshihisa Abe <yoshiabe@cs.cmu.edu>
Signed-off-by: Jan Harkes <jaharkes@cs.cmu.edu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We mostly need it to protect cached user permissions. The c_flags field
is advisory, reading the wrong value is harmless and in the worst case
we hit a slow path where we have to make an extra upcall to the
userspace cache manager when revalidating a dentry or inode.
Signed-off-by: Yoshihisa Abe <yoshiabe@cs.cmu.edu>
Signed-off-by: Jan Harkes <jaharkes@cs.cmu.edu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6: (110 commits)
sh: i2c-sh7760: Replase from ctrl_* to __raw_*
sh: clkfwk: Shuffle around to match the intc split up.
sh: clkfwk: modify for_each_frequency end condition
sh: fix clk_get() error handling
sh: clkfwk: Fix fault in frequency iterator.
sh: clkfwk: Add a helper for rate rounding by divisor ranges.
sh: clkfwk: Abstract rate rounding helper.
sh: clkfwk: support clock remapping.
sh: pci: Convert to upper/lower_32_bits() helpers.
sh: mach-sdk7786: Add support for the FPGA SRAM.
sh: Provide a generic SRAM pool for tiny memories.
sh: pci: Support secondary FPGA-driven PCIe clocks on SDK7786.
sh: pci: Support slot 4 routing on SDK7786.
sh: Fix up PMB locking.
sh: mach-sdk7786: Add support for fpga gpios.
sh: use pr_fmt for clock framework, too.
sh: remove name and id from struct clk
sh: free-without-alloc fix for sh_mobile_lcdcfb
sh: perf: Set up perf_max_events.
sh: perf: Support SH-X3 hardware counters.
...
Fix up trivial conflicts (perf_max_events got removed) in arch/sh/kernel/perf_event.c
Improve performance of the sske operation by using the nonquiescing
variant if the affected page has no mappings established. On machines
with no support for the new sske variant the mask bit will be ignored.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
The only Wimedia LLC Protocol (WLP) hardware was an Intel i1480 chip
with a beta release of firmware that was never commercially available as
a product. This hardware and firmware is no longer available as Intel
sold their UWB/WLP IP. I also see little prospect of other WLP
capable hardware ever being available.
Signed-off-by: David Vrabel <david.vrabel@csr.com>
* 'devel' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/edac: (25 commits)
i7300_edac: Properly initialize per-csrow memory size
V4L/DVB: i7300_edac: better initialize page counts
MAINTAINERS: Add maintainer for i7300-edac driver
i7300-edac: CodingStyle cleanup
i7300_edac: Improve comments
i7300_edac: Cleanup: reorganize the file contents
i7300_edac: Properly detect channel on CE errors
i7300_edac: enrich FBD error info for corrected errors
i7300_edac: enrich FBD error info for fatal errors
i7300_edac: pre-allocate a buffer used to prepare err messages
i7300_edac: Fix MTR x4/x8 detection logic
i7300_edac: Make the debug messages coherent with the others
i7300_edac: Cleanup: remove get_error_info logic
i7300_edac: Add a code to cleanup error registers
i7300_edac: Add support for reporting FBD errors
i7300_edac: Properly detect the type of error correction
i7300_edac: Detect if the device is on single mode
i7300_edac: Adds detection for enhanced scrub mode on x8
i7300_edac: Clear the error bit after reading
i7300_edac: Add error detection code for global errors
...
This reverts commit 7681bfeecc.
Conflicts:
include/linux/genhd.h
It has numerous issues with the cleanup path and non-elevator
devices. Revert it for now so we can come up with a clean
version without rushing things.
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6: (27 commits)
SLUB: Fix memory hotplug with !NUMA
slub: Move functions to reduce #ifdefs
slub: Enable sysfs support for !CONFIG_SLUB_DEBUG
SLUB: Optimize slab_free() debug check
slub: Move NUMA-related functions under CONFIG_NUMA
slub: Add lock release annotation
slub: Fix signedness warnings
slub: extract common code to remove objects from partial list without locking
SLUB: Pass active and inactive redzone flags instead of boolean to debug functions
slub: reduce differences between SMP and NUMA
Revert "Slub: UP bandaid"
percpu: clear memory allocated with the km allocator
percpu: use percpu allocator on UP too
percpu: reduce PCPU_MIN_UNIT_SIZE to 32k
vmalloc: pcpu_get/free_vm_areas() aren't needed on UP
SLUB: Fix merged slab cache names
Slub: UP bandaid
slub: fix SLUB_RESILIENCY_TEST for dynamic kmalloc caches
slub: Fix up missing kmalloc_cache -> kmem_cache_node case for memoryhotplug
slub: Add dummy functions for the !SLUB_DEBUG case
...
* 'kvm-updates/2.6.37' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (321 commits)
KVM: Drop CONFIG_DMAR dependency around kvm_iommu_map_pages
KVM: Fix signature of kvm_iommu_map_pages stub
KVM: MCE: Send SRAR SIGBUS directly
KVM: MCE: Add MCG_SER_P into KVM_MCE_CAP_SUPPORTED
KVM: fix typo in copyright notice
KVM: Disable interrupts around get_kernel_ns()
KVM: MMU: Avoid sign extension in mmu_alloc_direct_roots() pae root address
KVM: MMU: move access code parsing to FNAME(walk_addr) function
KVM: MMU: audit: check whether have unsync sps after root sync
KVM: MMU: audit: introduce audit_printk to cleanup audit code
KVM: MMU: audit: unregister audit tracepoints before module unloaded
KVM: MMU: audit: fix vcpu's spte walking
KVM: MMU: set access bit for direct mapping
KVM: MMU: cleanup for error mask set while walk guest page table
KVM: MMU: update 'root_hpa' out of loop in PAE shadow path
KVM: x86 emulator: Eliminate compilation warning in x86_decode_insn()
KVM: x86: Fix constant type in kvm_get_time_scale
KVM: VMX: Add AX to list of registers clobbered by guest switch
KVM guest: Move a printk that's using the clock before it's ready
KVM: x86: TSC catchup mode
...
* 'i2c-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
i2c-viapro: Don't log nacks
i2c/pca954x: Remove __devinit and __devexit from probe and remove functions
MAINTAINERS: Add maintainer for PCA9541 I2C bus master selector driver
i2c/mux: Driver for PCA9541 I2C Master Selector
i2c: Optimize function i2c_detect()
i2c: Discard warning message on device instantiation from user-space
i2c-amd8111: Add proper error handling
i2c: Change to new flag variable
i2c: Remove unneeded inclusions of <linux/i2c-id.h>
i2c: Let i2c_parent_is_i2c_adapter return the parent adapter
i2c: Simplify i2c_parent_is_i2c_adapter
i2c-pca-platform: Change device name of request_irq
i2c: Fix Kconfig dependencies
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: (47 commits)
HID: fix mismerge in hid-lg
HID: hidraw: fix window in hidraw_release
HID: hid-sony: override usbhid_output_raw_report for Sixaxis
HID: add absolute axis resolution calculation
HID: force feedback support for Logitech RumblePad gamepad
HID: support STmicroelectronics and Sitronix with hid-stantuml driver
HID: magicmouse: Adjust major / minor axes to scale
HID: Fix for problems with eGalax/DWAV multi-touch-screen
HID: waltop: add support for Waltop Slim Tablet 12.1 inch
HID: add NOGET quirk for AXIS 295 Video Surveillance Joystick
HID: usbhid: remove unused hiddev_driver
HID: magicmouse: Use hid-input parsing rather than bypassing it
HID: trivial formatting fix
HID: Add support for Logitech Speed Force Wireless gaming wheel
HID: don't Send Feature Reports on Interrupt Endpoint
HID: 3m: Adjust major / minor axes to scale
HID: 3m: Correct touchscreen emulation
HID: 3m: Convert to MT slots
HID: 3m: Output proper orientation range
HID: 3m: Adjust to sequential MT HID protocol
...
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: Makefile - replace the use of <module>-objs with <module>-y
crypto: hifn_795x - use cancel_delayed_work_sync()
crypto: talitos - sparse check endian fixes
crypto: talitos - fix checkpatch warning
crypto: talitos - fix warning: 'alg' may be used uninitialized in this function
crypto: cryptd - Adding the AEAD interface type support to cryptd
crypto: n2_crypto - Niagara2 driver needs to depend upon CRYPTO_DES
crypto: Kconfig - update broken web addresses
crypto: omap-sham - Adjust DMA parameters
crypto: fips - FIPS requires algorithm self-tests
crypto: omap-aes - OMAP2/3 AES hw accelerator driver
crypto: updates to enable omap aes
padata: add missing __percpu markup in include/linux/padata.h
MAINTAINERS: Add maintainer entries for padata/pcrypt
Only i2c devices can have their type set to i2c_adapter_type, so
testing the bus type is redundant.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Michael Lawnick <ml.lawnick@gmx.de>
Breaks otherwise if CONFIG_IOMMU_API is not set.
KVM-Stable-Tag.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
This just changes some names to better reflect the usage they
will be given. Separated out to keep confusion to a minimum.
Signed-off-by: Zachary Amsden <zamsden@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Instead of blindly attempting to inject an event before each guest entry,
check for a possible event first in vcpu->requests. Sites that can trigger
event injection are modified to set KVM_REQ_EVENT:
- interrupt, nmi window opening
- ppr updates
- i8259 output changes
- local apic irr changes
- rflags updates
- gif flag set
- event set on exit
This improves non-injecting entry performance, and sets the stage for
non-atomic injection.
Signed-off-by: Avi Kivity <avi@redhat.com>
This patch introduces a mmu-callback to translate gpa
addresses in the walk_addr code. This is later used to
translate l2_gpa addresses into l1_gpa addresses.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Now that we have all the level interrupt magic in place, let's
expose the capability to user space, so it can make use of it!
Signed-off-by: Alexander Graf <agraf@suse.de>
There is a bugs in this function, we call gfn_to_pfn() and kvm_mmu_gva_to_gpa_read() in
atomic context(kvm_mmu_audit() is called under the spinlock(mmu_lock)'s protection).
This patch fix it by:
- introduce gfn_to_pfn_atomic instead of gfn_to_pfn
- get the mapping gfn from kvm_mmu_page_get_gfn()
And it adds 'notrap' ptes check in unsync/direct sps
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Introduce this function to get consecutive gfn's pages, it can reduce
gup's overload, used by later patch
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Introduce hva_to_pfn_atomic(), it's the fast path and can used in atomic
context, the later patch will use it
Signed-off-by: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
We need to tell the guest the opcodes that make up a hypercall through
interfaces that are controlled by userspace. So we need to add a call
for userspace to allow it to query those opcodes so it can pass them
on.
This is required because the hypercall opcodes can change based on
the hypervisor conditions. If we're running in hardware accelerated
hypervisor mode, a hypercall looks different from when we're running
without hardware acceleration.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
Currently x86 is the only architecture that uses kvm_guest_init(). With
PowerPC we're getting a second user, but the signature is different there
and we don't need to export it, as it uses the normal kernel init framework.
So let's move the x86 specific definition of that function over to the x86
specfic header file.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
We will be introducing a method to project the shared page in guest context.
As soon as we're talking about this coupling, the shared page is colled magic
page.
This patch introduces simple defines, so the follow-up patches are easier to
read.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
To communicate with KVM directly we need to plumb some sort of interface
between the guest and KVM. Usually those interfaces use hypercalls.
This hypercall implementation is described in the last patch of the series
in a special documentation file. Please read that for further information.
This patch implements stubs to handle KVM PPC hypercalls on the host and
guest side alike.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>