Commit Graph

43123 Commits

Author SHA1 Message Date
Linus Torvalds
e66eed651f list: remove prefetching from regular list iterators
This is removes the use of software prefetching from the regular list
iterators.  We don't want it.  If you do want to prefetch in some
iterator of yours, go right ahead.  Just don't expect the iterator to do
it, since normally the downsides are bigger than the upsides.

It also replaces <linux/prefetch.h> with <linux/const.h>, because the
use of LIST_POISON ends up needing it.  <linux/poison.h> is sadly not
self-contained, and including prefetch.h just happened to hide that.

Suggested by David Miller (networking has a lot of regular lists that
are often empty or a single entry, and prefetching is not going to do
anything but add useless instructions).

Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: David S. Miller <davem@davemloft.net>
Cc: linux-arch@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-19 14:15:29 -07:00
Linus Torvalds
75d65a425c hlist: remove software prefetching in hlist iterators
They not only increase the code footprint, they actually make things
slower rather than faster.  On internationally acclaimed benchmarks
("make -j16" on an already fully built kernel source tree) the hlist
prefetching slows down the build by up to 1%.

(Almost all of it comes from hlist_for_each_entry_rcu() as used by
avc_has_perm_noaudit(), which is very hot due to all the pathname
lookups to see if there is anything to do).

The cause seems to be two-fold:

 - on at least some Intel cores, prefetch(NULL) ends up with some
   microarchitectural stall due to the TLB miss that it incurs.  The
   hlist case triggers this very commonly, since the NULL pointer is the
   last entry in the list.

 - the prefetch appears to cause more D$ activity, probably because it
   prefetches hash list entries that are never actually used (because we
   ended the search early due to a hit).

Regardless, the numbers clearly say that the implicit prefetching is
simply a bad idea.  If some _particular_ user of the hlist iterators
wants to prefetch the next list entry, they can do so themselves
explicitly, rather than depend on all list iterators doing so
implicitly.

Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: David S. Miller <davem@davemloft.net>
Cc: linux-arch@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-19 13:50:07 -07:00
Linus Torvalds
fce519588a Merge branch 'devicetree/merge' of git://git.secretlab.ca/git/linux-2.6
* 'devicetree/merge' of git://git.secretlab.ca/git/linux-2.6:
  drivercore: revert addition of of_match to struct device
  of: fix race when matching drivers
2011-05-18 13:25:57 -07:00
Grant Likely
b1608d69cb drivercore: revert addition of of_match to struct device
Commit b826291c, "drivercore/dt: add a match table pointer to struct
device" added an of_match pointer to struct device to cache the
of_match_table entry discovered at driver match time.  This was unsafe
because matching is not an atomic operation with probing a driver.  If
two or more drivers are attempted to be matched to a driver at the
same time, then the cached matching entry pointer could get
overwritten.

This patch reverts the of_match cache pointer and reworks all users to
call of_match_device() directly instead.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-05-18 12:32:23 -06:00
Milton Miller
01294d8262 of: fix race when matching drivers
If two drivers are probing devices at the same time, both will write
their match table result to the dev->of_match cache at the same time.

Only write the result if the device matches.

In a thread titled "SBus devices sometimes detected, sometimes not",
Meelis reported his SBus hme was not detected about 50% of the time.
From the debug suggested by Grant it was obvious another driver matched
some devices between the call to match the hme and the hme discovery
failling.

Reported-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: Milton Miller <miltonm@bga.com>
[grant.likely: modified to only call of_match_device() once]
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-05-18 10:19:36 -06:00
Linus Torvalds
a2b9c1f620 Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/linux-2.6-block:
  block: don't delay blk_run_queue_async
  scsi: remove performance regression due to async queue run
  blk-throttle: Use task_subsys_state() to determine a task's blkio_cgroup
  block: rescan partitions on invalidated devices on -ENOMEDIA too
  cdrom: always check_disk_change() on open
  block: unexport DISK_EVENT_MEDIA_CHANGE for legacy/fringe drivers
2011-05-18 06:49:02 -07:00
Randy Dunlap
f12a20fc9b procfs: add stub for proc_mkdir_mode()
Provide a stub for proc_mkdir_mode() when CONFIG_PROC_FS is not
enabled, just like the stub for proc_mkdir().

Fixes this linux-next build error:

  drivers/net/wireless/airo.c:4504: error: implicit declaration of function 'proc_mkdir_mode'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: "John W. Linville" <linville@tuxdriver.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-18 02:55:24 -07:00
Jens Axboe
9937a5e2f3 scsi: remove performance regression due to async queue run
Commit c21e6beb removed our queue request_fn re-enter
protection, and defaulted to always running the queues from
kblockd to be safe. This was a known potential slow down,
but should be safe.

Unfortunately this is causing big performance regressions for
some, so we need to improve this logic. Looking into the details
of the re-enter, the real issue is on requeue of requests.

Requeue of requests upon seeing a BUSY condition from the device
ends up re-running the queue, causing traces like this:

scsi_request_fn()
        scsi_dispatch_cmd()
                scsi_queue_insert()
                        __scsi_queue_insert()
                                scsi_run_queue()
					scsi_request_fn()
						...

potentially causing the issue we want to avoid. So special
case the requeue re-run of the queue, but improve it to offload
the entire run of local queue and starved queue from a single
workqueue callback. This is a lot better than potentially
kicking off a workqueue run for each device seen.

This also fixes the issue of the local device going into recursion,
since the above mentioned commit never moved that queue run out
of line.

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-17 11:04:44 +02:00
Linus Torvalds
477de0de44 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc:
  Revert "mmc: fix a race between card-detect rescan and clock-gate work instances"
2011-05-16 18:36:47 -07:00
Linus Torvalds
7c21738efd Merge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6
* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
  drm: Take lock around probes for drm_fb_helper_hotplug_event
  drm/i915: Revert i915.semaphore=1 default from 47ae63e0
  vga_switcheroo: don't toggle-switch devices
  drm/radeon/kms: add some evergreen/ni safe regs
  drm/radeon/kms: fix extended lvds info parsing
  drm/radeon/kms: fix tiling reg on fusion
2011-05-16 08:47:31 -07:00
Chris Ball
86f315bbb2 Revert "mmc: fix a race between card-detect rescan and clock-gate work instances"
This reverts commit 26fc8775b5, which has
been reported to cause boot/resume-time crashes for some users:

https://bbs.archlinux.org/viewtopic.php?id=118751.

Signed-off-by: Chris Ball <cjb@laptop.org>
Cc: <stable@kernel.org>
2011-05-16 11:32:26 -04:00
Chris Wilson
752d2635eb drm: Take lock around probes for drm_fb_helper_hotplug_event
We need to hold the dev->mode_config.mutex whilst detecting the output
status. But we also need to drop it for the call into
drm_fb_helper_single_fb_probe(), which indirectly acquires the lock when
attaching the fbcon.

Failure to do so exposes a race with normal output probing. Detected by
adding some warnings that the mutex is held to the backend detect routines:

[   17.772456] WARNING: at drivers/gpu/drm/i915/intel_crt.c:471 intel_crt_detect+0x3e/0x373 [i915]()
[   17.772458] Hardware name: Latitude E6400
[   17.772460] Modules linked in: ....
[   17.772582] Pid: 11, comm: kworker/0:1 Tainted: G        W 2.6.38.4-custom.2 #8
[   17.772584] Call Trace:
[   17.772591]  [<ffffffff81046af5>] ? warn_slowpath_common+0x78/0x8c
[   17.772603]  [<ffffffffa03f3e5c>] ? intel_crt_detect+0x3e/0x373 [i915]
[   17.772612]  [<ffffffffa0355d49>] ?  drm_helper_probe_single_connector_modes+0xbf/0x2af [drm_kms_helper]
[   17.772619]  [<ffffffffa03534d5>] ?  drm_fb_helper_probe_connector_modes+0x39/0x4d [drm_kms_helper]
[   17.772625]  [<ffffffffa0354760>] ?  drm_fb_helper_hotplug_event+0xa5/0xc3 [drm_kms_helper]
[   17.772633]  [<ffffffffa035577f>] ? output_poll_execute+0x146/0x17c [drm_kms_helper]
[   17.772638]  [<ffffffff81193c01>] ? cfq_init_queue+0x247/0x345
[   17.772644]  [<ffffffffa0355639>] ? output_poll_execute+0x0/0x17c [drm_kms_helper]
[   17.772648]  [<ffffffff8105b540>] ? process_one_work+0x193/0x28e
[   17.772652]  [<ffffffff8105c6bc>] ? worker_thread+0xef/0x172
[   17.772655]  [<ffffffff8105c5cd>] ? worker_thread+0x0/0x172
[   17.772658]  [<ffffffff8105c5cd>] ? worker_thread+0x0/0x172
[   17.772663]  [<ffffffff8105f767>] ? kthread+0x7a/0x82
[   17.772668]  [<ffffffff8100a724>] ? kernel_thread_helper+0x4/0x10
[   17.772671]  [<ffffffff8105f6ed>] ? kthread+0x0/0x82
[   17.772674]  [<ffffffff8100a720>] ? kernel_thread_helper+0x0/0x10

Reported-by:  Frederik Himpe <fhimpe@telenet.be>
References: https://bugs.freedesktop.org/show_bug.cgi?id=36394
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-05-16 12:01:43 +10:00
Linus Torvalds
eed631e0d7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable
* git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable:
  Btrfs: fix FS_IOC_SETFLAGS ioctl
  Btrfs: fix FS_IOC_GETFLAGS ioctl
  fs: remove FS_COW_FL
  Btrfs: fix easily get into ENOSPC in mixed case
  Prevent oopsing in posix_acl_valid()
2011-05-15 10:22:10 -07:00
Li Zefan
e1e8fb6a1f fs: remove FS_COW_FL
FS_COW_FL and FS_NOCOW_FL were newly introduced to control per file
COW in btrfs, but FS_NOCOW_FL is sufficient.

The fact is we don't have corresponding BTRFS_INODE_COW flag.

COW is default, and FS_NOCOW_FL can be used to switch off COW for
a single file.

If we mount btrfs with nodatacow, a newly created file will be set with
the FS_NOCOW_FL flag. So to turn on COW for it, we can just clear the
FS_NOCOW_FL flag.

Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Signed-off-by: Chris Mason <chris.mason@oracle.com>
2011-05-14 16:10:26 -04:00
Linus Torvalds
298eaaad0e Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  bridge: fix forwarding of IPv6
  bonding,llc: Fix structure sizeof incompatibility for some PDUs
  ipv6: restore correct ECN handling on TCP xmit
  ne-h8300: Fix regression caused during net_device_ops conversion
  hydra: Fix regression caused during net_device_ops conversion
  zorro8390: Fix regression caused during net_device_ops conversion
  sfc: Always map MCDI shared memory as uncacheable
  ehea: Fix memory hotplug oops
  libertas: fix cmdpendingq locking
  iwlegacy: fix IBSS mode crashes
  ath9k: Fix a warning due to a queued work during S3 state
  mac80211: don't start the dynamic ps timer if not associated
2011-05-13 15:20:51 -07:00
Linus Torvalds
cf70cc5b9d Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
  NFSv4.1: Ensure that layoutget uses the correct gfp modes
  NFSv4.1: remove pnfs_layout_hdr from pnfs_destroy_all_layouts tmp_list
  NFSv41: Resend on NFS4ERR_RETRY_UNCACHED_REP
2011-05-13 15:19:39 -07:00
Vitalii Demianets
a10e146676 bonding,llc: Fix structure sizeof incompatibility for some PDUs
With some combinations of arch/compiler (e.g. arm-linux-gcc) the sizeof
operator on structure returns value greater than expected. In cases when the
structure is used for mapping PDU fields it may lead to unexpected results
(such as holes and alignment problems in skb data). __packed prevents this
undesired behavior.

Signed-off-by: Vitalii Demianets <vitas@nppfactor.kiev.ua>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-13 15:13:24 -04:00
Serge E. Hallyn
47a150edc2 Cache user_ns in struct cred
If !CONFIG_USERNS, have current_user_ns() defined to (&init_user_ns).

Get rid of _current_user_ns.  This requires nsown_capable() to be
defined in capability.c rather than as static inline in capability.h,
so do that.

Request_key needs init_user_ns defined at current_user_ns if
!CONFIG_USERNS, so forward-declare that in cred.h if !CONFIG_USERNS
at current_user_ns() define.

Compile-tested with and without CONFIG_USERNS.

Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com>
[ This makes a huge performance difference for acl_permission_check(),
  up to 30%.  And that is one of the hottest kernel functions for loads
  that are pathname-lookup heavy.  ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-13 11:45:33 -07:00
Steinar H. Gunderson
ca06707022 ipv6: restore correct ECN handling on TCP xmit
Since commit e9df2e8fd8 (Use appropriate sock tclass setting for
routing lookup) we lost ability to properly add ECN codemarks to ipv6
TCP frames.

It seems like TCP_ECN_send() calls INET_ECN_xmit(), which only sets the
ECN bit in the IPv4 ToS field (inet_sk(sk)->tos), but after the patch,
what's checked is inet6_sk(sk)->tclass, which is a completely different
field.

Close bug https://bugzilla.kernel.org/show_bug.cgi?id=34322

[Eric Dumazet] : added the INET_ECN_dontxmit() fix and replace macros
by inline functions for clarity.

Signed-off-by: Steinar H. Gunderson <sgunderson@bigfoot.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-12 18:52:14 -04:00
Linus Torvalds
df43938bc5 Merge branch 'fbmem'
* fbmem:
  fbmem: make read/write/ioctl use the frame buffer at open time
  fbcon: add lifetime refcount to opened frame buffers
2011-05-12 10:42:36 -07:00
Linus Torvalds
698b368275 fbcon: add lifetime refcount to opened frame buffers
This just adds the refcount and the new registration lock logic.  It
does not (for example) actually change the read/write/ioctl routines to
actually use the frame buffer that was opened: those function still end
up alway susing whatever the current frame buffer is at the time of the
call.

Without this, if something holds the frame buffer open over a
framebuffer switch, the close() operation after the switch will access a
fb_info that has been free'd by the unregistering of the old frame
buffer.

(The read/write/ioctl operations will normally not cause problems,
because they will - illogically - pick up the new fbcon instead.  But a
switch that happens just as one of those is going on might see problems
too, the window is just much smaller: one individual op rather than the
whole open-close sequence.)

This use-after-free is apparently fairly easily triggered by the Ubuntu
11.04 boot sequence.

Acked-by: Tim Gardner <tim.gardner@canonical.com>
Tested-by: Daniel J Blueman <daniel.blueman@gmail.com>
Tested-by: Anca Emanuel <anca.emanuel@gmail.com>
Cc: Bruno Prémont <bonbons@linux-vserver.org>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Andy Whitcroft <andy.whitcroft@canonical.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-12 07:37:51 -07:00
Trond Myklebust
a75b9df9d3 NFSv4.1: Ensure that layoutget uses the correct gfp modes
Currently, writebacks may end up recursing back into the filesystem due to
GFP_KERNEL direct reclaims in the pnfs subsystem.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-05-11 22:52:13 -04:00
Mel Gorman
1d929b7a84 mm: tracing: add missing GFP flags to tracing
include/linux/gfp.h and include/trace/events/gfpflags.h are out of sync.
When tracing is enabled, certain flags are not recognised and the text
output is less useful as a result.  Add the missing flags.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-11 18:50:45 -07:00
Andi Kleen
ee85c2e145 mm: add alloc_pages_exact_nid()
Add a alloc_pages_exact_nid() that allocates on a specific node.

The naming is quite broken, but fixing that would need a larger renaming
action.

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: tweak comment]
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>
Cc: David Rientjes <rientjes@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-11 18:50:45 -07:00
Yinghai Lu
8f389a99b6 mm: use alloc_bootmem_node_nopanic() on really needed path
Stefan found nobootmem does not work on his system that has only 8M of
RAM.  This causes an early panic:

  BIOS-provided physical RAM map:
   BIOS-88: 0000000000000000 - 000000000009f000 (usable)
   BIOS-88: 0000000000100000 - 0000000000840000 (usable)
  bootconsole [earlyser0] enabled
  Notice: NX (Execute Disable) protection missing in CPU or disabled in BIOS!
  DMI not present or invalid.
  last_pfn = 0x840 max_arch_pfn = 0x100000
  init_memory_mapping: 0000000000000000-0000000000840000
  8MB LOWMEM available.
    mapped low ram: 0 - 00840000
    low ram: 0 - 00840000
  Zone PFN ranges:
    DMA      0x00000001 -> 0x00001000
    Normal   empty
  Movable zone start PFN for each node
  early_node_map[2] active PFN ranges
      0: 0x00000001 -> 0x0000009f
      0: 0x00000100 -> 0x00000840
  BUG: Int 6: CR2 (null)
       EDI c034663c  ESI (null)  EBP c0329f38  ESP c0329ef4
       EBX c0346380  EDX 00000006  ECX ffffffff  EAX fffffff4
       err (null)  EIP c0353191   CS c0320060  flg 00010082
  Stack: (null) c030c533 000007cd (null) c030c533 00000001 (null) (null)
         00000003 0000083f 00000018 00000002 00000002 c0329f6c c03534d6 (null)
         (null) 00000100 00000840 (null) c0329f64 00000001 00001000 (null)
  Pid: 0, comm: swapper Not tainted 2.6.36 #5
  Call Trace:
   [<c02e3707>] ? 0xc02e3707
   [<c035e6e5>] 0xc035e6e5
   [<c0353191>] ? 0xc0353191
   [<c03534d6>] 0xc03534d6
   [<c034f1cd>] 0xc034f1cd
   [<c034a824>] 0xc034a824
   [<c03513cb>] ? 0xc03513cb
   [<c0349432>] 0xc0349432
   [<c0349066>] 0xc0349066

It turns out that we should ignore the low limit of 16M.

Use alloc_bootmem_node_nopanic() in this case.

[akpm@linux-foundation.org: less mess]
Signed-off-by: Yinghai LU <yinghai@kernel.org>
Reported-by: Stefan Hellermann <stefan@the2masters.de>
Tested-by: Stefan Hellermann <stefan@the2masters.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@linux.intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@kernel.org>		[2.6.34+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-11 18:50:44 -07:00
Linus Torvalds
9f381a61f5 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (27 commits)
  slcan: fix ldisc->open retval
  net/usb: mark LG VL600 LTE modem ethernet interface as WWAN
  xfrm: Don't allow esn with disabled anti replay detection
  xfrm: Assign the inner mode output function to the dst entry
  net: dev_close() should check IFF_UP
  vlan: fix GVRP at dismantle time
  netfilter: revert a2361c8735
  netfilter: IPv6: fix DSCP mangle code
  netfilter: IPv6: initialize TOS field in REJECT target module
  IPVS: init and cleanup restructuring
  IPVS: Change of socket usage to enable name space exit.
  netfilter: ebtables: only call xt_compat_add_offset once per rule
  netfilter: fix ebtables compat support
  netfilter: ctnetlink: fix timestamp support for new conntracks
  pch_gbe: support ML7223 IOH
  PCH_GbE : Fixed the issue of checksum judgment
  PCH_GbE : Fixed the issue of collision detection
  NET: slip, fix ldisc->open retval
  be2net: Fixed bugs related to PVID.
  ehea: fix wrongly reported speed and port
  ...
2011-05-10 17:39:01 -07:00
David S. Miller
9bbc052d5e Merge branch 'pablo/nf-2.6-updates' of git://1984.lsi.us.es/net-2.6 2011-05-10 15:04:35 -07:00
Steffen Klassert
43a4dea4c9 xfrm: Assign the inner mode output function to the dst entry
As it is, we assign the outer modes output function to the dst entry
when we create the xfrm bundle. This leads to two problems on interfamily
scenarios. We might insert ipv4 packets into ip6_fragment when called
from xfrm6_output. The system crashes if we try to fragment an ipv4
packet with ip6_fragment. This issue was introduced with git commit
ad0081e4 (ipv6: Fragment locally generated tunnel-mode IPSec6 packets
as needed). The second issue is, that we might insert ipv4 packets in
netfilter6 and vice versa on interfamily scenarios.

With this patch we assign the inner mode output function to the dst entry
when we create the xfrm bundle. So xfrm4_output/xfrm6_output from the inner
mode is used and the right fragmentation and netfilter functions are called.
We switch then to outer mode with the output_finish functions.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-05-10 15:03:34 -07:00
Hans Schillstrom
7a4f0761fc IPVS: init and cleanup restructuring
DESCRIPTION
This patch tries to restore the initial init and cleanup
sequences that was before namspace patch.
Netns also requires action when net devices unregister
which has never been implemented. I.e this patch also
covers when a device moves into a network namespace,
and has to be released.

IMPLEMENTATION
The number of calls to register_pernet_device have been
reduced to one for the ip_vs.ko
Schedulers still have their own calls.

This patch adds a function __ip_vs_service_cleanup()
and an enable flag for the netfilter hooks.

The nf hooks will be enabled when the first service is loaded
and never disabled again, except when a namespace exit starts.

Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
[horms@verge.net.au: minor edit to changelog]
Signed-off-by: Simon Horman <horms@verge.net.au>
2011-05-10 09:52:47 +02:00
Mikulas Patocka
a09a79f668 Don't lock guardpage if the stack is growing up
Linux kernel excludes guard page when performing mlock on a VMA with
down-growing stack. However, some architectures have up-growing stack
and locking the guard page should be excluded in this case too.

This patch fixes lvm2 on PA-RISC (and possibly other architectures with
up-growing stack). lvm2 calculates number of used pages when locking and
when unlocking and reports an internal error if the numbers mismatch.

[ Patch changed fairly extensively to also fix /proc/<pid>/maps for the
  grows-up case, and to move things around a bit to clean it all up and
  share the infrstructure with the /proc bits.

  Tested on ia64 that has both grow-up and grow-down segments  - Linus ]

Signed-off-by: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
Tested-by: Tony Luck <tony.luck@gmail.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-09 16:22:07 -07:00
Linus Torvalds
fd98a5d780 Merge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6
* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
  drm/radeon/kms: add pci id to acer travelmate quirk for 5730
  drm/radeon: fix order of doing things in radeon_crtc_cursor_set
  drm: mm: fix debug output
  drm/radeon/kms: ATPX switcheroo fixes
  drm/nouveau: Fix a crash at card takedown for NV40 and older cards
2011-05-09 09:09:04 -07:00
Daniel Vetter
2bbd449255 drm: mm: fix debug output
The looping helper didn't do anything due to a superficial
semicolon. Furthermore one of the two dump functions suffered
from copy&paste fail.

While staring at the code I've also noticed that the replace
helper (currently unused) is a bit broken.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-05-09 09:14:45 +10:00
Linus Torvalds
8b061610da Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  perf tools: Makefile: Use gcc to determine ARCH
  perf events, x86: Fix Intel Nehalem and Westmere last level cache event definitions
  hw_breakpoints, powerpc: Fix CONFIG_HAVE_HW_BREAKPOINT off-case in ptrace_set_debugreg()
  sh, hw_breakpoints: Fix racy access to ptrace breakpoints
  arm, hw_breakpoints: Fix racy access to ptrace breakpoints
  powerpc, hw_breakpoints: Fix racy access to ptrace breakpoints
  x86, hw_breakpoints: Fix racy access to ptrace breakpoints
  ptrace: Prepare to fix racy accesses on task breakpoints
2011-05-07 13:17:37 -07:00
Arjan van de Ven
a3a4a5acd3 Regression: partial revert "tracing: Remove lock_depth from event entry"
This partially reverts commit e6e1e25935.

That commit changed the structure layout of the trace structure, which
in turn broke PowerTOP (1.9x generation) quite badly.

I appreciate not wanting to expose the variable in question, and
PowerTOP was not using it, so I've replaced the variable with just a
padding field - that way if in the future a new field is needed it can
just use this padding field.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-06 13:20:59 -07:00
Ingo Molnar
4d70230bb4 Merge branch 'master' of ssh://master.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 into perf/urgent 2011-05-06 08:11:28 +02:00
Linus Torvalds
8a3d8ed027 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
  flex_arrays: allow zero length flex arrays
  flex_array: flex_array_prealloc takes a number of elements, not an end
  SELinux: pass last path component in may_create
2011-05-04 14:21:08 -07:00
Thomas Gleixner
30106b8ce2 slub: Fix the lockless code on 32-bit platforms with no 64-bit cmpxchg
The SLUB allocator use of the cmpxchg_double logic was wrong: it
actually needs the irq-safe one.

That happens automatically when we use the native unlocked 'cmpxchg8b'
instruction, but when compiling the kernel for older x86 CPUs that do
not support that instruction, we fall back to the generic emulation
code.

And if you don't specify that you want the irq-safe version, the generic
code ends up just open-coding the cmpxchg8b equivalent without any
protection against interrupts or preemption.  Which definitely doesn't
work for SLUB.

This was reported by Werner Landgraf <w.landgraf@ru.ru>, who saw
instability with his distro-kernel that was compiled to support pretty
much everything under the sun.  Most big Linux distributions tend to
compile for PPro and later, and would never have noticed this problem.

This also fixes the prototypes for the irqsafe cmpxchg_double functions
to use 'bool' like they should.

[ Btw, that whole "generic code defaults to no protection" design just
  sounds stupid - if the code needs no protection, there is no reason to
  use "cmpxchg_double" to begin with.  So we should probably just remove
  the unprotected version entirely as pointless.   - Linus ]

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reported-and-tested-by: werner <w.landgraf@ru.ru>
Acked-and-tested-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Tejun Heo <tj@kernel.org>
Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1105041539050.3005@ionos
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-04 14:20:20 -07:00
Ingo Molnar
98bb318864 Merge branch 'perf/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing into perf/urgent 2011-05-04 20:33:42 +02:00
James Morris
6f23928454 Merge branch 'for-linus' of git://git.infradead.org/users/eparis/selinux into for-linus 2011-05-04 11:59:34 +10:00
Alex Deucher
8aeb96f802 drm/radeon/kms: fix gart setup on fusion parts (v2)
Out of the entire GART/VM subsystem, the hw designers changed
the location of 3 regs.

v2: airlied: add parameter for userspace to work from.

Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Cc: stable@kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-05-04 10:16:40 +10:00
Alex Deucher
e2c85d8e39 drm/radeon/kms: add some new pci ids
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Cc: stable@kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-05-04 09:28:59 +10:00
Linus Torvalds
9a6cd4b45a Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc:
  mmc: sdhci: Check mrq != NULL in sdhci_tasklet_finish
  mmc: sdhci: Check mrq->cmd in sdhci_tasklet_finish
  mmc: tmio: fix .set_ios(MMC_POWER_UP) handling
  mmc: fix a race between card-detect rescan and clock-gate work instances
  mmc: omap: Fix possible NULL pointer deref
  mmc: core: mmc_add_card(): fix missing break in switch statement
  mmc: sdhci-pci: Fix error case in sdhci_pci_probe_slot()
2011-05-03 09:24:44 -07:00
Linus Torvalds
497ff03444 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: wm831x-ts - move BTN_TOUCH reporting to data transfer
  Input: wm831x-ts - allow IRQ flags to be specified
  Input: wm831x-ts - fix races with IRQ management
2011-05-02 20:26:32 -07:00
Linus Torvalds
5933f2ae35 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (47 commits)
  sysctl: net: call unregister_net_sysctl_table where needed
  Revert: veth: remove unneeded ifname code from veth_newlink()
  smsc95xx: fix reset check
  tg3: Fix failure to enable WoL by default when possible
  networking: inappropriate ioctl operation should return ENOTTY
  amd8111e: trivial typo spelling: Negotitate -> Negotiate
  ipv4: don't spam dmesg with "Using LC-trie" messages
  af_unix: Only allow recv on connected seqpacket sockets.
  mii: add support of pause frames in mii_get_an
  net: ftmac100: fix scheduling while atomic during PHY link status change
  usbnet: Transfer of maintainership
  usbnet: add support for some Huawei modems with cdc-ether ports
  bnx2: cancel timer on device removal
  iwl4965: fix "Received BA when not expected"
  iwlagn: fix "Received BA when not expected"
  dsa/mv88e6131: fix unknown multicast/broadcast forwarding on mv88e6085
  usbnet: Resubmit interrupt URB if device is open
  iwl4965: fix "TX Power requested while scanning"
  iwlegacy: led stay solid on when no traffic
  b43: trivial: update module info about ucode16_mimo firmware
  ...
2011-05-02 18:00:43 -07:00
Jean Delvare
a6e5e2be44 i2c-i801: Move device ID definitions to driver
Move the SMBus device ID definitions of recent devices from pci_ids.h
to the i2c-i801.c driver file. They don't have to be shared, as they
are clearly identified and only used in this driver. In the future,
such IDs will go to i2c-i801 directly. This will make adding support
for new devices much faster and easier, as it will avoid cross-
subsystem patch sets and merge conflicts.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Seth Heasley <seth.heasley@intel.com>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-05-01 18:18:49 +02:00
Linus Torvalds
fafc9929c6 Merge branch 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6
* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
  drm/i915: restore only the mode of this driver on lastclose (v2)
  drm/radeon/kms: add info query for tile pipes
  drm/radeon/kms: add missing safe regs for 6xx/7xx
  drm: select FRAMEBUFFER_CONSOLE_PRIMARY if we have FRAMEBUFFER_CONSOLE
2011-04-28 13:14:02 -07:00
Linus Torvalds
9cab1ba421 Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
  nfs: don't lose MS_SYNCHRONOUS on remount of noac mount
  NFS: Return meaningful status from decode_secinfo()
  NFSv4: Ensure we request the ordinary fileid when doing readdirplus
  NFSv4: Ensure that clientid and session establishment can time out
  SUNRPC: Allow RPC calls to return ETIMEDOUT instead of EIO
  NFSv4.1: Don't loop forever in nfs4_proc_create_session
  NFSv4: Handle NFS4ERR_WRONGSEC outside of nfs4_handle_exception()
  NFSv4.1: Don't update sequence number if rpc_task is not sent
  NFSv4.1: Ensure state manager thread dies on last umount
  SUNRPC: Fix the SUNRPC Kerberos V RPCSEC_GSS module dependencies
  NFS: Use correct variable for page bounds checking
  NFS: don't negotiate when user specifies sec flavor
  NFS: Attempt mount with default sec flavor first
  NFS: flav_array honors NFS_MAX_SECFLAVORS
  NFS: Fix infinite loop in gss_create_upcall()
  Don't mark_inode_dirty_sync() while holding lock
  NFS: Get rid of pointless test in nfs_commit_done
  NFS: Remove unused argument from nfs_find_best_sec()
  NFS: Eliminate duplicate call to nfs_mark_request_dirty
  NFS: Remove dead code from nfs_fs_mount()
2011-04-28 13:13:07 -07:00
Eric Paris
5d30b10bd6 flex_array: flex_array_prealloc takes a number of elements, not an end
Change flex_array_prealloc to take the number of elements for which space
should be allocated instead of the last (inclusive) element. Users
and documentation are updated accordingly.  flex_arrays got introduced before
they had users.  When folks started using it, they ended up needing a
different API than was coded up originally.  This swaps over to the API that
folks apparently need.

Based-on-patch-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Eric Paris <eparis@redhat.com>
Tested-by: Chris Richards <gizmo@giz-works.com>
Acked-by: Dave Hansen <dave@linux.vnet.ibm.com>
Cc: stable@kernel.org [2.6.38+]
2011-04-28 16:12:47 -04:00
Paul Stewart
68972efa65 usbnet: Resubmit interrupt URB if device is open
Resubmit interrupt URB if device is open.  Use a flag set in
usbnet_open() to determine this state.  Also kill and free
interrupt URB in usbnet_disconnect().

[Rebased off git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git]

Signed-off-by: Paul Stewart <pstew@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-28 12:56:09 -07:00
Andrea Arcangeli
78f11a2557 mm: thp: fix /dev/zero MAP_PRIVATE and vm_flags cleanups
The huge_memory.c THP page fault was allowed to run if vm_ops was null
(which would succeed for /dev/zero MAP_PRIVATE, as the f_op->mmap wouldn't
setup a special vma->vm_ops and it would fallback to regular anonymous
memory) but other THP logics weren't fully activated for vmas with vm_file
not NULL (/dev/zero has a not NULL vma->vm_file).

So this removes the vm_file checks so that /dev/zero also can safely use
THP (the other albeit safer approach to fix this bug would have been to
prevent the THP initial page fault to run if vm_file was set).

After removing the vm_file checks, this also makes huge_memory.c stricter
in khugepaged for the DEBUG_VM=y case.  It doesn't replace the vm_file
check with a is_pfn_mapping check (but it keeps checking for VM_PFNMAP
under VM_BUG_ON) because for a is_cow_mapping() mapping VM_PFNMAP should
only be allowed to exist before the first page fault, and in turn when
vma->anon_vma is null (so preventing khugepaged registration).  So I tend
to think the previous comment saying if vm_file was set, VM_PFNMAP might
have been set and we could still be registered in khugepaged (despite
anon_vma was not NULL to be registered in khugepaged) was too paranoid.
The is_linear_pfn_mapping check is also I think superfluous (as described
by comment) but under DEBUG_VM it is safe to stay.

Addresses https://bugzilla.kernel.org/show_bug.cgi?id=33682

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Reported-by: Caspar Zhang <bugs@casparzhang.com>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: <stable@kernel.org>		[2.6.38.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-04-28 11:28:20 -07:00