Commit Graph

426690 Commits

Author SHA1 Message Date
Linus Torvalds
16e5a2ed59 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking updates from David Miller:

 1) Fix flexcan build on big endian, from Arnd Bergmann

 2) Correctly attach cpsw to GPIO bitbang MDIO drive, from Stefan Roese

 3) udp_add_offload has to use GFP_ATOMIC since it can be invoked from
    non-sleepable contexts.  From Or Gerlitz

 4) vxlan_gro_receive() does not iterate over all possible flows
    properly, fix also from Or Gerlitz

 5) CAN core doesn't use a proper SKB destructor when it hooks up
    sockets to SKBs.  Fix from Oliver Hartkopp

 6) ip_tunnel_xmit() can use an uninitialized route pointer, fix from
    Eric Dumazet

 7) Fix address family assignment in IPVS, from Michal Kubecek

 8) Fix ath9k build on ARM, from Sujith Manoharan

 9) Make sure fail_over_mac only applies for the correct bonding modes,
    from Ding Tianhong

10) The udp offload code doesn't use RCU correctly, from Shlomo Pongratz

11) Handle gigabit features properly in generic PHY code, from Florian
    Fainelli

12) Don't blindly invoke link operations in
    rtnl_link_get_slave_info_data_size, they are optional.  Fix from
    Fernando Luis Vazquez Cao

13) Add USB IDs for Netgear Aircard 340U, from Bjørn Mork

14) Handle netlink packet padding properly in openvswitch, from Thomas
    Graf

15) Fix oops when deleting chains in nf_tables, from Patrick McHardy

16) Fix RX stalls in xen-netback driver, from Zoltan Kiss

17) Fix deadlock in mac80211 stack, from Emmanuel Grumbach

18) inet_nlmsg_size() forgets to consider ifa_cacheinfo, fix from Geert
    Uytterhoeven

19) tg3_change_mtu() can deadlock, fix from Nithin Sujir

20) Fix regression in setting SCTP local source addresses on accepted
    sockets, caused by some generic ipv6 socket changes.  Fix from
    Matija Glavinic Pecotic

21) IPPROTO_* must be pure defines, otherwise module aliases don't get
    constructed properly.  Fix from Jan Moskyto

22) IPV6 netconsole setup doesn't work properly unless an explicit
    source address is specified, fix from Sabrina Dubroca

23) Use __GFP_NORETRY for high order skb page allocations in
    sock_alloc_send_pskb and skb_page_frag_refill.  From Eric Dumazet

24) Fix a regression added in netconsole over bridging, from Cong Wang

25) TCP uses an artificial offset of 1ms for SRTT, but this doesn't jive
    well with TCP pacing which needs the SRTT to be accurate.  Fix from
    Eric Dumazet

26) Several cases of missing header file includes from Rashika Kheria

27) Add ZTE MF667 device ID to qmi_wwan driver, from Raymond Wanyoike

28) TCP Small Queues doesn't handle nonagle properly in some corner
    cases, fix from Eric Dumazet

29) Remove extraneous read_unlock in bond_enslave, whoops.  From Ding
    Tianhong

30) Fix 9p trans_virtio handling of vmalloc buffers, from Richard Yao

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (136 commits)
  6lowpan: fix lockdep splats
  alx: add missing stats_lock spinlock init
  9p/trans_virtio.c: Fix broken zero-copy on vmalloc() buffers
  bonding: remove unwanted bond lock for enslave processing
  USB2NET : SR9800 : One chip USB2.0 USB2NET SR9800 Device Driver Support
  tcp: tsq: fix nonagle handling
  bridge: Prevent possible race condition in br_fdb_change_mac_address
  bridge: Properly check if local fdb entry can be deleted when deleting vlan
  bridge: Properly check if local fdb entry can be deleted in br_fdb_delete_by_port
  bridge: Properly check if local fdb entry can be deleted in br_fdb_change_mac_address
  bridge: Fix the way to check if a local fdb entry can be deleted
  bridge: Change local fdb entries whenever mac address of bridge device changes
  bridge: Fix the way to find old local fdb entries in br_fdb_change_mac_address
  bridge: Fix the way to insert new local fdb entries in br_fdb_changeaddr
  bridge: Fix the way to find old local fdb entries in br_fdb_changeaddr
  tcp: correct code comment stating 3 min timeout for FIN_WAIT2, we only do 1 min
  net: vxge: Remove unused device pointer
  net: qmi_wwan: add ZTE MF667
  3c59x: Remove unused pointer in vortex_eisa_cleanup()
  net: fix 'ip rule' iif/oif device rename
  ...
2014-02-11 12:05:55 -08:00
J. Bruce Fields
09bdc2d70d nfsd4: fix acl buffer overrun
4ac7249ea5 "nfsd: use get_acl and
->set_acl" forgets to set the size in the case get_acl() succeeds, so
_posix_to_nfsv4_one() can then write past the end of its allocation.
Symptoms were slab corruption warnings.

Also, some minor cleanup while we're here.  (Among other things, note
that the first few lines guarantee that pacl is non-NULL.)

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2014-02-11 13:48:11 -05:00
Steven Rostedt (Red Hat)
d651aa1d68 ring-buffer: Fix first commit on sub-buffer having non-zero delta
Each sub-buffer (buffer page) has a full 64 bit timestamp. The events on
that page use a 27 bit delta against that timestamp in order to save on
bits written to the ring buffer. If the time between events is larger than
what the 27 bits can hold, a "time extend" event is added to hold the
entire 64 bit timestamp again and the events after that hold a delta from
that timestamp.

As a "time extend" is always paired with an event, it is logical to just
allocate the event with the time extend, to make things a bit more efficient.

Unfortunately, when the pairing code was written, it removed the "delta = 0"
from the first commit on a page, causing the events on the page to be
slightly skewed.

Fixes: 69d1b839f7 "ring-buffer: Bind time extend and data events together"
Cc: stable@vger.kernel.org # 2.6.37+
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2014-02-11 13:38:54 -05:00
Linus Walleij
8858d88a25 ARM: ux500: disable msp2 device tree node
Commit 70b41abc15
"ARM: ux500: move MSP pin control to the device tree"
accidentally activated MSP2, giving rise to a boot scroll
scream as the kernel attempts to probe a driver for it and
fails to obtain DMA channel 14.

Fix this up by marking the node disabled again.

Acked-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Kevin Hilman <khilman@linaro.org>
2014-02-11 09:21:45 -08:00
Christoph Hellwig
49f5baa510 blk-mq: pair blk_mq_start_request / blk_mq_requeue_request
Make sure we have a proper pairing between starting and requeueing
requests.  Move the dma drain and REQ_END setup into blk_mq_start_request,
and make sure blk_mq_requeue_request properly undoes them, giving us
a pair of function to prepare and unprepare a request without leaving
side effects.

Together this ensures we always clean up properly after
BLK_MQ_RQ_QUEUE_BUSY returns from ->queue_rq.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-02-11 09:34:08 -07:00
Christoph Hellwig
1e93b8c274 blk-mq: dont assume rq->errors is set when returning an error from ->queue_rq
rq->errors never has been part of the communication protocol between drivers
and the block stack and most drivers will not have initialized it.

Return -EIO to upper layers when the driver returns BLK_MQ_RQ_QUEUE_ERROR
unconditionally.  If a driver want to return a different error it can easily
do so by returning success after calling blk_mq_end_io itself.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-02-11 09:34:07 -07:00
Kent Overstreet
8423ae3d7a block: Fix cloning of discard/write same bios
Immutable biovecs changed the way bio segments are treated in such a way that
bio_for_each_segment() cannot now do what we want for discard/write same bios,
since bi_size means something completely different for them.

Fortunately discard and write same bios never have more than a single biovec, so
bio_for_each_segment() is unnecessary and not terribly meaningful for them, but
we still have to special case them in a few places.

Signed-off-by: Kent Overstreet <kmo@daterainc.com>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-02-11 08:40:45 -07:00
Paul Bolle
d8320b2d2e ia64/xen: Remove Xen support for ia64 even more
Commit d52eefb47d ("ia64/xen: Remove Xen support for ia64") removed
the Kconfig symbol XEN_XENCOMM. But it didn't remove the code depending
on that symbol. Remove that code now.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Acked-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2014-02-11 10:12:37 -05:00
David Vrabel
564eb714f5 xen: install xen/gntdev.h and xen/gntalloc.h
xen/gntdev.h and xen/gntalloc.h both provide userspace ABIs so they
should be installed.

CC: stable@vger.kernel.org
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2014-02-11 10:12:36 -05:00
David Vrabel
97253eeeb7 xen/events: bind all new interdomain events to VCPU0
Commit fc087e1073 (xen/events: remove
unnecessary init_evtchn_cpu_bindings()) causes a regression.

The kernel-side VCPU binding was not being correctly set for newly
allocated or bound interdomain events.  In ARM guests where 2-level
events were used, this would result in no interdomain events being
handled because the kernel-side VCPU masks would all be clear.

x86 guests would work because the irq affinity was set during irq
setup and this would set the correct kernel-side VCPU binding.

Fix this by properly initializing the kernel-side VCPU binding in
bind_evtchn_to_irq().

Reported-and-tested-by: Julien Grall <julien.grall@linaro.org>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2014-02-11 10:12:34 -05:00
Tomi Valkeinen
c56812fc85 OMAPDSS: fix fck field types
'fck' field in dpi and sdi clock calculation struct is 'unsigned long
long', even though it should be 'unsigned long'. This hasn't caused any
issues so far.

Fix the field's type.

Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
2014-02-11 16:19:46 +02:00
Tomi Valkeinen
eec77da274 OMAPDSS: DISPC: decimation rounding fix
The driver uses DIV_ROUND_UP when calculating decimated width & height.
For example, when decimating with 3, the width is calculated as:

  width = DIV_ROUND_UP(width, decim_x);

This yields bad results for some values. For example, 800/3=266.666...,
which is rounded to 267. When the input width is set to 267, and pixel
increment is set to 3, this causes the dispc to read a line of 801
pixels, i.e. it reads a wrong pixel at the end of the line.

Even more pressing, the above rounding causes a BUG() in pixinc(), as
the value of 801 is used to calculate row increment, leading to a bad
value being passed to pixinc().

This patch fixes the decimation by removing the DIV_ROUND_UP()s when
calculating width and height for decimation.

Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
2014-02-11 16:19:41 +02:00
Mark Brown
cf20662db4 Merge remote-tracking branches 'spi/fix/doc', 'spi/fix/nuc900' and 'spi/fix/rspi' into spi-linus 2014-02-11 12:08:27 +00:00
Mark Brown
797d0dec8a Merge remote-tracking branch 'spi/fix/core' into spi-linus 2014-02-11 12:08:26 +00:00
Mika Kuoppala
1d2cb9a54a drm/i915: Pair va_copy with va_end in i915_error_vprintf
Each invocation of va_copy() must be matched by a corresponding
invocation of va_end() in the same function.

This regression has been introduced in

commit e29bb4ebbf
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Fri Sep 20 10:20:59 2013 +0100

    drm/i915: Use a temporary va_list for two-pass string handling

Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-02-11 11:53:15 +01:00
Daniel Vetter
a2d213dd77 drm/i915: Fix intel_pipe_to_cpu_transcoder for UMS
We don't have all the drm_crtc&co hanging around in that case.

This regression has been introduced in

commit 391f75e2bf
Author: Ville Syrjälä <ville.syrjala@linux.intel.com>
Date:   Wed Sep 25 19:55:26 2013 +0300

    drm/i915: Fix pre-CTG vblank counter

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=69521
Cc: stable@vger.kernel.org (for 3.13 only)
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-02-11 11:53:09 +01:00
Paul Gortmaker
2c45aada34 genirq: Add missing irq_to_desc export for CONFIG_SPARSE_IRQ=n
In allmodconfig builds for sparc and any other arch which does
not set CONFIG_SPARSE_IRQ, the following will be seen at modpost:

  CC [M]  lib/cpu-notifier-error-inject.o
  CC [M]  lib/pm-notifier-error-inject.o
ERROR: "irq_to_desc" [drivers/gpio/gpio-mcp23s08.ko] undefined!
make[2]: *** [__modpost] Error 1

This happens because commit 3911ff30f5 ("genirq: export
handle_edge_irq() and irq_to_desc()") added one export for it, but
there were actually two instances of it, in an if/else clause for
CONFIG_SPARSE_IRQ.  Add the second one.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: stable@vger.kernel.org	# 3.4+
Link: http://lkml.kernel.org/r/1392057610-11514-1-git-send-email-paul.gortmaker@windriver.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2014-02-11 10:30:36 +01:00
Benjamin Herrenschmidt
cd15b04844 powerpc/powernv: Add iommu DMA bypass support for IODA2
This patch adds the support for to create a direct iommu "bypass"
window on IODA2 bridges (such as Power8) allowing to bypass iommu
page translation completely for 64-bit DMA capable devices, thus
significantly improving DMA performances.

Additionally, this adds a hook to the struct iommu_table so that
the IOMMU API / VFIO can disable the bypass when external ownership
is requested, since in that case, the device will be used by an
environment such as userspace or a KVM guest which must not be
allowed to bypass translations.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-02-11 16:07:37 +11:00
Dave Airlie
379dd277ed Merge tag 'drm-intel-fixes-2014-02-06' of ssh://git.freedesktop.org/git/drm-intel into drm-next
Just minor stuff really, on vlv dp fix and two patches to tune down some
opregion sanity check. Plus MAINTAINERS update for the new git repo, which
is the only reason I've really bothered with this pull request.

* tag 'drm-intel-fixes-2014-02-06' of ssh://git.freedesktop.org/git/drm-intel:
  drm/i915: demote opregion excessive timeout WARN_ONCE to DRM_INFO_ONCE
  drm: add DRM_INFO_ONCE() to print a one-time DRM_INFO() message
  MAINTAINERS: Update drm/i915 git repo
  drm/i915: vlv: fix DP PHY lockup due to invalid PP sequencer setup
2014-02-11 12:57:27 +10:00
Dave Airlie
30d4442500 Merge branch 'exynos-drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into drm-next
This pull request fixes memory leak issue in exynos_drm_open() and
multiplatform breakage for ipp/gsc. And also including some cleanups.

* 'exynos-drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos:
  drm/exynos: Convert to use the standard hdmi.h header
  drm/exynos: Fix trivial typo
  drm/exynos: Remove unnecessary semicolon
  drm/exynos: Fix multiplatform breakage for ipp/gsc
  drm/exynos: Fix freeing issues in exynos_drm_drv.c
2014-02-11 12:56:57 +10:00
Dave Airlie
7431105b14 Merge branch 'msm-next' of git://people.freedesktop.org/~robclark/linux into drm-next
Compared to original fixes pull req that I sent yesterday, this adds
one more fix that I found for a synchronization issue which starts to
crop up when we use XA in DDX for 2d accel on 3d core.  In particular,
accelerating presentation blit triggers this problem.

* 'msm-next' of git://people.freedesktop.org/~robclark/linux:
  drm/msm: bigger synchronization hammer
  drm/msm: fix deadlock in bo create fail path
  drm/msm/mdp4: cursor fixes
  drm/msm/mdp4: pageflip fixes
  drm/msm/mdp5: fix ref leaks in error paths
  drm/msm: fix inconsequential typo
2014-02-11 12:56:17 +10:00
Eric Dumazet
20e7c4e80d 6lowpan: fix lockdep splats
When a device ndo_start_xmit() calls again dev_queue_xmit(),
lockdep can complain because dev_queue_xmit() is re-entered and the
spinlocks protecting tx queues share a common lockdep class.

Same issue was fixed for bonding/l2tp/ppp in commits

0daa230302 ("[PATCH] bonding: lockdep annotation")
49ee49202b ("bonding: set qdisc_tx_busylock to avoid LOCKDEP splat")
23d3b8bfb8 ("net: qdisc busylock needs lockdep annotations ")
303c07db48 ("ppp: set qdisc_tx_busylock to avoid LOCKDEP splat ")

Reported-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-10 17:51:29 -08:00
John Greene
3e5ccc29f7 alx: add missing stats_lock spinlock init
Trivial fix for init time stack trace occuring in
alx_get_stats64 upon start up. Should have been part of
commit adding the spinlock:
f1b6b106 alx: add alx_get_stats64 operation

Signed-off-by: John Greene <jogreene@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-10 17:50:35 -08:00
Richard Yao
b6f52ae2f0 9p/trans_virtio.c: Fix broken zero-copy on vmalloc() buffers
The 9p-virtio transport does zero copy on things larger than 1024 bytes
in size. It accomplishes this by returning the physical addresses of
pages to the virtio-pci device. At present, the translation is usually a
bit shift.

That approach produces an invalid page address when we read/write to
vmalloc buffers, such as those used for Linux kernel modules. Any
attempt to load a Linux kernel module from 9p-virtio produces the
following stack.

[<ffffffff814878ce>] p9_virtio_zc_request+0x45e/0x510
[<ffffffff814814ed>] p9_client_zc_rpc.constprop.16+0xfd/0x4f0
[<ffffffff814839dd>] p9_client_read+0x15d/0x240
[<ffffffff811c8440>] v9fs_fid_readn+0x50/0xa0
[<ffffffff811c84a0>] v9fs_file_readn+0x10/0x20
[<ffffffff811c84e7>] v9fs_file_read+0x37/0x70
[<ffffffff8114e3fb>] vfs_read+0x9b/0x160
[<ffffffff81153571>] kernel_read+0x41/0x60
[<ffffffff810c83ab>] copy_module_from_fd.isra.34+0xfb/0x180

Subsequently, QEMU will die printing:

qemu-system-x86_64: virtio: trying to map MMIO memory

This patch enables 9p-virtio to correctly handle this case. This not
only enables us to load Linux kernel modules off virtfs, but also
enables ZFS file-based vdevs on virtfs to be used without killing QEMU.

Special thanks to both Avi Kivity and Alexander Graf for their
interpretation of QEMU backtraces. Without their guidence, tracking down
this bug would have taken much longer. Also, special thanks to Linus
Torvalds for his insightful explanation of why this should use
is_vmalloc_addr() instead of is_vmalloc_or_module_addr():

https://lkml.org/lkml/2014/2/8/272

Signed-off-by: Richard Yao <ryao@gentoo.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-10 17:48:54 -08:00
dingtianhong
6b8790b500 bonding: remove unwanted bond lock for enslave processing
The bond enslave processing don't hold bond->lock anymore,
so release an unlocked rw lock will cause warning message,
remove the unwanted read_unlock(&bond->lock).

Cc: Jay Vosburgh <fubar@us.ibm.com>
Cc: Veaceslav Falico <vfalico@redhat.com>
Cc: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Acked-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-10 16:54:29 -08:00
Liu Junliang
19a38d8e0a USB2NET : SR9800 : One chip USB2.0 USB2NET SR9800 Device Driver Support
Signed-off-by: Liu Junliang <liujunliang_ljl@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-10 16:53:06 -08:00
Anton Blanchard
ea961a828f powerpc: Fix endian issues in kexec and crash dump code
We expose a number of OF properties in the kexec and crash dump code
and these need to be big endian.

Cc: stable@vger.kernel.org # v3.13
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-02-11 11:24:52 +11:00
Kevin Hao
04a341138d powerpc/ppc32: Fix the bug in the init of non-base exception stack for UP
We would allocate one specific exception stack for each kind of
non-base exceptions for every CPU. For ppc32 the CPU hard ID is
used as the subscript to get the specific exception stack for
one CPU. But for an UP kernel, there is only one element in the
each kind of exception stack array. We would get stuck if the
CPU hard ID is not equal to '0'. So in this case we should use the
subscript '0' no matter what the CPU hard ID is.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-02-11 11:24:52 +11:00
Michael Ellerman
d2b496e5e1 powerpc/xmon: Don't signal we've entered until we're finished printing
Currently we set our cpu's bit in cpus_in_xmon, and then we take the
output lock and print the exception information.

This can race with the master cpu entering the command loop and printing
the backtrace. The result is that the backtrace gets garbled with
another cpu's exception print out.

Fix it by delaying the set of cpus_in_xmon until we are finished
printing.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-02-11 11:24:51 +11:00
Michael Ellerman
1507589787 powerpc/xmon: Fix timeout loop in get_output_lock()
As far as I can tell, our 70s era timeout loop in get_output_lock() is
generating no code.

This leads to the hostile takeover happening more or less simultaneously
on all cpus. The result is "interesting", some example output that is
more readable than most:

    cpu 0x1: Vector: 100 (Scypsut e0mx bR:e setV)e catto xc0p:u[ c 00
    c0:0  000t0o0V0erc0td:o5 rfc28050000]0c00 0 0  0 6t(pSrycsV1ppuot
    uxe 1m 2 0Rx21e3:0s0ce000c00000t00)00 60602oV2SerucSayt0y 0p 1sxs

Fix it by using udelay() in the timeout loop. The wait time and check
frequency are arbitrary, but seem to work OK. We already rely on
udelay() working so this is not a new dependency.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-02-11 11:24:51 +11:00
Michael Ellerman
730efb6193 powerpc/xmon: Don't loop forever in get_output_lock()
If we enter with xmon_speaker != 0 we skip the first cmpxchg(), we also
skip the while loop because xmon_speaker != last_speaker (0) - meaning we
skip the second cmpxchg() also.

Following that code path the compiler sees no memory barriers and so is
within its rights to never reload xmon_speaker. The end result is we loop
forever.

This manifests as all cpus being in xmon ('c' command), but they refuse
to take control when you switch to them ('c x' for cpu # x).

I have seen this deadlock in practice and also checked the generated code to
confirm this is what's happening.

The simplest fix is just to always try the cmpxchg().

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-02-11 11:24:50 +11:00
Anshuman Khandual
b4d6c06c8d powerpc/perf: Configure BHRB filter before enabling PMU interrupts
Right now the config_bhrb() PMU specific call happens after
write_mmcr0(), which actually enables the PMU for event counting and
interrupts. So there is a small window of time where the PMU and BHRB
runs without the required HW branch filter (if any) enabled in BHRB.

This can cause some of the branch samples to be collected through BHRB
without any filter applied and hence affects the correctness of
the results. This patch moves the BHRB config function call before
enabling interrupts.

Here are some data points captured via trace prints which depicts how we
could get PMU interrupts with BHRB filter NOT enabled with a standard
perf record command line (asking for branch record information as well).

    $ perf record -j any_call ls

Before the patch:-

    ls-1962  [003] d...  2065.299590: .perf_event_interrupt: MMCRA: 40000000000
    ls-1962  [003] d...  2065.299603: .perf_event_interrupt: MMCRA: 40000000000
    ...

    All the PMU interrupts before this point did not have the requested
    HW branch filter enabled in the MMCRA.

    ls-1962  [003] d...  2065.299647: .perf_event_interrupt: MMCRA: 40040000000
    ls-1962  [003] d...  2065.299662: .perf_event_interrupt: MMCRA: 40040000000

After the patch:-

    ls-1850  [008] d...   190.311828: .perf_event_interrupt: MMCRA: 40040000000
    ls-1850  [008] d...   190.311848: .perf_event_interrupt: MMCRA: 40040000000

    All the PMU interrupts have the requested HW BHRB branch filter
    enabled in MMCRA.

Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
[mpe: Fixed up whitespace and cleaned up changelog]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-02-11 11:24:50 +11:00
Nathan Fontenot
0ba3e10116 crypto/nx/nx-842: Fix handling of vmalloc addresses
The powerpc specific nx-842 compression driver does not currently
handle translating a vmalloc address to a physical address.

The current driver uses __pa() for all addresses which does not
properly handle vmalloc addresses and thus causes a failure since
we do not pass a proper physical address to the hypervisor.

This patch adds a routine to convert an address to a physical
address by checking for vmalloc addresses and handling them properly.

Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
 ---
 drivers/crypto/nx/nx-842.c |   29 +++++++++++++++++++----------
 1 file changed, 19 insertions(+), 10 deletions(-)
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-02-11 11:24:49 +11:00
Michael Ellerman
8d4887ee30 powerpc/pseries: Select ARCH_RANDOM on pseries
We have a driver for the ARCH_RANDOM hook in rng.c, so we should select
ARCH_RANDOM on pseries.

Without this the build breaks if you turn ARCH_RANDOM off.

This hasn't broken the build because pseries_defconfig doesn't specify a
value for PPC_POWERNV, which is default y, and selects ARCH_RANDOM.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-02-11 11:24:49 +11:00
Michael Ellerman
2fdd313f54 powerpc/perf: Add Power8 cache & TLB events
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-02-11 11:24:48 +11:00
Laurent Dufour
3b830c824a powerpc/relocate fix relocate processing in LE mode
Relocation's code is not working in little endian mode because the r_info
field, which is a 64 bits value, should be read from the right offset.

The current code is optimized to read the r_info field as a 32 bits value
starting at the middle of the double word (offset 12). When running in LE
mode, the read value is not correct since only the MSB is read.

This patch removes this optimization which consist to deal with a 32 bits
value instead of a 64 bits one. This way it works in big and little endian
mode.

Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-02-11 11:24:48 +11:00
Mahesh Salgaonkar
429d2e8342 powerpc: Fix kdump hang issue on p8 with relocation on exception enabled.
On p8 systems, with relocation on exception feature enabled we are seeing
kdump kernel hang at interrupt vector 0xc*4400. The reason is, with this
feature enabled, exception are raised with MMU (IR=DR=1) ON with the
default offset of 0xc*4000. Since exception is raised in virtual mode it
requires the vector region to be executable without which it fails to
fetch and execute instruction at 0xc*4xxx. For default kernel since kernel
is loaded at real 0, the htab mappings sets the entire kernel text region
executable. But for relocatable kernel (e.g. kdump case) we only copy
interrupt vectors down to real 0 and never marked that region as
executable because in p7 and below we always get exception in real mode.

This patch fixes this issue by marking htab mapping range as executable
that overlaps with the interrupt vector region for relocatable kernel.

Thanks to Ben who helped me to debug this issue and find the root cause.

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-02-11 11:24:47 +11:00
Mahesh Salgaonkar
3ec8b78fcc powerpc/pseries: Disable relocation on exception while going down during crash.
Disable relocation on exception while going down even in kdump case. This
is because we are about clear htab mappings while kexec-ing into kdump
kernel and we may run into issues if we still have AIL ON.

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-02-11 11:24:47 +11:00
Thadeu Lima de Souza Cascardo
8cc6b6cd87 powerpc/eeh: Drop taken reference to driver on eeh_rmv_device
Commit f5c57710dd ("powerpc/eeh: Use
partial hotplug for EEH unaware drivers") introduces eeh_rmv_device,
which may grab a reference to a driver, but not release it.

That prevents a driver from being removed after it has gone through EEH
recovery.

This patch drops the reference if it was taken.

Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Acked-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-02-11 11:24:46 +11:00
Paul Gortmaker
0215b4aa06 powerpc: Fix build failure in sysdev/mpic.c for MPIC_WEIRD=y
Commit 446f6d06fa ("powerpc/mpic: Properly
set default triggers") breaks the mpc7447_hpc_defconfig as follows:

  CC      arch/powerpc/sysdev/mpic.o
arch/powerpc/sysdev/mpic.c: In function 'mpic_set_irq_type':
arch/powerpc/sysdev/mpic.c:886:9: error: case label does not reduce to an integer constant
arch/powerpc/sysdev/mpic.c:890:9: error: case label does not reduce to an integer constant
arch/powerpc/sysdev/mpic.c:894:9: error: case label does not reduce to an integer constant
arch/powerpc/sysdev/mpic.c:898:9: error: case label does not reduce to an integer constant

Looking at the cpp output (gcc 4.7.3), I see:

   case mpic->hw_set[MPIC_IDX_VECPRI_SENSE_EDGE] |
        mpic->hw_set[MPIC_IDX_VECPRI_POLARITY_POSITIVE]:

The pointer into an array appears because CONFIG_MPIC_WEIRD=y is set
for this platform, thus enabling the following:

  -------------------
  #ifdef CONFIG_MPIC_WEIRD
  static u32 mpic_infos[][MPIC_IDX_END] = {
        [0] = { /* Original OpenPIC compatible MPIC */

  [...]

  #define MPIC_INFO(name) mpic->hw_set[MPIC_IDX_##name]

  #else /* CONFIG_MPIC_WEIRD */

  #define MPIC_INFO(name) MPIC_##name

  #endif /* CONFIG_MPIC_WEIRD */
  -------------------

Here we convert the case section to if/else if, and also add
the equivalent of a default case to warn about unknown types.
Boot tested on sbc8548, build tested on all defconfigs.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-02-11 11:24:45 +11:00
Linus Torvalds
6792dfe383 Merge branch 'akpm' (patches from Andrew Morton)
Merge misc fixes from Andrew Morton:
 "A bunch of fixes"

* emailed patches fron Andrew Morton <akpm@linux-foundation.org>:
  ocfs2: check existence of old dentry in ocfs2_link()
  ocfs2: update inode size after zeroing the hole
  ocfs2: fix issue that ocfs2_setattr() does not deal with new_i_size==i_size
  mm/memory-failure.c: move refcount only in !MF_COUNT_INCREASED
  smp.h: fix x86+cpu.c sparse warnings about arch nonboot CPU calls
  mm: fix page leak at nfs_symlink()
  slub: do not assert not having lock in removing freed partial
  gitignore: add all.config
  ocfs2: fix ocfs2_sync_file() if filesystem is readonly
  drivers/edac/edac_mc_sysfs.c: poll timeout cannot be zero
  fs/file.c:fdtable: avoid triggering OOMs from alloc_fdmem
  xen: properly account for _PAGE_NUMA during xen pte translations
  mm/slub.c: list_lock may not be held in some circumstances
  drivers/md/bcache/extents.c: use %zi to format size_t
  vmcore: prevent PT_NOTE p_memsz overflow during header update
  drivers/message/i2o/i2o_config.c: fix deadlock in compat_ioctl(I2OGETIOPS)
  Documentation/: update 00-INDEX files
  checkpatch: fix detection of git repository
  get_maintainer: fix detection of git repository
  drivers/misc/sgi-gru/grukdump.c: unlocking should be conditional in gru_dump_context()
2014-02-10 16:03:16 -08:00
Xue jiufei
0e048316ff ocfs2: check existence of old dentry in ocfs2_link()
System call linkat first calls user_path_at(), check the existence of
old dentry, and then calls vfs_link()->ocfs2_link() to do the actual
work.  There may exist a race when Node A create a hard link for file
while node B rm it.

         Node A                          Node B
user_path_at()
  ->ocfs2_lookup(),
find old dentry exist
                                rm file, add inode say inodeA
                                to orphan_dir

call ocfs2_link(),create a
hard link for inodeA.

                                rm the link, add inodeA to orphan_dir
                                again

When orphan_scan work start, it calls ocfs2_queue_orphans() to do the
main work.  It first tranverses entrys in orphan_dir, linking all inodes
in this orphan_dir to a list look like this:

	inodeA->inodeB->...->inodeA

When tranvering this list, it will fall into loop, calling iput() again
and again.  And finally trigger BUG_ON(inode->i_state & I_CLEAR).

Signed-off-by: joyce <xuejiufei@huawei.com>
Reviewed-by: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-02-10 16:01:43 -08:00
Junxiao Bi
c7d2cbc364 ocfs2: update inode size after zeroing the hole
fs-writeback will release the dirty pages without page lock whose offset
are over inode size, the release happens at
block_write_full_page_endio().  If not update, dirty pages in file holes
may be released before flushed to the disk, then file holes will contain
some non-zero data, this will cause sparse file md5sum error.

To reproduce the bug, find a big sparse file with many holes, like vm
image file, its actual size should be bigger than available mem size to
make writeback work more frequently, tar it with -S option, then keep
untar it and check its md5sum again and again until you get a wrong
md5sum.

Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Younger Liu <younger.liu@huawei.com>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-02-10 16:01:43 -08:00
Younger Liu
d62e74be12 ocfs2: fix issue that ocfs2_setattr() does not deal with new_i_size==i_size
The issue scenario is as following:

- Create a small file and fallocate a large disk space for a file with
  FALLOC_FL_KEEP_SIZE option.

- ftruncate the file back to the original size again.  but the disk free
  space is not changed back.  This is a real bug that be fixed in this
  patch.

In order to solve the issue above, we modified ocfs2_setattr(), if
attr->ia_size != i_size_read(inode), It calls ocfs2_truncate_file(), and
truncate disk space to attr->ia_size.

Signed-off-by: Younger Liu <younger.liu@huawei.com>
Reviewed-by: Jie Liu <jeff.liu@oracle.com>
Tested-by: Jie Liu <jeff.liu@oracle.com>
Cc: Joel Becker <jlbec@evilplan.org>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Cc: Sunil Mushran <sunil.mushran@gmail.com>
Reviewed-by: Jensen <shencanquan@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-02-10 16:01:43 -08:00
Naoya Horiguchi
8d547ff4ac mm/memory-failure.c: move refcount only in !MF_COUNT_INCREASED
mce-test detected a test failure when injecting error to a thp tail
page.  This is because we take page refcount of the tail page in
madvise_hwpoison() while the fix in commit a3e0f9e47d
("mm/memory-failure.c: transfer page count from head page to tail page
after split thp") assumes that we always take refcount on the head page.

When a real memory error happens we take refcount on the head page where
memory_failure() is called without MF_COUNT_INCREASED set, so it seems
to me that testing memory error on thp tail page using madvise makes
little sense.

This patch cancels moving refcount in !MF_COUNT_INCREASED for valid
testing.

[akpm@linux-foundation.org: s/&&/&/]
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: Chen Gong <gong.chen@linux.intel.com>
Cc: <stable@vger.kernel.org>	[3.9+: a3e0f9e47d]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-02-10 16:01:43 -08:00
Paul Gortmaker
fb37bb04d6 smp.h: fix x86+cpu.c sparse warnings about arch nonboot CPU calls
Use what we already do for arch_disable_smp_support() to fix these:

  arch/x86/kernel/smpboot.c:1155:6: warning: symbol 'arch_enable_nonboot_cpus_begin' was not declared. Should it be static?
  arch/x86/kernel/smpboot.c:1160:6: warning: symbol 'arch_enable_nonboot_cpus_end' was not declared. Should it be static?
  kernel/cpu.c:512:13: warning: symbol 'arch_enable_nonboot_cpus_begin' was not declared. Should it be static?
  kernel/cpu.c:516:13: warning: symbol 'arch_enable_nonboot_cpus_end' was not declared. Should it be static?

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-02-10 16:01:42 -08:00
Rafael Aquini
a0b54adda3 mm: fix page leak at nfs_symlink()
Changes in commit a0b8cab3b9 ("mm: remove lru parameter from
__pagevec_lru_add and remove parts of pagevec API") have introduced a
call to add_to_page_cache_lru() which causes a leak in nfs_symlink() as
now the page gets an extra refcount that is not dropped.

Jan Stancek observed and reported the leak effect while running test8
from Connectathon Testsuite.  After several iterations over the test
case, which creates several symlinks on a NFS mountpoint, the test
system was quickly getting into an out-of-memory scenario.

This patch fixes the page leak by dropping that extra refcount
add_to_page_cache_lru() is grabbing.

Signed-off-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: Rafael Aquini <aquini@redhat.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Jeff Layton <jlayton@redhat.com>
Cc: Trond Myklebust <trond.myklebust@primarydata.com>
Cc: <stable@vger.kernel.org>	[3.11.x+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-02-10 16:01:42 -08:00
Steven Rostedt
1e4dd9461f slub: do not assert not having lock in removing freed partial
Vladimir reported the following issue:

Commit c65c1877bd ("slub: use lockdep_assert_held") requires
remove_partial() to be called with n->list_lock held, but free_partial()
called from kmem_cache_close() on cache destruction does not follow this
rule, leading to a warning:

  WARNING: CPU: 0 PID: 2787 at mm/slub.c:1536 __kmem_cache_shutdown+0x1b2/0x1f0()
  Modules linked in:
  CPU: 0 PID: 2787 Comm: modprobe Tainted: G        W    3.14.0-rc1-mm1+ #1
  Hardware name:
   0000000000000600 ffff88003ae1dde8 ffffffff816d9583 0000000000000600
   0000000000000000 ffff88003ae1de28 ffffffff8107c107 0000000000000000
   ffff880037ab2b00 ffff88007c240d30 ffffea0001ee5280 ffffea0001ee52a0
  Call Trace:
    __kmem_cache_shutdown+0x1b2/0x1f0
    kmem_cache_destroy+0x43/0xf0
    xfs_destroy_zones+0x103/0x110 [xfs]
    exit_xfs_fs+0x38/0x4e4 [xfs]
    SyS_delete_module+0x19a/0x1f0
    system_call_fastpath+0x16/0x1b

His solution was to add a spinlock in order to quiet lockdep.  Although
there would be no contention to adding the lock, that lock also requires
disabling of interrupts which will have a larger impact on the system.

Instead of adding a spinlock to a location where it is not needed for
lockdep, make a __remove_partial() function that does not test if the
list_lock is held, as no one should have it due to it being freed.

Also added a __add_partial() function that does not do the lock
validation either, as it is not needed for the creation of the cache.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Reported-by: Vladimir Davydov <vdavydov@parallels.com>
Suggested-by: David Rientjes <rientjes@google.com>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Vladimir Davydov <vdavydov@parallels.com>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-02-10 16:01:42 -08:00
Borislav Petkov
25fba9bebe gitignore: add all.config
This is used by kbuild to load preset Kconfig options.  We need to
ignore it, otherwise git clean kills it.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Michal Marek <mmarek@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-02-10 16:01:42 -08:00
Younger Liu
a987c7ca7f ocfs2: fix ocfs2_sync_file() if filesystem is readonly
If filesystem is readonly, there is no need to flush drive's caches or
force any uncommitted transactions.

[akpm@linux-foundation.org: return -EROFS, not 0]
Signed-off-by: Younger Liu <younger.liucn@gmail.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-02-10 16:01:42 -08:00