Commit Graph

648609 Commits

Author SHA1 Message Date
Linus Torvalds
e28ac1fc31 Contained in this update:
- Fix free space request handling when low on disk space
 - Remove redundant log failure error messages
 - Free truncate dirty pages instead of letting them build up forever
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCgAGBQJYdnh5AAoJEPh/dxk0SrTrI5sP/2FYAxrj3Iw684e92PWeIe1c
 pZir+0LEmlXgly4wprli7/BO6WZjESuMF3huBZZxtYLx3UUvGRavHup620z3Xa2e
 YRdSiBnvchOe3F0UfFO9wT15vEOzwWS61FfX/g35t4mD6ItY0XISesTx4+XA3Cqi
 8zf7OAjXI5WQietthwc9zmfhpyWgyw8CkeXVtqd5whNJN/6E80wh5IE6D6cJ1xkX
 2qNicrrruTUACPyJxrTrR/0kkoxDoYgBapy3kqV0R1uK+ttNrGlGjfdapKs0tqMb
 ezksj0sy/rx+HjfGEHx52mTiYRRTHEsGt/yMa1pT8o0p2wvR0nxnvzkvV8DeMoX3
 0wYRxATsv/t8Oog+ug6khB/FtppSGJML+XxG3bV9itgCkAIFAbb7vZRrThnzVqOr
 ChbQvBhchY4lYA1pef862QHLRraJF84HuY/ypyE6DX+nQWkGjDfEeHFPcm6eVSXG
 xEplK8wjs09pSklH/OQD+GsO4eedBHc4hdzDuBTjCmt867/Lk8Uw4RIjGMYjbBx2
 ovU3pTHfdlO6/qJmfSWnBAAZjKAjna/p47ZfDLzio23hb1hK68Qe/z8+tPnAzkHW
 rowA6KfBi13tX6Njds/lceiYMrnMRPqNBA+Og1wCyljP2Q4shbADa6czrFDzPn6P
 xZbfhKK2CTbYSvxi5S3G
 =8bsr
 -----END PGP SIGNATURE-----

Merge tag 'xfs-for-linus-4.10-rc4-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs fixes from Darrick Wong:
 "As promised last week, here's some stability fixes from Christoph and
  Jan Kara:

   - fix free space request handling when low on disk space

   - remove redundant log failure error messages

   - free truncated dirty pages instead of letting them build up
     forever"

* tag 'xfs-for-linus-4.10-rc4-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: Timely free truncated dirty pages
  xfs: don't print warnings when xfs_log_force fails
  xfs: don't rely on ->total in xfs_alloc_space_available
  xfs: adjust allocation length in xfs_alloc_space_available
  xfs: fix bogus minleft manipulations
  xfs: bump up reserved blocks in xfs_alloc_set_aside
2017-01-12 11:06:26 -08:00
John Garry
6f724fb303 i2c: print correct device invalid address
In of_i2c_register_device(), when the check for
device address validity fails we print the info.addr,
which has not been assigned properly.

Fix this by printing the actual invalid address.

Signed-off-by: John Garry <john.garry@huawei.com>
Reviewed-by: Vladimir Zapolskiy <vz@mleia.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Fixes: b4e2f6ac12 ("i2c: apply DT flags when probing")
Cc: stable@kernel.org
2017-01-12 20:06:15 +01:00
Dmitry Torokhov
331c342552 i2c: do not enable fall back to Host Notify by default
Falling back unconditionally to HostNotify as primary client's interrupt
breaks some drivers which alter their functionality depending on whether
interrupt is present or not, so let's introduce a board flag telling I2C
core explicitly if we want wired interrupt or HostNotify-based one:
I2C_CLIENT_HOST_NOTIFY.

For DT-based systems we introduce "host-notify" property that we convert
to I2C_CLIENT_HOST_NOTIFY board flag.

Tested-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Acked-by: Pali Rohár <pali.rohar@gmail.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
2017-01-12 20:06:15 +01:00
Vlad Tsyrklevich
30f939feae i2c: fix kernel memory disclosure in dev interface
i2c_smbus_xfer() does not always fill an entire block, allowing
kernel stack memory disclosure through the temp variable. Clear
it before it's read to.

Signed-off-by: Vlad Tsyrklevich <vlad@tsyrklevich.net>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Cc: stable@kernel.org
2017-01-12 20:06:10 +01:00
Linus Torvalds
9ca277eba0 remoteproc fixes for v4.10
This fixes two regressions that has been reported to be introduced in
 v4.10-rc1.
 
 * The first fix corrects an incorrect usage of the kref api.
 
 * The second reverts the change to make the resource table read-only. As the
   space each vdev resource is used as virtio device config space it must be
   shared with the remote.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJYdqtvAAoJEAsfOT8Nma3F58IP/A88sWYN0Fp8Zv7tU1H1sk4B
 YFRCbf+g+9fFUEA55infRbCrQxzPV5ZINeIV8hmsFgradSnMLntTrl0KUpgjYr7q
 eH6Nt7AHkloPME+fGV2m3aVrQGQDYOMaJ5kgRoFMn4YFU5KekcCwnwuwgxiSJiq5
 PBVhUZfFUoBEWfAcCh9palkUZ7V5FMtSu5ZujPu6rpEwpchZqla4hTCVCDJsZsQo
 HpF+Df30MSML8PiFtvX8aEKz4KzKKAFENAs0FQL22YcyhMF2A0LRy/LfVoL2ANDc
 c0CVAEQueVr5diACVYtUzYQKJPib/Lw97OqO2rlh3F6AxsLhOcM46agx28OtJEKv
 NRJRKKX5xWpXofsO+MyZ0odpesGWOOecouTbnmL6BgHeJKW2BVbkntbv/VDFD0f4
 BIxCioLNCY/zGZ/FCConmceqoXoMJwgL5ZHp2vcL4icwZj4M//B1zQGBmWmauCZG
 x/SA7A6r8DZ/Y6GyUP4rpzA0m3GUhEvFU4N4nEaQjBe9tVsougD2NIlgoDcARoO8
 e7CiKNuAqViSeMJzlNmDBARDoQFCDoGLZhhS4qQsyMCmd6ySmu2x5MVNEuMRUpp3
 wafRU5CMHJw/EUyFlybNQmkReE7RDhWuTZ42UBK+Zx1orhu+fH6ulPgPTaxKgp+d
 x4XcbiZU4iqAobZ/wWsP
 =GUNc
 -----END PGP SIGNATURE-----

Merge tag 'rproc-v4.10-fixes' of git://github.com/andersson/remoteproc

Pull remoteproc fixes from Bjorn Andersson:
 "This fixes two regressions that have been reported to be introduced in
  v4.10-rc1.

   - correct an incorrect usage of the kref api

   - revert the change to make the resource table read-only. As the
     space each vdev resource is used as virtio device config space it
     must be shared with the remote"

* tag 'rproc-v4.10-fixes' of git://github.com/andersson/remoteproc:
  Revert "remoteproc: Merge table_ptr and cached_table pointers"
  remoteproc: fix vdev reference management
2017-01-12 11:00:22 -08:00
Linus Torvalds
1d865da79e rpmsg fixes for v4.10
This fixes a regression introduced in v4.10-rc1 that prohibits multiple
 channels with the same name but different endpoint addresses to be used.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJYdqptAAoJEAsfOT8Nma3F3xcP/R2zVEsl7tbFt+DJdXTi6ZvK
 U+pTRuyGpao1oQVbcUf/aQ+IYqO5Ndr06qTzwnBI/44DYlHBihqskUepmvUYapNl
 Pl9n3S0BdDSuN8OpR8QJMLaWIzSbz9CM5bM8St7nORUy21rjPL89EkuUjaq/lU1P
 hr0HNfe/naOgcp93u/w7j+tS0afakjrNFxgXaU8UCasOSoXw1Uw5/aivIp1Oc6mN
 dOpEFykRHA1fTL5giMEWAqoh6tWHDfVaeyNSGcL8/KxoT2W9dSJ8ed37uLm2pnNK
 9e6lTTO8yRpOIT99XCytsZrezhASamg0PawY3OSQ8YE8gLqIpTRAz/YPyQaw83e3
 GwRb/zHINf4GsuyopAj8pIstJj57pB5M5rnMI/sUZsV0BRjFhOUPkGsuS7enCOfp
 CyNlIWXOOBirBefpeZLnYoYN+IIDKqhZ4/en1IMp04A6BNr4RrjlPguE4krgaK21
 yaAGrDMhCr5V+RQeOzwMtuTkNog8cML7CRwXhsFu8NmrNIsEj+eTL0aEzxFERIo4
 hkaLPIUyy8Njy2SCsp1hETDajrVagPWUw49zMcRAkaMQ5ev+wJnrNe+/7NNBzIXI
 PgtZQ0Eq0sE8KCEc37o5mt6CQv/vn8KRq+I6KoLmdxihD5QyxOcoQMpRjlH0vIrx
 5HiloewfU6PfoS1HqfB/
 =xcek
 -----END PGP SIGNATURE-----

Merge tag 'rpmsg-v4.10-fixes' of git://github.com/andersson/remoteproc

Pull rpmsg fixes from Bjorn Andersson:
 "This fixes a regression introduced in v4.10-rc1 that prohibits
  multiple channels with the same name but different endpoint addresses
  to be used"

* tag 'rpmsg-v4.10-fixes' of git://github.com/andersson/remoteproc:
  rpmsg: virtio_rpmsg_bus: fix channel creation
2017-01-12 10:58:16 -08:00
Linus Torvalds
95ce13138e Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid
Pull HID fixes from Jiri Kosina:

 - device descriptor length validation fix to hid-cypress driver from
   Greg

 - introduction of a short delay into i2c-hid, which is not really
   mandated by the spec, but fixes Asus Touchpads

 - Petzl USB connectable flashlight quirk from myself

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
  HID: i2c-hid: Add sleep between POWER ON and RESET
  HID: hid-cypress: validate length of report
  HID: ignore Petzl USB headlamp
2017-01-12 10:55:28 -08:00
Linus Torvalds
cb38b45346 Merge branch 'scsi-target-for-v4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/bvanassche/linux
Pull scsi target fixes from Bart Van Assche:

 - a series of bug fixes for the XCOPY implementation from David
   Disseldorp

 - one bug fix for the ibmvscsis driver, a driver that is used for
   communication between partitions on IBM POWER systems.

* 'scsi-target-for-v4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/bvanassche/linux:
  ibmvscsis: Fix srp_transfer_data fail return code
  target: support XCOPY requests without parameters
  target: check for XCOPY parameter truncation
  target: use XCOPY segment descriptor CSCD IDs
  target: check XCOPY segment descriptor CSCD IDs
  target: simplify XCOPY wwn->se_dev lookup helper
  target: return UNSUPPORTED TARGET/SEGMENT DESC TYPE CODE sense
  target: bounds check XCOPY total descriptor list length
  target: bounds check XCOPY segment descriptor list
  target: use XCOPY TOO MANY TARGET DESCRIPTORS sense
  target: add XCOPY target/segment desc sense codes
2017-01-12 10:41:20 -08:00
Geng, Jichao
84fcc2d2bd ceph: fix get_oldest_context()
For no snapshot case, we should use ci->truncate_{seq,size}.

Fixes: 5f743e4566 ("ceph: record truncate size/seq for snap data writeback")
Signed-off-by: Geng, Jichao <geng.jichao@h3c.com>
Signed-off-by: Yan, Zheng <zyan@redhat.com>
2017-01-12 19:31:01 +01:00
Yan, Zheng
cc8e834293 ceph: fix mds cluster availability check
We should apply the check after getting the initial mdsmap.

Fixes: e9e427f0a1 ("ceph: check availability of mds cluster on mount")
Link: http://tracker.ceph.com/issues/18161
Signed-off-by: Yan, Zheng <zyan@redhat.com>
2017-01-12 19:31:01 +01:00
Linus Torvalds
607ae5f269 Merge tag 'md/4.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md
Pull md fixes from Shaohua Li:
 "Basically one fix for raid5 cache which is merged in this cycle,
  others are trival fixes"

* tag 'md/4.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md:
  md/raid5: Use correct IS_ERR() variation on pointer check
  md: cleanup mddev flag clear for takeover
  md/r5cache: fix spelling mistake on "recoverying"
  md/r5cache: assign conf->log before r5l_load_log()
  md/r5cache: simplify handling of sh->log_start in recovery
  md/raid5-cache: removes unnecessary write-through mode judgments
  md/raid10: Refactor raid10_make_request
  md/raid1: Refactor raid1_make_request
2017-01-12 10:17:59 -08:00
Ard Biesheuvel
41c066f2c4 arm64: assembler: make adr_l work in modules under KASLR
When CONFIG_RANDOMIZE_MODULE_REGION_FULL=y, the offset between loaded
modules and the core kernel may exceed 4 GB, putting symbols exported
by the core kernel out of the reach of the ordinary adrp/add instruction
pairs used to generate relative symbol references. So make the adr_l
macro emit a movz/movk sequence instead when executing in module context.

While at it, remove the pointless special case for the stack pointer.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-01-12 18:10:52 +00:00
Greg Kroah-Hartman
97f9c5f211 USB-serial fixes for v4.10-rc4
These fixes address a number of issues in the ch341 driver and includes
 a partial revert of a change in how we set the line settings that went
 into 4.10-rc1 but which turned out to have undesired side effects. This
 included deasserting the modem-control lines when configuring the
 device, but also prevented a certain class of CH340 devices from working
 with the driver.
 
 Included are also two fixes for two minor information leaks in
 kl5kusb105 and ch341 due to failures to detect short control transfers.
 
 Signed-off-by: Johan Hovold <johan@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIuBAABCAAYBQJYd5F7ERxqb2hhbkBrZXJuZWwub3JnAAoJEEEN5E/e4bSVJPoP
 /2nebFgZd4efzi1NQNwowwjF4f0tIRx1ysMEfwf0l5QjtY8ZFoXmkIb0q6puIqSB
 6sl1s+1t+FMXlnIXYLO6nSxMl18o/hOCZ/1fw7rj6QnbtSiLxTrUa13tdRujsWLW
 1JFltCqThnToIxzms5ns+Pxhkq1T+cyLD0dCa++8VFNmr20E+ZGip3KGiu1exT/V
 lJPV+sEMaCDlqnKhnW313KDcJ5zL/lxPDwKdFD0+6+uTOcoeZ7wNUaTHDUhNyFWV
 d0nqxp9bQ2mENIVGIkZy+gewBQHJuiw+oo6qjDIbUn3mSMcS08D22+ZTEc56sLl5
 cbVxr5KAVqz5v/XovY6b9lvaBUTytvm2g9xaLk7QoY3Mn8r8TPWK5PxbZ7ZO62v6
 bi7+FUUp2UW7NoGXbz315PsomsalCwQSj2PS96Zoh82mY7qYxORsRuPrdFe6RjHL
 EVQMoq2HmlhwBa7bJmrm9h52Za8LBc0qH+szpwCLubCbnDQlsHLF+bQgrNs29c4f
 07JjhjtK5ihNU00llren1pJHRtgszoqOArjiU6BNpCBs16VBxWp9ZtZv1shd1Pyp
 v1VYPidmdGNUC+PdWe35rqYwaNIQa9yF/+0J0c0iVxbyQ/1IxSKLe8R7SQJGLYGE
 pjtrD9sZixFGwF0Xz9mR96DRMTWbCRIzMVGE5cBFIkIB
 =inRb
 -----END PGP SIGNATURE-----

Merge tag 'usb-serial-4.10-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-linus

Johan writes:

USB-serial fixes for v4.10-rc4

These fixes address a number of issues in the ch341 driver and includes
a partial revert of a change in how we set the line settings that went
into 4.10-rc1 but which turned out to have undesired side effects. This
included deasserting the modem-control lines when configuring the
device, but also prevented a certain class of CH340 devices from working
with the driver.

Included are also two fixes for two minor information leaks in
kl5kusb105 and ch341 due to failures to detect short control transfers.

Signed-off-by: Johan Hovold <johan@kernel.org>
2017-01-12 18:17:38 +01:00
Damien Le Moal
f99e86485c block: Rename blk_queue_zone_size and bdev_zone_size
All block device data fields and functions returning a number of 512B
sectors are by convention named xxx_sectors while names in the form
xxx_size are generally used for a number of bytes. The blk_queue_zone_size
and bdev_zone_size functions were not following this convention so rename
them.

No functional change is introduced by this patch.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>

Collapsed the two patches, they were nonsensically split and broke
bisection.

Signed-off-by: Jens Axboe <axboe@fb.com>
2017-01-12 07:58:32 -07:00
Paolo Bonzini
33ab91103b KVM: x86: fix emulation of "MOV SS, null selector"
This is CVE-2017-2583.  On Intel this causes a failed vmentry because
SS's type is neither 3 nor 7 (even though the manual says this check is
only done for usable SS, and the dmesg splat says that SS is unusable!).
On AMD it's worse: svm.c is confused and sets CPL to 0 in the vmcb.

The fix fabricates a data segment descriptor when SS is set to a null
selector, so that CPL and SS.DPL are set correctly in the VMCS/vmcb.
Furthermore, only allow setting SS to a NULL selector if SS.RPL < 3;
this in turn ensures CPL < 3 because RPL must be equal to CPL.

Thanks to Andy Lutomirski and Willy Tarreau for help in analyzing
the bug and deciphering the manuals.

Reported-by: Xiaohan Zhang <zhangxiaohan1@huawei.com>
Fixes: 79d5b4c3cd
Cc: stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-12 15:17:13 +01:00
Jike Song
19c816e8e4 capability: export has_capability
has_capability() is sometimes needed by modules to test capability
for specified task other than current, so export it.

Cc: Kirti Wankhede <kwankhede@nvidia.com>
Signed-off-by: Jike Song <jike.song@intel.com>
Acked-by: Serge Hallyn <serge@hallyn.com>
Acked-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2017-01-12 07:01:56 -07:00
Wanpeng Li
546d87e5c9 KVM: x86: fix NULL deref in vcpu_scan_ioapic
Reported by syzkaller:

    BUG: unable to handle kernel NULL pointer dereference at 00000000000001b0
    IP: _raw_spin_lock+0xc/0x30
    PGD 3e28eb067
    PUD 3f0ac6067
    PMD 0
    Oops: 0002 [#1] SMP
    CPU: 0 PID: 2431 Comm: test Tainted: G           OE   4.10.0-rc1+ #3
    Call Trace:
     ? kvm_ioapic_scan_entry+0x3e/0x110 [kvm]
     kvm_arch_vcpu_ioctl_run+0x10a8/0x15f0 [kvm]
     ? pick_next_task_fair+0xe1/0x4e0
     ? kvm_arch_vcpu_load+0xea/0x260 [kvm]
     kvm_vcpu_ioctl+0x33a/0x600 [kvm]
     ? hrtimer_try_to_cancel+0x29/0x130
     ? do_nanosleep+0x97/0xf0
     do_vfs_ioctl+0xa1/0x5d0
     ? __hrtimer_init+0x90/0x90
     ? do_nanosleep+0x5b/0xf0
     SyS_ioctl+0x79/0x90
     do_syscall_64+0x6e/0x180
     entry_SYSCALL64_slow_path+0x25/0x25
    RIP: _raw_spin_lock+0xc/0x30 RSP: ffffa43688973cc0

The syzkaller folks reported a NULL pointer dereference due to
ENABLE_CAP succeeding even without an irqchip.  The Hyper-V
synthetic interrupt controller is activated, resulting in a
wrong request to rescan the ioapic and a NULL pointer dereference.

    #include <sys/ioctl.h>
    #include <sys/mman.h>
    #include <sys/types.h>
    #include <linux/kvm.h>
    #include <pthread.h>
    #include <stddef.h>
    #include <stdint.h>
    #include <stdlib.h>
    #include <string.h>
    #include <unistd.h>

    #ifndef KVM_CAP_HYPERV_SYNIC
    #define KVM_CAP_HYPERV_SYNIC 123
    #endif

    void* thr(void* arg)
    {
	struct kvm_enable_cap cap;
	cap.flags = 0;
	cap.cap = KVM_CAP_HYPERV_SYNIC;
	ioctl((long)arg, KVM_ENABLE_CAP, &cap);
	return 0;
    }

    int main()
    {
	void *host_mem = mmap(0, 0x1000, PROT_READ|PROT_WRITE,
			MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
	int kvmfd = open("/dev/kvm", 0);
	int vmfd = ioctl(kvmfd, KVM_CREATE_VM, 0);
	struct kvm_userspace_memory_region memreg;
	memreg.slot = 0;
	memreg.flags = 0;
	memreg.guest_phys_addr = 0;
	memreg.memory_size = 0x1000;
	memreg.userspace_addr = (unsigned long)host_mem;
	host_mem[0] = 0xf4;
	ioctl(vmfd, KVM_SET_USER_MEMORY_REGION, &memreg);
	int cpufd = ioctl(vmfd, KVM_CREATE_VCPU, 0);
	struct kvm_sregs sregs;
	ioctl(cpufd, KVM_GET_SREGS, &sregs);
	sregs.cr0 = 0;
	sregs.cr4 = 0;
	sregs.efer = 0;
	sregs.cs.selector = 0;
	sregs.cs.base = 0;
	ioctl(cpufd, KVM_SET_SREGS, &sregs);
	struct kvm_regs regs = { .rflags = 2 };
	ioctl(cpufd, KVM_SET_REGS, &regs);
	ioctl(vmfd, KVM_CREATE_IRQCHIP, 0);
	pthread_t th;
	pthread_create(&th, 0, thr, (void*)(long)cpufd);
	usleep(rand() % 10000);
	ioctl(cpufd, KVM_RUN, 0);
	pthread_join(th, 0);
	return 0;
    }

This patch fixes it by failing ENABLE_CAP if without an irqchip.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Fixes: 5c919412fe (kvm/x86: Hyper-V synthetic interrupt controller)
Cc: stable@vger.kernel.org # 4.5+
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-12 14:52:52 +01:00
Wanpeng Li
4f3dbdf47e KVM: eventfd: fix NULL deref irqbypass consumer
Reported syzkaller:

    BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
    IP: irq_bypass_unregister_consumer+0x9d/0xb70 [irqbypass]
    PGD 0

    Oops: 0002 [#1] SMP
    CPU: 1 PID: 125 Comm: kworker/1:1 Not tainted 4.9.0+ #1
    Workqueue: kvm-irqfd-cleanup irqfd_shutdown [kvm]
    task: ffff9bbe0dfbb900 task.stack: ffffb61802014000
    RIP: 0010:irq_bypass_unregister_consumer+0x9d/0xb70 [irqbypass]
    Call Trace:
     irqfd_shutdown+0x66/0xa0 [kvm]
     process_one_work+0x16b/0x480
     worker_thread+0x4b/0x500
     kthread+0x101/0x140
     ? process_one_work+0x480/0x480
     ? kthread_create_on_node+0x60/0x60
     ret_from_fork+0x25/0x30
    RIP: irq_bypass_unregister_consumer+0x9d/0xb70 [irqbypass] RSP: ffffb61802017e20
    CR2: 0000000000000008

The syzkaller folks reported a NULL pointer dereference that due to
unregister an consumer which fails registration before. The syzkaller
creates two VMs w/ an equal eventfd occasionally. So the second VM
fails to register an irqbypass consumer. It will make irqfd as inactive
and queue an workqueue work to shutdown irqfd and unregister the irqbypass
consumer when eventfd is closed. However, the second consumer has been
initialized though it fails registration. So the token(same as the first
VM's) is taken to unregister the consumer through the workqueue, the
consumer of the first VM is found and unregistered, then NULL deref incurred
in the path of deleting consumer from the consumers list.

This patch fixes it by making irq_bypass_register/unregister_consumer()
looks for the consumer entry based on consumer pointer itself instead of
token matching.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Suggested-by: Alex Williamson <alex.williamson@redhat.com>
Cc: stable@vger.kernel.org
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-12 14:42:34 +01:00
Steve Rutherford
129a72a0d3 KVM: x86: Introduce segmented_write_std
Introduces segemented_write_std.

Switches from emulated reads/writes to standard read/writes in fxsave,
fxrstor, sgdt, and sidt.  This fixes CVE-2017-2584, a longstanding
kernel memory leak.

Since commit 283c95d0e3 ("KVM: x86: emulate FXSAVE and FXRSTOR",
2016-11-09), which is luckily not yet in any final release, this would
also be an exploitable kernel memory *write*!

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: stable@vger.kernel.org
Fixes: 96051572c8
Fixes: 283c95d0e3
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Steve Rutherford <srutherford@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-12 14:34:58 +01:00
David Matlack
cef84c302f KVM: x86: flush pending lapic jump label updates on module unload
KVM's lapic emulation uses static_key_deferred (apic_{hw,sw}_disabled).
These are implemented with delayed_work structs which can still be
pending when the KVM module is unloaded. We've seen this cause kernel
panics when the kvm_intel module is quickly reloaded.

Use the new static_key_deferred_flush() API to flush pending updates on
module unload.

Signed-off-by: David Matlack <dmatlack@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-12 14:33:17 +01:00
David Matlack
b6416e6101 jump_labels: API for flushing deferred jump label updates
Modules that use static_key_deferred need a way to synchronize with
any delayed work that is still pending when the module is unloaded.
Introduce static_key_deferred_flush() which flushes any pending
jump label updates.

Signed-off-by: David Matlack <dmatlack@google.com>
Cc: stable@vger.kernel.org
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-12 14:33:16 +01:00
Josh Poimboeuf
ff3f7e2475 x86/entry: Fix the end of the stack for newly forked tasks
When unwinding a task, the end of the stack is always at the same offset
right below the saved pt_regs, regardless of which syscall was used to
enter the kernel.  That convention allows the unwinder to verify that a
stack is sane.

However, newly forked tasks don't always follow that convention, as
reported by the following unwinder warning seen by Dave Jones:

  WARNING: kernel stack frame pointer at ffffc90001443f30 in kworker/u8:8:30468 has bad value           (null)

The warning was due to the following call chain:

  (ftrace handler)
  call_usermodehelper_exec_async+0x5/0x140
  ret_from_fork+0x22/0x30

The problem is that ret_from_fork() doesn't create a stack frame before
calling other functions.  Fix that by carefully using the frame pointer
macros.

In addition to conforming to the end of stack convention, this also
makes related stack traces more sensible by making it clear to the user
that ret_from_fork() was involved.

Reported-by: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/8854cdaab980e9700a81e9ebf0d4238e4bbb68ef.1483978430.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-12 09:28:29 +01:00
Josh Poimboeuf
2c96b2fe9c x86/unwind: Include __schedule() in stack traces
In the following commit:

  0100301bfd ("sched/x86: Rewrite the switch_to() code")

... the layout of the 'inactive_task_frame' struct was designed to have
a frame pointer header embedded in it, so that the unwinder could use
the 'bp' and 'ret_addr' fields to report __schedule() on the stack (or
ret_from_fork() for newly forked tasks which haven't actually run yet).

Finish the job by changing get_frame_pointer() to return a pointer to
inactive_task_frame's 'bp' field rather than 'bp' itself.  This allows
the unwinder to start one frame higher on the stack, so that it properly
reports __schedule().

Reported-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/598e9f7505ed0aba86e8b9590aa528c6c7ae8dcd.1483978430.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-12 09:28:28 +01:00
Josh Poimboeuf
84936118bd x86/unwind: Disable KASAN checks for non-current tasks
There are a handful of callers to save_stack_trace_tsk() and
show_stack() which try to unwind the stack of a task other than current.
In such cases, it's remotely possible that the task is running on one
CPU while the unwinder is reading its stack from another CPU, causing
the unwinder to see stack corruption.

These cases seem to be mostly harmless.  The unwinder has checks which
prevent it from following bad pointers beyond the bounds of the stack.
So it's not really a bug as long as the caller understands that
unwinding another task will not always succeed.

In such cases, it's possible that the unwinder may read a KASAN-poisoned
region of the stack.  Account for that by using READ_ONCE_NOCHECK() when
reading the stack of another task.

Use READ_ONCE() when reading the stack of the current task, since KASAN
warnings can still be useful for finding bugs in that case.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Jones <davej@codemonkey.org.uk>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/4c575eb288ba9f73d498dfe0acde2f58674598f1.1483978430.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-12 09:28:27 +01:00
Josh Poimboeuf
900742d89c x86/unwind: Silence warnings for non-current tasks
There are a handful of callers to save_stack_trace_tsk() and
show_stack() which try to unwind the stack of a task other than current.
In such cases, it's remotely possible that the task is running on one
CPU while the unwinder is reading its stack from another CPU, causing
the unwinder to see stack corruption.

These cases seem to be mostly harmless.  The unwinder has checks which
prevent it from following bad pointers beyond the bounds of the stack.
So it's not really a bug as long as the caller understands that
unwinding another task will not always succeed.

Since stack "corruption" on another task's stack isn't necessarily a
bug, silence the warnings when unwinding tasks other than current.

Reported-by: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Miroslav Benes <mbenes@suse.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/00d8c50eea3446c1524a2a755397a3966629354c.1483978430.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-12 09:28:27 +01:00
Brendan McGrath
a89af4abdf HID: i2c-hid: Add sleep between POWER ON and RESET
Support for the Asus Touchpad was recently added. It turns out this
device can fail initialisation (and become unusable) when the RESET
command is sent too soon after the POWER ON command.

Unfortunately the i2c-hid specification does not specify the need for
a delay between these two commands. But it was discovered the Windows
driver has a 1ms delay.

As a result, this patch modifies the i2c-hid module to add a sleep
inbetween the POWER ON and RESET commands which lasts between 1ms and 5ms.

See https://github.com/vlasenko/hid-asus-dkms/issues/24 for further
details.

Signed-off-by: Brendan McGrath <redmcg@redmandi.dyndns.org>
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2017-01-11 21:55:35 +01:00
Linus Torvalds
ba836a6f5a Merge branch 'akpm' (patches from Andrew)
Merge fixes from Andrew Morton:
 "27 fixes.

  There are three patches that aren't actually fixes. They're simple
  function renamings which are nice-to-have in mainline as ongoing net
  development depends on them."

* akpm: (27 commits)
  timerfd: export defines to userspace
  mm/hugetlb.c: fix reservation race when freeing surplus pages
  mm/slab.c: fix SLAB freelist randomization duplicate entries
  zram: support BDI_CAP_STABLE_WRITES
  zram: revalidate disk under init_lock
  mm: support anonymous stable page
  mm: add documentation for page fragment APIs
  mm: rename __page_frag functions to __page_frag_cache, drop order from drain
  mm: rename __alloc_page_frag to page_frag_alloc and __free_page_frag to page_frag_free
  mm, memcg: fix the active list aging for lowmem requests when memcg is enabled
  mm: don't dereference struct page fields of invalid pages
  mailmap: add codeaurora.org names for nameless email commits
  signal: protect SIGNAL_UNKILLABLE from unintentional clearing.
  mm: pmd dirty emulation in page fault handler
  ipc/sem.c: fix incorrect sem_lock pairing
  lib/Kconfig.debug: fix frv build failure
  mm: get rid of __GFP_OTHER_NODE
  mm: fix remote numa hits statistics
  mm: fix devm_memremap_pages crash, use mem_hotplug_{begin, done}
  ocfs2: fix crash caused by stale lvb with fsdlm plugin
  ...
2017-01-11 11:15:15 -08:00
Dan Carpenter
73da4268fd vfio-mdev: remove some dead code
We set info.count to 1 in mtty_get_irq_info() so static checkers
complain that, "Why do we have impossible conditions?"  The answer is
that it seems to be left over dead code that can be safely removed.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2017-01-11 12:12:37 -07:00
Dan Carpenter
5c677869e0 vfio-mdev: buffer overflow in ioctl()
This is a sample driver for documentation so the impact is probably
pretty low.  But we should check that bar_index is valid so we
don't write beyond the end of the mdev_state->region_info[] array.

Fixes: 9d1a546c53 ("docs: Sample driver to demonstrate how to use Mediated device framework.")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2017-01-11 12:12:29 -07:00
Dan Carpenter
6ed0993a0b vfio-mdev: return -EFAULT if copy_to_user() fails
The copy_to_user() function returns the number of bytes which it wasn't
able to copy but we want to return a negative error code.

Fixes: 9d1a546c53 ("docs: Sample driver to demonstrate how to use Mediated device framework.")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Kirti Wankhede <kwankhede@nvidia.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2017-01-11 12:06:35 -07:00
Takashi Iwai
6cf4569ce3 ASoC: Fixes for v4.10
As well as the usual smattering of driver specific fixes collected since
 the merge window this has one particularly important fix to the core for
 handling of aux_devs which was broken during the merge window by some of
 the componentization refactoring.
 -----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCAAxFiEEreZoqmdXGLWf4p/qJNaLcl1Uh9AFAlh2as0THGJyb29uaWVA
 a2VybmVsLm9yZwAKCRAk1otyXVSH0LBrB/92Z6gD0GrbQjP6LkMJ0SwmAMjWOOy+
 hDTr8m9CNwSHwQW/L+rAnyS8WBB46jiJ4/mTw6Sz7YIyY0Xdv5RY7IPPuWC92JQd
 jA+0lcfGe0p86ZvVhK2tye+EHTBqKgfIzO2Sl5XNzaQZiw0S8g/FjJIjBABOGkty
 oyK2iYHAW5H7aNVZfoXR9QQBqWniSh5hh06tCDs7Gy90zlKSOoWDUUfux5pubzVR
 mXOxTnie6bU7Rf0IKzdAQ5EI3zt2XT3XtFgv47VYp4bKW8LbkSo8JCVORGymoq+c
 k+Oc8YPbpAY5Jh4tZ9tSup1Ce7DJvE1sf4VOuHkAoXjKO+Pjp+/qTo50
 =KUQm
 -----END PGP SIGNATURE-----

Merge tag 'asoc-fix-v4.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v4.10

As well as the usual smattering of driver specific fixes collected since
the merge window this has one particularly important fix to the core for
handling of aux_devs which was broken during the merge window by some of
the componentization refactoring.
2017-01-11 19:49:27 +01:00
Jan Kara
0a417b8dc1 xfs: Timely free truncated dirty pages
Commit 99579ccec4 "xfs: skip dirty pages in ->releasepage()" started
to skip dirty pages in xfs_vm_releasepage() which also has the effect
that if a dirty page is truncated, it does not get freed by
block_invalidatepage() and is lingering in LRU list waiting for reclaim.
So a simple loop like:

while true; do
	dd if=/dev/zero of=file bs=1M count=100
	rm file
done

will keep using more and more memory until we hit low watermarks and
start pagecache reclaim which will eventually reclaim also the truncate
pages. Keeping these truncated (and thus never usable) pages in memory
is just a waste of memory, is unnecessarily stressing page cache
reclaim, and reportedly also leads to anonymous mmap(2) returning ENOMEM
prematurely.

So instead of just skipping dirty pages in xfs_vm_releasepage(), return
to old behavior of skipping them only if they have delalloc or unwritten
buffers and fix the spurious warnings by warning only if the page is
clean.

CC: stable@vger.kernel.org
CC: Brian Foster <bfoster@redhat.com>
CC: Vlastimil Babka <vbabka@suse.cz>
Reported-by: Petr Tůma <petr.tuma@d3s.mff.cuni.cz>
Fixes: 99579ccec4
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-01-11 10:20:04 -08:00
Linus Torvalds
cff3b2c4b3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Fix rtlwifi crash, from Larry Finger.

 2) Memory disclosure in appletalk ipddp routing code, from Vlad
    Tsyrklevich.

 3) r8152 can erroneously split an RX packet into multiple URBs if the
    Rx FIFO is not empty when we suspend. Fix this by waiting for the
    FIFO to empty before suspending. From Hayes Wang.

 4) Two GRO fixes (enter slow path when not enough SKB tail room exists,
    disable frag0 optimizations when there are IPV6 extension headers)
    from Eric Dumazet and Herbert Xu.

 5) A series of mlx5e bug fixes (do source udp port offloading for
    tunnels properly, Ip fragment matching fixes, handling firmware
    errors properly when installing TC rules, etc.) from Saeed Mahameed,
    Or Gerlitz, Roi Dayan, Hadar Hen Zion, Gil Rockah, and Daniel
    Jurgens.

 6) Two VRF fixes from David Ahern (don't skip multipath selection for
    VRF paths, disallow VRF to be configured with table ID 0).

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (35 commits)
  net: vrf: do not allow table id 0
  net: phy: marvell: fix Marvell 88E1512 used in SGMII mode
  sctp: Fix spelling mistake: "Atempt" -> "Attempt"
  net: ipv4: Fix multipath selection with vrf
  cgroup: move CONFIG_SOCK_CGROUP_DATA to init/Kconfig
  gro: use min_t() in skb_gro_reset_offset()
  net/mlx5: Only cancel recovery work when cleaning up device
  net/mlx5e: Remove WARN_ONCE from adaptive moderation code
  net/mlx5e: Un-register uplink representor on nic_disable
  net/mlx5e: Properly handle FW errors while adding TC rules
  net/mlx5e: Fix kbuild warnings for uninitialized parameters
  net/mlx5e: Set inline mode requirements for matching on IP fragments
  net/mlx5e: Properly get address type of encapsulation IP headers
  net/mlx5e: TC ipv4 tunnel encap offload error flow fixes
  net/mlx5e: Warn when rejecting offload attempts of IP tunnels
  net/mlx5e: Properly handle offloading of source udp port for IP tunnels
  gro: Disable frag0 optimization on IPv6 ext headers
  gro: Enter slow-path if there is no tailroom
  mlx4: Return EOPNOTSUPP instead of ENOTSUPP
  net/af_iucv: don't use paged skbs for TX on HiperSockets
  ...
2017-01-11 09:52:12 -08:00
Linus Torvalds
a6b6e61650 Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fix from Herbert Xu:
 "This fixes a regression in aesni that renders it useless if it's
  built-in with a modular pcbc configuration"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: aesni - Fix failure when built-in with modular pcbc
2017-01-11 09:28:13 -08:00
Guilherme G. Piccoli
b5a10c5f75 nvme: apply DELAY_BEFORE_CHK_RDY quirk at probe time too
Commit 54adc01055 ("nvme/quirk: Add a delay before checking for adapter
readiness") introduced a quirk to adapters that cannot read the bit
NVME_CSTS_RDY right after register NVME_REG_CC is set; these adapters
need a delay or else the action of reading the bit NVME_CSTS_RDY could
somehow corrupt adapter's registers state and it never recovers.

When this quirk was added, we checked ctrl->tagset in order to avoid
quirking in probe time, supposing we would never require such delay
during probe. Well, it was too optimistic; we in fact need this quirk
at probe time in some cases, like after a kexec.

In some experiments, after abnormal shutdown of machine (aka power cord
unplug), we booted into our bootloader in Power, which is a Linux kernel,
and kexec'ed into another distro. If this kexec is too quick, we end up
reaching the probe of NVMe adapter in that distro when adapter is in
bad state (not fully initialized on our bootloader). What happens next
is that nvme_wait_ready() is unable to complete, except if the quirk is
enabled.

So, this patch removes the original ctrl->tagset verification in order
to enable the quirk even on probe time.

Fixes: 54adc01055 ("nvme/quirk: Add a delay before checking for adapter readiness")
Reported-by: Andrew Byrne <byrneadw@ie.ibm.com>
Reported-by: Jaime A. H. Gomez <jahgomez@mx1.ibm.com>
Reported-by: Zachary D. Myers <zdmyers@us.ibm.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Acked-by: Jeffrey Lien <Jeff.Lien@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2017-01-11 17:21:35 +01:00
Christoph Hellwig
1392370ee7 nvme-rdma: fix nvme_rdma_queue_is_ready
Now that we don't abuse the cmd field in struct request for nvme command
passthrough this function needs to be converted to the proper accessor
as well.

Fixes: d49187e97e ("nvme: introduce struct nvme_request")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
2017-01-11 17:20:39 +01:00
Mathias Nyman
d6169d0409 xhci: fix deadlock at host remove by running watchdog correctly
If a URB is killed while the host is removed we can end up in a situation
where the hub thread takes the roothub device lock, and waits for
the URB to be given back by xhci-hcd, blocking the host remove code.

xhci-hcd tries to stop the endpoint and give back the urb, but can't
as the host is removed from PCI bus at the same time, preventing the normal
way of giving back urb.

Instead we need to rely on the stop command timeout function to give back
the urb. This xhci_stop_endpoint_command_watchdog() timeout function
used a XHCI_STATE_DYING flag to indicate if the timeout function is already
running, but later this flag has been taking into use in other places to
mark that xhci is dying.

Remove checks for XHCI_STATE_DYING in xhci_urb_dequeue. We are still
checking that reading from pci state does not return 0xffffffff or that
host is not halted before trying to stop the endpoint.

This whole area of stopping endpoints, giving back URBs, and the wathdog
timeout need rework, this fix focuses on solving a specific deadlock
issue that we can then send to stable before any major rework.

Cc: <stable@vger.kernel.org>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-11 16:52:13 +01:00
Colin King
ad5013d569 perf/x86/intel: Use ULL constant to prevent undefined shift behaviour
When x86_pmu.num_counters is 32 the shift of the integer constant 1 is
exceeding 32bit and therefor undefined behaviour.

Fix this by shifting 1ULL instead of 1.

Reported-by: CoverityScan CID#1192105 ("Bad bit shift operation")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Link: http://lkml.kernel.org/r/20170111114310.17928-1-colin.king@canonical.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-01-11 16:43:30 +01:00
David Ahern
24c63bbc18 net: vrf: do not allow table id 0
Frank reported that vrf devices can be created with a table id of 0.
This breaks many of the run time table id checks and should not be
allowed. Detect this condition at create time and fail with EINVAL.

Fixes: 193125dbd8 ("net: Introduce VRF device driver")
Reported-by: Frank Kellermann <frank.kellermann@atos.net>
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-11 10:04:01 -05:00
Russell King
a13c06525a net: phy: marvell: fix Marvell 88E1512 used in SGMII mode
When an Marvell 88E1512 PHY is connected to a nic in SGMII mode, the
fiber page is used for the SGMII host-side connection.  The PHY driver
notices that SUPPORTED_FIBRE is set, so it tries reading the fiber page
for the link status, and ends up reading the MAC-side status instead of
the outgoing (copper) link.  This leads to incorrect results reported
via ethtool.

If the PHY is connected via SGMII to the host, ignore the fiber page.
However, continue to allow the existing power management code to
suspend and resume the fiber page.

Fixes: 6cfb3bcc06 ("Marvell phy: check link status in case of fiber link.")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-11 10:02:37 -05:00
Colin Ian King
eb004603c8 sctp: Fix spelling mistake: "Atempt" -> "Attempt"
Trivial fix to spelling mistake in WARN_ONCE message

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-11 10:01:01 -05:00
David Ahern
7a18c5b9fb net: ipv4: Fix multipath selection with vrf
fib_select_path does not call fib_select_multipath if oif is set in the
flow struct. For VRF use cases oif is always set, so multipath route
selection is bypassed. Use the FLOWI_FLAG_SKIP_NH_OIF to skip the oif
check similar to what is done in fib_table_lookup.

Add saddr and proto to the flow struct for the fib lookup done by the
VRF driver to better match hash computation for a flow.

Fixes: 613d09b30f ("net: Use VRF device index for lookups on TX")
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-11 09:59:55 -05:00
Arnd Bergmann
73b3514735 cgroup: move CONFIG_SOCK_CGROUP_DATA to init/Kconfig
We now 'select SOCK_CGROUP_DATA' but Kconfig complains that this is
not right when CONFIG_NET is disabled and there is no socket interface:

warning: (CGROUP_BPF) selects SOCK_CGROUP_DATA which has unmet direct dependencies (NET)

I don't know what the correct solution for this is, but simply removing
the dependency on NET from SOCK_CGROUP_DATA by moving it out of the
'if NET' section avoids the warning and does not produce other build
errors.

Fixes: 483c4933ea ("cgroup: Fix CGROUP_BPF config")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-11 09:47:10 -05:00
Chris Mason
0bf70aebf1 Merge branch 'tracepoint-updates-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus-4.10 2017-01-11 06:26:12 -08:00
Eric Dumazet
7cfd5fd5a9 gro: use min_t() in skb_gro_reset_offset()
On 32bit arches, (skb->end - skb->data) is not 'unsigned int',
so we shall use min_t() instead of min() to avoid a compiler error.

Fixes: 1272ce87fa ("gro: Enter slow-path if there is no tailroom")
Reported-by: kernel test robot <fengguang.wu@intel.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-11 08:15:40 -05:00
Prarit Bhargava
6d6daa2094 perf/x86/intel/uncore: Fix hardcoded socket 0 assumption in the Haswell init code
hswep_uncore_cpu_init() uses a hardcoded physical package id 0 for the boot
cpu. This works as long as the boot CPU is actually on the physical package
0, which is normaly the case after power on / reboot.

But it fails with a NULL pointer dereference when a kdump kernel is started
on a secondary socket which has a different physical package id because the
locigal package translation for physical package 0 does not exist.

Use the logical package id of the boot cpu instead of hard coded 0.

[ tglx: Rewrote changelog once more ]

Fixes: cf6d445f68 ("perf/x86/uncore: Track packages, not per CPU data")
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Harish Chegondi <harish.chegondi@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1483628965-2890-1-git-send-email-prarit@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-01-11 12:13:21 +01:00
Johan Hovold
2d5a9c72d0 USB: serial: ch341: fix control-message error handling
A short control transfer would currently fail to be detected, something
which could lead to stale buffer data being used as valid input.

Check for short transfers, and make sure to log any transfer errors.

Note that this also avoids leaking heap data to user space (TIOCMGET)
and the remote device (break control).

Fixes: 6ce7610478 ("USB: Driver for CH341 USB-serial adaptor")
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
2017-01-11 12:08:57 +01:00
Huang Shijie
69d012345a arm64: hugetlb: fix the wrong return value for huge_ptep_set_access_flags
In current code, the @changed always returns the last one's status for
the huge page with the contiguous bit set. This is really not what we
want. Even one of the PTEs is changed, we should tell it to the caller.

This patch fixes this issue.

Fixes: 66b3923a1a ("arm64: hugetlb: add support for PTE contiguous bit")
Cc: <stable@vger.kernel.org> # 4.5.x-
Signed-off-by: Huang Shijie <shijie.huang@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2017-01-11 10:26:40 +00:00
Augusto Mecking Caringi
c8a6a09c1c vme: Fix wrong pointer utilization in ca91cx42_slave_get
In ca91cx42_slave_get function, the value pointed by vme_base pointer is
set through:

*vme_base = ioread32(bridge->base + CA91CX42_VSI_BS[i]);

So it must be dereferenced to be used in calculation of pci_base:

*pci_base = (dma_addr_t)*vme_base + pci_offset;

This bug was caught thanks to the following gcc warning:

drivers/vme/bridges/vme_ca91cx42.c: In function ‘ca91cx42_slave_get’:
drivers/vme/bridges/vme_ca91cx42.c:467:14: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast]
*pci_base = (dma_addr_t)vme_base + pci_offset;

Signed-off-by: Augusto Mecking Caringi <augustocaringi@gmail.com>
Acked-By: Martyn Welch <martyn@welchs.me.uk>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-01-11 10:42:16 +01:00
Frederic Weisbecker
24b91e360e nohz: Fix collision between tick and other hrtimers
When the tick is stopped and an interrupt occurs afterward, we check on
that interrupt exit if the next tick needs to be rescheduled. If it
doesn't need any update, we don't want to do anything.

In order to check if the tick needs an update, we compare it against the
clockevent device deadline. Now that's a problem because the clockevent
device is at a lower level than the tick itself if it is implemented
on top of hrtimer.

Every hrtimer share this clockevent device. So comparing the next tick
deadline against the clockevent device deadline is wrong because the
device may be programmed for another hrtimer whose deadline collides
with the tick. As a result we may end up not reprogramming the tick
accidentally.

In a worst case scenario under full dynticks mode, the tick stops firing
as it is supposed to every 1hz, leaving /proc/stat stalled:

      Task in a full dynticks CPU
      ----------------------------

      * hrtimer A is queued 2 seconds ahead
      * the tick is stopped, scheduled 1 second ahead
      * tick fires 1 second later
      * on tick exit, nohz schedules the tick 1 second ahead but sees
        the clockevent device is already programmed to that deadline,
        fooled by hrtimer A, the tick isn't rescheduled.
      * hrtimer A is cancelled before its deadline
      * tick never fires again until an interrupt happens...

In order to fix this, store the next tick deadline to the tick_sched
local structure and reuse that value later to check whether we need to
reprogram the clock after an interrupt.

On the other hand, ts->sleep_length still wants to know about the next
clock event and not just the tick, so we want to improve the related
comment to avoid confusion.

Reported-by: James Hartsock <hartsjc@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Reviewed-by: Wanpeng Li <wanpeng.li@hotmail.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Rik van Riel <riel@redhat.com>
Link: http://lkml.kernel.org/r/1483539124-5693-1-git-send-email-fweisbec@gmail.com
Cc: stable@vger.kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2017-01-11 10:41:33 +01:00