linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-19 18:41:48 +00:00

Author	SHA1	Message	Date
Linus Torvalds	9a9594efe5	Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull SMP hotplug updates from Thomas Gleixner: "This update is primarily a cleanup of the CPU hotplug locking code. The hotplug locking mechanism is an open coded RWSEM, which allows recursive locking. The main problem with that is the recursive nature as it evades the full lockdep coverage and hides potential deadlocks. The rework replaces the open coded RWSEM with a percpu RWSEM and establishes full lockdep coverage that way. The bulk of the changes fix up recursive locking issues and address the now fully reported potential deadlocks all over the place. Some of these deadlocks have been observed in the RT tree, but on mainline the probability was low enough to hide them away." * 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (37 commits) cpu/hotplug: Constify attribute_group structures powerpc: Only obtain cpu_hotplug_lock if called by rtasd ARM/hw_breakpoint: Fix possible recursive locking for arch_hw_breakpoint_init cpu/hotplug: Remove unused check_for_tasks() function perf/core: Don't release cred_guard_mutex if not taken cpuhotplug: Link lock stacks for hotplug callbacks acpi/processor: Prevent cpu hotplug deadlock sched: Provide is_percpu_thread() helper cpu/hotplug: Convert hotplug locking to percpu rwsem s390: Prevent hotplug rwsem recursion arm: Prevent hotplug rwsem recursion arm64: Prevent cpu hotplug rwsem recursion kprobes: Cure hotplug lock ordering issues jump_label: Reorder hotplug lock and jump_label_lock perf/tracing/cpuhotplug: Fix locking order ACPI/processor: Use cpu_hotplug_disable() instead of get_online_cpus() PCI: Replace the racy recursion prevention PCI: Use cpu_hotplug_disable() instead of get_online_cpus() perf/x86/intel: Drop get_online_cpus() in intel_snb_check_microcode() x86/perf: Drop EXPORT of perf_check_microcode ...	2017-07-03 18:08:06 -07:00
Linus Torvalds	e0f3e8f14d	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Martin Schwidefsky: "The bulk of the s390 patches for 4.13. Some new things but mostly bug fixes and cleanups. Noteworthy changes: - The SCM block driver is converted to blk-mq - Switch s390 to 5 level page tables. The virtual address space for a user space process can now have up to 16EB-4KB. - Introduce a ELF phdr flag for qemu to avoid the global vm.alloc_pgste which forces all processes to large page tables - A couple of PCI improvements to improve error recovery - Included is the merge of the base support for proper machine checks for KVM" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (52 commits) s390/dasd: Fix faulty ENODEV for RO sysfs attribute s390/pci: recognize name clashes with uids s390/pci: provide more debug information s390/pci: fix handling of PEC 306 s390/pci: improve pci hotplug s390/pci: introduce clp_get_state s390/pci: improve error handling during fmb (de)registration s390/pci: improve unreg_ioat error handling s390/pci: improve error handling during interrupt deregistration s390/pci: don't cleanup in arch_setup_msi_irqs KVM: s390: Backup the guest's machine check info s390/nmi: s390: New low level handling for machine check happening in guest s390/fpu: export save_fpu_regs for all configs s390/kvm: avoid global config of vm.alloc_pgste=1 s390: rename struct psw_bits members s390: rename psw_bits enums s390/mm: use correct address space when enabling DAT s390/cio: introduce io_subchannel_type s390/ipl: revert Load Normal semantics for LPAR CCW-type re-IPL s390/dumpstack: remove raw stack dump ...	2017-07-03 15:39:36 -07:00
Linus Torvalds	c6b1e36c8f	Merge branch 'for-4.13/block' of git://git.kernel.dk/linux-block Pull core block/IO updates from Jens Axboe: "This is the main pull request for the block layer for 4.13. Not a huge round in terms of features, but there's a lot of churn related to some core cleanups. Note this depends on the UUID tree pull request, that Christoph already sent out. This pull request contains: - A series from Christoph, unifying the error/stats codes in the block layer. We now use blk_status_t everywhere, instead of using different schemes for different places. - Also from Christoph, some cleanups around request allocation and IO scheduler interactions in blk-mq. - And yet another series from Christoph, cleaning up how we handle and do bounce buffering in the block layer. - A blk-mq debugfs series from Bart, further improving on the support we have for exporting internal information to aid debugging IO hangs or stalls. - Also from Bart, a series that cleans up the request initialization differences across types of devices. - A series from Goldwyn Rodrigues, allowing the block layer to return failure if we will block and the user asked for non-blocking. - Patch from Hannes for supporting setting loop devices block size to that of the underlying device. - Two series of patches from Javier, fixing various issues with lightnvm, particular around pblk. - A series from me, adding support for write hints. This comes with NVMe support as well, so applications can help guide data placement on flash to improve performance, latencies, and write amplification. - A series from Ming, improving and hardening blk-mq support for stopping/starting and quiescing hardware queues. - Two pull requests for NVMe updates. Nothing major on the feature side, but lots of cleanups and bug fixes. From the usual crew. - A series from Neil Brown, greatly improving the bio rescue set support. Most notably, this kills the bio rescue work queues, if we don't really need them. - Lots of other little bug fixes that are all over the place" * 'for-4.13/block' of git://git.kernel.dk/linux-block: (217 commits) lightnvm: pblk: set line bitmap check under debug lightnvm: pblk: verify that cache read is still valid lightnvm: pblk: add initialization check lightnvm: pblk: remove target using async. I/Os lightnvm: pblk: use vmalloc for GC data buffer lightnvm: pblk: use right metadata buffer for recovery lightnvm: pblk: schedule if data is not ready lightnvm: pblk: remove unused return variable lightnvm: pblk: fix double-free on pblk init lightnvm: pblk: fix bad le64 assignations nvme: Makefile: remove dead build rule blk-mq: map all HWQ also in hyperthreaded system nvmet-rdma: register ib_client to not deadlock in device removal nvme_fc: fix error recovery on link down. nvmet_fc: fix crashes on bad opcodes nvme_fc: Fix crash when nvme controller connection fails. nvme_fc: replace ioabort msleep loop with completion nvme_fc: fix double calls to nvme_cleanup_cmd() nvme-fabrics: verify that a controller returns the correct NQN nvme: simplify nvme_dev_attrs_are_visible ...	2017-07-03 10:34:51 -07:00
Linus Torvalds	81e3e04489	UUID/GUID updates: - introduce the new uuid_t/guid_t types that are going to replace the somewhat confusing uuid_be/uuid_le types and make the terminology fit the various specs, as well as the userspace libuuid library. (me, based on a previous version from Amir) - consolidated generic uuid/guid helper functions lifted from XFS and libnvdimm (Amir and me) - conversions to the new types and helpers (Amir, Andy and me) -----BEGIN PGP SIGNATURE----- iQI/BAABCAApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAllZfmILHGhjaEBsc3Qu ZGUACgkQD55TZVIEUYMvyg/9EvWHOOsSdeDykCK3KdH2uIqnxwpl+m7ljccaGJIc MmaH0KnsP9p/Cuw5hESh2tYlmCYN7pmYziNXpf/LRS65/HpEYbs4oMqo8UQsN0UM 2IXHfXY0HnCoG5OixH8RNbFTkxuGphsTY8meaiDr6aAmqChDQI2yGgQLo3WM2/Qe R9N1KoBWH/bqY6dHv+urlFwtsREm2fBH+8ovVma3TO73uZCzJGLJBWy3anmZN+08 uYfdbLSyRN0T8rqemVdzsZ2SrpHYkIsYGUZV43F581vp8e/3OKMoMxpWRRd9fEsa MXmoaHcLJoBsyVSFR9lcx3axKrhAgBPZljASbbA0h49JneWXrzghnKBQZG2SnEdA ktHQ2sE4Yb5TZSvvWEKMQa3kXhEfIbTwgvbHpcDr5BUZX8WvEw2Zq8e7+Mi4+KJw QkvFC1S96tRYO2bxdJX638uSesGUhSidb+hJ/edaOCB/GK+sLhUdDTJgwDpUGmyA xVXTF51ramRS2vhlbzN79x9g33igIoNnG4/PV0FPvpCTSqxkHmPc5mK6Vals1lqt cW6XfUjSQECq5nmTBtYDTbA/T+8HhBgSQnrrvmferjJzZUFGr/7MXl+Evz2x4CjX OBQoAMu241w6Vp3zoXqxzv+muZ/NLar52M/zbi9TUjE0GvvRNkHvgCC4NmpIlWYJ Sxg= =J/4P -----END PGP SIGNATURE----- Merge tag 'uuid-for-4.13' of git://git.infradead.org/users/hch/uuid Pull uuid subsystem from Christoph Hellwig: "This is the new uuid subsystem, in which Amir, Andy and I have started consolidating our uuid/guid helpers and improving the types used for them. Note that various other subsystems have pulled in this tree, so I'd like it to go in early. UUID/GUID summary: - introduce the new uuid_t/guid_t types that are going to replace the somewhat confusing uuid_be/uuid_le types and make the terminology fit the various specs, as well as the userspace libuuid library. (me, based on a previous version from Amir) - consolidated generic uuid/guid helper functions lifted from XFS and libnvdimm (Amir and me) - conversions to the new types and helpers (Amir, Andy and me)" * tag 'uuid-for-4.13' of git://git.infradead.org/users/hch/uuid: (34 commits) ACPI: hns_dsaf_acpi_dsm_guid can be static mmc: sdhci-pci: make guid intel_dsm_guid static uuid: Take const on input of uuid_is_null() and guid_is_null() thermal: int340x_thermal: fix compile after the UUID API switch thermal: int340x_thermal: Switch to use new generic UUID API acpi: always include uuid.h ACPI: Switch to use generic guid_t in acpi_evaluate_dsm() ACPI / extlog: Switch to use new generic UUID API ACPI / bus: Switch to use new generic UUID API ACPI / APEI: Switch to use new generic UUID API acpi, nfit: Switch to use new generic UUID API MAINTAINERS: add uuid entry tmpfs: generate random sb->s_uuid scsi_debug: switch to uuid_t nvme: switch to uuid_t sysctl: switch to use uuid_t partitions/ldm: switch to use uuid_t overlayfs: use uuid_t instead of uuid_be fs: switch ->s_uuid to uuid_t ima/policy: switch to use uuid_t ...	2017-07-03 09:55:26 -07:00
Tobias Klauser	6474924e2b	arch: remove unused macro/function thread_saved_pc() The only user of thread_saved_pc() in non-arch-specific code was removed in commit `8243d55977` ("sched/core: Remove pointless printout in sched_show_task()"). Remove the implementations as well. Some architectures use thread_saved_pc() in their arch-specific code. Leave their thread_saved_pc() intact. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-06-28 16:13:57 -07:00
Martin Schwidefsky	9e293b5a70	s390,kvm: provide plumbing for machines checks when running guests This provides the basic plumbing for handling machine checks when running guests -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.14 (GNU/Linux) iQIcBAABAgAGBQJZU4QPAAoJEBF7vIC1phx8GZsP/2P4nxWXBj0NS/dNq54/u7HU Va/zHIG7nUX81WZi8OCkPRlvb1RlcgNpIdw3Ar+BueFE6/qwVWBSdstVJCg6JSn4 L8T1srSeV6yQEPq1/I9S8ERYtbC8bOC3dDF6g+KyaKYnICjq5yC01+86MKSVfLTI vFMPWY/PPCgECtXHjGpWBW6HjofRH3/H+XQbxaoTUyHKwWKdtvWer9K2V7Mc/Cf8 XsyLY2Xq0Y5MBsJs+71Qw8+0R041Et5I3H7Od9lIc3SFYNoenQpk5oTtsujMtDG1 ccMPZKErYI4wHE3Hy1ozK+MdFNbepUk3RBI3oXU25tpFPG3OPuksnOqCVN/iZmm+ le9RuUi9WOOsuygPj2dsnx5v+aheedEcYWqvQ/qrNlP3pXNcpl+8waM6eke8HyCK 1JKcqqGKBNX5wKNE9b5sRTHINWK12EVCQyVrgLlZaXoXLa40NpJPjtV27vr3ttVl WmGYgwMUTo15Rdr0NSJlXl8iCgIFtWMHvuRhIgp8pBuWWb28zr6aX4w++JPwOOMZ e4rzn55giCBDnjjDFQK2Knv5XxwnMKafYMxZXfC8gLr5ELjnI6vZDN+1zhT5L2S9 uXd8l6rLN2qik57RzPV6YEDS0iybZnx5HF/ZPrNoFigJpdD7/0jFS5K5N0i+AhV5 UQmGhSGnI7Teguc45mHT =CTzL -----END PGP SIGNATURE----- Merge tag 'nmiforkvm' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into features Pull kvm patches from Christian Borntraeger: "s390,kvm: provide plumbing for machines checks when running guests" This provides the basic plumbing for handling machine checks when running guests	2017-06-28 12:57:47 +02:00
Sebastian Ott	312e8462ab	s390/pci: recognize name clashes with uids When uid checking is enabled firmware guarantees uniqueness of the uids and we use them for device enumeration. Tests have shown that uid checking can be toggled at runtime. This is unfortunate since it can lead to name clashes. Recognize these name clashes by allocating bits in zpci_domain even for firmware provided ids. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2017-06-28 07:32:14 +02:00
Sebastian Ott	be2c36769f	s390/pci: provide more debug information Add some debug data to observe the lifetime of the architecture specific device information. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2017-06-28 07:32:14 +02:00
Sebastian Ott	01553d9a2b	s390/pci: fix handling of PEC 306 In contrast to other hotplug events PEC 0x306 isn't about a single but multiple devices. Also there's no information on what happened to these devices. We correctly handled hotplug that way but failed to handle hot-unplug. This patch addresses that and implements hot-unplug of multiple devices via PEC 306. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2017-06-28 07:32:13 +02:00
Sebastian Ott	623bd44d3f	s390/pci: improve pci hotplug PCI hotplug events basically notify about the new state of a function. Unfortunately some hypervisors implement hotplug events in a way where it is not clear what the new state of the function should be. Use clp_get_state to find the current state of the function and handle accordingly. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2017-06-28 07:32:12 +02:00
Sebastian Ott	783684f1f6	s390/pci: introduce clp_get_state Code handling pci hotplug needs to determine the configuration state of a pci function. Implement clp_get_state as a wrapper for list pci functions. Also change enum zpci_state to match the configuration state values. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2017-06-28 07:32:11 +02:00
Sebastian Ott	4e5bd7803b	s390/pci: improve error handling during fmb (de)registration Cleanup in zpci_fmb_enable_device when fmb registration fails. Also don't free the fmb when deregistration fails in zpci_fmb_disable_device but handle error situations when a function was hot-unplugged. Also remove the mod_pci helper since it is no longer used. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2017-06-28 07:32:10 +02:00
Sebastian Ott	7257083491	s390/pci: improve unreg_ioat error handling DMA tables are freed in zpci_dma_exit_device regardless of the return code of zpci_unregister_ioat. This could lead to a use after free. On the other hand during function hot-unplug, zpci_unregister_ioat will always fail since the function is already gone. So let zpci_unregister_ioat report success when the function is gone but don't cleanup the dma table when a function could still have it in access. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2017-06-28 07:32:09 +02:00
Sebastian Ott	4dfbd3efe3	s390/pci: improve error handling during interrupt deregistration When we ask a function to stop creating interrupts this may fail due to the function being already gone (e.g. after hot-unplug). Consequently we don't free associated resources like summary bits and bit vectors used for irq processing. This could lead to situations where we ran out of these resources and fail to setup new interrupts. The fix is to just ignore the errors in cases where we can be sure no new interrupts are generated. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2017-06-28 07:32:08 +02:00
Sebastian Ott	795818e8bf	s390/pci: don't cleanup in arch_setup_msi_irqs After failures in arch_setup_msi_irqs common code calls arch_teardown_msi_irqs. Thus, remove cleanup code from arch_setup_msi_irqs. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2017-06-28 07:32:05 +02:00
QingFeng Hao	da72ca4d40	KVM: s390: Backup the guest's machine check info When a machine check happens in the guest, related mcck info (mcic, external damage code, ...) is stored in the vcpu's lowcore on the host. Then the machine check handler's low-level part is executed, followed by the high-level part. If the high-level part's execution is interrupted by a new machine check happening on the same vcpu on the host, the mcck info in the lowcore is overwritten with the new machine check's data. If the high-level part's execution is scheduled to a different cpu, the mcck info in the lowcore is uncertain. Therefore, for both cases, the further reinjection to the guest will use the wrong data. Let's backup the mcck info in the lowcore to the sie page for further reinjection, so that the right data will be used. Add new member into struct sie_page to store related machine check's info of mcic, failing storage address and external damage code. Signed-off-by: QingFeng Hao <haoqf@linux.vnet.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2017-06-27 16:05:38 +02:00
QingFeng Hao	c929500d7a	s390/nmi: s390: New low level handling for machine check happening in guest Add the logic to check if the machine check happens when the guest is running. If yes, set the exit reason -EINTR in the machine check's interrupt handler. Refactor s390_do_machine_check to avoid panicing the host for some kinds of machine checks which happen when guest is running. Reinject the instruction processing damage's machine checks including Delayed Access Exception instead of damaging the host if it happens in the guest because it could be caused by improper update on TLB entry or other software case and impacts the guest only. Signed-off-by: QingFeng Hao <haoqf@linux.vnet.ibm.com> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2017-06-27 16:05:27 +02:00
Linus Torvalds	9d646c97e1	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 bugfix from Martin Schwidefsky: "One last s390 patch for 4.12 Revert the re-IPL semantics back to the v4.7 state. It turned out that the memory layout may change due to memory hotplug if load-normal is used" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/ipl: revert Load Normal semantics for LPAR CCW-type re-IPL	2017-06-26 11:58:21 -07:00
Jens Axboe	f95a0d6a95	Merge commit '8e8320c9315c' into for-4.13/block Pull in the fix for shared tags, as it conflicts with the pending changes in for-4.13/block. We already pulled in v4.12-rc5 to solve other conflicts or get fixes that went into 4.12, so not a lot of changes in this merge. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2017-06-22 21:55:24 -06:00
Radim Krčmář	d6aa07c169	KVM: s390: fix shadow table handling for nested guests Some odd-ball cases (real-space designation ASCEs) are handled wrong for the shadow page tables. Fix it. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.14 (GNU/Linux) iQIcBAABAgAGBQJZS6aoAAoJEBF7vIC1phx8Y9gP/0UO7OBobxB10k3SP3aQtisw oILlXRxvEskkv6RiTUJGvHwILHiigIVXtIWFIDx+tpX70ifx0/id7KtLiQoEnuqm bmt2lsU1VQnO7siJmGvXZvZ4Da6BonlqT6bJkGSHiP2oOGgZByQFlQ3E04ZtyTJ+ Uc8nSAAsZrZDMCT+2P9OXLT/t3/dGw5C1vI5fiBmyweR4qXjXxGvWw5VtvA0nT4/ m/vTuEevTymmQeV7LyV0x/Ru3RV9yU2QUVQrctcPrkWicvTdWO/Ml+z+/Q0OHh2A 5B3sjlS0Dq5qqF6dRlh0YHDV00uuyrOuBSSH2p5Dhdo3+55fy44U243zZayDzarK 1xrv13iOus0e2GbMxhhB3hWE8E8gw4t9XUyl4ipJdiTGH0IRCS3Wi/yz8aqCS0w/ 1bY858p3cvV+SqTfmeQdvN0ZhcYDGaIPwxheeClE6DKHKr2PBqW2NnSHDBy3tyhD 5Lz1Xkn0RFxybb8TJhNdY0i2MxprFQdHAvVmhBvLTfspintO0nYygZjPme2OtYhZ P7bS2p8F8aR32HyDsN1nUGwwlYpuBAGcwQ/yuGdz11uEfcOPnI40GrWyHakBMzx4 krrK9WnF7WT1bcqgmB46YvUc+hAuG5smqsUxa1XqLkxOKRvkncYKgYAPj9dg+o8E Y+i+/SKxAqhTJcf2loHP =SHME -----END PGP SIGNATURE----- Merge tag 'kvm-s390-master-4.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux KVM: s390: fix shadow table handling for nested guests Some odd-ball cases (real-space designation ASCEs) are handled wrong for the shadow page tables. Fix it.	2017-06-22 16:13:06 +02:00
Heiko Carstens	addb63c18a	KVM: s390: gaccess: fix real-space designation asce handling for gmap shadows For real-space designation asces the asce origin part is only a token. The asce token origin must not be used to generate an effective address for storage references. This however is erroneously done within kvm_s390_shadow_tables(). Furthermore within the same function the wrong parts of virtual addresses are used to generate a corresponding real address (e.g. the region second index is used as region first index). Both of the above can result in incorrect address translations. Only for real space designations with a token origin of zero and addresses below one megabyte the translation was correct. Furthermore replace a "!asce.r" statement with a "!*fake" statement to make it more obvious that a specific condition has nothing to do with the architecture, but with the fake handling of real space designations. Fixes: `3218f7094b` ("s390/mm: support real-space for gmap shadows") Cc: David Hildenbrand <david@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>	2017-06-22 12:53:34 +02:00
Hugh Dickins	1be7107fbe	mm: larger stack guard gap, between vmas Stack guard page is a useful feature to reduce a risk of stack smashing into a different mapping. We have been using a single page gap which is sufficient to prevent having stack adjacent to a different mapping. But this seems to be insufficient in the light of the stack usage in userspace. E.g. glibc uses as large as 64kB alloca() in many commonly used functions. Others use constructs liks gid_t buffer[NGROUPS_MAX] which is 256kB or stack strings with MAX_ARG_STRLEN. This will become especially dangerous for suid binaries and the default no limit for the stack size limit because those applications can be tricked to consume a large portion of the stack and a single glibc call could jump over the guard page. These attacks are not theoretical, unfortunatelly. Make those attacks less probable by increasing the stack guard gap to 1MB (on systems with 4k pages; but make it depend on the page size because systems with larger base pages might cap stack allocations in the PAGE_SIZE units) which should cover larger alloca() and VLA stack allocations. It is obviously not a full fix because the problem is somehow inherent, but it should reduce attack space a lot. One could argue that the gap size should be configurable from userspace, but that can be done later when somebody finds that the new 1MB is wrong for some special case applications. For now, add a kernel command line option (stack_guard_gap) to specify the stack gap size (in page units). Implementation wise, first delete all the old code for stack guard page: because although we could get away with accounting one extra page in a stack vma, accounting a larger gap can break userspace - case in point, a program run with "ulimit -S -v 20000" failed when the 1MB gap was counted for RLIMIT_AS; similar problems could come with RLIMIT_MLOCK and strict non-overcommit mode. Instead of keeping gap inside the stack vma, maintain the stack guard gap as a gap between vmas: using vm_start_gap() in place of vm_start (or vm_end_gap() in place of vm_end if VM_GROWSUP) in just those few places which need to respect the gap - mainly arch_get_unmapped_area(), and and the vma tree's subtree_gap support for that. Original-patch-by: Oleg Nesterov <oleg@redhat.com> Original-patch-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Tested-by: Helge Deller <deller@gmx.de> # parisc Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-06-19 21:50:20 +08:00
Heiko Carstens	4130b28f56	s390/ipl: revert Load Normal semantics for LPAR CCW-type re-IPL This reverts the two commits `7afbeb6df2` ("s390/ipl: always use load normal for CCW-type re-IPL") `0f7451ff3a` ("s390/ipl: use load normal for LPAR re-ipl") The two commits did not take into account that behavior of standby memory changes fundamentally if the re-IPL method is changed from Load Clear to Load Normal. In case of the old re-IPL clear method all memory that was initially in standby state will be put into standby state again within the re-IPL process. Or in other words: memory that was brought online before a re-IPL will be offline again after a reboot. Given that we use different re-IPL methods depending on the hypervisor and CCW-type vs SCSI re-IPL it is not easy to tell in advance when and why memory will stay online or will be offline after a re-IPL. This does also have other side effects, since memory that is online from the beginning will be in ZONE_NORMAL by default vs ZONE_MOVABLE for memory that is offline. Therefore, before the change, a user could online and offline memory easily since standby memory was always in ZONE_NORMAL. After the change, and a re-IPL, this depended on which memory parts were online before the re-IPL. From a usability point of view the current behavior is more than suboptimal. Therefore revert these changes until we have a better solution and get back to a consistent behavior. The bad thing about this is that the time required for a re-IPL will be significantly increased for configurations with several 100GB or 1TB of memory. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2017-06-14 15:35:31 +02:00
Martin Schwidefsky	f044f4c588	s390/fpu: export save_fpu_regs for all configs The save_fpu_regs function is a general API that is supposed to be usable for modules as well. Remove the #ifdef that hides the symbol for CONFIG_KVM=n. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2017-06-13 13:03:43 +02:00
Martin Schwidefsky	23fefe119c	s390/kvm: avoid global config of vm.alloc_pgste=1 The system control vm.alloc_pgste is used to control the size of the page tables, either 2K or 4K. The idea is that a KVM host sets the vm.alloc_pgste control to 1 which causes all new processes to run with 4K page tables. For a non-kvm system the control should stay off to save on memory used for page tables. Trouble is that distributions choose to set the control globally to be able to run KVM guests. This wastes memory on non-KVM systems. Introduce the PT_S390_PGSTE ELF segment type to "mark" the qemu executable with it. All executables with this (empty) segment in its ELF phdr array will be started with 4K page tables. Any executable without PT_S390_PGSTE will run with the default 2K page tables. This removes the need to set vm.alloc_pgste=1 for a KVM host and minimizes the waste of memory for page tables. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2017-06-13 13:03:41 +02:00
Christoph Hellwig	fdd050b5b3	Merge branch 'uuid-types' of bombadil.infradead.org:public_git/uuid into nvme-base	2017-06-13 11:45:14 +02:00
Linus Torvalds	2ab99b001d	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Martin Schwidefsky: - A fix for KVM to avoid kernel oopses in case of host protection faults due to runtime instrumentation - A fix for the AP bus to avoid dead devices after unbind / bind - A fix for a compile warning merged from the vfio_ccw tree - Updated default configurations * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390: update defconfig s390/zcrypt: Fix blocking queue device after unbind/bind. s390/vfio_ccw: make some symbols static s390/kvm: do not rely on the ILC on kvm host protection fauls	2017-06-13 15:07:11 +09:00
Jens Axboe	8f66439eec	Linux 4.12-rc5 -----BEGIN PGP SIGNATURE----- iQEcBAABAgAGBQJZPdbLAAoJEHm+PkMAQRiGx4wH/1nCjfnl6fE8oJ24/1gEAOUh biFdqJkYZmlLYHVtYfLm4Ueg4adJdg0wx6qM/4RaAzmQVvLfDV34bc1qBf1+P95G kVF+osWyXrZo5cTwkwapHW/KNu4VJwAx2D1wrlxKDVG5AOrULH1pYOYGOpApEkZU 4N+q5+M0ce0GJpqtUZX+UnI33ygjdDbBxXoFKsr24B7eA0ouGbAJ7dC88WcaETL+ 2/7tT01SvDMo0jBSV0WIqlgXwZ5gp3yPGnklC3F4159Yze6VFrzHMKS/UpPF8o8E W9EbuzwxsKyXUifX2GY348L1f+47glen/1sedbuKnFhP6E9aqUQQJXvEO7ueQl4= =m2Gx -----END PGP SIGNATURE----- Merge tag 'v4.12-rc5' into for-4.13/block We've already got a few conflicts and upcoming work depends on some of the changes that have gone into mainline as regression fixes for this series. Pull in 4.12-rc5 to resolve these conflicts and make it easier on down stream trees to continue working on 4.13 changes. Signed-off-by: Jens Axboe <axboe@kernel.dk>	2017-06-12 08:30:13 -06:00
Heiko Carstens	a752598254	s390: rename struct psw_bits members Rename a couple of the struct psw_bits members so it is more obvious for what they are good. Initially I thought using the single character names from the PoP would be sufficient and obvious, but admittedly that is not true. The current implementation is not easy to use, if one has to look into the source file to figure out which member represents the 'per' bit (which is the 'r' member). Therefore rename the members to sane names that are identical to the uapi psw mask defines: r -> per i -> io e -> ext t -> dat m -> mcheck w -> wait p -> pstate Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2017-06-12 16:26:02 +02:00
Heiko Carstens	8bb3fdd686	s390: rename psw_bits enums The address space enums that must be used when modifying the address space part of a psw with the psw_bits() macro can easily be confused with the psw defines that are used to mask and compare directly the mask part of a psw. We have e.g. PSW_AS_PRIMARY vs PSW_ASC_PRIMARY. To avoid confusion rename the PSW_AS_* enums to PSW_BITS_AS_. In addition also rename the PSW_AMODE_ enums, so they also follow the same naming scheme: PSW_BITS_AMODE_*. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2017-06-12 16:26:02 +02:00
Heiko Carstens	60c497014e	s390/mm: use correct address space when enabling DAT Right now the kernel uses the primary address space until finally the switch to the correct home address space will be done when the idle PSW will be loaded within psw_idle(). Correct this and simply use the home address space when DAT is enabled for the first time. This doesn't really fix a bug, but fixes odd behavior. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2017-06-12 16:26:02 +02:00
Heiko Carstens	ead1dec8ed	s390/ipl: revert Load Normal semantics for LPAR CCW-type re-IPL This reverts the two commits `7afbeb6df2` ("s390/ipl: always use load normal for CCW-type re-IPL") `0f7451ff3a` ("s390/ipl: use load normal for LPAR re-ipl") The two commits did not take into account that behavior of standby memory changes fundamentally if the re-IPL method is changed from Load Clear to Load Normal. In case of the old re-IPL clear method all memory that was initially in standby state will be put into standby state again within the re-IPL process. Or in other words: memory that was brought online before a re-IPL will be offline again after a reboot. Given that we use different re-IPL methods depending on the hypervisor and CCW-type vs SCSI re-IPL it is not easy to tell in advance when and why memory will stay online or will be offline after a re-IPL. This does also have other side effects, since memory that is online from the beginning will be in ZONE_NORMAL by default vs ZONE_MOVABLE for memory that is offline. Therefore, before the change, a user could online and offline memory easily since standby memory was always in ZONE_NORMAL. After the change, and a re-IPL, this depended on which memory parts were online before the re-IPL. From a usability point of view the current behavior is more than suboptimal. Therefore revert these changes until we have a better solution and get back to a consistent behavior. The bad thing about this is that the time required for a re-IPL will be significantly increased for configurations with several 100GB or 1TB of memory. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2017-06-12 16:26:01 +02:00
Heiko Carstens	2b7b9817c2	s390/dumpstack: remove raw stack dump Remove raw stack dumps that are printed before call traces in case of a warning, or the 'l' sysrq trigger (show a stack backtrace for all active CPUs). Besides that a raw stack dump should not be shown for the 'l' sysrq trigger the value of the dump is close to zero. That's also why we don't print it in case of a panic since ages anymore. That this is still printed on warnings is just a leftover. So get rid of this completely. The following won't be printed anymore with this change: Stack: 00000000bbc4fbc8 00000000bbc4fc58 0000000000000003 0000000000000000 00000000bbc4fcf8 00000000bbc4fc70 00000000bbc4fc70 0000000000000020 000000007fe00098 00000000bfe8be00 00000000bbc4fe94 000000000000000a 000000000000000c 00000000bbc4fcc0 0000000000000000 0000000000000000 000000000095b930 0000000000113366 00000000bbc4fc58 00000000bbc4fca0 Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2017-06-12 16:26:01 +02:00
Logan Gunthorpe	7e9710af23	s390: provide default ioremap and iounmap declaration Move the CONFIG_PCI device so that ioremap and iounmap are always available. This looks safe as there's nothing PCI specific in the implementation of these functions. I have designs to use these functions in scatterlist.c where they'd likely never be called without CONFIG_PCI set, but this is needed to compile such changes. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Sebastian Ott <sebott@linux.vnet.ibm.com> Cc: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2017-06-12 16:26:00 +02:00
Thomas Richter	c39457ff1f	s390/perf: fix null string in perf list pmu command Command 'perf list pmu' displays events which contain an invalid string "(null)=xxx", where xxx is the pmu event name, for example: cpum_cf/AES_BLOCKED_CYCLES,(null)=AES_BLOCKED_CYCLES/ This is not correct, the invalid string should not be displayed at all. It is caused by an obsolete term in the sysfs attribute file for each s390 CPUMF counter event. Reading from the sysfs file also displays the event name. Fix this by omitting the event name. This patch makes s390 CPUMF sysfs files consistent with other plattforms. This is an interface change between user and kernel but does not break anything. Reading from a counter event sysfs file should only list terms mentioned in the /sys/bus/event_source/devices/<cpumf>/format directory. Name is not listed. Reported-by: Zvonko Kosic <zvonko.kosic@de.ibm.com> Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2017-06-12 16:26:00 +02:00
Heiko Carstens	cc18b460dc	s390/mm: add p?d_folded() helper functions Introduce and use p?d_folded() functions to clarify the page table code a bit more. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2017-06-12 16:26:00 +02:00
Heiko Carstens	f96c6f72bc	s390/mm: remove incorrect _REGION3_ENTRY_ORIGIN define _REGION3_ENTRY_ORIGIN defines a wrong mask which can be used to extract a segment table origin from a region 3 table entry. It removes only the lower 11 instead of 12 bits from a region 3 table entry. Luckily this bit is currently always zero, so nothing bad happened yet. In order to avoid future bugs just remove the region 3 specific mask and use the correct generic _REGION_ENTRY_ORIGIN mask. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2017-06-12 16:26:00 +02:00
Martin Schwidefsky	f5bbd72198	s390/ptrace: guarded storage regset for the current task The regset functions for guarded storage are supposed to work on the current task as well. For task == current add the required load and store instructions for the guarded storage control block. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2017-06-12 16:25:59 +02:00
Heiko Carstens	53d7f25f09	s390/facilities: remove stfle requirement All call sites of "stfle" check if the instruction is available before executing it. Therefore there is no reason to have the corresponding facility bit set within the architecture level set. This removes the last more or less odd bit from the list. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2017-06-12 16:25:59 +02:00
Thomas Huth	8aa8680aa3	s390: Remove 'message security assist' from the list of vital facilities The code in arch/s390/crypto checks for the availability of the 'message security assist' facility on its own, either by using module_cpu_feature_match(MSA, ...) or by checking the facility bit during cpacf_query(). Thus setting the MSA facility bit in gen_facilities.c as hard requirement is not necessary. We can remove it here, so that the kernel can also run on systems that do not provide the MSA facility yet (like the emulated environment of QEMU, for example). Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2017-06-12 16:25:59 +02:00
Heiko Carstens	fe7b274729	s390/fault: use _ASCE_ORIGIN instead of PAGE_MASK When masking an ASCE to get its origin use the corresponding define instead of the unrelated PAGE_MASK. This doesn't fix a bug since both masks are identical. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2017-06-12 16:25:59 +02:00
Heiko Carstens	bf10b6687c	s390/smp: use sigp condition code define Use proper define instead of open-coding the condition code value. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2017-06-12 16:25:58 +02:00
Christian Borntraeger	9cf8edb7a3	s390/smp: fix false positive kmemleak of mcesa data structure I get number of CPUs - 1 kmemleak hits like unreferenced object 0x37ec6f000 (size 1024): comm "swapper/0", pid 1, jiffies 4294937330 (age 889.690s) hex dump (first 32 bytes): 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk backtrace: [<000000000034a848>] kmem_cache_alloc+0x2b8/0x3d0 [<00000000001164de>] __cpu_up+0x456/0x488 [<000000000016f60c>] bringup_cpu+0x4c/0xd0 [<000000000016d5d2>] cpuhp_invoke_callback+0xe2/0x9e8 [<000000000016f3c6>] cpuhp_up_callbacks+0x5e/0x110 [<000000000016f988>] _cpu_up+0xe0/0x158 [<000000000016faf0>] do_cpu_up+0xf0/0x110 [<0000000000dae1ee>] smp_init+0x126/0x130 [<0000000000d9bd04>] kernel_init_freeable+0x174/0x2e0 [<000000000089fc62>] kernel_init+0x2a/0x148 [<00000000008adce2>] kernel_thread_starter+0x6/0xc [<00000000008adcdc>] kernel_thread_starter+0x0/0xc [<ffffffffffffffff>] 0xffffffffffffffff The pointer of this data structure is stored in the prefix page of that CPU together with some extra bits ORed into the the low bits. Mark the data structure as non-leak. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2017-06-12 16:25:58 +02:00
Harald Freudenberger	c4684f98d3	s390/crypto: fix aes/paes Kconfig dependeny The s390_paes and the s390_aes kernel module used just one config symbol CONFIG_CRYPTO_AES. As paes has a dependency to PKEY and this requires ZCRYPT the aes module also had a dependency to the zcrypt device driver which is not true. Fixed by introducing a new config symbol CONFIG_CRYPTO_PAES which has dependencies to PKEY and ZCRYPT. Removed the dependency for the aes module to ZCRYPT. Signed-off-by: Harald Freudenberger <freude@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2017-06-12 16:25:58 +02:00
Martin Schwidefsky	35bb092a91	s390/vdso: use _install_special_mapping to establish vdso Switch to the improved _install_special_mapping function to install the vdso mapping. This has two advantages, the arch_vma_name function is not needed anymore and the vdso vma still has its name after its memory location has been changed with mremap. Tested-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2017-06-12 16:25:57 +02:00
Martin Schwidefsky	b29e061bb7	s390/cputime: simplify account_system_index_scaled The account_system_index_scaled gets two cputime values, a raw value derived from CPU timer deltas and a scaled value. The scaled value is always calculated from the raw value, the code can be simplified by moving the scale_vtime call into account_system_index_scaled. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2017-06-12 16:25:57 +02:00
Heiko Carstens	6c386da799	s390: use two more generic header files I missed at least these two header files where we can make use of the generic ones. vga.h is another one, however that is already addressed by a patch from Jiri Slaby. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2017-06-12 16:25:57 +02:00
Heiko Carstens	d12a3d6036	s390/mm: add __rcu annotations Add __rcu annotations so sparse correctly warns only if "slot" gets derefenced without using rcu_dereference(). Right now we get warnings because of the missing annotation: arch/s390/mm/gmap.c:135:17: warning: incorrect type in assignment (different address spaces) arch/s390/mm/gmap.c:135:17: expected void slot arch/s390/mm/gmap.c:135:17: got void [noderef] <asn:4> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2017-06-12 16:25:55 +02:00
Heiko Carstens	92acfb7406	s390: add missing header includes for type checking Add missing include statements to make sure that prototypes match implementation. As reported by sparse: arch/s390/crypto/arch_random.c:18:1: warning: symbol 's390_arch_random_available' was not declared. Should it be static? arch/s390/kernel/traps.c:279:13: warning: symbol 'trap_init' was not declared. Should it be static? Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2017-06-12 16:25:55 +02:00
Martin Schwidefsky	1aea9b3f92	s390/mm: implement 5 level pages tables Add the logic to upgrade the page table for a 64-bit process to five levels. This increases the TASK_SIZE from 8PB to 16EB-4K. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2017-06-12 16:25:54 +02:00

1 2 3 4 5 ...

5265 Commits