Commit Graph

721807 Commits

Author SHA1 Message Date
Cyril Bur
86cd6d9802 powerpc/opal: Rework the opal-async interface
Future work will add an opal_async_wait_response_interruptible()
which will call wait_event_interruptible(). This work requires extra
token state to be tracked as wait_event_interruptible() can return and
the caller could release the token before OPAL responds.

Currently token state is tracked with two bitfields which are 64 bits
big but may not need to be as OPAL informs Linux how many async tokens
there are. It also uses an array indexed by token to store response
messages for each token.

The bitfields make it difficult to add more state and also provide a
hard maximum as to how many tokens there can be - it is possible that
OPAL will inform Linux that there are more than 64 tokens.

Rather than add a bitfield to track the extra state, rework the
internals slightly.

Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
[mpe: Fix __opal_async_get_token() when no tokens are free]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-06 20:33:54 +11:00
Cyril Bur
59cf9a1cfc powerpc/opal: Make __opal_async_{get, release}_token() static
There are no callers of both __opal_async_get_token() and
__opal_async_release_token().

This patch also removes the possibility of "emergency through
synchronous call to __opal_async_get_token()" as such it makes more
sense to initialise opal_sync_sem for the maximum number of async
tokens.

Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-06 20:20:26 +11:00
Cyril Bur
efe6941450 mtd: powernv_flash: Don't return -ERESTARTSYS on interrupted token acquisition
Because the MTD core might split up a read() or write() from userspace
into several calls to the driver, we may fail to get a token but already
have done some work, best to return -EINTR back to userspace and have
them decide what to do.

Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-06 20:20:26 +11:00
Cyril Bur
e32ec15a2d mtd: powernv_flash: Remove pointless goto in driver init
powernv_flash_probe() has pointless goto statements which jump to the
end of the function to simply return a variable. Rather than checking
for error and going to the label, just return the error as soon as it is
detected.

Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-06 20:20:26 +11:00
Cyril Bur
25ee52e669 mtd: powernv_flash: Don't treat OPAL_SUCCESS as an error
While this driver expects to interact asynchronously, OPAL is well
within its rights to return OPAL_SUCCESS to indicate that the operation
completed without the need for a callback. We shouldn't treat
OPAL_SUCCESS as an error rather we should wrap up and return promptly to
the caller.

Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-06 20:20:26 +11:00
Cyril Bur
44e2aa2b16 mtd: powernv_flash: Use WARN_ON_ONCE() rather than BUG_ON()
BUG_ON() should be reserved in situations where we can not longer
guarantee the integrity of the system. In the case where
powernv_flash_async_op() receives an impossible op, we can still
guarantee the integrity of the system.

Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-06 20:20:26 +11:00
William A. Kennington III
71e24d7731 powerpc/opal: Fix EBUSY bug in acquiring tokens
The current code checks the completion map to look for the first token
that is complete. In some cases, a completion can come in but the
token can still be on lease to the caller processing the completion.
If this completed but unreleased token is the first token found in the
bitmap by another tasks trying to acquire a token, then the
__test_and_set_bit call will fail since the token will still be on
lease. The acquisition will then fail with an EBUSY.

This patch reorganizes the acquisition code to look at the
opal_async_token_map for an unleased token. If the token has no lease
it must have no outstanding completions so we should never see an
EBUSY, unless we have leased out too many tokens. Since
opal_async_get_token_inrerruptible is protected by a semaphore, we
will practically never see EBUSY anymore.

Fixes: 8d72482322 ("powerpc/powernv: Infrastructure to support OPAL async completion")
Signed-off-by: William A. Kennington III <wak@google.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-06 20:17:41 +11:00
Borislav Petkov
c7da092a1f x86/mm: Define _PAGE_TABLE using _KERNPG_TABLE
... so that the difference is obvious.

No functionality change.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20171103102028.20284-1-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-06 09:50:14 +01:00
Ingo Molnar
75ec4eb3dc Merge branch 'x86/mm' into x86/asm, to pick up pending changes
Concentrate x86 MM and asm related changes into a single super-topic,
in preparation for larger changes.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-06 09:49:28 +01:00
Gustavo A. R. Silva
abfa2b377f crypto: chcr - Replace _manual_ swap with swap macro
Make use of the swap macro and remove unnecessary variable temp.
This makes the code easier to read and maintain.

This code was detected with the help of Coccinelle.

Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-11-06 14:45:35 +08:00
Boris BREZILLON
9c90344648 crypto: marvell - Add a NULL entry at the end of mv_cesa_plat_id_table[]
struct platform_device_id should be NULL terminated to let the core detect
where the last entry is.

Fixes: 07c50a8be41a ("crypto: marvell - Add a platform_device_id table")
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-11-06 14:45:07 +08:00
Jim Quigley
e5cc6e79bb hwrng: virtio - Virtio RNG devices need to be re-registered after suspend/resume
The patch for

commit: 5c06273401
Author: Amit Shah <amit.shah@redhat.com>
Date:   Sun Jul 27 07:34:01 2014 +0930

    virtio: rng: delay hwrng_register() till driver is ready

moved the call to hwrng_register() out of the probe routine into the scan
routine. We need to call hwrng_register() after a suspend/restore cycle
to re-register the device, but the scan function is not invoked for the
restore. Add the call to hwrng_register() to virtio_restore().

Reviewed-by: Liam Merwick <Liam.Merwick@oracle.com>
Signed-off-by: Jim Quigley <Jim.Quigley@oracle.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-11-06 14:45:06 +08:00
Tudor-Dan Ambarus
747f6ec6e8 crypto: atmel - remove empty functions
Pointer members of an object with static storage duration, if not
explicitly initialized, will be initialized to a NULL pointer.
The crypto API checks if these pointers are not NULL before using them,
therefore we can safely remove these empty functions.

Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-11-06 14:45:05 +08:00
Tudor-Dan Ambarus
f560acc3bb crypto: ecdh - remove empty exit()
Pointer members of an object with static storage duration, if not
explicitly initialized, will be initialized to a NULL pointer. The crypto
API checks if this pointer is not NULL before using it, we are safe to
remove the function.

Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-11-06 14:45:04 +08:00
Salvatore Benedetto
3c24f992a4 MAINTAINERS: update maintainer for qat
Removing myself as I'm not longer following QAT development.

Signed-off-by: Salvatore Benedetto <salvatore.benedetto@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-11-06 14:45:01 +08:00
Horia Geantă
dfcd8393ef crypto: caam - remove unused param of ctx_map_to_sec4_sg()
ctx_map_to_sec4_sg() function, added in
commit 045e36780f ("crypto: caam - ahash hmac support")
has never used the "desc" parameter, so let's drop it.

Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-11-06 14:45:01 +08:00
Horia Geantă
f2ac677465 crypto: caam - remove unneeded edesc zeroization
Extended descriptor allocation has been changed by
commit dde20ae9d6 ("crypto: caam - Change kmalloc to kzalloc to avoid residual data")
to provide zeroized memory, meaning we no longer have to sanitize
its members - edesc->src_nents and edesc->dst_dma.

Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-11-06 14:45:00 +08:00
Arnd Bergmann
edfd17ff39 powerpc/eeh: Stop using do_gettimeofday()
This interface is inefficient and deprecated because of the y2038
overflow.

ktime_get_seconds() is an appropriate replacement here, since it
has sufficient granularity but is more efficient and uses monotonic
time.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-06 17:40:00 +11:00
Dave Airlie
8a6fb5b582 Merge branch 'drm-next-4.15' of git://people.freedesktop.org/~agd5f/linux into drm-next
some more amd/ttm fixes.

* 'drm-next-4.15' of git://people.freedesktop.org/~agd5f/linux:
  drm/ttm: Downgrade pr_err to pr_debug for memory allocation failures
  drm/ttm: Always and only destroy bo->ttm_resv in ttm_bo_release_list
  drm/amd/amdgpu: Enabling ACP clock in hw_init (v2)
  drm/amdgpu/virt: don't dereference undefined 'module' struct
2017-11-06 16:18:59 +10:00
Vaibhav Jain
cbb55eeb49 cxl: Rework the implementation of cxl_stop_trace_psl9()
Presently the PSL9 specific cxl_stop_trace_psl9() only stops the RX0
traces on the CXL adapter when a PSL error irq is triggered. The patch
updates the function to stop all the traces arrays and move them to
the FIN state. The implementation issues the mmio to TRACECFG register
to stop the trace array iff it already not in FIN state. This prevents
the issue of trace data being reset in case of multiple stop mmio
issued for a single trace array.

Also the patch does some refactoring of existing cxl_stop_trace_psl9()
and cxl_stop_trace_psl8() functions by moving them to 'pci.c' from
'debugfs.c' file and marking them as static.

Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-06 16:48:17 +11:00
Sandipan Das
ac0761ebcb bpf: take advantage of stack_depth tracking in powerpc JIT
Take advantage of stack_depth tracking, originally introduced for
x64, in powerpc JIT as well. Round up allocated stack by 16 bytes
to make sure it stays aligned for functions called from JITed bpf
program.

Signed-off-by: Sandipan Das <sandipan@linux.vnet.ibm.com>
Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-06 16:48:16 +11:00
Michael Ellerman
632f057416 powerpc/tm: Don't check for WARN in TM Bad Thing handling
Currently when we take a TM Bad Thing program check exception, we
search the bug table to see if the program check was generated by a
WARN/WARN_ON etc.

That makes no sense, the WARN macros use trap instructions, which
should never generate a TM Bad Thing exception. If they ever did that
would be a bug and we should oops.

We do have some hand-coded bugs in tm.S, using EMIT_BUG_ENTRY, but
those are all BUGs not WARNs, and they all use trap instructions
anyway. Almost certainly this check was incorrectly copied from the
REASON_TRAP handling in the same function.

Remove it.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-By: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-06 16:48:16 +11:00
Michael Ellerman
1fd6c02207 powerpc/mm: Add a CONFIG option to choose if radix is used by default
Currently if the hardware supports the radix MMU we will use
it, *unless* "disable_radix" is passed on the kernel command line.

However some users would like the reverse semantics. ie. The kernel
uses the hash MMU by default, unless radix is explicitly requested on
the command line.

So add a CONFIG option to choose whether we use radix by default or
not, and expand the disable_radix command line option to allow
"disable_radix=no" which *enables* radix.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-06 16:48:15 +11:00
Michael Ellerman
4e00374704 powerpc/64s: Replace CONFIG_PPC_STD_MMU_64 with CONFIG_PPC_BOOK3S_64
CONFIG_PPC_STD_MMU_64 indicates support for the "standard" powerpc MMU
on 64-bit CPUs. The "standard" MMU refers to the hash page table MMU
found in "server" processors, from IBM mainly.

Currently CONFIG_PPC_STD_MMU_64 is == CONFIG_PPC_BOOK3S_64. While it's
annoying to have two symbols that always have the same value, it's not
quite annoying enough to bother removing one.

However with the arrival of Power9, we now have the situation where
CONFIG_PPC_STD_MMU_64 is enabled, but the kernel is running using the
Radix MMU - *not* the "standard" MMU. So it is now actively confusing
to use it, because it implies that code is disabled or inactive when
the Radix MMU is in use, however that is not necessarily true.

So s/CONFIG_PPC_STD_MMU_64/CONFIG_PPC_BOOK3S_64/, and do some minor
formatting updates of some of the affected lines.

This will be a pain for backports, but c'est la vie.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-06 16:48:14 +11:00
Michael Ellerman
c1807e3f84 powerpc/64: Free up CPU_FTR_ICSWX
The last user of CPU_FTR_ICSWX was removed in commit
6ff4d3e966 ("powerpc: Remove old unused icswx based coprocessor
support"), so free the bit up for future use.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-06 16:48:14 +11:00
Aneesh Kumar K.V
7f142661d4 powerpc/mm/hash: Add pr_fmt() to hash_utils64.c
Make the printks look a bit nicer by adding a prefix.

Radix config now do
 radix-mmu: Page sizes from device-tree:
 radix-mmu: Page size shift = 12 AP=0x0
 radix-mmu: Page size shift = 16 AP=0x5
 radix-mmu: Page size shift = 21 AP=0x1
 radix-mmu: Page size shift = 30 AP=0x2

This patch update hash config to do similar dmesg output. With the patch we have

 hash-mmu: Page sizes from device-tree:
 hash-mmu: base_shift=12: shift=12, sllp=0x0000, avpnm=0x00000000, tlbiel=1, penc=0
 hash-mmu: base_shift=12: shift=16, sllp=0x0000, avpnm=0x00000000, tlbiel=1, penc=7
 hash-mmu: base_shift=12: shift=24, sllp=0x0000, avpnm=0x00000000, tlbiel=1, penc=56
 hash-mmu: base_shift=16: shift=16, sllp=0x0110, avpnm=0x00000000, tlbiel=1, penc=1
 hash-mmu: base_shift=16: shift=24, sllp=0x0110, avpnm=0x00000000, tlbiel=1, penc=8
 hash-mmu: base_shift=20: shift=20, sllp=0x0111, avpnm=0x00000000, tlbiel=0, penc=2
 hash-mmu: base_shift=24: shift=24, sllp=0x0100, avpnm=0x00000001, tlbiel=0, penc=0
 hash-mmu: base_shift=34: shift=34, sllp=0x0120, avpnm=0x000007ff, tlbiel=0, penc=3

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-06 16:48:13 +11:00
Christophe Leroy
6b148a7ce7 powerpc/ipic: Fix status get and status clear
IPIC Status is provided by register IPIC_SERSR and not by IPIC_SERMR
which is the mask register.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-06 16:48:13 +11:00
Alexey Kardashevskiy
d6f934fd48 powerpc/powernv: Reserve a hole which appears after enabling IOV
In order to make generic IOV code work, the physical function IOV BAR
should start from offset of the first VF. Since M64 segments share
PE number space across PHB, and some PEs may be in use at the time
when IOV is enabled, the existing code shifts the IOV BAR to the index
of the first PE/VF. This creates a hole in IOMEM space which can be
potentially taken by some other device.

This reserves a temporary hole on a parent and releases it when IOV is
disabled; the temporary resources are stored in pci_dn to avoid
kmalloc/free.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-06 16:48:12 +11:00
Tyrel Datwyler
b8f89fea59 powerpc/pseries/vio: Dispose of virq mapping on vdevice unregister
When a vdevice is DLPAR removed from the system the vio subsystem
doesn't bother unmapping the virq from the irq_domain. As a result we
have a virq mapped to a hardware irq that is no longer valid for the
irq_domain. A side effect is that we are left with /proc/irq/<irq#>
affinity entries, and attempts to modify the smp_affinity of the irq
will fail.

In the following observed example the kernel log is spammed by
ics_rtas_set_affinity errors after the removal of a VSCSI adapter.
This is a result of irqbalance trying to adjust the affinity every 10
seconds.

  rpadlpar_io: slot U8408.E8E.10A7ACV-V5-C25 removed
  ics_rtas_set_affinity: ibm,set-xive irq=655385 returns -3
  ics_rtas_set_affinity: ibm,set-xive irq=655385 returns -3

This patch fixes the issue by calling irq_dispose_mapping() on the
virq of the viodev on unregister.

Fixes: f2ab621996 ("powerpc/pseries: Add PFO support to the VIO bus")
Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-06 16:48:11 +11:00
Nicholas Piggin
30b49ec798 powerpc/64s/radix: Fix process table entry cache invalidation
According to the architecture, the process table entry cache must be
flushed with tlbie RIC=2.

Currently the process table entry is set to invalid right before the
PID is returned to the allocator, with no invalidation. This works on
existing implementations that are known to not cache the process table
entry for any except the current PIDR.

It is architecturally correct and cleaner to invalidate with RIC=2
after clearing the process table entry and before the PID is returned
to the allocator. This can be done in arch_exit_mmap that runs before
the final flush, and to ensure the final flush (fullmm) is always a
RIC=2 variant.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-06 16:48:10 +11:00
Nicholas Piggin
dffe8449c5 powerpc/64s/radix: Improve preempt handling in TLB code
Preempt should be consistently disabled for mm_is_thread_local tests,
so bring the rest of these under preempt_disable().

Preempt does not need to be disabled for the mm->context.id tests,
which allows simplification and removal of gotos.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-06 16:48:10 +11:00
Nicholas Piggin
63c9d8a4b3 powerpc/powernv: Use FIXUP_ENDIAN_HV in OPAL return
Close the recoverability gap for OPAL calls by using FIXUP_ENDIAN_HV
in the return path.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-06 16:48:09 +11:00
Nicholas Piggin
8ca9c08d0c powerpc/book3s: Add an HV variant of FIXUP_ENDIAN that is recoverable
Add an HV variant of FIXUP_ENDIAN which uses HSRR[01] and does not
clear MSR[RI], which improves recoverability.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-06 16:48:08 +11:00
Nicholas Piggin
f848ea7f59 powerpc/book3s: Use label for FIXUP_ENDIAN macro branch
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-06 16:48:07 +11:00
Nicholas Piggin
ff967900c9 powerpc/64: Fix latency tracing for lazy irq replay
When returning from an exception to a soft-enabled context, pending
IRQs are replayed but IRQ tracing is not reset, so a number of them
can get chained together into the same IRQ-disabled trace.

Fix this by having __check_irq_replay re-set IRQ trace. This is
conceptually where we respond to the next interrupt, so it fits the
semantics of the IRQ tracer.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-06 16:48:07 +11:00
Nicholas Piggin
6de6638b35 KVM: PPC: Book3S HV: Handle host system reset in guest mode
If the host takes a system reset interrupt while a guest is running,
the CPU must exit the guest before processing the host exception
handler.

After this patch, taking a sysrq+x with a CPU running in a guest
gives a trace like this:

   cpu 0x27: Vector: 100 (System Reset) at [c000000fdf5776f0]
       pc: c008000010158b80: kvmppc_run_core+0x16b8/0x1ad0 [kvm_hv]
       lr: c008000010158b80: kvmppc_run_core+0x16b8/0x1ad0 [kvm_hv]
       sp: c000000fdf577850
      msr: 9000000002803033
     current = 0xc000000fdf4b1e00
     paca    = 0xc00000000fd4d680	 softe: 3	 irq_happened: 0x01
       pid   = 6608, comm = qemu-system-ppc
   Linux version 4.14.0-rc7-01489-g47e1893a404a-dirty #26 SMP
   [c000000fdf577a00] c008000010159dd4 kvmppc_vcpu_run_hv+0x3dc/0x12d0 [kvm_hv]
   [c000000fdf577b30] c0080000100a537c kvmppc_vcpu_run+0x44/0x60 [kvm]
   [c000000fdf577b60] c0080000100a1ae0 kvm_arch_vcpu_ioctl_run+0x118/0x310 [kvm]
   [c000000fdf577c00] c008000010093e98 kvm_vcpu_ioctl+0x530/0x7c0 [kvm]
   [c000000fdf577d50] c000000000357bf8 do_vfs_ioctl+0xd8/0x8c0
   [c000000fdf577df0] c000000000358448 SyS_ioctl+0x68/0x100
   [c000000fdf577e30] c00000000000b220 system_call+0x58/0x6c
   --- Exception: c01 (System Call) at 00007fff76868df0
   SP (7fff7069baf0) is in userspace

Fixes: e36d0a2ed5 ("powerpc/powernv: Implement NMI IPI with OPAL_SIGNAL_SYSTEM_RESET")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-11-06 16:48:06 +11:00
Ingo Molnar
19c5787a5f Merge branch 'x86/fpu' into x86/asm, to pick up fix
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-06 01:43:13 +01:00
Chao Yu
bb06664a53 f2fs: avoid race in between GC and block exchange
During block exchange in {insert,collapse,move}_range, page-block mapping
is unstable due to mapping moving or recovery, so there should be no
concurrent cache read operation rely on such mapping, nor cache write
operation to mess up block exchange.

So this patch let background GC be aware of that.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-11-05 16:42:10 -08:00
Fan Li
f6986ede80 f2fs: save a multiplication for last_nid calculation
Use a slightly easier way to calculate last_nid.

Signed-off-by: Fan li <fanofcode.li@samsung.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-11-05 16:42:09 -08:00
Chao Yu
2b60311dd1 f2fs: fix summary info corruption
Sometimes, after running generic/270 of fstest, fsck reports summary
info and actual position of block address in direct node becoming
inconsistent.

The root cause is race in between __f2fs_replace_block and change_curseg
as below:

Thread A				Thread B
- __clone_blkaddrs
 - f2fs_replace_block
  - __f2fs_replace_block
   - segnoA = GET_SEGNO(sbi, blkaddrA);
   - type = se->type:=CURSEG_HOT_DATA
   - if (!IS_CURSEG(sbi, segnoA))
         type = CURSEG_WARM_DATA
					- allocate_data_block
					 - allocate_segment
					  - get_ssr_segment
					  - change_curseg(segnoA, CURSEG_HOT_DATA)
   - change_curseg(segnoA, CURSEG_WARM_DATA)
    - reset_curseg
     - __set_sit_entry_type
      - change se->type from CURSEG_HOT_DATA to CURSEG_WARM_DATA

So finally, hot curseg locates in segnoA, but type of segnoA becomes
CURSEG_WARM_DATA.

Then if we invoke __f2fs_replace_block(blkaddrB, blkaddrA, true, false),
as blkaddrA locates in segnoA, so we will move warm type curseg to segnoA,
then change its summary cache and writeback it to summary block.

But segnoA is used by hot type curseg too, once it moves or persist, it
will cover summary block content with inner old summary cache, result in
inconsistent status.

This patch tries to fix this issue by introduce global curseg lock to avoid
race in between __f2fs_replace_block and change_curseg.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-11-05 16:42:08 -08:00
Chao Yu
0537b81153 f2fs: remove dead code in update_meta_page
After commit a468f0ef51 ("f2fs: use crc and cp version to determine
roll-forward recovery"), last caller of update_meta_page passing @src
with NULL is gone, so remove related dead code there.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-11-05 16:42:07 -08:00
Chao Yu
dee668c143 f2fs: remove unneeded semicolon
Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-11-05 16:42:06 -08:00
Jeff Layton
d1954ab4c9 f2fs: don't bother with inode->i_version
f2fs does not set the SB_I_VERSION flag, so the i_version will never
be incremented on write. It was recently changed to increment the
i_version on a quota write, which isn't necessary here.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-11-05 16:42:05 -08:00
Chao Yu
bf34c93d26 f2fs: check curseg space before foreground GC
When we are closing to trigger foreground GC, if there are only a few
of dirty metas, we can log these dirty metas in left space of opened
segments instead of triggering foreground GC.

With this patch, total count of foreground GC triggered by
test/generic/* of fstest suit reduce from 254 to 184.

So let's do the check before foreground GC anyway.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-11-05 16:42:04 -08:00
Chao Yu
3d26fa6be3 f2fs: use rw_semaphore to protect SIT cache
There are some cases user didn't update SIT cache under this lock,
so let's use rw_semaphore instead of mutex to enhance concurrently
accessing.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-11-05 16:42:03 -08:00
Jaegeuk Kim
ea6767337f f2fs: support quota sys files
This patch supports hidden quota files in the system, which will be used for
Android. It requires up-to-date f2fs-tools later than v1.9.0.

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-11-05 16:42:02 -08:00
Jaegeuk Kim
234a968961 f2fs: add quota_ino feature infra
This patch adds quota_ino feature infra to be used for quota files.

Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-11-05 16:42:01 -08:00
Fan Li
37a0ab2a3b f2fs: optimize __update_nat_bits
Make three modification for __update_nat_bits:
1. Take the codes of dealing the nat with nid 0 out of the loop
    Such nat only needs to be dealt with once at beginning.
2. Use " nat_index == 0" instead of " start_nid == 0" to decide if it's the first nat block
    It's better that we don't assume @start_nid is the first nid of the nat block it's in.
3. Use " if (nat_blk->entries[i].block_addr != NULL_ADDR)" to explicitly comfirm the value of block_addr
    use constant to make sure the codes is right, even if the value of NULL_ADDR changes.

Signed-off-by: Fan li <fanofcode.li@samsung.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-11-05 16:42:00 -08:00
Yunlei He
f15194fcfa f2fs: modify for accurate fggc node io stat
modify for accurate fggc node io stat

Signed-off-by: Yunlei He <heyunlei@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-11-05 16:41:59 -08:00
Yunlong Song
65f1b80b33 Revert "f2fs: handle dirty segments inside refresh_sit_entry"
This reverts commit 5e443818fa

The commit should be reverted because call sequence of below two parts
of code must be kept:
a. update sit information, it needs to be updated before segment
allocation since latter allocation may trigger SSR, and SSR allocation
needs latest valid block information of all segments.
b. update segment status, it needs to be updated after segment allocation
since we can skip updating current opened segment status.

Fixes: 5e443818fa ("f2fs: handle dirty segments inside refresh_sit_entry")
Suggested-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Yunlong Song <yunlong.song@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
[Jaegeuk Kim: remove refresh_sit_entry function]
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-11-05 16:41:58 -08:00