Commit Graph

351 Commits

Author SHA1 Message Date
Dave Jiang
11bac80004 mm, fs: reduce fault, page_mkwrite, and pfn_mkwrite to take only vmf
->fault(), ->page_mkwrite(), and ->pfn_mkwrite() calls do not need to
take a vma and vmf parameter when the vma already resides in vmf.

Remove the vma parameter to simplify things.

[arnd@arndb.de: fix ARM build]
  Link: http://lkml.kernel.org/r/20170125223558.1451224-1-arnd@arndb.de
Link: http://lkml.kernel.org/r/148521301778.19116.10840599906674778980.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jan Kara <jack@suse.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-24 17:46:54 -08:00
Andrew Donnellan
171ed0fcd8 cxl: fix nested locking hang during EEH hotplug
Commit 14a3ae34bf ("cxl: Prevent read/write to AFU config space while AFU
not configured") introduced a rwsem to fix an invalid memory access that
occurred when someone attempts to access the config space of an AFU on a
vPHB whilst the AFU is deconfigured, such as during EEH recovery.

It turns out that it's possible to run into a nested locking issue when EEH
recovery fails and a full device hotplug is required.
cxl_pci_error_detected() deconfigures the AFU, taking a writer lock on
configured_rwsem. When EEH recovery fails, the EEH code calls
pci_hp_remove_devices() to remove the device, which in turn calls
cxl_remove() -> cxl_pci_remove_afu() -> pci_deconfigure_afu(), which tries
to grab the writer lock that's already held.

Standard rwsem semantics don't express what we really want to do here and
don't allow for nested locking. Fix this by replacing the rwsem with an
atomic_t which we can control more finely. Allow the AFU to be locked
multiple times so long as there are no readers.

Fixes: 14a3ae34bf ("cxl: Prevent read/write to AFU config space while AFU not configured")
Cc: stable@vger.kernel.org # v4.9+
Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-02-21 21:32:52 +11:00
Andrew Donnellan
39d4087152 cxl: Fix build when CONFIG_DEBUG_FS=n
Stub out the debugfs functions so that the build doesn't break when
CONFIG_DEBUG_FS=n.

Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-02-03 21:59:26 +11:00
Andrew Donnellan
14a3ae34bf cxl: Prevent read/write to AFU config space while AFU not configured
During EEH recovery, we deconfigure all AFUs whilst leaving the
corresponding vPHB and virtual PCI device in place.

If something attempts to interact with the AFU's PCI config space (e.g.
running lspci) after the AFU has been deconfigured and before it's
reconfigured, cxl_pcie_{read,write}_config() will read invalid values from
the deconfigured struct cxl_afu and proceed to Oops when they try to
dereference pointers that have been set to NULL during deconfiguration.

Add a rwsem to struct cxl_afu so we can prevent interaction with config
space while the AFU is deconfigured.

Reported-by: Pradipta Ghosh <pradghos@in.ibm.com>
Suggested-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org # v4.9+
Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-01-25 13:34:24 +11:00
Vaibhav Jain
d7b1946c79 cxl: Force psl data-cache flush during device shutdown
This change adds a force psl data cache flush during device shutdown
callback. This should reduce a possibility of psl holding a dirty
cache line while the CAPP is being reinitialized, which may result in
a UE [load/store] machine check error.

Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-01-25 13:34:23 +11:00
Greg Kurz
0e17166d37 cxl: Drop unused header asm/pnv-pci.h
The kernel API does not use anything from this header file.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-01-25 13:34:23 +11:00
Linus Torvalds
de399813b5 powerpc updates for 4.10
Highlights include:
 
  - Support for the kexec_file_load() syscall, which is a prereq for secure and
    trusted boot.
 
  - Prevent kernel execution of userspace on P9 Radix (similar to SMEP/PXN).
 
  - Sort the exception tables at build time, to save time at boot, and store
    them as relative offsets to save space in the kernel image & memory.
 
  - Allow building the kernel with thin archives, which should allow us to build
    an allyesconfig once some other fixes land.
 
  - Build fixes to allow us to correctly rebuild when changing the kernel endian
    from big to little or vice versa.
 
  - Plumbing so that we can avoid doing a full mm TLB flush on P9 Radix.
 
  - Initial stack protector support (-fstack-protector).
 
  - Support for dumping the radix (aka. Linux) and hash page tables via debugfs.
 
  - Fix an oops in cxl coredump generation when cxl_get_fd() is used.
 
  - Freescale updates from Scott: "Highlights include 8xx hugepage support,
    qbman fixes/cleanup, device tree updates, and some misc cleanup."
 
  - Many and varied fixes and minor enhancements as always.
 
 Thanks to:
   Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V, Anshuman Khandual,
   Anton Blanchard, Balbir Singh, Bartlomiej Zolnierkiewicz, Christophe Jaillet,
   Christophe Leroy, Denis Kirjanov, Elimar Riesebieter, Frederic Barrat,
   Gautham R. Shenoy, Geliang Tang, Geoff Levand, Jack Miller, Johan Hovold,
   Lars-Peter Clausen, Libin, Madhavan Srinivasan, Michael Neuling, Nathan
   Fontenot, Naveen N. Rao, Nicholas Piggin, Pan Xinhui, Peter Senna Tschudin,
   Rashmica Gupta, Rui Teng, Russell Currey, Scott Wood, Simon Guo, Suraj
   Jitindar Singh, Thiago Jung Bauermann, Tobias Klauser, Vaibhav Jain.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJYU4YSAAoJEFHr6jzI4aWAC4gQALtIAqqPon0Cd5b/FVVcMbW7
 mMqB2b/0FGEl5GoRTzGUDaQqElilm6AEVfHO86C7DFji/a6olneFfw87iz+mtWuZ
 JvrNq68ZiSnoeszdUy4MgtXFLb5sTzNMev4skaHfjI9E5CepWBoR0zH4G+kNVnd5
 WSgudv8Cq4Px+MEuTOigt3QYjHzZ3cw/XNOOm9c+oGj+PDW4O9UItVI+S1WLoey4
 rAB2nRcLMDPuwfRQC9XsF3zEbkv4h1dEXo/EBRuRpcF+0lLTzFw1lv1WE8OxlUmS
 kAXbty3dIytBfSbtJT0c0Ps6sfQ4HFhu6ZV2fjnxNTz2KDkBIN7LBYHmBYiqY9oZ
 9zvbUWtfiTu5ocfRtTq7rC/Hcj4Kbr9S9F/FvXR0WyDsKgu4xxAovqC3gcn6YjYK
 Rr1tcCI4nUzyhVJVmd+OEhUvc5JbFy9aGage+YeOyejfvvSbXIunaxWlPjoDkvim
 Vjl+UKU8gw51XFssqY5ZBi/HNlMFKYedLpMFp/fItnLglhj50V0eFWkpDgdSCYom
 vo9ifPLZx8n8m8De3H7TV4E0F4gCHcTeqZdu7tW9AAUVM6iLJcDLm3asGmtNh21t
 snOHNOJ5QSIno6ezUUg29T6VBjbPh46fdJJSlIZrEe8OzLZ1haGyttf0tD00PQvY
 Z2W/m3gxafnOeGgBqvyv
 =xOzf
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:
 "Highlights include:

   - Support for the kexec_file_load() syscall, which is a prereq for
     secure and trusted boot.

   - Prevent kernel execution of userspace on P9 Radix (similar to
     SMEP/PXN).

   - Sort the exception tables at build time, to save time at boot, and
     store them as relative offsets to save space in the kernel image &
     memory.

   - Allow building the kernel with thin archives, which should allow us
     to build an allyesconfig once some other fixes land.

   - Build fixes to allow us to correctly rebuild when changing the
     kernel endian from big to little or vice versa.

   - Plumbing so that we can avoid doing a full mm TLB flush on P9
     Radix.

   - Initial stack protector support (-fstack-protector).

   - Support for dumping the radix (aka. Linux) and hash page tables via
     debugfs.

   - Fix an oops in cxl coredump generation when cxl_get_fd() is used.

   - Freescale updates from Scott: "Highlights include 8xx hugepage
     support, qbman fixes/cleanup, device tree updates, and some misc
     cleanup."

   - Many and varied fixes and minor enhancements as always.

  Thanks to:
    Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V, Anshuman
    Khandual, Anton Blanchard, Balbir Singh, Bartlomiej Zolnierkiewicz,
    Christophe Jaillet, Christophe Leroy, Denis Kirjanov, Elimar
    Riesebieter, Frederic Barrat, Gautham R. Shenoy, Geliang Tang, Geoff
    Levand, Jack Miller, Johan Hovold, Lars-Peter Clausen, Libin,
    Madhavan Srinivasan, Michael Neuling, Nathan Fontenot, Naveen N.
    Rao, Nicholas Piggin, Pan Xinhui, Peter Senna Tschudin, Rashmica
    Gupta, Rui Teng, Russell Currey, Scott Wood, Simon Guo, Suraj
    Jitindar Singh, Thiago Jung Bauermann, Tobias Klauser, Vaibhav Jain"

[ And thanks to Michael, who took time off from a new baby to get this
  pull request done.   - Linus ]

* tag 'powerpc-4.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (174 commits)
  powerpc/fsl/dts: add FMan node for t1042d4rdb
  powerpc/fsl/dts: add sg_2500_aqr105_phy4 alias on t1024rdb
  powerpc/fsl/dts: add QMan and BMan nodes on t1024
  powerpc/fsl/dts: add QMan and BMan nodes on t1023
  soc/fsl/qman: test: use DEFINE_SPINLOCK()
  powerpc/fsl-lbc: use DEFINE_SPINLOCK()
  powerpc/8xx: Implement support of hugepages
  powerpc: get hugetlbpage handling more generic
  powerpc: port 64 bits pgtable_cache to 32 bits
  powerpc/boot: Request no dynamic linker for boot wrapper
  soc/fsl/bman: Use resource_size instead of computation
  soc/fsl/qe: use builtin_platform_driver
  powerpc/fsl_pmc: use builtin_platform_driver
  powerpc/83xx/suspend: use builtin_platform_driver
  powerpc/ftrace: Fix the comments for ftrace_modify_code
  powerpc/perf: macros for power9 format encoding
  powerpc/perf: power9 raw event format encoding
  powerpc/perf: update attribute_group data structure
  powerpc/perf: factor out the event format field
  powerpc/mm/iommu, vfio/spapr: Put pages on VFIO container shutdown
  ...
2016-12-16 09:26:42 -08:00
Jan Kara
1a29d85eb0 mm: use vmf->address instead of of vmf->virtual_address
Every single user of vmf->virtual_address typed that entry to unsigned
long before doing anything with it so the type of virtual_address does
not really provide us any additional safety.  Just use masked
vmf->address which already has the appropriate type.

Link: http://lkml.kernel.org/r/1479460644-25076-3-git-send-email-jack@suse.cz
Signed-off-by: Jan Kara <jack@suse.cz>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-12-14 16:04:09 -08:00
Geliang Tang
7184bc2ddb cxl: drop duplicate header sched.h
Drop duplicate header sched.h from native.c.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-25 14:07:50 +11:00
Andrew Donnellan
3382a6220f cxl: Fix coccinelle warnings
Fix the following coccinelle warnings:

  drivers/misc/cxl/debugfs.c:46:0-23: WARNING: fops_io_x64 should be
      defined with DEFINE_DEBUGFS_ATTRIBUTE
  drivers/misc/cxl/guest.c:890:5-26: WARNING: Comparison to bool
  drivers/misc/cxl/irq.c:107:3-23: WARNING: Assignment of bool to 0/1
  drivers/misc/cxl/native.c:57:2-3: Unneeded semicolon
  drivers/misc/cxl/native.c:170:2-3: Unneeded semicolon

Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-23 22:57:49 +11:00
Frederic Barrat
bdecf76e31 cxl: Fix coredump generation when cxl_get_fd() is used
If a process dumps core while owning a cxl file descriptor obtained
from an AFU driver (e.g. cxlflash) through the cxl_get_fd() API, the
following error occurs:

  [  868.027591] Unable to handle kernel paging request for data at address ...
  [  868.027778] Faulting instruction address: 0xc00000000035edb0
  cpu 0x8c: Vector: 300 (Data Access) at [c000003c688275e0]
      pc: c00000000035edb0: elf_core_dump+0xd60/0x1300
      lr: c00000000035ed80: elf_core_dump+0xd30/0x1300
      sp: c000003c68827860
     msr: 9000000100009033
     dar: c
  dsisr: 40000000
   current = 0xc000003c68780000
   paca    = 0xc000000001b73200   softe: 0        irq_happened: 0x01
      pid   = 46725, comm = hxesurelock
  enter ? for help
  [c000003c68827a60] c00000000036948c do_coredump+0xcec/0x11e0
  [c000003c68827c20] c0000000000ce9e0 get_signal+0x540/0x7b0
  [c000003c68827d10] c000000000017354 do_signal+0x54/0x2b0
  [c000003c68827e00] c00000000001777c do_notify_resume+0xbc/0xd0
  [c000003c68827e30] c000000000009838 ret_from_except_lite+0x64/0x68
  --- Exception: 300 (Data Access) at 00003fff98ad2918

The root cause is that the address_space structure for the file
doesn't define a 'host' member.

When cxl allocates a file descriptor, it's using the anonymous inode
to back the file, but allocates a private address_space for each
context. The private address_space allows to track memory allocation
for each context. cxl doesn't define the 'host' member of the address
space, i.e. the inode. We don't want to define it as the anonymous
inode, since there's no longer a 1-to-1 relation between address_space
and inode.

To fix it, instead of using the anonymous inode, we introduce a simple
pseudo filesystem so that cxl can allocate its own inodes. So we now
have one inode for each file and address_space. The pseudo filesystem
is only mounted on the first allocation of a file descriptor by
cxl_get_fd().

Tested with cxlflash.

Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-18 23:02:17 +11:00
Vaibhav Jain
abf051be68 cxl: Do adapter fence check before handling afu interrupt
If an afu interrupt is in flight when an eeh error is triggered the
control still reaches the function native_irq_multiplexed and the
PE-Handle read from the CXL_PSL_PEHandle_An register is 0xffff. The
function then erroneously assumes that the interrupt belonged to a
detached context and generates a warning with full stack dump in the
kernel log complaining:

"Unable to demultiplex CXL PSL IRQ for PE 65535 DSISR ffffffff DAR
ffffffff. (Possible AFU HW issue - was a term/remove acked with
outstanding transactions"

To fix this the patch adds new code to the function
native_irq_multiplexed function to compares the read value of register
CXL_PSL_PEHandle_An to ~0ULL. If true then logs a warning message
saying that the interrupt is being ignored and returns IRQ_HANDLED from
the irq handler.

Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-18 22:41:08 +11:00
Christophe Jaillet
bb81733de2 cxl: Fix error handling in _cxl_pci_associate_default_context()
'cxl_dev_context_init()' returns an error pointer in case of error, not
NULL. So test it with IS_ERR.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-18 22:41:08 +11:00
Christophe Jaillet
28e323e5a0 cxl: Fix error handling in _cxl_cx4_setup_msi_irqs()
'cxl_dev_context_init()' returns an error pointer in case of error, not
NULL. So test it with IS_ERR.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-18 22:41:08 +11:00
Christophe Jaillet
5cd4f5cec2 cxl: Fix memory allocation failure test
'cxl_context_alloc()' does not return an error pointer. It is just a
shortcut for a call to 'kzalloc' with 'sizeof(struct cxl_context)' as the
size parameter.

So its return value should be compared with NULL.
While fixing it, simplify a bit the code.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-11-18 22:41:08 +11:00
Vaibhav Jain
a05b82d514 cxl: Fix leaking pid refs in some error paths
In some error paths in functions cxl_start_context and
afu_ioctl_start_work pid references to the current & group-leader tasks
can leak after they are taken. This patch fixes these error paths to
release these pid references before exiting the error path.

Fixes: 7b8ad495d5 ("cxl: Fix DSI misses when the context owning task exits")
Cc: stable@vger.kernel.org # v4.5+
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reported-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-24 11:38:27 +11:00
Vaibhav Jain
70b565bbdb cxl: Prevent adapter reset if an active context exists
This patch prevents resetting the cxl adapter via sysfs in presence of
one or more active cxl_context on it. This protects against an
unrecoverable error caused by PSL owning a dirty cache line even after
reset and host tries to touch the same cache line. In case a force reset
of the card is required irrespective of any active contexts, the int
value -1 can be stored in the 'reset' sysfs attribute of the card.

The patch introduces a new atomic_t member named contexts_num inside
struct cxl that holds the number of active context attached to the card
, which is checked against '0' before proceeding with the reset. To
prevent against a race condition where a context is activated just after
reset check is performed, the contexts_num is atomically set to '-1'
after reset-check to indicate that no more contexts can be activated on
the card anymore.

Before activating a context we atomically test if contexts_num is
non-negative and if so, increment its value by one. In case the value of
contexts_num is negative then it indicates that the card is about to be
reset and context activation is error-ed out at that point.

Fixes: 62fa19d4b4 ("cxl: Add ability to reset the card")
Cc: stable@vger.kernel.org # v4.0+
Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-19 20:35:39 +11:00
Andrew Donnellan
735840b44b cxl: replace loop with for_each_child_of_node(), remove unneeded of_node_put()
Rewrite the cxl_guest_init_afu() loop in cxl_of_probe() to use
for_each_child_of_node() rather than a hand-coded for loop.

Remove the useless of_node_put(afu_np) call after the loop, where it's
guaranteed that afu_np == NULL.

Reported-by: SF Markus Elfring <elfring@users.sourceforge.net>
Reported-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04 16:19:23 +11:00
Frederic Barrat
aaa2245ed8 cxl: Flush PSL cache before resetting the adapter
If the capi link is going down while the PSL owns a dirty cache line,
any access from the host for that data could lead to an Uncorrectable
Error.

So when resetting the capi adapter through sysfs, make sure the PSL
cache is flushed. It won't help if there are any active Process
Elements on the card, as the cache would likely get new dirty cache
lines immediately, but if resetting an idle adapter, it should avoid
any bad surprises from data left over from terminated Process Elements.

Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-10-04 16:16:42 +11:00
Frederic Barrat
b135077b83 cxl: Fix informational message
When set_sl_ops() is called, the adapter data structure is not fully
initialized yet. Therefore the device name is not showing up in the
trace. Fix is simply to get the device name from the pci_dev
structure.

Fixes: 6d382616ac ("cxl: Abstract the differences between the PSL and XSL")
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-09-13 17:37:11 +10:00
Andrew Donnellan
6f38a8b9a4 cxl: use pcibios_free_controller_deferred() when removing vPHBs
When cxl removes a vPHB, it's possible that the pci_controller may be freed
before all references to the devices on the vPHB have been released. This
in turn causes an invalid memory access when the devices are eventually
released, as pcibios_release_device() attempts to call the phb's
release_device hook.

In cxl_pci_vphb_remove(), remove the existing call to
pcibios_free_controller(). Instead, use
pcibios_free_controller_deferred() to free the pci_controller after all
devices have been released. Export pci_set_host_bridge_release() so we can
do this.

Cc: stable@vger.kernel.org
Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2016-08-22 11:09:33 +10:00
Frederic Barrat
c6d2ee09c2 cxl: Set psl_fir_cntl to production environment value
Switch the setting of psl_fir_cntl from debug to production
environment recommended value. It mostly affects the PSL behavior when
an error is raised in psl_fir1/2.

Tested with cxlflash.

Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Reviewed-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-10 14:02:45 +10:00
Andrew Donnellan
6fd40f192a cxl: Fix sparse warnings
Make native_irq_wait() static and use NULL rather than 0 to initialise
phb->cfg_data in cxl_pci_vphb_add() to remove sparse warnings.

Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-09 16:52:02 +10:00
Andrew Donnellan
164793379a cxl: Fix NULL dereference in cxl_context_init() on PowerVM guests
Commit f67a6722d6 ("cxl: Workaround PE=0 hardware limitation in
Mellanox CX4") added a "min_pe" field to struct cxl_service_layer_ops,
to allow us to work around a Mellanox CX-4 hardware limitation.

When allocating the PE number in cxl_context_init(), we read from
ctx->afu->adapter->native->sl_ops->min_pe to get the minimum PE number.
Unsurprisingly, in a PowerVM guest ctx->afu->adapter->native is NULL,
and guests don't have a cxl_service_layer_ops struct anywhere.

Move min_pe from struct cxl_service_layer_ops to struct cxl so it's
accessible in both native and PowerVM environments. For the Mellanox
CX-4, set the min_pe value in set_sl_ops().

Fixes: f67a6722d6 ("cxl: Workaround PE=0 hardware limitation in Mellanox CX4")
Reported-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-08-09 16:52:02 +10:00
Andrew Donnellan
8fbaa51d43 cxl: fix potential NULL dereference in free_adapter()
If kzalloc() fails when allocating adapter->guest in
cxl_guest_init_adapter(), we call free_adapter() before erroring out.
free_adapter() in turn attempts to dereference adapter->guest, which in
this case is NULL.

In free_adapter(), skip the adapter->guest cleanup if adapter->guest is
NULL.

Fixes: 14baf4d9c7 ("cxl: Add guest-specific code")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-19 20:12:29 +10:00
Andrew Donnellan
1e44727a0b cxl: remove dead Kconfig options
Remove the CXL_KERNEL_API and CXL_EEH Kconfig options, as they were only
needed to coordinate the merging of the cxlflash driver. Also remove the
stub implementation of cxl_perst_reloads_same_image() in cxlflash which is
only used if CXL_EEH isn't defined (i.e. never).

Suggested-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Acked-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-19 20:12:29 +10:00
Andrew Donnellan
b0b5e5918a cxl: Add cxl_check_and_switch_mode() API to switch bi-modal cards
Add a new API, cxl_check_and_switch_mode() to allow for switching of
bi-modal CAPI cards, such as the Mellanox CX-4 network card.

When a driver requests to switch a card to CAPI mode, use PCI hotplug
infrastructure to remove all PCI devices underneath the slot. We then write
an updated mode control register to the CAPI VSEC, hot reset the card, and
reprobe the card.

As the card may present a different set of PCI devices after the mode
switch, use the infrastructure provided by the pnv_php driver and the OPAL
PCI slot management facilities to ensure that:

  * the old devices are removed from both the OPAL and Linux device trees
  * the new devices are probed by OPAL and added to the OPAL device tree
  * the new devices are added to the Linux device tree and probed through
    the regular PCI device probe path

As such, introduce a new option, CONFIG_CXL_BIMODAL, with a dependency on
the pnv_php driver.

Refactor existing code that touches the mode control register in the
regular single mode case into a new function, setup_cxl_protocol_area().

Co-authored-by: Ian Munsie <imunsie@au1.ibm.com>
Cc: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Reviewed-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-14 20:28:11 +10:00
Ian Munsie
f67a6722d6 cxl: Workaround PE=0 hardware limitation in Mellanox CX4
The CX4 card cannot cope with a context with PE=0 due to a hardware
limitation, resulting in:

[   34.166577] command failed, status limits exceeded(0x8), syndrome 0x5a7939
[   34.166580] mlx5_core 0000:01:00.1: Failed allocating uar, aborting

Since the kernel API allocates a default context very early during
device init that will almost certainly get Process Element ID 0 there is
no easy way for us to extend the API to allow the Mellanox to inform us
of this limitation ahead of time.

Instead, work around the issue by extending the XSL structure to include
a minimum PE to allocate. Although the bug is not in the XSL, it is the
easiest place to work around this limitation given that the CX4 is
currently the only card that uses an XSL.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-14 20:28:07 +10:00
Ian Munsie
a2f67d5ee8 cxl: Add support for interrupts on the Mellanox CX4
The Mellanox CX4 in cxl mode uses a hybrid interrupt model, where
interrupts are routed from the networking hardware to the XSL using the
MSIX table, and from there will be transformed back into an MSIX
interrupt using the cxl style interrupts (i.e. using IVTE entries and
ranges to map a PE and AFU interrupt number to an MSIX address).

We want to hide the implementation details of cxl interrupts as much as
possible. To this end, we use a special version of the MSI setup &
teardown routines in the PHB while in cxl mode to allocate the cxl
interrupts and configure the IVTE entries in the process element.

This function does not configure the MSIX table - the CX4 card uses a
custom format in that table and it would not be appropriate to fill that
out in generic code. The rest of the functionality is similar to the
"Full MSI-X mode" described in the CAIA, and this could be easily
extended to support other adapters that use that mode in the future.

The interrupts will be associated with the default context. If the
maximum number of interrupts per context has been limited (e.g. by the
mlx5 driver), it will automatically allocate additional kernel contexts
to associate extra interrupts as required. These contexts will be
started using the same WED that was used to start the default context.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-14 20:27:08 +10:00
Ian Munsie
cbce0917e2 cxl: Add preliminary workaround for CX4 interrupt limitation
The Mellanox CX4 has a hardware limitation where only 4 bits of the
AFU interrupt number can be passed to the XSL when sending an interrupt,
limiting it to only 15 interrupts per context (AFU interrupt number 0 is
invalid).

In order to overcome this, we will allocate additional contexts linked
to the default context as extra address space for the extra interrupts -
this will be implemented in the next patch.

This patch adds the preliminary support to allow this, by way of adding
a linked list in the context structure that we use to keep track of the
contexts dedicated to interrupts, and an API to simultaneously iterate
over the related context structures, AFU interrupt numbers and hardware
interrupt numbers. The point of using a single API to iterate these is
to hide some of the details of the iteration from external code, and to
reduce the number of APIs that need to be exported via base.c to allow
built in code to call.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-14 20:27:02 +10:00
Ian Munsie
79384e4b71 cxl: Add kernel APIs to get & set the max irqs per context
These APIs will be used by the Mellanox CX4 support. While they function
standalone to configure existing behaviour, their primary purpose is to
allow the Mellanox driver to inform the cxl driver of a hardware
limitation, which will be used in a future patch.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-14 20:26:57 +10:00
Ian Munsie
317f5ef1b3 cxl: Add support for using the kernel API with a real PHB
This hooks up support for using the kernel API with a real PHB. After
the AFU initialisation has completed it calls into the PHB code to pass
it the AFU that will be used by other peer physical functions on the
adapter.

The cxl_pci_to_afu API is extended to work with peer PCI devices,
retrieving the peer AFU from the PHB. This API may also now return an
error if it is called on a PCI device that is not associated with either
a cxl vPHB or a peer PCI device to an AFU, and this error is propagated
down.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-14 20:26:56 +10:00
Ian Munsie
e4f5fc001a cxl: Do not create vPHB if there are no AFU configuration records
The vPHB model of the cxl kernel API is a hierarchy where the AFU is
represented by the vPHB, and it's AFU configuration records are exposed
as functions under that vPHB. If there are no AFU configuration records
we will create a vPHB with nothing under it, which is a waste of
resources and will opt us into EEH handling despite not having anything
special to handle.

This also does not make sense for cards using the peer model of the cxl
kernel API, where the other functions of the device are exposed via
additional peer physical functions rather than AFU configuration
records. This model will also not work with the existing EEH handling in
the cxl driver, as that is designed around the vPHB model.

Skip creating the vPHB for AFUs without any AFU configuration records,
and opt out of EEH handling for them.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-14 20:26:53 +10:00
Ian Munsie
a19bd79e31 cxl: Allow a default context to be associated with an external pci_dev
The cxl kernel API has a concept of a default context associated with
each PCI device under the virtual PHB. The Mellanox CX4 will also use
the cxl kernel API, but it does not use a virtual PHB - rather, the AFU
appears as a physical function as a peer to the networking functions.

In order to allow the kernel API to work with those networking
functions, we will need to associate a default context with them as
well. To this end, refactor the corresponding code to do this in vphb.c
and export it so that it can be called from the PHB code.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-14 20:26:52 +10:00
Ian Munsie
62ccf2d2ef cxl: Move cxl_afu_get / cxl_afu_put to base
The Mellanox CX4 uses a model where the AFU is one physical function of
the device, and is used by other peer physical functions of the same
device. This will require those other devices to grab a reference on the
AFU when they are initialised to make sure that it does not go away
during their lifetime.

Move the AFU refcount functions to base.c so they can be called from
the PHB code.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-14 20:26:36 +10:00
Ian Munsie
48b3adf334 cxl: Enable bus mastering for devices using CAPP DMA mode
Devices that use CAPP DMA mode (such as the Mellanox CX4) require bus
master to be enabled in order for the CAPI traffic to flow. This should
be harmless to enable for other cxl devices, so unconditionally enable
it in the adapter init flow.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-14 20:26:35 +10:00
Ian Munsie
4e56f858bd cxl: Add cxl_slot_is_supported API
This extends the check that the adapter is in a CAPI capable slot so
that it may be called by external users in the kernel API. This will be
used by the upcoming Mellanox CX4 support, which needs to know ahead of
time if the card can be switched to cxl mode so that it can leave it in
PCI mode if it is not.

This API takes a parameter to check if CAPP DMA mode is supported, which
it currently only allows on P8NVL systems, since that mode currently has
issues accessing memory < 4GB on P8, and we cannot realistically avoid
that.

This API does not currently check if a CAPP unit is available (i.e. not
already assigned to another PHB) on P8. Doing so would be racy since it
is assigned on a first come first serve basis, and so long as CAPP DMA
mode is not supported on P8 we don't need this, since the only
anticipated user of this API requires CAPP DMA mode.

Cc: Philippe Bergheaud <felix@linux.vnet.ibm.com>
Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-14 20:26:33 +10:00
Wei Yongjun
fc9f75ef2f cxl: Use for_each_compatible_node() macro
Use for_each_compatible_node() macro instead of open coding it.

Generated by Coccinelle.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-14 20:26:25 +10:00
Philippe Bergheaud
3b3dcd61fa cxl: Ignore CAPI adapters misplaced in switched slots
One should not attempt to switch a PHB into CAPI mode if there is
a switch between the PHB and the adapter. This patch modifies the
cxl driver to ignore CAPI adapters misplaced in switched slots.

Signed-off-by: Philippe Bergheaud <felix@linux.vnet.ibm.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-08 22:23:57 +10:00
Paul Gortmaker
e00878be3f cxl: make base more explicitly non-modular
The Kconfig/Makefile currently controlling compilation of this code is:

drivers/misc/cxl/Kconfig:config CXL_BASE
drivers/misc/cxl/Kconfig:       bool

drivers/misc/cxl/Makefile:obj-$(CONFIG_CXL_BASE)          += base.o

...meaning that it currently is not being built as a module by anyone.

Lets convert the one module_init into device_initcall so that
when reading the driver it more clear that it is builtin-only.

Since module_init translates to device_initcall in the non-modular
case, the init ordering remains unchanged with this commit.

We don't replace module.h with init.h since the file is doing
other modular stuff (module_get/put) even though it is built-in.

Cc: Ian Munsie <imunsie@au1.ibm.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-08 22:22:58 +10:00
Philippe Bergheaud
6e0c50f9e8 cxl: Refine slice error debug messages
The PSL Slice Error Register (PSL_SERR_An) reports implementation
dependent AFU errors, in the form of a bitmap. The PSL_SERR_An
register content is printed in the form of hex dump debug message.

This patch decodes the PSL_ERR_An register contents, and prints a
specific error message for each possible error bit. It also dumps
the secondary registers AFU_ERR_An and PSL_DSISR_An, that may
contain extra debug information.

This patch also removes the large WARN message that used to report
the cxl slice error interrupt, and replaces it by a short informative
message, that draws attention to AFU implementation errors.

Signed-off-by: Philippe Bergheaud <felix@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-08 22:22:03 +10:00
Ian Munsie
f5c9df9a44 cxl: Fix NULL pointer dereference on kernel contexts with no AFU interrupts
If a kernel context is initialised and does not have any AFU interrupts
allocated it will cause a NULL pointer dereference when the context is
detached since the irq_names list will not have been initialised.

Move the initialisation of the irq_names list into the cxl_context_init
routine so that it will be valid for the entire lifetime of the context
and will not cause a NULL pointer dereference.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-08 22:13:34 +10:00
Ian Munsie
2a4f667aad cxl: Workaround XSL bug that does not clear the RA bit after a reset
An issue was noted in our debug logs where the XSL would leave the RA
bit asserted after an AFU reset operation, which would effectively
prevent further AFU reset operations from working.

Workaround the issue by clearing the RA bit with an MMIO write if it is
still asserted after any AFU control operation.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-08 22:13:06 +10:00
Ian Munsie
5e7823c9bc cxl: Fix bug where AFU disable operation had no effect
The AFU disable operation has a bug where it will not clear the enable
bit and therefore will have no effect. To date this has likely been
masked by fact that we perform an AFU reset before the disable, which
also has the effect of clearing the enable bit, making the following
disable operation effectively a noop on most hardware. This patch
modifies the afu_control function to take a parameter to clear from the
AFU control register so that the disable operation can clear the
appropriate bit.

This bug was uncovered on the Mellanox CX4, which uses an XSL rather
than a PSL. On the XSL the reset operation will not complete while the
AFU is enabled, meaning the enable bit was still set at the start of the
disable and as a result this bug was hit and the disable also timed out.

Because of this difference in behaviour between the PSL and XSL, this
patch now makes the reset dependent on the card using a PSL to avoid
waiting for a timeout on the XSL. It is entirely possible that we may be
able to drop the reset altogether if it turns out we only ever needed it
due to this bug - however I am not willing to drop it without further
regression testing and have added comments to the code explaining the
background.

This also fixes a small issue where the AFU_Cntl register was read
outside of the lock that protects it.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-08 22:11:18 +10:00
Ian Munsie
2224b6719b cxl: Fix allocating a minimum of 2 pages for the SPA
The Scheduled Process Area is allocated dynamically with enough pages to
fit at least as many processes as the AFU descriptor indicated. Since
the calculation is non-trivial, it does this by calculating how many
processes could fit in an allocation of a given order, and increasing
that order until it can fit enough processes or hits the maximum
supported size.

Currently, it will start this search using a SPA of 2 pages instead of
1. This can waste a page of memory if the AFU's maximum number of
supported processes was small enough to fit in one page.

Fix the algorithm to start the search at 1 page.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-08 22:10:44 +10:00
Ian Munsie
49e9c99f47 cxl: Fix allowing bogus AFU descriptors with 0 maximum processes
If the AFU descriptor of an AFU directed AFU indicates that it supports
0 maximum processes, we will accept that value and attempt to use it.
The SPA will still be allocated (with 2 pages due to another minor bug
and room for 958 processes), and when a context is allocated we will
pass the value of 0 to idr_alloc as the maximum. However, idr_alloc will
treat that as meaning no maximum and will allocate a context number and
we return a valid context.

Conceivably, this could lead to a buffer overflow of the SPA if more
than 958 contexts were allocated, however this is mitigated by the fact
that there are no known AFUs in the wild with a bogus AFU descriptor
like this, and that only the root user is allowed to flash an AFU image
to a card.

Add a check when validating the AFU descriptor to reject any with 0
maximum processes.

We do still allow a dedicated process only AFU to indicate that it
supports 0 contexts even though that is forbidden in the architecture,
as in that case we ignore the value and use 1 instead. This is just on
the off-chance that such a dedicated process AFU may exist (not that I
am aware of any), since their developers are less likely to have cared
about this value at all.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-07-08 22:10:42 +10:00
Michael Neuling
ad42de859f cxl: Add set and get private data to context struct
This provides AFU drivers a means to associate private data with a cxl
context. This is particularly intended for make the new callbacks for
driver specific events easier for AFU drivers to use, as they can easily
get back to any private data structures they may use.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Philippe Bergheaud <felix@linux.vnet.ibm.com
Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-28 18:35:08 +10:00
Philippe Bergheaud
b810253bd9 cxl: Add mechanism for delivering AFU driver specific events
This adds an afu_driver_ops structure with fetch_event() and
event_delivered() callbacks. An AFU driver such as cxlflash can fill
this out and associate it with a context to enable passing custom AFU
specific events to userspace.

This also adds a new kernel API function cxl_context_pending_events(),
that the AFU driver can use to notify the cxl driver that new specific
events are ready to be delivered, and wake up anyone waiting on the
context wait queue.

The current count of AFU driver specific events is stored in the field
afu_driver_events of the context structure.

The cxl driver checks the afu_driver_events count during poll, select,
read, etc. calls to check if an AFU driver specific event is pending,
and calls fetch_event() to obtain and deliver that event. This way, the
cxl driver takes care of all the usual locking semantics around these
calls and handles all the generic cxl events, so that the AFU driver
only needs to worry about it's own events.

fetch_event() return a struct cxl_event_afu_driver_reserved, allocated
by the AFU driver, and filled in with the specific event information and
size. Total event size (header + data) should not be greater than
CXL_READ_MIN_SIZE (4K).

Th cxl driver prepends an appropriate cxl event header, copies the event
to userspace, and finally calls event_delivered() to return the status of
the operation to the AFU driver. The event is identified by the context
and cxl_event_afu_driver_reserved pointers.

Since AFU drivers provide their own means for userspace to obtain the
AFU file descriptor (i.e. cxlflash uses an ioctl on their scsi file
descriptor to obtain the AFU file descriptor) and the generic cxl driver
will never use this event, the ABI of the event is up to each individual
AFU driver.

Signed-off-by: Philippe Bergheaud <felix@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-28 18:34:56 +10:00
Frederic Barrat
a430739009 cxl: Make vPHB device node match adapter's
On bare-metal, when a device is attached to the cxl card, lsvpd shows
a location code such as (with cxlflash):
     # lsvpd -l sg22
     ...
     *YL U78CB.001.WZS0073-P1-C33-B0-T0-L0
which makes it hard to easily identify the cxl adapter owning the
flash device, since in this example C33 refers to a P8 processor.

lsvpd looks in the parent devices until it finds a location code, so the
device node for the vPHB ends up being used.

By reusing the device node of the adapter for the vPHB, lsvpd shows:
     # lsvpd -l sg16
     ...
     *YL U78C9.001.WZS09XA-P1-C7-B1-T0-L3
where C7 is the PCI slot of the cxl adapter.

On powerVM, the vPHB was already using the adapter device node, so
there's no change there.

Tested by cxlflash on bare-metal and powerVM.

Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-16 23:11:30 +10:00
Ian Munsie
b385c9e971 cxl: Add support for CAPP DMA mode
This adds support for using CAPP DMA mode, which is required for XSL
based cards such as the Mellanox CX4 to function.

This is currently an RFC as it depends on the corresponding support to
be merged into skiboot first, which was submitted here:
http://patchwork.ozlabs.org/patch/625582/

In the event that the skiboot on the system does not have the above
support, it will indicate as such in the kernel log and abort the init
process.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-16 23:10:26 +10:00
Frederic Barrat
6d382616ac cxl: Abstract the differences between the PSL and XSL
The XSL (Translation Service Layer) is a stripped down version of the
PSL (Power Service Layer) used in some cards such as the Mellanox CX4.

Like the PSL, it implements the CAIA architecture, but has a number of
differences, mostly in it's implementation dependent registers. This
adds an ops structure to abstract these differences to bring initial
support for XSL CAPI devices.

The XSL does not implement the optional architected SERR register,
however while it treats it as a reserved register and should work with
no special treatment, attempting to access it will cause the XSL_FEC
(First Error Capture) register to be filled out, preventing it from
capturing any subsequent errors. Therefore, this patch also prevents the
kernel from trying to set up the SERR register so that the FEC register
may still be useful, and to save one interrupt.

The XSL also uses a special DMA cxl mode, which uses a slightly
different init sequence for the CAPP and PHB. The kernel support for
this will be in a future patch once the corresponding support has been
merged into skiboot.

Co-authored-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-16 23:08:54 +10:00
Ian Munsie
292841b096 cxl: Update process element after allocating interrupts
In the kernel API, it is possible to attempt to allocate AFU interrupts
after already starting a context. Since the process element structure
used by the hardware is only filled out at the time the context is
started, it will not be updated with the interrupt numbers that have
just been allocated and therefore AFU interrupts will not work unless
they were allocated prior to starting the context.

This can present some difficulties as each CAPI enabled PCI device in
the kernel API has a default context, which may need to be started very
early to enable translations, potentially before interrupts can easily
be set up.

This patch makes the API more flexible to allow interrupts to be
allocated after a context has already been started and takes care of
updating the PE structure used by the hardware and notifying it to
discard any cached copy it may have.

The update is currently performed via a terminate/remove/add sequence.
This is necessary on some hardware such as the XSL that does not
properly support the update LLCMD.

Note that this is only supported on powernv at present - attempting to
perform this ordering on PowerVM will raise a warning.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-16 23:08:49 +10:00
Andrew Donnellan
64417a3989 cxl: static-ify variables to fix sparse warnings
Make a couple more variables static. Found by sparse.

Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reviewed-by: fbarrat@linux.vnet.ibm.com
Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-06-16 22:49:27 +10:00
Linus Torvalds
c04a588029 powerpc updates for 4.7
Highlights:
  - Support for Power ISA 3.0 (Power9) Radix Tree MMU from Aneesh Kumar K.V
  - Live patching support for ppc64le (also merged via livepatching.git)
 
 Various cleanups & minor fixes from:
  - Aaro Koskinen, Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V,
    Chris Smart, Daniel Axtens, Frederic Barrat, Gavin Shan, Ian Munsie, Lennart
    Sorensen, Madhavan Srinivasan, Mahesh Salgaonkar, Markus Elfring, Michael
    Ellerman, Oliver O'Halloran, Paul Gortmaker, Paul Mackerras, Rashmica Gupta,
    Russell Currey, Suraj Jitindar Singh, Thiago Jung Bauermann, Valentin
    Rothberg, Vipin K Parashar.
 
 General:
  - Update LMB associativity index during DLPAR add/remove from Nathan Fontenot
  - Fix branching to OOL handlers in relocatable kernel from Hari Bathini
  - Add support for userspace Power9 copy/paste from Chris Smart
  - Always use STRICT_MM_TYPECHECKS from Michael Ellerman
  - Add mask of possible MMU features from Michael Ellerman
 
 PCI:
  - Enable pass through of NVLink to guests from Alexey Kardashevskiy
  - Cleanups in preparation for powernv PCI hotplug from Gavin Shan
  - Don't report error in eeh_pe_reset_and_recover() from Gavin Shan
  - Restore initial state in eeh_pe_reset_and_recover() from Gavin Shan
  - Revert "powerpc/eeh: Fix crash in eeh_add_device_early() on Cell" from Guilherme G. Piccoli
  - Remove the dependency on EEH struct in DDW mechanism from Guilherme G. Piccoli
 
 selftests:
  - Test cp_abort during context switch from Chris Smart
  - Add several tests for transactional memory support from Rashmica Gupta
 
 perf:
  - Add support for sampling interrupt register state from Anju T
  - Add support for unwinding perf-stackdump from Chandan Kumar
 
 cxl:
  - Configure the PSL for two CAPI ports on POWER8NVL from Philippe Bergheaud
  - Allow initialization on timebase sync failures from Frederic Barrat
  - Increase timeout for detection of AFU mmio hang from Frederic Barrat
  - Handle num_of_processes larger than can fit in the SPA from Ian Munsie
  - Ensure PSL interrupt is configured for contexts with no AFU IRQs from Ian Munsie
  - Add kernel API to allow a context to operate with relocate disabled from Ian Munsie
  - Check periodically the coherent platform function's state from Christophe Lombard
 
 Freescale:
  - Updates from Scott: "Contains 86xx fixes, minor device tree fixes, an erratum
    workaround, and a kconfig dependency fix."
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXPsGzAAoJEFHr6jzI4aWAVoAP/iKdrDe0eYHlVAE9SqnbsiZs
 lgDxdsC8P3fsmP1G9o/HkKhC82zHl/La8Ztz8dtqa+LkSzbfliWP1ztJsI7GsBFo
 tyCKzWnX9Rwvd3meHu/o/SQ29TNLm/PbPyyRqpj5QPbJ8XCXkAXR7ZZZqjvcMsJW
 /AgIr7Cgf53tl9oZzzl/c7CnNHhMq+NBdA71vhWtUx+T97wfJEGyKW6HhZyHDbEU
 iAki7fu77ZpEqC/Fh9swf0dCGBJ+a132NoMVo0AdV7EQLznUYlQpQEqa+1PyHZOP
 /ArOzf2mDg6m3PfCo1eiB07v8PnVZ3llEUbVAJNg3GUxbE4SHrqq/kwm0iElm3p/
 DvFxerCwdX9vmskJX4wDs+pSZRabXYj9XVMptsgFzA4joWrqqb7mBHqaort88YcY
 YSljEt1bHyXmiJ+dBya40qARsWUkCVN7ZgEzdxckq0KI3w7g2tqpqIbO2lClWT6t
 B3GpqQ4jp34+d1M14FB91fIGK7tMvOhSInE0Mv9+tPvRsepXqiiU/SwdAtRlr3m2
 zs/K+4FYcVjJ3Rmpgc+tI38PbZxHe212I35YN6L1LP+4ZfAtzz0NyKdooTIBtkbO
 19pX4WbBjKq8zK+YutrySncBIrbnI6VjW51vtRhgVKZliPFO/6zKagyU6FbxM+E5
 udQES+t3F/9gvtxgxtDe
 =YvyQ
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:
 "Highlights:
   - Support for Power ISA 3.0 (Power9) Radix Tree MMU from Aneesh Kumar K.V
   - Live patching support for ppc64le (also merged via livepatching.git)

  Various cleanups & minor fixes from:
   - Aaro Koskinen, Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V,
     Chris Smart, Daniel Axtens, Frederic Barrat, Gavin Shan, Ian Munsie,
     Lennart Sorensen, Madhavan Srinivasan, Mahesh Salgaonkar, Markus Elfring,
     Michael Ellerman, Oliver O'Halloran, Paul Gortmaker, Paul Mackerras,
     Rashmica Gupta, Russell Currey, Suraj Jitindar Singh, Thiago Jung
     Bauermann, Valentin Rothberg, Vipin K Parashar.

  General:
   - Update LMB associativity index during DLPAR add/remove from Nathan
     Fontenot
   - Fix branching to OOL handlers in relocatable kernel from Hari Bathini
   - Add support for userspace Power9 copy/paste from Chris Smart
   - Always use STRICT_MM_TYPECHECKS from Michael Ellerman
   - Add mask of possible MMU features from Michael Ellerman

  PCI:
   - Enable pass through of NVLink to guests from Alexey Kardashevskiy
   - Cleanups in preparation for powernv PCI hotplug from Gavin Shan
   - Don't report error in eeh_pe_reset_and_recover() from Gavin Shan
   - Restore initial state in eeh_pe_reset_and_recover() from Gavin Shan
   - Revert "powerpc/eeh: Fix crash in eeh_add_device_early() on Cell"
     from Guilherme G Piccoli
   - Remove the dependency on EEH struct in DDW mechanism from Guilherme
     G Piccoli

  selftests:
   - Test cp_abort during context switch from Chris Smart
   - Add several tests for transactional memory support from Rashmica
     Gupta

  perf:
   - Add support for sampling interrupt register state from Anju T
   - Add support for unwinding perf-stackdump from Chandan Kumar

  cxl:
   - Configure the PSL for two CAPI ports on POWER8NVL from Philippe
     Bergheaud
   - Allow initialization on timebase sync failures from Frederic Barrat
   - Increase timeout for detection of AFU mmio hang from Frederic
     Barrat
   - Handle num_of_processes larger than can fit in the SPA from Ian
     Munsie
   - Ensure PSL interrupt is configured for contexts with no AFU IRQs
     from Ian Munsie
   - Add kernel API to allow a context to operate with relocate disabled
     from Ian Munsie
   - Check periodically the coherent platform function's state from
     Christophe Lombard

  Freescale:
   - Updates from Scott: "Contains 86xx fixes, minor device tree fixes,
     an erratum workaround, and a kconfig dependency fix."

* tag 'powerpc-4.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (192 commits)
  powerpc/86xx: Fix PCI interrupt map definition
  powerpc/86xx: Move pci1 definition to the include file
  powerpc/fsl: Fix build of the dtb embedded kernel images
  powerpc/fsl: Fix rcpm compatible string
  powerpc/fsl: Remove FSL_SOC dependency from FSL_LBC
  powerpc/fsl-pci: Add a workaround for PCI 5 errata
  powerpc/fsl: Fix SPI compatible on t208xrdb and t1040rdb
  powerpc/powernv/npu: Add PE to PHB's list
  powerpc/powernv: Fix insufficient memory allocation
  powerpc/iommu: Remove the dependency on EEH struct in DDW mechanism
  Revert "powerpc/eeh: Fix crash in eeh_add_device_early() on Cell"
  powerpc/eeh: Drop unnecessary label in eeh_pe_change_owner()
  powerpc/eeh: Ignore handlers in eeh_pe_reset_and_recover()
  powerpc/eeh: Restore initial state in eeh_pe_reset_and_recover()
  powerpc/eeh: Don't report error in eeh_pe_reset_and_recover()
  Revert "powerpc/powernv: Exclude root bus in pnv_pci_reset_secondary_bus()"
  powerpc/powernv/npu: Enable NVLink pass through
  powerpc/powernv/npu: Rework TCE Kill handling
  powerpc/powernv/npu: Add set/unset window helpers
  powerpc/powernv/ioda2: Export debug helper pe_level_printk()
  ...
2016-05-20 10:12:41 -07:00
Christophe Lombard
266eab8f32 cxl: Check periodically the coherent platform function's state
In the PowerVM environment, the PHYP CoherentAccel component manages
the state of the Coherent Accelerator Processor Interface adapter and
virtualizes CAPI resources, handles CAPP, PSL, PSL Slice errors - and
interrupts - and provides a new set of hcalls for the OS APIs to utilize
Accelerator Function Unit (AFU).

During the course of operation, a coherent platform function can
encounter errors. Some possible reason for errors are:
• Hardware recoverable and unrecoverable errors
• Transient and over-threshold correctable errors

PHYP implements its own state model for the coherent platform function.
The state of the AFU is available through a hcall.

The current implementation of the cxl driver, for the PowerVM
environment, checks this state of the AFU only when an action is
requested - open a device, ioctl command, memory map, attach/detach a
process - from an external driver - cxlflash, libcxl. If an error is
detected the cxl driver handles the error according the content of the
Power Architecture Platform Requirements document.

But in case of low-level troubles (or error injection), the PHYP
component may reset the card and change the AFU state. The PHYP
interface doesn't provide any way to be notified when that happens thus
implies that the cxl driver:
• cannot handle immediatly the state change of the AFU.
• cannot notify other drivers (cxlflash, ...)

The purpose of this patch is to wake up the cpu periodically to check
the current state of each AFU and to see if we need to enter an error
recovery path.

Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:54:11 +10:00
Ian Munsie
7a0d85d313 cxl: Add kernel API to allow a context to operate with relocate disabled
cxl devices typically access memory using an MMU in much the same way as
the CPU, and each context includes a state register much like the MSR in
the CPU. Like the CPU, the state register includes a bit to enable
relocation, which we currently always enable.

In some cases, it may be desirable to allow a device to access memory
using real addresses instead of effective addresses, so this adds a new
API, cxl_set_translation_mode, that can be used to disable relocation
on a given kernel context. This can allow for the creation of a special
privileged context that the device can use if it needs relocation
disabled, and can use regular contexts at times when it needs relocation
enabled.

This interface is only available to users of the kernel API for obvious
reasons, and will never be supported in a virtualised environment.

This will be used by the upcoming cxl support in the mlx5 driver.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:54:10 +10:00
Ian Munsie
3c206fa77a cxl: Ensure PSL interrupt is configured for contexts with no AFU IRQs
In the cxl kernel API, it is possible to create a context and start it
without allocating any interrupts. Since we assign or allocate the PSL
interrupt when allocating AFU interrupts this will lead to a situation
where we start the context with no means to take any faults.

The user API is not affected as it always goes through the cxl interrupt
allocation code paths and will have the PSL interrupt allocated or
assigned, even if no AFU interrupts were requested.

This checks that at least one interrupt is configured at the time of
attach, and if not it will assign the multiplexed PSL interrupt for
powernv, or allocate a single interrupt for PowerVM.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:54:10 +10:00
Ian Munsie
0e5b5ba17a cxl: Remove duplicate #defines
These defines are not used, but other equivalent definitions
(CXL_SPA_SW_CMD_*) are used. Remove the unused defines.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:54:09 +10:00
Ian Munsie
895a79805c cxl: Handle num_of_processes larger than can fit in the SPA
num_of_process is a 16 bit field, theoretically allowing an AFU to
support 16K processes, however the scheduled process area currently has
a maximum size of 1MB, which limits the maximum number of processes to
7704.

Some AFUs may not necessarily care what the limit is and just want to be
able to use the maximum by setting the field to 16K. To allow these to
work, detect this situation and use the maximum size for the SPA.

Downgrade the WARN_ON to a dev_warn.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-11 21:54:09 +10:00
Aneesh Kumar K.V
ac29c64089 powerpc/mm: Replace _PAGE_USER with _PAGE_PRIVILEGED
_PAGE_PRIVILEGED means the page can be accessed only by the kernel. This
is done to keep pte bits similar to PowerISA 3.0 Radix PTE format. User
pages are now marked by clearing _PAGE_PRIVILEGED bit.

Previously we allowed the kernel to have a privileged page in the lower
address range (USER_REGION). With this patch such access is denied.

We also prevent a kernel access to a non-privileged page in higher
address range (ie, REGION_ID != 0).

Both the above access scenarios should never happen.

Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Jeremy Kerr <jk@ozlabs.org>
Cc: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-01 18:32:26 +10:00
Aneesh Kumar K.V
c7d54842de powerpc/mm: Use _PAGE_READ to indicate Read access
This splits the _PAGE_RW bit into _PAGE_READ and _PAGE_WRITE. It also
removes the dependency on _PAGE_USER for implying read only. Few things
to note here is that, we have read implied with write and execute
permission. Hence we should always find _PAGE_READ set on hash pte
fault.

We still can't switch PROT_NONE to !(_PAGE_RWX). Auto numa depends on
marking a prot none pte _PAGE_WRITE. (For more details look at
b191f9b106 "mm: numa: preserve PTE write permissions across a NUMA
hinting fault")

Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Jeremy Kerr <jk@ozlabs.org>
Cc: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-05-01 18:32:21 +10:00
Michael Neuling
2bc79ffcbb cxl: Poll for outstanding IRQs when detaching a context
When detaching contexts, we may still have interrupts in the system
which are yet to be delivered to any CPU and be acked in the PSL.
This can result in a subsequent unrelated process getting an spurious
IRQ or an interrupt for a non-existent context.

This polls the PSL to ensure that the PSL is clear of IRQs for the
detached context, before removing the context from the idr.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Tested-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Tested-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-04-27 12:04:48 +10:00
Michael Neuling
d6776bba44 cxl: Keep IRQ mappings on context teardown
Keep IRQ mappings on context teardown.  This won't leak IRQs as if we
allocate the mapping again, the generic code will give the same
mapping used last time.

Doing this works around a race in the generic code. Masking the
interrupt introduces a race which can crash the kernel or result in
IRQ that is never EOIed. The lost of EOI results in all subsequent
mappings to the same HW IRQ never receiving an interrupt.

We've seen this race with cxl test cases which are doing heavy context
startup and teardown at the same time as heavy interrupt load.

A fix to the generic code is being investigated also.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Cc: stable@vger.kernel.org # 3.8
Tested-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Tested-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-04-27 12:04:31 +10:00
Aneesh Kumar K.V
3b1dbfa14f cxl: Fix DAR check & use REGION_ID instead of opencoding
The current code will set _PAGE_USER to the access flags for any
fault address, because the ~ operation will be true for all address we
take a fault on. But setting _PAGE_USER also means that the fault will
be handled only if the page table have _PAGE_USER set. Hence there is
no security hole with the current code.

Now if it is an user space access, then the change in this patch really
don't have an impact because we have (!ctx->kernel) set true
and we take the if condition true.

Now kernel context created fault on an address in the kernel range
will result in a fault loop because we will not insert the
hash pte due to access and pte permission mismatch. This patch fix
the above issue.

Fixes: f204e0b8ce ("cxl: Driver code for powernv PCIe based cards for userspace access")
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-04-26 21:06:36 +10:00
Frederic Barrat
4aec6ec0da cxl: Increase timeout for detection of AFU mmio hang
PSL designers recommend a larger value for the mmio hang pulse, 256 us
instead of 1 us. The CAIA architecture states that it needs to be
smaller than 1/2 of the RTOS timeout set in the PHB for outbound
non-posted transactions, which is still (easily) the case here.

Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Tested-by: Frank Haverkamp <haver@linux.vnet.ibm.com>
Tested-by: Manoj Kumar <manoj@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-04-22 21:45:50 +10:00
Frederic Barrat
e009a7e858 cxl: Allow initialization on timebase sync failures
Failure to synchronize the PSL timebase currently prevents the
initialization of the cxl card, thus rendering the card useless. This
is too extreme for a feature which is rarely used, if at all. No
hardware AFUs or software is currently using PSL timebase.

This patch still tries to synchronize the PSL timebase when the card
is initialized, but ignores the error if it can't. Instead, it reports
a status via /sys.

Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-04-22 21:45:44 +10:00
Markus Elfring
1050e689a6 cxl: Delete an unnecessary check before the function call "kfree"
The kfree() function tests whether its argument is NULL and then
returns immediately. Thus the test around the call is not needed.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-04-12 21:05:21 +10:00
Philippe Bergheaud
aa14138a51 cxl: Configure the PSL for two CAPI ports on POWER8NVL
The POWER8NVL chip has two CAPI ports.  Configure the PSL to route
data to the port corresponding to the CAPP unit.

Signed-off-by: Philippe Bergheaud <felix@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-04-11 20:30:46 +10:00
Frederic Barrat
d88e397f12 cxl: Remove dead code
Function cxl_get_phys_dev() was removed from the kernel API by a
previous patch, but it's actually dead code. Remove it.

Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-04-11 11:23:39 +10:00
Linus Torvalds
d5e2d00898 powerpc updates for 4.6
Highlights:
  - Restructure Linux PTE on Book3S/64 to Radix format from Paul Mackerras
  - Book3s 64 MMU cleanup in preparation for Radix MMU from Aneesh Kumar K.V
  - Add POWER9 cputable entry from Michael Neuling
  - FPU/Altivec/VSX save/restore optimisations from Cyril Bur
  - Add support for new ftrace ABI on ppc64le from Torsten Duwe
 
 Various cleanups & minor fixes from:
  - Adam Buchbinder, Andrew Donnellan, Balbir Singh, Christophe Leroy, Cyril
    Bur, Luis Henriques, Madhavan Srinivasan, Pan Xinhui, Russell Currey,
    Sukadev Bhattiprolu, Suraj Jitindar Singh.
 
 General:
  - atomics: Allow architectures to define their own __atomic_op_* helpers from
    Boqun Feng
  - Implement atomic{, 64}_*_return_* variants and acquire/release/relaxed
    variants for (cmp)xchg from Boqun Feng
  - Add powernv_defconfig from Jeremy Kerr
  - Fix BUG_ON() reporting in real mode from Balbir Singh
  - Add xmon command to dump OPAL msglog from Andrew Donnellan
  - Add xmon command to dump process/task similar to ps(1) from Douglas Miller
  - Clean up memory hotplug failure paths from David Gibson
 
 pci/eeh:
  - Redesign SR-IOV on PowerNV to give absolute isolation between VFs from Wei
    Yang.
  - EEH Support for SRIOV VFs from Wei Yang and Gavin Shan.
  - PCI/IOV: Rename and export virtfn_{add, remove} from Wei Yang
  - PCI: Add pcibios_bus_add_device() weak function from Wei Yang
  - MAINTAINERS: Update EEH details and maintainership from Russell Currey
 
 cxl:
  - Support added to the CXL driver for running on both bare-metal and
    hypervisor systems, from Christophe Lombard and Frederic Barrat.
  - Ignore probes for virtual afu pci devices from Vaibhav Jain
 
 perf:
  - Export Power8 generic and cache events to sysfs from Sukadev Bhattiprolu
  - hv-24x7: Fix usage with chip events, display change in counter values,
    display domain indices in sysfs, eliminate domain suffix in event names,
    from Sukadev Bhattiprolu
 
 Freescale:
  - Updates from Scott: "Highlights include 8xx optimizations, 32-bit checksum
    optimizations, 86xx consolidation, e5500/e6500 cpu hotplug, more fman and
    other dt bits, and minor fixes/cleanup."
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJW69OrAAoJEFHr6jzI4aWAe5EQAJw/hE6WBQc6a7Tj70AnXOqR
 qk/m5pZjuTwQxfBteIvHR1pE5eXdlvtAjcD254LVkFkAbIn19W/h2k0VX/nlee7P
 n/VRHRifjtGmukqHrPYJJ7ua9mNlY7pxh3leGSixBFASnSWqMxNNNziNQtSTcuCs
 TjHiw6NkZ/kzeunA4bAfE4yHVUZjmL74oiS9JbLyaVHqoW4fqWLlh26AKo2yYMZI
 qPicBBG4HBi3FGvoexnKxlJNdcV4HO7LzDjJmCSfUKYCJi+Pw19T5qmhso0q0qVz
 vHg/A8HNeG4Hn83pNVmLeQSAIQRZ3DvTtcLgbjPo+TVwm/hzrRRBWipTeOVbkLW8
 2bcOXT4t7LWUq15EAJ1LYgYZGzcLrfRfUeOcuQ1TWd3+PcfY9pE7FmizsxAAfaVe
 E9j9mpz4XnIqBtWkFHneTIHkQ5OWptyKuZJEaYH0nut4VsP0k8NarkseafGqBPu7
 5eG83gbiQbCVixfOgblV9eocJ29JcwpjPAY4CZSGJimShg909FV7WRgZgJkKWrbK
 dBRco8Jcp4VglGfo2qymv7Uj4KwQoypBREOhiKUvrAsVlDxPfx+bcskhjGu9xGDC
 xs/+nme0/lKa/wg5K4C3mQ1GAlkMWHI0ojhJjsyODbetup5UbkEu03wjAaTdO9dT
 Y6ptGm0rYAJluPNlziFj
 =qkAt
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:
 "This was delayed a day or two by some build-breakage on old toolchains
  which we've now fixed.

  There's two PCI commits both acked by Bjorn.

  There's one commit to mm/hugepage.c which is (co)authored by Kirill.

  Highlights:
   - Restructure Linux PTE on Book3S/64 to Radix format from Paul
     Mackerras
   - Book3s 64 MMU cleanup in preparation for Radix MMU from Aneesh
     Kumar K.V
   - Add POWER9 cputable entry from Michael Neuling
   - FPU/Altivec/VSX save/restore optimisations from Cyril Bur
   - Add support for new ftrace ABI on ppc64le from Torsten Duwe

  Various cleanups & minor fixes from:
   - Adam Buchbinder, Andrew Donnellan, Balbir Singh, Christophe Leroy,
     Cyril Bur, Luis Henriques, Madhavan Srinivasan, Pan Xinhui, Russell
     Currey, Sukadev Bhattiprolu, Suraj Jitindar Singh.

  General:
   - atomics: Allow architectures to define their own __atomic_op_*
     helpers from Boqun Feng
   - Implement atomic{, 64}_*_return_* variants and acquire/release/
     relaxed variants for (cmp)xchg from Boqun Feng
   - Add powernv_defconfig from Jeremy Kerr
   - Fix BUG_ON() reporting in real mode from Balbir Singh
   - Add xmon command to dump OPAL msglog from Andrew Donnellan
   - Add xmon command to dump process/task similar to ps(1) from Douglas
     Miller
   - Clean up memory hotplug failure paths from David Gibson

  pci/eeh:
   - Redesign SR-IOV on PowerNV to give absolute isolation between VFs
     from Wei Yang.
   - EEH Support for SRIOV VFs from Wei Yang and Gavin Shan.
   - PCI/IOV: Rename and export virtfn_{add, remove} from Wei Yang
   - PCI: Add pcibios_bus_add_device() weak function from Wei Yang
   - MAINTAINERS: Update EEH details and maintainership from Russell
     Currey

  cxl:
   - Support added to the CXL driver for running on both bare-metal and
     hypervisor systems, from Christophe Lombard and Frederic Barrat.
   - Ignore probes for virtual afu pci devices from Vaibhav Jain

  perf:
   - Export Power8 generic and cache events to sysfs from Sukadev
     Bhattiprolu
   - hv-24x7: Fix usage with chip events, display change in counter
     values, display domain indices in sysfs, eliminate domain suffix in
     event names, from Sukadev Bhattiprolu

  Freescale:
   - Updates from Scott: "Highlights include 8xx optimizations, 32-bit
     checksum optimizations, 86xx consolidation, e5500/e6500 cpu
     hotplug, more fman and other dt bits, and minor fixes/cleanup"

* tag 'powerpc-4.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (179 commits)
  powerpc: Fix unrecoverable SLB miss during restore_math()
  powerpc/8xx: Fix do_mtspr_cpu6() build on older compilers
  powerpc/rcpm: Fix build break when SMP=n
  powerpc/book3e-64: Use hardcoded mttmr opcode
  powerpc/fsl/dts: Add "jedec,spi-nor" flash compatible
  powerpc/T104xRDB: add tdm riser card node to device tree
  powerpc32: PAGE_EXEC required for inittext
  powerpc/mpc85xx: Add pcsphy nodes to FManV3 device tree
  powerpc/mpc85xx: Add MDIO bus muxing support to the board device tree(s)
  powerpc/86xx: Introduce and use common dtsi
  powerpc/86xx: Update device tree
  powerpc/86xx: Move dts files to fsl directory
  powerpc/86xx: Switch to kconfig fragments approach
  powerpc/86xx: Update defconfigs
  powerpc/86xx: Consolidate common platform code
  powerpc32: Remove one insn in mulhdu
  powerpc32: small optimisation in flush_icache_range()
  powerpc: Simplify test in __dma_sync()
  powerpc32: move xxxxx_dcache_range() functions inline
  powerpc32: Remove clear_pages() and define clear_page() inline
  ...
2016-03-19 15:38:41 -07:00
Linus Torvalds
8eee93e257 Char/Misc patches for 4.6-rc1
Here is the big char/misc driver update for 4.6-rc1.
 
 The majority of the patches here is hwtracing and some new mic drivers,
 but there's a lot of other driver updates as well.  Full details in the
 shortlog.
 
 All have been in linux-next for a while with no reported issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iEYEABECAAYFAlbp9IcACgkQMUfUDdst+ykyJgCeLTC2QNGrh51kiJglkVJ0yD36
 q4MAn0NkvSX2+iv5Jq8MaX6UQoRa4Nun
 =MNjR
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-4.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc updates from Greg KH:
 "Here is the big char/misc driver update for 4.6-rc1.

  The majority of the patches here is hwtracing and some new mic
  drivers, but there's a lot of other driver updates as well.  Full
  details in the shortlog.

  All have been in linux-next for a while with no reported issues"

* tag 'char-misc-4.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (238 commits)
  goldfish: Fix build error of missing ioremap on UM
  nvmem: mediatek: Fix later provider initialization
  nvmem: imx-ocotp: Fix return value of imx_ocotp_read
  nvmem: Fix dependencies for !HAS_IOMEM archs
  char: genrtc: replace blacklist with whitelist
  drivers/hwtracing: make coresight-etm-perf.c explicitly non-modular
  drivers: char: mem: fix IS_ERROR_VALUE usage
  char: xillybus: Fix internal data structure initialization
  pch_phub: return -ENODATA if ROM can't be mapped
  Drivers: hv: vmbus: Support kexec on ws2012 r2 and above
  Drivers: hv: vmbus: Support handling messages on multiple CPUs
  Drivers: hv: utils: Remove util transport handler from list if registration fails
  Drivers: hv: util: Pass the channel information during the init call
  Drivers: hv: vmbus: avoid unneeded compiler optimizations in vmbus_wait_for_unload()
  Drivers: hv: vmbus: remove code duplication in message handling
  Drivers: hv: vmbus: avoid wait_for_completion() on crash
  Drivers: hv: vmbus: don't loose HVMSG_TIMER_EXPIRED messages
  misc: at24: replace memory_accessor with nvmem_device_read
  eeprom: 93xx46: extend driver to plug into the NVMEM framework
  eeprom: at25: extend driver to plug into the NVMEM framework
  ...
2016-03-17 13:47:50 -07:00
Linus Torvalds
63e30271b0 PCI changes for the v4.6 merge window:
Enumeration
     Disable IO/MEM decoding for devices with non-compliant BARs (Bjorn Helgaas)
     Mark Broadwell-EP Home Agent & PCU as having non-compliant BARs (Bjorn Helgaas
 
   Resource management
     Mark shadow copy of VGA ROM as IORESOURCE_PCI_FIXED (Bjorn Helgaas)
     Don't assign or reassign immutable resources (Bjorn Helgaas)
     Don't enable/disable ROM BAR if we're using a RAM shadow copy (Bjorn Helgaas)
     Set ROM shadow location in arch code, not in PCI core (Bjorn Helgaas)
     Remove arch-specific IORESOURCE_ROM_SHADOW size from sysfs (Bjorn Helgaas)
     ia64: Use ioremap() instead of open-coded equivalent (Bjorn Helgaas)
     ia64: Keep CPU physical (not virtual) addresses in shadow ROM resource (Bjorn Helgaas)
     MIPS: Keep CPU physical (not virtual) addresses in shadow ROM resource (Bjorn Helgaas)
     Remove unused IORESOURCE_ROM_COPY and IORESOURCE_ROM_BIOS_COPY (Bjorn Helgaas)
     Don't leak memory if sysfs_create_bin_file() fails (Bjorn Helgaas)
     rcar: Remove PCI_PROBE_ONLY handling (Lorenzo Pieralisi)
     designware: Remove PCI_PROBE_ONLY handling (Lorenzo Pieralisi)
 
   Virtualization
     Wait for up to 1000ms after FLR reset (Alex Williamson)
     Support SR-IOV on any function type (Kelly Zytaruk)
     Add ACS quirk for all Cavium devices (Manish Jaggi)
 
   AER
     Rename pci_ops_aer to aer_inj_pci_ops (Bjorn Helgaas)
     Restore pci_ops pointer while calling original pci_ops (David Daney)
     Fix aer_inject error codes (Jean Delvare)
     Use dev_warn() in aer_inject (Jean Delvare)
     Log actual error causes in aer_inject (Jean Delvare)
     Log aer_inject error injections (Jean Delvare)
 
   VPD
     Prevent VPD access for buggy devices (Babu Moger)
     Move pci_read_vpd() and pci_write_vpd() close to other VPD code (Bjorn Helgaas)
     Move pci_vpd_release() from header file to pci/access.c (Bjorn Helgaas)
     Remove struct pci_vpd_ops.release function pointer (Bjorn Helgaas)
     Rename VPD symbols to remove unnecessary "pci22" (Bjorn Helgaas)
     Fold struct pci_vpd_pci22 into struct pci_vpd (Bjorn Helgaas)
     Sleep rather than busy-wait for VPD access completion (Bjorn Helgaas)
     Update VPD definitions (Hannes Reinecke)
     Allow access to VPD attributes with size 0 (Hannes Reinecke)
     Determine actual VPD size on first access (Hannes Reinecke)
 
   Generic host bridge driver
     Move structure definitions to separate header file (David Daney)
     Add pci_host_common_probe(), based on gen_pci_probe() (David Daney)
     Expose pci_host_common_probe() for use by other drivers (David Daney)
 
   Altera host bridge driver
     Fix altera_pcie_link_is_up() (Ley Foon Tan)
 
   Cavium ThunderX host bridge driver
     Add PCIe host driver for ThunderX processors (David Daney)
     Add driver for ThunderX-pass{1,2} on-chip devices (David Daney)
 
   Freescale i.MX6 host bridge driver
     Add DT bindings to configure PHY Tx driver settings (Justin Waters)
     Move imx6_pcie_reset_phy() near other PHY handling functions (Lucas Stach)
     Move PHY reset into imx6_pcie_establish_link() (Lucas Stach)
     Remove broken Gen2 workaround (Lucas Stach)
     Move link up check into imx6_pcie_wait_for_link() (Lucas Stach)
 
   Freescale Layerscape host bridge driver
     Add "fsl,ls2085a-pcie" compatible ID (Yang Shi)
 
   Intel VMD host bridge driver
     Attach VMD resources to parent domain's resource tree (Jon Derrick)
     Set bus resource start to 0 (Keith Busch)
 
   Microsoft Hyper-V host bridge driver
     Add fwnode_handle to x86 pci_sysdata (Jake Oshins)
     Look up IRQ domain by fwnode_handle (Jake Oshins)
     Add paravirtual PCI front-end for Microsoft Hyper-V VMs (Jake Oshins)
 
   NVIDIA Tegra host bridge driver
     Add pci_ops.{add,remove}_bus() callbacks (Thierry Reding)
     Implement ->{add,remove}_bus() callbacks (Thierry Reding)
     Remove unused struct tegra_pcie.num_ports field (Thierry Reding)
     Track bus -> CPU mapping (Thierry Reding)
     Remove misleading PHYS_OFFSET (Thierry Reding)
 
   Renesas R-Car host bridge driver
     Depend on ARCH_RENESAS, not ARCH_SHMOBILE (Simon Horman)
 
   Synopsys DesignWare host bridge driver
     ARC: Add PCI support (Joao Pinto)
     Add generic dw_pcie_wait_for_link() (Joao Pinto)
     Add default link up check if sub-driver doesn't override (Joao Pinto)
     Add driver for prototyping kits based on ARC SDP (Joao Pinto)
 
   TI Keystone host bridge driver
     Defer probing if devm_phy_get() returns -EPROBE_DEFER (Shawn Lin)
 
   Xilinx AXI host bridge driver
     Use of_pci_get_host_bridge_resources() to parse DT (Bharat Kumar Gogada)
     Remove dependency on ARM-specific struct hw_pci (Bharat Kumar Gogada)
     Don't call pci_fixup_irqs() on Microblaze (Bharat Kumar Gogada)
     Update Zynq binding with Microblaze node (Bharat Kumar Gogada)
     microblaze: Support generic Xilinx AXI PCIe Host Bridge IP driver (Bharat Kumar Gogada)
 
   Xilinx NWL host bridge driver
     Add support for Xilinx NWL PCIe Host Controller (Bharat Kumar Gogada)
 
   Miscellaneous
     Check device_attach() return value always (Bjorn Helgaas)
     Move pci_set_flags() from asm-generic/pci-bridge.h to linux/pci.h (Bjorn Helgaas)
     Remove includes of empty asm-generic/pci-bridge.h (Bjorn Helgaas)
     ARM64: Remove generated include of asm-generic/pci-bridge.h (Bjorn Helgaas)
     Remove empty asm-generic/pci-bridge.h (Bjorn Helgaas)
     Remove includes of asm/pci-bridge.h (Bjorn Helgaas)
     Consolidate PCI DMA constants and interfaces in linux/pci-dma-compat.h (Bjorn Helgaas)
     unicore32: Remove unused HAVE_ARCH_PCI_SET_DMA_MASK definition (Bjorn Helgaas)
     Cleanup pci/pcie/Kconfig whitespace (Andreas Ziegler)
     Include pci/hotplug Kconfig directly from pci/Kconfig (Bjorn Helgaas)
     Include pci/pcie/Kconfig directly from pci/Kconfig (Bogicevic Sasa)
     frv: Remove stray pci_{alloc,free}_consistent() declaration (Christoph Hellwig)
     Move pci_dma_* helpers to common code (Christoph Hellwig)
     Add PCI_CLASS_SERIAL_USB_DEVICE definition (Heikki Krogerus)
     Add QEMU top-level IDs for (sub)vendor & device (Robin H. Johnson)
     Fix broken URL for Dell biosdevname (Naga Venkata Sai Indubhaskar Jupudi)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJW6XgMAAoJEFmIoMA60/r8Yq4P/1nNwwZPikU+9Z8k0HyGPll6
 vqXBOYj/wlbAxJTzH2weaoyUamFrwvsKaO3Vap3xHkAeTFPD/Dp0TipCCNMrZ82Z
 j1y83JJpenkRyX6ifLARCNYpOtvnvgzSrO9x7Sb2Xfqb64dPb7+jGAfOpGNzhKsO
 n1nj/L7RGx8Q6fNFGf8ANMXKTsdkdL+1pdwegjUXmD5WdOT+oW8DmqVbhyfSKwl0
 E8r4Ml2lIg7Qd5Wu5iKMIBsR0+5HEyrwV7ch92wXChwKfoRwG70qnn7FGdc0y5ZB
 XvJuj8UD5UeMxEUeoRa9SwU6wWQT3Q9e6BzMS+P+43z36SPYjMfy/Xffv054z/bY
 rQomLjuGxNLESpmfNK5JfKxWoe2YNXjHQIDWMrAHyNlwdKJbYiwPcxnZJhvOa/eB
 p0QYcGS7O43STjibG9PZhzeq8tuSJRshxi0W6iB9QlqO8qs8nJQxIO+sZj/vl4yz
 lSnswWcV9062KITl8Fe9xDw244/RTz1xSVCdldlSoDhJyeMOjRvzS8raUMyyVmbA
 YULsI3l2iCl+fwDm/T21o7hJG966oYdAmgEv7lc7BWfgEAMg//LZXvMzVvrPFB2D
 R77u/0idtOciVJrmnO/x9DnQO2hzro9SLmVH6m0+0YU4wSSpZfGn98PCrtkatOAU
 c8zT9dJgyJVE3Z7cnPJ4
 =otsF
 -----END PGP SIGNATURE-----

Merge tag 'pci-v4.6-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI updates from Bjorn Helgaas:
 "PCI changes for v4.6:

  Enumeration:
   - Disable IO/MEM decoding for devices with non-compliant BARs (Bjorn Helgaas)
   - Mark Broadwell-EP Home Agent & PCU as having non-compliant BARs (Bjorn Helgaas

  Resource management:
   - Mark shadow copy of VGA ROM as IORESOURCE_PCI_FIXED (Bjorn Helgaas)
   - Don't assign or reassign immutable resources (Bjorn Helgaas)
   - Don't enable/disable ROM BAR if we're using a RAM shadow copy (Bjorn Helgaas)
   - Set ROM shadow location in arch code, not in PCI core (Bjorn Helgaas)
   - Remove arch-specific IORESOURCE_ROM_SHADOW size from sysfs (Bjorn Helgaas)
   - ia64: Use ioremap() instead of open-coded equivalent (Bjorn Helgaas)
   - ia64: Keep CPU physical (not virtual) addresses in shadow ROM resource (Bjorn Helgaas)
   - MIPS: Keep CPU physical (not virtual) addresses in shadow ROM resource (Bjorn Helgaas)
   - Remove unused IORESOURCE_ROM_COPY and IORESOURCE_ROM_BIOS_COPY (Bjorn Helgaas)
   - Don't leak memory if sysfs_create_bin_file() fails (Bjorn Helgaas)
   - rcar: Remove PCI_PROBE_ONLY handling (Lorenzo Pieralisi)
   - designware: Remove PCI_PROBE_ONLY handling (Lorenzo Pieralisi)

  Virtualization:
   - Wait for up to 1000ms after FLR reset (Alex Williamson)
   - Support SR-IOV on any function type (Kelly Zytaruk)
   - Add ACS quirk for all Cavium devices (Manish Jaggi)

  AER:
   - Rename pci_ops_aer to aer_inj_pci_ops (Bjorn Helgaas)
   - Restore pci_ops pointer while calling original pci_ops (David Daney)
   - Fix aer_inject error codes (Jean Delvare)
   - Use dev_warn() in aer_inject (Jean Delvare)
   - Log actual error causes in aer_inject (Jean Delvare)
   - Log aer_inject error injections (Jean Delvare)

  VPD:
   - Prevent VPD access for buggy devices (Babu Moger)
   - Move pci_read_vpd() and pci_write_vpd() close to other VPD code (Bjorn Helgaas)
   - Move pci_vpd_release() from header file to pci/access.c (Bjorn Helgaas)
   - Remove struct pci_vpd_ops.release function pointer (Bjorn Helgaas)
   - Rename VPD symbols to remove unnecessary "pci22" (Bjorn Helgaas)
   - Fold struct pci_vpd_pci22 into struct pci_vpd (Bjorn Helgaas)
   - Sleep rather than busy-wait for VPD access completion (Bjorn Helgaas)
   - Update VPD definitions (Hannes Reinecke)
   - Allow access to VPD attributes with size 0 (Hannes Reinecke)
   - Determine actual VPD size on first access (Hannes Reinecke)

  Generic host bridge driver:
   - Move structure definitions to separate header file (David Daney)
   - Add pci_host_common_probe(), based on gen_pci_probe() (David Daney)
   - Expose pci_host_common_probe() for use by other drivers (David Daney)

  Altera host bridge driver:
   - Fix altera_pcie_link_is_up() (Ley Foon Tan)

  Cavium ThunderX host bridge driver:
   - Add PCIe host driver for ThunderX processors (David Daney)
   - Add driver for ThunderX-pass{1,2} on-chip devices (David Daney)

  Freescale i.MX6 host bridge driver:
   - Add DT bindings to configure PHY Tx driver settings (Justin Waters)
   - Move imx6_pcie_reset_phy() near other PHY handling functions (Lucas Stach)
   - Move PHY reset into imx6_pcie_establish_link() (Lucas Stach)
   - Remove broken Gen2 workaround (Lucas Stach)
   - Move link up check into imx6_pcie_wait_for_link() (Lucas Stach)

  Freescale Layerscape host bridge driver:
   - Add "fsl,ls2085a-pcie" compatible ID (Yang Shi)

  Intel VMD host bridge driver:
   - Attach VMD resources to parent domain's resource tree (Jon Derrick)
   - Set bus resource start to 0 (Keith Busch)

  Microsoft Hyper-V host bridge driver:
   - Add fwnode_handle to x86 pci_sysdata (Jake Oshins)
   - Look up IRQ domain by fwnode_handle (Jake Oshins)
   - Add paravirtual PCI front-end for Microsoft Hyper-V VMs (Jake Oshins)

  NVIDIA Tegra host bridge driver:
   - Add pci_ops.{add,remove}_bus() callbacks (Thierry Reding)
   - Implement ->{add,remove}_bus() callbacks (Thierry Reding)
   - Remove unused struct tegra_pcie.num_ports field (Thierry Reding)
   - Track bus -> CPU mapping (Thierry Reding)
   - Remove misleading PHYS_OFFSET (Thierry Reding)

  Renesas R-Car host bridge driver:
   - Depend on ARCH_RENESAS, not ARCH_SHMOBILE (Simon Horman)

  Synopsys DesignWare host bridge driver:
   - ARC: Add PCI support (Joao Pinto)
   - Add generic dw_pcie_wait_for_link() (Joao Pinto)
   - Add default link up check if sub-driver doesn't override (Joao Pinto)
   - Add driver for prototyping kits based on ARC SDP (Joao Pinto)

  TI Keystone host bridge driver:
   - Defer probing if devm_phy_get() returns -EPROBE_DEFER (Shawn Lin)

  Xilinx AXI host bridge driver:
   - Use of_pci_get_host_bridge_resources() to parse DT (Bharat Kumar Gogada)
   - Remove dependency on ARM-specific struct hw_pci (Bharat Kumar Gogada)
   - Don't call pci_fixup_irqs() on Microblaze (Bharat Kumar Gogada)
   - Update Zynq binding with Microblaze node (Bharat Kumar Gogada)
   - microblaze: Support generic Xilinx AXI PCIe Host Bridge IP driver (Bharat Kumar Gogada)

  Xilinx NWL host bridge driver:
   - Add support for Xilinx NWL PCIe Host Controller (Bharat Kumar Gogada)

  Miscellaneous:
   - Check device_attach() return value always (Bjorn Helgaas)
   - Move pci_set_flags() from asm-generic/pci-bridge.h to linux/pci.h (Bjorn Helgaas)
   - Remove includes of empty asm-generic/pci-bridge.h (Bjorn Helgaas)
   - ARM64: Remove generated include of asm-generic/pci-bridge.h (Bjorn Helgaas)
   - Remove empty asm-generic/pci-bridge.h (Bjorn Helgaas)
   - Remove includes of asm/pci-bridge.h (Bjorn Helgaas)
   - Consolidate PCI DMA constants and interfaces in linux/pci-dma-compat.h (Bjorn Helgaas)
   - unicore32: Remove unused HAVE_ARCH_PCI_SET_DMA_MASK definition (Bjorn Helgaas)
   - Cleanup pci/pcie/Kconfig whitespace (Andreas Ziegler)
   - Include pci/hotplug Kconfig directly from pci/Kconfig (Bjorn Helgaas)
   - Include pci/pcie/Kconfig directly from pci/Kconfig (Bogicevic Sasa)
   - frv: Remove stray pci_{alloc,free}_consistent() declaration (Christoph Hellwig)
   - Move pci_dma_* helpers to common code (Christoph Hellwig)
   - Add PCI_CLASS_SERIAL_USB_DEVICE definition (Heikki Krogerus)
   - Add QEMU top-level IDs for (sub)vendor & device (Robin H. Johnson)
   - Fix broken URL for Dell biosdevname (Naga Venkata Sai Indubhaskar Jupudi)"

* tag 'pci-v4.6-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (94 commits)
  PCI: Add PCI_CLASS_SERIAL_USB_DEVICE definition
  PCI: designware: Add driver for prototyping kits based on ARC SDP
  PCI: designware: Add default link up check if sub-driver doesn't override
  PCI: designware: Add generic dw_pcie_wait_for_link()
  PCI: Cleanup pci/pcie/Kconfig whitespace
  PCI: Simplify pci_create_attr() control flow
  PCI: Don't leak memory if sysfs_create_bin_file() fails
  PCI: Simplify sysfs ROM cleanup
  PCI: Remove unused IORESOURCE_ROM_COPY and IORESOURCE_ROM_BIOS_COPY
  MIPS: Loongson 3: Keep CPU physical (not virtual) addresses in shadow ROM resource
  MIPS: Loongson 3: Use temporary struct resource * to avoid repetition
  ia64/PCI: Keep CPU physical (not virtual) addresses in shadow ROM resource
  ia64/PCI: Use ioremap() instead of open-coded equivalent
  ia64/PCI: Use temporary struct resource * to avoid repetition
  PCI: Clean up pci_map_rom() whitespace
  PCI: Remove arch-specific IORESOURCE_ROM_SHADOW size from sysfs
  PCI: thunder: Add driver for ThunderX-pass{1,2} on-chip devices
  PCI: thunder: Add PCIe host driver for ThunderX processors
  PCI: generic: Expose pci_host_common_probe() for use by other drivers
  PCI: generic: Add pci_host_common_probe(), based on gen_pci_probe()
  ...
2016-03-16 14:45:55 -07:00
Vaibhav Jain
17eb3eef19 cxl: Ignore probes for virtual afu pci devices
Add a check at the beginning of cxl_probe function to ignore virtual pci
devices created for each afu registered. This fixes the the errors
messages logged about missing CXL vsec, when cxl probe is unable to
find necessary vsec entries in device pci config space. The error
message logged are of the form :

cxl-pci 0004:00:00.0: ABORTING: CXL VSEC not found!
cxl-pci 0004:00:00.0: cxl_init_adapter failed: -19

Cc: Ian Munsie <imunsie@au1.ibm.com>
Cc: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Reviewed-by: fbarrat@linux.vnet.ibm.com
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 23:40:03 +11:00
Frederic Barrat
0d3a13fbf1 cxl: Remove cxl_get_phys_dev() kernel API
The cxl_get_phys_dev() API returns a struct device pointer which could
belong to either a struct pci_dev (bare-metal) or platform_device
(powerVM). To avoid potential problems in drivers, remove that API. It
was introduced to allow drivers to read the VPD of the adapter, but
the cxl driver now provides the cxl_pci_read_adapter_vpd() API for
that purpose.

Co-authored-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 23:40:02 +11:00
Christophe Lombard
e7a801ad15 cxl: Add tracepoints around the cxl hcall
To ease debugging, add a few tracepoints around the cxl hcalls.

Co-authored-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 23:40:01 +11:00
Christophe Lombard
0d400f77c1 cxl: Adapter failure handling
Check the AFU state whenever an API is called. The hypervisor may
issue a reset of the adapter when it detects a fault. When it happens,
it launches an error recovery which will either move the AFU to a
permanent failure state, or in the disabled state.
If the AFU is found to be disabled, detach all existing contexts from
it before issuing a AFU reset to re-enable it.

Before detaching contexts, notify any kernel driver through the EEH
callbacks of the AFU pci device.

Co-authored-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 23:40:00 +11:00
Frederic Barrat
d601ea918b cxl: Support the cxl kernel API from a guest
Like on bare-metal, the cxl driver creates a virtual PHB and a pci
device for the AFU. The configuration space of the device is mapped to
the configuration record of the AFU.

Reuse the code defined in afu_cr_read8|16|32() when reading the
configuration space of the AFU device.

Even though the (virtual) AFU device is a pci device, the adapter is
not. So a driver using the cxl kernel API cannot read the VPD of the
adapter through the usual PCI interface. Therefore, we add a call to
the cxl kernel API:
ssize_t cxl_read_adapter_vpd(struct pci_dev *dev, void *buf, size_t count);

Co-authored-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 23:40:00 +11:00
Frederic Barrat
b40844aa55 cxl: Parse device tree and create cxl device(s) at boot
Add new entry point to scan the device tree at boot in a guest,
looking for cxl devices.

Co-authored-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 23:39:59 +11:00
Christophe Lombard
594ff7d067 cxl: Support to flash a new image on the adapter from a guest
The new flash.c file contains the logic to flash a new image on the
adapter, through a hcall. It is an iterative process, with chunks of
data of 1M at a time. There are also 2 phases: write and verify. The
flash operation itself is driven from a user-land tool.
Once flashing is successful, an rtas call is made to update the device
tree with the new properties values for the adapter and the AFU(s)

Add a new char device for the adapter, so that the flash tool can
access the card, even if there is no valid AFU on it.

Co-authored-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 23:39:56 +11:00
Christophe Lombard
4752876c71 cxl: sysfs support for guests
Filter out a few adapter parameters which don't make sense in a guest.
Document the changes.

Co-authored-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 23:39:40 +11:00
Christophe Lombard
14baf4d9c7 cxl: Add guest-specific code
The new of.c file contains code to parse the device tree to find out
about cxl adapters and AFUs.

guest.c implements the guest-specific callbacks for the backend API.

The process element ID is not known until the context is attached, so
we have to separate the context ID assigned by the cxl driver from the
process element ID visible to the user applications. In bare-metal,
the 2 IDs match.

Co-authored-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
[mpe: Fix SMP=n build, fix PSERIES=n build, minor whitespace fixes]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 23:36:52 +11:00
Christophe Lombard
cbffa3a514 cxl: Separate bare-metal fields in adapter and AFU data structures
Introduce sub-structures containing the bare-metal specific fields in
the structures describing the adapter (struct cxl) and AFU (struct
cxl_afu).
Update all their references.

Co-authored-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 13:05:54 +11:00
Christophe Lombard
444c4ba461 cxl: New hcalls to support cxl adapters
The hypervisor calls provide an interface with a coherent platform
facility and function. It matches version 0.16 of the 'PAPR changes'
document.

The following hcalls are supported:
H_ATTACH_CA_PROCESS    Attach a process element to a coherent platform
                       function.
H_DETACH_CA_PROCESS    Detach a process element from a coherent
                       platform function.
H_CONTROL_CA_FUNCTION  Allow the partition to manipulate or query
                       certain coherent platform function behaviors.
H_COLLECT_CA_INT_INFO  Collect interrupt info about a coherent.
                       platform function after an interrupt occurred
H_CONTROL_CA_FAULTS    Control the operation of a coherent platform
                       function after a fault occurs.
H_DOWNLOAD_CA_FACILITY Support for downloading a base adapter image to
                       the coherent platform facility, and for
                       validating the entire image after the download.
H_CONTROL_CA_FACILITY  Allow the partition to manipulate or query
                       certain coherent platform facility behaviors.

Co-authored-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 13:05:52 +11:00
Frederic Barrat
73d55c3b59 cxl: IRQ allocation for guests
The PSL interrupt cannot be multiplexed in a guest, as it is not
supported by the hypervisor. So an interrupt will be allocated
for it for each context. It will still be the first interrupt found in
the first interrupt range, but is treated almost like any other AFU
interrupt when creating/deleting the context. Only the handler is
different. Rework the code so that the range 0 is treated like the
other ranges.

Co-authored-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 13:05:50 +11:00
Frederic Barrat
6d625ed9a7 cxl: Update cxl_irq() prototype
The context parameter when calling cxl_irq() should be strongly typed.

Co-authored-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 13:05:49 +11:00
Frederic Barrat
ea2d1f95ef cxl: Isolate a few bare-metal-specific calls
A few functions are mostly common between bare-metal and guest and
just need minor tuning. To avoid crowding the backend API, introduce a
few 'if' based on the CPU being in HV mode.

Co-authored-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 13:05:47 +11:00
Frederic Barrat
2b04cf310b cxl: Rename some bare-metal specific functions
Rename a few functions, changing the 'cxl_' prefix to either
'cxl_pci_' or 'cxl_native_', to make clear that the implementation is
bare-metal specific.

Those functions will have an equivalent implementation for a guest in
a later patch.

Co-authored-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 13:05:46 +11:00
Frederic Barrat
5be587b111 cxl: Introduce implementation-specific API
The backend API (in cxl.h) lists some low-level functions whose
implementation is different on bare-metal and in a guest. Each
environment implements its own functions, and the common code uses
them through function pointers, defined in cxl_backend_ops

Co-authored-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 13:05:43 +11:00
Frederic Barrat
cca44c0192 cxl: Define process problem state area at attach time only
CXL kernel API was defining the process problem state area during
context initialization, making it possible to map the problem state
area before attaching the context. This won't work on a powerVM
guest. So force the logical behavior, like in userspace: attach first,
then map the problem state area.
Remove calls to cxl_assign_psn_space during init. The function is
already called on the attach paths.

Co-authored-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 13:05:41 +11:00
Frederic Barrat
d56d301b51 cxl: Move bare-metal specific code to specialized files
Move a few functions around to better separate code specific to
bare-metal environment from code which will be commonly used between
guest and bare-metal.

Code specific to bare-metal is meant to be in native.c or pci.c
only. It's basically anything which touches the card p1 registers,
some p2 registers not needed from a guest and the PCI interface.

Co-authored-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 13:05:39 +11:00
Christophe Lombard
8633186209 cxl: Move common code away from bare-metal-specific files
Move around some functions which will be accessed from the bare-metal
and guest environments.
Code in native.c and pci.c is meant to be bare-metal specific.
Other files contain code which may be shared with guests.

Co-authored-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-03-09 13:05:37 +11:00
Frederic Barrat
923adb1646 cxl: Fix PSL timebase synchronization detection
The PSL timebase synchronization is seemingly failing for
configuration not including VIRT_CPU_ACCOUNTING_NATIVE. The driver
shows the following trace in dmesg:
PSL: Timebase sync: giving up!

The PSL timebase register is actually syncing correctly, but the cxl
driver is not detecting it. Fix is to use the proper timebase-to-time
conversion.

Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org> # 4.3+
Acked-by: Michael Neuling <mikey@neuling.org>
Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reviewed-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-02-29 21:36:00 +11:00
Geliang Tang
85016ff33f misc: cxl: use kobj_to_dev()
Use kobj_to_dev() instead of open-coding it.

Signed-off-by: Geliang Tang <geliangtang@163.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-02-07 23:01:45 -08:00
Bjorn Helgaas
952bbcb078 PCI: Remove includes of asm/pci-bridge.h
Drivers should include asm/pci-bridge.h only when they need the arch-
specific things provided there.  Outside of the arch/ directories, the only
drivers that actually need things provided by asm/pci-bridge.h are the
powerpc RPA hotplug drivers in drivers/pci/hotplug/rpa*.

Remove the includes of asm/pci-bridge.h from the other drivers, adding an
include of linux/pci.h if necessary.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2016-02-05 16:29:28 -06:00
Linus Torvalds
f689b742f2 powerpc updates for 4.5
- Ground work for the new Power9 MMU from Aneesh Kumar K.V
  - Optimise FP/VMX/VSX context switching from Anton Blanchard
 
  - Various cleanups from Krzysztof Kozlowski, John Ogness, Rashmica Gupta,
    Russell Currey, Gavin Shan, Daniel Axtens, Michael Neuling, Andrew Donnellan
  - Allow wrapper to work on non-english system from Laurent Vivier
  - Add rN aliases to the pt_regs_offset table from Rashmica Gupta
  - Fix module autoload for rackmeter & axonram drivers from Luis de Bethencourt
  - Include KVM guest test in all interrupt vectors from Paul Mackerras
  - Fix DSCR inheritance over fork() from Anton Blanchard
  - Make value-returning atomics & {cmp}xchg* & their atomic_ versions fully ordered from Boqun Feng
  - Print MSR TM bits in oops messages from Michael Neuling
  - Add TM signal return & invalid stack selftests from Michael Neuling
  - Limit EPOW reset event warnings from Vipin K Parashar
  - Remove the Cell QPACE code from Rashmica Gupta
  - Append linux_banner to exception information in xmon from Rashmica Gupta
  - Add selftest to check if VSRs are corrupted from Rashmica Gupta
  - Remove broken GregorianDay() from Daniel Axtens
  - Import Anton's context_switch2 benchmark into selftests from Michael Ellerman
  - Add selftest script to test HMI functionality from Daniel Axtens
  - Remove obsolete OPAL v2 support from Stewart Smith
  - Make enter_rtas() private from Michael Ellerman
  - PPR exception cleanups from Michael Ellerman
  - Add page soft dirty tracking from Laurent Dufour
  - Add support for Nvlink NPUs from Alistair Popple
  - Add support for kexec on 476fpe from Alistair Popple
  - Enable kernel CPU dlpar from sysfs from Nathan Fontenot
  - Copy only required pieces of the mm_context_t to the paca from Michael Neuling
  - Add a kmsg_dumper that flushes OPAL console output on panic from Russell Currey
  - Implement save_stack_trace_regs() to enable kprobe stack tracing from Steven Rostedt
  - Add HWCAP bits for Power9 from Michael Ellerman
  - Fix _PAGE_PTE breaking swapoff from Aneesh Kumar K.V
  - Fix _PAGE_SWP_SOFT_DIRTY breaking swapoff from Hugh Dickins
  - scripts/recordmcount.pl: support data in text section on powerpc from Ulrich Weigand
  - Handle R_PPC64_ENTRY relocations in modules from Ulrich Weigand
 
  - cxl: Fix possible idr warning when contexts are released from Vaibhav Jain
  - cxl: use correct operator when writing pcie config space values from Andrew Donnellan
  - cxl: Fix DSI misses when the context owning task exits from Vaibhav Jain
  - cxl: fix build for GCC 4.6.x from Brian Norris
  - cxl: use -Werror only with CONFIG_PPC_WERROR from Brian Norris
  - cxl: Enable PCI device ID for future IBM CXL adapter from Uma Krishnan
 
  - Freescale updates from Scott: Highlights include moving QE code out of
    arch/powerpc (to be shared with arm), device tree updates, and minor fixes.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWmIxeAAoJEFHr6jzI4aWAA+cQAIXAw4WfVWJ2V4ZK+1eKfB57
 fdXG71PuXG+WYIWy71ly8keLHdzzD1NQ2OUB64bUVRq202nRgVc15ZYKRJ/FE/sP
 SkxaQ2AG/2kI2EflWshOi0Lu9qaZ+LMHJnszIqE/9lnGSB2kUI/cwsSXgziiMKXR
 XNci9v14SdDd40YV/6BSZXoxApwyq9cUbZ7rnzFLmz4hrFuKmB/L3LABDF8QcpH7
 sGt/YaHGOtqP0UX7h5KQTFLGe1OPvK6NWixSXeZKQ71ED6cho1iKUEOtBA9EZeIN
 QM5JdHFWgX8MMRA0OHAgidkSiqO38BXjmjkVYWoIbYz7Zax3ThmrDHB4IpFwWnk3
 l7WBykEXY7KEqpZzbh0GFGehZWzVZvLnNgDdvpmpk/GkPzeYKomBj7ZZfm3H1yGD
 BTHPwuWCTX+/K75yEVNO8aJO12wBg7DRl4IEwBgqhwU8ga4FvUOCJkm+SCxA1Dnn
 qlpS7qPwTXNIEfKMJcxp5X0KiwDY1EoOotd4glTN0jbeY5GEYcxe+7RQ302GrYxP
 zcc8EGLn8h6BtQvV3ypNHF5l6QeTW/0ZlO9c236tIuUQ5gQU39SQci7jQKsYjSzv
 BB1XdLHkbtIvYDkmbnr1elbeJCDbrWL9rAXRUTRyfuCzaFWTfZmfVNe8c8qwDMLk
 TUxMR/38aI7bLcIQjwj9
 =R5bX
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:
 "Core:
   - Ground work for the new Power9 MMU from Aneesh Kumar K.V
   - Optimise FP/VMX/VSX context switching from Anton Blanchard

  Misc:
   - Various cleanups from Krzysztof Kozlowski, John Ogness, Rashmica
     Gupta, Russell Currey, Gavin Shan, Daniel Axtens, Michael Neuling,
     Andrew Donnellan
   - Allow wrapper to work on non-english system from Laurent Vivier
   - Add rN aliases to the pt_regs_offset table from Rashmica Gupta
   - Fix module autoload for rackmeter & axonram drivers from Luis de
     Bethencourt
   - Include KVM guest test in all interrupt vectors from Paul Mackerras
   - Fix DSCR inheritance over fork() from Anton Blanchard
   - Make value-returning atomics & {cmp}xchg* & their atomic_ versions
     fully ordered from Boqun Feng
   - Print MSR TM bits in oops messages from Michael Neuling
   - Add TM signal return & invalid stack selftests from Michael Neuling
   - Limit EPOW reset event warnings from Vipin K Parashar
   - Remove the Cell QPACE code from Rashmica Gupta
   - Append linux_banner to exception information in xmon from Rashmica
     Gupta
   - Add selftest to check if VSRs are corrupted from Rashmica Gupta
   - Remove broken GregorianDay() from Daniel Axtens
   - Import Anton's context_switch2 benchmark into selftests from
     Michael Ellerman
   - Add selftest script to test HMI functionality from Daniel Axtens
   - Remove obsolete OPAL v2 support from Stewart Smith
   - Make enter_rtas() private from Michael Ellerman
   - PPR exception cleanups from Michael Ellerman
   - Add page soft dirty tracking from Laurent Dufour
   - Add support for Nvlink NPUs from Alistair Popple
   - Add support for kexec on 476fpe from Alistair Popple
   - Enable kernel CPU dlpar from sysfs from Nathan Fontenot
   - Copy only required pieces of the mm_context_t to the paca from
     Michael Neuling
   - Add a kmsg_dumper that flushes OPAL console output on panic from
     Russell Currey
   - Implement save_stack_trace_regs() to enable kprobe stack tracing
     from Steven Rostedt
   - Add HWCAP bits for Power9 from Michael Ellerman
   - Fix _PAGE_PTE breaking swapoff from Aneesh Kumar K.V
   - Fix _PAGE_SWP_SOFT_DIRTY breaking swapoff from Hugh Dickins
   - scripts/recordmcount.pl: support data in text section on powerpc
     from Ulrich Weigand
   - Handle R_PPC64_ENTRY relocations in modules from Ulrich Weigand

  cxl:
   - cxl: Fix possible idr warning when contexts are released from
     Vaibhav Jain
   - cxl: use correct operator when writing pcie config space values
     from Andrew Donnellan
   - cxl: Fix DSI misses when the context owning task exits from Vaibhav
     Jain
   - cxl: fix build for GCC 4.6.x from Brian Norris
   - cxl: use -Werror only with CONFIG_PPC_WERROR from Brian Norris
   - cxl: Enable PCI device ID for future IBM CXL adapter from Uma
     Krishnan

  Freescale:
   - Freescale updates from Scott: Highlights include moving QE code out
     of arch/powerpc (to be shared with arm), device tree updates, and
     minor fixes"

* tag 'powerpc-4.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (149 commits)
  powerpc/module: Handle R_PPC64_ENTRY relocations
  scripts/recordmcount.pl: support data in text section on powerpc
  powerpc/powernv: Fix OPAL_CONSOLE_FLUSH prototype and usages
  powerpc/mm: fix _PAGE_SWP_SOFT_DIRTY breaking swapoff
  powerpc/mm: Fix _PAGE_PTE breaking swapoff
  cxl: Enable PCI device ID for future IBM CXL adapter
  cxl: use -Werror only with CONFIG_PPC_WERROR
  cxl: fix build for GCC 4.6.x
  powerpc: Add HWCAP bits for Power9
  powerpc/powernv: Reserve PE#0 on NPU
  powerpc/powernv: Change NPU PE# assignment
  powerpc/powernv: Fix update of NVLink DMA mask
  powerpc/powernv: Remove misleading comment in pci.c
  powerpc: Implement save_stack_trace_regs() to enable kprobe stack tracing
  powerpc: Fix build break due to paca mm_context_t changes
  cxl: Fix DSI misses when the context owning task exits
  MAINTAINERS: Update Scott Wood's e-mail address
  powerpc/powernv: Fix minor off-by-one error in opal_mce_check_early_recovery()
  powerpc: Fix style of self-test config prompts
  powerpc/powernv: Only delay opal_rtc_read() retry when necessary
  ...
2016-01-15 13:18:47 -08:00
Uma Krishnan
68adb7bfd6 cxl: Enable PCI device ID for future IBM CXL adapter
Add support for future IBM Coherent Accelerator (CXL) device
with ID of 0x0601.

Signed-off-by: Uma Krishnan <ukrishn@linux.vnet.ibm.com>
Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-01-11 20:46:25 +11:00
Brian Norris
57f7c39325 cxl: use -Werror only with CONFIG_PPC_WERROR
Some developers really like to have -Werror enabled for their code, as
it helps to ensure warning free code. Others don't want -Werror, as it
(for example) can cause problems when newer (or older) compilers have
different sets of warnings, or new warnings can appear just when turning
up the warning level (e.g., make W=1 or W=2). Thus, it seems prudent to
have the use of -Werror be configurable.

It so happens that cxl is only built on PowerPC, and PowerPC already
has a nice set of Kconfig options for this, under CONFIG_PPC_WERROR. So
let's use that, and the world is a happy place again! (Note that
PPC_WERROR defaults to =y, so the common case compile should still be
enforcing -Werror.)

Fixes: d3d73f4b38 ("cxl: Compile with -Werror")
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-01-11 20:30:53 +11:00
Brian Norris
aa09545589 cxl: fix build for GCC 4.6.x
GCC 4.6.3 does not support -Wno-unused-const-variable. Instead, use the
kbuild infrastructure that checks if this options exists.

Fixes: 2cd55c68c0 ("cxl: Fix build failure due to -Wunused-variable behaviour change")
Suggested-by: Michal Marek <mmarek@suse.com>
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-01-11 20:30:52 +11:00
Vaibhav Jain
7b8ad495d5 cxl: Fix DSI misses when the context owning task exits
Presently when a user-space process issues CXL_IOCTL_START_WORK ioctl we
store the pid of the current task_struct and use it to get pointer to
the mm_struct of the process, while processing page or segment faults
from the capi card. However this causes issues when the thread that had
originally issued the start-work ioctl exits in which case the stored
pid is no more valid and the cxl driver is unable to handle faults as
the mm_struct corresponding to process is no more accessible.

This patch fixes this issue by using the mm_struct of the next alive
task in the thread group. This is done by iterating over all the tasks
in the thread group starting from thread group leader and calling
get_task_mm on each one of them. When a valid mm_struct is obtained the
pid of the associated task is stored in the context replacing the
exiting one for handling future faults.

The patch introduces a new function named get_mem_context that checks if
the current task pointed to by ctx->pid is dead? If yes it performs the
steps described above. Also a new variable cxl_context.glpid is
introduced which stores the pid of the thread group leader associated
with the context owning task.

Reported-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Reported-by: Frank Haverkamp <HAVERKAM@de.ibm.com>
Suggested-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-01-05 16:28:25 +11:00
Frederic Barrat
e606e035cc cxl: Set endianess of kernel contexts
A process element (defined in CAIA) keeps track of the endianess of
contexts through the Little Endian (LE) bit of the State Register. It
is currently set for user contexts, but was somehow forgotten for
kernel contexts, so this patch fixes it.
It could lead to erratic behavior from an AFU when the context is
attached through the kernel API.

Fixes: 2f663527bd ("cxl: Configure PSL for kernel contexts and merge code")
Cc: stable@vger.kernel.org # 4.2+
Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com>
Suggested-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-12-08 16:57:01 +11:00
Andrew Donnellan
48f0f6b717 cxl: use correct operator when writing pcie config space values
When writing a value to config space, cxl_pcie_write_config() calls
cxl_pcie_config_info() to obtain a mask and shift value, shifts the new
value accordingly, then uses the mask to combine the shifted value with the
existing value at the address as part of a read-modify-write pattern.

Currently, we use a logical OR operator rather than a bitwise OR operator,
which means any use of this function results in an incorrect value being
written. Replace the logical OR operator with a bitwise OR operator so the
value is written correctly.

Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: stable@vger.kernel.org
Fixes: 6f7f0b3df6 ("cxl: Add AFU virtual PHB and kernel API")
Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-11-24 14:21:27 +11:00
Vaibhav Jain
1b5df59e50 cxl: Fix possible idr warning when contexts are released
An idr warning is reported when a context is release after the capi card
is unbound from the cxl driver via sysfs. Below are the steps to
reproduce:

1. Create multiple afu contexts in an user-space application using libcxl.
2. Unbind capi card from cxl using command of form
   echo <capi-card-pci-addr> > /sys/bus/pci/drivers/cxl-pci/unbind
3. Exit/kill the application owning afu contexts.

After above steps a warning message is usually seen in the kernel logs
of the form "idr_remove called for id=<context-id> which is not
allocated."

This is caused by the function cxl_release_afu which destroys the
contexts_idr table. So when a context is release no entry for context pe
is found in the contexts_idr table and idr code prints this warning.

This patch fixes this issue by increasing & decreasing the ref-count on
the afu device when a context is initialized or when its freed
respectively. This prevents the afu from being released until all the
afu contexts have been released. The patch introduces two new functions
namely cxl_afu_get/put that manage the ref-count on the afu device.

Also the patch removes code inside cxl_dev_context_init that increases ref
on the afu device as its guaranteed to be alive during this function.

Reported-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-11-24 14:21:27 +11:00
Linus Torvalds
2f4bf528ec powerpc updates for 4.4
- Kconfig: remove BE-only platforms from LE kernel build from Boqun Feng
  - Refresh ps3_defconfig from Geoff Levand
  - Emit GNU & SysV hashes for the vdso from Michael Ellerman
  - Define an enum for the bolted SLB indexes from Anshuman Khandual
  - Use a local to avoid multiple calls to get_slb_shadow() from Michael Ellerman
  - Add gettimeofday() benchmark from Michael Neuling
  - Avoid link stack corruption in __get_datapage() from Michael Neuling
  - Add virt_to_pfn and use this instead of opencoding from Aneesh Kumar K.V
  - Add ppc64le_defconfig from Michael Ellerman
  - pseries: extract of_helpers module from Andy Shevchenko
  - Correct string length in pseries_of_derive_parent() from Nathan Fontenot
  - Free the MSI bitmap if it was slab allocated from Denis Kirjanov
  - Shorten irq_chip name for the SIU from Christophe Leroy
  - Wait 1s for secondaries to enter OPAL during kexec from Samuel Mendoza-Jonas
  - Fix _ALIGN_* errors due to type difference. from Aneesh Kumar K.V
  - powerpc/pseries/hvcserver: don't memset pi_buff if it is null from Colin Ian King
  - Disable hugepd for 64K page size. from Aneesh Kumar K.V
  - Differentiate between hugetlb and THP during page walk from Aneesh Kumar K.V
  - Make PCI non-optional for pseries from Michael Ellerman
  - Individual System V IPC system calls from Sam bobroff
  - Add selftest of unmuxed IPC calls from Michael Ellerman
  - discard .exit.data at runtime from Stephen Rothwell
  - Delete old orphaned PrPMC 280/2800 DTS and boot file. from Paul Gortmaker
  - Use of_get_next_parent to simplify code from Christophe Jaillet
  - Paginate some xmon output from Sam bobroff
  - Add some more elements to the xmon PACA dump from Michael Ellerman
  - Allow the tm-syscall selftest to build with old headers from Michael Ellerman
  - Run EBB selftests only on POWER8 from Denis Kirjanov
  - Drop CONFIG_TUNE_CELL in favour of CONFIG_CELL_CPU from Michael Ellerman
  - Avoid reference to potentially freed memory in prom.c from Christophe Jaillet
  - Quieten boot wrapper output with run_cmd from Geoff Levand
  - EEH fixes and cleanups from Gavin Shan
  - Fix recursive fenced PHB on Broadcom shiner adapter from Gavin Shan
  - Use of_get_next_parent() in of_get_ibm_chip_id() from Michael Ellerman
  - Fix section mismatch warning in msi_bitmap_alloc() from Denis Kirjanov
  - Fix ps3-lpm white space from Rudhresh Kumar J
  - Fix ps3-vuart null dereference from Colin King
  - nvram: Add missing kfree in error path from Christophe Jaillet
  - nvram: Fix function name in some errors messages. from Christophe Jaillet
  - drivers/macintosh: adb: fix misleading Kconfig help text from Aaro Koskinen
  - agp/uninorth: fix a memleak in create_gatt_table from Denis Kirjanov
  - cxl: Free virtual PHB when removing from Andrew Donnellan
  - scripts/kconfig/Makefile: Allow KBUILD_DEFCONFIG to be a target from Michael Ellerman
  - scripts/kconfig/Makefile: Fix KBUILD_DEFCONFIG check when building with O= from Michael Ellerman
 
  - Freescale updates from Scott: Highlights include 64-bit book3e kexec/kdump
    support, a rework of the qoriq clock driver, device tree changes including
    qoriq fman nodes, support for a new 85xx board, and some fixes.
 
  - MPC5xxx updates from Anatolij: Highlights include a driver for MPC512x
    LocalPlus Bus FIFO with its device tree binding documentation, mpc512x
    device tree updates and some minor fixes.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWPEZgAAoJEFHr6jzI4aWANjYQAKX2Q/95hqKfCuF5FBcUmtMC
 Pu/Nff027MVzxZ2ApDcvvLGps5Nz2bn3nIhc9zjkXc5E8DuL6X3Yl8ce7qyNcc3g
 cJJ8RvtUo6J1OMWetXFehtPYniAAwKMhZYKnj0+WnLr2SyH/Vhl3ehDkFbGyPtuH
 r+2E7krFjfVgU+bzciIFnOaDekFuFN/pXWMb6e6zQyBJe9N8ZIp96uouGCebKVd0
 VDLItzdaKErT8JFfbymMPvZm3V0rMVx4WWu3kAbQX8LrD5a18NF1zrjAOHRXc61n
 kkk8/DPuNOon1PbXXyiS5BcFyZRe+KE3VBnoW5sOMqMIRg5WdO1oU3e2pEfXMO8+
 leXYwFLXiKzUZuOgQG2QiUhrzD2yC1o6/TJWATv0dSl9AwrecgPX+Vj6X357slAf
 A9E3eMy5tgnpndBWZmvZS3W7YDKH+NkeZ+Q40+NErAlqr++ErrTcKVndk5vWlYTT
 7mMZeTXagX66al/k5ATKqwB7iUSpnYHSAa9fcUYPSM2FnXsDxPyeJGkBbcoOmkGj
 QrpgNYOvJaUJd076goZCV39v0c1xpfV9/9kyVch8HUadf6JcjpVZwYnbGw2qlJjh
 ZanuBG2VOeSwaKQqXiRBSBetnpAg8CVpFjDmX9wOBfSek2wxEJqDX/vQExdbIDQQ
 pUs7vnUxLzhmW/x+ygOI
 =YwcM
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:

 - Kconfig: remove BE-only platforms from LE kernel build from Boqun
   Feng
 - Refresh ps3_defconfig from Geoff Levand
 - Emit GNU & SysV hashes for the vdso from Michael Ellerman
 - Define an enum for the bolted SLB indexes from Anshuman Khandual
 - Use a local to avoid multiple calls to get_slb_shadow() from Michael
   Ellerman
 - Add gettimeofday() benchmark from Michael Neuling
 - Avoid link stack corruption in __get_datapage() from Michael Neuling
 - Add virt_to_pfn and use this instead of opencoding from Aneesh Kumar
   K.V
 - Add ppc64le_defconfig from Michael Ellerman
 - pseries: extract of_helpers module from Andy Shevchenko
 - Correct string length in pseries_of_derive_parent() from Nathan
   Fontenot
 - Free the MSI bitmap if it was slab allocated from Denis Kirjanov
 - Shorten irq_chip name for the SIU from Christophe Leroy
 - Wait 1s for secondaries to enter OPAL during kexec from Samuel
   Mendoza-Jonas
 - Fix _ALIGN_* errors due to type difference, from Aneesh Kumar K.V
 - powerpc/pseries/hvcserver: don't memset pi_buff if it is null from
   Colin Ian King
 - Disable hugepd for 64K page size, from Aneesh Kumar K.V
 - Differentiate between hugetlb and THP during page walk from Aneesh
   Kumar K.V
 - Make PCI non-optional for pseries from Michael Ellerman
 - Individual System V IPC system calls from Sam bobroff
 - Add selftest of unmuxed IPC calls from Michael Ellerman
 - discard .exit.data at runtime from Stephen Rothwell
 - Delete old orphaned PrPMC 280/2800 DTS and boot file, from Paul
   Gortmaker
 - Use of_get_next_parent to simplify code from Christophe Jaillet
 - Paginate some xmon output from Sam bobroff
 - Add some more elements to the xmon PACA dump from Michael Ellerman
 - Allow the tm-syscall selftest to build with old headers from Michael
   Ellerman
 - Run EBB selftests only on POWER8 from Denis Kirjanov
 - Drop CONFIG_TUNE_CELL in favour of CONFIG_CELL_CPU from Michael
   Ellerman
 - Avoid reference to potentially freed memory in prom.c from Christophe
   Jaillet
 - Quieten boot wrapper output with run_cmd from Geoff Levand
 - EEH fixes and cleanups from Gavin Shan
 - Fix recursive fenced PHB on Broadcom shiner adapter from Gavin Shan
 - Use of_get_next_parent() in of_get_ibm_chip_id() from Michael
   Ellerman
 - Fix section mismatch warning in msi_bitmap_alloc() from Denis
   Kirjanov
 - Fix ps3-lpm white space from Rudhresh Kumar J
 - Fix ps3-vuart null dereference from Colin King
 - nvram: Add missing kfree in error path from Christophe Jaillet
 - nvram: Fix function name in some errors messages, from Christophe
   Jaillet
 - drivers/macintosh: adb: fix misleading Kconfig help text from Aaro
   Koskinen
 - agp/uninorth: fix a memleak in create_gatt_table from Denis Kirjanov
 - cxl: Free virtual PHB when removing from Andrew Donnellan
 - scripts/kconfig/Makefile: Allow KBUILD_DEFCONFIG to be a target from
   Michael Ellerman
 - scripts/kconfig/Makefile: Fix KBUILD_DEFCONFIG check when building
   with O= from Michael Ellerman
 - Freescale updates from Scott: Highlights include 64-bit book3e
   kexec/kdump support, a rework of the qoriq clock driver, device tree
   changes including qoriq fman nodes, support for a new 85xx board, and
   some fixes.
 - MPC5xxx updates from Anatolij: Highlights include a driver for
   MPC512x LocalPlus Bus FIFO with its device tree binding
   documentation, mpc512x device tree updates and some minor fixes.

* tag 'powerpc-4.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (106 commits)
  powerpc/msi: Fix section mismatch warning in msi_bitmap_alloc()
  powerpc/prom: Use of_get_next_parent() in of_get_ibm_chip_id()
  powerpc/pseries: Correct string length in pseries_of_derive_parent()
  powerpc/e6500: hw tablewalk: make sure we invalidate and write to the same tlb entry
  powerpc/mpc85xx: Add FSL QorIQ DPAA FMan support to the SoC device tree(s)
  powerpc/mpc85xx: Create dts components for the FSL QorIQ DPAA FMan
  powerpc/fsl: Add #clock-cells and clockgen label to clockgen nodes
  powerpc: handle error case in cpm_muram_alloc()
  powerpc: mpic: use IRQCHIP_SKIP_SET_WAKE instead of redundant mpic_irq_set_wake
  powerpc/book3e-64: Enable kexec
  powerpc/book3e-64/kexec: Set "r4 = 0" when entering spinloop
  powerpc/booke: Only use VIRT_PHYS_OFFSET on booke32
  powerpc/book3e-64/kexec: Enable SMP release
  powerpc/book3e-64/kexec: create an identity TLB mapping
  powerpc/book3e-64: Don't limit paca to 256 MiB
  powerpc/book3e/kdump: Enable crash_kexec_wait_realmode
  powerpc/book3e: support CONFIG_RELOCATABLE
  powerpc/booke64: Fix args to copy_and_flush
  powerpc/book3e-64: rename interrupt_end_book3e with __end_interrupts
  powerpc/e6500: kexec: Handle hardware threads
  ...
2015-11-05 23:38:43 -08:00
Andrew Donnellan
2e1a2556eb cxl: Free virtual PHB when removing
When adding a vPHB in cxl_pci_vphb_add(), we allocate a pci_controller
struct using pcibios_alloc_controller(). However, we don't free it in
cxl_pci_vphb_remove(), causing a leak.

Call pcibios_free_controller() in cxl_pci_vphb_remove() to free the vPHB
data structure correctly.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-15 20:31:58 +11:00
Christophe Lombard
4108efb02d cxl: Fix number of allocated pages in SPA
The scheduled process area is currently allocated before assigning the
correct maximum processes to the AFU, which will mean we only ever
allocate a fixed number of pages for the scheduled process area. This
will limit us to 958 processes with 2 x 64K pages. If we try to use more
processes than that we'd probably overrun the buffer and corrupt memory
or crash.

AFUs that require three or more interrupts per process will not be
affected as they are already limited to less processes than that, but we
could hit it on an AFU that requires 0, 1 or 2 interrupts per process,
or when using 4K pages.

This patch moves the initialisation of the num_procs to before the SPA
allocation so that enough pages will be allocated for the number of
processes that the AFU supports.

Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com>
Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Cc: <stable@vger.kernel.org> # 3.18+
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-07 16:44:46 +11:00
Philippe Bergheaud
d79e6801b1 cxl: Workaround malformed pcie packets on some cards
This works around a pcie host bridge defect on some cards, that can cause
malformed Transaction Layer Packet (TLP) errors to be erroneously reported.

The upper nibble of the vendor section PSL revision is used to distinguish
between different cards. The affected ones have it set to 0.

Signed-off-by: Philippe Bergheaud <felix@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-06 20:59:46 +11:00
Andrew Donnellan
5f81b95fe2 cxl: fix leak of ctx->mapping when releasing kernel API contexts
When a context is created via the kernel API, ctx->mapping is allocated
within the kernel and thus needs to be freed when the context is freed.
reclaim_ctx() attempts to do this for contexts with the ctx->kernelapi flag
set, but afu_release() (which can be called from the kernel API through
cxl_fd_release()) sets ctx->mapping to NULL before calling
cxl_context_free() to free the context.

Add a check to afu_release() so that the mappings in contexts created via
the kernel API are left alone so reclaim_ctx() can free them.

Reported-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Fixes: 6f7f0b3df6 ("cxl: Add AFU virtual PHB and kernel API")
Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-01 11:50:12 +10:00
Andrew Donnellan
52adee580d cxl: fix leak of ctx->irq_bitmap when releasing context via kernel API
At present, ctx->irq_bitmap is freed in afu_release_irqs(), which is called
from afu_release() via cxl_context_detach().

Move the freeing of ctx->irq_bitmap from afu_release_irqs() to
reclaim_ctx() (called through cxl_context_free()) so it's freed when
releasing a context via the kernel API (cxl_release_context()) or the
userspace API (afu_release()).

Reported-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Fixes: 6f7f0b3df6 ("cxl: Add AFU virtual PHB and kernel API")
Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-01 11:49:32 +10:00
Andrew Donnellan
8dde152ea3 cxl: fix leak of IRQ names in cxl_free_afu_irqs()
cxl_free_afu_irqs() doesn't free IRQ names when it releases an AFU's IRQ
ranges. The userspace API equivalent in afu_release_irqs() calls
afu_irq_name_free() to release the IRQ names.

Call afu_irq_name_free() in cxl_free_afu_irqs() to release the IRQ names.
Make afu_irq_name_free() non-static to allow this.

Reported-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Fixes: 6f7f0b3df6 ("cxl: Add AFU virtual PHB and kernel API")
Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-10-01 11:49:32 +10:00
Vaibhav Jain
d6eb71a6d2 cxl: Fix lockdep warning while creating afu_err_buff attribute
Presently a lockdep warning is reported during creation of afu_err_buff
bin_attribute for the afu. This is caused due to the variable attr.key
not pointing to a static class key, hence the function lockdep_init_map
reports this warning:

 BUG: key <some-address> not in .data!

The patch fixes this issue by calling sysfs_attr_init on the
attr_eb.attr structure before populating it with the afu_err_buff file
details. This will populate the attr.key variable with a static class
key so that lockdep_init_map stops complaining about the lockdep key not
being static.

Reported-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Reviewed-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-09-23 20:57:13 +10:00
Ian Munsie
2cd55c68c0 cxl: Fix build failure due to -Wunused-variable behaviour change
A recent change in gcc caused this build failure:

/var/lib/jenkins/workspace/gcc_kernel_build/linux/drivers/misc/cxl/cxl.h:72:27:
error: ‘CXL_PSL_DLCNTL’ defined but not used [-Werror=unused-const-variable]
 static const cxl_p1_reg_t CXL_PSL_DLCNTL  = {0x0060};

Because of this gcc commit:

Commit 1bca8cbd0c68366f07277f98ce6963e10c2aa617 by mark
PR28901 -Wunused-variable ignores unused const initialised variables in C
12 years ago it was decided that -Wunused-variable shouldn't warn about
static const variables because some code used const static char rcsid[]
strings which were never used but wanted in the code anyway. But as the
bug points out this hides some real bugs. These days the usage of
rcsids is not very popular anymore. So this patch changes the default
to warn about unused static const variables in C with
-Wunused-variable. And it adds a new option -Wno-unused-const-variable
to turn this warning off. For C++ this new warning is off by default,
since const variables can be used as #defines in C++. New testcases for
the new defaults in C and C++ are included testing the new warning and
suppressing it with an unused attribute or using
-Wno-unused-const-variable. gcc/ChangeLog

The cxl driver uses static consts in place of #defines in some cases
for type safety, so this change causes the driver to fail to build on
new copilers as these constants are not all used in every file that
imports the header. Suppress the warning for this driver to return to
the old behaviour of -Wunused-variable.

Reported-by: Anton Blanchard <anton@au1.ibm.com>
Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-09-15 19:33:51 +10:00
Daniel Axtens
2925c2fdf1 cxl: Fix unbalanced pci_dev_get in cxl_probe
Currently the first thing we do in cxl_probe is to grab a reference
on the pci device. Later on, we call device_register on our adapter.
In our remove path, we call device_unregister, but we never call
pci_dev_put. We therefore leak the device every time we do a
reflash.

device_register/unregister is sufficient to hold the reference.
Therefore, drop the call to pci_dev_get.

Here's why this is safe.
The proposed cxl_probe(pdev) calls cxl_adapter_init:
    a) init calls cxl_adapter_alloc, which creates a struct cxl,
       conventionally called adapter. This struct contains a
       device entry, adapter->dev.

    b) init calls cxl_configure_adapter, where we set
       adapter->dev.parent = &dev->dev (here dev is the pci dev)

So at this point, the cxl adapter's device's parent is the PCI
device that I want to be refcounted properly.

    c) init calls cxl_register_adapter
       *) cxl_register_adapter calls device_register(&adapter->dev)

So now we're in device_register, where dev is the adapter device, and
we want to know if the PCI device is safe after we return.

device_register(&adapter->dev) calls device_initialize() and then
device_add().

device_add() does a get_device(). device_add() also explicitly grabs
the device's parent, and calls get_device() on it:

         parent = get_device(dev->parent);

So therefore, device_register() takes a lock on the parent PCI dev,
which is what pci_dev_get() was guarding. pci_dev_get() can therefore
be safely removed.

Fixes: f204e0b8ce ("cxl: Driver code for powernv PCIe based cards for userspace access")
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Axtens <dja@axtens.net>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-09-15 19:33:38 +10:00
Andrew Donnellan
7d1647dc4b cxl: abort cxl_pci_enable_device_hook() if PCI channel is offline
cxl_pci_enable_device_hook() is called when attempting to enable an AFU
sitting on a vPHB. At present, the state of the underlying CXL card's PCI
channel is only checked when it calls cxl_afu_check_and_enable() at the
very end, after it has already set DMA options and initialised a default
context.

Check the CXL card's link status before setting DMA options or initialising
a default context. If the link is down, print a warning and return
immediately.

Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-09-07 20:14:24 +10:00
Linus Torvalds
ff474e8ca8 powerpc updates for 4.3
- Support "hybrid" iommu/direct DMA ops for coherent_mask < dma_mask from Benjamin Herrenschmidt
  - EEH fixes for SRIOV from Gavin
  - Introduce rtas_get_sensor_fast() for IRQ handlers from Thomas Huth
  - Use hardware RNG for arch_get_random_seed_* not arch_get_random_* from Paul Mackerras
  - Seccomp filter support from Michael Ellerman
  - opal_cec_reboot2() handling for HMIs & machine checks from Mahesh Salgaonkar
  - Add powerpc timebase as a trace clock source from Naveen N. Rao
  - Misc cleanups in the xmon, signal & SLB code from Anshuman Khandual
  - Add an inline function to update POWER8 HID0 from Gautham R. Shenoy
  - Fix pte_pagesize_index() crash on 4K w/64K hash from Michael Ellerman
  - Drop support for 64K local store on 4K kernels from Michael Ellerman
  - move dma_get_required_mask() from pnv_phb to pci_controller_ops from Andrew Donnellan
  - Initialize distance lookup table from drconf path from Nikunj A Dadhania
  - Enable RTC class support from Vaibhav Jain
  - Disable automatically blocked PCI config from Gavin Shan
  - Add LEDs driver for PowerNV platform from Vasant Hegde
  - Fix endianness issues in the HVSI driver from Laurent Dufour
  - Kexec endian fixes from Samuel Mendoza-Jonas
  - Fix corrupted pdn list from Gavin Shan
  - Fix fenced PHB caused by eeh_slot_error_detail() from Gavin Shan
 
  - Freescale updates from Scott: Highlights include 32-bit memcpy/memset
    optimizations, checksum optimizations, 85xx config fragments and updates,
    device tree updates, e6500 fixes for non-SMP, and misc cleanup and minor
    fixes.
 
  - A ton of cxl updates & fixes:
   - Add explicit precision specifiers from Rasmus Villemoes
   - use more common format specifier from Rasmus Villemoes
   - Destroy cxl_adapter_idr on module_exit from Johannes Thumshirn
   - Destroy afu->contexts_idr on release of an afu from Johannes Thumshirn
   - Compile with -Werror from Daniel Axtens
   - EEH support from Daniel Axtens
   - Plug irq_bitmap getting leaked in cxl_context from Vaibhav Jain
   - Add alternate MMIO error handling from Ian Munsie
   - Allow release of contexts which have been OPENED but not STARTED from Andrew Donnellan
   - Remove use of macro DEFINE_PCI_DEVICE_TABLE from Vaishali Thakkar
   - Release irqs if memory allocation fails from Vaibhav Jain
   - Remove racy attempt to force EEH invocation in reset from Daniel Axtens
   - Fix + cleanup error paths in cxl_dev_context_init from Ian Munsie
   - Fix force unmapping mmaps of contexts allocated through the kernel api from Ian Munsie
   - Set up and enable PSL Timebase from Philippe Bergheaud
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJV5+GzAAoJEFHr6jzI4aWA0iAP/jcd0kNaNBzLgcDKKygKdgz4
 xn4EWu81vfMfZYWesb0ATrjlH0hLsRxSXoFUqUMhtJTa5kNAoCIaz/M8WBALS50h
 aT+i7br4WEU2j2FcaMyP3iAZx/2hl+2utODJSHPRWPkec1fUDBfEyBf++e520RWM
 HUQGIGZXh8yq7KMA96Pwhsvls9vOB8hS2UdU/NS8ff3J5jFvXC1/WmF2qfzJBS1V
 8iHyz26Jl8+dJ+et7iC2oD5XQAjIH1oJgOyPVPBzAQttfi8RjuVzRA30TfPBAUwI
 lC9nlmPy6bCe4kiQYWVB1z7GegHyW/9vkeuMj/u8mZbqpaayMEMZmd2C3iNDXNHx
 i2NSvdln539t4qWYsV2v6lVCfa/ayDHD73Wackj5Dk394tzXnpCPhxNzc2yKEd5v
 h7vwYc9jBhsbfSCSogaM+gSHJ1APgCidggHJMYYNA2nN2u6V62RpsMB7zp/1+Q2v
 yqYdD8oYF4Dm21x/ujaNFrlizROD46WS0UqdJ3yP6HAqRYIpRXtibmpECJgt1n5h
 HjADEci4hQ2UQxdMdp/Q5KZnPTJebBtrZrmkW5r6cZBUaTB5TVkFaEWN44CT/Loh
 tMNeA3qOBN06CaQS2WL3UUUWpbZq9fSbWuUZ5lWZDb5AOyRxe5eWVYNLkiyIXozY
 L24l1bYdBhXahnjoS/kc
 =n9+X
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-4.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:

 - support "hybrid" iommu/direct DMA ops for coherent_mask < dma_mask
   from Benjamin Herrenschmidt

 - EEH fixes for SRIOV from Gavin

 - introduce rtas_get_sensor_fast() for IRQ handlers from Thomas Huth

 - use hardware RNG for arch_get_random_seed_* not arch_get_random_*
   from Paul Mackerras

 - seccomp filter support from Michael Ellerman

 - opal_cec_reboot2() handling for HMIs & machine checks from Mahesh
   Salgaonkar

 - add powerpc timebase as a trace clock source from Naveen N.  Rao

 - misc cleanups in the xmon, signal & SLB code from Anshuman Khandual

 - add an inline function to update POWER8 HID0 from Gautham R.  Shenoy

 - fix pte_pagesize_index() crash on 4K w/64K hash from Michael Ellerman

 - drop support for 64K local store on 4K kernels from Michael Ellerman

 - move dma_get_required_mask() from pnv_phb to pci_controller_ops from
   Andrew Donnellan

 - initialize distance lookup table from drconf path from Nikunj A
   Dadhania

 - enable RTC class support from Vaibhav Jain

 - disable automatically blocked PCI config from Gavin Shan

 - add LEDs driver for PowerNV platform from Vasant Hegde

 - fix endianness issues in the HVSI driver from Laurent Dufour

 - kexec endian fixes from Samuel Mendoza-Jonas

 - fix corrupted pdn list from Gavin Shan

 - fix fenced PHB caused by eeh_slot_error_detail() from Gavin Shan

 - Freescale updates from Scott: Highlights include 32-bit memcpy/memset
   optimizations, checksum optimizations, 85xx config fragments and
   updates, device tree updates, e6500 fixes for non-SMP, and misc
   cleanup and minor fixes.

 - a ton of cxl updates & fixes:
    - add explicit precision specifiers from Rasmus Villemoes
    - use more common format specifier from Rasmus Villemoes
    - destroy cxl_adapter_idr on module_exit from Johannes Thumshirn
    - destroy afu->contexts_idr on release of an afu from Johannes
      Thumshirn
    - compile with -Werror from Daniel Axtens
    - EEH support from Daniel Axtens
    - plug irq_bitmap getting leaked in cxl_context from Vaibhav Jain
    - add alternate MMIO error handling from Ian Munsie
    - allow release of contexts which have been OPENED but not STARTED
      from Andrew Donnellan
    - remove use of macro DEFINE_PCI_DEVICE_TABLE from Vaishali Thakkar
    - release irqs if memory allocation fails from Vaibhav Jain
    - remove racy attempt to force EEH invocation in reset from Daniel
      Axtens
    - fix + cleanup error paths in cxl_dev_context_init from Ian Munsie
    - fix force unmapping mmaps of contexts allocated through the kernel
      api from Ian Munsie
    - set up and enable PSL Timebase from Philippe Bergheaud

* tag 'powerpc-4.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (140 commits)
  cxl: Set up and enable PSL Timebase
  cxl: Fix force unmapping mmaps of contexts allocated through the kernel api
  cxl: Fix + cleanup error paths in cxl_dev_context_init
  powerpc/eeh: Fix fenced PHB caused by eeh_slot_error_detail()
  powerpc/pseries: Cleanup on pci_dn_reconfig_notifier()
  powerpc/pseries: Fix corrupted pdn list
  powerpc/powernv: Enable LEDS support
  powerpc/iommu: Set default DMA offset in dma_dev_setup
  cxl: Remove racy attempt to force EEH invocation in reset
  cxl: Release irqs if memory allocation fails
  cxl: Remove use of macro DEFINE_PCI_DEVICE_TABLE
  powerpc/powernv: Fix mis-merge of OPAL support for LEDS driver
  powerpc/powernv: Reset HILE before kexec_sequence()
  powerpc/kexec: Reset secondary cpu endianness before kexec
  powerpc/hvsi: Fix endianness issues in the HVSI driver
  leds/powernv: Add driver for PowerNV platform
  powerpc/powernv: Create LED platform device
  powerpc/powernv: Add OPAL interfaces for accessing and modifying system LED states
  powerpc/powernv: Fix the log message when disabling VF
  cxl: Allow release of contexts which have been OPENED but not STARTED
  ...
2015-09-03 16:41:38 -07:00
Philippe Bergheaud
390fd5929f cxl: Set up and enable PSL Timebase
This patch configures the PSL Timebase function and enables it,
after the CAPP has been initialized by OPAL.

Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Philippe Bergheaud <felix@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-30 18:56:34 +10:00
Ian Munsie
55e07668fb cxl: Fix force unmapping mmaps of contexts allocated through the kernel api
The cxl user api uses the address_space associated with the file when we
need to force unmap all cxl mmap regions (e.g. on eeh, driver detach,
etc). Currently, contexts allocated through the kernel api do not do
this and instead skip the mmap invalidation, potentially allowing them
to poke at the hardware after such an event, which may cause all sorts
of trouble.

This patch allocates an address_space for cxl contexts allocated through
the kernel api so that the same invalidate path will for these contexts
as well. We don't use the anonymous inode's address_space, as doing so
could invalidate any mmaps of completely unrelated drivers using
anonymous file descriptors.

This patch also introduces a kernelapi flag, so we know when freeing the
context if the address_space was allocated by us and needs to be freed.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-30 18:47:26 +10:00
Ian Munsie
af2a50bb0c cxl: Fix + cleanup error paths in cxl_dev_context_init
If the cxl_context_alloc() call fails, we return immediately without
releasing the reference on the AFU device, allowing it to leak.

This patch switches to using goto style error handling so that the
device is released in common code for both error paths, and will also
simplify things if we add additional initialisation in this function in
the future.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-30 18:47:26 +10:00
Daniel Axtens
9d8e27673c cxl: Remove racy attempt to force EEH invocation in reset
cxl_reset currently PERSTs the slot, and then repeatedly tries to
read MMIO space in order to kick off EEH.

There are 2 problems with this: it's unnecessary, and it's racy.

It's unnecessary because the PERST will bring down the PHB link.
That will be picked up by the CAPP, which will send out an HMI.
Skiboot, noticing an HMI from the CAPP, will send an OPAL
notification to the kernel, which will trigger EEH recovery.

It's also racy: the EEH recovery triggered by the CAPP will
eventually cause the MMIO space to have its mapping invalidated
and the pointer NULLed out. This races with our attempt to read
the MMIO space. This is causing OOPSes in testing.

Simply drop all the attempts to force EEH detection, and trust
that Skiboot will send the notification and that we'll act on it.
The Skiboot code to send the EEH notification has been in Skiboot
for as long as CAPP recovery has been supported, so we don't need
to worry about breaking obscure setups with ancient firmware.

Cc: Ryan Grimm <grimm@linux.vnet.ibm.com>
Cc: stable@vger.kernel.org
Fixes: 62fa19d4b4 ("cxl: Add ability to reset the card")
Signed-off-by: Daniel Axtens <dja@axtens.net>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-27 13:51:36 +10:00
Vaibhav Jain
a6897f3966 cxl: Release irqs if memory allocation fails
This minor patch plugs a potential irq leak in case of a memory
allocation failure inside function the afu_allocate_irqs. Presently the
irqs allocated to the context gets leaked if allocation of either
one of context irq_bitmap or irq_names fails.

Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-27 13:51:18 +10:00
Vaishali Thakkar
f47f966fbe cxl: Remove use of macro DEFINE_PCI_DEVICE_TABLE
Macro DEFINE_PCI_DEVICE_TABLE is deprecated. So, here use
struct pci_device_id instead of DEFINE_PCI_DEVICE_TABLE with
the goal of getting rid of this macro completely.

The Coccinelle semantic patch that performs this transformation
is as follows:

@@
identifier a;
declarer name DEFINE_PCI_DEVICE_TABLE;
initializer i;
@@
- DEFINE_PCI_DEVICE_TABLE(a)
+ const struct pci_device_id a[]
= i;

Signed-off-by: Vaishali Thakkar <vthakkar1994@gmail.com>
Acked-by: Michael Neuling <mikey@neuling.org>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-22 21:21:39 +10:00
Andrew Donnellan
7c26b9cf53 cxl: Allow release of contexts which have been OPENED but not STARTED
If we open a context but do not start it (either because we do not attempt
to start it, or because it fails to start for some reason), we are left
with a context in state OPENED. Previously, cxl_release_context() only
allowed releasing contexts in state CLOSED, so attempting to release an
OPENED context would fail.

In particular, this bug causes available contexts to run out after some EEH
failures, where drivers attempt to release contexts that have failed to
start.

Allow releasing contexts in any state with a value lower than STARTED, i.e.
OPENED or CLOSED (we can't release a STARTED context as it's currently
using the hardware, and we assume that contexts in any new states which may
be added in future with a value higher than STARTED are also unsafe to
release).

Cc: stable@vger.kernel.org
Fixes: 6f7f0b3df6 ("cxl: Add AFU virtual PHB and kernel API")
Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Daniel Axtens <dja@axtens.net>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-20 16:15:23 +10:00
Ian Munsie
d9232a3da8 cxl: Add alternate MMIO error handling
userspace programs using cxl currently have to use two strategies for
dealing with MMIO errors simultaneously. They have to check every read
for a return of all Fs in case the adapter has gone away and the kernel
has not yet noticed, and they have to deal with SIGBUS in case the
kernel has already noticed, invalidated the mapping and marked the
context as failed.

In order to simplify things, this patch adds an alternative approach
where the kernel will return a page filled with Fs instead of delivering
a SIGBUS. This allows userspace to only need to deal with one of these
two error paths, and is intended for use in libraries that use cxl
transparently and may not be able to safely install a signal handler.

This approach will only work if certain constraints are met. Namely, if
the application is both reading and writing to an address in the problem
state area it cannot assume that a non-FF read is OK, as it may just be
reading out a value it has previously written. Further - since only one
page is used per context a write to a given offset would be visible when
reading the same offset from a different page in the mapping (this only
applies within a single context, not between contexts).

An application could deal with this by e.g. making sure it also reads
from a read-only offset after any reads to a read/write offset.

Due to these constraints, this functionality must be explicitly
requested by userspace when starting the context by passing in the
CXL_START_WORK_ERR_FF flag.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Acked-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-18 19:34:43 +10:00
Vaibhav Jain
8c7dd08a8c cxl: Plug irq_bitmap getting leaked in cxl_context
This patch plugs the leak of irq_bitmap, allocated as part of
initialization of cxl_context struct; during the call to
afu_allocate_irqs. The bitmap is now release during the call to function
afu_release_irqs.

Reported-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com>
Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-17 13:56:32 +10:00
Daniel Axtens
25901632c9 cxl: Add CONFIG_CXL_EEH symbol
CONFIG_CXL_EEH is for CXL's EEH related code.

Other drivers can depend on or #ifdef on this symbol to configure
PERST behaviour, allowing CXL to participate in the EEH process.

Reviewed-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-17 13:56:29 +10:00
Daniel Axtens
9e8df8a219 cxl: EEH support
EEH (Enhanced Error Handling) allows a driver to recover from the
temporary failure of an attached PCI card. Enable basic CXL support
for EEH.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-14 21:32:08 +10:00
Daniel Axtens
13e68d8bd0 cxl: Allow the kernel to trust that an image won't change on PERST.
Provide a kernel API and a sysfs entry which allow a user to specify
that when a card is PERSTed, it's image will stay the same, allowing
it to participate in EEH.

cxl_reset is used to reflash the card. In that case, we cannot safely
assert that the image will not change. Therefore, disallow cxl_reset
if the flag is set.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-14 21:32:07 +10:00
Daniel Axtens
4e1efb403c cxl: Don't remove AFUs/vPHBs in cxl_reset
If the driver doesn't participate in EEH, the AFUs will be removed
by cxl_remove, which will be invoked by EEH.

If the driver does particpate in EEH, the vPHB needs to stick around
so that the it can particpate.

In both cases, we shouldn't remove the AFU/vPHB.

Reviewed-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-14 21:32:07 +10:00
Daniel Axtens
d76427b0d8 cxl: Refactor AFU init/teardown
As with an adapter, some aspects of initialisation are done only once
in the lifetime of an AFU: for example, allocating memory, or setting
up sysfs/debugfs files.

However, we may want to be able to do some parts of the initialisation
multiple times: for example, in error recovery we want to be able to
tear down and then re-map IO memory and IRQs.

Therefore, refactor AFU init/teardown as follows.

 - Create two new functions: 'cxl_configure_afu', and its pair
   'cxl_deconfigure_afu'. As with the adapter functions,
   these (de)configure resources that do not need to last the entire
   lifetime of the AFU.

 - Allocating and releasing memory remain the task of 'cxl_alloc_afu'
   and 'cxl_release_afu'.

 - Once-only functions that do not involve allocating/releasing memory
   stay in the overarching 'cxl_init_afu'/'cxl_remove_afu' pair.
   However, the task of picking an AFU mode and activating it has been
   broken out.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-14 21:32:07 +10:00
Daniel Axtens
c044c41542 cxl: Refactor adaptor init/teardown
Some aspects of initialisation are done only once in the lifetime of
an adapter: for example, allocating memory for the adapter,
allocating the adapter number, or setting up sysfs/debugfs files.

However, we may want to be able to do some parts of the
initialisation multiple times: for example, in error recovery we
want to be able to tear down and then re-map IO memory and IRQs.

Therefore, refactor CXL init/teardown as follows.

 - Keep the overarching functions 'cxl_init_adapter' and its pair,
   'cxl_remove_adapter'.

 - Move all 'once only' allocation/freeing steps to the existing
   'cxl_alloc_adapter' function, and its pair 'cxl_release_adapter'
   (This involves moving allocation of the adapter number out of
   cxl_init_adapter.)

 - Create two new functions: 'cxl_configure_adapter', and its pair
   'cxl_deconfigure_adapter'. These two functions 'wire up' the
   hardware --- they (de)configure resources that do not need to
   last the entire lifetime of the adapter

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-14 21:32:05 +10:00
Daniel Axtens
575e6986f0 cxl: Clean up adapter MMIO unmap path.
- MMIO pointer unmapping is guarded by a null pointer check.
   However, iounmap doesn't null the pointer, just invalidate it.
   Therefore, explicitly null the pointer after unmapping.

 - afu_desc_mmio also needs to be unmapped.

 - PCI regions are allocated in cxl_map_adapter_regs.
   Therefore they should be released in unmap, not elsewhere.

Acked-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-14 21:32:05 +10:00
Daniel Axtens
e640d2fc81 cxl: Make IRQ release idempotent
Check if an IRQ is mapped before releasing it.

This will simplify future EEH code by allowing unconditional unmapping
of IRQs.

Acked-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-14 21:32:04 +10:00
Daniel Axtens
05155772f6 cxl: Allocate and release the SPA with the AFU
Previously the SPA was allocated and freed upon entering and leaving
AFU-directed mode. This causes some issues for error recovery - contexts
hold a pointer inside the SPA, and they may persist after the AFU has
been detached.

We would ideally like to allocate the SPA when the AFU is allocated, and
release it until the AFU is released. However, we don't know how big the
SPA needs to be until we read the AFU descriptor.

Therefore, restructure the code:

 - Allocate the SPA only once, on the first attach.

 - Release the SPA only when the entire AFU is being released (not
   detached). Guard the release with a NULL check, so we don't free
   if it was never allocated (e.g. dedicated mode)

Acked-by: Cyril Bur <cyrilbur@gmail.com>
Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-14 21:32:04 +10:00
Daniel Axtens
0b3f9c757c cxl: Drop commands if the PCI channel is not in normal state
If the PCI channel has gone down, don't attempt to poke the hardware.

We need to guard every time cxl_whatever_(read|write) is called. This
is because a call to those functions will dereference an offset into an
mmio register, and the mmio mappings get invalidated in the EEH
teardown.

Check in the read/write functions in the header.
We give them the same semantics as usual PCI operations:
 - a write to a channel that is down is ignored.
 - a read from a channel that is down returns all fs.

Also, we try to access the MMIO space of a vPHB device as part of the
PCI disable path. Because that's a read that bypasses most of our usual
checks, we handle it explicitly.

As far as user visible warnings go:
 - Check link state in file ops, return -EIO if down.
 - Be reasonably quiet if there's an error in a teardown path,
   or when we already know the hardware is going down.
 - Throw a big WARN if someone tries to start a CXL operation
   while the card is down. This gives a useful stacktrace for
   debugging whatever is doing that.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-14 21:32:03 +10:00
Daniel Axtens
588b34be20 cxl: Convert MMIO read/write macros to inline functions
We're about to make these more complex, so make them functions
first.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-14 21:32:02 +10:00
Daniel Axtens
83c3fee7e7 cxl: sparse: Silence iomem warning in debugfs file creation
An IO address, tagged with __iomem, is passed to debugfs_create_file
as private data. This requires that it be cast to void *. The cast
drops the __iomem annotation and so creates a sparse warning:

  drivers/misc/cxl/debugfs.c:51:57: warning: cast removes address space of expression

The address space marker is added back in the file operations
(fops_io_u64).

Silence the warning with __force.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Acked-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-12 14:49:29 +10:00
Daniel Axtens
3d6b040e73 cxl: sparse: Make declarations static
A few declarations were identified by sparse as needing to be static:

  drivers/misc/cxl/irq.c:408:6: warning: symbol 'afu_irq_name_free' was not declared. Should it be static?
  drivers/misc/cxl/irq.c:467:6: warning: symbol 'afu_register_hwirqs' was not declared. Should it be static?
  drivers/misc/cxl/file.c:254:6: warning: symbol 'afu_compat_ioctl' was not declared. Should it be static?
  drivers/misc/cxl/file.c:399:30: warning: symbol 'afu_master_fops' was not declared. Should it be static?

Make them static.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Acked-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-12 14:49:09 +10:00
Daniel Axtens
d3d73f4b38 cxl: Compile with -Werror
It's a good idea, and it brings us in line with the rest of arch/powerpc.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Acked-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-11 07:43:40 +10:00
Daniel Axtens
368857c16c cxl: Don't ignore add_process_element() result when attaching context
Currently when attaching a context in dedicated mode, we ignore the
result of add_process_element(), which could potentially fail.

If add_process_element() returns an error, pass it back to the caller.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-08-06 15:10:19 +10:00
Vladimir Zapolskiy
acb921a5a1 misc: cxl: clean up afu_read_config()
The sanity checks for overflow are not needed, because this is done on
caller side in fs/sysfs/file.c

Signed-off-by: Vladimir Zapolskiy <vz@mleia.com>
Cc: Ian Munsie <imunsie@au1.ibm.com>
Acked-by: Michael Neuling <mikey@neuling.org>
Reviewed-by: Daniel Axtens <dja@axtens.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-08-05 13:53:39 -07:00
Johannes Thumshirn
bd664f892e cxl: Destroy afu->contexts_idr on release of an afu
Destroy afu->contexts_idr on release of an afu, reclaiming the allocated
memory.

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-07-16 14:15:07 +10:00
Johannes Thumshirn
b2a02ac65e cxl: Destroy cxl_adapter_idr on module_exit
Destroy cxl_adapter_idr on module exit, reclaiming the allocated memory.

This was detected by the following semantic patch (written by Luis Rodriguez
<mcgrof@suse.com>)
<SmPL>
@ defines_module_init @
declarer name module_init, module_exit;
declarer name DEFINE_IDR;
identifier init;
@@

module_init(init);

@ defines_module_exit @
identifier exit;
@@

module_exit(exit);

@ declares_idr depends on defines_module_init && defines_module_exit @
identifier idr;
@@

DEFINE_IDR(idr);

@ on_exit_calls_destroy depends on declares_idr && defines_module_exit @
identifier declares_idr.idr, defines_module_exit.exit;
@@

exit(void)
{
 ...
 idr_destroy(&idr);
 ...
}

@ missing_module_idr_destroy depends on declares_idr && defines_module_exit && !on_exit_calls_destroy @
identifier declares_idr.idr, defines_module_exit.exit;
@@

exit(void)
{
 ...
 +idr_destroy(&idr);
}
</SmPL>

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-07-16 14:14:55 +10:00
Rasmus Villemoes
de36953843 cxl: use more common format specifier
A precision of 16 (%.16llx) has the same effect as a field width of 16
along with passing the 0 flag (%016llx), but the latter is much more
common in the kernel tree. Update cxl to use that.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-07-13 10:10:54 +10:00
Rasmus Villemoes
80c394fab8 cxl: Add explicit precision specifiers
C99 says that a precision given as simply '.' with no following digits
or * should be interpreted as 0. The kernel's printf implementation,
however, treats this case as if the precision was omitted. C99 also
says that if both the precision and value are 0, no digits should be
printed. Even if the kernel followed C99 to the letter, I don't think
that would be particularly useful in these cases. For consistency with
most other format strings in the file, use an explicit precision of 16
and add a 0x prefix.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-07-13 10:10:54 +10:00
Daniel Axtens
2c069a118f cxl: Check if afu is not null in cxl_slbia
The pointer to an AFU in the adapter's list of AFUs can be null
if we're in the process of removing AFUs. The afu_list_lock
doesn't guard against this.

Say we have 2 slices, and we're in the process of removing cxl.
 - We remove the AFUs in order (see cxl_remove). In cxl_remove_afu
   for AFU 0, we take the lock, set adapter->afu[0] = NULL, and
   release the lock.
 - Then we get an slbia. In cxl_slbia we take the lock, and set
   afu = adapter->afu[0], which is NULL.
 - Therefore our attempt to check afu->enabled will blow up.

Therefore, check if afu is a null pointer before dereferencing it.

Cc: stable@vger.kernel.org
Signed-off-by: Daniel Axtens <dja@axtens.net>
Acked-by: Michael Neuling <mikey@neuling.org>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-07-10 11:44:25 +10:00
Ian Munsie
10a5894f2d cxl: Fix off by one error allowing subsequent mmap page to be accessed
It was discovered that if a process mmaped their problem state area they
were able to access one page more than expected, potentially allowing
them to access the problem state area of an unrelated process.

This was due to a simple off by one error in the mmap fault handler
introduced in 0712dc7e73 ("cxl: Fix issues
when unmapping contexts"), which is fixed in this patch.

Cc: stable@vger.kernel.org
Fixes: 0712dc7e73 ("cxl: Fix issues when unmapping contexts")
Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-07-08 15:17:47 +10:00
Ian Munsie
5caaf53468 cxl: Fail mmap if requested mapping is larger than assigned problem state area
This patch makes the mmap call fail outright if the requested region is
larger than the problem state area assigned to the context so the error
is reported immediately rather than waiting for an attempt to access an
address out of bounds.

Although we never expect users to map more than the assigned problem
state area and are not aware of anyone doing this (other than for
testing), this does have the potential to break users if someone has
used a larger range regardless. I'm submitting it for consideration, but
if this change is not considered acceptable the previous patch is
sufficient to prevent access out of bounds without breaking anyone.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-07-08 15:17:46 +10:00
Michael Neuling
3f8dc44d88 cxl: Fix refcounting in kernel API
Currently the kernel API AFU dev refcounting is done on context start and stop.
This patch moves this refcounting to context init and release, bringing it
inline with how the userspace API does it.

Without this we've seen the refcounting on the AFU get out of whack between the
user and kernel API usage.  This causes the AFU structures to be freed when
they are actually still in use.

This fixes some kref warnings we've been seeing and spurious ErrIVTE IRQs.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-07-07 19:38:37 +10:00
Daniel Axtens
8c00d5c9d3 cxl: Test the correct mmio space before unmapping
Before freeing p2n, test p2n, not p1n.

Signed-off-by: Daniel Axtens <dja@axtens.net>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-07-06 20:24:35 +10:00
Maninder Singh
14f21189df cxl/vphb.c: Use phb pointer after NULL check
static Anlaysis detected below error:-
(error) Possible null pointer dereference: phb

So, Use phb after NULL check.

Signed-off-by: Maninder Singh <maninder1.s@samsung.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-07-06 20:24:34 +10:00
Michael Neuling
f293106917 cxl: Fix typo in debug print
Fix typo in debug print. p1_base() should be p2_base(). No change other
than to the debug output.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-19 17:10:30 +10:00
Michael Neuling
e0fcdc2010 cxl: Add CXL_KERNEL_API config option
Add CXL_KERNEL_API config option so drivers which depend on this new
functionality won't be enabled until this is visible.

This is useful for merging the cxlflash driver which comes in via the SCSI
tree.  The cxlflash driver can depend on CXL_KERNEL_API, hence it won't be
enabled in the SCSI tree until this new config option is merged via the powerpc
tree.  Hence all trees will be bisectable at all times.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-19 17:10:30 +10:00
Michael Neuling
f67b4938af cxl: Reset default context for vPHB on release
When we release the device, we should also invalidate the default context.
With this cxl_get_context() will return null after removal.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-07 19:12:39 +10:00
Michael Neuling
6f7f0b3df6 cxl: Add AFU virtual PHB and kernel API
This patch does two things.

Firstly it presents the Accelerator Function Unit (AFUs) behind the POWER
Service Layer (PSL) as PCI devices on a virtual PCI Host Bridge (vPHB).  This
in in addition to the PSL being a PCI device itself.

As part of the Coherent Accelerator Interface Architecture (CAIA) AFUs can
provide an AFU configuration.  This AFU configuration recored is architected to
be the same as a PCI config space.

This patch sets discovers the AFU configuration records, provides AFU config
space read/write functions to these configuration records.  It then enumerates
the PCI bus.  It also hooks in PCI ops where appropriate.  It also destroys the
vPHB when the physical card is removed.

Secondly, it add an in kernel API for AFU to use CXL.  AFUs must present a
driver that firstly binds as a PCI device.  This PCI device can then be using
to do CXL specific operations (that can't sit in the PCI ops) using this API.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-03 13:27:20 +10:00
Michael Neuling
0520336afe cxl: Export file ops for use by API
The cxl kernel API will allow drivers other than cxl to export a file
descriptor which has the same userspace API.  These file descriptors will be
able to be used against libcxl.

This exports those file ops for use by other drivers.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-03 13:27:20 +10:00
Michael Neuling
ec249dd860 cxl: Move include file cxl.h -> cxl-base.h
This moves the current include file from cxl.h -> cxl-base.h.  This current
include file is used only to pass information between the base driver that
needs to be built into the kernel and the cxl module.

This is to make way for a new include/misc/cxl.h which will
contain just the kernel API for other driver to use

Signed-off-by: Michael Neuling <mikey@neuling.org>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-03 13:27:19 +10:00
Michael Neuling
406e12ec0b cxl: Cleanup Makefile
Cleanup Makefile by fixing line wrapping.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-03 13:27:19 +10:00
Michael Neuling
7bb5d91a4d cxl: Rework context lifetimes
This reworks contexts lifetimes a bit to enable the kernel API where we may
want to reuse contexts. Here we will want to start and stop contexts without
freeing them.

Start context does the get pid & ctx so stop context will need to do the puts.
Here we move put pid & ctx to the detach context path which will become part of
the stop context path.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-03 13:27:19 +10:00
Michael Neuling
2f663527bd cxl: Configure PSL for kernel contexts and merge code
This updates AFU directed and dedicated modes for contexts attached to the
kernel.

The SR (similar to the MSR in the core) calculation is getting
quite complex and is duplicated in AFU directed and dedicated
modes.  This patch also merges this SR calculation for these modes.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-03 13:27:19 +10:00
Michael Neuling
c358d84b4e cxl: Split afu_register_irqs() function
Split the afu_register_irqs() function so that different parts can
be useful elsewhere.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-03 13:27:19 +10:00
Michael Neuling
a6b07d8257 cxl: Only check pid for userspace contexts
We only need to check the pid attached to this context for userspace contexts.
Kernel contexts can skip this check.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-03 13:27:18 +10:00
Michael Neuling
1a1a94b876 cxl: Export some symbols
Export some symbols which will soon be used elsewhere in this driver.

Now they are global we rename them so to avoid collisions.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-03 13:27:18 +10:00
Michael Neuling
b12994fbfe cxl: cxl_afu_reset() -> __cxl_afu_reset()
Rename cxl_afu_reset() to __cxl_afu_reset() to we can reuse this function name
in the API.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-03 13:27:18 +10:00
Michael Neuling
eda3693c84 cxl: Rework detach context functions
Rework __detach_context() and cxl_context_detach() so we can reuse them in the
kernel API.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-03 13:27:18 +10:00
Michael Neuling
6428832a7b cxl: Add cookie parameter to afu_release_irqs()
Add cookie parameter to afu_release_irqs() so that we can pass in a different
cookie than the context structure.  This will be useful for other kernel
drivers that want to call this but get their own cookie back in the interrupt
handler.

Update all existing call sites.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-03 13:27:17 +10:00
Michael Neuling
bfcdc8fffe cxl: Dump debug info on the AFU configuration record
Now that we parse the AFU Configuration record, dump some info on it when in
debug mode.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-03 13:27:17 +10:00
Michael Neuling
7f436b534d cxl: Fix error path on probe
When probing we call pci_enable_device() but don't call pci_disable_device() on
fail. This causes refcounting issues in the PCI subsystem if a second driver
tries to bind to the same device.

This patch adds the pci_disable_device() to the probe error path. This error
path is hit when this cxl driver tries to bind to AFUs (on the vPHB) rather
than the physical device.

Signed-off-by: Michael Neuling <mikey@neuling.org>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-03 13:27:17 +10:00
Ian Munsie
bee30c7045 cxl: Re-order card init to check the VSEC earlier
When we expose AFUs as virtual PCI devices, they may look like the physical
CAPI PCI card.  ie they may have the same vendor/device IDs.

We want to avoid these AFUs binding to this driver and any init this driver may
do.

Re-order card init to check the VSEC earlier before assigning BARs or
activating CXL.  Also change the dev used in early prints as the adapter struct
may not be inited at this earlier stage.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-03 13:27:17 +10:00
Michael Neuling
69c3a73c81 cxl: Remove unnecessarily verbose print in cxl_remove()
Signed-off-by: Michael Neuling <mikey@neuling.org>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-03 13:27:17 +10:00
Michael Neuling
aa70775e9a cxl: Add shutdown hook
Signed-off-by: Michael Neuling <mikey@neuling.org>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-03 13:27:16 +10:00
Ian Munsie
8ac75b96be cxl: Use call_rcu to reduce latency when releasing the afu fd
The afu fd release path was identified as a significant bottleneck in
the overall performance of cxl. While an optimal AFU design would
minimise the need to close & reopen the AFU fd, it is not always
practical to avoid.

The bottleneck seems to be down to the call to synchronize_rcu(), which
will block until every other thread is guaranteed to be out of an RCU
critical section. Replace it with call_rcu() to free the context
structures later so we can return to the application sooner.

This reduces the time spent in the fd release path from 13356 usec to
13.3 usec - about a 100x speed up.

Reported-by: Fei K Chen <uchen@cn.ibm.com>
Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-03 13:27:15 +10:00
Vaibhav Jain
e36f6fe1f7 cxl: Export AFU error buffer via sysfs
Export the "AFU Error Buffer" via sysfs attribute (afu_err_buf). AFU
error buffer is used by the AFU to report application specific
errors. The contents of this buffer are AFU specific and are intended to
be interpreted by the application interacting with the afu.

Suggested-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-03 13:27:15 +10:00
Vaibhav Jain
27d4dc7116 cxl: Implement an ioctl to fetch afu card-id, offset-id and mode
Given a file descriptor on an afu device, libcxl currently uses the
major/minor number obtained from fstat on the fd to construct path to
the afu's sysfs directory. However it is possible that rather than using
one of the device in /dev/cxl, a kernel driver creates its own device
which export generic cxl interface to the userspace. This causes
problems with libcxl as it tries to use a wrong major/minor number to
construct the sysfs path and fail.

So this patch introduces a new ioctl called CXL_IOCTL_GET_AFU_ID on the
afu file descriptor to fetch the cxl_afu_id struct that holds the
card/offset-id and mode information. These info is then used by libcxl to
construct the correct path to the afu sysfs directory.

Testing:
	- Build against pseries be/le configs
	- Testing with corresponding libcxl changes to verify that it constructs
	  right sysfs path to the afu.

Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-03 13:27:15 +10:00
Linus Torvalds
d3f180ea1a powerpc updates for 3.20
Including:
 
 - Update of all defconfigs
 - Addition of a bunch of config options to modernise our defconfigs
 - Some PS3 updates from Geoff
 - Optimised memcmp for 64 bit from Anton
 - Fix for kprobes that allows 'perf probe' to work from Naveen
 - Several cxl updates from Ian & Ryan
 - Expanded support for the '24x7' PMU from Cody & Sukadev
 - Freescale updates from Scott:
   "Highlights include 8xx optimizations, some more work on datapath device
    tree content, e300 machine check support, t1040 corenet error reporting,
    and various cleanups and fixes."
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJU2/LSAAoJEFHr6jzI4aWATDAQAKPU6v2Mq0sLnGst69waHU/Q
 vvpIq9hqVeSr6znHhrnazc3iQTLk0acqIdxUl/dT+5ADhi9+FxGD5Ckk+BH1DDve
 g6mQelSMlVZF9hKonHsbr4iUuTUyZyx2vj2qjdgOaRiv9Xubq6vUFNeolq3AeHxv
 J33vqRTmowj3VJ52u+V1dmzXQGfUye7DG2jHpjXoBieZsroTvyuYm5GoIPblWFO6
 zbYRh6IitALnQRtXfwIManPyWMkJti9JX8PwDkmvacr+V+MXbrksHpIOITMhNlo1
 WsVnFMpxuk80XuUfhaKZgISgBSfCqBckvKDn2QwztF2/kBnV6Su5xiOKVgouzM6B
 myy+maiMZlNJlNjqdMK5v2bqMXICP048zgfMbDN2e1K25jSSlRawt0RngoCQO2EP
 7aWmEDAlL3shgzkl68pj1fevQokxC/40C1yExIgAa9C31+bjtMz4Xb1SfN1SSveW
 7uWEY/eG9eLsrSE1CeBDvh6B8BRdyuIHgPhux4Tgc/bUtBGFQ29NuXwKh3QCeEy9
 9wWrRGx3U69eP06Ey7P5js3jPTQs80bjJewyGaiPQF5XHB89To8Dg8VfXjEV49Dx
 Pa3OLL5QsQloKfEBiEhQeGfKYImC00pVYAxc0qpmnr9T+25Ri1TLdF1EBAwriSYE
 5p9kSW+ZIht0lvzsdPNm
 =xDU3
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-3.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux

Pull powerpc updates from Michael Ellerman:

 - Update of all defconfigs

 - Addition of a bunch of config options to modernise our defconfigs

 - Some PS3 updates from Geoff

 - Optimised memcmp for 64 bit from Anton

 - Fix for kprobes that allows 'perf probe' to work from Naveen

 - Several cxl updates from Ian & Ryan

 - Expanded support for the '24x7' PMU from Cody & Sukadev

 - Freescale updates from Scott:
    "Highlights include 8xx optimizations, some more work on datapath
     device tree content, e300 machine check support, t1040 corenet
     error reporting, and various cleanups and fixes"

* tag 'powerpc-3.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux: (102 commits)
  cxl: Add missing return statement after handling AFU errror
  cxl: Fail AFU initialisation if an invalid configuration record is found
  cxl: Export optional AFU configuration record in sysfs
  powerpc/mm: Warn on flushing tlb page in kernel context
  powerpc/powernv: Add OPAL soft-poweroff routine
  powerpc/perf/hv-24x7: Document sysfs event description entries
  powerpc/perf/hv-gpci: add the remaining gpci requests
  powerpc/perf/{hv-gpci, hv-common}: generate requests with counters annotated
  powerpc/perf/hv-24x7: parse catalog and populate sysfs with events
  perf: define EVENT_DEFINE_RANGE_FORMAT_LITE helper
  perf: add PMU_EVENT_ATTR_STRING() helper
  perf: provide sysfs_show for struct perf_pmu_events_attr
  powerpc/kernel: Avoid initializing device-tree pointer twice
  powerpc: Remove old compile time disabled syscall tracing code
  powerpc/kernel: Make syscall_exit a local label
  cxl: Fix device_node reference counting
  powerpc/mm: bail out early when flushing TLB page
  powerpc: defconfigs: add MTD_SPI_NOR (new dependency for M25P80)
  perf/powerpc: reset event hw state when adding it to the PMU
  powerpc/qe: Use strlcpy()
  ...
2015-02-11 18:15:38 -08:00
Ian Munsie
a6130ed253 cxl: Add missing return statement after handling AFU errror
We were missing a return statement in the PSL interrupt handler in the
case of an AFU error, which would trigger an "Unhandled CXL PSL IRQ"
warning. We do actually handle these type of errors (by notifying
userspace), so add the missing return IRQ_HANDLED so we don't throw
unecessary warnings.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-02-06 11:17:27 +11:00
Ian Munsie
3d5be0392f cxl: Fail AFU initialisation if an invalid configuration record is found
If an AFU claims to have a configuration record but doesn't actually
contain a vendor and device ID, fail the AFU initialisation. Right now
this is just a way of politely letting AFU developers know that they
need to fix their config space, but later on we may expose the AFUs as
actual PCI devices in their own right and don't want to inadvertendly
expose an AFU with a bad config space.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-02-06 11:16:58 +11:00
Ian Munsie
b087e6190d cxl: Export optional AFU configuration record in sysfs
An AFU may optionally contain one or more PCIe like configuration
records, which can be used to identify the AFU.

This patch adds support for exposing the raw config space and the
vendor, device and class code under sysfs. These will appear in a
subdirectory of the AFU device corresponding with the configuration
record number, e.g.

cat /sys/class/cxl/afu0.0/cr0/vendor
0x1014

cat /sys/class/cxl/afu0.0/cr0/device
0x4350

cat /sys/class/cxl/afu0.0/cr0/class
0x120000

hexdump -C /sys/class/cxl/afu0.0/cr0/config
00000000  14 10 50 43 00 00 00 00  06 00 00 12 00 00 00 00  |..PC............|
00000010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000100

These files behave in much the same way as the equivalent files for PCI
devices, with one exception being that the config file is currently
read-only and restricted to the root user. It is not necessarily
required to be this strict, but we currently do not have a compelling
use-case to make it writable and/or world-readable, so I erred on the
side of being restrictive.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-02-06 11:16:56 +11:00
Ryan Grimm
6f963ec2d6 cxl: Fix device_node reference counting
When unbinding and rebinding the driver on a system with a card in PHB0, this
error condition is reached after a few attempts:

ERROR: Bad of_node_put() on /pciex@3fffe40000000
CPU: 0 PID: 3040 Comm: bash Not tainted 3.18.0-rc3-12545-g3627ffe #152
Call Trace:
[c000000721acb5c0] [c00000000086ef94] .dump_stack+0x84/0xb0 (unreliable)
[c000000721acb640] [c00000000073a0a8] .of_node_release+0xd8/0xe0
[c000000721acb6d0] [c00000000044bc44] .kobject_release+0x74/0xe0
[c000000721acb760] [c0000000007394fc] .of_node_put+0x1c/0x30
[c000000721acb7d0] [c000000000545cd8] .cxl_probe+0x1a98/0x1d50
[c000000721acb900] [c0000000004845a0] .local_pci_probe+0x40/0xc0
[c000000721acb980] [c000000000484998] .pci_device_probe+0x128/0x170
[c000000721acba30] [c00000000052400c] .driver_probe_device+0xac/0x2a0
[c000000721acbad0] [c000000000522468] .bind_store+0x108/0x160
[c000000721acbb70] [c000000000521448] .drv_attr_store+0x38/0x60
[c000000721acbbe0] [c000000000293840] .sysfs_kf_write+0x60/0xa0
[c000000721acbc50] [c000000000292500] .kernfs_fop_write+0x140/0x1d0
[c000000721acbcf0] [c000000000208648] .vfs_write+0xd8/0x260
[c000000721acbd90] [c000000000208b18] .SyS_write+0x58/0x100
[c000000721acbe30] [c000000000009258] syscall_exit+0x0/0x98

We are missing a call to of_node_get(). pnv_pci_to_phb_node() should
call of_node_get() otherwise np's reference count isn't incremented and
it might go away. Rename pnv_pci_to_phb_node() to pnv_pci_get_phb_node()
so it's clear it calls of_node_get().

Signed-off-by: Ryan Grimm <grimm@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-02-02 14:51:31 +11:00
Ryan Grimm
62fa19d4b4 cxl: Add ability to reset the card
Adds reset to sysfs which will PERST the card. If load_image_on_perst is set
to "user" or "factory", the PERST will cause that image to be loaded.

load_image_on_perst is set to "user" for production.

"none" could be used for debugging. The PSL trace arrays are preserved which
then can be read through debugfs.

PERST also triggers CAPP recovery. An HMI comes in, which is handled by EEH.
EEH unbinds the driver, calls into Sapphire to reinitialize the PHB, then
rebinds the driver.

Signed-off-by: Ryan Grimm <grimm@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-01-22 17:31:52 +11:00
Ryan Grimm
1212aa1c8c cxl: Enable CAPP recovery
Turning snoops on is the last step in CAPP recovery. Sapphire is expected to
have reinitialized the PHB and done the previous recovery steps.

Add mode argument to opal call to do this. Driver can turn snoops off although
it does not currently.

Signed-off-by: Ryan Grimm <grimm@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-01-22 17:31:52 +11:00
Ryan Grimm
95bc11bcd1 cxl: Add image control to sysfs
load_image_on_perst identifies whether a PERST will cause the image to be
flashed to the card. And if so, which image.

Valid entries are: "none", "user" and "factory".

A value of "none" means PERST will not cause the image to be flashed. A
power cycle to the pcie slot is required to load the image.

"user" loads the user provided image and "factory" loads the factory image upon
PERST.

sysfs updates the cxl struct in the driver then calls cxl_update_image_control
to write the vals in the VSEC.

Signed-off-by: Ryan Grimm <grimm@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-01-22 17:31:51 +11:00
Ryan Grimm
4beb5421ba cxl: Use image state defaults for reloading FPGA
Select defaults such that a PERST causes flash image reload.  Select which
image based on what the card is set up to load.

CXL_VSEC_PERST_LOADS_IMAGE selects whether PERST assertion causes flash image
load.

CXL_VSEC_PERST_SELECT_USER selects which image is loaded on the next PERST.

cxl_update_image_control writes these bits into the VSEC.

Signed-off-by: Ryan Grimm <grimm@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-01-22 17:31:51 +11:00
Ian Munsie
9bcf28cdb2 cxl: Add tracepoints
This patch adds tracepoints throughout the cxl driver, which can provide
insight into:

- Context lifetimes
- Commands sent to the PSL and AFU and their completion status
- Segment and page table misses and their resolution
- PSL and AFU interrupts
- slbia calls from the powerpc copro_fault code

These tracepoints are mostly intended to aid in debugging (particularly
for new AFU designs), and may be useful standalone or in conjunction
with hardware traces collected by the PSL (read out via the trace
interface in debugfs) and AFUs.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-01-22 17:31:51 +11:00
Colin Ian King
d3383aaae9 cxl: remove redundant increment of hwirq
hwirq has not been initialized, however it is being incremented
and also not being referenced in a loop.  This error was detected with
cppcheck:

[drivers/misc/cxl/irq.c:439]: (error) Uninitialized variable: hwirq

Commit 80fa93fce3 ("cxl: Name interrupts in /proc/interrupt")
introduced this error.

This is a simple fix that removes the redundant increment.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-By: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-01-22 17:31:51 +11:00
Ian Munsie
0712dc7e73 cxl: Fix issues when unmapping contexts
An issue was introduced with "cxl: Unmap MMIO regions when detaching a
context" (b123429e6a) where closing a
context normally could also unmap the problem state area of other
contexts currently using the AFU.

It was also discovered that after a context's MMIO space had been
unmapped it would read 0s when accessing it, whereas the expected
behaviour was for the access to fail altogether.

In order to address these issues, this patch does two things:

- Forced mmap unmapping is only done when we are forcefully detaching
  all contexts, and not in the normal detach path. Since the normal
  context close path is tied to the file release any mmaps must have
  already been released so we don't need to worry in that case.

- The mmap path now uses a vm_operations_struct with a fault handler.
  The fault handler ensures that the context is in started state,
  otherwise it fails the access attempt with a SIGBUS.

Fixes: b123429e6a ("cxl: Unmap MMIO regions when detaching a context")
Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-01-12 16:40:02 +11:00
Ian Munsie
db7933f392 cxl: Disable SPAP register when freeing SPA
When we deactivate the AFU directed mode we free the scheduled process
area, but did not clear the register in the hardware that has a pointer
to it.

This should be fine since we will have already cleared out every context
and we won't do anything that would cause the hardware to access it
until after we have allocated a new one, but just to be safe this patch
clears out the register when we free the page.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-12-29 15:45:44 +11:00
Ian Munsie
d6a6af2c18 cxl: Disable AFU debug flag
Upon inspection of the implementation specific registers, it was
discovered that the high bit of the implementation specific RXCTL
register was enabled, which enables the DEADB00F debug feature.

The debug feature causes MMIO reads to a disabled AFU to respond with
0xDEADB00F instead of all Fs. In general this should not be visible as
the kernel will only allow MMIO access to enabled AFUs, but there may be
some circumstances where an AFU may become disabled while it is use.
One such case would be an AFU designed to only be used in the dedicated
process mode and to disable itself after it has completed it's work
(however even in that case the effects of this debug flag would be
limited as the userspace application must have completed any required
MMIO accesses before the AFU disables itself with or without the flag).

This patch removes the debug flag and replaces the magic value
programmed into this register with a preprocessor define so it is
clearer what the rest of this initialisation does.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-12-29 15:45:43 +11:00
Ian Munsie
13da704682 cxl: Early return from cxl_handle_fault for a shut down context
If a context is being detached and we get a translation fault for it
there is little point getting it's mm and handling the fault, so just
respond with an address error and return earlier.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-12-29 15:45:43 +11:00
Ian Munsie
456295e284 cxl: Fix leaking interrupts if attach process fails
In this particular error path we have already allocated the AFU
interrupts, but have not yet set the status to STARTED. The detach
context code will only attempt to release the interrupts if the context
is in state STARTED, so in this case the interrupts would remain
allocated.

This patch releases the AFU interrupts immediately if the attach call
fails to prevent them leaking.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-12-29 15:45:43 +11:00
Ian Munsie
b123429e6a cxl: Unmap MMIO regions when detaching a context
If we need to force detach a context (e.g. due to EEH or simply force
unbinding the driver) we should prevent the userspace contexts from
being able to access the Problem State Area MMIO region further, which
they may have mapped with mmap().

This patch unmaps any mapped MMIO regions when detaching a userspace
context.

Cc: stable@vger.kernel.org
Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-12-12 13:06:48 +11:00
Ian Munsie
a98e6e9f4e cxl: Add timeout to process element commands
In the event that something goes wrong in the hardware and it is unable
to complete a process element comment we would end up polling forever,
effectively making the associated process unkillable.

This patch adds a timeout to the process element command code path, so
that we will give up if the hardware does not respond in a reasonable
time.

Cc: stable@vger.kernel.org
Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-12-12 13:06:47 +11:00
Ian Munsie
ee41d11d53 cxl: Change contexts_lock to a mutex to fix sleep while atomic bug
We had a known sleep while atomic bug if a CXL device was forcefully
unbound while it was in use. This could occur as a result of EEH, or
manually induced with something like this while the device was in use:

echo 0000:01:00.0 > /sys/bus/pci/drivers/cxl-pci/unbind

The issue was that in this code path we iterated over each context and
forcefully detached it with the contexts_lock spin lock held, however
the detach also needed to take the spu_mutex, and call schedule.

This patch changes the contexts_lock to a mutex so that we are not in
atomic context while doing the detach, thereby avoiding the sleep while
atomic.

Also delete the related TODO comment, which suggested an alternate
solution which turned out to not be workable.

Cc: stable@vger.kernel.org
Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-12-12 13:06:47 +11:00
Aneesh Kumar K.V
aefa5688c0 powerpc/mm: don't do tlbie for updatepp request with NO HPTE fault
upatepp can get called for a nohpte fault when we find from the linux
page table that the translation was hashed before. In that case
we are sure that there is no existing translation, hence we could
avoid doing tlbie.

We could possibly race with a parallel fault filling the TLB. But
that should be ok because updatepp is only ever relaxing permissions.
We also look at linux pte permission bits when filling hash pte
permission bits. We also hold the linux pte busy bits while
inserting/updating a hashpte entry, hence a paralle update of
linux pte is not possible. On the other hand mprotect involves
ptep_modify_prot_start which cause a hpte invalidate and not updatepp.

Performance number:
We use randbox_access_bench written by Anton.

Kernel with THP disabled and smaller hash page table size.

    86.60%  random_access_b  [kernel.kallsyms]                [k] .native_hpte_updatepp
     2.10%  random_access_b  random_access_bench              [.] doit
     1.99%  random_access_b  [kernel.kallsyms]                [k] .do_raw_spin_lock
     1.85%  random_access_b  [kernel.kallsyms]                [k] .native_hpte_insert
     1.26%  random_access_b  [kernel.kallsyms]                [k] .native_flush_hash_range
     1.18%  random_access_b  [kernel.kallsyms]                [k] .__delay
     0.69%  random_access_b  [kernel.kallsyms]                [k] .native_hpte_remove
     0.37%  random_access_b  [kernel.kallsyms]                [k] .clear_user_page
     0.34%  random_access_b  [kernel.kallsyms]                [k] .__hash_page_64K
     0.32%  random_access_b  [kernel.kallsyms]                [k] fast_exception_return
     0.30%  random_access_b  [kernel.kallsyms]                [k] .hash_page_mm

With Fix:

    27.54%  random_access_b  random_access_bench              [.] doit
    22.90%  random_access_b  [kernel.kallsyms]                [k] .native_hpte_insert
     5.76%  random_access_b  [kernel.kallsyms]                [k] .native_hpte_remove
     5.20%  random_access_b  [kernel.kallsyms]                [k] fast_exception_return
     5.12%  random_access_b  [kernel.kallsyms]                [k] .__hash_page_64K
     4.80%  random_access_b  [kernel.kallsyms]                [k] .hash_page_mm
     3.31%  random_access_b  [kernel.kallsyms]                [k] data_access_common
     1.84%  random_access_b  [kernel.kallsyms]                [k] .trace_hardirqs_on_caller

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-12-05 16:26:15 +11:00
Michael Neuling
80fa93fce3 cxl: Name interrupts in /proc/interrupt
Currently all interrupts generated by cxl are named "cxl".  This is not very
informative as we can't distinguish between cards, AFUs, error interrupts, user
contexts and user interrupts numbers.  Being able to distinguish them is useful
for setting affinity.

This patch gives each of these names in /proc/interrupts.

A two card CAPI system, with afu0.0 having 2 active contexts each with 4 user
IRQs each, will now look like this:

    % grep cxl /proc/interrupts
    444:          0  OPAL ICS 141312 Level     cxl-card1-err
    445:          0  OPAL ICS 141313 Level     cxl-afu1.0-err
    446:          0  OPAL ICS 141314 Level     cxl-afu1.0
    462:          0  OPAL ICS 2052 Level     cxl-afu0.0-pe0-1
    463:      75517  OPAL ICS 2053 Level     cxl-afu0.0-pe0-2
    468:          0  OPAL ICS 2054 Level     cxl-afu0.0-pe0-3
    469:          0  OPAL ICS 2055 Level     cxl-afu0.0-pe0-4
    470:          0  OPAL ICS 2056 Level     cxl-afu0.0-pe1-1
    471:      75506  OPAL ICS 2057 Level     cxl-afu0.0-pe1-2
    472:          0  OPAL ICS 2058 Level     cxl-afu0.0-pe1-3
    473:          0  OPAL ICS 2059 Level     cxl-afu0.0-pe1-4
    502:       1066  OPAL ICS 2050 Level     cxl-afu0.0
    514:          0  OPAL ICS 2048 Level     cxl-card0-err
    515:          0  OPAL ICS 2049 Level     cxl-afu0.0-err

Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-18 13:01:39 +11:00
Ian Munsie
bc78b05bb4 cxl: Return error to PSL if IRQ demultiplexing fails & print clearer warning
If an AFU has a hardware bug that causes it to acknowledge a context
terminate or remove while that context has outstanding transactions, it
is possible for the kernel to receive an interrupt for that context
after we have removed it from the context list.

The kernel will not be able to demultiplex the interrupt (or worse - if
we have already reallocated the process handle we could mis-attribute it
to the new context), and printed a big scary warning.

It did not acknowledge the interrupt, which would effectively halt
further translation fault processing on the PSL.

This patch makes the warning clearer about the likely cause of the issue
(i.e. hardware bug) to make it obvious to future AFU designers of what
needs to be fixed. It also prints out the process handle which can then
be matched up with hardware and software traces for debugging.

It also acknowledges the interrupt to the PSL with either an address
error or acknowledge, so that the PSL can continue with other
translations.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-11-18 12:58:38 +11:00
Ian Munsie
eb01d4c238 cxl: Fix PSL error due to duplicate segment table entries
In certain circumstances the PSL (Power Service Layer, which provides
translation services for CXL hardware) can send an interrupt for a
segment miss that the kernel has already handled. This can happen if
multiple translations for the same segment are queued in the PSL before
the kernel has restarted the first translation.

The CXL driver does not expect this situation and does not check if a
segment had already been handled. This could cause a duplicate segment
table entry which in turn caused a PSL error taking down the card.

This patch fixes the issue by checking for existing entries in the
segment table that match the segment we are trying to insert, so as to
avoid inserting duplicate entries.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-28 19:52:52 +11:00
Ian Munsie
b03a7f578a cxl: Refactor cxl_load_segment() and find_free_sste()
This moves the segment table hash calculation from cxl_load_segment()
into find_free_sste() since that is the only place it is actually used.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-28 19:52:24 +11:00
Ian Munsie
5100a9d644 cxl: Disable secondary hash in segment table
This patch simplifies the process of finding a free segment table entry
by disabling the secondary hash. This reduces the number of possible
entries in the segment table for a given address from 16 to 8.

Due to the large segment sizes we use it is extremely unlikely that the
secondary hash would ever have been used in practice, so this should not
have any negative impacts and may even improve performance due to the
reduced number of comparisons that software & hardware need to perform.

This patch clears the SC bit in the hardware's state register
(CXL_PSL_SR_An) to disable the secondary hash in the hardware since we
can no longer fill out entries using it.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-28 19:52:07 +11:00
Ian Munsie
d53ba6b3bb cxl: Fix afu_read() not doing finish_wait() on signal or non-blocking
If afu_read() returned due to a signal or the AFU file descriptor being
opened non-blocking it would not call finish_wait() before returning,
which could lead to a crash later when something else wakes up the wait
queue.

This patch restructures the wait logic to ensure that the cleanup is
done correctly.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-09 11:29:57 +11:00
Ian Munsie
881632c905 cxl: Add driver to Kbuild and Makefiles
Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-08 20:15:58 +11:00
Ian Munsie
f204e0b8ce cxl: Driver code for powernv PCIe based cards for userspace access
This is the core of the cxl driver.

It adds support for using cxl cards in the powernv environment only (ie POWER8
bare metal). It allows access to cxl accelerators by userspace using the
/dev/cxl/afuM.N char devices.

The kernel driver has no knowledge of the function implemented by the
accelerator. It provides services to userspace via the /dev/cxl/afuM.N
devices. When a program opens this device and runs the start work IOCTL, the
accelerator will have coherent access to that processes memory using the same
virtual addresses. That process may mmap the device to access any MMIO space
the accelerator provides.  Also, reads on the device will allow interrupts to
be received. These services are further documented in a later patch in
Documentation/powerpc/cxl.txt.

Documentation of the cxl hardware architecture and userspace API is provided in
subsequent patches.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-08 20:15:57 +11:00