Commit Graph

191 Commits

Author SHA1 Message Date
Linus Torvalds
57fa2369ab CFI on arm64 series for v5.13-rc1
- Clean up list_sort prototypes (Sami Tolvanen)
 
 - Introduce CONFIG_CFI_CLANG for arm64 (Sami Tolvanen)
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmCHCR8ACgkQiXL039xt
 wCZyFQ//fnUZaXR2K354zDyW6CJljMf+d94RF6rH+J6eMTH2/HXa5v0iJokwABLf
 ussP6qF4k5wtmI22Gm9A5Zc3e4iiry5pC0jOdk0mk4gzWwFN9MdgNxJZIGA3xqhS
 bsBK4AGrVKjtZl48G1/ZxJuNDeJhVp6GNK2n6/Gl4rZF6R7D/Upz0XelyJRdDpcM
 HIGma7jZl6xfGU0mdWCzpOGK1zdMca1WVs7A4YuurSbLn5PZJrcNVWLouDqt/Si2
 AduSri1gyPClicgvqWjMOzhUpuw/nJtBLRl1x1EsWk/KSZ1/uNVjlewfzdN4fZrr
 zbtFr2gLubYLK6JOX7/LqoHlOTgE3tYLL+WIVN75DsOGZBKgHhmebTmWLyqzV0SL
 oqcyM5d3ucC6msdtAK5Fv4MSp8rpjqlK1Ha4SGRT6kC2wut7AhZ3KD7eyRIz8mV9
 Sa9mhignGFJnTEUp+LSbYdrAudgSKxB40WyXPmswAXX4VJFRD4ONrrcAON/SzkUT
 Hw/JdFRCKkJjgwNQjIQoZcUNMTbFz2PlNIEnjJWm38YImQKQlCb2mXaZKCwBkf45
 aheCZk17eKoxTCXFMd+KxlyNEtS2yBfq/PpZgvw7GW/pfFbWUg1+2O41LnihIe5v
 zu0hN1wNCQqgfxiMZqX1OTb9C/2vybzGsXILt+9nppjZ8EBU7iU=
 =wU6U
 -----END PGP SIGNATURE-----

Merge tag 'cfi-v5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull CFI on arm64 support from Kees Cook:
 "This builds on last cycle's LTO work, and allows the arm64 kernels to
  be built with Clang's Control Flow Integrity feature. This feature has
  happily lived in Android kernels for almost 3 years[1], so I'm excited
  to have it ready for upstream.

  The wide diffstat is mainly due to the treewide fixing of mismatched
  list_sort prototypes. Other things in core kernel are to address
  various CFI corner cases. The largest code portion is the CFI runtime
  implementation itself (which will be shared by all architectures
  implementing support for CFI). The arm64 pieces are Acked by arm64
  maintainers rather than coming through the arm64 tree since carrying
  this tree over there was going to be awkward.

  CFI support for x86 is still under development, but is pretty close.
  There are a handful of corner cases on x86 that need some improvements
  to Clang and objtool, but otherwise works well.

  Summary:

   - Clean up list_sort prototypes (Sami Tolvanen)

   - Introduce CONFIG_CFI_CLANG for arm64 (Sami Tolvanen)"

* tag 'cfi-v5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  arm64: allow CONFIG_CFI_CLANG to be selected
  KVM: arm64: Disable CFI for nVHE
  arm64: ftrace: use function_nocfi for ftrace_call
  arm64: add __nocfi to __apply_alternatives
  arm64: add __nocfi to functions that jump to a physical address
  arm64: use function_nocfi with __pa_symbol
  arm64: implement function_nocfi
  psci: use function_nocfi for cpu_resume
  lkdtm: use function_nocfi
  treewide: Change list_sort to use const pointers
  bpf: disable CFI in dispatcher functions
  kallsyms: strip ThinLTO hashes from static functions
  kthread: use WARN_ON_FUNCTION_MISMATCH
  workqueue: use WARN_ON_FUNCTION_MISMATCH
  module: ensure __cfi_check alignment
  mm: add generic function_nocfi macro
  cfi: add __cficanonical
  add support for Clang CFI
2021-04-27 10:16:46 -07:00
Sami Tolvanen
4f0f586bf0 treewide: Change list_sort to use const pointers
list_sort() internally casts the comparison function passed to it
to a different type with constant struct list_head pointers, and
uses this pointer to call the functions, which trips indirect call
Control-Flow Integrity (CFI) checking.

Instead of removing the consts, this change defines the
list_cmp_func_t type and changes the comparison function types of
all list_sort() callers to use const pointers, thus avoiding type
mismatches.

Suggested-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210408182843.1754385-10-samitolvanen@google.com
2021-04-08 16:04:22 -07:00
Bob Moore
cf16b05c60 ACPICA: ACPI 6.4: NFIT: add Location Cookie field
Also, update struct size to reflect these changes in nfit core driver.

ACPICA commit af60199a9a1de9e6844929fd4cc22334522ed195

Link: https://github.com/acpica/acpica/commit/af60199a
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Erik Kaneda <erik.kaneda@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-04-07 19:09:00 +02:00
Dan Williams
637464c59e ACPI: NFIT: Fix flexible_array.cocci warnings
Julia and 0day report:

    Zero-length and one-element arrays are deprecated, see
    Documentation/process/deprecated.rst
    Flexible-array members should be used instead.

However, a straight conversion to flexible arrays yields:

    drivers/acpi/nfit/core.c:2276:4: error: flexible array member in a struct with no named members
    drivers/acpi/nfit/core.c:2287:4: error: flexible array member in a struct with no named members

Instead, just use plain arrays not embedded flexible arrays.

Cc: Denis Efremov <efremov@linux.com>
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Julia Lawall <julia.lawall@inria.fr>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-01-11 12:56:49 -08:00
Dan Williams
9a7e3d7f05 ACPI: NFIT: Fix input validation of bus-family
Dan reports that smatch thinks userspace can craft an out-of-bound bus
family number. However, nd_cmd_clear_to_send() blocks all non-zero
values of bus-family since only the kernel can initiate these commands.
However, in the speculation path, family is a user controlled array
index value so mask it for speculation safety. Also, since the
nd_cmd_clear_to_send() safety is non-obvious and possibly may change in
the future include input validation as if userspace could get past the
nd_cmd_clear_to_send() gatekeeper.

Link: http://lore.kernel.org/r/20201111113000.GA1237157@mwanda
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: 6450ddbd5d ("ACPI: NFIT: Define runtime firmware activation commands")
Cc: <stable@vger.kernel.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2020-11-23 17:43:53 -08:00
Zhen Lei
1a57b1a3e1 ACPI/nfit: avoid accessing uninitialized memory in acpi_nfit_ctl()
The ACPI_ALLOCATE() does not zero the "buf", so when the condition
"integer->type != ACPI_TYPE_INTEGER" in int_to_buf() is met, the result
is unpredictable in acpi_nfit_ctl().

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Link: https://lore.kernel.org/r/20201118073517.1884-1-thunder.leizhen@huawei.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2020-11-18 11:46:18 -08:00
Maximilian Luz
c6237b210d ACPI: Fix whitespace inconsistencies
Replaces spaces with tabs where spaces have been (inconsistently) used
for indentation and removes trailing whitespaces.

Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-11-09 19:08:06 +01:00
Zhang Qilong
85f971b65a ACPI: NFIT: Fix comparison to '-ENXIO'
Initial value of rc is '-ENXIO', and we should
use the initial value to check it.

Signed-off-by: Zhang Qilong <zhangqilong3@huawei.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
[ rjw: Subject edit ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-10-27 19:34:56 +01:00
Rafael J. Wysocki
e4174ff78b Merge branch 'acpi-numa'
* acpi-numa:
  docs: mm: numaperf.rst Add brief description for access class 1.
  node: Add access1 class to represent CPU to memory characteristics
  ACPI: HMAT: Fix handling of changes from ACPI 6.2 to ACPI 6.3
  ACPI: Let ACPI know we support Generic Initiator Affinity Structures
  x86: Support Generic Initiator only proximity domains
  ACPI: Support Generic Initiator only domains
  ACPI / NUMA: Add stub function for pxm_to_node()
  irq-chip/gic-v3-its: Fix crash if ITS is in a proximity domain without processor or memory
  ACPI: Remove side effect of partly creating a node in acpi_get_node()
  ACPI: Rename acpi_map_pxm_to_online_node() to pxm_to_online_node()
  ACPI: Remove side effect of partly creating a node in acpi_map_pxm_to_online_node()
  ACPI: Do not create new NUMA domains from ACPI static tables that are not SRAT
  ACPI: Add out of bounds and numa_off protections to pxm_to_node()
2020-10-13 14:44:50 +02:00
Jonathan Cameron
4eb3723f18 ACPI: Rename acpi_map_pxm_to_online_node() to pxm_to_online_node()
As this function is no longer allowed to create new mappings
let us rename it to reflect this.

Note all nodes should already exist before any of the users
of this function are called.

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-09-24 12:57:38 +02:00
Jonathan Cameron
01feba590c ACPI: Do not create new NUMA domains from ACPI static tables that are not SRAT
Several ACPI static tables contain references to proximity domains.

ACPI 6.3 has clarified that only entries in SRAT may define a new
domain (sec 5.2.16).

Those tables described in the ACPI spec have additional clarifying text.

NFIT: Table 5-132,

"Integer that represents the proximity domain to which the memory
 belongs. This number must match with corresponding entry in the
 SRAT table."

HMAT: Table 5-145,

"... This number must match with the corresponding entry in the SRAT
 table's processor affinity structure ... if the initiator is a processor,
 or the Generic Initiator Affinity Structure if the initiator is a generic
 initiator".

IORT and DMAR are defined by external specifications.

Intel Virtualization Technology for Directed I/O Rev 3.1 does not make any
explicit statements, but the general SRAT statement above will still apply.
https://software.intel.com/sites/default/files/managed/c5/15/vt-directed-io-spec.pdf

IO Remapping Table, Platform Design Document rev D, also makes not explicit
statement, but refers to ACPI SRAT table for more information and again the
generic SRAT statement above applies.
https://developer.arm.com/documentation/den0049/d/

In conclusion, any proximity domain specified in these tables, should be a
reference to a proximity domain also found in SRAT, and they should not be
able to instantiate a new domain.  Hence we switch to pxm_to_node() which
will only return existing nodes.

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Barry Song <song.bao.hua@hisilicon.com>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-09-24 12:57:37 +02:00
Wang Qing
5f155515d3 ACPI: NFIT: Use kobj_to_dev() instead
Use kobj_to_dev() instead of container_of()

Signed-off-by: Wang Qing <wangqing@vivo.com>
Acked-by: Vishal Verma <vishal.l.verma@intel.com>
[ rjw: Subject edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-09-15 19:37:29 +02:00
Linus Torvalds
4bf5e36118 libnvdimm for 5.9
- Add 'Runtime Firmware Activation' support for NVDIMMs that advertise
   the relevant capability
 - Misc libnvdimm and DAX cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQT9vPEBxh63bwxRYEEPzq5USduLdgUCXzHodgAKCRAPzq5USduL
 djTjAQD1THDmizHn16zd94ueygh/BXfN0zyeVvQH352ol7kdfQEAj2A7YJ9XBbBY
 JC6/CNd+OiB9W88lLOUf3Waj1a7cUQ8=
 =Q6qn
 -----END PGP SIGNATURE-----

Merge tag 'libnvdimm-for-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm

Pull libnvdimm updayes from Vishal Verma:
 "You'd normally receive this pull request from Dan Williams, but he's
  busy watching a newborn (Congrats Dan!), so I'm watching libnvdimm
  this cycle.

  This adds a new feature in libnvdimm - 'Runtime Firmware Activation',
  and a few small cleanups and fixes in libnvdimm and DAX. I'd
  originally intended to make separate topic-based pull requests - one
  for libnvdimm, and one for DAX, but some of the DAX material fell out
  since it wasn't quite ready.

  Summary:

   - add 'Runtime Firmware Activation' support for NVDIMMs that
     advertise the relevant capability

   - misc libnvdimm and DAX cleanups"

* tag 'libnvdimm-for-5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  libnvdimm/security: ensure sysfs poll thread woke up and fetch updated attr
  libnvdimm/security: the 'security' attr never show 'overwrite' state
  libnvdimm/security: fix a typo
  ACPI: NFIT: Fix ARS zero-sized allocation
  dax: Fix incorrect argument passed to xas_set_err()
  ACPI: NFIT: Add runtime firmware activate support
  PM, libnvdimm: Add runtime firmware activation support
  libnvdimm: Convert to DEVICE_ATTR_ADMIN_RO()
  drivers/dax: Expand lock scope to cover the use of addresses
  fs/dax: Remove unused size parameter
  dax: print error message by pr_info() in __generic_fsdax_supported()
  driver-core: Introduce DEVICE_ATTR_ADMIN_{RO,RW}
  tools/testing/nvdimm: Emulate firmware activation commands
  tools/testing/nvdimm: Prepare nfit_ctl_test() for ND_CMD_CALL emulation
  tools/testing/nvdimm: Add command debug messages
  tools/testing/nvdimm: Cleanup dimm index passing
  ACPI: NFIT: Define runtime firmware activation commands
  ACPI: NFIT: Move bus_dsm_mask out of generic nvdimm_bus_descriptor
  libnvdimm: Validate command family indices
2020-08-11 10:59:19 -07:00
Dan Williams
9f1048d47e ACPI: NFIT: Fix ARS zero-sized allocation
Pending commit in -next "devres: handle zero size in devm_kmalloc()"
triggers a boot regression due to the ARS implementation expecting NULL
from a zero-sized allocation. Avoid the zero-sized allocation by
skipping ARS, otherwise crashes with the following signature when
de-referencing ZERO_SIZE_PTR.

     BUG: kernel NULL pointer dereference, address: 0000000000000018
     #PF: supervisor read access in kernel mode
     #PF: error_code(0x0000) - not-present page
     RIP: 0010:__acpi_nfit_scrub+0x28a/0x350 [nfit]
     [..]
     Call Trace:
       ? acpi_nfit_query_poison+0x6a/0x180 [nfit]
       acpi_nfit_scrub+0x36/0xb0 [nfit]
       process_one_work+0x23c/0x580
       worker_thread+0x50/0x3b0

Otherwise the implementation correctly aborts when NULL is returned from
devm_kzalloc() in ars_status_alloc().

Link: https://lore.kernel.org/r/159624590643.3037264.14157533719042907758.stgit@dwillia2-desk3.amr.corp.intel.com
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
2020-08-03 12:36:34 -06:00
Dan Williams
a1facc1fff ACPI: NFIT: Add runtime firmware activate support
Plumb the platform specific backend for the generic libnvdimm firmware
activate interface. Register dimm level operations to arm/disarm
activation, and register bus level operations to report the dynamic
platform-quiesce time relative to the number of dimms armed for firmware
activation.

A new nfit-specific bus attribute "firmware_activate_noidle" is added to
allow the activation to switch between platform enforced, and OS
opportunistic device quiesce. In other words, let the hibernate cycle
handle in-flight device-dma rather than the platform attempting to
increase PCI-E timeouts and the like.

Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
2020-07-28 19:29:22 -06:00
Alexander A. Klimov
4ce7796632 ACPI: Replace HTTP links with HTTPS ones
Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.

Deterministic algorithm:
For each file:
  If not .svg:
    For each line:
      If doesn't contain `\bxmlns\b`:
        For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`:
	  If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`:
            If both the HTTP and HTTPS versions
            return 200 OK and serve the same content:
              Replace HTTP with HTTPS.

Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de>
Acked-by: Vishal Verma <vishal.l.verma@intel.com>
Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-07-27 14:47:08 +02:00
Dan Williams
916566ae78 tools/testing/nvdimm: Emulate firmware activation commands
Augment the existing firmware update emulation to track activations and
validate proper update vs activate sequencing.

The DIMM firmware activate capability has a concept of a maximum amount
of time platform firmware will quiesce the system relative to how many
DIMMs are being activated in parallel. Simulate that DIMM activation
happens serially, 1 second per-DIMM, and limit the max at 3 seconds. The
nfit_test0 bus emulates 5 DIMMs so it will take 2 activations to update
all DIMMs.

Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Reported-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
2020-07-25 19:34:48 -06:00
Dan Williams
6450ddbd5d ACPI: NFIT: Define runtime firmware activation commands
Platform reboots are expensive. Towards reducing downtime to apply
firmware updates the Intel NVDIMM command definition is growing support
for applying live firmware updates that only require temporarily
suspending memory traffic instead of a full reboot.

Follow-on commits add support for triggering firmware activation, this
patch only defines the commands, adds probe support, and validates that
they are blocked via the ioctl path. The ioctl-path block ensures that
the OS is in charge since these commands have side effects only the OS
can handle. Specifically firmware activation may cause the memory
controller to be quiesced on the order of 100s of milliseconds. In that
case Linux ensure the activation only takes place while the OS is in a
suspend state.

Link: https://pmem.io/documents/IntelOptanePMem_DSM_Interface-V2.0.pdf
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Len Brown <lenb@kernel.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
2020-07-25 19:34:47 -06:00
Dan Williams
d46e6a2176 ACPI: NFIT: Move bus_dsm_mask out of generic nvdimm_bus_descriptor
DSMs are strictly an ACPI mechanism, evict the bus_dsm_mask concept from
the generic 'struct nvdimm_bus_descriptor' object.

As a side effect the test facility ->bus_nfit_cmd_force_en is no longer
necessary. The test infrastructure can communicate that information
directly in ->bus_dsm_mask.

Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
2020-07-25 19:34:47 -06:00
Dan Williams
92fe2aa859 libnvdimm: Validate command family indices
The ND_CMD_CALL format allows for a general passthrough of passlisted
commands targeting a given command set. However there is no validation
of the family index relative to what the bus supports.

- Update the NFIT bus implementation (the only one that supports
  ND_CMD_CALL passthrough) to also passlist the valid set of command
  family indices.

- Update the generic __nd_ioctl() path to validate that field on behalf
  of all implementations.

Fixes: 31eca76ba2 ("nfit, libnvdimm: limited/whitelisted dimm command marshaling mechanism")
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Len Brown <lenb@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
2020-07-25 19:34:47 -06:00
Linus Torvalds
d74b15dbbb libnvdimm for 5.8
- Small collection of cleanups to rework usage of ->queuedata and the
   GUID api.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEf41QbsdZzFdA8EfZHtKRamZ9iAIFAl7j+bAACgkQHtKRamZ9
 iAIIzw//UuZBJSTG58Vohb5qwXqmauVfK002HZu7sFOIc5MU8HQOvvBFoBG259nQ
 3f7ugemwJnI4nyBaeSuovvmzwjIZTy5N9QAgBxoTulHZENbsvoER2UimSDz2JPeD
 rcst2ka0uP4csRAxdrJFKYC22Uu074vWairsrmf1yRRNTcbNJFZAVmVBcExD55q8
 u6yZjH2hIU8CFGM7VbhQtynVj7q1YgmrsSMK0bq7pYAD7ciVrgWqlNgVvkr5kE8E
 RnnNpwnilxWfxtBjQoYNNFP1tvbXtiqvUz6yUjD9jZGLgJP6ad9Lrwqz+Qv/WVoK
 wwE+ZpIyAINDpof48DAvFVS0ZdgbOyHOc173aFaPa/kmwH6o1e9PZ8FzPyGVzuiF
 PfH7vs4q7Q768R87N6ltElUbX+BY/ycdtfhdpTL6ppK30GWGbV4GxU/y51T4P8QO
 dPNBPzR55QKdupjq3Jth/9Ter+DOBwe6K4QO1O1RX6nr+Znnop3I33oVHlT62Wl9
 6wgyHzKI/s0u0S4YHBbu9KrnKTBfQdqp0bQ6i9nO4fTI5m5z/H70RnpFs2AZSiOY
 XRWIrDG1GR34g7mxT/kfYfZ8EUIIOtbp6/PxoSZJX8+UsdfAK40+/odF9oJ/L8IB
 bV63Xn41TaIHCulbIK3DoWHobJ6ALYTtMb6auqblQfV47BL1FoQ=
 =Bhhc
 -----END PGP SIGNATURE-----

Merge tag 'libnvdimm-for-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm

Pull libnvdimm updates from Dan Williams:
 "Small collection of cleanups to rework usage of ->queuedata and the
  GUID api"

* tag 'libnvdimm-for-5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  nvdimm/pmem: stop using ->queuedata
  nvdimm/btt: stop using ->queuedata
  nvdimm/blk: stop using ->queuedata
  libnvdimm: Replace guid_copy() with import_guid() where it makes sense
2020-06-13 13:04:36 -07:00
Andy Shevchenko
f37b451f45 libnvdimm: Replace guid_copy() with import_guid() where it makes sense
There is a specific API to treat raw data as GUID, i.e. import_guid().
Use it instead of guid_copy() with explicit casting.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/20200422130539.45636-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2020-04-22 16:26:03 -07:00
Tony Luck
23ba710a08 x86/mce: Fix all mce notifiers to update the mce->kflags bitmask
If the handler took any action to log or deal with the error, set a bit
in mce->kflags so that the default handler on the end of the machine
check chain can see what has been done.

Get rid of NOTIFY_STOP returns. Make the EDAC and dev-mcelog handlers
skip over errors already processed by CEC.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Tony Luck <tony.luck@intel.com>
Link: https://lkml.kernel.org/r/20200214222720.13168-5-tony.luck@intel.com
2020-04-14 15:59:26 +02:00
Dan Williams
f6d2b802f8 Merge branch 'for-5.7/libnvdimm' into libnvdimm-for-next
- Introduce 'zero_page_range' as a dax operation. This facilitates
  filesystem-dax operation without a block-device.

- Advertise a persistence-domain for of_pmem and papr_scm. The
  persistence domain indicates where cpu-store cycles need to reach in
  the platform-memory subsystem before the platform will consider them
  power-fail protected.

- Fixup some flexible-array declarations.
2020-04-02 19:55:17 -07:00
Dan Williams
91bf79bcb6 Merge branch 'for-5.6/libnvdimm-fixes' into libnvdimm-for-next
Pick up some miscellaneous minor fixes, that missed v5.6-final,
including a some smatch reports in the ioctl path and some unit test
compilation fixups.
2020-04-02 19:47:12 -07:00
Gustavo A. R. Silva
4b56640608 ACPI: NFIT: Replace zero-length array with flexible-array member
The current codebase makes use of the zero-length array language
extension to the C90 standard, but the preferred mechanism to declare
variable-length types such as these ones is a flexible array member[1][2],
introduced in C99:

struct foo {
        int stuff;
        struct boo array[];
};

By making use of the mechanism above, we will get a compiler warning
in case the flexible array does not occur last in the structure, which
will help us prevent some kind of undefined behavior bugs from being
inadvertently introduced[3] to the codebase from now on.

Also, notice that, dynamic memory allocations won't be affected by
this change:

"Flexible array members have incomplete type, and so the sizeof operator
may not be applied. As a quirk of the original implementation of
zero-length arrays, sizeof evaluates to zero."[1]

This issue was found with the help of Coccinelle.

[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit 7649773293 ("cxgb3/l2t: Fix undefined behaviour")

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Link: https://lore.kernel.org/r/20200319195046.GA452@embeddedor.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2020-03-25 17:08:42 -07:00
Dan Williams
a0e374525d libnvdimm/region: Introduce NDD_LABELING
The NDD_ALIASING flag is used to indicate where pmem capacity might
alias with blk capacity and require labeling. It is also used to
indicate whether the DIMM supports labeling. Separate this latter
capability into its own flag so that the NDD_ALIASING flag is scoped to
true aliased configurations.

To my knowledge aliased configurations only exist in the ACPI spec,
there are no known platforms that ship this support in production.

This clarity allows namespace-capacity alignment constraints around
interleave-ways to be relaxed.

Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Oliver O'Halloran <oohall@gmail.com>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Link: https://lore.kernel.org/r/158041477856.3889308.4212605617834097674.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2020-03-17 12:23:21 -07:00
Dan Carpenter
01091c496f acpi/nfit: improve bounds checking for 'func'
The 'func' variable can come from the user in the __nd_ioctl().  If it's
too high then the (1 << func) shift in acpi_nfit_clear_to_send() is
undefined.  In acpi_nfit_ctl() we pass 'func' to test_bit(func, &dsm_mask)
which could result in an out of bounds access.

To fix these issues, I introduced the NVDIMM_CMD_MAX (31) define and
updated nfit_dsm_revid() to use that define as well instead of magic
numbers.

Fixes: 11189c1089 ("acpi/nfit: Fix command-supported detection")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Link: https://lore.kernel.org/r/20200225161927.hvftuq7kjn547fyj@kili.mountain
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2020-02-28 18:21:52 -08:00
Dan Williams
e755799aef libnvdimm: Move nvdimm_bus_attribute_group to device_type
A 'struct device_type' instance can carry default attributes for the
device. Use this facility to remove the export of
nvdimm_bus_attribute_group and put the responsibility on the core rather
than leaf implementations to define this attribute.

Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Oliver O'Halloran" <oohall@gmail.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Link: https://lore.kernel.org/r/157309903815.1582359.6418211876315050283.stgit@dwillia2-desk3.amr.corp.intel.com
2019-11-19 09:52:12 -08:00
Dan Williams
360eba7ebd libnvdimm: Move nvdimm_attribute_group to device_type
A 'struct device_type' instance can carry default attributes for the
device. Use this facility to remove the export of
nvdimm_attribute_group and put the responsibility on the core rather
than leaf implementations to define this attribute.

Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Oliver O'Halloran" <oohall@gmail.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Link: https://lore.kernel.org/r/157309903201.1582359.10966209746585062329.stgit@dwillia2-desk3.amr.corp.intel.com
2019-11-19 09:52:12 -08:00
Dan Williams
4ce79fa97e libnvdimm: Move nd_mapping_attribute_group to device_type
A 'struct device_type' instance can carry default attributes for the
device. Use this facility to remove the export of
nd_mapping_attribute_group and put the responsibility on the core rather
than leaf implementations to define this attribute.

Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Oliver O'Halloran" <oohall@gmail.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Link: https://lore.kernel.org/r/157309902686.1582359.6749533709859492704.stgit@dwillia2-desk3.amr.corp.intel.com
2019-11-19 09:52:12 -08:00
Dan Williams
7c4fc8cde1 libnvdimm: Move nd_region_attribute_group to device_type
A 'struct device_type' instance can carry default attributes for the
device. Use this facility to remove the export of
nd_region_attribute_group and put the responsibility on the core rather
than leaf implementations to define this attribute.

Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Oliver O'Halloran" <oohall@gmail.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Link: https://lore.kernel.org/r/157309902169.1582359.16828508538444551337.stgit@dwillia2-desk3.amr.corp.intel.com
2019-11-19 09:52:12 -08:00
Dan Williams
e2f6a0e348 libnvdimm: Move nd_numa_attribute_group to device_type
A 'struct device_type' instance can carry default attributes for the
device. Use this facility to remove the export of
nd_numa_attribute_group and put the responsibility on the core rather
than leaf implementations to define this attribute.

Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Oliver O'Halloran" <oohall@gmail.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Link: https://lore.kernel.org/r/157401269537.43284.14411189404186877352.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2019-11-19 09:51:54 -08:00
Dan Williams
adbb68293f libnvdimm: Move nd_device_attribute_group to device_type
A 'struct device_type' instance can carry default attributes for the
device. Use this facility to remove the export of
nd_device_attribute_group and put the responsibility on the core rather
than leaf implementations to define this attribute.

For regions this creates a new nd_region_attribute_groups[] added to the
per-region device-type instances.

Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: "Oliver O'Halloran" <oohall@gmail.com>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Link: https://lore.kernel.org/r/157309901138.1582359.12909354140826530394.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2019-11-17 09:17:39 -08:00
Dan Carpenter
edffc70f50 ACPI: NFIT: Fix unlock on error in scrub_show()
We change the locking in this function and forgot to update this error
path so we are accidentally still holding the "dev->lockdep_mutex".

Fixes: 87a30e1f05 ("driver-core, libnvdimm: Let device subsystems add local lockdep coverage")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Cc: 5.3+ <stable@vger.kernel.org> # 5.3+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-10-22 11:37:13 +02:00
Dan Williams
d78c620a2e libnvdimm/security: Introduce a 'frozen' attribute
In the process of debugging a system with an NVDIMM that was failing to
unlock it was found that the kernel is reporting 'locked' while the DIMM
security interface is 'frozen'. Unfortunately the security state is
tracked internally as an enum which prevents it from communicating the
difference between 'locked' and 'locked + frozen'. It follows that the
enum also prevents the kernel from communicating 'unlocked + frozen'
which would be useful for debugging why security operations like 'change
passphrase' are disabled.

Ditch the security state enum for a set of flags and introduce a new
sysfs attribute explicitly for the 'frozen' state. The regression risk
is low because the 'frozen' state was already blocked behind the
'locked' state, but will need to revisit if there were cases where
applications need 'frozen' to show up in the primary 'security'
attribute. The expectation is that communicating 'frozen' is mostly a
helper for debug and status monitoring.

Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reported-by: Jeff Moyer <jmoyer@redhat.com>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Link: https://lore.kernel.org/r/156686729474.184120.5835135644278860826.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2019-08-29 13:49:13 -07:00
Linus Torvalds
523634db14 libnvdimm fixes v5.3-rc2
- Fix duplicate device_unregister() calls (multiple threads competing to
   do unregister work when scheduling device removal from a sysfs attribute
   of the self-same device).
 
 - Fix badblocks registration order bug. Ensure region badblocks are
   initialized in advance of namespace registration.
 
 - Fix a deadlock between the bus lock and probe operations.
 
 - Export device-core infrastructure to coordinate async operations via
   the device ->dead state.
 
 - Add device-core infrastructure to validate device_lock() usage with
   lockdep.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJdO39XAAoJEB7SkWpmfYgCzbEQAJigRJecrz+OyICGmIAeNSy5
 hF6Cv+TPuccpnINNaULS7aJStv4Zl/3SxG5GkivKDk11Xs02VrLzv1m3nDxEOVwc
 6LwRwcM7U3UtROzI5gjfT5StgBU4xvlQYKiYV5oxAXoQ5amApqbl3NgfH3qmCaXR
 QqWhd7v7TiNZ1QWlnmRBw+j0YLbS1dHyaSAf4KZwnL6fVKmqxtfDxny5tG6jdDuq
 olPue6nFAA+ebxyAsKR9VQVmcxDwuG0bJ/GUD6IeOQp/Eh6hcv2AfcVjp4Iwn/aM
 n1dIXASFwKr6DoOXZgnUbfXMVGzq1qKHPNgzUvtK6SApZlcm+TnyIOfj0/6BNp9q
 Bae1RMRwo5Wa5oAQed3CutvUUQAPa5WrW95E0/4T+dkcutkRnxL6akn/c87qQ4nL
 F30zpL8U4UdeaJ5maEIqJ/mtAc9deHiFnO/k216+xvDcY3NGqvzY4PsUBAMep8i2
 FgoaBr0hmTkb0KTMI858ChQrT+sjqwJIa854g7b4VxrQz93WYPABRK9ZhMSBEJ8b
 rGCeNqvvq0G6dSN6e8bS6P/4EEk76nZAJUYKoMYmj3WuwYuY4Sxb86eFIudNeSEe
 EqRGaefaZrqEL6LJTHScCk+55BgYSEOrDdip1lSWGdNHjvgZeIOZrgCrqrm/H72c
 mkoCAzdA4drQ0D4ZbKrC
 =mhIp
 -----END PGP SIGNATURE-----

Merge tag 'libnvdimm-fixes-5.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm

Pull libnvdimm fixes from Dan Williams:
 "A collection of locking and async operations fixes for v5.3-rc2. These
  had been soaking in a branch targeting the merge window, but missed
  due to a regression hunt. This fixed up version has otherwise been in
  -next this past week with no reported issues.

  In order to gain confidence in the locking changes the pull also
  includes a debug / instrumentation patch to enable lockdep coverage
  for libnvdimm subsystem operations that depend on the device_lock for
  exclusion. As mentioned in the changelog it is a hack, but it works
  and documents the locking expectations of the sub-system in a way that
  others can use lockdep to verify. The driver core touches got an ack
  from Greg.

  Summary:

   - Fix duplicate device_unregister() calls (multiple threads competing
     to do unregister work when scheduling device removal from a sysfs
     attribute of the self-same device).

   - Fix badblocks registration order bug. Ensure region badblocks are
     initialized in advance of namespace registration.

   - Fix a deadlock between the bus lock and probe operations.

   - Export device-core infrastructure to coordinate async operations
     via the device ->dead state.

   - Add device-core infrastructure to validate device_lock() usage with
     lockdep"

* tag 'libnvdimm-fixes-5.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  driver-core, libnvdimm: Let device subsystems add local lockdep coverage
  libnvdimm/bus: Fix wait_nvdimm_bus_probe_idle() ABBA deadlock
  libnvdimm/bus: Stop holding nvdimm_bus_list_mutex over __nd_ioctl()
  libnvdimm/bus: Prepare the nd_ioctl() path to be re-entrant
  libnvdimm/region: Register badblocks before namespaces
  libnvdimm/bus: Prevent duplicate device_unregister() calls
  drivers/base: Introduce kill_device()
2019-07-27 08:25:51 -07:00
Dan Williams
87a30e1f05 driver-core, libnvdimm: Let device subsystems add local lockdep coverage
For good reason, the standard device_lock() is marked
lockdep_set_novalidate_class() because there is simply no sane way to
describe the myriad ways the device_lock() ordered with other locks.
However, that leaves subsystems that know their own local device_lock()
ordering rules to find lock ordering mistakes manually. Instead,
introduce an optional / additional lockdep-enabled lock that a subsystem
can acquire in all the same paths that the device_lock() is acquired.

A conversion of the NFIT driver and NVDIMM subsystem to a
lockdep-validate device_lock() scheme is included. The
debug_nvdimm_lock() implementation implements the correct lock-class and
stacking order for the libnvdimm device topology hierarchy.

Yes, this is a hack, but hopefully it is a useful hack for other
subsystems device_lock() debug sessions. Quoting Greg:

    "Yeah, it feels a bit hacky but it's really up to a subsystem to mess up
     using it as much as anything else, so user beware :)

     I don't object to it if it makes things easier for you to debug."

Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Vishal Verma <vishal.l.verma@intel.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/156341210661.292348.7014034644265455704.stgit@dwillia2-desk3.amr.corp.intel.com
2019-07-18 16:23:27 -07:00
Pankaj Gupta
c5d4355d10 libnvdimm: nd_region flush callback support
This patch adds functionality to perform flush from guest
to host over VIRTIO. We are registering a callback based
on 'nd_region' type. virtio_pmem driver requires this special
flush function. For rest of the region types we are registering
existing flush function. Report error returned by host fsync
failure to userspace.

Signed-off-by: Pankaj Gupta <pagupta@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2019-07-05 15:19:10 -07:00
Thomas Gleixner
5b497af42f treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 295
Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of version 2 of the gnu general public license as
  published by the free software foundation this program is
  distributed in the hope that it will be useful but without any
  warranty without even the implied warranty of merchantability or
  fitness for a particular purpose see the gnu general public license
  for more details

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 64 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Reviewed-by: Allison Randal <allison@lohutok.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190529141901.894819585@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-05 17:36:38 +02:00
Thomas Gleixner
ec8f24b7fa treewide: Add SPDX license identifier - Makefile/Kconfig
Add SPDX license identifiers to all Make/Kconfig files which:

 - Have no license information of any form

These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:

  GPL-2.0-only

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-21 10:50:46 +02:00
Dave Jiang
d2e5b6436c libnvdimm/security, acpi/nfit: unify zero-key for all security commands
With zero-key defined, we can remove previous detection of key id 0 or null
key in order to deal with a zero-key situation. Syncing all security
commands to use the zero-key. Helper functions are introduced to return the
data that points to the actual key payload or the zero_key. This helps
uniformly handle the key material even with zero_key.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2019-03-30 08:27:07 -07:00
Dan Williams
351f339faa acpi/nfit: Always dump _DSM output payload
The dynamic-debug statements for command payload output only get emitted
when the command is not ND_CMD_CALL. Move the output payload dumping
ahead of the early return path for ND_CMD_CALL.

Fixes: 31eca76ba2 ("...whitelisted dimm command marshaling mechanism")
Reported-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2019-03-22 16:19:18 -07:00
Linus Torvalds
f67e3fb489 device-dax for 5.1
* Replace the /sys/class/dax device model with /sys/bus/dax, and include
   a compat driver so distributions can opt-in to the new ABI.
 
 * Allow for an alternative driver for the device-dax address-range
 
 * Introduce the 'kmem' driver to hotplug / assign a device-dax
   address-range to the core-mm.
 
 * Arrange for the device-dax target-node to be onlined so that the newly
   added memory range can be uniquely referenced by numa apis.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJchWpGAAoJEB7SkWpmfYgCJk8P/0Q1DINszUDO/vKjJ09cDs9P
 Jw3it6GBIL50rDOu9QdcprSpwYDD0h1mLAV/m6oa3bVO+p4uWGvnxaxRx2HN2c/v
 vhZFtUDpHlqR63vzWMNVKRprYixCRJDUr6xQhhCcE3ak/ELN6w7LWfikKVWv15UL
 MfR96IQU38f+xRda/zSXnL9606Dvkvu/inEHj84lRcHIwj3sQAUalrE8bR3O32gZ
 bDg/l5kzT49o8ZXUo/TegvRSSSZpJmOl2DD0RW+ax5q3NI2bOXFrVDUKBKxf/hcQ
 E/V9i57TrqQx0GqRhnU7rN/v53cFZGGs31TEEIB/xs3bzCnADxwXcjL5b5K005J6
 vJjBA2ODBewHFK3uVx46Hy1iV4eCtZWj4QrMnrjdSrjXOfbF5GTbWOhPFgoq7TWf
 S7VqFEf3I2gDPaMq4o8Ej1kLH4HMYeor2NSOZjyvGn87rSZ3ZIQguwbaNIVl+itz
 gdDt0ZOU0BgOBkV+rZIeZDaGdloWCHcDPL15CkZaOZyzdWhfEZ7dod6ad+9udilU
 EUPH62RgzXZtfm5zpebYyjNVLbb9pLZ0nT+UypyGR6zqWx1SqU3mXi63NFXPco+x
 XA9j//edPeI6NHg2CXLEh8DLuCg3dG1zWRJANkiF+niBwyCR8CHtGWAoY6soXbKe
 2UrXGcIfXxyJ8V9v8v4q
 =hfa3
 -----END PGP SIGNATURE-----

Merge tag 'devdax-for-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm

Pull device-dax updates from Dan Williams:
 "New device-dax infrastructure to allow persistent memory and other
  "reserved" / performance differentiated memories, to be assigned to
  the core-mm as "System RAM".

  Some users want to use persistent memory as additional volatile
  memory. They are willing to cope with potential performance
  differences, for example between DRAM and 3D Xpoint, and want to use
  typical Linux memory management apis rather than a userspace memory
  allocator layered over an mmap() of a dax file. The administration
  model is to decide how much Persistent Memory (pmem) to use as System
  RAM, create a device-dax-mode namespace of that size, and then assign
  it to the core-mm. The rationale for device-dax is that it is a
  generic memory-mapping driver that can be layered over any "special
  purpose" memory, not just pmem. On subsequent boots udev rules can be
  used to restore the memory assignment.

  One implication of using pmem as RAM is that mlock() no longer keeps
  data off persistent media. For this reason it is recommended to enable
  NVDIMM Security (previously merged for 5.0) to encrypt pmem contents
  at rest. We considered making this recommendation an actively enforced
  requirement, but in the end decided to leave it as a distribution /
  administrator policy to allow for emulation and test environments that
  lack security capable NVDIMMs.

  Summary:

   - Replace the /sys/class/dax device model with /sys/bus/dax, and
     include a compat driver so distributions can opt-in to the new ABI.

   - Allow for an alternative driver for the device-dax address-range

   - Introduce the 'kmem' driver to hotplug / assign a device-dax
     address-range to the core-mm.

   - Arrange for the device-dax target-node to be onlined so that the
     newly added memory range can be uniquely referenced by numa apis"

NOTE! I'm not entirely happy with the whole "PMEM as RAM" model because
we currently have special - and very annoying rules in the kernel about
accessing PMEM only with the "MC safe" accessors, because machine checks
inside the regular repeat string copy functions can be fatal in some
(not described) circumstances.

And apparently the PMEM modules can cause that a lot more than regular
RAM.  The argument is that this happens because PMEM doesn't necessarily
get scrubbed at boot like RAM does, but that is planned to be added for
the user space tooling.

Quoting Dan from another email:
 "The exposure can be reduced in the volatile-RAM case by scanning for
  and clearing errors before it is onlined as RAM. The userspace tooling
  for that can be in place before v5.1-final. There's also runtime
  notifications of errors via acpi_nfit_uc_error_notify() from
  background scrubbers on the DIMM devices. With that mechanism the
  kernel could proactively clear newly discovered poison in the volatile
  case, but that would be additional development more suitable for v5.2.

  I understand the concern, and the need to highlight this issue by
  tapping the brakes on feature development, but I don't see PMEM as RAM
  making the situation worse when the exposure is also there via DAX in
  the PMEM case. Volatile-RAM is arguably a safer use case since it's
  possible to repair pages where the persistent case needs active
  application coordination"

* tag 'devdax-for-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  device-dax: "Hotplug" persistent memory for use like normal RAM
  mm/resource: Let walk_system_ram_range() search child resources
  mm/memory-hotplug: Allow memory resources to be children
  mm/resource: Move HMM pr_debug() deeper into resource code
  mm/resource: Return real error codes from walk failures
  device-dax: Add a 'modalias' attribute to DAX 'bus' devices
  device-dax: Add a 'target_node' attribute
  device-dax: Auto-bind device after successful new_id
  acpi/nfit, device-dax: Identify differentiated memory with a unique numa-node
  device-dax: Add /sys/class/dax backwards compatibility
  device-dax: Add support for a dax override driver
  device-dax: Move resource pinning+mapping into the common driver
  device-dax: Introduce bus + driver model
  device-dax: Start defining a dax bus model
  device-dax: Remove multi-resource infrastructure
  device-dax: Kill dax_region base
  device-dax: Kill dax_region ida
2019-03-16 13:05:32 -07:00
Dan Williams
4083014e32 Merge branch 'for-5.1/nfit/ars' into libnvdimm-for-next
Merge several updates to the ARS implementation. Highlights include:

* Support retrieval of short-ARS results if the ARS state is "requires
  continuation", and even if the "no_init_ars" module parameter is
  specified.
* Allow busy-polling of the kernel ARS state by allowing root to reset
  the exponential back-off timer.
* Filter potentially stale ARS results by tracking query-ARS relative to
  the previous start-ARS.
2019-03-11 12:37:55 -07:00
Dan Williams
451fed24e9 Merge branch 'for-5.1/libnvdimm' into libnvdimm-for-next
Merge miscellaneous libnvdimm sub-system updates for v5.1. Highlights
include:

* Support for the Hyper-V family of device-specific-methods (DSMs)
* Several fixes and workarounds for Hyper-V compatibility.
* Fix for the support to cache the dirty-shutdown-count at init.
2019-03-11 12:13:42 -07:00
Toshi Kani
5c9d62d002 acpi/nfit: Update NFIT flags error message
ACPI NFIT flags field reports major errors on NVDIMM, which need
user's attention.

Update the current log to a proper error message with dev_err().
The current message string is kept for grep-compatibility.

Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Robert Elliott <elliott@hpe.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2019-03-01 09:44:59 -08:00
Dan Williams
78153dd45e nfit/ars: Avoid stale ARS results
Gate ARS result consumption on whether the OS issued start-ARS since the
previous consumption. The BIOS may only clear its result buffers after a
successful start-ARS.

Fixes: 0caeef63e6 ("libnvdimm: Add a poison list and export badblocks")
Cc: <stable@vger.kernel.org>
Reported-by: Krzysztof Rusocki <krzysztof.rusocki@intel.com>
Reported-by: Vishal Verma <vishal.l.verma@intel.com>
Reviewed-by: Toshi Kani <toshi.kani@hpe.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2019-02-20 14:18:59 -08:00
Dan Williams
5479b2757f nfit/ars: Allow root to busy-poll the ARS state machine
The ARS implementation implements exponential back-off on the poll
interval to prevent high-frequency access to the DIMM / platform
interface. Depending on when the ARS completes the poll interval may
exceed the completion event by minutes. Allow root to reset the timeout
each time it probes the status. A one-second timeout is still enforced,
but root can otherwise can control the poll interval.

Fixes: bc6ba80858 ("nfit, address-range-scrub: rework and simplify ARS...")
Cc: <stable@vger.kernel.org>
Reported-by: Erwin Tsaur <erwin.tsaur@oracle.com>
Reviewed-by: Toshi Kani <toshi.kani@hpe.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2019-02-20 14:18:59 -08:00
Dan Williams
e34b8252a3 nfit/ars: Introduce scrub_flags
In preparation for introducing new flags to gate whether ARS results are
stale, or poll the completion state, convert the existing flags to an
unsigned long with enumerated values. This conversion allows the flags
to be atomically updated outside of ->init_mutex.

Reviewed-by: Toshi Kani <toshi.kani@hpe.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2019-02-20 14:18:59 -08:00