CXL for v6.11 merge window

New Changes:
 - Refactor to a common struct for DRAM and general media CXL events
 - Add abstract distance calculation support for CXL
 - Add CXL maturity map documentation to detail current state of CXL enabling
 - Add warning on mixed CXL VH and RCH/RCD hierachy to inform unsupported config
 - Replace ENXIO with EBUSY for inject poison limit reached via debugfs
 - Replace ENXIO with EBUSY for inject poison cxl-test support
 - XOR math fixup for DPA to SPA translation. Current math works for MODULO arithmetic
   where HPA==SPA, however not for XOR decode.
 - Move pci config read in cxl_dvsec_rr_decode() to avoid unnecessary acess
 
 Fixes:
 - Add a fix to address race condition in CXL memory hotplug notifier
 - Add missing MODULE_DESCRIPTION() for CXL modules
 - Fix incorrect vendor debug UUID define
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE5DAy15EJMCV1R6v9YGjFFmlTOEoFAmahJiMACgkQYGjFFmlT
 OEq8URAArcnzmH9yLvgE2pFOtaKg34vIGDWZGC4R1LpTnFEea04FuJslmxEKNgWo
 DJgPt9VZ66ump/oIvzcbvgLl/yMCTbnSxt5U6J8G5EmpO50PvxOTeWnEgAYVa0NH
 Diuzk/aF4GA94T3w+iAOzYx2N36kF+ezsY3/kqSORT7MC+DipSSUaPUiJcjr6FC6
 /ZIwkhhRi51ONJ8IgaXD+oEU9kxx7WUEyZoQZrJ9bv8/fGbeEfqy04pz2xDKHmLD
 rlQjm3l9um67VMsCvZ62Ce14HXqM213jZ3l0FmYjO4GbdXd2+0ZmIRNAb5vvTG9n
 5cY8vNsL6fND9FKkxlcRSdzI/O/vV+gcU+jzJxiul0p5fWHh/gaYjVH7fFq3dYc+
 vYE5lr97BfyA61bdmylIc2xwDH4yNKVQLZZPVTz5XTxfzBjYCjLPb5vGQKfg/nrB
 N66wjCIWLfCH6DqusUXem1c6BSrrjob8MwXpg00eBE0AA4ihieiy5fxuApnv9mI2
 f809AXRV1k24s5upStZ9iGZSEILBBqiw/KwDyWfRvxjNz36Z1Q2eiXBwbHrVQHBa
 PFtRPPFsZ9+ouIG/8otFaLwDQdITRdA0+drG8lmJ+gs8239Z3eIMMS0+CYdLDbva
 S8vo4POOQSS+cVUjLkC9zIxwPaXq96TLIkCtiLI9xUx5eIzv4K0=
 =HaEG
 -----END PGP SIGNATURE-----

Merge tag 'cxl-for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl

Pull CXL updates from Dave Jiang:
 "Core:

   - A CXL maturity map has been added to the documentation to detail
     the current state of CXL enabling.

     It provides the status of the current state of various CXL features
     to inform current and future contributors of where things are and
     which areas need contribution.

   - A notifier handler has been added in order for a newly created CXL
     memory region to trigger the abstract distance metrics calculation.

     This should bring parity for CXL memory to the same level vs
     hotplugged DRAM for NUMA abstract distance calculation. The
     abstract distance reflects relative performance used for memory
     tiering handling.

   - An addition for XOR math has been added to address the CXL DPA to
     SPA translation.

     CXL address translation did not support address interleave math
     with XOR prior to this change.

  Fixes:

   - Fix to address race condition in the CXL memory hotplug notifier

   - Add missing MODULE_DESCRIPTION() for CXL modules

   - Fix incorrect vendor debug UUID define

  Misc:

   - A warning has been added to inform users of an unsupported
     configuration when mixing CXL VH and RCH/RCD hierarchies

   - The ENXIO error code has been replaced with EBUSY for inject poison
     limit reached via debugfs and cxl-test support

   - Moving the PCI config read in cxl_dvsec_rr_decode() to avoid
     unnecessary PCI config reads

   - A refactor to a common struct for DRAM and general media CXL
     events"

* tag 'cxl-for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
  cxl/core/pci: Move reading of control register to immediately before usage
  cxl: Remove defunct code calculating host bridge target positions
  cxl/region: Verify target positions using the ordered target list
  cxl: Restore XOR'd position bits during address translation
  cxl/core: Fold cxl_trace_hpa() into cxl_dpa_to_hpa()
  cxl/test: Replace ENXIO with EBUSY for inject poison limit reached
  cxl/memdev: Replace ENXIO with EBUSY for inject poison limit reached
  cxl/acpi: Warn on mixed CXL VH and RCH/RCD Hierarchy
  cxl/core: Fix incorrect vendor debug UUID define
  Documentation: CXL Maturity Map
  cxl/region: Simplify cxl_region_nid()
  cxl/region: Support to calculate memory tier abstract distance
  cxl/region: Fix a race condition in memory hotplug notifier
  cxl: add missing MODULE_DESCRIPTION() macros
  cxl/events: Use a common struct for DRAM and General Media events
This commit is contained in:
Linus Torvalds 2024-07-28 09:33:28 -07:00
commit e62f81bbd2
19 changed files with 442 additions and 212 deletions

View File

@ -14,9 +14,10 @@ Description:
event to its internal Informational Event log, updates the
Event Status register, and if configured, interrupts the host.
It is not an error to inject poison into an address that
already has poison present and no error is returned. The
inject_poison attribute is only visible for devices supporting
the capability.
already has poison present and no error is returned. If the
device returns 'Inject Poison Limit Reached' an -EBUSY error
is returned to the user. The inject_poison attribute is only
visible for devices supporting the capability.
What: /sys/kernel/debug/memX/clear_poison

View File

@ -9,4 +9,6 @@ Compute Express Link
memory-devices
maturity-map
.. only:: subproject and html

View File

@ -0,0 +1,202 @@
.. SPDX-License-Identifier: GPL-2.0
.. include:: <isonum.txt>
===========================================
Compute Express Link Subsystem Maturity Map
===========================================
The Linux CXL subsystem tracks the dynamic `CXL specification
<https://computeexpresslink.org/cxl-specification-landing-page>`_ that
continues to respond to new use cases with new features, capability
updates and fixes. At any given point some aspects of the subsystem are
more mature than others. While the periodic pull requests summarize the
`work being incorporated each merge window
<https://lore.kernel.org/linux-cxl/?q=s%3APULL+s%3ACXL+tc%3Atorvalds+NOT+s%3ARe>`_,
those do not always convey progress relative to a starting point and a
future end goal.
What follows is a coarse breakdown of the subsystem's major
responsibilities along with a maturity score. The expectation is that
the change-history of this document provides an overview summary of the
subsystem maturation over time.
The maturity scores are:
- [3] Mature: Work in this area is complete and no changes on the horizon.
Note that this score can regress from one kernel release to the next
based on new test results or end user reports.
- [2] Stabilizing: Major functionality operational, common cases are
mature, but known corner cases are still a work in progress.
- [1] Initial: Capability that has exited the Proof of Concept phase, but
may still have significant gaps to close and fixes to apply as real
world testing occurs.
- [0] Known gap: Feature is on a medium to long term horizon to
implement. If the specification has a feature that does not even have
a '0' score in this document, there is a good chance that no one in
the linux-cxl@vger.kernel.org community has started to look at it.
- X: Out of scope for kernel enabling, or kernel enabling not required
Feature and Capabilities
========================
Enumeration / Provisioning
--------------------------
All of the fundamental enumeration an object model of the subsystem is
in place, but there are several corner cases that are pending closure.
* [2] CXL Window Enumeration
* [0] :ref:`Extended-linear memory-side cache <extended-linear>`
* [0] Low Memory-hole
* [0] Hetero-interleave
* [2] Switch Enumeration
* [0] CXL register enumeration link-up dependency
* [2] HDM Decoder Configuration
* [0] Decoder target and granularity constraints
* [2] Performance enumeration
* [3] Endpoint CDAT
* [3] Switch CDAT
* [1] CDAT to Core-mm integration
* [1] x86
* [0] Arm64
* [0] All other arch.
* [0] Shared link
* [2] Hotplug
(see CXL Window Enumeration)
* [0] Handle Soft Reserved conflicts
* [0] :ref:`RCH link status <rch-link-status>`
* [0] Fabrics / G-FAM (chapter 7)
* [0] Global Access Endpoint
RAS
---
In many ways CXL can be seen as a standardization of what would normally
be handled by custom EDAC drivers. The open development here is
mainly caused by the enumeration corner cases above.
* [3] Component events (OS)
* [2] Component events (FFM)
* [1] Endpoint protocol errors (OS)
* [1] Endpoint protocol errors (FFM)
* [0] Switch protocol errors (OS)
* [1] Switch protocol errors (FFM)
* [2] DPA->HPA Address translation
* [1] XOR Interleave translation
(see CXL Window Enumeration)
* [1] Memory Failure coordination
* [0] Scrub control
* [2] ACPI error injection EINJ
* [0] EINJ v2
* [X] Compliance DOE
* [2] Native error injection
* [3] RCH error handling
* [1] VH error handling
* [0] PPR
* [0] Sparing
* [0] Device built in test
Mailbox commands
----------------
* [3] Firmware update
* [3] Health / Alerts
* [1] :ref:`Background commands <background-commands>`
* [3] Sanitization
* [3] Security commands
* [3] RAW Command Debug Passthrough
* [0] CEL-only-validation Passthrough
* [0] Switch CCI
* [3] Timestamp
* [1] PMEM labels
* [0] PMEM GPF / Dirty Shutdown
* [0] Scan Media
PMU
---
* [1] Type 3 PMU
* [0] Switch USP/ DSP, Root Port
Security
--------
* [X] CXL Trusted Execution Environment Security Protocol (TSP)
* [X] CXL IDE (subsumed by TSP)
Memory-pooling
--------------
* [1] Hotplug of LDs (via PCI hotplug)
* [0] Dynamic Capacity Device (DCD) Support
Multi-host sharing
------------------
* [0] Hardware coherent shared memory
* [0] Software managed coherency shared memory
Multi-host memory
-----------------
* [0] Dynamic Capacity Device Support
* [0] Sharing
Accelerator
-----------
* [0] Accelerator memory enumeration HDM-D (CXL 1.1/2.0 Type-2)
* [0] Accelerator memory enumeration HDM-DB (CXL 3.0 Type-2)
* [0] CXL.cache 68b (CXL 2.0)
* [0] CXL.cache 256b Cache IDs (CXL 3.0)
User Flow Support
-----------------
* [0] HPA->DPA Address translation (need xormaps export solution)
Details
=======
.. _extended-linear:
* **Extended-linear memory-side cache**: An HMAT proposal to enumerate the presence of a
memory-side cache where the cache capacity extends the SRAT address
range capacity. `See the ECN
<https://lore.kernel.org/linux-cxl/6650e4f835a0e_195e294a8@dwillia2-mobl3.amr.corp.intel.com.notmuch/>`_
for more details:
.. _rch-link-status:
* **RCH Link Status**: RCH (Restricted CXL Host) topologies, end up
hiding some standard registers like PCIe Link Status / Capabilities in
the CXL RCRB (Root Complex Register Block).
.. _background-commands:
* **Background commands**: The CXL background command mechanism is
awkward as the single slot is monopolized potentially indefinitely by
various commands. A `cancel on conflict
<http://lore.kernel.org/r/66035c2e8ba17_770232948b@dwillia2-xfh.jf.intel.com.notmuch>`_
facility is needed to make sure the kernel can ensure forward progress
of priority commands.

View File

@ -5613,6 +5613,7 @@ M: Ira Weiny <ira.weiny@intel.com>
M: Dan Williams <dan.j.williams@intel.com>
L: linux-cxl@vger.kernel.org
S: Maintained
F: Documentation/driver-api/cxl
F: drivers/cxl/
F: include/linux/einj-cxl.h
F: include/linux/cxl-event.h

View File

@ -22,56 +22,42 @@ static const guid_t acpi_cxl_qtg_id_guid =
GUID_INIT(0xF365F9A6, 0xA7DE, 0x4071,
0xA6, 0x6A, 0xB4, 0x0C, 0x0B, 0x4F, 0x8E, 0x52);
/*
* Find a targets entry (n) in the host bridge interleave list.
* CXL Specification 3.0 Table 9-22
*/
static int cxl_xor_calc_n(u64 hpa, struct cxl_cxims_data *cximsd, int iw,
int ig)
{
int i = 0, n = 0;
u8 eiw;
/* IW: 2,4,6,8,12,16 begin building 'n' using xormaps */
if (iw != 3) {
for (i = 0; i < cximsd->nr_maps; i++)
n |= (hweight64(hpa & cximsd->xormaps[i]) & 1) << i;
}
/* IW: 3,6,12 add a modulo calculation to 'n' */
if (!is_power_of_2(iw)) {
if (ways_to_eiw(iw, &eiw))
return -1;
hpa &= GENMASK_ULL(51, eiw + ig);
n |= do_div(hpa, 3) << i;
}
return n;
}
static struct cxl_dport *cxl_hb_xor(struct cxl_root_decoder *cxlrd, int pos)
static u64 cxl_xor_hpa_to_spa(struct cxl_root_decoder *cxlrd, u64 hpa)
{
struct cxl_cxims_data *cximsd = cxlrd->platform_data;
struct cxl_switch_decoder *cxlsd = &cxlrd->cxlsd;
struct cxl_decoder *cxld = &cxlsd->cxld;
int ig = cxld->interleave_granularity;
int iw = cxld->interleave_ways;
int n = 0;
u64 hpa;
int hbiw = cxlrd->cxlsd.nr_targets;
u64 val;
int pos;
if (dev_WARN_ONCE(&cxld->dev,
cxld->interleave_ways != cxlsd->nr_targets,
"misconfigured root decoder\n"))
return NULL;
/* No xormaps for host bridge interleave ways of 1 or 3 */
if (hbiw == 1 || hbiw == 3)
return hpa;
hpa = cxlrd->res->start + pos * ig;
/*
* For root decoders using xormaps (hbiw: 2,4,6,8,12,16) restore
* the position bit to its value before the xormap was applied at
* HPA->DPA translation.
*
* pos is the lowest set bit in an XORMAP
* val is the XORALLBITS(HPA & XORMAP)
*
* XORALLBITS: The CXL spec (3.1 Table 9-22) defines XORALLBITS
* as an operation that outputs a single bit by XORing all the
* bits in the input (hpa & xormap). Implement XORALLBITS using
* hweight64(). If the hamming weight is even the XOR of those
* bits results in val==0, if odd the XOR result is val==1.
*/
/* Entry (n) is 0 for no interleave (iw == 1) */
if (iw != 1)
n = cxl_xor_calc_n(hpa, cximsd, iw, ig);
for (int i = 0; i < cximsd->nr_maps; i++) {
if (!cximsd->xormaps[i])
continue;
pos = __ffs(cximsd->xormaps[i]);
val = (hweight64(hpa & cximsd->xormaps[i]) & 1);
hpa = (hpa & ~(1ULL << pos)) | (val << pos);
}
if (n < 0)
return NULL;
return cxlrd->cxlsd.target[n];
return hpa;
}
struct cxl_cxims_context {
@ -361,7 +347,6 @@ static int __cxl_parse_cfmws(struct acpi_cedt_cfmws *cfmws,
struct cxl_port *root_port = ctx->root_port;
struct cxl_cxims_context cxims_ctx;
struct device *dev = ctx->dev;
cxl_calc_hb_fn cxl_calc_hb;
struct cxl_decoder *cxld;
unsigned int ways, i, ig;
int rc;
@ -389,13 +374,9 @@ static int __cxl_parse_cfmws(struct acpi_cedt_cfmws *cfmws,
if (rc)
return rc;
if (cfmws->interleave_arithmetic == ACPI_CEDT_CFMWS_ARITHMETIC_MODULO)
cxl_calc_hb = cxl_hb_modulo;
else
cxl_calc_hb = cxl_hb_xor;
struct cxl_root_decoder *cxlrd __free(put_cxlrd) =
cxl_root_decoder_alloc(root_port, ways, cxl_calc_hb);
cxl_root_decoder_alloc(root_port, ways);
if (IS_ERR(cxlrd))
return PTR_ERR(cxlrd);
@ -434,6 +415,9 @@ static int __cxl_parse_cfmws(struct acpi_cedt_cfmws *cfmws,
cxlrd->qos_class = cfmws->qtg_id;
if (cfmws->interleave_arithmetic == ACPI_CEDT_CFMWS_ARITHMETIC_XOR)
cxlrd->hpa_to_spa = cxl_xor_hpa_to_spa;
rc = cxl_decoder_add(cxld, target_map);
if (rc)
return rc;
@ -482,6 +466,8 @@ struct cxl_chbs_context {
unsigned long long uid;
resource_size_t base;
u32 cxl_version;
int nr_versions;
u32 saved_version;
};
static int cxl_get_chbs_iter(union acpi_subtable_headers *header, void *arg,
@ -490,22 +476,31 @@ static int cxl_get_chbs_iter(union acpi_subtable_headers *header, void *arg,
struct cxl_chbs_context *ctx = arg;
struct acpi_cedt_chbs *chbs;
if (ctx->base != CXL_RESOURCE_NONE)
return 0;
chbs = (struct acpi_cedt_chbs *) header;
if (ctx->uid != chbs->uid)
return 0;
ctx->cxl_version = chbs->cxl_version;
if (!chbs->base)
return 0;
if (chbs->cxl_version == ACPI_CEDT_CHBS_VERSION_CXL11 &&
chbs->length != CXL_RCRB_SIZE)
return 0;
if (!chbs->base)
return 0;
if (ctx->saved_version != chbs->cxl_version) {
/*
* cxl_version cannot be overwritten before the next two
* checks, then use saved_version
*/
ctx->saved_version = chbs->cxl_version;
ctx->nr_versions++;
}
if (ctx->base != CXL_RESOURCE_NONE)
return 0;
if (ctx->uid != chbs->uid)
return 0;
ctx->cxl_version = chbs->cxl_version;
ctx->base = chbs->base;
return 0;
@ -529,10 +524,19 @@ static int cxl_get_chbs(struct device *dev, struct acpi_device *hb,
.uid = uid,
.base = CXL_RESOURCE_NONE,
.cxl_version = UINT_MAX,
.saved_version = UINT_MAX,
};
acpi_table_parse_cedt(ACPI_CEDT_TYPE_CHBS, cxl_get_chbs_iter, ctx);
if (ctx->nr_versions > 1) {
/*
* Disclaim eRCD support given some component register may
* only be found via CHBCR
*/
dev_info(dev, "Unsupported platform config, mixed Virtual Host and Restricted CXL Host hierarchy.");
}
return 0;
}
@ -921,6 +925,7 @@ static void __exit cxl_acpi_exit(void)
/* load before dax_hmem sees 'Soft Reserved' CXL ranges */
subsys_initcall(cxl_acpi_init);
module_exit(cxl_acpi_exit);
MODULE_DESCRIPTION("CXL ACPI: Platform Support");
MODULE_LICENSE("GPL v2");
MODULE_IMPORT_NS(CXL);
MODULE_IMPORT_NS(ACPI);

View File

@ -28,12 +28,12 @@ int cxl_region_init(void);
void cxl_region_exit(void);
int cxl_get_poison_by_endpoint(struct cxl_port *port);
struct cxl_region *cxl_dpa_to_region(const struct cxl_memdev *cxlmd, u64 dpa);
u64 cxl_trace_hpa(struct cxl_region *cxlr, const struct cxl_memdev *cxlmd,
u64 dpa);
u64 cxl_dpa_to_hpa(struct cxl_region *cxlr, const struct cxl_memdev *cxlmd,
u64 dpa);
#else
static inline u64
cxl_trace_hpa(struct cxl_region *cxlr, const struct cxl_memdev *cxlmd, u64 dpa)
static inline u64 cxl_dpa_to_hpa(struct cxl_region *cxlr,
const struct cxl_memdev *cxlmd, u64 dpa)
{
return ULLONG_MAX;
}

View File

@ -875,10 +875,10 @@ void cxl_event_trace_record(const struct cxl_memdev *cxlmd,
guard(rwsem_read)(&cxl_region_rwsem);
guard(rwsem_read)(&cxl_dpa_rwsem);
dpa = le64_to_cpu(evt->common.phys_addr) & CXL_DPA_MASK;
dpa = le64_to_cpu(evt->media_hdr.phys_addr) & CXL_DPA_MASK;
cxlr = cxl_dpa_to_region(cxlmd, dpa);
if (cxlr)
hpa = cxl_trace_hpa(cxlr, cxlmd, dpa);
hpa = cxl_dpa_to_hpa(cxlr, cxlmd, dpa);
if (event_type == CXL_CPER_EVENT_GEN_MEDIA)
trace_cxl_general_media(cxlmd, type, cxlr, hpa,

View File

@ -338,10 +338,6 @@ int cxl_dvsec_rr_decode(struct device *dev, int d,
if (rc)
return rc;
rc = pci_read_config_word(pdev, d + CXL_DVSEC_CTRL_OFFSET, &ctrl);
if (rc)
return rc;
if (!(cap & CXL_DVSEC_MEM_CAPABLE)) {
dev_dbg(dev, "Not MEM Capable\n");
return -ENXIO;
@ -368,6 +364,10 @@ int cxl_dvsec_rr_decode(struct device *dev, int d,
* disabled, and they will remain moot after the HDM Decoder
* capability is enabled.
*/
rc = pci_read_config_word(pdev, d + CXL_DVSEC_CTRL_OFFSET, &ctrl);
if (rc)
return rc;
info->mem_enabled = FIELD_GET(CXL_DVSEC_MEM_ENABLE, ctrl);
if (!info->mem_enabled)
return 0;

View File

@ -1733,21 +1733,6 @@ static int decoder_populate_targets(struct cxl_switch_decoder *cxlsd,
return 0;
}
struct cxl_dport *cxl_hb_modulo(struct cxl_root_decoder *cxlrd, int pos)
{
struct cxl_switch_decoder *cxlsd = &cxlrd->cxlsd;
struct cxl_decoder *cxld = &cxlsd->cxld;
int iw;
iw = cxld->interleave_ways;
if (dev_WARN_ONCE(&cxld->dev, iw != cxlsd->nr_targets,
"misconfigured root decoder\n"))
return NULL;
return cxlrd->cxlsd.target[pos % iw];
}
EXPORT_SYMBOL_NS_GPL(cxl_hb_modulo, CXL);
static struct lock_class_key cxl_decoder_key;
/**
@ -1807,7 +1792,6 @@ static int cxl_switch_decoder_init(struct cxl_port *port,
* cxl_root_decoder_alloc - Allocate a root level decoder
* @port: owning CXL root of this decoder
* @nr_targets: static number of downstream targets
* @calc_hb: which host bridge covers the n'th position by granularity
*
* Return: A new cxl decoder to be registered by cxl_decoder_add(). A
* 'CXL root' decoder is one that decodes from a top-level / static platform
@ -1815,8 +1799,7 @@ static int cxl_switch_decoder_init(struct cxl_port *port,
* topology.
*/
struct cxl_root_decoder *cxl_root_decoder_alloc(struct cxl_port *port,
unsigned int nr_targets,
cxl_calc_hb_fn calc_hb)
unsigned int nr_targets)
{
struct cxl_root_decoder *cxlrd;
struct cxl_switch_decoder *cxlsd;
@ -1838,7 +1821,6 @@ struct cxl_root_decoder *cxl_root_decoder_alloc(struct cxl_port *port,
return ERR_PTR(rc);
}
cxlrd->calc_hb = calc_hb;
mutex_init(&cxlrd->range_lock);
cxld = &cxlsd->cxld;
@ -2356,5 +2338,6 @@ static void cxl_core_exit(void)
subsys_initcall(cxl_core_init);
module_exit(cxl_core_exit);
MODULE_DESCRIPTION("CXL: Core Compute Express Link support");
MODULE_LICENSE("GPL v2");
MODULE_IMPORT_NS(CXL);

View File

@ -9,6 +9,7 @@
#include <linux/uuid.h>
#include <linux/sort.h>
#include <linux/idr.h>
#include <linux/memory-tiers.h>
#include <cxlmem.h>
#include <cxl.h>
#include "core.h"
@ -1632,10 +1633,13 @@ static int cxl_region_attach_position(struct cxl_region *cxlr,
const struct cxl_dport *dport, int pos)
{
struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
struct cxl_switch_decoder *cxlsd = &cxlrd->cxlsd;
struct cxl_decoder *cxld = &cxlsd->cxld;
int iw = cxld->interleave_ways;
struct cxl_port *iter;
int rc;
if (cxlrd->calc_hb(cxlrd, pos) != dport) {
if (dport != cxlrd->cxlsd.target[pos % iw]) {
dev_dbg(&cxlr->dev, "%s:%s invalid target position for %s\n",
dev_name(&cxlmd->dev), dev_name(&cxled->cxld.dev),
dev_name(&cxlrd->cxlsd.cxld.dev));
@ -2310,6 +2314,7 @@ static void unregister_region(void *_cxlr)
int i;
unregister_memory_notifier(&cxlr->memory_notifier);
unregister_mt_adistance_algorithm(&cxlr->adist_notifier);
device_del(&cxlr->dev);
/*
@ -2386,14 +2391,23 @@ static bool cxl_region_update_coordinates(struct cxl_region *cxlr, int nid)
return true;
}
static int cxl_region_nid(struct cxl_region *cxlr)
{
struct cxl_region_params *p = &cxlr->params;
struct resource *res;
guard(rwsem_read)(&cxl_region_rwsem);
res = p->res;
if (!res)
return NUMA_NO_NODE;
return phys_to_target_node(res->start);
}
static int cxl_region_perf_attrs_callback(struct notifier_block *nb,
unsigned long action, void *arg)
{
struct cxl_region *cxlr = container_of(nb, struct cxl_region,
memory_notifier);
struct cxl_region_params *p = &cxlr->params;
struct cxl_endpoint_decoder *cxled = p->targets[0];
struct cxl_decoder *cxld = &cxled->cxld;
struct memory_notify *mnb = arg;
int nid = mnb->status_change_nid;
int region_nid;
@ -2401,7 +2415,7 @@ static int cxl_region_perf_attrs_callback(struct notifier_block *nb,
if (nid == NUMA_NO_NODE || action != MEM_ONLINE)
return NOTIFY_DONE;
region_nid = phys_to_target_node(cxld->hpa_range.start);
region_nid = cxl_region_nid(cxlr);
if (nid != region_nid)
return NOTIFY_DONE;
@ -2411,6 +2425,27 @@ static int cxl_region_perf_attrs_callback(struct notifier_block *nb,
return NOTIFY_OK;
}
static int cxl_region_calculate_adistance(struct notifier_block *nb,
unsigned long nid, void *data)
{
struct cxl_region *cxlr = container_of(nb, struct cxl_region,
adist_notifier);
struct access_coordinate *perf;
int *adist = data;
int region_nid;
region_nid = cxl_region_nid(cxlr);
if (nid != region_nid)
return NOTIFY_OK;
perf = &cxlr->coord[ACCESS_COORDINATE_CPU];
if (mt_perf_to_adistance(perf, adist))
return NOTIFY_OK;
return NOTIFY_STOP;
}
/**
* devm_cxl_add_region - Adds a region to a decoder
* @cxlrd: root decoder
@ -2453,6 +2488,10 @@ static struct cxl_region *devm_cxl_add_region(struct cxl_root_decoder *cxlrd,
cxlr->memory_notifier.priority = CXL_CALLBACK_PRI;
register_memory_notifier(&cxlr->memory_notifier);
cxlr->adist_notifier.notifier_call = cxl_region_calculate_adistance;
cxlr->adist_notifier.priority = 100;
register_mt_adistance_algorithm(&cxlr->adist_notifier);
rc = devm_add_action_or_reset(port->uport_dev, unregister_region, cxlr);
if (rc)
return ERR_PTR(rc);
@ -2816,20 +2855,13 @@ struct cxl_region *cxl_dpa_to_region(const struct cxl_memdev *cxlmd, u64 dpa)
return ctx.cxlr;
}
static bool cxl_is_hpa_in_range(u64 hpa, struct cxl_region *cxlr, int pos)
static bool cxl_is_hpa_in_chunk(u64 hpa, struct cxl_region *cxlr, int pos)
{
struct cxl_region_params *p = &cxlr->params;
int gran = p->interleave_granularity;
int ways = p->interleave_ways;
u64 offset;
/* Is the hpa within this region at all */
if (hpa < p->res->start || hpa > p->res->end) {
dev_dbg(&cxlr->dev,
"Addr trans fail: hpa 0x%llx not in region\n", hpa);
return false;
}
/* Is the hpa in an expected chunk for its pos(-ition) */
offset = hpa - p->res->start;
offset = do_div(offset, gran * ways);
@ -2842,15 +2874,26 @@ static bool cxl_is_hpa_in_range(u64 hpa, struct cxl_region *cxlr, int pos)
return false;
}
static u64 cxl_dpa_to_hpa(u64 dpa, struct cxl_region *cxlr,
struct cxl_endpoint_decoder *cxled)
u64 cxl_dpa_to_hpa(struct cxl_region *cxlr, const struct cxl_memdev *cxlmd,
u64 dpa)
{
struct cxl_root_decoder *cxlrd = to_cxl_root_decoder(cxlr->dev.parent);
u64 dpa_offset, hpa_offset, bits_upper, mask_upper, hpa;
struct cxl_region_params *p = &cxlr->params;
int pos = cxled->pos;
struct cxl_endpoint_decoder *cxled = NULL;
u16 eig = 0;
u8 eiw = 0;
int pos;
for (int i = 0; i < p->nr_targets; i++) {
cxled = p->targets[i];
if (cxlmd == cxled_to_memdev(cxled))
break;
}
if (!cxled || cxlmd != cxled_to_memdev(cxled))
return ULLONG_MAX;
pos = cxled->pos;
ways_to_eiw(p->interleave_ways, &eiw);
granularity_to_eig(p->interleave_granularity, &eig);
@ -2884,29 +2927,23 @@ static u64 cxl_dpa_to_hpa(u64 dpa, struct cxl_region *cxlr,
/* Apply the hpa_offset to the region base address */
hpa = hpa_offset + p->res->start;
if (!cxl_is_hpa_in_range(hpa, cxlr, cxled->pos))
/* Root decoder translation overrides typical modulo decode */
if (cxlrd->hpa_to_spa)
hpa = cxlrd->hpa_to_spa(cxlrd, hpa);
if (hpa < p->res->start || hpa > p->res->end) {
dev_dbg(&cxlr->dev,
"Addr trans fail: hpa 0x%llx not in region\n", hpa);
return ULLONG_MAX;
}
/* Simple chunk check, by pos & gran, only applies to modulo decodes */
if (!cxlrd->hpa_to_spa && (!cxl_is_hpa_in_chunk(hpa, cxlr, pos)))
return ULLONG_MAX;
return hpa;
}
u64 cxl_trace_hpa(struct cxl_region *cxlr, const struct cxl_memdev *cxlmd,
u64 dpa)
{
struct cxl_region_params *p = &cxlr->params;
struct cxl_endpoint_decoder *cxled = NULL;
for (int i = 0; i < p->nr_targets; i++) {
cxled = p->targets[i];
if (cxlmd == cxled_to_memdev(cxled))
break;
}
if (!cxled || cxlmd != cxled_to_memdev(cxled))
return ULLONG_MAX;
return cxl_dpa_to_hpa(dpa, cxlr, cxled);
}
static struct lock_class_key cxl_pmem_region_key;
static int cxl_pmem_region_alloc(struct cxl_region *cxlr)

View File

@ -340,23 +340,23 @@ TRACE_EVENT(cxl_general_media,
),
TP_fast_assign(
CXL_EVT_TP_fast_assign(cxlmd, log, rec->hdr);
CXL_EVT_TP_fast_assign(cxlmd, log, rec->media_hdr.hdr);
__entry->hdr_uuid = CXL_EVENT_GEN_MEDIA_UUID;
/* General Media */
__entry->dpa = le64_to_cpu(rec->phys_addr);
__entry->dpa = le64_to_cpu(rec->media_hdr.phys_addr);
__entry->dpa_flags = __entry->dpa & CXL_DPA_FLAGS_MASK;
/* Mask after flags have been parsed */
__entry->dpa &= CXL_DPA_MASK;
__entry->descriptor = rec->descriptor;
__entry->type = rec->type;
__entry->transaction_type = rec->transaction_type;
__entry->channel = rec->channel;
__entry->rank = rec->rank;
__entry->descriptor = rec->media_hdr.descriptor;
__entry->type = rec->media_hdr.type;
__entry->transaction_type = rec->media_hdr.transaction_type;
__entry->channel = rec->media_hdr.channel;
__entry->rank = rec->media_hdr.rank;
__entry->device = get_unaligned_le24(rec->device);
memcpy(__entry->comp_id, &rec->component_id,
CXL_EVENT_GEN_MED_COMP_ID_SIZE);
__entry->validity_flags = get_unaligned_le16(&rec->validity_flags);
__entry->validity_flags = get_unaligned_le16(&rec->media_hdr.validity_flags);
__entry->hpa = hpa;
if (cxlr) {
__assign_str(region_name);
@ -440,19 +440,19 @@ TRACE_EVENT(cxl_dram,
),
TP_fast_assign(
CXL_EVT_TP_fast_assign(cxlmd, log, rec->hdr);
CXL_EVT_TP_fast_assign(cxlmd, log, rec->media_hdr.hdr);
__entry->hdr_uuid = CXL_EVENT_DRAM_UUID;
/* DRAM */
__entry->dpa = le64_to_cpu(rec->phys_addr);
__entry->dpa = le64_to_cpu(rec->media_hdr.phys_addr);
__entry->dpa_flags = __entry->dpa & CXL_DPA_FLAGS_MASK;
__entry->dpa &= CXL_DPA_MASK;
__entry->descriptor = rec->descriptor;
__entry->type = rec->type;
__entry->transaction_type = rec->transaction_type;
__entry->validity_flags = get_unaligned_le16(rec->validity_flags);
__entry->channel = rec->channel;
__entry->rank = rec->rank;
__entry->descriptor = rec->media_hdr.descriptor;
__entry->type = rec->media_hdr.type;
__entry->transaction_type = rec->media_hdr.transaction_type;
__entry->validity_flags = get_unaligned_le16(rec->media_hdr.validity_flags);
__entry->channel = rec->media_hdr.channel;
__entry->rank = rec->media_hdr.rank;
__entry->nibble_mask = get_unaligned_le24(rec->nibble_mask);
__entry->bank_group = rec->bank_group;
__entry->bank = rec->bank;
@ -704,8 +704,8 @@ TRACE_EVENT(cxl_poison,
if (cxlr) {
__assign_str(region);
memcpy(__entry->uuid, &cxlr->params.uuid, 16);
__entry->hpa = cxl_trace_hpa(cxlr, cxlmd,
__entry->dpa);
__entry->hpa = cxl_dpa_to_hpa(cxlr, cxlmd,
__entry->dpa);
} else {
__assign_str(region);
memset(__entry->uuid, 0, 16);

View File

@ -434,14 +434,13 @@ struct cxl_switch_decoder {
};
struct cxl_root_decoder;
typedef struct cxl_dport *(*cxl_calc_hb_fn)(struct cxl_root_decoder *cxlrd,
int pos);
typedef u64 (*cxl_hpa_to_spa_fn)(struct cxl_root_decoder *cxlrd, u64 hpa);
/**
* struct cxl_root_decoder - Static platform CXL address decoder
* @res: host / parent resource for region allocations
* @region_id: region id for next region provisioning event
* @calc_hb: which host bridge covers the n'th position by granularity
* @hpa_to_spa: translate CXL host-physical-address to Platform system-physical-address
* @platform_data: platform specific configuration data
* @range_lock: sync region autodiscovery by address range
* @qos_class: QoS performance class cookie
@ -450,7 +449,7 @@ typedef struct cxl_dport *(*cxl_calc_hb_fn)(struct cxl_root_decoder *cxlrd,
struct cxl_root_decoder {
struct resource *res;
atomic_t region_id;
cxl_calc_hb_fn calc_hb;
cxl_hpa_to_spa_fn hpa_to_spa;
void *platform_data;
struct mutex range_lock;
int qos_class;
@ -524,6 +523,7 @@ struct cxl_region_params {
* @params: active + config params for the region
* @coord: QoS access coordinates for the region
* @memory_notifier: notifier for setting the access coordinates to node
* @adist_notifier: notifier for calculating the abstract distance of node
*/
struct cxl_region {
struct device dev;
@ -536,6 +536,7 @@ struct cxl_region {
struct cxl_region_params params;
struct access_coordinate coord[ACCESS_COORDINATE_MAX];
struct notifier_block memory_notifier;
struct notifier_block adist_notifier;
};
struct cxl_nvdimm_bridge {
@ -774,9 +775,7 @@ bool is_root_decoder(struct device *dev);
bool is_switch_decoder(struct device *dev);
bool is_endpoint_decoder(struct device *dev);
struct cxl_root_decoder *cxl_root_decoder_alloc(struct cxl_port *port,
unsigned int nr_targets,
cxl_calc_hb_fn calc_hb);
struct cxl_dport *cxl_hb_modulo(struct cxl_root_decoder *cxlrd, int pos);
unsigned int nr_targets);
struct cxl_switch_decoder *cxl_switch_decoder_alloc(struct cxl_port *port,
unsigned int nr_targets);
int cxl_decoder_add(struct cxl_decoder *cxld, int *target_map);

View File

@ -161,7 +161,7 @@ struct cxl_mbox_cmd {
C(FWRESET, -ENXIO, "FW failed to activate, needs cold reset"), \
C(HANDLE, -ENXIO, "one or more Event Record Handles were invalid"), \
C(PADDR, -EFAULT, "physical address specified is invalid"), \
C(POISONLMT, -ENXIO, "poison injection limit has been reached"), \
C(POISONLMT, -EBUSY, "poison injection limit has been reached"), \
C(MEDIAFAILURE, -ENXIO, "permanent issue with the media"), \
C(ABORT, -ENXIO, "background cmd was aborted by device"), \
C(SECURITY, -ENXIO, "not valid in the current security state"), \
@ -563,7 +563,7 @@ enum cxl_opcode {
0x3b, 0x3f, 0x17)
#define DEFINE_CXL_VENDOR_DEBUG_UUID \
UUID_INIT(0xe1819d9, 0x11a9, 0x400c, 0x81, 0x1f, 0xd6, 0x07, 0x19, \
UUID_INIT(0x5e1819d9, 0x11a9, 0x400c, 0x81, 0x1f, 0xd6, 0x07, 0x19, \
0x40, 0x3d, 0x86)
struct cxl_mbox_get_supported_logs {

View File

@ -253,6 +253,7 @@ static struct cxl_driver cxl_mem_driver = {
module_cxl_driver(cxl_mem_driver);
MODULE_DESCRIPTION("CXL: Memory Expansion");
MODULE_LICENSE("GPL v2");
MODULE_IMPORT_NS(CXL);
MODULE_ALIAS_CXL(CXL_DEVICE_MEMORY_EXPANDER);

View File

@ -1066,5 +1066,6 @@ static void __exit cxl_pci_driver_exit(void)
module_init(cxl_pci_driver_init);
module_exit(cxl_pci_driver_exit);
MODULE_DESCRIPTION("CXL: PCI manageability");
MODULE_LICENSE("GPL v2");
MODULE_IMPORT_NS(CXL);

View File

@ -453,6 +453,7 @@ static __exit void cxl_pmem_exit(void)
cxl_driver_unregister(&cxl_nvdimm_bridge_driver);
}
MODULE_DESCRIPTION("CXL PMEM: Persistent Memory Support");
MODULE_LICENSE("GPL v2");
module_init(cxl_pmem_init);
module_exit(cxl_pmem_exit);

View File

@ -209,6 +209,7 @@ static struct cxl_driver cxl_port_driver = {
};
module_cxl_driver(cxl_port_driver);
MODULE_DESCRIPTION("CXL: Port enumeration and services");
MODULE_LICENSE("GPL v2");
MODULE_IMPORT_NS(CXL);
MODULE_ALIAS_CXL(CXL_DEVICE_PORT);

View File

@ -21,6 +21,21 @@ struct cxl_event_record_hdr {
u8 reserved[15];
} __packed;
struct cxl_event_media_hdr {
struct cxl_event_record_hdr hdr;
__le64 phys_addr;
u8 descriptor;
u8 type;
u8 transaction_type;
/*
* The meaning of Validity Flags from bit 2 is
* different across DRAM and General Media records
*/
u8 validity_flags[2];
u8 channel;
u8 rank;
} __packed;
#define CXL_EVENT_RECORD_DATA_LENGTH 0x50
struct cxl_event_generic {
struct cxl_event_record_hdr hdr;
@ -33,14 +48,7 @@ struct cxl_event_generic {
*/
#define CXL_EVENT_GEN_MED_COMP_ID_SIZE 0x10
struct cxl_event_gen_media {
struct cxl_event_record_hdr hdr;
__le64 phys_addr;
u8 descriptor;
u8 type;
u8 transaction_type;
u8 validity_flags[2];
u8 channel;
u8 rank;
struct cxl_event_media_hdr media_hdr;
u8 device[3];
u8 component_id[CXL_EVENT_GEN_MED_COMP_ID_SIZE];
u8 reserved[46];
@ -52,14 +60,7 @@ struct cxl_event_gen_media {
*/
#define CXL_EVENT_DER_CORRECTION_MASK_SIZE 0x20
struct cxl_event_dram {
struct cxl_event_record_hdr hdr;
__le64 phys_addr;
u8 descriptor;
u8 type;
u8 transaction_type;
u8 validity_flags[2];
u8 channel;
u8 rank;
struct cxl_event_media_hdr media_hdr;
u8 nibble_mask[3];
u8 bank_group;
u8 bank;
@ -95,21 +96,13 @@ struct cxl_event_mem_module {
u8 reserved[0x3d];
} __packed;
/*
* General Media or DRAM Event Common Fields
* - provides common access to phys_addr
*/
struct cxl_event_common {
struct cxl_event_record_hdr hdr;
__le64 phys_addr;
} __packed;
union cxl_event {
struct cxl_event_generic generic;
struct cxl_event_gen_media gen_media;
struct cxl_event_dram dram;
struct cxl_event_mem_module mem_module;
struct cxl_event_common common;
/* dram & gen_media event header */
struct cxl_event_media_hdr media_hdr;
} __packed;
/*

View File

@ -385,19 +385,21 @@ struct cxl_test_gen_media {
struct cxl_test_gen_media gen_media = {
.id = CXL_EVENT_GEN_MEDIA_UUID,
.rec = {
.hdr = {
.length = sizeof(struct cxl_test_gen_media),
.flags[0] = CXL_EVENT_RECORD_FLAG_PERMANENT,
/* .handle = Set dynamically */
.related_handle = cpu_to_le16(0),
.media_hdr = {
.hdr = {
.length = sizeof(struct cxl_test_gen_media),
.flags[0] = CXL_EVENT_RECORD_FLAG_PERMANENT,
/* .handle = Set dynamically */
.related_handle = cpu_to_le16(0),
},
.phys_addr = cpu_to_le64(0x2000),
.descriptor = CXL_GMER_EVT_DESC_UNCORECTABLE_EVENT,
.type = CXL_GMER_MEM_EVT_TYPE_DATA_PATH_ERROR,
.transaction_type = CXL_GMER_TRANS_HOST_WRITE,
/* .validity_flags = <set below> */
.channel = 1,
.rank = 30,
},
.phys_addr = cpu_to_le64(0x2000),
.descriptor = CXL_GMER_EVT_DESC_UNCORECTABLE_EVENT,
.type = CXL_GMER_MEM_EVT_TYPE_DATA_PATH_ERROR,
.transaction_type = CXL_GMER_TRANS_HOST_WRITE,
/* .validity_flags = <set below> */
.channel = 1,
.rank = 30
},
};
@ -409,18 +411,20 @@ struct cxl_test_dram {
struct cxl_test_dram dram = {
.id = CXL_EVENT_DRAM_UUID,
.rec = {
.hdr = {
.length = sizeof(struct cxl_test_dram),
.flags[0] = CXL_EVENT_RECORD_FLAG_PERF_DEGRADED,
/* .handle = Set dynamically */
.related_handle = cpu_to_le16(0),
.media_hdr = {
.hdr = {
.length = sizeof(struct cxl_test_dram),
.flags[0] = CXL_EVENT_RECORD_FLAG_PERF_DEGRADED,
/* .handle = Set dynamically */
.related_handle = cpu_to_le16(0),
},
.phys_addr = cpu_to_le64(0x8000),
.descriptor = CXL_GMER_EVT_DESC_THRESHOLD_EVENT,
.type = CXL_GMER_MEM_EVT_TYPE_INV_ADDR,
.transaction_type = CXL_GMER_TRANS_INTERNAL_MEDIA_SCRUB,
/* .validity_flags = <set below> */
.channel = 1,
},
.phys_addr = cpu_to_le64(0x8000),
.descriptor = CXL_GMER_EVT_DESC_THRESHOLD_EVENT,
.type = CXL_GMER_MEM_EVT_TYPE_INV_ADDR,
.transaction_type = CXL_GMER_TRANS_INTERNAL_MEDIA_SCRUB,
/* .validity_flags = <set below> */
.channel = 1,
.bank_group = 5,
.bank = 2,
.column = {0xDE, 0xAD},
@ -474,11 +478,11 @@ static int mock_set_timestamp(struct cxl_dev_state *cxlds,
static void cxl_mock_add_event_logs(struct mock_event_store *mes)
{
put_unaligned_le16(CXL_GMER_VALID_CHANNEL | CXL_GMER_VALID_RANK,
&gen_media.rec.validity_flags);
&gen_media.rec.media_hdr.validity_flags);
put_unaligned_le16(CXL_DER_VALID_CHANNEL | CXL_DER_VALID_BANK_GROUP |
CXL_DER_VALID_BANK | CXL_DER_VALID_COLUMN,
&dram.rec.validity_flags);
&dram.rec.media_hdr.validity_flags);
mes_add_event(mes, CXL_EVENT_TYPE_INFO, &maint_needed);
mes_add_event(mes, CXL_EVENT_TYPE_INFO,
@ -1131,27 +1135,28 @@ static bool mock_poison_dev_max_injected(struct cxl_dev_state *cxlds)
return (count >= poison_inject_dev_max);
}
static bool mock_poison_add(struct cxl_dev_state *cxlds, u64 dpa)
static int mock_poison_add(struct cxl_dev_state *cxlds, u64 dpa)
{
/* Return EBUSY to match the CXL driver handling */
if (mock_poison_dev_max_injected(cxlds)) {
dev_dbg(cxlds->dev,
"Device poison injection limit has been reached: %d\n",
MOCK_INJECT_DEV_MAX);
return false;
poison_inject_dev_max);
return -EBUSY;
}
for (int i = 0; i < MOCK_INJECT_TEST_MAX; i++) {
if (!mock_poison_list[i].cxlds) {
mock_poison_list[i].cxlds = cxlds;
mock_poison_list[i].dpa = dpa;
return true;
return 0;
}
}
dev_dbg(cxlds->dev,
"Mock test poison injection limit has been reached: %d\n",
MOCK_INJECT_TEST_MAX);
return false;
return -ENXIO;
}
static bool mock_poison_found(struct cxl_dev_state *cxlds, u64 dpa)
@ -1175,10 +1180,8 @@ static int mock_inject_poison(struct cxl_dev_state *cxlds,
dev_dbg(cxlds->dev, "DPA: 0x%llx already poisoned\n", dpa);
return 0;
}
if (!mock_poison_add(cxlds, dpa))
return -ENXIO;
return 0;
return mock_poison_add(cxlds, dpa);
}
static bool mock_poison_del(struct cxl_dev_state *cxlds, u64 dpa)