mirror of
https://github.com/torvalds/linux.git
synced 2024-11-10 22:21:40 +00:00
VFIO updates for v5.18-rc1
- Introduce new device migration uAPI and implement device specific mlx5 vfio-pci variant driver supporting new protocol (Jason Gunthorpe, Yishai Hadas, Leon Romanovsky) - New HiSilicon acc vfio-pci variant driver, also supporting migration interface (Shameer Kolothum, Longfang Liu) - D3hot fixes for vfio-pci-core (Abhishek Sahu) - Document new vfio-pci variant driver acceptance criteria (Alex Williamson) - Fix UML build unresolved ioport_{un}map() functions (Alex Williamson) - Fix MAINTAINERS due to header movement (Lukas Bulwahn) -----BEGIN PGP SIGNATURE----- iQJPBAABCAA5FiEEQvbATlQL0amee4qQI5ubbjuwiyIFAmI6HGwbHGFsZXgud2ls bGlhbXNvbkByZWRoYXQuY29tAAoJECObm247sIsiyxcP/18Mh4eYJudvqU7ARH/H 8E2V+5YhkdVG088KZcB/sTEfVKAbROZrJ3zxkZMXU/OU2cYELHG2pgaI8yCMzHJK krz+kZ2p+nA/AMKp8V0xB0MCspTpX/3/6zHV2wDals+gTTLH34N0r6swh0wCjoSa wN+3ahE+c6KkX41H8X2Dup5YVM4ohg8MbCd3jSIFBrRDj6SMRGr7zytezCdLhnVs TwadlReOYSqKsuvcVnHObWbsOj5WCmuld2u9j0kTPknRm6VtxkfNFQTpKk3sbAcO SaPwDP0485plwCVZkNJELZVaF+qYIFW5WZLD5wlJNoH/mZE68a5BKbYFKSLt1gs3 ntYdktcmsBLVQxTNxcZ6/gwEV2/wuY6v7C3cm0jT0AqXgPIdOqrwlzafTwP+Z/KU TC9x4EzPPvdsnBCut0XJZg4QUNlJ7Cp+62vxXqhLGPA2cd4tjGO/8B1KOm05B7VQ 2XiDtlsW7pwx4v6jRPPdvoqUMd5qqjKF9RepTktirUSXv8z6NIjSyzGn3HZLrk6f 7AHnlltUg56y/c6hmLxe25PrXKpGqO1fFIcuPYpC+IbBHrE4NVqOhi3ieoonO5GZ nwe6IT/fLxsLOudUG/dJ3swuoE8o2Glf17rV9e53K8zF9J9LoFJQsqSFbUzR17pD NGN+nA8dWFmmLDS4uYiY9WBg =Sv96 -----END PGP SIGNATURE----- Merge tag 'vfio-v5.18-rc1' of https://github.com/awilliam/linux-vfio Pull VFIO updates from Alex Williamson: - Introduce new device migration uAPI and implement device specific mlx5 vfio-pci variant driver supporting new protocol (Jason Gunthorpe, Yishai Hadas, Leon Romanovsky) - New HiSilicon acc vfio-pci variant driver, also supporting migration interface (Shameer Kolothum, Longfang Liu) - D3hot fixes for vfio-pci-core (Abhishek Sahu) - Document new vfio-pci variant driver acceptance criteria (Alex Williamson) - Fix UML build unresolved ioport_{un}map() functions (Alex Williamson) - Fix MAINTAINERS due to header movement (Lukas Bulwahn) * tag 'vfio-v5.18-rc1' of https://github.com/awilliam/linux-vfio: (31 commits) vfio-pci: Provide reviewers and acceptance criteria for variant drivers MAINTAINERS: adjust entry for header movement in hisilicon qm driver hisi_acc_vfio_pci: Use its own PCI reset_done error handler hisi_acc_vfio_pci: Add support for VFIO live migration crypto: hisilicon/qm: Set the VF QM state register hisi_acc_vfio_pci: Add helper to retrieve the struct pci_driver hisi_acc_vfio_pci: Restrict access to VF dev BAR2 migration region hisi_acc_vfio_pci: add new vfio_pci driver for HiSilicon ACC devices hisi_acc_qm: Move VF PCI device IDs to common header crypto: hisilicon/qm: Move few definitions to common header crypto: hisilicon/qm: Move the QM header to include/linux vfio/mlx5: Fix to not use 0 as NULL pointer PCI/IOV: Fix wrong kernel-doc identifier vfio/mlx5: Use its own PCI reset_done error handler vfio/pci: Expose vfio_pci_core_aer_err_detected() vfio/mlx5: Implement vfio_pci driver for mlx5 devices vfio/mlx5: Expose migration commands over mlx5 device vfio: Remove migration protocol v1 documentation vfio: Extend the device migration protocol with RUNNING_P2P vfio: Define device migration protocol v2 ...
This commit is contained in:
commit
7403e6d826
@ -103,6 +103,7 @@ available subsections can be seen below.
|
||||
sync_file
|
||||
vfio-mediated-device
|
||||
vfio
|
||||
vfio-pci-device-specific-driver-acceptance
|
||||
xilinx/index
|
||||
xillybus
|
||||
zorro
|
||||
|
@ -0,0 +1,35 @@
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
|
||||
Acceptance criteria for vfio-pci device specific driver variants
|
||||
================================================================
|
||||
|
||||
Overview
|
||||
--------
|
||||
The vfio-pci driver exists as a device agnostic driver using the
|
||||
system IOMMU and relying on the robustness of platform fault
|
||||
handling to provide isolated device access to userspace. While the
|
||||
vfio-pci driver does include some device specific support, further
|
||||
extensions for yet more advanced device specific features are not
|
||||
sustainable. The vfio-pci driver has therefore split out
|
||||
vfio-pci-core as a library that may be reused to implement features
|
||||
requiring device specific knowledge, ex. saving and loading device
|
||||
state for the purposes of supporting migration.
|
||||
|
||||
In support of such features, it's expected that some device specific
|
||||
variants may interact with parent devices (ex. SR-IOV PF in support of
|
||||
a user assigned VF) or other extensions that may not be otherwise
|
||||
accessible via the vfio-pci base driver. Authors of such drivers
|
||||
should be diligent not to create exploitable interfaces via these
|
||||
interactions or allow unchecked userspace data to have an effect
|
||||
beyond the scope of the assigned device.
|
||||
|
||||
New driver submissions are therefore requested to have approval via
|
||||
sign-off/ack/review/etc for any interactions with parent drivers.
|
||||
Additionally, drivers should make an attempt to provide sufficient
|
||||
documentation for reviewers to understand the device specific
|
||||
extensions, for example in the case of migration data, how is the
|
||||
device state composed and consumed, which portions are not otherwise
|
||||
available to the user via vfio-pci, what safeguards exist to validate
|
||||
the data, etc. To that extent, authors should additionally expect to
|
||||
require reviews from at least one of the listed reviewers, in addition
|
||||
to the overall vfio maintainer.
|
@ -103,3 +103,4 @@ to do something different in the near future.
|
||||
../nvdimm/maintainer-entry-profile
|
||||
../riscv/patch-acceptance
|
||||
../driver-api/media/maintainer-entry-profile
|
||||
../driver-api/vfio-pci-device-specific-driver-acceptance
|
||||
|
25
MAINTAINERS
25
MAINTAINERS
@ -8722,9 +8722,9 @@ L: linux-crypto@vger.kernel.org
|
||||
S: Maintained
|
||||
F: Documentation/ABI/testing/debugfs-hisi-zip
|
||||
F: drivers/crypto/hisilicon/qm.c
|
||||
F: drivers/crypto/hisilicon/qm.h
|
||||
F: drivers/crypto/hisilicon/sgl.c
|
||||
F: drivers/crypto/hisilicon/zip/
|
||||
F: include/linux/hisi_acc_qm.h
|
||||
|
||||
HISILICON ROCE DRIVER
|
||||
M: Wenpeng Liang <liangwenpeng@huawei.com>
|
||||
@ -20399,6 +20399,13 @@ L: kvm@vger.kernel.org
|
||||
S: Maintained
|
||||
F: drivers/vfio/fsl-mc/
|
||||
|
||||
VFIO HISILICON PCI DRIVER
|
||||
M: Longfang Liu <liulongfang@huawei.com>
|
||||
M: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
|
||||
L: kvm@vger.kernel.org
|
||||
S: Maintained
|
||||
F: drivers/vfio/pci/hisilicon/
|
||||
|
||||
VFIO MEDIATED DEVICE DRIVERS
|
||||
M: Kirti Wankhede <kwankhede@nvidia.com>
|
||||
L: kvm@vger.kernel.org
|
||||
@ -20408,12 +20415,28 @@ F: drivers/vfio/mdev/
|
||||
F: include/linux/mdev.h
|
||||
F: samples/vfio-mdev/
|
||||
|
||||
VFIO PCI DEVICE SPECIFIC DRIVERS
|
||||
R: Jason Gunthorpe <jgg@nvidia.com>
|
||||
R: Yishai Hadas <yishaih@nvidia.com>
|
||||
R: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
|
||||
R: Kevin Tian <kevin.tian@intel.com>
|
||||
L: kvm@vger.kernel.org
|
||||
S: Maintained
|
||||
P: Documentation/driver-api/vfio-pci-device-specific-driver-acceptance.rst
|
||||
F: drivers/vfio/pci/*/
|
||||
|
||||
VFIO PLATFORM DRIVER
|
||||
M: Eric Auger <eric.auger@redhat.com>
|
||||
L: kvm@vger.kernel.org
|
||||
S: Maintained
|
||||
F: drivers/vfio/platform/
|
||||
|
||||
VFIO MLX5 PCI DRIVER
|
||||
M: Yishai Hadas <yishaih@nvidia.com>
|
||||
L: kvm@vger.kernel.org
|
||||
S: Maintained
|
||||
F: drivers/vfio/pci/mlx5/
|
||||
|
||||
VGA_SWITCHEROO
|
||||
R: Lukas Wunner <lukas@wunner.de>
|
||||
S: Maintained
|
||||
|
@ -4,7 +4,7 @@
|
||||
#define __HISI_HPRE_H
|
||||
|
||||
#include <linux/list.h>
|
||||
#include "../qm.h"
|
||||
#include <linux/hisi_acc_qm.h>
|
||||
|
||||
#define HPRE_SQE_SIZE sizeof(struct hpre_sqe)
|
||||
#define HPRE_PF_DEF_Q_NUM 64
|
||||
|
@ -68,8 +68,7 @@
|
||||
#define HPRE_REG_RD_INTVRL_US 10
|
||||
#define HPRE_REG_RD_TMOUT_US 1000
|
||||
#define HPRE_DBGFS_VAL_MAX_LEN 20
|
||||
#define HPRE_PCI_DEVICE_ID 0xa258
|
||||
#define HPRE_PCI_VF_DEVICE_ID 0xa259
|
||||
#define PCI_DEVICE_ID_HUAWEI_HPRE_PF 0xa258
|
||||
#define HPRE_QM_USR_CFG_MASK GENMASK(31, 1)
|
||||
#define HPRE_QM_AXI_CFG_MASK GENMASK(15, 0)
|
||||
#define HPRE_QM_VFG_AX_MASK GENMASK(7, 0)
|
||||
@ -111,8 +110,8 @@
|
||||
static const char hpre_name[] = "hisi_hpre";
|
||||
static struct dentry *hpre_debugfs_root;
|
||||
static const struct pci_device_id hpre_dev_ids[] = {
|
||||
{ PCI_DEVICE(PCI_VENDOR_ID_HUAWEI, HPRE_PCI_DEVICE_ID) },
|
||||
{ PCI_DEVICE(PCI_VENDOR_ID_HUAWEI, HPRE_PCI_VF_DEVICE_ID) },
|
||||
{ PCI_DEVICE(PCI_VENDOR_ID_HUAWEI, PCI_DEVICE_ID_HUAWEI_HPRE_PF) },
|
||||
{ PCI_DEVICE(PCI_VENDOR_ID_HUAWEI, PCI_DEVICE_ID_HUAWEI_HPRE_VF) },
|
||||
{ 0, }
|
||||
};
|
||||
|
||||
@ -242,7 +241,7 @@ MODULE_PARM_DESC(uacce_mode, UACCE_MODE_DESC);
|
||||
|
||||
static int pf_q_num_set(const char *val, const struct kernel_param *kp)
|
||||
{
|
||||
return q_num_set(val, kp, HPRE_PCI_DEVICE_ID);
|
||||
return q_num_set(val, kp, PCI_DEVICE_ID_HUAWEI_HPRE_PF);
|
||||
}
|
||||
|
||||
static const struct kernel_param_ops hpre_pf_q_num_ops = {
|
||||
@ -921,7 +920,7 @@ static int hpre_debugfs_init(struct hisi_qm *qm)
|
||||
qm->debug.sqe_mask_len = HPRE_SQE_MASK_LEN;
|
||||
hisi_qm_debug_init(qm);
|
||||
|
||||
if (qm->pdev->device == HPRE_PCI_DEVICE_ID) {
|
||||
if (qm->pdev->device == PCI_DEVICE_ID_HUAWEI_HPRE_PF) {
|
||||
ret = hpre_ctrl_debug_init(qm);
|
||||
if (ret)
|
||||
goto failed_to_create;
|
||||
@ -958,7 +957,7 @@ static int hpre_qm_init(struct hisi_qm *qm, struct pci_dev *pdev)
|
||||
qm->sqe_size = HPRE_SQE_SIZE;
|
||||
qm->dev_name = hpre_name;
|
||||
|
||||
qm->fun_type = (pdev->device == HPRE_PCI_DEVICE_ID) ?
|
||||
qm->fun_type = (pdev->device == PCI_DEVICE_ID_HUAWEI_HPRE_PF) ?
|
||||
QM_HW_PF : QM_HW_VF;
|
||||
if (qm->fun_type == QM_HW_PF) {
|
||||
qm->qp_base = HPRE_PF_DEF_Q_BASE;
|
||||
@ -1191,6 +1190,12 @@ static struct pci_driver hpre_pci_driver = {
|
||||
.driver.pm = &hpre_pm_ops,
|
||||
};
|
||||
|
||||
struct pci_driver *hisi_hpre_get_pf_driver(void)
|
||||
{
|
||||
return &hpre_pci_driver;
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(hisi_hpre_get_pf_driver);
|
||||
|
||||
static void hpre_register_debugfs(void)
|
||||
{
|
||||
if (!debugfs_initialized())
|
||||
|
@ -15,7 +15,7 @@
|
||||
#include <linux/uacce.h>
|
||||
#include <linux/uaccess.h>
|
||||
#include <uapi/misc/uacce/hisi_qm.h>
|
||||
#include "qm.h"
|
||||
#include <linux/hisi_acc_qm.h>
|
||||
|
||||
/* eq/aeq irq enable */
|
||||
#define QM_VF_AEQ_INT_SOURCE 0x0
|
||||
@ -33,23 +33,6 @@
|
||||
#define QM_ABNORMAL_EVENT_IRQ_VECTOR 3
|
||||
|
||||
/* mailbox */
|
||||
#define QM_MB_CMD_SQC 0x0
|
||||
#define QM_MB_CMD_CQC 0x1
|
||||
#define QM_MB_CMD_EQC 0x2
|
||||
#define QM_MB_CMD_AEQC 0x3
|
||||
#define QM_MB_CMD_SQC_BT 0x4
|
||||
#define QM_MB_CMD_CQC_BT 0x5
|
||||
#define QM_MB_CMD_SQC_VFT_V2 0x6
|
||||
#define QM_MB_CMD_STOP_QP 0x8
|
||||
#define QM_MB_CMD_SRC 0xc
|
||||
#define QM_MB_CMD_DST 0xd
|
||||
|
||||
#define QM_MB_CMD_SEND_BASE 0x300
|
||||
#define QM_MB_EVENT_SHIFT 8
|
||||
#define QM_MB_BUSY_SHIFT 13
|
||||
#define QM_MB_OP_SHIFT 14
|
||||
#define QM_MB_CMD_DATA_ADDR_L 0x304
|
||||
#define QM_MB_CMD_DATA_ADDR_H 0x308
|
||||
#define QM_MB_PING_ALL_VFS 0xffff
|
||||
#define QM_MB_CMD_DATA_SHIFT 32
|
||||
#define QM_MB_CMD_DATA_MASK GENMASK(31, 0)
|
||||
@ -103,19 +86,12 @@
|
||||
#define QM_DB_CMD_SHIFT_V1 16
|
||||
#define QM_DB_INDEX_SHIFT_V1 32
|
||||
#define QM_DB_PRIORITY_SHIFT_V1 48
|
||||
#define QM_DOORBELL_SQ_CQ_BASE_V2 0x1000
|
||||
#define QM_DOORBELL_EQ_AEQ_BASE_V2 0x2000
|
||||
#define QM_QUE_ISO_CFG_V 0x0030
|
||||
#define QM_PAGE_SIZE 0x0034
|
||||
#define QM_QUE_ISO_EN 0x100154
|
||||
#define QM_CAPBILITY 0x100158
|
||||
#define QM_QP_NUN_MASK GENMASK(10, 0)
|
||||
#define QM_QP_DB_INTERVAL 0x10000
|
||||
#define QM_QP_MAX_NUM_SHIFT 11
|
||||
#define QM_DB_CMD_SHIFT_V2 12
|
||||
#define QM_DB_RAND_SHIFT_V2 16
|
||||
#define QM_DB_INDEX_SHIFT_V2 32
|
||||
#define QM_DB_PRIORITY_SHIFT_V2 48
|
||||
|
||||
#define QM_MEM_START_INIT 0x100040
|
||||
#define QM_MEM_INIT_DONE 0x100044
|
||||
@ -693,7 +669,7 @@ static void qm_mb_pre_init(struct qm_mailbox *mailbox, u8 cmd,
|
||||
}
|
||||
|
||||
/* return 0 mailbox ready, -ETIMEDOUT hardware timeout */
|
||||
static int qm_wait_mb_ready(struct hisi_qm *qm)
|
||||
int hisi_qm_wait_mb_ready(struct hisi_qm *qm)
|
||||
{
|
||||
u32 val;
|
||||
|
||||
@ -701,6 +677,7 @@ static int qm_wait_mb_ready(struct hisi_qm *qm)
|
||||
val, !((val >> QM_MB_BUSY_SHIFT) &
|
||||
0x1), POLL_PERIOD, POLL_TIMEOUT);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(hisi_qm_wait_mb_ready);
|
||||
|
||||
/* 128 bit should be written to hardware at one time to trigger a mailbox */
|
||||
static void qm_mb_write(struct hisi_qm *qm, const void *src)
|
||||
@ -726,14 +703,14 @@ static void qm_mb_write(struct hisi_qm *qm, const void *src)
|
||||
|
||||
static int qm_mb_nolock(struct hisi_qm *qm, struct qm_mailbox *mailbox)
|
||||
{
|
||||
if (unlikely(qm_wait_mb_ready(qm))) {
|
||||
if (unlikely(hisi_qm_wait_mb_ready(qm))) {
|
||||
dev_err(&qm->pdev->dev, "QM mailbox is busy to start!\n");
|
||||
goto mb_busy;
|
||||
}
|
||||
|
||||
qm_mb_write(qm, mailbox);
|
||||
|
||||
if (unlikely(qm_wait_mb_ready(qm))) {
|
||||
if (unlikely(hisi_qm_wait_mb_ready(qm))) {
|
||||
dev_err(&qm->pdev->dev, "QM mailbox operation timeout!\n");
|
||||
goto mb_busy;
|
||||
}
|
||||
@ -745,8 +722,8 @@ mb_busy:
|
||||
return -EBUSY;
|
||||
}
|
||||
|
||||
static int qm_mb(struct hisi_qm *qm, u8 cmd, dma_addr_t dma_addr, u16 queue,
|
||||
bool op)
|
||||
int hisi_qm_mb(struct hisi_qm *qm, u8 cmd, dma_addr_t dma_addr, u16 queue,
|
||||
bool op)
|
||||
{
|
||||
struct qm_mailbox mailbox;
|
||||
int ret;
|
||||
@ -762,6 +739,7 @@ static int qm_mb(struct hisi_qm *qm, u8 cmd, dma_addr_t dma_addr, u16 queue,
|
||||
|
||||
return ret;
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(hisi_qm_mb);
|
||||
|
||||
static void qm_db_v1(struct hisi_qm *qm, u16 qn, u8 cmd, u16 index, u8 priority)
|
||||
{
|
||||
@ -1351,7 +1329,7 @@ static int qm_get_vft_v2(struct hisi_qm *qm, u32 *base, u32 *number)
|
||||
u64 sqc_vft;
|
||||
int ret;
|
||||
|
||||
ret = qm_mb(qm, QM_MB_CMD_SQC_VFT_V2, 0, 0, 1);
|
||||
ret = hisi_qm_mb(qm, QM_MB_CMD_SQC_VFT_V2, 0, 0, 1);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
@ -1725,12 +1703,12 @@ static int dump_show(struct hisi_qm *qm, void *info,
|
||||
|
||||
static int qm_dump_sqc_raw(struct hisi_qm *qm, dma_addr_t dma_addr, u16 qp_id)
|
||||
{
|
||||
return qm_mb(qm, QM_MB_CMD_SQC, dma_addr, qp_id, 1);
|
||||
return hisi_qm_mb(qm, QM_MB_CMD_SQC, dma_addr, qp_id, 1);
|
||||
}
|
||||
|
||||
static int qm_dump_cqc_raw(struct hisi_qm *qm, dma_addr_t dma_addr, u16 qp_id)
|
||||
{
|
||||
return qm_mb(qm, QM_MB_CMD_CQC, dma_addr, qp_id, 1);
|
||||
return hisi_qm_mb(qm, QM_MB_CMD_CQC, dma_addr, qp_id, 1);
|
||||
}
|
||||
|
||||
static int qm_sqc_dump(struct hisi_qm *qm, const char *s)
|
||||
@ -1842,7 +1820,7 @@ static int qm_eqc_aeqc_dump(struct hisi_qm *qm, char *s, size_t size,
|
||||
if (IS_ERR(xeqc))
|
||||
return PTR_ERR(xeqc);
|
||||
|
||||
ret = qm_mb(qm, cmd, xeqc_dma, 0, 1);
|
||||
ret = hisi_qm_mb(qm, cmd, xeqc_dma, 0, 1);
|
||||
if (ret)
|
||||
goto err_free_ctx;
|
||||
|
||||
@ -2495,7 +2473,7 @@ unlock:
|
||||
|
||||
static int qm_stop_qp(struct hisi_qp *qp)
|
||||
{
|
||||
return qm_mb(qp->qm, QM_MB_CMD_STOP_QP, 0, qp->qp_id, 0);
|
||||
return hisi_qm_mb(qp->qm, QM_MB_CMD_STOP_QP, 0, qp->qp_id, 0);
|
||||
}
|
||||
|
||||
static int qm_set_msi(struct hisi_qm *qm, bool set)
|
||||
@ -2763,7 +2741,7 @@ static int qm_sq_ctx_cfg(struct hisi_qp *qp, int qp_id, u32 pasid)
|
||||
return -ENOMEM;
|
||||
}
|
||||
|
||||
ret = qm_mb(qm, QM_MB_CMD_SQC, sqc_dma, qp_id, 0);
|
||||
ret = hisi_qm_mb(qm, QM_MB_CMD_SQC, sqc_dma, qp_id, 0);
|
||||
dma_unmap_single(dev, sqc_dma, sizeof(struct qm_sqc), DMA_TO_DEVICE);
|
||||
kfree(sqc);
|
||||
|
||||
@ -2804,7 +2782,7 @@ static int qm_cq_ctx_cfg(struct hisi_qp *qp, int qp_id, u32 pasid)
|
||||
return -ENOMEM;
|
||||
}
|
||||
|
||||
ret = qm_mb(qm, QM_MB_CMD_CQC, cqc_dma, qp_id, 0);
|
||||
ret = hisi_qm_mb(qm, QM_MB_CMD_CQC, cqc_dma, qp_id, 0);
|
||||
dma_unmap_single(dev, cqc_dma, sizeof(struct qm_cqc), DMA_TO_DEVICE);
|
||||
kfree(cqc);
|
||||
|
||||
@ -3514,6 +3492,12 @@ static void hisi_qm_pci_uninit(struct hisi_qm *qm)
|
||||
pci_disable_device(pdev);
|
||||
}
|
||||
|
||||
static void hisi_qm_set_state(struct hisi_qm *qm, u8 state)
|
||||
{
|
||||
if (qm->ver > QM_HW_V2 && qm->fun_type == QM_HW_VF)
|
||||
writel(state, qm->io_base + QM_VF_STATE);
|
||||
}
|
||||
|
||||
/**
|
||||
* hisi_qm_uninit() - Uninitialize qm.
|
||||
* @qm: The qm needed uninit.
|
||||
@ -3542,6 +3526,7 @@ void hisi_qm_uninit(struct hisi_qm *qm)
|
||||
dma_free_coherent(dev, qm->qdma.size,
|
||||
qm->qdma.va, qm->qdma.dma);
|
||||
}
|
||||
hisi_qm_set_state(qm, QM_NOT_READY);
|
||||
up_write(&qm->qps_lock);
|
||||
|
||||
qm_irq_unregister(qm);
|
||||
@ -3655,7 +3640,7 @@ static int qm_eq_ctx_cfg(struct hisi_qm *qm)
|
||||
return -ENOMEM;
|
||||
}
|
||||
|
||||
ret = qm_mb(qm, QM_MB_CMD_EQC, eqc_dma, 0, 0);
|
||||
ret = hisi_qm_mb(qm, QM_MB_CMD_EQC, eqc_dma, 0, 0);
|
||||
dma_unmap_single(dev, eqc_dma, sizeof(struct qm_eqc), DMA_TO_DEVICE);
|
||||
kfree(eqc);
|
||||
|
||||
@ -3684,7 +3669,7 @@ static int qm_aeq_ctx_cfg(struct hisi_qm *qm)
|
||||
return -ENOMEM;
|
||||
}
|
||||
|
||||
ret = qm_mb(qm, QM_MB_CMD_AEQC, aeqc_dma, 0, 0);
|
||||
ret = hisi_qm_mb(qm, QM_MB_CMD_AEQC, aeqc_dma, 0, 0);
|
||||
dma_unmap_single(dev, aeqc_dma, sizeof(struct qm_aeqc), DMA_TO_DEVICE);
|
||||
kfree(aeqc);
|
||||
|
||||
@ -3723,11 +3708,11 @@ static int __hisi_qm_start(struct hisi_qm *qm)
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
ret = qm_mb(qm, QM_MB_CMD_SQC_BT, qm->sqc_dma, 0, 0);
|
||||
ret = hisi_qm_mb(qm, QM_MB_CMD_SQC_BT, qm->sqc_dma, 0, 0);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
ret = qm_mb(qm, QM_MB_CMD_CQC_BT, qm->cqc_dma, 0, 0);
|
||||
ret = hisi_qm_mb(qm, QM_MB_CMD_CQC_BT, qm->cqc_dma, 0, 0);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
@ -3767,6 +3752,7 @@ int hisi_qm_start(struct hisi_qm *qm)
|
||||
if (!ret)
|
||||
atomic_set(&qm->status.flags, QM_START);
|
||||
|
||||
hisi_qm_set_state(qm, QM_READY);
|
||||
err_unlock:
|
||||
up_write(&qm->qps_lock);
|
||||
return ret;
|
||||
|
@ -4,7 +4,7 @@
|
||||
#ifndef __HISI_SEC_V2_H
|
||||
#define __HISI_SEC_V2_H
|
||||
|
||||
#include "../qm.h"
|
||||
#include <linux/hisi_acc_qm.h>
|
||||
#include "sec_crypto.h"
|
||||
|
||||
/* Algorithm resource per hardware SEC queue */
|
||||
|
@ -20,8 +20,7 @@
|
||||
|
||||
#define SEC_VF_NUM 63
|
||||
#define SEC_QUEUE_NUM_V1 4096
|
||||
#define SEC_PF_PCI_DEVICE_ID 0xa255
|
||||
#define SEC_VF_PCI_DEVICE_ID 0xa256
|
||||
#define PCI_DEVICE_ID_HUAWEI_SEC_PF 0xa255
|
||||
|
||||
#define SEC_BD_ERR_CHK_EN0 0xEFFFFFFF
|
||||
#define SEC_BD_ERR_CHK_EN1 0x7ffff7fd
|
||||
@ -229,7 +228,7 @@ static const struct debugfs_reg32 sec_dfx_regs[] = {
|
||||
|
||||
static int sec_pf_q_num_set(const char *val, const struct kernel_param *kp)
|
||||
{
|
||||
return q_num_set(val, kp, SEC_PF_PCI_DEVICE_ID);
|
||||
return q_num_set(val, kp, PCI_DEVICE_ID_HUAWEI_SEC_PF);
|
||||
}
|
||||
|
||||
static const struct kernel_param_ops sec_pf_q_num_ops = {
|
||||
@ -317,8 +316,8 @@ module_param_cb(uacce_mode, &sec_uacce_mode_ops, &uacce_mode, 0444);
|
||||
MODULE_PARM_DESC(uacce_mode, UACCE_MODE_DESC);
|
||||
|
||||
static const struct pci_device_id sec_dev_ids[] = {
|
||||
{ PCI_DEVICE(PCI_VENDOR_ID_HUAWEI, SEC_PF_PCI_DEVICE_ID) },
|
||||
{ PCI_DEVICE(PCI_VENDOR_ID_HUAWEI, SEC_VF_PCI_DEVICE_ID) },
|
||||
{ PCI_DEVICE(PCI_VENDOR_ID_HUAWEI, PCI_DEVICE_ID_HUAWEI_SEC_PF) },
|
||||
{ PCI_DEVICE(PCI_VENDOR_ID_HUAWEI, PCI_DEVICE_ID_HUAWEI_SEC_VF) },
|
||||
{ 0, }
|
||||
};
|
||||
MODULE_DEVICE_TABLE(pci, sec_dev_ids);
|
||||
@ -748,7 +747,7 @@ static int sec_core_debug_init(struct hisi_qm *qm)
|
||||
regset->base = qm->io_base;
|
||||
regset->dev = dev;
|
||||
|
||||
if (qm->pdev->device == SEC_PF_PCI_DEVICE_ID)
|
||||
if (qm->pdev->device == PCI_DEVICE_ID_HUAWEI_SEC_PF)
|
||||
debugfs_create_file("regs", 0444, tmp_d, regset, &sec_regs_fops);
|
||||
|
||||
for (i = 0; i < ARRAY_SIZE(sec_dfx_labels); i++) {
|
||||
@ -766,7 +765,7 @@ static int sec_debug_init(struct hisi_qm *qm)
|
||||
struct sec_dev *sec = container_of(qm, struct sec_dev, qm);
|
||||
int i;
|
||||
|
||||
if (qm->pdev->device == SEC_PF_PCI_DEVICE_ID) {
|
||||
if (qm->pdev->device == PCI_DEVICE_ID_HUAWEI_SEC_PF) {
|
||||
for (i = SEC_CLEAR_ENABLE; i < SEC_DEBUG_FILE_NUM; i++) {
|
||||
spin_lock_init(&sec->debug.files[i].lock);
|
||||
sec->debug.files[i].index = i;
|
||||
@ -908,7 +907,7 @@ static int sec_qm_init(struct hisi_qm *qm, struct pci_dev *pdev)
|
||||
qm->sqe_size = SEC_SQE_SIZE;
|
||||
qm->dev_name = sec_name;
|
||||
|
||||
qm->fun_type = (pdev->device == SEC_PF_PCI_DEVICE_ID) ?
|
||||
qm->fun_type = (pdev->device == PCI_DEVICE_ID_HUAWEI_SEC_PF) ?
|
||||
QM_HW_PF : QM_HW_VF;
|
||||
if (qm->fun_type == QM_HW_PF) {
|
||||
qm->qp_base = SEC_PF_DEF_Q_BASE;
|
||||
@ -1120,6 +1119,12 @@ static struct pci_driver sec_pci_driver = {
|
||||
.driver.pm = &sec_pm_ops,
|
||||
};
|
||||
|
||||
struct pci_driver *hisi_sec_get_pf_driver(void)
|
||||
{
|
||||
return &sec_pci_driver;
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(hisi_sec_get_pf_driver);
|
||||
|
||||
static void sec_register_debugfs(void)
|
||||
{
|
||||
if (!debugfs_initialized())
|
||||
|
@ -1,9 +1,9 @@
|
||||
// SPDX-License-Identifier: GPL-2.0
|
||||
/* Copyright (c) 2019 HiSilicon Limited. */
|
||||
#include <linux/dma-mapping.h>
|
||||
#include <linux/hisi_acc_qm.h>
|
||||
#include <linux/module.h>
|
||||
#include <linux/slab.h>
|
||||
#include "qm.h"
|
||||
|
||||
#define HISI_ACC_SGL_SGE_NR_MIN 1
|
||||
#define HISI_ACC_SGL_NR_MAX 256
|
||||
|
@ -7,7 +7,7 @@
|
||||
#define pr_fmt(fmt) "hisi_zip: " fmt
|
||||
|
||||
#include <linux/list.h>
|
||||
#include "../qm.h"
|
||||
#include <linux/hisi_acc_qm.h>
|
||||
|
||||
enum hisi_zip_error_type {
|
||||
/* negative compression */
|
||||
|
@ -15,8 +15,7 @@
|
||||
#include <linux/uacce.h>
|
||||
#include "zip.h"
|
||||
|
||||
#define PCI_DEVICE_ID_ZIP_PF 0xa250
|
||||
#define PCI_DEVICE_ID_ZIP_VF 0xa251
|
||||
#define PCI_DEVICE_ID_HUAWEI_ZIP_PF 0xa250
|
||||
|
||||
#define HZIP_QUEUE_NUM_V1 4096
|
||||
|
||||
@ -246,7 +245,7 @@ MODULE_PARM_DESC(uacce_mode, UACCE_MODE_DESC);
|
||||
|
||||
static int pf_q_num_set(const char *val, const struct kernel_param *kp)
|
||||
{
|
||||
return q_num_set(val, kp, PCI_DEVICE_ID_ZIP_PF);
|
||||
return q_num_set(val, kp, PCI_DEVICE_ID_HUAWEI_ZIP_PF);
|
||||
}
|
||||
|
||||
static const struct kernel_param_ops pf_q_num_ops = {
|
||||
@ -268,8 +267,8 @@ module_param_cb(vfs_num, &vfs_num_ops, &vfs_num, 0444);
|
||||
MODULE_PARM_DESC(vfs_num, "Number of VFs to enable(1-63), 0(default)");
|
||||
|
||||
static const struct pci_device_id hisi_zip_dev_ids[] = {
|
||||
{ PCI_DEVICE(PCI_VENDOR_ID_HUAWEI, PCI_DEVICE_ID_ZIP_PF) },
|
||||
{ PCI_DEVICE(PCI_VENDOR_ID_HUAWEI, PCI_DEVICE_ID_ZIP_VF) },
|
||||
{ PCI_DEVICE(PCI_VENDOR_ID_HUAWEI, PCI_DEVICE_ID_HUAWEI_ZIP_PF) },
|
||||
{ PCI_DEVICE(PCI_VENDOR_ID_HUAWEI, PCI_DEVICE_ID_HUAWEI_ZIP_VF) },
|
||||
{ 0, }
|
||||
};
|
||||
MODULE_DEVICE_TABLE(pci, hisi_zip_dev_ids);
|
||||
@ -838,7 +837,7 @@ static int hisi_zip_qm_init(struct hisi_qm *qm, struct pci_dev *pdev)
|
||||
qm->sqe_size = HZIP_SQE_SIZE;
|
||||
qm->dev_name = hisi_zip_name;
|
||||
|
||||
qm->fun_type = (pdev->device == PCI_DEVICE_ID_ZIP_PF) ?
|
||||
qm->fun_type = (pdev->device == PCI_DEVICE_ID_HUAWEI_ZIP_PF) ?
|
||||
QM_HW_PF : QM_HW_VF;
|
||||
if (qm->fun_type == QM_HW_PF) {
|
||||
qm->qp_base = HZIP_PF_DEF_Q_BASE;
|
||||
@ -1013,6 +1012,12 @@ static struct pci_driver hisi_zip_pci_driver = {
|
||||
.driver.pm = &hisi_zip_pm_ops,
|
||||
};
|
||||
|
||||
struct pci_driver *hisi_zip_get_pf_driver(void)
|
||||
{
|
||||
return &hisi_zip_pci_driver;
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(hisi_zip_get_pf_driver);
|
||||
|
||||
static void hisi_zip_register_debugfs(void)
|
||||
{
|
||||
if (!debugfs_initialized())
|
||||
|
@ -478,6 +478,11 @@ static int mlx5_internal_err_ret_value(struct mlx5_core_dev *dev, u16 op,
|
||||
case MLX5_CMD_OP_QUERY_VHCA_STATE:
|
||||
case MLX5_CMD_OP_MODIFY_VHCA_STATE:
|
||||
case MLX5_CMD_OP_ALLOC_SF:
|
||||
case MLX5_CMD_OP_SUSPEND_VHCA:
|
||||
case MLX5_CMD_OP_RESUME_VHCA:
|
||||
case MLX5_CMD_OP_QUERY_VHCA_MIGRATION_STATE:
|
||||
case MLX5_CMD_OP_SAVE_VHCA_STATE:
|
||||
case MLX5_CMD_OP_LOAD_VHCA_STATE:
|
||||
*status = MLX5_DRIVER_STATUS_ABORTED;
|
||||
*synd = MLX5_DRIVER_SYND;
|
||||
return -EIO;
|
||||
@ -675,6 +680,11 @@ const char *mlx5_command_str(int command)
|
||||
MLX5_COMMAND_STR_CASE(MODIFY_VHCA_STATE);
|
||||
MLX5_COMMAND_STR_CASE(ALLOC_SF);
|
||||
MLX5_COMMAND_STR_CASE(DEALLOC_SF);
|
||||
MLX5_COMMAND_STR_CASE(SUSPEND_VHCA);
|
||||
MLX5_COMMAND_STR_CASE(RESUME_VHCA);
|
||||
MLX5_COMMAND_STR_CASE(QUERY_VHCA_MIGRATION_STATE);
|
||||
MLX5_COMMAND_STR_CASE(SAVE_VHCA_STATE);
|
||||
MLX5_COMMAND_STR_CASE(LOAD_VHCA_STATE);
|
||||
default: return "unknown command opcode";
|
||||
}
|
||||
}
|
||||
|
@ -1620,6 +1620,7 @@ static void remove_one(struct pci_dev *pdev)
|
||||
struct devlink *devlink = priv_to_devlink(dev);
|
||||
|
||||
devlink_unregister(devlink);
|
||||
mlx5_sriov_disable(pdev);
|
||||
mlx5_crdump_disable(dev);
|
||||
mlx5_drain_health_wq(dev);
|
||||
mlx5_uninit_one(dev);
|
||||
@ -1882,6 +1883,50 @@ static struct pci_driver mlx5_core_driver = {
|
||||
.sriov_set_msix_vec_count = mlx5_core_sriov_set_msix_vec_count,
|
||||
};
|
||||
|
||||
/**
|
||||
* mlx5_vf_get_core_dev - Get the mlx5 core device from a given VF PCI device if
|
||||
* mlx5_core is its driver.
|
||||
* @pdev: The associated PCI device.
|
||||
*
|
||||
* Upon return the interface state lock stay held to let caller uses it safely.
|
||||
* Caller must ensure to use the returned mlx5 device for a narrow window
|
||||
* and put it back with mlx5_vf_put_core_dev() immediately once usage was over.
|
||||
*
|
||||
* Return: Pointer to the associated mlx5_core_dev or NULL.
|
||||
*/
|
||||
struct mlx5_core_dev *mlx5_vf_get_core_dev(struct pci_dev *pdev)
|
||||
__acquires(&mdev->intf_state_mutex)
|
||||
{
|
||||
struct mlx5_core_dev *mdev;
|
||||
|
||||
mdev = pci_iov_get_pf_drvdata(pdev, &mlx5_core_driver);
|
||||
if (IS_ERR(mdev))
|
||||
return NULL;
|
||||
|
||||
mutex_lock(&mdev->intf_state_mutex);
|
||||
if (!test_bit(MLX5_INTERFACE_STATE_UP, &mdev->intf_state)) {
|
||||
mutex_unlock(&mdev->intf_state_mutex);
|
||||
return NULL;
|
||||
}
|
||||
|
||||
return mdev;
|
||||
}
|
||||
EXPORT_SYMBOL(mlx5_vf_get_core_dev);
|
||||
|
||||
/**
|
||||
* mlx5_vf_put_core_dev - Put the mlx5 core device back.
|
||||
* @mdev: The mlx5 core device.
|
||||
*
|
||||
* Upon return the interface state lock is unlocked and caller should not
|
||||
* access the mdev any more.
|
||||
*/
|
||||
void mlx5_vf_put_core_dev(struct mlx5_core_dev *mdev)
|
||||
__releases(&mdev->intf_state_mutex)
|
||||
{
|
||||
mutex_unlock(&mdev->intf_state_mutex);
|
||||
}
|
||||
EXPORT_SYMBOL(mlx5_vf_put_core_dev);
|
||||
|
||||
static void mlx5_core_verify_params(void)
|
||||
{
|
||||
if (prof_sel >= ARRAY_SIZE(profile)) {
|
||||
|
@ -164,6 +164,7 @@ void mlx5_sriov_cleanup(struct mlx5_core_dev *dev);
|
||||
int mlx5_sriov_attach(struct mlx5_core_dev *dev);
|
||||
void mlx5_sriov_detach(struct mlx5_core_dev *dev);
|
||||
int mlx5_core_sriov_configure(struct pci_dev *dev, int num_vfs);
|
||||
void mlx5_sriov_disable(struct pci_dev *pdev);
|
||||
int mlx5_core_sriov_set_msix_vec_count(struct pci_dev *vf, int msix_vec_count);
|
||||
int mlx5_core_enable_hca(struct mlx5_core_dev *dev, u16 func_id);
|
||||
int mlx5_core_disable_hca(struct mlx5_core_dev *dev, u16 func_id);
|
||||
|
@ -161,7 +161,7 @@ static int mlx5_sriov_enable(struct pci_dev *pdev, int num_vfs)
|
||||
return err;
|
||||
}
|
||||
|
||||
static void mlx5_sriov_disable(struct pci_dev *pdev)
|
||||
void mlx5_sriov_disable(struct pci_dev *pdev)
|
||||
{
|
||||
struct mlx5_core_dev *dev = pci_get_drvdata(pdev);
|
||||
int num_vfs = pci_num_vf(dev->pdev);
|
||||
@ -205,19 +205,8 @@ int mlx5_core_sriov_set_msix_vec_count(struct pci_dev *vf, int msix_vec_count)
|
||||
mlx5_get_default_msix_vec_count(dev, pci_num_vf(pf));
|
||||
|
||||
sriov = &dev->priv.sriov;
|
||||
|
||||
/* Reversed translation of PCI VF function number to the internal
|
||||
* function_id, which exists in the name of virtfn symlink.
|
||||
*/
|
||||
for (id = 0; id < pci_num_vf(pf); id++) {
|
||||
if (!sriov->vfs_ctx[id].enabled)
|
||||
continue;
|
||||
|
||||
if (vf->devfn == pci_iov_virtfn_devfn(pf, id))
|
||||
break;
|
||||
}
|
||||
|
||||
if (id == pci_num_vf(pf) || !sriov->vfs_ctx[id].enabled)
|
||||
id = pci_iov_vf_id(vf);
|
||||
if (id < 0 || !sriov->vfs_ctx[id].enabled)
|
||||
return -EINVAL;
|
||||
|
||||
return mlx5_set_msix_vec_count(dev, id + 1, msix_vec_count);
|
||||
|
@ -33,6 +33,49 @@ int pci_iov_virtfn_devfn(struct pci_dev *dev, int vf_id)
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(pci_iov_virtfn_devfn);
|
||||
|
||||
int pci_iov_vf_id(struct pci_dev *dev)
|
||||
{
|
||||
struct pci_dev *pf;
|
||||
|
||||
if (!dev->is_virtfn)
|
||||
return -EINVAL;
|
||||
|
||||
pf = pci_physfn(dev);
|
||||
return (((dev->bus->number << 8) + dev->devfn) -
|
||||
((pf->bus->number << 8) + pf->devfn + pf->sriov->offset)) /
|
||||
pf->sriov->stride;
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(pci_iov_vf_id);
|
||||
|
||||
/**
|
||||
* pci_iov_get_pf_drvdata - Return the drvdata of a PF
|
||||
* @dev: VF pci_dev
|
||||
* @pf_driver: Device driver required to own the PF
|
||||
*
|
||||
* This must be called from a context that ensures that a VF driver is attached.
|
||||
* The value returned is invalid once the VF driver completes its remove()
|
||||
* callback.
|
||||
*
|
||||
* Locking is achieved by the driver core. A VF driver cannot be probed until
|
||||
* pci_enable_sriov() is called and pci_disable_sriov() does not return until
|
||||
* all VF drivers have completed their remove().
|
||||
*
|
||||
* The PF driver must call pci_disable_sriov() before it begins to destroy the
|
||||
* drvdata.
|
||||
*/
|
||||
void *pci_iov_get_pf_drvdata(struct pci_dev *dev, struct pci_driver *pf_driver)
|
||||
{
|
||||
struct pci_dev *pf_dev;
|
||||
|
||||
if (!dev->is_virtfn)
|
||||
return ERR_PTR(-EINVAL);
|
||||
pf_dev = dev->physfn;
|
||||
if (pf_dev->driver != pf_driver)
|
||||
return ERR_PTR(-EINVAL);
|
||||
return pci_get_drvdata(pf_dev);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(pci_iov_get_pf_drvdata);
|
||||
|
||||
/*
|
||||
* Per SR-IOV spec sec 3.3.10 and 3.3.11, First VF Offset and VF Stride may
|
||||
* change when NumVFs changes.
|
||||
|
@ -43,4 +43,9 @@ config VFIO_PCI_IGD
|
||||
|
||||
To enable Intel IGD assignment through vfio-pci, say Y.
|
||||
endif
|
||||
|
||||
source "drivers/vfio/pci/mlx5/Kconfig"
|
||||
|
||||
source "drivers/vfio/pci/hisilicon/Kconfig"
|
||||
|
||||
endif
|
||||
|
@ -7,3 +7,7 @@ obj-$(CONFIG_VFIO_PCI_CORE) += vfio-pci-core.o
|
||||
vfio-pci-y := vfio_pci.o
|
||||
vfio-pci-$(CONFIG_VFIO_PCI_IGD) += vfio_pci_igd.o
|
||||
obj-$(CONFIG_VFIO_PCI) += vfio-pci.o
|
||||
|
||||
obj-$(CONFIG_MLX5_VFIO_PCI) += mlx5/
|
||||
|
||||
obj-$(CONFIG_HISI_ACC_VFIO_PCI) += hisilicon/
|
||||
|
15
drivers/vfio/pci/hisilicon/Kconfig
Normal file
15
drivers/vfio/pci/hisilicon/Kconfig
Normal file
@ -0,0 +1,15 @@
|
||||
# SPDX-License-Identifier: GPL-2.0-only
|
||||
config HISI_ACC_VFIO_PCI
|
||||
tristate "VFIO PCI support for HiSilicon ACC devices"
|
||||
depends on ARM64 || (COMPILE_TEST && 64BIT)
|
||||
depends on VFIO_PCI_CORE
|
||||
depends on PCI_MSI
|
||||
depends on CRYPTO_DEV_HISI_QM
|
||||
depends on CRYPTO_DEV_HISI_HPRE
|
||||
depends on CRYPTO_DEV_HISI_SEC2
|
||||
depends on CRYPTO_DEV_HISI_ZIP
|
||||
help
|
||||
This provides generic PCI support for HiSilicon ACC devices
|
||||
using the VFIO framework.
|
||||
|
||||
If you don't know what to do here, say N.
|
4
drivers/vfio/pci/hisilicon/Makefile
Normal file
4
drivers/vfio/pci/hisilicon/Makefile
Normal file
@ -0,0 +1,4 @@
|
||||
# SPDX-License-Identifier: GPL-2.0-only
|
||||
obj-$(CONFIG_HISI_ACC_VFIO_PCI) += hisi-acc-vfio-pci.o
|
||||
hisi-acc-vfio-pci-y := hisi_acc_vfio_pci.o
|
||||
|
1326
drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
Normal file
1326
drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c
Normal file
File diff suppressed because it is too large
Load Diff
116
drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.h
Normal file
116
drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.h
Normal file
@ -0,0 +1,116 @@
|
||||
/* SPDX-License-Identifier: GPL-2.0 */
|
||||
/* Copyright (c) 2021 HiSilicon Ltd. */
|
||||
|
||||
#ifndef HISI_ACC_VFIO_PCI_H
|
||||
#define HISI_ACC_VFIO_PCI_H
|
||||
|
||||
#include <linux/hisi_acc_qm.h>
|
||||
|
||||
#define MB_POLL_PERIOD_US 10
|
||||
#define MB_POLL_TIMEOUT_US 1000
|
||||
#define QM_CACHE_WB_START 0x204
|
||||
#define QM_CACHE_WB_DONE 0x208
|
||||
#define QM_MB_CMD_PAUSE_QM 0xe
|
||||
#define QM_ABNORMAL_INT_STATUS 0x100008
|
||||
#define QM_IFC_INT_STATUS 0x0028
|
||||
#define SEC_CORE_INT_STATUS 0x301008
|
||||
#define HPRE_HAC_INT_STATUS 0x301800
|
||||
#define HZIP_CORE_INT_STATUS 0x3010AC
|
||||
#define QM_QUE_ISO_CFG 0x301154
|
||||
|
||||
#define QM_VFT_CFG_RDY 0x10006c
|
||||
#define QM_VFT_CFG_OP_WR 0x100058
|
||||
#define QM_VFT_CFG_TYPE 0x10005c
|
||||
#define QM_VFT_CFG 0x100060
|
||||
#define QM_VFT_CFG_OP_ENABLE 0x100054
|
||||
#define QM_VFT_CFG_DATA_L 0x100064
|
||||
#define QM_VFT_CFG_DATA_H 0x100068
|
||||
|
||||
#define ERROR_CHECK_TIMEOUT 100
|
||||
#define CHECK_DELAY_TIME 100
|
||||
|
||||
#define QM_SQC_VFT_BASE_SHIFT_V2 28
|
||||
#define QM_SQC_VFT_BASE_MASK_V2 GENMASK(15, 0)
|
||||
#define QM_SQC_VFT_NUM_SHIFT_V2 45
|
||||
#define QM_SQC_VFT_NUM_MASK_V2 GENMASK(9, 0)
|
||||
|
||||
/* RW regs */
|
||||
#define QM_REGS_MAX_LEN 7
|
||||
#define QM_REG_ADDR_OFFSET 0x0004
|
||||
|
||||
#define QM_XQC_ADDR_OFFSET 32U
|
||||
#define QM_VF_AEQ_INT_MASK 0x0004
|
||||
#define QM_VF_EQ_INT_MASK 0x000c
|
||||
#define QM_IFC_INT_SOURCE_V 0x0020
|
||||
#define QM_IFC_INT_MASK 0x0024
|
||||
#define QM_IFC_INT_SET_V 0x002c
|
||||
#define QM_QUE_ISO_CFG_V 0x0030
|
||||
#define QM_PAGE_SIZE 0x0034
|
||||
|
||||
#define QM_EQC_DW0 0X8000
|
||||
#define QM_AEQC_DW0 0X8020
|
||||
|
||||
struct acc_vf_data {
|
||||
#define QM_MATCH_SIZE offsetofend(struct acc_vf_data, qm_rsv_state)
|
||||
/* QM match information */
|
||||
#define ACC_DEV_MAGIC 0XCDCDCDCDFEEDAACC
|
||||
u64 acc_magic;
|
||||
u32 qp_num;
|
||||
u32 dev_id;
|
||||
u32 que_iso_cfg;
|
||||
u32 qp_base;
|
||||
u32 vf_qm_state;
|
||||
/* QM reserved match information */
|
||||
u32 qm_rsv_state[3];
|
||||
|
||||
/* QM RW regs */
|
||||
u32 aeq_int_mask;
|
||||
u32 eq_int_mask;
|
||||
u32 ifc_int_source;
|
||||
u32 ifc_int_mask;
|
||||
u32 ifc_int_set;
|
||||
u32 page_size;
|
||||
|
||||
/* QM_EQC_DW has 7 regs */
|
||||
u32 qm_eqc_dw[7];
|
||||
|
||||
/* QM_AEQC_DW has 7 regs */
|
||||
u32 qm_aeqc_dw[7];
|
||||
|
||||
/* QM reserved 5 regs */
|
||||
u32 qm_rsv_regs[5];
|
||||
u32 padding;
|
||||
/* qm memory init information */
|
||||
u64 eqe_dma;
|
||||
u64 aeqe_dma;
|
||||
u64 sqc_dma;
|
||||
u64 cqc_dma;
|
||||
};
|
||||
|
||||
struct hisi_acc_vf_migration_file {
|
||||
struct file *filp;
|
||||
struct mutex lock;
|
||||
bool disabled;
|
||||
|
||||
struct acc_vf_data vf_data;
|
||||
size_t total_length;
|
||||
};
|
||||
|
||||
struct hisi_acc_vf_core_device {
|
||||
struct vfio_pci_core_device core_device;
|
||||
u8 deferred_reset:1;
|
||||
/* for migration state */
|
||||
struct mutex state_mutex;
|
||||
enum vfio_device_mig_state mig_state;
|
||||
struct pci_dev *pf_dev;
|
||||
struct pci_dev *vf_dev;
|
||||
struct hisi_qm *pf_qm;
|
||||
struct hisi_qm vf_qm;
|
||||
u32 vf_qm_state;
|
||||
int vf_id;
|
||||
/* for reset handler */
|
||||
spinlock_t reset_lock;
|
||||
struct hisi_acc_vf_migration_file *resuming_migf;
|
||||
struct hisi_acc_vf_migration_file *saving_migf;
|
||||
};
|
||||
#endif /* HISI_ACC_VFIO_PCI_H */
|
10
drivers/vfio/pci/mlx5/Kconfig
Normal file
10
drivers/vfio/pci/mlx5/Kconfig
Normal file
@ -0,0 +1,10 @@
|
||||
# SPDX-License-Identifier: GPL-2.0-only
|
||||
config MLX5_VFIO_PCI
|
||||
tristate "VFIO support for MLX5 PCI devices"
|
||||
depends on MLX5_CORE
|
||||
depends on VFIO_PCI_CORE
|
||||
help
|
||||
This provides migration support for MLX5 devices using the VFIO
|
||||
framework.
|
||||
|
||||
If you don't know what to do here, say N.
|
4
drivers/vfio/pci/mlx5/Makefile
Normal file
4
drivers/vfio/pci/mlx5/Makefile
Normal file
@ -0,0 +1,4 @@
|
||||
# SPDX-License-Identifier: GPL-2.0-only
|
||||
obj-$(CONFIG_MLX5_VFIO_PCI) += mlx5-vfio-pci.o
|
||||
mlx5-vfio-pci-y := main.o cmd.o
|
||||
|
259
drivers/vfio/pci/mlx5/cmd.c
Normal file
259
drivers/vfio/pci/mlx5/cmd.c
Normal file
@ -0,0 +1,259 @@
|
||||
// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
|
||||
/*
|
||||
* Copyright (c) 2021-2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved
|
||||
*/
|
||||
|
||||
#include "cmd.h"
|
||||
|
||||
int mlx5vf_cmd_suspend_vhca(struct pci_dev *pdev, u16 vhca_id, u16 op_mod)
|
||||
{
|
||||
struct mlx5_core_dev *mdev = mlx5_vf_get_core_dev(pdev);
|
||||
u32 out[MLX5_ST_SZ_DW(suspend_vhca_out)] = {};
|
||||
u32 in[MLX5_ST_SZ_DW(suspend_vhca_in)] = {};
|
||||
int ret;
|
||||
|
||||
if (!mdev)
|
||||
return -ENOTCONN;
|
||||
|
||||
MLX5_SET(suspend_vhca_in, in, opcode, MLX5_CMD_OP_SUSPEND_VHCA);
|
||||
MLX5_SET(suspend_vhca_in, in, vhca_id, vhca_id);
|
||||
MLX5_SET(suspend_vhca_in, in, op_mod, op_mod);
|
||||
|
||||
ret = mlx5_cmd_exec_inout(mdev, suspend_vhca, in, out);
|
||||
mlx5_vf_put_core_dev(mdev);
|
||||
return ret;
|
||||
}
|
||||
|
||||
int mlx5vf_cmd_resume_vhca(struct pci_dev *pdev, u16 vhca_id, u16 op_mod)
|
||||
{
|
||||
struct mlx5_core_dev *mdev = mlx5_vf_get_core_dev(pdev);
|
||||
u32 out[MLX5_ST_SZ_DW(resume_vhca_out)] = {};
|
||||
u32 in[MLX5_ST_SZ_DW(resume_vhca_in)] = {};
|
||||
int ret;
|
||||
|
||||
if (!mdev)
|
||||
return -ENOTCONN;
|
||||
|
||||
MLX5_SET(resume_vhca_in, in, opcode, MLX5_CMD_OP_RESUME_VHCA);
|
||||
MLX5_SET(resume_vhca_in, in, vhca_id, vhca_id);
|
||||
MLX5_SET(resume_vhca_in, in, op_mod, op_mod);
|
||||
|
||||
ret = mlx5_cmd_exec_inout(mdev, resume_vhca, in, out);
|
||||
mlx5_vf_put_core_dev(mdev);
|
||||
return ret;
|
||||
}
|
||||
|
||||
int mlx5vf_cmd_query_vhca_migration_state(struct pci_dev *pdev, u16 vhca_id,
|
||||
size_t *state_size)
|
||||
{
|
||||
struct mlx5_core_dev *mdev = mlx5_vf_get_core_dev(pdev);
|
||||
u32 out[MLX5_ST_SZ_DW(query_vhca_migration_state_out)] = {};
|
||||
u32 in[MLX5_ST_SZ_DW(query_vhca_migration_state_in)] = {};
|
||||
int ret;
|
||||
|
||||
if (!mdev)
|
||||
return -ENOTCONN;
|
||||
|
||||
MLX5_SET(query_vhca_migration_state_in, in, opcode,
|
||||
MLX5_CMD_OP_QUERY_VHCA_MIGRATION_STATE);
|
||||
MLX5_SET(query_vhca_migration_state_in, in, vhca_id, vhca_id);
|
||||
MLX5_SET(query_vhca_migration_state_in, in, op_mod, 0);
|
||||
|
||||
ret = mlx5_cmd_exec_inout(mdev, query_vhca_migration_state, in, out);
|
||||
if (ret)
|
||||
goto end;
|
||||
|
||||
*state_size = MLX5_GET(query_vhca_migration_state_out, out,
|
||||
required_umem_size);
|
||||
|
||||
end:
|
||||
mlx5_vf_put_core_dev(mdev);
|
||||
return ret;
|
||||
}
|
||||
|
||||
int mlx5vf_cmd_get_vhca_id(struct pci_dev *pdev, u16 function_id, u16 *vhca_id)
|
||||
{
|
||||
struct mlx5_core_dev *mdev = mlx5_vf_get_core_dev(pdev);
|
||||
u32 in[MLX5_ST_SZ_DW(query_hca_cap_in)] = {};
|
||||
int out_size;
|
||||
void *out;
|
||||
int ret;
|
||||
|
||||
if (!mdev)
|
||||
return -ENOTCONN;
|
||||
|
||||
out_size = MLX5_ST_SZ_BYTES(query_hca_cap_out);
|
||||
out = kzalloc(out_size, GFP_KERNEL);
|
||||
if (!out) {
|
||||
ret = -ENOMEM;
|
||||
goto end;
|
||||
}
|
||||
|
||||
MLX5_SET(query_hca_cap_in, in, opcode, MLX5_CMD_OP_QUERY_HCA_CAP);
|
||||
MLX5_SET(query_hca_cap_in, in, other_function, 1);
|
||||
MLX5_SET(query_hca_cap_in, in, function_id, function_id);
|
||||
MLX5_SET(query_hca_cap_in, in, op_mod,
|
||||
MLX5_SET_HCA_CAP_OP_MOD_GENERAL_DEVICE << 1 |
|
||||
HCA_CAP_OPMOD_GET_CUR);
|
||||
|
||||
ret = mlx5_cmd_exec_inout(mdev, query_hca_cap, in, out);
|
||||
if (ret)
|
||||
goto err_exec;
|
||||
|
||||
*vhca_id = MLX5_GET(query_hca_cap_out, out,
|
||||
capability.cmd_hca_cap.vhca_id);
|
||||
|
||||
err_exec:
|
||||
kfree(out);
|
||||
end:
|
||||
mlx5_vf_put_core_dev(mdev);
|
||||
return ret;
|
||||
}
|
||||
|
||||
static int _create_state_mkey(struct mlx5_core_dev *mdev, u32 pdn,
|
||||
struct mlx5_vf_migration_file *migf, u32 *mkey)
|
||||
{
|
||||
size_t npages = DIV_ROUND_UP(migf->total_length, PAGE_SIZE);
|
||||
struct sg_dma_page_iter dma_iter;
|
||||
int err = 0, inlen;
|
||||
__be64 *mtt;
|
||||
void *mkc;
|
||||
u32 *in;
|
||||
|
||||
inlen = MLX5_ST_SZ_BYTES(create_mkey_in) +
|
||||
sizeof(*mtt) * round_up(npages, 2);
|
||||
|
||||
in = kvzalloc(inlen, GFP_KERNEL);
|
||||
if (!in)
|
||||
return -ENOMEM;
|
||||
|
||||
MLX5_SET(create_mkey_in, in, translations_octword_actual_size,
|
||||
DIV_ROUND_UP(npages, 2));
|
||||
mtt = (__be64 *)MLX5_ADDR_OF(create_mkey_in, in, klm_pas_mtt);
|
||||
|
||||
for_each_sgtable_dma_page(&migf->table.sgt, &dma_iter, 0)
|
||||
*mtt++ = cpu_to_be64(sg_page_iter_dma_address(&dma_iter));
|
||||
|
||||
mkc = MLX5_ADDR_OF(create_mkey_in, in, memory_key_mkey_entry);
|
||||
MLX5_SET(mkc, mkc, access_mode_1_0, MLX5_MKC_ACCESS_MODE_MTT);
|
||||
MLX5_SET(mkc, mkc, lr, 1);
|
||||
MLX5_SET(mkc, mkc, lw, 1);
|
||||
MLX5_SET(mkc, mkc, rr, 1);
|
||||
MLX5_SET(mkc, mkc, rw, 1);
|
||||
MLX5_SET(mkc, mkc, pd, pdn);
|
||||
MLX5_SET(mkc, mkc, bsf_octword_size, 0);
|
||||
MLX5_SET(mkc, mkc, qpn, 0xffffff);
|
||||
MLX5_SET(mkc, mkc, log_page_size, PAGE_SHIFT);
|
||||
MLX5_SET(mkc, mkc, translations_octword_size, DIV_ROUND_UP(npages, 2));
|
||||
MLX5_SET64(mkc, mkc, len, migf->total_length);
|
||||
err = mlx5_core_create_mkey(mdev, mkey, in, inlen);
|
||||
kvfree(in);
|
||||
return err;
|
||||
}
|
||||
|
||||
int mlx5vf_cmd_save_vhca_state(struct pci_dev *pdev, u16 vhca_id,
|
||||
struct mlx5_vf_migration_file *migf)
|
||||
{
|
||||
struct mlx5_core_dev *mdev = mlx5_vf_get_core_dev(pdev);
|
||||
u32 out[MLX5_ST_SZ_DW(save_vhca_state_out)] = {};
|
||||
u32 in[MLX5_ST_SZ_DW(save_vhca_state_in)] = {};
|
||||
u32 pdn, mkey;
|
||||
int err;
|
||||
|
||||
if (!mdev)
|
||||
return -ENOTCONN;
|
||||
|
||||
err = mlx5_core_alloc_pd(mdev, &pdn);
|
||||
if (err)
|
||||
goto end;
|
||||
|
||||
err = dma_map_sgtable(mdev->device, &migf->table.sgt, DMA_FROM_DEVICE,
|
||||
0);
|
||||
if (err)
|
||||
goto err_dma_map;
|
||||
|
||||
err = _create_state_mkey(mdev, pdn, migf, &mkey);
|
||||
if (err)
|
||||
goto err_create_mkey;
|
||||
|
||||
MLX5_SET(save_vhca_state_in, in, opcode,
|
||||
MLX5_CMD_OP_SAVE_VHCA_STATE);
|
||||
MLX5_SET(save_vhca_state_in, in, op_mod, 0);
|
||||
MLX5_SET(save_vhca_state_in, in, vhca_id, vhca_id);
|
||||
MLX5_SET(save_vhca_state_in, in, mkey, mkey);
|
||||
MLX5_SET(save_vhca_state_in, in, size, migf->total_length);
|
||||
|
||||
err = mlx5_cmd_exec_inout(mdev, save_vhca_state, in, out);
|
||||
if (err)
|
||||
goto err_exec;
|
||||
|
||||
migf->total_length =
|
||||
MLX5_GET(save_vhca_state_out, out, actual_image_size);
|
||||
|
||||
mlx5_core_destroy_mkey(mdev, mkey);
|
||||
mlx5_core_dealloc_pd(mdev, pdn);
|
||||
dma_unmap_sgtable(mdev->device, &migf->table.sgt, DMA_FROM_DEVICE, 0);
|
||||
mlx5_vf_put_core_dev(mdev);
|
||||
|
||||
return 0;
|
||||
|
||||
err_exec:
|
||||
mlx5_core_destroy_mkey(mdev, mkey);
|
||||
err_create_mkey:
|
||||
dma_unmap_sgtable(mdev->device, &migf->table.sgt, DMA_FROM_DEVICE, 0);
|
||||
err_dma_map:
|
||||
mlx5_core_dealloc_pd(mdev, pdn);
|
||||
end:
|
||||
mlx5_vf_put_core_dev(mdev);
|
||||
return err;
|
||||
}
|
||||
|
||||
int mlx5vf_cmd_load_vhca_state(struct pci_dev *pdev, u16 vhca_id,
|
||||
struct mlx5_vf_migration_file *migf)
|
||||
{
|
||||
struct mlx5_core_dev *mdev = mlx5_vf_get_core_dev(pdev);
|
||||
u32 out[MLX5_ST_SZ_DW(save_vhca_state_out)] = {};
|
||||
u32 in[MLX5_ST_SZ_DW(save_vhca_state_in)] = {};
|
||||
u32 pdn, mkey;
|
||||
int err;
|
||||
|
||||
if (!mdev)
|
||||
return -ENOTCONN;
|
||||
|
||||
mutex_lock(&migf->lock);
|
||||
if (!migf->total_length) {
|
||||
err = -EINVAL;
|
||||
goto end;
|
||||
}
|
||||
|
||||
err = mlx5_core_alloc_pd(mdev, &pdn);
|
||||
if (err)
|
||||
goto end;
|
||||
|
||||
err = dma_map_sgtable(mdev->device, &migf->table.sgt, DMA_TO_DEVICE, 0);
|
||||
if (err)
|
||||
goto err_reg;
|
||||
|
||||
err = _create_state_mkey(mdev, pdn, migf, &mkey);
|
||||
if (err)
|
||||
goto err_mkey;
|
||||
|
||||
MLX5_SET(load_vhca_state_in, in, opcode,
|
||||
MLX5_CMD_OP_LOAD_VHCA_STATE);
|
||||
MLX5_SET(load_vhca_state_in, in, op_mod, 0);
|
||||
MLX5_SET(load_vhca_state_in, in, vhca_id, vhca_id);
|
||||
MLX5_SET(load_vhca_state_in, in, mkey, mkey);
|
||||
MLX5_SET(load_vhca_state_in, in, size, migf->total_length);
|
||||
|
||||
err = mlx5_cmd_exec_inout(mdev, load_vhca_state, in, out);
|
||||
|
||||
mlx5_core_destroy_mkey(mdev, mkey);
|
||||
err_mkey:
|
||||
dma_unmap_sgtable(mdev->device, &migf->table.sgt, DMA_TO_DEVICE, 0);
|
||||
err_reg:
|
||||
mlx5_core_dealloc_pd(mdev, pdn);
|
||||
end:
|
||||
mlx5_vf_put_core_dev(mdev);
|
||||
mutex_unlock(&migf->lock);
|
||||
return err;
|
||||
}
|
36
drivers/vfio/pci/mlx5/cmd.h
Normal file
36
drivers/vfio/pci/mlx5/cmd.h
Normal file
@ -0,0 +1,36 @@
|
||||
/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */
|
||||
/*
|
||||
* Copyright (c) 2021-2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
|
||||
*/
|
||||
|
||||
#ifndef MLX5_VFIO_CMD_H
|
||||
#define MLX5_VFIO_CMD_H
|
||||
|
||||
#include <linux/kernel.h>
|
||||
#include <linux/mlx5/driver.h>
|
||||
|
||||
struct mlx5_vf_migration_file {
|
||||
struct file *filp;
|
||||
struct mutex lock;
|
||||
bool disabled;
|
||||
|
||||
struct sg_append_table table;
|
||||
size_t total_length;
|
||||
size_t allocated_length;
|
||||
|
||||
/* Optimize mlx5vf_get_migration_page() for sequential access */
|
||||
struct scatterlist *last_offset_sg;
|
||||
unsigned int sg_last_entry;
|
||||
unsigned long last_offset;
|
||||
};
|
||||
|
||||
int mlx5vf_cmd_suspend_vhca(struct pci_dev *pdev, u16 vhca_id, u16 op_mod);
|
||||
int mlx5vf_cmd_resume_vhca(struct pci_dev *pdev, u16 vhca_id, u16 op_mod);
|
||||
int mlx5vf_cmd_query_vhca_migration_state(struct pci_dev *pdev, u16 vhca_id,
|
||||
size_t *state_size);
|
||||
int mlx5vf_cmd_get_vhca_id(struct pci_dev *pdev, u16 function_id, u16 *vhca_id);
|
||||
int mlx5vf_cmd_save_vhca_state(struct pci_dev *pdev, u16 vhca_id,
|
||||
struct mlx5_vf_migration_file *migf);
|
||||
int mlx5vf_cmd_load_vhca_state(struct pci_dev *pdev, u16 vhca_id,
|
||||
struct mlx5_vf_migration_file *migf);
|
||||
#endif /* MLX5_VFIO_CMD_H */
|
676
drivers/vfio/pci/mlx5/main.c
Normal file
676
drivers/vfio/pci/mlx5/main.c
Normal file
@ -0,0 +1,676 @@
|
||||
// SPDX-License-Identifier: GPL-2.0-only
|
||||
/*
|
||||
* Copyright (c) 2021-2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved
|
||||
*/
|
||||
|
||||
#include <linux/device.h>
|
||||
#include <linux/eventfd.h>
|
||||
#include <linux/file.h>
|
||||
#include <linux/interrupt.h>
|
||||
#include <linux/iommu.h>
|
||||
#include <linux/module.h>
|
||||
#include <linux/mutex.h>
|
||||
#include <linux/notifier.h>
|
||||
#include <linux/pci.h>
|
||||
#include <linux/pm_runtime.h>
|
||||
#include <linux/types.h>
|
||||
#include <linux/uaccess.h>
|
||||
#include <linux/vfio.h>
|
||||
#include <linux/sched/mm.h>
|
||||
#include <linux/vfio_pci_core.h>
|
||||
#include <linux/anon_inodes.h>
|
||||
|
||||
#include "cmd.h"
|
||||
|
||||
/* Arbitrary to prevent userspace from consuming endless memory */
|
||||
#define MAX_MIGRATION_SIZE (512*1024*1024)
|
||||
|
||||
struct mlx5vf_pci_core_device {
|
||||
struct vfio_pci_core_device core_device;
|
||||
u16 vhca_id;
|
||||
u8 migrate_cap:1;
|
||||
u8 deferred_reset:1;
|
||||
/* protect migration state */
|
||||
struct mutex state_mutex;
|
||||
enum vfio_device_mig_state mig_state;
|
||||
/* protect the reset_done flow */
|
||||
spinlock_t reset_lock;
|
||||
struct mlx5_vf_migration_file *resuming_migf;
|
||||
struct mlx5_vf_migration_file *saving_migf;
|
||||
};
|
||||
|
||||
static struct page *
|
||||
mlx5vf_get_migration_page(struct mlx5_vf_migration_file *migf,
|
||||
unsigned long offset)
|
||||
{
|
||||
unsigned long cur_offset = 0;
|
||||
struct scatterlist *sg;
|
||||
unsigned int i;
|
||||
|
||||
/* All accesses are sequential */
|
||||
if (offset < migf->last_offset || !migf->last_offset_sg) {
|
||||
migf->last_offset = 0;
|
||||
migf->last_offset_sg = migf->table.sgt.sgl;
|
||||
migf->sg_last_entry = 0;
|
||||
}
|
||||
|
||||
cur_offset = migf->last_offset;
|
||||
|
||||
for_each_sg(migf->last_offset_sg, sg,
|
||||
migf->table.sgt.orig_nents - migf->sg_last_entry, i) {
|
||||
if (offset < sg->length + cur_offset) {
|
||||
migf->last_offset_sg = sg;
|
||||
migf->sg_last_entry += i;
|
||||
migf->last_offset = cur_offset;
|
||||
return nth_page(sg_page(sg),
|
||||
(offset - cur_offset) / PAGE_SIZE);
|
||||
}
|
||||
cur_offset += sg->length;
|
||||
}
|
||||
return NULL;
|
||||
}
|
||||
|
||||
static int mlx5vf_add_migration_pages(struct mlx5_vf_migration_file *migf,
|
||||
unsigned int npages)
|
||||
{
|
||||
unsigned int to_alloc = npages;
|
||||
struct page **page_list;
|
||||
unsigned long filled;
|
||||
unsigned int to_fill;
|
||||
int ret;
|
||||
|
||||
to_fill = min_t(unsigned int, npages, PAGE_SIZE / sizeof(*page_list));
|
||||
page_list = kvzalloc(to_fill * sizeof(*page_list), GFP_KERNEL);
|
||||
if (!page_list)
|
||||
return -ENOMEM;
|
||||
|
||||
do {
|
||||
filled = alloc_pages_bulk_array(GFP_KERNEL, to_fill, page_list);
|
||||
if (!filled) {
|
||||
ret = -ENOMEM;
|
||||
goto err;
|
||||
}
|
||||
to_alloc -= filled;
|
||||
ret = sg_alloc_append_table_from_pages(
|
||||
&migf->table, page_list, filled, 0,
|
||||
filled << PAGE_SHIFT, UINT_MAX, SG_MAX_SINGLE_ALLOC,
|
||||
GFP_KERNEL);
|
||||
|
||||
if (ret)
|
||||
goto err;
|
||||
migf->allocated_length += filled * PAGE_SIZE;
|
||||
/* clean input for another bulk allocation */
|
||||
memset(page_list, 0, filled * sizeof(*page_list));
|
||||
to_fill = min_t(unsigned int, to_alloc,
|
||||
PAGE_SIZE / sizeof(*page_list));
|
||||
} while (to_alloc > 0);
|
||||
|
||||
kvfree(page_list);
|
||||
return 0;
|
||||
|
||||
err:
|
||||
kvfree(page_list);
|
||||
return ret;
|
||||
}
|
||||
|
||||
static void mlx5vf_disable_fd(struct mlx5_vf_migration_file *migf)
|
||||
{
|
||||
struct sg_page_iter sg_iter;
|
||||
|
||||
mutex_lock(&migf->lock);
|
||||
/* Undo alloc_pages_bulk_array() */
|
||||
for_each_sgtable_page(&migf->table.sgt, &sg_iter, 0)
|
||||
__free_page(sg_page_iter_page(&sg_iter));
|
||||
sg_free_append_table(&migf->table);
|
||||
migf->disabled = true;
|
||||
migf->total_length = 0;
|
||||
migf->allocated_length = 0;
|
||||
migf->filp->f_pos = 0;
|
||||
mutex_unlock(&migf->lock);
|
||||
}
|
||||
|
||||
static int mlx5vf_release_file(struct inode *inode, struct file *filp)
|
||||
{
|
||||
struct mlx5_vf_migration_file *migf = filp->private_data;
|
||||
|
||||
mlx5vf_disable_fd(migf);
|
||||
mutex_destroy(&migf->lock);
|
||||
kfree(migf);
|
||||
return 0;
|
||||
}
|
||||
|
||||
static ssize_t mlx5vf_save_read(struct file *filp, char __user *buf, size_t len,
|
||||
loff_t *pos)
|
||||
{
|
||||
struct mlx5_vf_migration_file *migf = filp->private_data;
|
||||
ssize_t done = 0;
|
||||
|
||||
if (pos)
|
||||
return -ESPIPE;
|
||||
pos = &filp->f_pos;
|
||||
|
||||
mutex_lock(&migf->lock);
|
||||
if (*pos > migf->total_length) {
|
||||
done = -EINVAL;
|
||||
goto out_unlock;
|
||||
}
|
||||
if (migf->disabled) {
|
||||
done = -ENODEV;
|
||||
goto out_unlock;
|
||||
}
|
||||
|
||||
len = min_t(size_t, migf->total_length - *pos, len);
|
||||
while (len) {
|
||||
size_t page_offset;
|
||||
struct page *page;
|
||||
size_t page_len;
|
||||
u8 *from_buff;
|
||||
int ret;
|
||||
|
||||
page_offset = (*pos) % PAGE_SIZE;
|
||||
page = mlx5vf_get_migration_page(migf, *pos - page_offset);
|
||||
if (!page) {
|
||||
if (done == 0)
|
||||
done = -EINVAL;
|
||||
goto out_unlock;
|
||||
}
|
||||
|
||||
page_len = min_t(size_t, len, PAGE_SIZE - page_offset);
|
||||
from_buff = kmap_local_page(page);
|
||||
ret = copy_to_user(buf, from_buff + page_offset, page_len);
|
||||
kunmap_local(from_buff);
|
||||
if (ret) {
|
||||
done = -EFAULT;
|
||||
goto out_unlock;
|
||||
}
|
||||
*pos += page_len;
|
||||
len -= page_len;
|
||||
done += page_len;
|
||||
buf += page_len;
|
||||
}
|
||||
|
||||
out_unlock:
|
||||
mutex_unlock(&migf->lock);
|
||||
return done;
|
||||
}
|
||||
|
||||
static const struct file_operations mlx5vf_save_fops = {
|
||||
.owner = THIS_MODULE,
|
||||
.read = mlx5vf_save_read,
|
||||
.release = mlx5vf_release_file,
|
||||
.llseek = no_llseek,
|
||||
};
|
||||
|
||||
static struct mlx5_vf_migration_file *
|
||||
mlx5vf_pci_save_device_data(struct mlx5vf_pci_core_device *mvdev)
|
||||
{
|
||||
struct mlx5_vf_migration_file *migf;
|
||||
int ret;
|
||||
|
||||
migf = kzalloc(sizeof(*migf), GFP_KERNEL);
|
||||
if (!migf)
|
||||
return ERR_PTR(-ENOMEM);
|
||||
|
||||
migf->filp = anon_inode_getfile("mlx5vf_mig", &mlx5vf_save_fops, migf,
|
||||
O_RDONLY);
|
||||
if (IS_ERR(migf->filp)) {
|
||||
int err = PTR_ERR(migf->filp);
|
||||
|
||||
kfree(migf);
|
||||
return ERR_PTR(err);
|
||||
}
|
||||
|
||||
stream_open(migf->filp->f_inode, migf->filp);
|
||||
mutex_init(&migf->lock);
|
||||
|
||||
ret = mlx5vf_cmd_query_vhca_migration_state(
|
||||
mvdev->core_device.pdev, mvdev->vhca_id, &migf->total_length);
|
||||
if (ret)
|
||||
goto out_free;
|
||||
|
||||
ret = mlx5vf_add_migration_pages(
|
||||
migf, DIV_ROUND_UP_ULL(migf->total_length, PAGE_SIZE));
|
||||
if (ret)
|
||||
goto out_free;
|
||||
|
||||
ret = mlx5vf_cmd_save_vhca_state(mvdev->core_device.pdev,
|
||||
mvdev->vhca_id, migf);
|
||||
if (ret)
|
||||
goto out_free;
|
||||
return migf;
|
||||
out_free:
|
||||
fput(migf->filp);
|
||||
return ERR_PTR(ret);
|
||||
}
|
||||
|
||||
static ssize_t mlx5vf_resume_write(struct file *filp, const char __user *buf,
|
||||
size_t len, loff_t *pos)
|
||||
{
|
||||
struct mlx5_vf_migration_file *migf = filp->private_data;
|
||||
loff_t requested_length;
|
||||
ssize_t done = 0;
|
||||
|
||||
if (pos)
|
||||
return -ESPIPE;
|
||||
pos = &filp->f_pos;
|
||||
|
||||
if (*pos < 0 ||
|
||||
check_add_overflow((loff_t)len, *pos, &requested_length))
|
||||
return -EINVAL;
|
||||
|
||||
if (requested_length > MAX_MIGRATION_SIZE)
|
||||
return -ENOMEM;
|
||||
|
||||
mutex_lock(&migf->lock);
|
||||
if (migf->disabled) {
|
||||
done = -ENODEV;
|
||||
goto out_unlock;
|
||||
}
|
||||
|
||||
if (migf->allocated_length < requested_length) {
|
||||
done = mlx5vf_add_migration_pages(
|
||||
migf,
|
||||
DIV_ROUND_UP(requested_length - migf->allocated_length,
|
||||
PAGE_SIZE));
|
||||
if (done)
|
||||
goto out_unlock;
|
||||
}
|
||||
|
||||
while (len) {
|
||||
size_t page_offset;
|
||||
struct page *page;
|
||||
size_t page_len;
|
||||
u8 *to_buff;
|
||||
int ret;
|
||||
|
||||
page_offset = (*pos) % PAGE_SIZE;
|
||||
page = mlx5vf_get_migration_page(migf, *pos - page_offset);
|
||||
if (!page) {
|
||||
if (done == 0)
|
||||
done = -EINVAL;
|
||||
goto out_unlock;
|
||||
}
|
||||
|
||||
page_len = min_t(size_t, len, PAGE_SIZE - page_offset);
|
||||
to_buff = kmap_local_page(page);
|
||||
ret = copy_from_user(to_buff + page_offset, buf, page_len);
|
||||
kunmap_local(to_buff);
|
||||
if (ret) {
|
||||
done = -EFAULT;
|
||||
goto out_unlock;
|
||||
}
|
||||
*pos += page_len;
|
||||
len -= page_len;
|
||||
done += page_len;
|
||||
buf += page_len;
|
||||
migf->total_length += page_len;
|
||||
}
|
||||
out_unlock:
|
||||
mutex_unlock(&migf->lock);
|
||||
return done;
|
||||
}
|
||||
|
||||
static const struct file_operations mlx5vf_resume_fops = {
|
||||
.owner = THIS_MODULE,
|
||||
.write = mlx5vf_resume_write,
|
||||
.release = mlx5vf_release_file,
|
||||
.llseek = no_llseek,
|
||||
};
|
||||
|
||||
static struct mlx5_vf_migration_file *
|
||||
mlx5vf_pci_resume_device_data(struct mlx5vf_pci_core_device *mvdev)
|
||||
{
|
||||
struct mlx5_vf_migration_file *migf;
|
||||
|
||||
migf = kzalloc(sizeof(*migf), GFP_KERNEL);
|
||||
if (!migf)
|
||||
return ERR_PTR(-ENOMEM);
|
||||
|
||||
migf->filp = anon_inode_getfile("mlx5vf_mig", &mlx5vf_resume_fops, migf,
|
||||
O_WRONLY);
|
||||
if (IS_ERR(migf->filp)) {
|
||||
int err = PTR_ERR(migf->filp);
|
||||
|
||||
kfree(migf);
|
||||
return ERR_PTR(err);
|
||||
}
|
||||
stream_open(migf->filp->f_inode, migf->filp);
|
||||
mutex_init(&migf->lock);
|
||||
return migf;
|
||||
}
|
||||
|
||||
static void mlx5vf_disable_fds(struct mlx5vf_pci_core_device *mvdev)
|
||||
{
|
||||
if (mvdev->resuming_migf) {
|
||||
mlx5vf_disable_fd(mvdev->resuming_migf);
|
||||
fput(mvdev->resuming_migf->filp);
|
||||
mvdev->resuming_migf = NULL;
|
||||
}
|
||||
if (mvdev->saving_migf) {
|
||||
mlx5vf_disable_fd(mvdev->saving_migf);
|
||||
fput(mvdev->saving_migf->filp);
|
||||
mvdev->saving_migf = NULL;
|
||||
}
|
||||
}
|
||||
|
||||
static struct file *
|
||||
mlx5vf_pci_step_device_state_locked(struct mlx5vf_pci_core_device *mvdev,
|
||||
u32 new)
|
||||
{
|
||||
u32 cur = mvdev->mig_state;
|
||||
int ret;
|
||||
|
||||
if (cur == VFIO_DEVICE_STATE_RUNNING_P2P && new == VFIO_DEVICE_STATE_STOP) {
|
||||
ret = mlx5vf_cmd_suspend_vhca(
|
||||
mvdev->core_device.pdev, mvdev->vhca_id,
|
||||
MLX5_SUSPEND_VHCA_IN_OP_MOD_SUSPEND_RESPONDER);
|
||||
if (ret)
|
||||
return ERR_PTR(ret);
|
||||
return NULL;
|
||||
}
|
||||
|
||||
if (cur == VFIO_DEVICE_STATE_STOP && new == VFIO_DEVICE_STATE_RUNNING_P2P) {
|
||||
ret = mlx5vf_cmd_resume_vhca(
|
||||
mvdev->core_device.pdev, mvdev->vhca_id,
|
||||
MLX5_RESUME_VHCA_IN_OP_MOD_RESUME_RESPONDER);
|
||||
if (ret)
|
||||
return ERR_PTR(ret);
|
||||
return NULL;
|
||||
}
|
||||
|
||||
if (cur == VFIO_DEVICE_STATE_RUNNING && new == VFIO_DEVICE_STATE_RUNNING_P2P) {
|
||||
ret = mlx5vf_cmd_suspend_vhca(
|
||||
mvdev->core_device.pdev, mvdev->vhca_id,
|
||||
MLX5_SUSPEND_VHCA_IN_OP_MOD_SUSPEND_INITIATOR);
|
||||
if (ret)
|
||||
return ERR_PTR(ret);
|
||||
return NULL;
|
||||
}
|
||||
|
||||
if (cur == VFIO_DEVICE_STATE_RUNNING_P2P && new == VFIO_DEVICE_STATE_RUNNING) {
|
||||
ret = mlx5vf_cmd_resume_vhca(
|
||||
mvdev->core_device.pdev, mvdev->vhca_id,
|
||||
MLX5_RESUME_VHCA_IN_OP_MOD_RESUME_INITIATOR);
|
||||
if (ret)
|
||||
return ERR_PTR(ret);
|
||||
return NULL;
|
||||
}
|
||||
|
||||
if (cur == VFIO_DEVICE_STATE_STOP && new == VFIO_DEVICE_STATE_STOP_COPY) {
|
||||
struct mlx5_vf_migration_file *migf;
|
||||
|
||||
migf = mlx5vf_pci_save_device_data(mvdev);
|
||||
if (IS_ERR(migf))
|
||||
return ERR_CAST(migf);
|
||||
get_file(migf->filp);
|
||||
mvdev->saving_migf = migf;
|
||||
return migf->filp;
|
||||
}
|
||||
|
||||
if ((cur == VFIO_DEVICE_STATE_STOP_COPY && new == VFIO_DEVICE_STATE_STOP)) {
|
||||
mlx5vf_disable_fds(mvdev);
|
||||
return NULL;
|
||||
}
|
||||
|
||||
if (cur == VFIO_DEVICE_STATE_STOP && new == VFIO_DEVICE_STATE_RESUMING) {
|
||||
struct mlx5_vf_migration_file *migf;
|
||||
|
||||
migf = mlx5vf_pci_resume_device_data(mvdev);
|
||||
if (IS_ERR(migf))
|
||||
return ERR_CAST(migf);
|
||||
get_file(migf->filp);
|
||||
mvdev->resuming_migf = migf;
|
||||
return migf->filp;
|
||||
}
|
||||
|
||||
if (cur == VFIO_DEVICE_STATE_RESUMING && new == VFIO_DEVICE_STATE_STOP) {
|
||||
ret = mlx5vf_cmd_load_vhca_state(mvdev->core_device.pdev,
|
||||
mvdev->vhca_id,
|
||||
mvdev->resuming_migf);
|
||||
if (ret)
|
||||
return ERR_PTR(ret);
|
||||
mlx5vf_disable_fds(mvdev);
|
||||
return NULL;
|
||||
}
|
||||
|
||||
/*
|
||||
* vfio_mig_get_next_state() does not use arcs other than the above
|
||||
*/
|
||||
WARN_ON(true);
|
||||
return ERR_PTR(-EINVAL);
|
||||
}
|
||||
|
||||
/*
|
||||
* This function is called in all state_mutex unlock cases to
|
||||
* handle a 'deferred_reset' if exists.
|
||||
*/
|
||||
static void mlx5vf_state_mutex_unlock(struct mlx5vf_pci_core_device *mvdev)
|
||||
{
|
||||
again:
|
||||
spin_lock(&mvdev->reset_lock);
|
||||
if (mvdev->deferred_reset) {
|
||||
mvdev->deferred_reset = false;
|
||||
spin_unlock(&mvdev->reset_lock);
|
||||
mvdev->mig_state = VFIO_DEVICE_STATE_RUNNING;
|
||||
mlx5vf_disable_fds(mvdev);
|
||||
goto again;
|
||||
}
|
||||
mutex_unlock(&mvdev->state_mutex);
|
||||
spin_unlock(&mvdev->reset_lock);
|
||||
}
|
||||
|
||||
static struct file *
|
||||
mlx5vf_pci_set_device_state(struct vfio_device *vdev,
|
||||
enum vfio_device_mig_state new_state)
|
||||
{
|
||||
struct mlx5vf_pci_core_device *mvdev = container_of(
|
||||
vdev, struct mlx5vf_pci_core_device, core_device.vdev);
|
||||
enum vfio_device_mig_state next_state;
|
||||
struct file *res = NULL;
|
||||
int ret;
|
||||
|
||||
mutex_lock(&mvdev->state_mutex);
|
||||
while (new_state != mvdev->mig_state) {
|
||||
ret = vfio_mig_get_next_state(vdev, mvdev->mig_state,
|
||||
new_state, &next_state);
|
||||
if (ret) {
|
||||
res = ERR_PTR(ret);
|
||||
break;
|
||||
}
|
||||
res = mlx5vf_pci_step_device_state_locked(mvdev, next_state);
|
||||
if (IS_ERR(res))
|
||||
break;
|
||||
mvdev->mig_state = next_state;
|
||||
if (WARN_ON(res && new_state != mvdev->mig_state)) {
|
||||
fput(res);
|
||||
res = ERR_PTR(-EINVAL);
|
||||
break;
|
||||
}
|
||||
}
|
||||
mlx5vf_state_mutex_unlock(mvdev);
|
||||
return res;
|
||||
}
|
||||
|
||||
static int mlx5vf_pci_get_device_state(struct vfio_device *vdev,
|
||||
enum vfio_device_mig_state *curr_state)
|
||||
{
|
||||
struct mlx5vf_pci_core_device *mvdev = container_of(
|
||||
vdev, struct mlx5vf_pci_core_device, core_device.vdev);
|
||||
|
||||
mutex_lock(&mvdev->state_mutex);
|
||||
*curr_state = mvdev->mig_state;
|
||||
mlx5vf_state_mutex_unlock(mvdev);
|
||||
return 0;
|
||||
}
|
||||
|
||||
static void mlx5vf_pci_aer_reset_done(struct pci_dev *pdev)
|
||||
{
|
||||
struct mlx5vf_pci_core_device *mvdev = dev_get_drvdata(&pdev->dev);
|
||||
|
||||
if (!mvdev->migrate_cap)
|
||||
return;
|
||||
|
||||
/*
|
||||
* As the higher VFIO layers are holding locks across reset and using
|
||||
* those same locks with the mm_lock we need to prevent ABBA deadlock
|
||||
* with the state_mutex and mm_lock.
|
||||
* In case the state_mutex was taken already we defer the cleanup work
|
||||
* to the unlock flow of the other running context.
|
||||
*/
|
||||
spin_lock(&mvdev->reset_lock);
|
||||
mvdev->deferred_reset = true;
|
||||
if (!mutex_trylock(&mvdev->state_mutex)) {
|
||||
spin_unlock(&mvdev->reset_lock);
|
||||
return;
|
||||
}
|
||||
spin_unlock(&mvdev->reset_lock);
|
||||
mlx5vf_state_mutex_unlock(mvdev);
|
||||
}
|
||||
|
||||
static int mlx5vf_pci_open_device(struct vfio_device *core_vdev)
|
||||
{
|
||||
struct mlx5vf_pci_core_device *mvdev = container_of(
|
||||
core_vdev, struct mlx5vf_pci_core_device, core_device.vdev);
|
||||
struct vfio_pci_core_device *vdev = &mvdev->core_device;
|
||||
int vf_id;
|
||||
int ret;
|
||||
|
||||
ret = vfio_pci_core_enable(vdev);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
if (!mvdev->migrate_cap) {
|
||||
vfio_pci_core_finish_enable(vdev);
|
||||
return 0;
|
||||
}
|
||||
|
||||
vf_id = pci_iov_vf_id(vdev->pdev);
|
||||
if (vf_id < 0) {
|
||||
ret = vf_id;
|
||||
goto out_disable;
|
||||
}
|
||||
|
||||
ret = mlx5vf_cmd_get_vhca_id(vdev->pdev, vf_id + 1, &mvdev->vhca_id);
|
||||
if (ret)
|
||||
goto out_disable;
|
||||
|
||||
mvdev->mig_state = VFIO_DEVICE_STATE_RUNNING;
|
||||
vfio_pci_core_finish_enable(vdev);
|
||||
return 0;
|
||||
out_disable:
|
||||
vfio_pci_core_disable(vdev);
|
||||
return ret;
|
||||
}
|
||||
|
||||
static void mlx5vf_pci_close_device(struct vfio_device *core_vdev)
|
||||
{
|
||||
struct mlx5vf_pci_core_device *mvdev = container_of(
|
||||
core_vdev, struct mlx5vf_pci_core_device, core_device.vdev);
|
||||
|
||||
mlx5vf_disable_fds(mvdev);
|
||||
vfio_pci_core_close_device(core_vdev);
|
||||
}
|
||||
|
||||
static const struct vfio_device_ops mlx5vf_pci_ops = {
|
||||
.name = "mlx5-vfio-pci",
|
||||
.open_device = mlx5vf_pci_open_device,
|
||||
.close_device = mlx5vf_pci_close_device,
|
||||
.ioctl = vfio_pci_core_ioctl,
|
||||
.device_feature = vfio_pci_core_ioctl_feature,
|
||||
.read = vfio_pci_core_read,
|
||||
.write = vfio_pci_core_write,
|
||||
.mmap = vfio_pci_core_mmap,
|
||||
.request = vfio_pci_core_request,
|
||||
.match = vfio_pci_core_match,
|
||||
.migration_set_state = mlx5vf_pci_set_device_state,
|
||||
.migration_get_state = mlx5vf_pci_get_device_state,
|
||||
};
|
||||
|
||||
static int mlx5vf_pci_probe(struct pci_dev *pdev,
|
||||
const struct pci_device_id *id)
|
||||
{
|
||||
struct mlx5vf_pci_core_device *mvdev;
|
||||
int ret;
|
||||
|
||||
mvdev = kzalloc(sizeof(*mvdev), GFP_KERNEL);
|
||||
if (!mvdev)
|
||||
return -ENOMEM;
|
||||
vfio_pci_core_init_device(&mvdev->core_device, pdev, &mlx5vf_pci_ops);
|
||||
|
||||
if (pdev->is_virtfn) {
|
||||
struct mlx5_core_dev *mdev =
|
||||
mlx5_vf_get_core_dev(pdev);
|
||||
|
||||
if (mdev) {
|
||||
if (MLX5_CAP_GEN(mdev, migration)) {
|
||||
mvdev->migrate_cap = 1;
|
||||
mvdev->core_device.vdev.migration_flags =
|
||||
VFIO_MIGRATION_STOP_COPY |
|
||||
VFIO_MIGRATION_P2P;
|
||||
mutex_init(&mvdev->state_mutex);
|
||||
spin_lock_init(&mvdev->reset_lock);
|
||||
}
|
||||
mlx5_vf_put_core_dev(mdev);
|
||||
}
|
||||
}
|
||||
|
||||
ret = vfio_pci_core_register_device(&mvdev->core_device);
|
||||
if (ret)
|
||||
goto out_free;
|
||||
|
||||
dev_set_drvdata(&pdev->dev, mvdev);
|
||||
return 0;
|
||||
|
||||
out_free:
|
||||
vfio_pci_core_uninit_device(&mvdev->core_device);
|
||||
kfree(mvdev);
|
||||
return ret;
|
||||
}
|
||||
|
||||
static void mlx5vf_pci_remove(struct pci_dev *pdev)
|
||||
{
|
||||
struct mlx5vf_pci_core_device *mvdev = dev_get_drvdata(&pdev->dev);
|
||||
|
||||
vfio_pci_core_unregister_device(&mvdev->core_device);
|
||||
vfio_pci_core_uninit_device(&mvdev->core_device);
|
||||
kfree(mvdev);
|
||||
}
|
||||
|
||||
static const struct pci_device_id mlx5vf_pci_table[] = {
|
||||
{ PCI_DRIVER_OVERRIDE_DEVICE_VFIO(PCI_VENDOR_ID_MELLANOX, 0x101e) }, /* ConnectX Family mlx5Gen Virtual Function */
|
||||
{}
|
||||
};
|
||||
|
||||
MODULE_DEVICE_TABLE(pci, mlx5vf_pci_table);
|
||||
|
||||
static const struct pci_error_handlers mlx5vf_err_handlers = {
|
||||
.reset_done = mlx5vf_pci_aer_reset_done,
|
||||
.error_detected = vfio_pci_core_aer_err_detected,
|
||||
};
|
||||
|
||||
static struct pci_driver mlx5vf_pci_driver = {
|
||||
.name = KBUILD_MODNAME,
|
||||
.id_table = mlx5vf_pci_table,
|
||||
.probe = mlx5vf_pci_probe,
|
||||
.remove = mlx5vf_pci_remove,
|
||||
.err_handler = &mlx5vf_err_handlers,
|
||||
};
|
||||
|
||||
static void __exit mlx5vf_pci_cleanup(void)
|
||||
{
|
||||
pci_unregister_driver(&mlx5vf_pci_driver);
|
||||
}
|
||||
|
||||
static int __init mlx5vf_pci_init(void)
|
||||
{
|
||||
return pci_register_driver(&mlx5vf_pci_driver);
|
||||
}
|
||||
|
||||
module_init(mlx5vf_pci_init);
|
||||
module_exit(mlx5vf_pci_cleanup);
|
||||
|
||||
MODULE_LICENSE("GPL");
|
||||
MODULE_AUTHOR("Max Gurtovoy <mgurtovoy@nvidia.com>");
|
||||
MODULE_AUTHOR("Yishai Hadas <yishaih@nvidia.com>");
|
||||
MODULE_DESCRIPTION(
|
||||
"MLX5 VFIO PCI - User Level meta-driver for MLX5 device family");
|
@ -130,6 +130,7 @@ static const struct vfio_device_ops vfio_pci_ops = {
|
||||
.open_device = vfio_pci_open_device,
|
||||
.close_device = vfio_pci_core_close_device,
|
||||
.ioctl = vfio_pci_core_ioctl,
|
||||
.device_feature = vfio_pci_core_ioctl_feature,
|
||||
.read = vfio_pci_core_read,
|
||||
.write = vfio_pci_core_write,
|
||||
.mmap = vfio_pci_core_mmap,
|
||||
|
@ -228,6 +228,19 @@ int vfio_pci_set_power_state(struct vfio_pci_core_device *vdev, pci_power_t stat
|
||||
if (!ret) {
|
||||
/* D3 might be unsupported via quirk, skip unless in D3 */
|
||||
if (needs_save && pdev->current_state >= PCI_D3hot) {
|
||||
/*
|
||||
* The current PCI state will be saved locally in
|
||||
* 'pm_save' during the D3hot transition. When the
|
||||
* device state is changed to D0 again with the current
|
||||
* function, then pci_store_saved_state() will restore
|
||||
* the state and will free the memory pointed by
|
||||
* 'pm_save'. There are few cases where the PCI power
|
||||
* state can be changed to D0 without the involvement
|
||||
* of the driver. For these cases, free the earlier
|
||||
* allocated memory first before overwriting 'pm_save'
|
||||
* to prevent the memory leak.
|
||||
*/
|
||||
kfree(vdev->pm_save);
|
||||
vdev->pm_save = pci_store_saved_state(pdev);
|
||||
} else if (needs_restore) {
|
||||
pci_load_and_free_saved_state(pdev, &vdev->pm_save);
|
||||
@ -322,6 +335,17 @@ void vfio_pci_core_disable(struct vfio_pci_core_device *vdev)
|
||||
/* For needs_reset */
|
||||
lockdep_assert_held(&vdev->vdev.dev_set->lock);
|
||||
|
||||
/*
|
||||
* This function can be invoked while the power state is non-D0.
|
||||
* This function calls __pci_reset_function_locked() which internally
|
||||
* can use pci_pm_reset() for the function reset. pci_pm_reset() will
|
||||
* fail if the power state is non-D0. Also, for the devices which
|
||||
* have NoSoftRst-, the reset function can cause the PCI config space
|
||||
* reset without restoring the original state (saved locally in
|
||||
* 'vdev->pm_save').
|
||||
*/
|
||||
vfio_pci_set_power_state(vdev, PCI_D0);
|
||||
|
||||
/* Stop the device from further DMA */
|
||||
pci_clear_master(pdev);
|
||||
|
||||
@ -921,6 +945,19 @@ long vfio_pci_core_ioctl(struct vfio_device *core_vdev, unsigned int cmd,
|
||||
return -EINVAL;
|
||||
|
||||
vfio_pci_zap_and_down_write_memory_lock(vdev);
|
||||
|
||||
/*
|
||||
* This function can be invoked while the power state is non-D0.
|
||||
* If pci_try_reset_function() has been called while the power
|
||||
* state is non-D0, then pci_try_reset_function() will
|
||||
* internally set the power state to D0 without vfio driver
|
||||
* involvement. For the devices which have NoSoftRst-, the
|
||||
* reset function can cause the PCI config space reset without
|
||||
* restoring the original state (saved locally in
|
||||
* 'vdev->pm_save').
|
||||
*/
|
||||
vfio_pci_set_power_state(vdev, PCI_D0);
|
||||
|
||||
ret = pci_try_reset_function(vdev->pdev);
|
||||
up_write(&vdev->memory_lock);
|
||||
|
||||
@ -1114,71 +1151,51 @@ hot_reset_release:
|
||||
|
||||
return vfio_pci_ioeventfd(vdev, ioeventfd.offset,
|
||||
ioeventfd.data, count, ioeventfd.fd);
|
||||
} else if (cmd == VFIO_DEVICE_FEATURE) {
|
||||
struct vfio_device_feature feature;
|
||||
uuid_t uuid;
|
||||
|
||||
minsz = offsetofend(struct vfio_device_feature, flags);
|
||||
|
||||
if (copy_from_user(&feature, (void __user *)arg, minsz))
|
||||
return -EFAULT;
|
||||
|
||||
if (feature.argsz < minsz)
|
||||
return -EINVAL;
|
||||
|
||||
/* Check unknown flags */
|
||||
if (feature.flags & ~(VFIO_DEVICE_FEATURE_MASK |
|
||||
VFIO_DEVICE_FEATURE_SET |
|
||||
VFIO_DEVICE_FEATURE_GET |
|
||||
VFIO_DEVICE_FEATURE_PROBE))
|
||||
return -EINVAL;
|
||||
|
||||
/* GET & SET are mutually exclusive except with PROBE */
|
||||
if (!(feature.flags & VFIO_DEVICE_FEATURE_PROBE) &&
|
||||
(feature.flags & VFIO_DEVICE_FEATURE_SET) &&
|
||||
(feature.flags & VFIO_DEVICE_FEATURE_GET))
|
||||
return -EINVAL;
|
||||
|
||||
switch (feature.flags & VFIO_DEVICE_FEATURE_MASK) {
|
||||
case VFIO_DEVICE_FEATURE_PCI_VF_TOKEN:
|
||||
if (!vdev->vf_token)
|
||||
return -ENOTTY;
|
||||
|
||||
/*
|
||||
* We do not support GET of the VF Token UUID as this
|
||||
* could expose the token of the previous device user.
|
||||
*/
|
||||
if (feature.flags & VFIO_DEVICE_FEATURE_GET)
|
||||
return -EINVAL;
|
||||
|
||||
if (feature.flags & VFIO_DEVICE_FEATURE_PROBE)
|
||||
return 0;
|
||||
|
||||
/* Don't SET unless told to do so */
|
||||
if (!(feature.flags & VFIO_DEVICE_FEATURE_SET))
|
||||
return -EINVAL;
|
||||
|
||||
if (feature.argsz < minsz + sizeof(uuid))
|
||||
return -EINVAL;
|
||||
|
||||
if (copy_from_user(&uuid, (void __user *)(arg + minsz),
|
||||
sizeof(uuid)))
|
||||
return -EFAULT;
|
||||
|
||||
mutex_lock(&vdev->vf_token->lock);
|
||||
uuid_copy(&vdev->vf_token->uuid, &uuid);
|
||||
mutex_unlock(&vdev->vf_token->lock);
|
||||
|
||||
return 0;
|
||||
default:
|
||||
return -ENOTTY;
|
||||
}
|
||||
}
|
||||
|
||||
return -ENOTTY;
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(vfio_pci_core_ioctl);
|
||||
|
||||
static int vfio_pci_core_feature_token(struct vfio_device *device, u32 flags,
|
||||
void __user *arg, size_t argsz)
|
||||
{
|
||||
struct vfio_pci_core_device *vdev =
|
||||
container_of(device, struct vfio_pci_core_device, vdev);
|
||||
uuid_t uuid;
|
||||
int ret;
|
||||
|
||||
if (!vdev->vf_token)
|
||||
return -ENOTTY;
|
||||
/*
|
||||
* We do not support GET of the VF Token UUID as this could
|
||||
* expose the token of the previous device user.
|
||||
*/
|
||||
ret = vfio_check_feature(flags, argsz, VFIO_DEVICE_FEATURE_SET,
|
||||
sizeof(uuid));
|
||||
if (ret != 1)
|
||||
return ret;
|
||||
|
||||
if (copy_from_user(&uuid, arg, sizeof(uuid)))
|
||||
return -EFAULT;
|
||||
|
||||
mutex_lock(&vdev->vf_token->lock);
|
||||
uuid_copy(&vdev->vf_token->uuid, &uuid);
|
||||
mutex_unlock(&vdev->vf_token->lock);
|
||||
return 0;
|
||||
}
|
||||
|
||||
int vfio_pci_core_ioctl_feature(struct vfio_device *device, u32 flags,
|
||||
void __user *arg, size_t argsz)
|
||||
{
|
||||
switch (flags & VFIO_DEVICE_FEATURE_MASK) {
|
||||
case VFIO_DEVICE_FEATURE_PCI_VF_TOKEN:
|
||||
return vfio_pci_core_feature_token(device, flags, arg, argsz);
|
||||
default:
|
||||
return -ENOTTY;
|
||||
}
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(vfio_pci_core_ioctl_feature);
|
||||
|
||||
static ssize_t vfio_pci_rw(struct vfio_pci_core_device *vdev, char __user *buf,
|
||||
size_t count, loff_t *ppos, bool iswrite)
|
||||
{
|
||||
@ -1891,8 +1908,8 @@ void vfio_pci_core_unregister_device(struct vfio_pci_core_device *vdev)
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(vfio_pci_core_unregister_device);
|
||||
|
||||
static pci_ers_result_t vfio_pci_aer_err_detected(struct pci_dev *pdev,
|
||||
pci_channel_state_t state)
|
||||
pci_ers_result_t vfio_pci_core_aer_err_detected(struct pci_dev *pdev,
|
||||
pci_channel_state_t state)
|
||||
{
|
||||
struct vfio_pci_core_device *vdev;
|
||||
struct vfio_device *device;
|
||||
@ -1914,6 +1931,7 @@ static pci_ers_result_t vfio_pci_aer_err_detected(struct pci_dev *pdev,
|
||||
|
||||
return PCI_ERS_RESULT_CAN_RECOVER;
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(vfio_pci_core_aer_err_detected);
|
||||
|
||||
int vfio_pci_core_sriov_configure(struct pci_dev *pdev, int nr_virtfn)
|
||||
{
|
||||
@ -1936,7 +1954,7 @@ int vfio_pci_core_sriov_configure(struct pci_dev *pdev, int nr_virtfn)
|
||||
EXPORT_SYMBOL_GPL(vfio_pci_core_sriov_configure);
|
||||
|
||||
const struct pci_error_handlers vfio_pci_core_err_handlers = {
|
||||
.error_detected = vfio_pci_aer_err_detected,
|
||||
.error_detected = vfio_pci_core_aer_err_detected,
|
||||
};
|
||||
EXPORT_SYMBOL_GPL(vfio_pci_core_err_handlers);
|
||||
|
||||
@ -2055,6 +2073,18 @@ static int vfio_pci_dev_set_hot_reset(struct vfio_device_set *dev_set,
|
||||
}
|
||||
cur_mem = NULL;
|
||||
|
||||
/*
|
||||
* The pci_reset_bus() will reset all the devices in the bus.
|
||||
* The power state can be non-D0 for some of the devices in the bus.
|
||||
* For these devices, the pci_reset_bus() will internally set
|
||||
* the power state to D0 without vfio driver involvement.
|
||||
* For the devices which have NoSoftRst-, the reset function can
|
||||
* cause the PCI config space reset without restoring the original
|
||||
* state (saved locally in 'vdev->pm_save').
|
||||
*/
|
||||
list_for_each_entry(cur, &dev_set->device_list, vdev.dev_set_list)
|
||||
vfio_pci_set_power_state(cur, PCI_D0);
|
||||
|
||||
ret = pci_reset_bus(pdev);
|
||||
|
||||
err_undo:
|
||||
@ -2108,6 +2138,18 @@ static bool vfio_pci_dev_set_try_reset(struct vfio_device_set *dev_set)
|
||||
if (!pdev)
|
||||
return false;
|
||||
|
||||
/*
|
||||
* The pci_reset_bus() will reset all the devices in the bus.
|
||||
* The power state can be non-D0 for some of the devices in the bus.
|
||||
* For these devices, the pci_reset_bus() will internally set
|
||||
* the power state to D0 without vfio driver involvement.
|
||||
* For the devices which have NoSoftRst-, the reset function can
|
||||
* cause the PCI config space reset without restoring the original
|
||||
* state (saved locally in 'vdev->pm_save').
|
||||
*/
|
||||
list_for_each_entry(cur, &dev_set->device_list, vdev.dev_set_list)
|
||||
vfio_pci_set_power_state(cur, PCI_D0);
|
||||
|
||||
ret = pci_reset_bus(pdev);
|
||||
if (ret)
|
||||
return false;
|
||||
|
@ -288,6 +288,7 @@ out:
|
||||
return done;
|
||||
}
|
||||
|
||||
#ifdef CONFIG_VFIO_PCI_VGA
|
||||
ssize_t vfio_pci_vga_rw(struct vfio_pci_core_device *vdev, char __user *buf,
|
||||
size_t count, loff_t *ppos, bool iswrite)
|
||||
{
|
||||
@ -355,6 +356,7 @@ ssize_t vfio_pci_vga_rw(struct vfio_pci_core_device *vdev, char __user *buf,
|
||||
|
||||
return done;
|
||||
}
|
||||
#endif
|
||||
|
||||
static void vfio_pci_ioeventfd_do_write(struct vfio_pci_ioeventfd *ioeventfd,
|
||||
bool test_mem)
|
||||
|
@ -1557,15 +1557,303 @@ static int vfio_device_fops_release(struct inode *inode, struct file *filep)
|
||||
return 0;
|
||||
}
|
||||
|
||||
/*
|
||||
* vfio_mig_get_next_state - Compute the next step in the FSM
|
||||
* @cur_fsm - The current state the device is in
|
||||
* @new_fsm - The target state to reach
|
||||
* @next_fsm - Pointer to the next step to get to new_fsm
|
||||
*
|
||||
* Return 0 upon success, otherwise -errno
|
||||
* Upon success the next step in the state progression between cur_fsm and
|
||||
* new_fsm will be set in next_fsm.
|
||||
*
|
||||
* This breaks down requests for combination transitions into smaller steps and
|
||||
* returns the next step to get to new_fsm. The function may need to be called
|
||||
* multiple times before reaching new_fsm.
|
||||
*
|
||||
*/
|
||||
int vfio_mig_get_next_state(struct vfio_device *device,
|
||||
enum vfio_device_mig_state cur_fsm,
|
||||
enum vfio_device_mig_state new_fsm,
|
||||
enum vfio_device_mig_state *next_fsm)
|
||||
{
|
||||
enum { VFIO_DEVICE_NUM_STATES = VFIO_DEVICE_STATE_RUNNING_P2P + 1 };
|
||||
/*
|
||||
* The coding in this table requires the driver to implement the
|
||||
* following FSM arcs:
|
||||
* RESUMING -> STOP
|
||||
* STOP -> RESUMING
|
||||
* STOP -> STOP_COPY
|
||||
* STOP_COPY -> STOP
|
||||
*
|
||||
* If P2P is supported then the driver must also implement these FSM
|
||||
* arcs:
|
||||
* RUNNING -> RUNNING_P2P
|
||||
* RUNNING_P2P -> RUNNING
|
||||
* RUNNING_P2P -> STOP
|
||||
* STOP -> RUNNING_P2P
|
||||
* Without P2P the driver must implement:
|
||||
* RUNNING -> STOP
|
||||
* STOP -> RUNNING
|
||||
*
|
||||
* The coding will step through multiple states for some combination
|
||||
* transitions; if all optional features are supported, this means the
|
||||
* following ones:
|
||||
* RESUMING -> STOP -> RUNNING_P2P
|
||||
* RESUMING -> STOP -> RUNNING_P2P -> RUNNING
|
||||
* RESUMING -> STOP -> STOP_COPY
|
||||
* RUNNING -> RUNNING_P2P -> STOP
|
||||
* RUNNING -> RUNNING_P2P -> STOP -> RESUMING
|
||||
* RUNNING -> RUNNING_P2P -> STOP -> STOP_COPY
|
||||
* RUNNING_P2P -> STOP -> RESUMING
|
||||
* RUNNING_P2P -> STOP -> STOP_COPY
|
||||
* STOP -> RUNNING_P2P -> RUNNING
|
||||
* STOP_COPY -> STOP -> RESUMING
|
||||
* STOP_COPY -> STOP -> RUNNING_P2P
|
||||
* STOP_COPY -> STOP -> RUNNING_P2P -> RUNNING
|
||||
*/
|
||||
static const u8 vfio_from_fsm_table[VFIO_DEVICE_NUM_STATES][VFIO_DEVICE_NUM_STATES] = {
|
||||
[VFIO_DEVICE_STATE_STOP] = {
|
||||
[VFIO_DEVICE_STATE_STOP] = VFIO_DEVICE_STATE_STOP,
|
||||
[VFIO_DEVICE_STATE_RUNNING] = VFIO_DEVICE_STATE_RUNNING_P2P,
|
||||
[VFIO_DEVICE_STATE_STOP_COPY] = VFIO_DEVICE_STATE_STOP_COPY,
|
||||
[VFIO_DEVICE_STATE_RESUMING] = VFIO_DEVICE_STATE_RESUMING,
|
||||
[VFIO_DEVICE_STATE_RUNNING_P2P] = VFIO_DEVICE_STATE_RUNNING_P2P,
|
||||
[VFIO_DEVICE_STATE_ERROR] = VFIO_DEVICE_STATE_ERROR,
|
||||
},
|
||||
[VFIO_DEVICE_STATE_RUNNING] = {
|
||||
[VFIO_DEVICE_STATE_STOP] = VFIO_DEVICE_STATE_RUNNING_P2P,
|
||||
[VFIO_DEVICE_STATE_RUNNING] = VFIO_DEVICE_STATE_RUNNING,
|
||||
[VFIO_DEVICE_STATE_STOP_COPY] = VFIO_DEVICE_STATE_RUNNING_P2P,
|
||||
[VFIO_DEVICE_STATE_RESUMING] = VFIO_DEVICE_STATE_RUNNING_P2P,
|
||||
[VFIO_DEVICE_STATE_RUNNING_P2P] = VFIO_DEVICE_STATE_RUNNING_P2P,
|
||||
[VFIO_DEVICE_STATE_ERROR] = VFIO_DEVICE_STATE_ERROR,
|
||||
},
|
||||
[VFIO_DEVICE_STATE_STOP_COPY] = {
|
||||
[VFIO_DEVICE_STATE_STOP] = VFIO_DEVICE_STATE_STOP,
|
||||
[VFIO_DEVICE_STATE_RUNNING] = VFIO_DEVICE_STATE_STOP,
|
||||
[VFIO_DEVICE_STATE_STOP_COPY] = VFIO_DEVICE_STATE_STOP_COPY,
|
||||
[VFIO_DEVICE_STATE_RESUMING] = VFIO_DEVICE_STATE_STOP,
|
||||
[VFIO_DEVICE_STATE_RUNNING_P2P] = VFIO_DEVICE_STATE_STOP,
|
||||
[VFIO_DEVICE_STATE_ERROR] = VFIO_DEVICE_STATE_ERROR,
|
||||
},
|
||||
[VFIO_DEVICE_STATE_RESUMING] = {
|
||||
[VFIO_DEVICE_STATE_STOP] = VFIO_DEVICE_STATE_STOP,
|
||||
[VFIO_DEVICE_STATE_RUNNING] = VFIO_DEVICE_STATE_STOP,
|
||||
[VFIO_DEVICE_STATE_STOP_COPY] = VFIO_DEVICE_STATE_STOP,
|
||||
[VFIO_DEVICE_STATE_RESUMING] = VFIO_DEVICE_STATE_RESUMING,
|
||||
[VFIO_DEVICE_STATE_RUNNING_P2P] = VFIO_DEVICE_STATE_STOP,
|
||||
[VFIO_DEVICE_STATE_ERROR] = VFIO_DEVICE_STATE_ERROR,
|
||||
},
|
||||
[VFIO_DEVICE_STATE_RUNNING_P2P] = {
|
||||
[VFIO_DEVICE_STATE_STOP] = VFIO_DEVICE_STATE_STOP,
|
||||
[VFIO_DEVICE_STATE_RUNNING] = VFIO_DEVICE_STATE_RUNNING,
|
||||
[VFIO_DEVICE_STATE_STOP_COPY] = VFIO_DEVICE_STATE_STOP,
|
||||
[VFIO_DEVICE_STATE_RESUMING] = VFIO_DEVICE_STATE_STOP,
|
||||
[VFIO_DEVICE_STATE_RUNNING_P2P] = VFIO_DEVICE_STATE_RUNNING_P2P,
|
||||
[VFIO_DEVICE_STATE_ERROR] = VFIO_DEVICE_STATE_ERROR,
|
||||
},
|
||||
[VFIO_DEVICE_STATE_ERROR] = {
|
||||
[VFIO_DEVICE_STATE_STOP] = VFIO_DEVICE_STATE_ERROR,
|
||||
[VFIO_DEVICE_STATE_RUNNING] = VFIO_DEVICE_STATE_ERROR,
|
||||
[VFIO_DEVICE_STATE_STOP_COPY] = VFIO_DEVICE_STATE_ERROR,
|
||||
[VFIO_DEVICE_STATE_RESUMING] = VFIO_DEVICE_STATE_ERROR,
|
||||
[VFIO_DEVICE_STATE_RUNNING_P2P] = VFIO_DEVICE_STATE_ERROR,
|
||||
[VFIO_DEVICE_STATE_ERROR] = VFIO_DEVICE_STATE_ERROR,
|
||||
},
|
||||
};
|
||||
|
||||
static const unsigned int state_flags_table[VFIO_DEVICE_NUM_STATES] = {
|
||||
[VFIO_DEVICE_STATE_STOP] = VFIO_MIGRATION_STOP_COPY,
|
||||
[VFIO_DEVICE_STATE_RUNNING] = VFIO_MIGRATION_STOP_COPY,
|
||||
[VFIO_DEVICE_STATE_STOP_COPY] = VFIO_MIGRATION_STOP_COPY,
|
||||
[VFIO_DEVICE_STATE_RESUMING] = VFIO_MIGRATION_STOP_COPY,
|
||||
[VFIO_DEVICE_STATE_RUNNING_P2P] =
|
||||
VFIO_MIGRATION_STOP_COPY | VFIO_MIGRATION_P2P,
|
||||
[VFIO_DEVICE_STATE_ERROR] = ~0U,
|
||||
};
|
||||
|
||||
if (WARN_ON(cur_fsm >= ARRAY_SIZE(vfio_from_fsm_table) ||
|
||||
(state_flags_table[cur_fsm] & device->migration_flags) !=
|
||||
state_flags_table[cur_fsm]))
|
||||
return -EINVAL;
|
||||
|
||||
if (new_fsm >= ARRAY_SIZE(vfio_from_fsm_table) ||
|
||||
(state_flags_table[new_fsm] & device->migration_flags) !=
|
||||
state_flags_table[new_fsm])
|
||||
return -EINVAL;
|
||||
|
||||
/*
|
||||
* Arcs touching optional and unsupported states are skipped over. The
|
||||
* driver will instead see an arc from the original state to the next
|
||||
* logical state, as per the above comment.
|
||||
*/
|
||||
*next_fsm = vfio_from_fsm_table[cur_fsm][new_fsm];
|
||||
while ((state_flags_table[*next_fsm] & device->migration_flags) !=
|
||||
state_flags_table[*next_fsm])
|
||||
*next_fsm = vfio_from_fsm_table[*next_fsm][new_fsm];
|
||||
|
||||
return (*next_fsm != VFIO_DEVICE_STATE_ERROR) ? 0 : -EINVAL;
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(vfio_mig_get_next_state);
|
||||
|
||||
/*
|
||||
* Convert the drivers's struct file into a FD number and return it to userspace
|
||||
*/
|
||||
static int vfio_ioct_mig_return_fd(struct file *filp, void __user *arg,
|
||||
struct vfio_device_feature_mig_state *mig)
|
||||
{
|
||||
int ret;
|
||||
int fd;
|
||||
|
||||
fd = get_unused_fd_flags(O_CLOEXEC);
|
||||
if (fd < 0) {
|
||||
ret = fd;
|
||||
goto out_fput;
|
||||
}
|
||||
|
||||
mig->data_fd = fd;
|
||||
if (copy_to_user(arg, mig, sizeof(*mig))) {
|
||||
ret = -EFAULT;
|
||||
goto out_put_unused;
|
||||
}
|
||||
fd_install(fd, filp);
|
||||
return 0;
|
||||
|
||||
out_put_unused:
|
||||
put_unused_fd(fd);
|
||||
out_fput:
|
||||
fput(filp);
|
||||
return ret;
|
||||
}
|
||||
|
||||
static int
|
||||
vfio_ioctl_device_feature_mig_device_state(struct vfio_device *device,
|
||||
u32 flags, void __user *arg,
|
||||
size_t argsz)
|
||||
{
|
||||
size_t minsz =
|
||||
offsetofend(struct vfio_device_feature_mig_state, data_fd);
|
||||
struct vfio_device_feature_mig_state mig;
|
||||
struct file *filp = NULL;
|
||||
int ret;
|
||||
|
||||
if (!device->ops->migration_set_state ||
|
||||
!device->ops->migration_get_state)
|
||||
return -ENOTTY;
|
||||
|
||||
ret = vfio_check_feature(flags, argsz,
|
||||
VFIO_DEVICE_FEATURE_SET |
|
||||
VFIO_DEVICE_FEATURE_GET,
|
||||
sizeof(mig));
|
||||
if (ret != 1)
|
||||
return ret;
|
||||
|
||||
if (copy_from_user(&mig, arg, minsz))
|
||||
return -EFAULT;
|
||||
|
||||
if (flags & VFIO_DEVICE_FEATURE_GET) {
|
||||
enum vfio_device_mig_state curr_state;
|
||||
|
||||
ret = device->ops->migration_get_state(device, &curr_state);
|
||||
if (ret)
|
||||
return ret;
|
||||
mig.device_state = curr_state;
|
||||
goto out_copy;
|
||||
}
|
||||
|
||||
/* Handle the VFIO_DEVICE_FEATURE_SET */
|
||||
filp = device->ops->migration_set_state(device, mig.device_state);
|
||||
if (IS_ERR(filp) || !filp)
|
||||
goto out_copy;
|
||||
|
||||
return vfio_ioct_mig_return_fd(filp, arg, &mig);
|
||||
out_copy:
|
||||
mig.data_fd = -1;
|
||||
if (copy_to_user(arg, &mig, sizeof(mig)))
|
||||
return -EFAULT;
|
||||
if (IS_ERR(filp))
|
||||
return PTR_ERR(filp);
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int vfio_ioctl_device_feature_migration(struct vfio_device *device,
|
||||
u32 flags, void __user *arg,
|
||||
size_t argsz)
|
||||
{
|
||||
struct vfio_device_feature_migration mig = {
|
||||
.flags = device->migration_flags,
|
||||
};
|
||||
int ret;
|
||||
|
||||
if (!device->ops->migration_set_state ||
|
||||
!device->ops->migration_get_state)
|
||||
return -ENOTTY;
|
||||
|
||||
ret = vfio_check_feature(flags, argsz, VFIO_DEVICE_FEATURE_GET,
|
||||
sizeof(mig));
|
||||
if (ret != 1)
|
||||
return ret;
|
||||
if (copy_to_user(arg, &mig, sizeof(mig)))
|
||||
return -EFAULT;
|
||||
return 0;
|
||||
}
|
||||
|
||||
static int vfio_ioctl_device_feature(struct vfio_device *device,
|
||||
struct vfio_device_feature __user *arg)
|
||||
{
|
||||
size_t minsz = offsetofend(struct vfio_device_feature, flags);
|
||||
struct vfio_device_feature feature;
|
||||
|
||||
if (copy_from_user(&feature, arg, minsz))
|
||||
return -EFAULT;
|
||||
|
||||
if (feature.argsz < minsz)
|
||||
return -EINVAL;
|
||||
|
||||
/* Check unknown flags */
|
||||
if (feature.flags &
|
||||
~(VFIO_DEVICE_FEATURE_MASK | VFIO_DEVICE_FEATURE_SET |
|
||||
VFIO_DEVICE_FEATURE_GET | VFIO_DEVICE_FEATURE_PROBE))
|
||||
return -EINVAL;
|
||||
|
||||
/* GET & SET are mutually exclusive except with PROBE */
|
||||
if (!(feature.flags & VFIO_DEVICE_FEATURE_PROBE) &&
|
||||
(feature.flags & VFIO_DEVICE_FEATURE_SET) &&
|
||||
(feature.flags & VFIO_DEVICE_FEATURE_GET))
|
||||
return -EINVAL;
|
||||
|
||||
switch (feature.flags & VFIO_DEVICE_FEATURE_MASK) {
|
||||
case VFIO_DEVICE_FEATURE_MIGRATION:
|
||||
return vfio_ioctl_device_feature_migration(
|
||||
device, feature.flags, arg->data,
|
||||
feature.argsz - minsz);
|
||||
case VFIO_DEVICE_FEATURE_MIG_DEVICE_STATE:
|
||||
return vfio_ioctl_device_feature_mig_device_state(
|
||||
device, feature.flags, arg->data,
|
||||
feature.argsz - minsz);
|
||||
default:
|
||||
if (unlikely(!device->ops->device_feature))
|
||||
return -EINVAL;
|
||||
return device->ops->device_feature(device, feature.flags,
|
||||
arg->data,
|
||||
feature.argsz - minsz);
|
||||
}
|
||||
}
|
||||
|
||||
static long vfio_device_fops_unl_ioctl(struct file *filep,
|
||||
unsigned int cmd, unsigned long arg)
|
||||
{
|
||||
struct vfio_device *device = filep->private_data;
|
||||
|
||||
if (unlikely(!device->ops->ioctl))
|
||||
return -EINVAL;
|
||||
|
||||
return device->ops->ioctl(device, cmd, arg);
|
||||
switch (cmd) {
|
||||
case VFIO_DEVICE_FEATURE:
|
||||
return vfio_ioctl_device_feature(device, (void __user *)arg);
|
||||
default:
|
||||
if (unlikely(!device->ops->ioctl))
|
||||
return -EINVAL;
|
||||
return device->ops->ioctl(device, cmd, arg);
|
||||
}
|
||||
}
|
||||
|
||||
static ssize_t vfio_device_fops_read(struct file *filep, char __user *buf,
|
||||
|
@ -34,6 +34,41 @@
|
||||
#define QM_WUSER_M_CFG_ENABLE 0x1000a8
|
||||
#define WUSER_M_CFG_ENABLE 0xffffffff
|
||||
|
||||
/* mailbox */
|
||||
#define QM_MB_CMD_SQC 0x0
|
||||
#define QM_MB_CMD_CQC 0x1
|
||||
#define QM_MB_CMD_EQC 0x2
|
||||
#define QM_MB_CMD_AEQC 0x3
|
||||
#define QM_MB_CMD_SQC_BT 0x4
|
||||
#define QM_MB_CMD_CQC_BT 0x5
|
||||
#define QM_MB_CMD_SQC_VFT_V2 0x6
|
||||
#define QM_MB_CMD_STOP_QP 0x8
|
||||
#define QM_MB_CMD_SRC 0xc
|
||||
#define QM_MB_CMD_DST 0xd
|
||||
|
||||
#define QM_MB_CMD_SEND_BASE 0x300
|
||||
#define QM_MB_EVENT_SHIFT 8
|
||||
#define QM_MB_BUSY_SHIFT 13
|
||||
#define QM_MB_OP_SHIFT 14
|
||||
#define QM_MB_CMD_DATA_ADDR_L 0x304
|
||||
#define QM_MB_CMD_DATA_ADDR_H 0x308
|
||||
#define QM_MB_MAX_WAIT_CNT 6000
|
||||
|
||||
/* doorbell */
|
||||
#define QM_DOORBELL_CMD_SQ 0
|
||||
#define QM_DOORBELL_CMD_CQ 1
|
||||
#define QM_DOORBELL_CMD_EQ 2
|
||||
#define QM_DOORBELL_CMD_AEQ 3
|
||||
|
||||
#define QM_DOORBELL_SQ_CQ_BASE_V2 0x1000
|
||||
#define QM_DOORBELL_EQ_AEQ_BASE_V2 0x2000
|
||||
#define QM_QP_MAX_NUM_SHIFT 11
|
||||
#define QM_DB_CMD_SHIFT_V2 12
|
||||
#define QM_DB_RAND_SHIFT_V2 16
|
||||
#define QM_DB_INDEX_SHIFT_V2 32
|
||||
#define QM_DB_PRIORITY_SHIFT_V2 48
|
||||
#define QM_VF_STATE 0x60
|
||||
|
||||
/* qm cache */
|
||||
#define QM_CACHE_CTL 0x100050
|
||||
#define SQC_CACHE_ENABLE BIT(0)
|
||||
@ -128,6 +163,11 @@ enum qm_debug_file {
|
||||
DEBUG_FILE_NUM,
|
||||
};
|
||||
|
||||
enum qm_vf_state {
|
||||
QM_READY = 0,
|
||||
QM_NOT_READY,
|
||||
};
|
||||
|
||||
struct qm_dfx {
|
||||
atomic64_t err_irq_cnt;
|
||||
atomic64_t aeq_irq_cnt;
|
||||
@ -414,6 +454,10 @@ pci_ers_result_t hisi_qm_dev_slot_reset(struct pci_dev *pdev);
|
||||
void hisi_qm_reset_prepare(struct pci_dev *pdev);
|
||||
void hisi_qm_reset_done(struct pci_dev *pdev);
|
||||
|
||||
int hisi_qm_wait_mb_ready(struct hisi_qm *qm);
|
||||
int hisi_qm_mb(struct hisi_qm *qm, u8 cmd, dma_addr_t dma_addr, u16 queue,
|
||||
bool op);
|
||||
|
||||
struct hisi_acc_sgl_pool;
|
||||
struct hisi_acc_hw_sgl *hisi_acc_sg_buf_map_to_hw_sgl(struct device *dev,
|
||||
struct scatterlist *sgl, struct hisi_acc_sgl_pool *pool,
|
||||
@ -438,4 +482,9 @@ void hisi_qm_pm_init(struct hisi_qm *qm);
|
||||
int hisi_qm_get_dfx_access(struct hisi_qm *qm);
|
||||
void hisi_qm_put_dfx_access(struct hisi_qm *qm);
|
||||
void hisi_qm_regs_dump(struct seq_file *s, struct debugfs_regset32 *regset);
|
||||
|
||||
/* Used by VFIO ACC live migration driver */
|
||||
struct pci_driver *hisi_sec_get_pf_driver(void);
|
||||
struct pci_driver *hisi_hpre_get_pf_driver(void);
|
||||
struct pci_driver *hisi_zip_get_pf_driver(void);
|
||||
#endif
|
@ -1143,6 +1143,9 @@ int mlx5_dm_sw_icm_alloc(struct mlx5_core_dev *dev, enum mlx5_sw_icm_type type,
|
||||
int mlx5_dm_sw_icm_dealloc(struct mlx5_core_dev *dev, enum mlx5_sw_icm_type type,
|
||||
u64 length, u16 uid, phys_addr_t addr, u32 obj_id);
|
||||
|
||||
struct mlx5_core_dev *mlx5_vf_get_core_dev(struct pci_dev *pdev);
|
||||
void mlx5_vf_put_core_dev(struct mlx5_core_dev *mdev);
|
||||
|
||||
#ifdef CONFIG_MLX5_CORE_IPOIB
|
||||
struct net_device *mlx5_rdma_netdev_alloc(struct mlx5_core_dev *mdev,
|
||||
struct ib_device *ibdev,
|
||||
|
@ -127,6 +127,11 @@ enum {
|
||||
MLX5_CMD_OP_QUERY_SF_PARTITION = 0x111,
|
||||
MLX5_CMD_OP_ALLOC_SF = 0x113,
|
||||
MLX5_CMD_OP_DEALLOC_SF = 0x114,
|
||||
MLX5_CMD_OP_SUSPEND_VHCA = 0x115,
|
||||
MLX5_CMD_OP_RESUME_VHCA = 0x116,
|
||||
MLX5_CMD_OP_QUERY_VHCA_MIGRATION_STATE = 0x117,
|
||||
MLX5_CMD_OP_SAVE_VHCA_STATE = 0x118,
|
||||
MLX5_CMD_OP_LOAD_VHCA_STATE = 0x119,
|
||||
MLX5_CMD_OP_CREATE_MKEY = 0x200,
|
||||
MLX5_CMD_OP_QUERY_MKEY = 0x201,
|
||||
MLX5_CMD_OP_DESTROY_MKEY = 0x202,
|
||||
@ -1757,7 +1762,9 @@ struct mlx5_ifc_cmd_hca_cap_bits {
|
||||
u8 reserved_at_682[0x1];
|
||||
u8 log_max_sf[0x5];
|
||||
u8 apu[0x1];
|
||||
u8 reserved_at_689[0x7];
|
||||
u8 reserved_at_689[0x4];
|
||||
u8 migration[0x1];
|
||||
u8 reserved_at_68e[0x2];
|
||||
u8 log_min_sf_size[0x8];
|
||||
u8 max_num_sf_partitions[0x8];
|
||||
|
||||
@ -11518,4 +11525,142 @@ enum {
|
||||
MLX5_MTT_PERM_RW = MLX5_MTT_PERM_READ | MLX5_MTT_PERM_WRITE,
|
||||
};
|
||||
|
||||
enum {
|
||||
MLX5_SUSPEND_VHCA_IN_OP_MOD_SUSPEND_INITIATOR = 0x0,
|
||||
MLX5_SUSPEND_VHCA_IN_OP_MOD_SUSPEND_RESPONDER = 0x1,
|
||||
};
|
||||
|
||||
struct mlx5_ifc_suspend_vhca_in_bits {
|
||||
u8 opcode[0x10];
|
||||
u8 uid[0x10];
|
||||
|
||||
u8 reserved_at_20[0x10];
|
||||
u8 op_mod[0x10];
|
||||
|
||||
u8 reserved_at_40[0x10];
|
||||
u8 vhca_id[0x10];
|
||||
|
||||
u8 reserved_at_60[0x20];
|
||||
};
|
||||
|
||||
struct mlx5_ifc_suspend_vhca_out_bits {
|
||||
u8 status[0x8];
|
||||
u8 reserved_at_8[0x18];
|
||||
|
||||
u8 syndrome[0x20];
|
||||
|
||||
u8 reserved_at_40[0x40];
|
||||
};
|
||||
|
||||
enum {
|
||||
MLX5_RESUME_VHCA_IN_OP_MOD_RESUME_RESPONDER = 0x0,
|
||||
MLX5_RESUME_VHCA_IN_OP_MOD_RESUME_INITIATOR = 0x1,
|
||||
};
|
||||
|
||||
struct mlx5_ifc_resume_vhca_in_bits {
|
||||
u8 opcode[0x10];
|
||||
u8 uid[0x10];
|
||||
|
||||
u8 reserved_at_20[0x10];
|
||||
u8 op_mod[0x10];
|
||||
|
||||
u8 reserved_at_40[0x10];
|
||||
u8 vhca_id[0x10];
|
||||
|
||||
u8 reserved_at_60[0x20];
|
||||
};
|
||||
|
||||
struct mlx5_ifc_resume_vhca_out_bits {
|
||||
u8 status[0x8];
|
||||
u8 reserved_at_8[0x18];
|
||||
|
||||
u8 syndrome[0x20];
|
||||
|
||||
u8 reserved_at_40[0x40];
|
||||
};
|
||||
|
||||
struct mlx5_ifc_query_vhca_migration_state_in_bits {
|
||||
u8 opcode[0x10];
|
||||
u8 uid[0x10];
|
||||
|
||||
u8 reserved_at_20[0x10];
|
||||
u8 op_mod[0x10];
|
||||
|
||||
u8 reserved_at_40[0x10];
|
||||
u8 vhca_id[0x10];
|
||||
|
||||
u8 reserved_at_60[0x20];
|
||||
};
|
||||
|
||||
struct mlx5_ifc_query_vhca_migration_state_out_bits {
|
||||
u8 status[0x8];
|
||||
u8 reserved_at_8[0x18];
|
||||
|
||||
u8 syndrome[0x20];
|
||||
|
||||
u8 reserved_at_40[0x40];
|
||||
|
||||
u8 required_umem_size[0x20];
|
||||
|
||||
u8 reserved_at_a0[0x160];
|
||||
};
|
||||
|
||||
struct mlx5_ifc_save_vhca_state_in_bits {
|
||||
u8 opcode[0x10];
|
||||
u8 uid[0x10];
|
||||
|
||||
u8 reserved_at_20[0x10];
|
||||
u8 op_mod[0x10];
|
||||
|
||||
u8 reserved_at_40[0x10];
|
||||
u8 vhca_id[0x10];
|
||||
|
||||
u8 reserved_at_60[0x20];
|
||||
|
||||
u8 va[0x40];
|
||||
|
||||
u8 mkey[0x20];
|
||||
|
||||
u8 size[0x20];
|
||||
};
|
||||
|
||||
struct mlx5_ifc_save_vhca_state_out_bits {
|
||||
u8 status[0x8];
|
||||
u8 reserved_at_8[0x18];
|
||||
|
||||
u8 syndrome[0x20];
|
||||
|
||||
u8 actual_image_size[0x20];
|
||||
|
||||
u8 reserved_at_60[0x20];
|
||||
};
|
||||
|
||||
struct mlx5_ifc_load_vhca_state_in_bits {
|
||||
u8 opcode[0x10];
|
||||
u8 uid[0x10];
|
||||
|
||||
u8 reserved_at_20[0x10];
|
||||
u8 op_mod[0x10];
|
||||
|
||||
u8 reserved_at_40[0x10];
|
||||
u8 vhca_id[0x10];
|
||||
|
||||
u8 reserved_at_60[0x20];
|
||||
|
||||
u8 va[0x40];
|
||||
|
||||
u8 mkey[0x20];
|
||||
|
||||
u8 size[0x20];
|
||||
};
|
||||
|
||||
struct mlx5_ifc_load_vhca_state_out_bits {
|
||||
u8 status[0x8];
|
||||
u8 reserved_at_8[0x18];
|
||||
|
||||
u8 syndrome[0x20];
|
||||
|
||||
u8 reserved_at_40[0x40];
|
||||
};
|
||||
|
||||
#endif /* MLX5_IFC_H */
|
||||
|
@ -2166,7 +2166,8 @@ void __iomem *pci_ioremap_wc_bar(struct pci_dev *pdev, int bar);
|
||||
#ifdef CONFIG_PCI_IOV
|
||||
int pci_iov_virtfn_bus(struct pci_dev *dev, int id);
|
||||
int pci_iov_virtfn_devfn(struct pci_dev *dev, int id);
|
||||
|
||||
int pci_iov_vf_id(struct pci_dev *dev);
|
||||
void *pci_iov_get_pf_drvdata(struct pci_dev *dev, struct pci_driver *pf_driver);
|
||||
int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn);
|
||||
void pci_disable_sriov(struct pci_dev *dev);
|
||||
|
||||
@ -2194,6 +2195,18 @@ static inline int pci_iov_virtfn_devfn(struct pci_dev *dev, int id)
|
||||
{
|
||||
return -ENOSYS;
|
||||
}
|
||||
|
||||
static inline int pci_iov_vf_id(struct pci_dev *dev)
|
||||
{
|
||||
return -ENOSYS;
|
||||
}
|
||||
|
||||
static inline void *pci_iov_get_pf_drvdata(struct pci_dev *dev,
|
||||
struct pci_driver *pf_driver)
|
||||
{
|
||||
return ERR_PTR(-EINVAL);
|
||||
}
|
||||
|
||||
static inline int pci_enable_sriov(struct pci_dev *dev, int nr_virtfn)
|
||||
{ return -ENODEV; }
|
||||
|
||||
|
@ -2529,6 +2529,9 @@
|
||||
#define PCI_DEVICE_ID_KORENIX_JETCARDF3 0x17ff
|
||||
|
||||
#define PCI_VENDOR_ID_HUAWEI 0x19e5
|
||||
#define PCI_DEVICE_ID_HUAWEI_ZIP_VF 0xa251
|
||||
#define PCI_DEVICE_ID_HUAWEI_SEC_VF 0xa256
|
||||
#define PCI_DEVICE_ID_HUAWEI_HPRE_VF 0xa259
|
||||
|
||||
#define PCI_VENDOR_ID_NETRONOME 0x19ee
|
||||
#define PCI_DEVICE_ID_NETRONOME_NFP4000 0x4000
|
||||
|
@ -33,6 +33,7 @@ struct vfio_device {
|
||||
struct vfio_group *group;
|
||||
struct vfio_device_set *dev_set;
|
||||
struct list_head dev_set_list;
|
||||
unsigned int migration_flags;
|
||||
|
||||
/* Members below here are private, not for driver use */
|
||||
refcount_t refcount;
|
||||
@ -55,6 +56,17 @@ struct vfio_device {
|
||||
* @match: Optional device name match callback (return: 0 for no-match, >0 for
|
||||
* match, -errno for abort (ex. match with insufficient or incorrect
|
||||
* additional args)
|
||||
* @device_feature: Optional, fill in the VFIO_DEVICE_FEATURE ioctl
|
||||
* @migration_set_state: Optional callback to change the migration state for
|
||||
* devices that support migration. It's mandatory for
|
||||
* VFIO_DEVICE_FEATURE_MIGRATION migration support.
|
||||
* The returned FD is used for data transfer according to the FSM
|
||||
* definition. The driver is responsible to ensure that FD reaches end
|
||||
* of stream or error whenever the migration FSM leaves a data transfer
|
||||
* state or before close_device() returns.
|
||||
* @migration_get_state: Optional callback to get the migration state for
|
||||
* devices that support migration. It's mandatory for
|
||||
* VFIO_DEVICE_FEATURE_MIGRATION migration support.
|
||||
*/
|
||||
struct vfio_device_ops {
|
||||
char *name;
|
||||
@ -69,8 +81,44 @@ struct vfio_device_ops {
|
||||
int (*mmap)(struct vfio_device *vdev, struct vm_area_struct *vma);
|
||||
void (*request)(struct vfio_device *vdev, unsigned int count);
|
||||
int (*match)(struct vfio_device *vdev, char *buf);
|
||||
int (*device_feature)(struct vfio_device *device, u32 flags,
|
||||
void __user *arg, size_t argsz);
|
||||
struct file *(*migration_set_state)(
|
||||
struct vfio_device *device,
|
||||
enum vfio_device_mig_state new_state);
|
||||
int (*migration_get_state)(struct vfio_device *device,
|
||||
enum vfio_device_mig_state *curr_state);
|
||||
};
|
||||
|
||||
/**
|
||||
* vfio_check_feature - Validate user input for the VFIO_DEVICE_FEATURE ioctl
|
||||
* @flags: Arg from the device_feature op
|
||||
* @argsz: Arg from the device_feature op
|
||||
* @supported_ops: Combination of VFIO_DEVICE_FEATURE_GET and SET the driver
|
||||
* supports
|
||||
* @minsz: Minimum data size the driver accepts
|
||||
*
|
||||
* For use in a driver's device_feature op. Checks that the inputs to the
|
||||
* VFIO_DEVICE_FEATURE ioctl are correct for the driver's feature. Returns 1 if
|
||||
* the driver should execute the get or set, otherwise the relevant
|
||||
* value should be returned.
|
||||
*/
|
||||
static inline int vfio_check_feature(u32 flags, size_t argsz, u32 supported_ops,
|
||||
size_t minsz)
|
||||
{
|
||||
if ((flags & (VFIO_DEVICE_FEATURE_GET | VFIO_DEVICE_FEATURE_SET)) &
|
||||
~supported_ops)
|
||||
return -EINVAL;
|
||||
if (flags & VFIO_DEVICE_FEATURE_PROBE)
|
||||
return 0;
|
||||
/* Without PROBE one of GET or SET must be requested */
|
||||
if (!(flags & (VFIO_DEVICE_FEATURE_GET | VFIO_DEVICE_FEATURE_SET)))
|
||||
return -EINVAL;
|
||||
if (argsz < minsz)
|
||||
return -EINVAL;
|
||||
return 1;
|
||||
}
|
||||
|
||||
void vfio_init_group_dev(struct vfio_device *device, struct device *dev,
|
||||
const struct vfio_device_ops *ops);
|
||||
void vfio_uninit_group_dev(struct vfio_device *device);
|
||||
@ -82,6 +130,11 @@ extern void vfio_device_put(struct vfio_device *device);
|
||||
|
||||
int vfio_assign_device_set(struct vfio_device *device, void *set_id);
|
||||
|
||||
int vfio_mig_get_next_state(struct vfio_device *device,
|
||||
enum vfio_device_mig_state cur_fsm,
|
||||
enum vfio_device_mig_state new_fsm,
|
||||
enum vfio_device_mig_state *next_fsm);
|
||||
|
||||
/*
|
||||
* External user API
|
||||
*/
|
||||
|
@ -159,8 +159,17 @@ extern ssize_t vfio_pci_config_rw(struct vfio_pci_core_device *vdev,
|
||||
extern ssize_t vfio_pci_bar_rw(struct vfio_pci_core_device *vdev, char __user *buf,
|
||||
size_t count, loff_t *ppos, bool iswrite);
|
||||
|
||||
#ifdef CONFIG_VFIO_PCI_VGA
|
||||
extern ssize_t vfio_pci_vga_rw(struct vfio_pci_core_device *vdev, char __user *buf,
|
||||
size_t count, loff_t *ppos, bool iswrite);
|
||||
#else
|
||||
static inline ssize_t vfio_pci_vga_rw(struct vfio_pci_core_device *vdev,
|
||||
char __user *buf, size_t count,
|
||||
loff_t *ppos, bool iswrite)
|
||||
{
|
||||
return -EINVAL;
|
||||
}
|
||||
#endif
|
||||
|
||||
extern long vfio_pci_ioeventfd(struct vfio_pci_core_device *vdev, loff_t offset,
|
||||
uint64_t data, int count, int fd);
|
||||
@ -220,6 +229,8 @@ int vfio_pci_core_sriov_configure(struct pci_dev *pdev, int nr_virtfn);
|
||||
extern const struct pci_error_handlers vfio_pci_core_err_handlers;
|
||||
long vfio_pci_core_ioctl(struct vfio_device *core_vdev, unsigned int cmd,
|
||||
unsigned long arg);
|
||||
int vfio_pci_core_ioctl_feature(struct vfio_device *device, u32 flags,
|
||||
void __user *arg, size_t argsz);
|
||||
ssize_t vfio_pci_core_read(struct vfio_device *core_vdev, char __user *buf,
|
||||
size_t count, loff_t *ppos);
|
||||
ssize_t vfio_pci_core_write(struct vfio_device *core_vdev, const char __user *buf,
|
||||
@ -230,6 +241,8 @@ int vfio_pci_core_match(struct vfio_device *core_vdev, char *buf);
|
||||
int vfio_pci_core_enable(struct vfio_pci_core_device *vdev);
|
||||
void vfio_pci_core_disable(struct vfio_pci_core_device *vdev);
|
||||
void vfio_pci_core_finish_enable(struct vfio_pci_core_device *vdev);
|
||||
pci_ers_result_t vfio_pci_core_aer_err_detected(struct pci_dev *pdev,
|
||||
pci_channel_state_t state);
|
||||
|
||||
static inline bool vfio_pci_is_vga(struct pci_dev *pdev)
|
||||
{
|
||||
|
@ -323,7 +323,7 @@ struct vfio_region_info_cap_type {
|
||||
#define VFIO_REGION_TYPE_PCI_VENDOR_MASK (0xffff)
|
||||
#define VFIO_REGION_TYPE_GFX (1)
|
||||
#define VFIO_REGION_TYPE_CCW (2)
|
||||
#define VFIO_REGION_TYPE_MIGRATION (3)
|
||||
#define VFIO_REGION_TYPE_MIGRATION_DEPRECATED (3)
|
||||
|
||||
/* sub-types for VFIO_REGION_TYPE_PCI_* */
|
||||
|
||||
@ -405,225 +405,29 @@ struct vfio_region_gfx_edid {
|
||||
#define VFIO_REGION_SUBTYPE_CCW_CRW (3)
|
||||
|
||||
/* sub-types for VFIO_REGION_TYPE_MIGRATION */
|
||||
#define VFIO_REGION_SUBTYPE_MIGRATION (1)
|
||||
|
||||
/*
|
||||
* The structure vfio_device_migration_info is placed at the 0th offset of
|
||||
* the VFIO_REGION_SUBTYPE_MIGRATION region to get and set VFIO device related
|
||||
* migration information. Field accesses from this structure are only supported
|
||||
* at their native width and alignment. Otherwise, the result is undefined and
|
||||
* vendor drivers should return an error.
|
||||
*
|
||||
* device_state: (read/write)
|
||||
* - The user application writes to this field to inform the vendor driver
|
||||
* about the device state to be transitioned to.
|
||||
* - The vendor driver should take the necessary actions to change the
|
||||
* device state. After successful transition to a given state, the
|
||||
* vendor driver should return success on write(device_state, state)
|
||||
* system call. If the device state transition fails, the vendor driver
|
||||
* should return an appropriate -errno for the fault condition.
|
||||
* - On the user application side, if the device state transition fails,
|
||||
* that is, if write(device_state, state) returns an error, read
|
||||
* device_state again to determine the current state of the device from
|
||||
* the vendor driver.
|
||||
* - The vendor driver should return previous state of the device unless
|
||||
* the vendor driver has encountered an internal error, in which case
|
||||
* the vendor driver may report the device_state VFIO_DEVICE_STATE_ERROR.
|
||||
* - The user application must use the device reset ioctl to recover the
|
||||
* device from VFIO_DEVICE_STATE_ERROR state. If the device is
|
||||
* indicated to be in a valid device state by reading device_state, the
|
||||
* user application may attempt to transition the device to any valid
|
||||
* state reachable from the current state or terminate itself.
|
||||
*
|
||||
* device_state consists of 3 bits:
|
||||
* - If bit 0 is set, it indicates the _RUNNING state. If bit 0 is clear,
|
||||
* it indicates the _STOP state. When the device state is changed to
|
||||
* _STOP, driver should stop the device before write() returns.
|
||||
* - If bit 1 is set, it indicates the _SAVING state, which means that the
|
||||
* driver should start gathering device state information that will be
|
||||
* provided to the VFIO user application to save the device's state.
|
||||
* - If bit 2 is set, it indicates the _RESUMING state, which means that
|
||||
* the driver should prepare to resume the device. Data provided through
|
||||
* the migration region should be used to resume the device.
|
||||
* Bits 3 - 31 are reserved for future use. To preserve them, the user
|
||||
* application should perform a read-modify-write operation on this
|
||||
* field when modifying the specified bits.
|
||||
*
|
||||
* +------- _RESUMING
|
||||
* |+------ _SAVING
|
||||
* ||+----- _RUNNING
|
||||
* |||
|
||||
* 000b => Device Stopped, not saving or resuming
|
||||
* 001b => Device running, which is the default state
|
||||
* 010b => Stop the device & save the device state, stop-and-copy state
|
||||
* 011b => Device running and save the device state, pre-copy state
|
||||
* 100b => Device stopped and the device state is resuming
|
||||
* 101b => Invalid state
|
||||
* 110b => Error state
|
||||
* 111b => Invalid state
|
||||
*
|
||||
* State transitions:
|
||||
*
|
||||
* _RESUMING _RUNNING Pre-copy Stop-and-copy _STOP
|
||||
* (100b) (001b) (011b) (010b) (000b)
|
||||
* 0. Running or default state
|
||||
* |
|
||||
*
|
||||
* 1. Normal Shutdown (optional)
|
||||
* |------------------------------------->|
|
||||
*
|
||||
* 2. Save the state or suspend
|
||||
* |------------------------->|---------->|
|
||||
*
|
||||
* 3. Save the state during live migration
|
||||
* |----------->|------------>|---------->|
|
||||
*
|
||||
* 4. Resuming
|
||||
* |<---------|
|
||||
*
|
||||
* 5. Resumed
|
||||
* |--------->|
|
||||
*
|
||||
* 0. Default state of VFIO device is _RUNNING when the user application starts.
|
||||
* 1. During normal shutdown of the user application, the user application may
|
||||
* optionally change the VFIO device state from _RUNNING to _STOP. This
|
||||
* transition is optional. The vendor driver must support this transition but
|
||||
* must not require it.
|
||||
* 2. When the user application saves state or suspends the application, the
|
||||
* device state transitions from _RUNNING to stop-and-copy and then to _STOP.
|
||||
* On state transition from _RUNNING to stop-and-copy, driver must stop the
|
||||
* device, save the device state and send it to the application through the
|
||||
* migration region. The sequence to be followed for such transition is given
|
||||
* below.
|
||||
* 3. In live migration of user application, the state transitions from _RUNNING
|
||||
* to pre-copy, to stop-and-copy, and to _STOP.
|
||||
* On state transition from _RUNNING to pre-copy, the driver should start
|
||||
* gathering the device state while the application is still running and send
|
||||
* the device state data to application through the migration region.
|
||||
* On state transition from pre-copy to stop-and-copy, the driver must stop
|
||||
* the device, save the device state and send it to the user application
|
||||
* through the migration region.
|
||||
* Vendor drivers must support the pre-copy state even for implementations
|
||||
* where no data is provided to the user before the stop-and-copy state. The
|
||||
* user must not be required to consume all migration data before the device
|
||||
* transitions to a new state, including the stop-and-copy state.
|
||||
* The sequence to be followed for above two transitions is given below.
|
||||
* 4. To start the resuming phase, the device state should be transitioned from
|
||||
* the _RUNNING to the _RESUMING state.
|
||||
* In the _RESUMING state, the driver should use the device state data
|
||||
* received through the migration region to resume the device.
|
||||
* 5. After providing saved device data to the driver, the application should
|
||||
* change the state from _RESUMING to _RUNNING.
|
||||
*
|
||||
* reserved:
|
||||
* Reads on this field return zero and writes are ignored.
|
||||
*
|
||||
* pending_bytes: (read only)
|
||||
* The number of pending bytes still to be migrated from the vendor driver.
|
||||
*
|
||||
* data_offset: (read only)
|
||||
* The user application should read data_offset field from the migration
|
||||
* region. The user application should read the device data from this
|
||||
* offset within the migration region during the _SAVING state or write
|
||||
* the device data during the _RESUMING state. See below for details of
|
||||
* sequence to be followed.
|
||||
*
|
||||
* data_size: (read/write)
|
||||
* The user application should read data_size to get the size in bytes of
|
||||
* the data copied in the migration region during the _SAVING state and
|
||||
* write the size in bytes of the data copied in the migration region
|
||||
* during the _RESUMING state.
|
||||
*
|
||||
* The format of the migration region is as follows:
|
||||
* ------------------------------------------------------------------
|
||||
* |vfio_device_migration_info| data section |
|
||||
* | | /////////////////////////////// |
|
||||
* ------------------------------------------------------------------
|
||||
* ^ ^
|
||||
* offset 0-trapped part data_offset
|
||||
*
|
||||
* The structure vfio_device_migration_info is always followed by the data
|
||||
* section in the region, so data_offset will always be nonzero. The offset
|
||||
* from where the data is copied is decided by the kernel driver. The data
|
||||
* section can be trapped, mmapped, or partitioned, depending on how the kernel
|
||||
* driver defines the data section. The data section partition can be defined
|
||||
* as mapped by the sparse mmap capability. If mmapped, data_offset must be
|
||||
* page aligned, whereas initial section which contains the
|
||||
* vfio_device_migration_info structure, might not end at the offset, which is
|
||||
* page aligned. The user is not required to access through mmap regardless
|
||||
* of the capabilities of the region mmap.
|
||||
* The vendor driver should determine whether and how to partition the data
|
||||
* section. The vendor driver should return data_offset accordingly.
|
||||
*
|
||||
* The sequence to be followed while in pre-copy state and stop-and-copy state
|
||||
* is as follows:
|
||||
* a. Read pending_bytes, indicating the start of a new iteration to get device
|
||||
* data. Repeated read on pending_bytes at this stage should have no side
|
||||
* effects.
|
||||
* If pending_bytes == 0, the user application should not iterate to get data
|
||||
* for that device.
|
||||
* If pending_bytes > 0, perform the following steps.
|
||||
* b. Read data_offset, indicating that the vendor driver should make data
|
||||
* available through the data section. The vendor driver should return this
|
||||
* read operation only after data is available from (region + data_offset)
|
||||
* to (region + data_offset + data_size).
|
||||
* c. Read data_size, which is the amount of data in bytes available through
|
||||
* the migration region.
|
||||
* Read on data_offset and data_size should return the offset and size of
|
||||
* the current buffer if the user application reads data_offset and
|
||||
* data_size more than once here.
|
||||
* d. Read data_size bytes of data from (region + data_offset) from the
|
||||
* migration region.
|
||||
* e. Process the data.
|
||||
* f. Read pending_bytes, which indicates that the data from the previous
|
||||
* iteration has been read. If pending_bytes > 0, go to step b.
|
||||
*
|
||||
* The user application can transition from the _SAVING|_RUNNING
|
||||
* (pre-copy state) to the _SAVING (stop-and-copy) state regardless of the
|
||||
* number of pending bytes. The user application should iterate in _SAVING
|
||||
* (stop-and-copy) until pending_bytes is 0.
|
||||
*
|
||||
* The sequence to be followed while _RESUMING device state is as follows:
|
||||
* While data for this device is available, repeat the following steps:
|
||||
* a. Read data_offset from where the user application should write data.
|
||||
* b. Write migration data starting at the migration region + data_offset for
|
||||
* the length determined by data_size from the migration source.
|
||||
* c. Write data_size, which indicates to the vendor driver that data is
|
||||
* written in the migration region. Vendor driver must return this write
|
||||
* operations on consuming data. Vendor driver should apply the
|
||||
* user-provided migration region data to the device resume state.
|
||||
*
|
||||
* If an error occurs during the above sequences, the vendor driver can return
|
||||
* an error code for next read() or write() operation, which will terminate the
|
||||
* loop. The user application should then take the next necessary action, for
|
||||
* example, failing migration or terminating the user application.
|
||||
*
|
||||
* For the user application, data is opaque. The user application should write
|
||||
* data in the same order as the data is received and the data should be of
|
||||
* same transaction size at the source.
|
||||
*/
|
||||
#define VFIO_REGION_SUBTYPE_MIGRATION_DEPRECATED (1)
|
||||
|
||||
struct vfio_device_migration_info {
|
||||
__u32 device_state; /* VFIO device state */
|
||||
#define VFIO_DEVICE_STATE_STOP (0)
|
||||
#define VFIO_DEVICE_STATE_RUNNING (1 << 0)
|
||||
#define VFIO_DEVICE_STATE_SAVING (1 << 1)
|
||||
#define VFIO_DEVICE_STATE_RESUMING (1 << 2)
|
||||
#define VFIO_DEVICE_STATE_MASK (VFIO_DEVICE_STATE_RUNNING | \
|
||||
VFIO_DEVICE_STATE_SAVING | \
|
||||
VFIO_DEVICE_STATE_RESUMING)
|
||||
#define VFIO_DEVICE_STATE_V1_STOP (0)
|
||||
#define VFIO_DEVICE_STATE_V1_RUNNING (1 << 0)
|
||||
#define VFIO_DEVICE_STATE_V1_SAVING (1 << 1)
|
||||
#define VFIO_DEVICE_STATE_V1_RESUMING (1 << 2)
|
||||
#define VFIO_DEVICE_STATE_MASK (VFIO_DEVICE_STATE_V1_RUNNING | \
|
||||
VFIO_DEVICE_STATE_V1_SAVING | \
|
||||
VFIO_DEVICE_STATE_V1_RESUMING)
|
||||
|
||||
#define VFIO_DEVICE_STATE_VALID(state) \
|
||||
(state & VFIO_DEVICE_STATE_RESUMING ? \
|
||||
(state & VFIO_DEVICE_STATE_MASK) == VFIO_DEVICE_STATE_RESUMING : 1)
|
||||
(state & VFIO_DEVICE_STATE_V1_RESUMING ? \
|
||||
(state & VFIO_DEVICE_STATE_MASK) == VFIO_DEVICE_STATE_V1_RESUMING : 1)
|
||||
|
||||
#define VFIO_DEVICE_STATE_IS_ERROR(state) \
|
||||
((state & VFIO_DEVICE_STATE_MASK) == (VFIO_DEVICE_STATE_SAVING | \
|
||||
VFIO_DEVICE_STATE_RESUMING))
|
||||
((state & VFIO_DEVICE_STATE_MASK) == (VFIO_DEVICE_STATE_V1_SAVING | \
|
||||
VFIO_DEVICE_STATE_V1_RESUMING))
|
||||
|
||||
#define VFIO_DEVICE_STATE_SET_ERROR(state) \
|
||||
((state & ~VFIO_DEVICE_STATE_MASK) | VFIO_DEVICE_SATE_SAVING | \
|
||||
VFIO_DEVICE_STATE_RESUMING)
|
||||
((state & ~VFIO_DEVICE_STATE_MASK) | VFIO_DEVICE_STATE_V1_SAVING | \
|
||||
VFIO_DEVICE_STATE_V1_RESUMING)
|
||||
|
||||
__u32 reserved;
|
||||
__u64 pending_bytes;
|
||||
@ -1002,6 +806,186 @@ struct vfio_device_feature {
|
||||
*/
|
||||
#define VFIO_DEVICE_FEATURE_PCI_VF_TOKEN (0)
|
||||
|
||||
/*
|
||||
* Indicates the device can support the migration API through
|
||||
* VFIO_DEVICE_FEATURE_MIG_DEVICE_STATE. If this GET succeeds, the RUNNING and
|
||||
* ERROR states are always supported. Support for additional states is
|
||||
* indicated via the flags field; at least VFIO_MIGRATION_STOP_COPY must be
|
||||
* set.
|
||||
*
|
||||
* VFIO_MIGRATION_STOP_COPY means that STOP, STOP_COPY and
|
||||
* RESUMING are supported.
|
||||
*
|
||||
* VFIO_MIGRATION_STOP_COPY | VFIO_MIGRATION_P2P means that RUNNING_P2P
|
||||
* is supported in addition to the STOP_COPY states.
|
||||
*
|
||||
* Other combinations of flags have behavior to be defined in the future.
|
||||
*/
|
||||
struct vfio_device_feature_migration {
|
||||
__aligned_u64 flags;
|
||||
#define VFIO_MIGRATION_STOP_COPY (1 << 0)
|
||||
#define VFIO_MIGRATION_P2P (1 << 1)
|
||||
};
|
||||
#define VFIO_DEVICE_FEATURE_MIGRATION 1
|
||||
|
||||
/*
|
||||
* Upon VFIO_DEVICE_FEATURE_SET, execute a migration state change on the VFIO
|
||||
* device. The new state is supplied in device_state, see enum
|
||||
* vfio_device_mig_state for details
|
||||
*
|
||||
* The kernel migration driver must fully transition the device to the new state
|
||||
* value before the operation returns to the user.
|
||||
*
|
||||
* The kernel migration driver must not generate asynchronous device state
|
||||
* transitions outside of manipulation by the user or the VFIO_DEVICE_RESET
|
||||
* ioctl as described above.
|
||||
*
|
||||
* If this function fails then current device_state may be the original
|
||||
* operating state or some other state along the combination transition path.
|
||||
* The user can then decide if it should execute a VFIO_DEVICE_RESET, attempt
|
||||
* to return to the original state, or attempt to return to some other state
|
||||
* such as RUNNING or STOP.
|
||||
*
|
||||
* If the new_state starts a new data transfer session then the FD associated
|
||||
* with that session is returned in data_fd. The user is responsible to close
|
||||
* this FD when it is finished. The user must consider the migration data stream
|
||||
* carried over the FD to be opaque and must preserve the byte order of the
|
||||
* stream. The user is not required to preserve buffer segmentation when writing
|
||||
* the data stream during the RESUMING operation.
|
||||
*
|
||||
* Upon VFIO_DEVICE_FEATURE_GET, get the current migration state of the VFIO
|
||||
* device, data_fd will be -1.
|
||||
*/
|
||||
struct vfio_device_feature_mig_state {
|
||||
__u32 device_state; /* From enum vfio_device_mig_state */
|
||||
__s32 data_fd;
|
||||
};
|
||||
#define VFIO_DEVICE_FEATURE_MIG_DEVICE_STATE 2
|
||||
|
||||
/*
|
||||
* The device migration Finite State Machine is described by the enum
|
||||
* vfio_device_mig_state. Some of the FSM arcs will create a migration data
|
||||
* transfer session by returning a FD, in this case the migration data will
|
||||
* flow over the FD using read() and write() as discussed below.
|
||||
*
|
||||
* There are 5 states to support VFIO_MIGRATION_STOP_COPY:
|
||||
* RUNNING - The device is running normally
|
||||
* STOP - The device does not change the internal or external state
|
||||
* STOP_COPY - The device internal state can be read out
|
||||
* RESUMING - The device is stopped and is loading a new internal state
|
||||
* ERROR - The device has failed and must be reset
|
||||
*
|
||||
* And 1 optional state to support VFIO_MIGRATION_P2P:
|
||||
* RUNNING_P2P - RUNNING, except the device cannot do peer to peer DMA
|
||||
*
|
||||
* The FSM takes actions on the arcs between FSM states. The driver implements
|
||||
* the following behavior for the FSM arcs:
|
||||
*
|
||||
* RUNNING_P2P -> STOP
|
||||
* STOP_COPY -> STOP
|
||||
* While in STOP the device must stop the operation of the device. The device
|
||||
* must not generate interrupts, DMA, or any other change to external state.
|
||||
* It must not change its internal state. When stopped the device and kernel
|
||||
* migration driver must accept and respond to interaction to support external
|
||||
* subsystems in the STOP state, for example PCI MSI-X and PCI config space.
|
||||
* Failure by the user to restrict device access while in STOP must not result
|
||||
* in error conditions outside the user context (ex. host system faults).
|
||||
*
|
||||
* The STOP_COPY arc will terminate a data transfer session.
|
||||
*
|
||||
* RESUMING -> STOP
|
||||
* Leaving RESUMING terminates a data transfer session and indicates the
|
||||
* device should complete processing of the data delivered by write(). The
|
||||
* kernel migration driver should complete the incorporation of data written
|
||||
* to the data transfer FD into the device internal state and perform
|
||||
* final validity and consistency checking of the new device state. If the
|
||||
* user provided data is found to be incomplete, inconsistent, or otherwise
|
||||
* invalid, the migration driver must fail the SET_STATE ioctl and
|
||||
* optionally go to the ERROR state as described below.
|
||||
*
|
||||
* While in STOP the device has the same behavior as other STOP states
|
||||
* described above.
|
||||
*
|
||||
* To abort a RESUMING session the device must be reset.
|
||||
*
|
||||
* RUNNING_P2P -> RUNNING
|
||||
* While in RUNNING the device is fully operational, the device may generate
|
||||
* interrupts, DMA, respond to MMIO, all vfio device regions are functional,
|
||||
* and the device may advance its internal state.
|
||||
*
|
||||
* RUNNING -> RUNNING_P2P
|
||||
* STOP -> RUNNING_P2P
|
||||
* While in RUNNING_P2P the device is partially running in the P2P quiescent
|
||||
* state defined below.
|
||||
*
|
||||
* STOP -> STOP_COPY
|
||||
* This arc begin the process of saving the device state and will return a
|
||||
* new data_fd.
|
||||
*
|
||||
* While in the STOP_COPY state the device has the same behavior as STOP
|
||||
* with the addition that the data transfers session continues to stream the
|
||||
* migration state. End of stream on the FD indicates the entire device
|
||||
* state has been transferred.
|
||||
*
|
||||
* The user should take steps to restrict access to vfio device regions while
|
||||
* the device is in STOP_COPY or risk corruption of the device migration data
|
||||
* stream.
|
||||
*
|
||||
* STOP -> RESUMING
|
||||
* Entering the RESUMING state starts a process of restoring the device state
|
||||
* and will return a new data_fd. The data stream fed into the data_fd should
|
||||
* be taken from the data transfer output of a single FD during saving from
|
||||
* a compatible device. The migration driver may alter/reset the internal
|
||||
* device state for this arc if required to prepare the device to receive the
|
||||
* migration data.
|
||||
*
|
||||
* any -> ERROR
|
||||
* ERROR cannot be specified as a device state, however any transition request
|
||||
* can be failed with an errno return and may then move the device_state into
|
||||
* ERROR. In this case the device was unable to execute the requested arc and
|
||||
* was also unable to restore the device to any valid device_state.
|
||||
* To recover from ERROR VFIO_DEVICE_RESET must be used to return the
|
||||
* device_state back to RUNNING.
|
||||
*
|
||||
* The optional peer to peer (P2P) quiescent state is intended to be a quiescent
|
||||
* state for the device for the purposes of managing multiple devices within a
|
||||
* user context where peer-to-peer DMA between devices may be active. The
|
||||
* RUNNING_P2P states must prevent the device from initiating
|
||||
* any new P2P DMA transactions. If the device can identify P2P transactions
|
||||
* then it can stop only P2P DMA, otherwise it must stop all DMA. The migration
|
||||
* driver must complete any such outstanding operations prior to completing the
|
||||
* FSM arc into a P2P state. For the purpose of specification the states
|
||||
* behave as though the device was fully running if not supported. Like while in
|
||||
* STOP or STOP_COPY the user must not touch the device, otherwise the state
|
||||
* can be exited.
|
||||
*
|
||||
* The remaining possible transitions are interpreted as combinations of the
|
||||
* above FSM arcs. As there are multiple paths through the FSM arcs the path
|
||||
* should be selected based on the following rules:
|
||||
* - Select the shortest path.
|
||||
* Refer to vfio_mig_get_next_state() for the result of the algorithm.
|
||||
*
|
||||
* The automatic transit through the FSM arcs that make up the combination
|
||||
* transition is invisible to the user. When working with combination arcs the
|
||||
* user may see any step along the path in the device_state if SET_STATE
|
||||
* fails. When handling these types of errors users should anticipate future
|
||||
* revisions of this protocol using new states and those states becoming
|
||||
* visible in this case.
|
||||
*
|
||||
* The optional states cannot be used with SET_STATE if the device does not
|
||||
* support them. The user can discover if these states are supported by using
|
||||
* VFIO_DEVICE_FEATURE_MIGRATION. By using combination transitions the user can
|
||||
* avoid knowing about these optional states if the kernel driver supports them.
|
||||
*/
|
||||
enum vfio_device_mig_state {
|
||||
VFIO_DEVICE_STATE_ERROR = 0,
|
||||
VFIO_DEVICE_STATE_STOP = 1,
|
||||
VFIO_DEVICE_STATE_RUNNING = 2,
|
||||
VFIO_DEVICE_STATE_STOP_COPY = 3,
|
||||
VFIO_DEVICE_STATE_RESUMING = 4,
|
||||
VFIO_DEVICE_STATE_RUNNING_P2P = 5,
|
||||
};
|
||||
|
||||
/* -------- API for Type1 VFIO IOMMU -------- */
|
||||
|
||||
/**
|
||||
|
Loading…
Reference in New Issue
Block a user