linux/drivers/s390/scsi
Benjamin Block b76becde2b scsi: zfcp: fix request object use-after-free in send path causing seqno errors
With a recent change to our send path for FSF commands we introduced a
possible use-after-free of request-objects, that might further lead to
zfcp crafting bad requests, which the FCP channel correctly complains
about with an error (FSF_PROT_SEQ_NUMB_ERROR). This error is then handled
by an adapter-wide recovery.

The following sequence illustrates the possible use-after-free:

    Send Path:

        int zfcp_fsf_open_port(struct zfcp_erp_action *erp_action)
        {
                struct zfcp_fsf_req *req;
                ...
                spin_lock_irq(&qdio->req_q_lock);
        //                     ^^^^^^^^^^^^^^^^
        //                     protects QDIO queue during sending
                ...
                req = zfcp_fsf_req_create(qdio,
                                          FSF_QTCB_OPEN_PORT_WITH_DID,
                                          SBAL_SFLAGS0_TYPE_READ,
                                          qdio->adapter->pool.erp_req);
        //            ^^^^^^^^^^^^^^^^^^^
        //            allocation of the request-object
                ...
                retval = zfcp_fsf_req_send(req);
                ...
                spin_unlock_irq(&qdio->req_q_lock);
                return retval;
        }

        static int zfcp_fsf_req_send(struct zfcp_fsf_req *req)
        {
                struct zfcp_adapter *adapter = req->adapter;
                struct zfcp_qdio *qdio = adapter->qdio;
                ...
                zfcp_reqlist_add(adapter->req_list, req);
        //      ^^^^^^^^^^^^^^^^
        //      add request to our driver-internal hash-table for tracking
        //      (protected by separate lock req_list->lock)
                ...
                if (zfcp_qdio_send(qdio, &req->qdio_req)) {
        //          ^^^^^^^^^^^^^^
        //          hand-off the request to FCP channel;
        //          the request can complete at any point now
                        ...
                }

                /* Don't increase for unsolicited status */
                if (!zfcp_fsf_req_is_status_read_buffer(req))
        //           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        //           possible use-after-free
                        adapter->fsf_req_seq_no++;
        //                       ^^^^^^^^^^^^^^^^
        //                       because of the use-after-free we might
        //                       miss this accounting, and as follow-up
        //                       this results in the FCP channel error
        //                       FSF_PROT_SEQ_NUMB_ERROR
                adapter->req_no++;

                return 0;
        }

        static inline bool
        zfcp_fsf_req_is_status_read_buffer(struct zfcp_fsf_req *req)
        {
                return req->qtcb == NULL;
        //             ^^^^^^^^^
        //             possible use-after-free
        }

    Response Path:

        void zfcp_fsf_reqid_check(struct zfcp_qdio *qdio, int sbal_idx)
        {
                ...
                struct zfcp_fsf_req *fsf_req;
                ...
                for (idx = 0; idx < QDIO_MAX_ELEMENTS_PER_BUFFER; idx++) {
                        ...
                        fsf_req = zfcp_reqlist_find_rm(adapter->req_list,
                                                       req_id);
        //                        ^^^^^^^^^^^^^^^^^^^^
        //                        remove request from our driver-internal
        //                        hash-table (lock req_list->lock)
                        ...
                        zfcp_fsf_req_complete(fsf_req);
                }
        }

        static void zfcp_fsf_req_complete(struct zfcp_fsf_req *req)
        {
                ...
                if (likely(req->status & ZFCP_STATUS_FSFREQ_CLEANUP))
                        zfcp_fsf_req_free(req);
        //              ^^^^^^^^^^^^^^^^^
        //              free memory for request-object
                else
                        complete(&req->completion);
        //              ^^^^^^^^
        //              completion notification for code-paths that wait
        //              synchronous for the completion of the request; in
        //              those the memory is freed separately
        }

The result of the use-after-free only affects the send path, and can not
lead to any data corruption. In case we miss the sequence-number
accounting, because the memory was already re-purposed, the next FSF
command will fail with said FCP channel error, and we will recover the
whole adapter. This causes no additional errors, but it slows down
traffic.  There is a slight chance of the same thing happen again
recursively after the adapter recovery, but so far this has not been seen.

This was seen under z/VM, where the send path might run on a virtual CPU
that gets scheduled away by z/VM, while the return path might still run,
and so create the necessary timing. Running with KASAN can also slow down
the kernel sufficiently to run into this user-after-free, and then see the
report by KASAN.

To fix this, simply pull the test for the sequence-number accounting in
front of the hand-off to the FCP channel (this information doesn't change
during hand-off), but leave the sequence-number accounting itself where it
is.

To make future regressions of the same kind less likely, add comments to
all closely related code-paths.

Signed-off-by: Benjamin Block <bblock@linux.ibm.com>
Fixes: f9eca02276 ("scsi: zfcp: drop duplicate fsf_command from zfcp_fsf_req which is also in QTCB header")
Cc: <stable@vger.kernel.org> #5.0+
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-07-11 21:04:22 -04:00
..
Makefile s390: add a few more SPDX identifiers 2017-12-05 07:51:09 +01:00
zfcp_aux.c scsi: zfcp: fix sysfs block queue limit output for max_segment_size 2019-01-29 01:14:59 -05:00
zfcp_ccw.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
zfcp_dbf.c scsi: zfcp: silence all W=1 build warnings for existing kdoc 2018-11-15 15:01:18 -05:00
zfcp_dbf.h scsi: zfcp: silence remaining kdoc warnings in header files 2018-11-15 15:01:18 -05:00
zfcp_def.h scsi: zfcp: use enum zfcp_erp_steps for struct zfcp_erp_action.step 2018-11-15 15:01:18 -05:00
zfcp_erp.c scsi: zfcp: fix scsi_eh host reset with port_forced ERP for non-NPIV FCP devices 2019-03-27 21:26:12 -04:00
zfcp_ext.h scsi: zfcp: fix scsi_eh host reset with port_forced ERP for non-NPIV FCP devices 2019-03-27 21:26:12 -04:00
zfcp_fc.c scsi: s390: zfcp_fc: use sg helper to iterate over scatterlist 2019-06-20 15:21:33 -04:00
zfcp_fc.h scsi: zfcp: silence remaining kdoc warnings in header files 2018-11-15 15:01:18 -05:00
zfcp_fsf.c scsi: zfcp: fix request object use-after-free in send path causing seqno errors 2019-07-11 21:04:22 -04:00
zfcp_fsf.h scsi: zfcp: silence remaining kdoc warnings in header files 2018-11-15 15:01:18 -05:00
zfcp_qdio.c s390/qdio: make SBAL address array type-safe 2019-02-07 11:57:07 +01:00
zfcp_qdio.h scsi: zfcp: silence remaining kdoc warnings in header files 2018-11-15 15:01:18 -05:00
zfcp_reqlist.h scsi: zfcp: silence remaining kdoc warnings in header files 2018-11-15 15:01:18 -05:00
zfcp_scsi.c scsi: zfcp: fix scsi_eh host reset with port_forced ERP for non-NPIV FCP devices 2019-03-27 21:26:12 -04:00
zfcp_sysfs.c scsi: zfcp: support SCSI_ADAPTER_RESET via scsi_host sysfs attribute host_reset 2018-05-18 11:28:15 -04:00
zfcp_unit.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00