linux/drivers/gpu/drm/amd/amdkfd
Jason Gunthorpe 0029cab314 drm/amdkfd: fix a use after free race with mmu_notifer unregister
When using mmu_notifer_unregister_no_release() the caller must ensure
there is a SRCU synchronize before the mn memory is freed, otherwise use
after free races are possible, for instance:

     CPU0                                      CPU1
                                      invalidate_range_start
                                         hlist_for_each_entry_rcu(..)
 mmu_notifier_unregister_no_release(&p->mn)
 kfree(mn)
                                      if (mn->ops->invalidate_range_end)

The error unwind in amdkfd misses the SRCU synchronization.

amdkfd keeps the kfd_process around until the mm is released, so split the
flow to fully initialize the kfd_process and register it for find_process,
and with the notifier. Past this point the kfd_process does not need to be
cleaned up as it is fully ready.

The final failable step does a vm_mmap() and does not seem to impact the
kfd_process global state. Since it also cannot be undone (and already has
problems with undo if it internally fails), it has to be last.

This way we don't have to try to unwind the mmu_notifier_register() and
avoid the problem with the SRCU.

Along the way this also fixes various other error unwind bugs in the flow.

Fixes: 45102048f7 ("amdkfd: Add process queue manager module")
Link: https://lore.kernel.org/r/20190806231548.25242-10-jgg@ziepe.ca
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-08-20 09:35:02 -03:00
..
cik_event_interrupt.c drm/amdkfd: Simplify kfd2kgd interface 2018-11-05 14:21:07 -05:00
cik_int.h drm/amdkfd: Clean up reference of radeon 2018-07-11 22:33:08 -04:00
cik_regs.h drm/amdkfd: Delete a duplicate statement in set_pasid_vmid_mapping() 2018-11-05 14:21:13 -05:00
cwsr_trap_handler_gfx8.asm drm/amdkfd: Fix gfx8 MEM_VIOL exception handler 2019-05-24 12:21:01 -05:00
cwsr_trap_handler_gfx9.asm drm/amdkfd: Preserve ttmp[4:5] instead of ttmp[14:15] 2019-05-24 12:21:01 -05:00
cwsr_trap_handler_gfx10.asm drm/amdkfd: Add navi10 support to amdkfd. (v3) 2019-06-21 18:59:24 -05:00
cwsr_trap_handler.h drm/amdkfd: Add navi10 support to amdkfd. (v3) 2019-06-21 18:59:24 -05:00
Kconfig treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
kfd_chardev.c drm main pull request for v5.3-rc1 (sans mm changes) 2019-07-15 19:04:27 -07:00
kfd_crat.c drm/amdkfd: Add navi10 support to amdkfd. (v3) 2019-06-21 18:59:24 -05:00
kfd_crat.h drm/amdkfd: Adjust weight to represent num_hops info when report xgmi iolink 2019-05-24 12:20:48 -05:00
kfd_dbgdev.c drm/amdkfd: Clean up reference of radeon 2018-07-11 22:33:08 -04:00
kfd_dbgdev.h drm/amdkfd: Clean up reference of radeon 2018-07-11 22:33:08 -04:00
kfd_dbgmgr.c drm/amdkfd: Make sched_policy a per-device setting 2018-01-04 17:17:43 -05:00
kfd_dbgmgr.h
kfd_debugfs.c amdkfd: no need to check return value of debugfs_create functions 2019-06-13 13:59:49 -05:00
kfd_device_queue_manager_cik.c drm/amdkfd: Introduce asic-specific mqd_manager_init function 2019-05-24 12:21:02 -05:00
kfd_device_queue_manager_v9.c drm/amdkfd: Consistently apply noretry setting 2019-07-16 13:02:55 -05:00
kfd_device_queue_manager_v10.c drm/amdkfd: Add navi10 support to amdkfd. (v3) 2019-06-21 18:59:24 -05:00
kfd_device_queue_manager_vi.c drm/amdkfd: Introduce asic-specific mqd_manager_init function 2019-05-24 12:21:02 -05:00
kfd_device_queue_manager.c drm/amdkfd: fix cp hang in eviction 2019-07-11 14:37:24 -05:00
kfd_device_queue_manager.h drm/amdkfd: Add navi10 support to amdkfd. (v3) 2019-06-21 18:59:24 -05:00
kfd_device.c drm/amdkfd: remove an unused variable 2019-07-02 16:14:22 -05:00
kfd_doorbell.c drm/amdkfd: Fix kernel queue 64 bit doorbell offset calculation 2018-07-11 22:33:01 -04:00
kfd_events.c drm/amdkfd: Cosmetic cleanup 2019-05-24 12:20:48 -05:00
kfd_events.h drm/amdkfd: Implement GPU reset handlers in KFD 2018-07-11 22:32:56 -04:00
kfd_flat_memory.c drm/amdkfd: Add navi10 support to amdkfd. (v3) 2019-06-21 18:59:24 -05:00
kfd_int_process_v9.c drm/amdkfd: Workaround PASID missing in gfx9 interrupt payload under non HWS 2018-11-19 16:38:14 -05:00
kfd_interrupt.c drm/amdkfd: fix zero reading of VMID and PASID for Hawaii 2018-07-11 22:32:51 -04:00
kfd_iommu.c drm/amdkfd: Initialize HSA_CAP_ATS_PRESENT capability in topology codes 2019-06-11 12:57:25 -05:00
kfd_iommu.h drm/amdkfd: Centralize IOMMUv2 code and make it conditional 2017-12-08 19:22:12 -05:00
kfd_kernel_queue_cik.c drm/amdkfd: Add 64-bit doorbell and wptr support to kernel queue 2018-04-08 22:03:51 -04:00
kfd_kernel_queue_v9.c drm/amdkfd: Disable idle optimization for chained runlist 2019-07-03 14:32:10 -05:00
kfd_kernel_queue_v10.c drm/amdkfd: Add navi10 support to amdkfd. (v3) 2019-06-21 18:59:24 -05:00
kfd_kernel_queue_vi.c drm/amdkfd: Delete alloc_format field from map_queue struct 2019-05-24 12:21:03 -05:00
kfd_kernel_queue.c drm/amdkfd: Add navi10 support to amdkfd. (v3) 2019-06-21 18:59:24 -05:00
kfd_kernel_queue.h drm/amdkfd: Add navi10 support to amdkfd. (v3) 2019-06-21 18:59:24 -05:00
kfd_module.c drm/amdkfd: Add procfs-style information for KFD processes 2019-06-20 11:34:00 -05:00
kfd_mqd_manager_cik.c drm/amdkfd: Separate mqd allocation and initialization 2019-06-11 12:56:59 -05:00
kfd_mqd_manager_v9.c drm/amdkfd: Separate mqd allocation and initialization 2019-06-11 12:56:59 -05:00
kfd_mqd_manager_v10.c drm/amdkfd: Add navi10 support to amdkfd. (v3) 2019-06-21 18:59:24 -05:00
kfd_mqd_manager_vi.c drm/amdkfd: Separate mqd allocation and initialization 2019-06-11 12:56:59 -05:00
kfd_mqd_manager.c drm/amdkfd: Separate mqd allocation and initialization 2019-06-11 12:56:59 -05:00
kfd_mqd_manager.h drm/amdkfd: Separate mqd allocation and initialization 2019-06-11 12:56:59 -05:00
kfd_packet_manager.c drm/amdkfd: Print a warning when the runlist becomes oversubscribed 2019-07-03 14:31:26 -05:00
kfd_pasid.c drm/amdkfd: Simplify kfd2kgd interface 2018-11-05 14:21:07 -05:00
kfd_pm4_headers_ai.h drm/amdkfd: Add chained_runlist_idle_disable flag to pm4_mes_runlist 2019-07-03 14:32:04 -05:00
kfd_pm4_headers_diq.h
kfd_pm4_headers_vi.h drm/amdkfd: Delete alloc_format field from map_queue struct 2019-05-24 12:21:03 -05:00
kfd_pm4_headers.h
kfd_pm4_opcodes.h
kfd_priv.h drm/amdkfd: Consistently apply noretry setting 2019-07-16 13:02:55 -05:00
kfd_process_queue_manager.c drm/amdkfd: Remove GWS from process during uninit 2019-07-17 13:34:31 -05:00
kfd_process.c drm/amdkfd: fix a use after free race with mmu_notifer unregister 2019-08-20 09:35:02 -03:00
kfd_queue.c drm/amdkfd: use %px to print user space address instead of %p 2018-05-01 17:56:04 -04:00
kfd_topology.c drm/amdkfd: Add navi10 support to amdkfd. (v3) 2019-06-21 18:59:24 -05:00
kfd_topology.h drm/amdkfd: Add gws number to kfd topology node properties 2019-05-28 14:43:58 -05:00
Makefile drm/amdkfd: Add navi10 support to amdkfd. (v3) 2019-06-21 18:59:24 -05:00
soc15_int.h drm/amdkfd: Add SOC15 interrupt processing support 2018-04-10 17:33:10 -04:00