linux

Author	SHA1	Message	Date
Nicholas Kazlauskas	ad4de27f48	drm/amdgpu: Add module parameter for specifying default ABM level [Why] It's non trivial to configure or specify an ABM reduction level for userspace outside of X. There is also no method to specify the default ABM value at boot time. A parameter should be added to configure this. [How] Expose a module parameter that can specify the default ABM level to use for eDP connectors on DC enabled hardware that loads the DMCU firmware. The default is still disabled (0), but levels can range from 1-4. Levels control how much the backlight can be reduced, with being the least amount of reduction and four being the most reduction. Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Reviewed-by: David Francis <david.francis@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-06-11 12:45:18 -05:00
Monk Liu	ae1589f669	drm/amdgpu: drop the incorrect soft_reset for SRIOV It's incorrect to do soft reset for SRIOV, when GFX hang the WREG would stuck there becuase it goes KIQ way. the GPU reset counter is incorrect: always increase twice for each timedout Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-06-11 12:44:52 -05:00
James Zhu	df0a8064be	drm/amdgpu: Add GDS clearing workaround in later init for gfx9 Since Hardware bug, GDS exist ECC error after cold boot up, adding GDS clearing workaround in later init for gfx9. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-06-11 12:44:47 -05:00
Tom St Denis	b4559a1646	drm/amd/amdgpu: remove vram_page_split kernel option (v3) This option is no longer needed. The default code paths are now the only option. v2: Add HPAGE support and a default for non contiguous maps v3: Misread 512 pages as MiB ... Signed-off-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-06-11 12:44:23 -05:00
Prike Liang	80f41f84ae	drm/amd/amdgpu: add RLC firmware to support raven1 refresh Use SMU firmware version to indentify the raven1 refresh device and then load homologous RLC FW. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Suggested-by: Huang Rui<Ray.Huang@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-06-11 12:40:06 -05:00
Trigger Huang	e0301317ac	drm/amdgpu: Hardcode reg access using L1 security Under Vega10 SR-IOV VF, L1 register access mode should be enabled by default as the non-security VF will no longer be supported. Signed-off-by: Trigger Huang <Trigger.Huang@amd.com> Reviewed-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-06-11 12:39:51 -05:00
Shirish S	e038b9016a	drm/amdgpu/{uvd,vcn}: fetch ring's read_ptr after alloc [What] readptr read always returns zero, since most likely these blocks are either power or clock gated. [How] fetch rptr after amdgpu_ring_alloc() which informs the power management code that the block is about to be used and hence the gating is turned off. Signed-off-by: Louis Li <Ching-shih.Li@amd.com> Signed-off-by: Shirish S <shirish.s@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-06-11 12:39:44 -05:00
Louis Li	91c9c23e43	drm/amdgpu: fix ring test failure issue during s3 in vce 3.0 (V2) [What] vce ring test fails consistently during resume in s3 cycle, due to mismatch read & write pointers. On debug/analysis its found that rptr to be compared is not being correctly updated/read, which leads to this failure. Below is the failure signature: [drm:amdgpu_vce_ring_test_ring] ERROR amdgpu: ring 12 test failed [drm:amdgpu_device_ip_resume_phase2] ERROR resume of IP block <vce_v3_0> failed -110 [drm:amdgpu_device_resume] ERROR amdgpu_device_ip_resume failed (-110). [How] fetch rptr appropriately, meaning move its read location further down in the code flow. With this patch applied the s3 failure is no more seen for >5k s3 cycles, which otherwise is pretty consistent. V2: remove reduntant fetch of rptr Signed-off-by: Louis Li <Ching-shih.Li@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-06-11 12:39:26 -05:00
James Zhu	052af915d8	drm/amdgpu: Fixed missing to clear some EDC count EDC counts are related to instance and se. They are not the same for different type of EDC. EDC clearing are changed to base on individual EDC's instance and SE number. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-06-11 11:57:38 -05:00
Christian König	55c2e5a160	drm/amdgpu: stop removing BOs from the LRU v3 This avoids OOM situations when we have lots of threads submitting at the same time. v3: apply this to the whole driver, not just CS Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Tested-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-06-11 11:52:19 -05:00
Christian König	94de7349f7	drm/amdgpu: create GDS, GWS and OA in system domain And only move them in on validation. This allows for better control when multiple processes are fighting over those resources. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Tested-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-06-11 11:52:12 -05:00
Christian König	a3e7738d57	drm/amdgpu: drop some validation failure messages The messages about amdgpu_cs_list_validate are duplicated because the caller will complain into the logs as well and we can also get interrupted by a signal here. Also fix the the caller to not report -EAGAIN from validation. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Tested-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-06-11 11:52:03 -05:00
Hawking Zhang	5a6bfe0960	drm/amdgpu/psp: udpate ta_ras interface header ras ta interface header need to be updated to match with latest ta fw updates Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-06-11 11:51:28 -05:00
Alex Deucher	72a14e9b23	Revert "drm/amdgpu: add DRIVER_SYNCOBJ_TIMELINE to amdgpu" This reverts commit `8d8a5a64a8`. Wait until KHR exposes the VLK support. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-06-05 22:18:09 -05:00
Alex Deucher	beff74bc6e	drm/amdgpu: fix a race in GPU reset with IB test (v2) Split late_init into two functions, one (do_late_init) which just does the hw init, and late_init which calls do_late_init and schedules the IB test work. Call do_late_init in the GPU reset code to run the init code, but not schedule the IB test code. The IB test code is called directly in the gpu reset code so no need to run the IB tests in a separate work thread. If we do, we end up racing. v2: Rework late_init. Pull out the mgpu fan boost and xgmi pstate code into late_init so they get called in all cases. rename the late_init worker thread to delayed work since it's just the IB tests now which can happen later. Schedule the work at init and resume time. It's not needed at reset time because the IB tests are called directly. Reviewed-by: Christian König <christian.koenig@amd.com> Cc: Xinhui Pan <xinhui.pan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-06-05 22:18:09 -05:00
xinhui pan	c53e4db712	drm/amdgpu: cancel late_init_work before gpu reset gpu reset will run late_init and schedule the late_init_work. if we keep triggering gpu reset in a short time, there are potenial races. Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-06-05 22:18:09 -05:00
Christian König	6e58ab7ac7	drm/ttm: Make LRU removal optional v2 We are already doing this for DMA-buf imports and also for amdgpu VM BOs for quite a while now. If this doesn't run into any problems we are probably going to stop removing BOs from the LRU altogether. v2: drop BUG_ON from ttm_bo_add_to_lru Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Tested-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-05-31 10:39:34 -05:00
Emily Deng	bdb50274d0	drm/amdgpu/sriov: Correct some register program method For the VF, some registers only could be programmed with RLC. Signed-off-by: Emily Deng <Emily.Deng@amd.com> Reviewed-by: Trigger Huang <Trigger.Huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-05-31 10:39:33 -05:00
Oak Zeng	443e902eee	drm/amdkfd: Return proper error code for gws alloc API Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-05-31 10:39:33 -05:00
Emily Deng	789142eb8b	drm/amdgpu:Fix the unpin warning about csb buffer As it will destroy clear_state_obj, and also will unpin it in the gfx_v9_0_sw_fini, so don't need to call amdgpu_bo_free_kernel in gfx_v9_0_sw_fini, or it will have unpin warning. Signed-off-by: Emily Deng <Emily.Deng@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-05-31 10:39:29 -05:00
xinhui pan	efb426d581	drm/amdgpu: ras injection use gpu address injection need a valid gpu address. Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-05-31 10:39:29 -05:00
Dave Airlie	91c1ead6ae	Merge branch 'drm-next-5.3' of git://people.freedesktop.org/~agd5f/linux into drm-next New stuff for 5.3: - Add new thermal sensors for vega asics - Various RAS fixes - Add sysfs interface for memory interface utilization - Use HMM rather than mmu notifier for user pages - Expose xgmi topology via kfd - SR-IOV fixes - Fixes for manual driver reload - Add unique identifier for vega asics - Clean up user fence handling with UVD/VCE/VCN blocks - Convert DC to use core bpc attribute rather than a custom one - Add GWS support for KFD - Vega powerplay improvements - Add CRC support for DCE 12 - SR-IOV support for new security policy - Various cleanups From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190529220944.14464-1-alexander.deucher@amd.com	2019-05-31 10:04:39 +10:00
Emily Deng	394e9a14c6	drm/amdgpu: Need to set the baco cap before baco reset For passthrough, after rebooted the VM, driver will do a baco reset before doing other driver initialization during loading driver. For doing the baco reset, it will first check the baco reset capability. So first need to set the cap from the vbios information or baco reset won't be enabled. Signed-off-by: Emily Deng <Emily.Deng@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-05-28 14:47:42 -05:00
Alex Deucher	d55f33da54	drm/amdgpu/soc15: skip reset on init Not necessary on soc15 and breaks driver reload on server cards. Acked-by: Amber Lin <Amber.Lin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-05-28 14:47:20 -05:00
Chunming Zhou	8d8a5a64a8	drm/amdgpu: add DRIVER_SYNCOBJ_TIMELINE to amdgpu Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Flora Cui <Flora.Cui@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-05-28 14:45:46 -05:00
Oak Zeng	71efab6a30	drm/amdgpu: Add function to add/remove gws to kfd process GWS bo is shared between all kfd processes. Add function to add gws to kfd process's bo list so gws can be evicted from and restored for process. Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-05-28 14:44:18 -05:00
Oak Zeng	ca66fb8fbb	drm/amdgpu: Add interface to alloc gws from amdgpu Add amdgpu_amdkfd interface to alloc and free gws from amdgpu Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-05-28 14:44:05 -05:00
Oak Zeng	29e764621b	drm/amdkfd: Add gws number to kfd topology node properties Add amdgpu_amdkfd interface to get num_gws and add num_gws to /sys/class/kfd/kfd/topology/nodes/x/properties. Only report num_gws if MEC FW support GWS barriers. Currently it is determined by a module parameter which will be replaced with MEC FW version check when firmware is ready. Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-05-28 14:43:58 -05:00
Dave Airlie	88cd7a2c1b	drm-misc-next for v5.3, try #2 : Cross-subsystem Changes: - Fix device tree bindings in drm-misc-next after a botched merge. Core Changes: - Docbook fix for drm_hdmi_infoframe_set_hdr_metadata. Driver Changes: - mediatek: Fix compiler warning after merging the HDR series. - vc4: Rework binner bo handling. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEuXvWqAysSYEJGuVH/lWMcqZwE8MFAlznr6oACgkQ/lWMcqZw E8Mjbw//Rf2KeOyNYOpaUjzUIXjdGNKCSLG+MYbBzJLbdj6hywAi8tS6aS89d1qW CCBzPTUWFktuUVuHqIpZwNTPLndXzPvyC9v1BafKkF6Tkod1usBMaXD1266giAbC pKkJrejqeeQtYNfAQIGDzD/ndxXptw+mwK7DgRvMIQSGYuMCm+p5cG0RBtLV7Ijv fXIromzIQ+YUuOIyGRgmXW9zDUaieztovrLtIzpYALzTPZb5dqrJiuv3SKIiB4EK mlTprRqHbHpYLHHNhFrO2blfi/50+SThEHvUBP8rkMf3nu3nhQSMQrPtxJSfL71e 1nAWvIYkLY7lKid7ugFvsZL+1L0zgG6XnsqHs5/x5x/LGDK1jVCEGG/DdsXVjGFj XH8zdLBi3PrmwbKy/HHCh6QD5Iwtg4qm8Dfjjfil4XNQDI8pK8q8TaVMZETn3YRC 63JtZq8nBnrWgT57N/28apkymsHdz2QK99Yyc+GflFhhHsoNy6LhP+OqzW11rIas ANxZrF5CR8rudtoo2QeMkHcvkbIvDTQOPPuW6LXdXuqkhi91NFmgkxCCecFfpO74 QvTiBQHrlb8zqTMZJ/j6uSBTFNOXI2NxXTKUBMJ2O3FcyVqvpL+HutVPcBuIw3mM FNvCI1M9rVH1qFOZ+t1y9ceebuHPy6xYwuak6fKDwzOwJOmOMFI= =2K7c -----END PGP SIGNATURE----- Merge tag 'drm-misc-next-2019-05-24' of git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for v5.3, try #2: UAPI Changes: - Add HDR source metadata property. - Make drm.h compile on GNU/kFreeBSD by including stdint.h - Clarify how the userspace reviewer has to review new kernel UAPI. - Clarify that for using new UAPI, merging to drm-next or drm-misc-next should be enough. Cross-subsystem Changes: - video/hdmi: Add unpack function for DRM infoframes. - Device tree bindings: * Updating a property for Mali Midgard GPUs * Updating a property for STM32 DSI panel * Adding support for FriendlyELEC HD702E 800x1280 panel * Adding support for Evervision VGG804821 800x480 5.0" WVGA TFT panel * Adding support for the EDT ET035012DM6 3.5" 320x240 QVGA 24-bit RGB TFT. * Adding support for Three Five displays TFC S9700RTWV43TR-01B 800x480 panel with resistive touch found on TI's AM335X-EVM. * Adding support for EDT ETM0430G0DH6 480x272 panel. - Add OSD101T2587-53TS driver with DT bindings. - Add Samsung S6E63M0 panel driver with DT bindings. - Add VXT VL050-8048NT-C01 800x480 panel with DT bindings. - Dma-buf: - Make mmap callback actually optional. - Documentation updates. - Fix debugfs refcount inbalance. - Remove unused sync_dump function. - Fix device tree bindings in drm-misc-next after a botched merge. Core Changes: - Add support for HDR infoframes and related EDID parsing. - Remove prime sg_table caching, now done inside dma-buf. - Add shiny new drm_gem_vram helpers for simple VRAM drivers; with some fixes to the new API on top. - Small fix to job cleanup without timeout handler. - Documentation fixes to drm_fourcc. - Replace lookups of drm_format with struct drm_format_info; remove functions that become obsolete by this conversion. - Remove double include in bridge/panel.c and some drivers. - Remove drmP.h include from drm/edid and drm/dp. - Fix null pointer deref in drm_fb_helper_hotplug_event(). - Remove most members from drm_fb_helper_crtc, only mode_set is kept. - Remove race of fb helpers with userspace; only restore mode when userspace is not master. - Move legacy setup from drm_file.c to drm_legacy_misc.c - Rework scheduler job destruction. - drm/bus was removed, remove from TODO. - Add __drm_atomic_helper_crtc_reset() to subclass crtc_state, and convert some drivers to use it (conversion is not complete yet). - Bump vblank timeout wait to 100 ms for atomic. - Docbook fix for drm_hdmi_infoframe_set_hdr_metadata. Driver Changes: - sun4i: Use DRM_GEM_CMA_VMAP_DRIVER_OPS instead of definining manually. - v3d: Small cleanups, adding support for compute shaders, reservation/synchronization fixes and job management refactoring, fixes MMU and debugfs. - lima: Fix null pointer in irq handler on startup, set default timeout for scheduled jobs. - stm/ltdc: Assorted fixes and adding FB modifier support. - amdgpu: Avoid hw reset if guilty job was already signaled. - virtio: Add seqno to fences, add trace events, use correct flags for fence allocation. - Convert AST, bochs, mgag200, vboxvideo, hisilicon to the new drm_gem_vram API. - sun6i_mipi_dsi: Support DSI GENERIC_SHORT_WRITE_2 transfers. - bochs: Small fix to use PTR_RET_OR_ZERO and driver unload. - gma500: header fixes - cirrus: Remove unused files. - mediatek: Fix compiler warning after merging the HDR series. - vc4: Rework binner bo handling. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/052875a5-27ba-3832-60c2-193d950afdff@linux.intel.com	2019-05-28 08:59:11 +10:00
Tom St Denis	74abc2210e	drm/amd/doc: Add RAS documentation to guide Acked-by: Slava Abramov <slava.abramov@amd.com> Signed-off-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-05-24 12:46:49 -05:00
Tom St Denis	1c1e53f7f2	drm/amd/doc: Add XGMI sysfs documentation Acked-by: Slava Abramov <slava.abramov@amd.com> Signed-off-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-05-24 12:46:49 -05:00
Nicholas Kazlauskas	1825fd34e8	drm/amd/display: Switch the custom "max bpc" property to the DRM prop [Why] The custom "max bpc" property was added to limit color depth while the DRM one was still being merged. It's been a few kernel versions since then and this TODO was still sticking around. [How] Attach the DRM max bpc property to the connector and drop all of our custom property management. Set the max bpc to 8 by default since DRM defaults to the max in the range which would be 16 in this case. No behavioral changes are intended with this patch, it should just be a refactor. v2: Don't force 8bpc when no state is given Cc: Leo Li <sunpeng.li@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-05-24 12:25:56 -05:00
Kent Russell	fb2dbfd242	drm/amdgpu: Add Unique Identifier sysfs file unique_id v2 Add a file that provides a Unique ID for the GPU. This will persist across machines and is guaranteed to be unique. This is only available for GFX9 and newer, so older ASICs will not have this file in the sysfs pool v2: Store it in adev for ASICs that don't have a hwmgr Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Kent Russell <kent.russell@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-05-24 12:25:32 -05:00
Felix Kuehling	1986a3b022	drm/amdgpu: Improve error handling for HMM Use unsigned long for number of pages. Check that pfns are valid after hmm_vma_fault. If they are not, return an error instead of continuing with invalid page pointers and PTEs. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-05-24 12:21:02 -05:00
Philip Yang	b9c5eb5b80	drm/amdgpu: more descriptive message if HMM not enabled If using old kernel config file, CONFIG_ZONE_DEVICE is not selected, so CONFIG_HMM and CONFIG_HMM_MIRROR is not enabled, the current driver error message "Failed to register MMU notifier" is not clear. Inform user with more descriptive message on how to fix the missing kernel config option. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109808 Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-05-24 12:21:02 -05:00
Philip Yang	6826cb3b92	drm/amdgpu: support userptr cross VMAs case with HMM userptr may cross two VMAs if the forked child process (not call exec after fork) malloc buffer, then free it, and then malloc larger size buf, kerenl will create new VMA adjacent to old VMA which was cloned from parent process, some pages of userptr are in the first VMA, the rest pages are in the second VMA. HMM expects range only have one VMA, loop over all VMAs in the address range, create multiple ranges to handle this case. See is_mergeable_anon_vma in mm/mmap.c for details. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-05-24 12:21:02 -05:00
Philip Yang	6c55d6e90e	drm/amdkfd: support concurrent userptr update for HMM Userptr restore may have concurrent userptr invalidation after hmm_vma_fault adds the range to the hmm->ranges list, needs call hmm_vma_range_done to remove the range from hmm->ranges list first, then reschedule the restore worker. Otherwise hmm_vma_fault will add same range to the list, this will cause loop in the list because range->next point to range itself. Add function untrack_invalid_user_pages to reduce code duplication. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-05-24 12:21:02 -05:00
Philip Yang	ad595b8634	drm/amdgpu: fix HMM config dependency issue Only select HMM_MIRROR will get kernel config dependency warnings if CONFIG_HMM is missing in the config. Add depends on HMM will solve the issue. Add conditional compilation to fix compilation errors if HMM_MIRROR is not enabled as HMM config is not enabled. Remove unused function amdgpu_ttm_tt_mark_user_pages. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-05-24 12:21:02 -05:00
Philip Yang	899fbde146	drm/amdgpu: replace get_user_pages with HMM mirror helpers Use HMM helper function hmm_vma_fault() to get physical pages backing userptr and start CPU page table update track of those pages. Then use hmm_vma_range_done() to check if those pages are updated before amdgpu_cs_submit for gfx or before user queues are resumed for kfd. If userptr pages are updated, for gfx, amdgpu_cs_ioctl will restart from scratch, for kfd, restore worker is rescheduled to retry. HMM simplify the CPU page table concurrent update check, so remove guptasklock, mmu_invalidations, last_set_pages fields from amdgpu_ttm_tt struct. HMM does not pin the page (increase page ref count), so remove related operations like release_pages(), put_page(), mark_page_dirty(). Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-05-24 12:21:02 -05:00
Philip Yang	2c5a51f570	drm/amdgpu: use HMM callback to replace mmu notifier Replace our MMU notifier with hmm_mirror_ops.sync_cpu_device_pagetables callback. Enable CONFIG_HMM and CONFIG_HMM_MIRROR as a dependency in DRM_AMDGPU_USERPTR Kconfig. It supports both KFD userptr and gfx userptr paths. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-05-24 12:21:01 -05:00
shaoyunl	e14ba95b90	drm/amdgpu: Use heavy weight for tlb invalidation on xgmi configuration There is a bug found in vml2 xgmi logic: mtype is always sent as NC on the VMC to TC interface for a page walk, regardless of whether the request is being sent to local or remote GPU. NC means non-coherent and will cause the VMC return data to be cached in the TCC (versus UC – uncached will not cache the data). Since the page table updates are being done by SDMA/HDP, then TCC will never be updated and the GC VML2 will continue to hit on the TCC and never get the updated page tables and result in a fault. Heave weigh tlb invalidation does a WB/INVAL of the L1/L2 GL data caches so TCC will not be hit on next request Signed-off-by: shaoyunl <Shaoyun.Liu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-05-24 12:21:01 -05:00
Alex Deucher	dbaa922b57	drm/amdgpu: use pcie_bandwidth_available rather than open coding it It does the same thing we were doing already. I though it needed work for gen3/4 speeds, but that seems to be covered already. Reviewed-by: Evan Quan <evan.quan@amd.com> Acked-by: Michel Dänzer <michel.daenzer@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-05-24 12:21:01 -05:00
Slava Abramov	d6ee400e79	drm/amdgpu: use div64_ul for 32-bit compatibility v1 v1: replace casting to unsigned long with div64_ul Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Slava Abramov <slava.abramov@amd.com> Tested-by: Slava Abramov <slava.abramov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-05-24 12:21:01 -05:00
Colin Ian King	e70a26b303	drm/amdgpu: fix spelling mistake "retrived" -> "retrieved" There is a spelling mistake in a DRM_ERROR error message. Fix this. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-05-24 12:21:00 -05:00
Alex Deucher	e74609cb42	drm/amdgpu/vega20: use mode1 reset for RAS and XGMI If RAS or XGMI are enabled, you have to use mode1 reset rather than BACO. Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-05-24 12:21:00 -05:00
Evan Quan	fe75a32371	drm/amd/powerplay: support ppfeatures sysfs interface on sw smu routine Support ppfeatures sysfs interface on Vega20 sw smu routine. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-05-24 12:20:59 -05:00
Ori Messinger	5bb2353273	drm/amdgpu: Report firmware versions with sysfs v2 Firmware versions can be found as separate sysfs files at: /sys/class/drm/cardX/device/fw_version (where X is the card number) The firmware versions are displayed in hexadecimal. v2: Moved sysfs files to subfolder Signed-off-by: Ori Messinger <ori.messinger@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-05-24 12:20:52 -05:00
Leo Liu	9dc7b02a3c	drm/amdgpu: make VCN DPG pause mode detached from general VCN It should be attached to VCN 1.0 Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-05-24 12:20:52 -05:00
Leo Liu	05eee12dd6	drm/amdgpu: move the VCN DPG mode read and write to VCN Since this is VCN specific and only used by VCN Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-05-24 12:20:52 -05:00
Tiecheng Zhou	fe2b5323d2	drm/amdgpu/sriov: Need to initialize the HDP_NONSURFACE_BAStE it requires to initialize HDP_NONSURFACE_BASE, so as to avoid using the value left by a previous VM under sriov scenario. v2: it should not hurt baremetal, generalize it for both sriov and baremetal Signed-off-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Tiecheng Zhou <Tiecheng.Zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-05-24 12:20:52 -05:00

1 2 3 4 5 ...

5461 Commits