Under xgmi setup,some sysfs fail to create for the second time of kmd
driver loading. It's due to sysfs nodes are not removed appropriately
in the last unlod time.
Changes of this patch:
1. remove sysfs for dev_attr_xgmi_error
2. remove sysfs_link adev->dev->kobj with target name.
And it only needs to be removed once for a xgmi setup
3. remove sysfs_link hive->kobj with target name
In amdgpu_xgmi_remove_device:
1. amdgpu_xgmi_sysfs_rem_dev_info needs to be run per device
2. amdgpu_xgmi_sysfs_destroy needs to be run on the last node of
device.
v2: initialize array with memset
Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The prompts will contain pci address(segment/bus/port/function),
severity(warn or error) and some keywords(GPU, amdgpu). Also this
address the issue that pci bus retrieved by PCI_BUS_NUM(adev->pdev->devfn)
is wrong.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Testing on a Polaris11 gpu with DCE-11.2 suggests that it
seems to work fine there, so optimistically enable it for
DCE-11 and later.
v2: drop DCE 11.0 hunk. Carrizo (DCE 11.0) has a HW bug where FP16
scaling doesn't work. The upscale and downscale factors were
intended to block those FP16 cases and reject the commit but
nobody ever added those to atomic check. Once those are added
to atomic check, this can be re-enabled.
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Expose support for DRM_FORMAT_ABGR16161616F and
DRM_FORMAT_XBGR16161616F to the DRM core, complementing
the already existing xRGB ordered fp16 formats.
These are especially useful for creating presentable
swapchains in Vulkan for VK_FORMAT_R16G16B16A16_SFLOAT.
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
SCRATCH2 is used to keep decode wptr as a workaround
which fix a hardware DPG decode wptr update bug for
vcn2.5 beforehand.
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Return statements in functions returning bool should use
true/false instead of 1/0.
drivers/gpu/drm/amd/amdkfd/kfd_int_process_v9.c:40:9-10:
WARNING: return of 0/1 in function 'event_interrupt_isr_v9' with return type bool
Generated by: scripts/coccinelle/misc/boolreturn.cocci
Signed-off-by: Aishwarya Ramakrishnan <aishwaryarj100@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why & How]
One call was forcing stutter on instead of looking at the debug option.
Ensure we always check the debug option unless we want to force stutter
off.
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
At bringup we want to be able to disable various power features.
[How]
These features are already exposed as dc_debug_options and exercised
on other OSes. Create a new dc_debug_mask module parameter and expose
relevant bits, in particular
* DC_DISABLE_PIPE_SPLIT
* DC_DISABLE_STUTTER
* DC_DISABLE_DSC
* DC_DISABLE_CLOCK_GATING
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
the amdgpu device attribute node will be created accordding to sriov vf
mode at runtime.
cleanup unnecessary sriov check in attribute operation function.
Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
When this flag is set in the CS IB flags, it causes
a memory cache flush of the GFX.
v2:
Move new flag to drm_amdgpu_cs_chunk_ib.flags
Bump up UAPI version
Remove condition on job != null to emit mem_sync
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
unified amdgpu device attribute node functions:
1. add some helper functions to create amdgpu device attribute node.
2. create device node according to device attr flags on different VF mode.
3. rename some functions name to adapt a new interface.
v2:
1. remove ATTR_STATE_DEAD, ATTR_STATE_ALIVE enum.
2. rename callback function perform to attr_update.
3. modify some variable names
Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
the swsmu or powerplay(hwmgr) need to handle task according to different VF mode,
this function to help query vf mode.
vf mode:
1. SRIOV_VF_MODE_BARE_METAL: the driver work on host OS (PF)
2. SRIOV_VF_MODE_ONE_VF : the driver work on guest OS with one VF
3. SRIOV_VF_MODE_MULTI_VF : the driver work on guest OS with multi VF
Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
When GPU got timeout, it would notify an interested part
of an opportunity to dump info before actual GPU reset.
A usermode app would open 'autodump' node under debugfs system
and poll() for readable/writable. When a GPU reset is due,
amdgpu would notify usermode app through wait_queue_head and give
it 10 minutes to dump info.
After usermode app has done its work, this 'autodump' node is closed.
On node closure, amdgpu gets to know the dump is done through
the completion that is triggered in release().
There is no write or read callback because necessary info can be
obtained through dmesg and umr. Messages back and forth between
usermode app and amdgpu are unnecessary.
v2: (1) changed 'registered' to 'app_listening'
(2) add a mutex in open() to prevent race condition
v3 (chk): grab the reset lock to avoid race in autodump_open,
rename debugfs file to amdgpu_autodump,
provide autodump_read as well,
style and code cleanups
v4: add 'bool app_listening' to differentiate situations, so that
the node can be reopened; also, there is no need to wait for
completion when no app is waiting for a dump.
v5: change 'bool app_listening' to 'enum amdgpu_autodump_state'
add 'app_state_mutex' for race conditions:
(1)Only 1 user can open this file node
(2)wait_dump() can only take effect after poll() executed.
(3)eliminated the race condition between release() and
wait_dump()
v6: removed 'enum amdgpu_autodump_state' and 'app_state_mutex'
removed state checking in amdgpu_debugfs_wait_dump
Improve on top of version 3 so that the node can be reopened.
v7: move reinit_completion into open() so that only one user
can open it.
v8: remove complete_all() from amdgpu_debugfs_wait_dump().
Signed-off-by: Jiange Zhao <Jiange.Zhao@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The variable ret is being initializeed with a value that is never read
and it is being updated later with a new value. The initialization
is redundant and can be removed.
Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
There is no need to use amdgpu_mm_wreg_mmio_rlc()
during initialization time because this interface
is only designed for debugfs case to access the
registers which are only permitted by RLCG during
run-time. Therefore, turn back rlcg write for gfx_v10.
If we not turn back it, it will raise amdgpu load failure.
[ 54.904333] amdgpu: SMU driver if version not matched
[ 54.904393] amdgpu: SMU is initialized successfully!
[ 54.905971] [drm] kiq ring mec 2 pipe 1 q 0
[ 55.115416] amdgpu 0000:00:06.0: [drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring gfx_0.0.0 test failed (-110)
[ 55.118877] [drm:amdgpu_device_init [amdgpu]] *ERROR* hw_init of IP block <gfx_v10_0> failed -110
[ 55.126587] amdgpu 0000:00:06.0: amdgpu_device_ip_init failed
[ 55.133466] amdgpu 0000:00:06.0: Fatal error during GPU init
Signed-off-by: Yintian Tao <yttao@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
'ctxid' is used to distinguish different events raised from SMC.
0x3 and 0x4 are for AC and DC power mode.
V2: update the way to retrieve the ctxid and change the log level
to debug
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fix the following gcc warning:
drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c:65:18: warning: ‘crtc_offsets’
defined but not used [-Wunused-const-variable=]
static const u32 crtc_offsets[6] =
^~~~~~~~~~~~
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
For MST case: when update_config is called to disable a stream,
this clears the settings for all the streams on that link.
We should only clear the settings for the stream that was disabled.
[How]
Clear the settings after the call to remove display is called.
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Signed-off-by: Leo (Hanghong) Ma <hanghong.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
On my raven1 system (rev c6) with VBIOS 113-RAVEN-114 GFXOFF is
not stable (resulting in large block tiling noise in some applications).
Disabling GFXOFF via the quirk list fixes the problems for me.
Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
since for sriov, baco happens on host side, no need and meaning
to judge is baco.
also, since kiq reads strap0 in here, if kiq is not ready
or gpu reset(kiq resume) happens after this read, would fail
to read and wrongly set baco as true(1).
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Jane Jian <Jane.Jian@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We only need to set DPM_FLAG_NEVER_SKIP for the legacy ATPX
BOCO case. D3cold and BACO work as expected.
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We should be checking whether the driver enabled runtime pm
rather than whether the asic supports BOCO or BACO. That said
in general they are equivalent unless the user has disabled
runpm or it has been disabled for a specific asic.
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
get_hive_id/get_node_id/get_topology_info/set_topology_info
are common xgmi command supported by TA for all the ASICs
that support xgmi link. They should be implemented as common
helper functions to avoid duplicated code per IP generation
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drm-misc-next for 5.8:
UAPI Changes:
Cross-subsystem Changes:
* MAINTAINERS: restore alphabetical order; update cirrus driver
* Dcomuentation: document visionix, chronteli, ite vendor prefices; update
documentation for Chrontel CH7033, IT6505, IVO, BOE,
Panasonic, Chunghwa, AUO bindings; convert dw_mipi_dsi.txt
to YAML; remove todo item for drm_display_mode.hsync removal;
Core Changes:
* drm: add devm_drm_dev_alloc() for managed allocations of drm_device;
use DRM_MODESET_LOCK_ALL_*() in mode-object code; remove
drm_display_mode.hsync; small cleanups of unused variables,
compiler warnings and static functions
* drm/client: dual-lincensing: GPL-2.0 or MIT
* drm/mm: optimize tree searches in rb_hole_addr()
Driver Changes:
* drm/{many}: use devm_drm_dev_alloc(); don't use drm_device.dev_private
* drm/ast: don't double-assign to drm_crtc_funcs.set_config; drop
drm_connector_register()
* drm/bochs: drop drm_connector_register()
* drm/bridge: add support for Chrontel ch7033; fix stack usage with
old gccs; return error pointer in drm_panel_bridge_add()
* drm/cirrus: Move to tiny
* drm/dp_mst: don't use 2nd sideband tx slot; revert "Remove single tx
msg restriction"
* drm/lima: support runtime PM;
* drm/meson: limit modes wrt chipset
* drm/panel: add support for Visionox rm69299; fix clock on
boe-tv101wum-n16; fix panel type for AUO G101EVN10;
add support for Ivo M133NFW4 R0; add support for BOE
NV133FHM-N61; add support for AUO G121EAN01.4, G156XTN01.0,
G190EAN01
* drm/pl111: improve vexpress init; fix module auto-loading
* drm/stm: read number of endpoints from device tree
* drm/vboxvideo: use managed PCI functions; drop DRM_MTRR_WC
* drm/vkms: fix use-after-free in vkms_gem_create(); enable cursor
support by default
* fbdev: use boolean values in several drivers
* fbdev/controlfb: fix COMPILE_TEST
* fbdev/w100fb: fix double-free bug
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20200507072503.GA10979@linux-uq9g