Building a randconfig here triggered:
ERROR: modpost: "pm_suspend_target_state" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined!
because the module export of that symbol happens in
kernel/power/suspend.c which is enabled with CONFIG_SUSPEND.
The ifdef guards in amdgpu_acpi_is_s0ix_supported(), however, test for
CONFIG_PM_SLEEP which is defined like this:
config PM_SLEEP
def_bool y
depends on SUSPEND || HIBERNATE_CALLBACKS
and that randconfig has:
# CONFIG_SUSPEND is not set
CONFIG_HIBERNATE_CALLBACKS=y
leading to the module export missing.
Change the ifdeffery to depend directly on CONFIG_SUSPEND.
Fixes: 5706cb3c91 ("drm/amdgpu: fix checking pmops when PM_SLEEP is not enabled")
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/YSP6Lv53QV0cOAsd@zn.tnic
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
schedule_delayed_work does not push back the work if it was already
scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms
after the first time GFXOFF was disabled and re-enabled, even if GFXOFF
was disabled and re-enabled again during those 100 ms.
This resulted in frame drops / stutter with the upcoming mutter 41
release on Navi 14, due to constantly enabling GFXOFF in the HW and
disabling it again (for getting the GPU clock counter).
To fix this, call cancel_delayed_work_sync when the disable count
transitions from 0 to 1, and only schedule the delayed work on the
reverse transition, not if the disable count was already 0. This makes
sure the delayed work doesn't run at unexpected times, and allows it to
be lock-free.
v2:
* Use cancel_delayed_work_sync & mutex_trylock instead of
mod_delayed_work.
v3:
* Make amdgpu_device_delay_enable_gfx_off lock-free (Christian König)
v4:
* Fix race condition between amdgpu_gfx_off_ctrl incrementing
adev->gfx.gfx_off_req_count and amdgpu_device_delay_enable_gfx_off
checking for it to be 0 (Evan Quan)
Cc: stable@vger.kernel.org
Reviewed-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> # v3
Acked-by: Christian König <christian.koenig@amd.com> # v3
Signed-off-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
For some reason we run into an use case where a BO is already pinned
into GTT, but should be pinned into VRAM|GTT again.
Handle that case gracefully as well.
Reviewed-by: Shashank Sharma <Shashank.sharma@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
schedule_delayed_work does not push back the work if it was already
scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms
after the first time GFXOFF was disabled and re-enabled, even if GFXOFF
was disabled and re-enabled again during those 100 ms.
This resulted in frame drops / stutter with the upcoming mutter 41
release on Navi 14, due to constantly enabling GFXOFF in the HW and
disabling it again (for getting the GPU clock counter).
To fix this, call cancel_delayed_work_sync when the disable count
transitions from 0 to 1, and only schedule the delayed work on the
reverse transition, not if the disable count was already 0. This makes
sure the delayed work doesn't run at unexpected times, and allows it to
be lock-free.
v2:
* Use cancel_delayed_work_sync & mutex_trylock instead of
mod_delayed_work.
v3:
* Make amdgpu_device_delay_enable_gfx_off lock-free (Christian König)
v4:
* Fix race condition between amdgpu_gfx_off_ctrl incrementing
adev->gfx.gfx_off_req_count and amdgpu_device_delay_enable_gfx_off
checking for it to be 0 (Evan Quan)
Cc: stable@vger.kernel.org
Reviewed-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> # v3
Acked-by: Christian König <christian.koenig@amd.com> # v3
Signed-off-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
For some reason we run into an use case where a BO is already pinned
into GTT, but should be pinned into VRAM|GTT again.
Handle that case gracefully as well.
Reviewed-by: Shashank Sharma <Shashank.sharma@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
[Why]
In some cases when we unload driver, warning call trace
will show up in vram_mgr_fini which claims that LRU is not empty, caused
by the ttm bo inside delay deleted queue.
[How]
We should flush delayed work to make sure the delay deleting is done.
Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
aldebaran supports up to 16 xgmi physical nodes.
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
We have a S3 issue on that SKU with BACO enabled. Will bring back this
when that root caused.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
MMSCH 1.0 doesn't have major/minor version, only verison.
Signed-off-by: Zhigang Luo <zhigang.luo@amd.com>
Reviewed by Shaoyun.liu <Shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The TA has a limit to the amount of data that can be retrieved from
GET_TOPOLOGY. For setups that exceed this limit, the xGMI topology
needs to be re-initialized and data needs to be re-fetched from the
extended link records by setting a flag in the shared command buffer.
The number of hops and the number of links must be accumulated by the
driver. Other data points are all fetched from the first request.
Because the TA has already exceeded its link record limit, it
cannot hold bidirectional information. Otherwise the driver would
have to do more than two fetches so the driver has to reflect the
topology information in the opposite direction.
v2: squashed with internal reviewed fix
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Hawking Zhang <hawking.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Currently, the readout of fan speed pwm is transited into percent-based
and then pwm-based. However, the transition into percent-based is totally
unnecessary and make the final output less accurate.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Delete ras_if->name in the RAS ctx structure and remove related lines.
Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
When guest received FLR notification from host, it would
lock adapter into reset state. There will be no more
job submission and hardware access after that.
Then it should send a response to host that it has prepared
for host reset.
Signed-off-by: Jiange Zhao <Jiange.Zhao@amd.com>
Signed-off-by: Peng Ju Zhou <PengJu.Zhou@amd.com>
Reviewed-by: Emily.Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Why: Previously hw fence is alloced separately with job.
It caused historical lifetime issues and corner cases.
The ideal situation is to take fence to manage both job
and fence's lifetime, and simplify the design of gpu-scheduler.
How:
We propose to embed hw_fence into amdgpu_job.
1. We cover the normal job submission by this method.
2. For ib_test, and submit without a parent job keep the
legacy way to create a hw fence separately.
v2:
use AMDGPU_FENCE_FLAG_EMBED_IN_JOB_BIT to show that the fence is
embedded in a job.
v3:
remove redundant variable ring in amdgpu_job
v4:
add tdr sequence support for this feature. Add a job_run_counter to
indicate whether this job is a resubmit job.
v5
add missing handling in amdgpu_fence_enable_signaling
Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com>
Signed-off-by: Jack Zhang <Jack.Zhang7@hotmail.com>
Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed by: Monk Liu <monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
UAPI Changes:
Cross-subsystem Changes:
- Add lockdep_assert(once) helpers.
Core Changes:
- Add lockdep assert to drm_is_current_master_locked.
- Fix typos in dma-buf documentation.
- Mark drm irq midlayer as legacy only.
- Fix GPF in udmabuf_create.
- Rename member to correct value in drm_edid.h
Driver Changes:
- Build fix to make nouveau build with NOUVEAU_BACKLIGHT.
- Add MI101AIT-ICP1, LTTD800480070-L6WWH-RT panels.
- Assorted fixes to bridge/it66121, anx7625.
- Add custom crtc_state to simple helpers, and use it to
convert pll handling in mgag200 to atomic.
- Convert drivers to use offset-adjusted framebuffer bo mappings.
- Assorted small fixes and fix for a use-after-free in vmwgfx.
- Convert remaining callers of non-legacy drivers to use linux irqs directly.
- Small cleanup in ingenic.
- Small fixes to virtio and ti-sn65dsi86.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEuXvWqAysSYEJGuVH/lWMcqZwE8MFAmEVd7cACgkQ/lWMcqZw
E8Pw4hAApFWMPThb1OGxAsozhfOAc+Nhd/XjClWrEeu+auowge0OJzjUTOpUci7o
NrL5GWLPDoIXm5/8eLm6B9k+hPmsuLQUgriMMpPyulvgckSYiLKzFEygbKtilfsJ
k+ucLYlZ93ADLIcUIaode5eeWIbtsiuXNdfpH91uLKT2Lo1k4azJTfLJl0t1TB78
uAumaeLXirmqfyN8U/zgPSmaNyTgBlMjMrJG/0+NGaoiq3sKtC/vCU/4r2aEcsMd
ZrYbAJph5ND/u449P4vk2mDNmpz6GE1i8y1zeq7OWq+EOFHaUv0M2v1ZGTl9NSCw
CascPEb+Hzt4eMFEvpVn9Vz/p7NpVeT78733YkF6jO898pruDbJFz/1PFLUcWp5S
jssvgYyhqTrLTCabg8OMrkww8S5on+qcRPVjrpaND2QJUrF58acrEQVCJmCf/XIg
G2XXotZxv96upWj1n/bZz/KcNN7xc+J6z9SEZml/elz5er9msbL8OFEq9bxWxfKL
xjDY0CwvRtkwnxkin6UpZHYlGGciY1j3pCMHiRpQNiHDQOfNE2pFbKHN0lWE7jiS
wQaSMhxC8cjM1TpfHCeHmM522QS6oqJbPev9Z+2twO8KtfetbbBE3xe+Rm/Rf7eM
vPRvHavWaRDlkq742qZqsiqP0aiqgr04i73Kw5mGLOCT/42uCuU=
=G09r
-----END PGP SIGNATURE-----
Merge tag 'drm-misc-next-2021-08-12' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next for v5.15:
UAPI Changes:
Cross-subsystem Changes:
- Add lockdep_assert(once) helpers.
Core Changes:
- Add lockdep assert to drm_is_current_master_locked.
- Fix typos in dma-buf documentation.
- Mark drm irq midlayer as legacy only.
- Fix GPF in udmabuf_create.
- Rename member to correct value in drm_edid.h
Driver Changes:
- Build fix to make nouveau build with NOUVEAU_BACKLIGHT.
- Add MI101AIT-ICP1, LTTD800480070-L6WWH-RT panels.
- Assorted fixes to bridge/it66121, anx7625.
- Add custom crtc_state to simple helpers, and use it to
convert pll handling in mgag200 to atomic.
- Convert drivers to use offset-adjusted framebuffer bo mappings.
- Assorted small fixes and fix for a use-after-free in vmwgfx.
- Convert remaining callers of non-legacy drivers to use linux irqs directly.
- Small cleanup in ingenic.
- Small fixes to virtio and ti-sn65dsi86.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1cf2d7fc-402d-1852-574a-21cbbd2eaebf@linux.intel.com
Wherever possible, replace constructs that match either
generic_handle_irq(irq_find_mapping()) or
generic_handle_irq(irq_linear_revmap()) to a single call to
generic_handle_domain_irq().
Signed-off-by: Marc Zyngier <maz@kernel.org>
The variable val is declared without initialization, and its address is
passed to amdgpu_i2c_get_byte(). In this function, the value of val is
accessed in:
DRM_DEBUG("i2c 0x%02x 0x%02x read failed\n",
addr, *val);
Also, when amdgpu_i2c_get_byte() returns, val may remain uninitialized,
but it is accessed in:
val &= ~amdgpu_connector->router.ddc_mux_control_pin;
To fix this possible uninitialized-variable access, initialize val to 0 in
amdgpu_i2c_router_select_ddc_port().
Reported-by: TOTE Robot <oslab@tsinghua.edu.cn>
Signed-off-by: Tuo Li <islituo@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch adds support to program trap handler settings
when loading driver with software scheduler (sched_policy=2).
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Suggested-by: Jay Cornwall <Jay.Cornwall@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Drop the DRM IRQ midlayer in favor of Linux IRQ interfaces. DRM's
IRQ helpers are mostly useful for UMS drivers. Modern KMS drivers
don't benefit from using it.
DRM IRQ callbacks are now being called directly or inlined.
The interrupt number returned by pci_msi_vector() is now stored
in struct amdgpu_irq. Calls to pci_msi_vector() can fail and return
a negative errno code. Abort initlaizaton in thi case. The DRM IRQ
midlayer does not handle this correctly.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210803090704.32152-2-tzimmermann@suse.de
There may be multiple instances and only one is harvested.
v2: fix typo in commit message
Fixes: 83a0b86391 ("drm/amdgpu: add judgement when add ip blocks (v2)")
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1673
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
There may be multiple instances and only one is harvested.
v2: fix typo in commit message
Fixes: 83a0b86391 ("drm/amdgpu: add judgement when add ip blocks (v2)")
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1673
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
There was an "if" statement that did nothing so it was removed.
Signed-off-by: Sergio Miguéns Iglesias <sergio@lony.xyz>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Don't use "begin kernel-doc notation" (/**) for comments that are
not kernel-doc. This eliminates warnings reported by the 0day bot.
drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c:89: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
* This shader is used to clear VGPRS and LDS, and also write the input
drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c:209: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
* The below shaders are used to clear SGPRS, and also write the input
drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c:301: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
* This shader is used to clear the uninitiated sgprs after the above
Fixes: 0e0036c7d1 ("drm/amdgpu: fix no full coverage issue for gprs initialization")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: Dennis Li <Dennis.Li@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
When init failed in early init stage, amdgpu_object has
not been initialized, so hasn't the ttm delayed queue functions.
Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com>
Reviewed-by: Emily.Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
- Extend wait time and add retry, currently 6s * 2times
- Change timing algorithm
Signed-off-by: Victor Zhao <Victor.Zhao@amd.com>
Signed-off-by: Peng Ju Zhou <PengJu.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Check whether the kcalloc() fails and return -ENOMEM if it does.
Fixes: 84ec374bd5 ("drm/amdgpu: create amdgpu_vkms (v4)")
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
If the platform uses BOCO, don't use BACO in runtime suspend.
We could end up executing the BACO path if the platform supports
both.
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1669
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
The varialbe gtt in the function amdgpu_ttm_tt_populate() and
amdgpu_ttm_tt_unpopulate() is guaranteed to be not NULL in the context.
Thus the null-pointer checks are redundant and can be dropped.
Reported-by: TOTE Robot <oslab@tsinghua.edu.cn>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Tuo Li <islituo@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
If the platform uses BOCO, don't use BACO in runtime suspend.
We could end up executing the BACO path if the platform supports
both.
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1669
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add the MODE register into the per-wave debug information.
This register holds state such as FP rounding and denorm
modes, which exceptions are enabled, and active clamping
modes.
Signed-off-by: Joseph Greathouse <Joseph.Greathouse@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The previous logic is recording the amount of valid vcn instances
to use them on SRIOV, it is a hard task due to the vcn accessment is
based on the index of the vcn instance.
Check if the vcn instance enabled before do instance init.
Signed-off-by: Peng Ju Zhou <PengJu.Zhou@amd.com>
Reviewed-by: Monk Liu <monk.liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
resolved bug with incorrect PSP BL cmd IDs
Signed-off-by: John Clements <john.clements@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Move dce_virtual into amdgpu_vkms and update all references to
dce_virtual with amdgpu_vkms.
v2: Removed more references to dce_virtual.
v3: Restored display modes from previous implementation.
Signed-off-by: Ryan Taylor <Ryan.Taylor@amd.com>
Reported-by: kernel test robot <lkp@intel.com>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Remove obsolete functions and variables from dce_virtual.
Signed-off-by: Ryan Taylor <Ryan.Taylor@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Modify the VKMS driver into an api that dce_virtual can use to create
virtual displays that obey drm's atomic modesetting api.
v2: Made local functions static.
v3: Switched vkms_output kzalloc for kcalloc.
Cleanup patches by moving display mode fixes to this patch.
v4: Update atomic_check and atomic_update to comply with new kms api.
Signed-off-by: Ryan Taylor <Ryan.Taylor@amd.com>
Reported-by: kernel test robot <lkp@intel.com>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Duplicate include header file <drm/drm_drv.h>
line 28: #include <drm/drm_drv.h>
line 44: #include <drm/drm_drv.h>
Signed-off-by: zhouchuangao <zhouchuangao@vivo.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
In amdgpu_fence_driver_hw_fini, no need to call drm_sched_fini to stop
scheduler in s3 test, otherwise, fence related failure will arrive
after resume. To fix this and for a better clean up, move drm_sched_fini
from fence_hw_fini to fence_sw_fini, as it's part of driver shutdown, and
should never be called in hw_fini.
v2: rename amdgpu_fence_driver_init to amdgpu_fence_driver_sw_init,
to keep sw_init and sw_fini paired.
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1668
Fixes: 8d35a25961 ("drm/amdgpu: adjust fence driver enable sequence")
Suggested-by: Christian König <christian.koenig@amd.com>
Tested-by: Mike Lothian <mike@fireburn.co.uk>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fix the channel_index table layout to fetch the correct
channel_index when calculating physical address from
normalized address during page retirement.
Also, fix the number of UMC instances and number of channels
within each UMC instance for Aldebaran.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-By: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
'pm_suspend_target_state' is only available when CONFIG_PM_SLEEP
is set/enabled. OTOH, when both SUSPEND and HIBERNATION are not set,
PM_SLEEP is not set, so this variable cannot be used.
../drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c: In function ‘amdgpu_acpi_is_s0ix_active’:
../drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c:1046:11: error: ‘pm_suspend_target_state’ undeclared (first use in this function); did you mean ‘__KSYM_pm_suspend_target_state’?
return pm_suspend_target_state == PM_SUSPEND_TO_IDLE;
^~~~~~~~~~~~~~~~~~~~~~~
__KSYM_pm_suspend_target_state
Also use shorter IS_ENABLED(CONFIG_foo) notation for checking the
2 config symbols.
Fixes: 91e273712a ("drm/amdgpu: Check pmops for desired suspend state")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Cc: linux-next@vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
'pm_suspend_target_state' is only available when CONFIG_PM_SLEEP
is set/enabled. OTOH, when both SUSPEND and HIBERNATION are not set,
PM_SLEEP is not set, so this variable cannot be used.
../drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c: In function ‘amdgpu_acpi_is_s0ix_active’:
../drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c:1046:11: error: ‘pm_suspend_target_state’ undeclared (first use in this function); did you mean ‘__KSYM_pm_suspend_target_state’?
return pm_suspend_target_state == PM_SUSPEND_TO_IDLE;
^~~~~~~~~~~~~~~~~~~~~~~
__KSYM_pm_suspend_target_state
Also use shorter IS_ENABLED(CONFIG_foo) notation for checking the
2 config symbols.
Fixes: 91e273712a ("drm/amdgpu: Check pmops for desired suspend state")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Cc: linux-next@vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
If GC has entered CGPG, ringing doorbell > first page doesn't wakeup GC.
Enlarge CP_MEC_DOORBELL_RANGE_UPPER to workaround this issue.
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This reverts commit 53d0533049.
Revert reason: The issue has been resolved.
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This reverts commit 4e7b93ca52.
Revert reason: The issue has been resolved.
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This reverts commit 7ed9876c97.
Revert reason: The issue has been resolved.
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This reverts commit 024d8811c9.
Revert reason: The issue has been resolved.
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
If one GTT BO has been evicted/swapped out, it should sit in CPU domain.
TTM only alloc struct ttm_resource instead of struct ttm_range_mgr_node
for sysMem.
Now when we update mapping for such invalidated BOs, we might walk out
of bounds of struct ttm_resource.
Three possible fix:
1) Let sysMem manager alloc struct ttm_range_mgr_node, like
ttm_range_manager does.
2) Pass pages_addr to update_mapping function too, but need memset
pages_addr[] to zero when unpopulate.
3) Init amdgpu_res_cursor directly.
bug is detected by kfence.
==================================================================
BUG: KFENCE: out-of-bounds read in amdgpu_vm_bo_update_mapping+0x564/0x6e0
Out-of-bounds read at 0x000000008ea93fe9 (64B right of kfence-#167):
amdgpu_vm_bo_update_mapping+0x564/0x6e0 [amdgpu]
amdgpu_vm_bo_update+0x282/0xa40 [amdgpu]
amdgpu_vm_handle_moved+0x19e/0x1f0 [amdgpu]
amdgpu_cs_vm_handling+0x4e4/0x640 [amdgpu]
amdgpu_cs_ioctl+0x19e7/0x23c0 [amdgpu]
drm_ioctl_kernel+0xf3/0x180 [drm]
drm_ioctl+0x2cb/0x550 [drm]
amdgpu_drm_ioctl+0x5e/0xb0 [amdgpu]
kfence-#167 [0x000000008e11c055-0x000000001f676b3e
ttm_sys_man_alloc+0x35/0x80 [ttm]
ttm_resource_alloc+0x39/0x50 [ttm]
ttm_bo_swapout+0x252/0x5a0 [ttm]
ttm_device_swapout+0x107/0x180 [ttm]
ttm_global_swapout+0x6f/0x130 [ttm]
ttm_tt_populate+0xb1/0x2a0 [ttm]
ttm_bo_handle_move_mem+0x17e/0x1d0 [ttm]
ttm_mem_evict_first+0x59d/0x9c0 [ttm]
ttm_bo_mem_space+0x39f/0x400 [ttm]
ttm_bo_validate+0x13c/0x340 [ttm]
ttm_bo_init_reserved+0x269/0x540 [ttm]
amdgpu_bo_create+0x1d1/0xa30 [amdgpu]
amdgpu_bo_create_user+0x40/0x80 [amdgpu]
amdgpu_gem_object_create+0x71/0xc0 [amdgpu]
amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0x2f2/0xcd0 [amdgpu]
kfd_ioctl_alloc_memory_of_gpu+0xe2/0x330 [amdgpu]
kfd_ioctl+0x461/0x690 [amdgpu]
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
If GC has entered CGPG, ringing doorbell > first page doesn't wakeup GC.
Enlarge CP_MEC_DOORBELL_RANGE_UPPER to workaround this issue.
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
If one GTT BO has been evicted/swapped out, it should sit in CPU domain.
TTM only alloc struct ttm_resource instead of struct ttm_range_mgr_node
for sysMem.
Now when we update mapping for such invalidated BOs, we might walk out
of bounds of struct ttm_resource.
Three possible fix:
1) Let sysMem manager alloc struct ttm_range_mgr_node, like
ttm_range_manager does.
2) Pass pages_addr to update_mapping function too, but need memset
pages_addr[] to zero when unpopulate.
3) Init amdgpu_res_cursor directly.
bug is detected by kfence.
==================================================================
BUG: KFENCE: out-of-bounds read in amdgpu_vm_bo_update_mapping+0x564/0x6e0
Out-of-bounds read at 0x000000008ea93fe9 (64B right of kfence-#167):
amdgpu_vm_bo_update_mapping+0x564/0x6e0 [amdgpu]
amdgpu_vm_bo_update+0x282/0xa40 [amdgpu]
amdgpu_vm_handle_moved+0x19e/0x1f0 [amdgpu]
amdgpu_cs_vm_handling+0x4e4/0x640 [amdgpu]
amdgpu_cs_ioctl+0x19e7/0x23c0 [amdgpu]
drm_ioctl_kernel+0xf3/0x180 [drm]
drm_ioctl+0x2cb/0x550 [drm]
amdgpu_drm_ioctl+0x5e/0xb0 [amdgpu]
kfence-#167 [0x000000008e11c055-0x000000001f676b3e
ttm_sys_man_alloc+0x35/0x80 [ttm]
ttm_resource_alloc+0x39/0x50 [ttm]
ttm_bo_swapout+0x252/0x5a0 [ttm]
ttm_device_swapout+0x107/0x180 [ttm]
ttm_global_swapout+0x6f/0x130 [ttm]
ttm_tt_populate+0xb1/0x2a0 [ttm]
ttm_bo_handle_move_mem+0x17e/0x1d0 [ttm]
ttm_mem_evict_first+0x59d/0x9c0 [ttm]
ttm_bo_mem_space+0x39f/0x400 [ttm]
ttm_bo_validate+0x13c/0x340 [ttm]
ttm_bo_init_reserved+0x269/0x540 [ttm]
amdgpu_bo_create+0x1d1/0xa30 [amdgpu]
amdgpu_bo_create_user+0x40/0x80 [amdgpu]
amdgpu_gem_object_create+0x71/0xc0 [amdgpu]
amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0x2f2/0xcd0 [amdgpu]
kfd_ioctl_alloc_memory_of_gpu+0xe2/0x330 [amdgpu]
kfd_ioctl+0x461/0x690 [amdgpu]
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The function is ready on psp firmware, and enable it by default.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fence driver was enabled per ring when sw init on per IP block before.
Change to enable all the fence driver at the same time after
amdgpu_device_ip_init finished.
Rename some function related to fence to make it reasonable for read.
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Added BL loading support for soc/intf/dbg drivers
Signed-off-by: John Clements <john.clements@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Detect psp driver binaries packed into FW and try to load the FW
Signed-off-by: John Clements <john.clements@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
It won't need to clear the xxx_PSP_DEBUG registers, because firmware
will handle this change.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
On platforms that support multiple backlights, register
each one separately. This lets us manage them independently
rather than registering a single backlight and applying the
same settings to both.
v2: fix typo:
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Roman Li <Roman.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This reverts commit 4e7b93ca52.
Revert reason: The issue has been resolved.
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This reverts commit 7ed9876c97.
Revert reason: The issue has been resolved.
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This reverts commit 024d8811c9.
Revert reason: The issue has been resolved.
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
User might change the suspend behaviour from OS.
[How]
Check with pm for target suspend state and set s0ix
flag only for s2idle state.
v2: User might change default suspend state, use target state
v3: squash in build fix
Suggested-by: Lijo Lazar <Lijo.Lazar@amd.com>
Signed-off-by: Pratik Vishwakarma <Pratik.Vishwakarma@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Rename amdgpu_acpi_is_s0ix_supported to better explain
functionality by renaming to amdgpu_acpi_is_s0ix_active
Signed-off-by: Pratik Vishwakarma <Pratik.Vishwakarma@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
User might change the suspend behaviour from OS.
[How]
Check with pm for target suspend state and set s0ix
flag only for s2idle state.
v2: User might change default suspend state, use target state
v3: squash in build fix
Suggested-by: Lijo Lazar <Lijo.Lazar@amd.com>
Signed-off-by: Pratik Vishwakarma <Pratik.Vishwakarma@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
In case when psp_init_asd_microcode() fails to load ASD microcode file,
psp_v12_0_init_microcode() tries to print the firmware filename that
failed to load before bailing out.
This is wrong because:
- the firmware filename it would want it print is an incorrect one as
psp_init_asd_microcode() and psp_v12_0_init_microcode() are loading
different filenames
- it tries to print fw_name, but that's not yet been initialized by that
time, so it prints random stack contents, e.g.
amdgpu 0000:04:00.0: Direct firmware load for amdgpu/renoir_asd.bin failed with error -2
amdgpu 0000:04:00.0: amdgpu: fail to initialize asd microcode
amdgpu 0000:04:00.0: amdgpu: psp v12.0: Failed to load firmware "\xfeTO\x8e\xff\xff"
Fix that by bailing out immediately, instead of priting the bogus error
message.
Reported-by: Vojtech Pavlik <vojtech@ucw.cz>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
This reverts commit 4192f7b576.
It is not true (as stated in the reverted commit changelog) that we never
unmap the BAR on failure; it actually does happen properly on
amdgpu_driver_load_kms() -> amdgpu_driver_unload_kms() ->
amdgpu_device_fini() error path.
What's worse, this commit actually completely breaks resource freeing on
probe failure (like e.g. failure to load microcode), as
amdgpu_driver_unload_kms() notices adev->rmmio being NULL and bails too
early, leaving all the resources that'd normally be freed in
amdgpu_acpi_fini() and amdgpu_device_fini() still hanging around, leading
to all sorts of oopses when someone tries to, for example, access the
sysfs and procfs resources which are still around while the driver is
gone.
Fixes: 4192f7b576 ("drm/amdgpu: unmap register bar on device init failure")
Reported-by: Vojtech Pavlik <vojtech@ucw.cz>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
-----BEGIN PGP SIGNATURE-----
iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmD95yIeHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGqp0H/j/xHL20EHaUJOaV
iJjnfGyjtnkLC5FCoV/q/v9sFuSW2p4W1nyF8/eIgVKObef94Mg4/xxaHQrWIM56
cbzK9aIcD9InAuImJ6lju4fqjNmFrt2x7mhfzjPKqmhfINfZ5CohpLFN5XdOwzYC
l+ZgmUUl7GLDAND2M6rtkc7AOk4qTyAySDvvPFELE/uNgV4EKaENSIWofHhEzW5v
Yk+4agawaFTfa6H9+uMVYZBOcEKwheQ0E2tcOJvHJT8Mwm8MFoC/B7fLY5zxIdN2
7A7r/7qbSQmSDSjOgwKS4ZOjom0xGSD+V+596SzET6jkbahR2HJ/mrFvmD7GNEoW
OWJPjzI=
=vzIM
-----END PGP SIGNATURE-----
Backmerge tag 'v5.14-rc3' into drm-next
Linux 5.14-rc3
Daniel said we should pull the nouveau fix from fixes in here, probably
a good plan.
Signed-off-by: Dave Airlie <airlied@redhat.com>
They are initalized by hardware during power up phase,
starting from sdma v5_2 generation
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
In pass-through mode, after mode 1 reset, msix enablement status would
lost and never receives interrupt again. So, we should restore msix
status after mode 1 reset.
Signed-off-by: Chengzhe Liu <ChengZhe.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The fail reason is that the vfgate is disabled
Signed-off-by: Roy Sun <Roy.Sun@amd.com>
Reviewed-by: Peng Ju Zhou <PengJu.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
On Sienna Cichlid, in pass-through mode, if we unload the driver in BACO
mode(RTPM), then the kernel would receive thousands of interrupts.
That's because there is doorbell monitor interrupt on BIF, so KVM keeps
injecting interrupts to the guest VM. So we should clear the doorbell
interrupt status after BACO exit.
v2: Modify coding style and commit message
Signed-off-by: Chengzhe Liu <ChengZhe.Liu@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Asic cyan_skilfish2 won't support RLC autoload when using
front door loading. We just use PSP to load firmware like
gfx9 here.
So add autoload_supported flag check instead of just
checking firmware load type for RLC autoload.
Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Enable SMU support for cyan_skilfish.
v2: Squash in fix (Alex)
Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Will switch to front door loading by default after this function is
stable.
v2: use APU flags (Alex)
Signed-off-by: Lang Yu <lang.yu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add psp v11.0.8 to ip block initialization.
v2: use APU flags (Alex)
Signed-off-by: Lang Yu <lang.yu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
nbio version is 2.3.
v2: Make it more explicit (Alex)
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
v2: squash in updates from Ray
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add gfx support for cyan_skillfish.
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add gmc support for cyan_skillfish.
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use backdoor loading.
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Same as Navi10.
v2: squash in updates (Alex)
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add cp/rlc fw loading support and gfx golden setting.
v2: squash in updates (Alex)
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add ip blocks for cyan_skillfish.
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Use FAMILY_NV for cyan_skillfish.
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add ip offset definition for cyan_skillfish and initialize it.
v2: squash in ip_offset updates (Alex)
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>