linux

Author	SHA1	Message	Date
Kees Cook	bec2dd6969	drm/msm/adreno: Remove VLA usage In the quest to remove all stack VLA usage from the kernel[1], this switches to using a kasprintf()ed buffer. Return paths are updated to free the allocation. [1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-08-05 10:07:09 -04:00
Arnd Bergmann	3530a17f4d	drm/msm/gpu: avoid deprecated do_gettimeofday All users of do_gettimeofday() have been removed, but this one recently crept in, along with an incorrect printing of the microseconds portion. This converts it to using ktime_get_real_timespec64() as a direct replacement, and adds the leading zeroes. I considered using monotonic times (ktime_get()) instead, but as this timestamp appears to only be used for humans rather than compared with other timestamps, the real time domain is probably good enough. Fixes: e43b045e2c82 ("drm/msm/gpu: Capture the state of the GPU") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-07-30 08:50:12 -04:00
Jordan Crouse	cdb95931de	drm/msm/gpu: Add the buffer objects from the submit to the crash dump For hangs, dump copy out the contents of the buffer objects attached to the guilty submission and print them in the crash dump report. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-07-30 08:50:10 -04:00
Jordan Crouse	50f8d21863	drm/msm/adreno: Add a5xx specific registers for the GPU state HLSQ, SP and TP registers are only accessible from a special aperture and to make matters worse the aperture is blocked from the CPU on targets that can support secure rendering. Luckily the GPU hardware has its own purpose built register dumper that can access the registers from the aperture. Add a5xx specific code to program the crashdumper and retrieve the wayward registers and dump them for the crash state. Also, remove a block of registers the regular CPU accessible list that aren't useful for debug which helps reduce the size of the crash state file by a goodly amount. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-07-30 08:50:06 -04:00
Jordan Crouse	43a56687d1	drm/msm/adreno: Add ringbuffer data to the GPU state Add the contents of each ringbuffer to the GPU state and dump the data in the crash file encoded with ascii85. To save space only the used portions of the ringbuffer are dumped. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-07-30 08:50:03 -04:00
Jordan Crouse	bcf1d9fa5d	drm/msm/adreno: Convert the show/crash file format Convert the format of the 'show' debugfs file and the crash dump to a format resembling YAML. This should be easier to parse and be more flexible for future changes and expansions. v2: Use a standard .rst for the msm crashdump documentation Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-07-30 08:50:00 -04:00
Jordan Crouse	c0fec7f562	drm/msm/gpu: Capture the GPU state on a GPU hang Capture the GPU state on a GPU hang and store it for later playback via the devcoredump facility. Only one crash state is stored at a time on the assumption that the first hang is usually the most interesting. The existing crash state can be cleared after capturing it and then a new one will be captured on the next hang. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-07-30 08:49:56 -04:00
Jordan Crouse	4f776f4511	drm/msm/gpu: Convert the GPU show function to use the GPU state Convert the existing GPU show function to use the GPU state to dump the information rather than reading it directly from the hardware. This will require an additional step to capture the state before dumping it for the existing nodes but it will greatly facilitate reusing the same code for dumping a previously captured state from a GPU hang. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-07-30 08:49:48 -04:00
Jordan Crouse	e00e473d98	drm/msm/gpu: Capture the state of the GPU Add the infrastructure to capture the current state of the GPU and store it in memory so that it can be dumped later. For now grab the same basic ringbuffer information and registers that are provided by the debugfs 'gpu' node but obviously this should be extended to capture a much larger set of GPU information. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-07-30 08:49:45 -04:00
Jordan Crouse	64709686db	drm/msm/gpu: Increase the pm runtime autosuspend for 5xx Experimentation shows that resuming power quickly after suspending ends up forcing a system hang for unknown reasons on 5xx targets. To avoid cycling the power too much (especially during init) turn up the autosuspend time for a5xx to 250ms and use pm_runtime_put_autosuspend() when applicable. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-07-25 07:51:04 -04:00
Bjorn Andersson	79d57bf6fa	drm/msm: Trigger fence completion from GPU Interrupt commands causes the CP to trigger an interrupt as the command is processed, regardless of the GPU being done processing previous commands. This is seen by the interrupt being delivered before the fence is written on 8974 and is likely the cause of the additional CP_WAIT_FOR_IDLE workaround found for a306, which would cause the CP to wait for the GPU to go idle before triggering the interrupt. Instead we can set the (undocumented) BIT(31) of the CACHE_FLUSH_TS which will cause a special CACHE_FLUSH_TS interrupt to be triggered from the GPU as the write event is processed. Add CACHE_FLUSH_TS to the IRQ masks of A3xx and A4xx and remove the workaround for A306. Suggested-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-03-19 06:33:36 -04:00
Jordan Crouse	9de43e79c1	drm/msm/adreno: Use generic function to load firmware to a buffer object Move a5xx specific code to load firmware into a buffer object to the generic Adreno code. This will come in useful for future targets. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-20 10:41:22 -05:00
Jordan Crouse	c5e3548c29	drm/msm/adreno: Define a list of firmware files to load per target The number and type of firmware files required differs for each target. Instead of using a fixed struct member for each possible firmware file use a generic list of files that should be loaded on boot. Use some semi-target specific enums to help each target find the appropriate firmware(s) that it needs to load. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-02-20 10:41:22 -05:00
Jordan Crouse	f91c14ab44	drm/msm: Add devfreq support for the GPU Add support for devfreq to dynamically control the GPU frequency. By default try to use the 'simple_ondemand' governor which can adjust the frequency based on GPU load. v2: Fix __aeabi_uldivmod issue from the 0 day bot and use devfreq_recommended_opp() as suggested by Rob. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-01-10 14:30:03 -05:00
Jordan Crouse	999ae6edc1	drm/msm/adreno: Move clock parsing to adreno_gpu_init() Move the clock parsing to adreno_gpu_init() to allow for target specific probing and manipulation of the clock tables. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-01-10 08:58:42 -05:00
Jordan Crouse	1babd706b4	drm/msm/gpu: Remove unused bus scaling code Remove the downstream bus scaling code. It isn't needed for for compatibility with a downstream or vendor kernel. Get it out of the way to clear space for devfreq support. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>	2018-01-10 08:58:42 -05:00
Colin Ian King	3a9016ba0e	drm/msm: fix spelling mistake: "ringubffer" -> "ringbuffer" Trivial fix to spelling mistake in DRM_DEV_ERROR error message Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-12-13 11:01:20 -05:00
Jordan Crouse	b1fc2839d2	drm/msm: Implement preemption for A5XX targets Implement preemption for A5XX targets - this allows multiple ringbuffers for different priorities with automatic preemption of a lower priority ringbuffer if a higher one is ready. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-10-28 11:01:38 -04:00
Jordan Crouse	4d87fc32df	drm/msm: Make the value of RB_CNTL (almost) generic We use a global ringbuffer size and block size for all targets and at least for 5XX preemption we need to know the value the RB_CNTL in several locations so it makes sense to calculate it once and use it everywhere. The only monkey wrench is that we need to disable the RPTR shadow for A430 targets but that only needs to be done once and doesn't affect A5XX so we can or in the value at init time. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-10-28 11:01:38 -04:00
Jordan Crouse	4c7085a5d5	drm/msm: Shadow current pointer in the ring until command is complete Add a shadow pointer to track the current command being written into the ring. Don't commit it as 'cur' until the command is submitted. Because 'cur' is used to construct the software copy of the wptr this ensures that somebody peeking in on the ring doesn't assume that a command is inflight while it is being written. This isn't a huge deal with a single ring (though technically the hangcheck could assume the system is prematurely busy when it isn't) but it will be rather important for preemption where the decision to preempt is based on a non-empty ringbuffer. Without a shadow an aggressive preemption scheme could assume that the ringbuffer is non empty and switch to it before the CPU is done writing the command and boom. Even though preemption won't be supported for all targets because of the way the code is organized it is simpler to make this generic for all targets. The extra load for non-preemption targets should be minimal. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-10-28 11:01:37 -04:00
Jordan Crouse	a6e29a0eea	drm/msm: Add a parameter query for the number of ringbuffers In order to manage ringbuffer priority to its fullest userspace should know how many ringbuffers it has to work with. Add a parameter to return the number of active rings. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-10-28 11:01:37 -04:00
Jordan Crouse	f97decac5f	drm/msm: Support multiple ringbuffers Add the infrastructure to support the idea of multiple ringbuffers. Assign each ringbuffer an id and use that as an index for the various ring specific operations. The biggest delta is to support legacy fences. Each fence gets its own sequence number but the legacy functions expect to use a unique integer. To handle this we return a unique identifier for each submission but map it to a specific ring/sequence under the covers. Newer users use a dma_fence pointer anyway so they don't care about the actual sequence ID or ring. The actual mechanics for multiple ringbuffers are very target specific so this code just allows for the possibility but still only defines one ringbuffer for each target family. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-10-28 11:01:36 -04:00
Jordan Crouse	cd414f3d93	drm/msm: Move memptrs to msm_gpu When we move to multiple ringbuffers we're going to store the data in the memptrs on a per-ring basis. In order to prepare for that move the current memptrs from the adreno namespace into msm_gpu. This is way cleaner and immediately lets us kill off some sub functions so there is much less cost later when we do move to per-ring structs. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-10-28 11:01:36 -04:00
Rob Clark	2c41ef1b6f	drm/msm/adreno: deal with linux-firmware fw paths When firmware was added to linux-firmware, it was put in a qcom sub- directory, unlike what we'd been using before. For a300_pfp.fw and a300_pm4.fw symlinks were created, but we'd prefer not to have to do this in the future. So add support to look in both places when loading firmware. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-10-28 11:01:31 -04:00
Rob Clark	e8f3de96a9	drm/msm/adreno: split out helper to load fw Prep work for the next patch. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-10-28 11:01:31 -04:00
Rob Clark	eec874ce5f	drm/msm/adreno: load gpu at probe/bind time Previously, in an effort to defer initializing the gpu until firmware was available (ie. rootfs mounted), the gpu was not loaded at when the subdevice was bound. Which resulted that clks/etc were requested in a place that devm couldn't really help unwind if something failed. Instead move request_firmware() to gpu->hw_init() and construct the gpu earlier in adreno_bind(). To avoid the rest of the driver needing to be aware of a gpu that hasn't managed to load firmware and hw_init() yet, stash the gpu ptr in the adreno device's drvdata, and don't set priv->gpu() until hw_init() succeeds. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-10-28 11:01:31 -04:00
Jordan Crouse	8223286d62	drm/msm: Add a helper function for in-kernel buffer allocations Nearly all of the buffer allocations for kernel allocate an buffer object, virtual address and GPU iova at the same time. Make a helper function to handle the details. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> [dropped msm_fbdev conversion to new helper, since it interferes with display-handover work, where we want to separate allocation and mapping] Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-08-22 13:19:17 -04:00
Jordan Crouse	1267a4dfe0	drm/msm: Attach the GPU MMU when it is created Currently the GPU MMU is attached in the adreno_gpu code but as more and more of the GPU initialization moves to the generic GPU path we have a need to map and use GPU memory earlier and earlier. There isn't any reason to defer attaching the MMU until later so attach it right after the address space is created so it can be used immediately. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-08-22 13:19:15 -04:00
Archit Taneja	541de4c9c9	drm/msm/adreno: Prevent unclocked access when retrieving timestamps msm_gpu's get_timestamp() op (called by the MSM_GET_PARAM ioctl) can result in register accesses. We need our power domain and clocks to be active for that. Make sure they are enabled here. Signed-off-by: Archit Taneja <architt@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-08-01 19:20:13 -04:00
Sushmita Susheelendra	0e08270a1f	drm/msm: Separate locking of buffer resources from struct_mutex Buffer object specific resources like pages, domains, sg list need not be protected with struct_mutex. They can be protected with a buffer object level lock. This simplifies locking and makes it easier to avoid potential recursive locking scenarios for SVM involving mmap_sem and struct_mutex. This also removes unnecessary serialization when creating buffer objects, and also between buffer object creation and GPU command submission. Signed-off-by: Sushmita Susheelendra <ssusheel@codeaurora.org> [robclark: squash in handling new locking for shrinker] Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-06-17 08:03:07 -04:00
Rob Clark	8bdcd949bb	drm/msm: pass address-space to _get_iova() and friends No functional change, that will come later. But this will make it easier to deal with dynamically created address spaces (ie. per- process pagetables for gpu). Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-06-16 11:16:04 -04:00
Rob Clark	cb1e38181a	drm/msm: fix locking inconsistency for gpu->hw_init() Most, but not all, paths where calling the with struct_mutex held. The fast-path in msm_gem_get_iova() (plus some sub-code-paths that only run the first time) was masking this issue. So lets just always hold struct_mutex for hw_init(). And sprinkle some WARN_ON()'s and might_lock() to avoid this sort of problem in the future. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-06-16 11:16:01 -04:00
Jordan Crouse	42a105e9cf	drm/msm: Remove memptrs->wptr memptrs->wptr seems to be unused. Remove it to avoid confusing the upcoming preemption code. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-06-16 11:16:01 -04:00
Jordan Crouse	5770fc7a56	drm/msm: Add a struct to pass configuration to msm_gpu_init() The amount of information that we need to pass into msm_gpu_init() is steadily increasing, so add a new struct to stabilize the function call and make it easier to add new configuration down the line. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-06-16 11:16:00 -04:00
Jordan Crouse	bf5af4ae87	drm/msm: Hard code the GPU "slow frequency" Some A3XX and A4XX GPU targets required that the GPU clock be programmed to a non zero value when it was disabled so 27Mhz was chosen as the "invalid" frequency. Even though newer targets do not have the same clock restrictions we still write 27Mhz on clock disable and expect the clock subsystem to round down to zero. For unknown reasons even though the slow clock speed is always 27Mhz and it isn't actually a functional level the legacy device tree frequency tables always defined it and then did gymnastics to work around it. Instead of playing the same silly games just hard code the "slow" clock speed in the code as 27MHz and save ourselves a bit of infrastructure. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-04-08 06:59:37 -04:00
Jordan Crouse	e3689e470f	drm/msm: Add MSM_PARAM_GMEM_BASE User space needs to know where the GMEM whole starts so that they can set up the addressing correctly. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-04-08 06:59:36 -04:00
Jordan Crouse	ee546cd34a	drm/msm: Reference count address spaces There are reasons for a memory object to outlive the file descriptor that created it and so the address space that a buffer object is attached to must also outlive the file descriptor. Reference count the address space so that it can remain viable until all the objects have released their addresses. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-04-08 06:59:36 -04:00
Jordan Crouse	9873ef0743	drm/msm: Make sure to detach the MMU during GPU cleanup We should be detaching the MMU before destroying the address space. To do this cleanly, the detach has to happen in adreno_gpu_cleanup() because it needs access to structs in adreno_gpu.c. Plus it is better symmetry to have the attach and detach at the same code level. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-04-08 06:59:36 -04:00
Rob Clark	de098e5fb1	drm/msm/adreno: reset ringbuffer in hw_init We need to do this also in resume path when we need to re-hw_init(). Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-04-08 06:59:31 -04:00
Rob Clark	eeb754746b	drm/msm/gpu: use pm-runtime We need to use pm-runtime properly when IOMMU is using device_link() to control it's own clocks. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-04-08 06:59:31 -04:00
Rob Clark	c3c3ab199b	drm/msm/gpu: move suspend/resume into debugfs->show Each of the per-generation callbacks was doing this. Lets just simplify and move it into toplevel show() fxn. Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-04-08 06:59:31 -04:00
Rob Clark	4e09b95d72	drm/msm: drop quirks binding This was never documented or used in upstream dtb. It is used by downstream bindings from android device kernels. But the quirks are a property of the gpu revision, and as such are redundant to be listed separately in dt. Instead, move the quirks to the device table. Signed-off-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2017-02-06 11:28:42 -05:00
Rob Clark	de85d2b35a	drm/msm: fix potential null ptr issue in non-iommu case Fixes: 9cb07b099fb ("drm/msm: support multiple address spaces") Reported-by: Riku Voipio <riku.voipio@linaro.org> Signed-off-by: Rob Clark <robdclark@gmail.com>	2017-01-13 10:23:00 -05:00
Jordan Crouse	88b333b0ed	drm/msm: Ensure that the hardware write pointer is valid Currently the value written to CP_RB_WPTR is calculated on the fly as (rb->next - rb->start). But as the code is designed rb->next is wrapped before writing the commands so if a series of commands happened to fit perfectly in the ringbuffer, rb->next would end up being equal to rb->size / 4 and thus result in an out of bounds address to CP_RB_WPTR. The easiest way to fix this is to mask WPTR when writing it to the hardware; it makes the hardware happy and the rest of the ringbuffer math appears to work and there isn't any point in upsetting anything. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> [squash in is_power_of_2() check] Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-12-29 15:02:58 -05:00
Jordan Crouse	b5f103ab98	drm/msm: gpu: Add A5XX target support Add support for the A5XX family of Adreno GPUs. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-11-28 15:14:15 -05:00
Jordan Crouse	4ac277cd9d	drm/msm: Disable interrupts during init Disable the interrupt during the init sequence to avoid having interrupts fired for errors and other things that we are not ready to handle while initializing. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-11-28 15:14:14 -05:00
Jordan Crouse	fb03998192	drm/msm: Add adreno_gpu_write64() Add a new generic function to write a "64" bit value. This isn't actually a 64 bit operation, it just writes the upper and lower 32 bit of a 64 bit value to a specified LO and HI register. If a particular target doesn't support one of the registers it can mark that register as SKIP and writes/reads from that register will be quietly dropped. This can be immediately put in place for the ringbuffer base and the RPTR address. Both writes are converted to use adreno_gpu_write64() with their respective high and low registers and the high register appropriately marked as SKIP for both 32 bit targets (a3xx and a4xx). When a5xx comes it will define valid target registers for the 'hi' option and everything else will just work. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-11-28 15:14:12 -05:00
Jordan Crouse	c4a8d47560	drm/msm: gpu: Return error on hw_init failure When the GPU hardware init function fails (like say, ME_INIT timed out) return error instead of blindly continuing on. This gives us a small chance of saving the system before it goes boom. Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-11-28 15:14:11 -05:00
Rob Clark	398efc46f8	drm/msm/adreno: move scratch register dumping to per-gen code Scratch registers move, annoyingly enough, in a5xx. Move to per-generation aNxx_recover() fxn. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-11-28 15:14:09 -05:00
Rob Clark	667ce33e57	drm/msm: support multiple address spaces We can have various combinations of 64b and 32b address space, ie. 64b CPU but 32b display and gpu, or 64b CPU and GPU but 32b display. So best to decouple the device iova's from mmap offset. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-11-27 11:23:09 -05:00
Rob Clark	6b597ce2f7	drm/msm: deal with arbitrary # of cmd buffers For some optimizations coming on the userspace side, splitting larger draw or gmem cmds into multiple cmdstream buffers, we need to support much more than the previous small/arbitrary limit. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-16 10:09:08 -04:00
Rob Clark	18f23049f6	drm/msm: change gem->vmap() to get/put Before we can add vmap shrinking, we really need to know which vmap'ings are currently being used. So switch to get/put interface. Stubbed put fxns for now. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-07-16 10:09:07 -04:00
Rob Clark	69a834c28f	drm/msm: deal with exhausted vmap space better Some, but not all, callers of obj->vmap() would check if return IS_ERR(). So let's actually return an error if vmap() fails. And fixup the call-sites that were not handling this properly. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-06-04 14:45:48 -04:00
Rob Clark	1193c3bcb5	drm/msm: drop return from gpu->submit() At this point, there is nothing left to fail. And submit already has a fence assigned and is added to the submit_list. Any problems from here on out are asynchronous (ie. hangcheck/recovery). Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-05-08 10:22:18 -04:00
Rob Clark	2755734390	drm/msm: fix ->last_fence() after recover It is no longer true that we discard all in-flight submits on recover (these days we only discard the first one that hung). After the first re-submitted batch completes it would overwrite the fence with a correct value, but there would be a window of time which showed all re-submitted batches as already complete. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-05-08 10:22:15 -04:00
Rob Clark	b6295f9a38	drm/msm: 'struct fence' conversion Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-05-08 10:22:15 -04:00
Rob Clark	ca762a8ae7	drm/msm: introduce msm_fence_context Better encapsulate the per-timeline stuff into fence-context. For now there is just a single fence-context, but eventually we'll also have one per-CRTC to enable fully explicit fencing. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-05-08 10:19:51 -04:00
Rob Clark	6c77d1abe6	drm/msm: add timestamp param We need this for GL_TIMESTAMP queries. Note: currently only supported on a4xx.. a3xx doesn't have this always-on counter. I think we could emulate it with the one CP counter that is available, but for now it is of limited usefulness on a3xx (since we can't seem to do time-elapsed queries in any sane way with the existing firmware on a3xx, and if you are trying to do profiling on a tiler you want time-elapsed). We can add that later if it becomes useful. Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-03-03 11:55:32 -05:00
Craig Stout	7d0c5ee9f0	drm/msm/adreno: get CP_RPTR from register instead of shadow memory As described in the downstream/kgsl driver: Sometimes the RPTR shadow memory is unreliable causing timeouts in adreno_idle(). Read it directly from the register instead. Signed-off-by: Craig Stout <cstout@chromium.org> Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-03-03 11:55:28 -05:00
Craig Stout	357ff00b08	drm/msm/adreno: support for adreno 430. Signed-off-by: Craig Stout <cstout@chromium.org> Signed-off-by: Rob Clark <robdclark@gmail.com>	2016-03-03 11:55:27 -05:00
Rob Clark	4102a9e532	drm/msm: add max-freq gpu param to uapi We need this in userspace for interpreting some of the perf ctrs. Note possibly not quite sufficient if we had some frequency mgmt approach other than race-to-idle. Not really sure what the best thing to do if we did. Although displaying results as a percentage of max frequence seems sensible(ish) if we did. Signed-off-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2016-02-11 06:25:54 +10:00
Rob Clark	d735fdc35b	drm/msm: workaround for missing irq on a306/8x16 Signed-off-by: Rob Clark <robdclark@gmail.com>	2015-06-11 13:11:01 -04:00
Rob Clark	6490ad4740	drm/msm: clarify downstream bus scaling A few spots in the driver have support for downstream android CONFIG_MSM_BUS_SCALING. This is mainly to simplify backporting the driver for various devices which do not have sufficient upstream kernel support. But the intentionally dead code seems to cause some confusion. Rename the #define to make this more clear. Signed-off-by: Rob Clark <robdclark@gmail.com>	2015-06-11 13:11:01 -04:00
Rob Clark	2671618551	drm/msm/adreno: dump scratch regs and other info on hang Dump a bit more info when the GPU hangs, without having hang_debug enabled (which dumps a lot of registers). Also dump the scratch registers, as they are useful for determining where in the cmdstream the GPU hung (and they seem always safe to read when GPU has hung). Note that the freedreno gallium driver emits increasing counter values to SCRATCH6 (to identify tile #) and SCRATCH7 (to identify draw #), so these two in particular can be used to "triangulate" where in the cmdstream the GPU hung. Signed-off-by: Rob Clark <robdclark@gmail.com>	2015-06-11 13:11:00 -04:00
Rob Clark	774449ebcb	drm/msm: fix locking inconsistencies in gpu->destroy() In error paths, this was being called without struct_mutex held. Leading to panics like: msm 1a00000.qcom,mdss_mdp: No memory protection without IOMMU Kernel panic - not syncing: BUG! CPU: 0 PID: 1409 Comm: cat Not tainted 4.0.0-dirty #4 Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT) Call trace: [<ffffffc000089c78>] dump_backtrace+0x0/0x118 [<ffffffc000089da0>] show_stack+0x10/0x20 [<ffffffc0006686d4>] dump_stack+0x84/0xc4 [<ffffffc0006678b4>] panic+0xd0/0x210 [<ffffffc0003e1ce4>] drm_gem_object_free+0x5c/0x60 [<ffffffc000402870>] adreno_gpu_cleanup+0x60/0x80 [<ffffffc0004035a0>] a3xx_destroy+0x20/0x70 [<ffffffc0004036f4>] a3xx_gpu_init+0x84/0x108 [<ffffffc0004018b8>] adreno_load_gpu+0x58/0x190 [<ffffffc000419dac>] msm_open+0x74/0x88 [<ffffffc0003e0a48>] drm_open+0x168/0x400 [<ffffffc0003e7210>] drm_stub_open+0xa8/0x118 [<ffffffc0001a0e84>] chrdev_open+0x94/0x198 [<ffffffc000199f88>] do_dentry_open+0x208/0x310 [<ffffffc00019a4c4>] vfs_open+0x44/0x50 [<ffffffc0001aa26c>] do_last.isra.14+0x2c4/0xc10 [<ffffffc0001aac38>] path_openat+0x80/0x5e8 [<ffffffc0001ac354>] do_filp_open+0x2c/0x98 [<ffffffc00019b60c>] do_sys_open+0x13c/0x228 [<ffffffc00019b72c>] SyS_openat+0xc/0x18 CPU1: stopping But there isn't any particularly good reason to hold struct_mutex for teardown, so just standardize on calling it without the mutex held and use the _unlocked() versions for GEM obj unref'ing Signed-off-by: Rob Clark <robdclark@gmail.com>	2015-05-15 09:28:27 -04:00
Markus Elfring	5acb07ea80	drm/msm: Deletion of unnecessary checks before the function call "release_firmware" The release_firmware() function tests whether its argument is NULL and then returns immediately. Thus the test around the call is not needed. This issue was detected by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Reviewed-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Rob Clark <robdclark@gmail.com>	2014-12-17 10:59:49 -05:00
Aravind Ganesan	23bd62fd41	drm/msm: a4xx support for msm-drm Added a4xx GPU support. Signed-off-by: Aravind Ganesan <aravindg@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>	2014-11-16 14:27:40 -05:00
Aravind Ganesan	91b74e9761	drm/msm: Handle register offset differences between a3xx and a4xx Register offsets have changed between a3xx and a4xx GPUs. To be able access these registers in common code, we create a lookup table, and set of read-write APIs to access the register through the lookup table. Signed-off-by: Aravind Ganesan <aravindg@codeaurora.org> [robclark: remove REG_ADRENO_UNDEFINED, just use zero, and minor tweaks for latest generated headers] Signed-off-by: Rob Clark <robdclark@gmail.com>	2014-11-16 14:27:39 -05:00
Rob Clark	0122f96fc2	drm/msm/adreno: slight init order cleanup Move anything that can fail after call to base class msm_gpu_init(). This way, if we fail, active_list has already been initialized so we don't trip 'WARN_ON(!list_empty(&gpu->active_list))' in msm_gpu_cleanup(). Signed-off-by: Rob Clark <robdclark@gmail.com>	2014-11-16 14:22:42 -05:00
Rob Clark	3bcefb0497	drm/msm/adreno: push dump/show stuff to base class Add ptr to list of interesting registers to 'struct adreno_gpu' and use that to move most of the debugfs show and register dump bits down into adreno_gpu. This will avoid duplication as support for additional adreno generations is added. Signed-off-by: Rob Clark <robdclark@gmail.com>	2014-09-10 11:19:09 -04:00
Rob Clark	3526e9fb4f	drm/msm/adreno: bit of init refactoring Push a few bits down into adreno_gpu so they won't have to be duplicated as support for additional adreno generations is added. Signed-off-by: Rob Clark <robdclark@gmail.com>	2014-09-10 11:19:09 -04:00
Rob Clark	e2550b7a7d	drm/msm/adreno: move decision about what gpu to to load Move this into into adreno_device, and decide based on gpu revision rather than just assuming a3xx. Signed-off-by: Rob Clark <robdclark@gmail.com>	2014-09-10 11:19:08 -04:00
Rob Clark	a1ad352333	drm/msm: fix potential deadlock in gpu init Somewhere along the way, the firmware loader sprouted another lock dependency, resulting in possible deadlock scenario: &dev->struct_mutex --> &sb->s_type->i_mutex_key#2 --> &mm->mmap_sem which is problematic vs things like gem mmap. So introduce a separate mutex to synchronize gpu init. Signed-off-by: Rob Clark <robdclark@gmail.com>	2014-08-04 11:55:29 -04:00
Rob Clark	944fc36c31	drm/msm: use upstream iommu Downstream kernel IOMMU had a non-standard way of dealing with multiple devices and multiple ports/contexts. We don't need that on upstream kernel, so rip out the crazy. Note that we have to move the pinning of the ringbuffer to after the IOMMU is attached. No idea how that managed to work properly on the downstream kernel. For now, I am leaving the IOMMU port name stuff in place, to simplify things for folks trying to backport latest drm/msm to device kernels. Once we no longer have to care about pre-DT kernels, we can drop this and instead backport upstream IOMMU driver. Signed-off-by: Rob Clark <robdclark@gmail.com>	2014-08-04 11:55:29 -04:00
Rob Clark	4e1cbaa3eb	drm/msm: add chip-id param Some of the w/a or different behavior of userspace blob driver seem to be keyed to gpu patch revision, rather than gpu-id. So expose the full chip-id to userspace so it can DTRT. Signed-off-by: Rob Clark <robdclark@gmail.com>	2014-03-31 10:27:46 -04:00
Rob Clark	0963756fe5	drm/msm: spin helper Helper macro to simplify places where we need to poll with timeout waiting for gpu. Signed-off-by: Rob Clark <robdclark@gmail.com>	2014-03-31 10:27:45 -04:00
Rob Clark	5b6ef08e4b	drm/msm: add hang_debug module param msm.hang_debug=y will dump out current register values if the gpu locks up, for easier debugging. Signed-off-by: Rob Clark <robdclark@gmail.com>	2014-03-31 10:27:45 -04:00
Rob Clark	5545996817	drm/msm: add a330/apq8x74 Add support for adreno 330. Not too much different, just a few differences in initial configuration plus setting OCMEM base. Userspace support is already in upstream mesa. Note that the existing DT code is simply using the bindings from downstream android kernel, to simplify porting of this driver to existing devices. These do not constitute any committed/stable DT ABI. The addition of proper DT bindings will be a subsequent patch, at which point (as best as possible) I will try to support either upstream bindings or what is found in downstream android kernel, so that existing device DT files can be used. Signed-off-by: Rob Clark <robdclark@gmail.com>	2014-01-09 14:44:06 -05:00
Rob Clark	871d812aa4	drm/msm: add support for non-IOMMU systems Add a VRAM carveout that is used for systems which do not have an IOMMU. The VRAM carveout uses CMA. The arch code must setup a CMA pool for the device (preferrably in highmem.. a 256m-512m VRAM pool in lowmem is not cool). The user can configure the VRAM pool size using msm.vram module param. Technically, the abstraction of IOMMU behind msm_mmu is not strictly needed, but it simplifies the GEM code a bit, and will be useful later when I add support for a2xx devices with GPUMMU, so I decided to keep this part. It appears to be possible to configure the GPU to restrict access to addresses within the VRAM pool, but this is not done yet. So for now the GPU will refuse to load if there is no sort of mmu. Once address based limits are supported and tested to confirm that we aren't giving the GPU access to arbitrary memory, this restriction can be lifted Signed-off-by: Rob Clark <robdclark@gmail.com>	2014-01-09 14:38:58 -05:00
Rob Clark	3b57f23b1c	drm/msm: add missing MODULE_FIRMWARE()s Signed-off-by: Rob Clark <robdclark@gmail.com>	2014-01-09 14:38:57 -05:00
Rob Clark	26791c48e1	drm/msm: hangcheck harder If gpu locks up with the rptr shortly beyond the wrap-around point in the ringbuffer, because the rptr was not reset (but wptr is, by virtue of resetting rb->cur), we could end up in a scenario where we think there is not enough space in the ringbuffer for the next cmds. And since the CP won't reset rptr until after processing an IB, this leaves things in a sort of deadlock. So reset rptr too. And a bit more spiffing up of hangcheck to make things easier to debug. Signed-off-by: Rob Clark <robdclark@gmail.com>	2013-09-10 13:56:59 -04:00
Rob Clark	bd6f82d828	drm/msm: add basic hangcheck/recovery mechanism A basic, no-frills recovery mechanism in case the gpu gets wedged. We could try to be a bit more fancy and restart the next submit after the one that got wedged, but for now keep it simple. This is enough to recover things if, for example, the gpu hangs mid way through a piglit run. Signed-off-by: Rob Clark <robdclark@gmail.com>	2013-08-24 14:57:19 -04:00
Rob Clark	7198e6b031	drm/msm: add a3xx gpu support Add initial support for a3xx 3d core. So far, with hardware that I've seen to date, we can have: + zero, one, or two z180 2d cores + a3xx or a2xx 3d core, which share a common CP (the firmware for the CP seems to implement some different PM4 packet types but the basics of cmdstream submission are the same) Which means that the eventual complete "class" hierarchy, once support for all past and present hw is in place, becomes: + msm_gpu + adreno_gpu + a3xx_gpu + a2xx_gpu + z180_gpu This commit splits out the parts that will eventually be common between a2xx/a3xx into adreno_gpu, and the parts that are even common to z180 into msm_gpu. Note that there is no cmdstream validation required. All memory access from the GPU is via IOMMU/MMU. So as long as you don't map silly things to the GPU, there isn't much damage that the GPU can do. Signed-off-by: Rob Clark <robdclark@gmail.com>	2013-08-24 14:57:18 -04:00

1 2 3

133 Commits