Tegra host supports HW busy detection and timeouts based on the
count programmed in SDHCI_TIMEOUT_CONTROL register and max busy
timeout it supports is 11s in finite busy wait mode.
Some operations like SLEEP_AWAKE, ERASE and flush cache through
SWITCH commands take longer than 11s and Tegra host supports
infinite HW busy wait mode where HW waits forever till the card
is busy without HW timeout.
This patch implements Tegra specific set_timeout sdhci_ops to allow
switching between finite and infinite HW busy detection wait modes
based on the device command expected operation time.
Signed-off-by: Sowjanya Komatineni <skomatineni@nvidia.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/1583941675-9884-1-git-send-email-skomatineni@nvidia.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
When SDHC gets reset (E.g. in runtime suspend path), CQE also gets
reset and goes to disable state. But s/w state still points it as CQE
is in enabled state. Since s/w and h/w states goes out of sync,
it results in s/w request timeout for subsequent CQE requests.
To synchronize CQE s/w and h/w state during SDHC reset,
explicitly deactivate CQE just before SDHC reset.
Signed-off-by: Veerabhadrarao Badiganti <vbadigan@codeaurora.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/1583503724-13943-3-git-send-email-vbadigan@codeaurora.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
The current codebase makes use of the zero-length array language
extension to the C90 standard, but the preferred mechanism to declare
variable-length types such as these ones is a flexible array member[1][2],
introduced in C99:
struct foo {
int stuff;
struct boo array[];
};
By making use of the mechanism above, we will get a compiler warning
in case the flexible array does not occur last in the structure, which
will help us prevent some kind of undefined behavior bugs from being
inadvertently introduced[3] to the codebase from now on.
Also, notice that, dynamic memory allocations won't be affected by
this change:
"Flexible array members have incomplete type, and so the sizeof operator
may not be applied. As a quirk of the original implementation of
zero-length arrays, sizeof evaluates to zero."[1]
This issue was found with the help of Coccinelle.
[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit 7649773293 ("cxgb3/l2t: Fix undefined behaviour")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/20200226223125.GA20630@embeddedor
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Currently, when use standard tuning, driver default disable DMA just before
send tuning command. But on i.MX8 usdhc, this is not enough. Need also clear
DMA_SEL. If not, once the DMA_SEL select AMDA2 before, even dma already disabled,
when send tuning command, usdhc will still prefetch the ADMA script from wrong
DMA address, then we will see IOMMU report some error which show lack of TLB
mapping.
Signed-off-by: Haibo Chen <haibo.chen@nxp.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/1582100757-20683-7-git-send-email-haibo.chen@nxp.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Since L4.15, community involve the commit 105819c8a545 ("mmc: core: use mrq->sbc
when sending CMD23 for RPMB"), let the usdhc to decide whether to use ACMD23 for
RPMB. This CMD23 for RPMB need to set the bit 31 to its argument, if not, the
RPMB write operation will return general fail.
According to the sdhci logic, SDMA mode will disable the ACMD23, and only in
ADMA mode, it will chose to use ACMD23 if the host support. But according to
debug, and confirm with IC, the imx6qpdl/imx6sx/imx6sl/imx7d do not support
the ACMD23 feature completely. These SoCs only use the 16 bit block count of
the register 0x4 (BLOCK_ATT) as the CMD23's argument in ACMD23 mode, which
means it will ignore the upper 16 bit of the CMD23's argument. This will block
the reliable write operation in RPMB, because RPMB reliable write need to set
the bit31 of the CMD23's argument. This is the hardware limitation. So for
imx6qpdl/imx6sx/imx6sl/imx7d, it need to broke the ACMD23 for eMMC, SD card do
not has this limitation, because SD card do not support reliable write.
For imx6ul/imx6ull/imx6sll/imx7ulp/imx8, it support the ACMD23 completely, it
change to use the 0x0 register (DS_ADDR) to put the CMD23's argument in ADMA mode.
This patch add a new flag ESDHC_FLAG_BROKEN_AUTO_CMD23, and add this flag to
imx6q/imx6sx/imx6sl/imx7d, and due to the imx6sll share the same compatible string
with imx6sx before, and imx6sll do not has this limitation, so add a new compatible
string for imx6sll.
Signed-off-by: Haibo Chen <haibo.chen@nxp.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/1582100757-20683-4-git-send-email-haibo.chen@nxp.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
After set the STROBE SLV delay target value, it need to wait some
time to let the usdhc lock the REF and SLV clock. In normal case,
1~2us is enough for imx8/imx6 and imx7d, and 4~5us is enough for
imx7ulp, but when do reboot stress test or do the bind/unbind stress
test, sometimes need to wait about 10us to get the status lock.
This patch optimize delay handle method, only print the warning
message if the status is still not lock after 1ms delay.
Signed-off-by: Haibo Chen <haibo.chen@nxp.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/1582100757-20683-3-git-send-email-haibo.chen@nxp.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
strobe-dll-delay-target is the delay cell add on the strobe line.
Strobe line the the uSDHC loopback read clock which is use in HS400
mode. Different strobe-dll-delay-target may need to set for different
board/SoC. If this delay cell is not set to an appropriate value,
we may see some read operation meet CRC error after HS400 mode select
which already pass the tuning.
This patch add the strobe-dll-delay-target setting in driver, so that
user can easily config this delay cell in dts file.
Signed-off-by: Haibo Chen <haibo.chen@nxp.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/1582100757-20683-1-git-send-email-haibo.chen@nxp.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
When pm_runtime_suspend is run, a call to SCFW power off the SS (SS is a
power domain, usdhc belong to this SS power domain) in which the resource
resides is made. The SCFW can power off the SS if no other resource in
active in that SS. If so, all state associated with all the resources within
the SS that is powered off is lost, this includes the clock rates, clock state
etc. When pm_runtime_resume is called, the SS associated with that resource
is powered up. But the clocks are left in the default state.
This patch restore clock rate in pm_runtime_resume, make sure the
clock is right rather than depending on the default state setting
by SCFW.
Signed-off-by: Haibo Chen <haibo.chen@nxp.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/1582100563-20555-5-git-send-email-haibo.chen@nxp.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
For Mega/Mix enabled SoCs like MX7D and MX6SX, uSDHC will lost power in
LP mode no matter whether the MMC_KEEP_POWER flag is set or not.
This may cause state misalign between kernel and HW, especially for
SDIO3.0 WiFi cards.
e.g. SDIO WiFi driver usually will keep power during system suspend.
And after resume, no card re-enumeration called.
But the tuning state is lost due to Mega/Mix.
Then CRC error may happen during next data transfer.
So we should always fire a mmc_retune_needed() for such type SoC
to tell MMC core retuning is needed for next data transfer.
mmc: sdhci-esdhci-imx: retune needed for Mega/Mix enabled SoCs
Signed-off-by: Haibo Chen <haibo.chen@nxp.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/1582100563-20555-4-git-send-email-haibo.chen@nxp.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
A variant may need to define some actions before and after a voltage
switch.
This patch adds 2 callbacks to manage signal voltage switch in the struct
mmci_host_ops. ->pre_sig_volt_switch() allows to prepare a signal voltage
switch before sending the SD_SWITCH_VOLTAGE command (CMD11).
->post_sig_volt_switch callback allows specific actions to be executed,
after the I/O signal voltage level has been changed.
Signed-off-by: Ludovic Barre <ludovic.barre@st.com>
Link: https://lore.kernel.org/r/20200128090636.13689-8-ludovic.barre@st.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
The hardware delay block is used to align the sampling clock on the data
received by SDMMC. It is mandatory for SDMMC to support the SDR104 mode.
The delay block is used to generate an output clock which is dephased from
the input clock. The phase of the output clock must be programmed by the
execute tuning interface.
Signed-off-by: Ludovic Barre <ludovic.barre@st.com>
Link: https://lore.kernel.org/r/20200128090636.13689-7-ludovic.barre@st.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
sg_dma_xxx should be used after a dma_map_sg call has been done to get bus
addresses of each of the SG entries and their lengths. But mmci_host_ops
validate_data can be called before dma_map_sg. This patch replaces theses
macros by sg->offset and sg->length which are always defined.
Signed-off-by: Ludovic Barre <ludovic.barre@st.com>
Link: https://lore.kernel.org/r/20200128090636.13689-2-ludovic.barre@st.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
When using the host software queue, it will trigger the next request in
irq handler without a context switch. But the sdhci_request() can not be
called in interrupt context when using host software queue for some host
drivers, due to the get_cd() ops can be sleepable.
But for some host drivers, such as Spreadtrum host driver, the card is
nonremovable, so the get_cd() ops is not sleepable, which means we can
complete the data request and trigger the next request in irq handler
to remove the context switch for the Spreadtrum host driver.
As suggested by Adrian, we should introduce a request_atomic() API to
indicate that a request can be called in interrupt context to remove
the context switch when using mmc host software queue. But this should
be done in another thread to convert the users of mmc host software queue.
Thus we can introduce a variable in struct sdhci_host to indicate that
we will always to defer to complete requests when using the host software
queue.
Suggested-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
Signed-off-by: Baolin Wang <baolin.wang7@gmail.com>
Link: https://lore.kernel.org/r/e693e7a29beb3c1922b333f4603ea81f43d5c5b1.1581478568.git.baolin.wang7@gmail.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Now the MMC read/write stack will always wait for previous request is
completed by mmc_blk_rw_wait(), before sending a new request to hardware,
or queue a work to complete request, that will bring context switching
overhead and spend some extra time to poll the card for busy completion
for I/O writes via sending CMD13, especially for high I/O per second
rates, to affect the IO performance.
Thus this patch introduces MMC software queue interface based on the
hardware command queue engine's interfaces, which is similar with the
hardware command queue engine's idea, that can remove the context
switching. Moreover we set the default queue depth as 64 for software
queue, which allows more requests to be prepared, merged and inserted
into IO scheduler to improve performance, but we only allow 2 requests
in flight, that is enough to let the irq handler always trigger the
next request without a context switch, as well as avoiding a long latency.
Moreover the host controller should support HW busy detection for I/O
operations when enabling the host software queue. That means, the host
controller must not complete a data transfer request, until after the
card stops signals busy.
From the fio testing data in cover letter, we can see the software
queue can improve some performance with 4K block size, increasing
about 16% for random read, increasing about 90% for random write,
though no obvious improvement for sequential read and write.
Moreover we can expand the software queue interface to support MMC
packed request or packed command in future.
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Baolin Wang <baolin.wang@linaro.org>
Signed-off-by: Baolin Wang <baolin.wang7@gmail.com>
Link: https://lore.kernel.org/r/4409c1586a9b3ed20d57ad2faf6c262fc3ccb6e2.1581478568.git.baolin.wang7@gmail.com
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>