Add a function mmc_detect_card_removed() which upper layers can use to
determine immediately if a card has been removed. This function should
be called after an I/O request fails so that all queued I/O requests
can be errored out immediately instead of waiting for the card device
to be removed.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Sujit Reddy Thumma <sthumma@codeaurora.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
This patch adds support for sdio UHS cards per the version 3.0
spec.
UHS mode is only enabled for version 3.0 cards when both the
host and the controller support UHS modes.
1.8v signaling support is removed if both the card and the
host do not support UHS. This is done to maintain
compatibility and some system/card combinations break when
1.8v signaling is enabled when the host does not support UHS.
Signed-off-by: Philip Rakity <prakity@marvell.com>
Signed-off-by: Aaron Lu <Aaron.lu@amd.com>
Reviewed-by: Arindam Nath <arindam.nath@amd.com>
Tested-by: Bing Zhao <bzhao@marvell.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
Current clock gating framework disables the MCI clock as soon as the
request is completed and enables it when a request arrives. This aggressive
clock gating framework, when enabled, cause following issues:
When there are back-to-back requests from the Queue layer, we unnecessarily
end up disabling and enabling the clocks between these requests since 8MCLK
clock cycles is a very short duration compared to the time delay between
back to back requests reaching the MMC layer. This overhead can effect the
overall performance depending on how long the clock enable and disable
calls take which is platform dependent. For example on some platforms we
can have clock control not on the local processor, but on a different
subsystem and the time taken to perform the clock enable/disable can add
significant overhead.
Also if the host controller driver decides to disable the host clock too
when mmc_set_ios function is called with ios.clock=0, it adds additional
delay and it is highly possible that the next request had already arrived
and unnecessarily blocked in enabling the clocks. This is seen frequently
when the processor is executing at high speeds and in multi-core platforms
thus reduces the overall throughput compared to if clock gating is
disabled.
Fix this by delaying turning off the clocks by posting request on
delayed workqueue. Also cancel the unscheduled pending work, if any,
when there is access to card.
sysfs entry is provided to tune the delay as needed, default
value set to 200ms.
Signed-off-by: Sujit Reddy Thumma <sthumma@codeaurora.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
This patch is to expose the actual SDCLK frequency in
/sys/kernel/debug/mmcX/ios entry.
For example, if the max clk for a normal speed card is 20MHz this
is reported in /sys/kernel/debug/mmcX/ios. Unfortunately the actual
SDCLK frequency (i.e. Baseclock / divisor) is not reported at all:
for example, in that case, on Arasan HC, it should be 48/4=12 (MHz).
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
This patch allows any block size to be set on the SDIO link,
and still have an arbitrary sized packet (adjusted in size by
using sdio_align_size) transferred in an optimal way
(preferably one transfer).
Previously if the block size was larger than the default of
512 bytes and the transfer size was exactly one block size
(possibly thanks to using sdio_align_size to get an optimal
transfer size), it was sent as a number of byte transfers instead
of one block transfer. Also if the number of blocks was
(max_blocks * N) + 1, the tranfer would be conducted with a number
of blocks and finished off with a number of byte transfers.
When doing this change it was also possible to break out the quirk
for broken byte mode in a much cleaner way, and collect the logic of
when to do byte or block transfer in one function instead of two.
Signed-off-by: Stefan Nilsson XK <stefan.xk.nilsson@stericsson.com>
Signed-off-by: Ulf Hansson <ulf.hansson@stericsson.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
Add new macros for the high speed 50MHz case, rather than having
a confusing reuse of the value for UHS SDR50, which is 100MHz.
Reported-by: Aaron Lu <aaron.lu@amd.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
When SDIO runtime PM was originally introduced, we immediately faced
two regressions with two different chipsets, and in response decided
not to enable it by default.
With the recent work on the 8686 we hoped we found all the gotchas,
so 08da834 did make sense (at least experimentally).
Unfortunately we now see that some setups out there still refuse to
work when SDIO runtime PM is enabled by default (see
http://www.spinics.net/lists/linux-mmc/msg11161.html), and obviously
we can't live with these kind of regressions.
This reverts commit 08da834a24.
Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com>
Cc: Daniel Drake <dsd@laptop.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
mmc_suspend_host() tries to claim host during suspend
and release it only when the bus suspend operation is
compeleted. If CONFIG_MMC_UNSAFE_RESUME is defined and
the host is flagged as removable, mmc_suspend_host()
tries to remove the card. In this process, the file system
sync can get blocked trying to acquire host which is already
claimed by mmc_suspend_host() causing deadlock.
Fix this deadlock by releasing host before ->remove() is called.
Signed-off-by: Sujit Reddy Thumma <sthumma@codeaurora.org>
Acked-by: Ulf Hansson <ulf.hansson@stericsson.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
Fix wrong bus_ops->sleep check. (This isn't expected to have real-world
consequences, because the mmc core always defines both 'awake' and
'sleep' ops.)
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
The eMMC 4.5 devices respond to only RESET and AWAKE command in the
sleep state. Hence the mmc switch command to notify power off state
should be sent before the device enters sleep state.
This patch fixes the same.
Signed-off-by: Girish K S <girish.shivananjappa@linaro.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
This patch skips the setting of the power notify state variable
for non eMMC 4.5 devices. Also fixes the problem of omap_hsmmc
noisy/broken for suspend resume reported by Kevin Hilman.
Signed-off-by: Girish K S <girish.shivananjappa@linaro.org>
Acked-by: Ulf Hansson <ulf.hansson@stericsson.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
Adds a quirk that sets the data read timeout to a fixed value instead
of relying on the information in the CSD. The timeout value chosen
is 300ms since that has proven enough for the problematic cards found,
but could be increased if other cards require this.
This patch also enables this quirk for certain Micron cards known to
have this problem.
Signed-off-by: Stefan Nilsson XK <stefan.xk.nilsson@stericsson.com>
Signed-off-by: Ulf Hansson <ulf.hansson@stericsson.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Cc: <stable@kernel.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
Once the implicit use of module.h is prevented, these files will
fail to find the stat.h header content.
Fix up the implicit usage expectations in advance of the cleanup.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
These two basic defines were everywhere, simply because module.h
was also everywhere. But we are cleaning up the latter. So make
the exporters actually call out their need for the include.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Code cleanup, putting all eMMC 4.5 detection cases together.
This patch removes one if-statement and assembles all. And it also
removes variable initialization below else-statement -- all members
of card structure are already set to zero at card-init.
Signed-off-by: Seungwon Jeon <tgih.jun@samsung.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
While trying to suspend the mmc host there could still be
ongoing requests that we need to wait for. At the same time
a device driver must respond to a suspend request rather quickly.
Instead of potentially wait "forever" by claiming the host we now
"try" to claim the host instead. If it fails, -EBUSY is returned.
Signed-off-by: Ulf Hansson <ulf.hansson@stericsson.com>
Reviewed-by: Sujit Reddy Thumma <sthumma@codeaurora.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
Table 6-2: CCCR bit Definitions, address 00h. Part E1 SDIO Simplified
Specification Version 3.00, Feb. 25, 2011.
This patch has been tested with Marvell WLAN device SD8797.
Signed-off-by: Bing Zhao <bzhao@marvell.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
Here is Essential conditions to indicate Version 3.00 Card
(SD_SPEC=2 and SD_SPEC3=1) :
(1) The card shall support CMD6
(2) The card shall support CMD8
(3) The card shall support CMD42
(4) User area capacity shall be up to 2GB (SDSC) or 32GB (SDHC)
User area capacity shall be more than or equal to 32GB and
up to 2TB (SDXC)
(5) Speed Class shall be supported (SDHC or SDXC)
So even if SD card doesn't support any of the newly defined
UHS-I bus speed mode, it can advertise itself as SD3.0 cards
as long as it supports all the essential conditions of
SD3.0 cards. Given this, these type of cards should atleast
run in High Speed mode @50MHZ if it supports HS.
But current initialization sequence for SD3.0 cards is
such that these non-UHS-I SD3.0 cards runs in Default
Speed mode @25MHz.
This patch makes sure that these non-UHS-I SD3.0 cards run
in High Speed Mode @50MHz.
Tested this patch with SanDisk Extreme SDHC 8GB Class 10 card.
Reported-by: "Hiremath, Vaibhav" <hvaibhav@ti.com>
Signed-off-by: Subhash Jadavani <subhashj@codeaurora.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
HPI command is defined in eMMC4.41.
This feature is important for eMMC4.5 devices.
Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
This patch adds cache feature of eMMC4.5 Spec.
If device supports cache capability, host can utilize some specific
operations.
Signed-off-by: Seungwon Jeon <tgih.jun@samsung.com>
Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
This patch will apply the generic CMD6 timeout to switch command
for power class.
Signed-off-by: Seungwon Jeon <tgih.jun@samsung.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
MMC v4.5 supports the DISCARD feature (CMD38). It's different from
trim and there's no check bit. Currently it's only supported at v4.5.
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
In the v4.5, there's no secure erase & trim support.
Instead it supports the sanitize feature.
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
This patch adds support for the power off notify feature, available in
eMMC 4.5 devices. If the host has support for this feature, then the
mmc core will notify the device by setting the POWER_OFF_NOTIFICATION
byte in the extended csd register with a value of 1 (POWER_ON).
For suspend mode short timeout is used, whereas for the normal poweroff
long timeout is used.
Signed-off-by: Girish K S <girish.shivananjappa@linaro.org>
Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
All the files using printk function for displaying kernel messages
in the mmc driver have been replaced with corresponding macro.
Signed-off-by: Girish K S <girish.shivananjappa@linaro.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
EXT_CSD[248] includes the default maximum timeout for CMD6.
This field is added at eMMC4.5 Spec. And it can be used for default
timeout except for some operations which don't define the timeout
(i.e. background operation, sanitize, flush cache) in eMMC4.5 Spec.
Signed-off-by: Seungwon Jeon <tgih.jun@samsung.com>
Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
mmc_request_done() is sometimes called from interrupt or other atomic
context. Mostly all mmc_request_done() does is complete(), however it
contains code to retry on error, which uses ->request(). As the error
path is certainly not performance critical, this may be moved to the
waiting function mmc_wait_for_req_done().
This allows ->request() to use runtime PM get_sync() and guarantee it
is never in an atomic context.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ulf Hansson <ulf.hansson@stericsson.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
Commit "mmc: add module param to set fault injection attributes" adds
a module_param to this file. But it is relying on the old implicit
"module.h is everywhere" behaviour, and without the explicit include
of moduleparam.h, the pending module.h split up produces this error:
core/debugfs.c:28:35: error: expected ')' before numeric constant
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
It allows gerneral purpose partitions in MMC Device. And I try to simply
make mmc_blk_alloc_parts using mmc_part structure suggested by Andrei
Warkentin. After patching, we see general purpose partitions like this:
> cat /proc/partitions
179 0 847872 mmcblk0
179 192 4096 mmcblk0gp3
179 160 4096 mmcblk0gp2
179 128 4096 mmcblk0gp1
179 96 1052672 mmcblk0gp0
179 64 1024 mmcblk0boot1
179 32 1024 mmcblk0boot0
Signed-off-by: Namjae Jeon <linkinjeon@gmail.com>
Acked-by: Andrei Warkentin <awarkentin@vmware.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
f39b2dd9d ("mmc: core: Bus width testing needs to handle suspend/resume")
added code to only compare read-only ext_csd fields in bus width testing
code, yet it's comparing some fields that are never set.
The affected fields are ext_csd.raw_erased_mem_count and
ext_csd.raw_partition_support.
Signed-off-by: Andrei Warkentin <andrey.warkentin@gmail.com>
Acked-by: Philip Rakity <prakity@marvell.com>
Cc: <stable@kernel.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
This patch adds the power class selection feature available for mmc
versions 4.0 and above. During the enumeration stage before switching
to the lower data bus, check if the power class is supported for the
current bus width. If the power class is available then switch to the
power class and use the higher data bus. If power class is not supported
then switch to the lower data bus in a worst case.
Signed-off-by: Girish K S <girish.shivananjappa@linaro.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
Replace setup("fail_mmc_request") and faulty "ifdef KERNEL" with
a simple module_param(). The module param mmc_core.fail_request
may be used to set the fault injection attributes during boot time
or module load time.
Signed-off-by: Per Forlin <per.forlin@linaro.org>
Reviewed-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
This is a minor fix. It makes mmc_ios_show print proper string when the
host's timing is one of the newly added UHS-I modes.
Signed-off-by: Aaron Lu <aaron.lu@amd.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
For cards that support hardware reset (just eMMC), try a reset and
retry before returning an I/O error. However this is not done for
ECC errors and is never done twice for the same operation type
(READ, WRITE, DISCARD, SECURE DISCARD) until that type of operation
again succeeds.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
eMMC's may have a hardware reset line. This patch provides a
host controller operation to implement hardware reset and
a function to reset and reinitialize the card. Also, for MMC,
the reset is always performed before initialization.
The host must set the new host capability MMC_CAP_HW_RESET
to enable hardware reset.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
The err condition in post_req() is set to undo a call made to pre_req()
that hasn't been started yet. The err condition is not set if an MMC
request returns an error.
Signed-off-by: Per Forlin <per.forlin@linaro.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
Earlier all cards where initiated with bus mode set as OPENDRAIN, and then
later switched to PUSHPULL. According to the MMC/SD/SDIO specifications
only MMC cards use OPENDRAIN during init. For both SD and SDIO the bus
mode shall be PUSHPULL before attempting to init the card.
The consequence of having incorrect bus mode can lead to not being able
to detect the card. Therefore the default behavior have now been changed
to PUSHPULL in mmc_power_up, and will only be temporarily switched when
trying to attach or init a MMC card.
Signed-off-by: Stefan Nilsson XK <stefan.xk.nilsson@stericsson.com>
Signed-off-by: Ulf HANSSON <ulf.hansson@stericsson.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
Adds a quirk which can be turned on for SDIO devices that do not support
512 byte requests in byte mode during CMD53. These requests will always
be sent in block mode instead.
This patch also enables this quirk for ST-Ericsson CW1200 WLAN device.
Signed-off-by: Stefan Nilsson XK <stefan.xk.nilsson@stericsson.com>
Signed-off-by: Ulf HANSSON <ulf.hansson@stericsson.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
During a rescan operation mmc_attach(sd|mmc|sdio) functions are
called. The error handling in these function can trigger a detach
of the bus, which also meant a power off. This is not notified by
the rescan operation which then continues to the next attach function.
If a power off has been done, the framework must never send any
new commands to the host driver, without first doing a new power up.
This will most likely trigger any host driver to hang.
Moving power off out of detach and instead handle power off
separately when it is actually needed, solves the issue.
Signed-off-by: Ulf Hansson <ulf.hansson@stericsson.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Cc: <stable@kernel.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
Put MMC to sleep if it supports SLEEP/AWAKE (CMD5) in the mmc suspend
so that Vcc (NAND core) can be cut to minimize power consumption.
eMMC put into SLEEP can respond to CMD0 or H/W reset or CMD5.
Current implemention on resume from suspend relies on CMD0 in
mmc_init_card to get out of SLEEP mode.
Signed-off-by: Balaji T K <balajitk@ti.com>
Acked-by: Venkatraman S <svenkatr@ti.com>
Reviewed-by: Subhash Jadavani <subhashj@codeaurora.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
Stress-testing the runtime power management of libertas_sdio
through a rmmod/insmod loop revealed that it is quite easy to
cause an ETIMEDOUT failure in mmc_sdio_power_restore() leading to:
libertas_sdio: probe of mmc1:0001:1 failed with error -16
Experimentation shows that a very short delay (100us) is needed in
the power down path before the card can be successfully booted again.
We know that this setup is lacking poweroff clamps on the card's power
lines, but as only a short delay is needed, apply this unconditionally.
Also bump up to 1ms sleep for extra legroom.
Signed-off-by: Daniel Drake <dsd@laptop.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
Fix the sparse warning output "warning: Using plain integer as NULL pointer"
Signed-off-by: Venkatraman S <svenkatr@ti.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
Now that we have improved the runtime power management powerup/powerdown
code, we believe that MMC_CAP_POWER_OFF_CARD is no longer necessary:
runtime PM should now work everywhere.
The only hard evidence for introducing MMC_CAP_POWER_OFF_CARD was the
Marvell sd8686 wifi chip, which was believed to require external gpio
manipulation which wasn't supported by some boards.
After further investigation it was realized (and confirmed by Marvell
folks) that sd8686 requirements can be fulfilled by changing the reset
sequence itself, even if no external gpio is manipulated.
For further information, see the following thread:
http://www.mail-archive.com/linux-mmc@vger.kernel.org/msg04289.html
Enable this trivially for a release or two. If no problems are reported,
we will follow up with a more extensive patch to remove this flag
altogether. If problems are reported, we can look at whitelist/blacklist
possibilities as before.
Signed-off-by: Daniel Drake <dsd@laptop.org>
Acked-by: Ohad Ben-Cohen <ohad@wizery.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
This adds support to inject data errors after a completed host transfer.
The mmc core will return error even though the host transfer is successful.
This simple fault injection proved to be very useful to test the
non-blocking error handling in the mmc_blk_issue_rw_rq().
Random faults can also test how the host driver handles pre_req()
and post_req() in case of errors.
Signed-off-by: Per Forlin <per.forlin@linaro.org>
Acked-by: Akinobu Mita <akinobu.mita@gmail.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
mmc_sd_init_uhs_card function sets the driver type, current limit
and bus speed mode on card as well as on host controller side.
Currently bus speed mode is set by sending CMD6 to card and
immediately setting the timing mode in host controller. But
then before initiating tuning sequence, it also tries to set
current limit by sending CMD6 to card which results in data
timeout errors in controller if bus speed mode is SDR50/SDR104 mode.
So basically bus speed mode should be set only after current limit
is set in the card and immediately after setting the bus speed mode,
tuning sequence should be initiated.
Signed-off-by: Subhash Jadavani <subhashj@codeaurora.org>
Reviewed-by: Arindam Nath <arindam.nath@amd.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
The default multithread workqueue can cause the same work to be executed
concurrently on a different CPUs. This isn't really suitable for clock
gating as it might already gated the clock and gating it twice results both
host->clk_old and host->ios.clock to be set to 0.
To prevent this from happening we use system_nrt_wq instead.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Tested-by: Chris Ball <cjb@laptop.org>
Cc: <stable@kernel.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
We have seen at least two different races when clock gating kicks in in a
middle of ios structure update.
First one happens when ios->clock is changed outside of aggressive clock
gating framework, for example via mmc_set_clock(). The race might happen
when we run following code:
mmc_set_ios():
...
if (ios->clock > 0)
mmc_set_ungated(host);
Now if gating kicks in right after the condition check we end up setting
host->clk_gated to false even though we have just gated the clock. Next
time a request is started we try to ungate and restore the clock in
mmc_host_clk_hold(). However since we have host->clk_gated set to false the
original clock is not restored.
This eventually will cause the host controller to hang since its clock is
disabled while we are trying to issue a request. For example on Intel
Medfield platform we see:
[ 13.818610] mmc2: Timeout waiting for hardware interrupt.
[ 13.818698] sdhci: =========== REGISTER DUMP (mmc2)===========
[ 13.818753] sdhci: Sys addr: 0x00000000 | Version: 0x00008901
[ 13.818804] sdhci: Blk size: 0x00000000 | Blk cnt: 0x00000000
[ 13.818853] sdhci: Argument: 0x00000000 | Trn mode: 0x00000000
[ 13.818903] sdhci: Present: 0x1fff0000 | Host ctl: 0x00000001
[ 13.818951] sdhci: Power: 0x0000000d | Blk gap: 0x00000000
[ 13.819000] sdhci: Wake-up: 0x00000000 | Clock: 0x00000000
[ 13.819049] sdhci: Timeout: 0x00000000 | Int stat: 0x00000000
[ 13.819098] sdhci: Int enab: 0x00ff00c3 | Sig enab: 0x00ff00c3
[ 13.819147] sdhci: AC12 err: 0x00000000 | Slot int: 0x00000000
[ 13.819196] sdhci: Caps: 0x6bee32b2 | Caps_1: 0x00000000
[ 13.819245] sdhci: Cmd: 0x00000000 | Max curr: 0x00000000
[ 13.819292] sdhci: Host ctl2: 0x00000000
[ 13.819331] sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0x00000000
[ 13.819377] sdhci: ===========================================
[ 13.919605] mmc2: Reset 0x2 never completed.
and it never recovers.
Second race might happen while running mmc_power_off():
static void mmc_power_off(struct mmc_host *host)
{
host->ios.clock = 0;
host->ios.vdd = 0;
[ clock gating kicks in here ]
/*
* Reset ocr mask to be the highest possible voltage supported for
* this mmc host. This value will be used at next power up.
*/
host->ocr = 1 << (fls(host->ocr_avail) - 1);
if (!mmc_host_is_spi(host)) {
host->ios.bus_mode = MMC_BUSMODE_OPENDRAIN;
host->ios.chip_select = MMC_CS_DONTCARE;
}
host->ios.power_mode = MMC_POWER_OFF;
host->ios.bus_width = MMC_BUS_WIDTH_1;
host->ios.timing = MMC_TIMING_LEGACY;
mmc_set_ios(host);
}
If the clock gating worker kicks in while we are only partially updated the
ios structure the host controller gets incomplete ios and might not work as
supposed. Again on Intel Medfield platform we get:
[ 4.185349] kernel BUG at drivers/mmc/host/sdhci.c:1155!
[ 4.185422] invalid opcode: 0000 [#1] PREEMPT SMP
[ 4.185509] Modules linked in:
[ 4.185565]
[ 4.185608] Pid: 4, comm: kworker/0:0 Not tainted 3.0.0+ #240 Intel Corporation Medfield/iCDKA
[ 4.185742] EIP: 0060:[<c136364e>] EFLAGS: 00010083 CPU: 0
[ 4.185827] EIP is at sdhci_set_power+0x3e/0xd0
[ 4.185891] EAX: f5ff98e0 EBX: f5ff98e0 ECX: 00000000 EDX: 00000001
[ 4.185970] ESI: f5ff977c EDI: f5ff9904 EBP: f644fe98 ESP: f644fe94
[ 4.186049] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[ 4.186125] Process kworker/0:0 (pid: 4, ti=f644e000 task=f644c0e0 task.ti=f644e000)
[ 4.186219] Stack:
[ 4.186257] f5ff98e0 f644feb0 c1365173 00000282 f5ff9460 f5ff96e0 f5ff96e0 f644feec
[ 4.186418] c1355bd8 f644c0e0 c1499c3d f5ff96e0 f644fed4 00000006 f5ff96e0 00000286
[ 4.186579] f644fedc c107922b f644feec 00000286 f5ff9460 f5ff9700 f644ff10 c135839e
[ 4.186739] Call Trace:
[ 4.186802] [<c1365173>] sdhci_set_ios+0x1c3/0x340
[ 4.186883] [<c1355bd8>] mmc_gate_clock+0x68/0x120
[ 4.186963] [<c1499c3d>] ? _raw_spin_unlock_irqrestore+0x4d/0x60
[ 4.187052] [<c107922b>] ? trace_hardirqs_on+0xb/0x10
[ 4.187134] [<c135839e>] mmc_host_clk_gate_delayed+0xbe/0x130
[ 4.187219] [<c105ec09>] ? process_one_work+0xf9/0x5b0
[ 4.187300] [<c135841d>] mmc_host_clk_gate_work+0xd/0x10
[ 4.187379] [<c105ec82>] process_one_work+0x172/0x5b0
[ 4.187457] [<c105ec09>] ? process_one_work+0xf9/0x5b0
[ 4.187538] [<c1358410>] ? mmc_host_clk_gate_delayed+0x130/0x130
[ 4.187625] [<c105f3c8>] worker_thread+0x118/0x330
[ 4.187700] [<c1496cee>] ? preempt_schedule+0x2e/0x50
[ 4.187779] [<c105f2b0>] ? rescuer_thread+0x1f0/0x1f0
[ 4.187857] [<c1062cf4>] kthread+0x74/0x80
[ 4.187931] [<c1062c80>] ? __init_kthread_worker+0x60/0x60
[ 4.188015] [<c149acfa>] kernel_thread_helper+0x6/0xd
[ 4.188079] Code: 81 fa 00 00 04 00 0f 84 a7 00 00 00 7f 21 81 fa 80 00 00 00 0f 84 92 00 00 00 81 fa 00 00 0
[ 4.188780] EIP: [<c136364e>] sdhci_set_power+0x3e/0xd0 SS:ESP 0068:f644fe94
[ 4.188898] ---[ end trace a7b23eecc71777e4 ]---
This BUG() comes from the fact that ios.power_mode was still in previous
value (MMC_POWER_ON) and ios.vdd was set to zero.
We prevent these by inhibiting the clock gating while we update the ios
structure.
Both problems can be reproduced by simply running the device in a reboot
loop.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Tested-by: Chris Ball <cjb@laptop.org>
Cc: <stable@kernel.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
As per suggestion by Linus Walleij:
> If you think the names of the functions are confusing then
> you may rename them, say like this:
>
> mmc_host_clk_ungate() -> mmc_host_clk_hold()
> mmc_host_clk_gate() -> mmc_host_clk_release()
>
> Which would make the usecases more clear
(This is CC'd to stable@ because the next two patches, which fix
observable races, depend on it.)
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Cc: <stable@kernel.org>
Signed-off-by: Chris Ball <cjb@laptop.org>