linux/drivers/net/wireless/ath
Brian Norris 38faed1504 ath10k: perform crash dump collection in workqueue
Commit 25733c4e67 ("ath10k: pci: use mutex for diagnostic window CE
polling") introduced a regression where we try to sleep (grab a mutex)
in an atomic context:

[  233.602619] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:254
[  233.602626] in_atomic(): 1, irqs_disabled(): 0, pid: 0, name: swapper/0
[  233.602636] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W         5.1.0-rc2 #4
[  233.602642] Hardware name: Google Scarlet (DT)
[  233.602647] Call trace:
[  233.602663]  dump_backtrace+0x0/0x11c
[  233.602672]  show_stack+0x20/0x28
[  233.602681]  dump_stack+0x98/0xbc
[  233.602690]  ___might_sleep+0x154/0x16c
[  233.602696]  __might_sleep+0x78/0x88
[  233.602704]  mutex_lock+0x2c/0x5c
[  233.602717]  ath10k_pci_diag_read_mem+0x68/0x21c [ath10k_pci]
[  233.602725]  ath10k_pci_diag_read32+0x48/0x74 [ath10k_pci]
[  233.602733]  ath10k_pci_dump_registers+0x5c/0x16c [ath10k_pci]
[  233.602741]  ath10k_pci_fw_crashed_dump+0xb8/0x548 [ath10k_pci]
[  233.602749]  ath10k_pci_napi_poll+0x60/0x128 [ath10k_pci]
[  233.602757]  net_rx_action+0x140/0x388
[  233.602766]  __do_softirq+0x1b0/0x35c
[...]

ath10k_pci_fw_crashed_dump() is called from NAPI contexts, and firmware
memory dumps are retrieved using the diag memory interface.

A simple reproduction case is to run this on QCA6174A /
WLAN.RM.4.4.1-00132-QCARMSWP-1, which happens to be a way to b0rk the
firmware:

  dd if=/sys/kernel/debug/ieee80211/phy0/ath10k/mem_value bs=4K count=1
of=/dev/null

(NB: simulated firmware crashes, via debugfs, don't trigger firmware
dumps.)

The fix is to move the crash-dump into a workqueue context, and avoid
relying on 'data_lock' for most mutual exclusion. We only keep using it
here for protecting 'fw_crash_counter', while the rest of the coredump
buffers are protected by a new 'dump_mutex'.

I've tested the above with simulated firmware crashes (debugfs 'reset'
file), real firmware crashes (the 'dd' command above), and a variety of
reboot and suspend/resume configurations on QCA6174A.

Reported here:
http://lkml.kernel.org/linux-wireless/20190325202706.GA68720@google.com

Fixes: 25733c4e67 ("ath10k: pci: use mutex for diagnostic window CE polling")
Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2019-04-29 17:24:37 +03:00
..
ar5523 ath: Convert timers to use timer_setup() 2017-10-27 16:54:19 +03:00
ath5k ath5k: Remove unused BUG_ON 2018-10-01 18:32:46 +03:00
ath6kl Merge ath-next from git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git 2019-02-08 14:10:35 +02:00
ath9k mac80211: make ieee80211_schedule_txq schedule empty TXQs 2019-04-08 13:31:31 +02:00
ath10k ath10k: perform crash dump collection in workqueue 2019-04-29 17:24:37 +03:00
carl9170 carl9170: clean up a clamp() call 2019-02-19 17:18:14 +02:00
wcn36xx cross-tree: phase out dma_zalloc_coherent() 2019-01-08 07:58:37 -05:00
wil6210 wil6210: check null pointer in _wil_cfg80211_merge_extra_ies 2019-02-28 11:25:09 +02:00
ath.h ath: Remove unnecessary ath_bcast_mac and use eth_broadcast_addr 2018-03-29 12:10:26 +03:00
debug.c
dfs_pattern_detector.c ath: add support to get the detected radar specifications 2018-05-25 13:15:21 +03:00
dfs_pattern_detector.h ath: add support to get the detected radar specifications 2018-05-25 13:15:21 +03:00
dfs_pri_detector.c
dfs_pri_detector.h ath: add support to get the detected radar specifications 2018-05-25 13:15:21 +03:00
hw.c
Kconfig net/wireless: fix spaces and grammar copy/paste in vendor Kconfig help text 2018-03-13 18:52:25 +02:00
key.c
main.c
Makefile License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
reg.h
regd_common.h ath: regd: add extra US coutry codes 2019-02-07 17:02:19 +02:00
regd.c ath: Fix updating radar flags for coutry code India 2017-04-19 17:09:26 +03:00
regd.h ath: regd: add extra US coutry codes 2019-02-07 17:02:19 +02:00
spectral_common.h
trace.c
trace.h