Currently, _everything_ in cfg80211 holds the RTNL, and if you
have a slow USB device (or a few) you can get some bad lock
contention on that.
Fix that by re-adding a mutex to each wiphy/rdev as we had at
some point, so we have locking for the wireless_dev lists and
all the other things in there, and also so that drivers still
don't have to worry too much about it (they still won't get
parallel calls for a single device).
Then, we can restrict the RTNL to a few cases where we add or
remove interfaces and really need the added protection. Some
of the global list management still also uses the RTNL, since
we need to have it anyway for netdev management, but we only
hold the RTNL for very short periods of time here.
Link: https://lore.kernel.org/r/20210122161942.81df9f5e047a.I4a8e1a60b18863ea8c5e6d3a0faeafb2d45b2f40@changeid
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> [marvell driver issues]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
In D3 resume flow, avoid the following race where sending
packets before updating the sequence number (sequence
number received from the wowlan status command response):
Thread 1:
__iwl_mvm_resume clears IWL_MVM_STATUS_IN_D3 and is cut
by thread 2 before reaching iwl_mvm_query_wakeup_reasons.
Thread 2:
iwl_mvm_mac_itxq_xmit calls iwl_mvm_tx_skb since
IWL_MVM_STATUS_IN_D3 is not set using a wrong sequence number.
Thread 1:
__iwl_mvm_resume continues and calls iwl_mvm_query_wakeup_reasons
updating the sequence number received from the firmware.
The next packet that will be sent now will cause sysassert 0x1096.
Fix the bug by moving 'clear IWL_MVM_STATUS_IN_D3' to after
sending the wowlan status command and updating the sequence
number.
Signed-off-by: Shaul Triebitz <shaul.triebitz@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/iwlwifi.20210122144849.fe927ec939c6.I103d3321fb55da7e6c6c51582cfadf94eb8b6c58@changeid
If we spin for a long time in memory reads that (for some reason in
hardware) take a long time, then we'll eventually get messages such
as
watchdog: BUG: soft lockup - CPU#2 stuck for 24s! [kworker/2:2:272]
This is because the reading really does take a very long time, and
we don't schedule, so we're hogging the CPU with this task, at least
if CONFIG_PREEMPT is not set, e.g. with CONFIG_PREEMPT_VOLUNTARY=y.
Previously I misinterpreted the situation and thought that this was
only going to happen if we had interrupts disabled, and then fixed
this (which is good anyway, however), but that didn't always help;
looking at it again now I realized that the spin unlock will only
reschedule if CONFIG_PREEMPT is used.
In order to avoid this issue, change the code to cond_resched() if
we've been spinning for too long here.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Fixes: 04516706bb ("iwlwifi: pcie: limit memory read spin time")
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/iwlwifi.20210115130253.217a9d6a6a12.If964cb582ab0aaa94e81c4ff3b279eaafda0fd3f@changeid
I noticed that the flow that triggers an NMI on the firmware
for old devices (tested on 7265) doesn't work.
Apparently, the firmware / device is still in low power when
we write the register that triggers the NMI. We call the
"grab_nic_access" function to make sure the device is awake
but that wasn't enough. I played with this and noticed that
if we wait 1 ms after the device reports it is awake before
we write to the NMI register, the device always sees our
write and the firmware gets properly asserted.
Triggering an NMI to the firmware can be done with the
debugfs hook:
echo 1 > /sys/kernel/debug/iwlwifi/0000\:00\:03.0/iwlmvm/fw_nmi
What happened before is that the firmware would just stall
without running its NMI routine. Because of that the driver
wouldn't get the "firmware crashed" interrupt. After a while
the driver would notice that the firmware is not responding
to some command and it would read the error data from the
firmware, but this data is populated in the NMI service
routine in the firmware which was not called. So in the logs
it looked like:
iwlwifi 0000:00:03.0: Error sending REPLY_ERROR: time out after 2000ms.
iwlwifi 0000:00:03.0: Current CMD queue read_ptr 33 write_ptr 34
iwlwifi 0000:00:03.0: Loaded firmware version: 29.09bd31e1.0 7265D-29.ucode
iwlwifi 0000:00:03.0: 0x00000000 | ADVANCED_SYSASSERT
iwlwifi 0000:00:03.0: 0x00000000 | trm_hw_status0
iwlwifi 0000:00:03.0: 0x00000000 | trm_hw_status1
iwlwifi 0000:00:03.0: 0x00000000 | branchlink2
iwlwifi 0000:00:03.0: 0x00000000 | interruptlink1
iwlwifi 0000:00:03.0: 0x00000000 | interruptlink2
iwlwifi 0000:00:03.0: 0x00000000 | data1
iwlwifi 0000:00:03.0: 0x00000000 | data2
iwlwifi 0000:00:03.0: 0x00000000 | data3
iwlwifi 0000:00:03.0: 0x00000000 | beacon time
iwlwifi 0000:00:03.0: 0x00000000 | tsf low
...
With this fix, immediately after we trigger the NMI to the
firmware, we get the expected:
iwlwifi 0000:00:03.0: Microcode SW error detected. Restarting 0x2000000.
iwlwifi 0000:00:03.0: Start IWL Error Log Dump:
iwlwifi 0000:00:03.0: Status: 0x00000040, count: 6
iwlwifi 0000:00:03.0: Loaded firmware version: 29.09bd31e1.0 7265D-29.ucode
iwlwifi 0000:00:03.0: 0x00000084 | NMI_INTERRUPT_UNKNOWN
iwlwifi 0000:00:03.0: 0x000002F1 | trm_hw_status0
iwlwifi 0000:00:03.0: 0x00000000 | trm_hw_status1
iwlwifi 0000:00:03.0: 0x00043D6C | branchlink2
iwlwifi 0000:00:03.0: 0x0004AFD6 | interruptlink1
iwlwifi 0000:00:03.0: 0x000008C4 | interruptlink2
iwlwifi 0000:00:03.0: 0x00000000 | data1
iwlwifi 0000:00:03.0: 0x00000080 | data2
iwlwifi 0000:00:03.0: 0x07030000 | data3
iwlwifi 0000:00:03.0: 0x003FD4C3 | beacon time
iwlwifi 0000:00:03.0: 0x00C22AC3 | tsf low
iwlwifi 0000:00:03.0: 0x00000000 | tsf hi
iwlwifi 0000:00:03.0: 0x00000000 | time gp1
iwlwifi 0000:00:03.0: 0x00C22AC3 | time gp2
iwlwifi 0000:00:03.0: 0x00000001 | uCode revision type
iwlwifi 0000:00:03.0: 0x0000001D | uCode version major
Notice the first line: "Microcode SW error detected:" which
is printed in the driver's ISR, which means that the driver
actually got an interrupt from the firmware saying that it
crashed. And then we have the properly populated error data.
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/iwlwifi.20210115130252.70e67cc75d88.I6615cad4361862e7f3c9f2d3cafb6a8c61e16781@changeid
Fixes the following W=1 kernel build warning(s):
drivers/net/wireless/intel/iwlwifi/fw/acpi.c:123: warning: Function parameter or member 'dev' not described in 'iwl_acpi_get_dsm_object'
drivers/net/wireless/intel/iwlwifi/fw/acpi.c:123: warning: Function parameter or member 'rev' not described in 'iwl_acpi_get_dsm_object'
drivers/net/wireless/intel/iwlwifi/fw/acpi.c:123: warning: Function parameter or member 'func' not described in 'iwl_acpi_get_dsm_object'
drivers/net/wireless/intel/iwlwifi/fw/acpi.c:123: warning: Function parameter or member 'args' not described in 'iwl_acpi_get_dsm_object'
drivers/net/wireless/intel/iwlwifi/fw/acpi.c:142: warning: Function parameter or member 'dev' not described in 'iwl_acpi_get_dsm_u8'
drivers/net/wireless/intel/iwlwifi/fw/acpi.c:142: warning: Function parameter or member 'rev' not described in 'iwl_acpi_get_dsm_u8'
drivers/net/wireless/intel/iwlwifi/fw/acpi.c:142: warning: Function parameter or member 'func' not described in 'iwl_acpi_get_dsm_u8'
Cc: Johannes Berg <johannes.berg@intel.com>
Cc: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Cc: Luca Coelho <luciano.coelho@intel.com>
Cc: Intel Linux Wireless <linuxwifi@intel.com>
Cc: Kalle Valo <kvalo@codeaurora.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: linux-wireless@vger.kernel.org
Cc: netdev@vger.kernel.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20201126133152.3211309-17-lee.jones@linaro.org
Also add documentation for 'mvm'.
Fixes the following W=1 kernel build warning(s):
drivers/net/wireless/intel/iwlwifi/mvm/rs.c:400: warning: cannot understand function prototype: 'const u16 expected_tpt_legacy[IWL_RATE_COUNT] = '
drivers/net/wireless/intel/iwlwifi/mvm/rs.c:684: warning: Function parameter or member 'mvm' not described in '_rs_collect_tx_data'
drivers/net/wireless/intel/iwlwifi/mvm/rs.c:684: warning: Function parameter or member 'tbl' not described in '_rs_collect_tx_data'
drivers/net/wireless/intel/iwlwifi/mvm/rs.c:684: warning: Function parameter or member 'scale_index' not described in '_rs_collect_tx_data'
drivers/net/wireless/intel/iwlwifi/mvm/rs.c:684: warning: Function parameter or member 'attempts' not described in '_rs_collect_tx_data'
drivers/net/wireless/intel/iwlwifi/mvm/rs.c:684: warning: Function parameter or member 'successes' not described in '_rs_collect_tx_data'
drivers/net/wireless/intel/iwlwifi/mvm/rs.c:684: warning: Function parameter or member 'window' not described in '_rs_collect_tx_data'
drivers/net/wireless/intel/iwlwifi/mvm/rs.c:2677: warning: duplicate section name 'NOTE'
drivers/net/wireless/intel/iwlwifi/mvm/rs.c:2682: warning: Function parameter or member 'mvm' not described in 'rs_initialize_lq'
drivers/net/wireless/intel/iwlwifi/mvm/rs.c:2682: warning: Function parameter or member 'sta' not described in 'rs_initialize_lq'
drivers/net/wireless/intel/iwlwifi/mvm/rs.c:2682: warning: Function parameter or member 'lq_sta' not described in 'rs_initialize_lq'
drivers/net/wireless/intel/iwlwifi/mvm/rs.c:2682: warning: Function parameter or member 'band' not described in 'rs_initialize_lq'
drivers/net/wireless/intel/iwlwifi/mvm/rs.c:3761: warning: Function parameter or member 'mvm' not described in 'rs_program_fix_rate'
drivers/net/wireless/intel/iwlwifi/mvm/rs.c:3761: warning: Function parameter or member 'lq_sta' not described in 'rs_program_fix_rate'
drivers/net/wireless/intel/iwlwifi/mvm/rs.c:4213: warning: Function parameter or member 'mvm' not described in 'iwl_mvm_tx_protection'
Cc: Johannes Berg <johannes.berg@intel.com>
Cc: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Cc: Luca Coelho <luciano.coelho@intel.com>
Cc: Intel Linux Wireless <linuxwifi@intel.com>
Cc: Kalle Valo <kvalo@codeaurora.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: linux-wireless@vger.kernel.org
Cc: netdev@vger.kernel.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20201126133152.3211309-3-lee.jones@linaro.org
When we get a channel switch with a very long quiet period, we schedule
a work to disconnect after a while. This work runs in background. In the
meanwhile, we keep getting beacons and sending FW modify command for each.
This has a potential race, where we modify the CSA after we aborted it.
Protect the flow by setting csa_failed to true in case we abort, and check
it before sending the modify command.
This required a modification to the way we treat csa_failed in
iwl_mvm_post_channel_switch:
1. The variable isn't being reset anymore, so we can still look at it in
iwl_mvm_channel_switch_rx_beacon. This is fine, since we reset it when
starting a new CSA.
2. There is no more early return in case of csa_failed. This is fine,
since before this patch csa_failed was set only for GO, and for GO the
function is only resetting the power settings, which we want to restore
even in case of failure.
Signed-off-by: Sara Sharon <sara.sharon@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/iwlwifi.20201209231352.b023856bdf39.I4ed0149e0018fe5e1ae3c2a1cbc614954016063f@changeid
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>