linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-17 17:41:44 +00:00

Author	SHA1	Message	Date
Daniel Borkmann	85ddd9c317	Merge branch 'bpf-sockmap-tls-fixes' John Fastabend says: ==================== To date our usage of sockmap/tls has been fairly simple, the BPF programs did only well-defined pop, push, pull and apply/cork operations. Now that we started to push more complex programs into sockmap we uncovered a series of issues addressed here. Further OpenSSL3.0 version should be released soon with kTLS support so its important to get any remaining issues on BPF and kTLS support resolved. Additionally, I have a patch under development to allow sockmap to be enabled/disabled at runtime for Cilium endpoints. This allows us to stress the map insert/delete with kTLS more than previously where Cilium only added the socket to the map when it entered ESTABLISHED state and never touched it from the control path side again relying on the sockets own close() hook to remove it. To test I have a set of test cases in test_sockmap.c that expose these issues. Once we get fixes here merged and in bpf-next I'll submit the tests to bpf-next tree to ensure we don't regress again. Also I've run these patches in the Cilium CI with OpenSSL (master branch) this will run tools such as netperf, ab, wrk2, curl, etc. to get a broad set of testing. I'm aware of two more issues that we are working to resolve in another couple (probably two) patches. First we see an auth tag corruption in kTLS when sending small 1byte chunks under stress. I've not pinned this down yet. But, guessing because its under 1B stress tests it must be some error path being triggered. And second we need to ensure BPF RX programs are not skipped when kTLS ULP is loaded. This breaks some of the sockmap selftests when running with kTLS. I'll send a follow up for this. v2: I dropped a patch that added !0 size check in tls_push_record this originated from a panic I caught awhile ago with a trace in the crypto stack. But I can not reproduce it anymore so will dig into that and send another patch later if needed. Anyways after a bit of thought it would be nicer if tls/crypto/bpf didn't require special case handling for the !0 size. ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2020-01-15 23:26:23 +01:00
John Fastabend	7361d44896	bpf: Sockmap/tls, fix pop data with SK_DROP return code When user returns SK_DROP we need to reset the number of copied bytes to indicate to the user the bytes were dropped and not sent. If we don't reset the copied arg sendmsg will return as if those bytes were copied giving the user a positive return value. This works as expected today except in the case where the user also pops bytes. In the pop case the sg.size is reduced but we don't correctly account for this when copied bytes is reset. The popped bytes are not accounted for and we return a small positive value potentially confusing the user. The reason this happens is due to a typo where we do the wrong comparison when accounting for pop bytes. In this fix notice the if/else is not needed and that we have a similar problem if we push data except its not visible to the user because if delta is larger the sg.size we return a negative value so it appears as an error regardless. Fixes: `7246d8ed4d` ("bpf: helper to pop data from messages") Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/bpf/20200111061206.8028-9-john.fastabend@gmail.com	2020-01-15 23:26:13 +01:00
John Fastabend	9aaaa56845	bpf: Sockmap/tls, skmsg can have wrapped skmsg that needs extra chaining Its possible through a set of push, pop, apply helper calls to construct a skmsg, which is just a ring of scatterlist elements, with the start value larger than the end value. For example, end start \|_0_\|_1_\| ... \|_n_\|_n+1_\| Where end points at 1 and start points and n so that valid elements is the set {n, n+1, 0, 1}. Currently, because we don't build the correct chain only {n, n+1} will be sent. This adds a check and sg_chain call to correctly submit the above to the crypto and tls send path. Fixes: `d3b18ad31f` ("tls: add bpf support to sk_msg handling") Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/bpf/20200111061206.8028-8-john.fastabend@gmail.com	2020-01-15 23:26:13 +01:00
John Fastabend	d468e4775c	bpf: Sockmap/tls, tls_sw can create a plaintext buf > encrypt buf It is possible to build a plaintext buffer using push helper that is larger than the allocated encrypt buffer. When this record is pushed to crypto layers this can result in a NULL pointer dereference because the crypto API expects the encrypt buffer is large enough to fit the plaintext buffer. Kernel splat below. To resolve catch the cases this can happen and split the buffer into two records to send individually. Unfortunately, there is still one case to handle where the split creates a zero sized buffer. In this case we merge the buffers and unmark the split. This happens when apply is zero and user pushed data beyond encrypt buffer. This fixes the original case as well because the split allocated an encrypt buffer larger than the plaintext buffer and the merge simply moves the pointers around so we now have a reference to the new (larger) encrypt buffer. Perhaps its not ideal but it seems the best solution for a fixes branch and avoids handling these two cases, (a) apply that needs split and (b) non apply case. The are edge cases anyways so optimizing them seems not necessary unless someone wants later in next branches. [ 306.719107] BUG: kernel NULL pointer dereference, address: 0000000000000008 [...] [ 306.747260] RIP: 0010:scatterwalk_copychunks+0x12f/0x1b0 [...] [ 306.770350] Call Trace: [ 306.770956] scatterwalk_map_and_copy+0x6c/0x80 [ 306.772026] gcm_enc_copy_hash+0x4b/0x50 [ 306.772925] gcm_hash_crypt_remain_continue+0xef/0x110 [ 306.774138] gcm_hash_crypt_continue+0xa1/0xb0 [ 306.775103] ? gcm_hash_crypt_continue+0xa1/0xb0 [ 306.776103] gcm_hash_assoc_remain_continue+0x94/0xa0 [ 306.777170] gcm_hash_assoc_continue+0x9d/0xb0 [ 306.778239] gcm_hash_init_continue+0x8f/0xa0 [ 306.779121] gcm_hash+0x73/0x80 [ 306.779762] gcm_encrypt_continue+0x6d/0x80 [ 306.780582] crypto_gcm_encrypt+0xcb/0xe0 [ 306.781474] crypto_aead_encrypt+0x1f/0x30 [ 306.782353] tls_push_record+0x3b9/0xb20 [tls] [ 306.783314] ? sk_psock_msg_verdict+0x199/0x300 [ 306.784287] bpf_exec_tx_verdict+0x3f2/0x680 [tls] [ 306.785357] tls_sw_sendmsg+0x4a3/0x6a0 [tls] test_sockmap test signature to trigger bug, [TEST]: (1, 1, 1, sendmsg, pass,redir,start 1,end 2,pop (1,2),ktls,): Fixes: `d3b18ad31f` ("tls: add bpf support to sk_msg handling") Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/bpf/20200111061206.8028-7-john.fastabend@gmail.com	2020-01-15 23:26:13 +01:00
John Fastabend	cf21e9ba1e	bpf: Sockmap/tls, msg_push_data may leave end mark in place Leaving an incorrect end mark in place when passing to crypto layer will cause crypto layer to stop processing data before all data is encrypted. To fix clear the end mark on push data instead of expecting users of the helper to clear the mark value after the fact. This happens when we push data into the middle of a skmsg and have room for it so we don't do a set of copies that already clear the end flag. Fixes: `6fff607e2f` ("bpf: sk_msg program helper bpf_msg_push_data") Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/bpf/20200111061206.8028-6-john.fastabend@gmail.com	2020-01-15 23:26:13 +01:00
John Fastabend	6562e29cf6	bpf: Sockmap, skmsg helper overestimates push, pull, and pop bounds In the push, pull, and pop helpers operating on skmsg objects to make data writable or insert/remove data we use this bounds check to ensure specified data is valid, /* Bounds checks: start and pop must be inside message */ if (start >= offset + l \|\| last >= msg->sg.size) return -EINVAL; The problem here is offset has already included the length of the current element the 'l' above. So start could be past the end of the scatterlist element in the case where start also points into an offset on the last skmsg element. To fix do the accounting slightly different by adding the length of the previous entry to offset at the start of the iteration. And ensure its initialized to zero so that the first iteration does nothing. Fixes: `604326b41a` ("bpf, sockmap: convert to generic sk_msg interface") Fixes: `6fff607e2f` ("bpf: sk_msg program helper bpf_msg_push_data") Fixes: `7246d8ed4d` ("bpf: helper to pop data from messages") Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/bpf/20200111061206.8028-5-john.fastabend@gmail.com	2020-01-15 23:26:13 +01:00
John Fastabend	33bfe20dd7	bpf: Sockmap/tls, push write_space updates through ulp updates When sockmap sock with TLS enabled is removed we cleanup bpf/psock state and call tcp_update_ulp() to push updates to TLS ULP on top. However, we don't push the write_space callback up and instead simply overwrite the op with the psock stored previous op. This may or may not be correct so to ensure we don't overwrite the TLS write space hook pass this field to the ULP and have it fixup the ctx. This completes a previous fix that pushed the ops through to the ULP but at the time missed doing this for write_space, presumably because write_space TLS hook was added around the same time. Fixes: `95fa145479` ("bpf: sockmap/tls, close can race with map free") Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/bpf/20200111061206.8028-4-john.fastabend@gmail.com	2020-01-15 23:26:13 +01:00
John Fastabend	7e81a35302	bpf: Sockmap, ensure sock lock held during tear down The sock_map_free() and sock_hash_free() paths used to delete sockmap and sockhash maps walk the maps and destroy psock and bpf state associated with the socks in the map. When done the socks no longer have BPF programs attached and will function normally. This can happen while the socks in the map are still "live" meaning data may be sent/received during the walk. Currently, though we don't take the sock_lock when the psock and bpf state is removed through this path. Specifically, this means we can be writing into the ops structure pointers such as sendmsg, sendpage, recvmsg, etc. while they are also being called from the networking side. This is not safe, we never used proper READ_ONCE/WRITE_ONCE semantics here if we believed it was safe. Further its not clear to me its even a good idea to try and do this on "live" sockets while networking side might also be using the socket. Instead of trying to reason about using the socks from both sides lets realize that every use case I'm aware of rarely deletes maps, in fact kubernetes/Cilium case builds map at init and never tears it down except on errors. So lets do the simple fix and grab sock lock. This patch wraps sock deletes from maps in sock lock and adds some annotations so we catch any other cases easier. Fixes: `604326b41a` ("bpf, sockmap: convert to generic sk_msg interface") Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/bpf/20200111061206.8028-3-john.fastabend@gmail.com	2020-01-15 23:26:13 +01:00
John Fastabend	4da6a196f9	bpf: Sockmap/tls, during free we may call tcp_bpf_unhash() in loop When a sockmap is free'd and a socket in the map is enabled with tls we tear down the bpf context on the socket, the psock struct and state, and then call tcp_update_ulp(). The tcp_update_ulp() call is to inform the tls stack it needs to update its saved sock ops so that when the tls socket is later destroyed it doesn't try to call the now destroyed psock hooks. This is about keeping stacked ULPs in good shape so they always have the right set of stacked ops. However, recently unhash() hook was removed from TLS side. But, the sockmap/bpf side is not doing any extra work to update the unhash op when is torn down instead expecting TLS side to manage it. So both TLS and sockmap believe the other side is managing the op and instead no one updates the hook so it continues to point at tcp_bpf_unhash(). When unhash hook is called we call tcp_bpf_unhash() which detects the psock has already been destroyed and calls sk->sk_prot_unhash() which calls tcp_bpf_unhash() yet again and so on looping and hanging the core. To fix have sockmap tear down logic fixup the stale pointer. Fixes: `5d92e631b8` ("net/tls: partially revert fix transition through disconnect with close") Reported-by: syzbot+83979935eb6304f8cd46@syzkaller.appspotmail.com Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> Acked-by: Song Liu <songliubraving@fb.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/bpf/20200111061206.8028-2-john.fastabend@gmail.com	2020-01-15 23:26:13 +01:00
David S. Miller	567110f147	Merge branch 'stmmac-Fix-selftests-in-Synopsys-AXS101-board' Jose Abreu says: ==================== net: stmmac: Fix selftests in Synopsys AXS101 board Set of fixes for sefltests so that they work in Synopsys AXS101 board. Final output: $ ethtool -t eth0 The test result is PASS The test extra info: 1. MAC Loopback 0 2. PHY Loopback -95 3. MMC Counters 0 4. EEE -95 5. Hash Filter MC 0 6. Perfect Filter UC 0 7. MC Filter 0 8. UC Filter 0 9. Flow Control -95 10. RSS -95 11. VLAN Filtering -95 12. VLAN Filtering (perf) -95 13. Double VLAN Filter -95 14. Double VLAN Filter (perf) -95 15. Flexible RX Parser -95 16. SA Insertion (desc) -95 17. SA Replacement (desc) -95 18. SA Insertion (reg) -95 19. SA Replacement (reg) -95 20. VLAN TX Insertion -95 21. SVLAN TX Insertion -95 22. L3 DA Filtering -95 23. L3 SA Filtering -95 24. L4 DA TCP Filtering -95 25. L4 SA TCP Filtering -95 26. L4 DA UDP Filtering -95 27. L4 SA UDP Filtering -95 28. ARP Offload -95 29. Jumbo Frame 0 30. Multichannel Jumbo -95 31. Split Header -95 Description: 1) Fixes the unaligned accesses that caused CPU halt in Synopsys AXS101 boards. 2) Fixes the VLAN tests when filtering failed to work. 3) Fixes the VLAN Perfect tests when filtering is not available in HW. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-15 23:11:18 +01:00
Jose Abreu	4eee13f14d	net: stmmac: selftests: Guard VLAN Perfect test against non supported HW When HW does not support perfect filtering the feature will not be enabled in the net_device. Add a check for this to prevent failures. Fixes: `1b2250a04c` ("net: stmmac: selftests: Add tests for VLAN Perfect Filtering") Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-15 23:11:18 +01:00
Jose Abreu	d39b68e5a7	net: stmmac: selftests: Mark as fail when received VLAN ID != expected When the VLAN ID does not match the expected one it means filter failed in HW. Fix it. Fixes: `94e1838200` ("net: stmmac: selftests: Add selftest for VLAN TX Offload") Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-15 23:11:18 +01:00
Jose Abreu	0b9f932edc	net: stmmac: selftests: Make it work in Synopsys AXS101 boards Synopsys AXS101 boards do not support unaligned memory loads or stores. Change the selftests mechanism to explicity: - Not add extra alignment in TX SKB - Use the unaligned version of ether_addr_equal() Fixes: `091810dbde` ("net: stmmac: Introduce selftests support") Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-15 23:11:18 +01:00
Colin Ian King	ddf4203905	net/wan/fsl_ucc_hdlc: fix out of bounds write on array utdm_info Array utdm_info is declared as an array of MAX_HDLC_NUM (4) elements however up to UCC_MAX_NUM (8) elements are potentially being written to it. Currently we have an array out-of-bounds write error on the last 4 elements. Fix this by making utdm_info UCC_MAX_NUM elements in size. Addresses-Coverity: ("Out-of-bounds write") Fixes: `c19b6d246a` ("drivers/net: support hdlc function for QE-UCC") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-15 23:05:54 +01:00
Wayne Lin	5a64967a2f	drm/dp_mst: Have DP_Tx send one msg at a time [Why] Noticed this while testing MST with the 4 ports MST hub from StarTech.com. Sometimes can't light up monitors normally and get the error message as 'sideband msg build failed'. Look into aux transactions, found out that source sometimes will send out another down request before receiving the down reply of the previous down request. On the other hand, in drm_dp_get_one_sb_msg(), current code doesn't handle the interleaved replies case. Hence, source can't build up message completely and can't light up monitors. [How] For good compatibility, enforce source to send out one down request at a time. Add a flag, is_waiting_for_dwn_reply, to determine if the source can send out a down request immediately or not. - Check the flag before calling process_single_down_tx_qlock to send out a msg - Set the flag when successfully send out a down request - Clear the flag when successfully build up a down reply - Clear the flag when find erros during sending out a down request - Clear the flag when find errors during building up a down reply - Clear the flag when timeout occurs during waiting for a down reply - Use drm_dp_mst_kick_tx() to try to send another down request in queue at the end of drm_dp_mst_wait_tx_reply() (attempt to send out messages in queue when errors occur) Cc: Lyude Paul <lyude@redhat.com> Signed-off-by: Wayne Lin <Wayne.Lin@amd.com> Signed-off-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200113093649.11755-1-Wayne.Lin@amd.com	2020-01-15 17:01:21 -05:00
David S. Miller	5a40420e04	Here is a batman-adv bugfix: - Fix DAT candidate selection on little endian systems, by Sven Eckelmann -----BEGIN PGP SIGNATURE----- iQJKBAABCgA0FiEE1ilQI7G+y+fdhnrfoSvjmEKSnqEFAl4dzQUWHHN3QHNpbW9u d3VuZGVybGljaC5kZQAKCRChK+OYQpKeoQFuD/44PUv3jHW9S8aIu5FobWHlswOg e1p9g98QdO7dezkkQa64tuFMR6mgVefOeK2+LHCZQ3Svd6xeGaxEGNo6vdzrkyrj iRl/5nY8hRZ3jIdD7kOG11yX/ra+3BSWMWS4yu0OAawYFMYxakW6qUR7OUaqL9xv Qr/XHu7S/tpy4MGUDabu11KlJGyZBfyXQxEKmSjjfWVITqVKNeoZONFqm2lmLOHs RppvTjzdTf3HUPp6C9rC1b6phN7G8XLRXqXrgY4LzTDmwHbOmPvJQ1t+kh9dfZuX DYsv89dl4iVIcSUY9fyUA9YxSHRyqiXRDDG0DPIy0AqRrHy0HgdGTlwCkS1idl1H okiuO5rUzpDFGG5F8BRKY2t29JN8BO+/YVZpa5bl9ZgQQpnuBu+kkbceYCHY0lat h2y3AMsJUcWEbFsa/ulEmHw8aj38rE4lgd7lhLir04BFVZHVB+faH78MQ7xB7+Xl lPl4ptf3l3mwWW/tXbuWa6bXjCg5UBAOr4fEyhrg8fM0rst8LG6trWZWlWgZykLe 3HSnaR00tH8DiMPf/Zl7tAX/dKCmzCo6h7mXUAfvw1PktOvWC1JhtjchJASOqnoP peD45G5W3DGgxtmfcl6J33BmMllbxeT1N6sQRApliYvcrrMZ9gVzr2sHFxHftFXl kyCpbAlfAZ6ceGYCuQ== =6zKP -----END PGP SIGNATURE----- Merge tag 'batadv-net-for-davem-20200114' of git://git.open-mesh.org/linux-merge Simon Wunderlich says: ==================== Here is a batman-adv bugfix: - Fix DAT candidate selection on little endian systems, by Sven Eckelmann ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-15 22:44:23 +01:00
Daniel Borkmann	0af2ffc93a	bpf: Fix incorrect verifier simulation of ARSH under ALU32 Anatoly has been fuzzing with kBdysch harness and reported a hang in one of the outcomes: 0: R1=ctx(id=0,off=0,imm=0) R10=fp0 0: (85) call bpf_get_socket_cookie#46 1: R0_w=invP(id=0) R10=fp0 1: (57) r0 &= 808464432 2: R0_w=invP(id=0,umax_value=808464432,var_off=(0x0; 0x30303030)) R10=fp0 2: (14) w0 -= 810299440 3: R0_w=invP(id=0,umax_value=4294967295,var_off=(0xcf800000; 0x3077fff0)) R10=fp0 3: (c4) w0 s>>= 1 4: R0_w=invP(id=0,umin_value=1740636160,umax_value=2147221496,var_off=(0x67c00000; 0x183bfff8)) R10=fp0 4: (76) if w0 s>= 0x30303030 goto pc+216 221: R0_w=invP(id=0,umin_value=1740636160,umax_value=2147221496,var_off=(0x67c00000; 0x183bfff8)) R10=fp0 221: (95) exit processed 6 insns (limit 1000000) [...] Taking a closer look, the program was xlated as follows: # ./bpftool p d x i 12 0: (85) call bpf_get_socket_cookie#7800896 1: (bf) r6 = r0 2: (57) r6 &= 808464432 3: (14) w6 -= 810299440 4: (c4) w6 s>>= 1 5: (76) if w6 s>= 0x30303030 goto pc+216 6: (05) goto pc-1 7: (05) goto pc-1 8: (05) goto pc-1 [...] 220: (05) goto pc-1 221: (05) goto pc-1 222: (95) exit Meaning, the visible effect is very similar to `f54c7898ed` ("bpf: Fix precision tracking for unbounded scalars"), that is, the fall-through branch in the instruction 5 is considered to be never taken given the conclusion from the min/max bounds tracking in w6, and therefore the dead-code sanitation rewrites it as goto pc-1. However, real-life input disagrees with verification analysis since a soft-lockup was observed. The bug sits in the analysis of the ARSH. The definition is that we shift the target register value right by K bits through shifting in copies of its sign bit. In adjust_scalar_min_max_vals(), we do first coerce the register into 32 bit mode, same happens after simulating the operation. However, for the case of simulating the actual ARSH, we don't take the mode into account and act as if it's always 64 bit, but location of sign bit is different: dst_reg->smin_value >>= umin_val; dst_reg->smax_value >>= umin_val; dst_reg->var_off = tnum_arshift(dst_reg->var_off, umin_val); Consider an unknown R0 where bpf_get_socket_cookie() (or others) would for example return 0xffff. With the above ARSH simulation, we'd see the following results: [...] 1: R1=ctx(id=0,off=0,imm=0) R2_w=invP65535 R10=fp0 1: (85) call bpf_get_socket_cookie#46 2: R0_w=invP(id=0) R10=fp0 2: (57) r0 &= 808464432 -> R0_runtime = 0x3030 3: R0_w=invP(id=0,umax_value=808464432,var_off=(0x0; 0x30303030)) R10=fp0 3: (14) w0 -= 810299440 -> R0_runtime = 0xcfb40000 4: R0_w=invP(id=0,umax_value=4294967295,var_off=(0xcf800000; 0x3077fff0)) R10=fp0 (0xffffffff) 4: (c4) w0 s>>= 1 -> R0_runtime = 0xe7da0000 5: R0_w=invP(id=0,umin_value=1740636160,umax_value=2147221496,var_off=(0x67c00000; 0x183bfff8)) R10=fp0 (0x67c00000) (0x7ffbfff8) [...] In insn 3, we have a runtime value of 0xcfb40000, which is '1100 1111 1011 0100 0000 0000 0000 0000', the result after the shift has 0xe7da0000 that is '1110 0111 1101 1010 0000 0000 0000 0000', where the sign bit is correctly retained in 32 bit mode. In insn4, the umax was 0xffffffff, and changed into 0x7ffbfff8 after the shift, that is, '0111 1111 1111 1011 1111 1111 1111 1000' and means here that the simulation didn't retain the sign bit. With above logic, the updates happen on the 64 bit min/max bounds and given we coerced the register, the sign bits of the bounds are cleared as well, meaning, we need to force the simulation into s32 space for 32 bit alu mode. Verification after the fix below. We're first analyzing the fall-through branch on 32 bit signed >= test eventually leading to rejection of the program in this specific case: 0: R1=ctx(id=0,off=0,imm=0) R10=fp0 0: (b7) r2 = 808464432 1: R1=ctx(id=0,off=0,imm=0) R2_w=invP808464432 R10=fp0 1: (85) call bpf_get_socket_cookie#46 2: R0_w=invP(id=0) R10=fp0 2: (bf) r6 = r0 3: R0_w=invP(id=0) R6_w=invP(id=0) R10=fp0 3: (57) r6 &= 808464432 4: R0_w=invP(id=0) R6_w=invP(id=0,umax_value=808464432,var_off=(0x0; 0x30303030)) R10=fp0 4: (14) w6 -= 810299440 5: R0_w=invP(id=0) R6_w=invP(id=0,umax_value=4294967295,var_off=(0xcf800000; 0x3077fff0)) R10=fp0 5: (c4) w6 s>>= 1 6: R0_w=invP(id=0) R6_w=invP(id=0,umin_value=3888119808,umax_value=4294705144,var_off=(0xe7c00000; 0x183bfff8)) R10=fp0 (0x67c00000) (0xfffbfff8) 6: (76) if w6 s>= 0x30303030 goto pc+216 7: R0_w=invP(id=0) R6_w=invP(id=0,umin_value=3888119808,umax_value=4294705144,var_off=(0xe7c00000; 0x183bfff8)) R10=fp0 7: (30) r0 = (u8 )skb[808464432] BPF_LD_[ABS\|IND] uses reserved fields processed 8 insns (limit 1000000) [...] Fixes: `9cbe1f5a32` ("bpf/verifier: improve register value range tracking with ARSH") Reported-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200115204733.16648-1-daniel@iogearbox.net	2020-01-15 13:39:59 -08:00
Mohammed Gamal	536dc5df28	hv_netvsc: Fix memory leak when removing rndis device kmemleak detects the following memory leak when hot removing a network device: unreferenced object 0xffff888083f63600 (size 256): comm "kworker/0:1", pid 12, jiffies 4294831717 (age 1113.676s) hex dump (first 32 bytes): 00 40 c7 33 80 88 ff ff 00 00 00 00 10 00 00 00 .@.3............ 00 00 00 00 ad 4e ad de ff ff ff ff 00 00 00 00 .....N.......... backtrace: [<00000000d4a8f5be>] rndis_filter_device_add+0x117/0x11c0 [hv_netvsc] [<000000009c02d75b>] netvsc_probe+0x5e7/0xbf0 [hv_netvsc] [<00000000ddafce23>] vmbus_probe+0x74/0x170 [hv_vmbus] [<00000000046e64f1>] really_probe+0x22f/0xb50 [<000000005cc35eb7>] driver_probe_device+0x25e/0x370 [<0000000043c642b2>] bus_for_each_drv+0x11f/0x1b0 [<000000005e3d09f0>] __device_attach+0x1c6/0x2f0 [<00000000a72c362f>] bus_probe_device+0x1a6/0x260 [<0000000008478399>] device_add+0x10a3/0x18e0 [<00000000cf07b48c>] vmbus_device_register+0xe7/0x1e0 [hv_vmbus] [<00000000d46cf032>] vmbus_add_channel_work+0x8ab/0x1770 [hv_vmbus] [<000000002c94bb64>] process_one_work+0x919/0x17d0 [<0000000096de6781>] worker_thread+0x87/0xb40 [<00000000fbe7397e>] kthread+0x333/0x3f0 [<000000004f844269>] ret_from_fork+0x3a/0x50 rndis_filter_device_add() allocates an instance of struct rndis_device which never gets deallocated as rndis_filter_device_remove() sets net_device->extension which points to the rndis_device struct to NULL, leaving the rndis_device dangling. Since net_device->extension is eventually freed in free_netvsc_device(), we refrain from setting it to NULL inside rndis_filter_device_remove() Signed-off-by: Mohammed Gamal <mgamal@redhat.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-15 22:37:45 +01:00
Pengcheng Yang	e176b1ba47	tcp: fix marked lost packets not being retransmitted When the packet pointed to by retransmit_skb_hint is unlinked by ACK, retransmit_skb_hint will be set to NULL in tcp_clean_rtx_queue(). If packet loss is detected at this time, retransmit_skb_hint will be set to point to the current packet loss in tcp_verify_retransmit_hint(), then the packets that were previously marked lost but not retransmitted due to the restriction of cwnd will be skipped and cannot be retransmitted. To fix this, when retransmit_skb_hint is NULL, retransmit_skb_hint can be reset only after all marked lost packets are retransmitted (retrans_out >= lost_out), otherwise we need to traverse from tcp_rtx_queue_head in tcp_xmit_retransmit_queue(). Packetdrill to demonstrate: // Disable RACK and set max_reordering to keep things simple 0 `sysctl -q net.ipv4.tcp_recovery=0` +0 `sysctl -q net.ipv4.tcp_max_reordering=3` // Establish a connection +0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 +0 bind(3, ..., ...) = 0 +0 listen(3, 1) = 0 +.1 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7> +0 > S. 0:0(0) ack 1 <...> +.01 < . 1:1(0) ack 1 win 257 +0 accept(3, ..., ...) = 4 // Send 8 data segments +0 write(4, ..., 8000) = 8000 +0 > P. 1:8001(8000) ack 1 // Enter recovery and 1:3001 is marked lost +.01 < . 1:1(0) ack 1 win 257 <sack 3001:4001,nop,nop> +0 < . 1:1(0) ack 1 win 257 <sack 5001:6001 3001:4001,nop,nop> +0 < . 1:1(0) ack 1 win 257 <sack 5001:7001 3001:4001,nop,nop> // Retransmit 1:1001, now retransmit_skb_hint points to 1001:2001 +0 > . 1:1001(1000) ack 1 // 1001:2001 was ACKed causing retransmit_skb_hint to be set to NULL +.01 < . 1:1(0) ack 2001 win 257 <sack 5001:8001 3001:4001,nop,nop> // Now retransmit_skb_hint points to 4001:5001 which is now marked lost // BUG: 2001:3001 was not retransmitted +0 > . 2001:3001(1000) ack 1 Signed-off-by: Pengcheng Yang <yangpc@wangsu.com> Acked-by: Neal Cardwell <ncardwell@google.com> Tested-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-15 22:34:31 +01:00
Takashi Iwai	60adcfde92	ALSA: seq: Fix racy access for queue timer in proc read snd_seq_info_timer_read() reads the information of the timer assigned for each queue, but it's done in a racy way which may lead to UAF as spotted by syzkaller. This patch applies the missing q->timer_mutex lock while accessing the timer object as well as a slight code change to adapt the standard coding style. Reported-by: syzbot+2b2ef983f973e5c40943@syzkaller.appspotmail.com Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20200115203733.26530-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>	2020-01-15 21:38:18 +01:00
Jari Ruusu	f5ae2ea634	Fix built-in early-load Intel microcode alignment Intel Software Developer's Manual, volume 3, chapter 9.11.6 says: "Note that the microcode update must be aligned on a 16-byte boundary and the size of the microcode update must be 1-KByte granular" When early-load Intel microcode is loaded from initramfs, userspace tool 'iucode_tool' has already 16-byte aligned those microcode bits in that initramfs image. Image that was created something like this: iucode_tool --write-earlyfw=FOO.cpio microcode-files... However, when early-load Intel microcode is loaded from built-in firmware BLOB using CONFIG_EXTRA_FIRMWARE= kernel config option, that 16-byte alignment is not guaranteed. Fix this by forcing all built-in firmware BLOBs to 16-byte alignment. [ If we end up having other firmware with much bigger alignment requirements, we might need to introduce some method for the firmware to specify it, this is the minimal "just increase the alignment a bit to account for this one special case" patch - Linus ] Signed-off-by: Jari Ruusu <jari.ruusu@gmail.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2020-01-15 11:50:37 -08:00
Linus Torvalds	a4feff2264	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2 Pull arch/nios2 fixlet from Ley Foon Tan: "Update my nios2 maintainer email" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2: MAINTAINERS: Update Ley Foon Tan's email address	2020-01-15 11:33:53 -08:00
Krzysztof Kozlowski	e64175776d	i2c: iop3xx: Fix memory leak in probe error path When handling devm_gpiod_get_optional() errors, free the memory already allocated. This fixes Smatch warnings: drivers/i2c/busses/i2c-iop3xx.c:437 iop3xx_i2c_probe() warn: possible memory leak of 'new_adapter' drivers/i2c/busses/i2c-iop3xx.c:442 iop3xx_i2c_probe() warn: possible memory leak of 'new_adapter' Fixes: `fdb7e884ad` ("i2c: iop: Use GPIO descriptors") Reported-by: kbuild test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>	2020-01-15 20:31:27 +01:00
Linus Torvalds	51d6981751	platform-drivers-x86 for v5.5-3 * Fix keyboard brightness control for ASUS laptops * Better handling parameters of GPD pocket fan module to avoid thermal shock * Add IDs to PMC platform driver to support Intel Comet Lake * Fix potential dead lock in Mellanox TM FIFO driver and ABI documentation The following is an automated git shortlog grouped by driver: asus-wmi: - Fix keyboard brightness cannot be set to 0 Documentation/ABI: - Add missed attribute for mlxreg-io sysfs interfaces - Fix documentation inconsistency for mlxreg-io sysfs interfaces GPD pocket fan: - Allow somewhat lower/higher temperature limits - Use default values when wrong modparams are given intel-ips: - Use the correct style for SPDX License Identifier intel_pmc_core: - update Comet Lake platform driver platform/mellanox: - fix potential deadlock in the tmfifo driver -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEqaflIX74DDDzMJJtb7wzTHR8rCgFAl4e7x0ACgkQb7wzTHR8 rCjqSg//S8IV3EOXBt5tj3UwJLW7qUy6XyBVRYHyGvFUaBtyIOXzGhyMW66Adkgc hp8koJrK2xk+p/x8KUdjNDnb4VvDUIpGaKnidJTGk+T85ShEvGfaM8jSOGWdmfhB PPr7oVZXtsQRDyu0aQv6jWAh/fyBk7Z1RcignI/CZJfa7Q78A7zLHFt8jj34V/J8 UGUrESrvqFnV+uvbVOT3dvxZzIIUVg2giU+ODNVPaTCRmdGKkHLa1oSPb1zE+QkP rFFHUUgRo+7a1e29AWIT01ZM6ept8hVvaJQ71mTs0DPbS9qBtz2/AYyT40Mvf9aW dRkUcr26WoL8uGVg2+DgyPIErrLvbXWo8sFaBwYgpjfxWs+7uTt4SMbv05cMmO40 IPLCWzIyQ+Q1f+jLa3GJei+oyJ5NRFjDzouJUkyIZ7tz6yak6/L5P7ZBIqUWjA4q Etq3I7AJEL1qZ6v7zE4HbpwguaiJfkyY29xd2Zs/P2J2pGGLdB08//zjJNhMZVvr mJjlq8dTTp+yTplS2Y34ulsAapcAZLHfwvLRZWcXHJIwxJRL+X79nCYP6lnd16Y8 6AfXTsXx7SoSB3D7pxjr/LU9kiLxAS2anZIFUwBGE41jTzI7czDVoDFOcSpRia8h 8nRaD5JTZ2hN4QxZZggunexGtPmSdNonL+Rg+TRBVAGjoiC4Wio= =zKPB -----END PGP SIGNATURE----- Merge tag 'platform-drivers-x86-v5.5-3' of git://git.infradead.org/linux-platform-drivers-x86 Pull x86 platform driver fixes from Andy Shevchenko: - Fix keyboard brightness control for ASUS laptops - Better handling parameters of GPD pocket fan module to avoid thermal shock - Add IDs to PMC platform driver to support Intel Comet Lake - Fix potential dead lock in Mellanox TM FIFO driver and ABI documentation * tag 'platform-drivers-x86-v5.5-3' of git://git.infradead.org/linux-platform-drivers-x86: Documentation/ABI: Add missed attribute for mlxreg-io sysfs interfaces Documentation/ABI: Fix documentation inconsistency for mlxreg-io sysfs interfaces platform/x86: asus-wmi: Fix keyboard brightness cannot be set to 0 platform/x86: intel_pmc_core: update Comet Lake platform driver platform/x86: GPD pocket fan: Allow somewhat lower/higher temperature limits platform/x86: GPD pocket fan: Use default values when wrong modparams are given platform/mellanox: fix potential deadlock in the tmfifo driver platform/x86: intel-ips: Use the correct style for SPDX License Identifier	2020-01-15 11:30:50 -08:00
Linus Torvalds	0174cb6ce9	Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fix from Herbert Xu: "This fixes a build problem for the hisilicon driver" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: hisilicon/sec2 - Use atomics instead of __sync	2020-01-15 10:21:34 -08:00
Linus Torvalds	84bf39461e	Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs fixes from Al Viro: "Fixes for mountpoint_last() bugs (by converting to use of lookup_last()) and an autofs regression fix from this cycle (caused by follow_managed() breakage introduced in barrier fixes series)" * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fix autofs regression caused by follow_managed() changes reimplement path_mountpoint() with less magic	2020-01-15 09:58:14 -08:00
Dmitry Osipenko	24a49678f5	i2c: tegra: Properly disable runtime PM on driver's probe error One of the recent Tegra I2C commits made a change that resumes runtime PM during driver's probe, but it missed to put the RPM in a case of error. Note that it's not correct to use pm_runtime_status_suspended because it breaks RPM refcounting. Fixes: `8ebf15e9c8` ("i2c: tegra: Move suspend handling to NOIRQ phase") Cc: <stable@vger.kernel.org> # v5.4+ Tested-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>	2020-01-15 18:17:15 +01:00
Dmitry Osipenko	9f42de8d4e	i2c: tegra: Fix suspending in active runtime PM state I noticed that sometime I2C clock is kept enabled during suspend-resume. This happens because runtime PM defers dynamic suspension and thus it may happen that runtime PM is in active state when system enters into suspend. In particular I2C controller that is used for CPU's DVFS is often kept ON during suspend because CPU's voltage scaling happens quite often. Fixes: `8ebf15e9c8` ("i2c: tegra: Move suspend handling to NOIRQ phase") Cc: <stable@vger.kernel.org> # v5.4+ Tested-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>	2020-01-15 18:16:51 +01:00
Damien Le Moal	16c731fed6	null_blk: Fix zone write handling null_zone_write() only allows writing empty and implicitly opened zones. Writing to closed and explicitly opened zones must also be allowed and the zone condition must be transitioned to implicit open if the zone is not explicitly opened already. Fixes: `da644b2cc1` ("null_blk: add zone open, close, and finish support") Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2020-01-15 08:18:39 -07:00
Ian Abbott	9fea3a40f6	staging: comedi: ni_routes: allow partial routing information This patch fixes a regression on setting up asynchronous commands to use external trigger sources when board-specific routing information is missing. `ni_find_device_routes()` (called via `ni_assign_device_routes()`) finds the table of register values for the device family and the set of valid routes for the specific board. If both are found, `tables->route_values` is set to point to the table of register values for the device family and `tables->valid_routes` is set to point to the list of valid routes for the specific board. If either is not found, both `tables->route_values` and `tables->valid_routes` are left set at their initial null values (initialized by `ni_assign_device_routes()`) and the function returns `-ENODATA`. Returning an error results in some routing functionality being disabled. Unfortunately, leaving `table->route_values` set to `NULL` also breaks the setting up of asynchronous commands that are configured to use external trigger sources. Calls to `ni_check_trigger_arg()` or `ni_check_trigger_arg_roffs()` while checking the asynchronous command set-up would result in a null pointer dereference if `table->route_values` is `NULL`. The null pointer dereference is fixed in another patch, but it now results in failure to set up the asynchronous command. That is a regression from the behavior prior to commit `347e244884` ("staging: comedi: tio: implement global tio/ctr routing") and commit `56d0b826d3` ("staging: comedi: ni_mio_common: implement new routing for TRIG_EXT"). Change `ni_find_device_routes()` to set `tables->route_values` and/or `tables->valid_routes` to valid information even if the other one can only be set to `NULL` due to missing information. The function will still return an error in that case. This should result in `tables->valid_routes` being valid for all currently supported device families even if the board-specific routing information is missing. That should be enough to fix the regression on setting up asynchronous commands to use external triggers for boards with missing routing information. Fixes: `347e244884` ("staging: comedi: tio: implement global tio/ctr routing") Fixes: `56d0b826d3` ("staging: comedi: ni_mio_common: implement new routing for TRIG_EXT"). Cc: <stable@vger.kernel.org> # 4.20+ Cc: Spencer E. Olson <olsonse@umich.edu> Signed-off-by: Ian Abbott <abbotti@mev.co.uk> Link: https://lore.kernel.org/r/20200114182532.132058-3-abbotti@mev.co.uk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-01-15 13:30:09 +01:00
Ian Abbott	01e20b664f	staging: comedi: ni_routes: fix null dereference in ni_find_route_source() In `ni_find_route_source()`, `tables->route_values` gets dereferenced. However it is possible that `tables->route_values` is `NULL`, leading to a null pointer dereference. `tables->route_values` will be `NULL` if the call to `ni_assign_device_routes()` during board initialization returned an error due to missing device family routing information or missing board-specific routing information. For example, there is currently no board-specific routing information provided for the PCIe-6251 board and several other boards, so those are affected by this bug. The bug is triggered when `ni_find_route_source()` is called via `ni_check_trigger_arg()` or `ni_check_trigger_arg_roffs()` when checking the arguments for setting up asynchronous commands. Fix it by returning `-EINVAL` if `tables->route_values` is `NULL`. Even with this fix, setting up asynchronous commands to use external trigger sources for boards with missing routing information will still fail gracefully. Since `ni_find_route_source()` only depends on the device family routing information, it would be better if that was made available even if the board-specific routing information is missing. That will be addressed by another patch. Fixes: `4bb90c87ab` ("staging: comedi: add interface to ni routing table information") Cc: <stable@vger.kernel.org> # 4.20+ Cc: Spencer E. Olson <olsonse@umich.edu> Signed-off-by: Ian Abbott <abbotti@mev.co.uk> Link: https://lore.kernel.org/r/20200114182532.132058-2-abbotti@mev.co.uk Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-01-15 13:30:08 +01:00
David S. Miller	8b792f84c6	Merge branch 'mlxsw-Various-fixes' Ido Schimmel says: ==================== mlxsw: Various fixes This patch set contains various fixes for mlxsw. Patch #1 splits the init() callback between Spectrum-2 and Spectrum-3 in order to avoid enforcing the same firmware version for both ASICs, as this can't possibly work. Without this patch the driver cannot boot with the Spectrum-3 ASIC. Patches #2-#3 fix a long standing race condition that was recently exposed while testing the driver on an emulator, which is very slow compared to the actual hardware. The problem is explained in detail in the commit messages. Patch #4 fixes a selftest. Patch #5 prevents offloaded qdiscs from presenting a non-zero backlog to the user when the netdev is down. This is done by clearing the cached backlog in the driver when the netdev goes down. Patch #6 fixes qdisc statistics (backlog and tail drops) to also take into account the multicast traffic classes. v2: * Patches #2-#3: use skb_cow_head() instead of skb_unshare() as suggested by Jakub. Remove unnecessary check regarding headroom * Patches #5-#6: new ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-15 04:16:30 -08:00
Petr Machata	85005b82e5	mlxsw: spectrum_qdisc: Include MC TCs in Qdisc counters mlxsw configures Spectrum in such a way that BUM traffic is passed not through its nominal traffic class TC, but through its MC counterpart TC+8. However, when collecting statistics, Qdiscs only look at the nominal TC and ignore the MC TC. Add two helpers to compute the value for logical TC from the constituents, one for backlog, the other for tail drops. Use them throughout instead of going through the xstats pointer directly. Counters for TX bytes and packets are deduced from packet priority counters, and therefore already include BUM traffic. wred_drop counter is irrelevant on MC TCs, because RED is not enabled on them. Fixes: `7b81953066` ("mlxsw: spectrum: Configure MC-aware mode on mlxsw ports") Signed-off-by: Petr Machata <petrm@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-15 04:16:30 -08:00
Petr Machata	ca7609ff36	mlxsw: spectrum: Wipe xstats.backlog of down ports Per-port counter cache used by Qdiscs is updated periodically, unless the port is down. The fact that the cache is not updated for down ports is no problem for most counters, which are relative in nature. However, backlog is absolute in nature, and if there is a non-zero value in the cache around the time that the port goes down, that value just stays there. This value then leaks to offloaded Qdiscs that report non-zero backlog even if there (obviously) is no traffic. The HW does not keep backlog of a downed port, so do likewise: as the port goes down, wipe the backlog value from xstats. Fixes: `075ab8adaf` ("mlxsw: spectrum: Collect tclass related stats periodically") Signed-off-by: Petr Machata <petrm@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-15 04:16:30 -08:00
Petr Machata	fef6d67049	selftests: mlxsw: qos_mc_aware: Fix mausezahn invocation Mausezahn does not recognize "own" as a keyword on source IP address. As a result, the MC stream is not running at all, and therefore no UC degradation can be observed even in principle. Fix the invocation, and tighten the test: due to the minimum shaper configured at the MC TCs, we always expect about 20% degradation. Fail the test if it is lower. Fixes: `573363a68f` ("selftests: mlxsw: Add qos_lib.sh") Signed-off-by: Petr Machata <petrm@mellanox.com> Reported-by: Amit Cohen <amitc@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-15 04:16:30 -08:00
Ido Schimmel	63963d0f9d	mlxsw: switchx2: Do not modify cloned SKBs during xmit The driver needs to prepend a Tx header to each packet it is transmitting. The header includes information such as the egress port and traffic class. The addition of the header requires the driver to modify the SKB's header and therefore it must not be shared. Otherwise, we risk hitting various race conditions. For example, when a packet is flooded (cloned) by the bridge driver to two switch ports swp1 and swp2: t0 - mlxsw_sp_port_xmit() is called for swp1. Tx header is prepended with swp1's port number t1 - mlxsw_sp_port_xmit() is called for swp2. Tx header is prepended with swp2's port number, overwriting swp1's port number t2 - The device processes data buffer from t0. Packet is transmitted via swp2 t3 - The device processes data buffer from t1. Packet is transmitted via swp2 Usually, the device is fast enough and transmits the packet before its Tx header is overwritten, but this is not the case in emulated environments. Fix this by making sure the SKB's header is writable by calling skb_cow_head(). Since the function ensures we have headroom to push the Tx header, the check further in the function can be removed. v2: * Use skb_cow_head() instead of skb_unshare() as suggested by Jakub * Remove unnecessary check regarding headroom Fixes: `31557f0f97` ("mlxsw: Introduce Mellanox SwitchX-2 ASIC support") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reported-by: Shalom Toledo <shalomt@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-15 04:16:30 -08:00
Ido Schimmel	2da51ce75d	mlxsw: spectrum: Do not modify cloned SKBs during xmit The driver needs to prepend a Tx header to each packet it is transmitting. The header includes information such as the egress port and traffic class. The addition of the header requires the driver to modify the SKB's header and therefore it must not be shared. Otherwise, we risk hitting various race conditions. For example, when a packet is flooded (cloned) by the bridge driver to two switch ports swp1 and swp2: t0 - mlxsw_sp_port_xmit() is called for swp1. Tx header is prepended with swp1's port number t1 - mlxsw_sp_port_xmit() is called for swp2. Tx header is prepended with swp2's port number, overwriting swp1's port number t2 - The device processes data buffer from t0. Packet is transmitted via swp2 t3 - The device processes data buffer from t1. Packet is transmitted via swp2 Usually, the device is fast enough and transmits the packet before its Tx header is overwritten, but this is not the case in emulated environments. Fix this by making sure the SKB's header is writable by calling skb_cow_head(). Since the function ensures we have headroom to push the Tx header, the check further in the function can be removed. v2: * Use skb_cow_head() instead of skb_unshare() as suggested by Jakub * Remove unnecessary check regarding headroom Fixes: `56ade8fe3f` ("mlxsw: spectrum: Add initial support for Spectrum ASIC") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reported-by: Shalom Toledo <shalomt@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-15 04:16:30 -08:00
Ido Schimmel	d58c35ca52	mlxsw: spectrum: Do not enforce same firmware version for multiple ASICs In commit `a72afb6879` ("mlxsw: Enforce firmware version for Spectrum-2") I added a required firmware version for Spectrum-2, but missed the fact that mlxsw_sp2_init() is used by both Spectrum-2 and Spectrum-3. This means that the same firmware version will be used for both, which is wrong. Fix this by creating a new init() callback for Spectrum-3. Fixes: `a72afb6879` ("mlxsw: Enforce firmware version for Spectrum-2") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Tested-by: Shalom Toledo <shalomt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-15 04:16:30 -08:00
Keiya Nobuta	9c06ac4c83	usb: core: hub: Improved device recognition on remote wakeup If hub_activate() is called before D+ has stabilized after remote wakeup, the following situation might occur: __ ___________________ / \ / D+ __/ \__/ Hub _______________________________ \| ^ ^ ^ \| \| \| \| Host _____v__\|___\|___________\|______ \| \| \| \| \| \| \| \-- Interrupt Transfer (3) \| \| \-- ClearPortFeature (2) \| \-- GetPortStatus (1) \-- Host detects remote wakeup - D+ goes high, Host starts running by remote wakeup - D+ is not stable, goes low - Host requests GetPortStatus at (1) and gets the following hub status: - Current Connect Status bit is 0 - Connect Status Change bit is 1 - D+ stabilizes, goes high - Host requests ClearPortFeature and thus Connect Status Change bit is cleared at (2) - After waiting 100 ms, Host starts the Interrupt Transfer at (3) - Since the Connect Status Change bit is 0, Hub returns NAK. In this case, port_event() is not called in hub_event() and Host cannot recognize device. To solve this issue, flag change_bits even if only Connect Status Change bit is 1 when got in the first GetPortStatus. This issue occurs rarely because it only if D+ changes during a very short time between GetPortStatus and ClearPortFeature. However, it is fatal if it occurs in embedded system. Signed-off-by: Keiya Nobuta <nobuta.keiya@fujitsu.com> Cc: stable <stable@vger.kernel.org> Acked-by: Alan Stern <stern@rowland.harvard.edu> Link: https://lore.kernel.org/r/20200109051448.28150-1-nobuta.keiya@fujitsu.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-01-15 13:14:28 +01:00
David S. Miller	eb507906fe	A few fixes: * -O3 enablement fallout, thanks to Arnd who ran this * fixes for a few leaks, thanks to Felix * channel 12 regulatory fix for custom regdomains * check for a crash reported by syzbot (NULL function is called on drivers that don't have it) * fix TKIP replay protection after setup with some APs (from Jouni) * restrict obtaining some mesh data to avoid WARN_ONs * fix deadlocks with auto-disconnect (socket owner) * fix radar detection events with multiple devices -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEH1e1rEeCd0AIMq6MB8qZga/fl8QFAl4e4voACgkQB8qZga/f l8RVFg/+Ngjzje8xU6Ymh7VSwvBBz6CmSVwH/nRSsIAszDrQLoDCo7+eWeUHnAKG 2ViFQ4Lf5CP2J9kSgi7pbD0yGQRDGxH1CVSxMB64uZZRErlOCZYLRTIeBn7HGb8w JhqAVXBjskzP9iB5jUZIkOgJ6L2ESM2KYKQVGtgn8p9fFdHKitec8UEKiPU3tcX1 zx94EbKldynuvDNb7VkTm4MdoY8LK5KI8ovVdVKwJ05qCtLvc6xvEBYhm33Fl6Pe 66cWklCrU093jQXUDTk0yBkwlcXGxJc143y1nXuurI8ZLH6q2wdDUAMHUEKdL6mz ktHxABK4JA4Z/ylphR2Tp1TNh8xnz/ZZ5IdAe4P9K0/PerrjH8Yk7zf/48lrHTSv LKb86Oaoq4MY4k5UNyjDUdgPB/q8CgxN2xDQaeQX69u740wrlxYtrwgHmqHrKI69 ppCzL7a3wmJzAz/97TMFhSseCH1WEQfWY+tlzktrvRofSgOGsIUtcPP10NmvGZHj zdF79E9TiUPX4KVWssRb9A2MjBbvKA31hCQbH/OVKSWt/fItoBUbMQKtPxfnHcam /W/gCWrOpw5Qqhc/BdFbYYsgY9J06KzifsV5e2Z6he+6Ex+rtdZNoLXc6/4QH3+9 HzsVYBP+ZUaqrMJi64KDN59C3ni4jDVs9zfYnVkC9LhzHASI5EE= =i20C -----END PGP SIGNATURE----- Merge tag 'mac80211-for-net-2020-01-15' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== A few fixes: * -O3 enablement fallout, thanks to Arnd who ran this * fixes for a few leaks, thanks to Felix * channel 12 regulatory fix for custom regdomains * check for a crash reported by syzbot (NULL function is called on drivers that don't have it) * fix TKIP replay protection after setup with some APs (from Jouni) * restrict obtaining some mesh data to avoid WARN_ONs * fix deadlocks with auto-disconnect (socket owner) * fix radar detection events with multiple devices ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-01-15 04:12:00 -08:00
Chuansheng Liu	978370956d	x86/mce/therm_throt: Do not access uninitialized therm_work It is relatively easy to trigger the following boot splat on an Ice Lake client platform. The call stack is like: kernel BUG at kernel/timer/timer.c:1152! Call Trace: __queue_delayed_work queue_delayed_work_on therm_throt_process intel_thermal_interrupt ... The reason is that a CPU's thermal interrupt is enabled prior to executing its hotplug onlining callback which will initialize the throttling workqueues. Such a race can lead to therm_throt_process() accessing an uninitialized therm_work, leading to the above BUG at a very early bootup stage. Therefore, unmask the thermal interrupt vector only after having setup the workqueues completely. [ bp: Heavily massage commit message and correct comment formatting. ] Fixes: `f6656208f0` ("x86/mce/therm_throt: Optimize notifications of thermal throttle") Signed-off-by: Chuansheng Liu <chuansheng.liu@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20200107004116.59353-1-chuansheng.liu@intel.com	2020-01-15 11:31:33 +01:00
Kevin Hao	a564ac35d6	Revert "gpio: thunderx: Switch to GPIOLIB_IRQCHIP" This reverts commit `a7fc89f9d5` because there are some bugs in this commit, and we don't have a simple way to fix these bugs. So revert this commit to make the thunderx gpio work on the stable kernel at least. We will switch to GPIOLIB_IRQCHIP for thunderx gpio by following patches. Fixes: `a7fc89f9d5` ("gpio: thunderx: Switch to GPIOLIB_IRQCHIP") Signed-off-by: Kevin Hao <haokexin@gmail.com> Link: https://lore.kernel.org/r/20200114082821.14015-2-haokexin@gmail.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org>	2020-01-15 11:17:21 +01:00
Eric Dumazet	de95a991bb	tick/sched: Annotate lockless access to last_jiffies_update syzbot (KCSAN) reported a data-race in tick_do_update_jiffies64(): BUG: KCSAN: data-race in tick_do_update_jiffies64 / tick_do_update_jiffies64 write to 0xffffffff8603d008 of 8 bytes by interrupt on cpu 1: tick_do_update_jiffies64+0x100/0x250 kernel/time/tick-sched.c:73 tick_sched_do_timer+0xd4/0xe0 kernel/time/tick-sched.c:138 tick_sched_timer+0x43/0xe0 kernel/time/tick-sched.c:1292 __run_hrtimer kernel/time/hrtimer.c:1514 [inline] __hrtimer_run_queues+0x274/0x5f0 kernel/time/hrtimer.c:1576 hrtimer_interrupt+0x22a/0x480 kernel/time/hrtimer.c:1638 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1110 [inline] smp_apic_timer_interrupt+0xdc/0x280 arch/x86/kernel/apic/apic.c:1135 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:830 arch_local_irq_restore arch/x86/include/asm/paravirt.h:756 [inline] kcsan_setup_watchpoint+0x1d4/0x460 kernel/kcsan/core.c:436 check_access kernel/kcsan/core.c:466 [inline] __tsan_read1 kernel/kcsan/core.c:593 [inline] __tsan_read1+0xc2/0x100 kernel/kcsan/core.c:593 kallsyms_expand_symbol.constprop.0+0x70/0x160 kernel/kallsyms.c:79 kallsyms_lookup_name+0x7f/0x120 kernel/kallsyms.c:170 insert_report_filterlist kernel/kcsan/debugfs.c:155 [inline] debugfs_write+0x14b/0x2d0 kernel/kcsan/debugfs.c:256 full_proxy_write+0xbd/0x100 fs/debugfs/file.c:225 __vfs_write+0x67/0xc0 fs/read_write.c:494 vfs_write fs/read_write.c:558 [inline] vfs_write+0x18a/0x390 fs/read_write.c:542 ksys_write+0xd5/0x1b0 fs/read_write.c:611 __do_sys_write fs/read_write.c:623 [inline] __se_sys_write fs/read_write.c:620 [inline] __x64_sys_write+0x4c/0x60 fs/read_write.c:620 do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x44/0xa9 read to 0xffffffff8603d008 of 8 bytes by task 0 on cpu 0: tick_do_update_jiffies64+0x2b/0x250 kernel/time/tick-sched.c:62 tick_nohz_update_jiffies kernel/time/tick-sched.c:505 [inline] tick_nohz_irq_enter kernel/time/tick-sched.c:1257 [inline] tick_irq_enter+0x139/0x1c0 kernel/time/tick-sched.c:1274 irq_enter+0x4f/0x60 kernel/softirq.c:354 entering_irq arch/x86/include/asm/apic.h:517 [inline] entering_ack_irq arch/x86/include/asm/apic.h:523 [inline] smp_apic_timer_interrupt+0x55/0x280 arch/x86/kernel/apic/apic.c:1133 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:830 native_safe_halt+0xe/0x10 arch/x86/include/asm/irqflags.h:60 arch_cpu_idle+0xa/0x10 arch/x86/kernel/process.c:571 default_idle_call+0x1e/0x40 kernel/sched/idle.c:94 cpuidle_idle_call kernel/sched/idle.c:154 [inline] do_idle+0x1af/0x280 kernel/sched/idle.c:263 cpu_startup_entry+0x1b/0x20 kernel/sched/idle.c:355 rest_init+0xec/0xf6 init/main.c:452 arch_call_rest_init+0x17/0x37 start_kernel+0x838/0x85e init/main.c:786 x86_64_start_reservations+0x29/0x2b arch/x86/kernel/head64.c:490 x86_64_start_kernel+0x72/0x76 arch/x86/kernel/head64.c:471 secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:241 Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.4.0-rc7+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Use READ_ONCE() and WRITE_ONCE() to annotate this expected race. Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20191205045619.204946-1-edumazet@google.com	2020-01-15 10:54:12 +01:00
Felix Fietkau	81c044fc3b	cfg80211: fix page refcount issue in A-MSDU decap The fragments attached to a skb can be part of a compound page. In that case, page_ref_inc will increment the refcount for the wrong page. Fix this by using get_page instead, which calls page_ref_inc on the compound head and also checks for overflow. Fixes: `2b67f944f8` ("cfg80211: reuse existing page fragments in A-MSDU rx") Cc: stable@vger.kernel.org Signed-off-by: Felix Fietkau <nbd@nbd.name> Link: https://lore.kernel.org/r/20200113182107.20461-1-nbd@nbd.name Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2020-01-15 09:53:35 +01:00
Johannes Berg	24953de0a5	cfg80211: check for set_wiphy_params Check if set_wiphy_params is assigned and return an error if not, some drivers (e.g. virt_wifi where syzbot reported it) don't have it. Reported-by: syzbot+e8a797964a4180eb57d5@syzkaller.appspotmail.com Reported-by: syzbot+34b582cf32c1db008f8e@syzkaller.appspotmail.com Signed-off-by: Johannes Berg <johannes.berg@intel.com> Link: https://lore.kernel.org/r/20200113125358.ac07f276efff.Ibd85ee1b12e47b9efb00a2adc5cd3fac50da791a@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2020-01-15 09:53:24 +01:00
Felix Fietkau	df16737d43	cfg80211: fix memory leak in cfg80211_cqm_rssi_update The per-tid statistics need to be released after the call to rdev_get_station Cc: stable@vger.kernel.org Fixes: `8689c051a2` ("cfg80211: dynamically allocate per-tid stats for station info") Signed-off-by: Felix Fietkau <nbd@nbd.name> Link: https://lore.kernel.org/r/20200108170630.33680-2-nbd@nbd.name Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2020-01-15 09:53:12 +01:00
Felix Fietkau	2a279b3416	cfg80211: fix memory leak in nl80211_probe_mesh_link The per-tid statistics need to be released after the call to rdev_get_station Cc: stable@vger.kernel.org Fixes: `5ab92e7fe4` ("cfg80211: add support to probe unexercised mesh link") Signed-off-by: Felix Fietkau <nbd@nbd.name> Link: https://lore.kernel.org/r/20200108170630.33680-1-nbd@nbd.name Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2020-01-15 09:53:02 +01:00
Markus Theil	5a128a088a	cfg80211: fix deadlocks in autodisconnect work Use methods which do not try to acquire the wdev lock themselves. Cc: stable@vger.kernel.org Fixes: `37b1c00468` ("cfg80211: Support all iftypes in autodisconnect_wk") Signed-off-by: Markus Theil <markus.theil@tu-ilmenau.de> Link: https://lore.kernel.org/r/20200108115536.2262-1-markus.theil@tu-ilmenau.de Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2020-01-15 09:52:49 +01:00
Arnd Bergmann	e16119655c	wireless: wext: avoid gcc -O3 warning After the introduction of CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3, the wext code produces a bogus warning: In function 'iw_handler_get_iwstats', inlined from 'ioctl_standard_call' at net/wireless/wext-core.c:1015:9, inlined from 'wireless_process_ioctl' at net/wireless/wext-core.c:935:10, inlined from 'wext_ioctl_dispatch.part.8' at net/wireless/wext-core.c:986:8, inlined from 'wext_handle_ioctl': net/wireless/wext-core.c:671:3: error: argument 1 null where non-null expected [-Werror=nonnull] memcpy(extra, stats, sizeof(struct iw_statistics)); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from arch/x86/include/asm/string.h:5, net/wireless/wext-core.c: In function 'wext_handle_ioctl': arch/x86/include/asm/string_64.h:14:14: note: in a call to function 'memcpy' declared here The problem is that ioctl_standard_call() sometimes calls the handler with a NULL argument that would cause a problem for iw_handler_get_iwstats. However, iw_handler_get_iwstats never actually gets called that way. Marking that function as noinline avoids the warning and leads to slightly smaller object code as well. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20200107200741.3588770-1-arnd@arndb.de Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2020-01-15 09:52:20 +01:00
Jouni Malinen	6f60126521	mac80211: Fix TKIP replay protection immediately after key setup TKIP replay protection was skipped for the very first frame received after a new key is configured. While this is potentially needed to avoid dropping a frame in some cases, this does leave a window for replay attacks with group-addressed frames at the station side. Any earlier frame sent by the AP using the same key would be accepted as a valid frame and the internal RSC would then be updated to the TSC from that frame. This would allow multiple previously transmitted group-addressed frames to be replayed until the next valid new group-addressed frame from the AP is received by the station. Fix this by limiting the no-replay-protection exception to apply only for the case where TSC=0, i.e., when this is for the very first frame protected using the new key, and the local RSC had not been set to a higher value when configuring the key (which may happen with GTK). Signed-off-by: Jouni Malinen <j@w1.fi> Link: https://lore.kernel.org/r/20200107153545.10934-1-j@w1.fi Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2020-01-15 09:52:12 +01:00

1 2 3 4 5 ...

888677 Commits