linux

Author	SHA1	Message	Date
Lorenz Bauer	9b2b09717e	bpf: sockmap: Check value of unused args to BPF_PROG_ATTACH Using BPF_PROG_ATTACH on a sockmap program currently understands no flags or replace_bpf_fd, but accepts any value. Return EINVAL instead. Fixes: `604326b41a` ("bpf, sockmap: convert to generic sk_msg interface") Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200629095630.7933-4-lmb@cloudflare.com	2020-06-30 10:46:39 -07:00
Lorenz Bauer	4ac2add659	bpf: flow_dissector: Check value of unused flags to BPF_PROG_DETACH Using BPF_PROG_DETACH on a flow dissector program supports neither attach_flags nor attach_bpf_fd. Yet no value is enforced for them. Enforce that attach_flags are zero, and require the current program to be passed via attach_bpf_fd. This allows us to remove the check for CAP_SYS_ADMIN, since userspace can now no longer remove arbitrary flow dissector programs. Fixes: `b27f7bb590` ("flow_dissector: Move out netns_bpf prog callbacks") Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200629095630.7933-3-lmb@cloudflare.com	2020-06-30 10:46:38 -07:00
Lorenz Bauer	1b514239e8	bpf: flow_dissector: Check value of unused flags to BPF_PROG_ATTACH Using BPF_PROG_ATTACH on a flow dissector program supports neither target_fd, attach_flags or replace_bpf_fd but accepts any value. Enforce that all of them are zero. This is fine for replace_bpf_fd since its presence is indicated by BPF_F_REPLACE. It's more problematic for target_fd, since zero is a valid fd. Should we want to use the flag later on we'd have to add an exception for fd 0. The alternative is to force a value like -1. This requires more changes to tests. There is also precedent for using 0, since bpf_iter uses this for target_fd as well. Fixes: `b27f7bb590` ("flow_dissector: Move out netns_bpf prog callbacks") Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200629095630.7933-2-lmb@cloudflare.com	2020-06-30 10:46:38 -07:00
Alexei Starovoitov	951f38cf08	Merge branch 'bpf-multi-prog-prep' Jakub Sitnicki says: ==================== This patch set prepares ground for link-based multi-prog attachment for future netns attach types, with BPF_SK_LOOKUP attach type in mind [0]. Two changes are needed in order to attach and run a series of BPF programs: 1) an bpf_prog_array of programs to run (patch #2), and 2) a list of attached links to keep track of attachments (patch #3). Nothing changes for BPF flow_dissector. Just as before only one program can be attached to netns. In v3 I've simplified patch #2 that introduces bpf_prog_array to take advantage of the fact that it will hold at most one program for now. In particular, I'm no longer using bpf_prog_array_copy. It turned out to be less suitable for link operations than I thought as it fails to append the same BPF program. bpf_prog_array_replace_item is also gone, because we know we always want to replace the first element in prog_array. Naturally the code that handles bpf_prog_array will need change once more when there is a program type that allows multi-prog attachment. But I feel it will be better to do it gradually and present it together with tests that actually exercise multi-prog code paths. [0] https://lore.kernel.org/bpf/20200511185218.1422406-1-jakub@cloudflare.com/ v2 -> v3: - Don't check if run_array is null in link update callback. (Martin) - Allow updating the link with the same BPF program. (Andrii) - Add patch #4 with a test for the above case. - Kill bpf_prog_array_replace_item. Access the run_array directly. - Switch from bpf_prog_array_copy() to bpf_prog_array_alloc(1, ...). - Replace rcu_deref_protected & RCU_INIT_POINTER with rcu_replace_pointer. - Drop Andrii's Ack from patch #2. Code changed. v1 -> v2: - Show with a (void) cast that bpf_prog_array_replace_item() return value is ignored on purpose. (Andrii) - Explain why bpf-cgroup cannot replace programs in bpf_prog_array based on bpf_prog pointer comparison in patch #2 description. (Andrii) ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2020-06-30 10:45:11 -07:00
Jakub Sitnicki	6ebb85c83a	selftests/bpf: Test updating flow_dissector link with same program This case, while not particularly useful, is worth covering because we expect the operation to succeed as opposed when re-attaching the same program directly with PROG_ATTACH. While at it, update the tests summary that fell out of sync when tests extended to cover links. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200625141357.910330-5-jakub@cloudflare.com	2020-06-30 10:45:08 -07:00
Jakub Sitnicki	ab53cad90e	bpf, netns: Keep a list of attached bpf_link's To support multi-prog link-based attachments for new netns attach types, we need to keep track of more than one bpf_link per attach type. Hence, convert net->bpf.links into a list, that currently can be either empty or have just one item. Instead of reusing bpf_prog_list from bpf-cgroup, we link together bpf_netns_link's themselves. This makes list management simpler as we don't have to allocate, initialize, and later release list elements. We can do this because multi-prog attachment will be available only for bpf_link, and we don't need to build a list of programs attached directly and indirectly via links. No functional changes intended. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200625141357.910330-4-jakub@cloudflare.com	2020-06-30 10:45:08 -07:00
Jakub Sitnicki	695c12147a	bpf, netns: Keep attached programs in bpf_prog_array Prepare for having multi-prog attachments for new netns attach types by storing programs to run in a bpf_prog_array, which is well suited for iterating over programs and running them in sequence. After this change bpf(PROG_QUERY) may block to allocate memory in bpf_prog_array_copy_to_user() for collected program IDs. This forces a change in how we protect access to the attached program in the query callback. Because bpf_prog_array_copy_to_user() can sleep, we switch from an RCU read lock to holding a mutex that serializes updaters. Because we allow only one BPF flow_dissector program to be attached to netns at all times, the bpf_prog_array pointed by net->bpf.run_array is always either detached (null) or one element long. No functional changes intended. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200625141357.910330-3-jakub@cloudflare.com	2020-06-30 10:45:08 -07:00
Jakub Sitnicki	3b7016996c	flow_dissector: Pull BPF program assignment up to bpf-netns Prepare for using bpf_prog_array to store attached programs by moving out code that updates the attached program out of flow dissector. Managing bpf_prog_array is more involved than updating a single bpf_prog pointer. This will let us do it all from one place, bpf/net_namespace.c, in the subsequent patch. No functional change intended. Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/bpf/20200625141357.910330-2-jakub@cloudflare.com	2020-06-30 10:45:07 -07:00
Sumeet Pawnikar	0318e8374e	ACPI: fan: Fix Tiger Lake ACPI device ID Tiger Lake's new unique ACPI device ID for Fan is not valid because of missing 'C' in the ID. Use correct fan device ID. Fixes: `c248dfe7e0` ("ACPI: fan: Add Tiger Lake ACPI device ID") Signed-off-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com> Cc: 5.6+ <stable@vger.kernel.org> # 5.6+ [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2020-06-30 19:32:45 +02:00
Eric Dumazet	c4e8fa9074	netfilter: ipset: call ip_set_free() instead of kfree() Whenever ip_set_alloc() is used, allocated memory can either use kmalloc() or vmalloc(). We should call kvfree() or ip_set_free() invalid opcode: 0000 [#1] PREEMPT SMP KASAN CPU: 0 PID: 21935 Comm: syz-executor.3 Not tainted 5.8.0-rc2-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:__phys_addr+0xa7/0x110 arch/x86/mm/physaddr.c:28 Code: 1d 7a 09 4c 89 e3 31 ff 48 d3 eb 48 89 de e8 d0 58 3f 00 48 85 db 75 0d e8 26 5c 3f 00 4c 89 e0 5b 5d 41 5c c3 e8 19 5c 3f 00 <0f> 0b e8 12 5c 3f 00 48 c7 c0 10 10 a8 89 48 ba 00 00 00 00 00 fc RSP: 0000:ffffc900018572c0 EFLAGS: 00010046 RAX: 0000000000040000 RBX: 0000000000000001 RCX: ffffc9000fac3000 RDX: 0000000000040000 RSI: ffffffff8133f437 RDI: 0000000000000007 RBP: ffffc90098aff000 R08: 0000000000000000 R09: ffff8880ae636cdb R10: 0000000000000000 R11: 0000000000000000 R12: 0000408018aff000 R13: 0000000000080000 R14: 000000000000001d R15: ffffc900018573d8 FS: 00007fc540c66700(0000) GS:ffff8880ae600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fc9dcd67200 CR3: 0000000059411000 CR4: 00000000001406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: virt_to_head_page include/linux/mm.h:841 [inline] virt_to_cache mm/slab.h:474 [inline] kfree+0x77/0x2c0 mm/slab.c:3749 hash_net_create+0xbb2/0xd70 net/netfilter/ipset/ip_set_hash_gen.h:1536 ip_set_create+0x6a2/0x13c0 net/netfilter/ipset/ip_set_core.c:1128 nfnetlink_rcv_msg+0xbe8/0xea0 net/netfilter/nfnetlink.c:230 netlink_rcv_skb+0x15a/0x430 net/netlink/af_netlink.c:2469 nfnetlink_rcv+0x1ac/0x420 net/netfilter/nfnetlink.c:564 netlink_unicast_kernel net/netlink/af_netlink.c:1303 [inline] netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1329 netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1918 sock_sendmsg_nosec net/socket.c:652 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:672 ____sys_sendmsg+0x6e8/0x810 net/socket.c:2352 ___sys_sendmsg+0xf3/0x170 net/socket.c:2406 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2439 do_syscall_64+0x60/0xe0 arch/x86/entry/common.c:359 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x45cb19 Code: Bad RIP value. RSP: 002b:00007fc540c65c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00000000004fed80 RCX: 000000000045cb19 RDX: 0000000000000000 RSI: 0000000020001080 RDI: 0000000000000003 RBP: 000000000078bf00 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff R13: 000000000000095e R14: 00000000004cc295 R15: 00007fc540c666d4 Fixes: `f66ee0410b` ("netfilter: ipset: Fix "INFO: rcu detected stall in hash_xxx" reports") Fixes: `03c8b234e6` ("netfilter: ipset: Generalize extensions support") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2020-06-30 19:09:56 +02:00
Julian Anastasov	857ca89711	ipvs: register hooks only with services Keep the IPVS hooks registered in Netfilter only while there are configured virtual services. This saves CPU cycles while IPVS is loaded but not used. Signed-off-by: Julian Anastasov <ja@ssi.bg> Reviewed-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2020-06-30 18:37:39 +02:00
Stefano Brivio	d61d2e902a	netfilter: nft_set_pipapo: Drop useless assignment of scratch map index on insert In nft_pipapo_insert(), we need to reallocate scratch maps that will be used for matching by lookup functions, if they have never been allocated or if the bucket size changes as a result of the insertion. As pipapo_realloc_scratch() provides a pair of fresh, zeroed out maps, there's no need to select a particular one after reallocation. Other than being useless, the existing assignment was also troubled by the fact that the index was set only on the CPU performing the actual insertion, as spotted by Florian. Simply drop the assignment. Reported-by: Florian Westphal <fw@strlen.de> Fixes: `3c4287f620` ("nf_tables: Add set type for arbitrary concatenation of ranges") Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2020-06-30 18:25:07 +02:00
Laura Garcia Liebana	f53b9b0bdc	netfilter: introduce support for reject at prerouting stage REJECT statement can be only used in INPUT, FORWARD and OUTPUT chains. This patch adds support of REJECT, both icmp and tcp reset, at PREROUTING stage. The need for this patch comes from the requirement of some forwarding devices to reject traffic before the natting and routing decisions. The main use case is to be able to send a graceful termination to legitimate clients that, under any circumstances, the NATed endpoints are not available. This option allows clients to decide either to perform a reconnection or manage the error in their side, instead of just dropping the connection and let them die due to timeout. It is supported ipv4, ipv6 and inet families for nft infrastructure. Signed-off-by: Laura Garcia Liebana <nevola@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2020-06-30 18:21:02 +02:00
Fabio Estevam	341404415e	dt-bindings: thermal: k3: Fix the reg property Adjust the reg property to fix the following warning seen with 'make dt_binding_check': Documentation/devicetree/bindings/thermal/ti,am654-thermal.example.dt.yaml: example-0: thermal@42050000:reg:0: [0, 1107623936, 0, 604] is too long Signed-off-by: Fabio Estevam <festevam@gmail.com> Link: https://lore.kernel.org/r/20200630122527.28640-1-festevam@gmail.com Signed-off-by: Rob Herring <robh@kernel.org>	2020-06-30 09:01:40 -06:00
Fabio Estevam	34b9610609	dt-bindings: thermal: Remove soc unit address Remove the soc unit address to fix the following warnings seen with 'make dt_binding_check': Documentation/devicetree/bindings/thermal/thermal-sensor.example.dts:22.20-49.11: Warning (unit_address_vs_reg): /example-0/soc@0: node has a unit name, but no reg or ranges property Documentation/devicetree/bindings/thermal/thermal-zones.example.dts:23.20-50.11: Warning (unit_address_vs_reg): /example-0/soc@0: node has a unit name, but no reg or ranges property Signed-off-by: Fabio Estevam <festevam@gmail.com> Link: https://lore.kernel.org/r/20200630121804.27887-1-festevam@gmail.com [robh: also fix thermal-zones.yaml example] Signed-off-by: Rob Herring <robh@kernel.org>	2020-06-30 09:00:24 -06:00
Fabio Estevam	0b3f3ad3fe	dt-bindings: display: arm: versatile: Pass the sysreg unit name Pass the sysreg unit name to fix the following warning seen with 'make dt_binding_check': Warning (unit_address_vs_reg): /example-0/sysreg: node has a reg or ranges property, but no unit name Signed-off-by: Fabio Estevam <festevam@gmail.com> Link: https://lore.kernel.org/r/20200629215500.18037-1-festevam@gmail.com Signed-off-by: Rob Herring <robh@kernel.org>	2020-06-30 08:42:26 -06:00
Fabio Estevam	dd075b664c	dt-bindings: usb: aspeed: Remove the leading zeroes Remove the leading zeroes to fix the following warning seen with 'make dt_binding_check': Documentation/devicetree/bindings/usb/aspeed,usb-vhub.example.dts:37.33-42.23: Warning (unit_address_format): /example-0/usb-vhub@1e6a0000/vhub-strings/string@0409: unit name should not have leading 0s Reviewed-by: Tao Ren <rentao.bupt@gmail.com> Signed-off-by: Fabio Estevam <festevam@gmail.com> Link: https://lore.kernel.org/r/20200629214027.16768-1-festevam@gmail.com Signed-off-by: Rob Herring <robh@kernel.org>	2020-06-30 08:42:26 -06:00
Masahiro Yamada	dee9c0b575	dt-bindings: copy process-schema-examples.yaml to process-schema.yaml There are two processed schema files: - processed-schema-examples.yaml Used for 'make dt_binding_check'. This is always a full schema. - processed-schema.yaml Used for 'make dtbs_check'. This may be a full schema, or a smaller subset if DT_SCHEMA_FILES is given by a user. If DT_SCHEMA_FILES is not specified, they are the same. You can copy the former to the latter instead of running dt-mk-schema twice. This saves the cpu time a lot when you do 'make dt_binding_check dtbs_check' because building the full schema takes a couple of seconds. If DT_SCHEMA_FILES is specified, processed-schema.yaml is generated based on the specified yaml files. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Link: https://lore.kernel.org/r/20200625170434.635114-4-masahiroy@kernel.org Signed-off-by: Rob Herring <robh@kernel.org>	2020-06-30 08:42:26 -06:00
Masahiro Yamada	ce810eeb65	dt-bindings: do not build processed-schema.yaml for 'make dt_binding_check' Currently, processed-schema.yaml is always built, but it is actually used only for 'make dtbs_check'. 'make dt_binding_check' uses processed-schema-example.yaml instead. Build processed-schema.yaml only for 'make dtbs_check'. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Link: https://lore.kernel.org/r/20200625170434.635114-3-masahiroy@kernel.org Signed-off-by: Rob Herring <robh@kernel.org>	2020-06-30 08:42:26 -06:00
Masahiro Yamada	fa714cf58c	dt-bindings: fix error in 'make clean' after 'make dt_binding_check' We are having more and more schema files. Commit `8b6b80218b` ("dt-bindings: Fix command line length limit calling dt-mk-schema") fixed the 'Argument list too long' error of the schema checks, but the same error happens while cleaning too. 'make clean' after 'make dt_binding_check' fails as follows: $ make dt_binding_check [ snip ] $ make clean make[2]: execvp: /bin/sh: Argument list too long make[2]: * [scripts/Makefile.clean:52: __clean] Error 127 make[1]: * [scripts/Makefile.clean:66: Documentation/devicetree/bindings] Error 2 make: *** [Makefile:1763: _clean_Documentation] Error 2 'make dt_binding_check' generates so many .example.dts, .dt.yaml files, which are passed to the 'rm' command when you run 'make clean'. I added a small hack to use the 'find' command to clean up most of the build artifacts before they are processed by scripts/Makefile.clean Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Link: https://lore.kernel.org/r/20200625170434.635114-2-masahiroy@kernel.org Signed-off-by: Rob Herring <robh@kernel.org>	2020-06-30 08:42:26 -06:00
Kangmin Park	35b9c0fdb9	dt-bindings: mailbox: zynqmp_ipi: fix unit address Fix unit address to match the first address specified in the reg property of the node in example. Signed-off-by: Kangmin Park <l4stpr0gr4m@gmail.com> Link: https://lore.kernel.org/r/20200625135158.5861-1-l4stpr0gr4m@gmail.com Signed-off-by: Rob Herring <robh@kernel.org>	2020-06-30 08:42:26 -06:00
Masahiro Yamada	0fb24d1e5a	dt-bindings: bus: uniphier-system-bus: fix warning in example Since commit `e69f5dc623` ("dt-bindings: serial: Convert 8250 to json-schema"), the schema for "ns16550a" is checked. 'make dt_binding_check' emits the following warning: uart@5,00200000: $nodename:0: 'uart@5,00200000' does not match '^serial(@[0-9a-f,]+)*$' Rename the node to follow the pattern defined in Documentation/devicetree/bindings/serial/serial.yaml While I was here, I removed leading zeros from unit names. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Link: https://lore.kernel.org/r/20200623113242.779241-1-yamada.masahiro@socionext.com Signed-off-by: Rob Herring <robh@kernel.org>	2020-06-30 08:42:26 -06:00
Rob Herring	3eb619b2f7	scripts/dtc: Update to upstream version v1.6.0-11-g9d7888cbf19c Sync with upstream dtc primarily to pickup the I2C bus check fixes. The interrupt_provider check is noisy, so turn it off for now. This adds the following commits from upstream: 9d7888cbf19c dtc: Consider one-character strings as strings 8259d59f59de checks: Improve i2c reg property checking fdabcf2980a4 checks: Remove warning for I2C_OWN_SLAVE_ADDRESS 2478b1652c8d libfdt: add extern "C" for C++ f68bfc2668b2 libfdt: trivial typo fix 7be250b4d059 libfdt: Correct condition for reordering blocks 81e0919a3e21 checks: Add interrupt provider test 85e5d839847a Makefile: when building libfdt only, do not add unneeded deps b28464a550c5 Fix some potential unaligned accesses in dtc Signed-off-by: Rob Herring <robh@kernel.org>	2020-06-30 08:42:26 -06:00
Andrii Nakryiko	517bbe1994	bpf: Enforce BPF ringbuf size to be the power of 2 BPF ringbuf assumes the size to be a multiple of page size and the power of 2 value. The latter is important to avoid division while calculating position inside the ring buffer and using (N-1) mask instead. This patch fixes omission to enforce power-of-2 size rule. Fixes: `457f44363a` ("bpf: Implement BPF ring buffer and verifier support for it") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200630061500.1804799-1-andriin@fb.com	2020-06-30 16:31:55 +02:00
Christoph Hellwig	7e0245753f	xsk: Use dma_need_sync instead of reimplenting it Use the dma_need_sync helper instead of (not always entirely correctly) poking into the dma-mapping internals. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200629130359.2690853-5-hch@lst.de	2020-06-30 15:44:03 +02:00
Christoph Hellwig	53937ff7bc	xsk: Remove a double pool->dev assignment in xp_dma_map ->dev is already assigned at the top of the function, remove the duplicate one at the end. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200629130359.2690853-4-hch@lst.de	2020-06-30 15:44:03 +02:00
Christoph Hellwig	91d5b70273	xsk: Replace the cheap_dma flag with a dma_need_sync flag Invert the polarity and better name the flag so that the use case is properly documented. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200629130359.2690853-3-hch@lst.de	2020-06-30 15:44:03 +02:00
Christoph Hellwig	3aa9162500	dma-mapping: Add a new dma_need_sync API Add a new API to check if calls to dma_sync_single_for_{device,cpu} are required for a given DMA streaming mapping. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200629130359.2690853-2-hch@lst.de	2020-06-30 15:44:03 +02:00
Sean Christopherson	009bce1df0	x86/split_lock: Don't write MSR_TEST_CTRL on CPUs that aren't whitelisted Choo! Choo! All aboard the Split Lock Express, with direct service to Wreckage! Skip split_lock_verify_msr() if the CPU isn't whitelisted as a possible SLD-enabled CPU model to avoid writing MSR_TEST_CTRL. MSR_TEST_CTRL exists, and is writable, on many generations of CPUs. Writing the MSR, even with '0', can result in bizarre, undocumented behavior. This fixes a crash on Haswell when resuming from suspend with a live KVM guest. Because APs use the standard SMP boot flow for resume, they will go through split_lock_init() and the subsequent RDMSR/WRMSR sequence, which runs even when sld_state==sld_off to ensure SLD is disabled. On Haswell (at least, my Haswell), writing MSR_TEST_CTRL with '0' will succeed and _may_ take the SMT _sibling_ out of VMX root mode. When KVM has an active guest, KVM performs VMXON as part of CPU onlining (see kvm_starting_cpu()). Because SMP boot is serialized, the resulting flow is effectively: on_each_ap_cpu() { WRMSR(MSR_TEST_CTRL, 0) VMXON } As a result, the WRMSR can disable VMX on a different CPU that has already done VMXON. This ultimately results in a #UD on VMPTRLD when KVM regains control and attempt run its vCPUs. The above voodoo was confirmed by reworking KVM's VMXON flow to write MSR_TEST_CTRL prior to VMXON, and to serialize the sequence as above. Further verification of the insanity was done by redoing VMXON on all APs after the initial WRMSR->VMXON sequence. The additional VMXON, which should VM-Fail, occasionally succeeded, and also eliminated the unexpected #UD on VMPTRLD. The damage done by writing MSR_TEST_CTRL doesn't appear to be limited to VMX, e.g. after suspend with an active KVM guest, subsequent reboots almost always hang (even when fudging VMXON), a #UD on a random Jcc was observed, suspend/resume stability is qualitatively poor, and so on and so forth. kernel BUG at arch/x86/kvm/x86.c:386! CPU: 1 PID: 2592 Comm: CPU 6/KVM Tainted: G D Hardware name: ASUS Q87M-E/Q87M-E, BIOS 1102 03/03/2014 RIP: 0010:kvm_spurious_fault+0xf/0x20 Call Trace: vmx_vcpu_load_vmcs+0x1fb/0x2b0 vmx_vcpu_load+0x3e/0x160 kvm_arch_vcpu_load+0x48/0x260 finish_task_switch+0x140/0x260 __schedule+0x460/0x720 _cond_resched+0x2d/0x40 kvm_arch_vcpu_ioctl_run+0x82e/0x1ca0 kvm_vcpu_ioctl+0x363/0x5c0 ksys_ioctl+0x88/0xa0 __x64_sys_ioctl+0x16/0x20 do_syscall_64+0x4c/0x170 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: `dbaba47085` ("x86/split_lock: Rework the initialization flow of split lock detection") Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20200605192605.7439-1-sean.j.christopherson@intel.com	2020-06-30 14:09:31 +02:00
Paolo Bonzini	5ecad245de	KVM: x86: bit 8 of non-leaf PDPEs is not reserved Bit 8 would be the "global" bit, which does not quite make sense for non-leaf page table entries. Intel ignores it; AMD ignores it in PDEs and PDPEs, but reserves it in PML4Es. Probably, earlier versions of the AMD manual documented it as reserved in PDPEs as well, and that behavior made it into KVM as well as kvm-unit-tests; fix it. Cc: stable@vger.kernel.org Reported-by: Nadav Amit <namit@vmware.com> Fixes: `a0c0feb579` ("KVM: x86: reserve bit 8 of non-leaf PDPEs and PML4Es in 64-bit mode on AMD", 2014-09-03) Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-06-30 07:07:20 -04:00
Andreas Gruenbacher	34244d711d	gfs2: Don't sleep during glock hash walk In flush_delete_work, instead of flushing each individual pending delayed work item, cancel and re-queue them for immediate execution. The waiting isn't needed here because we're already waiting for all queued work items to complete in gfs2_flush_delete_work. This makes the code more efficient, but more importantly, it avoids sleeping during a rhashtable walk, inside rcu_read_lock(). Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2020-06-30 13:04:45 +02:00
Bob Peterson	58e08e8d83	gfs2: fix trans slab error when withdraw occurs inside log_flush Log flush operations (gfs2_log_flush()) can target a specific transaction. But if the function encounters errors (e.g. io errors) and withdraws, the transaction was only freed it if was queued to one of the ail lists. If the withdraw occurred before the transaction was queued to the ail1 list, function ail_drain never freed it. The result was: BUG gfs2_trans: Objects remaining in gfs2_trans on __kmem_cache_shutdown() This patch makes log_flush() add the targeted transaction to the ail1 list so that function ail_drain() will find and free it properly. Cc: stable@vger.kernel.org # v5.7+ Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2020-06-30 13:04:45 +02:00
Andreas Gruenbacher	5902f4dd6e	gfs2: Don't return NULL from gfs2_inode_lookup Callers expect gfs2_inode_lookup to return an inode pointer or ERR_PTR(error). Commit `b66648ad6d` caused it to return NULL instead of ERR_PTR(-ESTALE) in some cases. Fix that. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Fixes: `b66648ad6d` ("gfs2: Move inode generation number check into gfs2_inode_lookup") Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>	2020-06-30 13:04:45 +02:00
Thomas Gleixner	98817a84ff	irqchip fixes for Linux 5.8, take #1 - Fix atomicity of affinity update in the GIC driver - Don't sleep in atomic when waiting for a GICv4.1 RD to respond - Fix a couple of typos in user-visible messages -----BEGIN PGP SIGNATURE----- iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAl73IWsPHG1hekBrZXJu ZWwub3JnAAoJECPQ0LrRPXpDyjEP/11ubgu1FihxeOqNarFJfldWlScIR4d1PO4m J4OCFgqWhmbxcVpecs13Gdzbtj8eIqKXAmcrdCiGH6/WxGpcg4N2TD94z9/JXlam yenkyz1tdISTEaj0h6YvrPnBWK8ZUAgJbfz8z7pm2LtpPoEtJ8FrnWKZW5xfqgQL qapaDM1lcpQKvnBXg+vIFyasbb8UBNiDHwkhJdWRwarzX8Mu0qFpUaKxsSFtVZHc 19JsEU90W8uN7pWkTr7rnYG+yVdbm/iku+kUeyycxI30HROU13fdObFrD1a1nKgM xUx6tjvc3neOpeCMsh9WjM9HumjogI0wuONhBWiYXcn/fhAzZqC7zmCnueIS7ID7 Av3O/9SCIMEcC8Z0brlGBYum+4e7PbmTiKMOiiUUej+1cBedXZHWG2BtDopPJ8Fw M5oAEOL9byFsXBbpgMjHnC5yk9vzOBK68rsgVnVSw3T1+QXzbhieGieNDC0UwzKr oxtdzq4VJ7qoXUImTzU51uh1B4GzjCtLFX8EiYL/vXkglrsyregv10MsBKSjLMFK PrwZwqzfJS+zAQJhz4WXo1G7c7h8s6TLAHhEUlWsWiMH51lkDVTlyhPBxsQpfHhP R9db+ZBhSrC5OcIk+1fHCUW3xMCULZ0GgzqfyL8CfIlVv3BaRwS/4vjGNM9uZlBx 2XXiVLfG =x7P8 -----END PGP SIGNATURE----- Merge tag 'irqchip-fixes-5.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/urgent Pull irqchip fixes from Marc Zyngier: - Fix atomicity of affinity update in the GIC driver - Don't sleep in atomic when waiting for a GICv4.1 RD to respond - Fix a couple of typos in user-visible messages	2020-06-30 12:07:51 +02:00
Chen-Yu Tsai	bda8eaa6de	drm: sun4i: hdmi: Remove extra HPD polling The HPD sense mechanism in Allwinner's old HDMI encoder hardware is more or less an input-only GPIO. Other GPIO-based HPD implementations directly return the current state, instead of polling for a specific state and returning the other if that times out. Remove the I/O polling from sun4i_hdmi_connector_detect() and directly return a known state based on the current reading. This also gets rid of excessive CPU usage by kworker as reported on Stack Exchange [1] and Armbian forums [2]. [1] https://superuser.com/questions/1515001/debian-10-buster-on-cubietruck-with-bug-in-sun4i-drm-hdmi [2] https://forum.armbian.com/topic/14282-headless-systems-and-sun4i_drm_hdmi-a10a20/ Fixes: `9c5681011a` ("drm/sun4i: Add HDMI support") Signed-off-by: Chen-Yu Tsai <wens@csie.org> Signed-off-by: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20200629060032.24134-1-wens@kernel.org	2020-06-30 10:01:48 +02:00
Sasha Neftin	f637471d33	igc: Remove checking media type during MAC initialization i225 device support only copper mode. There is no point to check media type in the igc_config_fc_after_link_up() method. Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-06-29 18:22:03 -07:00
Sasha Neftin	2b374e3738	igc: Remove unneeded check for copper media type PHY of the i225 device support only copper mode. There is no point to check media type in the igc_power_up_link() method. Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-06-29 18:21:56 -07:00
Sasha Neftin	a0beb3c1b1	igc: Refactor the igc_power_down_link() Currently the implementation of igc_power_down_link() method was just calling igc_power_down_phy_copper_base() method. We can just call igc_power_down_phy_copper_base() method directly. Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-06-29 18:21:53 -07:00
Sasha Neftin	725fa16d36	igc: Remove TCP segmentation TX fail counter TCP segmentation TX context fail counter is not applicable for i225 devices. This patch comes to clean up this counter. Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Aaron Brown<aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-06-29 18:21:48 -07:00
Sasha Neftin	900d1e8b34	igc: Add LPI counters Add EEE TX LPI and EEE RX LPI counters. A EEE TX LPI event occurs when the transmitter enters EEE (IEEE 802.3az) LPI state. A EEE RX LPI event occurs when the receiver detect link partner entry into EEE(IEEE 802.3az) LPI state. Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-06-29 18:21:41 -07:00
Andre Guedes	1cbedabffd	igc: Fix Rx timestamp disabling When Rx timestamping is enabled, we set the timestamp bit in SRRCTL register for each queue, but we don't clear it when disabling. This patch fixes igc_ptp_disable_rx_timestamp() accordingly. Also, this patch gets rid of igc_ptp_enable_tstamp_rxqueue() and igc_ptp_enable_tstamp_all_rxqueues() and move their logic into igc_ptp_enable_rx_timestamp() to keep the enable and disable helpers symmetric. Signed-off-by: Andre Guedes <andre.guedes@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-06-29 18:21:38 -07:00
Andre Guedes	3df7fd799b	igc: Refactor igc_ptp_set_timestamp_mode() Current igc_ptp_set_timestamp_mode() logic is a bit tangled since it handles many different hardware configurations in one single place, making it harder to follow. This patch untangles that code by breaking it into helper functions. Quick note about the hw->mac.type check which was removed in this refactoring: this check it not really needed since igc_i225 is the only type supported by the IGC driver. Signed-off-by: Andre Guedes <andre.guedes@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-06-29 18:21:35 -07:00
Andre Guedes	3b44d4c10c	igc: Remove UDP filter setup in PTP code As implemented in igc_ethtool_get_ts_info(), igc only supports HWTSTAMP_ FILTER_ALL so any HWTSTAMP_FILTER_* option the user may set falls back to HWTSTAMP_FILTER_ALL. HWTSTAMP_FILTER_ALL is implemented via Rx Time Sync Control (TSYNCRXCTL) configuration which timestamps all incoming packets. Configuring a UDP filter, in addition to TSYNCRXCTL, doesn't add much so this patch removes that code. It also takes this opportunity to remove some non-applicable comments. Signed-off-by: Andre Guedes <andre.guedes@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-06-29 18:21:29 -07:00
Andre Guedes	1801f8d929	igc: Check __IGC_PTP_TX_IN_PROGRESS instead of ptp_tx_skb The __IGC_PTP_TX_IN_PROGRESS flag indicates we have a pending Tx timestamp. In some places, instead of checking that flag, we check adapter->ptp_tx_skb. This patch fixes those places to use the flag. Quick note about igc_ptp_tx_hwtstamp() change: when that function is called, adapter->ptp_tx_skb is expected to be valid always so we WARN_ON_ONCE() in case it is not. Quick note about igc_ptp_suspend() change: when suspending, we don't really need to check if there is a pending timestamp. We can simply clear it unconditionally. Signed-off-by: Andre Guedes <andre.guedes@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-06-29 18:21:24 -07:00
Andre Guedes	29b821fe97	igc: Remove duplicate code in Tx timestamp handling The functions igc_ptp_tx_hang() and igc_ptp_tx_work() have duplicate code which handles Tx timestamp timeouts. This patch does a trivial refactoring by moving that code to its own function and reusing it. Signed-off-by: Andre Guedes <andre.guedes@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-06-29 18:21:19 -07:00
Andre Guedes	3a66abe903	igc: Clean up Rx timestamping logic Differently from I210, I225 doesn't report Rx timestamps via the TS bit Rx descriptor + RXSTMPL/RXSTMPH registers mechanism. Rx timestamps are reported in the packet buffer only, which is implemented by igc_ptp_rx_ pktstamp(). So this patch removes igc_ptp_rx_rgtstamp() and all code related to it, copied from igb driver. Signed-off-by: Andre Guedes <andre.guedes@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-06-29 18:21:16 -07:00
Sasha Neftin	707abf0695	igc: Add initial LTR support The LTR message on the PCIe inform the requested latency on which the PCIe must become active to the downstream PCIe port of the system. This patch provide recommended LTR parameters by i225 specification. Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2020-06-29 18:21:09 -07:00
David S. Miller	2dac017dbd	Merge branch 'Add-ethtool-extended-link-state' Ido Schimmel says: ==================== Add ethtool extended link state Amit says: Currently, device drivers can only indicate to user space if the network link is up or down, without additional information. This patch set provides an infrastructure that allows these drivers to expose more information to user space about the link state. The information can save users' time when trying to understand why a link is not operationally up, for example. The above is achieved by extending the existing ethtool LINKSTATE_GET command with attributes that carry the extended state. For example, no link due to missing cable: $ ethtool ethX ... Link detected: no (No cable) Beside the general extended state, drivers can pass additional information about the link state using the sub-state field. For example: $ ethtool ethX ... Link detected: no (Autoneg, No partner detected) In the future the infrastructure can be extended - for example - to allow PHY drivers to report whether a downshift to a lower speed occurred. Something like: $ ethtool ethX ... Link detected: yes (downshifted) Patch set overview: Patches #1-#3 move mlxsw ethtool code to a separate file Patches #4-#5 add the ethtool infrastructure for extended link state Patches #6-#7 add support of extended link state in the mlxsw driver Patches #8-#10 add test cases Changes since v1: * In documentation, show ETHTOOL_LINK_EXT_STATE_* and ETHTOOL_LINK_EXT_SUBSTATE_* constants instead of user-space strings * Add `_CI_` to cable_issue substates to be consistent with other substates * Keep the commit messages within 75 columns * Use u8 variable for __link_ext_substate * Document the meaning of -ENODATA in get_link_ext_state() callback description * Do not zero data->link_ext_state_provided after getting an error * Use `ret` variable for error value Changes since RFC: * Move documentation patch before ethtool patch * Add nla_total_size() instead of sizeof() directly * Return an error code from linkstate_get_ext_state() * Remove SHORTED_CABLE, add CABLE_TEST_FAILURE instead * Check if the interface is administratively up before setting ext_state * Document all sub-states ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-29 17:45:02 -07:00
Amit Cohen	7d10bcce98	selftests: forwarding: Add tests for ethtool extended state Add tests to check ethtool report about extended state. The tests configure several states and verify that the correct extended state is reported by ethtool. Check extended state with substate (Autoneg) and extended state without substate (No cable). Signed-off-by: Amit Cohen <amitc@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-29 17:45:02 -07:00
Amit Cohen	0433045c27	selftests: forwarding: forwarding.config.sample: Add port with no cable connected Add NETIF_NO_CABLE port to tests topology. The port can also be declared as an environment variable and tests can be run like that: NETIF_NO_CABLE=eth9 ./test.sh eth{1..8} The NETIF_NO_CABLE port will be used by ethtool_extended_state test. Signed-off-by: Amit Cohen <amitc@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2020-06-29 17:45:02 -07:00

... 15 16 17 18 19 ...

935158 Commits