linux

mirror of https://github.com/torvalds/linux.git synced 2024-11-15 00:21:59 +00:00

Author	SHA1	Message	Date
Linus Torvalds	8967605e7d	block-5.18-2022-05-06 -----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmJ10nMQHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgpmkYD/9vfbQhyxrDH/GfKua18QmfQARphw6Ihegc NSVjJ38dohNiLekNKgTmquWtVl/s6g/roa0+zzUF66eML/6SnKRoVZHh6cJ5Vopk lpj6Pzb6COu+Vo7lWysDHCT4g2iYr6MaAgKKFrsVG6PkQ1vSxx2w6nQwInDHY2hG 4YL+NM8ID5SHzmfr9XYpAkDu6XoU1rVpq+JnZPzejWHUUwxMYMcNmLjZio/ysE6G 0NrW4LXmu1gJxv4+9betVjNin5CU7LLhxdigSTs/nGlqi9I9lq02tWLhUw7swTD3 xWReoxsKWzRzLB2Nb5lsERrv37XKSghkLuQkoa+gtr7wTfvUCZKePBrZ/BE0VzYf bnXaFA0gOa/H4P2AB68ZB8WaNhxtYZsW2PXDei4ramUFlPkyKstwAKxF2ViiBHKC VrR2aQ82VUm2b3iY1QnbxVAuiPLGT/t4RFJyaiFbA5dzEyd3ofMUTTz7m62K4khQ HlQLwl4jN+vVmDsvNr5zb1N1xZaChiyDJPGqofGAWbAWjdBsEuTt644mnn341/3r VluH+Eswkvi6SMn990QbI3TmqWWmyzUc7DafrK2V3mKlUu9swch34fFV5arU0xSJ 7bJ+vL75U5A8NWCOOLML2BPX9L4gVIpaKg0PHQ+Rhlc0/STQdaXa0TgrUVNB8YMS O4ubYf/raA== =zQyX -----END PGP SIGNATURE----- Merge tag 'block-5.18-2022-05-06' of git://git.kernel.dk/linux-block Pull block fixes from Jens Axboe: "A single revert for a change that isn't needed in 5.18, and a small series for s390/dasd" * tag 'block-5.18-2022-05-06' of git://git.kernel.dk/linux-block: s390/dasd: Use kzalloc instead of kmalloc/memset s390/dasd: Fix read inconsistency for ESE DASD devices s390/dasd: Fix read for ESE with blksize < 4k s390/dasd: prevent double format of tracks for ESE devices s390/dasd: fix data corruption for ESE devices Revert "block: release rq qos structures for queue without disk"	2022-05-07 10:47:51 -07:00
Linus Torvalds	b366bd7d96	io_uring-5.18-2022-05-06 -----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmJ10okQHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgprhNEADcc6I72dvyd5RVh0bAfWIVlk4QUStGG5rf AP+DdefmQBZaipjdkZ3MRUJ5brr7BD2Ioizo6vShDq5uymfWVS5EP3vKYkLSrBO1 7cmPtNG1qEwnwNEVtDDKKrZwFe5qM/3ayrlBcGQG3SNSLhYe44+Plc3TmrFNzCSi bTHphi06+iBujBG/o3G1mlo307Y9GJboxGEEVUkzyvUZ2wB+NLMAJ04fpSrMPGIa be/Y2PLcM4FOlM8HnUEcjfjQUuygBW7iarcGZyckkfKJOMkRcHwMvl9Z3Dyi4byH srBn2KQINlsLOVbn4q0U76dgqHlJ1lpARNHuDOEBp5DbcF2XX/ZrEaBmLFD1LNpb 0U+uZ9H82lBj2bH5iaz4se2/oGbhAlo77eGGshSMzlLE9BVd62JBmdIqYJhNlgxi CKM2WZZJAJSS4ftiSyJIkZ0ty1iM3vgvj0sbLe7gIZYpu9MhIUVICESXTim2VCd4 lAJtKJTIL2ad//WLgfTRWzSUOVCzQIpOMezIkz4lJWFZ+xOh8WBl5l7wXvVb21Ld JHOXcq/OFZJfEIRE2k0qWzQPjqOohwASDi5hI3ceNM2ZjjgmDtZmcdgajT9u4KFl q9dLW6e5DdMAIjAOne/1mVnq38xj4jhrV3rtIAcjU4ESXUACTThThbLUcRNzEz+4 Hpo1FNnltQ== =1wM2 -----END PGP SIGNATURE----- Merge tag 'io_uring-5.18-2022-05-06' of git://git.kernel.dk/linux-block Pull io_uring fix from Jens Axboe: "Just a single file assignment fix this week" * tag 'io_uring-5.18-2022-05-06' of git://git.kernel.dk/linux-block: io_uring: assign non-fixed early for async work	2022-05-07 10:41:41 -07:00
Javier Martinez Canillas	1b5853dfab	fbdev: efifb: Fix a use-after-free due early fb_info cleanup Commit `d258d00fb9` ("fbdev: efifb: Cleanup fb_info in .fb_destroy rather than .remove") attempted to fix a use-after-free error due driver freeing the fb_info in the .remove handler instead of doing it in .fb_destroy. But ironically that change introduced yet another use-after-free since the fb_info was still used after the free. This should fix for good by freeing the fb_info at the end of the handler. Fixes: `d258d00fb9` ("fbdev: efifb: Cleanup fb_info in .fb_destroy rather than .remove") Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reported-by: Andrzej Hajda <andrzej.hajda@intel.com> Signed-off-by: Javier Martinez Canillas <javierm@redhat.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Reviewed-by: Thomas Zimmermann <tzimemrmann@suse.de> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220506132225.588379-1-javierm@redhat.com	2022-05-07 09:05:48 -07:00
Kees Cook	1c7ab9cd98	net: chelsio: cxgb4: Avoid potential negative array offset Using min_t(int, ...) as a potential array index implies to the compiler that negative offsets should be allowed. This is not the case, though. Replace "int" with "unsigned int". Fixes the following warning exposed under future CONFIG_FORTIFY_SOURCE improvements: In file included from include/linux/string.h:253, from include/linux/bitmap.h:11, from include/linux/cpumask.h:12, from include/linux/smp.h:13, from include/linux/lockdep.h:14, from include/linux/rcupdate.h:29, from include/linux/rculist.h:11, from include/linux/pid.h:5, from include/linux/sched.h:14, from include/linux/delay.h:23, from drivers/net/ethernet/chelsio/cxgb4/t4_hw.c:35: drivers/net/ethernet/chelsio/cxgb4/t4_hw.c: In function 't4_get_raw_vpd_params': include/linux/fortify-string.h:46:33: warning: '__builtin_memcpy' pointer overflow between offset 29 and size [2147483648, 4294967295] [-Warray-bounds] 46 \| #define __underlying_memcpy __builtin_memcpy \| ^ include/linux/fortify-string.h:388:9: note: in expansion of macro '__underlying_memcpy' 388 \| __underlying_##op(p, q, __fortify_size); \ \| ^~~~~~~~~~~~~ include/linux/fortify-string.h:433:26: note: in expansion of macro '__fortify_memcpy_chk' 433 \| #define memcpy(p, q, s) __fortify_memcpy_chk(p, q, s, \ \| ^~~~~~~~~~~~~~~~~~~~ drivers/net/ethernet/chelsio/cxgb4/t4_hw.c:2796:9: note: in expansion of macro 'memcpy' 2796 \| memcpy(p->id, vpd + id, min_t(int, id_len, ID_LEN)); \| ^~~~~~ include/linux/fortify-string.h:46:33: warning: '__builtin_memcpy' pointer overflow between offset 0 and size [2147483648, 4294967295] [-Warray-bounds] 46 \| #define __underlying_memcpy __builtin_memcpy \| ^ include/linux/fortify-string.h:388:9: note: in expansion of macro '__underlying_memcpy' 388 \| __underlying_##op(p, q, __fortify_size); \ \| ^~~~~~~~~~~~~ include/linux/fortify-string.h:433:26: note: in expansion of macro '__fortify_memcpy_chk' 433 \| #define memcpy(p, q, s) __fortify_memcpy_chk(p, q, s, \ \| ^~~~~~~~~~~~~~~~~~~~ drivers/net/ethernet/chelsio/cxgb4/t4_hw.c:2798:9: note: in expansion of macro 'memcpy' 2798 \| memcpy(p->sn, vpd + sn, min_t(int, sn_len, SERNUM_LEN)); \| ^~~~~~ Additionally remove needless cast from u8[] to char * in last strim() call. Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/lkml/202205031926.FVP7epJM-lkp@intel.com Fixes: `fc9279298e` ("cxgb4: Search VPD with pci_vpd_find_ro_info_keyword()") Fixes: `24c521f81c` ("cxgb4: Use pci_vpd_find_id_string() to find VPD ID string") Cc: Raju Rangoju <rajur@chelsio.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20220505233101.1224230-1-keescook@chromium.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-05-06 15:41:27 -07:00
Eric Dumazet	d5076fe404	netlink: do not reset transport header in netlink_recvmsg() netlink_recvmsg() does not need to change transport header. If transport header was needed, it should have been reset by the producer (netlink_dump()), not the consumer(s). The following trace probably happened when multiple threads were using MSG_PEEK. BUG: KCSAN: data-race in netlink_recvmsg / netlink_recvmsg write to 0xffff88811e9f15b2 of 2 bytes by task 32012 on cpu 1: skb_reset_transport_header include/linux/skbuff.h:2760 [inline] netlink_recvmsg+0x1de/0x790 net/netlink/af_netlink.c:1978 sock_recvmsg_nosec net/socket.c:948 [inline] sock_recvmsg net/socket.c:966 [inline] __sys_recvfrom+0x204/0x2c0 net/socket.c:2097 __do_sys_recvfrom net/socket.c:2115 [inline] __se_sys_recvfrom net/socket.c:2111 [inline] __x64_sys_recvfrom+0x74/0x90 net/socket.c:2111 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae write to 0xffff88811e9f15b2 of 2 bytes by task 32005 on cpu 0: skb_reset_transport_header include/linux/skbuff.h:2760 [inline] netlink_recvmsg+0x1de/0x790 net/netlink/af_netlink.c:1978 ____sys_recvmsg+0x162/0x2f0 ___sys_recvmsg net/socket.c:2674 [inline] __sys_recvmsg+0x209/0x3f0 net/socket.c:2704 __do_sys_recvmsg net/socket.c:2714 [inline] __se_sys_recvmsg net/socket.c:2711 [inline] __x64_sys_recvmsg+0x42/0x50 net/socket.c:2711 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x2b/0x70 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae value changed: 0xffff -> 0x0000 Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 32005 Comm: syz-executor.4 Not tainted 5.18.0-rc1-syzkaller-00328-ge1f700ebd6be-dirty #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Link: https://lore.kernel.org/r/20220505161946.2867638-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-05-06 15:37:36 -07:00
Christophe JAILLET	ab244be47a	drm/nouveau: Fix a potential theorical leak in nouveau_get_backlight_name() If successful ida_simple_get() calls are not undone when needed, some additional memory may be allocated and wasted. Here, an ID between 0 and MAX_INT is required. If this ID is >=100, it is not taken into account and is wasted. It should be released. Instead of calling ida_simple_remove(), take advantage of the 'max' parameter to require the ID not to be too big. Should it be too big, it is not allocated and don't need to be freed. While at it, use ida_alloc_xxx()/ida_free() instead to ida_simple_get()/ida_simple_remove(). The latter is deprecated and more verbose. Fixes: `db1a0ae214` ("drm/nouveau/bl: Assign different names to interfaces") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Lyude Paul <lyude@redhat.com> [Fixed formatting warning from checkpatch] Signed-off-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/9ba85bca59df6813dc029e743a836451d5173221.1644386541.git.christophe.jaillet@wanadoo.fr	2022-05-06 17:35:00 -04:00
Linus Torvalds	4b97bac075	for-5.18-rc5-tag -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmJ1X+YACgkQxWXV+ddt WDs75g/8DsrDBe/Sbn/bDLRw30D0StC5xRhaVJLNcAhSfIyyoBo0EKifd2nXdNSn rdc3pvRToPh2X1YQ/5FmtxuYW12N4pVOfWtXKdoFFXabMVJetDBSnS8KNzxBd9Ys OkiKj2qb8H+1uNwWVLxfbFBNWWPyCQLe/2DDmePxOAszs4YUPvwxffyA8oclyHb5 wW0qOJAC76Lz6y5Wo/GJrtAdlvWwb3t8+IDUKgGPT1BEIOE/fna+MhvEwqvCdY9F PPU5sWIXkAv3addgrcBHVX5HdAzRC0jAv2lHsttfu4dEOmTzw8dCTUh/IzSRa3fj IVy7AIGR+ZR++d4tOhAEZBe1KAqntY3UVmXY19cKTHOLLWFjv4XkKw8KJzsDiHUt rczedAFdgRRo9QIgwSsAU9Zi8DSG56BSenxsqFzqiL5BDDX1bUFXCegNYR42GfB/ 8E89eYkBCXxP6XeA1+44EOalCdqLKuQOyEijTUxn0UDGdqHHB1Gd/IUQDU5fpCqo kX6gdgMhNmjGJ/zETfOdFrYZaHYJUiiIRc6z8SCgM3JIPBgVw0FAa99Hs8Pl+eJn idmpfnHn6Xvmq46FISVRWgolkuj7VzTEOM65rNgOY3889Vk9Qt2qjI/DvYPGDs9Y 9PQI1FY+2E+ZqbNY8dZpXDii+6y36aAmGR1B10x3545+C36CA/0= =S4KK -----END PGP SIGNATURE----- Merge tag 'for-5.18-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "Regression fixes in zone activation: - move a loop invariant out of the loop to avoid checking space status - properly handle unlimited activation Other fixes: - for subpage, force the free space v2 mount to avoid a warning and make it easy to switch a filesystem on different page size systems - export sysfs status of exclusive operation 'balance paused', so the user space tools can recognize it and allow adding a device with paused balance - fix assertion failure when logging directory key range item" * tag 'for-5.18-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: sysfs: export the balance paused state of exclusive operation btrfs: fix assertion failure when logging directory key range item btrfs: zoned: activate block group properly on unlimited active zone device btrfs: zoned: move non-changing condition check out of the loop btrfs: force v2 space cache usage for subpage mount	2022-05-06 14:32:16 -07:00
Robin Murphy	87fd2b091f	drm/nouveau/tegra: Stop using iommu_present() Even if some IOMMU has registered itself on the platform "bus", that doesn't necessarily mean it provides translation for the device we care about. Replace iommu_present() with a more appropriate check. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Lyude Paul <lyude@redhat.com> [added cc for stable] Signed-off-by: Lyude Paul <lyude@redhat.com> Cc: stable@vger.kernel.org # v5.0+ Link: https://patchwork.freedesktop.org/patch/msgid/70d40ea441da3663c2824d54102b471e9a621f8a.1649168494.git.robin.murphy@arm.com	2022-05-06 17:23:17 -04:00
Linus Torvalds	adcffc1716	NFS client bugfixes for Linux 5.18 Highlights include: Stable fixes: - Fix a socket leak when setting up an AF_LOCAL RPC client - Ensure that knfsd connects to the gss-proxy daemon on setup Bugfixes: - Fix a refcount leak when migrating a task off an offlined transport - Don't gratuitously invalidate inode attributes on delegation return - Don't leak sockets in xs_local_connect() - Ensure timely close of disconnected AF_LOCAL sockets -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEESQctxSBg8JpV8KqEZwvnipYKAPIFAmJ1f3wACgkQZwvnipYK APKKIRAAmVUswcfRQ9wSz5wW6DCFU9hdsN9JD4pAcPvWAYGo8fqmn3I3qe/iaBCf rrJF38SfQVygtthmAY4CBBwOiVxm2fvqanML2lta+ZUU15MqoH2px3kMYemRyulJ 9/2yP25AUSgkmwdEmm69hIXJkEJa3dsjg+LajQZ5X01DgKSfpObS5s9t/upM9kve Wqz5QRr+aJnZuuYYJWxNmXZ4XQEkzHccg3aSswB6bEsEGNXKo8NnWryrSMnWTW1y rQCb0e+gxpoVFgV3ngP1r9xT2l2ISbJIIhTPoj5hSjSVlFvQlIEyHtGA2vuIEZH9 hPJAnaSc7Xb+QER6XfZkTxjW+jtMl5OmMKkWUcUmHiYv2KIM8dUAd3ANnbDBCUvw C5bGF907Qjqs5d2VdfsbisT9ikyn+xw6SFxcr9HYyH2T3dIsC1A8P9uUvn/afwUQ EPfQIsIEDeufo6O8KLfF+gCO9kbk9rdaP8Bv3B2H94aRs1yYde9bJpa7QABncGbA otWehkX/AbrIa4Zjp1ELzcVJxlIl+/AtxzCdGY2me1Ds388U/RKsyDWwXuGynLP6 98ycdtHWVyoJ48L5kZowuj8/3tEB998En5hh0HSuAd0DYkAuGxaSGb+iuwKi/M0H +D1wZxef49r2ggQkEOsllTEjJKSHcq1+vCVASZ8ITEbcVUSiO90= =LSoH -----END PGP SIGNATURE----- Merge tag 'nfs-for-5.18-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs Pull NFS client fixes from Trond Myklebust: "Highlights include: Stable fixes: - Fix a socket leak when setting up an AF_LOCAL RPC client - Ensure that knfsd connects to the gss-proxy daemon on setup Bugfixes: - Fix a refcount leak when migrating a task off an offlined transport - Don't gratuitously invalidate inode attributes on delegation return - Don't leak sockets in xs_local_connect() - Ensure timely close of disconnected AF_LOCAL sockets" * tag 'nfs-for-5.18-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: Revert "SUNRPC: attempt AF_LOCAL connect on setup" SUNRPC: Ensure gss-proxy connects on setup SUNRPC: Ensure timely close of disconnected AF_LOCAL sockets SUNRPC: Don't leak sockets in xs_local_connect() NFSv4: Don't invalidate inode attributes on delegation return SUNRPC release the transport of a relocated task with an assigned transport	2022-05-06 13:19:11 -07:00
Lokesh Dhoundiyal	9e6c6d17d1	ipv4: drop dst in multicast routing path kmemleak reports the following when routing multicast traffic over an ipsec tunnel. Kmemleak output: unreferenced object 0x8000000044bebb00 (size 256): comm "softirq", pid 0, jiffies 4294985356 (age 126.810s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 80 00 00 00 05 13 74 80 ..............t. 80 00 00 00 04 9b bf f9 00 00 00 00 00 00 00 00 ................ backtrace: [<00000000f83947e0>] __kmalloc+0x1e8/0x300 [<00000000b7ed8dca>] metadata_dst_alloc+0x24/0x58 [<0000000081d32c20>] __ipgre_rcv+0x100/0x2b8 [<00000000824f6cf1>] gre_rcv+0x178/0x540 [<00000000ccd4e162>] gre_rcv+0x7c/0xd8 [<00000000c024b148>] ip_protocol_deliver_rcu+0x124/0x350 [<000000006a483377>] ip_local_deliver_finish+0x54/0x68 [<00000000d9271b3a>] ip_local_deliver+0x128/0x168 [<00000000bd4968ae>] xfrm_trans_reinject+0xb8/0xf8 [<0000000071672a19>] tasklet_action_common.isra.16+0xc4/0x1b0 [<0000000062e9c336>] __do_softirq+0x1fc/0x3e0 [<00000000013d7914>] irq_exit+0xc4/0xe0 [<00000000a4d73e90>] plat_irq_dispatch+0x7c/0x108 [<000000000751eb8e>] handle_int+0x16c/0x178 [<000000001668023b>] _raw_spin_unlock_irqrestore+0x1c/0x28 The metadata dst is leaked when ip_route_input_mc() updates the dst for the skb. Commit `f38a9eb1f7` ("dst: Metadata destinations") correctly handled dropping the dst in ip_route_input_slow() but missed the multicast case which is handled by ip_route_input_mc(). Drop the dst in ip_route_input_mc() avoiding the leak. Fixes: `f38a9eb1f7` ("dst: Metadata destinations") Signed-off-by: Lokesh Dhoundiyal <lokesh.dhoundiyal@alliedtelesis.co.nz> Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20220505020017.3111846-1-chris.packham@alliedtelesis.co.nz Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-05-06 12:46:38 -07:00
Linus Torvalds	bce58da1f3	x86: * Account for family 17h event renumberings in AMD PMU emulation * Remove CPUID leaf 0xA on AMD processors * Fix lockdep issue with locking all vCPUs * Fix loss of A/D bits in SPTEs * Fix syzkaller issue with invalid guest state -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmJ1Vf4UHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroNaUQgAgygZ2KsejlJCYGtEkAsjcpdzmPVL 8j42nWB673/PLZ6GrDXcFnRwQaBIT+0YrES5VHTkTI996d2T/yHII2L4G3DQtUGm 6L3qYqrjJlX2WjbYGvYzkJ6m4EzcstUfPYNO2Qzfvbl2y/wz64HlAhNdymwMX2UU GPUVoo3EHeobJdZVKFMe7eI6r/uY1/uPdsKqNjnlWI73op+tc7mMRN5+SlQDgQvR kmzw+Nk0J+PERQO+D+fm1vUdXDQ8hiI7LtTBIUX7rf47IqVlHNHC8frC94PX3W3E l2sVS+LzRQRqCgFgQ2ay2gYkl078VL8z4A6vWpcWSmaToEYE7VcAnHqb0Q== =6gt2 -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm fixes from Paolo Bonzini: "x86: - Account for family 17h event renumberings in AMD PMU emulation - Remove CPUID leaf 0xA on AMD processors - Fix lockdep issue with locking all vCPUs - Fix loss of A/D bits in SPTEs - Fix syzkaller issue with invalid guest state" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: VMX: Exit to userspace if vCPU has injected exception and invalid state KVM: SEV: Mark nested locking of vcpu->lock kvm: x86/cpuid: Only provide CPUID leaf 0xA if host has architectural PMU KVM: x86/svm: Account for family 17h event renumberings in amd_pmc_perf_hw_id KVM: x86/mmu: Use atomic XCHG to write TDP MMU SPTEs with volatile bits KVM: x86/mmu: Move shadow-present check out of spte_has_volatile_bits() KVM: x86/mmu: Don't treat fully writable SPTEs as volatile (modulo A/D)	2022-05-06 11:42:58 -07:00
Linus Torvalds	497fe3bb19	RISC-V Fix for 5.18-rc6 * A fix to relocate the DTB early in boot, in cases where the bootloader doesn't put the DTB in a region that will end up mapped by the kernel. This manifests as a crash early in boot on a handful of configurations. -----BEGIN PGP SIGNATURE----- iQJHBAABCAAxFiEEAM520YNJYN/OiG3470yhUCzLq0EFAmJ1TjQTHHBhbG1lckBk YWJiZWx0LmNvbQAKCRDvTKFQLMurQQLmD/4477/Ax+QlHXJ4stM1m4VaXybzJ1qb gvjsV/xnOwwLcQU6683B6mm0LQ+LGXfJHwveYwMQs1dsoWEeOjmwpadO5qjfNv55 dbJh/9QtS4BYWHBySvumU2hv3Wyn4kO+7f/bLoxgIR0CupMBslTdP1oo8jFQeHBt bL/wWUU/MVEg9FAB0t2cSkaJVcxvCkyRk5MYJUok2eHedTpwOwtIGTnSpv00EA8z V1aDN3Y8MJuqRVSsbN/hGCKB1WrKkP5K7qxnvp5tH+68G3t30zc5NvVtrH+VZNiT YTod4kf5Y3JlekboNz17O/WDP3BjpP5QBga9K7m64de9vSZjiEtM4Ze4i3A/C1xe Z/LaizIy1H+1ehA3tWPQH6MwLFVUx4XNVWPQfDhAGtAyXTA5Epl028V9mvvUClfg l63f9cWEEGsy2DVg9kU9MfzNdro5iJARL/pYUFQCyoEiUQIkl7E1Eh9XXqQwIozK 3rhvRL9DbELYTK65xKUXwBZcnuCBgUPNQvf7/AZ02qPMQzmy+NlWQLKYQvEYCH5U IbYV0LCwpYAi2tdxgkzv7wkI86m5NLtRRKeD/eZLMEozlkkxVtq1fPeVXTDA2d8N RdsF16H9CW9Bdf+COyhR4xn2IpxqSV/aVOsjYA0F87D5GUJk+AQp4w7Pm2NdH2fb yw8HwqrnD8+66A== =JBfr -----END PGP SIGNATURE----- Merge tag 'riscv-for-linus-5.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fix from Palmer Dabbelt: - A fix to relocate the DTB early in boot, in cases where the bootloader doesn't put the DTB in a region that will end up mapped by the kernel. This manifests as a crash early in boot on a handful of configurations. * tag 'riscv-for-linus-5.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: RISC-V: relocate DTB if it's outside memory region	2022-05-06 11:30:59 -07:00
Michal Michalik	a11b6c1a38	ice: fix PTP stale Tx timestamps cleanup Read stale PTP Tx timestamps from PHY on cleanup. After running out of Tx timestamps request handlers, hardware (HW) stops reporting finished requests. Function ice_ptp_tx_tstamp_cleanup() used to only clean up stale handlers in driver and was leaving the hardware registers not read. Not reading stale PTP Tx timestamps prevents next interrupts from arriving and makes timestamping unusable. Fixes: `ea9b847cda` ("ice: enable transmit timestamps for E810 devices") Signed-off-by: Michal Michalik <michal.michalik@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-05-06 10:09:24 -07:00
Anatolii Gerasymenko	6096dae926	ice: clear stale Tx queue settings before configuring The iAVF driver uses 3 virtchnl op codes to communicate with the PF regarding the VF Tx queues: * VIRTCHNL_OP_CONFIG_VSI_QUEUES configures the hardware and firmware logic for the Tx queues * VIRTCHNL_OP_ENABLE_QUEUES configures the queue interrupts * VIRTCHNL_OP_DISABLE_QUEUES disables the queue interrupts and Tx rings. There is a bug in the iAVF driver due to the race condition between VF reset request and shutdown being executed in parallel. This leads to a break in logic and VIRTCHNL_OP_DISABLE_QUEUES is not being sent. If this occurs, the PF driver never cleans up the Tx queues. This results in leaving behind stale Tx queue settings in the hardware and firmware. The most obvious outcome is that upon the next VIRTCHNL_OP_CONFIG_VSI_QUEUES, the PF will fail to program the Tx scheduler node due to a lack of space. We need to protect ICE driver against such situation. To fix this, make sure we clear existing stale settings out when handling VIRTCHNL_OP_CONFIG_VSI_QUEUES. This ensures we remove the previous settings. Calling ice_vf_vsi_dis_single_txq should be safe as it will do nothing if the queue is not configured. The function already handles the case when the Tx queue is not currently configured and exits with a 0 return in that case. Fixes: `7ad15440ac` ("ice: Refactor VIRTCHNL_OP_CONFIG_VSI_QUEUES handling") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Anatolii Gerasymenko <anatolii.gerasymenko@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-05-06 10:09:24 -07:00
Ivan Vecera	486b9eee57	ice: Fix race during aux device (un)plugging Function ice_plug_aux_dev() assigns pf->adev field too early prior aux device initialization and on other side ice_unplug_aux_dev() starts aux device deinit and at the end assigns NULL to pf->adev. This is wrong because pf->adev should always be non-NULL only when aux device is fully initialized and ready. This wrong order causes a crash when ice_send_event_to_aux() call occurs because that function depends on non-NULL value of pf->adev and does not assume that aux device is half-initialized or half-destroyed. After order correction the race window is tiny but it is still there, as Leon mentioned and manipulation with pf->adev needs to be protected by mutex. Fix (un-)plugging functions so pf->adev field is set after aux device init and prior aux device destroy and protect pf->adev assignment by new mutex. This mutex is also held during ice_send_event_to_aux() call to ensure that aux device is valid during that call. Note that device lock used ice_send_event_to_aux() needs to be kept to avoid race with aux drv unload. Reproducer: cycle=1 while :;do echo "#### Cycle: $cycle" ip link set ens7f0 mtu 9000 ip link add bond0 type bond mode 1 miimon 100 ip link set bond0 up ifenslave bond0 ens7f0 ip link set bond0 mtu 9000 ethtool -L ens7f0 combined 1 ip link del bond0 ip link set ens7f0 mtu 1500 sleep 1 let cycle++ done In short when the device is added/removed to/from bond the aux device is unplugged/plugged. When MTU of the device is changed an event is sent to aux device asynchronously. This can race with (un)plugging operation and because pf->adev is set too early (plug) or too late (unplug) the function ice_send_event_to_aux() can touch uninitialized or destroyed fields. In the case of crash below pf->adev->dev.mutex. Crash: [ 53.372066] bond0: (slave ens7f0): making interface the new active one [ 53.378622] bond0: (slave ens7f0): Enslaving as an active interface with an u p link [ 53.386294] IPv6: ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready [ 53.549104] bond0: (slave ens7f1): Enslaving as a backup interface with an up link [ 54.118906] ice 0000:ca:00.0 ens7f0: Number of in use tx queues changed inval idating tc mappings. Priority traffic classification disabled! [ 54.233374] ice 0000:ca:00.1 ens7f1: Number of in use tx queues changed inval idating tc mappings. Priority traffic classification disabled! [ 54.248204] bond0: (slave ens7f0): Releasing backup interface [ 54.253955] bond0: (slave ens7f1): making interface the new active one [ 54.274875] bond0: (slave ens7f1): Releasing backup interface [ 54.289153] bond0 (unregistering): Released all slaves [ 55.383179] MII link monitoring set to 100 ms [ 55.398696] bond0: (slave ens7f0): making interface the new active one [ 55.405241] BUG: kernel NULL pointer dereference, address: 0000000000000080 [ 55.405289] bond0: (slave ens7f0): Enslaving as an active interface with an u p link [ 55.412198] #PF: supervisor write access in kernel mode [ 55.412200] #PF: error_code(0x0002) - not-present page [ 55.412201] PGD 25d2ad067 P4D 0 [ 55.412204] Oops: 0002 [#1] PREEMPT SMP NOPTI [ 55.412207] CPU: 0 PID: 403 Comm: kworker/0:2 Kdump: loaded Tainted: G S 5.17.0-13579-g57f2d6540f03 #1 [ 55.429094] bond0: (slave ens7f1): Enslaving as a backup interface with an up link [ 55.430224] Hardware name: Dell Inc. PowerEdge R750/06V45N, BIOS 1.4.4 10/07/ 2021 [ 55.430226] Workqueue: ice ice_service_task [ice] [ 55.468169] RIP: 0010:mutex_unlock+0x10/0x20 [ 55.472439] Code: 0f b1 13 74 96 eb e0 4c 89 ee eb d8 e8 79 54 ff ff 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 65 48 8b 04 25 40 ef 01 00 31 d2 <f0> 48 0f b1 17 75 01 c3 e9 e3 fe ff ff 0f 1f 00 0f 1f 44 00 00 48 [ 55.491186] RSP: 0018:ff4454230d7d7e28 EFLAGS: 00010246 [ 55.496413] RAX: ff1a79b208b08000 RBX: ff1a79b2182e8880 RCX: 0000000000000001 [ 55.503545] RDX: 0000000000000000 RSI: ff4454230d7d7db0 RDI: 0000000000000080 [ 55.510678] RBP: ff1a79d1c7e48b68 R08: ff4454230d7d7db0 R09: 0000000000000041 [ 55.517812] R10: 00000000000000a5 R11: 00000000000006e6 R12: ff1a79d1c7e48bc0 [ 55.524945] R13: 0000000000000000 R14: ff1a79d0ffc305c0 R15: 0000000000000000 [ 55.532076] FS: 0000000000000000(0000) GS:ff1a79d0ffc00000(0000) knlGS:0000000000000000 [ 55.540163] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 55.545908] CR2: 0000000000000080 CR3: 00000003487ae003 CR4: 0000000000771ef0 [ 55.553041] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 55.560173] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 55.567305] PKRU: 55555554 [ 55.570018] Call Trace: [ 55.572474] <TASK> [ 55.574579] ice_service_task+0xaab/0xef0 [ice] [ 55.579130] process_one_work+0x1c5/0x390 [ 55.583141] ? process_one_work+0x390/0x390 [ 55.587326] worker_thread+0x30/0x360 [ 55.590994] ? process_one_work+0x390/0x390 [ 55.595180] kthread+0xe6/0x110 [ 55.598325] ? kthread_complete_and_exit+0x20/0x20 [ 55.603116] ret_from_fork+0x1f/0x30 [ 55.606698] </TASK> Fixes: `f9f5301e7e` ("ice: Register auxiliary device to provide RDMA") Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-05-06 10:09:24 -07:00
Sean Christopherson	053d2290c0	KVM: VMX: Exit to userspace if vCPU has injected exception and invalid state Exit to userspace with an emulation error if KVM encounters an injected exception with invalid guest state, in addition to the existing check of bailing if there's a pending exception (KVM doesn't support emulating exceptions except when emulating real mode via vm86). In theory, KVM should never get to such a situation as KVM is supposed to exit to userspace before injecting an exception with invalid guest state. But in practice, userspace can intervene and manually inject an exception and/or stuff registers to force invalid guest state while a previously injected exception is awaiting reinjection. Fixes: `fc4fad79fc` ("KVM: VMX: Reject KVM_RUN if emulation is required with pending exception") Reported-by: syzbot+cfafed3bb76d3e37581b@syzkaller.appspotmail.com Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220502221850.131873-1-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-06 13:08:06 -04:00
Peter Gonda	0c2c7c0692	KVM: SEV: Mark nested locking of vcpu->lock svm_vm_migrate_from() uses sev_lock_vcpus_for_migration() to lock all source and target vcpu->locks. Unfortunately there is an 8 subclass limit, so a new subclass cannot be used for each vCPU. Instead maintain ownership of the first vcpu's mutex.dep_map using a role specific subclass: source vs target. Release the other vcpu's mutex.dep_maps. Fixes: `b56639318b` ("KVM: SEV: Add support for SEV intra host migration") Reported-by: John Sperbeck<jsperbeck@google.com> Suggested-by: David Rientjes <rientjes@google.com> Suggested-by: Sean Christopherson <seanjc@google.com> Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Cc: Hillf Danton <hdanton@sina.com> Cc: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Peter Gonda <pgonda@google.com> Message-Id: <20220502165807.529624-1-pgonda@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-06 13:08:04 -04:00
Linus Torvalds	4df22ca85d	v5.18 second rc pull request A few recent regressions in rxe's multicast code, and some old driver bugs: - Error case unwind bug in rxe for rkeys - Dot not call netdev functions under a spinlock in rxe multicast code - Use the proper BH lock type in rxe multicast code - Fix idrma deadlock and crash - Add a missing flush to drain irdma QPs when in error - Fix high userspace latency in irdma during destroy due to synchronize_rcu() - Rare race in siw MPA processing -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQRRRCHOFoQz/8F5bUaFwuHvBreFYQUCYnUqeQAKCRCFwuHvBreF Yf58AQCNUQZlmEiuBid6WxggXPW/MM5sxJdOqZeX+Ddbmm7swAEAidtoVBILozLC ltd8+P8qNdccqOZDatgqYYSpXUfHIA4= =Idcg -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma Pull rdma fixes from Jason Gunthorpe: "A few recent regressions in rxe's multicast code, and some old driver bugs: - Error case unwind bug in rxe for rkeys - Dot not call netdev functions under a spinlock in rxe multicast code - Use the proper BH lock type in rxe multicast code - Fix idrma deadlock and crash - Add a missing flush to drain irdma QPs when in error - Fix high userspace latency in irdma during destroy due to synchronize_rcu() - Rare race in siw MPA processing" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: RDMA/rxe: Change mcg_lock to a _bh lock RDMA/rxe: Do not call dev_mc_add/del() under a spinlock RDMA/siw: Fix a condition race issue in MPA request processing RDMA/irdma: Fix possible crash due to NULL netdev in notifier RDMA/irdma: Reduce iWARP QP destroy time RDMA/irdma: Flush iWARP QP if modified to ERR from RTR state RDMA/rxe: Recheck the MR in when generating a READ reply RDMA/irdma: Fix deadlock in irdma_cleanup_cm_core() RDMA/rxe: Fix "Replace mr by rkey in responder resources"	2022-05-06 09:50:25 -07:00
Linus Torvalds	64267926e0	MMC core: - Fix initialization for eMMC's HS200/HS400 mode MMC host: - sdhci-msm: Reset GCC_SDCC_BCR register to prevent timeout issues - sunxi-mmc: Fix DMA descriptors allocated above 32 bits -----BEGIN PGP SIGNATURE----- iQJLBAABCgA1FiEEugLDXPmKSktSkQsV/iaEJXNYjCkFAmJ03rAXHHVsZi5oYW5z c29uQGxpbmFyby5vcmcACgkQ/iaEJXNYjClX4g/+Ns6zYVW+5SIXGyNlN/vItwUo QaRdCJwUWuDsvXUpEczzpUMsLuzVuFnyaThnxqBwN6aSCtp79rMcyNH6Rjorq1d/ HVnL36kw4n5LLIC4yD/q8KraTX1xS64qUEf4hy9XhjzMl61pGAmtYID7Z20UJBAs N6CNRrILgu6A+Hzp7Ezj01CEGYQnHFKvcXMz1NHZX/KpsdiSQBXdcg9uFgNK3Z1A pxAnoiJVPa67Ksa2pqTh8UHqWTfoMRo2MoF+JomUtbvyxJpUCfB7+wAa895VEqAP +QAvwfBm4K90WLIp7rh2QBDwYav2pugc/dvE0kfO7AiWCczTS/PjH1OTThDH9WR9 HzSdbLeW49NcdAuP6X8YvrqTTA0NP3xzw5T+531gbutlGZIkuimnlUGhaHghk5/p tQfizA1QwBBAKLM7kXlkM9Nm512zgnBtdG3yApgJVyLvTO39eU/n4rpNkSJ8dlOM 36WaQbC8DzwPo8bEoQNHfD7R76JjcdxwVPGcpWgTkYWSMkBQ+mAU/u9N9vmKM3+q XtDpsVa6DMqaLn1QgFSjxadgwl7dkQ3XJkLfjHk3O8u+UOs5zPh7rbMfv6onO8C6 bo0YvDZSrYeqwUNPnaoZXiKj1vzPGVRszbvSmpyBYR6XyoVDGue4HL883Ewn/d3w c9anAC4yp1I7AXPQJTE= =G+HD -----END PGP SIGNATURE----- Merge tag 'mmc-v5.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc Pull mmc fixes from Ulf Hansson: "MMC core: - Fix initialization for eMMC's HS200/HS400 mode MMC host: - sdhci-msm: Reset GCC_SDCC_BCR register to prevent timeout issues - sunxi-mmc: Fix DMA descriptors allocated above 32 bits" * tag 'mmc-v5.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: mmc: sdhci-msm: Reset GCC_SDCC_BCR register for SDHC mmc: sunxi-mmc: Fix DMA descriptors allocated above 32 bits mmc: core: Set HS clock speed before sending HS CMD13	2022-05-06 09:45:44 -07:00
Linus Torvalds	5fa576d7f0	drm fixes for 5.18-rc6 fbdev: - hotunplugging fix amdgpu: - Fix a xen dom0 regression on APUs - Fix a potential array overflow if a receiver were to send an erroneous audio channel count msm: - lockdep fix. it6505: - kconfig fix -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmJ0nIoACgkQDHTzWXnE hr5BVxAAlTPqazcayV1T2cTEWeTjPD39D3t+WMP+hNy7aEgRa0LBHYdciRWx4gLx 4ddu0VutlCtlUC2bnhQNPfJ7hbcStCvcsu5tRaIk9JwhcngwjAWFCc0sWDQwU2OK 7OILBAFMd5t4fA+YxfySykI1pIQUGFB9QMGWakPV1xPKo2oyyJbPxpGcQcZXjp5Z myKnfM2LaG+7gyCrgizh9218vei7XSVbwcKHW2rHA5002kthb0K1rH+Q7sSuRbiW ozZgC7RrdGwpu/iz3oGDef8b9AJv3JQgL87FE9LqetAV9F7t3tc9y8WzM50PJh4u Q/AKj9xRo/NxDwBKvn/YeL/7UgQyA+Ob0sQr/ObqMXcowvDWP0gCTxySkLEzS9tv WuE6w3qLJ9GHHiMI8BUYtKGuFC5/E4sLqXhL2hUiFq7yQXWokL/ocAp7RYBh6ls9 2uBWWCEIv9MJ/vV23kMAg8sdqk4nvAVUtucp0z3Ee7w0Vt3v0YMtOhxXoD42Qg/w eg/gMo39BgEKJsvFv5OQcRzGi5Sm9VKm9uAQN97dMEoe3tJPs7udCMehKwVebGBa z6M1ch+qoIydFdVU1mPBqsmtHSPPlpfmRYVx6TY656RE/eanlVB76qUDBr5J2Ulw yMjLPdyqRfTUwXatAqF9eBv9RRmb3xdX9iKkhEsAFf8jXKgJFbs= =Ak9r -----END PGP SIGNATURE----- Merge tag 'drm-fixes-2022-05-06' of git://anongit.freedesktop.org/drm/drm Pull drm fixes from Dave Airlie: "A pretty quiet week, one fbdev, msm, kconfig, and two amdgpu fixes, about what I'd expect for rc6. fbdev: - hotunplugging fix amdgpu: - Fix a xen dom0 regression on APUs - Fix a potential array overflow if a receiver were to send an erroneous audio channel count msm: - lockdep fix. it6505: - kconfig fix" * tag 'drm-fixes-2022-05-06' of git://anongit.freedesktop.org/drm/drm: drm/amd/display: Avoid reading audio pattern past AUDIO_CHANNELS_COUNT drm/amdgpu: do not use passthrough mode in Xen dom0 drm/bridge: ite-it6505: add missing Kconfig option select fbdev: Make fb_release() return -ENODEV if fbdev was unregistered drm/msm/dp: remove fail safe mode related code	2022-05-06 09:33:28 -07:00
Puyou Lu	dba7857985	gpio: pca953x: fix irq_stat not updated when irq is disabled (irq_mask not set) When one port's input state get inverted (eg. from low to hight) after pca953x_irq_setup but before setting irq_mask (by some other driver such as "gpio-keys"), the next inversion of this port (eg. from hight to low) will not be triggered any more (because irq_stat is not updated at the first time). Issue should be fixed after this commit. Fixes: `89ea8bbe9c` ("gpio: pca953x.c: add interrupt handling capability") Signed-off-by: Puyou Lu <puyou.lu@gmail.com> Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>	2022-05-06 16:59:29 +02:00
Eric Yang	9b9bd3f640	drm/amd/display: undo clearing of z10 related function pointers [Why] Z10 and S0i3 have some shared path. Previous code clean up , incorrectly removed these pointers, which breaks s0i3 restore [How] Do not clear the function pointers based on Z10 disable. Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com> Acked-by: Pavle Kotarac <Pavle.Kotarac@amd.com> Signed-off-by: Eric Yang <Eric.Yang2@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-05-06 10:55:01 -04:00
Richard Gong	aa482ddca8	drm/amdgpu: vi: disable ASPM on Intel Alder Lake based systems Active State Power Management (ASPM) feature is enabled since kernel 5.14. There are some AMD Volcanic Islands (VI) GFX cards, such as the WX3200 and RX640, that do not work with ASPM-enabled Intel Alder Lake based systems. Using these GFX cards as video/display output, Intel Alder Lake based systems will freeze after suspend/resume. The issue was originally reported on one system (Dell Precision 3660 with BIOS version 0.14.81), but was later confirmed to affect at least 4 pre-production Alder Lake based systems. Add an extra check to disable ASPM on Intel Alder Lake based systems with the problematic AMD Volcanic Islands GFX cards. Fixes: `0064b0ce85` ("drm/amd/pm: enable ASPM by default") Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1885 Signed-off-by: Richard Gong <richard.gong@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2022-05-06 10:52:29 -04:00
Maximilian Luz	44acfc22c7	platform/surface: aggregator: Fix initialization order when compiling as builtin module When building the Surface Aggregator Module (SAM) core, registry, and other SAM client drivers as builtin modules (=y), proper initialization order is not guaranteed. Due to this, client driver registration (triggered by device registration in the registry) races against bus initialization in the core. If any attempt is made at registering the device driver before the bus has been initialized (i.e. if bus initialization fails this race) driver registration will fail with a message similar to: Driver surface_battery was unable to register with bus_type surface_aggregator because the bus was not initialized Switch from module_init() to subsys_initcall() to resolve this issue. Note that the serdev subsystem uses postcore_initcall() so we are still able to safely register the serdev device driver for the core. Fixes: `c167b9c7e3` ("platform/surface: Add Surface Aggregator subsystem") Reported-by: Blaž Hrastnik <blaz@mxxn.io> Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com> Link: https://lore.kernel.org/r/20220429195738.535751-1-luzmaximilian@gmail.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>	2022-05-06 13:05:57 +02:00
Maximilian Luz	ed13d4ac57	platform/surface: gpe: Add support for Surface Pro 8 The new Surface Pro 8 uses GPEs for lid events as well. Add an entry for that so that the lid can be used to wake the device. Note that this is a device with a keyboard type-cover, where this acts as the "lid". Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com> Link: https://lore.kernel.org/r/20220429180049.1282447-1-luzmaximilian@gmail.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>	2022-05-06 13:05:57 +02:00
Prarit Bhargava	2cdfa0c20d	platform/x86/intel: Fix 'rmmod pmt_telemetry' panic 'rmmod pmt_telemetry' panics with: BUG: kernel NULL pointer dereference, address: 0000000000000040 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 4 PID: 1697 Comm: rmmod Tainted: G S W -------- --- 5.18.0-rc4 #3 Hardware name: Intel Corporation Alder Lake Client Platform/AlderLake-P DDR5 RVP, BIOS ADLPFWI1.R00.3056.B00.2201310233 01/31/2022 RIP: 0010:device_del+0x1b/0x3d0 Code: e8 1a d9 e9 ff e9 58 ff ff ff 48 8b 08 eb dc 0f 1f 44 00 00 41 56 41 55 41 54 55 48 8d af 80 00 00 00 53 48 89 fb 48 83 ec 18 <4c> 8b 67 40 48 89 ef 65 48 8b 04 25 28 00 00 00 48 89 44 24 10 31 RSP: 0018:ffffb520415cfd60 EFLAGS: 00010286 RAX: 0000000000000070 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000000 RBP: 0000000000000080 R08: ffffffffffffffff R09: ffffb520415cfd78 R10: 0000000000000002 R11: ffffb520415cfd78 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 FS: 00007f7e198e5740(0000) GS:ffff905c9f700000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000040 CR3: 000000010782a005 CR4: 0000000000770ee0 PKRU: 55555554 Call Trace: <TASK> ? __xa_erase+0x53/0xb0 device_unregister+0x13/0x50 intel_pmt_dev_destroy+0x34/0x60 [pmt_class] pmt_telem_remove+0x40/0x50 [pmt_telemetry] auxiliary_bus_remove+0x18/0x30 device_release_driver_internal+0xc1/0x150 driver_detach+0x44/0x90 bus_remove_driver+0x74/0xd0 auxiliary_driver_unregister+0x12/0x20 pmt_telem_exit+0xc/0xe4a [pmt_telemetry] __x64_sys_delete_module+0x13a/0x250 ? syscall_trace_enter.isra.19+0x11e/0x1a0 do_syscall_64+0x58/0x80 ? syscall_exit_to_user_mode+0x12/0x30 ? do_syscall_64+0x67/0x80 ? syscall_exit_to_user_mode+0x12/0x30 ? do_syscall_64+0x67/0x80 ? syscall_exit_to_user_mode+0x12/0x30 ? do_syscall_64+0x67/0x80 ? exc_page_fault+0x64/0x140 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f7e1803a05b Code: 73 01 c3 48 8b 0d 2d 4e 38 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d fd 4d 38 00 f7 d8 64 89 01 48 The probe function, pmt_telem_probe(), adds an entry for devices even if they have not been initialized. This results in the array of initialized devices containing both initialized and uninitialized entries. This causes a panic in the remove function, pmt_telem_remove() which expects the array to only contain initialized entries. Only use an entry when a device is initialized. Cc: "David E. Box" <david.e.box@linux.intel.com> Cc: Hans de Goede <hdegoede@redhat.com> Cc: Mark Gross <markgross@kernel.org> Cc: platform-driver-x86@vger.kernel.org Signed-off-by: David Arcari <darcari@redhat.com> Signed-off-by: Prarit Bhargava <prarit@redhat.com> Reviewed-by: David E. Box <david.e.box@linux.intel.com> Link: https://lore.kernel.org/r/20220429122322.2550003-1-prarit@redhat.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>	2022-05-06 13:05:57 +02:00
Mark Pearson	aa2fef6f40	platform/x86: thinkpad_acpi: Correct dual fan probe There was an issue with the dual fan probe whereby the probe was failing as it assuming that second_fan support was not available. Corrected the logic so the probe works correctly. Cleaned up so quirks only used if 2nd fan not detected. Tested on X1 Carbon 10 (2 fans), X1 Carbon 9 (2 fans) and T490 (1 fan) Signed-off-by: Mark Pearson <markpearson@lenovo.com> Link: https://lore.kernel.org/r/20220502191200.63470-1-markpearson@lenovo.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>	2022-05-06 13:05:57 +02:00
Mario Limonciello	455cd867b8	platform/x86: thinkpad_acpi: Add a s2idle resume quirk for a number of laptops Lenovo laptops that contain NVME SSDs across a variety of generations have trouble resuming from suspend to idle when the IOMMU translation layer is active for the NVME storage device. This generally manifests as a large resume delay or page faults. These delays and page faults occur as a result of a Lenovo BIOS specific SMI that runs during the D3->D0 transition on NVME devices. This SMI occurs because of a flag that is set during resume by Lenovo firmware: ``` OperationRegion (PM80, SystemMemory, 0xFED80380, 0x10) Field (PM80, AnyAcc, NoLock, Preserve) { SI3R, 1 } Method (_ON, 0, NotSerialized) // _ON_: Power On { TPST (0x60D0) If ((DAS3 == 0x00)) { If (SI3R) { TPST (0x60E0) M020 (NBRI, 0x00, 0x00, 0x04, (NCMD \| 0x06)) M020 (NBRI, 0x00, 0x00, 0x10, NBAR) APMC = HDSI /* \HDSI */ SLPS = 0x01 SI3R = 0x00 TPST (0x60E1) } D0NV = 0x01 } } ``` Create a quirk that will run early in the resume process to prevent this SMI from running. As any of these machines are fixed, they can be peeled back from this quirk or narrowed down to individual firmware versions. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1910 Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1689 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Tested-by: Mark Pearson <markpearson@lenvo.com> Link: https://lore.kernel.org/r/20220429030501.1909-3-mario.limonciello@amd.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>	2022-05-06 13:05:57 +02:00
Mario Limonciello	c25d7f32e3	platform/x86: thinkpad_acpi: Convert btusb DMI list to quirks DMI matching in thinkpad_acpi happens local to a function meaning quirks can only match that function. Future changes to thinkpad_acpi may need to quirk other code, so change this to use a quirk infrastructure. Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Tested-by: Mark Pearson <markpearson@lenvo.com> Link: https://lore.kernel.org/r/20220429030501.1909-2-mario.limonciello@amd.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>	2022-05-06 13:05:57 +02:00
Darren Hart	29ad05fd67	Documentation/process: Add embargoed HW contact for Ampere Computing Add Darren Hart as Ampere Computing's ambassador for the embargoed hardware issues process. Signed-off-by: Darren Hart <darren@os.amperecomputing.com> Link: https://lore.kernel.org/r/2e36a8e925bc958928b4afa189b2f876c392831b.1650995848.git.darren@os.amperecomputing.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-05-06 10:00:25 +02:00
Darren Hart	8bf6e0e3c7	Documentation/process: Make groups alphabetical and use tabs consistently The list appears to be grouped by type (silicon, software, cloud) and mostly alphabetical within each group, with a few exceptions. Before adding to it, cleanup the list to be alphabetical within the groups, and use tabs consistently throughout the list. Signed-off-by: Darren Hart <darren@os.amperecomputing.com> Link: https://lore.kernel.org/r/ec574b5d55584a3adda9bd31b7695193636ff136.1650995848.git.darren@os.amperecomputing.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-05-06 10:00:23 +02:00
Thiébaud Weksteen	581dd69830	firmware_loader: use kernel credentials when reading firmware Device drivers may decide to not load firmware when probed to avoid slowing down the boot process should the firmware filesystem not be available yet. In this case, the firmware loading request may be done when a device file associated with the driver is first accessed. The credentials of the userspace process accessing the device file may be used to validate access to the firmware files requested by the driver. Ensure that the kernel assumes the responsibility of reading the firmware. This was observed on Android for a graphic driver loading their firmware when the device file (e.g. /dev/mali0) was first opened by userspace (i.e. surfaceflinger). The security context of surfaceflinger was used to validate the access to the firmware file (e.g. /vendor/firmware/mali.bin). Previously, Android configurations were not setting up the firmware_class.path command line argument and were relying on the userspace fallback mechanism. In this case, the security context of the userspace daemon (i.e. ueventd) was consistently used to read firmware files. More Android devices are now found to set firmware_class.path which gives the kernel the opportunity to read the firmware directly (via kernel_read_file_from_path_initns). In this scenario, the current process credentials were used, even if unrelated to the loading of the firmware file. Signed-off-by: Thiébaud Weksteen <tweek@google.com> Cc: <stable@vger.kernel.org> # 5.10 Reviewed-by: Paul Moore <paul@paul-moore.com> Acked-by: Luis Chamberlain <mcgrof@kernel.org> Link: https://lore.kernel.org/r/20220502004952.3970800-1-tweek@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-05-06 10:00:03 +02:00
Javier Martinez Canillas	b3c9a924aa	fbdev: vesafb: Cleanup fb_info in .fb_destroy rather than .remove The driver is calling framebuffer_release() in its .remove callback, but this will cause the struct fb_info to be freed too early. Since it could be that a reference is still hold to it if user-space opened the fbdev. This would lead to a use-after-free error if the framebuffer device was unregistered but later a user-space process tries to close the fbdev fd. To prevent this, move the framebuffer_release() call to fb_ops.fb_destroy instead of doing it in the driver's .remove callback. Strictly speaking, the code flow in the driver is still wrong because all the hardware cleanupd (i.e: iounmap) should be done in .remove while the software cleanup (i.e: releasing the framebuffer) should be done in the .fb_destroy handler. But this at least makes to match the behavior before commit `27599aacba` ("fbdev: Hot-unplug firmware fb devices on forced removal"). Fixes: `27599aacba` ("fbdev: Hot-unplug firmware fb devices on forced removal") Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Javier Martinez Canillas <javierm@redhat.com> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20220505220631.366371-1-javierm@redhat.com	2022-05-06 09:25:49 +02:00
Javier Martinez Canillas	d258d00fb9	fbdev: efifb: Cleanup fb_info in .fb_destroy rather than .remove The driver is calling framebuffer_release() in its .remove callback, but this will cause the struct fb_info to be freed too early. Since it could be that a reference is still hold to it if user-space opened the fbdev. This would lead to a use-after-free error if the framebuffer device was unregistered but later a user-space process tries to close the fbdev fd. To prevent this, move the framebuffer_release() call to fb_ops.fb_destroy instead of doing it in the driver's .remove callback. Strictly speaking, the code flow in the driver is still wrong because all the hardware cleanupd (i.e: iounmap) should be done in .remove while the software cleanup (i.e: releasing the framebuffer) should be done in the .fb_destroy handler. But this at least makes to match the behavior before commit `27599aacba` ("fbdev: Hot-unplug firmware fb devices on forced removal"). Fixes: `27599aacba` ("fbdev: Hot-unplug firmware fb devices on forced removal") Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Javier Martinez Canillas <javierm@redhat.com> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20220505220540.366218-1-javierm@redhat.com	2022-05-06 09:21:22 +02:00
Javier Martinez Canillas	666b90b3ce	fbdev: simplefb: Cleanup fb_info in .fb_destroy rather than .remove The driver is calling framebuffer_release() in its .remove callback, but this will cause the struct fb_info to be freed too early. Since it could be that a reference is still hold to it if user-space opened the fbdev. This would lead to a use-after-free error if the framebuffer device was unregistered but later a user-space process tries to close the fbdev fd. To prevent this, move the framebuffer_release() call to fb_ops.fb_destroy instead of doing it in the driver's .remove callback. Strictly speaking, the code flow in the driver is still wrong because all the hardware cleanupd (i.e: iounmap) should be done in .remove while the software cleanup (i.e: releasing the framebuffer) should be done in the .fb_destroy handler. But this at least makes to match the behavior before commit `27599aacba` ("fbdev: Hot-unplug firmware fb devices on forced removal"). Fixes: `27599aacba` ("fbdev: Hot-unplug firmware fb devices on forced removal") Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Javier Martinez Canillas <javierm@redhat.com> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20220505220456.366090-1-javierm@redhat.com	2022-05-06 09:21:21 +02:00
Daniel Vetter	89bfd4017e	fbdev: Prevent possible use-after-free in fb_release() Most fbdev drivers have issues with the fb_info lifetime, because call to framebuffer_release() from their driver's .remove callback, rather than doing from fbops.fb_destroy callback. Doing that will destroy the fb_info too early, while references to it may still exist, leading to a use-after-free error. To prevent this, check the fb_info reference counter when attempting to kfree the data structure in framebuffer_release(). That will leak it but at least will prevent the mentioned error. Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Signed-off-by: Javier Martinez Canillas <javierm@redhat.com> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20220505220413.365977-1-javierm@redhat.com	2022-05-06 09:21:20 +02:00
Javier Martinez Canillas	135332f34b	Revert "fbdev: Make fb_release() return -ENODEV if fbdev was unregistered" This reverts commit `aafa025c76`. That commit attempted to fix a NULL pointer dereference, caused by the struct fb_info associated with a framebuffer device to not longer be valid when the file descriptor was closed. The issue was exposed by commit `27599aacba` ("fbdev: Hot-unplug firmware fb devices on forced removal"), which added a new path that goes through the struct device removal instead of directly unregistering the fb. Most fbdev drivers have issues with the fb_info lifetime, because call to framebuffer_release() from their driver's .remove callback, rather than doing from fbops.fb_destroy callback. This meant that due to this switch, the fb_info was now destroyed too early, while references still existed, while before it was simply leaked. The patch we're reverting here reinstated that leak, hence "fixed" the regression. But the proper solution is to fix the drivers to not release the fb_info too soon. Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Javier Martinez Canillas <javierm@redhat.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20220504115917.758787-1-javierm@redhat.com	2022-05-06 09:19:02 +02:00
Kajol Jain	348c713441	powerpc/papr_scm: Fix buffer overflow issue with CONFIG_FORTIFY_SOURCE With CONFIG_FORTIFY_SOURCE enabled, string functions will also perform dynamic checks for string size which can panic the kernel, like incase of overflow detection. In papr_scm, papr_scm_pmu_check_events function uses stat->stat_id with string operations, to populate the nvdimm_events_map array. Since stat_id variable is not NULL terminated, the kernel panics with CONFIG_FORTIFY_SOURCE enabled at boot time. Below are the logs of kernel panic: detected buffer overflow in __fortify_strlen ------------[ cut here ]------------ kernel BUG at lib/string_helpers.c:980! Oops: Exception in kernel mode, sig: 5 [#1] NIP [c00000000077dad0] fortify_panic+0x28/0x38 LR [c00000000077dacc] fortify_panic+0x24/0x38 Call Trace: [c0000022d77836e0] [c00000000077dacc] fortify_panic+0x24/0x38 (unreliable) [c00800000deb2660] papr_scm_pmu_check_events.constprop.0+0x118/0x220 [papr_scm] [c00800000deb2cb0] papr_scm_probe+0x288/0x62c [papr_scm] [c0000000009b46a8] platform_probe+0x98/0x150 Fix this issue by using kmemdup_nul() to copy the content of stat->stat_id directly to the nvdimm_events_map array. mpe: stat->stat_id comes from the hypervisor, not userspace, so there is no security exposure. Fixes: `4c08d4bbc0` ("powerpc/papr_scm: Add perf interface support") Signed-off-by: Kajol Jain <kjain@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220505153451.35503-1-kjain@linux.ibm.com	2022-05-06 12:44:03 +10:00
Jakub Kicinski	c88d390851	Merge branch 'ocelot-vcap-fixes' Vladimir Oltean says: ==================== Ocelot VCAP fixes Changes in v2: fix the NPDs and UAFs caused by filter->trap_list in a more robust way that actually does not introduce bugs of its own (1/5) This series fixes issues found while running tools/testing/selftests/net/forwarding/tc_actions.sh on the ocelot switch: - NULL pointer dereference when failing to offload a filter - NULL pointer dereference after deleting a trap - filters still having effect after being deleted - dropped packets still being seen by software - statistics counters showing double the amount of hits - statistics counters showing inexistent hits - invalid configurations not rejected ==================== Link: https://lore.kernel.org/r/20220504235503.4161890-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-05-05 19:15:22 -07:00
Vladimir Oltean	93a8417088	net: mscc: ocelot: avoid corrupting hardware counters when moving VCAP filters Given the following order of operations: (1) we add filter A using tc-flower (2) we send a packet that matches it (3) we read the filter's statistics to find a hit count of 1 (4) we add a second filter B with a higher preference than A, and A moves one position to the right to make room in the TCAM for it (5) we send another packet, and this matches the second filter B (6) we read the filter statistics again. When this happens, the hit count of filter A is 2 and of filter B is 1, despite a single packet having matched each filter. Furthermore, in an alternate history, reading the filter stats a second time between steps (3) and (4) makes the hit count of filter A remain at 1 after step (6), as expected. The reason why this happens has to do with the filter->stats.pkts field, which is written to hardware through the call path below: vcap_entry_set / \| \ / \| \ / \| \ / \| \ es0_entry_set is1_entry_set is2_entry_set \ \| / \ \| / \ \| / vcap_data_set(data.counter, ...) The primary role of filter->stats.pkts is to transport the filter hit counters from the last readout all the way from vcap_entry_get() -> ocelot_vcap_filter_stats_update() -> ocelot_cls_flower_stats(). The reason why vcap_entry_set() writes it to hardware is so that the counters (saturating and having a limited bit width) are cleared after each user space readout. The writing of filter->stats.pkts to hardware during the TCAM entry movement procedure is an unintentional consequence of the code design, because the hit count isn't up to date at this point. So at step (4), when filter A is moved by ocelot_vcap_filter_add() to make room for filter B, the hardware hit count is 0 (no packet matched on it in the meantime), but filter->stats.pkts is 1, because the last readout saw the earlier packet. The movement procedure programs the old hit count back to hardware, so this creates the impression to user space that more packets have been matched than they really were. The bug can be seen when running the gact_drop_and_ok_test() from the tc_actions.sh selftest. Fix the issue by reading back the hit count to tmp->stats.pkts before migrating the VCAP filter. Sure, this is a best-effort technique, since the packets that hit the rule between vcap_entry_get() and vcap_entry_set() won't be counted, but at least it allows the counters to be reliably used for selftests where the traffic is under control. The vcap_entry_get() name is a bit unintuitive, but it only reads back the counter portion of the TCAM entry, not the entire entry. The index from which we retrieve the counter is also a bit unintuitive (i - 1 during add, i + 1 during del), but this is the way in which TCAM entry movement works. The "entry index" isn't a stored integer for a TCAM filter, instead it is dynamically computed by ocelot_vcap_block_get_filter_index() based on the entry's position in the &block->rules list. That position (as well as block->count) is automatically updated by ocelot_vcap_filter_add_to_block() on add, and by ocelot_vcap_block_remove_filter() on del. So "i" is the new filter index, and "i - 1" or "i + 1" respectively are the old addresses of that TCAM entry (we only support installing/deleting one filter at a time). Fixes: `b596229448` ("net: mscc: ocelot: Add support for tcam") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-05-05 19:15:17 -07:00
Vladimir Oltean	477d2b9162	net: mscc: ocelot: restrict tc-trap actions to VCAP IS2 lookup 0 Once the CPU port was added to the destination port mask of a packet, it can never be cleared, so even packets marked as dropped by the MASK_MODE of a VCAP IS2 filter will still reach it. This is why we need the OCELOT_POLICER_DISCARD to "kill dropped packets dead" and make software stop seeing them. We disallow policer rules from being put on any other chain than the one for the first lookup, but we don't do this for "drop" rules, although we should. This change is merely ascertaining that the rules dont't (completely) work and letting the user know. The blamed commit is the one that introduced the multi-chain architecture in ocelot. Prior to that, we should have always offloaded the filters to VCAP IS2 lookup 0, where they did work. Fixes: `1397a2eb52` ("net: mscc: ocelot: create TCAM skeleton from tc filter chains") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-05-05 19:15:16 -07:00
Vladimir Oltean	6741e11880	net: mscc: ocelot: fix VCAP IS2 filters matching on both lookups The VCAP IS2 TCAM is looked up twice per packet, and each filter can be configured to only match during the first, second lookup, or both, or none. The blamed commit wrote the code for making VCAP IS2 filters match only on the given lookup. But right below that code, there was another line that explicitly made the lookup a "don't care", and this is overwriting the lookup we've selected. So the code had no effect. Some of the more noticeable effects of having filters match on both lookups: - in "tc -s filter show dev swp0 ingress", we see each packet matching a VCAP IS2 filter counted twice. This throws off scripts such as tools/testing/selftests/net/forwarding/tc_actions.sh and makes them fail. - a "tc-drop" action offloaded to VCAP IS2 needs a policer as well, because once the CPU port becomes a member of the destination port mask of a packet, nothing removes it, not even a PERMIT/DENY mask mode with a port mask of 0. But VCAP IS2 rules with the POLICE_ENA bit in the action vector can only appear in the first lookup. What happens when a filter matches both lookups is that the action vector is combined, and this makes the POLICE_ENA bit ineffective, since the last lookup in which it has appeared is the second one. In other words, "tc-drop" actions do not drop packets for the CPU port, dropped packets are still seen by software unless there was an FDB entry that directed those packets to some other place different from the CPU. The last bit used to work, because in the initial commit `b596229448` ("net: mscc: ocelot: Add support for tcam"), we were writing the FIRST field of the VCAP IS2 half key with a 1, not with a "don't care". The change to "don't care" was made inadvertently by me in commit `c1c3993edb` ("net: mscc: ocelot: generalize existing code for VCAP"), which I just realized, and which needs a separate fix from this one, for "stable" kernels that lack the commit blamed below. Fixes: `226e9cd82a` ("net: mscc: ocelot: only install TCAM entries into a specific lookup and PAG") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-05-05 19:15:16 -07:00
Vladimir Oltean	16bbebd356	net: mscc: ocelot: fix last VCAP IS1/IS2 filter persisting in hardware when deleted ocelot_vcap_filter_del() works by moving the next filters over the current one, and then deleting the last filter by calling vcap_entry_set() with a del_filter which was specially created by memsetting its memory to zeroes. vcap_entry_set() then programs this to the TCAM and action RAM via the cache registers. The problem is that vcap_entry_set() is a dispatch function which looks at del_filter->block_id. But since del_filter is zeroized memory, the block_id is 0, or otherwise said, VCAP_ES0. So practically, what we do is delete the entry at the same TCAM index from VCAP ES0 instead of IS1 or IS2. The code was not always like this. vcap_entry_set() used to simply be is2_entry_set(), and then, the logic used to work. Restore the functionality by populating the block_id of the del_filter based on the VCAP block of the filter that we're deleting. This makes vcap_entry_set() know what to do. Fixes: `1397a2eb52` ("net: mscc: ocelot: create TCAM skeleton from tc filter chains") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-05-05 19:15:15 -07:00
Vladimir Oltean	e1846cff2f	net: mscc: ocelot: mark traps with a bool instead of keeping them in a list Since the blamed commit, VCAP filters can appear on more than one list. If their action is "trap", they are chained on ocelot->traps via filter->trap_list. This is in addition to their normal placement on the VCAP block->rules list head. Therefore, when we free a VCAP filter, we must remove it from all lists it is a member of, including ocelot->traps. There are at least 2 bugs which are direct consequences of this design decision. First is the incorrect usage of list_empty(), meant to denote whether "filter" is chained into ocelot->traps via filter->trap_list. This does not do the correct thing, because list_empty() checks whether "head->next == head", but in our case, head->next == head->prev == NULL. So we dereference NULL pointers and die when we call list_del(). Second is the fact that not all places that should remove the filter from ocelot->traps do so. One example is ocelot_vcap_block_remove_filter(), which is where we have the main kfree(filter). By keeping freed filters in ocelot->traps we end up in a use-after-free in felix_update_trapping_destinations(). Attempting to fix all the buggy patterns is a whack-a-mole game which makes the driver unmaintainable. Actually this is what the previous patch version attempted to do: https://patchwork.kernel.org/project/netdevbpf/patch/20220503115728.834457-3-vladimir.oltean@nxp.com/ but it introduced another set of bugs, because there are other places in which create VCAP filters, not just ocelot_vcap_filter_create(): - ocelot_trap_add() - felix_tag_8021q_vlan_add_rx() - felix_tag_8021q_vlan_add_tx() Relying on the convention that all those code paths must call INIT_LIST_HEAD(&filter->trap_list) is not going to scale. So let's do what should have been done in the first place and keep a bool in struct ocelot_vcap_filter which denotes whether we are looking at a trapping rule or not. Iterating now happens over the main VCAP IS2 block->rules. The advantage is that we no longer risk having stale references to a freed filter, since it is only present in that list. Fixes: `e42bd4ed09` ("net: mscc: ocelot: keep traps in a list") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-05-05 19:15:14 -07:00
Haowen Bai	f1c8781ac9	s390/dasd: Use kzalloc instead of kmalloc/memset Use kzalloc rather than duplicating its implementation, which makes code simple and easy to understand. Signed-off-by: Haowen Bai <baihaowen@meizu.com> Reviewed-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Stefan Haberland <sth@linux.ibm.com> Link: https://lore.kernel.org/r/20220505141733.1989450-6-sth@linux.ibm.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-05-05 20:08:27 -06:00
Jan Höppner	b9c10f68e2	s390/dasd: Fix read inconsistency for ESE DASD devices Read requests that return with NRF error are partially completed in dasd_eckd_ese_read(). The function keeps track of the amount of processed bytes and the driver will eventually return this information back to the block layer for further processing via __dasd_cleanup_cqr() when the request is in the final stage of processing (from the driver's perspective). For this, blk_update_request() is used which requires the number of bytes to complete the request. As per documentation the nr_bytes parameter is described as follows: "number of bytes to complete for @req". This was mistakenly interpreted as "number of bytes _left_ for @req" leading to new requests with incorrect data length. The consequence are inconsistent and completely wrong read requests as data from random memory areas are read back. Fix this by correctly specifying the amount of bytes that should be used to complete the request. Fixes: `5e6bdd37c5` ("s390/dasd: fix data corruption for thin provisioned devices") Cc: stable@vger.kernel.org # 5.3+ Signed-off-by: Jan Höppner <hoeppner@linux.ibm.com> Reviewed-by: Stefan Haberland <sth@linux.ibm.com> Link: https://lore.kernel.org/r/20220505141733.1989450-5-sth@linux.ibm.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-05-05 20:08:27 -06:00
Jan Höppner	cd68c48ea1	s390/dasd: Fix read for ESE with blksize < 4k When reading unformatted tracks on ESE devices, the corresponding memory areas are simply set to zero for each segment. This is done incorrectly for blocksizes < 4096. There are two problems. First, the increment of dst is done using the counter of the loop (off), which is increased by blksize every iteration. This leads to a much bigger increment for dst as actually intended. Second, the increment of dst is done before the memory area is set to 0, skipping a significant amount of bytes of memory. This leads to illegal overwriting of memory and ultimately to a kernel panic. This is not a problem with 4k blocksize because blk_queue_max_segment_size is set to PAGE_SIZE, always resulting in a single iteration for the inner segment loop (bv.bv_len == blksize). The incorrectly used 'off' value to increment dst is 0 and the correct memory area is used. In order to fix this for blksize < 4k, increment dst correctly using the blksize and only do it at the end of the loop. Fixes: `5e2b17e712` ("s390/dasd: Add dynamic formatting support for ESE volumes") Cc: stable@vger.kernel.org # v5.3+ Signed-off-by: Jan Höppner <hoeppner@linux.ibm.com> Reviewed-by: Stefan Haberland <sth@linux.ibm.com> Link: https://lore.kernel.org/r/20220505141733.1989450-4-sth@linux.ibm.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-05-05 20:08:27 -06:00
Stefan Haberland	71f3871657	s390/dasd: prevent double format of tracks for ESE devices For ESE devices we get an error for write operations on an unformatted track. Afterwards the track will be formatted and the IO operation restarted. When using alias devices a track might be accessed by multiple requests simultaneously and there is a race window that a track gets formatted twice resulting in data loss. Prevent this by remembering the amount of formatted tracks when starting a request and comparing this number before actually formatting a track on the fly. If the number has changed there is a chance that the current track was finally formatted in between. As a result do not format the track and restart the current IO to check. The number of formatted tracks does not match the overall number of formatted tracks on the device and it might wrap around but this is no problem. It is only needed to recognize that a track has been formatted at all in between. Fixes: `5e2b17e712` ("s390/dasd: Add dynamic formatting support for ESE volumes") Cc: stable@vger.kernel.org # 5.3+ Signed-off-by: Stefan Haberland <sth@linux.ibm.com> Reviewed-by: Jan Hoeppner <hoeppner@linux.ibm.com> Link: https://lore.kernel.org/r/20220505141733.1989450-3-sth@linux.ibm.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-05-05 20:08:27 -06:00
Stefan Haberland	5b53a405e4	s390/dasd: fix data corruption for ESE devices For ESE devices we get an error when accessing an unformatted track. The handling of this error will return zero data for read requests and format the track on demand before writing to it. To do this the code needs to distinguish between read and write requests. This is done with data from the blocklayer request. A pointer to the blocklayer request is stored in the CQR. If there is an error on the device an ERP request is built to do error recovery. While the ERP request is mostly a copy of the original CQR the pointer to the blocklayer request is not copied to not accidentally pass it back to the blocklayer without cleanup. This leads to the error that during ESE handling after an ERP request was built it is not possible to determine the IO direction. This leads to the formatting of a track for read requests which might in turn lead to data corruption. Fixes: `5e2b17e712` ("s390/dasd: Add dynamic formatting support for ESE volumes") Cc: stable@vger.kernel.org # 5.3+ Signed-off-by: Stefan Haberland <sth@linux.ibm.com> Reviewed-by: Jan Hoeppner <hoeppner@linux.ibm.com> Link: https://lore.kernel.org/r/20220505141733.1989450-2-sth@linux.ibm.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-05-05 20:08:27 -06:00
Jonathan Toppins	4e707344e1	MAINTAINERS: add missing files for bonding definition The bonding entry did not include additional include files that have been added nor did it reference the documentation. Add these references for completeness. Signed-off-by: Jonathan Toppins <jtoppins@redhat.com> Link: https://lore.kernel.org/r/903ed2906b93628b38a2015664a20d2802042863.1651690748.git.jtoppins@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-05-05 18:41:18 -07:00

... 2 3 4 5 6 ...

1090517 Commits