Commit Graph

1264402 Commits

Author SHA1 Message Date
Zhu Lingshan
ae1374b7f7 vDPA: report virtio-block read-only info to user space
This commit report read-only information of
virtio-blk devices to user space.

Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
Message-Id: <20240218185606.13509-10-lingshan.zhu@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19 02:45:51 -04:00
Zhu Lingshan
6bdc7846e6 vDPA: report virtio-block write zeroes configuration to user space
This commits reports write zeroes configuration of
virtio-block devices to user space, includes:
1)maximum write zeroes sectors size
2)maximum write zeroes segment number

Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
Message-Id: <20240218185606.13509-9-lingshan.zhu@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19 02:45:51 -04:00
Zhu Lingshan
65848f46e1 vDPA: report virtio-block discarding configuration to user space
This commit reports virtio-blk discarding configuration
to user space,includes:
1) the maximum discard sectors
2) maximum number of discard segments for the block driver to use
3) the alignment for splitting a discarding request

Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
Message-Id: <20240218185606.13509-8-lingshan.zhu@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19 02:45:51 -04:00
Zhu Lingshan
c9d989b4ab vDPA: report virtio-block topology info to user space
This commit allows vDPA reporting topology information of
virtio-blk devices to user space, includes:
1) the number of logical blocks per physical block
2) offset of first aligned logical block
3) suggested minimum I/O size in blocks
4) optimal (suggested maximum) I/O size in blocks

Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
Message-Id: <20240218185606.13509-7-lingshan.zhu@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19 02:45:51 -04:00
Zhu Lingshan
54fb04b02e vDPA: report virtio-block MQ info to user space
This commits allows vDPA reporting virtio-block multi-queue
configuration to user sapce.

Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
Message-Id: <20240218185606.13509-6-lingshan.zhu@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19 02:45:51 -04:00
Zhu Lingshan
81f64e1d91 vDPA: report virtio-block max segments in a request to user space
This commit allows vDPA reporting the maximum number of
segments in a request of virtio-block devices to
user space.

Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
Message-Id: <20240218185606.13509-5-lingshan.zhu@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19 02:45:51 -04:00
Zhu Lingshan
3a1d33fb7e vDPA: report virtio-block block-size to user space
This commit allows reporting the block size of a
virtio-block device to user space.

Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
Message-Id: <20240218185606.13509-4-lingshan.zhu@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19 02:45:51 -04:00
Zhu Lingshan
330b8aea69 vDPA: report virtio-block max segment size to user space
This commit allows reporting the max size of any
single segment of virtio-block devices to user space.

Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
Message-Id: <20240218185606.13509-3-lingshan.zhu@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19 02:45:51 -04:00
Zhu Lingshan
c2475a9a78 vDPA: report virtio-block capacity to user space
This commit allows userspace to query capacity of
a virtio-block device.

Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
Message-Id: <20240218185606.13509-2-lingshan.zhu@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19 02:45:51 -04:00
Ricardo B. Marliere
2b666ee296 virtio: make virtio_bus const
Now that the driver core can properly handle constant struct bus_type,
move the virtio_bus variable to be a constant structure as well,
placing it into read-only memory which can not be modified at runtime.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net>
Message-Id: <20240204-bus_cleanup-virtio-v1-1-3bcb2212aaa0@marliere.net>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Jason Wang <jasowang@redhat.com>
2024-03-19 02:45:50 -04:00
Ricardo B. Marliere
8169ed6220 vdpa: make vdpa_bus const
Now that the driver core can properly handle constant struct bus_type,
move the vdpa_bus variable to be a constant structure as well,
placing it into read-only memory which can not be modified at runtime.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net>
Message-Id: <20240204-bus_cleanup-vdpa-v1-1-1745eccb0a5c@marliere.net>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-03-19 02:45:50 -04:00
Zhu Lingshan
56d61ae558 vDPA/ifcvf: implement vdpa_config_ops.get_vq_num_min
IFCVF HW supports operation with vq size less than the max size,
as the spec required.

This commit implements vdpa_config_ops.get_vq_num_min to report
the minimal size of the virtqueues, which gives vDPA framework
a chance to reduce the vring size.

We need at least one descriptor to be functional, but it is better
no less than 64 to meet ceratin performance requirements.
Actually the framework would allocate at least a PAGE for the vq.

Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
Message-Id: <20240202163905.8834-11-lingshan.zhu@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19 02:45:50 -04:00
Zhu Lingshan
cd21470674 vDPA/ifcvf: get_max_vq_size to return max size
Since we already implemented vdpa_config_ops.get_vq_size,
so get_max_vq_size can return the acutal max size of the
virtqueues other than the max allowed safe size.

Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
Message-Id: <20240202163905.8834-10-lingshan.zhu@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19 02:45:50 -04:00
Zhu Lingshan
273ae08fa0 virtio_vdpa: create vqs with the actual size
The size of a virtqueue is a per vq configuration,
this commit allows virtio_vdpa to create
virtqueues with the actual size of a specific
vq size that supported by the backend device.

Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
Message-Id: <20240202163905.8834-9-lingshan.zhu@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19 02:45:50 -04:00
Zhu Lingshan
47e62e6d62 vduse: implement vdpa_config_ops.get_vq_size for vduse
This commit implements get_vq_size for vdpa_config_ops. This
new interface is used to report per vq size.

Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
Message-Id: <20240202163905.8834-8-lingshan.zhu@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19 02:45:50 -04:00
Zhu Lingshan
f6fa2f7ea0 vdpa_sim: implement vdpa_config_ops.get_vq_size for vDPA simulator
This commit implements vdpa_config_ops.get_vq_size for vDPA
simulator, this new interface can help report per vq size.

Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
Message-Id: <20240202163905.8834-7-lingshan.zhu@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19 02:45:50 -04:00
Zhu Lingshan
1da13e6470 eni_vdpa: implement vdpa_config_ops.get_vq_size
This commit implements get_vq_size which report
per vq size in vdpa_config_ops

Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
Message-Id: <20240202163905.8834-6-lingshan.zhu@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19 02:45:50 -04:00
Zhu Lingshan
a97f9c8fcc vp_vdpa: implement vdpa_config_ops.get_vq_size
This commit implements vdpa_config_ops.get_vq_size in
vp_vdpa, which reports per virtqueue size.

Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
Message-Id: <20240202163905.8834-5-lingshan.zhu@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19 02:45:50 -04:00
Zhu Lingshan
36503e5e06 vDPA/ifcvf: implement vdpa_config_ops.get_vq_size
This commit implements vdpa_ops.get_vq_size to report
the size of a specific virtqueue.

Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
Message-Id: <20240202163905.8834-4-lingshan.zhu@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19 02:45:50 -04:00
Zhu Lingshan
0a926fc972 vDPA: introduce get_vq_size to vdpa_config_ops
This commit introduces a new interface get_vq_size to
vDPA config ops, this new interface intends to report
the size of a specific virtqueue

Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
Message-Id: <20240202163905.8834-3-lingshan.zhu@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19 02:45:50 -04:00
Zhu Lingshan
1496c47065 vhost-vdpa: uapi to support reporting per vq size
The size of a virtqueue is a per vq configuration.
This commit introduce a new ioctl uAPI to support this flexibility.

Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
Message-Id: <20240202163905.8834-2-lingshan.zhu@intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19 02:45:49 -04:00
Shannon Nelson
ba6faaa68d vdpa/pds: fixes for VF vdpa flr-aer handling
This addresses a couple of things found while testing the FLR and AER
handling with the VFs.
 - release irqs before calling vp_modern_remove()
 - make sure we have a valid struct pointer before using it to release irqs
 - make sure the FW is alive before trying to add a new device

Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Message-Id: <20240220011050.30913-1-shannon.nelson@amd.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19 02:45:49 -04:00
Maxime Coquelin
d7b4e3287c vduse: implement DMA sync callbacks
Since commit 295525e29a ("virtio_net: merge dma
operations when filling mergeable buffers"), VDUSE device
require support for DMA's .sync_single_for_cpu() operation
as the memory is non-coherent between the device and CPU
because of the use of a bounce buffer.

This patch implements both .sync_single_for_cpu() and
.sync_single_for_device() callbacks, and also skip bounce
buffer copies during DMA map and unmap operations if the
DMA_ATTR_SKIP_CPU_SYNC attribute is set to avoid extra
copies of the same buffer.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Message-Id: <20240219170606.587290-1-maxime.coquelin@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19 02:45:49 -04:00
Jonah Palmer
749a401683 vdpa/mlx5: Allow CVQ size changes
The MLX driver was not updating its control virtqueue size at set_vq_num
and instead always initialized to MLX5_CVQ_MAX_ENT (16) at
setup_cvq_vring.

Qemu would try to set the size to 64 by default, however, because the
CVQ size always was initialized to 16, an error would be thrown when
sending >16 control messages (as used-ring entry 17 is initialized to 0).
For example, starting a guest with x-svq=on and then executing the
following command would produce the error below:

 # for i in {1..20}; do ifconfig eth0 hw ether XX:xx:XX:xx:XX:XX; done

 qemu-system-x86_64: Insufficient written data (0)
 [  435.331223] virtio_net virtio0: Failed to set mac address by vq command.
 SIOCSIFHWADDR: Invalid argument

Acked-by: Dragos Tatulea <dtatulea@nvidia.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
Message-Id: <20240216142502.78095-1-jonah.palmer@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Fixes: 5262912ef3 ("vdpa/mlx5: Add support for control VQ and MAC setting")
2024-03-19 02:45:49 -04:00
Steve Sistare
c4e8b5aed7 vdpa: skip suspend/resume ops if not DRIVER_OK
If a vdpa device is not in state DRIVER_OK, then there is no driver state
to preserve, so no need to call the suspend and resume driver ops.

Suggested-by: Eugenio Perez Martin <eperezma@redhat.com>"
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Message-Id: <1707834358-165470-1-git-send-email-steven.sistare@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
2024-03-19 02:45:49 -04:00
David Hildenbrand
310227f428 virtio: reenable config if freezing device failed
Currently, we don't reenable the config if freezing the device failed.

For example, virtio-mem currently doesn't support suspend+resume, and
trying to freeze the device will always fail. Afterwards, the device
will no longer respond to resize requests, because it won't get notified
about config changes.

Let's fix this by re-enabling the config if freezing fails.

Fixes: 22b7050a02 ("virtio: defer config changed notifications")
Cc: <stable@kernel.org>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20240213135425.795001-1-david@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19 02:45:49 -04:00
Steve Sistare
9588e7fc51 vdpa_sim: reset must not run
vdpasim_do_reset sets running to true, which is wrong, as it allows
vdpasim_kick_vq to post work requests before the device has been
configured.  To fix, do not set running until VIRTIO_CONFIG_S_DRIVER_OK
is set.

Fixes: 0c89e2a3a9 ("vdpa_sim: Implement suspend vdpa op")
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <1707517807-137331-1-git-send-email-steven.sistare@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19 02:45:49 -04:00
Suzuki K Poulose
ec6ecb844d virtio: uapi: Drop __packed attribute in linux/virtio_pci.h
Commit 92792ac752 ("virtio-pci: Introduce admin command sending function")
added "__packed" structures to UAPI header linux/virtio_pci.h. This triggers
build failures in the consumer userspace applications without proper "definition"
of __packed (e.g., kvmtool build fails).

Moreover, the structures are already packed well, and doesn't need explicit
packing, similar to the rest of the structures in all virtio_* headers. Remove
the __packed attribute.

Fixes: 92792ac752 ("virtio-pci: Introduce admin command sending function")
Cc: Feng Liu <feliu@nvidia.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Yishai Hadas <yishaih@nvidia.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Message-Id: <20240125232039.913606-1-suzuki.poulose@arm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19 02:45:49 -04:00
Andrew Melnychenko
f6baca2d32 vhost: Added pad cleanup if vnet_hdr is not present.
When the Qemu launched with vhost but without tap vnet_hdr,
vhost tries to copy vnet_hdr from socket iter with size 0
to the page that may contain some trash.
That trash can be interpreted as unpredictable values for
vnet_hdr.
That leads to dropping some packets and in some cases to
stalling vhost routine when the vhost_net tries to process
packets and fails in a loop.

Qemu options:
  -netdev tap,vhost=on,vnet_hdr=off,...

Signed-off-by: Andrew Melnychenko <andrew@daynix.com>
Message-Id: <20240115194840.1183077-1-andrew@daynix.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-03-19 02:45:49 -04:00
Dave Airlie
02ac437111 Short summary of fixes pull:
probe-helper:
 - never return negative values from .get_modes() plus driver fixes
 
 nouveau:
 - clear bo resource bus after eviction
 - documentation fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAmXytIgACgkQaA3BHVML
 eiPuXAgAl8nqkA6c5SlAbXsamAHorjznl816Y5LeL4YgRWZpzrkTkxtpS6aP/4r4
 s9RbQzzxd6rmV56kTjQywPZRAVuUkZ3220vcfFie2JFOrysnfz7xEpCkOS1KT9uO
 JparsT3HbWHqlA2kQ5iHQWdgq6fBWTftoVF5vRmCkJsj9NoDtchCMBHPc0xfTSa0
 +NAvkX9F3ktXYB8vibpotKC0mn8chUul3EtnAINjUJVKjbjvPlJ048FVJTuI1yoe
 RDiih3k0A20KF2kpldAlQdhOCFgCy62O8/jYhZFjMonB25y7fHdmyVgSQP2UhvYS
 vIkX+B87c38VD3EJANNLOllJj8+Xog==
 =+tQp
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-next-fixes-2024-03-14' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next

Short summary of fixes pull:

probe-helper:
- never return negative values from .get_modes() plus driver fixes

nouveau:
- clear bo resource bus after eviction
- documentation fixes

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20240314082833.GA8761@localhost.localdomain
2024-03-19 14:36:15 +10:00
Kent Overstreet
2e92d26b25 bcachefs: Fix lost wakeup on journal shutdown
We need to check for journal shutdown first in __journal_res_get() -
after the journal is shutdown, j->watermark won't be changing anymore.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-18 23:35:42 -04:00
Kent Overstreet
c502b5b878 bcachefs; Fix deadlock in bch2_btree_update_start()
BCH_TRANS_COMMIT_journal_reclaim with watermark != BCH_WATERMARK_reclaim
means nonblocking, and we need the journal_res_get() in
btree_update_start() to respect that.

In a future refactoring we'll be deleting
BCH_TRANS_COMMIT_journal_reclaim and replacing it with an explicit
BCH_TRANS_COMMIT_nonblocking.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-18 23:35:42 -04:00
Namjae Jeon
def30e72d8 ksmbd: remove module version
ksmbd module version marking is not needed. Since there is a
Linux kernel version, there is no point in increasing it anymore.

Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-18 21:21:38 -05:00
Namjae Jeon
c6cd2e8d2d ksmbd: fix potencial out-of-bounds when buffer offset is invalid
I found potencial out-of-bounds when buffer offset fields of a few requests
is invalid. This patch set the minimum value of buffer offset field to
->Buffer offset to validate buffer length.

Cc: stable@vger.kernel.org
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-18 21:21:33 -05:00
Dave Airlie
341f708158 Driver changes:
- Invalidate userptr VMA on page pin fault, allowing userspace
   to free userptr while still having bindings
 - Fail early on sysfs file creation error
 - Skip VMA pinning on xe_exec with num_batch_buffer == 0
 -----BEGIN PGP SIGNATURE-----
 
 iQJNBAABCAA3FiEE6rM8lpABPHM5FqyDm6KlpjDL6lMFAmXzxSgZHGx1Y2FzLmRl
 bWFyY2hpQGludGVsLmNvbQAKCRCboqWmMMvqUyGDEACPEFGiM+A6xFaKF3xh6MrQ
 wSYGGwTuQriACOxF1MSjdyDcI4n8jHM04negYPyZRdtausG3O37nZbHlHlUXmizY
 zw0aI8KAy2ZwkD7l0BwuFhAycs47+6ht5B6/sI05VMuN42VNclNCj3DOJ3v45rCL
 GIecBbNm9z+jy70B0r6snczCJ5dMZASbfJkgUTJTyvtRpJbJZRaOj3unn+fzuD6+
 8ZmPO/+MN7ALt7Zr57Ndhsz8kZtyDUpch/xgVrU2VgLSj3xloNnI6vyLM6jmFaka
 hdfPgd2U4e+kK8iQEB84AozGeKnsE7i7YkSO5bTnpIwMZMVrWUQll0TAiT/wp8sW
 TGVjXB83TzqXi8WvNkWZ6bcTXGFjpS8mXkTVp14moBQaJpXXakr1+uFxYZGavidg
 u5Ka6BN7f01utLUVBzL8kpvG5m8jd1p2BTLXsoAwxdfBMwVUctr+OkH5UVbc9Bc6
 q6LNGR25rFL4psSTxWVbfE4vnsHQONQIAlSe9OIS1llLuBuhbQm9gEm92XkKXqDw
 MaYHt8RivfuDsgpDDWADtGe8RJxwTRnpDWz/UArP+VCJy0vEjHi2IR8Y2Okw7KFg
 qjQiWjHZyYGcSFAfqk3hs4h1XPkpaJ99feNcwSkPHXS+ib/ypUilvEF9XJnJtMBG
 Ht7xHB0p26WR0v2SsfjXRQ==
 =HsE+
 -----END PGP SIGNATURE-----

Merge tag 'drm-xe-next-fixes-2024-03-14' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-next

Driver changes:

- Invalidate userptr VMA on page pin fault, allowing userspace
  to free userptr while still having bindings
- Fail early on sysfs file creation error
- Skip VMA pinning on xe_exec with num_batch_buffer == 0

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/c4epi2j6anpc77z73zbgibxg7bxsmmkb522aa7tyei6oa6uunn@3oad4cgomd5a
2024-03-19 12:16:13 +10:00
Arthur Grillo
807f96abdf drm: Fix drm_fixp2int_round() making it add 0.5
As well noted by Pekka[1], the rounding of drm_fixp2int_round is wrong.
To round a number, you need to add 0.5 to the number and floor that,
drm_fixp2int_round() is adding 0.0000076. Make it add 0.5.

[1]: https://lore.kernel.org/all/20240301135327.22efe0dd.pekka.paalanen@collabora.com/

Fixes: 8b25320887 ("drm: Add fixed-point helper to get rounded integer values")
Suggested-by: Pekka Paalanen <pekka.paalanen@collabora.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Arthur Grillo <arthurgrillo@riseup.net>
Signed-off-by: Melissa Wen <melissa.srw@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240316-drm_fixed-v2-1-c1bc2665b5ed@riseup.net
2024-03-18 23:28:28 -01:00
Linus Torvalds
b3603fcb79 dlm for 6.9
- Fix mistaken variable assignment that caused a refcounting problem.
 - Revert a recent change that began using atomic counters where they
   were not needed (for lkb wait_count.)
 - Add comments around forced state reset for waiting lock operations
   during recovery.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEcGkeEvkvjdvlR90nOBtzx/yAaaoFAmX4raoACgkQOBtzx/yA
 aapyThAAtLcTZXOa9MuZDvLtaQKX4c2MDlqiAhdL0YOYnz3+DAveA8HF1FRbVwL0
 74lA1O/GX0t2TdCrLiq75u+N/Sm2ACtbZEr8z6VeEoxxtOwCVbGKjA0CwDgvhdSe
 hUv5beO4mlguc16l4+u88z1Ta6GylXmWHRL6l2q4dPKmO4qVX6wn9JUT4JHJSQy/
 ACJ3+Lu7ndREBzCmqb4cR4TcHAhBynYmV7IIE3LQprgkCKiX2A3boeOIk+lEhUn5
 aqmwNNF2WDjJ1D5QVKbXu07MraD71rnyZBDuHzjprP01OhgXfUHLIcgdi7GzK8aN
 KnQ9S5hQWHzTiWA/kYgrUq/S5124plm2pMRyh1WDG6g3dhBxh7XsOHUxtgbLaurJ
 LmMxdQgH0lhJ3f+LSm3w8e3m45KxTeCYC2NUVg/icjOGUjAsVx1xMDXzMxoABoWO
 GGVED4i4CesjOyijMuRO9G/0MRb/lIyZkfoZgtHgL20yphmtv0B5XIIz062N28Wf
 PqmsYUz4ESYkxR4u/5VPBey5aYYdhugnOSERC6yH4QQJXyRgGWQn/CSuRrEmJJS2
 CurprPKx99XJZjZE7RJNlvpUrSBcD9Y7R6I3vo6RyrUCNwPJ0Y+Qvydvc9FoMN3R
 tn7fJe7tDfEEsukhGkwp90vK3MLbW5iKv7IaAxyALdSW12A23WM=
 =6RCz
 -----END PGP SIGNATURE-----

Merge tag 'dlm-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm

Pull dlm updates from David Teigland:

 - Fix mistaken variable assignment that caused a refcounting problem

 - Revert a recent change that began using atomic counters where they
   were not needed (for lkb wait_count)

 - Add comments around forced state reset for waiting lock operations
   during recovery

* tag 'dlm-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm:
  dlm: add comments about forced waiters reset
  dlm: revert atomic_t lkb_wait_count
  dlm: fix user space lkb refcounting
2024-03-18 15:39:48 -07:00
Linus Torvalds
6207b37eb5 RDMA v6.9
Very small update this cycle:
 
 - Minor code improvements in fi, rxe, ipoib, mana, cxgb4, mlx5, irdma,
   rxe, rtrs, mana
 
 - Simplify the hns hem mechanism
 
 - Fix EFA's MSI-X allocation in resource constrained configurations
 
 - Fix a KASN splat in srpt
 
 - Narrow hns's congestion control selection to QPs granularity and allow
   userspace to select it
 
 - Solve a parallel module loading race between the CM module and a driver
   module
 
 - Flexible array cleanup
 
 - Dump hns's SCC Conext to 'rdma res' for debugging
 
 - Make mana build page lists for HW objects that require a 0 offset
   correctly
 
 - Stuck CM ID debugging
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRRRCHOFoQz/8F5bUaFwuHvBreFYQUCZfgzdQAKCRCFwuHvBreF
 YbS7AQDLy6uJ/1dgrZQ4efcyQDs6H93LG4jWZKoA7F9Oho+MFQEAsQM/UL4nj18O
 T6vHl30N0Ee0aOCqET7HBbnFGKEADAE=
 =KxUj
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma updates from Jason Gunthorpe:
 "Very small update this cycle:

   - Minor code improvements in fi, rxe, ipoib, mana, cxgb4, mlx5,
     irdma, rxe, rtrs, mana

   - Simplify the hns hem mechanism

   - Fix EFA's MSI-X allocation in resource constrained configurations

   - Fix a KASN splat in srpt

   - Narrow hns's congestion control selection to QPs granularity and
     allow userspace to select it

   - Solve a parallel module loading race between the CM module and a
     driver module

   - Flexible array cleanup

   - Dump hns's SCC Conext to 'rdma res' for debugging

   - Make mana build page lists for HW objects that require a 0 offset
     correctly

   - Stuck CM ID debugging"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (29 commits)
  RDMA/cm: add timeout to cm_destroy_id wait
  RDMA/mana_ib: Use virtual address in dma regions for MRs
  RDMA/mana_ib: Fix bug in creation of dma regions
  RDMA/hns: Append SCC context to the raw dump of QPC
  RDMA/uverbs: Avoid -Wflex-array-member-not-at-end warnings
  RDMA/hns: Support userspace configuring congestion control algorithm with QP granularity
  RDMA/rtrs-clt: Check strnlen return len in sysfs mpath_policy_store()
  RDMA/uverbs: Remove flexible arrays from struct *_filter
  RDMA/device: Fix a race between mad_client and cm_client init
  RDMA/hns: Fix mis-modifying default congestion control algorithm
  RDMA/rxe: Remove unused 'iova' parameter from rxe_mr_init_user
  RDMA/srpt: Do not register event handler until srpt device is fully setup
  RDMA/irdma: Remove duplicate assignment
  RDMA/efa: Limit EQs to available MSI-X vectors
  RDMA/mlx5: Delete unused mlx5_ib_copy_pas prototype
  RDMA/cxgb4: Delete unused c4iw_ep_redirect prototype
  RDMA/mana_ib: Introduce mana_ib_install_cq_cb helper function
  RDMA/mana_ib: Introduce mana_ib_get_netdev helper function
  RDMA/mana_ib: Introduce mdev_to_gc helper function
  RDMA/hns: Simplify 'struct hns_roce_hem' allocation
  ...
2024-03-18 15:34:03 -07:00
Linus Torvalds
65b64246f2 ktest updates for v6.9:
- Allow variables to contain variables. This makes the shell commands
   have a bit more flexibility to reuse existing variables.
 
 - Have make_warnings_file in build-only mode require limited variables
 
   The make_warnings_file test will create a file with all existing
   warnings (which can be used to compare against in builds with
   new commits). Add it to the build-only list that doesn't require
   other variables (like how to reset a machine), as the make_warnings_file
   makes the most sense on build only tests.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZfhlQRQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qoLnAP0XUeKMKV9JN1ayPUdQoN0stsseVLmt
 W+O0lowXVj3JWwD/d8mTVFVQHJ7zcmJQ3LJ/+daUmULjYX8daWGmVWYSyAg=
 =PMaK
 -----END PGP SIGNATURE-----

Merge tag 'ktest-v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest

Pull ktest updates from Steven Rostedt:

 - Allow variables to contain variables. This makes the shell commands
   have a bit more flexibility to reuse existing variables.

 - Have make_warnings_file in build-only mode require limited variables

   The make_warnings_file test will create a file with all existing
   warnings (which can be used to compare against in builds with new
   commits). Add it to the build-only list that doesn't require other
   variables (like how to reset a machine), as the make_warnings_file
   makes the most sense on build only tests.

* tag 'ktest-v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest:
  ktest: force $buildonly = 1 for 'make_warnings_file' test type
  ktest.pl: Process variables within variables
2024-03-18 15:27:03 -07:00
Linus Torvalds
ad584d73a2 Tracing updates for 6.9:
Main user visible change:
 
 - User events can now have "multi formats"
 
   The current user events have a single format. If another event is created
   with a different format, it will fail to be created. That is, once an
   event name is used, it cannot be used again with a different format. This
   can cause issues if a library is using an event and updates its format.
   An application using the older format will prevent an application using
   the new library from registering its event.
 
   A task could also DOS another application if it knows the event names, and
   it creates events with different formats.
 
   The multi-format event is in a different name space from the single
   format. Both the event name and its format are the unique identifier.
   This will allow two different applications to use the same user event name
   but with different payloads.
 
 - Added support to have ftrace_dump_on_oops dump out instances and
   not just the main top level tracing buffer.
 
 Other changes:
 
 - Add eventfs_root_inode
 
   Only the root inode has a dentry that is static (never goes away) and
   stores it upon creation. There's no reason that the thousands of other
   eventfs inodes should have a pointer that never gets set in its
   descriptor. Create a eventfs_root_inode desciptor that has a eventfs_inode
   descriptor and a dentry pointer, and only the root inode will use this.
 
 - Added WARN_ON()s in eventfs
 
   There's some conditionals remaining in eventfs that should never be hit,
   but instead of removing them, add WARN_ON() around them to make sure that
   they are never hit.
 
 - Have saved_cmdlines allocation also include the map_cmdline_to_pid array
 
   The saved_cmdlines structure allocates a large amount of data to hold its
   mappings. Within it, it has three arrays. Two are already apart of it:
   map_pid_to_cmdline[] and saved_cmdlines[]. More memory can be saved by
   also including the map_cmdline_to_pid[] array as well.
 
 - Restructure __string() and __assign_str() macros used in TRACE_EVENT().
 
   Dynamic strings in TRACE_EVENT() are declared with:
 
       __string(name, source)
 
   And assigned with:
 
      __assign_str(name, source)
 
   In the tracepoint callback of the event, the __string() is used to get the
   size needed to allocate on the ring buffer and __assign_str() is used to
   copy the string into the ring buffer. There's a helper structure that is
   created in the TRACE_EVENT() macro logic that will hold the string length
   and its position in the ring buffer which is created by __string().
 
   There are several trace events that have a function to create the string
   to save. This function is executed twice. Once for __string() and again
   for __assign_str(). There's no reason for this. The helper structure could
   also save the string it used in __string() and simply copy that into
   __assign_str() (it also already has its length).
 
   By using the structure to store the source string for the assignment, it
   means that the second argument to __assign_str() is no longer needed.
 
   It will be removed in the next merge window, but for now add a warning if
   the source string given to __string() is different than the source string
   given to __assign_str(), as the source to __assign_str() isn't even used
   and will be going away.
 
 - Added checks to make sure that the source of __string() is also the
   source of __assign_str() so that it can be safely removed in the next
   merge window.
 
   Included fixes that the above check found.
 
 - Other minor clean ups and fixes
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZfhbUBQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qrhJAP9bfnYO7tfNGZVNPmTT7Fz0z4zCU1Pb
 P8M+24yiFTeFWwD/aIPlMFZONVkTdFAlLdffl6kJOKxZ7vW4XzUjfNWb6wo=
 =z/D6
 -----END PGP SIGNATURE-----

Merge tag 'trace-v6.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing updates from Steven Rostedt:
 "Main user visible change:

   - User events can now have "multi formats"

     The current user events have a single format. If another event is
     created with a different format, it will fail to be created. That
     is, once an event name is used, it cannot be used again with a
     different format. This can cause issues if a library is using an
     event and updates its format. An application using the older format
     will prevent an application using the new library from registering
     its event.

     A task could also DOS another application if it knows the event
     names, and it creates events with different formats.

     The multi-format event is in a different name space from the single
     format. Both the event name and its format are the unique
     identifier. This will allow two different applications to use the
     same user event name but with different payloads.

   - Added support to have ftrace_dump_on_oops dump out instances and
     not just the main top level tracing buffer.

  Other changes:

   - Add eventfs_root_inode

     Only the root inode has a dentry that is static (never goes away)
     and stores it upon creation. There's no reason that the thousands
     of other eventfs inodes should have a pointer that never gets set
     in its descriptor. Create a eventfs_root_inode desciptor that has a
     eventfs_inode descriptor and a dentry pointer, and only the root
     inode will use this.

   - Added WARN_ON()s in eventfs

     There's some conditionals remaining in eventfs that should never be
     hit, but instead of removing them, add WARN_ON() around them to
     make sure that they are never hit.

   - Have saved_cmdlines allocation also include the map_cmdline_to_pid
     array

     The saved_cmdlines structure allocates a large amount of data to
     hold its mappings. Within it, it has three arrays. Two are already
     apart of it: map_pid_to_cmdline[] and saved_cmdlines[]. More memory
     can be saved by also including the map_cmdline_to_pid[] array as
     well.

   - Restructure __string() and __assign_str() macros used in
     TRACE_EVENT()

     Dynamic strings in TRACE_EVENT() are declared with:

         __string(name, source)

     And assigned with:

        __assign_str(name, source)

     In the tracepoint callback of the event, the __string() is used to
     get the size needed to allocate on the ring buffer and
     __assign_str() is used to copy the string into the ring buffer.
     There's a helper structure that is created in the TRACE_EVENT()
     macro logic that will hold the string length and its position in
     the ring buffer which is created by __string().

     There are several trace events that have a function to create the
     string to save. This function is executed twice. Once for
     __string() and again for __assign_str(). There's no reason for
     this. The helper structure could also save the string it used in
     __string() and simply copy that into __assign_str() (it also
     already has its length).

     By using the structure to store the source string for the
     assignment, it means that the second argument to __assign_str() is
     no longer needed.

     It will be removed in the next merge window, but for now add a
     warning if the source string given to __string() is different than
     the source string given to __assign_str(), as the source to
     __assign_str() isn't even used and will be going away.

   - Added checks to make sure that the source of __string() is also the
     source of __assign_str() so that it can be safely removed in the
     next merge window.

     Included fixes that the above check found.

   - Other minor clean ups and fixes"

* tag 'trace-v6.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (34 commits)
  tracing: Add __string_src() helper to help compilers not to get confused
  tracing: Use strcmp() in __assign_str() WARN_ON() check
  tracepoints: Use WARN() and not WARN_ON() for warnings
  tracing: Use div64_u64() instead of do_div()
  tracing: Support to dump instance traces by ftrace_dump_on_oops
  tracing: Remove second parameter to __assign_rel_str()
  tracing: Add warning if string in __assign_str() does not match __string()
  tracing: Add __string_len() example
  tracing: Remove __assign_str_len()
  ftrace: Fix most kernel-doc warnings
  tracing: Decrement the snapshot if the snapshot trigger fails to register
  tracing: Fix snapshot counter going between two tracers that use it
  tracing: Use EVENT_NULL_STR macro instead of open coding "(null)"
  tracing: Use ? : shortcut in trace macros
  tracing: Do not calculate strlen() twice for __string() fields
  tracing: Rework __assign_str() and __string() to not duplicate getting the string
  cxl/trace: Properly initialize cxl_poison region name
  net: hns3: tracing: fix hclgevf trace event strings
  drm/i915: Add missing ; to __assign_str() macros in tracepoint code
  NFSD: Fix nfsd_clid_class use of __string_len() macro
  ...
2024-03-18 15:11:44 -07:00
Michael Kelley
f2580a907e x86/hyperv: Use Hyper-V entropy to seed guest random number generator
A Hyper-V host provides its guest VMs with entropy in a custom ACPI
table named "OEM0".  The entropy bits are updated each time Hyper-V
boots the VM, and are suitable for seeding the Linux guest random
number generator (rng). See a brief description of OEM0 in [1].

Generation 2 VMs on Hyper-V use UEFI to boot. Existing EFI code in
Linux seeds the rng with entropy bits from the EFI_RNG_PROTOCOL.
Via this path, the rng is seeded very early during boot with good
entropy. The ACPI OEM0 table provided in such VMs is an additional
source of entropy.

Generation 1 VMs on Hyper-V boot from BIOS. For these VMs, Linux
doesn't currently get any entropy from the Hyper-V host. While this
is not fundamentally broken because Linux can generate its own entropy,
using the Hyper-V host provided entropy would get the rng off to a
better start and would do so earlier in the boot process.

Improve the rng seeding for Generation 1 VMs by having Hyper-V specific
code in Linux take advantage of the OEM0 table to seed the rng. For
Generation 2 VMs, use the OEM0 table to provide additional entropy
beyond the EFI_RNG_PROTOCOL. Because the OEM0 table is custom to
Hyper-V, parse it directly in the Hyper-V code in the Linux kernel
and use add_bootloader_randomness() to add it to the rng. Once the
entropy bits are read from OEM0, zero them out in the table so
they don't appear in /sys/firmware/acpi/tables/OEM0 in the running
VM. The zero'ing is done out of an abundance of caution to avoid
potential security risks to the rng. Also set the OEM0 data length
to zero so a kexec or other subsequent use of the table won't try
to use the zero'ed bits.

[1] https://download.microsoft.com/download/1/c/9/1c9813b8-089c-4fef-b2ad-ad80e79403ba/Whitepaper%20-%20The%20Windows%2010%20random%20number%20generation%20infrastructure.pdf

Signed-off-by: Michael Kelley <mhklinux@outlook.com>
Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com>
Link: https://lore.kernel.org/r/20240318155408.216851-1-mhklinux@outlook.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Message-ID: <20240318155408.216851-1-mhklinux@outlook.com>
2024-03-18 22:01:52 +00:00
Linus Torvalds
2cb5c86839 sysctl changes for v6.9-rc1
I'm sending you the sysctl pull request after following Luis' suggestion to
 become a maintainer. If you see that something is missing, get back to me with
 how to improve and I'll include your feedback in the following PRs.
 
 Here is a summary of the changes included in this PR:
 * New shared repo for sysctl maintenance
 * check-sysctl-docs adjustment for API changes by Thomas Weißschuh
 
 This is a non-functional PR. Additional testing is required for the rest of the
 pending changes. Future kernel pull requests will include the removal of the
 empty elements (sentinels) from sysctl arrays in the kernel/, net/, mm/ and
 security/ dirs. After that, the superfluous check for procname == NULL will be
 removed. And the push to avoid bloating the kernel as these arrays move out of
 kernel/sysctl.c will be completed.
 
 Even though Thomas' changes went into sysctl-next after v6.8-rc5 (3 weeks in
 linux-next), I include them as they contained no functional changes and
 therefore have little chance of resulting in an error/regression. Finally the
 new shared repo is now picked up by linux-next and is the source for upcoming
 sysctl changes.
 -----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEErkcJVyXmMSXOyyeQupfNUreWQU8FAmXzVSUACgkQupfNUreW
 QU938wv9F8giyaHfGAOOytq6zsMxEYt96t7YP8gAIApPrLIorfFPc/hP4fZhthwX
 G0KRuA2LLmBL8wq22otzwDx0I5p3zu1ZOEXX594MX2ac4iGRFTsGbZo4G/caiaDu
 tUEjxMKC4EChGd04Zh8QW93SFK2bQLJYm59ST4JnXynpFZ4B3B7y1AMTshMKdmGu
 KozaCt/IBi27Wsp8Bwlx39KL+wWtmluYtM4ErxTjUp2hXyDr5aQiNztD0yeOMrLN
 rIh3H7WYFbFVm3HY4ZgkVfRgKgKZBjI6+5lYu8C3BAgp+ltDkDY7rJu5ux2b5q1r
 Z9yQ4rg+pnsEjvIpq4trccbyPZX5hrgE9zUN7lJSKr2bqPTKAnJfN0FAQ4rNgHzO
 EFSHJQd26XuWoQIhwR07d8PDXnfKUH1f8mgN/LWFEXr4iQ1VBGBlYwbvrMkjyoVt
 Qb/bLUKomCEPzQ6qKrSDAqmcm4A8dl3jbMnjFT7zAfjrcMy8gsWY1sX/0FYR/KYs
 gPWmf0GW
 =mUbc
 -----END PGP SIGNATURE-----

Merge tag 'sysctl-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl

Pull sysctl updates from Joel Granados:
 "No functional changes - additional testing is required for the rest of
  the pending changes.

   - New shared repo for sysctl maintenance

   - check-sysctl-docs adjustment for API changes by Thomas Weißschuh"

* tag 'sysctl-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl:
  scripts: check-sysctl-docs: handle per-namespace sysctls
  ipc: remove linebreaks from arguments of __register_sysctl_table
  scripts: check-sysctl-docs: adapt to new API
  MAINTAINERS: Update sysctl tree location
2024-03-18 14:59:13 -07:00
Purna Pavan Chandra Aekkaladevi
eac03d81cd x86/hyperv: Cosmetic changes for hv_spinlock.c
Fix issues reported by checkpatch.pl script for hv_spinlock.c file.
- Place __initdata after variable name
- Add missing blank line after enum declaration

No functional changes intended.

Signed-off-by: Purna Pavan Chandra Aekkaladevi <paekkaladevi@linux.microsoft.com>
Reviewed-by: Saurabh Sengar <ssengar@linux.microsoft.com>
Link: https://lore.kernel.org/r/1710763751-14137-1-git-send-email-paekkaladevi@linux.microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Message-ID: <1710763751-14137-1-git-send-email-paekkaladevi@linux.microsoft.com>
2024-03-18 21:56:19 +00:00
Adam Butcher
cf6d79a0f5
spi: spi-imx: fix off-by-one in mx51 CPU mode burst length
c712c05e46 ("spi: imx: fix the burst length at DMA mode and CPU mode")
corrects three cases of setting the ECSPI burst length but erroneously
leaves the in-range CPU case one bit to big (in that field a value of
0 means 1 bit).  The effect was that transmissions that should have been
8-bit bytes appeared as 9-bit causing failed communication with SPI
devices.

Link: https://lore.kernel.org/all/20240201105451.507005-1-carlos.song@nxp.com/
Link: https://lore.kernel.org/all/20240204091912.36488-1-carlos.song@nxp.com/
Fixes: c712c05e46 ("spi: imx: fix the burst length at DMA mode and CPU mode")
Signed-off-by: Adam Butcher <adam@jessamine.co.uk>
Link: https://msgid.link/r/20240318175119.3334-1-adam@jessamine.co.uk
Signed-off-by: Mark Brown <broonie@kernel.org>
2024-03-18 21:06:54 +00:00
Chengming Zhou
a8922f7967 ceph: remove SLAB_MEM_SPREAD flag usage
The SLAB_MEM_SPREAD flag used to be implemented in SLAB, which was
removed as of v6.8-rc1, so it became a dead flag since the commit
16a1d96835 ("mm/slab: remove mm/slab.c and slab_def.h"). And the
series [1] went on to mark it obsolete to avoid confusion for users.
Here we can just remove all its users, which has no functional change.

[1] https://lore.kernel.org/all/20240223-slab-cleanup-flags-v2-1-02f1753e8303@suse.cz/

Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Reviewed-by: Xiubo Li <xiubli@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2024-03-18 22:03:29 +01:00
Xiubo Li
09927e7ef1 ceph: break the check delayed cap loop every 5s
In some cases this may take a long time and will block renewing
the caps to MDS.

[ idryomov: massage comment ]

Link: https://tracker.ceph.com/issues/50223#note-21
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2024-03-18 22:03:29 +01:00
Linus Torvalds
bf3a69c686 One fix, one cleanup...
Fix:
 Julia Lawall pointed out a null pointer dereference.
 
 Cleanup:
 Vlastimil Babka sent me a patch to remove some SLAB related code.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEIGSFVdO6eop9nER2z0QOqevODb4FAmXSUcsACgkQz0QOqevO
 Db5VmxAAiQlsxX2Ki3q00rMgaXFS4yTwSPsGsTrC5eQlyJfq4xEGMTWJGjjRS56k
 L2FGioP7OmIlo2VrzCc9Ms4ve/NyQjXpaoDMnsEUWSUfd8OHSJkBrpeVUWWfcHHk
 zLEmxNb2AETcJupAgoJOWOoSb59ggCKpCqmLezYoSZmlnb9qg6lhFbnWtkVC6q+p
 AREOfByoLIrJUtVh4Bmexo4nO5w3F84cfAV2WAmLMnXKjGnyFLGkqQvy8yXW0sA1
 hsZW+VjmoRaCG78M6OX+Sl3ok0V8i2AcOkguWPj+dkCPb8XLkxhGwjnqLziJ54Z5
 aFrtSzeZiNQOqy7b6cj6+x2KcWE5FhphAKjX/psEZrZNa0e6ZvNfby3yJ4TzNWaN
 eajtOtcq+Ec9IruWXt/WCsm0zYwW1HumUhga5QCHjQRRjOt36ua4QC02iCx2sYuX
 SBnsBCgQo1xxAta3uOMj2sG38lUwYoH0U5wlPsqrGh1nsbGbc49Ok7BYX/wWF8os
 CYnT5t2KR9yUvblV+dH9XTj2EwqgINMRYBW7uBjZqY9gq2v/RKrtQCjnedAAA+yx
 B6UUob/naV5VpXfhwpXiw2oJrQy/kqQuwOEcgY/a6cwmENd9RuwGXBiBI6hrAOzl
 ftxiUcByW8/hS13G04qJ7pGTACs4njteMvvg+Y68nWUmPWMZTio=
 =/gAC
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-6.9-ofs1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux

Pull orangefs updates from Mike Marshall:
 "One fix, one cleanup...

  Fix: Julia Lawall pointed out a null pointer dereference.

  Cleanup: Vlastimil Babka sent me a patch to remove some SLAB related
  code"

* tag 'for-linus-6.9-ofs1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux:
  Julia Lawall reported this null pointer dereference, this should fix it.
  fs/orangefs: remove ORANGEFS_CACHE_CREATE_FLAGS
2024-03-18 12:15:19 -07:00
Linus Torvalds
c5d9ab85eb f2fs update for 6.9-rc1
In this round, there are a number of updates on mainly two areas: Zoned block
 device support and Per-file compression. For example, we've found several issues
 to support Zoned block device especially having large sections regarding to GC
 and file pinning used for Android devices. In compression side, we've fixed many
 corner race conditions that had broken the design assumption.
 
 Enhancement:
  - Support file pinning for Zoned block device having large section
  - Enhance the data recovery after sudden power cut on Zoned block device
  - Add more error injection cases to easily detect the kernel panics
  - add a proc entry show the entire disk layout
  - Improve various error paths paniced by BUG_ON in block allocation and GC
  - support SEEK_DATA and SEEK_HOLE for compression files
 
 Bug fix:
  - fix to avoid use-after-free issue in f2fs_filemap_fault
  - fix some race conditions to break the atomic write design assumption
  - fix to truncate meta inode pages forcely
  - resolve various per-file compression issues wrt the space management and
    compression policies
  - fix some swap-related bugs
 
 In addition, we removed deprecated codes such as io_bits and heap_allocation,
 and also fixed minor error handling routines with neat debugging messages.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE00UqedjCtOrGVvQiQBSofoJIUNIFAmX4gS0ACgkQQBSofoJI
 UNLmgBAAg4mvbWjmJ5VbXs4zGLOgLRJYcY1sZRO5Ufg4LhWzoGRxL1Dru+TELw0t
 1Ck2EQvP91XZ5weA5AZOfWbxcijy4+8L3P8L7ohOShudfACci0wQsx6IaUUWWylC
 ILA4+DkovpZrlu6th12Gj9QAM6TN9gdy3V1VLT5O/KmE1x6Pekwp2hQoIvVJRH5L
 I3KxOf5fTe3oWLvEN6m7yCz/8qGqz8+w0ae90UG0fqi0wVEuZJ99zsVPnuhu6uBo
 riFm2A6ra0I/JqoPyqn2QM6ApItM867ULo9EoyQVgq56Q1w31ENOJXsU9N7N4Wxt
 olgujH1SijkWk9ni57iKtMhR68e3Rs+pVsuNFmJuOPq0HASoggB66QRrVvCgM9JG
 z3D//CB2ONtX2XiKJMiTcX9VqIqrMw6L1eVxEZu0P96C3CS70MoBU69mdSR9Og2S
 5nQXja3yzFhdk3thp6+wAJ3I04ZQkf3qoHZB+0chU2Xl1pV+5NIkBgBsSw8g/TY3
 EIHMfK+TX0SBSNCvkUDEJ+Z8ZRID6tcbAquTSsBr6wxB+F9mq7onEvI8O7xwyH9W
 DU8xhymOE2QUoluNtyW7ww6HK913ripXIenI9LaYJnuj0XeDAcMIoPsgR7AGU5UG
 hshvirFdUdWRMTfXxNNUrvhOWI0qurQSVx+VV6Qb62DGqR5ofOw=
 =Qpvy
 -----END PGP SIGNATURE-----

Merge tag 'f2fs-for-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs

Pull f2fs update from Jaegeuk Kim:
 "In this round, there are a number of updates on mainly two areas:
  Zoned block device support and Per-file compression. For example,
  we've found several issues to support Zoned block device especially
  having large sections regarding to GC and file pinning used for
  Android devices. In compression side, we've fixed many corner race
  conditions that had broken the design assumption.

  Enhancements:
   - Support file pinning for Zoned block device having large section
   - Enhance the data recovery after sudden power cut on Zoned block
     device
   - Add more error injection cases to easily detect the kernel panics
   - add a proc entry show the entire disk layout
   - Improve various error paths paniced by BUG_ON in block allocation
     and GC
   - support SEEK_DATA and SEEK_HOLE for compression files

  Bug fixes:
   - avoid use-after-free issue in f2fs_filemap_fault
   - fix some race conditions to break the atomic write design
     assumption
   - fix to truncate meta inode pages forcely
   - resolve various per-file compression issues wrt the space
     management and compression policies
   - fix some swap-related bugs

  In addition, we removed deprecated codes such as io_bits and
  heap_allocation, and also fixed minor error handling routines with
  neat debugging messages"

* tag 'f2fs-for-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (60 commits)
  f2fs: fix to avoid use-after-free issue in f2fs_filemap_fault
  f2fs: truncate page cache before clearing flags when aborting atomic write
  f2fs: mark inode dirty for FI_ATOMIC_COMMITTED flag
  f2fs: prevent atomic write on pinned file
  f2fs: fix to handle error paths of {new,change}_curseg()
  f2fs: unify the error handling of f2fs_is_valid_blkaddr
  f2fs: zone: fix to remove pow2 check condition for zoned block device
  f2fs: fix to truncate meta inode pages forcely
  f2fs: compress: fix reserve_cblocks counting error when out of space
  f2fs: compress: relocate some judgments in f2fs_reserve_compress_blocks
  f2fs: add a proc entry show disk layout
  f2fs: introduce SEGS_TO_BLKS/BLKS_TO_SEGS for cleanup
  f2fs: fix to check return value of f2fs_gc_range
  f2fs: fix to check return value __allocate_new_segment
  f2fs: fix to do sanity check in update_sit_entry
  f2fs: fix to reset fields for unloaded curseg
  f2fs: clean up new_curseg()
  f2fs: relocate f2fs_precache_extents() in f2fs_swap_activate()
  f2fs: fix blkofs_end correctly in f2fs_migrate_blocks()
  f2fs: ro: don't start discard thread for readonly image
  ...
2024-03-18 11:26:00 -07:00
Anand Jain
d565fffa68 btrfs: do not skip re-registration for the mounted device
There are reports that since version 6.7 update-grub fails to find the
device of the root on systems without initrd and on a single device.

This looks like the device name changed in the output of
/proc/self/mountinfo:

6.5-rc5 working

  18 1 0:16 / / rw,noatime - btrfs /dev/sda8 ...

6.7 not working:

  17 1 0:15 / / rw,noatime - btrfs /dev/root ...

and "update-grub" shows this error:

  /usr/sbin/grub-probe: error: cannot find a device for / (is /dev mounted?)

This looks like it's related to the device name, but grub-probe
recognizes the "/dev/root" path and tries to find the underlying device.
However there's a special case for some filesystems, for btrfs in
particular.

The generic root device detection heuristic is not done and it all
relies on reading the device infos by a btrfs specific ioctl. This ioctl
returns the device name as it was saved at the time of device scan (in
this case it's /dev/root).

The change in 6.7 for temp_fsid to allow several single device
filesystem to exist with the same fsid (and transparently generate a new
UUID at mount time) was to skip caching/registering such devices.

This also skipped mounted device. One step of scanning is to check if
the device name hasn't changed, and if yes then update the cached value.

This broke the grub-probe as it always read the device /dev/root and
couldn't find it in the system. A temporary workaround is to create a
symlink but this does not survive reboot.

The right fix is to allow updating the device path of a mounted
filesystem even if this is a single device one.

In the fix, check if the device's major:minor number matches with the
cached device. If they do, then we can allow the scan to happen so that
device_list_add() can take care of updating the device path. The file
descriptor remains unchanged.

This does not affect the temp_fsid feature, the UUID of the mounted
filesystem remains the same and the matching is based on device major:minor
which is unique per mounted filesystem.

This covers the path when the device (that exists for all mounted
devices) name changes, updating /dev/root to /dev/sdx. Any other single
device with filesystem and is not mounted is still skipped.

Note that if a system is booted and initial mount is done on the
/dev/root device, this will be the cached name of the device. Only after
the command "btrfs device scan" it will change as it triggers the
rename.

The fix was verified by users whose systems were affected.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=218353
Link: https://lore.kernel.org/lkml/CAKLYgeJ1tUuqLcsquwuFqjDXPSJpEiokrWK2gisPKDZLs8Y2TQ@mail.gmail.com/
Fixes: bc27d6f0aa ("btrfs: scan but don't register device on single device filesystem")
CC: stable@vger.kernel.org # 6.7+
Tested-by: Alex Romosan <aromosan@gmail.com>
Tested-by: CHECK_1234543212345@protonmail.com
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-18 19:16:50 +01:00
Linus Torvalds
0d7ca657df overlayfs fixes for 6.9-rc1
-----BEGIN PGP SIGNATURE-----
 
 iQIyBAABCAAdFiEE9zuTYTs0RXF+Ke33EVvVyTe/1WoFAmX4AWYACgkQEVvVyTe/
 1WpUXA/4r0qYfDU3M1NYtBhcRcXe44WvfdMxUPgc5IqffHof058FWC6cRhRnKB9U
 R1Wb09xIguX2hWsseyq1MWBuoZUq3nWUNRfTkld0uUX1I97ggV70YqRY2SSUWqCq
 eLMYwR6e5Ar1/noMkHYMLupJknTJsyOOFTXZrAvl5aZDCKvAQUQoVWY772B71/4j
 odH7wpgVX4Ah6U5Zk3anEdn5TVcvl4XXW4U1o4kJzUjIgxI9ZyrBpN6EmoSgpoQj
 zRfbQ6OrT2FNAPc/y3Dwn9fAoUZT2F3A/R4dVddBAPyLY96Qj+p0fv+y+0RY9efV
 dcf0bvoj4xXT+9KDVVp2rWszmFnr0niu+dE6bCkzW/lC2Uz/NIghjKcgwOGIjqpb
 9o8CzqGOF6B07A7SoVtcQlZR+7iUn+SUPEIk8JpvDkDLCLBOqd6rc58MnlMwJwrs
 OsMKH8IQKb6bGO0wfxmeyWj8637k9dtc0cF16hHhCxlRoextU3guNw0uBlta29QQ
 J+QCkFR6F2xQH2UbKEpOTsQo+2x385mEByXZdSUGkCM7ABb3cictrIMXyWJMNoWq
 un5tnJ5d/E3N+FKc5r9UWGHguwdYKiucvx7NQC6lK0y3b8aZMyVioiCVYX+lPKHB
 vdSp6F5vBrL6PHVsD9oDjiqLl8gQk7eIc4ne6zs/T9M5h8yHSg==
 =cMqg
 -----END PGP SIGNATURE-----

Merge tag 'ovl-fixes-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs

Pull overlayfs fixes from Amir Goldstein:
 "Only minor fixes:

   - Fix uncalled for WARN_ON from v6.8-rc1

   - Fix the overlayfs MAINTAINERS entry"

* tag 'ovl-fixes-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs:
  ovl: relax WARN_ON in ovl_verify_area()
  MAINTAINERS: update overlayfs git tree
2024-03-18 11:15:58 -07:00