We must add hugetlb_free_vmemmap=on (or "off") to the boot cmdline and
reboot the server to enable or disable the feature of optimizing vmemmap
pages associated with HugeTLB pages. However, rebooting usually takes a
long time. So add a sysctl to enable or disable the feature at runtime
without rebooting. Why we need this? There are 3 use cases.
1) The feature of minimizing overhead of struct page associated with
each HugeTLB is disabled by default without passing
"hugetlb_free_vmemmap=on" to the boot cmdline. When we (ByteDance)
deliver the servers to the users who want to enable this feature, they
have to configure the grub (change boot cmdline) and reboot the
servers, whereas rebooting usually takes a long time (we have thousands
of servers). It's a very bad experience for the users. So we need a
approach to enable this feature after rebooting. This is a use case in
our practical environment.
2) Some use cases are that HugeTLB pages are allocated 'on the fly'
instead of being pulled from the HugeTLB pool, those workloads would be
affected with this feature enabled. Those workloads could be
identified by the characteristics of they never explicitly allocating
huge pages with 'nr_hugepages' but only set 'nr_overcommit_hugepages'
and then let the pages be allocated from the buddy allocator at fault
time. We can confirm it is a real use case from the commit
099730d674. For those workloads, the page fault time could be ~2x
slower than before. We suspect those users want to disable this
feature if the system has enabled this before and they don't think the
memory savings benefit is enough to make up for the performance drop.
3) If the workload which wants vmemmap pages to be optimized and the
workload which wants to set 'nr_overcommit_hugepages' and does not want
the extera overhead at fault time when the overcommitted pages be
allocated from the buddy allocator are deployed in the same server.
The user could enable this feature and set 'nr_hugepages' and
'nr_overcommit_hugepages', then disable the feature. In this case, the
overcommited HugeTLB pages will not encounter the extra overhead at
fault time.
Link: https://lkml.kernel.org/r/20220512041142.39501-5-songmuchun@bytedance.com
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Iurii Zaikin <yzaikin@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: David Hildenbrand <david@redhat.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Xiongchun Duan <duanxiongchun@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
A semicolon was missing, and the almost-alphabetical-but-not ordering
was confusing, so regroup these by category instead.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
This adds the new nf_conntrack_events=2 mode and makes it the
default.
This leverages the earlier flag in struct net to allow to avoid
the event extension as long as no event listener is active in
the namespace.
This avoids, for most cases, allocation of ct->ext area.
A followup patch will take further advantage of this by avoiding
calls down into the event framework if the extension isn't present.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Arm SMMU updates for 5.19
- Add new Qualcomm device-tree compatible strings
- Add new Nvidia device-tree compatible string for Tegra234
- Fix UAF in SMMUv3 shared virtual addressing code
- Force identity-mapped domains for users of ye olde SMMU legacy binding
- Minor cleanups
These two fields need documentation as they have dual meaning. It is also
confusing since pic_num is a derived value from frame_num, so this should
help application developers. If we ever need to make a V2 of this API, I
would suggest to remove pic_num entirely.
Signed-off-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Reviewed-by: Sebastian Fricke <sebastian.fricke@collabora.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
The x86 Chromebooks have the ChromeOS ACPI device. This driver attaches
to the ChromeOS ACPI device and exports the values reported by ACPI in a
sysfs directory. This data isn't present in ACPI tables when read
through ACPI tools, hence a driver is needed to do it. The driver gets
data from firmware using the ACPI component of the kernel. The ACPI values
are presented in string form (numbers as decimal values) or binary
blobs, and can be accessed as the contents of the appropriate read only
files in the standard ACPI device's sysfs directory tree. This data is
consumed by the ChromeOS user space.
Reviewed-by: Guenter Roeck <groeck@chromium.org>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Co-developed-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: Tzung-Bi Shih <tzungbi@kernel.org>
Link: https://lore.kernel.org/r/Yn4OKYrtV35Dv+nd@debian-BULLSEYE-live-builder-AMD64
New peripherals supported on rk356x: sfc, usb3, sata and
the video-decoder on rk3328.
RK3399 received some improvements and nodes for the memory controller.
Additional peripherals for PineNote, Gru and BananaPi-R2-Pro.
New boards are the Firefly Station M2, Pine64 SoQuartz SOM and
Quartz64 model B as well as the Radxa Rock3 model A.
* tag 'v5.19-rockchip-dts64-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip: (32 commits)
arm64: dts: rockchip: enable otg/drd operation of usb_host0_xhci in rk356x
arm64: dts: rockchip: rename HDMI ref clock to 'ref' on rk3399
arm64: dts: rockchip: add dts for Firefly Station M2 rk3566
arm64: dts: rockchip: add SoQuartz CM4IO dts
arm64: dts: rockchip: add Pine64 Quartz64-B device tree
dt-bindings: arm: rockchip: Add Firefly Station M2
dt-bindings: arm: rockchip: Add Pine64 SoQuartz SoM
dt-bindings: arm: rockchip: Add Pine64 Quartz64 Model B
arm64: dts: rockchip: enable usb hub on the radxa rock3 model a
arm64: dts: rockchip: add usb3 support to the radxa rock3 model a
arm64: dts: rockchip: add rk356x sfc support
arm64: dts: rockchip: Add USB and TCPC to rk3566-pinenote
arm64: dts: rockchip: Add accelerometer to rk3566-pinenote
arm64: dts: rockchip: add an input enable pinconf to rk3399
arm64: dts: rockchip: Add vdec support for RK3328
arm64: dts: rockchip: Rename vdec_mmu node for RK3328
arm64: dts: rockchip: Enable dmc and dfi nodes on gru
arm64: dts: rockchip: Add dfi and dmc nodes to rk3399
arm64: dts: rockchip: add clocks property to cru nodes rk3399
arm64: dts: rockchip: use generic node name for pmucru on rk3399
...
Link: https://lore.kernel.org/r/7748558.DvuYhMxLoT@phil
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Document the RK3328 compatible for rockchip-vdec. The driver shares
the same base functionality as the RK3399 hardware so make sure that
the RK3399 compatible is also included in the device tree.
Signed-off-by: Christopher Obbard <chris.obbard@collabora.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
I tried to debug streaming in libcamera, where I stumbled upon this.
I asked around on IRC where I was told that this statement in the
documentation is wrong, so I'm submitting a removal.
Signed-off-by: Dorota Czaplejewicz <dorota.czaplejewicz@puri.sm>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Describe the coding tree unit as replacement for the macroblock in the
HEVC codec. Highlight a key difference of the HEVC codec to predecessors
like AVC(H.264) to give a better overview of the differences between the
coding standards.
[hverkuil: replaced the 'corresponds to' symbol with the full text for clarity]
Signed-off-by: Sebastian Fricke <sebastian.fricke@collabora.com>
Acked-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
update documentation to correctly state the interrupt-cells to be 2.
Cc: stable@vger.kernel.org
Fixes: 4fd9bbc6e0 ("drivers/gpio: Altera soft IP GPIO driver devicetree binding")
Signed-off-by: Dinh Nguyen <dinguyen@kernel.org>
Add documentation for In-Field Scan (IFS). This documentation
describes the basics of IFS, the loading IFS image, chunk
authentication, running scan and how to check result via sysfs.
The CORE_CAPABILITIES MSR enumerates whether IFS is supported.
The full github location for distributing the IFS images is
still being decided. So just a placeholder included for now
in the documentation.
Future CPUs will support more than one type of test. Plan for
that now by using a "_0" suffix on the ABI directory names.
Additional test types will use "_1", etc.
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20220506225410.1652287-13-tony.luck@intel.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Add KRYO4XX gold/big cores to the list of CPUs that need the
repeat TLBI workaround. Apply this to the affected
KRYO4XX cores (rcpe to rfpe).
The variant and revision bits are implementation defined and are
different from the their Cortex CPU counterparts on which they are
based on, i.e., (r0p0 to r3p0) is equivalent to (rcpe to rfpe).
Signed-off-by: Shreyas K K <quic_shrekk@quicinc.com>
Reviewed-by: Sai Prakash Ranjan <quic_saipraka@quicinc.com>
Link: https://lore.kernel.org/r/20220512110134.12179-1-quic_shrekk@quicinc.com
Signed-off-by: Will Deacon <will@kernel.org>
MediaTek Cache Coherent Interconnect (CCI) uses software devfreq module
for scaling clock frequency and adjust voltage.
The phandle could be linked between CPU and MediaTek CCI for some
MediaTek SoCs, like MT8183 and MT8186.
The reason we need the link status between cpufreq and MediaTek cci is
cpufreq and mediatek cci could share the same regulator in some MediaTek
SoCs. Therefore, to prevent the issue of high frequency and low voltage,
we need to use this to make sure mediatek cci is ready.
Signed-off-by: Rex-BC Chen <rex-bc.chen@mediatek.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Currently both expedited and regular grace period stall warnings use
a single timeout value that with units of seconds. However, recent
Android use cases problem require a sub-100-millisecond expedited RCU CPU
stall warning. Given that expedited RCU grace periods normally complete
in far less than a single millisecond, especially for small systems,
this is not unreasonable.
Therefore introduce the CONFIG_RCU_EXP_CPU_STALL_TIMEOUT kernel
configuration that defaults to 20 msec on Android and remains the same
as that of the non-expedited stall warnings otherwise. It also can be
changed in run-time via: /sys/.../parameters/rcu_exp_cpu_stall_timeout.
[ paulmck: Default of zero to use CONFIG_RCU_STALL_TIMEOUT. ]
Signed-off-by: Uladzislau Rezki <uladzislau.rezki@sony.com>
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Looks like all the changes to this driver had been tree-wide
refactoring since git era begun. The driver is using virt_to_bus()
we should make it use more modern DMA APIs but since it's unlikely
to be getting any use these days delete it instead. We can always
revert to bring it back.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>