- A series ("kbuild: enable more warnings by default") from Arnd

Bergmann which enables a number of additional build-time warnings.  We
   fixed all the fallout which we could find, there may still be a few
   stragglers.
 
 - Samuel Holland has developed the series "Unified cross-architecture
   kernel-mode FPU API".  This does a lot of consolidation of
   per-architecture kernel-mode FPU usage and enables the use of newer AMD
   GPUs on RISC-V.
 
 - Tao Su has fixed some selftests build warnings in the series
   "Selftests: Fix compilation warnings due to missing _GNU_SOURCE
   definition".
 
 - This pull also includes a nilfs2 fixup from Ryusuke Konishi.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZk6OSAAKCRDdBJ7gKXxA
 jpTGAP9hQaZ+g7CO38hKQAtEI8rwcZJtvUAP84pZEGMjYMGLxQD/S8z1o7UHx61j
 DUbnunbOkU/UcPx3Fs/gp4KcJARMEgs=
 =EPi9
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2024-05-22-17-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull more non-mm updates from Andrew Morton:

 - A series ("kbuild: enable more warnings by default") from Arnd
   Bergmann which enables a number of additional build-time warnings. We
   fixed all the fallout which we could find, there may still be a few
   stragglers.

 - Samuel Holland has developed the series "Unified cross-architecture
   kernel-mode FPU API". This does a lot of consolidation of
   per-architecture kernel-mode FPU usage and enables the use of newer
   AMD GPUs on RISC-V.

 - Tao Su has fixed some selftests build warnings in the series
   "Selftests: Fix compilation warnings due to missing _GNU_SOURCE
   definition".

 - This pull also includes a nilfs2 fixup from Ryusuke Konishi.

* tag 'mm-nonmm-stable-2024-05-22-17-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (23 commits)
  nilfs2: make block erasure safe in nilfs_finish_roll_forward()
  selftests/harness: use 1024 in place of LINE_MAX
  Revert "selftests/harness: remove use of LINE_MAX"
  selftests/fpu: allow building on other architectures
  selftests/fpu: move FP code to a separate translation unit
  drm/amd/display: use ARCH_HAS_KERNEL_FPU_SUPPORT
  drm/amd/display: only use hard-float, not altivec on powerpc
  riscv: add support for kernel-mode FPU
  x86: implement ARCH_HAS_KERNEL_FPU_SUPPORT
  powerpc: implement ARCH_HAS_KERNEL_FPU_SUPPORT
  LoongArch: implement ARCH_HAS_KERNEL_FPU_SUPPORT
  lib/raid6: use CC_FLAGS_FPU for NEON CFLAGS
  arm64: crypto: use CC_FLAGS_FPU for NEON CFLAGS
  arm64: implement ARCH_HAS_KERNEL_FPU_SUPPORT
  ARM: crypto: use CC_FLAGS_FPU for NEON CFLAGS
  ARM: implement ARCH_HAS_KERNEL_FPU_SUPPORT
  arch: add ARCH_HAS_KERNEL_FPU_SUPPORT
  x86/fpu: fix asm/fpu/types.h include guard
  kbuild: enable -Wcast-function-type-strict unconditionally
  kbuild: enable -Wformat-truncation on clang
  ...
This commit is contained in:
Linus Torvalds 2024-05-22 18:59:29 -07:00
commit c760b3725e
41 changed files with 365 additions and 220 deletions

View File

@ -0,0 +1,78 @@
.. SPDX-License-Identifier: GPL-2.0+
Floating-point API
==================
Kernel code is normally prohibited from using floating-point (FP) registers or
instructions, including the C float and double data types. This rule reduces
system call overhead, because the kernel does not need to save and restore the
userspace floating-point register state.
However, occasionally drivers or library functions may need to include FP code.
This is supported by isolating the functions containing FP code to a separate
translation unit (a separate source file), and saving/restoring the FP register
state around calls to those functions. This creates "critical sections" of
floating-point usage.
The reason for this isolation is to prevent the compiler from generating code
touching the FP registers outside these critical sections. Compilers sometimes
use FP registers to optimize inlined ``memcpy`` or variable assignment, as
floating-point registers may be wider than general-purpose registers.
Usability of floating-point code within the kernel is architecture-specific.
Additionally, because a single kernel may be configured to support platforms
both with and without a floating-point unit, FPU availability must be checked
both at build time and at run time.
Several architectures implement the generic kernel floating-point API from
``linux/fpu.h``, as described below. Some other architectures implement their
own unique APIs, which are documented separately.
Build-time API
--------------
Floating-point code may be built if the option ``ARCH_HAS_KERNEL_FPU_SUPPORT``
is enabled. For C code, such code must be placed in a separate file, and that
file must have its compilation flags adjusted using the following pattern::
CFLAGS_foo.o += $(CC_FLAGS_FPU)
CFLAGS_REMOVE_foo.o += $(CC_FLAGS_NO_FPU)
Architectures are expected to define one or both of these variables in their
top-level Makefile as needed. For example::
CC_FLAGS_FPU := -mhard-float
or::
CC_FLAGS_NO_FPU := -msoft-float
Normal kernel code is assumed to use the equivalent of ``CC_FLAGS_NO_FPU``.
Runtime API
-----------
The runtime API is provided in ``linux/fpu.h``. This header cannot be included
from files implementing FP code (those with their compilation flags adjusted as
above). Instead, it must be included when defining the FP critical sections.
.. c:function:: bool kernel_fpu_available( void )
This function reports if floating-point code can be used on this CPU or
platform. The value returned by this function is not expected to change
at runtime, so it only needs to be called once, not before every
critical section.
.. c:function:: void kernel_fpu_begin( void )
void kernel_fpu_end( void )
These functions create a floating-point critical section. It is only
valid to call ``kernel_fpu_begin()`` after a previous call to
``kernel_fpu_available()`` returned ``true``. These functions are only
guaranteed to be callable from (preemptible or non-preemptible) process
context.
Preemption may be disabled inside critical sections, so their size
should be minimized. They are *not* required to be reentrant. If the
caller expects to nest critical sections, it must implement its own
reference counting.

View File

@ -48,6 +48,7 @@ Library functionality that is used throughout the kernel.
errseq
wrappers/atomic_t
wrappers/atomic_bitops
floating-point
Low level entry and exit
========================

View File

@ -970,6 +970,11 @@ KBUILD_CFLAGS += $(CC_FLAGS_CFI)
export CC_FLAGS_CFI
endif
# Architectures can define flags to add/remove for floating-point support
CC_FLAGS_FPU += -D_LINUX_FPU_COMPILATION_UNIT
export CC_FLAGS_FPU
export CC_FLAGS_NO_FPU
ifneq ($(CONFIG_FUNCTION_ALIGNMENT),0)
# Set the minimal function alignment. Use the newer GCC option
# -fmin-function-alignment if it is available, or fall back to -falign-funtions.

View File

@ -1594,6 +1594,12 @@ config ARCH_HAS_NONLEAF_PMD_YOUNG
address translations. Page table walkers that clear the accessed bit
may use this capability to reduce their search space.
config ARCH_HAS_KERNEL_FPU_SUPPORT
bool
help
Architectures that select this option can run floating-point code in
the kernel, as described in Documentation/core-api/floating-point.rst.
source "kernel/gcov/Kconfig"
source "scripts/gcc-plugins/Kconfig"

View File

@ -130,6 +130,13 @@ endif
# Accept old syntax despite ".syntax unified"
AFLAGS_NOWARN :=$(call as-option,-Wa$(comma)-mno-warn-deprecated,-Wa$(comma)-W)
# The GCC option -ffreestanding is required in order to compile code containing
# ARM/NEON intrinsics in a non C99-compliant environment (such as the kernel)
CC_FLAGS_FPU := -ffreestanding
# Enable <arm_neon.h>
CC_FLAGS_FPU += -isystem $(shell $(CC) -print-file-name=include)
CC_FLAGS_FPU += -march=armv7-a -mfloat-abi=softfp -mfpu=neon
ifeq ($(CONFIG_THUMB2_KERNEL),y)
CFLAGS_ISA :=-Wa,-mimplicit-it=always $(AFLAGS_NOWARN)
AFLAGS_ISA :=$(CFLAGS_ISA) -Wa$(comma)-mthumb

View File

@ -0,0 +1,15 @@
/* SPDX-License-Identifier: GPL-2.0-only */
/*
* Copyright (C) 2023 SiFive
*/
#ifndef __ASM_FPU_H
#define __ASM_FPU_H
#include <asm/neon.h>
#define kernel_fpu_available() cpu_has_neon()
#define kernel_fpu_begin() kernel_neon_begin()
#define kernel_fpu_end() kernel_neon_end()
#endif /* ! __ASM_FPU_H */

View File

@ -40,8 +40,7 @@ $(obj)/csumpartialcopy.o: $(obj)/csumpartialcopygeneric.S
$(obj)/csumpartialcopyuser.o: $(obj)/csumpartialcopygeneric.S
ifeq ($(CONFIG_KERNEL_MODE_NEON),y)
NEON_FLAGS := -march=armv7-a -mfloat-abi=softfp -mfpu=neon
CFLAGS_xor-neon.o += $(NEON_FLAGS)
CFLAGS_xor-neon.o += $(CC_FLAGS_FPU)
obj-$(CONFIG_XOR_BLOCKS) += xor-neon.o
endif

View File

@ -30,6 +30,7 @@ config ARM64
select ARCH_HAS_GCOV_PROFILE_ALL
select ARCH_HAS_GIGANTIC_PAGE
select ARCH_HAS_KCOV
select ARCH_HAS_KERNEL_FPU_SUPPORT if KERNEL_MODE_NEON
select ARCH_HAS_KEEPINITRD
select ARCH_HAS_MEMBARRIER_SYNC_CORE
select ARCH_HAS_NMI_SAFE_THIS_CPU_OPS

View File

@ -36,7 +36,14 @@ ifeq ($(CONFIG_BROKEN_GAS_INST),y)
$(warning Detected assembler with broken .inst; disassembly will be unreliable)
endif
KBUILD_CFLAGS += -mgeneral-regs-only \
# The GCC option -ffreestanding is required in order to compile code containing
# ARM/NEON intrinsics in a non C99-compliant environment (such as the kernel)
CC_FLAGS_FPU := -ffreestanding
# Enable <arm_neon.h>
CC_FLAGS_FPU += -isystem $(shell $(CC) -print-file-name=include)
CC_FLAGS_NO_FPU := -mgeneral-regs-only
KBUILD_CFLAGS += $(CC_FLAGS_NO_FPU) \
$(compat_vdso) $(cc_has_k_constraint)
KBUILD_CFLAGS += $(call cc-disable-warning, psabi)
KBUILD_AFLAGS += $(compat_vdso)

View File

@ -0,0 +1,15 @@
/* SPDX-License-Identifier: GPL-2.0-only */
/*
* Copyright (C) 2023 SiFive
*/
#ifndef __ASM_FPU_H
#define __ASM_FPU_H
#include <asm/neon.h>
#define kernel_fpu_available() cpu_has_neon()
#define kernel_fpu_begin() kernel_neon_begin()
#define kernel_fpu_end() kernel_neon_end()
#endif /* ! __ASM_FPU_H */

View File

@ -7,10 +7,8 @@ lib-y := clear_user.o delay.o copy_from_user.o \
ifeq ($(CONFIG_KERNEL_MODE_NEON), y)
obj-$(CONFIG_XOR_BLOCKS) += xor-neon.o
CFLAGS_REMOVE_xor-neon.o += -mgeneral-regs-only
CFLAGS_xor-neon.o += -ffreestanding
# Enable <arm_neon.h>
CFLAGS_xor-neon.o += -isystem $(shell $(CC) -print-file-name=include)
CFLAGS_xor-neon.o += $(CC_FLAGS_FPU)
CFLAGS_REMOVE_xor-neon.o += $(CC_FLAGS_NO_FPU)
endif
lib-$(CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE) += uaccess_flushcache.o

View File

@ -19,6 +19,7 @@ config LOONGARCH
select ARCH_HAS_FAST_MULTIPLIER
select ARCH_HAS_FORTIFY_SOURCE
select ARCH_HAS_KCOV
select ARCH_HAS_KERNEL_FPU_SUPPORT if CPU_HAS_FPU
select ARCH_HAS_NMI_SAFE_THIS_CPU_OPS
select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
select ARCH_HAS_PTE_SPECIAL

View File

@ -26,6 +26,9 @@ endif
32bit-emul = elf32loongarch
64bit-emul = elf64loongarch
CC_FLAGS_FPU := -mfpu=64
CC_FLAGS_NO_FPU := -msoft-float
ifdef CONFIG_UNWINDER_ORC
orc_hash_h := arch/$(SRCARCH)/include/generated/asm/orc_hash.h
orc_hash_sh := $(srctree)/scripts/orc_hash.sh
@ -59,7 +62,7 @@ ld-emul = $(64bit-emul)
cflags-y += -mabi=lp64s
endif
cflags-y += -pipe -msoft-float
cflags-y += -pipe $(CC_FLAGS_NO_FPU)
LDFLAGS_vmlinux += -static -n -nostdlib
# When the assembler supports explicit relocation hint, we must use it.

View File

@ -21,6 +21,7 @@
struct sigcontext;
#define kernel_fpu_available() cpu_has_fpu
extern void kernel_fpu_begin(void);
extern void kernel_fpu_end(void);

View File

@ -137,6 +137,7 @@ config PPC
select ARCH_HAS_GCOV_PROFILE_ALL
select ARCH_HAS_HUGEPD if HUGETLB_PAGE
select ARCH_HAS_KCOV
select ARCH_HAS_KERNEL_FPU_SUPPORT if PPC_FPU
select ARCH_HAS_MEMBARRIER_CALLBACKS
select ARCH_HAS_MEMBARRIER_SYNC_CORE
select ARCH_HAS_MEMREMAP_COMPAT_ALIGN if PPC_64S_HASH_MMU

View File

@ -149,6 +149,9 @@ CFLAGS-$(CONFIG_PPC32) += $(call cc-option, $(MULTIPLEWORD))
CFLAGS-$(CONFIG_PPC32) += $(call cc-option,-mno-readonly-in-sdata)
CC_FLAGS_FPU := $(call cc-option,-mhard-float)
CC_FLAGS_NO_FPU := $(call cc-option,-msoft-float)
ifdef CONFIG_FUNCTION_TRACER
ifdef CONFIG_ARCH_USING_PATCHABLE_FUNCTION_ENTRY
KBUILD_CPPFLAGS += -DCC_USING_PATCHABLE_FUNCTION_ENTRY
@ -170,7 +173,7 @@ asinstr := $(call as-instr,lis 9$(comma)foo@high,-DHAVE_AS_ATHIGH=1)
KBUILD_CPPFLAGS += -I $(srctree)/arch/powerpc $(asinstr)
KBUILD_AFLAGS += $(AFLAGS-y)
KBUILD_CFLAGS += $(call cc-option,-msoft-float)
KBUILD_CFLAGS += $(CC_FLAGS_NO_FPU)
KBUILD_CFLAGS += $(CFLAGS-y)
CPP = $(CC) -E $(KBUILD_CFLAGS)

View File

@ -0,0 +1,28 @@
/* SPDX-License-Identifier: GPL-2.0-only */
/*
* Copyright (C) 2023 SiFive
*/
#ifndef _ASM_POWERPC_FPU_H
#define _ASM_POWERPC_FPU_H
#include <linux/preempt.h>
#include <asm/cpu_has_feature.h>
#include <asm/switch_to.h>
#define kernel_fpu_available() (!cpu_has_feature(CPU_FTR_FPU_UNAVAILABLE))
static inline void kernel_fpu_begin(void)
{
preempt_disable();
enable_kernel_fp();
}
static inline void kernel_fpu_end(void)
{
disable_kernel_fp();
preempt_enable();
}
#endif /* ! _ASM_POWERPC_FPU_H */

View File

@ -28,6 +28,7 @@ config RISCV
select ARCH_HAS_GCOV_PROFILE_ALL
select ARCH_HAS_GIGANTIC_PAGE
select ARCH_HAS_KCOV
select ARCH_HAS_KERNEL_FPU_SUPPORT if 64BIT && FPU
select ARCH_HAS_MEMBARRIER_CALLBACKS
select ARCH_HAS_MEMBARRIER_SYNC_CORE
select ARCH_HAS_MMIOWB

View File

@ -91,6 +91,9 @@ KBUILD_CFLAGS += -march=$(shell echo $(riscv-march-y) | sed -E 's/(rv32ima|rv64i
KBUILD_AFLAGS += -march=$(riscv-march-y)
# For C code built with floating-point support, exclude V but keep F and D.
CC_FLAGS_FPU := -march=$(shell echo $(riscv-march-y) | sed -E 's/(rv32ima|rv64ima)([^v_]*)v?/\1\2/')
KBUILD_CFLAGS += -mno-save-restore
KBUILD_CFLAGS += -DCONFIG_PAGE_OFFSET=$(CONFIG_PAGE_OFFSET)

View File

@ -0,0 +1,16 @@
/* SPDX-License-Identifier: GPL-2.0-only */
/*
* Copyright (C) 2023 SiFive
*/
#ifndef _ASM_RISCV_FPU_H
#define _ASM_RISCV_FPU_H
#include <asm/switch_to.h>
#define kernel_fpu_available() has_fpu()
void kernel_fpu_begin(void);
void kernel_fpu_end(void);
#endif /* ! _ASM_RISCV_FPU_H */

View File

@ -67,6 +67,7 @@ obj-$(CONFIG_RISCV_MISALIGNED) += unaligned_access_speed.o
obj-$(CONFIG_RISCV_PROBE_UNALIGNED_ACCESS) += copy-unaligned.o
obj-$(CONFIG_FPU) += fpu.o
obj-$(CONFIG_FPU) += kernel_mode_fpu.o
obj-$(CONFIG_RISCV_ISA_V) += vector.o
obj-$(CONFIG_RISCV_ISA_V) += kernel_mode_vector.o
obj-$(CONFIG_SMP) += smpboot.o

View File

@ -0,0 +1,28 @@
// SPDX-License-Identifier: GPL-2.0-only
/*
* Copyright (C) 2023 SiFive
*/
#include <linux/export.h>
#include <linux/preempt.h>
#include <asm/csr.h>
#include <asm/fpu.h>
#include <asm/processor.h>
#include <asm/switch_to.h>
void kernel_fpu_begin(void)
{
preempt_disable();
fstate_save(current, task_pt_regs(current));
csr_set(CSR_SSTATUS, SR_FS);
}
EXPORT_SYMBOL_GPL(kernel_fpu_begin);
void kernel_fpu_end(void)
{
csr_clear(CSR_SSTATUS, SR_FS);
fstate_restore(current, task_pt_regs(current));
preempt_enable();
}
EXPORT_SYMBOL_GPL(kernel_fpu_end);

View File

@ -85,6 +85,7 @@ config X86
select ARCH_HAS_FORTIFY_SOURCE
select ARCH_HAS_GCOV_PROFILE_ALL
select ARCH_HAS_KCOV if X86_64
select ARCH_HAS_KERNEL_FPU_SUPPORT
select ARCH_HAS_MEM_ENCRYPT
select ARCH_HAS_MEMBARRIER_SYNC_CORE
select ARCH_HAS_NMI_SAFE_THIS_CPU_OPS

View File

@ -74,6 +74,26 @@ KBUILD_CFLAGS += -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -mno-avx
KBUILD_RUSTFLAGS += --target=$(objtree)/scripts/target.json
KBUILD_RUSTFLAGS += -Ctarget-feature=-sse,-sse2,-sse3,-ssse3,-sse4.1,-sse4.2,-avx,-avx2
#
# CFLAGS for compiling floating point code inside the kernel.
#
CC_FLAGS_FPU := -msse -msse2
ifdef CONFIG_CC_IS_GCC
# Stack alignment mismatch, proceed with caution.
# GCC < 7.1 cannot compile code using `double` and -mpreferred-stack-boundary=3
# (8B stack alignment).
# See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53383
#
# The "-msse" in the first argument is there so that the
# -mpreferred-stack-boundary=3 build error:
#
# -mpreferred-stack-boundary=3 is not between 4 and 12
#
# can be triggered. Otherwise gcc doesn't complain.
CC_FLAGS_FPU += -mhard-float
CC_FLAGS_FPU += $(call cc-option,-msse -mpreferred-stack-boundary=3,-mpreferred-stack-boundary=4)
endif
ifeq ($(CONFIG_X86_KERNEL_IBT),y)
#
# Kernel IBT has S_CET.NOTRACK_EN=0, as such the compilers must not generate

View File

@ -0,0 +1,13 @@
/* SPDX-License-Identifier: GPL-2.0-only */
/*
* Copyright (C) 2023 SiFive
*/
#ifndef _ASM_X86_FPU_H
#define _ASM_X86_FPU_H
#include <asm/fpu/api.h>
#define kernel_fpu_available() true
#endif /* ! _ASM_X86_FPU_H */

View File

@ -2,8 +2,8 @@
/*
* FPU data structures:
*/
#ifndef _ASM_X86_FPU_H
#define _ASM_X86_FPU_H
#ifndef _ASM_X86_FPU_TYPES_H
#define _ASM_X86_FPU_TYPES_H
#include <asm/page_types.h>
@ -596,4 +596,4 @@ struct fpu_state_config {
/* FPU state configuration information */
extern struct fpu_state_config fpu_kernel_cfg, fpu_user_cfg;
#endif /* _ASM_X86_FPU_H */
#endif /* _ASM_X86_FPU_TYPES_H */

View File

@ -8,7 +8,7 @@ config DRM_AMD_DC
depends on BROKEN || !CC_IS_CLANG || ARM64 || RISCV || SPARC64 || X86_64
select SND_HDA_COMPONENT if SND_HDA_CORE
# !CC_IS_CLANG: https://github.com/ClangBuiltLinux/linux/issues/1752
select DRM_AMD_DC_FP if (X86 || LOONGARCH || (PPC64 && ALTIVEC) || (ARM64 && KERNEL_MODE_NEON && !CC_IS_CLANG))
select DRM_AMD_DC_FP if ARCH_HAS_KERNEL_FPU_SUPPORT && (!ARM64 || !CC_IS_CLANG)
help
Choose this option if you want to use the new display engine
support for AMDGPU. This adds required support for Vega and

View File

@ -26,16 +26,7 @@
#include "dc_trace.h"
#if defined(CONFIG_X86)
#include <asm/fpu/api.h>
#elif defined(CONFIG_PPC64)
#include <asm/switch_to.h>
#include <asm/cputable.h>
#elif defined(CONFIG_ARM64)
#include <asm/neon.h>
#elif defined(CONFIG_LOONGARCH)
#include <asm/fpu.h>
#endif
#include <linux/fpu.h>
/**
* DOC: DC FPU manipulation overview
@ -87,20 +78,9 @@ void dc_fpu_begin(const char *function_name, const int line)
WARN_ON_ONCE(!in_task());
preempt_disable();
depth = __this_cpu_inc_return(fpu_recursion_depth);
if (depth == 1) {
#if defined(CONFIG_X86) || defined(CONFIG_LOONGARCH)
BUG_ON(!kernel_fpu_available());
kernel_fpu_begin();
#elif defined(CONFIG_PPC64)
if (cpu_has_feature(CPU_FTR_VSX_COMP))
enable_kernel_vsx();
else if (cpu_has_feature(CPU_FTR_ALTIVEC_COMP))
enable_kernel_altivec();
else if (!cpu_has_feature(CPU_FTR_FPU_UNAVAILABLE))
enable_kernel_fp();
#elif defined(CONFIG_ARM64)
kernel_neon_begin();
#endif
}
TRACE_DCN_FPU(true, function_name, line, depth);
@ -122,18 +102,7 @@ void dc_fpu_end(const char *function_name, const int line)
depth = __this_cpu_dec_return(fpu_recursion_depth);
if (depth == 0) {
#if defined(CONFIG_X86) || defined(CONFIG_LOONGARCH)
kernel_fpu_end();
#elif defined(CONFIG_PPC64)
if (cpu_has_feature(CPU_FTR_VSX_COMP))
disable_kernel_vsx();
else if (cpu_has_feature(CPU_FTR_ALTIVEC_COMP))
disable_kernel_altivec();
else if (!cpu_has_feature(CPU_FTR_FPU_UNAVAILABLE))
disable_kernel_fp();
#elif defined(CONFIG_ARM64)
kernel_neon_end();
#endif
} else {
WARN_ON_ONCE(depth < 0);
}

View File

@ -25,40 +25,8 @@
# It provides the general basic services required by other DAL
# subcomponents.
ifdef CONFIG_X86
dml_ccflags-$(CONFIG_CC_IS_GCC) := -mhard-float
dml_ccflags := $(dml_ccflags-y) -msse
endif
ifdef CONFIG_PPC64
dml_ccflags := -mhard-float -maltivec
endif
ifdef CONFIG_ARM64
dml_rcflags := -mgeneral-regs-only
endif
ifdef CONFIG_LOONGARCH
dml_ccflags := -mfpu=64
dml_rcflags := -msoft-float
endif
ifdef CONFIG_CC_IS_GCC
ifneq ($(call gcc-min-version, 70100),y)
IS_OLD_GCC = 1
endif
endif
ifdef CONFIG_X86
ifdef IS_OLD_GCC
# Stack alignment mismatch, proceed with caution.
# GCC < 7.1 cannot compile code using `double` and -mpreferred-stack-boundary=3
# (8B stack alignment).
dml_ccflags += -mpreferred-stack-boundary=4
else
dml_ccflags += -msse2
endif
endif
dml_ccflags := $(CC_FLAGS_FPU)
dml_rcflags := $(CC_FLAGS_NO_FPU)
ifneq ($(CONFIG_FRAME_WARN),0)
ifeq ($(filter y,$(CONFIG_KASAN)$(CONFIG_KCSAN)),y)

View File

@ -24,40 +24,8 @@
#
# Makefile for dml2.
ifdef CONFIG_X86
dml2_ccflags-$(CONFIG_CC_IS_GCC) := -mhard-float
dml2_ccflags := $(dml2_ccflags-y) -msse
endif
ifdef CONFIG_PPC64
dml2_ccflags := -mhard-float -maltivec
endif
ifdef CONFIG_ARM64
dml2_rcflags := -mgeneral-regs-only
endif
ifdef CONFIG_LOONGARCH
dml2_ccflags := -mfpu=64
dml2_rcflags := -msoft-float
endif
ifdef CONFIG_CC_IS_GCC
ifeq ($(call cc-ifversion, -lt, 0701, y), y)
IS_OLD_GCC = 1
endif
endif
ifdef CONFIG_X86
ifdef IS_OLD_GCC
# Stack alignment mismatch, proceed with caution.
# GCC < 7.1 cannot compile code using `double` and -mpreferred-stack-boundary=3
# (8B stack alignment).
dml2_ccflags += -mpreferred-stack-boundary=4
else
dml2_ccflags += -msse2
endif
endif
dml2_ccflags := $(CC_FLAGS_FPU)
dml2_rcflags := $(CC_FLAGS_NO_FPU)
ifneq ($(CONFIG_FRAME_WARN),0)
ifeq ($(filter y,$(CONFIG_KASAN)$(CONFIG_KCSAN)),y)

View File

@ -702,8 +702,12 @@ static void nilfs_finish_roll_forward(struct the_nilfs *nilfs,
if (WARN_ON(!bh))
return; /* should never happen */
lock_buffer(bh);
memset(bh->b_data, 0, bh->b_size);
set_buffer_uptodate(bh);
set_buffer_dirty(bh);
unlock_buffer(bh);
err = sync_dirty_buffer(bh);
if (unlikely(err))
nilfs_warn(nilfs->ns_sb,

12
include/linux/fpu.h Normal file
View File

@ -0,0 +1,12 @@
/* SPDX-License-Identifier: GPL-2.0 */
#ifndef _LINUX_FPU_H
#define _LINUX_FPU_H
#ifdef _LINUX_FPU_COMPILATION_UNIT
#error FP code must be compiled separately. See Documentation/core-api/floating-point.rst.
#endif
#include <asm/fpu.h>
#endif

View File

@ -2925,7 +2925,7 @@ config TEST_FREE_PAGES
config TEST_FPU
tristate "Test floating point operations in kernel space"
depends on X86 && !KCOV_INSTRUMENT_ALL
depends on ARCH_HAS_KERNEL_FPU_SUPPORT && !KCOV_INSTRUMENT_ALL
help
Enable this option to add /sys/kernel/debug/selftest_helpers/test_fpu
which will trigger a sequence of floating point operations. This is used

View File

@ -110,30 +110,10 @@ CFLAGS_test_fprobe.o += $(CC_FLAGS_FTRACE)
obj-$(CONFIG_FPROBE_SANITY_TEST) += test_fprobe.o
obj-$(CONFIG_TEST_OBJPOOL) += test_objpool.o
#
# CFLAGS for compiling floating point code inside the kernel. x86/Makefile turns
# off the generation of FPU/SSE* instructions for kernel proper but FPU_FLAGS
# get appended last to CFLAGS and thus override those previous compiler options.
#
FPU_CFLAGS := -msse -msse2
ifdef CONFIG_CC_IS_GCC
# Stack alignment mismatch, proceed with caution.
# GCC < 7.1 cannot compile code using `double` and -mpreferred-stack-boundary=3
# (8B stack alignment).
# See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53383
#
# The "-msse" in the first argument is there so that the
# -mpreferred-stack-boundary=3 build error:
#
# -mpreferred-stack-boundary=3 is not between 4 and 12
#
# can be triggered. Otherwise gcc doesn't complain.
FPU_CFLAGS += -mhard-float
FPU_CFLAGS += $(call cc-option,-msse -mpreferred-stack-boundary=3,-mpreferred-stack-boundary=4)
endif
obj-$(CONFIG_TEST_FPU) += test_fpu.o
CFLAGS_test_fpu.o += $(FPU_CFLAGS)
test_fpu-y := test_fpu_glue.o test_fpu_impl.o
CFLAGS_test_fpu_impl.o += $(CC_FLAGS_FPU)
CFLAGS_REMOVE_test_fpu_impl.o += $(CC_FLAGS_NO_FPU)
# Some KUnit files (hooks.o) need to be built-in even when KUnit is a module,
# so we can't just use obj-$(CONFIG_KUNIT).

View File

@ -33,25 +33,6 @@ CFLAGS_REMOVE_vpermxor8.o += -msoft-float
endif
endif
# The GCC option -ffreestanding is required in order to compile code containing
# ARM/NEON intrinsics in a non C99-compliant environment (such as the kernel)
ifeq ($(CONFIG_KERNEL_MODE_NEON),y)
NEON_FLAGS := -ffreestanding
# Enable <arm_neon.h>
NEON_FLAGS += -isystem $(shell $(CC) -print-file-name=include)
ifeq ($(ARCH),arm)
NEON_FLAGS += -march=armv7-a -mfloat-abi=softfp -mfpu=neon
endif
CFLAGS_recov_neon_inner.o += $(NEON_FLAGS)
ifeq ($(ARCH),arm64)
CFLAGS_REMOVE_recov_neon_inner.o += -mgeneral-regs-only
CFLAGS_REMOVE_neon1.o += -mgeneral-regs-only
CFLAGS_REMOVE_neon2.o += -mgeneral-regs-only
CFLAGS_REMOVE_neon4.o += -mgeneral-regs-only
CFLAGS_REMOVE_neon8.o += -mgeneral-regs-only
endif
endif
quiet_cmd_unroll = UNROLL $@
cmd_unroll = $(AWK) -v N=$* -f $(src)/unroll.awk < $< > $@
@ -75,10 +56,16 @@ targets += vpermxor1.c vpermxor2.c vpermxor4.c vpermxor8.c
$(obj)/vpermxor%.c: $(src)/vpermxor.uc $(src)/unroll.awk FORCE
$(call if_changed,unroll)
CFLAGS_neon1.o += $(NEON_FLAGS)
CFLAGS_neon2.o += $(NEON_FLAGS)
CFLAGS_neon4.o += $(NEON_FLAGS)
CFLAGS_neon8.o += $(NEON_FLAGS)
CFLAGS_neon1.o += $(CC_FLAGS_FPU)
CFLAGS_neon2.o += $(CC_FLAGS_FPU)
CFLAGS_neon4.o += $(CC_FLAGS_FPU)
CFLAGS_neon8.o += $(CC_FLAGS_FPU)
CFLAGS_recov_neon_inner.o += $(CC_FLAGS_FPU)
CFLAGS_REMOVE_neon1.o += $(CC_FLAGS_NO_FPU)
CFLAGS_REMOVE_neon2.o += $(CC_FLAGS_NO_FPU)
CFLAGS_REMOVE_neon4.o += $(CC_FLAGS_NO_FPU)
CFLAGS_REMOVE_neon8.o += $(CC_FLAGS_NO_FPU)
CFLAGS_REMOVE_recov_neon_inner.o += $(CC_FLAGS_NO_FPU)
targets += neon1.c neon2.c neon4.c neon8.c
$(obj)/neon%.c: $(src)/neon.uc $(src)/unroll.awk FORCE
$(call if_changed,unroll)

8
lib/test_fpu.h Normal file
View File

@ -0,0 +1,8 @@
/* SPDX-License-Identifier: GPL-2.0+ */
#ifndef _LIB_TEST_FPU_H
#define _LIB_TEST_FPU_H
int test_fpu(void);
#endif

View File

@ -17,39 +17,9 @@
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/debugfs.h>
#include <asm/fpu/api.h>
#include <linux/fpu.h>
static int test_fpu(void)
{
/*
* This sequence of operations tests that rounding mode is
* to nearest and that denormal numbers are supported.
* Volatile variables are used to avoid compiler optimizing
* the calculations away.
*/
volatile double a, b, c, d, e, f, g;
a = 4.0;
b = 1e-15;
c = 1e-310;
/* Sets precision flag */
d = a + b;
/* Result depends on rounding mode */
e = a + b / 2;
/* Denormal and very large values */
f = b / c;
/* Depends on denormal support */
g = a + c * f;
if (d > a && e > a && g > a)
return 0;
else
return -EINVAL;
}
#include "test_fpu.h"
static int test_fpu_get(void *data, u64 *val)
{
@ -68,6 +38,9 @@ static struct dentry *selftest_dir;
static int __init test_fpu_init(void)
{
if (!kernel_fpu_available())
return -EINVAL;
selftest_dir = debugfs_create_dir("selftest_helpers", NULL);
if (!selftest_dir)
return -ENOMEM;

37
lib/test_fpu_impl.c Normal file
View File

@ -0,0 +1,37 @@
// SPDX-License-Identifier: GPL-2.0+
#include <linux/errno.h>
#include "test_fpu.h"
int test_fpu(void)
{
/*
* This sequence of operations tests that rounding mode is
* to nearest and that denormal numbers are supported.
* Volatile variables are used to avoid compiler optimizing
* the calculations away.
*/
volatile double a, b, c, d, e, f, g;
a = 4.0;
b = 1e-15;
c = 1e-310;
/* Sets precision flag */
d = a + b;
/* Result depends on rounding mode */
e = a + b / 2;
/* Denormal and very large values */
f = b / c;
/* Depends on denormal support */
g = a + c * f;
if (d > a && e > a && g > a)
return 0;
else
return -EINVAL;
}

View File

@ -37,11 +37,6 @@ else
KBUILD_CFLAGS += -Wno-main
endif
# These warnings generated too much noise in a regular build.
# Use make W=1 to enable them (see scripts/Makefile.extrawarn)
KBUILD_CFLAGS += $(call cc-disable-warning, unused-but-set-variable)
KBUILD_CFLAGS += $(call cc-disable-warning, unused-const-variable)
# These result in bogus false positives
KBUILD_CFLAGS += $(call cc-disable-warning, dangling-pointer)
@ -82,22 +77,17 @@ KBUILD_CFLAGS += $(call cc-option,-Werror=designated-init)
# Warn if there is an enum types mismatch
KBUILD_CFLAGS += $(call cc-option,-Wenum-conversion)
KBUILD_CFLAGS += -Wextra
KBUILD_CFLAGS += -Wunused
#
# W=1 - warnings which may be relevant and do not occur too often
#
ifneq ($(findstring 1, $(KBUILD_EXTRA_WARN)),)
KBUILD_CFLAGS += -Wextra -Wunused -Wno-unused-parameter
KBUILD_CFLAGS += $(call cc-option, -Wrestrict)
KBUILD_CFLAGS += -Wmissing-format-attribute
KBUILD_CFLAGS += -Wold-style-definition
KBUILD_CFLAGS += -Wmissing-include-dirs
KBUILD_CFLAGS += $(call cc-option, -Wunused-but-set-variable)
KBUILD_CFLAGS += $(call cc-option, -Wunused-const-variable)
KBUILD_CFLAGS += $(call cc-option, -Wpacked-not-aligned)
KBUILD_CFLAGS += $(call cc-option, -Wformat-overflow)
KBUILD_CFLAGS += $(call cc-option, -Wformat-truncation)
KBUILD_CFLAGS += $(call cc-option, -Wstringop-truncation)
KBUILD_CPPFLAGS += -Wundef
KBUILD_CPPFLAGS += -DKBUILD_EXTRA_WARN1
@ -108,10 +98,16 @@ else
# Suppress them by using -Wno... except for W=1.
KBUILD_CFLAGS += $(call cc-disable-warning, unused-but-set-variable)
KBUILD_CFLAGS += $(call cc-disable-warning, unused-const-variable)
KBUILD_CFLAGS += $(call cc-disable-warning, restrict)
KBUILD_CFLAGS += $(call cc-disable-warning, packed-not-aligned)
KBUILD_CFLAGS += $(call cc-disable-warning, format-overflow)
ifdef CONFIG_CC_IS_GCC
KBUILD_CFLAGS += $(call cc-disable-warning, format-truncation)
else
# Clang checks for overflow/truncation with '%p', while GCC does not:
# https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111219
KBUILD_CFLAGS += $(call cc-disable-warning, format-overflow-non-kprintf)
KBUILD_CFLAGS += $(call cc-disable-warning, format-truncation-non-kprintf)
endif
KBUILD_CFLAGS += $(call cc-disable-warning, stringop-truncation)
KBUILD_CFLAGS += -Wno-override-init # alias for -Wno-initializer-overrides in clang
@ -133,7 +129,6 @@ endif
KBUILD_CFLAGS += $(call cc-disable-warning, pointer-to-enum-cast)
KBUILD_CFLAGS += -Wno-tautological-constant-out-of-range-compare
KBUILD_CFLAGS += $(call cc-disable-warning, unaligned-access)
KBUILD_CFLAGS += $(call cc-disable-warning, cast-function-type-strict)
KBUILD_CFLAGS += -Wno-enum-compare-conditional
KBUILD_CFLAGS += -Wno-enum-enum-conversion
endif
@ -148,9 +143,6 @@ ifneq ($(findstring 2, $(KBUILD_EXTRA_WARN)),)
KBUILD_CFLAGS += -Wdisabled-optimization
KBUILD_CFLAGS += -Wshadow
KBUILD_CFLAGS += $(call cc-option, -Wlogical-op)
KBUILD_CFLAGS += -Wmissing-field-initializers
KBUILD_CFLAGS += -Wtype-limits
KBUILD_CFLAGS += $(call cc-option, -Wmaybe-uninitialized)
KBUILD_CFLAGS += $(call cc-option, -Wunused-macros)
KBUILD_CPPFLAGS += -DKBUILD_EXTRA_WARN2
@ -190,6 +182,7 @@ else
# The following turn off the warnings enabled by -Wextra
KBUILD_CFLAGS += -Wno-sign-compare
KBUILD_CFLAGS += -Wno-unused-parameter
endif

View File

@ -1216,7 +1216,7 @@ void __run_test(struct __fixture_metadata *f,
struct __test_metadata *t)
{
struct __test_xfail *xfail;
char *test_name;
char test_name[1024];
const char *diagnostic;
/* reset test struct */
@ -1227,12 +1227,8 @@ void __run_test(struct __fixture_metadata *f,
memset(t->env, 0, sizeof(t->env));
memset(t->results->reason, 0, sizeof(t->results->reason));
if (asprintf(&test_name, "%s%s%s.%s", f->name,
variant->name[0] ? "." : "", variant->name, t->name) == -1) {
ksft_print_msg("ERROR ALLOCATING MEMORY\n");
t->exit_code = KSFT_FAIL;
_exit(t->exit_code);
}
snprintf(test_name, sizeof(test_name), "%s%s%s.%s",
f->name, variant->name[0] ? "." : "", variant->name, t->name);
ksft_print_msg(" RUN %s ...\n", test_name);
@ -1270,7 +1266,6 @@ void __run_test(struct __fixture_metadata *f,
ksft_test_result_code(t->exit_code, test_name,
diagnostic ? "%s" : NULL, diagnostic);
free(test_name);
}
static int test_harness_run(int argc, char **argv)

View File

@ -7,7 +7,6 @@
#include <linux/mman.h>
#include <linux/prctl.h>
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <sys/auxv.h>