forked from Minki/linux
kmemcheck: rip it out
Fix up makefiles, remove references, and git rm kmemcheck. Link: http://lkml.kernel.org/r/20171007030159.22241-4-alexander.levin@verizon.com Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Vegard Nossum <vegardno@ifi.uio.no> Cc: Pekka Enberg <penberg@kernel.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Alexander Potapenko <glider@google.com> Cc: Tim Hansen <devtimhansen@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This commit is contained in:
parent
d8be75663c
commit
4675ff05de
@ -1864,13 +1864,6 @@
|
||||
Built with CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF=y,
|
||||
the default is off.
|
||||
|
||||
kmemcheck= [X86] Boot-time kmemcheck enable/disable/one-shot mode
|
||||
Valid arguments: 0, 1, 2
|
||||
kmemcheck=0 (disabled)
|
||||
kmemcheck=1 (enabled)
|
||||
kmemcheck=2 (one-shot mode)
|
||||
Default: 2 (one-shot mode)
|
||||
|
||||
kvm.ignore_msrs=[KVM] Ignore guest accesses to unhandled MSRs.
|
||||
Default is 0 (don't ignore, but inject #GP)
|
||||
|
||||
|
@ -21,7 +21,6 @@ whole; patches welcome!
|
||||
kasan
|
||||
ubsan
|
||||
kmemleak
|
||||
kmemcheck
|
||||
gdb-kernel-debugging
|
||||
kgdb
|
||||
kselftest
|
||||
|
@ -1,733 +0,0 @@
|
||||
Getting started with kmemcheck
|
||||
==============================
|
||||
|
||||
Vegard Nossum <vegardno@ifi.uio.no>
|
||||
|
||||
|
||||
Introduction
|
||||
------------
|
||||
|
||||
kmemcheck is a debugging feature for the Linux Kernel. More specifically, it
|
||||
is a dynamic checker that detects and warns about some uses of uninitialized
|
||||
memory.
|
||||
|
||||
Userspace programmers might be familiar with Valgrind's memcheck. The main
|
||||
difference between memcheck and kmemcheck is that memcheck works for userspace
|
||||
programs only, and kmemcheck works for the kernel only. The implementations
|
||||
are of course vastly different. Because of this, kmemcheck is not as accurate
|
||||
as memcheck, but it turns out to be good enough in practice to discover real
|
||||
programmer errors that the compiler is not able to find through static
|
||||
analysis.
|
||||
|
||||
Enabling kmemcheck on a kernel will probably slow it down to the extent that
|
||||
the machine will not be usable for normal workloads such as e.g. an
|
||||
interactive desktop. kmemcheck will also cause the kernel to use about twice
|
||||
as much memory as normal. For this reason, kmemcheck is strictly a debugging
|
||||
feature.
|
||||
|
||||
|
||||
Downloading
|
||||
-----------
|
||||
|
||||
As of version 2.6.31-rc1, kmemcheck is included in the mainline kernel.
|
||||
|
||||
|
||||
Configuring and compiling
|
||||
-------------------------
|
||||
|
||||
kmemcheck only works for the x86 (both 32- and 64-bit) platform. A number of
|
||||
configuration variables must have specific settings in order for the kmemcheck
|
||||
menu to even appear in "menuconfig". These are:
|
||||
|
||||
- ``CONFIG_CC_OPTIMIZE_FOR_SIZE=n``
|
||||
This option is located under "General setup" / "Optimize for size".
|
||||
|
||||
Without this, gcc will use certain optimizations that usually lead to
|
||||
false positive warnings from kmemcheck. An example of this is a 16-bit
|
||||
field in a struct, where gcc may load 32 bits, then discard the upper
|
||||
16 bits. kmemcheck sees only the 32-bit load, and may trigger a
|
||||
warning for the upper 16 bits (if they're uninitialized).
|
||||
|
||||
- ``CONFIG_SLAB=y`` or ``CONFIG_SLUB=y``
|
||||
This option is located under "General setup" / "Choose SLAB
|
||||
allocator".
|
||||
|
||||
- ``CONFIG_FUNCTION_TRACER=n``
|
||||
This option is located under "Kernel hacking" / "Tracers" / "Kernel
|
||||
Function Tracer"
|
||||
|
||||
When function tracing is compiled in, gcc emits a call to another
|
||||
function at the beginning of every function. This means that when the
|
||||
page fault handler is called, the ftrace framework will be called
|
||||
before kmemcheck has had a chance to handle the fault. If ftrace then
|
||||
modifies memory that was tracked by kmemcheck, the result is an
|
||||
endless recursive page fault.
|
||||
|
||||
- ``CONFIG_DEBUG_PAGEALLOC=n``
|
||||
This option is located under "Kernel hacking" / "Memory Debugging"
|
||||
/ "Debug page memory allocations".
|
||||
|
||||
In addition, I highly recommend turning on ``CONFIG_DEBUG_INFO=y``. This is also
|
||||
located under "Kernel hacking". With this, you will be able to get line number
|
||||
information from the kmemcheck warnings, which is extremely valuable in
|
||||
debugging a problem. This option is not mandatory, however, because it slows
|
||||
down the compilation process and produces a much bigger kernel image.
|
||||
|
||||
Now the kmemcheck menu should be visible (under "Kernel hacking" / "Memory
|
||||
Debugging" / "kmemcheck: trap use of uninitialized memory"). Here follows
|
||||
a description of the kmemcheck configuration variables:
|
||||
|
||||
- ``CONFIG_KMEMCHECK``
|
||||
This must be enabled in order to use kmemcheck at all...
|
||||
|
||||
- ``CONFIG_KMEMCHECK_``[``DISABLED`` | ``ENABLED`` | ``ONESHOT``]``_BY_DEFAULT``
|
||||
This option controls the status of kmemcheck at boot-time. "Enabled"
|
||||
will enable kmemcheck right from the start, "disabled" will boot the
|
||||
kernel as normal (but with the kmemcheck code compiled in, so it can
|
||||
be enabled at run-time after the kernel has booted), and "one-shot" is
|
||||
a special mode which will turn kmemcheck off automatically after
|
||||
detecting the first use of uninitialized memory.
|
||||
|
||||
If you are using kmemcheck to actively debug a problem, then you
|
||||
probably want to choose "enabled" here.
|
||||
|
||||
The one-shot mode is mostly useful in automated test setups because it
|
||||
can prevent floods of warnings and increase the chances of the machine
|
||||
surviving in case something is really wrong. In other cases, the one-
|
||||
shot mode could actually be counter-productive because it would turn
|
||||
itself off at the very first error -- in the case of a false positive
|
||||
too -- and this would come in the way of debugging the specific
|
||||
problem you were interested in.
|
||||
|
||||
If you would like to use your kernel as normal, but with a chance to
|
||||
enable kmemcheck in case of some problem, it might be a good idea to
|
||||
choose "disabled" here. When kmemcheck is disabled, most of the run-
|
||||
time overhead is not incurred, and the kernel will be almost as fast
|
||||
as normal.
|
||||
|
||||
- ``CONFIG_KMEMCHECK_QUEUE_SIZE``
|
||||
Select the maximum number of error reports to store in an internal
|
||||
(fixed-size) buffer. Since errors can occur virtually anywhere and in
|
||||
any context, we need a temporary storage area which is guaranteed not
|
||||
to generate any other page faults when accessed. The queue will be
|
||||
emptied as soon as a tasklet may be scheduled. If the queue is full,
|
||||
new error reports will be lost.
|
||||
|
||||
The default value of 64 is probably fine. If some code produces more
|
||||
than 64 errors within an irqs-off section, then the code is likely to
|
||||
produce many, many more, too, and these additional reports seldom give
|
||||
any more information (the first report is usually the most valuable
|
||||
anyway).
|
||||
|
||||
This number might have to be adjusted if you are not using serial
|
||||
console or similar to capture the kernel log. If you are using the
|
||||
"dmesg" command to save the log, then getting a lot of kmemcheck
|
||||
warnings might overflow the kernel log itself, and the earlier reports
|
||||
will get lost in that way instead. Try setting this to 10 or so on
|
||||
such a setup.
|
||||
|
||||
- ``CONFIG_KMEMCHECK_SHADOW_COPY_SHIFT``
|
||||
Select the number of shadow bytes to save along with each entry of the
|
||||
error-report queue. These bytes indicate what parts of an allocation
|
||||
are initialized, uninitialized, etc. and will be displayed when an
|
||||
error is detected to help the debugging of a particular problem.
|
||||
|
||||
The number entered here is actually the logarithm of the number of
|
||||
bytes that will be saved. So if you pick for example 5 here, kmemcheck
|
||||
will save 2^5 = 32 bytes.
|
||||
|
||||
The default value should be fine for debugging most problems. It also
|
||||
fits nicely within 80 columns.
|
||||
|
||||
- ``CONFIG_KMEMCHECK_PARTIAL_OK``
|
||||
This option (when enabled) works around certain GCC optimizations that
|
||||
produce 32-bit reads from 16-bit variables where the upper 16 bits are
|
||||
thrown away afterwards.
|
||||
|
||||
The default value (enabled) is recommended. This may of course hide
|
||||
some real errors, but disabling it would probably produce a lot of
|
||||
false positives.
|
||||
|
||||
- ``CONFIG_KMEMCHECK_BITOPS_OK``
|
||||
This option silences warnings that would be generated for bit-field
|
||||
accesses where not all the bits are initialized at the same time. This
|
||||
may also hide some real bugs.
|
||||
|
||||
This option is probably obsolete, or it should be replaced with
|
||||
the kmemcheck-/bitfield-annotations for the code in question. The
|
||||
default value is therefore fine.
|
||||
|
||||
Now compile the kernel as usual.
|
||||
|
||||
|
||||
How to use
|
||||
----------
|
||||
|
||||
Booting
|
||||
~~~~~~~
|
||||
|
||||
First some information about the command-line options. There is only one
|
||||
option specific to kmemcheck, and this is called "kmemcheck". It can be used
|
||||
to override the default mode as chosen by the ``CONFIG_KMEMCHECK_*_BY_DEFAULT``
|
||||
option. Its possible settings are:
|
||||
|
||||
- ``kmemcheck=0`` (disabled)
|
||||
- ``kmemcheck=1`` (enabled)
|
||||
- ``kmemcheck=2`` (one-shot mode)
|
||||
|
||||
If SLUB debugging has been enabled in the kernel, it may take precedence over
|
||||
kmemcheck in such a way that the slab caches which are under SLUB debugging
|
||||
will not be tracked by kmemcheck. In order to ensure that this doesn't happen
|
||||
(even though it shouldn't by default), use SLUB's boot option ``slub_debug``,
|
||||
like this: ``slub_debug=-``
|
||||
|
||||
In fact, this option may also be used for fine-grained control over SLUB vs.
|
||||
kmemcheck. For example, if the command line includes
|
||||
``kmemcheck=1 slub_debug=,dentry``, then SLUB debugging will be used only
|
||||
for the "dentry" slab cache, and with kmemcheck tracking all the other
|
||||
caches. This is advanced usage, however, and is not generally recommended.
|
||||
|
||||
|
||||
Run-time enable/disable
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
When the kernel has booted, it is possible to enable or disable kmemcheck at
|
||||
run-time. WARNING: This feature is still experimental and may cause false
|
||||
positive warnings to appear. Therefore, try not to use this. If you find that
|
||||
it doesn't work properly (e.g. you see an unreasonable amount of warnings), I
|
||||
will be happy to take bug reports.
|
||||
|
||||
Use the file ``/proc/sys/kernel/kmemcheck`` for this purpose, e.g.::
|
||||
|
||||
$ echo 0 > /proc/sys/kernel/kmemcheck # disables kmemcheck
|
||||
|
||||
The numbers are the same as for the ``kmemcheck=`` command-line option.
|
||||
|
||||
|
||||
Debugging
|
||||
~~~~~~~~~
|
||||
|
||||
A typical report will look something like this::
|
||||
|
||||
WARNING: kmemcheck: Caught 32-bit read from uninitialized memory (ffff88003e4a2024)
|
||||
80000000000000000000000000000000000000000088ffff0000000000000000
|
||||
i i i i u u u u i i i i i i i i u u u u u u u u u u u u u u u u
|
||||
^
|
||||
|
||||
Pid: 1856, comm: ntpdate Not tainted 2.6.29-rc5 #264 945P-A
|
||||
RIP: 0010:[<ffffffff8104ede8>] [<ffffffff8104ede8>] __dequeue_signal+0xc8/0x190
|
||||
RSP: 0018:ffff88003cdf7d98 EFLAGS: 00210002
|
||||
RAX: 0000000000000030 RBX: ffff88003d4ea968 RCX: 0000000000000009
|
||||
RDX: ffff88003e5d6018 RSI: ffff88003e5d6024 RDI: ffff88003cdf7e84
|
||||
RBP: ffff88003cdf7db8 R08: ffff88003e5d6000 R09: 0000000000000000
|
||||
R10: 0000000000000080 R11: 0000000000000000 R12: 000000000000000e
|
||||
R13: ffff88003cdf7e78 R14: ffff88003d530710 R15: ffff88003d5a98c8
|
||||
FS: 0000000000000000(0000) GS:ffff880001982000(0063) knlGS:00000
|
||||
CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033
|
||||
CR2: ffff88003f806ea0 CR3: 000000003c036000 CR4: 00000000000006a0
|
||||
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
|
||||
DR3: 0000000000000000 DR6: 00000000ffff4ff0 DR7: 0000000000000400
|
||||
[<ffffffff8104f04e>] dequeue_signal+0x8e/0x170
|
||||
[<ffffffff81050bd8>] get_signal_to_deliver+0x98/0x390
|
||||
[<ffffffff8100b87d>] do_notify_resume+0xad/0x7d0
|
||||
[<ffffffff8100c7b5>] int_signal+0x12/0x17
|
||||
[<ffffffffffffffff>] 0xffffffffffffffff
|
||||
|
||||
The single most valuable information in this report is the RIP (or EIP on 32-
|
||||
bit) value. This will help us pinpoint exactly which instruction that caused
|
||||
the warning.
|
||||
|
||||
If your kernel was compiled with ``CONFIG_DEBUG_INFO=y``, then all we have to do
|
||||
is give this address to the addr2line program, like this::
|
||||
|
||||
$ addr2line -e vmlinux -i ffffffff8104ede8
|
||||
arch/x86/include/asm/string_64.h:12
|
||||
include/asm-generic/siginfo.h:287
|
||||
kernel/signal.c:380
|
||||
kernel/signal.c:410
|
||||
|
||||
The "``-e vmlinux``" tells addr2line which file to look in. **IMPORTANT:**
|
||||
This must be the vmlinux of the kernel that produced the warning in the
|
||||
first place! If not, the line number information will almost certainly be
|
||||
wrong.
|
||||
|
||||
The "``-i``" tells addr2line to also print the line numbers of inlined
|
||||
functions. In this case, the flag was very important, because otherwise,
|
||||
it would only have printed the first line, which is just a call to
|
||||
``memcpy()``, which could be called from a thousand places in the kernel, and
|
||||
is therefore not very useful. These inlined functions would not show up in
|
||||
the stack trace above, simply because the kernel doesn't load the extra
|
||||
debugging information. This technique can of course be used with ordinary
|
||||
kernel oopses as well.
|
||||
|
||||
In this case, it's the caller of ``memcpy()`` that is interesting, and it can be
|
||||
found in ``include/asm-generic/siginfo.h``, line 287::
|
||||
|
||||
281 static inline void copy_siginfo(struct siginfo *to, struct siginfo *from)
|
||||
282 {
|
||||
283 if (from->si_code < 0)
|
||||
284 memcpy(to, from, sizeof(*to));
|
||||
285 else
|
||||
286 /* _sigchld is currently the largest know union member */
|
||||
287 memcpy(to, from, __ARCH_SI_PREAMBLE_SIZE + sizeof(from->_sifields._sigchld));
|
||||
288 }
|
||||
|
||||
Since this was a read (kmemcheck usually warns about reads only, though it can
|
||||
warn about writes to unallocated or freed memory as well), it was probably the
|
||||
"from" argument which contained some uninitialized bytes. Following the chain
|
||||
of calls, we move upwards to see where "from" was allocated or initialized,
|
||||
``kernel/signal.c``, line 380::
|
||||
|
||||
359 static void collect_signal(int sig, struct sigpending *list, siginfo_t *info)
|
||||
360 {
|
||||
...
|
||||
367 list_for_each_entry(q, &list->list, list) {
|
||||
368 if (q->info.si_signo == sig) {
|
||||
369 if (first)
|
||||
370 goto still_pending;
|
||||
371 first = q;
|
||||
...
|
||||
377 if (first) {
|
||||
378 still_pending:
|
||||
379 list_del_init(&first->list);
|
||||
380 copy_siginfo(info, &first->info);
|
||||
381 __sigqueue_free(first);
|
||||
...
|
||||
392 }
|
||||
393 }
|
||||
|
||||
Here, it is ``&first->info`` that is being passed on to ``copy_siginfo()``. The
|
||||
variable ``first`` was found on a list -- passed in as the second argument to
|
||||
``collect_signal()``. We continue our journey through the stack, to figure out
|
||||
where the item on "list" was allocated or initialized. We move to line 410::
|
||||
|
||||
395 static int __dequeue_signal(struct sigpending *pending, sigset_t *mask,
|
||||
396 siginfo_t *info)
|
||||
397 {
|
||||
...
|
||||
410 collect_signal(sig, pending, info);
|
||||
...
|
||||
414 }
|
||||
|
||||
Now we need to follow the ``pending`` pointer, since that is being passed on to
|
||||
``collect_signal()`` as ``list``. At this point, we've run out of lines from the
|
||||
"addr2line" output. Not to worry, we just paste the next addresses from the
|
||||
kmemcheck stack dump, i.e.::
|
||||
|
||||
[<ffffffff8104f04e>] dequeue_signal+0x8e/0x170
|
||||
[<ffffffff81050bd8>] get_signal_to_deliver+0x98/0x390
|
||||
[<ffffffff8100b87d>] do_notify_resume+0xad/0x7d0
|
||||
[<ffffffff8100c7b5>] int_signal+0x12/0x17
|
||||
|
||||
$ addr2line -e vmlinux -i ffffffff8104f04e ffffffff81050bd8 \
|
||||
ffffffff8100b87d ffffffff8100c7b5
|
||||
kernel/signal.c:446
|
||||
kernel/signal.c:1806
|
||||
arch/x86/kernel/signal.c:805
|
||||
arch/x86/kernel/signal.c:871
|
||||
arch/x86/kernel/entry_64.S:694
|
||||
|
||||
Remember that since these addresses were found on the stack and not as the
|
||||
RIP value, they actually point to the _next_ instruction (they are return
|
||||
addresses). This becomes obvious when we look at the code for line 446::
|
||||
|
||||
422 int dequeue_signal(struct task_struct *tsk, sigset_t *mask, siginfo_t *info)
|
||||
423 {
|
||||
...
|
||||
431 signr = __dequeue_signal(&tsk->signal->shared_pending,
|
||||
432 mask, info);
|
||||
433 /*
|
||||
434 * itimer signal ?
|
||||
435 *
|
||||
436 * itimers are process shared and we restart periodic
|
||||
437 * itimers in the signal delivery path to prevent DoS
|
||||
438 * attacks in the high resolution timer case. This is
|
||||
439 * compliant with the old way of self restarting
|
||||
440 * itimers, as the SIGALRM is a legacy signal and only
|
||||
441 * queued once. Changing the restart behaviour to
|
||||
442 * restart the timer in the signal dequeue path is
|
||||
443 * reducing the timer noise on heavy loaded !highres
|
||||
444 * systems too.
|
||||
445 */
|
||||
446 if (unlikely(signr == SIGALRM)) {
|
||||
...
|
||||
489 }
|
||||
|
||||
So instead of looking at 446, we should be looking at 431, which is the line
|
||||
that executes just before 446. Here we see that what we are looking for is
|
||||
``&tsk->signal->shared_pending``.
|
||||
|
||||
Our next task is now to figure out which function that puts items on this
|
||||
``shared_pending`` list. A crude, but efficient tool, is ``git grep``::
|
||||
|
||||
$ git grep -n 'shared_pending' kernel/
|
||||
...
|
||||
kernel/signal.c:828: pending = group ? &t->signal->shared_pending : &t->pending;
|
||||
kernel/signal.c:1339: pending = group ? &t->signal->shared_pending : &t->pending;
|
||||
...
|
||||
|
||||
There were more results, but none of them were related to list operations,
|
||||
and these were the only assignments. We inspect the line numbers more closely
|
||||
and find that this is indeed where items are being added to the list::
|
||||
|
||||
816 static int send_signal(int sig, struct siginfo *info, struct task_struct *t,
|
||||
817 int group)
|
||||
818 {
|
||||
...
|
||||
828 pending = group ? &t->signal->shared_pending : &t->pending;
|
||||
...
|
||||
851 q = __sigqueue_alloc(t, GFP_ATOMIC, (sig < SIGRTMIN &&
|
||||
852 (is_si_special(info) ||
|
||||
853 info->si_code >= 0)));
|
||||
854 if (q) {
|
||||
855 list_add_tail(&q->list, &pending->list);
|
||||
...
|
||||
890 }
|
||||
|
||||
and::
|
||||
|
||||
1309 int send_sigqueue(struct sigqueue *q, struct task_struct *t, int group)
|
||||
1310 {
|
||||
....
|
||||
1339 pending = group ? &t->signal->shared_pending : &t->pending;
|
||||
1340 list_add_tail(&q->list, &pending->list);
|
||||
....
|
||||
1347 }
|
||||
|
||||
In the first case, the list element we are looking for, ``q``, is being
|
||||
returned from the function ``__sigqueue_alloc()``, which looks like an
|
||||
allocation function. Let's take a look at it::
|
||||
|
||||
187 static struct sigqueue *__sigqueue_alloc(struct task_struct *t, gfp_t flags,
|
||||
188 int override_rlimit)
|
||||
189 {
|
||||
190 struct sigqueue *q = NULL;
|
||||
191 struct user_struct *user;
|
||||
192
|
||||
193 /*
|
||||
194 * We won't get problems with the target's UID changing under us
|
||||
195 * because changing it requires RCU be used, and if t != current, the
|
||||
196 * caller must be holding the RCU readlock (by way of a spinlock) and
|
||||
197 * we use RCU protection here
|
||||
198 */
|
||||
199 user = get_uid(__task_cred(t)->user);
|
||||
200 atomic_inc(&user->sigpending);
|
||||
201 if (override_rlimit ||
|
||||
202 atomic_read(&user->sigpending) <=
|
||||
203 t->signal->rlim[RLIMIT_SIGPENDING].rlim_cur)
|
||||
204 q = kmem_cache_alloc(sigqueue_cachep, flags);
|
||||
205 if (unlikely(q == NULL)) {
|
||||
206 atomic_dec(&user->sigpending);
|
||||
207 free_uid(user);
|
||||
208 } else {
|
||||
209 INIT_LIST_HEAD(&q->list);
|
||||
210 q->flags = 0;
|
||||
211 q->user = user;
|
||||
212 }
|
||||
213
|
||||
214 return q;
|
||||
215 }
|
||||
|
||||
We see that this function initializes ``q->list``, ``q->flags``, and
|
||||
``q->user``. It seems that now is the time to look at the definition of
|
||||
``struct sigqueue``, e.g.::
|
||||
|
||||
14 struct sigqueue {
|
||||
15 struct list_head list;
|
||||
16 int flags;
|
||||
17 siginfo_t info;
|
||||
18 struct user_struct *user;
|
||||
19 };
|
||||
|
||||
And, you might remember, it was a ``memcpy()`` on ``&first->info`` that
|
||||
caused the warning, so this makes perfect sense. It also seems reasonable
|
||||
to assume that it is the caller of ``__sigqueue_alloc()`` that has the
|
||||
responsibility of filling out (initializing) this member.
|
||||
|
||||
But just which fields of the struct were uninitialized? Let's look at
|
||||
kmemcheck's report again::
|
||||
|
||||
WARNING: kmemcheck: Caught 32-bit read from uninitialized memory (ffff88003e4a2024)
|
||||
80000000000000000000000000000000000000000088ffff0000000000000000
|
||||
i i i i u u u u i i i i i i i i u u u u u u u u u u u u u u u u
|
||||
^
|
||||
|
||||
These first two lines are the memory dump of the memory object itself, and
|
||||
the shadow bytemap, respectively. The memory object itself is in this case
|
||||
``&first->info``. Just beware that the start of this dump is NOT the start
|
||||
of the object itself! The position of the caret (^) corresponds with the
|
||||
address of the read (ffff88003e4a2024).
|
||||
|
||||
The shadow bytemap dump legend is as follows:
|
||||
|
||||
- i: initialized
|
||||
- u: uninitialized
|
||||
- a: unallocated (memory has been allocated by the slab layer, but has not
|
||||
yet been handed off to anybody)
|
||||
- f: freed (memory has been allocated by the slab layer, but has been freed
|
||||
by the previous owner)
|
||||
|
||||
In order to figure out where (relative to the start of the object) the
|
||||
uninitialized memory was located, we have to look at the disassembly. For
|
||||
that, we'll need the RIP address again::
|
||||
|
||||
RIP: 0010:[<ffffffff8104ede8>] [<ffffffff8104ede8>] __dequeue_signal+0xc8/0x190
|
||||
|
||||
$ objdump -d --no-show-raw-insn vmlinux | grep -C 8 ffffffff8104ede8:
|
||||
ffffffff8104edc8: mov %r8,0x8(%r8)
|
||||
ffffffff8104edcc: test %r10d,%r10d
|
||||
ffffffff8104edcf: js ffffffff8104ee88 <__dequeue_signal+0x168>
|
||||
ffffffff8104edd5: mov %rax,%rdx
|
||||
ffffffff8104edd8: mov $0xc,%ecx
|
||||
ffffffff8104eddd: mov %r13,%rdi
|
||||
ffffffff8104ede0: mov $0x30,%eax
|
||||
ffffffff8104ede5: mov %rdx,%rsi
|
||||
ffffffff8104ede8: rep movsl %ds:(%rsi),%es:(%rdi)
|
||||
ffffffff8104edea: test $0x2,%al
|
||||
ffffffff8104edec: je ffffffff8104edf0 <__dequeue_signal+0xd0>
|
||||
ffffffff8104edee: movsw %ds:(%rsi),%es:(%rdi)
|
||||
ffffffff8104edf0: test $0x1,%al
|
||||
ffffffff8104edf2: je ffffffff8104edf5 <__dequeue_signal+0xd5>
|
||||
ffffffff8104edf4: movsb %ds:(%rsi),%es:(%rdi)
|
||||
ffffffff8104edf5: mov %r8,%rdi
|
||||
ffffffff8104edf8: callq ffffffff8104de60 <__sigqueue_free>
|
||||
|
||||
As expected, it's the "``rep movsl``" instruction from the ``memcpy()``
|
||||
that causes the warning. We know about ``REP MOVSL`` that it uses the register
|
||||
``RCX`` to count the number of remaining iterations. By taking a look at the
|
||||
register dump again (from the kmemcheck report), we can figure out how many
|
||||
bytes were left to copy::
|
||||
|
||||
RAX: 0000000000000030 RBX: ffff88003d4ea968 RCX: 0000000000000009
|
||||
|
||||
By looking at the disassembly, we also see that ``%ecx`` is being loaded
|
||||
with the value ``$0xc`` just before (ffffffff8104edd8), so we are very
|
||||
lucky. Keep in mind that this is the number of iterations, not bytes. And
|
||||
since this is a "long" operation, we need to multiply by 4 to get the
|
||||
number of bytes. So this means that the uninitialized value was encountered
|
||||
at 4 * (0xc - 0x9) = 12 bytes from the start of the object.
|
||||
|
||||
We can now try to figure out which field of the "``struct siginfo``" that
|
||||
was not initialized. This is the beginning of the struct::
|
||||
|
||||
40 typedef struct siginfo {
|
||||
41 int si_signo;
|
||||
42 int si_errno;
|
||||
43 int si_code;
|
||||
44
|
||||
45 union {
|
||||
..
|
||||
92 } _sifields;
|
||||
93 } siginfo_t;
|
||||
|
||||
On 64-bit, the int is 4 bytes long, so it must the union member that has
|
||||
not been initialized. We can verify this using gdb::
|
||||
|
||||
$ gdb vmlinux
|
||||
...
|
||||
(gdb) p &((struct siginfo *) 0)->_sifields
|
||||
$1 = (union {...} *) 0x10
|
||||
|
||||
Actually, it seems that the union member is located at offset 0x10 -- which
|
||||
means that gcc has inserted 4 bytes of padding between the members ``si_code``
|
||||
and ``_sifields``. We can now get a fuller picture of the memory dump::
|
||||
|
||||
_----------------------------=> si_code
|
||||
/ _--------------------=> (padding)
|
||||
| / _------------=> _sifields(._kill._pid)
|
||||
| | / _----=> _sifields(._kill._uid)
|
||||
| | | /
|
||||
-------|-------|-------|-------|
|
||||
80000000000000000000000000000000000000000088ffff0000000000000000
|
||||
i i i i u u u u i i i i i i i i u u u u u u u u u u u u u u u u
|
||||
|
||||
This allows us to realize another important fact: ``si_code`` contains the
|
||||
value 0x80. Remember that x86 is little endian, so the first 4 bytes
|
||||
"80000000" are really the number 0x00000080. With a bit of research, we
|
||||
find that this is actually the constant ``SI_KERNEL`` defined in
|
||||
``include/asm-generic/siginfo.h``::
|
||||
|
||||
144 #define SI_KERNEL 0x80 /* sent by the kernel from somewhere */
|
||||
|
||||
This macro is used in exactly one place in the x86 kernel: In ``send_signal()``
|
||||
in ``kernel/signal.c``::
|
||||
|
||||
816 static int send_signal(int sig, struct siginfo *info, struct task_struct *t,
|
||||
817 int group)
|
||||
818 {
|
||||
...
|
||||
828 pending = group ? &t->signal->shared_pending : &t->pending;
|
||||
...
|
||||
851 q = __sigqueue_alloc(t, GFP_ATOMIC, (sig < SIGRTMIN &&
|
||||
852 (is_si_special(info) ||
|
||||
853 info->si_code >= 0)));
|
||||
854 if (q) {
|
||||
855 list_add_tail(&q->list, &pending->list);
|
||||
856 switch ((unsigned long) info) {
|
||||
...
|
||||
865 case (unsigned long) SEND_SIG_PRIV:
|
||||
866 q->info.si_signo = sig;
|
||||
867 q->info.si_errno = 0;
|
||||
868 q->info.si_code = SI_KERNEL;
|
||||
869 q->info.si_pid = 0;
|
||||
870 q->info.si_uid = 0;
|
||||
871 break;
|
||||
...
|
||||
890 }
|
||||
|
||||
Not only does this match with the ``.si_code`` member, it also matches the place
|
||||
we found earlier when looking for where siginfo_t objects are enqueued on the
|
||||
``shared_pending`` list.
|
||||
|
||||
So to sum up: It seems that it is the padding introduced by the compiler
|
||||
between two struct fields that is uninitialized, and this gets reported when
|
||||
we do a ``memcpy()`` on the struct. This means that we have identified a false
|
||||
positive warning.
|
||||
|
||||
Normally, kmemcheck will not report uninitialized accesses in ``memcpy()`` calls
|
||||
when both the source and destination addresses are tracked. (Instead, we copy
|
||||
the shadow bytemap as well). In this case, the destination address clearly
|
||||
was not tracked. We can dig a little deeper into the stack trace from above::
|
||||
|
||||
arch/x86/kernel/signal.c:805
|
||||
arch/x86/kernel/signal.c:871
|
||||
arch/x86/kernel/entry_64.S:694
|
||||
|
||||
And we clearly see that the destination siginfo object is located on the
|
||||
stack::
|
||||
|
||||
782 static void do_signal(struct pt_regs *regs)
|
||||
783 {
|
||||
784 struct k_sigaction ka;
|
||||
785 siginfo_t info;
|
||||
...
|
||||
804 signr = get_signal_to_deliver(&info, &ka, regs, NULL);
|
||||
...
|
||||
854 }
|
||||
|
||||
And this ``&info`` is what eventually gets passed to ``copy_siginfo()`` as the
|
||||
destination argument.
|
||||
|
||||
Now, even though we didn't find an actual error here, the example is still a
|
||||
good one, because it shows how one would go about to find out what the report
|
||||
was all about.
|
||||
|
||||
|
||||
Annotating false positives
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
There are a few different ways to make annotations in the source code that
|
||||
will keep kmemcheck from checking and reporting certain allocations. Here
|
||||
they are:
|
||||
|
||||
- ``__GFP_NOTRACK_FALSE_POSITIVE``
|
||||
This flag can be passed to ``kmalloc()`` or ``kmem_cache_alloc()``
|
||||
(therefore also to other functions that end up calling one of
|
||||
these) to indicate that the allocation should not be tracked
|
||||
because it would lead to a false positive report. This is a "big
|
||||
hammer" way of silencing kmemcheck; after all, even if the false
|
||||
positive pertains to particular field in a struct, for example, we
|
||||
will now lose the ability to find (real) errors in other parts of
|
||||
the same struct.
|
||||
|
||||
Example::
|
||||
|
||||
/* No warnings will ever trigger on accessing any part of x */
|
||||
x = kmalloc(sizeof *x, GFP_KERNEL | __GFP_NOTRACK_FALSE_POSITIVE);
|
||||
|
||||
- ``kmemcheck_bitfield_begin(name)``/``kmemcheck_bitfield_end(name)`` and
|
||||
``kmemcheck_annotate_bitfield(ptr, name)``
|
||||
The first two of these three macros can be used inside struct
|
||||
definitions to signal, respectively, the beginning and end of a
|
||||
bitfield. Additionally, this will assign the bitfield a name, which
|
||||
is given as an argument to the macros.
|
||||
|
||||
Having used these markers, one can later use
|
||||
kmemcheck_annotate_bitfield() at the point of allocation, to indicate
|
||||
which parts of the allocation is part of a bitfield.
|
||||
|
||||
Example::
|
||||
|
||||
struct foo {
|
||||
int x;
|
||||
|
||||
kmemcheck_bitfield_begin(flags);
|
||||
int flag_a:1;
|
||||
int flag_b:1;
|
||||
kmemcheck_bitfield_end(flags);
|
||||
|
||||
int y;
|
||||
};
|
||||
|
||||
struct foo *x = kmalloc(sizeof *x);
|
||||
|
||||
/* No warnings will trigger on accessing the bitfield of x */
|
||||
kmemcheck_annotate_bitfield(x, flags);
|
||||
|
||||
Note that ``kmemcheck_annotate_bitfield()`` can be used even before the
|
||||
return value of ``kmalloc()`` is checked -- in other words, passing NULL
|
||||
as the first argument is legal (and will do nothing).
|
||||
|
||||
|
||||
Reporting errors
|
||||
----------------
|
||||
|
||||
As we have seen, kmemcheck will produce false positive reports. Therefore, it
|
||||
is not very wise to blindly post kmemcheck warnings to mailing lists and
|
||||
maintainers. Instead, I encourage maintainers and developers to find errors
|
||||
in their own code. If you get a warning, you can try to work around it, try
|
||||
to figure out if it's a real error or not, or simply ignore it. Most
|
||||
developers know their own code and will quickly and efficiently determine the
|
||||
root cause of a kmemcheck report. This is therefore also the most efficient
|
||||
way to work with kmemcheck.
|
||||
|
||||
That said, we (the kmemcheck maintainers) will always be on the lookout for
|
||||
false positives that we can annotate and silence. So whatever you find,
|
||||
please drop us a note privately! Kernel configs and steps to reproduce (if
|
||||
available) are of course a great help too.
|
||||
|
||||
Happy hacking!
|
||||
|
||||
|
||||
Technical description
|
||||
---------------------
|
||||
|
||||
kmemcheck works by marking memory pages non-present. This means that whenever
|
||||
somebody attempts to access the page, a page fault is generated. The page
|
||||
fault handler notices that the page was in fact only hidden, and so it calls
|
||||
on the kmemcheck code to make further investigations.
|
||||
|
||||
When the investigations are completed, kmemcheck "shows" the page by marking
|
||||
it present (as it would be under normal circumstances). This way, the
|
||||
interrupted code can continue as usual.
|
||||
|
||||
But after the instruction has been executed, we should hide the page again, so
|
||||
that we can catch the next access too! Now kmemcheck makes use of a debugging
|
||||
feature of the processor, namely single-stepping. When the processor has
|
||||
finished the one instruction that generated the memory access, a debug
|
||||
exception is raised. From here, we simply hide the page again and continue
|
||||
execution, this time with the single-stepping feature turned off.
|
||||
|
||||
kmemcheck requires some assistance from the memory allocator in order to work.
|
||||
The memory allocator needs to
|
||||
|
||||
1. Tell kmemcheck about newly allocated pages and pages that are about to
|
||||
be freed. This allows kmemcheck to set up and tear down the shadow memory
|
||||
for the pages in question. The shadow memory stores the status of each
|
||||
byte in the allocation proper, e.g. whether it is initialized or
|
||||
uninitialized.
|
||||
|
||||
2. Tell kmemcheck which parts of memory should be marked uninitialized.
|
||||
There are actually a few more states, such as "not yet allocated" and
|
||||
"recently freed".
|
||||
|
||||
If a slab cache is set up using the SLAB_NOTRACK flag, it will never return
|
||||
memory that can take page faults because of kmemcheck.
|
||||
|
||||
If a slab cache is NOT set up using the SLAB_NOTRACK flag, callers can still
|
||||
request memory with the __GFP_NOTRACK or __GFP_NOTRACK_FALSE_POSITIVE flags.
|
||||
This does not prevent the page faults from occurring, however, but marks the
|
||||
object in question as being initialized so that no warnings will ever be
|
||||
produced for this object.
|
||||
|
||||
Currently, the SLAB and SLUB allocators are supported by kmemcheck.
|
10
MAINTAINERS
10
MAINTAINERS
@ -7688,16 +7688,6 @@ F: include/linux/kdb.h
|
||||
F: include/linux/kgdb.h
|
||||
F: kernel/debug/
|
||||
|
||||
KMEMCHECK
|
||||
M: Vegard Nossum <vegardno@ifi.uio.no>
|
||||
M: Pekka Enberg <penberg@kernel.org>
|
||||
S: Maintained
|
||||
F: Documentation/dev-tools/kmemcheck.rst
|
||||
F: arch/x86/include/asm/kmemcheck.h
|
||||
F: arch/x86/mm/kmemcheck/
|
||||
F: include/linux/kmemcheck.h
|
||||
F: mm/kmemcheck.c
|
||||
|
||||
KMEMLEAK
|
||||
M: Catalin Marinas <catalin.marinas@arm.com>
|
||||
S: Maintained
|
||||
|
@ -112,7 +112,6 @@ config X86
|
||||
select HAVE_ARCH_JUMP_LABEL
|
||||
select HAVE_ARCH_KASAN if X86_64 && SPARSEMEM_VMEMMAP
|
||||
select HAVE_ARCH_KGDB
|
||||
select HAVE_ARCH_KMEMCHECK
|
||||
select HAVE_ARCH_MMAP_RND_BITS if MMU
|
||||
select HAVE_ARCH_MMAP_RND_COMPAT_BITS if MMU && COMPAT
|
||||
select HAVE_ARCH_COMPAT_MMAP_BASES if MMU && COMPAT
|
||||
@ -1430,7 +1429,7 @@ config ARCH_DMA_ADDR_T_64BIT
|
||||
|
||||
config X86_DIRECT_GBPAGES
|
||||
def_bool y
|
||||
depends on X86_64 && !DEBUG_PAGEALLOC && !KMEMCHECK
|
||||
depends on X86_64 && !DEBUG_PAGEALLOC
|
||||
---help---
|
||||
Certain kernel features effectively disable kernel
|
||||
linear 1 GB mappings (even if the CPU otherwise
|
||||
|
@ -1,43 +1 @@
|
||||
/* SPDX-License-Identifier: GPL-2.0 */
|
||||
#ifndef ASM_X86_KMEMCHECK_H
|
||||
#define ASM_X86_KMEMCHECK_H
|
||||
|
||||
#include <linux/types.h>
|
||||
#include <asm/ptrace.h>
|
||||
|
||||
#ifdef CONFIG_KMEMCHECK
|
||||
bool kmemcheck_active(struct pt_regs *regs);
|
||||
|
||||
void kmemcheck_show(struct pt_regs *regs);
|
||||
void kmemcheck_hide(struct pt_regs *regs);
|
||||
|
||||
bool kmemcheck_fault(struct pt_regs *regs,
|
||||
unsigned long address, unsigned long error_code);
|
||||
bool kmemcheck_trap(struct pt_regs *regs);
|
||||
#else
|
||||
static inline bool kmemcheck_active(struct pt_regs *regs)
|
||||
{
|
||||
return false;
|
||||
}
|
||||
|
||||
static inline void kmemcheck_show(struct pt_regs *regs)
|
||||
{
|
||||
}
|
||||
|
||||
static inline void kmemcheck_hide(struct pt_regs *regs)
|
||||
{
|
||||
}
|
||||
|
||||
static inline bool kmemcheck_fault(struct pt_regs *regs,
|
||||
unsigned long address, unsigned long error_code)
|
||||
{
|
||||
return false;
|
||||
}
|
||||
|
||||
static inline bool kmemcheck_trap(struct pt_regs *regs)
|
||||
{
|
||||
return false;
|
||||
}
|
||||
#endif /* CONFIG_KMEMCHECK */
|
||||
|
||||
#endif
|
||||
|
@ -179,8 +179,6 @@ static inline void *__memcpy3d(void *to, const void *from, size_t len)
|
||||
* No 3D Now!
|
||||
*/
|
||||
|
||||
#ifndef CONFIG_KMEMCHECK
|
||||
|
||||
#if (__GNUC__ >= 4)
|
||||
#define memcpy(t, f, n) __builtin_memcpy(t, f, n)
|
||||
#else
|
||||
@ -189,13 +187,6 @@ static inline void *__memcpy3d(void *to, const void *from, size_t len)
|
||||
? __constant_memcpy((t), (f), (n)) \
|
||||
: __memcpy((t), (f), (n)))
|
||||
#endif
|
||||
#else
|
||||
/*
|
||||
* kmemcheck becomes very happy if we use the REP instructions unconditionally,
|
||||
* because it means that we know both memory operands in advance.
|
||||
*/
|
||||
#define memcpy(t, f, n) __memcpy((t), (f), (n))
|
||||
#endif
|
||||
|
||||
#endif
|
||||
#endif /* !CONFIG_FORTIFY_SOURCE */
|
||||
|
@ -33,7 +33,6 @@ extern void *memcpy(void *to, const void *from, size_t len);
|
||||
extern void *__memcpy(void *to, const void *from, size_t len);
|
||||
|
||||
#ifndef CONFIG_FORTIFY_SOURCE
|
||||
#ifndef CONFIG_KMEMCHECK
|
||||
#if (__GNUC__ == 4 && __GNUC_MINOR__ < 3) || __GNUC__ < 4
|
||||
#define memcpy(dst, src, len) \
|
||||
({ \
|
||||
@ -46,13 +45,6 @@ extern void *__memcpy(void *to, const void *from, size_t len);
|
||||
__ret; \
|
||||
})
|
||||
#endif
|
||||
#else
|
||||
/*
|
||||
* kmemcheck becomes very happy if we use the REP instructions unconditionally,
|
||||
* because it means that we know both memory operands in advance.
|
||||
*/
|
||||
#define memcpy(dst, src, len) __inline_memcpy((dst), (src), (len))
|
||||
#endif
|
||||
#endif /* !CONFIG_FORTIFY_SOURCE */
|
||||
|
||||
#define __HAVE_ARCH_MEMSET
|
||||
|
@ -187,21 +187,6 @@ static void early_init_intel(struct cpuinfo_x86 *c)
|
||||
if (c->x86 == 6 && c->x86_model < 15)
|
||||
clear_cpu_cap(c, X86_FEATURE_PAT);
|
||||
|
||||
#ifdef CONFIG_KMEMCHECK
|
||||
/*
|
||||
* P4s have a "fast strings" feature which causes single-
|
||||
* stepping REP instructions to only generate a #DB on
|
||||
* cache-line boundaries.
|
||||
*
|
||||
* Ingo Molnar reported a Pentium D (model 6) and a Xeon
|
||||
* (model 2) with the same problem.
|
||||
*/
|
||||
if (c->x86 == 15)
|
||||
if (msr_clear_bit(MSR_IA32_MISC_ENABLE,
|
||||
MSR_IA32_MISC_ENABLE_FAST_STRING_BIT) > 0)
|
||||
pr_info("kmemcheck: Disabling fast string operations\n");
|
||||
#endif
|
||||
|
||||
/*
|
||||
* If fast string is not enabled in IA32_MISC_ENABLE for any reason,
|
||||
* clear the fast string and enhanced fast string CPU capabilities.
|
||||
|
@ -29,8 +29,6 @@ obj-$(CONFIG_X86_PTDUMP) += debug_pagetables.o
|
||||
|
||||
obj-$(CONFIG_HIGHMEM) += highmem_32.o
|
||||
|
||||
obj-$(CONFIG_KMEMCHECK) += kmemcheck/
|
||||
|
||||
KASAN_SANITIZE_kasan_init_$(BITS).o := n
|
||||
obj-$(CONFIG_KASAN) += kasan_init_$(BITS).o
|
||||
|
||||
|
@ -163,12 +163,11 @@ static int page_size_mask;
|
||||
static void __init probe_page_size_mask(void)
|
||||
{
|
||||
/*
|
||||
* For CONFIG_KMEMCHECK or pagealloc debugging, identity mapping will
|
||||
* use small pages.
|
||||
* For pagealloc debugging, identity mapping will use small pages.
|
||||
* This will simplify cpa(), which otherwise needs to support splitting
|
||||
* large pages into small in interrupt context, etc.
|
||||
*/
|
||||
if (boot_cpu_has(X86_FEATURE_PSE) && !debug_pagealloc_enabled() && !IS_ENABLED(CONFIG_KMEMCHECK))
|
||||
if (boot_cpu_has(X86_FEATURE_PSE) && !debug_pagealloc_enabled())
|
||||
page_size_mask |= 1 << PG_LEVEL_2M;
|
||||
else
|
||||
direct_gbpages = 0;
|
||||
|
@ -1 +0,0 @@
|
||||
obj-y := error.o kmemcheck.o opcode.o pte.o selftest.o shadow.o
|
@ -1,228 +1 @@
|
||||
// SPDX-License-Identifier: GPL-2.0
|
||||
#include <linux/interrupt.h>
|
||||
#include <linux/kdebug.h>
|
||||
#include <linux/kmemcheck.h>
|
||||
#include <linux/kernel.h>
|
||||
#include <linux/types.h>
|
||||
#include <linux/ptrace.h>
|
||||
#include <linux/stacktrace.h>
|
||||
#include <linux/string.h>
|
||||
|
||||
#include "error.h"
|
||||
#include "shadow.h"
|
||||
|
||||
enum kmemcheck_error_type {
|
||||
KMEMCHECK_ERROR_INVALID_ACCESS,
|
||||
KMEMCHECK_ERROR_BUG,
|
||||
};
|
||||
|
||||
#define SHADOW_COPY_SIZE (1 << CONFIG_KMEMCHECK_SHADOW_COPY_SHIFT)
|
||||
|
||||
struct kmemcheck_error {
|
||||
enum kmemcheck_error_type type;
|
||||
|
||||
union {
|
||||
/* KMEMCHECK_ERROR_INVALID_ACCESS */
|
||||
struct {
|
||||
/* Kind of access that caused the error */
|
||||
enum kmemcheck_shadow state;
|
||||
/* Address and size of the erroneous read */
|
||||
unsigned long address;
|
||||
unsigned int size;
|
||||
};
|
||||
};
|
||||
|
||||
struct pt_regs regs;
|
||||
struct stack_trace trace;
|
||||
unsigned long trace_entries[32];
|
||||
|
||||
/* We compress it to a char. */
|
||||
unsigned char shadow_copy[SHADOW_COPY_SIZE];
|
||||
unsigned char memory_copy[SHADOW_COPY_SIZE];
|
||||
};
|
||||
|
||||
/*
|
||||
* Create a ring queue of errors to output. We can't call printk() directly
|
||||
* from the kmemcheck traps, since this may call the console drivers and
|
||||
* result in a recursive fault.
|
||||
*/
|
||||
static struct kmemcheck_error error_fifo[CONFIG_KMEMCHECK_QUEUE_SIZE];
|
||||
static unsigned int error_count;
|
||||
static unsigned int error_rd;
|
||||
static unsigned int error_wr;
|
||||
static unsigned int error_missed_count;
|
||||
|
||||
static struct kmemcheck_error *error_next_wr(void)
|
||||
{
|
||||
struct kmemcheck_error *e;
|
||||
|
||||
if (error_count == ARRAY_SIZE(error_fifo)) {
|
||||
++error_missed_count;
|
||||
return NULL;
|
||||
}
|
||||
|
||||
e = &error_fifo[error_wr];
|
||||
if (++error_wr == ARRAY_SIZE(error_fifo))
|
||||
error_wr = 0;
|
||||
++error_count;
|
||||
return e;
|
||||
}
|
||||
|
||||
static struct kmemcheck_error *error_next_rd(void)
|
||||
{
|
||||
struct kmemcheck_error *e;
|
||||
|
||||
if (error_count == 0)
|
||||
return NULL;
|
||||
|
||||
e = &error_fifo[error_rd];
|
||||
if (++error_rd == ARRAY_SIZE(error_fifo))
|
||||
error_rd = 0;
|
||||
--error_count;
|
||||
return e;
|
||||
}
|
||||
|
||||
void kmemcheck_error_recall(void)
|
||||
{
|
||||
static const char *desc[] = {
|
||||
[KMEMCHECK_SHADOW_UNALLOCATED] = "unallocated",
|
||||
[KMEMCHECK_SHADOW_UNINITIALIZED] = "uninitialized",
|
||||
[KMEMCHECK_SHADOW_INITIALIZED] = "initialized",
|
||||
[KMEMCHECK_SHADOW_FREED] = "freed",
|
||||
};
|
||||
|
||||
static const char short_desc[] = {
|
||||
[KMEMCHECK_SHADOW_UNALLOCATED] = 'a',
|
||||
[KMEMCHECK_SHADOW_UNINITIALIZED] = 'u',
|
||||
[KMEMCHECK_SHADOW_INITIALIZED] = 'i',
|
||||
[KMEMCHECK_SHADOW_FREED] = 'f',
|
||||
};
|
||||
|
||||
struct kmemcheck_error *e;
|
||||
unsigned int i;
|
||||
|
||||
e = error_next_rd();
|
||||
if (!e)
|
||||
return;
|
||||
|
||||
switch (e->type) {
|
||||
case KMEMCHECK_ERROR_INVALID_ACCESS:
|
||||
printk(KERN_WARNING "WARNING: kmemcheck: Caught %d-bit read from %s memory (%p)\n",
|
||||
8 * e->size, e->state < ARRAY_SIZE(desc) ?
|
||||
desc[e->state] : "(invalid shadow state)",
|
||||
(void *) e->address);
|
||||
|
||||
printk(KERN_WARNING);
|
||||
for (i = 0; i < SHADOW_COPY_SIZE; ++i)
|
||||
printk(KERN_CONT "%02x", e->memory_copy[i]);
|
||||
printk(KERN_CONT "\n");
|
||||
|
||||
printk(KERN_WARNING);
|
||||
for (i = 0; i < SHADOW_COPY_SIZE; ++i) {
|
||||
if (e->shadow_copy[i] < ARRAY_SIZE(short_desc))
|
||||
printk(KERN_CONT " %c", short_desc[e->shadow_copy[i]]);
|
||||
else
|
||||
printk(KERN_CONT " ?");
|
||||
}
|
||||
printk(KERN_CONT "\n");
|
||||
printk(KERN_WARNING "%*c\n", 2 + 2
|
||||
* (int) (e->address & (SHADOW_COPY_SIZE - 1)), '^');
|
||||
break;
|
||||
case KMEMCHECK_ERROR_BUG:
|
||||
printk(KERN_EMERG "ERROR: kmemcheck: Fatal error\n");
|
||||
break;
|
||||
}
|
||||
|
||||
__show_regs(&e->regs, 1);
|
||||
print_stack_trace(&e->trace, 0);
|
||||
}
|
||||
|
||||
static void do_wakeup(unsigned long data)
|
||||
{
|
||||
while (error_count > 0)
|
||||
kmemcheck_error_recall();
|
||||
|
||||
if (error_missed_count > 0) {
|
||||
printk(KERN_WARNING "kmemcheck: Lost %d error reports because "
|
||||
"the queue was too small\n", error_missed_count);
|
||||
error_missed_count = 0;
|
||||
}
|
||||
}
|
||||
|
||||
static DECLARE_TASKLET(kmemcheck_tasklet, &do_wakeup, 0);
|
||||
|
||||
/*
|
||||
* Save the context of an error report.
|
||||
*/
|
||||
void kmemcheck_error_save(enum kmemcheck_shadow state,
|
||||
unsigned long address, unsigned int size, struct pt_regs *regs)
|
||||
{
|
||||
static unsigned long prev_ip;
|
||||
|
||||
struct kmemcheck_error *e;
|
||||
void *shadow_copy;
|
||||
void *memory_copy;
|
||||
|
||||
/* Don't report several adjacent errors from the same EIP. */
|
||||
if (regs->ip == prev_ip)
|
||||
return;
|
||||
prev_ip = regs->ip;
|
||||
|
||||
e = error_next_wr();
|
||||
if (!e)
|
||||
return;
|
||||
|
||||
e->type = KMEMCHECK_ERROR_INVALID_ACCESS;
|
||||
|
||||
e->state = state;
|
||||
e->address = address;
|
||||
e->size = size;
|
||||
|
||||
/* Save regs */
|
||||
memcpy(&e->regs, regs, sizeof(*regs));
|
||||
|
||||
/* Save stack trace */
|
||||
e->trace.nr_entries = 0;
|
||||
e->trace.entries = e->trace_entries;
|
||||
e->trace.max_entries = ARRAY_SIZE(e->trace_entries);
|
||||
e->trace.skip = 0;
|
||||
save_stack_trace_regs(regs, &e->trace);
|
||||
|
||||
/* Round address down to nearest 16 bytes */
|
||||
shadow_copy = kmemcheck_shadow_lookup(address
|
||||
& ~(SHADOW_COPY_SIZE - 1));
|
||||
BUG_ON(!shadow_copy);
|
||||
|
||||
memcpy(e->shadow_copy, shadow_copy, SHADOW_COPY_SIZE);
|
||||
|
||||
kmemcheck_show_addr(address);
|
||||
memory_copy = (void *) (address & ~(SHADOW_COPY_SIZE - 1));
|
||||
memcpy(e->memory_copy, memory_copy, SHADOW_COPY_SIZE);
|
||||
kmemcheck_hide_addr(address);
|
||||
|
||||
tasklet_hi_schedule_first(&kmemcheck_tasklet);
|
||||
}
|
||||
|
||||
/*
|
||||
* Save the context of a kmemcheck bug.
|
||||
*/
|
||||
void kmemcheck_error_save_bug(struct pt_regs *regs)
|
||||
{
|
||||
struct kmemcheck_error *e;
|
||||
|
||||
e = error_next_wr();
|
||||
if (!e)
|
||||
return;
|
||||
|
||||
e->type = KMEMCHECK_ERROR_BUG;
|
||||
|
||||
memcpy(&e->regs, regs, sizeof(*regs));
|
||||
|
||||
e->trace.nr_entries = 0;
|
||||
e->trace.entries = e->trace_entries;
|
||||
e->trace.max_entries = ARRAY_SIZE(e->trace_entries);
|
||||
e->trace.skip = 1;
|
||||
save_stack_trace(&e->trace);
|
||||
|
||||
tasklet_hi_schedule_first(&kmemcheck_tasklet);
|
||||
}
|
||||
|
@ -1,16 +1 @@
|
||||
/* SPDX-License-Identifier: GPL-2.0 */
|
||||
#ifndef ARCH__X86__MM__KMEMCHECK__ERROR_H
|
||||
#define ARCH__X86__MM__KMEMCHECK__ERROR_H
|
||||
|
||||
#include <linux/ptrace.h>
|
||||
|
||||
#include "shadow.h"
|
||||
|
||||
void kmemcheck_error_save(enum kmemcheck_shadow state,
|
||||
unsigned long address, unsigned int size, struct pt_regs *regs);
|
||||
|
||||
void kmemcheck_error_save_bug(struct pt_regs *regs);
|
||||
|
||||
void kmemcheck_error_recall(void);
|
||||
|
||||
#endif
|
||||
|
@ -1,658 +0,0 @@
|
||||
/**
|
||||
* kmemcheck - a heavyweight memory checker for the linux kernel
|
||||
* Copyright (C) 2007, 2008 Vegard Nossum <vegardno@ifi.uio.no>
|
||||
* (With a lot of help from Ingo Molnar and Pekka Enberg.)
|
||||
*
|
||||
* This program is free software; you can redistribute it and/or modify
|
||||
* it under the terms of the GNU General Public License (version 2) as
|
||||
* published by the Free Software Foundation.
|
||||
*/
|
||||
|
||||
#include <linux/init.h>
|
||||
#include <linux/interrupt.h>
|
||||
#include <linux/kallsyms.h>
|
||||
#include <linux/kernel.h>
|
||||
#include <linux/kmemcheck.h>
|
||||
#include <linux/mm.h>
|
||||
#include <linux/page-flags.h>
|
||||
#include <linux/percpu.h>
|
||||
#include <linux/ptrace.h>
|
||||
#include <linux/string.h>
|
||||
#include <linux/types.h>
|
||||
|
||||
#include <asm/cacheflush.h>
|
||||
#include <asm/kmemcheck.h>
|
||||
#include <asm/pgtable.h>
|
||||
#include <asm/tlbflush.h>
|
||||
|
||||
#include "error.h"
|
||||
#include "opcode.h"
|
||||
#include "pte.h"
|
||||
#include "selftest.h"
|
||||
#include "shadow.h"
|
||||
|
||||
|
||||
#ifdef CONFIG_KMEMCHECK_DISABLED_BY_DEFAULT
|
||||
# define KMEMCHECK_ENABLED 0
|
||||
#endif
|
||||
|
||||
#ifdef CONFIG_KMEMCHECK_ENABLED_BY_DEFAULT
|
||||
# define KMEMCHECK_ENABLED 1
|
||||
#endif
|
||||
|
||||
#ifdef CONFIG_KMEMCHECK_ONESHOT_BY_DEFAULT
|
||||
# define KMEMCHECK_ENABLED 2
|
||||
#endif
|
||||
|
||||
int kmemcheck_enabled = KMEMCHECK_ENABLED;
|
||||
|
||||
int __init kmemcheck_init(void)
|
||||
{
|
||||
#ifdef CONFIG_SMP
|
||||
/*
|
||||
* Limit SMP to use a single CPU. We rely on the fact that this code
|
||||
* runs before SMP is set up.
|
||||
*/
|
||||
if (setup_max_cpus > 1) {
|
||||
printk(KERN_INFO
|
||||
"kmemcheck: Limiting number of CPUs to 1.\n");
|
||||
setup_max_cpus = 1;
|
||||
}
|
||||
#endif
|
||||
|
||||
if (!kmemcheck_selftest()) {
|
||||
printk(KERN_INFO "kmemcheck: self-tests failed; disabling\n");
|
||||
kmemcheck_enabled = 0;
|
||||
return -EINVAL;
|
||||
}
|
||||
|
||||
printk(KERN_INFO "kmemcheck: Initialized\n");
|
||||
return 0;
|
||||
}
|
||||
|
||||
early_initcall(kmemcheck_init);
|
||||
|
||||
/*
|
||||
* We need to parse the kmemcheck= option before any memory is allocated.
|
||||
*/
|
||||
static int __init param_kmemcheck(char *str)
|
||||
{
|
||||
int val;
|
||||
int ret;
|
||||
|
||||
if (!str)
|
||||
return -EINVAL;
|
||||
|
||||
ret = kstrtoint(str, 0, &val);
|
||||
if (ret)
|
||||
return ret;
|
||||
kmemcheck_enabled = val;
|
||||
return 0;
|
||||
}
|
||||
|
||||
early_param("kmemcheck", param_kmemcheck);
|
||||
|
||||
int kmemcheck_show_addr(unsigned long address)
|
||||
{
|
||||
pte_t *pte;
|
||||
|
||||
pte = kmemcheck_pte_lookup(address);
|
||||
if (!pte)
|
||||
return 0;
|
||||
|
||||
set_pte(pte, __pte(pte_val(*pte) | _PAGE_PRESENT));
|
||||
__flush_tlb_one(address);
|
||||
return 1;
|
||||
}
|
||||
|
||||
int kmemcheck_hide_addr(unsigned long address)
|
||||
{
|
||||
pte_t *pte;
|
||||
|
||||
pte = kmemcheck_pte_lookup(address);
|
||||
if (!pte)
|
||||
return 0;
|
||||
|
||||
set_pte(pte, __pte(pte_val(*pte) & ~_PAGE_PRESENT));
|
||||
__flush_tlb_one(address);
|
||||
return 1;
|
||||
}
|
||||
|
||||
struct kmemcheck_context {
|
||||
bool busy;
|
||||
int balance;
|
||||
|
||||
/*
|
||||
* There can be at most two memory operands to an instruction, but
|
||||
* each address can cross a page boundary -- so we may need up to
|
||||
* four addresses that must be hidden/revealed for each fault.
|
||||
*/
|
||||
unsigned long addr[4];
|
||||
unsigned long n_addrs;
|
||||
unsigned long flags;
|
||||
|
||||
/* Data size of the instruction that caused a fault. */
|
||||
unsigned int size;
|
||||
};
|
||||
|
||||
static DEFINE_PER_CPU(struct kmemcheck_context, kmemcheck_context);
|
||||
|
||||
bool kmemcheck_active(struct pt_regs *regs)
|
||||
{
|
||||
struct kmemcheck_context *data = this_cpu_ptr(&kmemcheck_context);
|
||||
|
||||
return data->balance > 0;
|
||||
}
|
||||
|
||||
/* Save an address that needs to be shown/hidden */
|
||||
static void kmemcheck_save_addr(unsigned long addr)
|
||||
{
|
||||
struct kmemcheck_context *data = this_cpu_ptr(&kmemcheck_context);
|
||||
|
||||
BUG_ON(data->n_addrs >= ARRAY_SIZE(data->addr));
|
||||
data->addr[data->n_addrs++] = addr;
|
||||
}
|
||||
|
||||
static unsigned int kmemcheck_show_all(void)
|
||||
{
|
||||
struct kmemcheck_context *data = this_cpu_ptr(&kmemcheck_context);
|
||||
unsigned int i;
|
||||
unsigned int n;
|
||||
|
||||
n = 0;
|
||||
for (i = 0; i < data->n_addrs; ++i)
|
||||
n += kmemcheck_show_addr(data->addr[i]);
|
||||
|
||||
return n;
|
||||
}
|
||||
|
||||
static unsigned int kmemcheck_hide_all(void)
|
||||
{
|
||||
struct kmemcheck_context *data = this_cpu_ptr(&kmemcheck_context);
|
||||
unsigned int i;
|
||||
unsigned int n;
|
||||
|
||||
n = 0;
|
||||
for (i = 0; i < data->n_addrs; ++i)
|
||||
n += kmemcheck_hide_addr(data->addr[i]);
|
||||
|
||||
return n;
|
||||
}
|
||||
|
||||
/*
|
||||
* Called from the #PF handler.
|
||||
*/
|
||||
void kmemcheck_show(struct pt_regs *regs)
|
||||
{
|
||||
struct kmemcheck_context *data = this_cpu_ptr(&kmemcheck_context);
|
||||
|
||||
BUG_ON(!irqs_disabled());
|
||||
|
||||
if (unlikely(data->balance != 0)) {
|
||||
kmemcheck_show_all();
|
||||
kmemcheck_error_save_bug(regs);
|
||||
data->balance = 0;
|
||||
return;
|
||||
}
|
||||
|
||||
/*
|
||||
* None of the addresses actually belonged to kmemcheck. Note that
|
||||
* this is not an error.
|
||||
*/
|
||||
if (kmemcheck_show_all() == 0)
|
||||
return;
|
||||
|
||||
++data->balance;
|
||||
|
||||
/*
|
||||
* The IF needs to be cleared as well, so that the faulting
|
||||
* instruction can run "uninterrupted". Otherwise, we might take
|
||||
* an interrupt and start executing that before we've had a chance
|
||||
* to hide the page again.
|
||||
*
|
||||
* NOTE: In the rare case of multiple faults, we must not override
|
||||
* the original flags:
|
||||
*/
|
||||
if (!(regs->flags & X86_EFLAGS_TF))
|
||||
data->flags = regs->flags;
|
||||
|
||||
regs->flags |= X86_EFLAGS_TF;
|
||||
regs->flags &= ~X86_EFLAGS_IF;
|
||||
}
|
||||
|
||||
/*
|
||||
* Called from the #DB handler.
|
||||
*/
|
||||
void kmemcheck_hide(struct pt_regs *regs)
|
||||
{
|
||||
struct kmemcheck_context *data = this_cpu_ptr(&kmemcheck_context);
|
||||
int n;
|
||||
|
||||
BUG_ON(!irqs_disabled());
|
||||
|
||||
if (unlikely(data->balance != 1)) {
|
||||
kmemcheck_show_all();
|
||||
kmemcheck_error_save_bug(regs);
|
||||
data->n_addrs = 0;
|
||||
data->balance = 0;
|
||||
|
||||
if (!(data->flags & X86_EFLAGS_TF))
|
||||
regs->flags &= ~X86_EFLAGS_TF;
|
||||
if (data->flags & X86_EFLAGS_IF)
|
||||
regs->flags |= X86_EFLAGS_IF;
|
||||
return;
|
||||
}
|
||||
|
||||
if (kmemcheck_enabled)
|
||||
n = kmemcheck_hide_all();
|
||||
else
|
||||
n = kmemcheck_show_all();
|
||||
|
||||
if (n == 0)
|
||||
return;
|
||||
|
||||
--data->balance;
|
||||
|
||||
data->n_addrs = 0;
|
||||
|
||||
if (!(data->flags & X86_EFLAGS_TF))
|
||||
regs->flags &= ~X86_EFLAGS_TF;
|
||||
if (data->flags & X86_EFLAGS_IF)
|
||||
regs->flags |= X86_EFLAGS_IF;
|
||||
}
|
||||
|
||||
void kmemcheck_show_pages(struct page *p, unsigned int n)
|
||||
{
|
||||
unsigned int i;
|
||||
|
||||
for (i = 0; i < n; ++i) {
|
||||
unsigned long address;
|
||||
pte_t *pte;
|
||||
unsigned int level;
|
||||
|
||||
address = (unsigned long) page_address(&p[i]);
|
||||
pte = lookup_address(address, &level);
|
||||
BUG_ON(!pte);
|
||||
BUG_ON(level != PG_LEVEL_4K);
|
||||
|
||||
set_pte(pte, __pte(pte_val(*pte) | _PAGE_PRESENT));
|
||||
set_pte(pte, __pte(pte_val(*pte) & ~_PAGE_HIDDEN));
|
||||
__flush_tlb_one(address);
|
||||
}
|
||||
}
|
||||
|
||||
bool kmemcheck_page_is_tracked(struct page *p)
|
||||
{
|
||||
/* This will also check the "hidden" flag of the PTE. */
|
||||
return kmemcheck_pte_lookup((unsigned long) page_address(p));
|
||||
}
|
||||
|
||||
void kmemcheck_hide_pages(struct page *p, unsigned int n)
|
||||
{
|
||||
unsigned int i;
|
||||
|
||||
for (i = 0; i < n; ++i) {
|
||||
unsigned long address;
|
||||
pte_t *pte;
|
||||
unsigned int level;
|
||||
|
||||
address = (unsigned long) page_address(&p[i]);
|
||||
pte = lookup_address(address, &level);
|
||||
BUG_ON(!pte);
|
||||
BUG_ON(level != PG_LEVEL_4K);
|
||||
|
||||
set_pte(pte, __pte(pte_val(*pte) & ~_PAGE_PRESENT));
|
||||
set_pte(pte, __pte(pte_val(*pte) | _PAGE_HIDDEN));
|
||||
__flush_tlb_one(address);
|
||||
}
|
||||
}
|
||||
|
||||
/* Access may NOT cross page boundary */
|
||||
static void kmemcheck_read_strict(struct pt_regs *regs,
|
||||
unsigned long addr, unsigned int size)
|
||||
{
|
||||
void *shadow;
|
||||
enum kmemcheck_shadow status;
|
||||
|
||||
shadow = kmemcheck_shadow_lookup(addr);
|
||||
if (!shadow)
|
||||
return;
|
||||
|
||||
kmemcheck_save_addr(addr);
|
||||
status = kmemcheck_shadow_test(shadow, size);
|
||||
if (status == KMEMCHECK_SHADOW_INITIALIZED)
|
||||
return;
|
||||
|
||||
if (kmemcheck_enabled)
|
||||
kmemcheck_error_save(status, addr, size, regs);
|
||||
|
||||
if (kmemcheck_enabled == 2)
|
||||
kmemcheck_enabled = 0;
|
||||
|
||||
/* Don't warn about it again. */
|
||||
kmemcheck_shadow_set(shadow, size);
|
||||
}
|
||||
|
||||
bool kmemcheck_is_obj_initialized(unsigned long addr, size_t size)
|
||||
{
|
||||
enum kmemcheck_shadow status;
|
||||
void *shadow;
|
||||
|
||||
shadow = kmemcheck_shadow_lookup(addr);
|
||||
if (!shadow)
|
||||
return true;
|
||||
|
||||
status = kmemcheck_shadow_test_all(shadow, size);
|
||||
|
||||
return status == KMEMCHECK_SHADOW_INITIALIZED;
|
||||
}
|
||||
|
||||
/* Access may cross page boundary */
|
||||
static void kmemcheck_read(struct pt_regs *regs,
|
||||
unsigned long addr, unsigned int size)
|
||||
{
|
||||
unsigned long page = addr & PAGE_MASK;
|
||||
unsigned long next_addr = addr + size - 1;
|
||||
unsigned long next_page = next_addr & PAGE_MASK;
|
||||
|
||||
if (likely(page == next_page)) {
|
||||
kmemcheck_read_strict(regs, addr, size);
|
||||
return;
|
||||
}
|
||||
|
||||
/*
|
||||
* What we do is basically to split the access across the
|
||||
* two pages and handle each part separately. Yes, this means
|
||||
* that we may now see reads that are 3 + 5 bytes, for
|
||||
* example (and if both are uninitialized, there will be two
|
||||
* reports), but it makes the code a lot simpler.
|
||||
*/
|
||||
kmemcheck_read_strict(regs, addr, next_page - addr);
|
||||
kmemcheck_read_strict(regs, next_page, next_addr - next_page);
|
||||
}
|
||||
|
||||
static void kmemcheck_write_strict(struct pt_regs *regs,
|
||||
unsigned long addr, unsigned int size)
|
||||
{
|
||||
void *shadow;
|
||||
|
||||
shadow = kmemcheck_shadow_lookup(addr);
|
||||
if (!shadow)
|
||||
return;
|
||||
|
||||
kmemcheck_save_addr(addr);
|
||||
kmemcheck_shadow_set(shadow, size);
|
||||
}
|
||||
|
||||
static void kmemcheck_write(struct pt_regs *regs,
|
||||
unsigned long addr, unsigned int size)
|
||||
{
|
||||
unsigned long page = addr & PAGE_MASK;
|
||||
unsigned long next_addr = addr + size - 1;
|
||||
unsigned long next_page = next_addr & PAGE_MASK;
|
||||
|
||||
if (likely(page == next_page)) {
|
||||
kmemcheck_write_strict(regs, addr, size);
|
||||
return;
|
||||
}
|
||||
|
||||
/* See comment in kmemcheck_read(). */
|
||||
kmemcheck_write_strict(regs, addr, next_page - addr);
|
||||
kmemcheck_write_strict(regs, next_page, next_addr - next_page);
|
||||
}
|
||||
|
||||
/*
|
||||
* Copying is hard. We have two addresses, each of which may be split across
|
||||
* a page (and each page will have different shadow addresses).
|
||||
*/
|
||||
static void kmemcheck_copy(struct pt_regs *regs,
|
||||
unsigned long src_addr, unsigned long dst_addr, unsigned int size)
|
||||
{
|
||||
uint8_t shadow[8];
|
||||
enum kmemcheck_shadow status;
|
||||
|
||||
unsigned long page;
|
||||
unsigned long next_addr;
|
||||
unsigned long next_page;
|
||||
|
||||
uint8_t *x;
|
||||
unsigned int i;
|
||||
unsigned int n;
|
||||
|
||||
BUG_ON(size > sizeof(shadow));
|
||||
|
||||
page = src_addr & PAGE_MASK;
|
||||
next_addr = src_addr + size - 1;
|
||||
next_page = next_addr & PAGE_MASK;
|
||||
|
||||
if (likely(page == next_page)) {
|
||||
/* Same page */
|
||||
x = kmemcheck_shadow_lookup(src_addr);
|
||||
if (x) {
|
||||
kmemcheck_save_addr(src_addr);
|
||||
for (i = 0; i < size; ++i)
|
||||
shadow[i] = x[i];
|
||||
} else {
|
||||
for (i = 0; i < size; ++i)
|
||||
shadow[i] = KMEMCHECK_SHADOW_INITIALIZED;
|
||||
}
|
||||
} else {
|
||||
n = next_page - src_addr;
|
||||
BUG_ON(n > sizeof(shadow));
|
||||
|
||||
/* First page */
|
||||
x = kmemcheck_shadow_lookup(src_addr);
|
||||
if (x) {
|
||||
kmemcheck_save_addr(src_addr);
|
||||
for (i = 0; i < n; ++i)
|
||||
shadow[i] = x[i];
|
||||
} else {
|
||||
/* Not tracked */
|
||||
for (i = 0; i < n; ++i)
|
||||
shadow[i] = KMEMCHECK_SHADOW_INITIALIZED;
|
||||
}
|
||||
|
||||
/* Second page */
|
||||
x = kmemcheck_shadow_lookup(next_page);
|
||||
if (x) {
|
||||
kmemcheck_save_addr(next_page);
|
||||
for (i = n; i < size; ++i)
|
||||
shadow[i] = x[i - n];
|
||||
} else {
|
||||
/* Not tracked */
|
||||
for (i = n; i < size; ++i)
|
||||
shadow[i] = KMEMCHECK_SHADOW_INITIALIZED;
|
||||
}
|
||||
}
|
||||
|
||||
page = dst_addr & PAGE_MASK;
|
||||
next_addr = dst_addr + size - 1;
|
||||
next_page = next_addr & PAGE_MASK;
|
||||
|
||||
if (likely(page == next_page)) {
|
||||
/* Same page */
|
||||
x = kmemcheck_shadow_lookup(dst_addr);
|
||||
if (x) {
|
||||
kmemcheck_save_addr(dst_addr);
|
||||
for (i = 0; i < size; ++i) {
|
||||
x[i] = shadow[i];
|
||||
shadow[i] = KMEMCHECK_SHADOW_INITIALIZED;
|
||||
}
|
||||
}
|
||||
} else {
|
||||
n = next_page - dst_addr;
|
||||
BUG_ON(n > sizeof(shadow));
|
||||
|
||||
/* First page */
|
||||
x = kmemcheck_shadow_lookup(dst_addr);
|
||||
if (x) {
|
||||
kmemcheck_save_addr(dst_addr);
|
||||
for (i = 0; i < n; ++i) {
|
||||
x[i] = shadow[i];
|
||||
shadow[i] = KMEMCHECK_SHADOW_INITIALIZED;
|
||||
}
|
||||
}
|
||||
|
||||
/* Second page */
|
||||
x = kmemcheck_shadow_lookup(next_page);
|
||||
if (x) {
|
||||
kmemcheck_save_addr(next_page);
|
||||
for (i = n; i < size; ++i) {
|
||||
x[i - n] = shadow[i];
|
||||
shadow[i] = KMEMCHECK_SHADOW_INITIALIZED;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
status = kmemcheck_shadow_test(shadow, size);
|
||||
if (status == KMEMCHECK_SHADOW_INITIALIZED)
|
||||
return;
|
||||
|
||||
if (kmemcheck_enabled)
|
||||
kmemcheck_error_save(status, src_addr, size, regs);
|
||||
|
||||
if (kmemcheck_enabled == 2)
|
||||
kmemcheck_enabled = 0;
|
||||
}
|
||||
|
||||
enum kmemcheck_method {
|
||||
KMEMCHECK_READ,
|
||||
KMEMCHECK_WRITE,
|
||||
};
|
||||
|
||||
static void kmemcheck_access(struct pt_regs *regs,
|
||||
unsigned long fallback_address, enum kmemcheck_method fallback_method)
|
||||
{
|
||||
const uint8_t *insn;
|
||||
const uint8_t *insn_primary;
|
||||
unsigned int size;
|
||||
|
||||
struct kmemcheck_context *data = this_cpu_ptr(&kmemcheck_context);
|
||||
|
||||
/* Recursive fault -- ouch. */
|
||||
if (data->busy) {
|
||||
kmemcheck_show_addr(fallback_address);
|
||||
kmemcheck_error_save_bug(regs);
|
||||
return;
|
||||
}
|
||||
|
||||
data->busy = true;
|
||||
|
||||
insn = (const uint8_t *) regs->ip;
|
||||
insn_primary = kmemcheck_opcode_get_primary(insn);
|
||||
|
||||
kmemcheck_opcode_decode(insn, &size);
|
||||
|
||||
switch (insn_primary[0]) {
|
||||
#ifdef CONFIG_KMEMCHECK_BITOPS_OK
|
||||
/* AND, OR, XOR */
|
||||
/*
|
||||
* Unfortunately, these instructions have to be excluded from
|
||||
* our regular checking since they access only some (and not
|
||||
* all) bits. This clears out "bogus" bitfield-access warnings.
|
||||
*/
|
||||
case 0x80:
|
||||
case 0x81:
|
||||
case 0x82:
|
||||
case 0x83:
|
||||
switch ((insn_primary[1] >> 3) & 7) {
|
||||
/* OR */
|
||||
case 1:
|
||||
/* AND */
|
||||
case 4:
|
||||
/* XOR */
|
||||
case 6:
|
||||
kmemcheck_write(regs, fallback_address, size);
|
||||
goto out;
|
||||
|
||||
/* ADD */
|
||||
case 0:
|
||||
/* ADC */
|
||||
case 2:
|
||||
/* SBB */
|
||||
case 3:
|
||||
/* SUB */
|
||||
case 5:
|
||||
/* CMP */
|
||||
case 7:
|
||||
break;
|
||||
}
|
||||
break;
|
||||
#endif
|
||||
|
||||
/* MOVS, MOVSB, MOVSW, MOVSD */
|
||||
case 0xa4:
|
||||
case 0xa5:
|
||||
/*
|
||||
* These instructions are special because they take two
|
||||
* addresses, but we only get one page fault.
|
||||
*/
|
||||
kmemcheck_copy(regs, regs->si, regs->di, size);
|
||||
goto out;
|
||||
|
||||
/* CMPS, CMPSB, CMPSW, CMPSD */
|
||||
case 0xa6:
|
||||
case 0xa7:
|
||||
kmemcheck_read(regs, regs->si, size);
|
||||
kmemcheck_read(regs, regs->di, size);
|
||||
goto out;
|
||||
}
|
||||
|
||||
/*
|
||||
* If the opcode isn't special in any way, we use the data from the
|
||||
* page fault handler to determine the address and type of memory
|
||||
* access.
|
||||
*/
|
||||
switch (fallback_method) {
|
||||
case KMEMCHECK_READ:
|
||||
kmemcheck_read(regs, fallback_address, size);
|
||||
goto out;
|
||||
case KMEMCHECK_WRITE:
|
||||
kmemcheck_write(regs, fallback_address, size);
|
||||
goto out;
|
||||
}
|
||||
|
||||
out:
|
||||
data->busy = false;
|
||||
}
|
||||
|
||||
bool kmemcheck_fault(struct pt_regs *regs, unsigned long address,
|
||||
unsigned long error_code)
|
||||
{
|
||||
pte_t *pte;
|
||||
|
||||
/*
|
||||
* XXX: Is it safe to assume that memory accesses from virtual 86
|
||||
* mode or non-kernel code segments will _never_ access kernel
|
||||
* memory (e.g. tracked pages)? For now, we need this to avoid
|
||||
* invoking kmemcheck for PnP BIOS calls.
|
||||
*/
|
||||
if (regs->flags & X86_VM_MASK)
|
||||
return false;
|
||||
if (regs->cs != __KERNEL_CS)
|
||||
return false;
|
||||
|
||||
pte = kmemcheck_pte_lookup(address);
|
||||
if (!pte)
|
||||
return false;
|
||||
|
||||
WARN_ON_ONCE(in_nmi());
|
||||
|
||||
if (error_code & 2)
|
||||
kmemcheck_access(regs, address, KMEMCHECK_WRITE);
|
||||
else
|
||||
kmemcheck_access(regs, address, KMEMCHECK_READ);
|
||||
|
||||
kmemcheck_show(regs);
|
||||
return true;
|
||||
}
|
||||
|
||||
bool kmemcheck_trap(struct pt_regs *regs)
|
||||
{
|
||||
if (!kmemcheck_active(regs))
|
||||
return false;
|
||||
|
||||
/* We're done. */
|
||||
kmemcheck_hide(regs);
|
||||
return true;
|
||||
}
|
@ -1,107 +1 @@
|
||||
// SPDX-License-Identifier: GPL-2.0
|
||||
#include <linux/types.h>
|
||||
|
||||
#include "opcode.h"
|
||||
|
||||
static bool opcode_is_prefix(uint8_t b)
|
||||
{
|
||||
return
|
||||
/* Group 1 */
|
||||
b == 0xf0 || b == 0xf2 || b == 0xf3
|
||||
/* Group 2 */
|
||||
|| b == 0x2e || b == 0x36 || b == 0x3e || b == 0x26
|
||||
|| b == 0x64 || b == 0x65
|
||||
/* Group 3 */
|
||||
|| b == 0x66
|
||||
/* Group 4 */
|
||||
|| b == 0x67;
|
||||
}
|
||||
|
||||
#ifdef CONFIG_X86_64
|
||||
static bool opcode_is_rex_prefix(uint8_t b)
|
||||
{
|
||||
return (b & 0xf0) == 0x40;
|
||||
}
|
||||
#else
|
||||
static bool opcode_is_rex_prefix(uint8_t b)
|
||||
{
|
||||
return false;
|
||||
}
|
||||
#endif
|
||||
|
||||
#define REX_W (1 << 3)
|
||||
|
||||
/*
|
||||
* This is a VERY crude opcode decoder. We only need to find the size of the
|
||||
* load/store that caused our #PF and this should work for all the opcodes
|
||||
* that we care about. Moreover, the ones who invented this instruction set
|
||||
* should be shot.
|
||||
*/
|
||||
void kmemcheck_opcode_decode(const uint8_t *op, unsigned int *size)
|
||||
{
|
||||
/* Default operand size */
|
||||
int operand_size_override = 4;
|
||||
|
||||
/* prefixes */
|
||||
for (; opcode_is_prefix(*op); ++op) {
|
||||
if (*op == 0x66)
|
||||
operand_size_override = 2;
|
||||
}
|
||||
|
||||
/* REX prefix */
|
||||
if (opcode_is_rex_prefix(*op)) {
|
||||
uint8_t rex = *op;
|
||||
|
||||
++op;
|
||||
if (rex & REX_W) {
|
||||
switch (*op) {
|
||||
case 0x63:
|
||||
*size = 4;
|
||||
return;
|
||||
case 0x0f:
|
||||
++op;
|
||||
|
||||
switch (*op) {
|
||||
case 0xb6:
|
||||
case 0xbe:
|
||||
*size = 1;
|
||||
return;
|
||||
case 0xb7:
|
||||
case 0xbf:
|
||||
*size = 2;
|
||||
return;
|
||||
}
|
||||
|
||||
break;
|
||||
}
|
||||
|
||||
*size = 8;
|
||||
return;
|
||||
}
|
||||
}
|
||||
|
||||
/* escape opcode */
|
||||
if (*op == 0x0f) {
|
||||
++op;
|
||||
|
||||
/*
|
||||
* This is move with zero-extend and sign-extend, respectively;
|
||||
* we don't have to think about 0xb6/0xbe, because this is
|
||||
* already handled in the conditional below.
|
||||
*/
|
||||
if (*op == 0xb7 || *op == 0xbf)
|
||||
operand_size_override = 2;
|
||||
}
|
||||
|
||||
*size = (*op & 1) ? operand_size_override : 1;
|
||||
}
|
||||
|
||||
const uint8_t *kmemcheck_opcode_get_primary(const uint8_t *op)
|
||||
{
|
||||
/* skip prefixes */
|
||||
while (opcode_is_prefix(*op))
|
||||
++op;
|
||||
if (opcode_is_rex_prefix(*op))
|
||||
++op;
|
||||
return op;
|
||||
}
|
||||
|
@ -1,10 +1 @@
|
||||
/* SPDX-License-Identifier: GPL-2.0 */
|
||||
#ifndef ARCH__X86__MM__KMEMCHECK__OPCODE_H
|
||||
#define ARCH__X86__MM__KMEMCHECK__OPCODE_H
|
||||
|
||||
#include <linux/types.h>
|
||||
|
||||
void kmemcheck_opcode_decode(const uint8_t *op, unsigned int *size);
|
||||
const uint8_t *kmemcheck_opcode_get_primary(const uint8_t *op);
|
||||
|
||||
#endif
|
||||
|
@ -1,23 +1 @@
|
||||
// SPDX-License-Identifier: GPL-2.0
|
||||
#include <linux/mm.h>
|
||||
|
||||
#include <asm/pgtable.h>
|
||||
|
||||
#include "pte.h"
|
||||
|
||||
pte_t *kmemcheck_pte_lookup(unsigned long address)
|
||||
{
|
||||
pte_t *pte;
|
||||
unsigned int level;
|
||||
|
||||
pte = lookup_address(address, &level);
|
||||
if (!pte)
|
||||
return NULL;
|
||||
if (level != PG_LEVEL_4K)
|
||||
return NULL;
|
||||
if (!pte_hidden(*pte))
|
||||
return NULL;
|
||||
|
||||
return pte;
|
||||
}
|
||||
|
||||
|
@ -1,11 +1 @@
|
||||
/* SPDX-License-Identifier: GPL-2.0 */
|
||||
#ifndef ARCH__X86__MM__KMEMCHECK__PTE_H
|
||||
#define ARCH__X86__MM__KMEMCHECK__PTE_H
|
||||
|
||||
#include <linux/mm.h>
|
||||
|
||||
#include <asm/pgtable.h>
|
||||
|
||||
pte_t *kmemcheck_pte_lookup(unsigned long address);
|
||||
|
||||
#endif
|
||||
|
@ -1,71 +1 @@
|
||||
// SPDX-License-Identifier: GPL-2.0
|
||||
#include <linux/bug.h>
|
||||
#include <linux/kernel.h>
|
||||
|
||||
#include "opcode.h"
|
||||
#include "selftest.h"
|
||||
|
||||
struct selftest_opcode {
|
||||
unsigned int expected_size;
|
||||
const uint8_t *insn;
|
||||
const char *desc;
|
||||
};
|
||||
|
||||
static const struct selftest_opcode selftest_opcodes[] = {
|
||||
/* REP MOVS */
|
||||
{1, "\xf3\xa4", "rep movsb <mem8>, <mem8>"},
|
||||
{4, "\xf3\xa5", "rep movsl <mem32>, <mem32>"},
|
||||
|
||||
/* MOVZX / MOVZXD */
|
||||
{1, "\x66\x0f\xb6\x51\xf8", "movzwq <mem8>, <reg16>"},
|
||||
{1, "\x0f\xb6\x51\xf8", "movzwq <mem8>, <reg32>"},
|
||||
|
||||
/* MOVSX / MOVSXD */
|
||||
{1, "\x66\x0f\xbe\x51\xf8", "movswq <mem8>, <reg16>"},
|
||||
{1, "\x0f\xbe\x51\xf8", "movswq <mem8>, <reg32>"},
|
||||
|
||||
#ifdef CONFIG_X86_64
|
||||
/* MOVZX / MOVZXD */
|
||||
{1, "\x49\x0f\xb6\x51\xf8", "movzbq <mem8>, <reg64>"},
|
||||
{2, "\x49\x0f\xb7\x51\xf8", "movzbq <mem16>, <reg64>"},
|
||||
|
||||
/* MOVSX / MOVSXD */
|
||||
{1, "\x49\x0f\xbe\x51\xf8", "movsbq <mem8>, <reg64>"},
|
||||
{2, "\x49\x0f\xbf\x51\xf8", "movsbq <mem16>, <reg64>"},
|
||||
{4, "\x49\x63\x51\xf8", "movslq <mem32>, <reg64>"},
|
||||
#endif
|
||||
};
|
||||
|
||||
static bool selftest_opcode_one(const struct selftest_opcode *op)
|
||||
{
|
||||
unsigned size;
|
||||
|
||||
kmemcheck_opcode_decode(op->insn, &size);
|
||||
|
||||
if (size == op->expected_size)
|
||||
return true;
|
||||
|
||||
printk(KERN_WARNING "kmemcheck: opcode %s: expected size %d, got %d\n",
|
||||
op->desc, op->expected_size, size);
|
||||
return false;
|
||||
}
|
||||
|
||||
static bool selftest_opcodes_all(void)
|
||||
{
|
||||
bool pass = true;
|
||||
unsigned int i;
|
||||
|
||||
for (i = 0; i < ARRAY_SIZE(selftest_opcodes); ++i)
|
||||
pass = pass && selftest_opcode_one(&selftest_opcodes[i]);
|
||||
|
||||
return pass;
|
||||
}
|
||||
|
||||
bool kmemcheck_selftest(void)
|
||||
{
|
||||
bool pass = true;
|
||||
|
||||
pass = pass && selftest_opcodes_all();
|
||||
|
||||
return pass;
|
||||
}
|
||||
|
@ -1,7 +1 @@
|
||||
/* SPDX-License-Identifier: GPL-2.0 */
|
||||
#ifndef ARCH_X86_MM_KMEMCHECK_SELFTEST_H
|
||||
#define ARCH_X86_MM_KMEMCHECK_SELFTEST_H
|
||||
|
||||
bool kmemcheck_selftest(void);
|
||||
|
||||
#endif
|
||||
|
@ -1,173 +0,0 @@
|
||||
#include <linux/kmemcheck.h>
|
||||
#include <linux/export.h>
|
||||
#include <linux/mm.h>
|
||||
|
||||
#include <asm/page.h>
|
||||
#include <asm/pgtable.h>
|
||||
|
||||
#include "pte.h"
|
||||
#include "shadow.h"
|
||||
|
||||
/*
|
||||
* Return the shadow address for the given address. Returns NULL if the
|
||||
* address is not tracked.
|
||||
*
|
||||
* We need to be extremely careful not to follow any invalid pointers,
|
||||
* because this function can be called for *any* possible address.
|
||||
*/
|
||||
void *kmemcheck_shadow_lookup(unsigned long address)
|
||||
{
|
||||
pte_t *pte;
|
||||
struct page *page;
|
||||
|
||||
if (!virt_addr_valid(address))
|
||||
return NULL;
|
||||
|
||||
pte = kmemcheck_pte_lookup(address);
|
||||
if (!pte)
|
||||
return NULL;
|
||||
|
||||
page = virt_to_page(address);
|
||||
if (!page->shadow)
|
||||
return NULL;
|
||||
return page->shadow + (address & (PAGE_SIZE - 1));
|
||||
}
|
||||
|
||||
static void mark_shadow(void *address, unsigned int n,
|
||||
enum kmemcheck_shadow status)
|
||||
{
|
||||
unsigned long addr = (unsigned long) address;
|
||||
unsigned long last_addr = addr + n - 1;
|
||||
unsigned long page = addr & PAGE_MASK;
|
||||
unsigned long last_page = last_addr & PAGE_MASK;
|
||||
unsigned int first_n;
|
||||
void *shadow;
|
||||
|
||||
/* If the memory range crosses a page boundary, stop there. */
|
||||
if (page == last_page)
|
||||
first_n = n;
|
||||
else
|
||||
first_n = page + PAGE_SIZE - addr;
|
||||
|
||||
shadow = kmemcheck_shadow_lookup(addr);
|
||||
if (shadow)
|
||||
memset(shadow, status, first_n);
|
||||
|
||||
addr += first_n;
|
||||
n -= first_n;
|
||||
|
||||
/* Do full-page memset()s. */
|
||||
while (n >= PAGE_SIZE) {
|
||||
shadow = kmemcheck_shadow_lookup(addr);
|
||||
if (shadow)
|
||||
memset(shadow, status, PAGE_SIZE);
|
||||
|
||||
addr += PAGE_SIZE;
|
||||
n -= PAGE_SIZE;
|
||||
}
|
||||
|
||||
/* Do the remaining page, if any. */
|
||||
if (n > 0) {
|
||||
shadow = kmemcheck_shadow_lookup(addr);
|
||||
if (shadow)
|
||||
memset(shadow, status, n);
|
||||
}
|
||||
}
|
||||
|
||||
void kmemcheck_mark_unallocated(void *address, unsigned int n)
|
||||
{
|
||||
mark_shadow(address, n, KMEMCHECK_SHADOW_UNALLOCATED);
|
||||
}
|
||||
|
||||
void kmemcheck_mark_uninitialized(void *address, unsigned int n)
|
||||
{
|
||||
mark_shadow(address, n, KMEMCHECK_SHADOW_UNINITIALIZED);
|
||||
}
|
||||
|
||||
/*
|
||||
* Fill the shadow memory of the given address such that the memory at that
|
||||
* address is marked as being initialized.
|
||||
*/
|
||||
void kmemcheck_mark_initialized(void *address, unsigned int n)
|
||||
{
|
||||
mark_shadow(address, n, KMEMCHECK_SHADOW_INITIALIZED);
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(kmemcheck_mark_initialized);
|
||||
|
||||
void kmemcheck_mark_freed(void *address, unsigned int n)
|
||||
{
|
||||
mark_shadow(address, n, KMEMCHECK_SHADOW_FREED);
|
||||
}
|
||||
|
||||
void kmemcheck_mark_unallocated_pages(struct page *p, unsigned int n)
|
||||
{
|
||||
unsigned int i;
|
||||
|
||||
for (i = 0; i < n; ++i)
|
||||
kmemcheck_mark_unallocated(page_address(&p[i]), PAGE_SIZE);
|
||||
}
|
||||
|
||||
void kmemcheck_mark_uninitialized_pages(struct page *p, unsigned int n)
|
||||
{
|
||||
unsigned int i;
|
||||
|
||||
for (i = 0; i < n; ++i)
|
||||
kmemcheck_mark_uninitialized(page_address(&p[i]), PAGE_SIZE);
|
||||
}
|
||||
|
||||
void kmemcheck_mark_initialized_pages(struct page *p, unsigned int n)
|
||||
{
|
||||
unsigned int i;
|
||||
|
||||
for (i = 0; i < n; ++i)
|
||||
kmemcheck_mark_initialized(page_address(&p[i]), PAGE_SIZE);
|
||||
}
|
||||
|
||||
enum kmemcheck_shadow kmemcheck_shadow_test(void *shadow, unsigned int size)
|
||||
{
|
||||
#ifdef CONFIG_KMEMCHECK_PARTIAL_OK
|
||||
uint8_t *x;
|
||||
unsigned int i;
|
||||
|
||||
x = shadow;
|
||||
|
||||
/*
|
||||
* Make sure _some_ bytes are initialized. Gcc frequently generates
|
||||
* code to access neighboring bytes.
|
||||
*/
|
||||
for (i = 0; i < size; ++i) {
|
||||
if (x[i] == KMEMCHECK_SHADOW_INITIALIZED)
|
||||
return x[i];
|
||||
}
|
||||
|
||||
return x[0];
|
||||
#else
|
||||
return kmemcheck_shadow_test_all(shadow, size);
|
||||
#endif
|
||||
}
|
||||
|
||||
enum kmemcheck_shadow kmemcheck_shadow_test_all(void *shadow, unsigned int size)
|
||||
{
|
||||
uint8_t *x;
|
||||
unsigned int i;
|
||||
|
||||
x = shadow;
|
||||
|
||||
/* All bytes must be initialized. */
|
||||
for (i = 0; i < size; ++i) {
|
||||
if (x[i] != KMEMCHECK_SHADOW_INITIALIZED)
|
||||
return x[i];
|
||||
}
|
||||
|
||||
return x[0];
|
||||
}
|
||||
|
||||
void kmemcheck_shadow_set(void *shadow, unsigned int size)
|
||||
{
|
||||
uint8_t *x;
|
||||
unsigned int i;
|
||||
|
||||
x = shadow;
|
||||
for (i = 0; i < size; ++i)
|
||||
x[i] = KMEMCHECK_SHADOW_INITIALIZED;
|
||||
}
|
@ -1,19 +1 @@
|
||||
/* SPDX-License-Identifier: GPL-2.0 */
|
||||
#ifndef ARCH__X86__MM__KMEMCHECK__SHADOW_H
|
||||
#define ARCH__X86__MM__KMEMCHECK__SHADOW_H
|
||||
|
||||
enum kmemcheck_shadow {
|
||||
KMEMCHECK_SHADOW_UNALLOCATED,
|
||||
KMEMCHECK_SHADOW_UNINITIALIZED,
|
||||
KMEMCHECK_SHADOW_INITIALIZED,
|
||||
KMEMCHECK_SHADOW_FREED,
|
||||
};
|
||||
|
||||
void *kmemcheck_shadow_lookup(unsigned long address);
|
||||
|
||||
enum kmemcheck_shadow kmemcheck_shadow_test(void *shadow, unsigned int size);
|
||||
enum kmemcheck_shadow kmemcheck_shadow_test_all(void *shadow,
|
||||
unsigned int size);
|
||||
void kmemcheck_shadow_set(void *shadow, unsigned int size);
|
||||
|
||||
#endif
|
||||
|
@ -594,21 +594,6 @@ static inline void tasklet_hi_schedule(struct tasklet_struct *t)
|
||||
__tasklet_hi_schedule(t);
|
||||
}
|
||||
|
||||
extern void __tasklet_hi_schedule_first(struct tasklet_struct *t);
|
||||
|
||||
/*
|
||||
* This version avoids touching any other tasklets. Needed for kmemcheck
|
||||
* in order not to take any page faults while enqueueing this tasklet;
|
||||
* consider VERY carefully whether you really need this or
|
||||
* tasklet_hi_schedule()...
|
||||
*/
|
||||
static inline void tasklet_hi_schedule_first(struct tasklet_struct *t)
|
||||
{
|
||||
if (!test_and_set_bit(TASKLET_STATE_SCHED, &t->state))
|
||||
__tasklet_hi_schedule_first(t);
|
||||
}
|
||||
|
||||
|
||||
static inline void tasklet_disable_nosync(struct tasklet_struct *t)
|
||||
{
|
||||
atomic_inc(&t->count);
|
||||
|
@ -1,172 +1 @@
|
||||
/* SPDX-License-Identifier: GPL-2.0 */
|
||||
#ifndef LINUX_KMEMCHECK_H
|
||||
#define LINUX_KMEMCHECK_H
|
||||
|
||||
#include <linux/mm_types.h>
|
||||
#include <linux/types.h>
|
||||
|
||||
#ifdef CONFIG_KMEMCHECK
|
||||
extern int kmemcheck_enabled;
|
||||
|
||||
/* The slab-related functions. */
|
||||
void kmemcheck_alloc_shadow(struct page *page, int order, gfp_t flags, int node);
|
||||
void kmemcheck_free_shadow(struct page *page, int order);
|
||||
void kmemcheck_slab_alloc(struct kmem_cache *s, gfp_t gfpflags, void *object,
|
||||
size_t size);
|
||||
void kmemcheck_slab_free(struct kmem_cache *s, void *object, size_t size);
|
||||
|
||||
void kmemcheck_pagealloc_alloc(struct page *p, unsigned int order,
|
||||
gfp_t gfpflags);
|
||||
|
||||
void kmemcheck_show_pages(struct page *p, unsigned int n);
|
||||
void kmemcheck_hide_pages(struct page *p, unsigned int n);
|
||||
|
||||
bool kmemcheck_page_is_tracked(struct page *p);
|
||||
|
||||
void kmemcheck_mark_unallocated(void *address, unsigned int n);
|
||||
void kmemcheck_mark_uninitialized(void *address, unsigned int n);
|
||||
void kmemcheck_mark_initialized(void *address, unsigned int n);
|
||||
void kmemcheck_mark_freed(void *address, unsigned int n);
|
||||
|
||||
void kmemcheck_mark_unallocated_pages(struct page *p, unsigned int n);
|
||||
void kmemcheck_mark_uninitialized_pages(struct page *p, unsigned int n);
|
||||
void kmemcheck_mark_initialized_pages(struct page *p, unsigned int n);
|
||||
|
||||
int kmemcheck_show_addr(unsigned long address);
|
||||
int kmemcheck_hide_addr(unsigned long address);
|
||||
|
||||
bool kmemcheck_is_obj_initialized(unsigned long addr, size_t size);
|
||||
|
||||
/*
|
||||
* Bitfield annotations
|
||||
*
|
||||
* How to use: If you have a struct using bitfields, for example
|
||||
*
|
||||
* struct a {
|
||||
* int x:8, y:8;
|
||||
* };
|
||||
*
|
||||
* then this should be rewritten as
|
||||
*
|
||||
* struct a {
|
||||
* kmemcheck_bitfield_begin(flags);
|
||||
* int x:8, y:8;
|
||||
* kmemcheck_bitfield_end(flags);
|
||||
* };
|
||||
*
|
||||
* Now the "flags_begin" and "flags_end" members may be used to refer to the
|
||||
* beginning and end, respectively, of the bitfield (and things like
|
||||
* &x.flags_begin is allowed). As soon as the struct is allocated, the bit-
|
||||
* fields should be annotated:
|
||||
*
|
||||
* struct a *a = kmalloc(sizeof(struct a), GFP_KERNEL);
|
||||
* kmemcheck_annotate_bitfield(a, flags);
|
||||
*/
|
||||
#define kmemcheck_bitfield_begin(name) \
|
||||
int name##_begin[0];
|
||||
|
||||
#define kmemcheck_bitfield_end(name) \
|
||||
int name##_end[0];
|
||||
|
||||
#define kmemcheck_annotate_bitfield(ptr, name) \
|
||||
do { \
|
||||
int _n; \
|
||||
\
|
||||
if (!ptr) \
|
||||
break; \
|
||||
\
|
||||
_n = (long) &((ptr)->name##_end) \
|
||||
- (long) &((ptr)->name##_begin); \
|
||||
BUILD_BUG_ON(_n < 0); \
|
||||
\
|
||||
kmemcheck_mark_initialized(&((ptr)->name##_begin), _n); \
|
||||
} while (0)
|
||||
|
||||
#define kmemcheck_annotate_variable(var) \
|
||||
do { \
|
||||
kmemcheck_mark_initialized(&(var), sizeof(var)); \
|
||||
} while (0) \
|
||||
|
||||
#else
|
||||
#define kmemcheck_enabled 0
|
||||
|
||||
static inline void
|
||||
kmemcheck_alloc_shadow(struct page *page, int order, gfp_t flags, int node)
|
||||
{
|
||||
}
|
||||
|
||||
static inline void
|
||||
kmemcheck_free_shadow(struct page *page, int order)
|
||||
{
|
||||
}
|
||||
|
||||
static inline void
|
||||
kmemcheck_slab_alloc(struct kmem_cache *s, gfp_t gfpflags, void *object,
|
||||
size_t size)
|
||||
{
|
||||
}
|
||||
|
||||
static inline void kmemcheck_slab_free(struct kmem_cache *s, void *object,
|
||||
size_t size)
|
||||
{
|
||||
}
|
||||
|
||||
static inline void kmemcheck_pagealloc_alloc(struct page *p,
|
||||
unsigned int order, gfp_t gfpflags)
|
||||
{
|
||||
}
|
||||
|
||||
static inline bool kmemcheck_page_is_tracked(struct page *p)
|
||||
{
|
||||
return false;
|
||||
}
|
||||
|
||||
static inline void kmemcheck_mark_unallocated(void *address, unsigned int n)
|
||||
{
|
||||
}
|
||||
|
||||
static inline void kmemcheck_mark_uninitialized(void *address, unsigned int n)
|
||||
{
|
||||
}
|
||||
|
||||
static inline void kmemcheck_mark_initialized(void *address, unsigned int n)
|
||||
{
|
||||
}
|
||||
|
||||
static inline void kmemcheck_mark_freed(void *address, unsigned int n)
|
||||
{
|
||||
}
|
||||
|
||||
static inline void kmemcheck_mark_unallocated_pages(struct page *p,
|
||||
unsigned int n)
|
||||
{
|
||||
}
|
||||
|
||||
static inline void kmemcheck_mark_uninitialized_pages(struct page *p,
|
||||
unsigned int n)
|
||||
{
|
||||
}
|
||||
|
||||
static inline void kmemcheck_mark_initialized_pages(struct page *p,
|
||||
unsigned int n)
|
||||
{
|
||||
}
|
||||
|
||||
static inline bool kmemcheck_is_obj_initialized(unsigned long addr, size_t size)
|
||||
{
|
||||
return true;
|
||||
}
|
||||
|
||||
#define kmemcheck_bitfield_begin(name)
|
||||
#define kmemcheck_bitfield_end(name)
|
||||
#define kmemcheck_annotate_bitfield(ptr, name) \
|
||||
do { \
|
||||
} while (0)
|
||||
|
||||
#define kmemcheck_annotate_variable(var) \
|
||||
do { \
|
||||
} while (0)
|
||||
|
||||
#endif /* CONFIG_KMEMCHECK */
|
||||
|
||||
#endif /* LINUX_KMEMCHECK_H */
|
||||
|
@ -486,16 +486,6 @@ void __tasklet_hi_schedule(struct tasklet_struct *t)
|
||||
}
|
||||
EXPORT_SYMBOL(__tasklet_hi_schedule);
|
||||
|
||||
void __tasklet_hi_schedule_first(struct tasklet_struct *t)
|
||||
{
|
||||
lockdep_assert_irqs_disabled();
|
||||
|
||||
t->next = __this_cpu_read(tasklet_hi_vec.head);
|
||||
__this_cpu_write(tasklet_hi_vec.head, t);
|
||||
__raise_softirq_irqoff(HI_SOFTIRQ);
|
||||
}
|
||||
EXPORT_SYMBOL(__tasklet_hi_schedule_first);
|
||||
|
||||
static __latent_entropy void tasklet_action(struct softirq_action *a)
|
||||
{
|
||||
struct tasklet_struct *list;
|
||||
|
@ -30,7 +30,6 @@
|
||||
#include <linux/proc_fs.h>
|
||||
#include <linux/security.h>
|
||||
#include <linux/ctype.h>
|
||||
#include <linux/kmemcheck.h>
|
||||
#include <linux/kmemleak.h>
|
||||
#include <linux/fs.h>
|
||||
#include <linux/init.h>
|
||||
@ -1173,15 +1172,6 @@ static struct ctl_table kern_table[] = {
|
||||
.extra1 = &zero,
|
||||
.extra2 = &one_thousand,
|
||||
},
|
||||
#endif
|
||||
#ifdef CONFIG_KMEMCHECK
|
||||
{
|
||||
.procname = "kmemcheck",
|
||||
.data = &kmemcheck_enabled,
|
||||
.maxlen = sizeof(int),
|
||||
.mode = 0644,
|
||||
.proc_handler = proc_dointvec,
|
||||
},
|
||||
#endif
|
||||
{
|
||||
.procname = "panic_on_warn",
|
||||
|
@ -504,7 +504,7 @@ config DEBUG_OBJECTS_ENABLE_DEFAULT
|
||||
|
||||
config DEBUG_SLAB
|
||||
bool "Debug slab memory allocations"
|
||||
depends on DEBUG_KERNEL && SLAB && !KMEMCHECK
|
||||
depends on DEBUG_KERNEL && SLAB
|
||||
help
|
||||
Say Y here to have the kernel do limited verification on memory
|
||||
allocation as well as poisoning memory on free to catch use of freed
|
||||
@ -516,7 +516,7 @@ config DEBUG_SLAB_LEAK
|
||||
|
||||
config SLUB_DEBUG_ON
|
||||
bool "SLUB debugging on by default"
|
||||
depends on SLUB && SLUB_DEBUG && !KMEMCHECK
|
||||
depends on SLUB && SLUB_DEBUG
|
||||
default n
|
||||
help
|
||||
Boot with debugging on by default. SLUB boots by default with
|
||||
@ -730,8 +730,6 @@ config DEBUG_STACKOVERFLOW
|
||||
|
||||
If in doubt, say "N".
|
||||
|
||||
source "lib/Kconfig.kmemcheck"
|
||||
|
||||
source "lib/Kconfig.kasan"
|
||||
|
||||
endmenu # "Memory Debugging"
|
||||
|
@ -1,94 +0,0 @@
|
||||
config HAVE_ARCH_KMEMCHECK
|
||||
bool
|
||||
|
||||
if HAVE_ARCH_KMEMCHECK
|
||||
|
||||
menuconfig KMEMCHECK
|
||||
bool "kmemcheck: trap use of uninitialized memory"
|
||||
depends on DEBUG_KERNEL
|
||||
depends on !X86_USE_3DNOW
|
||||
depends on SLUB || SLAB
|
||||
depends on !CC_OPTIMIZE_FOR_SIZE
|
||||
depends on !FUNCTION_TRACER
|
||||
select FRAME_POINTER
|
||||
select STACKTRACE
|
||||
default n
|
||||
help
|
||||
This option enables tracing of dynamically allocated kernel memory
|
||||
to see if memory is used before it has been given an initial value.
|
||||
Be aware that this requires half of your memory for bookkeeping and
|
||||
will insert extra code at *every* read and write to tracked memory
|
||||
thus slow down the kernel code (but user code is unaffected).
|
||||
|
||||
The kernel may be started with kmemcheck=0 or kmemcheck=1 to disable
|
||||
or enable kmemcheck at boot-time. If the kernel is started with
|
||||
kmemcheck=0, the large memory and CPU overhead is not incurred.
|
||||
|
||||
choice
|
||||
prompt "kmemcheck: default mode at boot"
|
||||
depends on KMEMCHECK
|
||||
default KMEMCHECK_ONESHOT_BY_DEFAULT
|
||||
help
|
||||
This option controls the default behaviour of kmemcheck when the
|
||||
kernel boots and no kmemcheck= parameter is given.
|
||||
|
||||
config KMEMCHECK_DISABLED_BY_DEFAULT
|
||||
bool "disabled"
|
||||
depends on KMEMCHECK
|
||||
|
||||
config KMEMCHECK_ENABLED_BY_DEFAULT
|
||||
bool "enabled"
|
||||
depends on KMEMCHECK
|
||||
|
||||
config KMEMCHECK_ONESHOT_BY_DEFAULT
|
||||
bool "one-shot"
|
||||
depends on KMEMCHECK
|
||||
help
|
||||
In one-shot mode, only the first error detected is reported before
|
||||
kmemcheck is disabled.
|
||||
|
||||
endchoice
|
||||
|
||||
config KMEMCHECK_QUEUE_SIZE
|
||||
int "kmemcheck: error queue size"
|
||||
depends on KMEMCHECK
|
||||
default 64
|
||||
help
|
||||
Select the maximum number of errors to store in the queue. Since
|
||||
errors can occur virtually anywhere and in any context, we need a
|
||||
temporary storage area which is guarantueed not to generate any
|
||||
other faults. The queue will be emptied as soon as a tasklet may
|
||||
be scheduled. If the queue is full, new error reports will be
|
||||
lost.
|
||||
|
||||
config KMEMCHECK_SHADOW_COPY_SHIFT
|
||||
int "kmemcheck: shadow copy size (5 => 32 bytes, 6 => 64 bytes)"
|
||||
depends on KMEMCHECK
|
||||
range 2 8
|
||||
default 5
|
||||
help
|
||||
Select the number of shadow bytes to save along with each entry of
|
||||
the queue. These bytes indicate what parts of an allocation are
|
||||
initialized, uninitialized, etc. and will be displayed when an
|
||||
error is detected to help the debugging of a particular problem.
|
||||
|
||||
config KMEMCHECK_PARTIAL_OK
|
||||
bool "kmemcheck: allow partially uninitialized memory"
|
||||
depends on KMEMCHECK
|
||||
default y
|
||||
help
|
||||
This option works around certain GCC optimizations that produce
|
||||
32-bit reads from 16-bit variables where the upper 16 bits are
|
||||
thrown away afterwards. This may of course also hide some real
|
||||
bugs.
|
||||
|
||||
config KMEMCHECK_BITOPS_OK
|
||||
bool "kmemcheck: allow bit-field manipulation"
|
||||
depends on KMEMCHECK
|
||||
default n
|
||||
help
|
||||
This option silences warnings that would be generated for bit-field
|
||||
accesses where not all the bits are initialized at the same time.
|
||||
This may also hide some real bugs.
|
||||
|
||||
endif
|
@ -11,7 +11,6 @@ config DEBUG_PAGEALLOC
|
||||
bool "Debug page memory allocations"
|
||||
depends on DEBUG_KERNEL
|
||||
depends on !HIBERNATION || ARCH_SUPPORTS_DEBUG_PAGEALLOC && !PPC && !SPARC
|
||||
depends on !KMEMCHECK
|
||||
select PAGE_EXTENSION
|
||||
select PAGE_POISONING if !ARCH_SUPPORTS_DEBUG_PAGEALLOC
|
||||
---help---
|
||||
|
@ -17,7 +17,6 @@ KCOV_INSTRUMENT_slub.o := n
|
||||
KCOV_INSTRUMENT_page_alloc.o := n
|
||||
KCOV_INSTRUMENT_debug-pagealloc.o := n
|
||||
KCOV_INSTRUMENT_kmemleak.o := n
|
||||
KCOV_INSTRUMENT_kmemcheck.o := n
|
||||
KCOV_INSTRUMENT_memcontrol.o := n
|
||||
KCOV_INSTRUMENT_mmzone.o := n
|
||||
KCOV_INSTRUMENT_vmstat.o := n
|
||||
@ -70,7 +69,6 @@ obj-$(CONFIG_KSM) += ksm.o
|
||||
obj-$(CONFIG_PAGE_POISONING) += page_poison.o
|
||||
obj-$(CONFIG_SLAB) += slab.o
|
||||
obj-$(CONFIG_SLUB) += slub.o
|
||||
obj-$(CONFIG_KMEMCHECK) += kmemcheck.o
|
||||
obj-$(CONFIG_KASAN) += kasan/
|
||||
obj-$(CONFIG_FAILSLAB) += failslab.o
|
||||
obj-$(CONFIG_MEMORY_HOTPLUG) += memory_hotplug.o
|
||||
|
125
mm/kmemcheck.c
125
mm/kmemcheck.c
@ -1,126 +1 @@
|
||||
// SPDX-License-Identifier: GPL-2.0
|
||||
#include <linux/gfp.h>
|
||||
#include <linux/mm_types.h>
|
||||
#include <linux/mm.h>
|
||||
#include <linux/slab.h>
|
||||
#include "slab.h"
|
||||
#include <linux/kmemcheck.h>
|
||||
|
||||
void kmemcheck_alloc_shadow(struct page *page, int order, gfp_t flags, int node)
|
||||
{
|
||||
struct page *shadow;
|
||||
int pages;
|
||||
int i;
|
||||
|
||||
pages = 1 << order;
|
||||
|
||||
/*
|
||||
* With kmemcheck enabled, we need to allocate a memory area for the
|
||||
* shadow bits as well.
|
||||
*/
|
||||
shadow = alloc_pages_node(node, flags, order);
|
||||
if (!shadow) {
|
||||
if (printk_ratelimit())
|
||||
pr_err("kmemcheck: failed to allocate shadow bitmap\n");
|
||||
return;
|
||||
}
|
||||
|
||||
for(i = 0; i < pages; ++i)
|
||||
page[i].shadow = page_address(&shadow[i]);
|
||||
|
||||
/*
|
||||
* Mark it as non-present for the MMU so that our accesses to
|
||||
* this memory will trigger a page fault and let us analyze
|
||||
* the memory accesses.
|
||||
*/
|
||||
kmemcheck_hide_pages(page, pages);
|
||||
}
|
||||
|
||||
void kmemcheck_free_shadow(struct page *page, int order)
|
||||
{
|
||||
struct page *shadow;
|
||||
int pages;
|
||||
int i;
|
||||
|
||||
if (!kmemcheck_page_is_tracked(page))
|
||||
return;
|
||||
|
||||
pages = 1 << order;
|
||||
|
||||
kmemcheck_show_pages(page, pages);
|
||||
|
||||
shadow = virt_to_page(page[0].shadow);
|
||||
|
||||
for(i = 0; i < pages; ++i)
|
||||
page[i].shadow = NULL;
|
||||
|
||||
__free_pages(shadow, order);
|
||||
}
|
||||
|
||||
void kmemcheck_slab_alloc(struct kmem_cache *s, gfp_t gfpflags, void *object,
|
||||
size_t size)
|
||||
{
|
||||
if (unlikely(!object)) /* Skip object if allocation failed */
|
||||
return;
|
||||
|
||||
/*
|
||||
* Has already been memset(), which initializes the shadow for us
|
||||
* as well.
|
||||
*/
|
||||
if (gfpflags & __GFP_ZERO)
|
||||
return;
|
||||
|
||||
/* No need to initialize the shadow of a non-tracked slab. */
|
||||
if (s->flags & SLAB_NOTRACK)
|
||||
return;
|
||||
|
||||
if (!kmemcheck_enabled || gfpflags & __GFP_NOTRACK) {
|
||||
/*
|
||||
* Allow notracked objects to be allocated from
|
||||
* tracked caches. Note however that these objects
|
||||
* will still get page faults on access, they just
|
||||
* won't ever be flagged as uninitialized. If page
|
||||
* faults are not acceptable, the slab cache itself
|
||||
* should be marked NOTRACK.
|
||||
*/
|
||||
kmemcheck_mark_initialized(object, size);
|
||||
} else if (!s->ctor) {
|
||||
/*
|
||||
* New objects should be marked uninitialized before
|
||||
* they're returned to the called.
|
||||
*/
|
||||
kmemcheck_mark_uninitialized(object, size);
|
||||
}
|
||||
}
|
||||
|
||||
void kmemcheck_slab_free(struct kmem_cache *s, void *object, size_t size)
|
||||
{
|
||||
/* TODO: RCU freeing is unsupported for now; hide false positives. */
|
||||
if (!s->ctor && !(s->flags & SLAB_TYPESAFE_BY_RCU))
|
||||
kmemcheck_mark_freed(object, size);
|
||||
}
|
||||
|
||||
void kmemcheck_pagealloc_alloc(struct page *page, unsigned int order,
|
||||
gfp_t gfpflags)
|
||||
{
|
||||
int pages;
|
||||
|
||||
if (gfpflags & (__GFP_HIGHMEM | __GFP_NOTRACK))
|
||||
return;
|
||||
|
||||
pages = 1 << order;
|
||||
|
||||
/*
|
||||
* NOTE: We choose to track GFP_ZERO pages too; in fact, they
|
||||
* can become uninitialized by copying uninitialized memory
|
||||
* into them.
|
||||
*/
|
||||
|
||||
/* XXX: Can use zone->node for node? */
|
||||
kmemcheck_alloc_shadow(page, order, gfpflags, -1);
|
||||
|
||||
if (gfpflags & __GFP_ZERO)
|
||||
kmemcheck_mark_initialized_pages(page, pages);
|
||||
else
|
||||
kmemcheck_mark_uninitialized_pages(page, pages);
|
||||
}
|
||||
|
@ -1371,7 +1371,7 @@ static inline void *slab_free_hook(struct kmem_cache *s, void *x)
|
||||
* So in order to make the debug calls that expect irqs to be
|
||||
* disabled we need to disable interrupts temporarily.
|
||||
*/
|
||||
#if defined(CONFIG_KMEMCHECK) || defined(CONFIG_LOCKDEP)
|
||||
#ifdef CONFIG_LOCKDEP
|
||||
{
|
||||
unsigned long flags;
|
||||
|
||||
@ -1399,8 +1399,7 @@ static inline void slab_free_freelist_hook(struct kmem_cache *s,
|
||||
* Compiler cannot detect this function can be removed if slab_free_hook()
|
||||
* evaluates to nothing. Thus, catch all relevant config debug options here.
|
||||
*/
|
||||
#if defined(CONFIG_KMEMCHECK) || \
|
||||
defined(CONFIG_LOCKDEP) || \
|
||||
#if defined(CONFIG_LOCKDEP) || \
|
||||
defined(CONFIG_DEBUG_KMEMLEAK) || \
|
||||
defined(CONFIG_DEBUG_OBJECTS_FREE) || \
|
||||
defined(CONFIG_KASAN)
|
||||
|
@ -2182,8 +2182,6 @@ sub dump_struct($$) {
|
||||
# strip comments:
|
||||
$members =~ s/\/\*.*?\*\///gos;
|
||||
$nested =~ s/\/\*.*?\*\///gos;
|
||||
# strip kmemcheck_bitfield_{begin,end}.*;
|
||||
$members =~ s/kmemcheck_bitfield_.*?;//gos;
|
||||
# strip attributes
|
||||
$members =~ s/__attribute__\s*\(\([a-z,_\*\s\(\)]*\)\)//i;
|
||||
$members =~ s/__aligned\s*\([^;]*\)//gos;
|
||||
|
@ -1,9 +1 @@
|
||||
/* SPDX-License-Identifier: GPL-2.0 */
|
||||
#ifndef _LIBLOCKDEP_LINUX_KMEMCHECK_H_
|
||||
#define _LIBLOCKDEP_LINUX_KMEMCHECK_H_
|
||||
|
||||
static inline void kmemcheck_mark_initialized(void *address, unsigned int n)
|
||||
{
|
||||
}
|
||||
|
||||
#endif
|
||||
|
Loading…
Reference in New Issue
Block a user