LN Koike: Linux Kernel: memory corruption

Sunday, July 19, 2015

Linux Kernel: memory corruption - debug tricks

Note: This post is meant to help you debug a memory corruption if you are building your own kernel from source.

The symptoms:

If you see one of the following errors in your in your dmesg log:

* BUG: unable to handle kernel paging request at 7f45402d
* invalid opcode: 0000 [#1] SMP
* general protection fault: 0000 [#1] SMP
* BUG: unable to handle kernel NULL pointer dereference at 0000000000000258

Then you probably have a memory corruption somewhere.

In my case, I didn't get these errors at first. I was adding objects to a list by calling list_add_tail(my_list, my_obj) and verifying that list_empty(my_list) was returning false as expected (and it was). But latter in the code when I called list_empty(my_list) again, it was returning true, and in nowhere in my code I was removing objects from the list.

Weird behaviors that doesn't make sense probably are due to memory corruption. When I started simplifying the code to isolate the problem, I start getting the errors above in the dmesg log.

Where it is happening:

When you have an error, you get a log message in dmesg like:

We will be looking at 2 infos there:
1) The IP (instruction pointer) address in the 2nd line
2) The call trace at line 23rd

The IP

In the 2nd line of this log, we have the instruction pointer address, it is the address the computer was executing that generated the error.

In this example, the IP is ffffffff817c69f0, then we can find where in the code this address corresponds to using the addr2line tool:

Where the path_to_your_kernel_tree is the path to the kernel code where you downloaded it (where you do make && make install)
The vmlinux file is the uncompressed version of the Linux image (it is created when you do make)

Note: if it doesn't work, check if your make menuconfig if your kernel is compiled with DEBUG flag

But this technique won't work if the corresponding address is not part of the kernel core code (if it was an error caused by some module, as we will see in the call trace section).

The call trace

In the log above, line 23rd, it means that the function devres_remove was called by devres_destroy, which was called by devm_kfree, which was called by vmc_cap_destroy and so on.

Now suppose that the vmc_cap_destroy function call devm_kfree in many places and you want to know exactly which one has triggered the bug. Lets do this in 2 steps:

1) Get the offset in the code section of the function using nm:
nm is a tool to get the offset of a symbol in a section:

In this example, the offset of the vmc_cap_destroy function is 0x680

2) Get the file and the line using addr2line:
In line 27th we have: vmc_cap_destroy+0x40, where the 0x40 is the offset inside the vmc_cap_destroy function where the code execution would return if the call to devm_kfree hadn't triggered the bug.

So lets add the function offset we found in the previous step with the offset inside the function: 0x680 + 0x40 = 0x6c0

And now we use the addr2line with the compile module.ko to find out where is it:

Then the buggy devm_kfree call is the first one just before the line 396 in file vmc-capture.c

Enable debug prints:

Besides the techniques above, you can enable the debug level prints in the dmesg log:

sudo sh -c "echo 8 > /proc/sys/kernel/printk"

If this doesn't work, check if the DYNAMIC_DEBUG flag is enabled in your menuconfig, if so, then check the next section about Dynamic debug.

In the case of a module that I was testing (the vivid module), I needed to change the vivid_debug parameter:

We can do this when we start the module:

sudo modprobe vivid vivid_debug=8

Or changing the parameter while its already running:

sudo sh -c "echo -n 0 > sys/module/vivid/parameters/vivid_debug"

Dynamic debug:

If your kernel is compiled with DYNAMIC_DEBUG flag, then changing the printk level probably won't enable the debug prints in the dmesg log.

Lets say the module we are working is called media, lets add it into the kernel:

sudo modprobe media

And now we will enable all the debug prints in the media module by sending 'module media +p' to the dynamic_debug/control file:

If you look at the dynamic_debug/control file content (as we did above) you can see the "=p" which means that the prints in those lines are enabled.

To disable the prints, we send 'module media -p' to the dynamic_debug/control file:

If you look at the dynamic_debug/control file content again you can see the "=_" which means that the prints in those lines are disabled.

Enabling a specific print

Instead of enabling all the debug prints of the entire module, we can enable all the prints on a specific file, or we can specify a file and a line:

$ sudo sh -c "echo -n 'file media-entity.c +p' > /sys/kernel/debug/dynamic_debug/control"

$ sudo sh -c "echo -n 'file media-entity.c line 301 +p' > /sys/kernel/debug/dynamic_debug/control"

Read dmesg log continuously

Instead of calling dmesg every time to check the log, you can do:

$ tail -f /var/log/kern.log

and the log will be printed continuously.

13 comments:

A human beingJuly 20, 2015 at 6:44:00 PM GMT-3
Thanks Helen for a very informative article! I've used a couple of the tools named but I'm really happy that I got to know new debugging techniques and tools too! I have two questions though:

Why did you decide that the problem is memory corruption?
You introduced a problem a problem that occurred to you and then provided the techniques that you followed to handle the issue. What happened to your problem at the end? I know that you solved it but I would like to know details. =D
ReplyDelete
Replies
Tomeu VizosoJune 13, 2016 at 6:02:00 AM GMT-3
CONFIG_SLUB_DEBUG_ON=y can help in some instances of memory corruption.
ReplyDelete
Replies
HuzaifaPC.NetAugust 11, 2019 at 6:08:00 AM GMT-3
This is so amazing post very interesting.Keep sharing dude.Well done.
Download Latest Version HTTP Debugger Pro Crack
ReplyDelete
Replies
svrtechnologiesDecember 17, 2019 at 8:48:00 AM GMT-3
Thanks for Sharing such an useful & informative stuff...

mule training
ReplyDelete
Replies
Jack sparrowJanuary 7, 2020 at 8:45:00 AM GMT-3

Thank you for sharing such a great information.Its really nice and informative.hope more posts from you. I also want to share some information recently i have gone through and i had find the one of the best mulesoft tutorial videos

ReplyDelete
Replies
AnonymousMay 4, 2022 at 9:25:00 PM GMT-3
mmorpg oyunlar
instagram takipçi satin al
Tiktok Jeton Hilesi
tiktok jeton hilesi
antalya saç ekimi
referans kimliği nedir
İNSTAGRAM TAKİPÇİ SATIN AL
Mt2 pvp serverler
instagram takipçi satın al
ReplyDelete
Replies
AnonymousMay 17, 2022 at 10:03:00 PM GMT-3
tül perde modelleri
sms onay
mobil ödeme bozdurma
NFT NASIL ALİNİR
ANKARA EVDEN EVE NAKLİYAT
trafik sigortası
dedektor
Kurma Website
Aşk romanları
ReplyDelete
Replies
AnonymousMay 26, 2022 at 11:40:00 AM GMT-3
Smm Panel
Smm Panel
İS İLANLARİ
İnstagram Takipçi Satın Al
hirdavatciburada.com
beyazesyateknikservisi.com.tr
servis
Tiktok Jeton Hilesi
ReplyDelete
Replies
sportsbetDecember 20, 2022 at 11:24:00 AM GMT-3
Good content. You write beautiful things.
hacklink
hacklink
vbet
sportsbet
mrbahis
taksi
vbet
mrbahis
korsan taksi
ReplyDelete
Replies
onurJuly 24, 2023 at 9:34:00 PM GMT-3
hatay
kars
mardin
samsun
urfa
D7U6Q
ReplyDelete
Replies
halisAugust 2, 2023 at 7:03:00 PM GMT-3
bitlis
edirne
hatay
ağrı
urfa

ZZ8O
ReplyDelete
Replies
İlaydaAugust 31, 2023 at 10:12:00 PM GMT-3
https://saglamproxy.com
metin2 proxy
proxy satın al
knight online proxy
mobil proxy satın al
ULGH
ReplyDelete
Replies

Add comment

Pages