Linux kernel live debugging, how it's done and what tools are used?

后端 未结 11 1671
不知归路
不知归路 2020-11-28 01:58

What are the most common and why not uncommon methods and tools used to do live debugging on the Linux kernel? I know that Linus for eg. is against this kind of debugging fo

相关标签:
11条回答
  • 2020-11-28 02:10

    KGDB + QEMU step-by-step

    KGDB is a kernel subsystem that allows you to step debug the kernel itself from a host GDB.

    My QEMU + Buildroot example is a good way to get a taste of it without real hardware: https://github.com/cirosantilli/linux-kernel-module-cheat/tree/1969cd6f8d30dace81d9848c6bacbb8bad9dacd8#kgdb

    Pros and cons vs other methods:

    • advantage vs QEMU:
      • you often don't have software emulation for your device as hardware vendors don't like to release accurate software models for their devices
      • real hardware way faster than QEMU
    • advantage vs JTAG: no need for extra JTAG hardware, easier to setup
    • disadvantages vs QEMU and JTAG: less visibility and more intrusive. KGDB relies on the certain parts of the kernel working to be able to communicate with the host. So e.g. it breaks down in panic, you can't view the boot sequence.

    The main steps are:

    1. Compile the kernel with:

      CONFIG_DEBUG_KERNEL=y
      CONFIG_DEBUG_INFO=y
      
      CONFIG_CONSOLE_POLL=y
      CONFIG_KDB_CONTINUE_CATASTROPHIC=0
      CONFIG_KDB_DEFAULT_ENABLE=0x1
      CONFIG_KDB_KEYBOARD=y
      CONFIG_KGDB=y
      CONFIG_KGDB_KDB=y
      CONFIG_KGDB_LOW_LEVEL_TRAP=y
      CONFIG_KGDB_SERIAL_CONSOLE=y
      CONFIG_KGDB_TESTS=y
      CONFIG_KGDB_TESTS_ON_BOOT=n
      CONFIG_MAGIC_SYSRQ=y
      CONFIG_MAGIC_SYSRQ_DEFAULT_ENABLE=0x1
      CONFIG_SERIAL_KGDB_NMI=n
      

      Most of those are not mandatory, but this is what I've tested.

    2. Add to your QEMU command:

      -append 'kgdbwait kgdboc=ttyS0,115200' \
      -serial tcp::1234,server,nowait
      
    3. Run GDB with from the root of the Linux kernel source tree with:

      gdb -ex 'file vmlinux' -ex 'target remote localhost:1234'
      
    4. In GDB:

      (gdb) c
      

      and the boot should finish.

    5. In QEMU:

      echo g > /proc/sysrq-trigger
      

      And GDB should break.

    6. Now we are done, you can use GDB as usual:

      b sys_write
      c
      

    Tested in Ubuntu 14.04.

    KGDB + Raspberry Pi

    The exact same setup as above almost worked on a Raspberry Pi 2, Raspbian Jessie 2016-05-27.

    You just have to learn to do the QEMU steps on the Pi, which are easily Googlable:

    • add the configuration options and recompile the kernel as explained at https://www.raspberrypi.org/documentation/linux/kernel/building.md There were unfortunately missing options on the default kernel build, notably no debug symbols, so the recompile is needed.

    • edit cmdline.txt of the boot partition and add:

      kgdbwait kgdboc=ttyAMA0,115200
      
    • connect gdb to the serial with:

      arm-linux-gnueabihf-gdb -ex 'file vmlinux' -ex 'target remote /dev/ttyUSB0'
      

      If you are not familiar with the serial, check out this: https://www.youtube.com/watch?v=da5Q7xL_OTo All you need is a cheap adapter like this one. Make sure you can get a shell through the serial to ensure that it is working before trying out KGDB.

    • do:

      echo g | sudo tee /proc/sysrq-trigger
      

      from inside an SSH session, since the serial is already taken by GDB.

    With this setup, I was able to put a breakpoint in sys_write, pause program execution, list source and continue.

    However, sometimes when I did next in sys_write GDB just hung and printed this error message several times:

    Ignoring packet error, continuing...
    

    so I'm not sure if something is wrong with my setup, or if this is expected because of what some background process is doing in the more complex Raspbian image.

    I've also been told to try and disable multiprocessing with the Linux boot options, but I haven't tried it yet.

    0 讨论(0)
  • 2020-11-28 02:14

    As someone who writes kernel code a lot I have to say I have never used kgdb, and only rarely use kprobes etc.

    It is still often the best approach to throw in some strategic printks. In more recent kernels trace_printk is a good way to do that without spamming dmesg.

    0 讨论(0)
  • 2020-11-28 02:18

    You guys are wrong, the kgdb still works well for latest kernel, you need to take care of kernel configuration of split image, randomization optimization.

    kgdb over serial port is useless because no computer today supports DB9 on a motherboard serial port, USB serial port doesn't support the polling mode.

    The new game is kgdboe, following is the log trace:

    following is the host machine, vmlinux is from the target machine

    root@Thinkpad-T510:~/KGDBOE# gdb vmlinux
    Reading symbols from vmlinux...done.
    (gdb) target remote udp:192.168.1.22:31337
    1077    kernel/debug/debug_core.c: No such file or directory.
    (gdb) l oom_kill_process 
    828 mm/oom_kill.c: No such file or directory.
    (gdb) l oom_kill_process 
    828 in mm/oom_kill.c
    (gdb) break oom_kill_process
    Breakpoint 1 at 0xffffffff8119e0c0: file mm/oom_kill.c, line 833.
    (gdb) c
    Continuing.
    [New Thread 1779]
    [New Thread 1782]
    [New Thread 1777]
    [New Thread 1778]
    [New Thread 1780]
    [New Thread 1781]
    [Switching to Thread 1779]
    
    Thread 388 hit Breakpoint 1, oom_kill_process (oc=0xffffc90000d93ce8, message=0xffffffff82098fbc "Out of memory")
    at mm/oom_kill.c:833
    833 in mm/oom_kill.c
    (gdb) s
    834 in mm/oom_kill.c
    (gdb) 
    

    On peer target machine, following is how to get it crash and to be captured by host machine

    #swapoff -a
    #stress -m 4 --vm-bytes=500m
    
    0 讨论(0)
  • 2020-11-28 02:19

    Actually the joke is that Linux has had an in-kernel debugger since 2.2.12, xmon, but only for the powerpc architecture (actually it was ppc back then).

    It's not a source level debugger, and it's almost entirely undocumented, but still.

    http://lxr.linux.no/linux-old+v2.2.12/arch/ppc/xmon/xmon.c#L119

    0 讨论(0)
  • 2020-11-28 02:21

    kgdb and gdb are almost useless for debugging the kernel because the code is so optimised it bears no relation to the orioginal source and many varuiables are optimised out. This makes steppijng , hence stepping through the source is impossible, examining variables is impossible and is therefore aolmost pointles.

    Actually it is worse than useless, it actually gives you false infoprmation so detached is the code you are ollooking at to the actual running code.

    And no, you cant turn off optimisations in the kernel, it doesnt compile.

    I have to say, coming from a windows kernel environment, the lack of decent debugger is anoying, given that there is junk code out there to maintain.

    0 讨论(0)
  • 2020-11-28 02:22

    According to the wiki, kgdb was merged into the kernel in 2.6.26 which is within the last few years. kgdb is a remote debugger, so you activate it in your kernel then you attach gdb to it somehow. I say somehow as there seems to be lots of options - see connecting gdb. Given that kgdb is now in the source tree, I'd say going forward this is what you want to be using.

    So it looks like Linus gave in. However, I would emphasize his argument - you should know what you're doing and know the system well. This is kernel land. If anything goes wrong, you don't get segfault, you get anything from some obscure problem later on to the whole system coming down. Here be dragons. Proceed with care, you have been warned.

    0 讨论(0)
提交回复
热议问题