Watch a variable (memory address) change in Linux kernel, and print stack trace when it changes?

后端 未结 2 1851
野性不改
野性不改 2020-12-03 09:24

I would like to somehow \"watch\" a variable (or a memory address, rather) in the Linux kernel (a kernel module/driver, to be exact); and find out what changed it - basicall

相关标签:
2条回答
  • 2020-12-03 09:40

    You need hardware support for this. The CPU need to sense when a certain memory address gets written to and call some code - an interrupt or exception handler. In my experience, I've seen this on the PowerPC platform but not on the x86. It's called a hardware watchpoint.

    Theoretically, if you run in an emulator, you could simulate this behaviour, but I am completely unfamiliar with the currently existing emulators.

    EDIT: I've dug a little more and it seems there is a general purpose hw breakpoint interface in Linux and that x86 has such a register. It's called DR7. Look at function in 'include/linux/hw_breakpoint.h'. It looks like ptrace and/or perf use these interfaces. Good luck debugging it!

    0 讨论(0)
  • 2020-12-03 09:56

    Many thanks for the replies by @CosminRatiu and Eugene; thanks to those, I found:

    • debugging - Linux kernel hardware break points - Stack Overflow
    • Hardware Breakpoint (or watchpoint) - The Linux Kernel Archives

    ... with which I could develop the example I'm posting here, the testhrarr.c kernel module/driver and the Makefile (below). It demonstrates that the hardware watchpoint tracing can be achieved in two ways: either using the perf program, which can probe the driver unchanged; or by adding some hardware breakpoint code to the driver (in the example, enveloped by the HWDEBUG_STACK define variable).

    Essentially, debugging contents of standard atomic variable types like ints (like the runcount variable) are straightforward, as long as they are defined as a global variable in the kernel module, so they end up showing as a kernel symbol globally. Because of that, the code below adds the testhrarr_ as prefix to the variables (so as to avoid naming conflicts). However, debugging contents of arrays may be a bit trickier, due to the need for dereferencing - so that is what this post demonstrates, debug of the first byte of the testhrarr_arr array. It was done on:

    $ echo `cat /etc/lsb-release` 
    DISTRIB_ID=Ubuntu DISTRIB_RELEASE=11.04 DISTRIB_CODENAME=natty DISTRIB_DESCRIPTION="Ubuntu 11.04"
    $ uname -a
    Linux mypc 2.6.38-16-generic #67-Ubuntu SMP Thu Sep 6 18:00:43 UTC 2012 i686 i686 i386 GNU/Linux
    $ cat /proc/cpuinfo | grep "model name"
    model name  : Intel(R) Atom(TM) CPU N450   @ 1.66GHz
    model name  : Intel(R) Atom(TM) CPU N450   @ 1.66GHz
    

    The testhrarr module basically allocates memory for a small array upon module initialization, sets up a timer function, and exposes a /proc/testhrarr_proc file (using the newer proc_create interface). Then, attempting to read from the /proc/testhrarr_proc file (say, using cat) will trigger the timer function, which will modify the testhrarr_arr array values, and dump messages to /var/log/syslog. We expect that testhrarr_arr[0] will change three times during the operation; once in testhrarr_startup, and twice in testhrarr_timer_function (due to wrapping).

    using perf

    After building the module with make, you can load it with:

    sudo insmod ./testhrarr.ko
    

    At that point, /var/log/syslog would contain:

    kernel: [40277.199913] Init testhrarr: 0 ; HZ: 250 ; 1/HZ (ms): 4 ; hrres: 0.000000001
    kernel: [40277.199930]  Addresses: _runcount 0xf84be22c ; _arr 0xf84be2a0 ; _arr[0] 0xed182a80 (0xed182a80) ; _timer_function 0xf84bc1c3 ; my_hrtimer 0xf84be260; my_hrt.f 0xf84be27c
    kernel: [40277.220329] HW Breakpoint for testhrarr_arr write installed (0xf84be2a0)
    

    Note that just passing testhrarr_arr as symbol for hardware watchpoint scans the address of that variable (0xf84be2a0), not the address of the first element of the array (0xed182a80)! Because of this, the hardware breakpoint is not going to trigger - so the behavior will be as if the hardware breakpoint code is not present at all (which can be achieved by undefining HWDEBUG_STACK)!

    So, even without a hardware breakpoint set through kernel module code, we can still use perf to observe a change of a memory address - in perf, we specify both the address we want to watch (here the address of the first element of testhrarr_arr, 0xed182a80), and the process which should be ran: here we run bash, so we can execute a cat /proc/testhrarr_proc which will trigger the kernel module timer, followed by a sleep 0.5 which will allow the timer to complete. The -a parameter is also needed, otherwise some events may be missed:

    $ sudo perf record -a --call-graph --event=mem:0xed182a80:w bash -c 'cat /proc/testhrarr_proc ; sleep 0.5'
    testhrarr proc: startup
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 0.485 MB perf.data (~21172 samples) ]
    

    At this point, /var/log/syslog would also contain something like:

    
    [40822.114964]  testhrarr_timer_function: testhrarr_runcount 0 
    [40822.114980]  testhrarr jiffies 10130528 ; ret: 1 ; ktnsec: 40822114975062
    [40822.118956]  testhrarr_timer_function: testhrarr_runcount 1 
    [40822.118977]  testhrarr jiffies 10130529 ; ret: 1 ; ktnsec: 40822118973195
    [40822.122940]  testhrarr_timer_function: testhrarr_runcount 2 
    [40822.122956]  testhrarr jiffies 10130530 ; ret: 1 ; ktnsec: 40822122951143
    [40822.126962]  testhrarr_timer_function: testhrarr_runcount 3 
    [40822.126978]  testhrarr jiffies 10130531 ; ret: 1 ; ktnsec: 40822126973583
    [40822.130941]  testhrarr_timer_function: testhrarr_runcount 4 
    [40822.130961]  testhrarr jiffies 10130532 ; ret: 1 ; ktnsec: 40822130955167
    [40822.134940]  testhrarr_timer_function: testhrarr_runcount 5 
    [40822.134962]  testhrarr jiffies 10130533 ; ret: 1 ; ktnsec: 40822134958888
    [40822.138936]  testhrarr_timer_function: testhrarr_runcount 6 
    [40822.138958]  testhrarr jiffies 10130534 ; ret: 1 ; ktnsec: 40822138955693
    [40822.142940]  testhrarr_timer_function: testhrarr_runcount 7 
    [40822.142962]  testhrarr jiffies 10130535 ; ret: 1 ; ktnsec: 40822142959345
    [40822.146936]  testhrarr_timer_function: testhrarr_runcount 8 
    [40822.146957]  testhrarr jiffies 10130536 ; ret: 1 ; ktnsec: 40822146954479
    [40822.150949]  testhrarr_timer_function: testhrarr_runcount 9 
    [40822.150970]  testhrarr jiffies 10130537 ; ret: 1 ; ktnsec: 40822150963438
    [40822.154974]  testhrarr_timer_function: testhrarr_runcount 10 
    [40822.154988] testhrarr [ 5, 7, 9, 11, 13, ]
    

    To read the capture of perf (a file called perf.data) we can use:

    $ sudo perf report --call-graph flat --stdio
    No kallsyms or vmlinux with build-id 5031df4d8668bcc45a7bdb4023909c6f8e2d3d34 was found
    [testhrarr] with build id 5031df4d8668bcc45a7bdb4023909c6f8e2d3d34 not found, continuing without symbols
    Failed to open /bin/cat, continuing without symbols
    Failed to open /usr/lib/libpixman-1.so.0.20.2, continuing without symbols
    Failed to open /usr/lib/xorg/modules/drivers/intel_drv.so, continuing without symbols
    Failed to open /usr/bin/Xorg, continuing without symbols
    # Events: 5  unknown
    #
    # Overhead  Command  Shared Object                                Symbol
    # ........  .......  .............  ....................................
    #
        87.50%     Xorg  [testhrarr]    [k] testhrarr_timer_function
                87.50%
                    testhrarr_timer_function
                    __run_hrtimer
                    hrtimer_interrupt
                    smp_apic_timer_interrupt
                    apic_timer_interrupt
                    0x30185d
                    0x2ed701
                    0x2ed8cc
                    0x2edba0
                    0x9d0386
                    0x8126fc8
                    0x81217a1
                    0x811bdd3
                    0x8070aa7
                    0x806281c
                    __libc_start_main
                    0x8062411
    
         6.25%      cat  [testhrarr]    [k] testhrarr_timer_function
                 6.25%
                    testhrarr_timer_function
                    testhrarr_proc_show
                    seq_read
                    proc_reg_read
                    vfs_read
                    sys_read
                    syscall_call
                    0xaa2416
                    0x8049f4d
                    __libc_start_main
                    0x8049081
    
         3.12%  swapper  [testhrarr]    [k] testhrarr_timer_function
                 3.12%
                    testhrarr_timer_function
                    __run_hrtimer
                    hrtimer_interrupt
                    smp_apic_timer_interrupt
                    apic_timer_interrupt
                    cpuidle_idle_call
                    cpu_idle
                    start_secondary
    
         3.12%      cat  [testhrarr]    [k] 0x356   
                 3.12%
                    0xf84bc356
                    0xf84bc3a7
                    seq_read
                    proc_reg_read
                    vfs_read
                    sys_read
                    syscall_call
                    0xaa2416
                    0x8049f4d
                    __libc_start_main
                    0x8049081
    
    
    
    #
    # (For a higher level overview, try: perf report --sort comm,dso)
    #
    

    So, since we're building the kernel module with debugging on (-g in the Makefile), it is not a problem for perf to find this module's symbols, even if the live kernel is not a debug kernel. So it correctly interprets testhrarr_timer_function as the setter most of the time, although it doesn't report testhrarr_startup (but it does report testhrarr_proc_show which calls it). There are also references to 0xf84bc3a7 and 0xf84bc356 which it couldn't resolve; however, note that the module is loaded at 0xf84bc000:

    $ sudo cat /proc/modules | grep testhr
    testhrarr 13433 0 - Live 0xf84bc000
    

    ... and that entry also starts with ...[k] 0x356; and if we look in the objdump of the kernel module:

    $ objdump -S testhrarr.ko | less
    ...
    00000323 :
    
    static void testhrarr_startup(void)
    {
    ...
        testhrarr_arr[0] = 0; //just the first element
     34b:   a1 80 00 00 00          mov    0x80,%eax
     350:   c7 00 00 00 00 00       movl   $0x0,(%eax)
        hrtimer_start(&my_hrtimer, ktime_period_ns, HRTIMER_MODE_REL);
     356:   c7 04 24 01 00 00 00    movl   $0x1,(%esp)                     **********
     35d:   8b 15 1c 00 00 00       mov    0x1c,%edx
    ...
    00000375 :
    
    
    static int testhrarr_proc_show(struct seq_file *m, void *v) {
    ...
        seq_printf(m, "testhrarr proc: startup\n");
     38f:   c7 44 24 04 79 00 00    movl   $0x79,0x4(%esp)
     396:   00 
     397:   8b 45 fc                mov    -0x4(%ebp),%eax
     39a:   89 04 24                mov    %eax,(%esp)
     39d:   e8 fc ff ff ff          call   39e 
        testhrarr_startup();
     3a2:   e8 7c ff ff ff          call   323 
     3a7:   eb 1c                   jmp    3c5   **********
      } else {
        seq_printf(m, "testhrarr proc: (is running, %d)\n", testhrarr_runcount);
     3a9:   a1 0c 00 00 00          mov    0xc,%eax
    ...
    

    ... so 0xf84bc356 apparently refers to hrtimer_start; and 0xf84bc3a7 -> 3a7 refers to its calling testhrarr_proc_show function; which thankfully makes sense. (Note that I've experienced with different versions of the driver, that the _start could show, and the timer_function to be expressed by sheer addresses; not sure what this is due).

    One problem with perf, though, is that it gives me a statistical "Overhead" of these functions occurring (not sure what that refers to - probably time spent between entry and exit of a function?) - but what I want, really, is a log of stack traces which is sequential. Not sure if perf can be set up for that - but it definitely be done with kernel module code for hardware breakpoints.

    using kernel module HW breakpoint

    The code which is in the HWDEBUG_STACK implements the HW breakpoint setup and handling. As noted, the default set up for the symbol ksym_name (if unspecified), is testhrarr_arr, which doesn't trigger the hardware breakpoint at all. The ksym_name parameter can be specified on the command line during insmod; here we can note that:

    $ sudo rmmod testhrarr    # remove module if still loaded
    $ sudo insmod ./testhrarr.ko ksym=testhrarr_arr[0]
    

    ... results with a HW Breakpoint for testhrarr_arr[0] write installed (0x (null)) in /var/log/syslog; - which means we cannot use symbols with bracket notation for array access; thankfully a null pointer here simply means that HW breakpoint will again not fire; it doesn't crash the OS completely :)

    There is, however, a global variable made to refer to the first element of the testhrarr_arr array, called testhrarr_arr_first - note how this global variable is specially handled in the code, and needs to be dereferenced, so that the correct address is obtained. So we do:

    $ sudo rmmod testhrarr    # remove module if still loaded
    $ sudo insmod ./testhrarr.ko ksym=testhrarr_arr_first
    

    ... and the syslog informs:

    kernel: [43910.509726] Init testhrarr: 0 ; HZ: 250 ; 1/HZ (ms): 4 ; hrres: 0.000000001
    kernel: [43910.509765]  Addresses: _runcount 0xf84be22c ; _arr 0xf84be2a0 ; _arr[0] 0xedf6c5c0 (0xedf6c5c0) ; _timer_function 0xf84bc1c3 ; my_hrtimer 0xf84be260; my_hrt.f 0xf84be27c
    kernel: [43910.538535] HW Breakpoint for testhrarr_arr_first write installed (0xedf6c5c0)
    

    ... and we can see that the HW breakpoint is set at 0xedf6c5c0, which is the address of testhrarr_arr[0]. Now if we trigger the driver via the /proc file:

    $ cat /proc/testhrarr_proc 
    testhrarr proc: startup
    

    ... we obtain in syslog:

    kernel: [44069.735695] testhrarr_arr_first value is changed
    [44069.735711] Pid: 29320, comm: cat Not tainted 2.6.38-16-generic #67-Ubuntu
    [44069.735719] Call Trace:
    [44069.735737]  [] ? sample_hbp_handler+0x2d/0x3b [testhrarr]
    [44069.735755]  [] ? __perf_event_overflow+0x90/0x240
    [44069.735768]  [] ? proc_alloc_inode+0x23/0x90
    [44069.735778]  [] ? proc_alloc_inode+0x23/0x90
    [44069.735790]  [] ? perf_swevent_event+0x136/0x140
    [44069.735801]  [] ? perf_bp_event+0x70/0x80
    [44069.735812]  [] ? prep_new_page+0x110/0x1a0
    [44069.735824]  [] ? get_page_from_freelist+0x12e/0x320
    [44069.735836]  [] ? seq_open+0x3d/0xa0
    [44069.735848]  [] ? hw_breakpoint_handler.clone.0+0x102/0x130
    [44069.735861]  [] ? hw_breakpoint_exceptions_notify+0x22/0x30
    [44069.735872]  [] ? notifier_call_chain+0x45/0x60
    [44069.735883]  [] ? atomic_notifier_call_chain+0x22/0x30
    [44069.735894]  [] ? notify_die+0x2d/0x30
    [44069.735904]  [] ? do_debug+0x88/0x180
    [44069.735915]  [] ? debug_stack_correct+0x30/0x38
    [44069.735928]  [] ? testhrarr_startup+0x33/0x52 [testhrarr]
    [44069.735940]  [] ? testhrarr_proc_show+0x32/0x57 [testhrarr]
    [44069.735952]  [] ? seq_read+0x145/0x390
    [44069.735963]  [] ? seq_read+0x0/0x390
    [44069.735973]  [] ? proc_reg_read+0x64/0xa0
    [44069.735985]  [] ? vfs_read+0x9f/0x160
    [44069.735995]  [] ? proc_reg_read+0x0/0xa0
    [44069.736003]  [] ? sys_read+0x42/0x70
    [44069.736013]  [] ? syscall_call+0x7/0xb
    [44069.736019] Dump stack from sample_hbp_handler
    [44069.740132]  testhrarr_timer_function: testhrarr_runcount 0 
    [44069.740146]  testhrarr jiffies 10942435 ; ret: 1 ; ktnsec: 44069740142485
    [44069.740159] testhrarr_arr_first value is changed
    [44069.740169] Pid: 4302, comm: gnome-terminal Not tainted 2.6.38-16-generic #67-Ubuntu
    [44069.740176] Call Trace:
    [44069.740195]  [] ? sample_hbp_handler+0x2d/0x3b [testhrarr]
    [44069.740213]  [] ? __perf_event_overflow+0x90/0x240
    [44069.740227]  [] ? perf_swevent_event+0x136/0x140
    [44069.740239]  [] ? perf_bp_event+0x70/0x80
    [44069.740253]  [] ? sched_clock_local+0xd3/0x1c0
    [44069.740267]  [] ? format_decode+0x323/0x380
    [44069.740280]  [] ? hw_breakpoint_handler.clone.0+0x102/0x130
    [44069.740292]  [] ? hw_breakpoint_exceptions_notify+0x22/0x30
    [44069.740302]  [] ? notifier_call_chain+0x45/0x60
    [44069.740313]  [] ? atomic_notifier_call_chain+0x22/0x30
    [44069.740324]  [] ? notify_die+0x2d/0x30
    [44069.740335]  [] ? do_debug+0x88/0x180
    [44069.740345]  [] ? debug_stack_correct+0x30/0x38
    [44069.740364]  [] ? init_intel_cacheinfo+0x103/0x394
    [44069.740379]  [] ? testhrarr_timer_function+0xed/0x160 [testhrarr]
    [44069.740391]  [] ? __run_hrtimer+0x6f/0x190
    [44069.740404]  [] ? testhrarr_timer_function+0x0/0x160 [testhrarr]
    [44069.740416]  [] ? hrtimer_interrupt+0x108/0x240
    [44069.740430]  [] ? smp_apic_timer_interrupt+0x56/0x8a
    [44069.740441]  [] ? apic_timer_interrupt+0x31/0x38
    [44069.740453]  [] ? _raw_spin_unlock_irqrestore+0x15/0x20
    [44069.740465]  [] ? try_to_del_timer_sync+0x67/0xb0
    [44069.740476]  [] ? del_timer_sync+0x29/0x50
    [44069.740486]  [] ? flush_delayed_work+0x13/0x40
    [44069.740500]  [] ? tty_flush_to_ldisc+0x12/0x20
    [44069.740510]  [] ? n_tty_poll+0x4f/0x190
    [44069.740523]  [] ? tty_poll+0x6d/0x90
    [44069.740531]  [] ? n_tty_poll+0x0/0x190
    [44069.740542]  [] ? do_poll.clone.3+0xd0/0x210
    [44069.740553]  [] ? do_sys_poll+0x134/0x1e0
    [44069.740563]  [] ? __pollwait+0x0/0xd0
    [44069.740572]  [] ? pollwake+0x0/0x60
    ...
    [44069.740742]  [] ? pollwake+0x0/0x60
    [44069.740757]  [] ? rw_verify_area+0x6c/0x130
    [44069.740770]  [] ? ktime_get_ts+0xf8/0x120
    [44069.740781]  [] ? poll_select_set_timeout+0x64/0x70
    [44069.740793]  [] ? sys_poll+0x5a/0xd0
    [44069.740804]  [] ? syscall_call+0x7/0xb
    [44069.740815]  [] ? init_intel_cacheinfo+0x23/0x394
    [44069.740822] Dump stack from sample_hbp_handler
    [44069.744130]  testhrarr_timer_function: testhrarr_runcount 1 
    [44069.744143]  testhrarr jiffies 10942436 ; ret: 1 ; ktnsec: 44069744140055
    [44069.748132]  testhrarr_timer_function: testhrarr_runcount 2 
    [44069.748145]  testhrarr jiffies 10942437 ; ret: 1 ; ktnsec: 44069748141271
    [44069.752131]  testhrarr_timer_function: testhrarr_runcount 3 
    [44069.752145]  testhrarr jiffies 10942438 ; ret: 1 ; ktnsec: 44069752141164
    [44069.756131]  testhrarr_timer_function: testhrarr_runcount 4 
    [44069.756141]  testhrarr jiffies 10942439 ; ret: 1 ; ktnsec: 44069756138318
    [44069.760130]  testhrarr_timer_function: testhrarr_runcount 5 
    [44069.760141]  testhrarr jiffies 10942440 ; ret: 1 ; ktnsec: 44069760138469
    [44069.760154] testhrarr_arr_first value is changed
    [44069.760164] Pid: 4302, comm: gnome-terminal Not tainted 2.6.38-16-generic #67-Ubuntu
    [44069.760170] Call Trace:
    [44069.760187]  [] ? sample_hbp_handler+0x2d/0x3b [testhrarr]
    [44069.760202]  [] ? __perf_event_overflow+0x90/0x240
    [44069.760213]  [] ? perf_swevent_event+0x136/0x140
    [44069.760224]  [] ? perf_bp_event+0x70/0x80
    [44069.760235]  [] ? sched_clock_local+0xd3/0x1c0
    [44069.760247]  [] ? format_decode+0x323/0x380
    [44069.760258]  [] ? hw_breakpoint_handler.clone.0+0x102/0x130
    [44069.760269]  [] ? hw_breakpoint_exceptions_notify+0x22/0x30
    [44069.760279]  [] ? notifier_call_chain+0x45/0x60
    [44069.760289]  [] ? atomic_notifier_call_chain+0x22/0x30
    [44069.760299]  [] ? notify_die+0x2d/0x30
    [44069.760308]  [] ? do_debug+0x88/0x180
    [44069.760318]  [] ? debug_stack_correct+0x30/0x38
    [44069.760334]  [] ? init_intel_cacheinfo+0x103/0x394
    [44069.760345]  [] ? testhrarr_timer_function+0xed/0x160 [testhrarr]
    [44069.760356]  [] ? __run_hrtimer+0x6f/0x190
    [44069.760366]  [] ? send_to_group.clone.1+0xf8/0x150
    [44069.760376]  [] ? testhrarr_timer_function+0x0/0x160 [testhrarr]
    [44069.760387]  [] ? hrtimer_interrupt+0x108/0x240
    [44069.760396]  [] ? fsnotify+0x1a5/0x290
    [44069.760407]  [] ? smp_apic_timer_interrupt+0x56/0x8a
    [44069.760416]  [] ? apic_timer_interrupt+0x31/0x38
    [44069.760428]  [] ? mem_cgroup_resize_limit+0x108/0x1c0
    [44069.760437]  [] ? fput+0x0/0x30
    [44069.760446]  [] ? sys_write+0x67/0x70
    [44069.760455]  [] ? syscall_call+0x7/0xb
    [44069.760464]  [] ? init_intel_cacheinfo+0x23/0x394
    [44069.760470] Dump stack from sample_hbp_handler
    [44069.764134]  testhrarr_timer_function: testhrarr_runcount 6 
    [44069.764147]  testhrarr jiffies 10942441 ; ret: 1 ; ktnsec: 44069764144141
    [44069.768133]  testhrarr_timer_function: testhrarr_runcount 7 
    [44069.768146]  testhrarr jiffies 10942442 ; ret: 1 ; ktnsec: 44069768142976
    [44069.772134]  testhrarr_timer_function: testhrarr_runcount 8 
    [44069.772148]  testhrarr jiffies 10942443 ; ret: 1 ; ktnsec: 44069772144121
    [44069.776132]  testhrarr_timer_function: testhrarr_runcount 9 
    [44069.776145]  testhrarr jiffies 10942444 ; ret: 1 ; ktnsec: 44069776141971
    [44069.780133]  testhrarr_timer_function: testhrarr_runcount 10 
    [44069.780141] testhrarr [ 5, 7, 9, 11, 13, ]

    ... we get a stack trace exactly three times - once during testhrarr_startup, and twice in testhrarr_timer_function: once for runcount==0 and once for runcount==5, as expected.

    Well, hope this helps someone,
    Cheers!


    Makefile

    CONFIG_MODULE_FORCE_UNLOAD=y
    
    # debug build:
    # "CFLAGS was changed ... Fix it to use EXTRA_CFLAGS."
    override EXTRA_CFLAGS+=-g -O0 
    
    obj-m += testhrarr.o
    #testhrarr-objs  := testhrarr.o
    
    all:
        @echo EXTRA_CFLAGS = $(EXTRA_CFLAGS)
        make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules
    
    clean:
        make -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean
    

    testhrarr.c

    /*
     * [http://www.tldp.org/LDP/lkmpg/2.6/html/lkmpg.html#AEN189 The Linux Kernel Module Programming Guide]
     * https://stackoverflow.com/questions/16920238/reliability-of-linux-kernel-add-timer-at-resolution-of-one-jiffy/17055867#17055867
     * https://stackoverflow.com/questions/8516021/proc-create-example-for-kernel-module/18924359#18924359
     * http://lxr.free-electrons.com/source/samples/hw_breakpoint/data_breakpoint.c
     */
    
    
    #include <linux/module.h>   /* Needed by all modules */
    #include <linux/kernel.h>   /* Needed for KERN_INFO */
    #include <linux/init.h>     /* Needed for the macros */
    #include <linux/jiffies.h>
    #include <linux/time.h>
    #include <linux/proc_fs.h>  /* /proc entry */
    #include <linux/seq_file.h> /* /proc entry */
    #define ARRSIZE 5
    #define MAXRUNS 2*ARRSIZE
    
    #include <linux/hrtimer.h>
    
    #define HWDEBUG_STACK 1
    
    #if (HWDEBUG_STACK == 1)
    #include <linux/perf_event.h>
    #include <linux/hw_breakpoint.h>
    
    struct perf_event * __percpu *sample_hbp;
    static char ksym_name[KSYM_NAME_LEN] = "testhrarr_arr";
    module_param_string(ksym, ksym_name, KSYM_NAME_LEN, S_IRUGO);
    MODULE_PARM_DESC(ksym, "Kernel symbol to monitor; this module will report any"
          " write operations on the kernel symbol");
    #endif
    
    static volatile int testhrarr_runcount = 0;
    static volatile int testhrarr_isRunning = 0;
    
    static unsigned long period_ms;
    static unsigned long period_ns;
    static ktime_t ktime_period_ns;
    static struct hrtimer my_hrtimer;
    
    static int* testhrarr_arr;
    static int* testhrarr_arr_first;
    
    static enum hrtimer_restart testhrarr_timer_function(struct hrtimer *timer)
    {
      unsigned long tjnow;
      ktime_t kt_now;
      int ret_overrun;
    
      printk(KERN_INFO
        " %s: testhrarr_runcount %d \n",
        __func__, testhrarr_runcount);
    
      if (testhrarr_runcount < MAXRUNS) {
        tjnow = jiffies;
        kt_now = hrtimer_cb_get_time(&my_hrtimer);
        ret_overrun = hrtimer_forward(&my_hrtimer, kt_now, ktime_period_ns);
        printk(KERN_INFO
          " testhrarr jiffies %lu ; ret: %d ; ktnsec: %lld\n",
          tjnow, ret_overrun, ktime_to_ns(kt_now));
        testhrarr_arr[(testhrarr_runcount % ARRSIZE)] += testhrarr_runcount;
        testhrarr_runcount++;
        return HRTIMER_RESTART;
      }
      else {
        int i;
        testhrarr_isRunning = 0;
        // do not use KERN_DEBUG etc, if printk buffering until newline is desired!
        printk("testhrarr_arr [ ");
        for(i=0; i<ARRSIZE; i++) {
          printk("%d, ", testhrarr_arr[i]);
        }
        printk("]\n");
        return HRTIMER_NORESTART;
      }
    }
    
    static void testhrarr_startup(void)
    {
      if (testhrarr_isRunning == 0) {
        testhrarr_isRunning = 1;
        testhrarr_runcount = 0;
        testhrarr_arr[0] = 0; //just the first element
        hrtimer_start(&my_hrtimer, ktime_period_ns, HRTIMER_MODE_REL);
      }
    }
    
    
    static int testhrarr_proc_show(struct seq_file *m, void *v) {
      if (testhrarr_isRunning == 0) {
        seq_printf(m, "testhrarr proc: startup\n");
        testhrarr_startup();
      } else {
        seq_printf(m, "testhrarr proc: (is running, %d)\n", testhrarr_runcount);
      }
      return 0;
    }
    
    static int testhrarr_proc_open(struct inode *inode, struct  file *file) {
      return single_open(file, testhrarr_proc_show, NULL);
    }
    
    static const struct file_operations testhrarr_proc_fops = {
      .owner = THIS_MODULE,
      .open = testhrarr_proc_open,
      .read = seq_read,
      .llseek = seq_lseek,
      .release = single_release,
    };
    
    
    #if (HWDEBUG_STACK == 1)
    static void sample_hbp_handler(struct perf_event *bp,
                 struct perf_sample_data *data,
                 struct pt_regs *regs)
    {
      printk(KERN_INFO "%s value is changed\n", ksym_name);
      dump_stack();
      printk(KERN_INFO "Dump stack from sample_hbp_handler\n");
    }
    #endif
    
    static int __init testhrarr_init(void)
    {
      struct timespec tp_hr_res;
      #if (HWDEBUG_STACK == 1)
      struct perf_event_attr attr;
      #endif
    
      period_ms = 1000/HZ;
      hrtimer_get_res(CLOCK_MONOTONIC, &tp_hr_res);
      printk(KERN_INFO
        "Init testhrarr: %d ; HZ: %d ; 1/HZ (ms): %ld ; hrres: %lld.%.9ld\n",
                   testhrarr_runcount,      HZ,        period_ms, (long long)tp_hr_res.tv_sec, tp_hr_res.tv_nsec );
    
      testhrarr_arr = (int*)kcalloc(ARRSIZE, sizeof(int), GFP_ATOMIC);
      testhrarr_arr_first = &testhrarr_arr[0];
    
      hrtimer_init(&my_hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
      my_hrtimer.function = &testhrarr_timer_function;
      period_ns = period_ms*( (unsigned long)1E6L );
      ktime_period_ns = ktime_set(0,period_ns);
    
      printk(KERN_INFO
        " Addresses: _runcount 0x%p ; _arr 0x%p ; _arr[0] 0x%p (0x%p) ; _timer_function 0x%p ; my_hrtimer 0x%p; my_hrt.f 0x%p\n",
        &testhrarr_runcount, &testhrarr_arr, &(testhrarr_arr[0]), testhrarr_arr_first, &testhrarr_timer_function, &my_hrtimer, &my_hrtimer.function);
    
    
      proc_create("testhrarr_proc", 0, NULL, &testhrarr_proc_fops);
    
    
      #if (HWDEBUG_STACK == 1)
      hw_breakpoint_init(&attr);
      if (strcmp(ksym_name, "testhrarr_arr_first") == 0) {
        // just for testhrarr_arr_first - interpret the found symbol address
        // as int*, and dereference it to get the "real" address it points to
        attr.bp_addr = *((int*)kallsyms_lookup_name(ksym_name));
      } else {
        // the usual - address is kallsyms_lookup_name result
        attr.bp_addr = kallsyms_lookup_name(ksym_name);
      }
      attr.bp_len = HW_BREAKPOINT_LEN_1;
      attr.bp_type = HW_BREAKPOINT_W ; //| HW_BREAKPOINT_R;
    
      sample_hbp = register_wide_hw_breakpoint(&attr, (perf_overflow_handler_t)sample_hbp_handler);
      if (IS_ERR((void __force *)sample_hbp)) {
        int ret = PTR_ERR((void __force *)sample_hbp);
        printk(KERN_INFO "Breakpoint registration failed\n");
        return ret;
      }
    
      // explicit cast needed to show 64-bit bp_addr as 32-bit address
      // https://stackoverflow.com/questions/11796909/how-to-resolve-cast-to-pointer-from-integer-of-different-size-warning-in-c-co/11797103#11797103
      printk(KERN_INFO "HW Breakpoint for %s write installed (0x%p)\n", ksym_name, (void*)(uintptr_t)attr.bp_addr);
      #endif
    
      return 0;
    }
    
    static void __exit testhrarr_exit(void)
    {
      int ret_cancel = 0;
      kfree(testhrarr_arr);
      while( hrtimer_callback_running(&my_hrtimer) ) {
        ret_cancel++;
      }
      if (ret_cancel != 0) {
        printk(KERN_INFO " testhrarr Waited for hrtimer callback to finish (%d)\n", ret_cancel);
      }
      if (hrtimer_active(&my_hrtimer) != 0) {
        ret_cancel = hrtimer_cancel(&my_hrtimer);
        printk(KERN_INFO " testhrarr active hrtimer cancelled: %d (%d)\n", ret_cancel, testhrarr_runcount);
      }
      if (hrtimer_is_queued(&my_hrtimer) != 0) {
        ret_cancel = hrtimer_cancel(&my_hrtimer);
        printk(KERN_INFO " testhrarr queued hrtimer cancelled: %d (%d)\n", ret_cancel, testhrarr_runcount);
      }
      remove_proc_entry("testhrarr_proc", NULL);
      #if (HWDEBUG_STACK == 1)
      unregister_wide_hw_breakpoint(sample_hbp);
      printk(KERN_INFO "HW Breakpoint for %s write uninstalled\n", ksym_name);
      #endif
      printk(KERN_INFO "Exit testhrarr\n");
    }
    
    module_init(testhrarr_init);
    module_exit(testhrarr_exit);
    
    MODULE_LICENSE("GPL");
    
    0 讨论(0)
提交回复
热议问题