Strange behavior of printk in linux kernel module

谁说我不能喝 提交于 2019-12-12 18:04:01

问题


I am writing a code for linux kernel module and experiencing a strange behavior in it. Here is my code:

int data = 0;
void threadfn1()
{
    int j;
    for( j = 0; j < 10; j++ )
        printk(KERN_INFO "I AM THREAD 1 %d\n",j);   
    data++;
}

void threadfn2()
{
    int j;
    for( j = 0; j < 10; j++ )
        printk(KERN_INFO "I AM THREAD 2 %d\n",j);
    data++; 
}
static int __init abc_init(void)
{
        struct task_struct *t1 = kthread_run(threadfn1, NULL, "thread1");
        struct task_struct *t2 = kthread_run(threadfn2, NULL, "thread2");
        while( 1 )
        {
        printk("debug\n"); // runs ok
            if( data >= 2 )
            {
                kthread_stop(t1);
                kthread_stop(t2);
                break;
            }
        }
        printk(KERN_INFO "HELLO WORLD\n");

 }

Basically I was trying to wait for threads to finish and then print something after that. The above code does achieve that target but WITH "printk("debug\n");" not commented. As soon as I comment out printk("debug\n"); to run the code without debugging and load the module through insmod command, the module hangs on and it seems like it gets lost in recursion. I dont why printk effects my code in such a big way?

Any help would be appreciated.

regards.


回答1:


With the call to printk() removed the compiler is optimising the loop into while (1);. When you add the call to printk() the compiler is not sure that data isn't changed and so checks the value each time through the loop.

You can insert a barrier into the loop, which forces the compiler to reevaluate data on each iteration. eg:

while (1) {
        if (data >= 2) {
                kthread_stop(t1);
                kthread_stop(t2);
                break;
        }

        barrier();
}



回答2:


You're not synchronizing the access to the data-variable. What happens is, that the compiler will generate a infinite loop. Here is why:

  while( 1 )
        {
            if( data >= 2 )
            {
                kthread_stop(t1);
                kthread_stop(t2);
                break;
            }
        }

The compiler can detect that the value of data never changes within the while loop. Therefore it can completely move the check out of the loop and you'll end up with a simple

 while (1) {} 

If you insert printk the compiler has to assume that the global variable data may change (after all - the compiler has no idea what printk does in detail) therefore your code will start to work again (in a undefined behavior kind of way..)

How to fix this:

Use proper thread synchronization primitives. If you wrap the access to data into a code section protected by a mutex the code will work. You could also replace the variable data and use a counted semaphore instead.

Edit:

This link explains how locking in the linux-kernel works:

http://www.linuxgrill.com/anonymous/fire/netfilter/kernel-hacking-HOWTO-5.html




回答3:


Maybe data should be declared volatile? It could be that the compiler is not going to memory to get data in the loop.




回答4:


Nils Pipenbrinck's answer is spot on. I'll just add some pointers.

Rusty's Unreliable Guide to Kernel Locking (every kernel hacker should read this one).
Goodbye semaphores?, The mutex API (lwn.net articles on the new mutex API introduced in early 2006, before that the Linux kernel used semaphores as mutexes).

Also, since your shared data is a simple counter, you can just use the atomic API (basically, declare your counter as atomic_t and access it using atomic_* functions).




回答5:


Volatile might not always be "bad idea". One needs to separate out the case of when volatile is needed and when mutual exclusion mechanism is needed. It is non optimal when one uses or misuses one mechanism for the other. In the above case. I would suggest for optimal solution, that both mechanisms are needed: mutex to provide mutual exclusion, volatile to indicate to compiler that "info" must be read fresh from hardware. Otherwise, in some situation (optimization -O2, -O3), compilers might inadvertently leave out the needed codes.



来源:https://stackoverflow.com/questions/4113176/strange-behavior-of-printk-in-linux-kernel-module

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!