What happens when a mov instruction causes a page fault with interrupts disabled on x86?

前端未结

关注

 2  1359

I recently encountered an issue in a custom Linux kernel (2.6.31.5, x86) driver where copy_to_user would periodically not copy any bytes to user space. It would return the coun

相关标签:

2条回答

醉话见心

2021-02-13 03:54

Page faults are not mask-able interrupts. In fact, they are not technically interrupts at all - but rather exceptions, although I agree the difference is more semantic.

The reason your copy_to_user failed when you called it in atomic context with interrupts disabled is because the code has an explicit check for this.

See http://lxr.free-electrons.com/source/arch/x86/lib/usercopy_32.c#L575

0 讨论(0)
发布评论:

提交评论
- 加载中...
悲哀的现实

2021-02-13 03:55
I've found the answer. My #2 suggestion was correct and the mechanism was right in front of my face. The page fault does happen, but the fixup_exception mechanism is used to provide a exception/continue mechanism. This section adds entries to the exception handler table:
```
    ".section __ex_table,\"a\"\n"               \
    "   .align 4\n"                 \
    "   .long 4b,5b\n"                  \
    "   .long 0b,3b\n"                  \
    "   .long 1b,6b\n"                  \
    ".previous"                     \
```
This says: if the IP address is the first entry and an exception is encountered in a fault handler, then set the IP address to the second address and continue.

So if the exception happens at "4:", jump to "5:". If the exception happens at "0:" then jump to "3:" and if the exception happens at "1:" jump to "6:".

The missing piece is in do_page_fault() in arch/x86/mm/fault.c:
```
/*
 * If we're in an interrupt, have no user context or are running
 * in an atomic region then we must not take the fault:
 */
if (unlikely(in_atomic() || !mm)) {
    bad_area_nosemaphore(regs, error_code, address);
    return;
}
```
in_atomic returned true because we are in a write_lock_bh() lock! bad_area_nosemaphore eventually does the fixup.

If a page_fault would occur (which was unlikely, because of the concept of the working space) then the function call would fail and jump out of the __copy_user macro, with the uncopied bytes set to size because preemption was disabled.
0 讨论(0)
发布评论:

提交评论
- 加载中...