Windows 10 specific crash on call LeaveCriticalSection

假如想象 提交于 2019-12-06 12:44:55

问题


I stucked into a problem with threads syncronization and critical sections on Windows 10.

Application will crash in this case:

  • Application has two threads.
  • Thread 1 calls EnterCriticalSection with object m_CS
  • Thread 2 then attempts to enter the same critical section
  • Thread 1 terminates Thread 2 using TerminateThread
  • Thread 1 calls LeaveCriticalSection

In previous Windows versions which I was able to test (7, 8, 8.1) this works properly. Thread 2 terminates, and Thread 1 leaves the critical section without exception.

On Windows 10, when Thread 1 leaves the critical section, application crashes with Access Violation. It only happens when another thread was terminated while waiting on EnterCriticalThread.

Looking at the stack trace it looks this (latest frame at the top):

RtlpWakeByAddress
RtlpUnWaitCriticalSection
RtlLeaveCriticalSection

I spent so much time on debugging this issue. In my case m_CS is totally fine when LeaveCriticalSection was called. I debugged and spent some time to analyze disassembled code of ntdll.dll functions. Seems like object corrupts somewhere during execution of RtlpUnWaitCriticalSection and then passed to RtlpWakeByAddress when crash occurs. Basicly ntdll.dll was able to modify CRITICAL_SECTION object's properties such as lock count in RtlLeaveCriticalSection.

From the web I didn't find any answer on this or statement what changed in Windows 10. Only thread on reddit and ~1800 crash reports for Mozilla Firefox with same call stack in the last month. I contacted with author of post on reddit and he was not able to fix this thus far.

So anybody dealed with this issue and may be have a fix for it or advices? As a solution right now I only see to rethink usage of WinAPI TerminateThread and try to avoid it as much as possible. Another way probably to do a code refactoring and think on application's architecture.

Any response appreciated. Thanks in advance


回答1:


Implementation of CRITICAL_SECTION very volatile from version to version. when in last Windows version thread begin wait on CRITICAL_SECTION he call WaitOnAddress function. ok, really it ntdll internal implementation - RtlpWaitOnAddress, but this not change gist. this function internal call RtlpAddWaitBlockToWaitList - and here the key point - WaitBlock is allocated on thread stack and pointer to this wait block is added to List. then when owner of CRITICAL_SECTION leave he call WakeByAddressSingle (really it internal implementation RtlpWakeByAddress) and this function pop the first WaitBlock from List, extract thread Id from it and call NtAlertThreadByThreadId(new api from win 8.1) - for awaken some thread waited in EnterCriticalSection. but when you terminated thread, waited in EnterCriticalSection - his stack is deallocated. so address of WaitBlock block become invalid. so thread which called RtlpWakeByAddress (as part of LeaveCriticalSection) got access violation when try read thread Id from WaitBlock (died thread stack). conclusion - if you call TerminatedThread - process already become in unstable state, bug can be at any time and any point. so - not call this function, especially from self process.




回答2:


Thread 1 terminates Thread 2 using TerminateThread

Don't do that. It may look like it works on other windows versions, but there's no way for you to know for sure what side-effects are occurring and hiding from you.

From https://msdn.microsoft.com/en-us/library/windows/desktop/ms686717(v=vs.85).aspx

TerminateThread is a dangerous function that should only be used in the most extreme cases. You should call TerminateThread only if you know exactly what the target thread is doing, and you control all of the code that the target thread could possibly be running at the time of the termination. For example, TerminateThread can result in the following problems:

  • If the target thread owns a critical section, the critical section will not be released.
  • If the target thread is allocating memory from the heap, the heap lock will not be released.
  • If the target thread is executing certain kernel32 calls when it is terminated, the kernel32 state for the thread's process could be inconsistent.
  • If the target thread is manipulating the global state of a shared DLL, the state of the DLL could be destroyed, affecting other users of the DLL.

What you should do is communicate with thread 2 and let thread 2 shut itself down correctly and safely.




回答3:


I would change the code of thread 2 to use TryEnterCriticalSection

if(!TryEnterCriticalSection(&m_CS)) {
    return 0;    // Terminate thread
}
//code
LeaveCriticalSection(&m_CS);

This has the advantage that thread 2 is not waiting on the critical section, and it can terminate itself properly. It is generally not advisable to use TerminateThread, as already mentioned by others in the comments.



来源:https://stackoverflow.com/questions/39635817/windows-10-specific-crash-on-call-leavecriticalsection

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!