In which cases is the dynamic CRT not already initialized on call to user supplied DllMain?

会有一股神秘感。 提交于 2019-12-03 15:59:55

Is there any other situation when linking /MD and not supplying a custom /ENTRYPOINT, where the dynamic CRT ought to not be fully initialized?

first some notation:

  • X have static import (depends on) Y and Z : X[ Y, Z]
  • X entry point : X_DllMain
  • X_DllMain call LoadLibrary(Y) : X<Y>

when we use /MD - we use crt in separate DLL(s). initialized in this context mean that entry point(s) of crt DLL(s) already called. so question can be more general and clear:

are from X[Y] => Y_DllMain called before X_DllMain ?

in general case no. because can be circular dependency, when Y[X] or Y[Z[X]].

most known example user32[gdi32], and gdi32[user32] or in win10 depends on gdi32[gdi32full[user32]] . so user32_DllMain or gdi32_DllMain must be called first ? however obvious that any crt DLL(s) not depends on our custom DLL. so let exclude circular dependency case.

when loader load module X - it load all it dependency modules (and it dependency - this is recursive process), if it already not in memory, then loader build call graph, and begin call modules entry points. obvious if A[B], loader always try call B_DllMain before A_DllMain (except circular dependency when order of calls is undefined). but which modules will be in call graph ? all X dependency modules ? of course no. some of this modules can already be in memory (loaded) when we begin load X. so it entry points already called, with DLL_PROCESS_ATTACH and must not be called second time now. this strategy used in xp, vista, win7:

when we load X:

  1. load or locate in memory all it dependency modules
  2. call entry points of new loaded (after X) modules only.
  3. if A[B] - call B_DllMain before A_DllMain

example: loaded X[Y[W[Z]], Z]

//++begin load X
Z_DllMain
W_DllMain
Y_DllMain
X_DllMain
// --end load X

but this scenario not take in account next case - some module can be already in memory, but it entry point yet not called. how this can happen ? this can happen in case some module entry point call LoadLibrary.

example - loaded X[Y<W[ Z]>, Z]

//++begin load X
Y_DllMain
  //++begin load W
  W_DllMain
  //--end load W
Z_DllMain
X_DllMain
// --end load X

so W_DllMain will be called before Z_DllMain, despite W[Z]. exactly because this not recommended call LoadLibrary from DLL entry point.


but from Dynamic-Link Library Best Practices

This can cause a deadlock or a crash.

the words about deadlock not true - of course any deadlock can not be basically. where ? how ? we already hold loader lock inside DLL entry point and this lock can be acquired recursively. crash really can be (before win8).

or another false:

Call ExitThread. Exiting a thread during DLL detach can cause the loader lock to be acquired again, causing a deadlock or a crash.

  • can cause the loader lock to be acquired again - not can but always
  • causing a deadlock - false - we already hold this lock
  • a crash - no any crash will be, else one false

but which is really will be - thread exit without free loader lock. it became busy forever. as result any new thread creation or exit, any new DLL load or unload, or just ExitProcess call - hung, when try acquire loader lock. so deadlock here really will be, but not during Call ExitThread - latter.

and of course interesting note - the windows itself call LoadLibrary from DllMain - user32.dll always call LoadLibrary for imm32.dll from it entry point (still true and on win10)


but begin from win8 (or win8.1) loader became more smart on handle dependency modules. now 2 is changed

2. call entry points of new loaded (after X) modules or if module yet not initialized.

so in modern windows (8+) for load X[Y<W[Z]>, Z]

//++begin load X
Y_DllMain
  //++begin load W
  Z_DllMain
  W_DllMain
  //--end load W
X_DllMain
// -- end load X

the Z initialization will be moved to W load call graph. as result all will be correct now.

for test this we can build next solution: test.exe[ kernel32, D1< D2[kernel32, msvcrt] >, msvcrt ]

  • D2 import from kernel32 and msvcrt only and export SomeFunc
  • D1 import only from kernel32 and call LoadLibraryW(L"D2") from it entry point, and then call D2.SomeFunc
  • test.exe import from kernel32, D1 and msvcrt

(exactly in this order ! this is critical important - D1 must be before msvcrt in import, for this need set D1 before msvcrt in linker command line)

as result D1 entry point will be called before msvcrt. this is normal - D1 not depends on msvcrt but when D1 load D2 from it entry point, became interesting

code for D2.dll ( /NODEFAULTLIB kernel32.lib msvcrt.lib )

#include <Windows.h>

extern "C"
{
    __declspec(dllimport) int __cdecl sprintf(PSTR buf, PCSTR format, ...);
}

BOOLEAN WINAPI MyEp( HMODULE , DWORD ul_reason_for_call, PVOID )
{
    if (ul_reason_for_call == DLL_PROCESS_ATTACH)
    {
        OutputDebugStringA("D2.DllMain\n");
    }

    return TRUE;
}

INT_PTR WINAPI SomeFunc()
{
    __pragma(message(__FUNCDNAME__))
    char buf[32];
    // this is only for link to msvcrt.dll
    sprintf(buf, "D2.SomeFunc\n");
    OutputDebugStringA(buf);
    return 0;
}

#ifdef _WIN64
#define FuncName "?SomeFunc@@YA_JXZ"
#else
#define FuncName "?SomeFunc@@YGHXZ"
#endif

__pragma(comment(linker, "/export:" FuncName ",@1,NONAME,PRIVATE"))

code for D1.dll ( /NODEFAULTLIB kernel32.lib )

#include <Windows.h>

#pragma warning(disable : 4706)

BOOLEAN WINAPI MyEp( HMODULE hmod, DWORD ul_reason_for_call, PVOID )
{
    if (ul_reason_for_call == DLL_PROCESS_ATTACH)
    {
        OutputDebugStringA("D1.DllMain\n");
        if (hmod = LoadLibraryW(L"D2"))
        {
            if (FARPROC fp = GetProcAddress(hmod, (PCSTR)1))
            {
                fp();
            }
        }
    }

    return TRUE;
}

INT_PTR WINAPI SomeFunc()
{
    __pragma(message(__FUNCDNAME__))
    OutputDebugStringA("D1.SomeFunc\n");
    return 0;
}

#ifdef _WIN64
#define FuncName "?SomeFunc@@YA_JXZ"
#else
#define FuncName "?SomeFunc@@YGHXZ"
#endif

__pragma(comment(linker, "/export:" FuncName ",@1,NONAME"))

code for exe ( /NODEFAULTLIB kernel32.lib D1.lib msvcrt.lib )

#include <Windows.h>

extern "C"
{
    __declspec(dllimport) int __cdecl sprintf(PSTR buf, PCSTR format, ...);
}

__declspec(dllimport) INT_PTR WINAPI SomeFunc();

void ep()
{
    char buf[32];
    // this is only for link to msvcrt.dll
    sprintf(buf, "exe entry\n");
    OutputDebugStringA(buf);
    ExitProcess((UINT)SomeFunc());
}

output for xp:

LDR: D1.dll loaded - Calling init routine
D1.DllMain
Load: D2.dll
LDR: D2.dll loaded - Calling init routine
D2.DllMain
D2.SomeFunc
LDR: msvcrt.dll loaded - Calling init routine
exe entry
D1.SomeFunc

for win7:

LdrpRunInitializeRoutines - INFO: Calling init routine for DLL "D1.dll"
D1.DllMain
Load: D2.dll
LdrpRunInitializeRoutines - INFO: Calling init routine for DLL "D2.DLL"
D2.DllMain
D2.SomeFunc
LdrpRunInitializeRoutines - "msvcrt.dll"
exe entry
D1.SomeFunc

in both case call flow is the same - D2.DllMain called before msvcrt entry point, despite D2[msvcrt]

but on win8.1 and win10 - call flow is another:

LdrpInitializeNode - INFO: Calling init routine for DLL "D1.dll"
D1.DllMain
LdrpInitializeNode - INFO: Calling init routine for DLL "msvcrt.dll"
LdrpInitializeNode - INFO: Calling init routine for DLL "D2.DLL"
D2.DllMain
D2.SomeFunc
exe entry
D1.SomeFunc

the D2 entry point called after msvcrt initialization.

so what is conclusion?

if when module X[Y] is loaded and no not initialized Y in memory - Y_DllMain will be called before X_DllMain. or in another words - if nobody call LoadLibrary(X) (or LoadLibrary(Z[X]) ) from DLL entry point. so if your DLL will be loaded "normal" way (not by call LoadLibrary from DllMain or injected from driver on some dll load event) - you can be sure that crt entry point already called (crt initialized)

more - if you run on win8.1+ - and X[Y] is loaded - Y_DllMain will be always called before X_DllMain.


now about custom /ENTRYPOINT in your dll.

even if you use crt in separate DLLs - some small crt code will be statically linked to your module DllMainCRTStartup - which call your function DllMain (this is not a entry point) by name. so in case dynamic crt - we really have 2 crt parts - main part in separate DLLs and it will be initialized before your DLL entry point is called (if not special case which i describe higher and win7,vista,xp). and small static part (code inside your module). when this static part will be called already full depend from you. this part DllMainCRTStartup do some internal initializations, initialize global objects in your code (initterm) and call DllMain, after it return (on dll detach) call destructors for globals..

if you set custom entry point in DLL - at this point crt in separate DLLs already initialized, but your static crt no (as and global objects). from this custom entry point you will be need call DllMainCRTStartup

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!