zoukankan      html  css  js  c++  java
  • Mixed DLL Loading analysis

    这个问题还是关于ASP.NET. 问题背景:

     

    客户的ASP.NET发生了hang, CPU0,客户端得不到任何反应.问题发生的时候抓取了hang dump

     

    打开dump,一如既往地用!threads命令检查发现一共有20个活动的CLR thread,其中没有标记为GC的线程. 接下来用~* e !clrstack命令查看对应的CLR Callstack, 发现:

     

    0:017> ~* e !clrstack

    Thread 0

    Not a managed thread.

    Thread 1

    Not a managed thread.

    Thread 2

    Not a managed thread.

    Thread 3

    Not a managed thread.

    Thread 4

    ESP         EIP      

    Thread 5

    ESP         EIP      

    Thread 6

    ESP         EIP      

    Thread 7

    Not a managed thread.

    Thread 8

    Not a managed thread.

    Thread 9

    Not a managed thread.

    Thread 10

    ESP         EIP      

    0x01a9ec48  0x7c82ed54 [FRAME: GCFrame]

    0x01a9f7e4  0x7c82ed54 [FRAME: ECallMethodFrame] [DEFAULT] Class System.Reflection.Assembly System.Reflection.Assembly.nLoad(Class System.Reflection.AssemblyName,String,Boolean,Class System.Security.Policy.Evidence,Boolean,Class System.Reflection.Assembly,ByRef ValueClass System.Threading.StackCrawlMark)

    0x01a9f808  0x799afb84 [DEFAULT] Class System.Reflection.Assembly System.Reflection.Assembly.InternalLoad(Class System.Reflection.AssemblyName,Boolean,Class System.Security.Policy.Evidence,ByRef ValueClass System.Threading.StackCrawlMark)

    0x01a9f830  0x799ea867 [DEFAULT] Class System.Reflection.Assembly System.Reflection.Assembly.Load(Class System.Reflection.AssemblyName)

    0x01a9f838  0x0cdd2bef [DEFAULT] Class System.Collections.Hashtable System.Web.UI.Util.GetReferencedAssembliesHashtable(Class System.Reflection.Assembly)

    0x01a9f84c  0x0c119870 [DEFAULT] [hasThis] Boolean System.Web.Compilation.PreservedAssemblyEntry.LoadDataFromFileInternal(Boolean)

    0x01a9f89c  0x0c1194a2 [DEFAULT] [hasThis] Boolean System.Web.Compilation.PreservedAssemblyEntry.LoadDataFromFile(Boolean)

    0x01a9f8c4  0x0c118a20 [DEFAULT] Class System.Web.Compilation.PreservedAssemblyEntry System.Web.Compilation.PreservedAssemblyEntry.GetPreservedAssemblyEntry(Class System.Web.HttpContext,String,Boolean)

    0x01a9f8dc  0x0c118764 [DEFAULT] [hasThis] Class System.Web.UI.ParserCacheItem System.Web.UI.TemplateParser.GetParserCacheItemFromPreservedCompilation()

    0x01a9f8f4  0x0c11740b [DEFAULT] [hasThis] Class System.Web.UI.ParserCacheItem System.Web.UI.TemplateParser.GetParserCacheItemInternal(Boolean)

    0x01a9f91c  0x0c1186ae [DEFAULT] [hasThis] Class System.Web.UI.ParserCacheItem System.Web.UI.TemplateParser.GetParserCacheItemWithNewConfigPath()

    0x01a9f94c  0x0c1172dc [DEFAULT] [hasThis] Class System.Web.UI.ParserCacheItem System.Web.UI.TemplateParser.GetParserCacheItem()

    0x01a9f97c  0x0c115d84 [DEFAULT] Class System.Type System.Web.UI.ApplicationFileParser.GetCompiledApplicationType(String,Class System.Web.HttpContext,ByRef Class System.Web.UI.ApplicationFileParser)

    0x01a9f994  0x0c115c4c [DEFAULT] [hasThis] Void System.Web.HttpApplicationFactory.CompileApplication(Class System.Web.HttpContext)

    0x01a9f9a4  0x0c115b4f [DEFAULT] [hasThis] Void System.Web.HttpApplicationFactory.Init(Class System.Web.HttpContext)

    0x01a9f9e4  0x0c115953 [DEFAULT] Class System.Web.IHttpHandler System.Web.HttpApplicationFactory.GetApplicationInstance(Class System.Web.HttpContext)

    0x01a9fa10  0x0be76a97 [DEFAULT] [hasThis] Void System.Web.HttpRuntime.ProcessRequestInternal(Class System.Web.HttpWorkerRequest)

    0x01a9fa4c  0x0be76690 [DEFAULT] Void System.Web.HttpRuntime.ProcessRequest(Class System.Web.HttpWorkerRequest)

    0x01a9fa58  0x0be7320d [DEFAULT] [hasThis] I4 System.Web.Hosting.ISAPIRuntime.ProcessRequest(I,I4)

    0x01a9fb20  0x79217188 [FRAME: ContextTransitionFrame]

    0x01a9fc00  0x79217188 [FRAME: ComMethodFrame]

    Thread 11

    Not a managed thread.

    Thread 12

    Not a managed thread.

    Thread 13

    Not a managed thread.

    Thread 14

    ESP         EIP      

    Thread 15

    Not a managed thread.

    Thread 16

    ESP         EIP      

    Thread 17

    ESP         EIP      

    0x0d1aeaf4  0x7c82ed54 [FRAME: GCFrame]

    0x0d1af690  0x7c82ed54 [FRAME: ECallMethodFrame] [DEFAULT] Class System.Reflection.Assembly System.Reflection.Assembly.nLoad(Class System.Reflection.AssemblyName,String,Boolean,Class System.Security.Policy.Evidence,Boolean,Class System.Reflection.Assembly,ByRef ValueClass System.Threading.StackCrawlMark)

    0x0d1af6b4  0x799afb84 [DEFAULT] Class System.Reflection.Assembly System.Reflection.Assembly.InternalLoad(Class System.Reflection.AssemblyName,Boolean,Class System.Security.Policy.Evidence,ByRef ValueClass System.Threading.StackCrawlMark)

    0x0d1af6dc  0x799afa15 [DEFAULT] Class System.Reflection.Assembly System.Reflection.Assembly.InternalLoad(String,Class System.Security.Policy.Evidence,ByRef ValueClass System.Threading.StackCrawlMark)

    0x0d1af6ec  0x799c2014 [DEFAULT] Class System.Reflection.Assembly System.Reflection.Assembly.Load(String)

    0x0d1af6f4  0x0c11ab17 [DEFAULT] Class System.Collections.Hashtable System.Web.UI.CompilationConfiguration.LoadAssemblies(Class System.Collections.Hashtable)

    0x0d1af758  0x0c11a8df [DEFAULT] [hasThis] Void System.Web.UI.TemplateParser.AppendConfigAssemblies()

    0x0d1af784  0x0c119d01 [DEFAULT] [hasThis] Void System.Web.UI.TemplateParser.PrepareParse()

    0x0d1af794  0x0c119b35 [DEFAULT] [hasThis] Void System.Web.UI.TemplateParser.Parse()

    0x0d1af7c4  0x0c119a03 [DEFAULT] [hasThis] Class System.Web.UI.ParserCacheItem System.Web.UI.TemplateParser.GetParserCacheItemThroughCompilation()

    0x0d1af7f4  0x0c117434 [DEFAULT] [hasThis] Class System.Web.UI.ParserCacheItem System.Web.UI.TemplateParser.GetParserCacheItemInternal(Boolean)

    0x0d1af81c  0x0c1186ae [DEFAULT] [hasThis] Class System.Web.UI.ParserCacheItem System.Web.UI.TemplateParser.GetParserCacheItemWithNewConfigPath()

    0x0d1af84c  0x0c1172dc [DEFAULT] [hasThis] Class System.Web.UI.ParserCacheItem System.Web.UI.TemplateParser.GetParserCacheItem()

    0x0d1af87c  0x0c115d84 [DEFAULT] Class System.Type System.Web.UI.ApplicationFileParser.GetCompiledApplicationType(String,Class System.Web.HttpContext,ByRef Class System.Web.UI.ApplicationFileParser)

    0x0d1af894  0x0c115c4c [DEFAULT] [hasThis] Void System.Web.HttpApplicationFactory.CompileApplication(Class System.Web.HttpContext)

    0x0d1af8a4  0x0c115b4f [DEFAULT] [hasThis] Void System.Web.HttpApplicationFactory.Init(Class System.Web.HttpContext)

    0x0d1af8e4  0x0c115953 [DEFAULT] Class System.Web.IHttpHandler System.Web.HttpApplicationFactory.GetApplicationInstance(Class System.Web.HttpContext)

    0x0d1af910  0x0be76a97 [DEFAULT] [hasThis] Void System.Web.HttpRuntime.ProcessRequestInternal(Class System.Web.HttpWorkerRequest)

    0x0d1af94c  0x0be76690 [DEFAULT] Void System.Web.HttpRuntime.ProcessRequest(Class System.Web.HttpWorkerRequest)

    0x0d1af958  0x0be7320d [DEFAULT] [hasThis] I4 System.Web.Hosting.ISAPIRuntime.ProcessRequest(I,I4)

    0x0d1afa20  0x79217188 [FRAME: ContextTransitionFrame]

    0x0d1afb00  0x79217188 [FRAME: ComMethodFrame]

    Thread 18

    ESP         EIP      

    0x0ec5ce80  0x7c82ed54 [FRAME: GCFrame]

    0x0ec5d378  0x7c82ed54 [FRAME: GCFrame]

    0x0ec5eac8  0x7c82ed54 [FRAME: GCFrame]

    0x0ec5f664  0x7c82ed54 [FRAME: ECallMethodFrame] [DEFAULT] Class System.Reflection.Assembly System.Reflection.Assembly.nLoad(Class System.Reflection.AssemblyName,String,Boolean,Class System.Security.Policy.Evidence,Boolean,Class System.Reflection.Assembly,ByRef ValueClass System.Threading.StackCrawlMark)

    0x0ec5f688  0x799afb84 [DEFAULT] Class System.Reflection.Assembly System.Reflection.Assembly.InternalLoad(Class System.Reflection.AssemblyName,Boolean,Class System.Security.Policy.Evidence,ByRef ValueClass System.Threading.StackCrawlMark)

    0x0ec5f6b0  0x799ea867 [DEFAULT] Class System.Reflection.Assembly System.Reflection.Assembly.Load(Class System.Reflection.AssemblyName)

    0x0ec5f6b8  0x0cdd2bef [DEFAULT] Class System.Collections.Hashtable System.Web.UI.Util.GetReferencedAssembliesHashtable(Class System.Reflection.Assembly)

    0x0ec5f6cc  0x0c119870 [DEFAULT] [hasThis] Boolean System.Web.Compilation.PreservedAssemblyEntry.LoadDataFromFileInternal(Boolean)

    0x0ec5f71c  0x0c1194a2 [DEFAULT] [hasThis] Boolean System.Web.Compilation.PreservedAssemblyEntry.LoadDataFromFile(Boolean)

    0x0ec5f744  0x0c118a20 [DEFAULT] Class System.Web.Compilation.PreservedAssemblyEntry System.Web.Compilation.PreservedAssemblyEntry.GetPreservedAssemblyEntry(Class System.Web.HttpContext,String,Boolean)

    0x0ec5f75c  0x0c118764 [DEFAULT] [hasThis] Class System.Web.UI.ParserCacheItem System.Web.UI.TemplateParser.GetParserCacheItemFromPreservedCompilation()

    0x0ec5f774  0x0c11740b [DEFAULT] [hasThis] Class System.Web.UI.ParserCacheItem System.Web.UI.TemplateParser.GetParserCacheItemInternal(Boolean)

    0x0ec5f79c  0x0c1186ae [DEFAULT] [hasThis] Class System.Web.UI.ParserCacheItem System.Web.UI.TemplateParser.GetParserCacheItemWithNewConfigPath()

    0x0ec5f7cc  0x0c1172dc [DEFAULT] [hasThis] Class System.Web.UI.ParserCacheItem System.Web.UI.TemplateParser.GetParserCacheItem()

    0x0ec5f7fc  0x0c115d84 [DEFAULT] Class System.Type System.Web.UI.ApplicationFileParser.GetCompiledApplicationType(String,Class System.Web.HttpContext,ByRef Class System.Web.UI.ApplicationFileParser)

    0x0ec5f814  0x0c115c4c [DEFAULT] [hasThis] Void System.Web.HttpApplicationFactory.CompileApplication(Class System.Web.HttpContext)

    0x0ec5f824  0x0c115b4f [DEFAULT] [hasThis] Void System.Web.HttpApplicationFactory.Init(Class System.Web.HttpContext)

    0x0ec5f864  0x0c115953 [DEFAULT] Class System.Web.IHttpHandler System.Web.HttpApplicationFactory.GetApplicationInstance(Class System.Web.HttpContext)

    0x0ec5f890  0x0be76a97 [DEFAULT] [hasThis] Void System.Web.HttpRuntime.ProcessRequestInternal(Class System.Web.HttpWorkerRequest)

    0x0ec5f8cc  0x0be76690 [DEFAULT] Void System.Web.HttpRuntime.ProcessRequest(Class System.Web.HttpWorkerRequest)

    0x0ec5f8d8  0x0be7320d [DEFAULT] [hasThis] I4 System.Web.Hosting.ISAPIRuntime.ProcessRequest(I,I4)

    0x0ec5f9a0  0x79217188 [FRAME: ContextTransitionFrame]

    0x0ec5fa80  0x79217188 [FRAME: ComMethodFrame]

    Thread 19

    ESP         EIP      

    0x0ecaee18  0x7c82ed54 [FRAME: ECallMethodFrame] [DEFAULT] String System.Security.Util.Config._GetMachineDirectory()

    0x0ecaee28  0x799a0a9f [DEFAULT] Void System.Security.Util.Config.GetFileLocales()

    0x0ecaee34  0x799a07b7 [DEFAULT] [hasThis] Void System.Security.PolicyManager.InitData()

    0x0ecaee50  0x799a072b [DEFAULT] Boolean System.Security.SecurityManager.InitPolicy()

    0x0ecaee78  0x799a046b [DEFAULT] Class System.Security.PermissionSet System.Security.SecurityManager.ResolvePolicy(Class System.Security.Policy.Evidence,Class System.Security.PermissionSet,Class System.Security.PermissionSet,Class System.Security.PermissionSet,ByRef Class System.Security.PermissionSet,Boolean)

    0x0ecaeeb8  0x799a02b1 [DEFAULT] Class System.Security.PermissionSet System.Security.SecurityManager.ResolvePolicy(Class System.Security.Policy.Evidence,Class System.Security.PermissionSet,Class System.Security.PermissionSet,Class System.Security.PermissionSet,ByRef Class System.Security.PermissionSet,ByRef I4,Boolean)

    0x0ecaf180  0x791b3208 [FRAME: GCFrame]

    0x0ecaf694  0x791b3208 [FRAME: ECallMethodFrame] [DEFAULT] [hasThis] Void System.AppDomain.SetupDomainSecurity(String,Class System.Security.Policy.Evidence,Class System.Security.Policy.Evidence,I)

    0x0ecaf6b0  0x799cd45b [DEFAULT] Void System.AppDomain.InternalRemotelySetupRemoteDomainHelper(String,Class System.AppDomainSetup,I,SZArray Char,SZArray Char,SZArray UI1)

    0x0ecaf6d8  0x799cd2c3 [DEFAULT] Void System.AppDomain.InternalRemotelySetupRemoteDomain(I4,I4,String,Class System.AppDomainSetup,I,SZArray Char,SZArray Char,SZArray UI1)

    0x0ecaf6e4  0x799ccfda [FRAME: ContextTransitionFrame]

    0x0ecaf738  0x799ccfda [DEFAULT] Void System.AppDomain.RemotelySetupRemoteDomain(Class System.AppDomain,String,Class System.AppDomainSetup,Class System.Security.Policy.Evidence,Class System.Security.Policy.Evidence,I)

    0x0ecaf774  0x799c5a9a [DEFAULT] Class System.AppDomain System.AppDomain.CreateDomain(String,Class System.Security.Policy.Evidence,Class System.AppDomainSetup)

    0x0ecaf790  0x0be70507 [DEFAULT] [hasThis] Object System.Web.Hosting.AppDomainFactory.Create(String,String,String,String,String,I4)

    0x0ecaf96c  0x79217188 [FRAME: ComMethodFrame]

    Thread 20

    Not a managed thread.

    Thread 21

    Not a managed thread.

    Thread 22

    Thread 30

    ESP         EIP      

    Thread 31

    Not a managed thread.

    Thread 32

    ESP         EIP      

    Thread 33

    ESP         EIP      

    Thread 34

    ESP         EIP      

    Thread 35

    Not a managed thread.

    Thread 36

    Not a managed thread.

    Thread 37

    Thread 50

    ESP         EIP      

    Thread 51

    Not a managed thread.

    Thread 64

    Not a managed thread.

     

     

    受到篇幅限制,中间省略了部分Not a managed thread和空callstack的结果.从上面的输出来看,虽然CLR thread20,但大多数都是idle.发生blocking,看得到callstackCLR thread只有四个,其中三个等在Assembly.Load上面.似乎在加载Assembly的时候出问题了.

     

    在前面看到callstack停留在SqlCommand.Execute的时候,很容易理解对应的thread在等待SQL操作,大多数情况是在等待SQL服务器的执行结果,但是也看到等待GC的例子。

     

    但这里很难理解什么直接的理由可以导致Assembly.Load发生堵塞.由于最终堵塞线程的必然是API调用,比如EnterCriticalSection, WaitForSingleObject,所以检查对应的unmanaged callstack是进一步弄清楚问题的下一步,切换到第一个活动的CLR 线程检查:

     

    0:010> kb100

    ChildEBP RetAddr  Args to Child             

    01a9db1c 7c822124 7c83970f 00000154 00000000 ntdll!KiFastSystemCallRet

    01a9db20 7c83970f 00000154 00000000 00000000 ntdll!NtWaitForSingleObject+0xc

    01a9db5c 7c839620 00000000 00000004 00000001 ntdll!RtlpWaitOnCriticalSection+0x19c

    01a9db7c 7c832ad0 7c889d94 00000000 00000000 ntdll!RtlEnterCriticalSection+0xa8

    01a9dbb0 7c833ea9 00000001 00000000 01a9dbec ntdll!LdrLockLoaderLock+0xe4

    01a9de20 77e41c90 1c68d7b8 01a9de6c 01a9de4c ntdll!LdrLoadDll+0xc9

    01a9de88 791e0610 368ec3b8 00000000 00000009 kernel32!LoadLibraryExW+0x1b2

    01a9dea4 791ea056 368ec3b8 00000000 00000009 mscorsvr!WszLoadLibraryEx+0x5f

    01a9decc 791e7e58 368ec50c 36922588 00129000 mscorsvr!CorMap::BaseAddress+0x8b

    01a9dee8 791e7ec4 36922378 368ec50c 00000000 mscorsvr!PEFile::Setup+0x45

    01a9df04 791ea014 368ec50c 01a9e018 00000000 mscorsvr!PEFile::Create+0x38

    01a9df1c 791ea117 368ec50c 368ec50c 01a9e018 mscorsvr!PEFile::CreateImageFile+0x3a

    01a9df90 791ea151 00000001 00000000 3691f3b8 mscorsvr!PEFile::VerifyModule+0x76

    01a9dfd4 791e88f6 01a9e248 00000000 26000000 mscorsvr!PEFile::Create+0x11a

    01a9e01c 791e94d8 01a9e248 00000000 26000000 mscorsvr!SystemDomain::LoadFile+0x184

    01a9e46c 791e8084 1d2312d8 3685d0f0 01a9f56c mscorsvr!AssemblySpec::GetAssemblyFromFusion+0x630

    01a9e6f4 791e7f4b 00000000 01a9e7ac 00000001 mscorsvr!AssemblySpec::LowLevelLoadManifestFile+0x17d

    01a9e714 791e2c95 01a9f53c 01a9e7a4 01a9e7ac mscorsvr!AppDomain::BindAssemblySpec+0x50

    01a9ec24 791eb38b 01a9eca0 01a9eca8 01a9f7bc mscorsvr!AssemblySpec::LoadAssembly+0x98

    01a9f7a4 01bab9ac 00000000 01a9f830 00000000 mscorsvr!AssemblyNative::Load+0x407

    WARNING: Frame IP not in any known module. Following frames may be wrong.

    01a9f820 799ea867 01a9f830 00000000 00000001 0x1bab9ac

    01a9f824 01a9f830 00000000 00000001 0cdd2bef mscorlib_79990000+0x5a867

    01a9f828 00000000 00000001 0cdd2bef 01a9f894 0x1a9f830

     

    callstack上看到,CLR最后调用到LoadLibrary API上,在试图获取DLL Loader Lock的时候堵塞了。

     

    DLL Loader LockOS维护的一个锁,用来确保系统加载DLL时候的顺序和同步。该锁是通过CriticalSection来实现的。在一个DLLDLLMain里面不允许调用LoadLibrary加载其它的DLL,也跟该锁相关。详细的信息可以搜索网上的资料。

     

    这里最后堵塞在CriticalSection上,根据第二章的介绍,!cs命令可以派上用途了。看看谁占用了这个CriticalSection没有释放:

     

    0:010> !cs 7c889d94

    -----------------------------------------

    Critical section   = 0x7c889d94 (ntdll!LdrpLoaderLock+0x0)

    DebugInfo          = 0x7c889dc0

    LOCKED

    LockCount          = 0x12

    WaiterWoken        = No

    OwningThread       = 0x00000948

    RecursionCount     = 0x2

    LockSemaphore      = 0x154

    SpinCount          = 0x00000000

    0:010> ~~[0x00000948]

      18  Id: cc4.948 Suspend: 1 Teb: 7ffa3000 Unfrozen

          Start: mscorsvr!ThreadpoolMgr::intermediateThreadProc (791fda8b)

          Priority: 0  Priority class: 32  Affinity: 3

     

    于是切换到线程18检查:

     

    0:010> ~18s

    eax=449db1b4 ebx=00000000 ecx=791b4ef3 edx=00000001 esi=793ef47c edi=00000588

    eip=7c82ed54 esp=0ec5c384 ebp=0ec5c3c0 iopl=0         nv up ei pl zr na pe nc

    cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000246

    ntdll!KiFastSystemCallRet:

    7c82ed54 c3              ret

    0:018> kb100

    ChildEBP RetAddr  Args to Child             

    0ec5c380 7c822124 7c83970f 00000588 00000000 ntdll!KiFastSystemCallRet

    0ec5c384 7c83970f 00000588 00000000 00000000 ntdll!NtWaitForSingleObject+0xc

    0ec5c3c0 7c839620 00000000 00000004 0ec5c65c ntdll!RtlpWaitOnCriticalSection+0x19c

    0ec5c3e0 791b2f0c 793ef47c 0ec5c618 791e78ab ntdll!RtlEnterCriticalSection+0xa8

    0ec5c3ec 791e78ab 793ef47c 791e7c89 0c067cb8 mscorsvr!EE_EnterCriticalSection+0xc

    0ec5c3f4 791e7c89 0c067cb8 00000000 00000000 mscorsvr!CorMap::Enter+0x13

    0ec5c618 791e7d32 0000009c 00000003 0ec5c65c mscorsvr!CorMap::OpenFile+0x73

    0ec5c648 791e88f6 0ec5c8bc 00000000 26000000 mscorsvr!PEFile::Create+0xc3

    0ec5c690 791e94d8 0ec5c8bc 00000000 26000000 mscorsvr!SystemDomain::LoadFile+0x184

    0ec5cae0 791e8084 1d2130b8 368f3390 0ec5cf2c mscorsvr!AssemblySpec::GetAssemblyFromFusion+0x630

    0ec5cd68 791e7f4b 00000000 0ec5cee0 00000001 mscorsvr!AssemblySpec::LowLevelLoadManifestFile+0x17d

    0ec5cd88 79208071 0ec5cefc 0ec5cef8 0ec5cee0 mscorsvr!AppDomain::BindAssemblySpec+0x50

    0ec5d19c 79207d79 00000003 00000024 00000000 mscorsvr!Assembly::ComputeBindingDependenciesClosure+0x25a

    0ec5d1c4 791e31ca 369291b8 369210e0 0ec5d324 mscorsvr!BaseDomain::CreateShareableAssemblyNoLock+0x7e

    0ec5d340 79258d85 36928d80 369210e0 0ec5d3cc mscorsvr!BaseDomain::LoadAssembly+0xacd

    0ec5d5e0 79019ad7 00000000 00000001 00000000 mscorsvr!ExecuteDLL+0x281

    0ec5d5fc 7901a789 44940000 00000001 00000000 mscoree!CorDllMainWorker+0x6c

    0ec5d638 7c82257a 44940000 00000000 00000000 mscoree!_CorDllMain+0x106

    0ec5d658 7c8358fb 449594e2 44940000 00000001 ntdll!LdrpCallInitRoutine+0x14

    0ec5d760 7c835bcb 00000000 00000000 00000000 ntdll!LdrpRunInitializeRoutines+0x367

    0ec5d9f4 7c833ee5 00000000 1c626830 0ec5dcbc ntdll!LdrpLoadDll+0x3cd

    0ec5dc70 77e41c90 1c626830 0ec5dcbc 0ec5dc9c ntdll!LdrLoadDll+0x198

    0ec5dcd8 791e0610 36821d50 00000000 00000008 kernel32!LoadLibraryExW+0x1b2

    0ec5dcf4 791ea056 36821d50 00000000 00000008 mscorsvr!WszLoadLibraryEx+0x5f

    0ec5dd1c 791e7e58 36821ed4 368f7740 368f7530 mscorsvr!CorMap::BaseAddress+0x8b

    0ec5dd38 7922c8ba 368f7530 36821ed4 00000000 mscorsvr!PEFile::Setup+0x45

    0ec5ddac 791ea151 00000000 36922378 369210e0 mscorsvr!PEFile::VerifyModule+0x195

    0ec5ddf0 791e88f6 0ec5e0c8 00000000 26000000 mscorsvr!PEFile::Create+0x11a

    0ec5de38 791e94d8 0ec5e0c8 00000000 26000000 mscorsvr!SystemDomain::LoadFile+0x184

    0ec5e2ec 791e8084 1d2130b8 368cc008 0ec5f3ec mscorsvr!AssemblySpec::GetAssemblyFromFusion+0x630

    0ec5e574 791e7f4b 00000000 0ec5e62c 00000001 mscorsvr!AssemblySpec::LowLevelLoadManifestFile+0x17d

    0ec5e594 791e2c95 0ec5f3bc 0ec5e624 0ec5e62c mscorsvr!AppDomain::BindAssemblySpec+0x50

    0ec5eaa4 791eb38b 0ec5eb20 0ec5eb28 0ec5f63c mscorsvr!AssemblySpec::LoadAssembly+0x98

    0ec5f624 01bab9ac 00000000 0ec5f6b0 00000000 mscorsvr!AssemblyNative::Load+0x407

    WARNING: Frame IP not in any known module. Following frames may be wrong.

    0ec5f6a0 799ea867 0ec5f6b0 00000000 00000001 0x1bab9ac

    0ec5f6a4 0ec5f6b0 00000000 00000001 0cdd2bef mscorlib_79990000+0x5a867

    0ec5f6a8 00000000 00000001 0cdd2bef 0ec5f714 0xec5f6b0

     

    这个callstack好长。理清楚逻辑,Assembly.Load调用被分发到CLR Runtime,CLR Runtime调用API LoadLibaray加载DLL,在加载结束以前,系统会用DLL_PROCESS_ATTACH作为dwReason调用当前进程里面所有的DLLEntry Point (DLLMain). 所以mscoree!_CorDllMain被调用。Mscoree!_CorDllMain再次调入CLR Runtime做一些DLL/Assembly相关的操作,操作的过程中再次等待到某一个CriticalSectoin

     

    那再次检查对应的CriticalSection被谁占用:

     

    0:018> !cs 793ef47c

    -----------------------------------------

    Critical section   = 0x793ef47c (mscorsvr!CorMap::m_pCorMapCrst+0x0)

    DebugInfo          = 0x000ddfd0

    LOCKED

    LockCount          = 0x1

    WaiterWoken        = No

    OwningThread       = 0x00000c6c

    RecursionCount     = 0x1

    LockSemaphore      = 0x588

    SpinCount          = 0x00000000

    0:018> ~~[0x00000c6c]

      17  Id: cc4.c6c Suspend: 1 Teb: 7ffa4000 Unfrozen

          Start: mscorsvr!ThreadpoolMgr::intermediateThreadProc (791fda8b)

          Priority: 0  Priority class: 32  Affinity: 3

     

    Thread 17?! 原来thread 10在等thread 17, thread 17thread 18发生deadlock在互相等待。下一步需要弄清的就是thread 17thread 18CriticalSection到底有什么依赖关系。

     

    前面已经看到thread 17在等待loader lock. thread 18callstack上,LoadLibrary的调用还没有退出,所以loader lock没有释放,thread 17等待thread 18的原因很清楚。那thread 18为何需要等待thread 17呢?检查thread 18EnterCriticalSectioncaller, mscorsvr!CorMap::Enter。从名字上看,该CriticalSection应该是用来同步CorMap资源的,而CorMap应该是CLR的一个内部实现类。回头看thread 17, thread 17中的LoadLibrarycallermscorsvr!CorMap::BaseAddress!原来CorMap的相关函数在thread 17中的调用也没有结束,所以thread 18应该是在等待thread 17没有释放的CorMap

     

    原因看明白了,解决方法呢?整理一下思路,这里无论是managed callstack还是unmanaged callstack,所有的参与函数都是CLR RuntimeOS API,中间没有涉及到任何客户自己的函数。说明问题在很大程度上跟客户的具体应用没有关系,很有可能是CLR或者OSbug

     

    关于CLRloader lock的冲突,最臭名昭著的就是前面介绍过的mixed DLL loading:

     

    Mixed DLL Loading Problem

    http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dv_vstechart/html/vcconMixedDLLLoadingProblem.asp

     

    为了看看是不是足够幸运地遇上这个问题,接下来尝试去找到引发问题的DLL的名字,看看是不是用managed C++写的。检查LoadLibaray 的第一个参数可以看到:

     

    0:018> du 36821d50

    36821d50  "c:\windows\microsoft.net\framewo"

    36821d90  "rk\v1.1.4322\temporary asp.net f"

    36821dd0  "iles\root\c01b9164\6d04ba4\assem"

    36821e10  "bly\dl2\ce86c938\007079bd_9d34c6"

    36821e50  "01\Company.product.usage"

    36821e90  "name.Class1.dll"

    0:018> du 368ec3b8

    368ec3b8  "c:\windows\microsoft.net\framewo"

    368ec3f8  "rk\v1.1.4322\temporary asp.net f"

    368ec438  "iles\root\6882f6e1\6ac01cc1\asse"

    368ec478  "mbly\dl2\a3ad3432\5b25229b_ffa0c"

    368ec4b8  "601\Company.Class2.dll"

     

    上面的输出提供了两个信息:

     

    1.      DLL的具体名字

    2.      DLL的路径可以看到,该DLL是从ASP.NET临时文件架加载的。所以该DLL应该是被部署到了ASP.NETbin目录下

     

    dump中用lmf命令找出这两个DLL,用reflector检查后发现下面的信息:

     

    1.      这两个AssemblyStrongName

    2.      这两个Assembly是用managed C++开发的,因为里面包含了下面的类型信息,这些类型信息是MC++编译器生成的

     

    [StructLayout(LayoutKind.Explicit, Size=520, Pack=1), DebugInfoInPDB]
    internal struct $ArrayType$0x47914e9e
    {

    }

     

    所以,这里不幸遭遇了mixed dll loading的问题。在CLR 1上面,该问题是没有完美解决方案的。所以最显而易见的方法是升级到CLR 2.0. 但是,为了挑战一下自己,看看能不能在现有的基础上尽量做点什么。

     

    首先,从callstack上看到,问题根源在于两个thread互相等待。其实,只要DLL Loader早一点释放,或者CorMap lock早一点释放,问题都不会发生。也就是说,问题并非想重现就可以找到恰当的时机重现的。客户的程序肯定是执行了相同的code path很多次,终于有一次遇上了死锁的条件。换句话说,要避免这个问题,只需要尽可能地减少code path执行的频率就可以了。导致问题的code path用来加载DLL,为何需要频繁地加载Assembly呢?由于Assembly不能单独卸载,必须跟appDomain一起卸载,所以接下来可以用!dumpdomain看看appDomainAssenbly的情况:

     

    0:018> !dumpdomain -stat

        Domain      Num Assemblies      Size Assemblies     Name

    0x793f16a0                  1         2,138,112      System Domain

    0x793f2b70                328       288,662,016      Shared Domain

    0x000cf478                  2         2,490,368      DefaultDomain

    0x001334b0                 77        20,086,784      /LM/W3SVC/262227220/Root-1-127973823709142110

    0x1d2312d8                 27        16,484,864      /LM/W3SVC/1064537246/Root-36-127973891453478142

    0x368594b8                 11         7,123,456      /LM/W3SVC/182767242/Root-37-127973891487384175

    0x36915188                  0                 0 Domain39

    Total 41 Domains

     

    哇,这么多appDomain。再去掉-stat参数看看每个appDomain记载的信息,发现上面两个Assembly在超过30appDomain中都有加载。由此看来,这两个MC++ Assembly中包含的应该是一些公用的组件,由于在多个应用中都会用到,所以分别部署到了各个Virtual Directorybin目录下。有超过30Virtual Directory被分配到了同一个app pool, 每一个Virtual Directory对应一个appDomain, 同一个w3wp.exe进程中频繁的appDomian加载和卸载触发了这个问题。所以可以采取下面的方法来尽可能地避免问题的发生:

     

    1.      检查是否有防毒程度频繁扫描web.config文件导致appDomain的重启。尽量避免对Virtual Directory的修改来避免appDomain的重启

    2.      把多个Virtual Directory合并,以便共同引用同一套MC++ Assembly,减少MC++ Assembly的加载

    3.      拆分Virtual Directory到多个不同的app pool.由于Loader lockCorMap lock使用CriticalSection,是进程级别的锁,所以分割到多个进程可以减少触发问题的频率

    4.      既然这两个AssemblyStrong Named的,应该放到GAC中,而不是bin目录下,详细参考:
    ASP.NET: Strong named assemblies should not be stored in the bin directory
    http://blogs.msdn.com/tess/archive/2006/04/13/575361.aspx

  • 相关阅读:
    台湾9大知名开源社区
    使用SignalR打造消息总线
    ENode 2.0
    Wireshark基本介绍和学习TCP三次握手 专题
    linux tomcat 的安装
    linux 之静默安装oracle
    什么是全栈呢(转)
    Android开发自学笔记(基于Android Studio1.3.1)—1.环境搭建(转)
    hdu 4919 Exclusive or
    D
  • 原文地址:https://www.cnblogs.com/lixiong/p/510198.html
Copyright © 2011-2022 走看看