zoukankan      html  css  js  c++  java
  • 捉虫记(四)线程安全导致的HighCpu

    一个朋友QQ群里说网站启动后会cpu很高,想要帮忙看一下dump.

    1.打开windbg加载dump文件后第一个命令lmf,这个命令显示加载的dll以及路径,这样子可以找个dll来帮忙加载sos,(额,我记不住load那么长的路径啊)

    0:000> lmf

    start end module name

    ……

    00007ffb`a3750000 00007ffb`a375a000 version C:WindowsSystem32version.dll

    00007ffb`a37e0000 00007ffb`a444e000 System_ni C:WindowsassemblyNativeImages_v4.0.30319_64System10e3367d9097070044e05c2825a4a1e9System.ni.dll

    00007ffb`a4450000 00007ffb`a457f000 clrjit C:WindowsMicrosoft.NETFramework64v4.0.30319clrjit.dll

    00007ffb`a4580000 00007ffb`a5b0d000 mscorlib_ni C:WindowsassemblyNativeImages_v4.0.30319_64mscorliba20cafac04a2e9b3bcb5ec4d674775e5mscorlib.ni.dll

    ……

    2.还是loadby比load好用吧(嘿嘿)

    0:000> .loadby sos clrjit

    我们要看看cpu情况咋样

    0:000> !tp

    CPU utilization: 100%

    Worker Thread: Total: 8 Running: 7 Idle: 1 MaxLimit: 32767 MinLimit: 4

    Work Request in Queue: 0

    --------------------------------------

    Number of Timers: 0

    --------------------------------------

    Completion Port Thread:Total: 1 Free: 1 MaxFree: 8 CurrentLimit: 1 MaxLimit: 1000 MinLimit: 4

    嗯,情况足够残。。

    3.我们看看那个线程占用的高吧

    0:000> !runaway

    User Mode Time

    Thread Time

    21:32e4 0 days 0:00:43.671

    20:2d1c 0 days 0:00:43.187

    22:1110 0 days 0:00:42.828

    25:257c 0 days 0:00:42.625

    19:337c 0 days 0:00:36.984

    26:32c8 0 days 0:00:35.906

    30:2f78 0 days 0:00:35.796

    7:30c0 0 days 0:00:00.140

    11:2758 0 days 0:00:00.062

    13:2644 0 days 0:00:00.031

    12:2770 0 days 0:00:00.031

    。。。

    4.线程太少,好像不是GC啥的关系,都是用户态用的。

    0:000> !threads

    ThreadCount: 18

    UnstartedThread: 0

    BackgroundThread: 17

    PendingThread: 0

    DeadThread: 1

    Hosted Runtime: no Lock

    ID OSID ThreadOBJ State GC Mode GC Alloc Context Domain Count Apt Exception

    7 1 30c0 000000df603e3170 28220 Preemptive 0000000000000000:0000000000000000 000000df603d9ff0 0 Ukn

    14 2 26c4 000000df6040e790 2b220 Preemptive 000000E060F2C908:000000E060F2DC90 000000df603d9ff0 0 MTA (Finalizer)

    16 3 2cac 000000df5fc17e40 102a220 Preemptive 0000000000000000:0000000000000000 000000df603d9ff0 0 MTA (Threadpool Worker)

    17 4 33c0 000000df6047c7d0 21220 Preemptive 0000000000000000:0000000000000000 000000df603d9ff0 0 Ukn

    18 6 3174 000000df6051b420 1020220 Preemptive 0000000000000000:0000000000000000 000000df603d9ff0 0 Ukn (Threadpool Worker)

    19 7 337c 000000df60549fa0 1029220 Cooperative 000000E260B93590:000000E260B947D0 000000df6047b7d0 1 MTA (Threadpool Worker)

    20 8 2d1c 000000df6054b370 1029220 Cooperative 000000E060FA2DD8:000000E060FA3C90 000000df6047b7d0 0 MTA (Threadpool Worker)

    21 9 32e4 000000df60560a10 1029220 Cooperative 000000E260B5F130:000000E260B607D0 000000df6047b7d0 0 MTA (Threadpool Worker)

    22 10 1110 000000df60561690 1029220 Cooperative 000000DF60B5AB48:000000DF60B5B178 000000df6047b7d0 0 MTA (Threadpool Worker)

    25 11 257c 000000df5fc28520 1029220 Cooperative 000000E160AE4988:000000E160AE6190 000000df6047b7d0 0 MTA (Threadpool Worker)

    26 12 32c8 000000df5fc29c90 1029220 Cooperative 000000E1610D4570:000000E1610D4A60 000000df6047b7d0 0 MTA (Threadpool Worker)

    XXXX 13 0 000000df5fc2a460 8039820 Preemptive 0000000000000000:0000000000000000 000000df603d9ff0 0 Ukn (Threadpool Completion Port)

    28 15 1a64 000000df5fc2b400 202b220 Preemptive 0000000000000000:0000000000000000 000000df6047b7d0 1 MTA

    29 16 2fbc 000000df5fc2bbd0 8029220 Preemptive 000000DF60F077D0:000000DF60F09178 000000df603d9ff0 0 MTA (Threadpool Completion Port)

    30 17 2f78 000000df5fc28cf0 1029220 Cooperative 000000DF6167E630:000000DF6167F8B0 000000df6047b7d0 0 MTA (Threadpool Worker)

    31 5 312c 000000df5fc294c0 1029220 Preemptive 000000E1610E0D88:000000E1610E2A60 000000df603d9ff0 0 MTA (Threadpool Worker)

    2 14 2fb8 000000df5fc2ac30 20220 Preemptive 000000E061ADF4B8:000000E061AE1358 000000df603d9ff0 0 Ukn

    3 18 2c88 000000df604c6f60 20220 Preemptive 000000DF6167FA30:000000DF616818B0 000000df603d9ff0 0 Ukn

    0:000> .time

    Debug session time: Wed Sep 24 15:11:53.000 2014 (UTC + 8:00)

    System Uptime: 0 days 5:43:30.998

    Process Uptime: 0 days 0:01:16.000

    Kernel time: 0 days 0:00:00.000

    User time: 0 days 0:04:42.000

    5.随便找几个线程看看吧

    0:000> ~21s

    mscorlib_ni+0x5535e0:

    00007ffb`a4ad35e0 488b4b10 mov rcx,qword ptr [rbx+10h] ds:000000e0`607f0e28=000000e060fa2d78

    0:021> !clrstack

    OS Thread Id: 0x32e4 (21)

    Child SP IP Call Site

    000000e3ab54da90 00007ffba4ad35e0 System.Collections.Generic.Dictionary`2[[System.__Canon, mscorlib],[System.__Canon, mscorlib]].Insert(System.__Canon, System.__Canon, Boolean)

    000000e3ab54db20 00007ffb46ec5de9 *** ERROR: Module load completed but symbols could not be loaded for System.Web.Optimization.dll

    System.Web.Optimization.BundleCollection.Add(System.Web.Optimization.Bundle)

    /*********这里是客户代码*************/

    000000e3ab54e400 00007ffb46a55aca Rabbit.Web.Environment.Impl.WebHost.ActivateShell(Rabbit.Kernel.Environment.ShellBuilders.ShellContext)

    000000e3ab54e490 00007ffb46a44fd3 Rabbit.Web.Environment.Impl.WebHost.<CreateAndActivateShells>b__f(Rabbit.Kernel.Environment.Configuration.ShellSettings)

    000000e3ab54e520 00007ffba530df6d System.Threading.Tasks.Parallel+<>c__DisplayClassf`1[[System.__Canon, mscorlib]].<ForWorker>b__c()

    000000e3ab54e630 00007ffba52f0452 System.Threading.Tasks.Task.InnerInvokeWithArg(System.Threading.Tasks.Task)

    000000e3ab54e660 00007ffba54a8911 System.Threading.Tasks.Task+<>c__DisplayClass11.<ExecuteSelfReplicating>b__10(System.Object)

    000000e3ab54e700 00007ffba4ab30ee System.Threading.Tasks.Task.Execute()

    000000e3ab54e770 00007ffba4a18355 System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)

    000000e3ab54e8d0 00007ffba4a180c9 System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)

    000000e3ab54e900 00007ffba4ab33c5 System.Threading.Tasks.Task.ExecuteWithThreadLocal(System.Threading.Tasks.Task ByRef)

    000000e3ab54e9e0 00007ffba4ab2a65 System.Threading.Tasks.Task.ExecuteEntry(Boolean)

    000000e3ab54ea20 00007ffba4a9a22a System.Threading.ThreadPoolWorkQueue.Dispatch()

    000000e3ab54ef78 00007ffba5c40453 [DebuggerU2MCatchHandlerFrame: 000000e3ab54ef78]

    000000e3ab54f118 00007ffba5c40453 [ContextTransitionFrame: 000000e3ab54f118]

    000000e3ab54f338 00007ffba5c40453 [DebuggerU2MCatchHandlerFrame: 000000e3ab54f338]

    0:021> ~20s

    mscorlib_ni+0x5535f4:

    00007ffb`a4ad35f4 488d1476 lea rdx,[rsi+rsi*2]

    0:020> !clrstack

    OS Thread Id: 0x2d1c (20)

    Child SP IP Call Site

    000000e3ab40dbe0 00007ffba4ad35f4 System.Collections.Generic.Dictionary`2[[System.__Canon, mscorlib],[System.__Canon, mscorlib]].Insert(System.__Canon, System.__Canon, Boolean)

    000000e3ab40dc70 00007ffb46ec5de9 System.Web.Optimization.BundleCollection.Add(System.Web.Optimization.Bundle)

    /*********这里是客户代码*************/

    000000e3ab40e5e0 00007ffb46a44fd3 Rabbit.Web.Environment.Impl.WebHost.<CreateAndActivateShells>b__f(Rabbit.Kernel.Environment.Configuration.ShellSettings)

    000000e3ab40e670 00007ffba530df6d System.Threading.Tasks.Parallel+<>c__DisplayClassf`1[[System.__Canon, mscorlib]].<ForWorker>b__c()

    000000e3ab40e780 00007ffba52f0452 System.Threading.Tasks.Task.InnerInvokeWithArg(System.Threading.Tasks.Task)

    000000e3ab40e7b0 00007ffba54a8911 System.Threading.Tasks.Task+<>c__DisplayClass11.<ExecuteSelfReplicating>b__10(System.Object)

    000000e3ab40e850 00007ffba4ab30ee System.Threading.Tasks.Task.Execute()

    000000e3ab40e8c0 00007ffba4a18355 System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)

    000000e3ab40ea20 00007ffba4a180c9 System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)

    000000e3ab40ea50 00007ffba4ab33c5 System.Threading.Tasks.Task.ExecuteWithThreadLocal(System.Threading.Tasks.Task ByRef)

    000000e3ab40eb30 00007ffba4ab2a65 System.Threading.Tasks.Task.ExecuteEntry(Boolean)

    000000e3ab40eb70 00007ffba4a9a22a System.Threading.ThreadPoolWorkQueue.Dispatch()

    000000e3ab40f0c8 00007ffba5c40453 [DebuggerU2MCatchHandlerFrame: 000000e3ab40f0c8]

    000000e3ab40f268 00007ffba5c40453 [ContextTransitionFrame: 000000e3ab40f268]

    000000e3ab40f488 00007ffba5c40453 [DebuggerU2MCatchHandlerFrame: 000000e3ab40f488]

    后面随便在搞几个占用时间比较长的看看,都是这样子。

    实际上我看到第一个,心里就咯噔一下,原因就在下面

    http://blogs.msdn.com/b/tess/archive/2009/12/21/high-cpu-in-net-app-using-a-static-generic-dictionary.aspx

    实际上这个问题恰巧吴贞操同学也遇到过 http://www.cnblogs.com/wlflovenet/p/debugCpu100.html

    整理一下思路:查看高CPU的问题

    1. 首先看一下线程池和线程运行时间,(我们一般抓两个dump来对比看)
    2. 确定不是GC啥的因为的高CPU就直接看栈(其实因为GC分配内存啥的也能看得到,详见上上篇捉虫记(二)GC导致的hang
    3. 看看有没有死循环,有没有大量计算Hash,如果是asp.net,看看有没有静态容器(Dict or List)的线程安全问题。
  • 相关阅读:
    EntityFramework优缺点
    领导者与管理者的区别
    七个对我最好的职业建议(精简版)
    The best career advice I’ve received
    Difference between Stored Procedure and Function in SQL Server
    2015年上半年一次通过 信息系统项目管理师
    Difference between WCF and Web API and WCF REST and Web Service
    What’s the difference between data mining and data warehousing?
    What is the difference between a Clustered and Non Clustered Index?
    用new创建函数的过程发生了什么
  • 原文地址:https://www.cnblogs.com/StevenChennet/p/3991475.html
Copyright © 2011-2022 走看看