zoukankan      html  css  js  c++  java
  • 【性能诊断】十一、性能问题综合分析(案例2,windbg、wireshark)

    【问题描述】:

          前段时间有一项目反馈,常用的审批功能有时的响应较慢,多个管理员功能不定期的出现客户端无响应的状况,并且管理员功能一旦出现卡死,也会影响到普通的业务用户致使很多用户无法操作。该系统为混合场景,管理员功能使用的CS客户端,业务用户使用的是WEB。

    【环境信息】:

          应用服务器:Windows Server 2008 R2 IIS7  .NET 4.0

          数据库服务器:AIX Oracle 10.2.0.4

    【处理过程】:

          经过跟踪发现两个情况:

    1. 确实有部分待审单据响应很慢,可以反复重现,貌似跟单据有关(慢的单据反复打开都会很慢,其他单据则一直没有类似问题),感觉很奇怪;
    2. 管理员功能发生了客户端不响应的情况后,查看对应数据库端的性能视图,没有SQL阻塞信息,也没有明显的执行时间很长、效率很低的SQL;

          在出现系统挂起时,抓取应用服务器的dump文件后,发现存在明显的线程阻塞:

    1. 105号线程阻塞了其他4个线程{(9-1)/2},而105号线程在等待数据库的返回,OracleLob,这是个特别的数据类型,需要引起警惕了;
    2. 根据线程堆栈:OperateDataBaseManager.GetPacekage(System.String),我们可以很快定位到程序代码及对应的SQL脚本;
    3. 根据windbg还可以看到,其他线程都在MetadataCache.GetObject()方法上处于被阻塞状态,查看对应的源代码,发现该方法使用了静态对象作为同步锁对象;

    下图是线程阻塞对应windbg的log日志

    Microsoft (R) Windows Debugger Version 6.12.0002.633 AMD64
    Copyright (c) Microsoft Corporation. All rights reserved.
    
    
    Loading Dump File [D:dumpsw3wp.DMP]
    User Mini Dump File with Full Memory: Only application data is available
    
    Symbol search path is: SRV*c:symbols*http://msdl.microsoft.com/download/symbols
    Executable search path is: 
    Windows 7 Version 7601 (Service Pack 1) MP (8 procs) Free x64
    Product: Server, suite: Enterprise TerminalServer SingleUserTS
    Machine Name:
    Debug session time: Tue Apr 14 14:22:10.000 2015 (UTC + 8:00)
    System Uptime: 19 days 20:26:50.903
    Process Uptime: 0 days 5:33:15.000
    ................................................................
    ................................................................
    ................................................................
    ................................................................
    ................................................................
    ................................................................
    ................................................................
    ......................................
    ntdll!NtWaitForSingleObject+0xa:
    00000000`7711135a c3              ret
    0:000> .load D:dumpssos.dll
    0:000> !tp
    CPU utilization: 0%
    Worker Thread: Total: 29 Running: 6 Idle: 23 MaxLimit: 800 MinLimit: 8
    Work Request in Queue: 0
    --------------------------------------
    Number of Timers: 43
    --------------------------------------
    Completion Port Thread:Total: 1 Free: 1 MaxFree: 16 CurrentLimit: 0 MaxLimit: 800 MinLimit: 8
    0:000> !syncblk
    Index         SyncBlock MonitorHeld Recursion Owning Thread Info          SyncBlock Owner
       47 0000000005532488            9         1 000000001385b1e0 93a0 105   000000017ffefbe0 System.Object
    -----------------------------
    Total           6125
    CCW             4
    RCW             11
    ComClassFactory 0
    Free            4333
    0:000> ~105s
    ntdll!NtWaitForSingleObject+0xa:
    00000000`7711135a c3              ret
    0:105> !clrstack
    OS Thread Id: 0x93a0 (105)
    Child SP         IP               Call Site
    000000000fbac038 000000007711135a [NDirectMethodFrameStandalone: 000000000fbac038] System.Data.Common.UnsafeNativeMethods.OCILobRead(System.Data.OracleClient.OciHandle, System.Data.OracleClient.OciHandle, System.Data.OracleClient.OciHandle, UInt32 ByRef, UInt32, IntPtr, UInt32, IntPtr, IntPtr, UInt16, CHARSETFORM)
    000000000fbabfb0 000007fef3455640 DomainNeutralILStubClass.IL_STUB_PInvoke(System.Data.OracleClient.OciHandle, System.Data.OracleClient.OciHandle, System.Data.OracleClient.OciHandle, UInt32 ByRef, UInt32, IntPtr, UInt32, IntPtr, IntPtr, UInt16, CHARSETFORM)*** WARNING: Unable to verify checksum for System.Data.OracleClient.ni.dll
    
    000000000fbac0f0 000007fef344ea9b System.Data.OracleClient.TracedNativeMethods.OCILobRead(System.Data.OracleClient.OciHandle, System.Data.OracleClient.OciHandle, System.Data.OracleClient.OciHandle, Int32 ByRef, UInt32, IntPtr, UInt32, UInt16, CHARSETFORM)
    000000000fbac1d0 000007fef347c63d System.Data.OracleClient.OracleLob.Read(Byte[], Int32, Int32)
    000000000fbac2a0 000007fef347d483 System.Data.OracleClient.OracleLob.get_Value()
    000000000fbac320 000007fef34721b6 System.Data.OracleClient.OracleColumn.GetValue(System.Data.OracleClient.NativeBuffer_RowBuffer)
    000000000fbac3d0 000007fef346efd2 System.Data.OracleClient.OracleDataReader.GetValues(System.Object[])
    000000000fbac420 000007fedd16faec System.Data.ProviderBase.SchemaMapping.LoadDataRow()*** WARNING: Unable to verify checksum for System.Data.ni.dll
    
    000000000fbac460 000007fedd1674fd System.Data.Common.DataAdapter.FillLoadDataRow(System.Data.ProviderBase.SchemaMapping)
    000000000fbac4f0 000007fedd1673d4 System.Data.Common.DataAdapter.FillFromReader(System.Data.DataSet, System.Data.DataTable, System.String, System.Data.ProviderBase.DataReaderContainer, Int32, Int32, System.Data.DataColumn, System.Object)
    000000000fbac5b0 000007fedd167271 System.Data.Common.DataAdapter.Fill(System.Data.DataSet, System.String, System.Data.IDataReader, Int32, Int32)
    000000000fbac670 000007fedd167ef6 System.Data.Common.DbDataAdapter.FillInternal(System.Data.DataSet, System.Data.DataTable[], Int32, Int32, System.String, System.Data.IDbCommand, System.Data.CommandBehavior)
    000000000fbac720 000007fedd167cc4 System.Data.Common.DbDataAdapter.Fill(System.Data.DataSet, Int32, Int32, System.String, System.Data.IDbCommand, System.Data.CommandBehavior)
    000000000fbac7d0 000007fedd167ac2 System.Data.Common.DbDataAdapter.Fill(System.Data.DataSet)
    000000000fbac880 000007ff002393ee xxxxxx.Platform.Core.DataAccess.Database.GetDataSet(System.String, System.String, System.Data.IDbDataParameter[], Boolean)*** ERROR: Module load completed but symbols could not be loaded for xxxxxx.Platform.Core.DataAccess.dll
    
    000000000fbac950 000007ff002391cd xxxxxx.Platform.Core.DataAccess.Database.GetDataSet(System.String, System.Data.IDbDataParameter[], Boolean)
    000000000fbac990 000007ff0023906a xxxxxx.Platform.Core.DataAccess.Database.ExecuteDataSet(System.String, System.Data.IDbDataParameter[])
    000000000fbac9f0 000007ff00238810 xxxxxx.Platform.Core.DataAccess.Database.ExecuteDataSet(System.String, System.Object[])
    000000000fbaca60 000007ff01566749 xxxxxx.Platform.Workflow.Server.Designer.OperateDataBaseManager.GetPacekage(System.String)*** WARNING: Unable to verify checksum for xxxxxx.Platform.Workflow.Server.dll
    *** ERROR: Module load completed but symbols could not be loaded for xxxxxx.Platform.Workflow.Server.dll
    
    000000000fbacab0 000007ff01566548 xxxxxx.Platform.Workflow.Server.Designer.OperateDataBaseManager.Load(System.String)
    000000000fbacb00 000007ff0024bbfe xxxxxx.Platform.Resource.ResourceManager.MetadataService.GetCheckedinVersionContent(System.String, System.Data.DataRow, System.String ByRef)*** ERROR: Module load completed but symbols could not be loaded for xxxxxx.Platform.Resource.MetadataManager.dll
    
    000000000fbacb60 000007ff0024b842 xxxxxx.Platform.Resource.ResourceManager.MetadataService.GetMetadataContentString(System.String, System.String ByRef, System.String)
    000000000fbacbe0 000007ff0024b53b xxxxxx.Platform.Resource.ResourceManager.MetadataService.GetMetadataContentFromDB(System.String)
    000000000fbacc40 000007ff00243d1a xxxxxx.Platform.Resource.ResourceManager.MetadataCache.GetObject(System.String, xxxxxx.Platform.AppFramework.Service.xxxState)
    000000000fbaccb0 000007ff00fb036b xxxxxx.Platform.Resource.MetadataManager.MetadataRestFul.TryGetNewVersionContent(System.IO.BinaryReader, System.IO.BinaryWriter)
    000000000fbad418 000007fef98d10b4 [DebuggerU2MCatchHandlerFrame: 000000000fbad418] 
    000000000fbad500 000007fef98d10b4 [CustomGCFrame: 000000000fbad500] 
    000000000fbad4c8 000007fef98d10b4 [GCFrame: 000000000fbad4c8] 
    000000000fbad458 000007fef98d10b4 [GCFrame: 000000000fbad458] 
    000000000fbad828 000007fef98d10b4 [HelperMethodFrame_PROTECTOBJ: 000000000fbad828] System.RuntimeMethodHandle._InvokeMethodFast(System.IRuntimeMethodInfo, System.Object, System.Object[], System.SignatureStruct ByRef, System.Reflection.MethodAttributes, System.RuntimeType)
    000000000fbad970 000007fef895587f System.Reflection.RuntimeMethodInfo.Invoke(System.Object, System.Reflection.BindingFlags, System.Reflection.Binder, System.Object[], System.Globalization.CultureInfo, Boolean)*** WARNING: Unable to verify checksum for mscorlib.ni.dll
    
    000000000fbadab0 000007fef89a0426 System.Reflection.RuntimeMethodInfo.Invoke(System.Object, System.Reflection.BindingFlags, System.Reflection.Binder, System.Object[], System.Globalization.CultureInfo)
    000000000fbadb00 000007ff00b4f3b3 xxxxxx.Platform.AppFramework.Service.xxxRestfulContext.InternalInvoke(System.String, System.Reflection.MethodInfo, System.Object, System.Object[], Int32[] ByRef, System.Object[] ByRef)*** ERROR: Module load completed but symbols could not be loaded for xxxxxx.Platform.AppFramework.RestfulService.dll
    
    000000000fbadb60 000007ff00b4ebf0 xxxxxx.Platform.AppFramework.Service.xxxRestfulContext.Invoke(System.String, System.String, System.String, Boolean, System.Object[], Int32[] ByRef, System.Object[] ByRef)
    000000000fbadc40 000007ff00c28ae8 xxxxxx.Platform.AppFramework.Service.xxxRestfulContext.Invoke(System.String, System.String, System.String, Boolean, System.Object[])
    000000000fbadca0 000007ff00c2876b xxxxxx.Platform.AppFramework.RESTFulWebService.xxxHttpWebHandler.InvokeStream(System.IO.BinaryReader, System.Web.HttpContext, Boolean)*** ERROR: Module load completed but symbols could not be loaded for xxxxxx.Platform.AppFramework.RESTFulWebService.dll
    
    000000000fbadd80 000007ff00b4de0c xxxxxx.Platform.AppFramework.RESTFulWebService.xxxHttpWebHandler.ProcessRequest(System.Web.HttpContext)
    000000000fbade70 000007fef1eb5c25 System.Web.HttpApplication+CallHandlerExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute()*** WARNING: Unable to verify checksum for System.Web.ni.dll
    
    000000000fbadf30 000007fef1e8337a System.Web.HttpApplication.ExecuteStep(IExecutionStep, Boolean ByRef)
    000000000fbadfe0 000007fef1e97030 System.Web.HttpApplication+ApplicationStepManager.ResumeSteps(System.Exception)
    000000000fbae0a0 000007fef1e82879 System.Web.HttpApplication.System.Web.IHttpAsyncHandler.BeginProcessRequest(System.Web.HttpContext, System.AsyncCallback, System.Object)
    000000000fbae100 000007fef1e8777c System.Web.HttpRuntime.ProcessRequestInternal(System.Web.HttpWorkerRequest)
    000000000fbae180 000007fef1e86477 System.Web.Hosting.ISAPIRuntime.ProcessRequest(IntPtr, Int32)
    000000000fbae2b0 000007fef25ed6dd DomainNeutralILStubClass.IL_STUB_COMtoCLR(Int64, Int32, Int32 ByRef)
    000000000fbae5e8 000007fef993e93e [GCFrame: 000000000fbae5e8] 
    000000000fbae628 000007fef993e93e [ContextTransitionFrame: 000000000fbae628] 
    000000000fbae668 000007fef993e93e [GCFrame: 000000000fbae668] 
    000000000fbae850 000007fef993e93e [ComMethodFrame: 000000000fbae850]
    0:105> ~* e!clrstack
    OS Thread Id: 0x2aec (0)
    Unable to walk the managed stack. The current thread is likely not a 
    managed thread. You can run !threads to get a list of managed threads in
    the process
    OS Thread Id: 0x15f0 (106)
    Child SP         IP               Call Site

    000000000ed5df70 00000000771118ca [GCFrame: 000000000ed5df70]
    000000000ed5e1c8 00000000771118ca [GCFrame: 000000000ed5e1c8]
    000000000ed5e178 00000000771118ca [HelperMethodFrame: 000000000ed5e178] System.Threading.Monitor.Enter(System.Object)
    000000000ed5e2d0 000007ff00243ce8 xxxxxx.Platform.Resource.ResourceManager.MetadataCache.GetObject(System.String, xxxxxx.Platform.AppFramework.Service.GSPState)
    000000000ed5e340 000007ff00d40c8d xxxxxx.Platform.Resource.ResourceManager.DefaultValueManager.IsFormMobileForm(System.String)
    000000000ed5e380 000007ff00d40ac0 xxxxxx.Platform.Resource.ResourceManager.DefaultValueManager.GetDefaultValueXmlExactly(System.String, System.String, System.String)
    000000000ed5e3d0 000007ff00d409df xxxxxx.Platform.Engine.DataAccess.Common.DAEUtils.GetDefaultValueXml(System.String, System.String, System.String)
    000000000ed5e420 000007ff00d40583 xxxxxx.Platform.Engine.DataAccess.GSPDataAccessEngine.NewObjectWithDefault(xxxxxx.Platform.Resource.Metadata.DataModel.GspDataModel, System.String, System.String)
    000000000ed5e500 000007ff0091495d xxxxxx.Platform.BizComponent.JSProxy.BizComponentRESTFulService.InvokeInternal(System.String, System.Object[], System.Collections.Generic.Dictionary`2<Int32,System.Object> ByRef, Boolean ByRef)
    000000000ed5e590 000007ff0023fe16 xxxxxx.Platform.BizComponent.JSProxy.BizComponentRESTFulService.Get(System.Collections.Generic.Dictionary`2<System.String,System.String>)
    000000000ed5e610 000007ff002276d8 xxxxxx.Platform.WebFramework.RestFulService.RESTFulHandlerForWeb.ProcessRequest(System.Web.HttpContext)
    000000000ed5e840 000007fef1eb5c25 System.Web.HttpApplication+CallHandlerExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute()
    000000000ed5e900 000007fef1e8337a System.Web.HttpApplication.ExecuteStep(IExecutionStep, Boolean ByRef)
    000000000ed5e9b0 000007fef1e97030 System.Web.HttpApplication+ApplicationStepManager.ResumeSteps(System.Exception)
    000000000ed5ea70 000007fef1e82879 System.Web.HttpApplication.System.Web.IHttpAsyncHandler.BeginProcessRequest(System.Web.HttpContext, System.AsyncCallback, System.Object)
    000000000ed5ead0 000007fef1e8777c System.Web.HttpRuntime.ProcessRequestInternal(System.Web.HttpWorkerRequest)
    000000000ed5eb50 000007fef1e86477 System.Web.Hosting.ISAPIRuntime.ProcessRequest(IntPtr, Int32)
    000000000ed5ec80 000007fef25ed6dd DomainNeutralILStubClass.IL_STUB_COMtoCLR(Int64, Int32, Int32 ByRef)
    000000000ed5efb8 000007fef993e93e [GCFrame: 000000000ed5efb8]
    000000000ed5eff8 000007fef993e93e [ContextTransitionFrame: 000000000ed5eff8]
    000000000ed5f038 000007fef993e93e [GCFrame: 000000000ed5f038]
    000000000ed5f220 000007fef993e93e [ComMethodFrame: 000000000ed5f220]

    下图为阻塞源头105号线程(被数据库返回阻塞),执行SQL对应的程序代码:

    image

    下图为线程阻塞的同步程序代码:

    image

    至此问题应该比较清楚了,执行SQL时间长的线程阻塞了其他线程,而线程同步锁的粒度太大(仅为了做数据缓存时,防止插入重复值,结果锁定了所有的元数据对象)。但是,该SQL脚本真的有问题吗?

    【网络分析】:

          将该SQL放到plsql中执行,客户端响应时间很长,经过对不同单据进行测试后发现,当LOB类型的数据超过16K以后问题就会出现,但该SQL在测试环境并没有问题。前面已经跟踪过数据库服务器,没有发现什么问题,因此目标开始转移到应用服务器到数据库服务器的网络环境上。

          直接使用ADO.NET和JDBC分别直接访问数据库,均发现响应延迟问题。与客户方IT人员沟通了解到,从应用服务器到数据库经过了一个思科的防火墙。我们就分别在应用服务器、应用服务器端的交换机、数据库服务器端的交换机、数据库服务器,4个点进行网络抓包。经过对比发现,经过防火墙前后两个交换机的数据包明显有问题:有非常明显的包乱序、包数量不一致、window size递减的情况;并且所有经过防火墙的网络包,window size的值均从原来的64k篡改为16k;

    如下图:

    image

    把这个情况反馈给思科的代理后,他们调整防火墙设置:好像是改为按流量控制,不是特别懂。

    至此,问题彻底解决!

  • 相关阅读:
    HDU 5791 Two (DP)
    POJ 1088 滑雪 (DPor记忆化搜索)
    LightOJ 1011
    POJ 1787 Charlie's Change (多重背包 带结果组成)
    HDU 5550 Game Rooms (ccpc2015 K)(dp)
    HDU 5542 The Battle of Chibi (ccpc 南阳 C)(DP 树状数组 离散化)
    HDU 5543 Pick The Sticks (01背包)
    HDU 5546 Ancient Go (ccpc2015南阳G)
    NB-IoT的DRX、eDRX、PSM三个模式 (转载,描述的简单易懂)
    MQTT 嵌入式端通讯协议解析(转)
  • 原文地址:https://www.cnblogs.com/zhaoguan_wang/p/4598315.html
Copyright © 2011-2022 走看看