zoukankan      html  css  js  c++  java
  • 【性能诊断】十一、性能问题综合分析(案例2,windbg、wireshark)

    【问题描述】:

          前段时间有一项目反馈,常用的审批功能有时的响应较慢,多个管理员功能不定期的出现客户端无响应的状况,并且管理员功能一旦出现卡死,也会影响到普通的业务用户致使很多用户无法操作。该系统为混合场景,管理员功能使用的CS客户端,业务用户使用的是WEB。

    【环境信息】:

          应用服务器:Windows Server 2008 R2 IIS7  .NET 4.0

          数据库服务器:AIX Oracle 10.2.0.4

    【处理过程】:

          经过跟踪发现两个情况:

    1. 确实有部分待审单据响应很慢,可以反复重现,貌似跟单据有关(慢的单据反复打开都会很慢,其他单据则一直没有类似问题),感觉很奇怪;
    2. 管理员功能发生了客户端不响应的情况后,查看对应数据库端的性能视图,没有SQL阻塞信息,也没有明显的执行时间很长、效率很低的SQL;

          在出现系统挂起时,抓取应用服务器的dump文件后,发现存在明显的线程阻塞:

    1. 105号线程阻塞了其他4个线程{(9-1)/2},而105号线程在等待数据库的返回,OracleLob,这是个特别的数据类型,需要引起警惕了;
    2. 根据线程堆栈:OperateDataBaseManager.GetPacekage(System.String),我们可以很快定位到程序代码及对应的SQL脚本;
    3. 根据windbg还可以看到,其他线程都在MetadataCache.GetObject()方法上处于被阻塞状态,查看对应的源代码,发现该方法使用了静态对象作为同步锁对象;

    下图是线程阻塞对应windbg的log日志

    Microsoft (R) Windows Debugger Version 6.12.0002.633 AMD64
    Copyright (c) Microsoft Corporation. All rights reserved.
    
    
    Loading Dump File [D:dumpsw3wp.DMP]
    User Mini Dump File with Full Memory: Only application data is available
    
    Symbol search path is: SRV*c:symbols*http://msdl.microsoft.com/download/symbols
    Executable search path is: 
    Windows 7 Version 7601 (Service Pack 1) MP (8 procs) Free x64
    Product: Server, suite: Enterprise TerminalServer SingleUserTS
    Machine Name:
    Debug session time: Tue Apr 14 14:22:10.000 2015 (UTC + 8:00)
    System Uptime: 19 days 20:26:50.903
    Process Uptime: 0 days 5:33:15.000
    ................................................................
    ................................................................
    ................................................................
    ................................................................
    ................................................................
    ................................................................
    ................................................................
    ......................................
    ntdll!NtWaitForSingleObject+0xa:
    00000000`7711135a c3              ret
    0:000> .load D:dumpssos.dll
    0:000> !tp
    CPU utilization: 0%
    Worker Thread: Total: 29 Running: 6 Idle: 23 MaxLimit: 800 MinLimit: 8
    Work Request in Queue: 0
    --------------------------------------
    Number of Timers: 43
    --------------------------------------
    Completion Port Thread:Total: 1 Free: 1 MaxFree: 16 CurrentLimit: 0 MaxLimit: 800 MinLimit: 8
    0:000> !syncblk
    Index         SyncBlock MonitorHeld Recursion Owning Thread Info          SyncBlock Owner
       47 0000000005532488            9         1 000000001385b1e0 93a0 105   000000017ffefbe0 System.Object
    -----------------------------
    Total           6125
    CCW             4
    RCW             11
    ComClassFactory 0
    Free            4333
    0:000> ~105s
    ntdll!NtWaitForSingleObject+0xa:
    00000000`7711135a c3              ret
    0:105> !clrstack
    OS Thread Id: 0x93a0 (105)
    Child SP         IP               Call Site
    000000000fbac038 000000007711135a [NDirectMethodFrameStandalone: 000000000fbac038] System.Data.Common.UnsafeNativeMethods.OCILobRead(System.Data.OracleClient.OciHandle, System.Data.OracleClient.OciHandle, System.Data.OracleClient.OciHandle, UInt32 ByRef, UInt32, IntPtr, UInt32, IntPtr, IntPtr, UInt16, CHARSETFORM)
    000000000fbabfb0 000007fef3455640 DomainNeutralILStubClass.IL_STUB_PInvoke(System.Data.OracleClient.OciHandle, System.Data.OracleClient.OciHandle, System.Data.OracleClient.OciHandle, UInt32 ByRef, UInt32, IntPtr, UInt32, IntPtr, IntPtr, UInt16, CHARSETFORM)*** WARNING: Unable to verify checksum for System.Data.OracleClient.ni.dll
    
    000000000fbac0f0 000007fef344ea9b System.Data.OracleClient.TracedNativeMethods.OCILobRead(System.Data.OracleClient.OciHandle, System.Data.OracleClient.OciHandle, System.Data.OracleClient.OciHandle, Int32 ByRef, UInt32, IntPtr, UInt32, UInt16, CHARSETFORM)
    000000000fbac1d0 000007fef347c63d System.Data.OracleClient.OracleLob.Read(Byte[], Int32, Int32)
    000000000fbac2a0 000007fef347d483 System.Data.OracleClient.OracleLob.get_Value()
    000000000fbac320 000007fef34721b6 System.Data.OracleClient.OracleColumn.GetValue(System.Data.OracleClient.NativeBuffer_RowBuffer)
    000000000fbac3d0 000007fef346efd2 System.Data.OracleClient.OracleDataReader.GetValues(System.Object[])
    000000000fbac420 000007fedd16faec System.Data.ProviderBase.SchemaMapping.LoadDataRow()*** WARNING: Unable to verify checksum for System.Data.ni.dll
    
    000000000fbac460 000007fedd1674fd System.Data.Common.DataAdapter.FillLoadDataRow(System.Data.ProviderBase.SchemaMapping)
    000000000fbac4f0 000007fedd1673d4 System.Data.Common.DataAdapter.FillFromReader(System.Data.DataSet, System.Data.DataTable, System.String, System.Data.ProviderBase.DataReaderContainer, Int32, Int32, System.Data.DataColumn, System.Object)
    000000000fbac5b0 000007fedd167271 System.Data.Common.DataAdapter.Fill(System.Data.DataSet, System.String, System.Data.IDataReader, Int32, Int32)
    000000000fbac670 000007fedd167ef6 System.Data.Common.DbDataAdapter.FillInternal(System.Data.DataSet, System.Data.DataTable[], Int32, Int32, System.String, System.Data.IDbCommand, System.Data.CommandBehavior)
    000000000fbac720 000007fedd167cc4 System.Data.Common.DbDataAdapter.Fill(System.Data.DataSet, Int32, Int32, System.String, System.Data.IDbCommand, System.Data.CommandBehavior)
    000000000fbac7d0 000007fedd167ac2 System.Data.Common.DbDataAdapter.Fill(System.Data.DataSet)
    000000000fbac880 000007ff002393ee xxxxxx.Platform.Core.DataAccess.Database.GetDataSet(System.String, System.String, System.Data.IDbDataParameter[], Boolean)*** ERROR: Module load completed but symbols could not be loaded for xxxxxx.Platform.Core.DataAccess.dll
    
    000000000fbac950 000007ff002391cd xxxxxx.Platform.Core.DataAccess.Database.GetDataSet(System.String, System.Data.IDbDataParameter[], Boolean)
    000000000fbac990 000007ff0023906a xxxxxx.Platform.Core.DataAccess.Database.ExecuteDataSet(System.String, System.Data.IDbDataParameter[])
    000000000fbac9f0 000007ff00238810 xxxxxx.Platform.Core.DataAccess.Database.ExecuteDataSet(System.String, System.Object[])
    000000000fbaca60 000007ff01566749 xxxxxx.Platform.Workflow.Server.Designer.OperateDataBaseManager.GetPacekage(System.String)*** WARNING: Unable to verify checksum for xxxxxx.Platform.Workflow.Server.dll
    *** ERROR: Module load completed but symbols could not be loaded for xxxxxx.Platform.Workflow.Server.dll
    
    000000000fbacab0 000007ff01566548 xxxxxx.Platform.Workflow.Server.Designer.OperateDataBaseManager.Load(System.String)
    000000000fbacb00 000007ff0024bbfe xxxxxx.Platform.Resource.ResourceManager.MetadataService.GetCheckedinVersionContent(System.String, System.Data.DataRow, System.String ByRef)*** ERROR: Module load completed but symbols could not be loaded for xxxxxx.Platform.Resource.MetadataManager.dll
    
    000000000fbacb60 000007ff0024b842 xxxxxx.Platform.Resource.ResourceManager.MetadataService.GetMetadataContentString(System.String, System.String ByRef, System.String)
    000000000fbacbe0 000007ff0024b53b xxxxxx.Platform.Resource.ResourceManager.MetadataService.GetMetadataContentFromDB(System.String)
    000000000fbacc40 000007ff00243d1a xxxxxx.Platform.Resource.ResourceManager.MetadataCache.GetObject(System.String, xxxxxx.Platform.AppFramework.Service.xxxState)
    000000000fbaccb0 000007ff00fb036b xxxxxx.Platform.Resource.MetadataManager.MetadataRestFul.TryGetNewVersionContent(System.IO.BinaryReader, System.IO.BinaryWriter)
    000000000fbad418 000007fef98d10b4 [DebuggerU2MCatchHandlerFrame: 000000000fbad418] 
    000000000fbad500 000007fef98d10b4 [CustomGCFrame: 000000000fbad500] 
    000000000fbad4c8 000007fef98d10b4 [GCFrame: 000000000fbad4c8] 
    000000000fbad458 000007fef98d10b4 [GCFrame: 000000000fbad458] 
    000000000fbad828 000007fef98d10b4 [HelperMethodFrame_PROTECTOBJ: 000000000fbad828] System.RuntimeMethodHandle._InvokeMethodFast(System.IRuntimeMethodInfo, System.Object, System.Object[], System.SignatureStruct ByRef, System.Reflection.MethodAttributes, System.RuntimeType)
    000000000fbad970 000007fef895587f System.Reflection.RuntimeMethodInfo.Invoke(System.Object, System.Reflection.BindingFlags, System.Reflection.Binder, System.Object[], System.Globalization.CultureInfo, Boolean)*** WARNING: Unable to verify checksum for mscorlib.ni.dll
    
    000000000fbadab0 000007fef89a0426 System.Reflection.RuntimeMethodInfo.Invoke(System.Object, System.Reflection.BindingFlags, System.Reflection.Binder, System.Object[], System.Globalization.CultureInfo)
    000000000fbadb00 000007ff00b4f3b3 xxxxxx.Platform.AppFramework.Service.xxxRestfulContext.InternalInvoke(System.String, System.Reflection.MethodInfo, System.Object, System.Object[], Int32[] ByRef, System.Object[] ByRef)*** ERROR: Module load completed but symbols could not be loaded for xxxxxx.Platform.AppFramework.RestfulService.dll
    
    000000000fbadb60 000007ff00b4ebf0 xxxxxx.Platform.AppFramework.Service.xxxRestfulContext.Invoke(System.String, System.String, System.String, Boolean, System.Object[], Int32[] ByRef, System.Object[] ByRef)
    000000000fbadc40 000007ff00c28ae8 xxxxxx.Platform.AppFramework.Service.xxxRestfulContext.Invoke(System.String, System.String, System.String, Boolean, System.Object[])
    000000000fbadca0 000007ff00c2876b xxxxxx.Platform.AppFramework.RESTFulWebService.xxxHttpWebHandler.InvokeStream(System.IO.BinaryReader, System.Web.HttpContext, Boolean)*** ERROR: Module load completed but symbols could not be loaded for xxxxxx.Platform.AppFramework.RESTFulWebService.dll
    
    000000000fbadd80 000007ff00b4de0c xxxxxx.Platform.AppFramework.RESTFulWebService.xxxHttpWebHandler.ProcessRequest(System.Web.HttpContext)
    000000000fbade70 000007fef1eb5c25 System.Web.HttpApplication+CallHandlerExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute()*** WARNING: Unable to verify checksum for System.Web.ni.dll
    
    000000000fbadf30 000007fef1e8337a System.Web.HttpApplication.ExecuteStep(IExecutionStep, Boolean ByRef)
    000000000fbadfe0 000007fef1e97030 System.Web.HttpApplication+ApplicationStepManager.ResumeSteps(System.Exception)
    000000000fbae0a0 000007fef1e82879 System.Web.HttpApplication.System.Web.IHttpAsyncHandler.BeginProcessRequest(System.Web.HttpContext, System.AsyncCallback, System.Object)
    000000000fbae100 000007fef1e8777c System.Web.HttpRuntime.ProcessRequestInternal(System.Web.HttpWorkerRequest)
    000000000fbae180 000007fef1e86477 System.Web.Hosting.ISAPIRuntime.ProcessRequest(IntPtr, Int32)
    000000000fbae2b0 000007fef25ed6dd DomainNeutralILStubClass.IL_STUB_COMtoCLR(Int64, Int32, Int32 ByRef)
    000000000fbae5e8 000007fef993e93e [GCFrame: 000000000fbae5e8] 
    000000000fbae628 000007fef993e93e [ContextTransitionFrame: 000000000fbae628] 
    000000000fbae668 000007fef993e93e [GCFrame: 000000000fbae668] 
    000000000fbae850 000007fef993e93e [ComMethodFrame: 000000000fbae850]
    0:105> ~* e!clrstack
    OS Thread Id: 0x2aec (0)
    Unable to walk the managed stack. The current thread is likely not a 
    managed thread. You can run !threads to get a list of managed threads in
    the process
    OS Thread Id: 0x15f0 (106)
    Child SP         IP               Call Site

    000000000ed5df70 00000000771118ca [GCFrame: 000000000ed5df70]
    000000000ed5e1c8 00000000771118ca [GCFrame: 000000000ed5e1c8]
    000000000ed5e178 00000000771118ca [HelperMethodFrame: 000000000ed5e178] System.Threading.Monitor.Enter(System.Object)
    000000000ed5e2d0 000007ff00243ce8 xxxxxx.Platform.Resource.ResourceManager.MetadataCache.GetObject(System.String, xxxxxx.Platform.AppFramework.Service.GSPState)
    000000000ed5e340 000007ff00d40c8d xxxxxx.Platform.Resource.ResourceManager.DefaultValueManager.IsFormMobileForm(System.String)
    000000000ed5e380 000007ff00d40ac0 xxxxxx.Platform.Resource.ResourceManager.DefaultValueManager.GetDefaultValueXmlExactly(System.String, System.String, System.String)
    000000000ed5e3d0 000007ff00d409df xxxxxx.Platform.Engine.DataAccess.Common.DAEUtils.GetDefaultValueXml(System.String, System.String, System.String)
    000000000ed5e420 000007ff00d40583 xxxxxx.Platform.Engine.DataAccess.GSPDataAccessEngine.NewObjectWithDefault(xxxxxx.Platform.Resource.Metadata.DataModel.GspDataModel, System.String, System.String)
    000000000ed5e500 000007ff0091495d xxxxxx.Platform.BizComponent.JSProxy.BizComponentRESTFulService.InvokeInternal(System.String, System.Object[], System.Collections.Generic.Dictionary`2<Int32,System.Object> ByRef, Boolean ByRef)
    000000000ed5e590 000007ff0023fe16 xxxxxx.Platform.BizComponent.JSProxy.BizComponentRESTFulService.Get(System.Collections.Generic.Dictionary`2<System.String,System.String>)
    000000000ed5e610 000007ff002276d8 xxxxxx.Platform.WebFramework.RestFulService.RESTFulHandlerForWeb.ProcessRequest(System.Web.HttpContext)
    000000000ed5e840 000007fef1eb5c25 System.Web.HttpApplication+CallHandlerExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute()
    000000000ed5e900 000007fef1e8337a System.Web.HttpApplication.ExecuteStep(IExecutionStep, Boolean ByRef)
    000000000ed5e9b0 000007fef1e97030 System.Web.HttpApplication+ApplicationStepManager.ResumeSteps(System.Exception)
    000000000ed5ea70 000007fef1e82879 System.Web.HttpApplication.System.Web.IHttpAsyncHandler.BeginProcessRequest(System.Web.HttpContext, System.AsyncCallback, System.Object)
    000000000ed5ead0 000007fef1e8777c System.Web.HttpRuntime.ProcessRequestInternal(System.Web.HttpWorkerRequest)
    000000000ed5eb50 000007fef1e86477 System.Web.Hosting.ISAPIRuntime.ProcessRequest(IntPtr, Int32)
    000000000ed5ec80 000007fef25ed6dd DomainNeutralILStubClass.IL_STUB_COMtoCLR(Int64, Int32, Int32 ByRef)
    000000000ed5efb8 000007fef993e93e [GCFrame: 000000000ed5efb8]
    000000000ed5eff8 000007fef993e93e [ContextTransitionFrame: 000000000ed5eff8]
    000000000ed5f038 000007fef993e93e [GCFrame: 000000000ed5f038]
    000000000ed5f220 000007fef993e93e [ComMethodFrame: 000000000ed5f220]

    下图为阻塞源头105号线程(被数据库返回阻塞),执行SQL对应的程序代码:

    image

    下图为线程阻塞的同步程序代码:

    image

    至此问题应该比较清楚了,执行SQL时间长的线程阻塞了其他线程,而线程同步锁的粒度太大(仅为了做数据缓存时,防止插入重复值,结果锁定了所有的元数据对象)。但是,该SQL脚本真的有问题吗?

    【网络分析】:

          将该SQL放到plsql中执行,客户端响应时间很长,经过对不同单据进行测试后发现,当LOB类型的数据超过16K以后问题就会出现,但该SQL在测试环境并没有问题。前面已经跟踪过数据库服务器,没有发现什么问题,因此目标开始转移到应用服务器到数据库服务器的网络环境上。

          直接使用ADO.NET和JDBC分别直接访问数据库,均发现响应延迟问题。与客户方IT人员沟通了解到,从应用服务器到数据库经过了一个思科的防火墙。我们就分别在应用服务器、应用服务器端的交换机、数据库服务器端的交换机、数据库服务器,4个点进行网络抓包。经过对比发现,经过防火墙前后两个交换机的数据包明显有问题:有非常明显的包乱序、包数量不一致、window size递减的情况;并且所有经过防火墙的网络包,window size的值均从原来的64k篡改为16k;

    如下图:

    image

    把这个情况反馈给思科的代理后,他们调整防火墙设置:好像是改为按流量控制,不是特别懂。

    至此,问题彻底解决!

  • 相关阅读:
    mysql常用操作语句
    开启端口命令
    在linux终端远程登陆linux服务器
    CentOS个人目录下中文路径转英文路径
    安装MariaDB和简单配置
    centos 7.1 apache 源码编译安装
    PHP页面跳转
    一亿个不重复的随机数算法
    HTML中添加背景
    SQL语法
  • 原文地址:https://www.cnblogs.com/zhaoguan_wang/p/4598315.html
Copyright © 2011-2022 走看看