zoukankan      html  css  js  c++  java
  • How to Analyze "Deadlocked Schedulers" Dumps?---WINDBG

    https://blogs.msdn.microsoft.com/karthick_pk/2010/06/22/how-to-analyze-deadlocked-schedulers-dumps/

    How to Analyze "Deadlocked Schedulers" Dumps?

    Newer version of this post is available in http://mssqlwiki.com/2010/06/15/how-to-analyze-deadlocked-schedulers-dumps/

    Do you see "Deadlocked Schedulers" errors similar to one below and 
    stuck?

    From SQLServer Errorlog

    **Dump thread – spid = 0, PSS = 0x0000000000000000, EC = 
    0x0000000000000000

    ***Stack Dump being sent to C:Program FilesMicrosoft SQL 
    ServerMSSQL.1MSSQLLOGSQLDump0001.txt

    * BEGIN STACK DUMP:

    * Deadlocked Schedulers

    * Short Stack Dump

    Stack Signature for the dump is 0x00000000000003D0

    New queries assigned to process on Node 0 have not been picked  up by a worker thread in the last 60 seconds. Blocking or long-running queries can 
    contribute to this condition, and may degrade client response time.  Use the "max worker threads" configuration option to increase number  of allowable 
    threads, or optimize current running queries.  SQL Process Utilization: 0%. System Idle: 69%.

    Cause

    We get Deadlocked Schedulers error when Scheduler Monitor detects Threads(workers) are not Progressing on schedulers.

    Some of common causes are

    1.Most of the tasks are waiting on a single resource and SQL Server could not spawn new thread to take new work request and there is no Idle thread to process the new work Request . In systems with multiple nodes (Numa)  If all the threads which belong to schedulers of single node is exhausted (or)  Schedulers not progressing on single node can cause deadlocked scheduler condition.

    2.   Excessive blocking, Very long running Queries executed by all workers, All the threads waiting on some resource.

    Steps to analyze "Deadlocked Schedulers" Dumps.

    To analyze the dump download and Install Windows Debugger from This  link

    Step 1:

    Open Windbg .  Choose File menu –> select Open crash dump –>Select the Dump file (SQLDump000#.mdmp)

    Step 2:

    on command window type 
    .sympath srv*c:Websymbols*http://msdl.microsoft.com/download/symbols;

    Step 3:

    Type .reload /f and hit enter. This will force debugger to immediately load all the symbols.

    Step 4:

    Verify if symbols are loaded for  SQL Server by using the debugger command lmvm

    0:002> lmvm sqlservr 
    start             end                 module name 
    00000000`01000000 00000000`03679000   sqlservr T (pdb symbols)          c:websymbolssqlservr.pdb21E4AC6E96294A529C9D99826B5A7C032sqlservr.pdb 
        Loaded symbol image file: sqlservr.exe 
        Image path: C:Program FilesMicrosoft SQL ServerMSSQL.1MSSQLBinnsqlservr.exe 
        Image name: sqlservr.exe 
        Timestamp:        Wed Oct 07 21:15:52 2009 (4ACD6778) 
        CheckSum:         025FEB5E 
        ImageSize:        02679000 
        File version:     2005.90.4266.0 
        Product version:  9.0.4266.0 
        File flags:       0 (Mask 3F) 
        File OS:          40000 NT Base 
        File type:        1.0 App 
        File date:        00000000.00000000 
        Translations:     0000.04b0 0000.04e4 0409.04b0 0409.04e4

    Step 5:

    Type  ~*kL 20   and look at the stack of all the threads  to find what majority of threads are doing.

    1. If it is blocking issue and If most of the threads are  waiting to 
    acquire a lock you will find the most of the stack similar to one below. (We try 
    to acquire lock and go to wait, since someone is holding a lock)

    ntdll!ZwSignalAndWaitForSingleObject

    kernel32!SignalObjectAndWait

    sqlservr!SOS_Scheduler::SwitchContext

    sqlservr!SOS_Scheduler::Suspend

    sqlservr!SOS_Event::Wait

    sqlservr!LockOwner::Sleep

    sqlservr!lck_lockInternal

    sqlservr!GetLock

    2. If most of threads are stuck while trying to write profiler events 
    to the destination you might find stack similar to one below

    ntdll!ZwSignalAndWaitForSingleObject

    kernel32!SignalObjectAndWait

    sqlservr!SOS_Scheduler::SwitchContext

    sqlservr!SOS_Task::Sleep

    sqlservr!CTraceRowsetIoProvider::GetFreeBuffers

    sqlservr!CTraceWriteRequest::InitForRowsetTrace

    sqlservr!CTraceRowsetIoProvider::InitializeWriteRequest

    sqlservr!CTrace::WriteRecord

    sqlservr!CTraceController::ProduceRecord

    sqlservr!CTraceData::TracePreBatchEvent

    sqlservr!CSQLSource::Execute

    sqlservr!process_request

    sqlservr!process_commands

    sqlservr!SOS_Task::Param::Execute

    sqlservr!SOS_Scheduler::RunTask

    sqlservr!SOS_Scheduler::ProcessTasks

    3. If your stack’s  are like one below refer http://support.microsoft.com/default.aspx?scid=kb;EN-US;974205

    sqlservr!SpinlockBase::Sleep sqlservr!SpinlockBase::SpinToAcquire 
    sqlservr!TSyncHashTable_EntryAccessorsqlservr!CQSIndexStatsMgr::AddNewMissingIndex 
    sqlservr!CIdxSuggestion::Register 
    sqlservr!COptExpr::PqteConvert sqlservr!CPhyOp_Top::PqteConvert sqlservr!COptExpr::PqteConvert 
    sqlservr!COptExpr::PqteConvertTree sqlservr!COptContext::PcxteOptimizeQuery 
    sqlservr!CQuery::Optimize sqlservr!CQuery::PqoBuild sqlservr!CStmtQuery::InitQuery 
    sqlservr!CStmtSelect::Init
     

    4. If you see many stacks like the one below it could be BPOOL memory

    pressure (or) Lazy writer  waiting on I/O

    sqlservr!BPool::Steal

    sqlservr!SQLSinglePageAllocator::AllocatePages

    sqlservr!MemoryNode::AllocatePagesInternal

    sqlservr!MemoryClerkInternal::AllocatePages

    sqlservr!IMemObj::PbGetNewPages

    sqlservr!CSlotPageMgr::PbAllocate

    5. If you see many stacks like the one below it should be because of

    excessive parallelism

    sqlservr!CQScanXProducerNew::Open

    sqlservr!FnProducerOpen

    sqlservr!FnProducerThread

    sqlservr!SubprocEntrypoint

    6. If you see many stacks like the one below (Many threads waiting to

    flush log) it should be because of disk bottleneck’s. Check if you see "I/O

    requests taking longer than 15 seconds" messages in Errorlog before Deadlocked

    Schedulers Dumps. Refer http://mssqlwiki.com/sqlwiki/sql-performance/io-bottlenecks/

    for troubleshooting I/O issues.

    sqlservr!SOS_Event::Wait

    sqlservr!SQLServerLogMgr::WaitLCFlush

    sqlservr!SQLServerLogMgr::LogFlush

    sqlservr!SQLServerLogMgr::WaitLogFlush

    sqlservr!XdesRMFull::Commit

    Related blogs:

    http://mssqlwiki.com/sqlwiki/sql-performance/basics-of-sql-server-memory-architecture/

    http://mssqlwiki.com/sqlwiki/sql-performance/windows-2008-and-windows-2008-r2-known-issues-related-to-working-set-memory/

    http://mssqlwiki.com/2012/05/18/sql-server-performance-degraded-in-32-bit-sql-server-after-i-adding-additional-ram/

    http://mssqlwiki.com/sqlwiki/sql-performance/troubleshooting-sql-server-memory/

    http://mssqlwiki.com/sqlwiki/sql-performance/io-bottlenecks/

    If you liked this post, do like us on FaceBook at https://www.facebook.com/mssqlwiki and join our FaceBook grouphttps://www.facebook.com/mssqlwiki#!/groups/454762937884205/

    Regards

    Karthick P.K

  • 相关阅读:
    模块
    python运算符与表达式
    subShell与代码块
    参数展开与特殊字符
    变量和参数
    shelll Test
    Cypher查询语言--Neo4j 入门 (一)
    那些争议最大的编程观点
    大型网站架构不得不考虑的10个问题
    大型网站架构演变
  • 原文地址:https://www.cnblogs.com/zengkefu/p/6943640.html
Copyright © 2011-2022 走看看