你可以使用下面的语句来使用sys.dm_os_wait_stats这个DMV得到线程的等待信息(线程在等什么, 等了多久)的统计数值.
WITH [Waits] AS (SELECT [wait_type], [wait_time_ms] / 1000.0 AS [WaitS], ([wait_time_ms] - [signal_wait_time_ms]) / 1000.0 AS [ResourceS], [signal_wait_time_ms] / 1000.0 AS [SignalS], [waiting_tasks_count] AS [WaitCount], 100.0 * [wait_time_ms] / SUM ([wait_time_ms]) OVER() AS [Percentage], ROW_NUMBER() OVER(ORDER BY [wait_time_ms] DESC) AS [RowNum] FROM sys.dm_os_wait_stats WHERE [wait_type] NOT IN ( N'BROKER_EVENTHANDLER', N'BROKER_RECEIVE_WAITFOR', N'BROKER_TASK_STOP', N'BROKER_TO_FLUSH', N'BROKER_TRANSMITTER', N'CHECKPOINT_QUEUE', N'CHKPT', N'CLR_AUTO_EVENT', N'CLR_MANUAL_EVENT', N'CLR_SEMAPHORE', N'DBMIRROR_DBM_EVENT', N'DBMIRROR_EVENTS_QUEUE', N'DBMIRROR_WORKER_QUEUE', N'DBMIRRORING_CMD', N'DIRTY_PAGE_POLL', N'DISPATCHER_QUEUE_SEMAPHORE', N'EXECSYNC', N'FSAGENT', N'FT_IFTS_SCHEDULER_IDLE_WAIT', N'FT_IFTSHC_MUTEX', N'HADR_CLUSAPI_CALL', N'HADR_FILESTREAM_IOMGR_IOCOMPLETION', N'HADR_LOGCAPTURE_WAIT', N'HADR_NOTIFICATION_DEQUEUE', N'HADR_TIMER_TASK', N'HADR_WORK_QUEUE', N'KSOURCE_WAKEUP', N'LAZYWRITER_SLEEP', N'LOGMGR_QUEUE', N'ONDEMAND_TASK_QUEUE', N'PWAIT_ALL_COMPONENTS_INITIALIZED', N'QDS_PERSIST_TASK_MAIN_LOOP_SLEEP', N'QDS_CLEANUP_STALE_QUERIES_TASK_MAIN_LOOP_SLEEP', N'REQUEST_FOR_DEADLOCK_SEARCH', N'RESOURCE_QUEUE', N'SERVER_IDLE_CHECK', N'SLEEP_BPOOL_FLUSH', N'SLEEP_DBSTARTUP', N'SLEEP_DCOMSTARTUP', N'SLEEP_MASTERDBREADY', N'SLEEP_MASTERMDREADY', N'SLEEP_MASTERUPGRADED', N'SLEEP_MSDBSTARTUP', N'SLEEP_SYSTEMTASK', N'SLEEP_TASK', N'SLEEP_TEMPDBSTARTUP', N'SNI_HTTP_ACCEPT', N'SP_SERVER_DIAGNOSTICS_SLEEP', N'SQLTRACE_BUFFER_FLUSH', N'SQLTRACE_INCREMENTAL_FLUSH_SLEEP', N'SQLTRACE_WAIT_ENTRIES', N'WAIT_FOR_RESULTS', N'WAITFOR', N'WAITFOR_TASKSHUTDOWN', N'WAIT_XTP_HOST_WAIT', N'WAIT_XTP_OFFLINE_CKPT_NEW_LOG', N'WAIT_XTP_CKPT_CLOSE', N'XE_DISPATCHER_JOIN', N'XE_DISPATCHER_WAIT', N'XE_TIMER_EVENT') AND [waiting_tasks_count] > 0 ) SELECT MAX ([W1].[wait_type]) AS [WaitType], CAST (MAX ([W1].[WaitS]) AS DECIMAL (16,2)) AS [Wait_S], CAST (MAX ([W1].[ResourceS]) AS DECIMAL (16,2)) AS [Resource_S], CAST (MAX ([W1].[SignalS]) AS DECIMAL (16,2)) AS [Signal_S], MAX ([W1].[WaitCount]) AS [WaitCount], CAST (MAX ([W1].[Percentage]) AS DECIMAL (5,2)) AS [Percentage], CAST ((MAX ([W1].[WaitS]) / MAX ([W1].[WaitCount])) AS DECIMAL (16,4)) AS [AvgWait_S], CAST ((MAX ([W1].[ResourceS]) / MAX ([W1].[WaitCount])) AS DECIMAL (16,4)) AS [AvgRes_S], CAST ((MAX ([W1].[SignalS]) / MAX ([W1].[WaitCount])) AS DECIMAL (16,4)) AS [AvgSig_S] FROM [Waits] AS [W1] INNER JOIN [Waits] AS [W2] ON [W2].[RowNum] <= [W1].[RowNum] GROUP BY [W1].[RowNum] HAVING SUM ([W2].[Percentage]) - MAX ([W1].[Percentage]) < 95; -- percentage threshold GO
一些说明:
SQL Server会永久地跟踪为什么执行的线程会处于等待状态. 你可以从SQL Server中得到这些信息, 然后缩小性能问题的调查范围.
有些人一开始调查性能问题就会看几次当下的线程在等待着什么, 然后尝试去弄清楚为什么. 而事实上是, '等待'这回事儿是always发生的, SQL的scheduleing系统就是这么工作的. 下面让我们来看一下scheduleing大致的工作流程吧.
正在使用CPU的线程(处于RUNNING状态)会在它需要某种资源的时候停下来. 该线程会被移到一个叫做SUSPENDED的线程的无序列表中. 同时, 在RUNNABLE队列(先进先出)中等待CPU的下一个线程会变为RUNNING状态. 如果在SUSPENDED列表中的一个线程得到通知说它想要的资源可用了, 它就会转到RUNNABLE队列中的底部. 就这样, 线程们像钟表一样的从RUNNING到SUSPENDED, 在到RUNNABLE, 再到RUNNING的循环, 直到任务完成.
SQL Server跟踪从离开RUNNING状态到再回到RUNNING状态的时间(叫做wait time), 和花在RUNNABLE队列中的时间(叫做signal wait time). 我们需要得到在SUSPENDED队列中的时间, 方法就是从整个的wait time中减掉signal wait time.
下面节选了一些常见的等待类型, 并作出解释. 更多等待类型的信息请看原文.
- PAGEIOLATCH_XX: 线程正在等待一个数据page从磁盘中读到buffer pool中.
- LCK_M_X: 线程正在等待被在某资源上赋予一个排他锁.
- CXPACKET: 并行线程中某一个线程在执行时, 整个query都会被block, 这时该数值会被累加. 具体见原文.
- WRITELOG: 日志管理系统在等待log被flush到磁盘上.
信息来源
=========================
Wait statistics, or please tell me where it hurts
http://www.sqlskills.com/blogs/paul/wait-statistics-or-please-tell-me-where-it-hurts/