zoukankan      html  css  js  c++  java
  • SQL Server自动化运维系列——监控性能指标脚本(Power Shell)

    需求描述

    一般在生产环境中,有时候需要自动的检测指标值状态,如果发生异常,需要提前预警的,比如发邮件告知,本篇就介绍如果通过Power shell实现状态值监控

    监控值范围

    根据经验,作为DBA一般需要监控如下系统能行指标

      cpu:
     
        Processor(_Total)% Processor Time
        Processor(_Total)% Privileged Time
     
        SQLServer:SQL StatisticsBatch Requests/sec
        SQLServer:SQL StatisticsSQL Compilations/sec
        SQLServer:SQL StatisticsSQL Re-Compilations/sec
        SystemProcessor Queue Length
        SystemContext Switches/sec
     
      Memory:
     
        MemoryAvailable Bytes
        MemoryPages/sec
        MemoryPage Faults/sec
        MemoryPages Input/sec
        MemoryPages Output/sec
        Process(sqlservr)Private Bytes
        SQLServer:Buffer ManagerBuffer cache hit ratio
        SQLServer:Buffer ManagerPage life expectancy
        SQLServer:Buffer ManagerLazy writes/sec
        SQLServer:Memory ManagerMemory Grants Pending
        SQLServer:Memory ManagerTarget Server Memory (KB)
        SQLServer:Memory ManagerTotal Server Memory (KB)
     
      Disk:
     
        PhysicalDisk(_Total)% Disk Time
        PhysicalDisk(_Total)Current Disk Queue Length
        PhysicalDisk(_Total)Avg. Disk Queue Length
        PhysicalDisk(_Total)Disk Transfers/sec
        PhysicalDisk(_Total)Disk Bytes/sec
        PhysicalDisk(_Total)Avg. Disk sec/Read
        PhysicalDisk(_Total)Avg. Disk sec/Write
     
      SQL Server:
     
        SQLServer:Access MethodsFreeSpace Scans/sec
        SQLServer:Access MethodsFull Scans/sec
        SQLServer:Access MethodsTable Lock Escalations/sec
        SQLServer:Access MethodsWorktables Created/sec
        SQLServer:General StatisticsProcesses blocked
        SQLServer:General StatisticsUser Connections
        SQLServer:LatchesTotal Latch Wait Time (ms)
        SQLServer:Locks(_Total)Lock Timeouts (timeout > 0)/sec
        SQLServer:Locks(_Total)Lock Wait Time (ms)
        SQLServer:Locks(_Total)Number of Deadlocks/sec
        SQLServer:SQL StatisticsBatch Requests/sec
        SQLServer:SQL StatisticsSQL Re-Compilations/sec

    上述指标含义,可以参照我上一篇文章:SQL Server需要监控哪些计数器 

    监控脚本

    $server = "(local)"
    $uid = "sa"
    $db="master"
    $pwd="password"
    $mailprfname = "SendEmail"
    $recipients = "787449667@qq.com"
    $subject = "数据库指标异常了!"
    $computernamexml = "f:computername.xml"
    $alter_cpuxml = "f:alter_cpu.xml"
    function GetServerName($xmlpath)
    {
        $xml = [xml] (Get-Content $xmlpath)
        $return = New-Object Collections.Generic.List[string]
        for($i = 0;$i -lt $xml.computernames.ChildNodes.Count;$i++)
        {
            if ( $xml.computernames.ChildNodes.Count -eq 1)
            {
                $cp = [string]$xml.computernames.computername
            }
            else
            {
                $cp = [string]$xml.computernames.computername[$i]
            }
            $return.Add($cp.Trim())
        }
        $return
    }
    
    function GetAlterCounter($xmlpath)
    {
        $xml = [xml] (Get-Content $xmlpath)
        $return = New-Object Collections.Generic.List[string]
        $list = $xml.counters.Counter
        $list
    }
    
    function CreateAlter($message)
    {
        $SqlConnection = New-Object System.Data.SqlClient.SqlConnection 
        $CnnString ="Server = $server; Database = $db;User Id = $uid; Password = $pwd" 
        $SqlConnection.ConnectionString = $CnnString 
        $CC = $SqlConnection.CreateCommand(); 
        if (-not ($SqlConnection.State -like "Open")) { $SqlConnection.Open() } 
        
        $cc.CommandText=" EXEC msdb..sp_send_dbmail 
                 @profile_name  = '$mailprfname'
                ,@recipients = '$recipients'
                ,@body = '$message'
                ,@subject = '$subject'
    " 
        $cc.ExecuteNonQuery()|out-null 
        $SqlConnection.Close();
    }
    
    $names = GetServerName($computernamexml)
    $pfcounters = GetAlterCounter($alter_cpuxml)
    foreach($cp in $names)
    {
        $p = New-Object Collections.Generic.List[string]
        $report = ""
        foreach ($pfc in $pfcounters)
        {
            $b = ""
            $counter ="\"+$cp+$pfc.get_InnerText().Trim()
            $p.Add($counter)
            
        }
        $count = Get-Counter $p
        for ($i = 0; $i -lt $count.CounterSamples.Count; $i++)
        {
            $v = $count.CounterSamples.Get($i).CookedValue
            $pfc = $pfcounters[$i]
            #$pfc.get_InnerText()
            $b = ""
            $lg = ""
            if($pfc.operator -eq "lt")
            {
                if ($v -ge [double]$pfc.alter)
                    {$b = "alter"
                    $lg = "Greater Than"}
            }
            elseif ($pfc.operator -eq "gt")
            {
                if( $v -le [double]$pfc.alter)
                    {$b = "alter"
                    $lg = "Less Than"}
            }
            if($b -eq "alter")
            {
                $path = "\"+$cp+$pfc.get_InnerText()
                
                $item = "{0}:{1};{2} Threshold:{3}" -f $path,$v.ToString(),$lg,$pfc.alter.Trim()
                $report += $item + "`n"
            }
            
        }
        if($report -ne "")
        {
            #生产警告 参数 计数器,阀值,当前值
            CreateAlter $report
        }
    }

    其中涉及到2个配置文件:computernamexml,alter_cpuxml分别如下:

    <computernames>
            <computername>
                    wuxuelei-pc
            </computername>
    </computernames>
    <Counters>
            <Counter alter = "10" operator = "gt" >Processor(_Total)\% Processor Time</Counter>
            <Counter alter = "10" operator = "gt" >Processor(_Total)\% Privileged Time</Counter>
            <Counter alter = "10" operator = "gt" >SQLServer:SQL StatisticsBatch Requests/sec</Counter>
            <Counter alter = "10" operator = "gt" >SQLServer:SQL StatisticsSQL Compilations/sec</Counter>
            <Counter alter = "10" operator = "gt" >SQLServer:SQL StatisticsSQL Re-Compilations/sec</Counter>
            <Counter alter = "10" operator=  "lt" >SystemProcessor Queue Length</Counter>
            <Counter alter = "10" operator=  "lt" >SystemContext Switches/sec</Counter>
    </Counters>

    其中 alter 就是阀值,如第一条,如果 阀值 > 性能计数器值,就会发出警告。

    其实这种自定义配置的方式,实现了灵活多变的自动化监控标准:

    1、比如可以检测磁盘空间大小

    2、检测运行峰值状态

    3、定时的根据历史运行值,更改生产系统中的阀值大小,也就是所谓的运行基线

    警告实现方式

    1、SQL Agent配置Job方式实现

    2、计划任务

    以上两种配置方式,可以灵活掌握,操作还是蛮简单的,如果不会,可自行google。当然,如果不想干预正常的生产系统,可以添加一个Server专门用来自动化运维检测来用,实现远程监控。

    后续文章中会分析关于Power Shell的远程调用,并且能实现事故当前状态下,自动化截图....自动Send Email......为DBA现场取证第一手材料...方便诊断问题...

    效果图如下

    以上只提供实现方式,如需要内容更新,自己灵活更新。

    脚本下载地址http://files.cnblogs.com/zhijianliutang/DBALter.zip

  • 相关阅读:
    hdu 4027 Can you answer these queries? 线段树
    ZOJ1610 Count the Colors 线段树
    poj 2528 Mayor's posters 离散化 线段树
    hdu 1599 find the mincost route floyd求最小环
    POJ 2686 Traveling by Stagecoach 状压DP
    POJ 1990 MooFest 树状数组
    POJ 2955 Brackets 区间DP
    lightoj 1422 Halloween Costumes 区间DP
    模板 有源汇上下界最小流 loj117
    模板 有源汇上下界最大流 loj116
  • 原文地址:https://www.cnblogs.com/zhijianliutang/p/4170488.html
Copyright © 2011-2022 走看看