zoukankan      html  css  js  c++  java
  • CPU 负荷过重时邮件报警

    首先介绍一下 top 命令的使用方法, top 程序提供了运行系统的动态实时视图, 它可以显示系统摘要信息以及当前线程或进程的列表

    $ top -h
      procps-ng 3.3.12
    Usage:
      top -hv | -bcHiOSs -d secs -n max -u|U user -p pid(s) -o field -w [cols]
    

    -hv Help/Version 两者都是打印版本等帮助信息

    在命令行参数中提供以下选项可以改变默认值

    -b    Batch-mode 非窗口模式的输出
    -c    Command-line/Program-name 显示进程的 command
    -H    Threads-mode 线程模式 指示top显示单个线程。如果没有此命令行选项,则显示每个进程中所有线程的总和。窗口模式下可以用“H”更改
    -i    Idle-process 空闲任务 当此切换为“关闭”时,自上次更新以来未使用任何CPU的任务将不会显示
    -O    Output-field-names
    -S    Cumulative-time 累积模式
    -s    Secure-mode 安全模式
    
    -d    Delay-time 延迟时间
    -n    刷新次数
    -w    限制列数
    

    默认值如下:

    Global-defaults
                  A - Alt display      Off (full-screen)
                * d - Delay time       1.5 seconds
                * H - Threads mode     Off (summarize as tasks)
                  I - Irix mode        On  (no, `solaris' smp)
                * p - PID monitoring   Off (show all processes)
                * s - Secure mode      Off (unsecured)
                  B - Bold enable      On  (yes, bold globally)
               Summary-Area-defaults
                  l - Load Avg/Uptime  On  (thus program name)
                  t - Task/Cpu states  On  (1+1 lines, see `1')
                  m - Mem/Swap usage   On  (2 lines worth)
                  1 - Single Cpu       Off (thus multiple cpus)
               Task-Area-defaults
                  b - Bold hilite      Off (use `reverse')
                * c - Command line     Off (name, not cmdline)
                * i - Idle tasks       On  (show all tasks)
                  J - Num align right  On  (not left justify)
                  j - Str align right  Off (not right justify)
                  R - Reverse sort     On  (pids high-to-low)
                * S - Cumulative time  Off (no, dead children)
                * u - User filter      Off (show euid only)
                * U - User filter      Off (show any uid)
                  V - Forest view      On  (show as branches)
                  x - Column hilite    Off (no, sort field)
                  y - Row hilite       On  (yes, running tasks)
                  z - color/mono       On  (show colors)
    

    要想监控 CPU 使用情况, 我们可以观察 top -bi -n 1
    以下是命令watch top -bi -n 1的输出

    Every 2.0s: top -bi -n 1                                                                                                                                   MyServer: Fri Oct 18 08:45:14 2019
    
    top - 08:45:14 up 36 days,  1:50,  5 users,  load average: 0.07, 0.05, 0.01
    Tasks: 146 total,   1 running, 144 sleeping,   1 stopped,   0 zombie
    %Cpu(s):  0.1 us,  0.0 sy,  0.0 ni, 99.8 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
    KiB Mem :  2062096 total,   350188 free,   316304 used,  1395604 buff/cache
    KiB Swap:   524284 total,   523764 free,      520 used.  1550992 avail Mem
    
      PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
    

    当我开启一个线程空转时

    Every 2.0s: top -bi -n 1                                                                                                                                   MyServer: Fri Oct 18 08:45:55 2019
    
    top - 08:45:55 up 36 days,  1:51,  5 users,  load average: 0.12, 0.06, 0.01
    Tasks: 148 total,   1 running, 146 sleeping,   1 stopped,   0 zombie
    %Cpu(s):  0.1 us,  0.0 sy,  0.0 ni, 99.8 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
    KiB Mem :  2062096 total,   339368 free,   327092 used,  1395636 buff/cache
    KiB Swap:   524284 total,   523764 free,      520 used.  1540204 avail Mem
    
      PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
     5595 d         20   0 3100664  33628  24520 S 100.0  1.6   0:04.71 java
    

    当然, top -cbi -n 1 可以显示完整命令行

      PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
     4584 root      20   0 3142032 158940  27872 S   0.3  7.7   1:00.08 java -cp .:bin:SpringDependent/emcat/ref/tomcat-annotations-api-9.0.26.jar:SpringDependent/emcat/ref/tomcat-embed-core-+
    

    使用正则表达式匹配 CPU 和 内存

    ^.*s+(d+.d+)s+(d+.d+)s+.*$
    

    然后就可以编程实现了, 项目地址: https://github.com/develon2015/CPUWarning

    采样174 CPU:100.0       Mem: 1.7
    采样175 CPU:100.0       Mem: 1.7
    CPU平均使用率为 100.1840909090909 %
    CPU 超载 (100.0%), 检查上一次警告时间以确认本次是否发送警报邮件
    发送邮件 -- (Sat Oct 19 00:53:28 EDT 2019)
    已发送警报邮件至 develon@qq.com : CPU超负荷警告 -> 服务器CPU严重超载(100.1840909090909%), 请管理员立即处理.
    top - 00:53:27 up 36 days, 17:58,  5 users,  load average: 0.97, 0.39, 0.15
    Tasks: 148 total,   1 running, 147 sleeping,   0 stopped,   0 zombie
    %Cpu(s):  0.1 us,  0.0 sy,  0.0 ni, 99.8 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
    KiB Mem :  2062096 total,   134600 free,   362592 used,  1564904 buff/cache
    KiB Swap:   524284 total,   523508 free,      776 used.  1515476 avail Mem
    
      PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
    13211 d         20   0 3100664  34784  25480 S 100.0  1.7   2:02.85 java
    
    FROM CPUWarning.
    采样0   CPU:106.7       Mem: 1.7
    采样1   CPU:106.7       Mem: 3.0999999999999996
    采样2   CPU:93.8        Mem: 1.7
    采样3   CPU:93.8        Mem: 1.7
    采样4   CPU:106.7       Mem: 1.7
    采样5   CPU:100.0       Mem: 1.7
    采样6   CPU:100.0       Mem: 1.7
    采样7   CPU:100.0       Mem: 1.7
    采样8   CPU:106.7       Mem: 1.7
    
    ...
    
    采样158 CPU:6.7 Mem: 2.9
    采样159 CPU:0.0 Mem: 0.0
    采样160 CPU:0.0 Mem: 0.0
    采样161 CPU:0.0 Mem: 0.0
    采样162 CPU:0.0 Mem: 0.0
    采样163 CPU:0.0 Mem: 0.0
    CPU平均使用率为 36.94268292682926 %
    警报解除 -- (Sat Oct 19 00:55:28 EDT 2019)
    当前处于安全状态(CPU 0.0 %) -- (Sat Oct 19 00:55:30 EDT 2019)
     top - 00:55:30 up 36 days, 18:00,  5 users,  load average: 0.35, 0.40, 0.18
    Tasks: 146 total,   1 running, 145 sleeping,   0 stopped,   0 zombie
    %Cpu(s):  0.1 us,  0.0 sy,  0.0 ni, 99.8 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
    KiB Mem :  2062096 total,   134324 free,   362808 used,  1564964 buff/cache
    KiB Swap:   524284 total,   523508 free,      776 used.  1515260 avail Mem
    
      PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
    
    当前处于安全状态(CPU 0.0 %) -- (Sat Oct 19 00:55:33 EDT 2019)
     top - 00:55:32 up 36 days, 18:00,  5 users,  load average: 0.32, 0.39, 0.18
    Tasks: 146 total,   1 running, 145 sleeping,   0 stopped,   0 zombie
    %Cpu(s):  0.1 us,  0.0 sy,  0.0 ni, 99.8 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
    KiB Mem :  2062096 total,   134324 free,   362808 used,  1564964 buff/cache
    KiB Swap:   524284 total,   523508 free,      776 used.  1515260 avail Mem
    
      PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
    

    https://github.com/develon2015/CPUWarning


  • 相关阅读:
    python shutil
    AttributeError: module 'shutil' has no attribute 'copyfileobj'
    python configparser
    JSON使用
    VRRP
    KeepAlived的介绍
    Nginx模块
    Nginx配置
    Nginx介绍
    apache相关补充
  • 原文地址:https://www.cnblogs.com/develon/p/11700546.html
Copyright © 2011-2022 走看看