zoukankan      html  css  js  c++  java
  • 【原创】大叔问题定位分享(6)Dubbo monitor服务iowait高,负载高

    一 问题

    Dubbo monitor所在服务器状态异常,iowait一直很高,load也一直很高,监控如下:

    iowait如图:

    load如图:

    二 分析

    通过iotop命令可以查看当前系统中磁盘io情况以及进程占用磁盘io的情况

     

    从中可以定位到占用io进程的pid;

    通过

    cat /proc/${pid}/io

    可以查看一个进程具体的读写状况;

    通过

    ps aux|grep ${pid}

    可以查到这个进程具体的命令;

    通过以上命令定位到进程为dubbo的monitor进程,用jstack打印线程栈发现处于RUNNABLE的进程除了

    java.lang.Thread.State: RUNNABLE

           at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)

    之外,有两个线程很可疑

    "DubboMonitorTimer-thread-1" daemon prio=10 tid=0x00007f53b0593000 nid=0x363e runnable [0x00007f53ac36d000]

       java.lang.Thread.State: RUNNABLE

           at java.io.FileInputStream.readBytes(Native Method)

           at java.io.FileInputStream.read(FileInputStream.java:272)

           at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:283)

           at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:325)

           at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:177)

           - locked <0x0000000779d506f8> (a java.io.FileReader)

           at java.io.InputStreamReader.read(InputStreamReader.java:184)

           at java.io.BufferedReader.fill(BufferedReader.java:154)

           at java.io.BufferedReader.readLine(BufferedReader.java:317)

           - locked <0x0000000779d506f8> (a java.io.FileReader)

           at java.io.BufferedReader.readLine(BufferedReader.java:382)

           at com.alibaba.dubbo.monitor.simple.SimpleMonitorService.appendData(SimpleMonitorService.java:322)

           at com.alibaba.dubbo.monitor.simple.SimpleMonitorService.draw(SimpleMonitorService.java:263)

           at com.alibaba.dubbo.monitor.simple.SimpleMonitorService.access$300(SimpleMonitorService.java:64)

           at com.alibaba.dubbo.monitor.simple.SimpleMonitorService$2.run(SimpleMonitorService.java:137)

           at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)

           at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)

           at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)

           at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)

           at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

           at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

           at java.lang.Thread.run(Thread.java:744)

       Locked ownable synchronizers:

           - <0x0000000788104268> (a java.util.concurrent.ThreadPoolExecutor$Worker)

    "DubboMonitorAsyncWriteLogThread" daemon prio=10 tid=0x00007f53b05ba000 nid=0x363d runnable [0x00007f53ac3af000]

       java.lang.Thread.State: RUNNABLE

           at java.io.FileOutputStream.writeBytes(Native Method)

           at java.io.FileOutputStream.write(FileOutputStream.java:345)

           at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)

           at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:291)

           at sun.nio.cs.StreamEncoder.implFlush(StreamEncoder.java:295)

           at sun.nio.cs.StreamEncoder.flush(StreamEncoder.java:141)

           - locked <0x000000077a028918> (a java.io.FileWriter)

           at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:229)

           at com.alibaba.dubbo.monitor.simple.SimpleMonitorService.write(SimpleMonitorService.java:213)

           at com.alibaba.dubbo.monitor.simple.SimpleMonitorService.access$100(SimpleMonitorService.java:64)

           at com.alibaba.dubbo.monitor.simple.SimpleMonitorService$1.run(SimpleMonitorService.java:120)

           at java.lang.Thread.run(Thread.java:744)

       Locked ownable synchronizers:

           - None

    这两个线程一个在read,一个在write,查看dubbo monitor的源代码发现:

     

    monitor中会有两个线程,一个不停地写统计日志,一个每隔300s进行一次draw操作;

    monitor中每天的日志都会保存在一个日志目录中,进入其中一天的日志目录查看日志大小:

    发现一天的日志只有几十M,而上边iotop查看的结果是read是1520K/s,write是3.43M/s,所以排除write的问题,问题应该出在draw身上,查看draw的源代码发现draw会不断的读取统计日志同时进行绘图操作,

     

    draw中会遍历日志目录,同时在appendData方法中会进行日志读取操作,由于monitor已经运行很长时间,所以按天保存的目录非常多,每次draw都需要去遍历这些目录并且进行读取文件操作,历史数据没有改动但却会不断的重新进行日志读取和绘图,问题应该出在这里;

    三 解决

    修改代码如下

     

    即让draw只对当天的日志进行处理,修改之后服务器iowait如图:

     

    load如图:

     

    问题解决

  • 相关阅读:
    error C2955: “std::xx”: 使用 类 模板 需要 模板 参数列表
    打印windows环境变量
    自动化测试覆盖率要到多少才算足够
    每个离谱的bug背后总有一个精彩绝伦的幕后故事
    测试是整个开发过程的瓶颈吗
    如何用指标去度量bug本身的质量情况
    python : 将txt文件中的数据读为numpy数组或列表
    Baxter+Kinect: Robot Teleoperation via Skeleton Tracking and Inverse Kinematics
    安卓环境和配置
    Qt开发经验小技巧181-185
  • 原文地址:https://www.cnblogs.com/barneywill/p/9923739.html
Copyright © 2011-2022 走看看