zoukankan      html  css  js  c++  java
  • 记录java程序一次CPU占用90%问题排查过程

    线上问题出现有个服务多台机器CPU飙升到90%

    1、查看系统CPU占用

    top -H -p pid

    top - 11:51:57 up 110 days, 14:01, 0 users, load average: 7.08, 7.64, 8.15
    Threads: 586 total, 11 running, 575 sleeping, 0 stopped, 0 zombie
    %Cpu(s): 82.5 us, 7.0 sy, 0.0 ni, 8.0 id, 1.5 wa, 0.0 hi, 1.0 si, 0.0 st
    KiB Mem : 8174716 total, 157652 free, 6103332 used, 1913732 buff/cache
    KiB Swap: 0 total, 0 free, 0 used. 1741096 avail Mem

    PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
    31810 * 20 0 8485932 4.591g 0 R 53.9 58.9 117:41.64 java
    32411 * 20 0 8485932 4.591g 0 R 44.7 58.9 85:29.14 java
    31626 * 20 0 8485932 4.591g 0 S 21.7 58.9 0:53.47 java
    31716 * 20 0 8485932 4.591g 0 R 19.1 58.9 461:14.88 java
    6757 * 20 0 8485932 4.591g 0 R 15.5 58.9 27:29.25 java
    7217 * 20 0 8485932 4.591g 0 R 14.8 58.9 27:17.37 java
    7216 * 20 0 8485932 4.591g 0 R 12.8 58.9 27:21.51 java
    5295 * 20 0 8485932 4.591g 0 R 11.8 58.9 28:12.26 java
    31627 * 20 0 8485932 4.591g 0 S 11.5 58.9 15:56.04 java

    ....

    2、查看对应的具体线程占用资源情况,获取16进制线程id

    printf “%x
    ” pid

    printf '%x ' 32426
    7eaa

    3、导出jstack日志

    "logback-6" #528 daemon prio=5 os_prio=0 tid=0x00007fda40005000 nid=0x7eab waiting on condition [0x00007fda3c3df000]
       java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x0000000765aea438> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
        at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1081)
        at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
        at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
    
       Locked ownable synchronizers:
        - None
    
    "logback-5" #527 daemon prio=5 os_prio=0 tid=0x00007fda40004800 nid=0x7eaa waiting on condition [0x00007fda1db8d000]
       java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x0000000765aea438> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
        at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1081)
        at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
        at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
    
       Locked ownable synchronizers:
        - None

    4、使用第2步的id查找jstack日志,logback-5即线程名

    "logback-5" #527 daemon prio=5 os_prio=0 tid=0x00007fda40004800 nid=0x7eaa waiting on condition [0x00007fda1db8d000]
       java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x0000000765aea438> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
        at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1081)
        at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
        at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
    
       Locked ownable synchronizers:
        - None

    根据线程名查找代码

    5、由于该服务日志量很大,且打印的都是trace日志,而且打印的是异步日志

    ch.qos.logback.core.AsyncAppenderBase#put
    
    private void put(E eventObject) {
            if (neverBlock) {
                blockingQueue.offer(eventObject);
            } else {
                putUninterruptibly(eventObject);
            }
        }

    综上所述排查顺序,则缩小打印日志量应该就可以减少日志打印的cpu占用情况,而且应该也能减少CMS 老年代垃圾回收时间。准备关闭trace日志,只使用公司的全链路日志,优化效果应该非常明显。

  • 相关阅读:
    Axure流程图
    Axure工作区间
    Axure简介
    Java知识导航总图
    SQL筛选出同一学科的时间最新的记录
    从高版本JDK换成低版本JDK报错Unsupported major.minor version 52.0
    java.lang.IllegalArgumentException: No converter found for return value of type
    httpClient创建对象、设置超时
    sql 查出一张表中重复的所有记录数据
    java List分批处理
  • 原文地址:https://www.cnblogs.com/beaconSky/p/14024361.html
Copyright © 2011-2022 走看看