zoukankan      html  css  js  c++  java
  • 定位ScheduledExecutorService过了一段时间不执行问题

    今天查看生产环境的sentinel控制台,发现某dubbo应用一共5个节点,有3个失联了。

    查看失联节点的应用日志,服务没有挂,各dubbo接口的日志正常在打印。

    在应用节点ping/telnet sentinel控制台节点,ip和端口能够连通。

    查看应用节点的sentinel-record日志,grep "Heartbeat" sentinel-record.log.2019-01-1*

    sentinel-record.log.2019-01-14.0:2019-01-14 16:50:43 [Sentinel InitExecutor] Found init func: com.alibaba.csp.sentinel.transport.init.HeartbeatSenderInitFunc
    sentinel-record.log.2019-01-14.0:2019-01-14 16:50:43 [SimpleHttpHeartbeatSender] Default console address list retrieved: [/xxx:xxx]
    sentinel-record.log.2019-01-14.0:2019-01-14 16:50:43 [HeartbeatSenderInit] HeartbeatSender started: com.alibaba.csp.sentinel.transport.heartbeat.SimpleHttpHeartbeatSender
    sentinel-record.log.2019-01-14.0:2019-01-14 16:50:43 [Sentinel InitExecutor] Initialized: com.alibaba.csp.sentinel.transport.init.HeartbeatSenderInitFunc with order 2147483647
    

    发现1月14号有日志输出,当时应用程序有改动,构建发布后节点重新启动,日志显示心跳初始化正常。

    用jmc查看各节点sentinel定时发送心跳的线程情况,
    失联的节点:

    正常的节点:

    注意到失联节点线程状态全部变成了WAITING,而正常节点有一个线程是TIMED_WAITING

    HeartbeatSenderInitFunc类的发送心跳代码:

    private void scheduleHeartbeatTask(/*@NonNull*/ final HeartbeatSender sender, /*@Valid*/ long interval) {
            pool.scheduleAtFixedRate(new Runnable() {
                @Override
                public void run() {
                    try {
                        sender.sendHeartbeat();
                    } catch (Throwable e) {
                        RecordLog.warn("[HeartbeatSender] Send heartbeat error", e);
                    }
                }
            }, 5000, interval, TimeUnit.MILLISECONDS);
            RecordLog.info("[HeartbeatSenderInit] HeartbeatSender started: "
                + sender.getClass().getCanonicalName());
    }
    

    线程中sender.sendHeartbeat();是捕获了Throwable并记录了异常日志的;
    而在日志中并没有搜到异常信息;
    由此推断,定时任务的线程已经失效了。

    查询资料可能是应用中有内存溢出,会导致线程挂掉。

    在最近两周的应用日志里搜索OutOfMemoryError

    [ ERROR] [2019-01-14 10:25:29] [6beacd73653e7f50/6beacd73653e7f50]  [DubboServerHandler-xxx:xxx-thread-397] com.alibaba.dubbo.rpc.filter.ExceptionFilter [91] -  [DUBBO] Got unchecked and undeclared exception which called by xxx. service: com.winxuan.services.shopps.service.ShopItemService, method: getShopItemInfoId, exception: java.lang.OutOfMemoryError: GC overhead limit exceeded, dubbo version: 2.6.0, current host: xxx
    

    总结:

    JAVA应用如果出现OutOfMemoryError,可能导致ScheduledExecutorService失效。


    参考:

    ScheduledExecutorService is broken https://community.oracle.com/thread/1144316

  • 相关阅读:
    Codeforces Round #172 (Div. 2) B. Nearest Fraction
    什么是DWR
    1310 N皇后问题
    ural Bus Routes(dfs深搜)
    ural Russian Pipelines(最短路)
    ural Graph Decomposition
    ural Network ( 最小生成树)
    poj 1579 Function Run Fun ( 记忆化搜索 )
    计算某一天的前一天的日期
    DataStructGraphpart1
  • 原文地址:https://www.cnblogs.com/cdfive2018/p/10283315.html
Copyright © 2011-2022 走看看