zoukankan      html  css  js  c++  java
  • hbase 写入大量数据到datanode失败

    1、hbase regionserver 错误日志
    2020-04-07 15:53:36,604 WARN  [hadoop01:16020-0.append-pool2-t1] wal.FSHLog: Append sequenceId=3897, requesting roll of WAL
    java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[110.221.140.165:50010, 110.221.140.163:50010], original=[110.221.140.165:50010, 10.221.140.163:50010]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration.
            at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:969)
            at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:1035)
            at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1184)
            at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:933)
            at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:487)
    2020-04-07 15:53:36,620 ERROR [MemStoreFlusher.3] regionserver.MemStoreFlusher: Cache flush failed for region hbase:meta,,1
    org.apache.hadoop.hbase.regionserver.wal.DamagedWALException: Append sequenceId=3897, requesting roll of WAL
            at org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.append(FSHLog.java:1971)
            at org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.onEvent(FSHLog.java:1815)
            at org.apache.hadoop.hbase.regionserver.wal.FSHLog$RingBufferEventHandler.onEvent(FSHLog.java:1725)
            at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:128)
            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
            at java.lang.Thread.run(Thread.java:748)
    2、分析

    由于datanode集群只有四台,标准的小集群,所以hbase在写入数据到datanode时,,在pipeline中,大量的datanode失败时,,会把bad datanode踢出,这样一来由于副本数不能满足,导致regionserver挂掉

    3、解决

    该错误相关的两个参数:

    dfs.client.block.write.replace-datanode-on-failure.enable=true

    dfs.client.block.write.replace-datanode-on-failure.policy=DEFAULT

    这个属性只有在dfs.client.block.write.replace-datanode-on-failure.enable设置true时有效:

    ALWAYS:当一个存在的DataNode被删除时,总是添加一个新的DataNode

    NEVER:永远不添加新的DataNode

    DEFAULT:副本数是r,DataNode的数时n,只要r >= 3时,或者floor(r/2)大于等于n时,r>n时再添加一个新的DataNode,并且这个块是hflushed/appended

    借鉴:

    https://blog.csdn.net/wangweislk/article/details/78890163

  • 相关阅读:
    【板子】博弈论
    【洛谷】P1229快速幂
    【洛谷】P1349广义斐波那契
    2018.11.15 Nginx服务器的使用
    2018.11.14 hibernate中的查询优化---关联级别查询
    2018.11.13 Hibernate 中数据库查询中的Criteria查询实例
    2018.11.12 Spring事务的实现和原理
    2018.11.11 Java的 三大框架:Struts+Hibernate+Spring
    2018.11.10 Mac设置Eclipse的 .m2文件夹是否可见操作&&Mac系统显示当前文件夹的路径设置
    2018.11.9 Dubbo入门学习
  • 原文地址:https://www.cnblogs.com/yjt1993/p/12654137.html
Copyright © 2011-2022 走看看