zoukankan      html  css  js  c++  java
  • 处理CDH环境Hadoop:NameNode is not formatted

    背景

    因升级JN节点,需要将JN迁移到其他机器,该节点有三台在迁移过程中我迁移其中一台。
    在HDFS页面进行角色迁移,选择当前角色机器和目标机器,提示需要重启整个集群(前提是需要确保是否有人员在使用)。重启后出现错误导致HA中Master无法启动

    错误信息

    引导备用 NameNode
    Failed to bootstrap Standby NameNode NameNode (cluster-master): STARTUP_MSG:   build = http://github.com/cloudera/hadoop -r 91e45acfc3e208d656c3ec1c1a0abe4a8de6ad4c; compiled by 'jenkins' on 2016-01-26T00:19Z
    STARTUP_MSG:   java = 1.7.0_67
    ************************************************************/
    19/01/15 11:11:47 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT]
    19/01/15 11:11:47 INFO namenode.NameNode: createNameNode [-bootstrapStandby, -nonInteractive]
    Running in non-interactive mode, and data appears to exist in Storage Directory /data1/dfs/nn. Not formatting.
    19/01/15 11:11:49 INFO util.ExitUtil: Exiting with status 5
    19/01/15 11:11:49 INFO namenode.NameNode: SHUTDOWN_MSG: 
    /************************************************************
    SHUTDOWN_MSG: Shutting down NameNode at cluster-master.gyyx.cn/10.12.50.49
    ************************************************************/

    查看日志

    2019-01-15 13:24:55,058 INFO org.apache.hadoop.util.GSet: Computing capacity for map NameNodeRetryCache
    2019-01-15 13:24:55,058 INFO org.apache.hadoop.util.GSet: VM type       = 64-bit
    2019-01-15 13:24:55,058 INFO org.apache.hadoop.util.GSet: 0.029999999329447746% max memory 4.9 GB = 1.5 MB
    2019-01-15 13:24:55,058 INFO org.apache.hadoop.util.GSet: capacity      = 2^18 = 262144 entries
    2019-01-15 13:24:55,063 INFO org.apache.hadoop.hdfs.server.namenode.NNConf: ACLs enabled? false
    2019-01-15 13:24:55,063 INFO org.apache.hadoop.hdfs.server.namenode.NNConf: XAttrs enabled? true
    2019-01-15 13:24:55,063 INFO org.apache.hadoop.hdfs.server.namenode.NNConf: Maximum size of an xattr: 16384
    2019-01-15 13:24:55,080 INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on /data1/dfs/nn/in_use.lock acquired by nodename 15050@cluster-master.gyyx.cn
    2019-01-15 13:24:55,083 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Encountered exception loading fsimage
    java.io.IOException: NameNode is not formatted.
    	at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:212)
    	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1061)
    	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:765)
    	at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:609)
    	at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:666)
    	at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:838)
    	at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:817)
    	at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1538)
    	at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1606)
    2019-01-15 13:24:55,096 INFO org.mortbay.log: Stopped HttpServer2$SelectChannelConnectorWithSafeStartup@cluster-master.gyyx.cn:50070
    2019-01-15 13:24:55,196 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping NameNode metrics system...
    2019-01-15 13:24:55,197 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system stopped.
    2019-01-15 13:24:55,198 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system shutdown complete.
    2019-01-15 13:24:55,198 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode.
    java.io.IOException: NameNode is not formatted.
    	at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:212)
    	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1061)
    	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:765)
    	at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:609)
    	at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:666)
    	at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:838)
    	at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:817)
    	at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1538)
    	at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1606)
    2019-01-15 13:24:55,202 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1
    2019-01-15 13:24:55,205 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG: 

    关注点在:

    2019-01-15 13:24:55,083 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Encountered exception loading fsimage
    java.io.IOException: NameNode is not formatted.
    

    各种百度、google搜索均是要求格式化,

    hadoop namenode -format

    我这生产环境能动不动就格式化吗?

    解决思路

    根据提示说是无法load fsimage
    于是寻找fsimage所在的位置也就是edits 所在的位置
    看到/data1/dfs/nn 目录下只有一个root权限的current.bak 说明系统将current目录给重命名了。
    因为我的NN是HA。所以可以把current目录拷贝过来。(不能把currtne.bak名称改过去是因为数据已经发生变更)

    操作流程

    1、联系各组负责人需要对hadoop集群进行修复,暂停使用查询或其他操作
    2、关闭整个集群,确认服务均已关闭
    3、拷贝current数据至故障NN
    scp -r  -P63008 root@YOUR_NAMENODE:/data1/dfs/nn/current/* .
    4、授权
    chown -R hdfs.hdfs current
    5、删除/tmp 目录下的临时文件
    6、重启集群
    7、查看hadoop日志、cloudera manager状态正常

    解决问题

  • 相关阅读:
    HTML5 中的Nav元素详解
    Gevent中信号量的使用
    MemCache缓存multiget hole详解
    MemCache中的内存管理详解
    Php中的强制转换详解
    Python中类的特殊方法详解
    MemCache的LRU删除机制详解
    AngularJS事件绑定的使用详解
    Php数据类型之整型详解
    HTML基础知识
  • 原文地址:https://www.cnblogs.com/zhangrui153169/p/13714070.html
Copyright © 2011-2022 走看看