zoukankan      html  css  js  c++  java
  • namenode 和datanode无法启动,错误:FSNamesystem initialization failed. datanode.DataNode: Incompatible namespaceIDs

    问题一:

    namenode无法启动,查看日志,错误信息如下:

    org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem initialization failed.

    java.io.IOException: NameNode is not formatted.
        at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:317)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:292)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:201)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:279)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:956)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:965)
    2013-01-19 00:34:55,813 INFO org.apache.hadoop.ipc.Server: Stopping server on 9000
    2013-01-19 00:34:55,814 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.io.IOException: NameNode is not formatted.
        at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:317)
        at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:87)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:311)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:292)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:201)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:279)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:956)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:965)

    问题的原因:原来是修改了配置文件中的tmp目录后没有对hdfs做初始化,导致启动hadoop时报namenode没有初始化的错误。

    解决办法:删除hadoop/tmp目录下的dfs和mapred文件夹,然后格式化hadoop

    rm -rf dfs/

    rm -rf mapred/

    bin/./hadoop namenode -format 

    ok,但是有时候会引起问题二。

    问题二:

    datanode无法启动

    查看日志如下错误

    ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Incompatible namespaceIDs in /home/admin/joe.wangh/hadoop/data/dfs.data.dir: namenode namespaceID = 898136669; datanode namespaceID = 2127444065 
            at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:233)
            at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:148)
            at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:288)
            at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:206)
            at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1239)
            at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1194)
            at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1202)
            at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1324)

    错误提示namespaceIDs不一致。

    问题产生原因:每次namenode format会重新创建一个namenodeId,而tmp/dfs/data下包含了上次format下的id,namenode format清空了namenode下的数据,但是没有清空datanode下的数据,所以造成namenode节点上的namespaceID与datanode节点上的namespaceID不一致。启动失败。

    Workaround 1: Start from scratch

    I can testify that the following steps solve this error, but the side effects won't make you happy (me neither). The crude workaround I have found is to:

    1.     stop the cluster

    2.     delete the data directory on the problematic datanode: the directory is specified by dfs.data.dir in conf/hdfs-site.xml; if you followed this tutorial, the relevant directory is /usr/local/hadoop-datastore/hadoop-hadoop/dfs/data

    3.     reformat the namenode (NOTE: all HDFS data is lost during this process!)

    4.     restart the cluster

    When deleting all the HDFS data and starting from scratch does not sound like a good idea (it might be ok during the initial setup/testing), you might give the second approach a try.

    Workaround 2: Updating namespaceID of problematic datanodes

    Big thanks to Jared Stehler for the following suggestion. I have not tested it myself yet, but feel free to try it out and send me your feedback. This workaround is "minimally invasive" as you only have to edit one file on the problematic datanodes:

    1.     stop the datanode

    2.     edit the value of namespaceID in <dfs.data.dir>/current/VERSION to match the value of the current namenode

    3.     restart the datanode

    If you followed the instructions in my tutorials, the full path of the relevant file is /usr/local/hadoop-datastore/hadoop-hadoop/dfs/data/current/VERSION (background: dfs.data.dir is by default set to ${hadoop.tmp.dir}/dfs/data, and we set hadoop.tmp.dir to /usr/local/hadoop-datastore/hadoop-hadoop).

    If you wonder how the contents of VERSION look like, here's one of mine:

    #contents of <dfs.data.dir>/current/VERSION

    namespaceID=393514426

    storageID=DS-1706792599-10.10.10.1-50010-1204306713481

    cTime=1215607609074

    storageType=DATA_NODE

    layoutVersion=-13

    我们采用方法一:

    (1)停掉集群服务

      (2)在出问题的datanode节点上删除data目录,data目录即是在hdfs-site.xml文件中配置的dfs.data.dir目录,本机器上那个是/var/lib/hadoop-0.20/cache/hdfs/dfs/data/ (注:我们当时在所有的datanode和namenode节点上均执行了该步骤。以防删掉后不成功,可以先把data目录保存一个副本).

      (3)格式化namenode.

      (4)重新启动集群。

      问题解决。

    这种方法带来的一个副作用即是,hdfs上的所有数据丢失。如果hdfs上存放有重要数据的时候,不建议采用该方法,可以尝试提供的网址中的第二种方法。

  • 相关阅读:
    转ihone程序内发邮件,发短信,打开链接等
    plist 文件的读写
    转object c语法速成
    转iphone项目之间的引用。
    object c求nsstring 长度和去掉前后空格的方法
    object c runtime中类类型和消息支持检查
    转NSDictionary类使用
    设置UITableview 浮动的 header
    NSString 类型plist转为NSDictionary
    ObjectiveC Unicode 转换成中文
  • 原文地址:https://www.cnblogs.com/catWang/p/4056826.html
Copyright © 2011-2022 走看看