原因是因为你的hadoop.tmp.dir在/tmp目录下,而linux系统的/tmp文件夹内容能够是定时清理的,所以会导致你看hadoop使用不了了,就反复的格式化namenode会导致上述问题,也有可能是datanode长期没正常启动导致;
找了一下资料,有三个解决方案:
解決方法一:删除 datanode 的所有资料,主要指的是tmp目录和data目录,适用没存放过任何资料的HDFS;
解決方法二:修改 datanode 的 namespaceID
编辑每台 datanode 的 hadoop.tmp.dir/hadoop/hadoop-root/dfs/data/current/VERSION 把ID改为和namenode一致,重启datanode,数据会丢失;
解決方法三:修改 namenode 的 namespaceID(网上找到的)
编辑 namenode 的 hadoop.tmp.dir/hadoop/hadoop-root/dfs/name/current/VERSION 把ID改为和datanode一直,重启namenode,我测试了一下,第三种方法不行,我初步断定namespaceID生成的时候,里面可能有时间的随机数,我在测试中改了namenode的namespaceID,让namende和datanode一直,但是重启后他会自动的核对,他重新的修改回来,没办法,我只好采用了第二种方案,然后我仔细看了namenoe启动的日志,发现 有日志块注册的信息,注册完后,namenode发现datanode上有不属于自己的data,就发送了delete的命令
2012-10-19 16:57:20,980 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.registerDatanode: node registration from 192.168.80.84:50010 storage DS-584796903-192.168.80.84-50010-1350015221338
2012-10-19 16:57:21,142 INFO org.apache.hadoop.net.NetworkTopology: Adding a new node: /default-rack/192.168.80.84:50010
2012-10-19 16:57:21,618 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.registerDatanode: node registration from 192.168.80.83:50010 storage DS-942449248-192.168.80.83-50010-1350015230758
2012-10-19 16:57:21,618 INFO org.apache.hadoop.net.NetworkTopology: Adding a new node: /default-rack/192.168.80.83:50010
2012-10-19 16:57:21,866 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.processReport: block blk_-8214438839875239556_1105 on 192.168.80.84:50010 size 67108864 does not belong to any file.
2012-10-19 16:57:21,882 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.addToInvalidates: blk_-8214438839875239556 is added to invalidSet of 192.168.80.84:50010
2012-10-19 16:57:21,882 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.processReport: block blk_-4821437377619945111_1112 on 192.168.80.84:50010 size 4 does not belong to any file.
2012-10-19 16:57:21,882 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.addToInvalidates: blk_-4821437377619945111 is added to invalidSet of 192.168.80.84:500102012-10-19 16:57:21,618 INFO org.apache.hadoop.net.NetworkTopology: Adding a new node: /default-rack/192.168.80.83:50010