1.下载lzo
wget 'https://github.com/cloudera/hadoop-lzo/tarball/0.4.14' -O hadoop-lzo-0.4.14.tar.gz --no-check-certificate
2.编译lzo
tar zxf hadoop-lzo-0.4.14.tar.gz
cd ./cloudera-hadoop-lzo-8aa0605/
JAVA_HOME=${JAVA_HOME} C_INCLUDE_PATH=${HADOOP_HOME}/lzo-2.06/include LIBRARY_PATH=${HADOOP_HOME}/lzo-2.06/lib
ant compile-native
ant jar
cp -af /usr/local/hadoop-0.20.2-cdh3u3/cloudera-hadoop-lzo-8aa0605/build/hadoop-lzo-0.4.14.jar /usr/local/hadoop/lib
3.配置lzo
core-site.xml
1: <!-- compression config -->
2: <property>
3: <name>io.compression.codecs</name>
4: <value>com.hadoop.compression.lzo.LzopCodec,org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.BZip2Codec</value>
5: <description>A list of the compression codec classes that can be used for compression/decompression.</description>
6: </property>
4.配置hadoop Snappy
mapred-site.xml
1: <!--compression-->
2: <property>
3: <name>mapred.compress.map.output</name>
4: <value>true</value>
5: </property>
6:
7: <property>
8: <name>mapred.map.output.compression.codec</name>
9: <value>org.apache.hadoop.io.compress.SnappyCodec</value>
10: </property>
11:
12: <property>
13: <name>mapred.output.compression.type</name>
14: <value>BLOCK</value>
15: </property>
5.重启hadoop
stop-all.sh
start-all.sh
datanode节点报错:
原因是主namenode重新编译,导致主从节点版本异同。
(以后最好不要在主节点重新编译,这样同步hadoop builde version工作量挺大)
1: 2012-06-05 13:47:23,006 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG:
2: /************************************************************
3: STARTUP_MSG: Starting DataNode
4: STARTUP_MSG: host = datanode113.hadoop/172.16.51.113
5: STARTUP_MSG: args = []
6: STARTUP_MSG: version = 0.20.2-cdh3u3
7: STARTUP_MSG: build = file:///usr/local/hadoop -r Unknown; compiled by 'root' on 2012氓鹿麓 04忙?? 19忙?楼 忙??忙??氓?? 12:48:49 CST
8: ************************************************************/
9: 2012-06-05 13:47:23,260 INFO org.apache.hadoop.security.UserGroupInformation: JAAS Configuration already set up for Hadoop, not re-installing.
10: 2012-06-05 13:47:23,315 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Incompatible build versions: namenode BV = 318bc781117fa276ae81a3d111f5eeba0020634f; datanode BV = Unknown
11: 2012-06-05 13:47:23,421 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Incompatible build versions: namenode BV = 318bc781117fa276ae81a3d111f5eeba0020634f; datanode BV = Unknown
12: at org.apache.hadoop.hdfs.server.datanode.DataNode.handshake(DataNode.java:608)
13: at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:387)
14: at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:305)
15: at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1627)
16: at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1567)
17: at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1585)
18: at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1711)
19: at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1728)
20:
21: 2012-06-05 13:47:23,422 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
22: /************************************************************
23: SHUTDOWN_MSG: Shutting down DataNode at datanode113.hadoop/172.16.51.113
24: ************************************************************/
fix:
备份从节点hadoop 当前版本,将主节点新版本重新scp至各个从节点。
参考:
https://www.opensciencegrid.org/bin/view/Documentation/Release3/HadoopDebug