第一次使用需要 hdfs namenode -format
一键启动和关闭hadoop
新建文本文档 然后改名
start-hadoop.cmd
里面的内容
@echo off cd /d %HADOOP_HOME% cd sbin start start-dfs.cmd start start-yarn.cmd
双击 直接能启动 hadoop的 DFS和YARN
这是 第二个 脚本
stop-hadoop.cmd
cd /d %hadoop_home%sbin start stop-dfs.cmd start stop-yarn.cmd
双击这个能 关闭 hadoop
今天发现 Hadoop 2.4.1 的reducer 的 keyout valueout不支持 NullWritable。
问题:
log4j:WARN No appenders could be found for logger (org.apache.hadoop.util.Shell). log4j:WARN Please initialize the log4j system properly. log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
解决方法
log4j.rootLogger=INFO, stdout log4j.appender.stdout=org.apache.log4j.ConsoleAppender log4j.appender.stdout.layout=org.apache.log4j.PatternLayout log4j.appender.stdout.layout.ConversionPattern=%d %p [%c] - %m%n log4j.appender.logfile=org.apache.log4j.FileAppender log4j.appender.logfile.File=target/spring.log log4j.appender.logfile.layout=org.apache.log4j.PatternLayout log4j.appender.logfile.layout.ConversionPattern=%d %p [%c] - %m%n
这样在 eclipse下面 就有输出文件了
参考资料:http://blog.csdn.net/hipercomer/article/details/27063577
问题:java.io.IOException: Mkdirs failed to create D:/hadoop-2.4.1/hadooptmp
解决方法:
<property> <name>hadoop.tmp.dir</name> <value>d:/hadoop-2.4.1/tmp/hadoop-${user.name}</value> </property>
改成这个就可以了。看来hadoop.tmp.dir不能用file:///d:/hadoop-2.4.1这种写法。
新的问题:在Eclispe 上可以运行程序,但是cmd上不能运行,报错:
14/11/27 19:37:42 INFO ipc.Server: Socket Reader #1 for port 9000: readAndProces
s from client 127.0.0.1 threw exception [java.io.IOException: 远程主机强迫关闭了
一个现有的连接。]
java.io.IOException: 远程主机强迫关闭了一个现有的连接。
14/11/27 19:37:33 ERROR datanode.DataNode: xxxxx :50010:DataXceiver error proce
ssing READ_BLOCK operation src: /127.0.0.1:3349 dst: /127.0.0.1:50010
java.io.IOException: 远程主机强迫关闭了一个现有的连接。
14/11/27 19:20:19 INFO mapreduce.Job: Task Id : attempt_1417085699849_0001_r_000
000_1, Status : FAILED
Error: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in sh
uffle in fetcher#1
有待解决。。。。
2014/11/27 21:58 已经解决
hdfs-site.xml
<configuration> <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>file:///d:/hadoop-2.4.1/dfs/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:///d:/hadoop-2.4.1/dfs/data</value> </property> <property> <name>dfs.datanode.socket.write.timeout</name> <value>6000000</value> </property> <property> <name>dfs.socket.timeout</name> <value>6000000</value> </property> <property> <name>dfs.datanode.max.transfer.threads</name> <value>8192</value> </property> </configuration>
mapred-site.xml
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>mapred.job.tracker</name> <value>localhost:9001</value> </property> <property> <name>mapreduce.job.user.name</name> <value>%USERNAME%</value> </property> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>yarn.apps.stagingDir</name> <value>/user/%USERNAME%/staging</value> </property> <property> <name>mapreduce.jobtracker.address</name> <value>local</value> </property>
yarn-site.xml
<configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <property> <name>yarn.server.resourcemanager.address</name> <value>0.0.0.0:8020</value> </property> <property> <name>yarn.server.resourcemanager.application.expiry.interval</name> <value>60000</value> </property> <property> <name>yarn.server.nodemanager.address</name> <value>0.0.0.0:45454</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.server.nodemanager.remote-app-log-dir</name> <value>D:/hadoop-2.4.1/logs/userlogs/applogs</value> </property> <property> <name>yarn.nodemanager.log-dirs</name> <value>D:/hadoop-2.4.1/logs/userlogs/yarnlogs</value> </property> <property> <name>yarn.server.mapreduce-appmanager.attempt-listener.bindAddress</name> <value>0.0.0.0</value> </property> <property> <name>yarn.server.mapreduce-appmanager.client-service.bindAddress</name> <value>0.0.0.0</value> </property> <property> <name>yarn.log-aggregation-enable</name> <value>true</value> </property> <property> <name>yarn.log-aggregation.retain-seconds</name> <value>-1</value> </property> <property> <name>yarn.application.classpath</name> <value> %HADOOP_CONF_DIR%, %HADOOP_HOME%etchadoop, %HADOOP_HOME%sharehadoopcommon*, %HADOOP_HOME%sharehadoopcommonlib*, %HADOOP_HOME%sharehadoophdfs*, %HADOOP_HOME%sharehadoophdfslib*, %HADOOP_HOME%sharehadoopmapreduce*, %HADOOP_HOME%sharehadoopmapreducelib*, %HADOOP_HOME%sharehadoopyarn*, %HADOOP_HOME%sharehadoopyarnlib* </value> </property>