使用环境:centos6.5+eclipse(4.4.2)+hadoop2.7.0
1、下载eclipse hadoop 插件 hadoop-eclipse-plugin-2.7.0.jar 粘贴到eclipse 目录下的plugins下重新启动eclipse
2、选择eclipse里面的windows-->showview-->other-->mapreduce tool -->map/reduce locations
3、选择上一步后eclipse右下角会多出一个map/reduce locations 选项卡,在里面点击右键-->new hadoop locations -->配置如下图所示:
4、配置hadoop/etc/hadoop/mapred-site.xml,在<configuration></configuration>追加如下信息
<property> <name>mapred.map.child.java.opts</name> <value>-Xmx1024m -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=8883</value> </property> <property> <name>mapred.tasktracker.map.tasks.maximum</name> <value>1</value> </property>
追加后的mapred-site.xml完整信息
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>mapred.map.child.java.opts</name> <value>-Xmx1024m -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=8883</value> </property> <property> <name>mapred.tasktracker.map.tasks.maximum</name> <value>1</value> </property> </configuration>
追加后的mapred-site.xml完整信息
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration>
<property> <name>mapreduce.framework.name</name> <value>yarn</value> </property>
<property> <name>mapred.map.child.java.opts</name> <value>-Xmx1024m -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=8883</value> </property>
<property> <name>mapred.tasktracker.map.tasks.maximum</name> <value>1</value> </property> <property> <name>mapred.reduce.child.java.opts</name> <value>-Xmx1024m -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=8884</value> </property>
<property> <name>mapred.tasktracker.reduce.tasks.maximum</name> <value>1</value> </property>
</configuration>
执行:hadoop jar xxx.jar words.txt /wordsout
提交上面之后job就会在map 0% reduce 0%的时刻停下来等待远程调试的debugger。
这时在eclipse -->Debug as -->Remote Java Appplicetion :Contention Proteties IP:127.0.0.1[根据情况填写实际IP],Port:8883[上面配置的端口] apply debug
这时就可以进入map函数里面了。
5、调试Reduce 任务
配置hadoop/etc/hadoop/mapred-site.xml,在<configuration></configuration>追加如下信息
<property> <name>mapred.reduce.child.java.opts</name> <value>-Xmx1024m -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=8884</value> </property> <property> <name>mapred.tasktracker.reduce.tasks.maximum</name> <value>1</value> </property>
执行:hadoop jar xxx.jar words.txt /wordsout
之后提交job,job就会在map 100% reduce 0%的时刻停下来等待远程调试的debugger。
这时在eclipse -->Debug as -->Remote Java Appplicetion :Contention Proteties IP:127.0.0.1[根据情况填写实际IP],Port:8884[上面配置的端口] apply debug
这时就可以进入reduce函数里面了。
本篇文章是基于hadoop 伪部署的基本上配置 的
参考文章:http://blog.csdn.net/gjt19910817/article/details/30384685