192.168.81.132 -> hadoop1 namenode:
192.168.81.130 -> hadoop2 datanode1:
192.168.81.129 -> hadoop3 datanode2;
192.168.81.131 -> hadoop4 datanode3;
一、创建账号
1.所有节点创建用户
useradd hadoop
passwd hadoop
2.所有节点创建目录
mkdir -p /home/hadoop/source
mkdir -p /home/hadoop/tools
3.Slave节点创建目录
mkdir -p /hadoop/hdfs
mkdir -p /hadoop/tmp
mkdir -p /hadoop/log
chmod -R 777 /hadoop
二、修改主机名
所有节点修改
1.vim /etc/sysconfig/network ,
修改 HOSTNAME=hadoopx
2.vim /etc/hosts
192.168.81.132 hadoop1
192.168.81.130 hadoop2
192.168.81.129 hadoop3
192.168.81.131 hadoop4
3.执行 hostname hadoopx
4.重新登录,即可
三、免密码登录
注意:非root用户免密码登录,需要执行 chmod 700 ~/.ssh chmod 600 ~/.ssh/authorized_keys 如果不修改权限,非root用户无法免密码登录
四、安装JDK(略)
五、配置环境变量
1. /etc/profile
export HADOOP_HOME=/home/hadoop/hadoop
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
export HADOOP_MAPARED_HOME=${HADOOP_HOME}
export HADOOP_COMMON_HOME=${HADOOP_HOME}
export HADOOP_HDFS_HOME=${HADOOP_HOME}
export YARN_HOME=${HADOOP_HOME}
export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop
export HDFS_CONF_DIR=${HADOOP_HOME}/etc/hadoop export
YARN_CONF_DIR=${HADOOP_HOME}/etc/hadoop
2. hadoop-env.sh
在末尾添加 export JAVA_HOME=/usr/java/jdk1.6.0_27
六、Hadoop 2.3安装
1.core-site.xml
在 configuration 节点 里面添加属性
<property>
<name>hadoop.tmp.dir</name>
<value>/hadoop/tmp</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://192.168.81.132:9000</value>
</property>
添加 httpfs 的 选项
<property>
<name>hadoop.proxyuser.root.hosts</name>
<value>192.168.1.201</value>
</property>
<property>
<name>hadoop.proxyuser.root.groups</name>
<value>*</value>
</property>
2.添加slaves节点 192.168.81.129 192.168.81.130 192.168.81.131
3.配置hdfs-site.xml
/etc/hadoop/hdfs-site.xml 添加 节点
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/hadoop/hdfs/name</value>
<final>true</final>
</property>
<property>
<name>dfs.federation.nameservice.id</name>
<value>ns1</value>
</property>
<property>
<name>dfs.namenode.backup.address.ns1</name>
<value>192.168.81.132:50100</value>
</property>
<property>
<name>dfs.namenode.backup.http-address.ns1</name>
<value>192.168.81.132:50105</value>
</property>
<property>
<name>dfs.federation.nameservices</name>
<value>ns1</value>
</property>
<property>
<name>dfs.namenode.rpc-address.ns1</name>
<value>192.168.81.132:9000</value>
</property>
<property>
<name>dfs.namenode.rpc-address.ns2</name>
<value>192.168.81.132:9000</value>
</property>
<property>
<name>dfs.namenode.http-address.ns1</name>
<value>192.168.81.132:23001</value>
</property>
<property>
<name>dfs.namenode.http-address.ns2</name>
<value>192.168.81.132:13001</value>
</property>
<property>
<name>dfs.dataname.data.dir</name>
<value>file:/hadoop/hdfs/data</value>
<final>true</final>
</property>
<property>
<name>dfs.namenode.secondary.http-address.ns1</name>
<value>192.168.81.132:23002</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address.ns2</name>
<value>192.168.81.132:23002</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address.ns1</name>
<value>192.168.81.132:23003</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address.ns2</name>
<value>192.168.81.132:23003</value>
</property>
4.配置yarn-site.xml 添加 节点
<property> <name>yarn.resourcemanager.address</name> <value>192.168.81.132:18040</value> </property>
<property> <name>yarn.resourcemanager.scheduler.address</name> <value>192.168.81.132:18030</value> </property>
<property> <name>yarn.resourcemanager.webapp.address</name> <value>192.168.81.132:18088</value> </property>
<property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>192.168.81.132:18025</value> </property>
<property> <name>yarn.resourcemanager.admin.address</name> <value>192.168.81.132:18141</value> </property>
<property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce.shuffle</value> </property> 5.配置httpfs-site.xml 略
七、配置DataNode
1.复制hadoop-2.3至各Data节点
2.复制环境变量/etc/profile,生效
八、启动并验证
1.格式化HDFS hadoop namenode -format -clusterid clustername
2.启动HDFS start-dfs.sh
3.启动任务管理器 start-yarn.sh
4.启动httpfs httpfs.sh start
5.NameNode 验证进程
NameNode Bootstrap SecondaryNameNode ResourceManager
6.DataNode 验证进程 DataNode NodeManager
7.测试HDFS读写,JOB的运行
此过程本人亲自一步步总结并验证,有问题可以email:bobsoft@foxmail.com