在Ubuntu上安装Hadoop（集群模式）

在安装好单机模式的hadoop之后，我们就可以着手来在集群模式下安装hadoop了。在安装了单机版hadoop的机器上将hadoop转成集群模式是很容易的。首先来设置网络。

网络

首先，我们需要在同一个网络的几台机器（这里用的是192.168.0.0/24网段）。然后为了操作方便，我们需要给它们配上域名，直接在 /etc/hosts 文件里面加入以下内容。

1
192.168.0.1    master

2
192.168.0.2    slave

配置SSH

为了能让hadoop运行，需要让master机器能够登录到slave机器上，这就要配置SSH密钥。跟配置单机模式的时候相似，把 ~/.ssh/authorized_key 文件拷贝slave机器的 ~/.ssh/ 路径下。在完成后试试是否能够用运行hadoop的帐号登录slave机器，能的话这一步就算完成了。

安装hadoop

master机器上的配置项

HADOOP_HOME/conf/master 这个文件定义了在多机器模式下在那台机器上运行 namenode ， secondary namenode和jobtracker。我们可以通过运行 HADOOP_HOME/bin/start-all.sh 来启动多机器模式的hadoop。在master机器上，我们需要在 HADOOP_HOME/conf/master添加以下内容。

1
master

HADOOP_HOME/conf/slave 这个文件定义在多机器模式的hadoop里面在那些机器上运行datanode和tasktracker。需要在这个文件里面添加一下配置。

1
master

2
slave

所有机器上都需要的配置项

在 HADOOP_HOME/conf/core-site.xml 文件里面添加一下配置

1
<property>

2
  <name>fs.default.name</name>

3
  <value>hdfs://master:54310</value>

4
  <description>The name of the default file system.  A URI whose

5
  scheme and authority determine the FileSystem implementation.  The

6
  uri's scheme determines the config property (fs.SCHEME.impl) naming

7
  the FileSystem implementation class.  The uri's authority is used to

8
  determine the host, port, etc. for a filesystem.</description>

9
</property>

在HADOOP_HOME/conf/mapred-site.xml 文件里面添加以下配置：

1
<property>

2
  <name>mapred.job.tracker</name>

3
  <value>master:54311</value>

4
  <description>The host and port that the MapReduce job tracker runs

5
  at.  If "local", then jobs are run in-process as a single map

6
  and reduce task.

7
  </description>

8
</property>

在 HADOOP_HOME/conf/hdfs-site.xml 文件添加以下配置：

1
<property>

2
  <name>dfs.replication</name>

3
  <value>2</value>

4
  <description>Default block replication.

5
  The actual number of replications can be specified when the file is created.

6
  The default is used if replication is not specified in create time.

7
  </description>

8
</property>

接下来把HDFS文件系统来格式化一下

1
$ /usr/local/hadoop$ bin/hadoop namenode -format

然后运行 /usr/local/hasoop/bin/start-all.sh ，如果执行成功，多机器模式的hadoop就安装完成了。