zoukankan      html  css  js  c++  java
  • Linux ->> UBuntu 14.04 LTE下安装Hadoop 1.2.1(伪分布模式)

    Hadoop的运行模式可分为单机模式、伪分布模式和分布模式。

    首先无论哪种模式都需要安装JDK的,这一步之前的随笔Ubuntu 14.04 LTE下安装JDK 1.8中已经做了。这里就不多说了。

    其次是安装SSH。安装SSH是为了每次可以免密码登陆数据节点服务器。因为集群的环境下,每次登陆到数据节点服务器不可能每次都输入密码。这一步在前面的随笔Ubuntu 14.04 LTE下配置SSH免密码登录中已经做了。这里也不多说了。

    伪分布模式安装:

    首先下载Hadoop 1.2.1到本机,再解压到用户目录下。

    jerry@ubuntu:~/Downloads$ tar zxf hadoop-1.2.1.tar.gz -C ~/hadoop_1.2.1
    jerry@ubuntu:~/Downloads$ cd ~/hadoop_1.2.1/
    jerry@ubuntu:~/hadoop_1.2.1$ ls
    hadoop-1.2.1
    jerry@ubuntu:~/hadoop_1.2.1$ cd hadoop-1.2.1/
    jerry@ubuntu:~/hadoop_1.2.1/hadoop-1.2.1$ ls
    bin          hadoop-ant-1.2.1.jar          ivy          sbin
    build.xml    hadoop-client-1.2.1.jar       ivy.xml      share
    c++          hadoop-core-1.2.1.jar         lib          src
    CHANGES.txt  hadoop-examples-1.2.1.jar     libexec      webapps
    conf         hadoop-minicluster-1.2.1.jar  LICENSE.txt
    contrib      hadoop-test-1.2.1.jar         NOTICE.txt
    docs         hadoop-tools-1.2.1.jar        README.txt

    然后配置hadoop的几个配置文件,都是XML格式。

    首先是core-default.xml。这里配置hadoop分布式文件系统的地址和端口,以及Hadoop临时文件目录(/tmp/hadoop-${user.name})。

    jerry@ubuntu:~/hadoop_1.2.1/hadoop-1.2.1/conf$ cat core-site.xml 
    <?xml version="1.0"?>
    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
    
    <!-- Put site-specific property overrides in this file. -->
    
    <configuration>
        <property>
            <name>fs.default.name</name>
            <value>hdfs://localhost:9000</value>
        </property>
        <property>
            <name>hadoop.tmp.dir</name>
            <value>/hadoop/hadooptmp</value>
        </property>
    </configuration>
    jerry@ubuntu:~/hadoop_1.2.1/hadoop-1.2.1/conf$ 

    修改hadoop系统环境配置文件,告诉hadoop安装好的jdk的主目录路径

    jerry@ubuntu:~/hadoop_1.2.1/hadoop-1.2.1$ cd conf/
    jerry@ubuntu:~/hadoop_1.2.1/hadoop-1.2.1/conf$ ls
    capacity-scheduler.xml      hadoop-policy.xml      slaves
    configuration.xsl           hdfs-site.xml          ssl-client.xml.example
    core-site.xml               log4j.properties       ssl-server.xml.example
    fair-scheduler.xml          mapred-queue-acls.xml  taskcontroller.cfg
    hadoop-env.sh               mapred-site.xml        task-log4j.properties
    hadoop-metrics2.properties  masters
    jerry@ubuntu:~/hadoop_1.2.1/hadoop-1.2.1/conf$ sudo vim hadoop-env.sh n
    [sudo] password for jerry: 
    2 files to edit
    jerry@ubuntu:~/hadoop_1.2.1/hadoop-1.2.1/conf$ sudo vim hadoop-env.sh
    jerry@ubuntu:~/hadoop_1.2.1/hadoop-1.2.1/conf$ tail -n 1 hadoop-env.sh 
    export JAVA_HOME=/usr/lib/jvm/jdk

    然后是hdfs-site.xml 。修改hdfs的文件备份数量为1,dfs命名节点的主目录,dfs数据节点的目录。

    jerry@ubuntu:~/hadoop_1.2.1/hadoop-1.2.1/conf$ cat hdfs-site.xml 
    <?xml version="1.0"?>
    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
    
    <!-- Put site-specific property overrides in this file. -->
    
    <configuration>
        <property>
            <name>dfs.replication</name>
            <value>1</value>
        </property>
        <property>
            <name>dfs.name.dir</name>
            <value>/hadoop/hdfs/name</value>
        </property>
        <property>
            <name>dfs.data.dir</name>
            <value>/hadoop/hdfs/data</value>
        </property>
    </configuration>
    jerry@ubuntu:~/hadoop_1.2.1/hadoop-1.2.1/conf$ 

    最后配置mapreduce的job tracker的地址和端口

    jerry@ubuntu:~/hadoop_1.2.1/hadoop-1.2.1/conf$ cat mapred-site.xml
    <?xml version="1.0"?>
    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

    <!-- Put site-specific property overrides in this file. -->

    <configuration>
    <property>
    <name>mapred.job.tracker</name>
    <value>localhost:9001</value>
    </property>
    </configuration>
    jerry@ubuntu:~/hadoop_1.2.1/hadoop-1.2.1/conf$

    配置masters文件和slaves文件,这里因为我们是伪分布式,命名节点和数据节点其实都是一样。

    jerry@ubuntu:~/hadoop_1.2.1/hadoop-1.2.1/conf$ cat masters 
    localhost
    192.168.2.100
    
    jerry@ubuntu:~/hadoop_1.2.1/hadoop-1.2.1/conf$ cat slaves 
    localhost
    192.168.2.100
    jerry@ubuntu:~/hadoop_1.2.1/hadoop-1.2.1/conf$ 

    编辑/etc/hosts文件,配置主机名和IP地址的映射关系

    jerry@ubuntu:~/hadoop_1.2.1/hadoop-1.2.1/conf$ cat /etc/hosts
    127.0.0.1    localhost
    127.0.1.1    ubuntu
    
    # The following lines are desirable for IPv6 capable hosts
    ::1     ip6-localhost ip6-loopback
    fe00::0 ip6-localnet
    ff00::0 ip6-mcastprefix
    ff02::1 ip6-allnodes
    ff02::2 ip6-allrouters
    192.168.2.100 master
    192.168.2.100 slave
    jerry@ubuntu:~/hadoop_1.2.1/hadoop-1.2.1/conf$ 

    创建好core-default.xml,hdfs-site.xml,mapred-site.xml 三个配置文件里面写到的目录

    jerry@ubuntu:~/hadoop_1.2.1/hadoop-1.2.1/conf$ mkdir -p /hadoop/hadooptmp
    jerry@ubuntu:~/hadoop_1.2.1/hadoop-1.2.1/conf$ mkdir -p /hadoop/hdfs/name
    jerry@ubuntu:~/hadoop_1.2.1/hadoop-1.2.1/conf$ mkdir -p /hadoop/hdfs/data

    格式化HDFS

    jerry@ubuntu:~/hadoop_1.2.1/hadoop-1.2.1/bin$ ./hadoop namenode -format

    启动所有Hadoop服务,包括JobTracker,TaskTracker,Namenode

    jerry@ubuntu:~/hadoop_1.2.1/hadoop-1.2.1/bin$ ./start-all.sh 
    starting namenode, logging to /home/jerry/hadoop_1.2.1/hadoop-1.2.1/libexec/../logs/hadoop-jerry-namenode-ubuntu.out
    192.168.68.130: starting datanode, logging to /home/jerry/hadoop_1.2.1/hadoop-1.2.1/libexec/../logs/hadoop-jerry-datanode-ubuntu.out
    localhost: starting datanode, logging to /home/jerry/hadoop_1.2.1/hadoop-1.2.1/libexec/../logs/hadoop-jerry-datanode-ubuntu.out
    localhost: ulimit -a for user jerry
    localhost: core file size          (blocks, -c) 0
    localhost: data seg size           (kbytes, -d) unlimited
    localhost: scheduling priority             (-e) 0
    localhost: file size               (blocks, -f) unlimited
    localhost: pending signals                 (-i) 7855
    localhost: max locked memory       (kbytes, -l) 64
    localhost: max memory size         (kbytes, -m) unlimited
    localhost: open files                      (-n) 1024
    localhost: pipe size            (512 bytes, -p) 8
    localhost: starting secondarynamenode, logging to /home/jerry/hadoop_1.2.1/hadoop-1.2.1/libexec/../logs/hadoop-jerry-secondarynamenode-ubuntu.out
    192.168.68.130: secondarynamenode running as process 10689. Stop it first.
    starting jobtracker, logging to /home/jerry/hadoop_1.2.1/hadoop-1.2.1/libexec/../logs/hadoop-jerry-jobtracker-ubuntu.out
    192.168.68.130: starting tasktracker, logging to /home/jerry/hadoop_1.2.1/hadoop-1.2.1/libexec/../logs/hadoop-jerry-tasktracker-ubuntu.out
    localhost: starting tasktracker, logging to /home/jerry/hadoop_1.2.1/hadoop-1.2.1/libexec/../logs/hadoop-jerry-tasktracker-ubuntu.out
    localhost: ulimit -a for user jerry
    localhost: core file size          (blocks, -c) 0
    localhost: data seg size           (kbytes, -d) unlimited
    localhost: scheduling priority             (-e) 0
    localhost: file size               (blocks, -f) unlimited
    localhost: pending signals                 (-i) 7855
    localhost: max locked memory       (kbytes, -l) 64
    localhost: max memory size         (kbytes, -m) unlimited
    localhost: open files                      (-n) 1024
    localhost: pipe size            (512 bytes, -p) 8
    jerry@ubuntu:~/hadoop_1.2.1/hadoop-1.2.1/bin$ 

    查看Hadoop服务是否启动成功

    jerry@ubuntu:~/hadoop_1.2.1/hadoop-1.2.1/conf$ jps
    3472 JobTracker
    3604 TaskTracker
    3084 NameNode
    5550 Jps
    3247 DataNode
    3391 SecondaryNameNode
    jerry@ubuntu:~/hadoop_1.2.1/hadoop-1.2.1/conf$

    查看hadoop群集的状态

    jerry@ubuntu:~/hadoop_1.2.1/hadoop-1.2.1/bin$ ./hadoop dfsadmin -report
    Configured Capacity: 41083600896 (38.26 GB)
    Present Capacity: 32723169280 (30.48 GB)
    DFS Remaining: 32723128320 (30.48 GB)
    DFS Used: 40960 (40 KB)
    DFS Used%: 0%
    Under replicated blocks: 0
    Blocks with corrupt replicas: 0
    Missing blocks: 0
    
    -------------------------------------------------
    Datanodes available: 1 (1 total, 0 dead)
    
    Name: 127.0.0.1:50010
    Decommission Status : Normal
    Configured Capacity: 41083600896 (38.26 GB)
    DFS Used: 40960 (40 KB)
    Non DFS Used: 8360431616 (7.79 GB)
    DFS Remaining: 32723128320(30.48 GB)
    DFS Used%: 0%
    DFS Remaining%: 79.65%
    Last contact: Sat Dec 26 12:22:07 PST 2015
    
    
    jerry@ubuntu:~/hadoop_1.2.1/hadoop-1.2.1/bin$ 

    过程中遇到不少问题,这里贴下一些有用的链接:

    Hadoop伪分布模式安装

    hadoop配置、运行错误总结

    hadoop环境配置过程中可能遇到问题的解决方案

    Hadoop的datanode无法启动

    Hadoop 添加删除datanode及tasktracker

    hadoop datanode启动不起来

  • 相关阅读:
    Java实现 LeetCode 211 添加与搜索单词
    跨平台Unicode与UTF8互转代码
    C++转换unicode utf-8 gb2312编码
    c++ ANSI、UNICODE、UTF8互转
    Visual C++ unicode and utf8 转换
    Unicode和UTF-8的关系
    boost uuid 学习笔记
    boost uuid
    Darwin Streaming server 的 Task 类
    VS2010下编译安装DarwinStreamingServer5.5.5
  • 原文地址:https://www.cnblogs.com/jenrrychen/p/5043203.html
Copyright © 2011-2022 走看看