zoukankan      html  css  js  c++  java
  • hadoop安装配置

    Hadoop集群安装

    1.配置服务器

    1个主节点:master(192.168.15.128)2个(从)子节点,slaver1(192.168.15.129)slaver2(192.168.15.130)

    配置主节点名(192.168.15.128)

    vi /etc/sysconfig/network

    添加内容:

    NETWORKING=yes

    HOSTNAME=master

    Centos7:/etc/hostname

    配置两台子节点名(192.168.15.129)和(192.168.15.130)

    vi /etc/sysconfig/network

    添加内容:

    NETWORKING=yes

    HOSTNAME=slaver1

    vi /etc/sysconfig/network

    添加内容:

    NETWORKING=yes

    HOSTNAME=slaver2

    配置hosts

    打开主节点的hosts文件,要将文件的前两行注释掉 (注释当前主机的信息)并在文件中添加所有hadoop集群的主机信息。

    vi /etc/hosts

    192.168.15.128   master

    192.168.15.129   slaver1

    192.168.15.130   slaver2

    保存之后,将主节点的hosts分别拷贝到其他两个子节点

    scp /etc/hosts root@192.168.15.129:/etc/

    scp /etc/hosts root@192.168.15.130:/etc/

    然后分别执行(重启服务器也可以不执行下面的语句): /bin/hostsname hostsname

    例如:master上执行 /bin/hostsname master,使之生效。

    2. 配置ssh无密码访问

    生成公钥密钥对

    在每个节点上分别执行:

    ssh-keygen -t rsa

    一直按回车直到生成结束

    执行结束之后每个节点上的/root/.ssh/目录下生成了两个文件 id_rsa id_rsa.pub

    其中前者为私钥,后者为公钥

    在主节点上执行:

    cp id_rsa.pub authorized_keys

    将子节点的公钥拷贝到主节点并添加进authorized_keys

    将两个子节点的公钥拷贝到主节点上,分别在两个子节点上执行:

    scp ~/.ssh/id_rsa.pub root@master:~/.ssh/id_rsa_slaver1.pub

    scp ~/.ssh/id_rsa.pub root@master:~/.ssh/id_rsa_slaver2.pub

    然后在主节点上,将拷贝过来的两个公钥合并到authorized_keys文件中去

    主节点上执行:

    cat id_rsa_slaver1.pub>> authorized_keys

    cat id_rsa_slaver2.pub>> authorized_keys

    最后测试是否配置成功

    master上分别执行

    ssh slaver1

    ssh slaver2

    能正确跳转到两台子节点的操作界面即可,同样在每个子节点通过相同的方式登录主节点和其他子节点也能无密码正常登录就表示配置成功。

    这里的配置方式可以有多种操作步骤,最终目的是每个节点上的/root/.ssh/authorized_keys文件中都包含所有的节点生成的公钥内容。

    将主节点的authorized_keys文件分别替换子节点的authorized_keys文件

    主节点上用scp命令将authorized_keys文件拷贝到子节点的相应位置

    scp authorized_keys root@slaver1:/root/.ssh/

    scp authorized_keys root@slaver2:/root/.ssh/

    3. 安装jdk

    卸载jdk

    查看系统已经装的jdk

    rpm -qa|grep jdk

    卸载jdk:

    rpm -e --nodeps java-1.6.0-openjdk-javadoc-1.6.0.0-1.66.1.13.0.el6.x86_64

    安装JDK(三台机器都要安装)

    安装在同一位置/opt/java/jdk1.7.0_72

    下载JDK

    解压JDK  tar -zxvf /opt/java/jdk-7u72-linux-x64.gz

    配置环境变量, 编辑profile文件:

    vi /etc/profile

    profile文件末尾添加以下代码:

    export JAVA_HOME=/opt/java/jdk1.7.0_72

    export JRE_HOME=$JAVA_HOME/jre

    export PATH=$JAVA_HOME/bin:$PATH

    export CLASSPATH=.:$JAVA_HOME/lib:$JRE_HOME/lib

    保存后,使刚才编辑的文件生效:

    source /etc/profile

    测试是否安装成功:java –version

    4.安装hadoop

    master主机上安装hadoop

    安装位置自定,例如安装在/usr目录下面

    下载hadoop包,放在/usr目录下,解压hadoop

    tar -zxvf /opt/hadoop/hadoop-2.6.4.tar.gz

    usr下面生成hadoop-2.6.4目录

    配置环境变量:

    vi /etc/profile

    在末尾添加:

    export HADOOP_HOME=/usr/ hadoop-2.6.4

    export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

    保存后使新编辑的profile生效:

    source /etc/profile

    5.配置hadoop

    配置hadoop配置文件

    需要配置的文件的位置为/hadoop-2.6.4/etc/hadoop,需要修改的有以下几个

    hadoop-env.sh

    yarn-env.sh

    core-site.xml

    hdfs-site.xml

    mapred-site.xml

    yarn-site.xml

    slaves

    其中

    hadoop-env.shyarn-env.sh里面都要添加jdk的环境变量:

    hadoop-env.sh

    # The java implementation to use.

    export JAVA_HOME=/opt/java/jdk1.7.0_72

    (红色为新添加的内容,其他的代码是文件中原有的)

    # The jsvc implementation to use. Jsvc is required to run secure datanodes

    # that bind to privileged ports to provide authentication of data transfer

    # protocol.  Jsvc is not required if SASL is configured for authentication of

    # data transfer protocol using non-privileged ports.

    #export JSVC_HOME=${JSVC_HOME}

    yarn-env.sh

    # User for YARN daemons

    export HADOOP_YARN_USER=${HADOOP_YARN_USER:-yarn}

    # resolve links - $0 may be a softlink

    export YARN_CONF_DIR="${YARN_CONF_DIR:-$HADOOP_YARN_HOME/conf}"

    # some Java parameters

    export JAVA_HOME=/opt/java/jdk1.7.0_72

    (红色为新添加的内容,其他的代码是文件中原有的)

    core-site.xml

    <configuration>

            <property>

                    <name>fs.defaultFS</name>

                    <value>hdfs://master:9000</value>

            </property>

            <property>

                    <name>io.file.buffer.size</name>

                    <value>131072</value>

            </property>

            <property>

                    <name>hadoop.tmp.dir</name>

                    <value>file:/usr/temp</value>

            </property>

            <property>

                    <name>hadoop.proxyuser.root.hosts</name>

                    <value>*</value>

            </property>

            <property>

                    <name>hadoop.proxyuser.root.groups</name>

                    <value>*</value>

            </property>

    </configuration>

    hdfs-site.xml

    <configuration>

            <property>

                    <name>dfs.namenode.secondary.http-address</name>

                    <value>master:9001</value>

            </property>

            <property>

                    <name>dfs.namenode.name.dir</name>

                    <value>file:/usr/dfs/name</value>

            </property>

            <property>

                    <name>dfs.datanode.data.dir</name>

                    <value>file:/usr/dfs/data</value>

            </property>

            <property>

                    <name>dfs.replication</name>

                    <value>2</value>

            </property>

            <property>

                    <name>dfs.webhdfs.enabled</name>

                    <value>true</value>

            </property>

            <property>

                    <name>dfs.permissions</name>

                    <value>false</value>

            </property>

            <property>

                    <name>dfs.web.ugi</name>

                    <value> supergroup</value>

            </property>

    </configuration>

    mapred-site.xml

    <configuration>

            <property>

                    <name>mapreduce.framework.name</name>

                    <value>yarn</value>

            </property>

            <property>

                    <name>mapreduce.jobhistory.address</name>

                    <value>master:10020</value>

            </property>

            <property>

                    <name>mapreduce.jobhistory.webapp.address</name>

                    <value>master:19888</value>

            </property>

    </configuration>

    yarn-site.xml

    <configuration>

            <property>

                    <name>yarn.nodemanager.aux-services</name>

                    <value>mapreduce_shuffle</value>

            </property>

            <property>

                    <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>

                    <value>org.apache.hadoop.mapred.ShuffleHandler</value>

            </property>

            <property>

                    <name>yarn.resourcemanager.address</name>

                    <value>master:8032</value>

            </property>

            <property>

                    <name>yarn.resourcemanager.scheduler.address</name>

                    <value>master:8030</value>

            </property>

            <property>

                    <name>yarn.resourcemanager.resource-tracker.address</name>

                    <value>master:8031</value>

            </property>

            <property>

                    <name>yarn.resourcemanager.admin.address</name>

                    <value>master:8033</value>

            </property>

            <property>

                    <name>yarn.resourcemanager.webapp.address</name>

                    <value>master:8088</value>

            </property>

    </configuration>

    slaves

    master

    slaver1

    slaver2

    拷贝hadoop安装文件到子节点

    主节点上执行:

    scp -r /usr/hadoop-2.6.4 root@slaver1:/usr

    scp -r /usr/hadoop-2.6.4 root@slaver2:/usr

    拷贝profile到子节点

    主节点上执行:

    scp /etc/profile root@slaver1:/etc/

    scp /etc/profile root@slaver2:/etc/

    在两个子节点上分别使新的profile生效:

    source /etc/profile

    格式化主节点的namenode

    主节点上进入hadoop目录

    然后执行:

    ./bin/hadoop namenode –format

    新版本用下面的语句不用hadoop命令了

    ./bin/hdfs namenode –format

    提示:successfully formatted表示格式化成功

    启动hadoop

    主节点上在hadoop目录下执行:

    ./sbin/start-all.sh

    主节点上jps进程有:

    NameNode

    SecondaryNameNode

    ResourceManager

    每个子节点上的jps进程有:

    DataNode

    NodeManager

    如果这样表示hadoop集群配置成功

  • 相关阅读:
    如何阅读一篇论文
    FT232R驱动问题
    无线传感网-定位技术1
    无线传感网中常见路由协议2
    课程总结
    十四周总结以及实验报告
    第十三周总结
    第十二周课程总结
    第十周课程总结
    第九周课程总结&实验报告(七)
  • 原文地址:https://www.cnblogs.com/ljk-007/p/9188104.html
Copyright © 2011-2022 走看看