zoukankan      html  css  js  c++  java
  • hadoop安装

    hadoop的配置文件里面,不能用下划线,如果要做单词分割,可以用中画线。
    例如:
    hadoop_master改成hadoop-master
    用下划线会报错dfs.namenode.servicerpc-address or dfs.namenode.rpc-address is not configured, Does not contain a valid host:port authority
    

      

    查看CentOS自带JDK是否已安装。

    yum list installed |grep java。

    安装和更新java

    yum -y install java-1.7.0-openjdk*

    设置java_home=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.99-2.6.5.0.el7_2.x86_64

    vi ~/.bash_profile

    export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.99-2.6.5.0.el7_2.x86_64
    export CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
    export PATH=$PATH:$JAVA_HOME/bin

    export hadoop-PREFIX=/home/hadoop/hadoop-2.7.2
    export hadoop-HOME=/home/hadoop/hadoop-2.7.2
    export hadoop-INSTALL=$hadoop-HOME
    export hadoop-MAPRED_HOME=$hadoop-HOME
    export hadoop-COMMON_HOME=$hadoop-HOME
    export hadoop-HDFS_HOME=$hadoop-HOME
    export YARN_HOME=$hadoop-HOME
    export hadoop-COMMON_LIB_NATIVE_DIR=$hadoop-HOME/lib/native
    export PATH=$PATH:$hadoop-HOME/sbin:$hadoop-HOME/bin

      

    yum install ssh

    yum install rsync

     useradd hadoop

    passwd hadoop

    修改机器名

    步骤1:

    修改/etc/sysconfig/network中的hostname

    vi /etc/sysconfig/network
    HOSTNAME=localhost.localdomain  #修改localhost.localdomain为orcl1

    修改network的HOSTNAME项。点前面是主机名,点后面是域名。没有点就是主机名。

    centos 7修改 vi /etc/hostname
    

      

    这个是永久修改,重启后生效。

    步骤2:
    修改/etc/hosts文件

    vi /etc/hosts

    127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 
    ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
    10.20.77.172 hadoop-master
    10.20.77.173 hadoop-slave1
    10.20.77.174 hadoop-slave2
    10.20.77.175 hadoop-slave3

    shutdown -r now    #最后,重启服务器即可。

    配置SSH的无密码登录:可新建专用用户hadoop进行操作,cd命令进入所属目录下,输入以下指令(已安装ssh)

    ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
    cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
    chmod 0600 ~/.ssh/authorized_keys

    解释一下,第一条生成ssh密码的命令,-t 参数表示生成算法,有rsa和dsa两种;-P表示使用的密码,这里使用“”空字符串表示无密码。

    第二条命令将生成的密钥写入authorized_keys文件。

    这时输入 ssh localhost,弹出写入提示后回车,便可无密码登录本机。同理,将authorized_keys文件 通过 scp命令拷贝到其它主机相同目录下,则可无密码登录其它机器。

    scp /home/hadoop/.ssh/authorized_keys hadoop@hadoop-slave1:/home/hadoop/
    
    在hadoop-slave1
    cd /home/hadoop/
    cat authorized_keys >> .ssh/authorized_keys 
    

      

    su - hadoop

    cd /home/hadoop

    mkdir tmp

    mkdir hdfs

    mkdir hdfs/data

    mkdir hdfs/name

    wget http://apache.fayea.com/hadoop/common/stable/hadoop-2.7.2.tar.gz 

    解压到/home/hadoop/hadoop-2.7.2/

    修改/home/hadoop/hadoop-2.7.2/etc/hadoop/core-site.xml

    完整文件如下


    <configuration>
    <property>
    <name>fs.defaultFS</name>
    <value>hdfs://hadoop-master:9000</value>
    </property>
    <property>
    <name>hadoop.tmp.dir</name>
    <value>file:/home/hadoop/tmp</value>
    </property>
    <property>
    <name>io.file.buffer.size</name>
    <value>131702</value>
    </property>
    </configuration>

      这个是hadoop的核心配置文件,这里需要配置的就这两个属性,fs.default.name配置了hadoop的HDFS系统的命名,位置为主机的9000端口;hadoop.tmp.dir配置了hadoop的tmp目录的根位置。这里使用了一个文件系统中没有的位置,所以要先用mkdir命令新建一下。

    hdfs-site.xml

    <property>
            <name>dfs.namenode.name.dir</name>
            <value>file:/home/hadoop/name</value>
        </property>
        <property>
            <name>dfs.datanode.data.dir</name>
            <value>file:/home/hadoop/data</value>
        </property>
        <property>
            <name>dfs.replication</name>
            <value>3</value>
        </property>
    	<property>
    		<name>dfs.http.address</name>
    		<value>hadoop-master:50070</value>
    	</property>
        <property>
            <name>dfs.namenode.secondary.http-address</name>
            <value>hadoop-master:9001</value>
        </property>
        <property>
    <name>dfs.webhdfs.enabled</name>
    <value>true</value>
        </property>
    

      

    cp mapred-site.xml.template mapred-site.xml

    vi mapred-site.xml

    <property>
            <name>mapreduce.framework.name</name>
            <value>yarn</value>
        </property>
        <property>
            <name>mapreduce.jobhistory.address</name>
            <value>hadoop-master:10020</value>
        </property>
        <property>
            <name>mapreduce.jobhistory.webapp.address</name>
            <value>hadoop-master:19888</value>
        </property>
    

      

    修改yarn-site.xml

    <property>
            <name>yarn.nodemanager.aux-services</name>
            <value>mapreduce_shuffle</value>
        </property>
        <property>
            <name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name>
            <value>org.apache.hadoop.mapred.ShuffleHandler</value>
        </property>
        <property>
            <name>yarn.resourcemanager.address</name>
            <value>hadoop-master:8032</value>
        </property>
        <property>
            <name>yarn.resourcemanager.scheduler.address</name>
            <value>hadoop-master:8030</value>
        </property>
        <property>
            <name>yarn.resourcemanager.resource-tracker.address</name>
            <value>hadoop-master:8031</value>
        </property>
        <property>
            <name>yarn.resourcemanager.admin.address</name>
            <value>hadoop-master:8033</value>
        </property>
        <property>
            <name>yarn.resourcemanager.webapp.address</name>
            <value>hadoop-master:8088</value>
        </property>
        <property>
            <name>yarn.nodemanager.resource.memory-mb</name>
            <value>768</value>
        </property>
    

      vi slaves

    hadoop-slave1
    hadoop-slave2
    hadoop-slave3
    

      

    其它机器配置

    yum install ssh
    yum install rsync
    
    
    vi /etc/hosts
    10.20.77.172 hadoop-master
    10.20.77.173 hadoop-slave1
    10.20.77.174 hadoop-slave2
    10.20.77.175 hadoop-slave3
    
    
    useradd hadoop
    passwd hadoop
    
    再从master机器拷贝hadoop过来
    scp -r /home/hadoop/* hadoop@hadoop-slave1:/home/hadoop/
    

      

    格式化namenode:

    ./bin/hdfs namenode -format

    启动hdfs: ./sbin/start-dfs.sh

    此时在Master上面运行的进程有:namenode secondarynamenode

    Slave上面运行的进程有:datanode

    启动yarn: ./sbin/start-yarn.sh

    此时在Master上面运行的进程有:namenode secondarynamenode resourcemanager

    Slave上面运行的进程有:datanode nodemanager

    mr-jobhistory-daemon.sh start historyserver

    现在多了一个JobHistoryServer

    检查启动结果
    查看集群状态:./bin/hdfs  dfsadmin  –report
    查看文件块组成: ./bin/hdfs  fsck / -files -blocks
    查看HDFS: http://10.20.77.172:50070
    查看RM: http:// 10.20.77.172:8088

     管理端口

    1、HDFS页面:50070
    
    2、YARN的管理界面:8088
    
    3、HistoryServer的管理界面:19888
    
    4、Zookeeper的服务端口号:2181
    
    5、Mysql的服务端口号:3306
    
    6、Hive.server1=10000
    
    7、Kafka的服务端口号:9092
    
    8、azkaban界面:8443
    
    9、Hbase界面:16010,60010
    
    10、Spark的界面:8080
    
    11、Spark的URL:7077
    Daemon	Web Interface	Notes
    NameNode	http://nn_host:port/	Default HTTP port is 50070.
    ResourceManager	http://rm_host:port/	Default HTTP port is 8088.
    MapReduce JobHistory Server	http://jhs_host:port/	Default HTTP port is 19888.
    

      

    参考资料:

    http://jingyan.baidu.com/article/27fa73269c02fe46f9271f45.html

    http://www.powerxing.com/install-hadoop/

    http://www.powerxing.com/install-hadoop-cluster/

    http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/ClusterSetup.html

  • 相关阅读:
    Sum of a Function(区间筛)
    (01背包)输出方案数
    删边求概率
    完全背包输出方案数(dp)
    二分
    Just Arrange the Icons(模拟)
    Balls of Buma(回文串)
    dp思想
    刷题-力扣-190. 颠倒二进制位
    刷题-力扣-173. 二叉搜索树迭代器
  • 原文地址:https://www.cnblogs.com/linn/p/5337013.html
Copyright © 2011-2022 走看看