zoukankan      html  css  js  c++  java
  • hadoop_2.6.5集群安装

    安装hadoop2.6.5集群:
    1.规划设计:
    JacK6:NameNode,jobtracker
    JacK7:secondnode,datenode,tasktracker
    JacK8:datanode,tasktracker
    2.配置ssh免密钥登录
    1.关闭SElinux
    su root
    setenforce 0
    vi /etc/selinux/config
    SELINUX=disabled
    2.配置ssh免密钥:分别在6、7、8(需要免密钥自己)执行(pssh值得研究)
    ssh-keygen -t rsa -P ''
    ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@JacK7
    ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@JacK8
    ssh JacK7
    3. 系统配置:
    1.关闭防火墙
    service iptables stop
    service iptables status
    chkconfig iptables off
    2.关闭透明大页
    查看:cat /sys/kernel/mm/redhat_transparent_hugepage/defrag
    [always] madvise never 标识启用
    关闭:echo never > /sys/kernel/mm/redhat_transparent_hugepage/defrag
    echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled
    3.修改swappiness
    Linux内核参数vm.swappiness,值的范围为0~100,表示系统什么时候开始进行物理内存 与虚拟内存的交换。
    举个例子,系统总内存为64G,vm.swappiness为60,表示在系统内存使用64*0.4=25.6G 的时候开始物理内存与虚拟内存的交换,
    这个动作势必会影响系统的性能。因此,Cloudera建议把这个值修改为1~10。
    查看:cat /proc/sys/vm/swappiness
    修改:
    临时:sysctl -w vm.swappiness=10
    永久生效:
    echo "vm.swappiness=10" >> /etc/sysctl.conf
    4.修改文件打开最大数和最大进程数:后面两个文件有待研究
    查看:ulimit -a
    修改可打开的最大文件数:vi /etc/security/limits.conf
    * soft nofile 65535
    * hard nofile 65535
    * soft nproc 65535
    * hard nproc 65535
    hadoop soft nproc 10240
    hadoop hard nofile 10240
    hadoop soft nproc 10240
    hadoop hard nproc 10240
    重启生效,其他两个文件:
    /etc/security/limits.d/90-nproc.conf文件尾添加
    * soft nproc 204800
    * hard nproc 204800
    /etc/security/limits.d/def.conf文件尾添加
    * soft nofile 204800
    * hard nofile 204800
    5.禁用IPv6:以后再看
    vi /etc/sysconfig/network
    6.屏蔽文件访问时间:以后再看
    4.建立本地yum仓库:以后再建
    5.NTP配置:以后
    6.安装Java
    7.hadoop安装
    1.mkdir Hadoop_2.6.5
    tar -xvf /data/tar/hadoop-2.6.5.tar.gz -C /data/hadoop/Hadoop_2.6.5/
    tar -xvf hadoop-native-64-2.6.0.tar -C /data/hadoop/Hadoop_2.6.5/lib/native
    vi ~/.bash_profile
    #Hadoop_2.6.5
    export HADOOP_HOME=/data/hadoop/Hadoop_2.6.5
    export HADOOP_PREFIX=$HADOOP_HOME
    export HADOOP_MAPRED_HOME=${HADOOP_PREFIX}
    export HADOOP_COMMON_HOME=${HADOOP_PREFIX}
    export HADOOP_HDFS_HOME=${HADOOP_PREFIX}
    export YARN_HOME=${HADOOP_PREFIX}
    # Native Path
    export HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_PREFIX}/lib/native
    export HADOOP_OPTS="-Djava.library.path=$HADOOP_PREFIX/lib/native"
    export PATH=$HADOOP_PREFIX/bin:$HADOOP_PREFIX/sbin:
    $JAVA_HOME/bin:$PATH
    scp .bash_profile JacK7
    scp .bash_profile JacK8
    2.修改配置文件:
    cd /data/hadoop/Hadoop_2.6.5/etc/hadoop
    1.vi hadoop-env.sh
    # 明确指定JAVA_HOME
    export JAVA_HOME=/usr/software/java_1.8
    # 明确指定log的存放目录,默认位置是安装目录下的logs文件夹
    export HADOOP_LOG_DIR=/data/tmp_data/hadoop_data/logs
    2.vi yarn-env.sh
    export JAVA_HOME=/usr/software/java_1.8
    #if [ "$JAVA_HOME" != "" ]; then
    # #echo "run java in $JAVA_HOME"
    # JAVA_HOME=$JAVA_HOME
    #fi
    #
    #if [ "$JAVA_HOME" = "" ]; then
    # echo "Error: JAVA_HOME is not set."
    # exit 1
    #fi
    3.vi slaves 修改namenode和secondnode上的slaves文件
    JacK7
    JacK8
    4.vi core-site.xml 配置core-site文件
    <configuration>
    <property>
    <name>fs.defaultFS</name>
    <value>hdfs://JacK6:9000</value>
    </property>
    <property>
    <name>hadoop.tmp.dir</name>
    <value>file:/data/tmp_data/hadoop_data/tmp</value>
    <description>Abase for other temporary directories.</description>
    </property>
    </configuration>
    5.vi hdfs-site.xml配置secondnamenode
    <configuration>
    <property>
    <name>dfs.namenode.secondary.http-address</name>
    <value>JacK7:50090</value>
    </property>
    <property>
    <name>dfs.replication</name>
    <value>2</value>
    </property>
    <property>
    <name>dfs.namenode.name.dir</name>
    <value>file:/data/tmp_data/hadoop_data/name</value>
    </property>
    <property>
    <name>dfs.datanode.data.dir</name>
    <value>file:/data/tmp_data/hadoop_data/hdfs</value>
    </property>
    </configuration>
    6.cp mapred-site.xml.template mapred-site.xml
    vi mapred-site.xml
    <configuration>
    <property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
    </property>
    <property>
    <name>mapreduce.jobhistory.address</name>
    <value>JacK6:10020</value>
    </property>
    <property>
    <name>mapreduce.jobhistory.webapp.address</name>
    <value>JacK6:19888</value>
    </property>
    </configuration>
    7.vi yarn-site.xml
    <configuration>
    <property>
    <name>yarn.resourcemanager.hostname</name>
    <value>JacK6</value>
    </property>
    <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
    </property>
    </configuration>
    8.复制到其他节点:
    scp -r Hadoop_2.6.5/ JacK7:/data/hadoop/
    scp -r Hadoop_2.6.5/ JacK8:/data/hadoop/
    9.启停测试:
    1 $hdfs namenode -format HDFS格式化
    首次启动需要先在 Master 节点执行 NameNode 的格式化,之后的启动不需要再去进行:
    2 start-dfs.sh 在主节点启动所有守护进程,通过在各节点jps来查看
    start-yarn.sh
    mr-jobhistory-daemon.sh start historyserver
    3. hdfs dfsadmin -report 主节点查看集群的DataNode是否启动
    4. stop-yarn.sh
    stop-dfs.sh
    mr-jobhistory-daemon.sh stop historyserver
     
     
     
     
     
  • 相关阅读:
    hdu1003 Max sum
    Java Programming Tutorial Programming Graphical User Interface (GUI)
    Java中int和String互相转换的多种方法
    Google搜索技巧总结
    HTML 教程延伸阅读:改变文本的外观和含义
    JAR 文件揭密
    sparql 学习,理解sparql
    sparql查询语言学习摘要
    经典算法总结
    杨辉三角
  • 原文地址:https://www.cnblogs.com/tingyuxuanzhuzi/p/8453161.html
Copyright © 2011-2022 走看看