zoukankan      html  css  js  c++  java
  • Hadoop 2.7.3 完全分布式维护-动态增加datanode篇

    原有环境

    http://www.cnblogs.com/ilifeilong/p/7406944.html

     IP       host JDK linux hadop role
    172.16.101.55 sht-sgmhadoopnn-01 1.8.0_111 CentOS release 6.5 hadoop-2.7.3 NameNode,SecondaryNameNode,ResourceManager
    172.16.101.58 sht-sgmhadoopdn-01 1.8.0_111 CentOS release 6.5 hadoop-2.7.3 DataNode,NodeManager
    172.16.101.59 sht-sgmhadoopdn-02 1.8.0_111 CentOS release 6.5 hadoop-2.7.3 DataNode,NodeManager
    172.16.101.60 sht-sgmhadoopdn-03 1.8.0_111 CentOS release 6.5 hadoop-2.7.3 DataNode,NodeManager
    172.16.101.66 sht-sgmhadoopdn-04 1.8.0_111 CentOS release 6.5 hadoop-2.7.3 DataNode,NodeManager

    现计划向集群新增一台datanode,如表格所示

    1. 配置系统环境

    主机名,ssh互信,环境变量等

    2. 修改namenode节点的slave文件,增加新节点信息

    $ cat slaves 
    sht-sgmhadoopdn-01
    sht-sgmhadoopdn-02
    sht-sgmhadoopdn-03
    sht-sgmhadoopdn-04

    3. 在namenode节点上,将hadoop-2.7.3复制到新节点上,并在新节点上删除data和logs目录中的文件

    $ hostname 
    sht-sgmhadoopnn-01
    $ rsync -az --progress /usr/local/hadoop-2.7.3/* hduser@sht-sgmhadoopdn-04:/usr/local/hadoop-2.7.3/
    
    $ hostname 
    sht-sgmhadoopdn-04
    $ rm -rf /usr/local/hadoop-2.7.3/logs/*
    $ rm -rf /usr/local/hadoop-2.7.3/data/*

    4. 启动新datanode的datanode进程

    $ hadoop-daemon.sh start datanode
    starting datanode, logging to /usr/local/hadoop-2.7.3/logs/hadoop-hduser-datanode-sht-sgmhadoopdn-04.out
    $ jps
    31875 Jps
    31821 DataNode

     5. 在namenode查看当前集群情况,确认信节点已经正常加入

    5.1 以命令行方式

    $ hdfs dfsadmin -report
    Configured Capacity: 303324561408 (282.49 GB)
    Present Capacity: 83729309696 (77.98 GB)
    DFS Remaining: 83081265152 (77.38 GB)
    DFS Used: 648044544 (618.02 MB)
    DFS Used%: 0.77%
    Under replicated blocks: 0
    Blocks with corrupt replicas: 0
    Missing blocks: 0
    Missing blocks (with replication factor 1): 0
    
    -------------------------------------------------
    Live datanodes (4):
    
    Name: 172.16.101.66:50010 (sht-sgmhadoopdn-04)
    Hostname: sht-sgmhadoopdn-04
    Decommission Status : Normal
    Configured Capacity: 75831140352 (70.62 GB)
    DFS Used: 24576 (24 KB)
    Non DFS Used: 35573932032 (33.13 GB)
    DFS Remaining: 40257183744 (37.49 GB)
    DFS Used%: 0.00%
    DFS Remaining%: 53.09%
    Configured Cache Capacity: 0 (0 B)
    Cache Used: 0 (0 B)
    Cache Remaining: 0 (0 B)
    Cache Used%: 100.00%
    Cache Remaining%: 0.00%
    Xceivers: 1
    Last contact: Fri Sep 01 22:50:16 CST 2017
    
    
    Name: 172.16.101.60:50010 (sht-sgmhadoopdn-03)
    Hostname: sht-sgmhadoopdn-03
    Decommission Status : Normal
    Configured Capacity: 75831140352 (70.62 GB)
    DFS Used: 216006656 (206 MB)
    Non DFS Used: 61714608128 (57.48 GB)
    DFS Remaining: 13900525568 (12.95 GB)
    DFS Used%: 0.28%
    DFS Remaining%: 18.33%
    Configured Cache Capacity: 0 (0 B)
    Cache Used: 0 (0 B)
    Cache Remaining: 0 (0 B)
    Cache Used%: 100.00%
    Cache Remaining%: 0.00%
    Xceivers: 1
    Last contact: Fri Sep 01 22:50:15 CST 2017
    
    
    Name: 172.16.101.59:50010 (sht-sgmhadoopdn-02)
    Hostname: sht-sgmhadoopdn-02
    Decommission Status : Normal
    Configured Capacity: 75831140352 (70.62 GB)
    DFS Used: 216006656 (206 MB)
    Non DFS Used: 62057410560 (57.80 GB)
    DFS Remaining: 13557723136 (12.63 GB)
    DFS Used%: 0.28%
    DFS Remaining%: 17.88%
    Configured Cache Capacity: 0 (0 B)
    Cache Used: 0 (0 B)
    Cache Remaining: 0 (0 B)
    Cache Used%: 100.00%
    Cache Remaining%: 0.00%
    Xceivers: 1
    Last contact: Fri Sep 01 22:50:14 CST 2017
    
    
    Name: 172.16.101.58:50010 (sht-sgmhadoopdn-01)
    Hostname: sht-sgmhadoopdn-01
    Decommission Status : Normal
    Configured Capacity: 75831140352 (70.62 GB)
    DFS Used: 216006656 (206 MB)
    Non DFS Used: 60249300992 (56.11 GB)
    DFS Remaining: 15365832704 (14.31 GB)
    DFS Used%: 0.28%
    DFS Remaining%: 20.26%
    Configured Cache Capacity: 0 (0 B)
    Cache Used: 0 (0 B)
    Cache Remaining: 0 (0 B)
    Cache Used%: 100.00%
    Cache Remaining%: 0.00%
    Xceivers: 1
    Last contact: Fri Sep 01 22:50:15 CST 2017

    5.2 以web方式

    6. 在namenoe上设置 hdfs 的负载均衡

    $ hdfs dfsadmin -setBalancerBandwidth 67108864
    Balancer bandwidth is set to 67108864
    $ start-balancer.sh -threshold 5
    starting balancer, logging to /usr/local/hadoop-2.7.3/logs/hadoop-hduser-balancer-sht-sgmhadoopnn-01.out

     7. 查看hdfs负载信息(有时候节点数据量较小,看出来数据量变化,可以上传大文件测试)

    8. 启动新节点的nodemanager进程

    $ yarn-daemon.sh start nodemanager
    starting nodemanager, logging to /usr/local/hadoop-2.7.3/logs/yarn-hduser-nodemanager-sht-sgmhadoopdn-04.out
    $ jps
    32562 NodeManager
    32599 Jps
    31821 DataNode

     

  • 相关阅读:
    git使用教程2-更新github上代码
    git使用教程-本地代码上传到github
    【Mac系统 + Git】之上传项目代码到github上以及删除某个文件夹
    【Mac + Appium + Python3.6学习(五)】之常用的Android自动化测试API总结
    【Mac + Python + Selenium】之PyCharm配置Selenium自动化
    appium自动化常用API
    【Mac + Appium + Python3.6学习(四)】之常用的IOS自动化测试API总结
    ubuntu指令大全
    Win10上安装双系统(win10+ubuntu)
    C语言共用体的作用
  • 原文地址:https://www.cnblogs.com/ilifeilong/p/7465397.html
Copyright © 2011-2022 走看看