zoukankan      html  css  js  c++  java
  • Hadoop 添加删除数据节点(datanode)

    前提条件:

    添加机器安装jdk等,最好把环境都搞成一样,示例可做相应改动

    实现目的:

    在hadoop集群中添加一个新增数据节点。

    1. 创建目录和用户 

    mkdir -p /app/hadoop

    groupadd hadoop

    useradd licz -g hadoop -d /app/hadoop

    chown licz:hadoop /app/hadoop

    passwd licz

    注:如果出现下面的问题

    [root@dbserver22 ~]# su - licz
    -bash-3.2$ 

    解决办法:

    cp -a /etc/skel/. /app/hadoop 

    2. 修改环境变量

    [licz@server123 ~]$ vi .bash_profile

     

    PATH=$PATH:$HOME/bin

    export LANG=zh_CN

    export PATH

    unset USERNAME

     

    export HADOOP_HOME=/app/hadoop/hadoop-1.2.1

    export JAVA_HOME=/usr/java/jdk1.6.0_18

    export HIVE_HOME=/app/hadoop/hive-0.11.0

     

    export PIG_HOME=/app/hadoop/pig-0.12.0

    export PIG_CLASSPATH=/app/hadoop/pig-0.12.0/conf

     

    PATH=$JAVA_HOME/bin:$PATH:$HOME/bin:$HADOOP_HOME/bin:$PIG_HOME/bin:$HIVE_HOME/bin

     

    export PATH

     

    export HADOOP_HOME_WARN_SUPPRESS=1

    3. 修改host文件,添加服务器

    [root@server123 ~]# vi /etc/hosts

    10.1.32.91             nticket1

    10.1.32.93             nticket2

    10.1.32.95             nticket3

    10.1.5.123             server123

    同样在其它各节点都添加新的server123服务器

    4. 配置ssh免密码连入

    步骤为:

    ->在新节点上生成自己的密钥

    ->把原集群中的密钥复制添加到新节点的密钥当中

    ->再把新节点上的新密钥复制(覆盖)到原集群中的新节点

    --首先,为了避免误操作,操作之前要先备份原集群的密钥文件

    [licz@nticket1 .ssh]$ cp authorized_keys authorized_keys.bak

     

    [licz@server123 ~]$ ssh-keygen -t rsa

    [licz@server123 ~]$ cat ~/.ssh/id_rsa.pub >>~/.ssh/authorized_keys

     

    [licz@server123 ~]$ ssh nticket1 cat ~/.ssh/authorized_keys >> ~/.ssh/authorized_keys

     

    [licz@server123 ~]$ scp ~/.ssh/authorized_keys nticket1:~/.ssh/authorized_keys

    [licz@server123 ~]$ ssh nticket1 date

    2014年 02月 12日 星期三 11:31:08 CST

    [licz@nticket1 .ssh]$ ssh server123 date

    三  2月 1211:25:57 CST 2014

    --同样把新新密钥复制(覆盖)到原集群中的新节点

    [licz@server123 ~]$ scp ~/.ssh/authorized_keys nticket2:~/.ssh/authorized_keys

    [licz@server123 ~]$ scp ~/.ssh/authorized_keys nticket3:~/.ssh/authorized_keys

    5. 修改hadoop配置文件

    --在各节点修改hadoop的配置文件  

    [licz@nticket1 conf]$ vi slaves

    nticket2

    nticket3

    server123

    6. 安装hadoop

    --把集群中的hadoop复制到新节点

    [licz@nticket2~]$ scp -r hadoop-1.2.1/server123:/app/hadoop

        

    7. 在新节点上启动datanode和tasktracker

    [licz@server123~]$ hadoop-daemon.sh start datanode

    startingdatanode, logging to /app/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-licz-datanode-server123.out

    [licz@server123~]$ hadoop-daemon.sh start tasktracker

    startingtasktracker, logging to /app/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-licz-tasktracker-server123.out

     

    --测试安装成功

    [licz@server123 ~]$ jps

    18356 DataNode

    18517 TaskTracker

    18780 Jps

     

    8. 进行block块的均衡

    --在hdfs-site.xml中增加设置balance的带宽,默认只有1M:

    <property>

       <name>dfs.balance.bandwidthPerSec</name>

        <value>10485760</value>

        <description>

            Specifies the maximum bandwidth thateach datanode can utilize for the balancing purpose in term of the number ofbytes per second.

        </description>

    </property>

    运行以下命令:

    [licz@server123conf]$ start-balancer.sh -threshold 5

    startingbalancer, logging to /app/hadoop/hadoop-1.2.1/libexec/../logs/hadoop-licz-balancer-server123.out

    Time Stamp               Iteration#  Bytes Already Moved  Bytes Left To Move  Bytes Being Moved
    2014-2-20 17:55:14                0                 0 KB            14.12 GB           14.06 GB

    --测试

    [licz@server123~]$ hadoop dfs -ls /user/hive

    Found 1 items

    drwxr-xr-x   - licz supergroup          0 2014-02-10 11:25/user/hive/warehouse

     

    [licz@nticket1 ~]$ hadoop dfsadmin -report
    Configured Capacity: 2588293705728 (2.35 TB)
    Present Capacity: 2027166097408 (1.84 TB)
    DFS Remaining: 2026681536512 (1.84 TB)
    DFS Used: 484560896 (462.11 MB)
    DFS Used%: 0.02%
    Under replicated blocks: 9
    Blocks with corrupt replicas: 0
    Missing blocks: 0


    -------------------------------------------------
    Datanodes available: 3 (3 total, 0 dead)


    Name: 10.1.32.95:50010
    Decommission Status : Normal
    Configured Capacity: 1041225043968 (969.72 GB)
    DFS Used: 242110464 (230.89 MB)
    Non DFS Used: 102109831168 (95.1 GB)
    DFS Remaining: 938873102336(874.39 GB)
    DFS Used%: 0.02%
    DFS Remaining%: 90.17%
    Last contact: Fri Feb 14 09:49:02 CST 2014




    Name: 10.1.32.93:50010
    Decommission Status : Normal
    Configured Capacity: 1041225043968 (969.72 GB)
    DFS Used: 242143232 (230.93 MB)
    Non DFS Used: 57774628864 (53.81 GB)
    DFS Remaining: 983208271872(915.68 GB)
    DFS Used%: 0.02%
    DFS Remaining%: 94.43%
    Last contact: Fri Feb 14 09:49:02 CST 2014




    Name: 10.1.5.123:50010
    Decommission Status : Normal
    Configured Capacity: 505843617792 (471.1 GB)
    DFS Used: 307200 (300 KB)
    Non DFS Used: 401243148288 (373.69 GB)
    DFS Remaining: 104600162304(97.42 GB)
    DFS Used%: 0%
    DFS Remaining%: 20.68%
    Last contact: Fri Feb 14 09:49:03 CST 2014

    参考 http://blog.csdn.net/lichangzai/article/details/19118711

    添加节点


    1.修改host 
      和普通的datanode一样。添加namenode的ip 


    2.修改namenode的配置文件conf/slaves 
      添加新增节点的ip或host 


    3.在新节点的机器上,启动服务 

    [root@slave-004 hadoop]# ./bin/hadoop-daemon.sh start datanode
    [root@slave-004 hadoop]# ./bin/hadoop-daemon.sh start tasktracker

    4.均衡block 

    [root@slave-004 hadoop]# ./bin/start-balancer.sh


    1)如果不balance,那么cluster会把新的数据都存放在新的node上,这样会降低mapred的工作效率 
    2)设置平衡阈值,默认是10%,值越低各节点越平衡,但消耗时间也更长 

    [root@slave-004 hadoop]# ./bin/start-balancer.sh -threshold 5

    3)设置balance的带宽,默认只有1M/s

    复制代码
    1 <property>
    2   <name>dfs.balance.bandwidthPerSec</name>
    3   <value>1048576</value>
    4   <description>
    5     Specifies the maximum amount of bandwidth that each datanode
    6     can utilize for the balancing purpose in term of
    7     the number of bytes per second.
    8   </description>
    9 </property>
    复制代码



    注意: 
    1. 必须确保slave的firewall已关闭; 
    2. 确保新的slave的ip已经添加到master及其他slaves的/etc/hosts中,反之也要将master及其他slave的ip添加到新的slave的/etc/hosts中

    删除节点

     

    1.集群配置 
       修改conf/hdfs-site.xml文件

    复制代码
    1 <property>  
    2   <name>dfs.hosts.exclude</name>
    3   <value>/data/soft/hadoop/conf/excludes</value>
    4   <description>Names a file that contains a list of hosts that are
    5   not permitted to connect to the namenode. The full pathname of the
    6   file must be specified. If the value is empty, no hosts are
    7   excluded.</description>
    8 </property>
    复制代码


    2确定要下架的机器 
    dfs.hosts.exclude定义的文件内容为,每个需要下线的机器,一行一个。这个将阻止他们去连接Namenode。如: 

    slave-003  
    slave-004


    3.强制重新加载配置 

    [root@master hadoop]# ./bin/hadoop dfsadmin  -refreshNodes  

    它会在后台进行Block块的移动 


    4.关闭节点 
    等待刚刚的操作结束后,需要下架的机器就可以安全的关闭了。 

    [root@master hadoop]# ./bin/ hadoop dfsadmin -report  

    可以查看到现在集群上连接的节点 

    正在执行Decommission,会显示: 
    Decommission Status : Decommission in progress

    执行完毕后,会显示:
    Decommission Status : Decommissioned


    5.再次编辑excludes文件 
    一旦完成了机器下架,它们就可以从excludes文件移除了 
    登录要下架的机器,会发现DataNode进程没有了,但是TaskTracker依然存在,需要手工处理一下

    参考 http://www.cnblogs.com/rilley/archive/2012/02/13/2349858.html

  • 相关阅读:
    SQL Server 【应用】行列转换Pivot&Unpivot
    SQL Server 【优化】in & exists & not in & not exists
    SQL Server 【提高】 死锁
    SQL Server 【提高】 锁
    SQL Server 【提高】 游标
    .Net 【基础回顾】关键字
    .Net 【基础回顾】值类型与引用类型
    mysql中point类型数据的操作
    CGI环境配置(Ubuntu)
    CGI环境配置(CentOS)
  • 原文地址:https://www.cnblogs.com/xd502djj/p/4422982.html
Copyright © 2011-2022 走看看