zoukankan      html  css  js  c++  java
  • 测试集群模式安装实施Hadoop

    测试集群模式安装实施Hadoop   

    1. 集群架构

     

    VMware中安装三台CentOS虚拟机server1server2server3,其中server1作为Hadoop集群的NomeNodeJobTrackerserver2server3作为DataNodeTaskTracker.  为简便将DNSNFS也安装在server1之上。

     

    2.安装DNS

     

    使用yum安装bind

    [root@server1 admin]#  yum install bind*

     

    安装完成后检查,

    [root@server1 admin]#  rpm -qa | grep  '^bind'

    bind-dyndb-ldap-1.1.0-0.9.b1.el6_3.1.x86_64

    bind-chroot-9.8.2-0.10.rc1.el6_3.6.x86_64

    bind-libs-9.8.2-0.10.rc1.el6_3.6.x86_64

    bind-sdb-9.8.2-0.10.rc1.el6_3.6.x86_64

    bind-utils-9.8.2-0.10.rc1.el6_3.6.x86_64

    bind-devel-9.8.2-0.10.rc1.el6_3.6.x86_64

    bind-9.8.2-0.10.rc1.el6_3.6.x86_64

     

    安装已经齐全

     

    修改配置文件

    修改/etc/named.conf,将127.0.0.1,localhost 改成any

    [root@server1 etc]# vim named.conf

     

    options {

            listen-on port 53 { any; };

            listen-on-v6 port 53 { ::1; };

            directory       "/var/named";

            dump-file       "/var/named/data/cache_dump.db";

            statistics-file "/var/named/data/named_stats.txt";

            memstatistics-file "/var/named/data/named_mem_stats.txt";

            allow-query     { any; };

            recursion yes;

     

            dnssec-enable yes;

            dnssec-validation yes;

            dnssec-lookaside auto;

     

            

            bindkeys-file "/etc/named.iscdlv.key";

            managed-keys-directory "/var/named/dynamic";

    };

     

    修改/etc/named.rfc1912.zones, 加入以下内容

    zone "myhadoop.com" IN {

            type master;

            file "myhadoop.com.zone";       

            allow-update { none; };

    };

    zone "1.168.192.in-addr.arpa" IN {

            type master;

            file "1.168.192.in-addr.zone";

            allow-update { none; };

    };

     

    在目录/var/named下创建文件myhadoop.com.zone1.168.192.in-addr.zone

     

    修改myhadoop.com.zone

     

    $TTL 86400

    @     IN SOA     server1.myhadoop.com. chizk.root.myhadoop.com. (

                                                  0       ; serial (d.adams)

                                                   1D    ; refresh

                                                   1H    ; retry

                                                   1W   ; expire

                                                   3H )  ; minimum

    @     IN NS  server1.myhadoop.com.

    server1.myhadoop.com.       IN A 192.168.1.201

    server2.myhadoop.com.   IN A 192.168.1.202

    server3.myhadoop.com.   IN A 192.168.1.203

     

    修改1.168.192.in-addr.zone

    $TTL 86400

    @     IN SOA  server1.myhadoop.com. chizk.root.myhadoop.com. (

                                            0       ; serial

                                            1D      ; refresh

                                            1H      ; retry

                                            1W      ; expire

                                            3H )    ; minimum

    @       IN NS  server1.myhadoop.com.

    201     IN PTR server1.myhadoop.com.   

    202     IN PTR server2.myhadoop.com.   

    202     IN PTR server3.myhadoop.com.

    修改这两个文件的所有者

     

    [root@server1 named]# chown root.named myhadoop.com.zone

    [root@server1 named]# chown root.named 1.168.192.in-addr.zone

     

    /etc/resolv.conf中添加以下配置

    nameserver 192.168.1.201

     

    用同样的方法修改server2server3中的/etc/resolv.conf文件

     

    启动DNS

    [root@server1 named]# service named start

    Starting named:                                            [  OK  ]

     

    设置为开机自动启动

    [root@server1 admin]# chkconfig named on

     

    测试DNS查询

    [root@server1 admin]# nslookup server1.myhadoop.com

    Server:               192.168.1.201

    Address:  192.168.1.201#53

     

    Name:      server1.myhadoop.com

    Address: 192.168.1.201

     

    [root@server1 admin]# nslookup server2.myhadoop.com

    Server:               192.168.1.201

    Address:  192.168.1.201#53

     

    Name:      server2.myhadoop.com

    Address: 192.168.1.202

     

    [root@server1 admin]# nslookup server3.myhadoop.com

    Server:               192.168.1.201

    Address:  192.168.1.201#53

     

    Name:      server3.myhadoop.com

    Address: 192.168.1.203

     

    查询成功,同时在server2server3中测试,查询都成功

     

    3.安装NFS

     

    查看NFSrpcbind包是否已安装

     

    [root@server1 admin]# rpm -qa | grep nfs

    nfs4-acl-tools-0.3.3-5.el6.x86_64

    nfs-utils-1.2.2-7.el6.x86_64

    nfs-utils-lib-1.1.5-1.el6.x86_64

    [root@server1 admin]# rpm -qa | grep rpcbind

    rpcbind-0.2.0-8.el6.x86_64

     

    可见已经安装完全,若没有安装使用yum安装即可。

     

    修改文件/etc/exports, 加入以下内容

    /home/admin *(sync,rw)

     

    启动NFS

    [root@server1 admin]# service nfs start

    Starting NFS services:                                     [  OK  ]

    Starting NFS quotas:                                       [  OK  ]

    Starting NFS daemon:                                       [  OK  ]

    Starting NFS mountd:                                       [  OK  ]

     

    设置为开机自动启动

    [root@server1 admin]# chkconfig nfs on

     

    启动rpcbind

    [root@server1 admin]# service rpcbind start

    Starting rpcbind:                                          [  OK  ]

     

    设置为自动启动

    [root@server1 admin]# chkconfig rpcbind on

     

    输出挂载点

    [root@server1 admin]# showmount -e localhost

    Export list for localhost:

    /home/admin *

     

    修改/home/admin的权限,为方便设置为777

    [root@server1 home]# chmod 777 /home/admin

     

    server2中挂载server1中的/home/admin

     

    [root@server2 home]# mount server1.myhadoop.com:/home/admin/ /home/admin_share/

     

    测试访问

    [root@server2 home]# cd admin_share/

    [root@server2 admin_share]# cat test.txt

    aaaa,111

    bbbb,222

    cccc,333

    dddd,444

    可见访问成功

     

    修改server2/etc/fstab 文件,设置自动挂载,在末尾添加如下行:

    server1.myhadoop.com:/home/admin   /home/admin_share   nfs  defaults 1 1

     

    同理在server3中挂载server1/home/admin,并测试

     

     

    4. 共享密钥文件

     

    server1server2server3admin用户各自生成登录密钥

    [admin@server1 ~]$ ssh-keygen -t rsa

    Generating public/private rsa key pair.

    Enter file in which to save the key (/home/admin/.ssh/id_rsa):

    /home/admin/.ssh/id_rsa already exists.

    Overwrite (y/n)? y

    Enter passphrase (empty for no passphrase):

    Enter same passphrase again:

    Your identification has been saved in /home/admin/.ssh/id_rsa.

    Your public key has been saved in /home/admin/.ssh/id_rsa.pub.

    The key fingerprint is:

    46:56:64:8f:83:13:e0:f3:17:cb:b9:7d:d5:fc:9f:52 admin@server1

    The key's randomart image is:

    +--[ RSA 2048]----+

    |      ....+      |

    |     .   = o     |

    |      o = + .    |

    |       = o =   ..|

    |        S =     +|

    |       . . o   E.|

    |          . . o .|

    |             o  o|

    |              ...|

    +-----------------+

    [admin@server2 ~]$ ssh-keygen -t rsa

    [admin@server3 ~]$ ssh-keygen -t rsa

     

    server1中将id_rsa.pub 复制为authorized_keys

    [admin@server1 ~]$ cp .ssh/id_rsa.pub .ssh/authorized_keys

     

     server2server3中建立共享密钥到本地的软连接

    [admin@server2 ~]$ ln -s /home/admin_share/.ssh/authorized_keys ~/.ssh/authorized_keys

    [admin@server3 ~]$ ln -s /home/admin_share/.ssh/authorized_keys ~/.ssh/authorized_keys

     

    server2server3的密钥分别追加到authorized_keys

    [admin@server2 ~]$ cat .ssh/id_rsa.pub >> .ssh/authorized_keys

    [admin@server3 ~]$ cat .ssh/id_rsa.pub >> .ssh/authorized_keys

     

    测试配置

    [admin@server1 ~]$ ssh server1.myhadoop.com

    The authenticity of host 'server1.myhadoop.com (192.168.1.201)' can't be established.

    RSA key fingerprint is a9:f3:7f:55:56:3a:a7:d7:9e:23:1e:86:a5:eb:90:dc.

    Are you sure you want to continue connecting (yes/no)? yes

    Warning: Permanently added 'server1.myhadoop.com,192.168.1.201' (RSA) to the list of known hosts.

    Last login: Sun Jan 27 10:02:12 2013 from server1

     

    同样方法测试其他机器,测试成功

     

     

    5. 安装Hadoop

    server1上,hadoop中配置core-site.xml 为以下格式:

     

    <?xml version="1.0"?>

    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

     

    <!-- Put site-specific property overrides in this file. -->

     

    <configuration>

       <property>

          <name>fs.default.name</name>

          <value>hdfs://server1.myhadoop.com:9000</value>

       </property>

    </configuration>

     

    配置mapred-site.xml:

     

    <?xml version="1.0"?>

    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

     

    <!-- Put site-specific property overrides in this file. -->

     

    <configuration>

      <property>

        <name>mapred.job.tracker</name>

        <value>server1.myhadoop.com:9001</value>

      </property>

     <property>

         <name>mapred.job.tracker.map.tasks.maximum</name>

         <value>50</value>

     </property>

    <property>

         <name>mapred.job.tracker.reduce.tasks.maximum</name>

         <value>50</value>

     </property>

    </configuration>

     

    配置master

    server1.myhadoop.com

     

    配置slaves

    server2.myhadoop.com

    server3.myhadoop.com

     

    建立文本文件serverlist.txt,里面包含所有需要分发Hadoop的机器域名,在这里为server2server3,即内容为

    [admin@server1 ~]$ cat serverlist.txt

    server2.myhadoop.com

    server3.myhadoop.com

     

    生成HadoopShell脚本

    [admin@server1 ~]$ cat serverlist.txt | awk '{print "scp -rp /home/admin/hadoop-0.20.2/ admin@"$1":/home/admin/"}' > distributeHadoop.sh

     

    内容如下

    [admin@server1 ~]$ cat ./distributeHadoop.sh

    scp -rp /home/admin/hadoop-0.20.2/ admin@server2.myhadoop.com:/home/admin/

    scp -rp /home/admin/hadoop-0.20.2/ admin@server3.myhadoop.com:/home/admin/

     

    运行脚本

    [admin@server1 ~]$ ./distributeHadoop.sh

     

    检查server2 server3,复制成功

     

    格式化namenode

    [admin@server1 logs]$ hadoop namenode -format

     

    启动Hadoop

    [admin@server1 ~]$ ./hadoop-0.20.2/bin/start-all.sh

     

    starting namenode, logging to /home/admin/hadoop-0.20.2/bin/../logs/hadoop-admin-namenode-server1.out

    server2.myhadoop.com: starting datanode, logging to /home/admin/hadoop-0.20.2/bin/../logs/hadoop-admin-datanode-server2.out

    server3.myhadoop.com: starting datanode, logging to /home/admin/hadoop-0.20.2/bin/../logs/hadoop-admin-datanode-server3.out

    server1.myhadoop.com: starting secondarynamenode, logging to /home/admin/hadoop-0.20.2/bin/../logs/hadoop-admin-secondarynamenode-server1.out

    starting jobtracker, logging to /home/admin/hadoop-0.20.2/bin/../logs/hadoop-admin-jobtracker-server1.out

    server2.myhadoop.com: starting tasktracker, logging to /home/admin/hadoop-0.20.2/bin/../logs/hadoop-admin-tasktracker-server2.out

    server3.myhadoop.com: starting tasktracker, logging to /home/admin/hadoop-0.20.2/bin/../logs/hadoop-admin-tasktracker-server3.out

     

     

    检查server1server2server3,启动成功

    [admin@server1 logs]$ jps

    6481 NameNode

    6612 SecondaryNameNode

    6681 JobTracker

    6749 Jps

     

    [admin@server2 logs]$ jps

    14869 TaskTracker

    14917 Jps

    14795 DataNode

     

    [admin@server3 logs]$ jps

    16354 TaskTracker

    16396 Jps

    16280 DataNode

  • 相关阅读:
    Sqlserver 实际开发中表变量的用法
    Python Day 20 面向对象 (面向对象的组合用法,面向对象的三大特性
    Python Day 19 面向对象(初识面向对象)
    Python Day 18 常用模块(模块和包)
    Python Day 17 常用模块(常用模块一 时间模块,random模块,os模块,sys模块,序列化模块)
    Python Day 15 函数(递归函数、二分查找算法)
    Python Day 14 函数(内置函数,匿名函数(lambda表达式))
    Python Day 13 函数(迭代器,生成器,列表推导式,生成器表达式)
    Python Day 11 + Python Day 12 函数(函数名的应用,闭包,装饰器)
    Python Day 10 函数(名称空间,作用域,作用域链,加载顺序等; 函数的嵌套 global,nonlocal)
  • 原文地址:https://www.cnblogs.com/leeeee/p/7276611.html
Copyright © 2011-2022 走看看