zoukankan      html  css  js  c++  java
  • CentOS 7安装Hadoop集群

    准备三台虚拟机,ip分别为192.168.220.10(master)、192.168.220.11(slave1)、192.168.220.12(slave2)

    准备好jdk-6u45-linux-x64.bin和hadoop-1.2.1-bin.tar.gz,放在/usr/local/src/目录下

    安装JDK(每台虚拟机都安装)

    1.进入到/usr/local/src/目录,执行./jdk-6u45-linux-x64.bin

    2.修改~/.bashrc,在文件末尾增加三行

    export JAVA_HOME=/usr/local/src/jdk1.6.0_45
    export CLASSPATH=.:$CLASSPATH:$JAVA_HOME/lib
    export PATH=$PATH:$JAVA_HOME/bin
    

      

    3.使环境变量生效,执行source ~/.bashrc

    安装Hadoop

    在192.168.220.10机器上安装hadoop

    1.进入到/usr/local/src/目录,解压hadoop-1.2.1-bin.tar.gz,执行tar -zxf hadoop-1.2.1-bin.tar.gz

    2.修改配置文件

    masters文件

    master
    

      

    slaves文件

    slave1
    slave2
    

      

    core-site.xml文件

    <configuration>
    	<property>
    		<name>hadoop.tmp.dir</name>
    		<value>/usr/local/src/hadoop-1.2.1/tmp</value>
    	</property>
    	<property>
    		<name>fs.default.name</name>
    		<value>hdfs://192.168.220.10:9000</value>
    	</property>
    </configuration>
    

      

    mapred-site.xml文件

    <configuration>
    	<property>
    		<name>mapred.job.tracker</name>
    		<value>http://192.168.220.10:9001</value>
    	</property>
    </configuration>
    

      

    hdfs-site.xml文件

    <configuration>
    	<property>
    		<name>dfs.replication</name>
    		<value>3</value>
    	</property>
    </configuration>
    

      

    hadoop-env.sh文件,在后面添加一行

    export JAVA_HOME=/usr/local/src/jdk1.6.0_45
    

      

    3.将/usr/local/src/hadoop-1.2.1目录拷贝到192.168.220.11、192.168.220.12机器上

    配置hostname

    配置192.168.220.10的主机名为master

    1.执行hostname master

    2.修改/etc/hostname文件

    master
    

      

    修改192.168.220.11的主机名为slave1,修改192.168.220.12的主机名为slave2

    配置host文件

    三台机器的/etc/hosts文件末尾添加以下代码

    192.168.220.10    master
    192.168.220.11    slave1
    192.168.220.12    slave2

    关闭防火墙和SELinux

    分别在每台机器上执行

    systemctl stop firewalld.service
    setenforce 0

    配置SSH

    1.在192.168.220.10上执行ssh-keygen,在~目录下新增.ssh目录,目录中的文件为id_rsa,id_rsa.pub

    2.将id_rsa.pub拷贝为authorized_keys

    cp id_rsa.pub authorized_keys

    3.在192.168.220.11和192.168.220.12上分别执行ssh-keygen

    4.将192.168.220.11和192.168.220.12上id_rsa.pub的内容分别拷贝到192.168.220.10的authorized_keys文件中,如下:

    ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC9mGRhFOdcoHw9GUnKQmqThNKpsyah93Dtq/d8RICGWIHDRJ3GXd0sEcb743ejwbuCMmtlhheXcU0FuyA6Cm0jvMyvDfaPKArtxl6KT7Z93uC0VDCXDRomueux81HAIVjc7ZqlXwVeYs1LITxEeJykKlFOXvK7JexWhWGdMMADwxbFMbaNsZ9EwRxcFLFtNg65FQ+u8CIV9KR3D02kemwLCsP+xiRcgs+wirQPm5JM+2cJoLsVQBz3Hk335IsEhc1Xb9Cralo8Tt8gh/ho8K/1pVjvyW1b0LkP9HGNdwVYD9wkWdEJRkryLXBEXpjk4xu+riF+N4rOzJD root@master
    ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDn79fdfR/NjzPVD3NPj1vBBfQdVOrv7jeb4UJCOsd7xioPRiz8gOQnOmhu5C+GchbyGA+tg5pXwnNJTOO2wn32U4lOPndW0okN/wqyN4vgq/taJi7JgY/8rneBiGaIIdNIy/pAGlMwb53Qn766adetMhsxYMD2l4uxmbVVjzCRb8QP5EsAYTmmFOODzJsPm70uF3j1Q8zGavYg0wFSYR/yECQns4DBSuBJNxdGY6PskBXqurahwi5yaR3vWV1Ix4wtB6BYuQomEnGdzOSfrBMZ/yc5tXo0xmEfY7wFkize6z9Pm2E3oDoMR18YkwT1Cz6fHikVILA9cldtL root@slave1
    ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCydYCvASzCZggks4hMqOcYSGLO2eAvocWezNOMwspTfpJ105Jumb/vf5h6cRZeckq56IvhSV6t6mytk4pZoZjjZPSmWvCwLtMRMPShNbA3BYtj5V3WRKV8ZcMrNdD//U7iHHoJm57vI/m+XO42YSYjPw7JDkb8Ij9b6zgI3fyvbSSYeXb451PlyJLHdxIzRMAaZDSbAML9e7EO8VJB9Wf9bXpow4+VeP33it3kgMNUlHQtyqduSwYGxVVtGsUTJkxnuRsbWeeA1/pp8MNFKUgBTMALTVHByglgZqwGcbblJxsG832PIZNRECIFqorm6odftjnT4DR7/0yR root@slave2

    5.将192.168.220.10的authorized_keys文件拷贝到192.168.220.11、192.168.220.12机器上

    配置完成后三台机器可以互相访问不用密码

    好了,到这里Hadoop集群就配置完了,让我们来使用下吧

    在192.168.220.10上格式化namenode

    进入到/usr/local/src/hadoop-1.2.1/bin目录,执行

    ./hadoop namenode -format

    出现

    19/08/04 15:15:21 INFO namenode.NameNode: STARTUP_MSG: 
    /************************************************************
    STARTUP_MSG: Starting NameNode
    STARTUP_MSG:   host = master/192.168.220.10
    STARTUP_MSG:   args = [-format]
    STARTUP_MSG:   version = 1.2.1
    STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r 1503152; compiled by 'mattf' on Mon Jul 22 15:23:09 PDT 2013
    STARTUP_MSG:   java = 1.6.0_45
    ************************************************************/
    19/08/04 15:15:21 INFO util.GSet: Computing capacity for map BlocksMap
    19/08/04 15:15:21 INFO util.GSet: VM type       = 64-bit
    19/08/04 15:15:21 INFO util.GSet: 2.0% max memory = 1013645312
    19/08/04 15:15:21 INFO util.GSet: capacity      = 2^21 = 2097152 entries
    19/08/04 15:15:21 INFO util.GSet: recommended=2097152, actual=2097152
    19/08/04 15:15:22 INFO namenode.FSNamesystem: fsOwner=root
    19/08/04 15:15:22 INFO namenode.FSNamesystem: supergroup=supergroup
    19/08/04 15:15:22 INFO namenode.FSNamesystem: isPermissionEnabled=true
    19/08/04 15:15:22 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
    19/08/04 15:15:22 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
    19/08/04 15:15:22 INFO namenode.FSEditLog: dfs.namenode.edits.toleration.length = 0
    19/08/04 15:15:22 INFO namenode.NameNode: Caching file names occuring more than 10 times 
    19/08/04 15:15:23 INFO common.Storage: Image file /usr/local/src/hadoop-1.2.1/tmp/dfs/name/current/fsimage of size 110 bytes saved in 0 seconds.
    19/08/04 15:15:23 INFO namenode.FSEditLog: closing edit log: position=4, editlog=/usr/local/src/hadoop-1.2.1/tmp/dfs/name/current/edits
    19/08/04 15:15:23 INFO namenode.FSEditLog: close success: truncate to 4, editlog=/usr/local/src/hadoop-1.2.1/tmp/dfs/name/current/edits
    19/08/04 15:15:23 INFO common.Storage: Storage directory /usr/local/src/hadoop-1.2.1/tmp/dfs/name has been successfully formatted.
    19/08/04 15:15:23 INFO namenode.NameNode: SHUTDOWN_MSG: 
    /************************************************************
    SHUTDOWN_MSG: Shutting down NameNode at master/192.168.220.10
    ************************************************************/

    用jps查看进程

    [root@master bin]# jps
    19905 JobTracker
    19650 NameNode
    19821 SecondaryNameNode
    20202 Jps

    在192.168.220.11查看进程

    9289 DataNode
    9493 Jps
    9391 TaskTracker

    在192.168.220.12查看进程

    6823 DataNode
    6923 TaskTracker
    7057 Jps

    测试下:

    执行./hadoop fs -ls /

    drwxr-xr-x   - root supergroup          0 2019-08-04 15:15 /usr

    上传文件,执行./hadoop fs -put /root/w.txt /

    查看文件执行./hadoop fs -cat /w.txt,显示

    ddd

    成功

    注:

    问题:上传文件时出现以下错误

    [root@master bin]# ./hadoop fs -put /root/a.txt /
    19/08/04 15:21:07 WARN hdfs.DFSClient: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /a.txt could only be replicated to 0 nodes, instead of 1
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1920)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:783)
        at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:587)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1432)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1428)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1426)
    
        at org.apache.hadoop.ipc.Client.call(Client.java:1113)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
        at com.sun.proxy.$Proxy1.addBlock(Unknown Source)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:85)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:62)
        at com.sun.proxy.$Proxy1.addBlock(Unknown Source)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:3720)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:3580)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2600(DFSClient.java:2783)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:3023)
    
    19/08/04 15:21:07 WARN hdfs.DFSClient: Error Recovery for null bad datanode[0] nodes == null
    19/08/04 15:21:07 WARN hdfs.DFSClient: Could not get block locations. Source file "/a.txt" - Aborting...
    put: java.io.IOException: File /a.txt could only be replicated to 0 nodes, instead of 1
    19/08/04 15:21:07 ERROR hdfs.DFSClient: Failed to close file /a.txt
    org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /a.txt could only be replicated to 0 nodes, instead of 1
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1920)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:783)
        at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:587)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1432)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1428)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1426)
    
        at org.apache.hadoop.ipc.Client.call(Client.java:1113)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
        at com.sun.proxy.$Proxy1.addBlock(Unknown Source)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:85)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:62)
        at com.sun.proxy.$Proxy1.addBlock(Unknown Source)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:3720)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:3580)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2600(DFSClient.java:2783)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:3023)

    解决方法:关闭防火墙

  • 相关阅读:
    全链路压测(4):全链路压测的价值是什么?
    基于SVN的版本范围汇总
    一篇值得思考的职业教育之路!
    分享35个讨人喜欢的漂亮进度条UI设计
    转一篇难得的好文章CPU流水线的探秘之旅
    超棒的获奖动物摄影作品集
    解决web.py在SAE云中的Session使用问题
    2012年度最新免费web开发设计资源荟萃
    Endless icon: 每天都更新的图标集
    不容错过的超棒Javascript日期处理类库Moment.js
  • 原文地址:https://www.cnblogs.com/shiwaitaoyuan/p/11298605.html
Copyright © 2011-2022 走看看