zoukankan      html  css  js  c++  java
  • hdfs部署环境准备&hdfs伪劣分布式部署

    Hadoop:
    广义: 以apache hadoop软件为主的生态圈(hive zookeeper spark hbase)
    狭义: apache hadoop软件

    hadoop.apache.org
    hive.apache.org
    spark.apache.org

    hadoop软件:
    1.x 企业不用
    2.x 主流
    3.x 没有企业敢用
    a.采坑
    b.很多公司都是CDH5.x部署大数据环境 (www.cloudera.com)
    2.6.0-cdh5.7.0 =? apache hadoop2.6.0

    hadoop软件:
    hdfs:存储 分布式文件系统
    mapreduce:计算 job1 job2 编码 java but 企业不用(开发难度高 代码量大 计算慢)
    yarn:资源(CPU memory)和作业调度

    上课: 2.6.0-cdh5.7.0
    2.6.0-cdh5.14.0

    apache hadoop: hadoop.apache.org
    cdh hadoop: http://archive.cloudera.com/cdh5/cdh/5/

    wget http://archive.cloudera.com/cdh5/cdh/5/hadoop-2.6.0-cdh5.7.0.tar.gz
    rz命令上传

    学习: 伪分布式 1台服务器就够了


    Ubuntu Linux:
    $ sudo apt-get install ssh
    $ sudo apt-get install rsync

    CentOS:
    yum

    $ sudo yum install ssh
    $ sudo yum install rsync

    ----------------------------

    不同的用户管理着不同的软件
    linux root用户
    mysql mysqladmin用户
    hadoop hadoop用户

    1.创建用户和上传hadoop软件
    useradd hadoop
    su - hadoop
    [hadoop@hadoop002 ~]$ mkdir app
    [hadoop@hadoop002 ~]$ cd app/

    [hadoop@hadoop002 ~]$ wget http://archive.cloudera.com/cdh5/cdh/5/hadoop-2.6.0-cdh5.7.0.tar.gz

    [hadoop@hadoop002 ~]$ rz 上传


    2.部署jdk
    CDH环境:
    mkdir /usr/java jdk部署
    mkdir /usr/share/java 部署CDH需要mysql jdbc jar包

    rz上传 jdk-8u45-linux-x64.gz

    解压
    [root@hadoop002 java]# tar -xzvf jdk-8u45-linux-x64.gz
    [root@hadoop002 java]# ll
    total 319156
    drwxr-xr-x 8 uucp 143 4096 Apr 11 2015 jdk1.8.0_45
    -rw-r--r-- 1 root root 153530841 Jul 8 2015 jdk-7u80-linux-x64.tar.gz
    -rw-r--r-- 1 root root 173271626 Sep 19 11:49 jdk-8u45-linux-x64.gz

    权限修正
    [root@hadoop002 java]# chown -R root:root jdk1.8.0_45
    [root@hadoop002 java]# ll
    total 319156
    drwxr-xr-x 8 root root 4096 Apr 11 2015 jdk1.8.0_45
    -rw-r--r-- 1 root root 153530841 Jul 8 2015 jdk-7u80-linux-x64.tar.gz
    -rw-r--r-- 1 root root 173271626 Sep 19 11:49 jdk-8u45-linux-x64.gz
    [root@hadoop002 java]# vi /etc/profile

    #env
    export JAVA_HOME=/usr/java/jdk1.8.0_45
    export JRE_HOME=$JAVA_HOME/jre
    export CLASSPATH=.:$JAVA_HOME/lib:$JER_HOME/lib:$CLASSPATH
    export PATH=$JAVA_HOME/bin:$JER_HOME/bin:$PATH


    [root@hadoop002 java]# source /etc/profile
    [root@hadoop002 java]# which java
    /usr/java/jdk1.8.0_45/bin/java
    [root@hadoop002 java]#
    [root@hadoop002 java]#


    解释器 /bin/bash
    /sbin/nologin

    CDH:
    组件 用户
    hdfs hdfs
    yarn yarn
    zookeeper zookeeper
    hbase hbase

    su - zookeeper切不了:This account is currently not available.
    生产怎么做:/sbin/nologin --》 /bin/bash

    vim /etc/passwd 

    zookeeperty:x:515:515::/home/zookeeper:/bin/nologin

    修改为

    zookeeper:x:515:515::/home/zookeeper:/bin/bash

    3.解压hadoop
    [hadoop@hadoop002 app]$ tar -xzvf hadoop-2.6.0-cdh5.7.0.tar.gz
    [hadoop@hadoop002 app]$ cd hadoop-2.6.0-cdh5.7.0
    [hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ ll
    total 76
    drwxr-xr-x 2 hadoop hadoop 4096 Mar 24 2016 bin 可执行脚本
    drwxr-xr-x 2 hadoop hadoop 4096 Mar 24 2016 bin-mapreduce1
    drwxr-xr-x 3 hadoop hadoop 4096 Mar 24 2016 cloudera
    drwxr-xr-x 6 hadoop hadoop 4096 Mar 24 2016 etc 配置目录(conf)
    drwxr-xr-x 5 hadoop hadoop 4096 Mar 24 2016 examples
    drwxr-xr-x 3 hadoop hadoop 4096 Mar 24 2016 examples-mapreduce1
    drwxr-xr-x 2 hadoop hadoop 4096 Mar 24 2016 include
    drwxr-xr-x 3 hadoop hadoop 4096 Mar 24 2016 lib jar包目录
    drwxr-xr-x 2 hadoop hadoop 4096 Mar 24 2016 libexec
    drwxr-xr-x 3 hadoop hadoop 4096 Mar 24 2016 sbin hadoop组件的启动 停止脚本
    drwxr-xr-x 4 hadoop hadoop 4096 Mar 24 2016 share
    drwxr-xr-x 17 hadoop hadoop 4096 Mar 24 2016 src
    [hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$


    4.Configuration
    Use the following:

    etc/hadoop/core-site.xml:

    <configuration>
    <property>
    <name>fs.defaultFS</name>
    <value>hdfs://localhost:9000</value>
    </property>
    </configuration>


    etc/hadoop/hdfs-site.xml:
    <configuration>
    <property>
    <name>dfs.replication</name>
    <value>1</value>
    </property>
    </configuration>

    5.配置ssh localhost无密码信任关系
    [hadoop@hadoop002 ~]$ ssh-keygen
    Generating public/private rsa key pair.
    Enter file in which to save the key (/home/hadoop/.ssh/id_rsa):
    Created directory '/home/hadoop/.ssh'.
    Enter passphrase (empty for no passphrase):
    Enter same passphrase again:
    Your identification has been saved in /home/hadoop/.ssh/id_rsa.
    Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.
    The key fingerprint is:
    ba:48:3d:ff:af:4d:da:74:67:31:d6:98:ad:a0:b3:76 hadoop@hadoop002
    The key's randomart image is:
    +--[ RSA 2048]----+
    | |
    | |
    | |
    | +.|
    | S . o+o|
    | . . . ...o|
    | . + o o o o|
    | . . + .OE. o |
    | . . .o=++ |
    +-----------------+
    [hadoop@hadoop002 ~]$ cd .ssh
    [hadoop@hadoop002 .ssh]$ ll
    total 8
    -rw------- 1 hadoop hadoop 1675 Feb 13 22:36 id_rsa 私钥
    -rw-r--r-- 1 hadoop hadoop 398 Feb 13 22:36 id_rsa.pub 公钥
    [hadoop@hadoop002 .ssh]$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
    [hadoop@hadoop002 .ssh]$

    [hadoop@hadoop002 .ssh]$ ll
    total 12
    -rw-rw-r-- 1 hadoop hadoop 398 Feb 13 22:37 authorized_keys
    -rw------- 1 hadoop hadoop 1675 Feb 13 22:36 id_rsa
    -rw-r--r-- 1 hadoop hadoop 398 Feb 13 22:36 id_rsa.pub
    -rw-r--r-- 1 hadoop hadoop 0 Feb 13 22:39 known_hosts

    ssh localhost date 是需要输入密码,但是这个用户是没有配置密码。
    我们应该在没有配置密码情况下去完成无密码信任呢?

    改权限

    chmod 700 -R ~/.ssh
    chmod 600 ~/.ssh/authorized_keys 
    [hadoop@hadoop002 .ssh]$ chmod 600 authorized_keys
    [hadoop@hadoop002 .ssh]$ ssh localhost date
    The authenticity of host 'localhost (127.0.0.1)' can't be established.
    RSA key fingerprint is b1:94:33:ec:95:89:bf:06:3b:ef:30:2f:d7:8e:d2:4c.
    Are you sure you want to continue connecting (yes/no)? yes
    Warning: Permanently added 'localhost' (RSA) to the list of known hosts.
    Wed Feb 13 22:41:17 CST 2019
    [hadoop@hadoop002 .ssh]$
    [hadoop@hadoop002 .ssh]$
    [hadoop@hadoop002 .ssh]$ ssh localhost date
    ####ssh 远程登录到localhost 这台机器去打印一个命令 把这个命令的结果返回回来
    Wed Feb 13 22:41:22 CST 2019
    [hadoop@hadoop002 .ssh]$


    6.格式化
    [hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ bin/hdfs namenode -format


    7.启动
    [hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ sbin/start-dfs.sh
    19/02/13 22:47:42 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

    Starting namenodes on [localhost]
    localhost: starting namenode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-namenode-hadoop002.out
    localhost: starting datanode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-datanode-hadoop002.out

    #注意:此时可能会报错

    [hadoop]@hadoop001 hadoop-2.6.0-cdh5.7.0]$ sbin/start-dfs.sh
    19/03/23 19:52:08 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    Stopping namenodes on [hadoop001]
    hadoop001: Error: JAVA_HOME is not set and could not be found.
    localhost: Error: JAVA_HOME is not set and could not be found.

    解决方法:在/home/hadoop/app/hadoop-2.6.0-cdh5.7.0/etc/hadoop/hadoop-env.sh这个文件里添加java的家目录即可

    [hadoop@hadoop001 hadoop]$ vi hadoop-env.sh

    export JAVA_HOME=/usr/java/jdk1.8.0_45
    export HADOOP_HOME=/home/hadoop/app/hadoop-2.6.0-cdh5.7.0

    # 将export JAVA_HOME=${JAVA_HOME}改为
    export JAVA_HOME=/usr/java/jdk1.8.0_45

    [hadoop@hadoop001 hadoop]$source  hadoop-env.sh  --敲重点   不要忘记source一下

    再次启动发现还是报错

    Error: Cannot find configuration directory: /etc/hadoop
    Error: Cannot find configuration directory: /etc/hadoop

    解决方法

    还是在 hadoop-env.sh将 HADOOP_CONF_DIR的默认参数改成自己 hadoop-env.sh所在的目录

    [xuziyu@hadoop001 hadoop]$ pwd
    /home/xuziyu/app/hadoop-2.6.0-cdh5.7.0/etc/hadoop  --hadoop-env.sh所在的目录

    [hadoop@hadoop001 hadoop]$ vi hadoop-env.sh

    export HADOOP_CONF_DIR=/home/xuziyu/app/hadoop-2.6.0-cdh5.7.0/etc/hadoop

    [hadoop@hadoop001 hadoop]$source  hadoop-env.sh  --敲重点   不要忘记source一下

    ##错误解除

    ssh 信任关系 是配置localhost
    为什么要配置ssh
    因为hadoop软件脚本在启动的时候是要用到的在远程启动的时候

    Starting secondary namenodes [0.0.0.0]
    The authenticity of host '0.0.0.0 (0.0.0.0)' can't be established.
    RSA key fingerprint is b1:94:33:ec:95:89:bf:06:3b:ef:30:2f:d7:8e:d2:4c.
    Are you sure you want to continue connecting (yes/no)? yes ##这里输yes的原因是know_hosts文件中是没有0.0.0.0这条地址信息的 输入yes以后为什么不用输入密码就可以进入,是因为我们的公钥文件(即id_rsa.pub)
    放到了一个叫信任的文件里边authorized_keys,相当于暴露出去了,不需要输入
    0.0.0.0: Warning: Permanently added '0.0.0.0' (RSA) to the list of known hosts.
    0.0.0.0: starting secondarynamenode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-secondarynamenode-hadoop002.out
    19/02/13 22:49:47 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    [hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$

    ssh localhost date
    [hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ sbin/stop-dfs.sh
    [hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ sbin/start-dfs.sh
    19/02/13 22:57:23 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    Starting namenodes on [localhost]
    localhost: starting namenode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-namenode-hadoop002.out
    localhost: starting datanode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-datanode-hadoop002.out
    Starting secondary namenodes [0.0.0.0]
    0.0.0.0: starting secondarynamenode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-secondarynamenode-hadoop002.out
    19/02/13 22:57:38 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    [hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$
    [hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ jps
    15059 Jps
    14948 SecondaryNameNode 第二名称节点 老二
    14783 DataNode 数据节点 小弟
    14655 NameNode 名称节点 老大 读写
    [hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$
    ###如果启动以后jps查看发现DataNode没有启动起来,就需要重新打开一个窗口
    用root用户登录,到tmp目录下删掉hadoop-hadoop,或者将tmp目录下的文件清空(全部删除)
    [root@hadoop001 tmp]# ll
    total 28
    srwxr-xr-x 1 root root 0 Jan 25 00:40 Aegis-<Guid(5A2C30A2-A87D-490A-9281-6765EDAD7CBA)>
    drwxrwxr-x 3 hadoop hadoop 4096 Feb 17 12:00 hadoop-hadoop
    drwxr-xr-x 2 hadoop hadoop 4096 Feb 17 14:46 hsperfdata_hadoop
    drwxrwxr-x 4 hadoop hadoop 4096 Feb 17 12:21 Jetty_0_0_0_0_50070_hdfs____w2cu08
    drwxrwxr-x 4 hadoop hadoop 4096 Feb 17 12:21 Jetty_0_0_0_0_50090_secondary____y6aanv
    drwxrwxr-x 4 hadoop hadoop 4096 Feb 17 12:21 Jetty_localhost_43515_datanode____8tekv5
    drwx------ 3 root root 4096 Dec 19 10:45 systemd-private-eb1b20a234e9490a9501420766a0ad26-httpd.service-1R84I0
    drwx------ 3 root root 4096 Nov 21 14:11 systemd-private-eb1b20a234e9490a9501420766a0ad26-ntpd.service-ALvtHb
    [root@hadoop001 tmp]#
    然后再重新格式化
    .格式化
    [hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ bin/hdfs namenode -format
    然后再重新启动
    .启动
    [hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ sbin/start-dfs.sh
    这时候在用jps查看进程(这个方法适用于安装过一次 第二次安装,属于环境残留造成的)
    [hadoop@hadoop001 hadoop-2.6.0-cdh5.7.0]$ jps
    24032 NameNode
    24424 Jps
    24154 DataNode
    24315 SecondaryNameNode
    [hadoop@hadoop001 hadoop-2.6.0-cdh5.7.0]$

    open http://ip:50070


    8.环境变量
    [hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ cat ~/.bash_profile
    # .bash_profile

    # Get the aliases and functions
    if [ -f ~/.bashrc ]; then
    . ~/.bashrc
    fi

    # User specific environment and startup programs
    export HADOOP_PREFIX=/home/hadoop/app/hadoop-2.6.0-cdh5.7.0
    export PATH=$HADOOP_PREFIX/bin:$PATH
    [hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$

    9.命令: hdfs dfs操作命令和Linux命令极其相似

    [hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$
    [hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$
    [hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ bin/hdfs dfs -ls /
    19/02/13 23:08:35 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    [hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$
    [hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$
    [hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ bin/hdfs dfs -ls /
    19/02/13 23:11:22 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    [hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$
    [hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$
    [hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$
    [hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ ls /
    bin dev home lib64 media opt root sbin srv tmp var
    boot etc lib lost+found mnt proc run selinux sys usr
    [hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$
    [hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$
    [hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ bin/hdfs dfs -mkdir /ruozedata
    19/02/13 23:11:51 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    [hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$
    [hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$
    [hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ bin/hdfs dfs -ls /
    19/02/13 23:11:56 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    Found 1 items
    drwxr-xr-x - hadoop supergroup 0 2019-02-13 23:11 /ruozedata
    [hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$
    [hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$
    [hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$
    [hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ ls /
    bin dev home lib64 media opt root sbin srv tmp var
    boot etc lib lost+found mnt proc run selinux sys usr
    [hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$



    http://blog.itpub.net/30089851/viewspace-1992210/
    http://blog.itpub.net/30089851/viewspace-2127102/

  • 相关阅读:
    echarts 算百分比与js toFixed算出来的百分比不一致的问题。
    VUE, Vue Router Tab 显示动态页签名称。
    [日常坑]前端j's数据导出excel,导出的文件损坏
    [最新 | Build 3211]Sublime Text 2.x, 3.x 许可License集合
    图片滤波
    electron-ipc通信性能分析
    设计vue3的请求实体工厂
    canvas-修改图片亮度
    canvas性能-drawImage渲染图片
    基于windows配置gitlab-runner
  • 原文地址:https://www.cnblogs.com/xuziyu/p/10391690.html
Copyright © 2011-2022 走看看