zoukankan      html  css  js  c++  java
  • 超详细!CDH版-Hadoop_Hive_Impala单机环境搭建纯离线安装

    CDH-Hadoop_Hive_Impala单机环境搭建纯离线安装

    环境介绍及安装包下载地址

    Centos6.5

    版本选择:cdh5.14.0-src.tar.gz  CDH5 – 5.14.0

    Hadoop安装包下载地址

    http://archive.cloudera.com/cdh5/cdh/5/hadoop-2.6.0-cdh5.14.0.tar.gz

    Hive安装包下载地址

    http://archive.cloudera.com/cdh5/cdh/5/hive-1.1.0-cdh5.14.0.tar.gz

    impala安装包下载地址

    http://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/5.14.0/RPMS/x86_64/impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64.rpm

    http://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/5.14.0/RPMS/x86_64/impala-catalog-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64.rpm

    http://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/5.14.0/RPMS/x86_64/impala-debuginfo-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64.rpm

    http://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/5.14.0/RPMS/x86_64/impala-server-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64.rpm

    http://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/5.14.0/RPMS/x86_64/impala-shell-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64.rpm

    http://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/5.14.0/RPMS/x86_64/impala-state-store-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64.rpm  

    http://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/5.14.0/RPMS/x86_64/impala-udf-devel-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64.rpm

    参考文档

    https://docs.cloudera.com/documentation/enterprise/5-12-x/topics/cdh_qs_cdh5_pseudo.html#topic_3

    https://docs.cloudera.com/documentation/index.html

    1.安装JDK

    上传rpm包

    rpm -ivh jdk-8u221-linux-x64.rpm

    默认安装在/usr/java中

    路径为/usr/java/jdk1.8.0_221-amd64

    配置环境变量

    vim /etc/profile

    #添加如下内容

    JAVA_HOME=/usr/java/jdk1.8.0_221-amd64

    JRE_HOME=/usr/java/jdk1.8.0_221-amd64/jre

    PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin

    CLASSPATH=:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JRE_HOME/lib

    export JAVA_HOME JRE_HOME PATH CLASSPATH

    [root@sunny conf]# java -version

    java version "1.8.0_221"

    Java(TM) SE Runtime Environment (build 1.8.0_221-b11)

    Java HotSpot(TM) 64-Bit Server VM (build 25.221-b11, mixed mode)

    2.安装Hadoop

    2.1指定Hadoop安装目录/data/CDH,进行解压

    cd /data/CDH

    tar -zvxf /data/install_pkg_CDH/hadoop-2.6.0-cdh5.14.0.tar.gz

    hadoop核心配置文件位置

    /data/CDH/hadoop-2.6.0-cdh5.14.0/etc/hadoop/

     

    2.2配置Hadoop环境变量

    vi /etc/profile

    #在末尾加入配置:

    JAVA_HOME=/usr/java/jdk1.8.0_221-amd64

    JRE_HOME=/usr/java/jdk1.8.0_221-amd64/jre

    PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin

    CLASSPATH=:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JRE_HOME/lib

    export JAVA_HOME JRE_HOME PATH CLASSPATH

    export HADOOP_HOME=/data/CDH/hadoop-2.6.0-cdh5.14.0

    export HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_HOME}/lib/native

    export HADOOP_OPTS="-Djava.library.path=HADOOP_HOME/lib"

    export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH

    source /etc/profile #使得配置生效

    2.3先在hadoop安装目录下创建数据目录(用于配置: core-site.xml)

    cd /data/CDH/hadoop-2.6.0-cdh5.14.0

    mkdir data

    2.4配置core-site.xml

    vi   /data/CDH/hadoop-2.6.0-cdh5.14.0/etc/hadoop/core-site.xml

    #添加如下配置:

    <property>

        <name>fs.defaultFS</name>

        <!-- 这里填的是你自己的ip,端口默认-->

        <value>hdfs://172.20.71.66:9000</value>

    </property>

    <property>

        <name>hadoop.tmp.dir</name>

        <!-- 这里填的是你自定义的hadoop工作的目录,端口默认-->

        <value>/data/CDH/hadoop-2.6.0-cdh5.14.0/data</value>

    </property>

    <property>

        <name>hadoop.native.lib</name>

        <value>false</value>

        <description>Should native hadoop libraries, if present, be used.

        </description>

    </property>

    2.5配置hdfs-site.xml

    vi      /data/CDH/hadoop-2.6.0-cdh5.14.0/etc/hadoop/hdfs-site.xml

    #添加如下配置:

    <property>

             <name>dfs.replication</name>

             <value>1</value>

    </property>

    <property>

        <name>dfs.namenode.secondary.http-address</name>

        <value>172.20.71.66:50090</value>

    </property>

    <property>

        <name>dfs.permissions.enabled</name>

        <value>false</value>

    </property>

    2.6配置mapres-site.xml

    cd     /data/CDH/hadoop-2.6.0-cdh5.14.0/etc/hadoop

    cp mapred-site.xml.template  ./mapred-site.xml  #复制默认文件配置命名为mapred-site.xml

    vi      /data/CDH/hadoop-2.6.0-cdh5.14.0/etc/hadoop/mapred-site.xml

    #增加如下配置:

    <property>

        <name>mapreduce.framework.name</name>

        <value>yarn</value>

    </property>

    2.7配置yarn-site.xml

    vi      /data/CDH/hadoop-2.6.0-cdh5.14.0/etc/hadoop/yarn-site.xml 

    #添加如下配置:

    <property>

        <name>yarn.resourcemanager.hostname</name>

        <value>172.20.71.66</value>

    </property>

    <property>

        <name>yarn.nodemanager.aux-services</name>

        <value>mapreduce_shuffle</value>

    </property>

    2.8配置hadoop-env.sh

    vi      /data/CDH/hadoop-2.6.0-cdh5.14.0/etc/hadoop/hadoop-env.sh

    export JAVA_HOME=${JAVA_HOME}

    更改为:

    export JAVA_HOME=/usr/java/jdk1.8.0_221-amd64

    2.9启动hadoop

    配置好之后切换到sbin目录下

    cd /data/CDH/hadoop-2.6.0-cdh5.14.0/sbin/ 

    hadoop namenode –format #格式化hadoop文件格式

    错误提示:hostname与/etc/hosts中的不对应。

    执行hostname命令,查看主机名。比如为testuser

    执行vi /etc/hosts

    修改

    127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4

    ::1      localhost localhost.localdomain localhost6 localhost6.localdomain6

    127.0.0.1 localhost localhost.localdomain testuser  localhost4.localdomain4

    ::1      localhost localhost.localdomain localhost6 localhost6.localdomain6

    再次格式化: hadoop namenode -format

    成功后启动所有的

    ./start-all.sh

    中间需要输入多次当前服务器密码,按照提示输入即可

    然后再输入命令 jps 查看服务启动情况

    出现下图标记的五个进程,即表示启动成功!

     

    2.10Hadoop页面访问

    http://172.20.71.66:50070   #把172.20.71.66 换为你的服务器IP即可

    3.安装Hive

    3.1指定Hive安装目录/data/CDH,并且解压

    cd /data/CDH

    tar -zxvf /data/install_pkg_CDH/hive-1.1.0-cdh5.14.0.tar.gz

    3.2配置环境变量

    vim /etc/profile

    #添加如下内容

    export HIVE_HOME=/data/CDH/hive-1.1.0-cdh5.14.0/

    export PATH=$HIVE_HOME/bin:$PATH

    export HIVE_CONF_DIR=$HIVE_HOME/conf

    export HIVE_AUX_JARS_PATH=$HIVE_HOME/lib

    ~ source /etc/profile

    3.3配置hive-env.sh

    cd /data/CDH/hive-1.1.0-cdh5.14.0/conf

    cp  hive-env.sh.template hive-env.sh

    vi hive_env.sh

    #在文件末尾添加如下内容

    export JAVA_HOME=/usr/java/jdk1.8.0_221-amd64

    export HADOOP_HOME=/data/CDH/hadoop-2.6.0-cdh5.14.0

    export HIVE_HOME=/data/CDH/hive-1.1.0-cdh5.14.0

    export HIVE_CONF_DIR=/data/CDH/hive-1.1.0-cdh5.14.0/conf

    export HIVE_AUX_JARS_PATH=/data/CDH/hive-1.1.0-cdh5.14.0/lib

    3.4配置文件hive-site.xml

    cd /data/CDH/hive-1.1.0-cdh5.14.0/conf

    vi hive-site.xml

    #添加如下内容

    <?xml version="1.0" encoding="UTF-8" standalone="no"?>

    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

    <configuration>

      <property>

        <name>javax.jdo.option.ConnectionURL</name>

        <value>jdbc:derby:;databaseName=metastore_db;create=true</value>

      </property>

        <property>

        <name>javax.jdo.option.ConnectionDriverName</name>

        <value>org.apache.derby.jdbc.EmbeddedDriver</value>

      </property>

        <property>

        <name>javax.jdo.option.ConnectionUserName</name>

        <value>APP</value>

      </property>

        <property>

        <name>javax.jdo.option.ConnectionPassword</name>

        <value>mine</value>

      </property>

    </configuration>

    3.5创建HDFS目录(这里好像非必须的,也可不操作)

    cd /data/CDH/hive-1.1.0-cdh5.14.0

    mkdir hive_hdfs

    cd hive_hdfs

    hdfs dfs -mkdir -p warehouse #使用这条命令的前提是hadoop已经安装好了

    hdfs dfs -mkdir -p tmp

    hdfs dfs -mkdir -p log

    hdfs dfs -chmod -R 777 warehouse

    hdfs dfs -chmod -R 777 tmp

    hdfs dfs -chmod -R 777 log

    3.6初始化hive (建议在安装目录的bin目录里执行)

    cd  /data/CDH/hive-1.1.0-cdh5.14.0/bin

    schematool -dbType derby –initSchema

    #如果需要反复执行schematool -dbType derby –initSchema ,可以手动删除或重命名metastore_db目录。

     

    3.7启动hive

    hive --service metastore &    #可在任意目录输入

    hive --service hiveserver2 &    #可在任意目录输入

    3.8使用hive

    hive  #可在任意目录输入

    3.9安装hive时的异常处理

    3.9.1 java.lang.ClassNotFoundException: org.htrace.Trace

    从maven仓库中下载htrace-core.jar包拷贝到$HIVE_HOME/lib目录下

    <!-- https://mvnrepository.com/artifact/org.htrace/htrace-core -->

    <dependency>

        <groupId>org.htrace</groupId>

        <artifactId>htrace-core</artifactId>

        <version>3.0</version>

    </dependency>

    cp /data/install_pkg_apache/htrace-core-3.0.jar /data/CDH/hive-1.1.0-cdh5.14.0/lib

    cd /data/CDH/hive-1.1.0-cdh5.14.0/bin

    3.9.2 [ERROR] Terminal initialization failed; falling back to unsupported java.lang.IncompatibleClassChangeError: Found class jline.Terminal, but interface was expected

    处理方案:从maven仓库中下载 jline-2.12.jar(上传到当前服务器)

    cd /data/CDH/hadoop-2.6.0-cdh5.14.0/share/hadoop/yarn/lib

    mv jline-2.11.jar jline-2.11.jar_bak

    cd /data/CDH/hive-1.1.0-cdh5.14.0/lib

    cp jline-2.12.jar /data/CDH/hadoop-2.6.0-cdh5.14.0/share/hadoop/yarn/lib

    重新启动 hadoop:

    cd /data/CDH/hadoop-2.6.0-cdh5.14.0/sbin

    sh stop-all.sh

    sh start-all.sh

    再次执行: schematool -dbType derby -initSchema

    错误还在!!!

    处理方案Plus:

    vi /etc/profile

    export HADOOP_USER_CLASSPATH_FIRST=true

    再次执行: schematool -dbType derby -initSchema  (ok)

    接着开始启动hive:

    ./hive --service metastore &  

    ./hive --service hiveserver2 &

    3.9.3 java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient

    参考博客地址: https://www.58jb.com/html/89.html

    在Hive2.0.0以前的版本,需要在一个Hive的主节点上启动两个服务:metastore 、hiveserver2 才可以登陆到Hive的命令行下。(配置hive-site.xml再启动即可)

    hive --service metastore &

    hive --service hiveserver2 &

    3.10Hive基本操作测试

    hive  #进入hive

    #执行hql语句

    show databases;

    create database test;

    use test;

    create table user_test(name string) row format delimited fields terminated by '01';

    insert into table user_test values('zs');

    select * from user_test;

    4.安装Impala

    4.1磁盘扩容(服务器配置足够的跳过即可)

    需要来个小插曲,我这是自己搭建的centos环境,服务器配置不够,需要加空间。

    参考博客: http://blog.itpub.net/117319/viewspace-2125079/

    windows操作进入cdm命令行:

    C:UserskingdeeVirtualBox VMscentos6centos6.vdi

    VBoxManage.exe modifymedium C:UserskingdeeVirtualBox VMscentos6centos6.vdi resize --20480  #这里运行有问题 因为文件名称存在空格

    C:

    cd UserskingdeeVirtualBox VMscentos6

    F:VirtualBoxVBoxManage.exe modifymedium centos6.vdi --resize 20480  #这样执行就可以啦

    Centos操作:

    fdisk -l

    fdisk /dev/sda

    n

    p

    3

    回车 (使用默认)

    +10G (分配的内存)

    w (保存)

     

    查看并立即刷新分区信息

    cat /proc/partitions

    reboot #重启

    mkfs -t ext3 /dev/sda3

    格式分后再挂载到映射目录

    mount /dev/sda3 /home  #取消挂载命令umount /home

    将 impala安装在 /home目录下

    cd /home

    mkdir data

    mkdir CDH

    mkdir impala_rpm

    cd /home/data/CDH/impala_rpm

    4.2安装impala(ps:所有安装包都放到/home/data/CDH/impala_rpm_install_pkg 目录下 )

    cd    /home/data/CDH/impala_rpm

    rpm -ivh ../impala_rpm_install_pkg/impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64.rpm

    出现如下错误:

    warning: ../impala_rpm_install_pkg/impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64.rpm: Header V4 DSA/SHA1 Signature, key ID e8f86acd: NOKEY

    error: Failed dependencies:

            bigtop-utils >= 0.7 is needed by impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64

            hadoop is needed by impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64

            hadoop-hdfs is needed by impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64

            hadoop-yarn is needed by impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64

            hadoop-mapreduce is needed by impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64

            hbase is needed by impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64

            hive >= 0.12.0+cdh5.1.0 is needed by impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64

            zookeeper is needed by impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64

            hadoop-libhdfs is needed by impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64

            avro-libs is needed by impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64

            parquet is needed by impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64

            sentry >= 1.3.0+cdh5.1.0 is needed by impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64

            sentry is needed by impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64

            /lib/lsb/init-functions is needed by impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64

            libhdfs.so.0.0.0()(64bit) is needed by impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64

    emmmm… 得先安装 bigtop-util 和 sentry

    4.3安装bigtop-util 和 sentry

    bigtop-util 和 sentry安装包下载地址:

    http://archive.cloudera.com/cdh5/redhat/7/x86_64/cdh/5.14.0/RPMS/noarch/

    也可以直接点击下面这两个链接下载:

    http://archive.cloudera.com/cdh5/redhat/7/x86_64/cdh/5.14.0/RPMS/noarch/bigtop-utils-0.7.0+cdh5.14.0+0-1.cdh5.14.0.p0.47.el7.noarch.rpm

    http://archive.cloudera.com/cdh5/redhat/7/x86_64/cdh/5.14.0/RPMS/noarch/sentry-1.5.1+cdh5.14.0+432-1.cdh5.14.0.p0.47.el7.noarch.rpm

    执行如下命令:

    sudo rpm -ivh  ../impala_rpm_install_pkg/bigtop-utils-0.7.0+cdh5.14.0+0-1.cdh5.14.0.p0.47.el7.noarch.rpm

    sudo rpm -ivh  ../impala_rpm_install_pkg/sentry-1.5.1+cdh5.14.0+432-1.cdh5.14.0.p0.47.el7.noarch.rpm --nodeps  # --nodeps 不强制依赖

    sudo rpm -ivh  ../impala_rpm_install_pkg/impala-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64.rpm --nodeps

    sudo rpm -ivh  ../impala_rpm_install_pkg/impala-server-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64.rpm

    sudo rpm -ivh  ../impala_rpm_install_pkg/impala-state-store-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64.rpm

    sudo rpm -ivh  ../impala_rpm_install_pkg/impala-catalog-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64.rpm

    sudo rpm -ivh  ../impala_rpm_install_pkg/impala-shell-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64.rpm

    执行到impala-shell得时候,需要先安装:python-setuptools

    4.4安装python-setuptools

    python-setuptools的下载地址:

          http://mirror.centos.org/altarch/7/os/aarch64/Packages/python-setuptools-0.9.8-7.el7.noarch.rpm

    执行如下命令:

    sudo rpm -ivh  ../impala_rpm_install_pkg/python-setuptools-0.9.8-7.el7.noarch.rpm --nodeps

    然后我们继续上次的impala-shell安装:

    sudo rpm -ivh  ../impala_rpm_install_pkg/impala-shell-2.11.0+cdh5.14.0+0-1.cdh5.14.0.p0.50.el6.x86_64.rpm

    欧克了

    4.5配置bigtop-utils

    vi /etc/default/bigtop-utils

    export JAVA_HOME=/usr/java/jdk1.8.0_221-amd64

    4.6配置conf

    cd  /etc/impala/conf  #复制hadoop与hive的配置文件到impala的目录下

    cp /data/CDH/hadoop-2.6.0-cdh5.14.0/etc/hadoop/core-site.xml ./

    cp /data/CDH/hadoop-2.6.0-cdh5.14.0/etc/hadoop/hdfs-site.xml ./

    cp /data/CDH/hive-1.1.0-cdh5.14.0/conf/hive-site.xml ./

    4.7配置hdfs-site.xml

    vi hdfs-site.xml

    #添加如下内容:

     <property>

             <name>dfs.datanode.hdfs-blocks-metadata.enabled</name>

             <value>true</value>

    </property>

    <property>

             <name>dfs.client.file-block-storage-locations.timeout.millis</name>

             <value>10000</value>

    </property>

    4.8处理jar包

    4.8.1我们先来学习一下linux的软链接

    创建软链接

    ln  -s  [源文件或目录]  [目标文件或目录]

    例如:

    当前路径创建test 引向/var/www/test 文件夹

    ln –s  /var/www/test  test

    删除软链接

    和删除普通的文件是一样的,删除都是使用rm来进行操作

     rm –rf 软链接名称(请注意不要在后面加”/”,rm –rf 后面加不加”/” 的区别,可自行去百度下啊)

    例如:

    删除test

    rm –rf test

    修改软链接

    ln –snf  [新的源文件或目录]  [目标文件或目录]

    这将会修改原有的链接地址为新的地址

    例如:

    创建一个软链接

    ln –s  /var/www/test   /var/test

    修改指向的新路径

    ln –snf  /var/www/test1   /var/test

    4.8.2处理impala jar包的软连接

    (以下内容可以复制,记得替换版本号哦,例如我的包版本是 2.6.0-cdh5.14.0, 需要按照你具体的安装版本替换哦)

    Hadoop相关jar包

    sudo rm -rf /usr/lib/impala/lib/hadoop-*.jar

    sudo ln -s $HADOOP_HOME/share/hadoop/common/lib/hadoop-annotations-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-annotations.jar

    sudo ln -s $HADOOP_HOME/share/hadoop/common/lib/hadoop-auth-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-auth.jar

    sudo ln -s $HADOOP_HOME/share/hadoop/tools/lib/hadoop-aws-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-aws.jar

    sudo ln -s $HADOOP_HOME/share/hadoop/common/hadoop-common-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-common.jar

    sudo ln -s $HADOOP_HOME/share/hadoop/hdfs/hadoop-hdfs-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-hdfs.jar

    sudo ln -s $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-client-common-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-mapreduce-client-common.jar

    sudo ln -s $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-mapreduce-client-core.jar

    sudo ln -s $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-mapreduce-client-jobclient.jar

    sudo ln -s $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-client-shuffle-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-mapreduce-client-shuffle.jar

    sudo ln -s $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-api-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-yarn-api.jar

    sudo ln -s $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-client-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-yarn-client.jar

    sudo ln -s $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-yarn-common.jar

    sudo ln -s $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-server-applicationhistoryservice-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-yarn-server-applicationhistoryservice.jar

    sudo ln -s $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-server-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-yarn-server-common.jar

    sudo ln -s $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-server-nodemanager-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-yarn-server-nodemanager.jar

    sudo ln -s $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-server-resourcemanager-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-yarn-server-resourcemanager.jar

    sudo ln -s $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-server-web-proxy-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-yarn-server-web-proxy.jar

    sudo ln -s $HADOOP_HOME/share/hadoop/mapreduce1/hadoop-core-2.6.0-mr1-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-core.jar

    sudo ln -s $HADOOP_HOME/share/hadoop/tools/lib/hadoop-azure-datalake-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-azure-datalake.jar

    #重新建立软连接:

    sudo ln -snf $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-common-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-yarn-common.jar

    sudo ln -snf $HADOOP_HOME/share/hadoop/yarn/hadoop-yarn-server-common-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-yarn-server-common.jar

    sudo ln -snf $HADOOP_HOME/share/hadoop/common/lib/avro-1.7.6-cdh5.14.0.jar /usr/lib/impala/lib/avro.jar

    sudo ln -snf $HADOOP_HOME/share/hadoop/common/lib/zookeeper-3.4.5-cdh5.14.0.jar /usr/lib/impala/lib/zookeeper.jar

    hive相关jar包

    sudo rm -rf /usr/lib/impala/lib/hive-*.jar

    sudo ln -s $HIVE_HOME/lib/hive-ant-1.1.0-cdh5.14.0.jar /usr/lib/impala/lib/hive-ant.jar

    sudo ln -s $HIVE_HOME/lib/hive-beeline-1.1.0-cdh5.14.0.jar /usr/lib/impala/lib/hive-beeline.jar

    sudo ln -s $HIVE_HOME/lib/hive-common-1.1.0-cdh5.14.0.jar /usr/lib/impala/lib/hive-common.jar

    sudo ln -s $HIVE_HOME/lib/hive-exec-1.1.0-cdh5.14.0.jar /usr/lib/impala/lib/hive-exec.jar

    sudo ln -s $HIVE_HOME/lib/hive-hbase-handler-1.1.0-cdh5.14.0.jar /usr/lib/impala/lib/hive-hbase-handler.jar

    sudo ln -s $HIVE_HOME/lib/hive-metastore-1.1.0-cdh5.14.0.jar /usr/lib/impala/lib/hive-metastore.jar

    sudo ln -s $HIVE_HOME/lib/hive-serde-1.1.0-cdh5.14.0.jar /usr/lib/impala/lib/hive-serde.jar

    sudo ln -s $HIVE_HOME/lib/hive-service-1.1.0-cdh5.14.0.jar /usr/lib/impala/lib/hive-service.jar

    sudo ln -s $HIVE_HOME/lib/hive-shims-common-1.1.0-cdh5.14.0.jar /usr/lib/impala/lib/hive-shims-common.jar

    sudo ln -s $HIVE_HOME/lib/hive-shims-1.1.0-cdh5.14.0.jar /usr/lib/impala/lib/hive-shims.jar

    sudo ln -s $HIVE_HOME/lib/hive-shims-scheduler-1.1.0-cdh5.14.0.jar /usr/lib/impala/lib/hive-shims-scheduler.jar

    sudo ln -snf $HIVE_HOME/lib/parquet-hadoop-bundle-1.5.0-cdh5.14.0.jar /usr/lib/impala/lib/parquet-hadoop-bundle.jar

    其它jar包

    sudo ln -snf $HIVE_HOME/lib/hbase-annotations-1.2.0-cdh5.14.0.jar /usr/lib/impala/lib/hbase-annotations.jar

    sudo ln -snf $HIVE_HOME/lib/hbase-common-1.2.0-cdh5.14.0.jar /usr/lib/impala/lib/hbase-common.jar

    sudo ln -snf $HIVE_HOME/lib/hbase-protocol-1.2.0-cdh5.14.0.jar /usr/lib/impala/lib/hbase-protocol.jar

    sudo ln -snf /data/CDH/impala-2.11.0-cdh5.14.0/thirdparty/hbase-1.2.0-cdh5.14.0/lib/hbase-client-1.2.0-cdh5.14.0.jar /usr/lib/impala/lib/hbase-client.jar

    sudo ln -snf /data/CDH/impala-2.11.0-cdh5.14.0/thirdparty/hadoop-2.6.0-cdh5.14.0/lib/native/libhadoop.so /usr/lib/impala/lib/libhadoop.so

    sudo ln -snf /data/CDH/impala-2.11.0-cdh5.14.0/thirdparty/hadoop-2.6.0-cdh5.14.0/lib/native/libhadoop.so.1.0.0 /usr/lib/impala/lib/libhadoop.so.1.0.0

    sudo ln -snf /data/CDH/impala-2.11.0-cdh5.14.0/thirdparty/hadoop-2.6.0-cdh5.14.0/lib/native/libhdfs.so /usr/lib/impala/lib/libhdfs.so

    sudo ln -snf /data/CDH/impala-2.11.0-cdh5.14.0/thirdparty/hadoop-2.6.0-cdh5.14.0/lib/native/libhdfs.so.0.0.0 /usr/lib/impala/lib/libhdfs.so.0.0.0

    额外补充

    如有遗漏,还有一些jar包软链接未生效,可以全局搜索当前服务器上是否存在此jar,如果不存在就去maven上下载一个,然后放到某个目录,软连接过去即可。

    eg:发现还有个 hadoop-core.jar软连接无效。

    find  /  -name       “hadoop-core*”;  #根据此命令找到相关包的位置,然后ln –s xx xx 即可

    4.9启动impala

    service impala-state-store start

    service impala-catalog start

    service impala-server start

    在执行service impala-state-store start 报错

    [root@sunny conf]# service impala-state-store start

    /etc/init.d/impala-state-store: line 34: /lib/lsb/init-functions: No such file or directory

    /etc/init.d/impala-state-store: line 104: log_success_msg: command not found

    需要安装 redhat-lsb

    下载地址:http://rpmfind.net/linux/rpm2html/search.php?query=redhat-lsb-core(x86-64)

    选择:redhat-lsb-core-4.1-27.el7.centos.1.x86_64.rpm 下载

     

    下载好后执行命令:

    rpm -ivh redhat-lsb-core-4.1-27.el7.centos.1.x86_64.rpm

    继续启动impala

    service impala-state-store start

    service impala-catalog start

    service impala-server start #所有datanode节点启动impalad

    所有服务启动成功后可以开始访问impala

    4.10使用impala

    进入impala:

    impala-shell #任意路径输入即可,基本操作同hive

    访问impalad的管理界面

    http://172.20.71.66:25000/

    访问statestored的管理界面

    http://172.20.71.66:25010/

    访问catalog的管理界面

    http://172.20.71.66:25020

    impala日志文件:  /var/log/impala 

    impala配置文件:/etc/default/impala

    /etc/impala/conf  (core-site.xml, hdfs-site.xml, hive-site.xml)

    4.11 安装impala异常处理

    4.11.1 FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient

    处理方案:

    参考:http://blog.csdn.net/l1028386804/article/details/51566262

    进入所安装的Hive的conf目录,找到hive-site.xml,(若没修改,则是hive-default.xml.template)。

    <property>

      <name>hive.metastore.schema.verification</name>

      <value>true</value>

       <description>

       Enforce metastore schema version consistency.

       True: Verify that version information stored in metastore matches with one from Hive jars.  Also disable automatic

             schema migration attempt. Users are required to manully migrate schema after Hive upgrade which ensures

             proper metastore schema migration. (Default)

       False: Warn if the version information stored in metastore doesn't match with one from in Hive jars.

       </description>

    </property>

    改为

    <property>

      <name>hive.metastore.schema.verification</name>

      <value>false</value>

       <description>

       Enforce metastore schema version consistency.

       True: Verify that version information stored in metastore matches with one from Hive jars.  Also disable automatic

             schema migration attempt. Users are required to manully migrate schema after Hive upgrade which ensures

             proper metastore schema migration. (Default)

       False: Warn if the version information stored in metastore doesn't match with one from in Hive jars.

       </description>

    </property>

    然后重启hive

    hive --service metastore &  

    hive --service hiveserver2 &

    4.11.2 Invalid short-circuit reads configuration:- Impala cannot read or execute the parent directory of dfs.domain.socket.path

    处理方案:创建配置文件中 /etc/impala/conf/hdfs-site.xml,对应目录即可

     

    cd /var/run

    mkdir hdfs-sockets

    cd hdfs-sockets

    mkdir dn

    4.11.3impala insert数据报错:NoClassDefFoundError: org/apache/hadoop/fs/adl/AdlFileSystem

    [localhost.localdomain:21000] > insert into user_test values('zs');

    Query: insert into user_test values('zs')

    Query submitted at: 2020-12-17 02:39:24 (Coordinator: http://sunny:25000)

    ERROR: AnalysisException: Failed to load metadata for table: test.user_test. Running 'invalidate metadata test.user_test' may resolve this problem.

    CAUSED BY: NoClassDefFoundError: org/apache/hadoop/fs/adl/AdlFileSystem

    CAUSED BY: TableLoadingException: Failed to load metadata for table: test.user_test. Running 'invalidate metadata test.user_test' may resolve this problem.

    CAUSED BY: NoClassDefFoundError: org/apache/hadoop/fs/adl/AdlFileSystem

    处理方案:

    impala没有引用hadoop-azure-datalake.jar ,补充引用

    sudo ln -s /data/CDH/hadoop-2.6.0-cdh5.14.0/share/hadoop/tools/lib/hadoop-azure-datalake-2.6.0-cdh5.14.0.jar /usr/lib/impala/lib/hadoop-azure-datalake.jar

    hadoop core-site.xml 配置解析:http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/core-default.xml

    到此 hadoop hive impala 安装结束啦,完结 撒花~

    --=====================补充更新 2020-12-28================

    ERROR: AnalysisException: Unable to INSERT into target table (Test.user_test) because Impala does not have WRITE access to at least one HDFS path: hdfs://172.20.71.66:9000/user/hive/warehouse/test.db/user_test
    处理方案:
    vi /etc/selinux/config
    SELINUX=disabled

    vi /data/CDH/hive-1.1.0-cdh5.14.0/conf/hive-site.xml
    添加内容
    <property>
    <name>hive.metastore.warehouse.dir</name>
    <!--此处修改-->
    <value>/user/hive/warehouse</value>
    <description>location of default database for the warehouse</description>
    </property>

    hadoop fs -chmod 777 /user/hive/warehouse/test.db/user_test

    hive --service metastore &
    hive --service hiveserver2 &

    service impala-state-store restart
    service impala-catalog restart
    service impala-server restart

    impala-shell
    invalidate metadata


    impala-shell --auth_creds_ok_in_clear -l -i 172.20.71.66 -u APP

    -- ============================================================================

  • 相关阅读:
    调用AsyncTask的excute方法不能立即执行程序的原因分析及改善方案
    辅助
    目录检索
    高斯消元法
    树套树
    珂朵莉树
    卢卡斯定理
    中国剩余定理
    数论基础
    网络流基础
  • 原文地址:https://www.cnblogs.com/DFX339/p/14153516.html
Copyright © 2011-2022 走看看