zoukankan      html  css  js  c++  java
  • problems_hive

    problems_hive

    1 hive执行如下截图中的语句时卡住了


    原因:yarn未启动,hive底层是要提交mapreduce到yarn上才能计算结果的。

    RCA: 之前启动yarn时,未执行jps查看是否已经启动。其实未启动成功,具体原因未知:

    [root@node01 ~]# start-yarn.sh  
    starting yarn daemons resourcemanager running as process 3220. Stop it first.  
    

    solution: 重新启动dfs、yarn、jobhistory。

    2 执行jar包报错:udfFull.jar中没有主清单属性

    在windows系统的cmd命令行窗口中执行: java -jar udfFull.jar {"movie":"1287","rate":"5","timeStamp":"978302039","uid":"1"}
    报错1:udfFull.jar中没有主清单属性
    原因:jar包中的META-INF文件夹下的MANIFEST.MF文件缺少定义jar接口类。
    说白了就是没有指定class类。这里说明一下MANIFEST.MF就是一个清单文件,通俗点将就相当于WINDOWS中ini配置文件。用来配置程序的一些信息。
    解决方法:在MANIFEST.MF文件中添加一个入口类,添加这一行:Main-Class: TransferJsonUDF (TransferJsonUDF是含有main方法的入口类)

    继续执行,报另一个错误:找不到或无法加载主类 TransferJsonUDF
    原因:入口类配置得不对。
    解决方法:入口类配置修改为: Main-Class: cn.itcast.hive.udf.TransferJsonUDF (即在入口类前面添加其类路径 cn.itcast.hive.udf.)

    再次运行,成功:
    D:> java -jar udfFull.jar {"movie":"1287","rate":"5","timeStamp":"978302039","uid":"1"}
    1287 5 978302039 1

    另外,linux中执行该命令,如果后面带参数,则需要在参数两边加上单引号''. 否则执行可能有异常,可以这么写:
    java -jar udfFull.jar '{"movie":"914","rate":"3","timeStamp":"978301968","uid":"1"}'

    3 beeline连接hive数据库报错

    beeline> !connect jdbc:hive2://cdh03:10000 root root org.apache.hive.jdbc.HiveDriver 
    SLF4J: Class path contains multiple SLF4J bindings.
    SLF4J: Found binding in [jar:file:/usr/develop/apache-hive-2.1.1-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: Found binding in [jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
    SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
    Connecting to jdbc:hive2://cdh03:10000
    19/12/16 20:53:02 [main]: WARN jdbc.HiveConnection: Failed to connect to cdh03:10000
    Error: Could not open client transport with JDBC Uri: jdbc:hive2://cdh03:10000: Failed to open new session: java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): <font color=red>User: root is not allowed to impersonate root (state=08S01,code=0)</font>  
    

    RCA: hadoop未设置超级代理。
    hadoop引入了一个安全伪装机制,使得hadoop 不允许上层系统直接将实际用户传递到hadoop层,而是将实际用户传递给一个超级代理,由此代理在hadoop上执行操作,避免任意客户端随意操作hadoop,
    solution:
    在hadoop的配置文件core-site.xml中添加如下属性:

    <property>
            <name>hadoop.proxyuser.root.hosts</name>
            <value>*</value>
    </property>
    <property>
            <name>hadoop.proxyuser.root.groups</name>
            <value>*</value>
    </property>
    

    就将上面配置hadoop.proxyuser.xxx.hosts和hadoop.proxyuser.xxx.groups中的xxx设置为root(即你的错误日志中显示的User:xxx为什么就设置为什么)。
    “*”表示可通过超级代理“xxx”操作hadoop的用户、用户组和主机。重启hdfs。
    参考链接:https://blog.csdn.net/qq_16633405/article/details/82190440
    注意:如果要使用beeline连接hive的语法格式为:
    !connect jdbc:hive2://cdh01:10000 username password org.apache.hive.jdbc.HiveDriver
    使用beeline有2个前提条件:

    1. mysql中创建好hive的元数据的数据库
    2. 事先启动hiveserver2服务:
      • CDH版本启动命令:service hive-server2 start
      • apache版本启动命令:nohup hive --service hiveserver2 2>&1 &

    4 运行hive命令报错

    [root@cdh01 conf]# hive
    Cannot find hadoop installation: $HADOOP_HOME or $HADOOP_PREFIX must be set or hadoop must be in the path
    原因:未设置环境变量HADOOP_HOME.
    解决方法:
    1.vim /etc/hive/conf/hive-env.sh
    在该文件中添加语句:export HADOOP_HOME=/usr/lib/hadoop
    注意:CDH版本的hadoop的工作目录是/usr/lib/hadoop

    5 hive中创建和hbase的整合的表失败报错:

    执行语句:
    create table course.hbase_score(id int,cname string,score int) stored by 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' with serdeproperties("hbase.columns.mapping" = "cf:name,cf:score") tblproperties("hbase.table.name" = "hbase_score");
    报错语句:
    FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. org.apache.hadoop.hbase.client.HBaseAdmin.(Lorg/apache/hadoop/conf/Configuration;)V
    排查步骤:
    1.运行 hive -hiveconf hive.root.logger=DEBUG,console,以便查看详细错误日志。然后再次执行建表语句,报如下详细的错误:

    ERROR exec.DDLTask: java.lang.NoSuchMethodError: org.apache.hadoop.hbase.client.HBaseAdmin.<init>(Lorg/apache/hadoop/conf/Configuration;)V
    	at org.apache.hadoop.hive.hbase.HBaseStorageHandler.getHBaseAdmin(HBaseStorageHandler.java:115)
    	at org.apache.hadoop.hive.hbase.HBaseStorageHandler.preCreateTable(HBaseStorageHandler.java:195)
    	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:727)
    	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:720)
    	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    	at java.lang.reflect.Method.invoke(Method.java:498)
    	at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:105)
    	at com.sun.proxy.$Proxy18.createTable(Unknown Source)
    	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    	at java.lang.reflect.Method.invoke(Method.java:498)
    	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:2126)
    	at com.sun.proxy.$Proxy18.createTable(Unknown Source)
    	at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:784)
    	at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4202)
    	at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:311)
    	at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:214)
    	at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:99)
    	at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2054)
    	at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1750)
    	at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1503)
    	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1287)
    	at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1277)
    	at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:226)
    	at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:175)
    	at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:389)
    	at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:781)
    	at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:699)
    	at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:634)
    	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    	at java.lang.reflect.Method.invoke(Method.java:498)
    	at org.apache.hadoop.util.RunJar.run(RunJar.java:226)
    	at org.apache.hadoop.util.RunJar.main(RunJar.java:141)  
    

    怀疑是hive和hbase版本不兼容,hive是CDH版本,hbase是apache开源版本,所以在cdh03上重新安装了一个apache开源版本的hive。解决。
    小结:cdh01中是CDH版本的hive,cdh03中是apache开源版本的hive。

    6 hive执行sql报错

    desc:执行sql语句,join两个子查询
    errorlog:

    FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
    Caused by: java.lang.IllegalStateException: Invalid input path hdfs://node1:8020/user/hive/warehouse/weblog.db/ods_click_stream_visit/datestr=20181101/visit
    

    action&solution:
    检查发现日志中说无效的路径,其实是存在的。
    检查发现hive本地模式开启了,尝试关闭本地模式 set hive.exec.mode.local.auto=false;
    再次尝试,执行成功。

    7 创建hive自定义函数报错

    desc:
    执行下面语句报错: add jar /develop/weblog/0210udf-1.0.jar;
    create temporary function ua_parse as 'com.udf.UAParseUDF';
    errorlog:

    FAILED: Execution Error, return code -101 from org.apache.hadoop.hive.ql.exec.FunctionTask. Invalid signature file digest for Manifest main attributes
    

    solution:
    方法1. 删除jar包中的META-INF/.RSA META-INF/.DSA META-INF/*.SF 文件:

    zip -d 0210udf-1.0.jar META-INF/*.RSA META-INF/*.DSA META-INF/*.SF
    

    方法2. 打包时排除以上的需要被删除的文件:

    <plugin>
      <groupId>org.apache.maven.plugins</groupId>
      <artifactId>maven-shade-plugin</artifactId>
      <version>2.4.3</version>
      <executions>
        <execution>
          <phase>package</phase>
          <goals>
            <goal>shade</goal>
          </goals>
          <configuration>
            <minimizeJar>true</minimizeJar>
            <filters>
              <filter>
                <artifact>*:*</artifact>
                <excludes>
                  <exclude>META-INF/*.SF</exclude>
                  <exclude>META-INF/*.DSA</exclude>
                  <exclude>META-INF/*.RSA</exclude>
                </excludes>
              </filter>
            </filters>
          </configuration>
        </execution>
      </executions>
    </plugin>
    

    8 beeline连接hive报错

    errorlog:

    java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): 
    User root is not allowed to impersonate anonymous 错误。
    

    solution: 修改hadoop 配置文件 .../etc/hadoop/core-site.xml,加入如下配置项

    <property>
        <name>hadoop.proxyuser.root.hosts</name>
        <value>*</value>
    </property>
    <property>
        <name>hadoop.proxyuser.root.groups</name>
        <value>*</value>
    </property>
    

    9

  • 相关阅读:
    可视化工具Grafana:简介及安装
    数据采集工具Telegraf:简介及安装
    怒怼某些自媒体培训机构,吃相不要太难看了!!!
    时序数据库InfluxDB:简介及安装
    jmeter(二十五)linux环境运行jmeter并生成报告
    Linux:CentOS7.4新建用户并授权
    服务端监控工具:Nmon使用方法
    Locust:简介和基本用法
    Quant Finance Master’s Guide 2020
    数据科学入门前需要知道的10件事
  • 原文地址:https://www.cnblogs.com/mediocreWorld/p/15145713.html
Copyright © 2011-2022 走看看