zoukankan      html  css  js  c++  java
  • Spark-SQL连接Hive

    第一步修个Hive的配置文件hive-site.xml

      添加如下属性,取消本地元数据服务:

    <property>  
      <name>hive.metastore.local</name>  
      <value>false</value>  
    </property>

      修改Hive元数据服务地址和端口:

    <property>
      <name>hive.metastore.uris</name>
      <value>thrift://192.168.10.10:9083</value>
      <description>Thrift URI for the remote metastore. Used by metastore client to connect to remote metastore.</description>
    </property>

      然后把配置文件hive-site.xml拷贝到Spark的conf目录下

     

    第二步:对于Hive元数据库使用Mysql的把mysql-connector-java-5.1.41-bin.jar拷贝到Spark的jar目录下

      到这里已经能够在Scala终端下查询Hive数据库了

      但是某人一开始的要求是用Spark-SQL查询Hive呀

      于是启动Spark-SQL,启了一天了都是报下面的错误

    Exception in thread "main" java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
        at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522)
        at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:114)
        at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:738)
        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
    Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
        at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1523)
        at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:86)
        at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:132)
        at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104)
        at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3005)
        at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3024)
        at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:503)
        ... 11 more
    Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1521)
        ... 17 more
    Caused by: MetaException(message:Version information not found in metastore. )
        at org.apache.hadoop.hive.metastore.ObjectStore.checkSchema(ObjectStore.java:6664)
        at org.apache.hadoop.hive.metastore.ObjectStore.verifySchema(ObjectStore.java:6645)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:114)
        at com.sun.proxy.$Proxy6.verifySchema(Unknown Source)
        at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:572)
        at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:620)
        at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:461)
        at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:66)
        at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:72)
        at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:5762)
        at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:199)
        at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74)
        ... 22 more

    一开始我查这个bug都是用第一行的报错信息查,都没成功,后面搜了下最后一个报错信息

    message:Version information not found in metastore

    终于找到问题解决方法了,把hive-site.xml中的hive.metastore.schema.verification的值改为false

    <property>
      <name>hive.metastore.schema.verification</name>
      <value>false</value>
      <description>
          Enforce metastore schema version consistency.
          True: Verify that version information stored in is compatible with one from Hive jars.  Also disable automatic
                schema migration attempt. Users are required to manually migrate schema after Hive upgrade which ensures
                proper metastore schema migration. (Default)
          False: Warn if the version information stored in metastore doesn't match with one from in Hive jars.
      </description>
    </property>

    原因应该是Hive的jar包和存储元数据信息版本不一致,这里设置不验证就可以了。


     参考博客:http://www.cnblogs.com/rocky-AGE-24/p/7345417.html

         http://blog.csdn.net/jyl1798/article/details/41087533

           http://dblab.xmu.edu.cn/blog/1086-2/

           http://blog.csdn.net/youngqj/article/details/19987727

  • 相关阅读:
    python之路day08--文件的操作
    python之路day07-集合set的增删查、列表如何排重(效率最高的方法)、深浅copy
    python之路day06-python2/3小区别,小数据池的概念,编码的进阶str转为bytes类型,编码和解码
    python之路day05--字典的增删改查,嵌套
    python之路day04--列表的增删改查,嵌套、元组的嵌套、range、for循环嵌套
    python之路day03--数据类型分析,转换,索引切片,str常用操作方法
    python之路day02--格式化输出、初始编码、运算符
    python之路day01--变量
    线程、进程、协程 异步io
    Ubuntu学习(转载)
  • 原文地址:https://www.cnblogs.com/vincent-vg/p/7587958.html
Copyright © 2011-2022 走看看