zoukankan      html  css  js  c++  java
  • Spark Sql数仓报-Metastore contains multiple versions

    Spark版本为2.1.0,Hadoop版本为2.7.1,元数据存储在mysql中,异常信息如下:

    Exception in thread "main" java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
        at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:346)
        at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
    Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
        at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1412)
        at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:62)
        at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:72)
        at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2453)
        at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2465)
        at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:340)
        ... 7 more
    Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
        at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1410)
        ... 12 more
    Caused by: MetaException(message:Metastore contains multiple versions)
        at org.apache.hadoop.hive.metastore.ObjectStore.getMSchemaVersion(ObjectStore.java:6368)
        at org.apache.hadoop.hive.metastore.ObjectStore.getMetaStoreSchemaVersion(ObjectStore.java:6330)
        at org.apache.hadoop.hive.metastore.ObjectStore.checkSchema(ObjectStore.java:6289)
        at org.apache.hadoop.hive.metastore.ObjectStore.verifySchema(ObjectStore.java:6277)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:108)
        at com.sun.proxy.$Proxy9.verifySchema(Unknown Source)
        at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:476)
        at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:523)
        at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:397)
        at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.<init>(HiveMetaStore.java:356)
        at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:54)
        at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:59)
        at org.apache.hadoop.hive.metastore.HiveMetaStore.newHMSHandler(HiveMetaStore.java:4944)
        at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:171)
        ... 17 more

    报错提示,hive metastore出现了多个版本,去hive的元数据库查看表VERSION,发现多了一条记录

    select * from VERSION;
    
    1    1.1.0    Set by MetaStore hadoop@10.252.97.244
    2    1.1.0    Set by MetaStore hadoop@10.252.97.244 #这条是多出的

    解决过程

    查资料

    google了一些资料,网上也有人提过,如HIVE-9543,网上大家说的解决方法有如下

    设置datanucleus.autoCreateSchema=false
    此配置官网介绍:
        Default Value: true
        Added In: Hive 0.7.0
        Removed In: Hive 2.0.0 with HIVE-6113, replaced by datanucleus.schema.autoCreateAll
        Creates necessary schema on a startup if one does not exist. Set this to false, after creating it once.
    #意思就是这个参数再hive元数据初始化的时候用到,之后就可以设置为false禁用

    设置此参数为false后,继续观察,错误还是会再次出现

    查看日志及报错

    #查看hive运行日志发现多版本出现的时候,有如下日志
    Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version
    #意思是说在metastore中未找到版本信息,hive.metastore.schema.verification未禁用,因此记录下版本信息,也就是往版本表中插入一条记录
    #接着结合之前的报错,看看报错类如下:
    Caused by: MetaException(message:Metastore contains multiple versions)
        at org.apache.hadoop.hive.metastore.ObjectStore.getMSchemaVersion(ObjectStore.java:6368)

    异常是因为在启动hive命令时会检查hive源数据中有一张VERSION表,如果元数据版本信息获取不到(原因可能是元数据库异常||网络异常||短期内作业量较多操作都会造成查询不到版本信息),这种情况下会判断hive.metastore.schema.verification属性是true还是false,为true时直接抛出MetaException,为false时打出warn警告然后插入一条version数据(这种情况下会造成多条version记录后面的作业会受影响),下面为hive-metastore包中ObjectStore类代码。

    查看源代码

    通过查看源代码发现,相关代码如下

    ##同步方法checkSchema()
    private synchronized void checkSchema() throws MetaException {
        // recheck if it got verified by another thread while we were waiting
        if (isSchemaVerified.get()) {
          return;
        }
    
        //获取hive配置,也就是hive.metastore.schema.verification的值
        boolean strictValidation =
          HiveConf.getBoolVar(getConf(), HiveConf.ConfVars.METASTORE_SCHEMA_VERIFICATION);
        // read the schema version stored in metastore db
        //读取metastore的版本信息
        String schemaVer = getMetaStoreSchemaVersion();
        if (schemaVer == null) {
          //如果版本信息未找到,这个时候strictValidation为true的时候直接抛出异常
          if (strictValidation) {
            throw new MetaException("Version information not found in metastore. ");
          } else {
            //否则,调用方法插入版本信息,也就是之前日志所说的
            LOG.warn("Version information not found in metastore. "
                + HiveConf.ConfVars.METASTORE_SCHEMA_VERIFICATION.toString() +
                " is not enabled so recording the schema version " +
                MetaStoreSchemaInfo.getHiveSchemaVersion());
            setMetaStoreSchemaVersion(MetaStoreSchemaInfo.getHiveSchemaVersion(),
              "Set by MetaStore " + USER + "@" + HOSTNAME);
          }
        }
        
        
    ##setMetaStoreSchemaVersion方法如下
    public void setMetaStoreSchemaVersion(String schemaVersion, String comment) throws MetaException {
        MVersionTable mSchemaVer;
        boolean commited = false;
        //此参数控制了记录version信息
        boolean recordVersion =
          HiveConf.getBoolVar(getConf(), HiveConf.ConfVars.METASTORE_SCHEMA_VERIFICATION_RECORD_VERSION);
        //参数为false,则返回,不记录版本信息,否则将插入版本信息
        if (!recordVersion) {
          LOG.warn("setMetaStoreSchemaVersion called but recording version is disabled: " +
            "version = " + schemaVersion + ", comment = " + comment);
          return;
        }
    
        try {
          mSchemaVer = getMSchemaVersion();
        } catch (NoSuchObjectException e) {
          // if the version doesn't exist, then create it
          mSchemaVer = new MVersionTable();
        }
        
    ##查看HiveConf中METASTORE_SCHEMA_VERIFICATION_RECORD_VERSION可知hive.metastore.schema.verification.record.version默认为true,则允许记录版本信息    

    解决方案

    通过以上的源码查看,解决方案已经出来了,其实方法有几种,我选取的做法是将hive.metastore.schema.verification.record.version设置为fals 
    当然你也可以关闭版本校验

    遗留问题

      1. 看网上说的hive多版本问题似乎是并发、网络引起的,源代码中为什么没有获取到metastore的版本schema信息,这是一个问题,还有待源码探究
      2. 几个参数都可以起到在代码流程中阻断记录版本信息的操作,哪种是无风险的,还有待深究
  • 相关阅读:
    gcc和g++的区别和联系
    Linux基础命令第二天
    Linux基础命令第一天
    Flask入门之完整项目搭建
    Flask入门第三天
    Flask入门第二天
    Flask入门第一天
    vue_drf之多级过滤、排序、分页
    vue_drf之视频接口
    vue_drf之支付宝接口
  • 原文地址:https://www.cnblogs.com/itboys/p/10134288.html
Copyright © 2011-2022 走看看