zoukankan      html  css  js  c++  java
  • 记一次hive版本升级

      之前的项目中hive版本用的是 apache-hive-1.2.2-bin,最近需要将hive版本升级到apache-hive-2.3.4-bin, 本来想着元数据在mysql中,直接部署好hive启动就好了,

    结果提示元数据未初始化,查找一下资料,发现不同版本hive的元数据信息有所不同,而官方也给了升级方法,这里记录一下。

      首先,找到我们要升级的hive,找到apache-hive-2.3.4-bin/scripts/metastore/upgrade/mysql  这个目录

     这个目录下有一个README,里面讲了如何升级

    Hive MetaStore Upgrade HowTo
    ============================
    
    This document describes how to upgrade the schema of a MySQL backed
    Hive MetaStore instance from one release version of Hive to another
    release version of Hive. For example, by following the steps listed
    below it is possible to upgrade a Hive 0.5.0 MetaStore schema to a
    Hive 0.7.0 MetaStore schema. Before attempting this project we
    strongly recommend that you read through all of the steps in this
    document and familiarize yourself with the required tools.
    
    MetaStore Upgrade Steps
    =======================
    
    1) Shutdown your MetaStore instance and restrict access to the
       MetaStore's MySQL database. It is very important that no one else
       accesses or modifies the contents of database while you are
       performing the schema upgrade.
    
    2) Create a backup of your MySQL metastore database. This will allow
       you to revert any changes made during the upgrade process if
       something goes wrong. The mysqldump utility is the easiest way to
       create a backup of a MySQL database:
    
       % mysqldump --opt <metastore_db_name> > metastore_backup.sql
    
       Note that you may need also need to specify a hostname and username
       using the --host and --user command line switches.
    
    3) Dump your metastore database schema to a file. We use the mysqldump
       utility again, but this time with a command line option that
       specifies we are only interested in dumping the DDL statements
       required to create the schema:
    
       % mysqldump --skip-add-drop-table --no-data <metastore_db_name> > my-schema-x.y.z.mysql.sql
    
    4) The schema upgrade scripts assume that the schema you are upgrading
       closely matches the official schema for your particular version of
       Hive. The files in this directory with names like
       "hive-schema-x.y.z.mysql.sql" contain dumps of the official schemas
       corresponding to each of the released versions of Hive. You can
       determine differences between your schema and the official schema
       by diffing the contents of the official dump with the schema dump
       you created in the previous step. Some differences are acceptable
       and will not interfere with the upgrade process, but others need to
       be resolved manually or the upgrade scripts will fail to complete.
    
      * Missing Tables: Hive's default configuration causes the MetaStore
        to create schema elements only when they are needed. Some tables
        may be missing from your MetaStore schema if you have not created
        the corresponding Hive catalog objects, e.g. the PARTITIONS table
        will probably not exist if you have not created any table
        partitions in your MetaStore. You MUST create these missing tables
        before running the upgrade scripts. The easiest way to do this is
        by executing the official schema DDL script against your
        schema. Each of the CREATE TABLE statements in the schema script
        include an IF NOT EXISTS clause, so tables which already exist in
        your schema will be ignored, and those which don't exist will get
        created.
    
      * Extra Tables: Your schema may include a table named NUCLEUS_TABLES
        or a table named SEQUENCE_TABLE. These tables are managed
        by the DataNucleus ORM layer and will be created automatically if
        they don't exist. No action on your part is required.
    
      * Reversed Column Constraint Names in the Same Table: Tables with
        multiple constraints may have the names of the constraints
        reversed. For example, the PARTITIONS table contains two foreign
        key constraints named PARTITIONS_FK1 and PARTITIONS_FK2 which
        reference SDS.SD_ID and TBLS.TBL_ID respectively. However, in your
        schema you may find that PARTITIONS_FK1 references TBLS.TBL_ID and
        PARTITIONS_FK2 references SDS.SD_ID. Either version is acceptable
        -- the only requirement is that these constraints actually exist.
    
      * Differences in Column/Constraint Names: Your schema may contain
        tables with columns named "IDX" or unique keys named
        "UNIQUE<tab_name>". If you find either of these in your schema you
        will need to change the names to "INTEGER_IDX" and
        "UNIQUE_<tab_name>" before running the upgrade scripts. For more
        background on this issue please refer to HIVE-1435.
    
    5) You are now ready to run the schema upgrade scripts. If you are
       upgrading from Hive 0.5.0 to Hive 0.6.0 you need to run the
       upgrade-0.5.0-to-0.6.0.mysql.sql script, but if you are upgrading
       from 0.5.0 to 0.7.0 you will need to run the 0.5.0 to 0.6.0 upgrade
       script followed by the 0.6.0 to 0.7.0 upgrade script.
    
       % mysql --verbose
       mysql> use <metastore_db_name>;
       Database changed
       mysql> source upgrade-0.5.0-to-0.6.0.mysql.sql
       mysql> source upgrade-0.6.0-to-0.7.0.mysql.sql
    
       These scripts should run to completion without any errors. If you
       do encounter errors you need to analyze the cause and attempt to
       trace it back to one of the preceding steps.
    
    6) The final step of the upgrade process is validating your freshly
       upgraded schema against the official schema for your particular
       version of Hive. This is accomplished by repeating steps (3) and
       (4), but this time comparing against the official version of the
       upgraded schema, e.g. if you upgraded the schema to Hive 0.7.0 then
       you will want to compare your schema dump against the contents of
       hive-schema-0.7.0.mysql.sql
    

      直接说主要步骤:

           一,最好先停止我们的hive相关服务,禁止访问hive的元数据 ,防止升级过程中其他应用使用修改数据。

           二,备份元数据

           三,开始升级,这里升级需要说明一下,hive提供的升级脚本是不能跨版本运行的,比如直接从1.2.2升级到2.3.4,所以我们需要根据它提供的脚本(脚本在scripts/metastore/upgrade/mysql),一个大版本,一个大版本升级

    如我这次  需要按照如下顺序执行

    upgrade-1.2.0-to-2.0.0.mysql.sql
    upgrade-2.0.0-to-2.1.0.mysql.sql
    upgrade-2.1.0-to-2.2.0.mysql.sql
    upgrade-2.2.0-to-2.3.0.mysql.sql

         执行完后,再打开我们的apache-hive-2.3.4-bin就可以了。

  • 相关阅读:
    java 数据结构容器之HashSet
    java LinkedList模拟堆栈
    Java实现windows7标准型计算器
    Java迭代器注意事项
    (动态规划求解)聪明的kk
    三张图讲清楚Java中的方法的参数传递机制
    Java对象在内存中的表示
    两张内存变化图揭秘二维数组的初始化
    四张內存变化图揭秘引用类型数组的初始化
    三张內存变化图揭秘基本类型数组的初始化
  • 原文地址:https://www.cnblogs.com/lly001/p/12198698.html
Copyright © 2011-2022 走看看