zoukankan      html  css  js  c++  java
  • Hive Bug修复:ORC表中array数据类型长度超过1024报异常

    目前HVIE里查询如下语句报错:

    select * from dw.ticket_user_mtime limit 10;

    错误如下:

    17/07/06 16:45:38 [main]: DEBUG impl.RecordReaderImpl: merge = [{data range [22733, 19927580), size: 19904847 type: array-backed}]
    Failed with exception java.io.IOException:java.lang.ArrayIndexOutOfBoundsException: 1024
    17/07/06 16:45:38 [main]: ERROR CliDriver: Failed with exception java.io.IOException:java.lang.ArrayIndexOutOfBoundsException: 1024
    java.io.IOException: java.lang.ArrayIndexOutOfBoundsException: 1024
    at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:517)
    at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:424)
    at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:144)
    at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1885)
    at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:252)
    at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183)
    at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399)
    at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776)
    at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714)
    at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
    Caused by: java.lang.ArrayIndexOutOfBoundsException: 1024
    at org.apache.orc.impl.RunLengthIntegerReaderV2.nextVector(RunLengthIntegerReaderV2.java:369)
    at org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.commonReadByteArrays(TreeReaderFactory.java:1231)
    at org.apache.orc.impl.TreeReaderFactory$BytesColumnVectorUtil.readOrcByteArrays(TreeReaderFactory.java:1268)
    at org.apache.orc.impl.TreeReaderFactory$StringDirectTreeReader.nextVector(TreeReaderFactory.java:1368)
    at org.apache.orc.impl.TreeReaderFactory$StringTreeReader.nextVector(TreeReaderFactory.java:1212)
    at org.apache.orc.impl.TreeReaderFactory$ListTreeReader.nextVector(TreeReaderFactory.java:1902)
    at org.apache.orc.impl.TreeReaderFactory$StructTreeReader.nextBatch(TreeReaderFactory.java:1737)
    at org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1045)
    at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.ensureBatch(RecordReaderImpl.java:77)
    at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.hasNext(RecordReaderImpl.java:89)
    at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:230)
    at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:205)
    at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:484)
    ... 15 more

    hive 补丁位置https://issues.apache.org/jira/browse/HIVE-14483

    该BUG触发的条件是,orc表中,array数据类型,并且当array这个字段中数组的元素超过1024个。

    可以通过打补丁的方式修复,重新编译hive的源码得到 hive-exec-2.1.0.jar 以及hive-orc-2.1.0.jar,

    将这两个jar包更新到所有线上的hive客户端的lib,然后需要重启hive相关的服务(轮流重启hive metasotre,然后轮流重启hiveserver2)。

  • 相关阅读:
    jQuery EasyUI API 中文文档 数字框(NumberBox)
    jQuery EasyUI API 中文文档 数值微调器(NumberSpinner)
    jQuery EasyUI API 中文文档 日期时间框(DateTimeBox)
    jQuery EasyUI API 中文文档 微调器(Spinner)
    jQuery EasyUI API 中文文档 树表格(TreeGrid)
    jQuery EasyUI API 中文文档 树(Tree)
    jQuery EasyUI API 中文文档 属性表格(PropertyGrid)
    EntityFramework 数据操作
    jQuery EasyUI API 中文文档 对话框(Dialog)
    jQuery EasyUI API 中文文档 组合表格(ComboGrid)
  • 原文地址:https://www.cnblogs.com/honeybee/p/8668771.html
Copyright © 2011-2022 走看看