zoukankan      html  css  js  c++  java
  • 执行Hive出现Error running child : java.lang.OutOfMemoryError: Java heap space错误

    具体错误日志如下:

    2018-05-11 15:16:49,429 FATAL [main] org.apache.hadoop.mapred.YarnChild: Error running child : java.lang.OutOfMemoryError: Java heap space
    	at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
    	at java.nio.ByteBuffer.allocate(ByteBuffer.java:335)
    	at org.apache.hadoop.hive.ql.io.orc.OutStream.getNewInputBuffer(OutStream.java:107)
    	at org.apache.hadoop.hive.ql.io.orc.OutStream.write(OutStream.java:128)
    	at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerWriterV2.writeDeltaValues(RunLengthIntegerWriterV2.java:238)
    	at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerWriterV2.writeValues(RunLengthIntegerWriterV2.java:186)
    	at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerWriterV2.write(RunLengthIntegerWriterV2.java:788)
    	at org.apache.hadoop.hive.ql.io.orc.WriterImpl$StringTreeWriter$1.visit(WriterImpl.java:1179)
    	at org.apache.hadoop.hive.ql.io.orc.StringRedBlackTree.recurse(StringRedBlackTree.java:152)
    	at org.apache.hadoop.hive.ql.io.orc.StringRedBlackTree.recurse(StringRedBlackTree.java:150)
    	at org.apache.hadoop.hive.ql.io.orc.StringRedBlackTree.recurse(StringRedBlackTree.java:153)
    	at org.apache.hadoop.hive.ql.io.orc.StringRedBlackTree.visit(StringRedBlackTree.java:163)
    	at org.apache.hadoop.hive.ql.io.orc.WriterImpl$StringTreeWriter.flushDictionary(WriterImpl.java:1173)
    	at org.apache.hadoop.hive.ql.io.orc.WriterImpl$StringTreeWriter.writeStripe(WriterImpl.java:1125)
    	at org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1617)
    	at org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:1997)
    	at org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:2289)
    	at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:106)
    	at org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:185)
    	at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:958)
    	at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:598)
    	at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
    	at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
    	at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
    	at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
    	at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:287)
    	at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:453)
    	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
    	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
    	at java.security.AccessController.doPrivileged(Native Method)
    	at javax.security.auth.Subject.doAs(Subject.java:422)
    	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
    

      

    花了将近一天的时间,最终在这里找到解决办法:

    https://community.hortonworks.com/questions/37603/i-am-getting-outofmemory-while-inserting-the-data.html

    我的hive表数据类型是OCRFile类型,似乎是这个类型对于分区有限制。

    按贴子所示,修改了set orc.compress.size  = 8192; 发现将分区变小时,脚本即可正常执行,但还是没有完全解决我的问题。

    因为时间有限,没有办法深入去解决该问题,所以干脆将该表的数据类型修改成text,问题解决。

    先完成业务任务,后续有时间后再翻过来解决这个问题。

  • 相关阅读:
    【MySQL笔记】数据定义语言DDL
    【MySQL笔记】SQL语言四大类语言
    《ggplot2:数据分析与图形艺术》,读书笔记
    【数据处理】为什么数据要取对数
    【R实践】时间序列分析之ARIMA模型预测___R篇
    【R笔记】使用R语言进行异常检测
    【R笔记】日期处理
    朴素贝叶斯分类器的应用
    数据分析的方法与技术
    爬虫 测试webmagic (一)
  • 原文地址:https://www.cnblogs.com/hark0623/p/9027324.html
Copyright © 2011-2022 走看看