zoukankan      html  css  js  c++  java
  • hive优化之自己主动合并输出的小文件

    1.先在hive-site.xml中设置小文件的标准.

    <property>
      <name>hive.merge.smallfiles.avgsize</name>
      <value>536870912</value>
      <description>When the average output file size of a job is less than this number, Hive will start an additional map-reduce job to merge the output files into bigger files.  This is only done for map-only jobs if hive.merge.mapfiles is true, and for map-reduce jobs if hive.merge.mapredfiles is true.</description>
    </property>

    2.为仅仅有map的mapreduce的输出并合并小文件.


    <property>
      <name>hive.merge.mapfiles</name>
      <value>true</value>
      <description>Merge small files at the end of a map-only job</description>
    </property>
    

    2.为含有reduce的mapreduce的输出并合并小文件.

    <property>
      <name>hive.merge.mapredfiles</name>
      <value>true</value>
      <description>Merge small files at the end of a map-reduce job</description>
    </property>




  • 相关阅读:
    promise!
    123
    git回忆回忆回忆
    Vue基本指令
    vue小案例(跑马灯)
    mvc
    nodejs中path模块
    web服务端重定向
    弹性布局
    导出数据库的表的所有字段类型,长度,名称
  • 原文地址:https://www.cnblogs.com/bhlsheji/p/5305777.html
Copyright © 2011-2022 走看看