zoukankan      html  css  js  c++  java
  • 2.28 MapReduce在实际应用中常见的优化

    一、优化的点

    • Reduce Task Number
    • Map Task输出压缩
    • Shuffle Phase 参数
    • map、reduce分配的虚拟CPU

    二、Reduce Task Number

    Reduce Task 默认是一个;

    Reduce Task的数目也不是越多越好,实际中需要测试调整,以调整到最优的个数, 如下;

    job.setNumReduceTasks(2);

     

    三、Map Task输出压缩

    上一节已经讲到了;

    四、Shuffle Phase 参数

    具体可参考:mapred-default.xml

    可调的有如下几点:

    mapreduce.task.io.sort.factor:

    <property>
      <name>mapreduce.task.io.sort.factor</name>
      <value>10</value>
      <description>The number of streams to merge at once while sorting
      files.  This determines the number of open file handles.</description>
    </property>

    mapreduce.task.io.sort.mb:

    <property>
      <name>mapreduce.task.io.sort.mb</name>
      <value>100</value>
      <description>The total amount of buffer memory to use while sorting 
      files, in megabytes.  By default, gives each merge stream 1MB, which
      should minimize seeks.</description>
    </property>

    mapreduce.map.sort.spill.percent:

    <property>
      <name>mapreduce.map.sort.spill.percent</name>
      <value>0.80</value>
      <description>The soft limit in the serialization buffer. Once reached, a
      thread will begin to spill the contents to disk in the background. Note that
      collection will not block if this threshold is exceeded while a spill is
      already in progress, so spills may be larger than this threshold when it is
      set to less than .5</description>
    </property>

    五、map、reduce分配的虚拟CPU

    默认都是一个虚拟CPU,实际中也可以调整;

    1、map

    mapreduce.map.cpu.vcores:

    <property>
      <name>mapreduce.map.cpu.vcores</name>
      <value>1</value>
      <description>
          The number of virtual cores required for each map task.
      </description>
    </property>

    2、reduce

    mapreduce.reduce.cpu.vcores:

    <property>
      <name>mapreduce.reduce.cpu.vcores</name>
      <value>1</value>
      <description>
          The number of virtual cores required for each reduce task.
      </description>
    </property>
  • 相关阅读:
    Unity---简单的性能优化理论
    第一次参加Game Jam
    Unity---自制游戏中控制角色的移动摇杆
    不使用插件 修改Unity和C#创建时的默认模板
    Leetcode---剑指Offer题10---斐波那契数列
    Leetcode---剑指Offer题9---用两个栈实现队列
    MySQL百万级数据量分页查询方法及其优化
    Nginx日志切割
    Nginx服务优化及优化深入(配置网页缓存时间、日志切割、防盗链等等)
    MySQL主从复制+读写分离原理及配置实例
  • 原文地址:https://www.cnblogs.com/weiyiming007/p/10717143.html
Copyright © 2011-2022 走看看