zoukankan      html  css  js  c++  java
  • 2.28 MapReduce在实际应用中常见的优化

    一、优化的点

    • Reduce Task Number
    • Map Task输出压缩
    • Shuffle Phase 参数
    • map、reduce分配的虚拟CPU

    二、Reduce Task Number

    Reduce Task 默认是一个;

    Reduce Task的数目也不是越多越好,实际中需要测试调整,以调整到最优的个数, 如下;

    job.setNumReduceTasks(2);

     

    三、Map Task输出压缩

    上一节已经讲到了;

    四、Shuffle Phase 参数

    具体可参考:mapred-default.xml

    可调的有如下几点:

    mapreduce.task.io.sort.factor:

    <property>
      <name>mapreduce.task.io.sort.factor</name>
      <value>10</value>
      <description>The number of streams to merge at once while sorting
      files.  This determines the number of open file handles.</description>
    </property>

    mapreduce.task.io.sort.mb:

    <property>
      <name>mapreduce.task.io.sort.mb</name>
      <value>100</value>
      <description>The total amount of buffer memory to use while sorting 
      files, in megabytes.  By default, gives each merge stream 1MB, which
      should minimize seeks.</description>
    </property>

    mapreduce.map.sort.spill.percent:

    <property>
      <name>mapreduce.map.sort.spill.percent</name>
      <value>0.80</value>
      <description>The soft limit in the serialization buffer. Once reached, a
      thread will begin to spill the contents to disk in the background. Note that
      collection will not block if this threshold is exceeded while a spill is
      already in progress, so spills may be larger than this threshold when it is
      set to less than .5</description>
    </property>

    五、map、reduce分配的虚拟CPU

    默认都是一个虚拟CPU,实际中也可以调整;

    1、map

    mapreduce.map.cpu.vcores:

    <property>
      <name>mapreduce.map.cpu.vcores</name>
      <value>1</value>
      <description>
          The number of virtual cores required for each map task.
      </description>
    </property>

    2、reduce

    mapreduce.reduce.cpu.vcores:

    <property>
      <name>mapreduce.reduce.cpu.vcores</name>
      <value>1</value>
      <description>
          The number of virtual cores required for each reduce task.
      </description>
    </property>
  • 相关阅读:
    【u244】山地考察
    【u246】卫星照片
    【z08】乌龟棋
    【22.95%】【hdu 5992】Finding Hotels
    【t048】水流
    【b601】能量项链
    【b702】字符串的展开
    【a903】石子归并
    【9915】乘积最大
    JavaEE(24)
  • 原文地址:https://www.cnblogs.com/weiyiming007/p/10717143.html
Copyright © 2011-2022 走看看