zoukankan      html  css  js  c++  java
  • 2.28 MapReduce在实际应用中常见的优化

    一、优化的点

    • Reduce Task Number
    • Map Task输出压缩
    • Shuffle Phase 参数
    • map、reduce分配的虚拟CPU

    二、Reduce Task Number

    Reduce Task 默认是一个;

    Reduce Task的数目也不是越多越好,实际中需要测试调整,以调整到最优的个数, 如下;

    job.setNumReduceTasks(2);

     

    三、Map Task输出压缩

    上一节已经讲到了;

    四、Shuffle Phase 参数

    具体可参考:mapred-default.xml

    可调的有如下几点:

    mapreduce.task.io.sort.factor:

    <property>
      <name>mapreduce.task.io.sort.factor</name>
      <value>10</value>
      <description>The number of streams to merge at once while sorting
      files.  This determines the number of open file handles.</description>
    </property>

    mapreduce.task.io.sort.mb:

    <property>
      <name>mapreduce.task.io.sort.mb</name>
      <value>100</value>
      <description>The total amount of buffer memory to use while sorting 
      files, in megabytes.  By default, gives each merge stream 1MB, which
      should minimize seeks.</description>
    </property>

    mapreduce.map.sort.spill.percent:

    <property>
      <name>mapreduce.map.sort.spill.percent</name>
      <value>0.80</value>
      <description>The soft limit in the serialization buffer. Once reached, a
      thread will begin to spill the contents to disk in the background. Note that
      collection will not block if this threshold is exceeded while a spill is
      already in progress, so spills may be larger than this threshold when it is
      set to less than .5</description>
    </property>

    五、map、reduce分配的虚拟CPU

    默认都是一个虚拟CPU,实际中也可以调整;

    1、map

    mapreduce.map.cpu.vcores:

    <property>
      <name>mapreduce.map.cpu.vcores</name>
      <value>1</value>
      <description>
          The number of virtual cores required for each map task.
      </description>
    </property>

    2、reduce

    mapreduce.reduce.cpu.vcores:

    <property>
      <name>mapreduce.reduce.cpu.vcores</name>
      <value>1</value>
      <description>
          The number of virtual cores required for each reduce task.
      </description>
    </property>
  • 相关阅读:
    毕业设计同学们的福利(将word表格导入PowerDesigner中实现快速创建PDM)
    (转载)彻底的理解:WebService到底是什么?
    Aptana常用快捷键总结
    解决nuxt中路由变化后vanlist触底不加载的问题
    vuepropertydecorator的装饰器及其功能(可能不全)
    前端基础复习篇DOM
    Docker如何制作镜像Dockerfile的使用
    接口测试及常用接口测试工具
    SVN快速入门3——整合eclipse(1)
    SVN快速入门1——SVN的安装及常用命令
  • 原文地址:https://www.cnblogs.com/weiyiming007/p/10717143.html
Copyright © 2011-2022 走看看