zoukankan      html  css  js  c++  java
  • 记录一次hadoop2.8.4版本RM接入zk ha问题

    背景:

    公司将线上hadoop RM接入ZK 实现高可用 但ZK Znode 默认存储1M,当存储数据量大时候可能导致线上业务的崩溃

    处理方案如下:

    1,修改ZK配置 增加默认存储上限

    2,修改RM数据存储在zk中的路径结构 使结构拆分能支撑更大的数据

    问题一 修改ZK配置 增加默认存储上限

    主要为修改配置参数 

    在zk各节点上修改配置 (修改为10M大小)

    vi zkServer.sh

    新增配置到图中位置  ZOO_USER_CFG="-Djute.maxbuffer=10240000" 

     

     

    修改zkCli.sh  (不修改 客户端命令行 将不能取得超出1M的数据)

     即使如此 当我们代码客户端也不能取得超出大小的数据 需要添加环境变量 如下

    System.setProperty("jute.maxbuffer",String.valueOf(10240000));
    同样的yarn的配置也要修改 不然也是白搭
    yarn-env.sh
    新增一行
    YARN_RESOURCEMANAGER_OPTS="$YARN_RESOURCEMANAGER_OPTS -Djute.maxbuffer=10240000"



    问题2 优化zk中存储结构

    yarn 在zk中的存储如下
    ROOT_DIR_PATH
          |--- VERSION_INFO
          |--- EPOCH_NODE
          |--- RM_ZK_FENCING_LOCK
          |--- RM_APP_ROOT
          |     |----- (#ApplicationId1)
          |     |        |----- (#ApplicationAttemptIds)
          |     |
          |     |----- (#ApplicationId2)
          |     |       |----- (#ApplicationAttemptIds)
          |     ....
          |
          |--- RM_DT_SECRET_MANAGER_ROOT
          |----- RM_DT_SEQUENTIAL_NUMBER_ZNODE_NAME
          |----- RM_DELEGATION_TOKENS_ROOT_ZNODE_NAME
          |       |----- Token_1
          |       |----- Token_2
          |       ....
          |
          |----- RM_DT_MASTER_KEYS_ROOT_ZNODE_NAME
          |      |----- Key_1
          |      |----- Key_2
          ....
          |--- AMRMTOKEN_SECRET_MANAGER_ROOT
          |----- currentMasterKey
          |----- nextMasterKey

    更新为:

     * The znode structure is as follows:
     * ROOT_DIR_PATH
     * |--- VERSION_INFO
     * |--- EPOCH_NODE
     * |--- RM_ZK_FENCING_LOCK
     * |--- RM_APP_ROOT
     * |     |----- HIERARCHIES
     * |     |        |----- 1
     * |     |        |      |----- (#ApplicationId barring last character)
     * |     |        |      |       |----- (#Last character of ApplicationId)
     * |     |        |      |       |       |----- (#ApplicationAttemptIds)
     * |     |        |      ....
     * |     |        |
     * |     |        |----- 2
     * |     |        |      |----- (#ApplicationId barring last 2 characters)
     * |     |        |      |       |----- (#Last 2 characters of ApplicationId)
     * |     |        |      |       |       |----- (#ApplicationAttemptIds)
     * |     |        |      ....
     * |     |        |
     * |     |        |----- 3
     * |     |        |      |----- (#ApplicationId barring last 3 characters)
     * |     |        |      |       |----- (#Last 3 characters of ApplicationId)
     * |     |        |      |       |       |----- (#ApplicationAttemptIds)
     * |     |        |      ....
     * |     |        |
     * |     |        |----- 4
     * |     |        |      |----- (#ApplicationId barring last 4 characters)
     * |     |        |      |       |----- (#Last 4 characters of ApplicationId)
     * |     |        |      |       |       |----- (#ApplicationAttemptIds)
     * |     |        |      ....
     * |     |        |
     * |     |----- (#ApplicationId1)
     * |     |        |----- (#ApplicationAttemptIds)
     * |     |
     * |     |----- (#ApplicationId2)
     * |     |       |----- (#ApplicationAttemptIds)
     * |     ....
     * |
     * |--- RM_DT_SECRET_MANAGER_ROOT
     *        |----- RM_DT_SEQUENTIAL_NUMBER_ZNODE_NAME
     *        |----- RM_DELEGATION_TOKENS_ROOT_ZNODE_NAME
     *        |       |----- 1
     *        |       |      |----- (#TokenId barring last character)
     *        |       |      |       |----- (#Last character of TokenId)
     *        |       |      ....
     *        |       |----- 2
     *        |       |      |----- (#TokenId barring last 2 characters)
     *        |       |      |       |----- (#Last 2 characters of TokenId)
     *        |       |      ....
     *        |       |----- 3
     *        |       |      |----- (#TokenId barring last 3 characters)
     *        |       |      |       |----- (#Last 3 characters of TokenId)
     *        |       |      ....
     *        |       |----- 4
     *        |       |      |----- (#TokenId barring last 4 characters)
     *        |       |      |       |----- (#Last 4 characters of TokenId)
     *        |       |      ....
     *        |       |----- Token_1
     *        |       |----- Token_2
     *        |       ....
     *        |
     *        |----- RM_DT_MASTER_KEYS_ROOT_ZNODE_NAME
     *        |      |----- Key_1
     *        |      |----- Key_2
     *                ....
     * |--- AMRMTOKEN_SECRET_MANAGER_ROOT
     *        |----- currentMasterKey
     *        |----- nextMasterKey
     *
     * |-- RESERVATION_SYSTEM_ROOT
     *        |------PLAN_1
     *        |      |------ RESERVATION_1
     *        |      |------ RESERVATION_2
     *        |      ....
     *        |------PLAN_2
     *        ....

    yarn-siting.xml文件新增一个配置项

    <property>
    
        <description>Index at which last section of application id (with each section
          separated by _ in application id) will be split so that application znode
          stored in zookeeper RM state store will be stored as two different znodes
          (parent-child). Split is done from the end.
          For instance, with no split, appid znode will be of the form
          application_1352994193343_0001. If the value of this config is 1, the
          appid znode will be broken into two parts application_1352994193343_000
          and 1 respectively with former being the parent node.
          application_1352994193343_0002 will then be stored as 2 under the parent
          node application_1352994193343_000. This config can take values from 0 to 4.
          0 means there will be no split. If configuration value is outside this
          range, it will be treated as config value of 0(i.e. no split). A value
          larger than 0 (up to 4) should be configured if you are storing a large number
          of apps in ZK based RM state store and state store operations are failing due to
          LenError in Zookeeper.</description>
        <name>yarn.resourcemanager.zk-appid-node.split-index</name>
        <value>0</value>
      </property>

      

    参考:https://cloud.tencent.com/developer/article/1491079

    参考:https://issues.apache.org/jira/browse/YARN-2368

    参考:https://issues.apache.org/jira/browse/YARN-2962






  • 相关阅读:
    学Python必背的初级单词,你都背了吗?
    零基础Python应该怎样学习呢?(附视频教程)
    初学Python,需要装什么软件?
    Python该怎么入门?Python入门教程(非常详细)
    c语言该怎么入门?C语言入门教程(非常详细)
    零基础学习Python web开发、Python爬虫、Python数据分析,从基础到项目实战!
    零基础学到什么程度可以找一份web前端工作?
    【spring boot】SpringBoot初学(2)
    【spring boot】SpringBoot初学(1)
    【spring】(填坑)sql注入攻击
  • 原文地址:https://www.cnblogs.com/songchaolin/p/11836515.html
Copyright © 2011-2022 走看看