zoukankan      html  css  js  c++  java
  • 【原创】大叔经验分享(21)yarn中查看每个应用实时占用的内存和cpu资源

    在yarn中的application详情页面

    http://resourcemanager/cluster/app/$applicationId

    或者通过application命令

    yarn application -status $applicationId

    只能看到应用启动以来占用的资源*时间统计,比如:

    Aggregate Resource Allocation : 3962853 MB-seconds, 1466 vcore-seconds

    到处都找不到这个应用当前实时的资源占用情况,比如当前占用了多少内存多少核,跟进yarn代码发现其实是有这个统计的:

    org.apache.hadoop.yarn.api.records.ApplicationResourceUsageReport

      public static ApplicationResourceUsageReport newInstance(
          int numUsedContainers, int numReservedContainers, Resource usedResources,
          Resource reservedResources, Resource neededResources, long memorySeconds,
          long vcoreSeconds) {
        ApplicationResourceUsageReport report =
            Records.newRecord(ApplicationResourceUsageReport.class);
        report.setNumUsedContainers(numUsedContainers);
        report.setNumReservedContainers(numReservedContainers);
        report.setUsedResources(usedResources);
        report.setReservedResources(reservedResources);
        report.setNeededResources(neededResources);
        report.setMemorySeconds(memorySeconds);
        report.setVcoreSeconds(vcoreSeconds);
        return report;
      }

    其中usedResources就是当前的实时占用资源情况,包括内存和cpu,这个统计是在YarnScheduler的接口中返回:

    org.apache.hadoop.yarn.server.resourcemanager.scheduler.YarnScheduler

      /**
       * Get a resource usage report from a given app attempt ID.
       * @param appAttemptId the id of the application attempt
       * @return resource usage report for this given attempt
       */
      @LimitedPrivate("yarn")
      @Evolving
      ApplicationResourceUsageReport getAppResourceUsageReport(
          ApplicationAttemptId appAttemptId);

    getAppResourceUsageReport方法被RMAppAttemptImpl.getApplicationResourceUsageReport调用:

    org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl

      @Override
      public ApplicationResourceUsageReport getApplicationResourceUsageReport() {
        this.readLock.lock();
        try {
          ApplicationResourceUsageReport report =
              scheduler.getAppResourceUsageReport(this.getAppAttemptId());
          if (report == null) {
            report = RMServerUtils.DUMMY_APPLICATION_RESOURCE_USAGE_REPORT;
          }
          AggregateAppResourceUsage resUsage =
              this.attemptMetrics.getAggregateAppResourceUsage();
          report.setMemorySeconds(resUsage.getMemorySeconds());
          report.setVcoreSeconds(resUsage.getVcoreSeconds());
          return report;
        } finally {
          this.readLock.unlock();
        }
      }

    RMAppAttemptImpl.getApplicationResourceUsageReport被两个地方调用:

    第一个调用

    org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl

      public ApplicationReport createAndGetApplicationReport(String clientUserName,
          boolean allowAccess) {
    ...
              appUsageReport = currentAttempt.getApplicationResourceUsageReport();
    ...

    RMAppImpl.createAndGetApplicationReport会被ClientRMService.getApplications和ClientRMService.getApplicationReport调用,这两个方法分别对应命令

    yarn application -list
    yarn application -status $applicationId

    这两个地方展示信息的时候都没展示usedResources,可能作者觉得这个实时资源占用统计没那么重要。

    详见:
    org.apache.hadoop.yarn.server.resourcemanager.ClientRMService

    第二个调用

    org.apache.hadoop.yarn.server.resourcemanager.webapp.dao.AppInfo

      public AppInfo(RMApp app, Boolean hasAccess, String schemePrefix) {
    ...
              ApplicationResourceUsageReport resourceReport = attempt
                  .getApplicationResourceUsageReport();
              if (resourceReport != null) {
                Resource usedResources = resourceReport.getUsedResources();
                allocatedMB = usedResources.getMemory();
                allocatedVCores = usedResources.getVirtualCores();
                runningContainers = resourceReport.getNumUsedContainers();
              }
    ...

    这个构造函数会在RMWebServices.getApp和RMWebServices.getApps时被调用,这是个service接口,对应url分别为:

    http://resourcemanager/ws/v1/cluster/apps/$applicationId
    http://resourcemanager/ws/v1/cluster/apps?state=RUNNING

    这两个接口的返回值中有实时资源占用情况如下:

    <allocatedMB>56320</allocatedMB>
    <allocatedVCores>21</allocatedVCores>

    分别对应实时内存占用和实时CPU占用;

    详见:
    org.apache.hadoop.yarn.server.resourcemanager.webapp.RMWebServices

    如果你发现spark应用内存的占用比你分配的要多,可以参考这里:https://www.cnblogs.com/barneywill/p/10102353.html

  • 相关阅读:
    韦大仙--Katalon---一款好用的selenium自动化测试插件
    Python之路,Day3
    Python之路,Day2
    Python之路,Day1
    Python基础02 基本数据类型
    Python基础01 Hello World!
    韦大仙--LoadRunner压力测试:详细操作流程
    韦大仙--python对文件操作 2--写入与修改
    韦大仙--python对文件操作
    更新pip10后 ImportError: cannot import name ‘main'
  • 原文地址:https://www.cnblogs.com/barneywill/p/10251010.html
Copyright © 2011-2022 走看看