zoukankan      html  css  js  c++  java
  • Spark运行模式:cluster与client

    When run SparkSubmit --class [mainClass], SparkSubmit will call a childMainClass which is

    1. client mode, childMainClass = mainClass

    2. standalone cluster mde, childMainClass = org.apache.spark.deploy.Client

    3. yarn cluster mode, childMainClass = org.apache.spark.deploy.yarn.Client

    The childMainClass is a wrapper of mainClass. The childMainClass will be called in SparkSubmit, and if cluster mode, the childMainClass will talk to the the cluster and launch a process on one woker to run the mainClass.
     
    ps. use "spark-submit -v" to print debug infos.
     
    Yarn client: spark-submit -v --class "org.apache.spark.examples.JavaWordCount" --master yarn JavaWordCount.jar
    childMainclass: org.apache.spark.examples.JavaWordCount
    Yarn cluster: spark-submit -v --class "org.apache.spark.examples.JavaWordCount" --master yarn-cluster JavaWordCount.jar
    childMainclass: org.apache.spark.deploy.yarn.Client
     
    Standalone client: spark-submit -v --class "org.apache.spark.examples.JavaWordCount" --master spark://aa01:7077 JavaWordCount.jar
    childMainclass: org.apache.spark.examples.JavaWordCount
    Stanalone cluster: spark-submit -v --class "org.apache.spark.examples.JavaWordCount" --master spark://aa01:7077 --deploy-mode cluster JavaWordCount.jar
    childMainclass: org.apache.spark.deploy.rest.RestSubmissionClient (if rest, else org.apache.spark.deploy.Client)
     
    Taking standalone spark as example, here is the client mode workflow. The mainclass run in the driver application which could be reside out of the cluster.
    On cluster mode showed as below, SparkSubmit will register driver in the cluster, and a driver process launched in one work running the main class.
     
    There are also two deploy modes that can be used to launch Spark applications on YARN. In yarn-cluster mode, the Spark driver runs inside an application master process which is managed by YARN on the cluster, and the client can go away after initiating the application. In yarn-client mode, the driver runs in the client process, and the application master is only used for requesting resources from YARN.
     
    Cluster deploy mode is not applicable to Spark shells.
     
  • 相关阅读:
    第四次作业:个人项目-小学四则运算 “软件”之初版
    随笔 | 阅读《构建之法》1-5章感想
    随笔 | 分布式版本控制系统Git的安装与使用
    随笔 | 对软件工程的一些感想
    为什么加载 JavaScript 使用 "src",CSS 使用 "href"?有其含义还是历史语法遗留?
    webstrom 添加一键open in browser
    sublime 代码段
    二.sublime配置 sublimecondeintel 分号后不要提示
    一.sublime配置 sublime 新建文档 默认html
    chrome使用
  • 原文地址:https://www.cnblogs.com/mustone/p/4962056.html
Copyright © 2011-2022 走看看