zoukankan      html  css  js  c++  java
  • spark-submit(spark版本2.3.2)

    spark-submit官方文档 :http://spark.apache.org/docs/latest/submitting-applications.html

    spark-properties官方文档:http://spark.apache.org/docs/latest/configuration.html

    Launching Applications with spark-submit

    ./bin/spark-submit 
      --class <main-class> 
      --master <master-url> 
      --deploy-mode <deploy-mode> 
      --conf <key>=<value> 
      ... # other options
      <application-jar> 
      [application-arguments]
    Spark shell和spark-submit工具支持两种动态加载配置的方法。 第一个是命令行选项,例如--master,如上所示。 spark-submit可以使用--conf标志接受任何Spark属性,但是对于在启动Spark应用程序中起作用的属性使用特殊标志。 运行./bin/spark-submit --help将显示这些选项的完整列表
    Usage: spark-submit [options] <app jar | python file | R file> [app arguments]
    Usage: spark-submit --kill [submission ID] --master [spark://...]
    Usage: spark-submit --status [submission ID] --master [spark://...]
    Usage: spark-submit run-example [options] example-class [example args]
    Some of the commonly used options are:
    |Options:
    | --master MASTER_URL spark://host:port, mesos://host:port, yarn,
    | k8s://https://host:port, or local (Default: local[*]).
    | --deploy-mode DEPLOY_MODE Whether to launch the driver program locally ("client") or
    | on one of the worker machines inside the cluster ("cluster")
    | (Default: client).
    | --class CLASS_NAME Your application's main class (for Java / Scala apps).
    | --name NAME A name of your application.
    | --jars JARS Comma-separated list of jars to include on the driver
    | and executor classpaths.
    | --packages Comma-separated list of maven coordinates of jars to include
    | on the driver and executor classpaths. Will search the local
    | maven repo, then maven central and any additional remote
    | repositories given by --repositories. The format for the
    | coordinates should be groupId:artifactId:version.
    | --exclude-packages Comma-separated list of groupId:artifactId, to exclude while
    | resolving the dependencies provided in --packages to avoid
    | dependency conflicts.
    | --repositories Comma-separated list of additional remote repositories to
    | search for the maven coordinates given with --packages.
    | --py-files PY_FILES Comma-separated list of .zip, .egg, or .py files to place
    | on the PYTHONPATH for Python apps.
    | --files FILES Comma-separated list of files to be placed in the working
    | directory of each executor. File paths of these files
    | in executors can be accessed via SparkFiles.get(fileName).
    |
    | --conf PROP=VALUE Arbitrary Spark configuration property.
    | --properties-file FILE Path to a file from which to load extra properties. If not
    | specified, this will look for conf/spark-defaults.conf.
    |
    | --driver-memory MEM Memory for driver (e.g. 1000M, 2G) (Default: ${mem_mb}M).
    | --driver-java-options Extra Java options to pass to the driver.
    | --driver-library-path Extra library path entries to pass to the driver.
    | --driver-class-path Extra class path entries to pass to the driver. Note that
    | jars added with --jars are automatically included in the
    | classpath.
    |
    | --executor-memory MEM Memory per executor (e.g. 1000M, 2G) (Default: 1G).
    |
    | --proxy-user NAME User to impersonate when submitting the application.
    | This argument does not work with --principal / --keytab.
    |
    | --help, -h Show this help message and exit.
    | --verbose, -v Print additional debug output.
    | --version, Print the version of current Spark.
    |
    | Cluster deploy mode only:
    | --driver-cores NUM Number of cores used by the driver, only in cluster mode
    | (Default: 1).
    |
    | Spark standalone or Mesos with cluster deploy mode only:
    | --supervise If given, restarts the driver on failure.
    | --kill SUBMISSION_ID If given, kills the driver specified.
    | --status SUBMISSION_ID If given, requests the status of the driver specified.
    |
    | Spark standalone and Mesos only:
    | --total-executor-cores NUM Total cores for all executors.
    |
    | Spark standalone and YARN only:
    | --executor-cores NUM Number of cores per executor. (Default: 1 in YARN mode,
    | or all available cores on the worker in standalone mode)
    |
    | YARN-only:
    | --queue QUEUE_NAME The YARN queue to submit to (Default: "default").
    | --num-executors NUM Number of executors to launch (Default: 2).
    | If dynamic allocation is enabled, the initial number of
    | executors will be at least NUM.
    | --archives ARCHIVES Comma separated list of archives to be extracted into the
    | working directory of each executor.
    | --principal PRINCIPAL Principal to be used to login to KDC, while running on
    | secure HDFS.
    | --keytab KEYTAB The full path to the file that contains the keytab for the
    | principal specified above. This keytab will be copied to
    | the node running the Application Master via the Secure
    | Distributed Cache, for renewing the login tickets and the
    | delegation tokens periodically.
  • 相关阅读:
    Memcache 内存分配策略和性能(使用)状态检查
    C# 中字符串转换成日期
    Task及Mvc的异步控制器 使用探索
    MVC项目实践,在三层架构下实现SportsStore-01,EF Code First建模、DAL层等
    从壹开始前后端分离 [ Vue2.0+.NET Core2.1] 二十三║Vue实战:Vuex 其实很简单
    从壹开始前后端分离 [ Vue2.0+.NET Core2.1] 二十一║Vue实战:开发环境搭建【详细版】
    vue-router 快速入门
    Vue.js——60分钟快速入门
    五小步让VS Code支持AngularJS智能提示
    AngularJS----服务,表单,模块
  • 原文地址:https://www.cnblogs.com/jqbai/p/10783805.html
Copyright © 2011-2022 走看看