提交Spark程序到集群与提交MapReduce程序到集群一样,首先要将写好的Spark程序打成jar包,再在Spark-submit下通过命令提交。
Step1:打包程序
Intellij IDEA进行打包步骤:
Step2:提交任务
./spark-submit --class com.jz.bigdata.DecisionTree --master spark:master:7077 --executor-memory 2g --num-executors 5 /bigdata/DecisionTree.jar
附:
官方给定的通过spark-submit
提交Spark程序的例子:
# Run application locally on 8 cores ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master local[8] /path/to/examples.jar 100 # Run on a Spark standalone cluster ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master spark://207.184.161.138:7077 --executor-memory 20G --total-executor-cores 100 /path/to/examples.jar 1000 # Run on a YARN cluster export HADOOP_CONF_DIR=XXX ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn-cluster # can also be `yarn-client` for client mode --executor-memory 20G --num-executors 50 /path/to/examples.jar 1000 # Run a Python application on a cluster ./bin/spark-submit --master spark://207.184.161.138:7077 examples/src/main/python/pi.py 1000