zoukankan      html  css  js  c++  java
  • spark-submit提交方式测试Demo

    写一个小小的Demo测试一下Spark提交程序的流程

    Maven的pom文件

    <properties>
            <maven.compiler.source>1.7</maven.compiler.source>
            <maven.compiler.target>1.7</maven.compiler.target>
            <encoding>UTF-8</encoding>
            <spark.version>1.6.1</spark.version>
      </properties>
    
      <dependencies>
               <dependency>
                <groupId>org.apache.spark</groupId>
                <artifactId>spark-core_2.10</artifactId>
                <version>${spark.version}</version>
            </dependency>
    
            <dependency>
                <groupId>redis.clients</groupId>
                <artifactId>jedis</artifactId>
                <version>2.7.1</version>
            </dependency>
    
      </dependencies>
      
       <build>
            <plugins>
                <plugin>
                    <groupId>org.apache.maven.plugins</groupId>
                    <artifactId>maven-compiler-plugin</artifactId>
                    <configuration>
                        <source>1.7</source>
                        <target>1.7</target>
                    </configuration>
                </plugin>
            
                <plugin>
                    <groupId>org.apache.maven.plugins</groupId>
                    <artifactId>maven-shade-plugin</artifactId>
                    <version>2.4.3</version>
                    <executions>
                        <execution>
                            <phase>package</phase>
                            <goals>
                                <goal>shade</goal>
                            </goals>
                            <configuration>
                                <filters>
                                    <filter>
                                        <artifact>*:*</artifact>
                                        <excludes>
                                            <exclude>META-INF/*.SF</exclude>
                                            <exclude>META-INF/*.DSA</exclude>
                                            <exclude>META-INF/*.RSA</exclude>
                                        </excludes>
                                    </filter>
                                </filters>
                            </configuration>
                        </execution>
                    </executions>
                </plugin>
            </plugins>
        </build>

    编写一个蒙特卡罗求PI的代码

    import java.util.ArrayList;
    import java.util.List;
    
    import org.apache.spark.SparkConf;
    import org.apache.spark.api.java.JavaRDD;
    import org.apache.spark.api.java.JavaSparkContext;
    import org.apache.spark.api.java.function.Function;
    import org.apache.spark.api.java.function.Function2;
    
    import redis.clients.jedis.Jedis;
    
    /** 
     * Computes an approximation to pi
     * Usage: JavaSparkPi [slices]
     */
    public final class JavaSparkPi {
    
      public static void main(String[] args) throws Exception {
        SparkConf sparkConf = new SparkConf().setAppName("JavaSparkPi")/*.setMaster("local[2]")*/;
        JavaSparkContext jsc = new JavaSparkContext(sparkConf);
        
        Jedis jedis = new Jedis("192.168.49.151",19000);
        int slices = (args.length == 1) ? Integer.parseInt(args[0]) : 2;
        int n = 100000 * slices;
        List<Integer> l = new ArrayList<Integer>(n);
        for (int i = 0; i < n; i++) {
          l.add(i);
        }
    
        JavaRDD<Integer> dataSet = jsc.parallelize(l, slices);
    
        int count = dataSet.map(new Function<Integer, Integer>() {
          @Override
          public Integer call(Integer integer) {
            double x = Math.random() * 2 - 1;
            double y = Math.random() * 2 - 1;
            return (x * x + y * y < 1) ? 1 : 0;
          }
        }).reduce(new Function2<Integer, Integer, Integer>() {
          @Override
          public Integer call(Integer integer, Integer integer2) {
            return integer + integer2;
          }
        });
    
        jedis.set("Pi", String.valueOf(4.0 * count / n));
        System.out.println("Pi is roughly " + 4.0 * count / n);
        
        jsc.stop();
      }
    }

    前提条件的setMaster("local[2]") 没有在代码中hard code


    本地模式测试情况:# Run application locally on 8 cores

    spark-submit
    --master local[8]
    --class com.spark.test.JavaSparkPi
    --executor-memory 4g
    --executor-cores 4
    /home/dinpay/test/Spark-SubmitTest.jar 100

    运行结果在本地:运行在本地一起提交8个Task,不会在WebUI的8080端口上看见提交的任务

    -------------------------------------

    spark-submit
    --master local[8]
    --class com.spark.test.JavaSparkPi
    --executor-memory 8G
    --total-executor-cores 8
    hdfs://192.168.46.163:9000/home/test/Spark-SubmitTest.jar 100

    运行报错:java.lang.ClassNotFoundException: com.spark.test.JavaSparkPi

    ------------------------------------

    spark-submit
    --master local[8]
    --deploy-mode cluster
    --supervise
    --class com.spark.test.JavaSparkPi
    --executor-memory 8G
    --total-executor-cores 8
    /home/dinpay/test/Spark-SubmitTest.jar 100

    运行报错:Error: Cluster deploy mode is not compatible with master "local"


    ====================================================================


    Standalone模式client模式 # Run on a Spark standalone cluster in client deploy mode

    spark-submit
    --master spark://hadoop-namenode-02:7077
    --class com.spark.test.JavaSparkPi
    --executor-memory 8g
    --tital-executor-cores 8
    /home/dinpay/test/Spark-SubmitTest.jar 100

    运行结果如下:

    -------------------------------------------
    spark-submit
    --master spark://hadoop-namenode-02:7077
    --class com.spark.test.JavaSparkPi
    --executor-memory 4g
    --executor-cores 4g
    hdfs://192.168.46.163:9000/home/test/Spark-SubmitTest.jar 100

    运行报错:java.lang.ClassNotFoundException: com.spark.test.JavaSparkPi

    =======================================================================

    standalone模式下的cluster模式 # Run on a Spark standalone cluster in cluster deploy mode with supervise

    spark-submit
    --master spark://hadoop-namenode-02:7077
    --class com.spark.test.JavaSparkPi
    --deploy-mode cluster
    --supervise
    --executor-memory 4g
    --executor-cores 4
    /home/dinpay/test/Spark-SubmitTest.jar 100

    运行报错:java.io.FileNotFoundException: /home/dinpay/test/Spark-SubmitTest.jar (No such file or directory)

    -------------------------------------------

    spark-submit
    --master spark://hadoop-namenode-02:7077
    --class com.spark.test.JavaSparkPi
    --deploy-mode cluster
    --supervise
    --driver-memory 4g
    --driver-cores 4
    --executor-memory 2g
    --total-executor-cores 4
    hdfs://192.168.46.163:9000/home/test/Spark-SubmitTest.jar 100

    运行结果如下:

    =============================================

    如果代码中写定了.setMaster("local[2]");
    则提交的集群模式也会运行driver,但是不会有对应的application并行运行

    spark-submit --deploy-mode cluster
    --master spark://hadoop-namenode-02:6066
    --class com.dinpay.bdp.rcp.service.Window12HzStat
    --driver-memory 2g
    --driver-cores 2
    --executor-memory 1g
    --total-executor-cores 2
    hdfs://192.168.46.163:9000/home/dinpay/RCP-HZ-TASK-0.0.1-SNAPSHOT.jar
    如果代码中限定了.setMaster("local[2]");
    则提交方式还是本地模式,会找一台worker进行本地化运行任务

  • 相关阅读:
    服务部署 RPC vs RESTful
    模拟浏览器之从 Selenium 到splinter
    windows程序设计 vs2012 新建win32项目
    ubuntu python 安装numpy,scipy.pandas.....
    vmvare 将主机的文件复制到虚拟机系统中 安装WMware tools
    ubuntu 修改root密码
    python 定义类 简单使用
    python 定义函数 两个文件调用函数
    python 定义函数 调用函数
    python windows 安装gensim
  • 原文地址:https://www.cnblogs.com/atomicbomb/p/6999594.html
Copyright © 2011-2022 走看看