zoukankan      html  css  js  c++  java
  • 让spark运行在mesos上 -- 分布式计算系统spark学习(五)

    mesos集群部署参见上篇

    运行在mesos上面和 spark standalone模式的区别是:

    1)stand alone

    需要自己启动spark master

    需要自己启动spark slaver(即工作的worker)

    2)运行在mesos

    启动mesos master

    启动mesos slaver

    启动spark的 ./sbin/start-mesos-dispatcher.sh -m mesos://127.0.0.1:5050

    配置spark的可执行程序的路径(也就是mesos里面所谓EXECUTOR),提供给mesos下载运行。

    在mesos上面的运行流程:

    1)通过spark-submit提交任务到spark-mesos-dispatcher

    2)spark-mesos-dispatcher 把通过driver 提交到mesos master,并收到任务ID

    3)mesos master 分配到slaver 让它执行任务

    4) spark-mesos-dispatcher,通过任务ID查询任务状态

    前期配置:

    1) spark-env.sh

    配置mesos库,以及spark可以执行的二进制程序包(这里偷懒用了官网的包,这里URI支持hdfs,http)。

    #到spark的安装目录
    vim conf/spark-env.sh
    
    # Options read by executors and drivers running inside the cluster
    # - SPARK_LOCAL_IP, to set the IP address Spark binds to on this node
    # - SPARK_PUBLIC_DNS, to set the public DNS name of the driver program
    # - SPARK_CLASSPATH, default classpath entries to append
    # - SPARK_LOCAL_DIRS, storage directories to use on this node for shuffle and RDD data
    # - MESOS_NATIVE_JAVA_LIBRARY, to point to your libmesos.so if you use Mesos
    MESOS_NATIVE_JAVA_LIBRARY=/usr/local/lib/libmesos.so
    SPARK_EXECUTOR_URI=http://d3kbcqa49mib13.cloudfront.net/spark-1.5.1-bin-hadoop2.6.tgz

    2)spark-defaults.conf

    # Default system properties included when running spark-submit.
    # This is useful for setting default environmental settings.
    
    # Example:
    # spark.master                     spark://master:7077
    # spark.eventLog.enabled           true
    # spark.eventLog.dir               hdfs://namenode:8021/directory
    # spark.serializer                 org.apache.spark.serializer.KryoSerializer
    # spark.driver.memory              5g
    # spark.executor.extraJavaOptions  -XX:+PrintGCDetails -Dkey=value -Dnumbers="one two three"
    
    spark.executor.uri              http://d3kbcqa49mib13.cloudfront.net/spark-1.5.1-bin-hadoop2.6.tgz
    spark.master                    mesos://10.230.136.197:5050

    3)  修改测试WordCount.java

    这里测试例子,可以参考 提交任务到spark

    /**
     * Illustrates a wordcount in Java
     */
    package com.oreilly.learningsparkexamples.mini.java;
    
    import java.util.Arrays;
    import java.util.List;
    import java.lang.Iterable;
    
    import scala.Tuple2;
    
    import org.apache.commons.lang.StringUtils;
    
    import org.apache.spark.SparkConf;
    import org.apache.spark.api.java.JavaRDD;
    import org.apache.spark.api.java.JavaPairRDD;
    import org.apache.spark.api.java.JavaSparkContext;
    import org.apache.spark.api.java.function.FlatMapFunction;
    import org.apache.spark.api.java.function.Function2;
    import org.apache.spark.api.java.function.PairFunction;
    
    
    public class WordCount {
      public static void main(String[] args) throws Exception {
        String inputFile = args[0];
        String outputFile = args[1];
        // Create a Java Spark Context.
        SparkConf conf = new SparkConf().setMaster("mesos://10.230.136.197:5050").setAppName("wordCount").set("spark.executor.uri", "http://d3kbcqa49mib13.cloudfront.net/spark-1.5.1-bin-hadoop2.6.tgz");
                    JavaSparkContext sc = new JavaSparkContext(conf);
        // Load our input data.
        JavaRDD<String> input = sc.textFile(inputFile);
        // Split up into words.
        JavaRDD<String> words = input.flatMap(
          new FlatMapFunction<String, String>() {
            public Iterable<String> call(String x) {
              return Arrays.asList(x.split(" "));
            }});
        // Transform into word and count.
        JavaPairRDD<String, Integer> counts = words.mapToPair(
          new PairFunction<String, String, Integer>(){
            public Tuple2<String, Integer> call(String x){
              return new Tuple2(x, 1);
            }}).reduceByKey(new Function2<Integer, Integer, Integer>(){
                public Integer call(Integer x, Integer y){ return x + y;}});
        // Save the word count back out to a text file, causing evaluation.
        counts.saveAsTextFile(outputFile);
            }
    }

    进入到examples/mini-complete-example 中,重新编译生成jar包。

    4)启动spark-mesos-dispather

    ./sbin/start-mesos-dispatcher.sh -m mesos://10.230.136.197:5050
    Spark Command: /app/otter/jdk1.7.0_80/bin/java -cp /home/qingpingzhang/dev/spark-1.5.1-bin-hadoop2.6/sbin/../conf/:/home/qingpingzhang/dev/spark-1.5.1-bin-hadoop2.6/lib/spark-assembly-1.5.1-hadoop2.6.0.jar:/home/qingpingzhang/dev/spark-1.5.1-bin-hadoop2.6/lib/datanucleus-core-3.2.10.jar:/home/qingpingzhang/dev/spark-1.5.1-bin-hadoop2.6/lib/datanucleus-rdbms-3.2.9.jar:/home/qingpingzhang/dev/spark-1.5.1-bin-hadoop2.6/lib/datanucleus-api-jdo-3.2.6.jar -Xms1g -Xmx1g -XX:MaxPermSize=256m org.apache.spark.deploy.mesos.MesosClusterDispatcher --host vg-log-analysis-prod --port 7077 -m mesos://10.230.136.197:5050
    ========================================
    Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
    15/11/05 11:31:36 INFO MesosClusterDispatcher: Registered signal handlers for [TERM, HUP, INT]
    15/11/05 11:31:36 WARN Utils: Your hostname, vg-log-analysis-prod resolves to a loopback address: 127.0.0.1; using 10.230.136.197 instead (on interface eth0)
    15/11/05 11:31:36 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
    15/11/05 11:31:36 INFO MesosClusterDispatcher: Recovery mode in Mesos dispatcher set to: NONE
    15/11/05 11:31:36 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    15/11/05 11:31:37 INFO SecurityManager: Changing view acls to: qingpingzhang
    15/11/05 11:31:37 INFO SecurityManager: Changing modify acls to: qingpingzhang
    15/11/05 11:31:37 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(qingpingzhang); users with modify permissions: Set(qingpingzhang)
    15/11/05 11:31:37 INFO SecurityManager: Changing view acls to: qingpingzhang
    15/11/05 11:31:37 INFO SecurityManager: Changing modify acls to: qingpingzhang
    15/11/05 11:31:37 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(qingpingzhang); users with modify permissions: Set(qingpingzhang)
    15/11/05 11:31:37 INFO Utils: Successfully started service on port 8081.
    15/11/05 11:31:37 INFO MesosClusterUI: Started MesosClusterUI at http://10.230.136.197:8081
    WARNING: Logging before InitGoogleLogging() is written to STDERR
    W1105 03:31:37.927594  3374 sched.cpp:1487]
    **************************************************
    Scheduler driver bound to loopback interface! Cannot communicate with remote master(s). You might want to set 'LIBPROCESS_IP' environment variable to use a routable IP address.
    **************************************************
    I1105 03:31:37.931098  3408 sched.cpp:164] Version: 0.24.0
    I1105 03:31:37.939507  3406 sched.cpp:262] New master detected at master@10.230.136.197:5050
    I1105 03:31:37.940353  3406 sched.cpp:272] No credentials provided. Attempting to register without authentication
    I1105 03:31:37.943528  3406 sched.cpp:640] Framework registered with 20151105-021937-16777343-5050-32543-0001
    15/11/05 11:31:37 INFO MesosClusterScheduler: Registered as framework ID 20151105-021937-16777343-5050-32543-0001
    15/11/05 11:31:37 INFO Utils: Successfully started service on port 7077.
    15/11/05 11:31:37 INFO MesosRestServer: Started REST server for submitting applications on port 7077

    5)提交任务

    ./bin/spark-submit  --master mesos://10.230.136.197:7077 --deploy-mode cluster --class com.oreilly.learningsparkexamples.mini.java.WordCount  /home/qingpingzhang/dev/spark-1.5.1-bin-hadoop2.6/examples/mini-complete-example/target/learning-spark-mini-example-0.0.1.jar  /home/qingpingzhang/dev/spark-1.5.1-bin-hadoop2.6/README.md /home/qingpingzhang/dev/spark-1.5.1-bin-hadoop2.6/wordcounts.txt

    然后我们可以看到mesos-master的输出,收到任务,然后发送给slaver,最后更新任务状态:

    I1105 05:08:33.312283  7490 master.cpp:2094] Received SUBSCRIBE call for framework 'Spark Cluster' at scheduler-fae58488-7d56-4661-a078-938d12871930@10.230.136.197:57392
    I1105 05:08:33.312641  7490 master.cpp:2164] Subscribing framework Spark Cluster with checkpointing enabled and capabilities [  ]
    I1105 05:08:33.313761  7486 hierarchical.hpp:391] Added framework 20151105-050710-3314083338-5050-7469-0000
    I1105 05:08:33.315335  7490 master.cpp:4613] Sending 1 offers to framework 20151105-050710-3314083338-5050-7469-0000 (Spark Cluster) at scheduler-fae58488-7d56-4661-a078-938d12871930@10.230.136.197:57392
    I1105 05:08:33.426009  7489 master.cpp:2739] Processing ACCEPT call for offers: [ 20151105-050710-3314083338-5050-7469-O0 ] on slave 20151105-045733-3314083338-5050-7152-S0 at slave(1)@10.29.23.28:5051 (ip-10-29-23-28.ec2.internal) for framework 20151105-050710-331408333$
    -5050-7469-0000 (Spark Cluster) at scheduler-fae58488-7d56-4661-a078-938d12871930@10.230.136.197:57392
    I1105 05:08:33.427104  7486 hierarchical.hpp:814] Recovered cpus(*):8; mem(*):5986; disk(*):196338; ports(*):[31000-32000] (total: cpus(*):8; mem(*):5986; disk(*):196338; ports(*):[31000-32000], allocated: ) on slave 20151105-045733-3314083338-5050-7152-S0 from framework
    20151105-050710-3314083338-5050-7469-0000
    I1105 05:08:39.177790  7484 master.cpp:4613] Sending 1 offers to framework 20151105-050710-3314083338-5050-7469-0000 (Spark Cluster) at scheduler-fae58488-7d56-4661-a078-938d12871930@10.230.136.197:57392
    I1105 05:08:39.181149  7489 master.cpp:2739] Processing ACCEPT call for offers: [ 20151105-050710-3314083338-5050-7469-O1 ] on slave 20151105-045733-3314083338-5050-7152-S0 at slave(1)@10.29.23.28:5051 (ip-10-29-23-28.ec2.internal) for framework 20151105-050710-331408333$
    -5050-7469-0000 (Spark Cluster) at scheduler-fae58488-7d56-4661-a078-938d12871930@10.230.136.197:57392
    I1105 05:08:39.181699  7485 hierarchical.hpp:814] Recovered cpus(*):8; mem(*):5986; disk(*):196338; ports(*):[31000-32000] (total: cpus(*):8; mem(*):5986; disk(*):196338; ports(*):[31000-32000], allocated: ) on slave 20151105-045733-3314083338-5050-7152-S0 from framework
    20151105-050710-3314083338-5050-7469-0000
    I1105 05:08:44.183100  7486 master.cpp:4613] Sending 1 offers to framework 20151105-050710-3314083338-5050-7469-0000 (Spark Cluster) at scheduler-fae58488-7d56-4661-a078-938d12871930@10.230.136.197:57392
    I1105 05:08:44.186468  7484 master.cpp:2739] Processing ACCEPT call for offers: [ 20151105-050710-3314083338-5050-7469-O2 ] on slave 20151105-045733-3314083338-5050-7152-S0 at slave(1)@10.29.23.28:5051 (ip-10-29-23-28.ec2.internal) for framework 20151105-050710-331408333$
    -5050-7469-0000 (Spark Cluster) at scheduler-fae58488-7d56-4661-a078-938d12871930@10.230.136.197:57392
    I1105 05:08:44.187100  7485 hierarchical.hpp:814] Recovered cpus(*):8; mem(*):5986; disk(*):196338; ports(*):[31000-32000] (total: cpus(*):8; mem(*):5986; disk(*):196338; ports(*):[31000-32000], allocated: ) on slave 20151105-045733-3314083338-5050-7152-S0 from framework
    20151105-050710-3314083338-5050-7469-0000
    
    I1105 05:12:18.668609  7489 master.cpp:4069] Status update TASK_FAILED (UUID: 8d30c637-b885-487b-b174-47232cc0e49f) for task driver-20151105131213-0002 of framework 20151105-050710-3314083338-5050-7469-0000 from slave 20151105-045733-3314083338-5050-7152-S0 at slave(1)@10
    .29.23.28:5051 (ip-10-29-23-28.ec2.internal)
    I1105 05:12:18.668689  7489 master.cpp:4108] Forwarding status update TASK_FAILED (UUID: 8d30c637-b885-487b-b174-47232cc0e49f) for task driver-20151105131213-0002 of framework 20151105-050710-3314083338-5050-7469-0000
    I1105 05:12:18.669001  7489 master.cpp:5576] Updating the latest state of task driver-20151105131213-0002 of framework 20151105-050710-3314083338-5050-7469-0000 to TASK_FAILED
    I1105 05:12:18.669373  7483 hierarchical.hpp:814] Recovered cpus(*):1; mem(*):1024 (total: cpus(*):8; mem(*):5986; disk(*):196338; ports(*):[31000-32000], allocated: ) on slave 20151105-045733-3314083338-5050-7152-S0 from framework 20151105-050710-3314083338-5050-7469-000
    0
    I1105 05:12:18.670912  7489 master.cpp:5644] Removing task driver-20151105131213-0002 with resources cpus(*):1; mem(*):1024 of framework 20151105-050710-3314083338-5050-7469-0000 on slave 20151105-045733-3314083338-5050-7152-S0 at slave(1)@10.29.23.28:5051 (ip-10-29-23-28
    .ec2.internal)

    mesos-slaver的输出,收到任务,然后下载spark的可执行文件失败,最后执行任务失败:

    1105 05:11:31.363765 17084 slave.cpp:1270] Got assigned task driver-20151105131130-0001 for framework 20151105-050710-3314083338-5050-7469-0000
    I1105 05:11:31.365025 17084 slave.cpp:1386] Launching task driver-20151105131130-0001 for framework 20151105-050710-3314083338-5050-7469-0000
    I1105 05:11:31.376075 17084 slave.cpp:4852] Launching executor driver-20151105131130-0001 of framework 20151105-050710-3314083338-5050-7469-0000 with resources cpus(*):0.1; mem(*):32 in work directory '/tmp/mesos/slaves/20151105-045733-3314083338-5050-7152-S0/frameworks/$
    0151105-050710-3314083338-5050-7469-0000/executors/driver-20151105131130-0001/runs/461ceb14-9247-4a59-b9d4-4b0a7947e353'
    I1105 05:11:31.376448 17085 containerizer.cpp:640] Starting container '461ceb14-9247-4a59-b9d4-4b0a7947e353' for executor 'driver-20151105131130-0001' of framework '20151105-050710-3314083338-5050-7469-0000'
    I1105 05:11:31.376878 17084 slave.cpp:1604] Queuing task 'driver-20151105131130-0001' for executor driver-20151105131130-0001 of framework '20151105-050710-3314083338-5050-7469-0000
    I1105 05:11:31.379096 17083 linux_launcher.cpp:352] Cloning child process with flags =
    I1105 05:11:31.382968 17083 containerizer.cpp:873] Checkpointing executor's forked pid 17098 to '/tmp/mesos/meta/slaves/20151105-045733-3314083338-5050-7152-S0/frameworks/20151105-050710-3314083338-5050-7469-0000/executors/driver-20151105131130-0001/runs/461ceb14-9247-4a$
    9-b9d4-4b0a7947e353/pids/forked.pid'
    E1105 05:11:31.483093 17078 fetcher.cpp:515] Failed to run mesos-fetcher: Failed to fetch all URIs for container '461ceb14-9247-4a59-b9d4-4b0a7947e353' with exit status: 256
    E1105 05:11:31.483355 17079 slave.cpp:3342] Container '461ceb14-9247-4a59-b9d4-4b0a7947e353' for executor 'driver-20151105131130-0001' of framework '20151105-050710-3314083338-5050-7469-0000' failed to start: Failed to fetch all URIs for container '461ceb14-9247-4a59-b9d$
    -4b0a7947e353' with exit status: 256
    I1105 05:11:31.483444 17079 containerizer.cpp:1097] Destroying container '461ceb14-9247-4a59-b9d4-4b0a7947e353'
    I1105 05:11:31.485548 17084 cgroups.cpp:2433] Freezing cgroup /sys/fs/cgroup/freezer/mesos/461ceb14-9247-4a59-b9d4-4b0a7947e353
    I1105 05:11:31.487112 17080 cgroups.cpp:1415] Successfully froze cgroup /sys/fs/cgroup/freezer/mesos/461ceb14-9247-4a59-b9d4-4b0a7947e353 after 1.48992ms
    I1105 05:11:31.488673 17082 cgroups.cpp:2450] Thawing cgroup /sys/fs/cgroup/freezer/mesos/461ceb14-9247-4a59-b9d4-4b0a7947e353
    I1105 05:11:31.490102 17082 cgroups.cpp:1444] Successfullly thawed cgroup /sys/fs/cgroup/freezer/mesos/461ceb14-9247-4a59-b9d4-4b0a7947e353 after 1.363968ms
    I1105 05:11:31.583328 17082 containerizer.cpp:1284] Executor for container '461ceb14-9247-4a59-b9d4-4b0a7947e353' has exited
    I1105 05:11:31.583977 17081 slave.cpp:3440] Executor 'driver-20151105131130-0001' of framework 20151105-050710-3314083338-5050-7469-0000 exited with status 1
    I1105 05:11:31.585134 17081 slave.cpp:2717] Handling status update TASK_FAILED (UUID: 0c429164-a159-4e72-9872-c5fbd2ef9004) for task driver-20151105131130-0001 of framework 20151105-050710-3314083338-5050-7469-0000 from @0.0.0.0:0
    W1105 05:11:31.585384 17084 containerizer.cpp:988] Ignoring update for unknown container: 461ceb14-9247-4a59-b9d4-4b0a7947e353
    I1105 05:11:31.585605 17084 status_update_manager.cpp:322] Received status update TASK_FAILED (UUID: 0c429164-a159-4e72-9872-c5fbd2ef9004) for task driver-20151105131130-0001 of framework 20151105-050710-3314083338-5050-7469-0000
    I1105 05:11:31.585911 17084 status_update_manager.cpp:826] Checkpointing UPDATE for status update TASK_FAILED (UUID: 0c429164-a159-4e72-9872-c5fbd2ef9004) for task driver-20151105131130-0001 of framework 20151105-050710-3314083338-5050-7469-0000
    I1105 05:11:31.596305 17081 slave.cpp:3016] Forwarding the update TASK_FAILED (UUID: 0c429164-a159-4e72-9872-c5fbd2ef9004) for task driver-20151105131130-0001 of framework 20151105-050710-3314083338-5050-7469-0000 to master@10.230.136.197:5050
    I1105 05:11:31.611620 17083 status_update_manager.cpp:394] Received status update acknowledgement (UUID: 0c429164-a159-4e72-9872-c5fbd2ef9004) for task driver-20151105131130-0001 of framework 20151105-050710-3314083338-5050-7469-0000
    I1105 05:11:31.611702 17083 status_update_manager.cpp:826] Checkpointing ACK for status update TASK_FAILED (UUID: 0c429164-a159-4e72-9872-c5fbd2ef9004) for task driver-20151105131130-0001 of framework 20151105-050710-3314083338-5050-7469-0000
    I1105 05:11:31.616345 17083 slave.cpp:3544] Cleaning up executor 'driver-20151105131130-0001' of framework 20151105-050710-3314083338-5050-7469-0000

    那么这个问题咋解决呢?

    去mesos的slaver上面查看运行日志(默认是在/tmp/mesos/slaves/目录下),发现尼玛,这个跟stand alone还不一样(stand alone模式下面,spark的master会建立一个http server,把jar包提供给spark worker下载执行,但是mesos模式居然不是这样,jar包需要是放到hdfs 或者http 路径下面?坑爹了),这有点不合理啊,亲。

    /tmp/mesos/slaves/20151105-045733-3314083338-5050-7152-S0/frameworks/20151105-070418-3314083338-5050-12075-0000/executors/driver-20151105151012-0001/runs/bed1c620-e849-4076-b130-c95c47133599$ cat stderr
    I1105 07:10:17.387609 17897 fetcher.cpp:414] Fetcher Info: {"cache_directory":"/tmp/mesos/fetch/slaves/20151105-045733-3314083338-5050-7152-S0/qingpingzhang","items":[{"action":"BYPASS_CACHE","uri":{"extract":true,"value":"/home/qingpingzhang/dev/spark-1.5.1-bin-hadoop2.6/examples/mini-complete-example/target/learning-spark-mini-example-0.0.1.jar"}},{"action":"BYPASS_CACHE","uri":{"extract":true,"value":"http://d3kbcqa49mib13.cloudfront.net/spark-1.5.1-bin-hadoop2.6.tgz"}}],"sandbox_directory":"/tmp/mesos/slaves/20151105-045733-3314083338-5050-7152-S0/frameworks/20151105-070418-3314083338-5050-12075-0000/executors/driver-20151105151012-0001/runs/bed1c620-e849-4076-b130-c95c47133599","user":"qingpingzhang"}
    I1105 07:10:17.390316 17897 fetcher.cpp:369] Fetching URI '/home/qingpingzhang/dev/spark-1.5.1-bin-hadoop2.6/examples/mini-complete-example/target/learning-spark-mini-example-0.0.1.jar'
    I1105 07:10:17.390344 17897 fetcher.cpp:243] Fetching directly into the sandbox directory
    I1105 07:10:17.390384 17897 fetcher.cpp:180] Fetching URI '/home/qingpingzhang/dev/spark-1.5.1-bin-hadoop2.6/examples/mini-complete-example/target/learning-spark-mini-example-0.0.1.jar'
    I1105 07:10:17.390418 17897 fetcher.cpp:160] Copying resource with command:cp '/home/qingpingzhang/dev/spark-1.5.1-bin-hadoop2.6/examples/mini-complete-example/target/learning-spark-mini-example-0.0.1.jar' '/tmp/mesos/slaves/20151105-045733-3314083338-5050-7152-S0/frameworks/20151105-070418-3314083338-5050-12075-0000/executors/driver-20151105151012-0001/runs/bed1c620-e849-4076-b130-c95c47133599/learning-spark-mini-example-0.0.1.jar'
    cp: cannot stat ‘/home/qingpingzhang/dev/spark-1.5.1-bin-hadoop2.6/examples/mini-complete-example/target/learning-spark-mini-example-0.0.1.jar’: No such file or directory
    Failed to fetch '/home/qingpingzhang/dev/spark-1.5.1-bin-hadoop2.6/examples/mini-complete-example/target/learning-spark-mini-example-0.0.1.jar': Failed to copy with command 'cp '/home/qingpingzhang/dev/spark-1.5.1-bin-hadoop2.6/examples/mini-complete-example/target/learning-spark-mini-example-0.0.1.jar' '/tmp/mesos/slaves/20151105-045733-3314083338-5050-7152-S0/frameworks/20151105-070418-3314083338-5050-12075-0000/executors/driver-20151105151012-0001/runs/bed1c620-e849-4076-b130-c95c47133599/learning-spark-mini-example-0.0.1.jar'', exit status: 256
    Failed to synchronize with slave (it's probably exited)

    好吧,为了能够跑通测试,现在把jar包和readme.txt文件copy到mesos slaver的机器上面去。

    ./bin/spark-submit  --master mesos://10.230.136.197:7077 --deploy-mode cluster --class com.oreilly.learningsparkexamples.mini.java.WordCount  /tmp/learning-spark-mini-example-0.0.1.jar  /tmp/README.md /tmp/wordcounts.txt

    果然任务就执行成功了,meso-slave输入出如下:

    I1105 07:43:26.748515 17244 slave.cpp:1270] Got assigned task driver-20151105154326-0002 for framework 20151105-070418-3314083338-5050-12075-0000
    I1105 07:43:26.749575 17244 gc.cpp:84] Unscheduling '/tmp/mesos/slaves/20151105-045733-3314083338-5050-7152-S0/frameworks/20151105-070418-3314083338-5050-12075-0000' from gc
    I1105 07:43:26.749703 17247 gc.cpp:84] Unscheduling '/tmp/mesos/meta/slaves/20151105-045733-3314083338-5050-7152-S0/frameworks/20151105-070418-3314083338-5050-12075-0000' from gc
    I1105 07:43:26.749825 17246 slave.cpp:1386] Launching task driver-20151105154326-0002 for framework 20151105-070418-3314083338-5050-12075-0000
    I1105 07:43:26.760673 17246 slave.cpp:4852] Launching executor driver-20151105154326-0002 of framework 20151105-070418-3314083338-5050-12075-0000 with resources cpus(*):0.1; mem(*):32 in work directory '/tmp/mesos/slaves/20151105-045733-3314083338-5050-7152-S0/frameworks$
    20151105-070418-3314083338-5050-12075-0000/executors/driver-20151105154326-0002/runs/910a4983-2732-41dd-b014-66827c044c16'
    I1105 07:43:26.760967 17244 containerizer.cpp:640] Starting container '910a4983-2732-41dd-b014-66827c044c16' for executor 'driver-20151105154326-0002' of framework '20151105-070418-3314083338-5050-12075-0000'
    I1105 07:43:26.761265 17246 slave.cpp:1604] Queuing task 'driver-20151105154326-0002' for executor driver-20151105154326-0002 of framework '20151105-070418-3314083338-5050-12075-0000
    I1105 07:43:26.763134 17249 linux_launcher.cpp:352] Cloning child process with flags =
    I1105 07:43:26.766726 17249 containerizer.cpp:873] Checkpointing executor's forked pid 18129 to '/tmp/mesos/meta/slaves/20151105-045733-3314083338-5050-7152-S0/frameworks/20151105-070418-3314083338-5050-12075-0000/executors/driver-20151105154326-0002/runs/910a4983-2732-4$dd-b014-66827c044c16/pids/forked.pid'
    I1105 07:43:33.153153 17244 slave.cpp:2379] Got registration for executor 'driver-20151105154326-0002' of framework 20151105-070418-3314083338-5050-12075-0000 from executor(1)@10.29.23.28:54580
    I1105 07:43:33.154284 17246 slave.cpp:1760] Sending queued task 'driver-20151105154326-0002' to executor 'driver-20151105154326-0002' of framework 20151105-070418-3314083338-5050-12075-0000
    I1105 07:43:33.160464 17243 slave.cpp:2717] Handling status update TASK_RUNNING (UUID: eb83c08d-19fc-4d20-ab17-6f69ee8d90b2) for task driver-20151105154326-0002 of framework 20151105-070418-3314083338-5050-12075-0000 from executor(1)@10.29.23.28:54580
    I1105 07:43:33.160643 17242 status_update_manager.cpp:322] Received status update TASK_RUNNING (UUID: eb83c08d-19fc-4d20-ab17-6f69ee8d90b2) for task driver-20151105154326-0002 of framework 20151105-070418-3314083338-5050-12075-0000
    I1105 07:43:33.160940 17242 status_update_manager.cpp:826] Checkpointing UPDATE for status update TASK_RUNNING (UUID: eb83c08d-19fc-4d20-ab17-6f69ee8d90b2) for task driver-20151105154326-0002 of framework 20151105-070418-3314083338-5050-12075-0000
    I1105 07:43:33.168092 17243 slave.cpp:3016] Forwarding the update TASK_RUNNING (UUID: eb83c08d-19fc-4d20-ab17-6f69ee8d90b2) for task driver-20151105154326-0002 of framework 20151105-070418-3314083338-5050-12075-0000 to master@10.230.136.197:5050
    I1105 07:43:33.168218 17243 slave.cpp:2946] Sending acknowledgement for status update TASK_RUNNING (UUID: eb83c08d-19fc-4d20-ab17-6f69ee8d90b2) for task driver-20151105154326-0002 of framework 20151105-070418-3314083338-5050-12075-0000 to executor(1)@10.29.23.28:54580
    I1105 07:43:33.171906 17247 status_update_manager.cpp:394] Received status update acknowledgement (UUID: eb83c08d-19fc-4d20-ab17-6f69ee8d90b2) for task driver-20151105154326-0002 of framework 20151105-070418-3314083338-5050-12075-0000
    I1105 07:43:33.172025 17247 status_update_manager.cpp:826] Checkpointing ACK for status update TASK_RUNNING (UUID: eb83c08d-19fc-4d20-ab17-6f69ee8d90b2) for task driver-20151105154326-0002 of framework 20151105-070418-3314083338-5050-12075-0000
    I1105 07:43:36.454344 17249 slave.cpp:3926] Current disk usage 1.10%. Max allowed age: 6.223128215149259days
    I1105 07:43:39.174698 17247 slave.cpp:1270] Got assigned task 0 for framework 20151105-070418-3314083338-5050-12075-0001
    I1105 07:43:39.175014 17247 slave.cpp:1386] Launching task 0 for framework 20151105-070418-3314083338-5050-12075-0001
    I1105 07:43:39.185343 17247 slave.cpp:4852] Launching executor 20151105-045733-3314083338-5050-7152-S0 of framework 20151105-070418-3314083338-5050-12075-0001 with resources cpus(*):1; mem(*):1408 in work directory '/tmp/mesos/slaves/20151105-045733-3314083338-5050-7152-$
    0/frameworks/20151105-070418-3314083338-5050-12075-0001/executors/20151105-045733-3314083338-5050-7152-S0/runs/8b233e11-54a6-41b6-a8ad-f82660d640b5'
    I1105 07:43:39.185643 17247 slave.cpp:1604] Queuing task '0' for executor 20151105-045733-3314083338-5050-7152-S0 of framework '20151105-070418-3314083338-5050-12075-0001
    I1105 07:43:39.185694 17246 containerizer.cpp:640] Starting container '8b233e11-54a6-41b6-a8ad-f82660d640b5' for executor '20151105-045733-3314083338-5050-7152-S0' of framework '20151105-070418-3314083338-5050-12075-0001'
    I1105 07:43:39.185786 17247 slave.cpp:1270] Got assigned task 1 for framework 20151105-070418-3314083338-5050-12075-0001
    I1105 07:43:39.185931 17247 slave.cpp:1386] Launching task 1 for framework 20151105-070418-3314083338-5050-12075-0001
    I1105 07:43:39.185968 17247 slave.cpp:1604] Queuing task '1' for executor 20151105-045733-3314083338-5050-7152-S0 of framework '20151105-070418-3314083338-5050-12075-0001
    I1105 07:43:39.187925 17245 linux_launcher.cpp:352] Cloning child process with flags =
    I1105 07:43:46.388809 17249 slave.cpp:2379] Got registration for executor '20151105-045733-3314083338-5050-7152-S0' of framework 20151105-070418-3314083338-5050-12075-0001 from executor(1)@10.29.23.28:36458
    I1105 07:43:46.389571 17244 slave.cpp:1760] Sending queued task '0' to executor '20151105-045733-3314083338-5050-7152-S0' of framework 20151105-070418-3314083338-5050-12075-0001
    I1105 07:43:46.389883 17244 slave.cpp:1760] Sending queued task '1' to executor '20151105-045733-3314083338-5050-7152-S0' of framework 20151105-070418-3314083338-5050-12075-0001
    I1105 07:43:49.534858 17246 slave.cpp:2717] Handling status update TASK_RUNNING (UUID: b6d77a8f-ef6f-414b-9bdc-0fa1c2a96841) for task 1 of framework 20151105-070418-3314083338-5050-12075-0001 from executor(1)@10.29.23.28:36458
    I1105 07:43:49.535087 17246 slave.cpp:2717] Handling status update TASK_RUNNING (UUID: 6633c11b-c403-45c6-82c5-f495fcc9ae70) for task 0 of framework 20151105-070418-3314083338-5050-12075-0001 from executor(1)@10.29.23.28:36458
    #......更多日志....
    I1105 07:43:53.012852 17245 slave.cpp:2946] Sending acknowledgement for status update TASK_FINISHED (UUID: 72c9805d-ea9d-49c1-ac2a-5c3940aa77f5) for task 3 of framework 20151105-070418-3314083338-5050-12075-0001 to executor(1)@10.29.23.28:36458
    I1105 07:43:53.016926 17244 status_update_manager.cpp:394] Received status update acknowledgement (UUID: 5f175f09-2a35-46dd-9ac9-fbe648d9780a) for task 2 of framework 20151105-070418-3314083338-5050-12075-0001
    I1105 07:43:53.017357 17247 status_update_manager.cpp:394] Received status update acknowledgement (UUID: 72c9805d-ea9d-49c1-ac2a-5c3940aa77f5) for task 3 of framework 20151105-070418-3314083338-5050-12075-0001
    I1105 07:43:53.146410 17246 slave.cpp:1980] Asked to shut down framework 20151105-070418-3314083338-5050-12075-0001 by master@10.230.136.197:5050
    I1105 07:43:53.146461 17246 slave.cpp:2005] Shutting down framework 20151105-070418-3314083338-5050-12075-0001
    I1105 07:43:53.146515 17246 slave.cpp:3751] Shutting down executor '20151105-045733-3314083338-5050-7152-S0' of framework 20151105-070418-3314083338-5050-12075-0001
    I1105 07:43:53.637172 17247 slave.cpp:2717] Handling status update TASK_FINISHED (UUID: 9fad65ef-d651-4d2c-91a7-bac8cf4b55a4) for task driver-20151105154326-0002 of framework 20151105-070418-3314083338-5050-12075-0000 from executor(1)@10.29.23.28:54580
    I1105 07:43:53.637706 17246 status_update_manager.cpp:322] Received status update TASK_FINISHED (UUID: 9fad65ef-d651-4d2c-91a7-bac8cf4b55a4) for task driver-20151105154326-0002 of framework 20151105-070418-3314083338-5050-12075-0000
    I1105 07:43:53.637755 17246 status_update_manager.cpp:826] Checkpointing UPDATE for status update TASK_FINISHED (UUID: 9fad65ef-d651-4d2c-91a7-bac8cf4b55a4) for task driver-20151105154326-0002 of framework 20151105-070418-3314083338-5050-12075-0000
    I1105 07:43:53.643517 17242 slave.cpp:3016] Forwarding the update TASK_FINISHED (UUID: 9fad65ef-d651-4d2c-91a7-bac8cf4b55a4) for task driver-20151105154326-0002 of framework 20151105-070418-3314083338-5050-12075-0000 to master@10.230.136.197:5050
    I1105 07:43:53.643635 17242 slave.cpp:2946] Sending acknowledgement for status update TASK_FINISHED (UUID: 9fad65ef-d651-4d2c-91a7-bac8cf4b55a4) for task driver-20151105154326-0002 of framework 20151105-070418-3314083338-5050-12075-0000 to executor(1)@10.29.23.28:54580
    I1105 07:43:53.647647 17246 status_update_manager.cpp:394] Received status update acknowledgement (UUID: 9fad65ef-d651-4d2c-91a7-bac8cf4b55a4) for task driver-20151105154326-0002 of framework 20151105-070418-3314083338-5050-12075-0000
    I1105 07:43:53.647703 17246 status_update_manager.cpp:826] Checkpointing ACK for status update TASK_FINISHED (UUID: 9fad65ef-d651-4d2c-91a7-bac8cf4b55a4) for task driver-20151105154326-0002 of framework 20151105-070418-3314083338-5050-12075-0000
    I1105 07:43:54.689678 17248 containerizer.cpp:1284] Executor for container '910a4983-2732-41dd-b014-66827c044c16' has exited
    I1105 07:43:54.689708 17248 containerizer.cpp:1097] Destroying container '910a4983-2732-41dd-b014-66827c044c16'
    I1105 07:43:54.691368 17248 cgroups.cpp:2433] Freezing cgroup /sys/fs/cgroup/freezer/mesos/910a4983-2732-41dd-b014-66827c044c16
    I1105 07:43:54.693023 17245 cgroups.cpp:1415] Successfully froze cgroup /sys/fs/cgroup/freezer/mesos/910a4983-2732-41dd-b014-66827c044c16 after 1.624064ms
    I1105 07:43:54.694628 17249 cgroups.cpp:2450] Thawing cgroup /sys/fs/cgroup/freezer/mesos/910a4983-2732-41dd-b014-66827c044c16
    I1105 07:43:54.695976 17249 cgroups.cpp:1444] Successfullly thawed cgroup /sys/fs/cgroup/freezer/mesos/910a4983-2732-41dd-b014-66827c044c16 after 1312us
    I1105 07:43:54.697335 17246 slave.cpp:3440] Executor 'driver-20151105154326-0002' of framework 20151105-070418-3314083338-5050-12075-0000 exited with status 0
    I1105 07:43:54.697371 17246 slave.cpp:3544] Cleaning up executor 'driver-20151105154326-0002' of framework 20151105-070418-3314083338-5050-12075-0000
    I1105 07:43:54.697621 17245 gc.cpp:56] Scheduling '/tmp/mesos/slaves/20151105-045733-3314083338-5050-7152-S0/frameworks/20151105-070418-3314083338-5050-12075-0000/executors/driver-20151105154326-0002/runs/910a4983-2732-41dd-b014-66827c044c16' for gc 6.99999192621333days i
    n the future
    I1105 07:43:54.697669 17245 gc.cpp:56] Scheduling '/tmp/mesos/slaves/20151105-045733-3314083338-5050-7152-S0/frameworks/20151105-070418-3314083338-5050-12075-0000/executors/driver-20151105154326-0002' for gc 6.99999192552296days in the future
    I1105 07:43:54.697700 17245 gc.cpp:56] Scheduling '/tmp/mesos/meta/slaves/20151105-045733-3314083338-5050-7152-S0/frameworks/20151105-070418-3314083338-5050-12075-0000/executors/driver-20151105154326-0002/runs/910a4983-2732-41dd-b014-66827c044c16' for gc 6.99999192509333d
    ays in the future
    I1105 07:43:54.697713 17246 slave.cpp:3633] Cleaning up framework 20151105-070418-3314083338-5050-12075-0000
    I1105 07:43:54.697726 17245 gc.cpp:56] Scheduling '/tmp/mesos/meta/slaves/20151105-045733-3314083338-5050-7152-S0/frameworks/20151105-070418-3314083338-5050-12075-0000/executors/driver-20151105154326-0002' for gc 6.99999192474667days in the future
    I1105 07:43:54.697756 17245 status_update_manager.cpp:284] Closing status update streams for framework 20151105-070418-3314083338-5050-12075-0000
    I1105 07:43:54.697813 17244 gc.cpp:56] Scheduling '/tmp/mesos/slaves/20151105-045733-3314083338-5050-7152-S0/frameworks/20151105-070418-3314083338-5050-12075-0000' for gc 6.99999192391407days in the future
    I1105 07:43:54.697856 17244 gc.cpp:56] Scheduling '/tmp/mesos/meta/slaves/20151105-045733-3314083338-5050-7152-S0/frameworks/20151105-070418-3314083338-5050-12075-0000' for gc 6.99999192340148days in the future
    I1105 07:43:58.147456 17248 slave.cpp:3820] Killing executor '20151105-045733-3314083338-5050-7152-S0' of framework 20151105-070418-3314083338-5050-12075-0001
    I1105 07:43:58.147546 17246 containerizer.cpp:1097] Destroying container '8b233e11-54a6-41b6-a8ad-f82660d640b5'
    I1105 07:43:58.149194 17246 cgroups.cpp:2433] Freezing cgroup /sys/fs/cgroup/freezer/mesos/8b233e11-54a6-41b6-a8ad-f82660d640b5
    I1105 07:43:58.200407 17245 containerizer.cpp:1284] Executor for container '8b233e11-54a6-41b6-a8ad-f82660d640b5' has exited

    然后/tmp/wordcounts.txt/目录下也如愿以偿的出现了统计结果:

    ll /tmp/wordcounts.txt/
    total 28
    drwxr-xr-x  2 qingpingzhang qingpingzhang 4096 Nov  5 07:43 ./
    drwxrwxrwt 14 root          root          4096 Nov  5 07:43 ../
    -rw-r--r--  1 qingpingzhang qingpingzhang 1970 Nov  5 07:43 part-00000
    -rw-r--r--  1 qingpingzhang qingpingzhang   24 Nov  5 07:43 .part-00000.crc
    -rw-r--r--  1 qingpingzhang qingpingzhang 1682 Nov  5 07:43 part-00001
    -rw-r--r--  1 qingpingzhang qingpingzhang   24 Nov  5 07:43 .part-00001.crc
    -rw-r--r--  1 qingpingzhang qingpingzhang    0 Nov  5 07:43 _SUCCESS
    -rw-r--r--  1 qingpingzhang qingpingzhang    8 Nov  5 07:43 ._SUCCESS.crc

    至此,spark 运行在mesos框架上也算是跑通了。

    这里没有把mesos,spark的webui 中的图贴出来。

    总结一下:

    1)搭建好mesos集群

    2)安装好spark,启动spark的mesos-dispather(连接到mesos的matser上)

    3)采用spark-submit来提交任务(提交任务的时候,需要把jar包放到可以下载的地方,例如:hdfs,http等)

    -----------------

    使用mesos的好处,主要是能够动态的启动和分配任务,尽量利用好机器资源。

    由于我们的日志分析就2台机器,所以还是打算用spark的stand alone 模式启动。

  • 相关阅读:
    基于Metaweblog API 接口一键发布到国内外主流博客平台
    uva144 Student Grants
    Uva 10452
    Uva 439 Knight Moves
    Uva 352 The Seasonal War
    switch语句
    java——基础知识
    我的lua学习2
    codeforces 431 D. Random Task 组合数学
    codeforces 285 D. Permutation Sum 状压 dfs打表
  • 原文地址:https://www.cnblogs.com/zhangqingping/p/4939264.html
Copyright © 2011-2022 走看看