zoukankan      html  css  js  c++  java
  • <Spark><Spark Streaming><作业分析><JobHistory>

    Intro

    • 这篇是对一个Spark (Streaming)作业的log进行分析。用来加深对Spark application运行过程,优化空间的各种理解。

    Here to Start

    • 从我这个初学者写得一个Spark Streaming程序开始...
    • package com.wttttt.spark
      
      /**
        * Created with IntelliJ IDEA.
        * Description: 
        * Author: wttttt
        * Github: https://github.com/wttttt-wang/hadoop_inaction
        * Date: 2017-05-19
        * Time: 09:56
        */
      import java.util.regex.Pattern
      
      import org.apache.spark.SparkConf
      import org.apache.spark.streaming.{Milliseconds, StreamingContext}
      import org.slf4j.LoggerFactory
      
      import scala.collection.mutable
      
      object LocalTest {
        val logger = LoggerFactory.getLogger("LocalTest")
        def main(args: Array[String]) {
      
          val batchInterval = Milliseconds(10000)
          val slideInterval = Milliseconds(5000)
      
          val conf = new SparkConf()
            .setMaster("local[2]")
            .setAppName("LocalTest")
          // WARN StreamingContext: spark.master should be set as local[n], n > 1 in local mode if you have receivers to get data,
          // otherwise Spark jobs will not get resources to process the received data.
          val sc = new StreamingContext(conf, Milliseconds(5000))
          sc.checkpoint("flumeCheckpoint/")
      
          val stream = sc.socketTextStream("localhost", 9998)
      
          val counts = stream.mapPartitions{ events =>
            val pattern = Pattern.compile("\?Input=[^\s]*\s")
            val map = new mutable.HashMap[String, Int]()
            logger.info("Handling events, events is empty: " + events.isEmpty)
            while (events.hasNext){   // par is an Iterator!!!
            val line = events.next()
              val m = pattern.matcher(line)
              if (m.find()) {
                val words = line.substring(m.start(), m.end()).split("=")(1).toLowerCase()
                logger.info(s"Processing words $words")
                map.put(words, map.getOrElse(words, 0) + 1)
              }
            }
            map.iterator
          }
      
          val window = counts.reduceByKeyAndWindow(_+_, _-_, batchInterval, slideInterval)
          // window.print()
      
          // transform和它的变体trnasformWith运行在DStream上任意的RDD-to-RDD函数;
          // 可以用来使用那些不包含在DStrema API中RDD操作
          val sorted = window.transform(rdd =>{
            val sortRdd = rdd.map(t => (t._2, t._1)).sortByKey(false).map(t => (t._2, t._1))
            val more = sortRdd.take(2)
            more.foreach(println)
            sortRdd
          })
      
          sorted.print()
      
          sc.start()
          sc.awaitTermination()
        }
      }
      
    • 看运行log (摘取的一部分)
    • 17/05/20 10:20:13 INFO Utils: Successfully started service 'sparkDriver' on port 52300.
      
      
      
      17/05/20 10:20:17 INFO ReducedWindowedDStream: Checkpoint interval automatically set to 10000 ms
      
      17/05/20 10:20:17 INFO SocketInputDStream: Slide time = 5000 ms
      
      
      
      17/05/20 10:20:17 INFO ReceiverTracker: Receiver 0 started
      
      17/05/20 10:20:17 INFO DAGScheduler: Got job 0 (start at LocalTest.scala:66) with 1 output partitions
      
      
      
      17/05/20 10:20:17 INFO JobScheduler: Started JobScheduler
      
      17/05/20 10:20:17 INFO StreamingContext: StreamingContext started
      
      
      
      17/05/20 10:20:20 INFO SparkContext: Starting job: sortByKey at LocalTest.scala:58
      
      17/05/20 10:20:20 INFO DAGScheduler: Registering RDD 2 (mapPartitions at LocalTest.scala:36)
      
      17/05/20 10:20:20 INFO DAGScheduler: Registering RDD 4 (reduceByKeyAndWindow at LocalTest.scala:52)
      
      17/05/20 10:20:20 INFO DAGScheduler: Got job 1 (sortByKey at LocalTest.scala:58) with 2 output partitions
      
      17/05/20 10:20:20 INFO DAGScheduler: Final stage: ResultStage 3 (sortByKey at LocalTest.scala:58)
      
      17/05/20 10:20:20 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 1, ShuffleMapStage 2)
      
      17/05/20 10:20:20 INFO DAGScheduler: Missing parents: List(ShuffleMapStage 2)
      
      17/05/20 10:20:20 INFO DAGScheduler: Submitting ShuffleMapStage 2 (ParallelCollectionRDD[4] at reduceByKeyAndWindow at LocalTest.scala:52), which has no missing parents
      
      
      
      17/05/20 10:20:20 INFO DAGScheduler: Submitting 2 missing tasks from ShuffleMapStage 2 (ParallelCollectionRDD[4] at reduceByKeyAndWindow at LocalTest.scala:52)
      
      
      
      17/05/20 10:20:20 INFO DAGScheduler: Job 1 finished: sortByKey at LocalTest.scala:58, took 0.314005 s
      
      17/05/20 10:20:20 INFO SparkContext: Starting job: take at LocalTest.scala:59
      
      
      
      17/05/20 10:20:20 INFO DAGScheduler: Registering RDD 7 (map at LocalTest.scala:58)
      
      17/05/20 10:20:20 INFO DAGScheduler: Got job 2 (take at LocalTest.scala:59) with 1 output partitions
      
      17/05/20 10:20:20 INFO DAGScheduler: Final stage: ResultStage 7 (take at LocalTest.scala:59)
      
      17/05/20 10:20:20 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 6)
      
      17/05/20 10:20:20 INFO DAGScheduler: Missing parents: List(ShuffleMapStage 6)
      
      17/05/20 10:20:20 INFO DAGScheduler: Submitting ShuffleMapStage 6 (MapPartitionsRDD[7] at map at LocalTest.scala:58), which has no missing parents
      
      
      
      17/05/20 10:20:20 INFO DAGScheduler: Submitting 2 missing tasks from ShuffleMapStage 6 (MapPartitionsRDD[7] at map at LocalTest.scala:58)
      
      17/05/20 10:20:20 INFO TaskSchedulerImpl: Adding task set 6.0 with 2 tasks
      
      17/05/20 10:20:20 INFO TaskSetManager: Starting task 0.0 in stage 6.0 (TID 5, localhost, executor driver, partition 0, PROCESS_LOCAL, 6109 bytes)
      
      17/05/20 10:20:20 INFO Executor: Running task 0.0 in stage 6.0 (TID 5)
      
      17/05/20 10:20:20 INFO BlockManager: Found block rdd60 locally
      
      17/05/20 10:20:20 INFO Executor: Finished task 0.0 in stage 6.0 (TID 5). 1062 bytes result sent to driver
      
      17/05/20 10:20:20 INFO TaskSetManager: Starting task 1.0 in stage 6.0 (TID 6, localhost, executor driver, partition 1, PROCESS_LOCAL, 6109 bytes)
      
      17/05/20 10:20:20 INFO Executor: Running task 1.0 in stage 6.0 (TID 6)
      
      17/05/20 10:20:20 INFO TaskSetManager: Finished task 0.0 in stage 6.0 (TID 5) in 14 ms on localhost (executor driver) (1/2)
      
      17/05/20 10:20:20 INFO BlockManager: Found block rdd61 locally
      
      17/05/20 10:20:20 INFO Executor: Finished task 1.0 in stage 6.0 (TID 6). 1062 bytes result sent to driver
      
      17/05/20 10:20:20 INFO TaskSetManager: Finished task 1.0 in stage 6.0 (TID 6) in 9 ms on localhost (executor driver) (2/2)
      
      17/05/20 10:20:20 INFO TaskSchedulerImpl: Removed TaskSet 6.0, whose tasks have all completed, from pool 
      
      17/05/20 10:20:20 INFO DAGScheduler: ShuffleMapStage 6 (map at LocalTest.scala:58) finished in 0.024 s
      
      17/05/20 10:20:20 INFO DAGScheduler: looking for newly runnable stages
      
      17/05/20 10:20:20 INFO DAGScheduler: running: Set(ResultStage 0)
      
      17/05/20 10:20:20 INFO DAGScheduler: waiting: Set(ResultStage 7)
      
      17/05/20 10:20:20 INFO DAGScheduler: failed: Set()
      
      17/05/20 10:20:20 INFO DAGScheduler: Submitting ResultStage 7 (MapPartitionsRDD[11] at map at LocalTest.scala:58), which has no missing parents
      
      17/05/20 10:20:20 INFO MemoryStore: Block broadcast_4 stored as values in memory (estimated size 4.0 KB, free 912.2 MB)
      
      17/05/20 10:20:20 INFO MemoryStore: Block broadcast4piece0 stored as bytes in memory (estimated size 2.4 KB, free 912.2 MB)
      
      17/05/20 10:20:20 INFO BlockManagerInfo: Added broadcast4piece0 in memory on 192.168.6.90:52302 (size: 2.4 KB, free: 912.3 MB)
      
      17/05/20 10:20:20 INFO SparkContext: Created broadcast 4 from broadcast at DAGScheduler.scala:996
      
      17/05/20 10:20:20 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 7 (MapPartitionsRDD[11] at map at LocalTest.scala:58)
      
      17/05/20 10:20:20 INFO TaskSchedulerImpl: Adding task set 7.0 with 1 tasks
      
      17/05/20 10:20:20 INFO TaskSetManager: Starting task 0.0 in stage 7.0 (TID 7, localhost, executor driver, partition 0, PROCESS_LOCAL, 5903 bytes)
      
      17/05/20 10:20:20 INFO Executor: Running task 0.0 in stage 7.0 (TID 7)
      
      17/05/20 10:20:20 INFO ShuffleBlockFetcherIterator: Getting 0 non-empty blocks out of 2 blocks
      
      17/05/20 10:20:20 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
      
      17/05/20 10:20:20 INFO Executor: Finished task 0.0 in stage 7.0 (TID 7). 1718 bytes result sent to driver
      
      17/05/20 10:20:20 INFO TaskSetManager: Finished task 0.0 in stage 7.0 (TID 7) in 146 ms on localhost (executor driver) (1/1)
      
      17/05/20 10:20:20 INFO TaskSchedulerImpl: Removed TaskSet 7.0, whose tasks have all completed, from pool 
      
      
      
      17/05/20 10:20:20 INFO DAGScheduler: Job 2 finished: take at LocalTest.scala:59, took 0.198460 s
      
      
      
      17/05/20 10:20:20 INFO SparkContext: Starting job: print at LocalTest.scala:64
      
      
      
      17/05/20 10:20:20 INFO DAGScheduler: Got job 3 (print at LocalTest.scala:64) with 1 output partitions
      
      17/05/20 10:20:20 INFO DAGScheduler: Final stage: ResultStage 11 (print at LocalTest.scala:64)
      
      17/05/20 10:20:20 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 10)
      
      17/05/20 10:20:20 INFO CheckpointWriter: Submitted checkpoint of time 1495246820000 ms to writer queue
      
      17/05/20 10:20:20 INFO DAGScheduler: Missing parents: List()
      
      17/05/20 10:20:20 INFO DAGScheduler: Submitting ResultStage 11 (MapPartitionsRDD[11] at map at LocalTest.scala:58), which has no missing parents
      
      17/05/20 10:20:20 INFO CheckpointWriter: Saving checkpoint for time 1495246820000 ms to file 'file:/Users/wttttt/programming/flume/RealTimeLog/flumeCheckpoint/checkpoint-1495246820000'
      
      17/05/20 10:20:20 INFO MemoryStore: Block broadcast_5 stored as values in memory (estimated size 4.0 KB, free 912.2 MB)
      
      17/05/20 10:20:20 INFO MemoryStore: Block broadcast5piece0 stored as bytes in memory (estimated size 2.4 KB, free 912.2 MB)
      
      17/05/20 10:20:20 INFO BlockManagerInfo: Added broadcast5piece0 in memory on 192.168.6.90:52302 (size: 2.4 KB, free: 912.3 MB)
      
      17/05/20 10:20:20 INFO SparkContext: Created broadcast 5 from broadcast at DAGScheduler.scala:996
      
      17/05/20 10:20:20 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 11 (MapPartitionsRDD[11] at map at LocalTest.scala:58)
      
      17/05/20 10:20:20 INFO TaskSchedulerImpl: Adding task set 11.0 with 1 tasks
      
      17/05/20 10:20:20 INFO TaskSetManager: Starting task 0.0 in stage 11.0 (TID 8, localhost, executor driver, partition 0, PROCESS_LOCAL, 6672 bytes)
      
      17/05/20 10:20:20 INFO Executor: Running task 0.0 in stage 11.0 (TID 8)
      
      17/05/20 10:20:20 INFO ShuffleBlockFetcherIterator: Getting 0 non-empty blocks out of 2 blocks
      
      17/05/20 10:20:20 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
      
      17/05/20 10:20:20 INFO Executor: Finished task 0.0 in stage 11.0 (TID 8). 1718 bytes result sent to driver
      
      17/05/20 10:20:20 INFO TaskSetManager: Finished task 0.0 in stage 11.0 (TID 8) in 39 ms on localhost (executor driver) (1/1)
      
      17/05/20 10:20:20 INFO TaskSchedulerImpl: Removed TaskSet 11.0, whose tasks have all completed, from pool 
      
      17/05/20 10:20:20 INFO DAGScheduler: ResultStage 11 (print at LocalTest.scala:64) finished in 0.041 s
      
      17/05/20 10:20:20 INFO DAGScheduler: Job 3 finished: print at LocalTest.scala:64, took 0.053225 s
      
      
      
      Time: 1495246820000 ms
      
      17/05/20 10:20:20 INFO JobScheduler: Finished job streaming job 1495246820000 ms.0 from job set of time 1495246820000 ms
      
      17/05/20 10:20:20 INFO JobScheduler: Total delay: 0.954 s for time 1495246820000 ms (execution: 0.064 s)
      
      17/05/20 10:20:21 INFO CheckpointWriter: Deleting file:/Users/wttttt/programming/flume/RealTimeLog/flumeCheckpoint/checkpoint-1495246110000.bk
      
      17/05/20 10:20:21 INFO JobGenerator: Checkpointing graph for time 1495246820000 ms
      
      17/05/20 10:20:21 INFO DStreamGraph: Updating checkpoint data for time 1495246820000 ms
      
      17/05/20 10:20:21 INFO CheckpointWriter: Checkpoint for time 1495246820000 ms saved to file 'file:/Users/wttttt/programming/flume/RealTimeLog/flumeCheckpoint/checkpoint-1495246820000', took 3942 bytes and 125 ms
      
      17/05/20 10:20:21 INFO DStreamGraph: Updated checkpoint data for time 1495246820000 ms
      
      17/05/20 10:20:21 INFO CheckpointWriter: Submitted checkpoint of time 1495246820000 ms to writer queue
      
      17/05/20 10:20:21 INFO CheckpointWriter: Saving checkpoint for time 1495246820000 ms to file 'file:/Users/wttttt/programming/flume/RealTimeLog/flumeCheckpoint/checkpoint-1495246820000'
      
      17/05/20 10:20:21 INFO BlockManagerInfo: Removed broadcast4piece0 on 192.168.6.90:52302 in memory (size: 2.4 KB, free: 912.3 MB)
      
      17/05/20 10:20:21 INFO BlockManagerInfo: Removed broadcast1piece0 on 192.168.6.90:52302 in memory (size: 1266.0 B, free: 912.3 MB)
      
      17/05/20 10:20:21 INFO BlockManagerInfo: Removed broadcast2piece0 on 192.168.6.90:52302 in memory (size: 2.7 KB, free: 912.3 MB)
      
      17/05/20 10:20:21 INFO BlockManagerInfo: Removed broadcast3piece0 on 192.168.6.90:52302 in memory (size: 2.9 KB, free: 912.3 MB)
      
      17/05/20 10:20:21 INFO BlockManagerInfo: Removed broadcast5piece0 on 192.168.6.90:52302 in memory (size: 2.4 KB, free: 912.3 MB)
      
      17/05/20 10:20:21 INFO CheckpointWriter: Deleting file:/Users/wttttt/programming/flume/RealTimeLog/flumeCheckpoint/checkpoint-1495246110000
      
      17/05/20 10:20:21 INFO CheckpointWriter: Checkpoint for time 1495246820000 ms saved to file 'file:/Users/wttttt/programming/flume/RealTimeLog/flumeCheckpoint/checkpoint-1495246820000', took 3938 bytes and 48 ms
      
      17/05/20 10:20:21 INFO DStreamGraph: Clearing checkpoint data for time 1495246820000 ms
      
      17/05/20 10:20:21 INFO DStreamGraph: Cleared checkpoint data for time 1495246820000 ms
      
      17/05/20 10:20:21 INFO ReceivedBlockTracker: Deleting batches: 
      
      17/05/20 10:20:21 INFO FileBasedWriteAheadLog_ReceivedBlockTracker: Attempting to clear 2 old log files in file:/Users/wttttt/programming/flume/RealTimeLog/flumeCheckpoint/receivedBlockMetadata older than 1495246790000: file:/Users/wttttt/programming/flume/RealTimeLog/flumeCheckpoint/receivedBlockMetadata/log-1495246055009-1495246115009
      
      file:/Users/wttttt/programming/flume/RealTimeLog/flumeCheckpoint/receivedBlockMetadata/log-1495246115167-1495246175167
      
      17/05/20 10:20:21 INFO InputInfoTracker: remove old batch metadata: 
      
      17/05/20 10:20:21 INFO FileBasedWriteAheadLog_ReceivedBlockTracker: Cleared log files in file:/Users/wttttt/programming/flume/RealTimeLog/flumeCheckpoint/receivedBlockMetadata older than 1495246790000
      
      17/05/20 10:20:21 INFO FileBasedWriteAheadLog_ReceivedBlockTracker: Cleared log files in file:/Users/wttttt/programming/flume/RealTimeLog/flumeCheckpoint/receivedBlockMetadata older than 1495246790000
      
      17/05/20 10:20:21 INFO MemoryStore: Block input-0-1495246821600 stored as bytes in memory (estimated size 284.0 B, free 912.2 MB)
      
      17/05/20 10:20:21 INFO BlockManagerInfo: Added input-0-1495246821600 in memory on 192.168.6.90:52302 (size: 284.0 B, free: 912.3 MB)
      
      17/05/20 10:20:21 WARN RandomBlockReplicationPolicy: Expecting 1 replicas with only 0 peer/s.
      
      17/05/20 10:20:21 WARN BlockManager: Block input-0-1495246821600 replicated to only 0 peer(s) instead of 1 peers
      
      17/05/20 10:20:21 INFO BlockGenerator: Pushed block input-0-1495246821600
      
      17/05/20 10:20:24 INFO MemoryStore: Block input-0-1495246824600 stored as bytes in memory (estimated size 284.0 B, free 912.2 MB)
      
      17/05/20 10:20:24 INFO BlockManagerInfo: Added input-0-1495246824600 in memory on 192.168.6.90:52302 (size: 284.0 B, free: 912.3 MB)
      
      17/05/20 10:20:24 WARN RandomBlockReplicationPolicy: Expecting 1 replicas with only 0 peer/s.
      
      17/05/20 10:20:24 WARN BlockManager: Block input-0-1495246824600 replicated to only 0 peer(s) instead of 1 peers
      
      17/05/20 10:20:24 INFO BlockGenerator: Pushed block input-0-1495246824600
      
      17/05/20 10:20:25 INFO ShuffledDStream: Slicing from 1495246815000 ms to 1495246815000 ms (aligned to 1495246815000 ms and 1495246815000 ms)
      
      17/05/20 10:20:25 INFO ShuffledDStream: Time 1495246815000 ms is invalid as zeroTime is 1495246815000 ms , slideDuration is 5000 ms and difference is 0 ms
      
      17/05/20 10:20:25 INFO ShuffledDStream: Slicing from 1495246825000 ms to 1495246825000 ms (aligned to 1495246825000 ms and 1495246825000 ms)
      
      17/05/20 10:20:25 INFO ReducedWindowedDStream: Marking RDD 16 for time 1495246825000 ms for checkpointing
      
      17/05/20 10:20:25 INFO SparkContext: Starting job: sortByKey at LocalTest.scala:58
      
      17/05/20 10:20:25 INFO DAGScheduler: Registering RDD 13 (mapPartitions at LocalTest.scala:36)
      
      17/05/20 10:20:25 INFO DAGScheduler: Got job 4 (sortByKey at LocalTest.scala:58) with 2 output partitions
      
      17/05/20 10:20:25 INFO DAGScheduler: Final stage: ResultStage 15 (sortByKey at LocalTest.scala:58)
      
      17/05/20 10:20:25 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 12, ShuffleMapStage 13, ShuffleMapStage 14)
      
      17/05/20 10:20:25 INFO DAGScheduler: Missing parents: List(ShuffleMapStage 13)
      
      17/05/20 10:20:25 INFO DAGScheduler: Submitting ShuffleMapStage 13 (MapPartitionsRDD[13] at mapPartitions at LocalTest.scala:36), which has no missing parents
      
      17/05/20 10:20:25 INFO MemoryStore: Block broadcast_6 stored as values in memory (estimated size 2.6 KB, free 912.2 MB)
      
      17/05/20 10:20:25 INFO MemoryStore: Block broadcast6piece0 stored as bytes in memory (estimated size 1667.0 B, free 912.2 MB)
      
      17/05/20 10:20:25 INFO BlockManagerInfo: Added broadcast6piece0 in memory on 192.168.6.90:52302 (size: 1667.0 B, free: 912.3 MB)
      
      17/05/20 10:20:25 INFO SparkContext: Created broadcast 6 from broadcast at DAGScheduler.scala:996
      
      17/05/20 10:20:25 INFO DAGScheduler: Submitting 2 missing tasks from ShuffleMapStage 13 (MapPartitionsRDD[13] at mapPartitions at LocalTest.scala:36)
      
      17/05/20 10:20:25 INFO TaskSchedulerImpl: Adding task set 13.0 with 2 tasks
      
      17/05/20 10:20:25 INFO TaskSetManager: Starting task 0.0 in stage 13.0 (TID 9, localhost, executor driver, partition 0, ANY, 6020 bytes)
      
      17/05/20 10:20:25 INFO Executor: Running task 0.0 in stage 13.0 (TID 9)
      
      17/05/20 10:20:25 INFO BlockManager: Found block input-0-1495246821600 locally
      
      17/05/20 10:20:25 INFO LocalTest: Handling events, events is empty: false
      
      17/05/20 10:20:25 INFO LocalTest: Processing words test1 
      
      17/05/20 10:20:25 INFO LocalTest: Processing words test2 
      
      17/05/20 10:20:25 INFO LocalTest: Processing words test2 
      
      17/05/20 10:20:25 INFO LocalTest: Processing words test3 
      
      17/05/20 10:20:25 INFO LocalTest: Processing words test3 
      
      17/05/20 10:20:25 INFO LocalTest: Processing words test3 
      
      17/05/20 10:20:25 INFO LocalTest: Processing words test3 
      
      17/05/20 10:20:25 INFO Executor: Finished task 0.0 in stage 13.0 (TID 9). 1498 bytes result sent to driver
      
      17/05/20 10:20:25 INFO TaskSetManager: Starting task 1.0 in stage 13.0 (TID 10, localhost, executor driver, partition 1, ANY, 6020 bytes)
      
      17/05/20 10:20:25 INFO TaskSetManager: Finished task 0.0 in stage 13.0 (TID 9) in 28 ms on localhost (executor driver) (1/2)
      
      17/05/20 10:20:25 INFO Executor: Running task 1.0 in stage 13.0 (TID 10)
      
      17/05/20 10:20:25 INFO BlockManager: Found block input-0-1495246824600 locally
      
      17/05/20 10:20:25 INFO LocalTest: Handling events, events is empty: false
      
      17/05/20 10:20:25 INFO LocalTest: Processing words test1 
      
      17/05/20 10:20:25 INFO LocalTest: Processing words test2 
      
      17/05/20 10:20:25 INFO LocalTest: Processing words test2 
      
      17/05/20 10:20:25 INFO LocalTest: Processing words test3 
      
      17/05/20 10:20:25 INFO LocalTest: Processing words test3 
      
      17/05/20 10:20:25 INFO LocalTest: Processing words test3 
      
      17/05/20 10:20:25 INFO LocalTest: Processing words test3 
      
      17/05/20 10:20:25 INFO Executor: Finished task 1.0 in stage 13.0 (TID 10). 1498 bytes result sent to driver
      
      17/05/20 10:20:25 INFO TaskSetManager: Finished task 1.0 in stage 13.0 (TID 10) in 22 ms on localhost (executor driver) (2/2)
      
      17/05/20 10:20:25 INFO TaskSchedulerImpl: Removed TaskSet 13.0, whose tasks have all completed, from pool 
      
      17/05/20 10:20:25 INFO DAGScheduler: ShuffleMapStage 13 (mapPartitions at LocalTest.scala:36) finished in 0.048 s
      
      17/05/20 10:20:25 INFO DAGScheduler: looking for newly runnable stages
      
      17/05/20 10:20:25 INFO DAGScheduler: running: Set(ResultStage 0)
      
      17/05/20 10:20:25 INFO DAGScheduler: waiting: Set(ResultStage 15)
      
      17/05/20 10:20:25 INFO DAGScheduler: failed: Set()
      
      17/05/20 10:20:25 INFO DAGScheduler: Submitting ResultStage 15 (MapPartitionsRDD[19] at sortByKey at LocalTest.scala:58), which has no missing parents
      
      17/05/20 10:20:25 INFO MemoryStore: Block broadcast_7 stored as values in memory (estimated size 6.4 KB, free 912.2 MB)
      
      17/05/20 10:20:25 INFO MemoryStore: Block broadcast7piece0 stored as bytes in memory (estimated size 3.6 KB, free 912.2 MB)
      
      17/05/20 10:20:25 INFO BlockManagerInfo: Added broadcast7piece0 in memory on 192.168.6.90:52302 (size: 3.6 KB, free: 912.3 MB)
      
      17/05/20 10:20:25 INFO SparkContext: Created broadcast 7 from broadcast at DAGScheduler.scala:996
      
      17/05/20 10:20:25 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 15 (MapPartitionsRDD[19] at sortByKey at LocalTest.scala:58)
      
      17/05/20 10:20:25 INFO TaskSchedulerImpl: Adding task set 15.0 with 2 tasks
      
      17/05/20 10:20:25 INFO TaskSetManager: Starting task 0.0 in stage 15.0 (TID 11, localhost, executor driver, partition 0, PROCESS_LOCAL, 6185 bytes)
      
      17/05/20 10:20:25 INFO Executor: Running task 0.0 in stage 15.0 (TID 11)
      
      17/05/20 10:20:25 INFO BlockManager: Found block rdd60 locally
      
      17/05/20 10:20:25 INFO ShuffleBlockFetcherIterator: Getting 2 non-empty blocks out of 2 blocks
      
      17/05/20 10:20:25 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
      
      17/05/20 10:20:25 INFO MemoryStore: Block rdd140 stored as bytes in memory (estimated size 155.0 B, free 912.2 MB)
      
      17/05/20 10:20:25 INFO BlockManagerInfo: Added rdd140 in memory on 192.168.6.90:52302 (size: 155.0 B, free: 912.3 MB)
      
      17/05/20 10:20:25 INFO MemoryStore: Block rdd160 stored as bytes in memory (estimated size 155.0 B, free 912.2 MB)
      
      17/05/20 10:20:25 INFO BlockManagerInfo: Added rdd160 in memory on 192.168.6.90:52302 (size: 155.0 B, free: 912.3 MB)
      
      17/05/20 10:20:25 INFO Executor: Finished task 0.0 in stage 15.0 (TID 11). 2799 bytes result sent to driver
      
      17/05/20 10:20:25 INFO TaskSetManager: Starting task 1.0 in stage 15.0 (TID 12, localhost, executor driver, partition 1, PROCESS_LOCAL, 6185 bytes)
      
      17/05/20 10:20:25 INFO TaskSetManager: Finished task 0.0 in stage 15.0 (TID 11) in 31 ms on localhost (executor driver) (1/2)
      
      17/05/20 10:20:25 INFO Executor: Running task 1.0 in stage 15.0 (TID 12)
      
      17/05/20 10:20:25 INFO BlockManager: Found block rdd61 locally
      
      17/05/20 10:20:25 INFO ShuffleBlockFetcherIterator: Getting 2 non-empty blocks out of 2 blocks
      
      17/05/20 10:20:25 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
      
      17/05/20 10:20:25 INFO MemoryStore: Block rdd141 stored as bytes in memory (estimated size 180.0 B, free 912.2 MB)
      
      17/05/20 10:20:25 INFO BlockManagerInfo: Added rdd141 in memory on 192.168.6.90:52302 (size: 180.0 B, free: 912.3 MB)
      
      17/05/20 10:20:25 INFO MemoryStore: Block rdd161 stored as bytes in memory (estimated size 180.0 B, free 912.2 MB)
      
      17/05/20 10:20:25 INFO BlockManagerInfo: Added rdd161 in memory on 192.168.6.90:52302 (size: 180.0 B, free: 912.3 MB)
      
      17/05/20 10:20:25 INFO Executor: Finished task 1.0 in stage 15.0 (TID 12). 2803 bytes result sent to driver
      
      17/05/20 10:20:25 INFO TaskSetManager: Finished task 1.0 in stage 15.0 (TID 12) in 39 ms on localhost (executor driver) (2/2)
      
      17/05/20 10:20:25 INFO TaskSchedulerImpl: Removed TaskSet 15.0, whose tasks have all completed, from pool 
      
      17/05/20 10:20:25 INFO DAGScheduler: ResultStage 15 (sortByKey at LocalTest.scala:58) finished in 0.069 s
      
      17/05/20 10:20:25 INFO DAGScheduler: Job 4 finished: sortByKey at LocalTest.scala:58, took 0.156155 s
      
      17/05/20 10:20:25 INFO MemoryStore: Block broadcast_8 stored as values in memory (estimated size 127.1 KB, free 912.1 MB)
      
      17/05/20 10:20:25 INFO MemoryStore: Block broadcast8piece0 stored as bytes in memory (estimated size 14.3 KB, free 912.1 MB)
      
      17/05/20 10:20:25 INFO BlockManagerInfo: Added broadcast8piece0 in memory on 192.168.6.90:52302 (size: 14.3 KB, free: 912.3 MB)
      
      17/05/20 10:20:25 INFO SparkContext: Created broadcast 8 from sortByKey at LocalTest.scala:58
      
      17/05/20 10:20:25 INFO SparkContext: Starting job: sortByKey at LocalTest.scala:58
      
      17/05/20 10:20:25 INFO MapOutputTrackerMaster: Size of output statuses for shuffle 1 is 83 bytes
      
      17/05/20 10:20:25 INFO MapOutputTrackerMaster: Size of output statuses for shuffle 3 is 157 bytes
      
      17/05/20 10:20:25 INFO MapOutputTrackerMaster: Size of output statuses for shuffle 0 is 156 bytes
      
      17/05/20 10:20:25 INFO DAGScheduler: Got job 5 (sortByKey at LocalTest.scala:58) with 2 output partitions
      
      17/05/20 10:20:25 INFO DAGScheduler: Final stage: ResultStage 19 (sortByKey at LocalTest.scala:58)
      
      17/05/20 10:20:25 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 16, ShuffleMapStage 17, ShuffleMapStage 18)
      
      17/05/20 10:20:25 INFO DAGScheduler: Missing parents: List()
      
      17/05/20 10:20:25 INFO DAGScheduler: Submitting ResultStage 19 (MapPartitionsRDD[16] at reduceByKeyAndWindow at LocalTest.scala:52), which has no missing parents
      
      17/05/20 10:20:25 INFO MemoryStore: Block broadcast_9 stored as values in memory (estimated size 6.5 KB, free 912.1 MB)
      
      17/05/20 10:20:25 INFO MemoryStore: Block broadcast9piece0 stored as bytes in memory (estimated size 3.6 KB, free 912.1 MB)
      
      17/05/20 10:20:25 INFO BlockManagerInfo: Added broadcast9piece0 in memory on 192.168.6.90:52302 (size: 3.6 KB, free: 912.3 MB)
      
      17/05/20 10:20:25 INFO SparkContext: Created broadcast 9 from broadcast at DAGScheduler.scala:996
      
      17/05/20 10:20:25 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 19 (MapPartitionsRDD[16] at reduceByKeyAndWindow at LocalTest.scala:52)
      
      17/05/20 10:20:25 INFO TaskSchedulerImpl: Adding task set 19.0 with 2 tasks
      
      17/05/20 10:20:25 INFO TaskSetManager: Starting task 0.0 in stage 19.0 (TID 13, localhost, executor driver, partition 0, PROCESS_LOCAL, 6122 bytes)
      
      17/05/20 10:20:25 INFO Executor: Running task 0.0 in stage 19.0 (TID 13)
      
      17/05/20 10:20:25 INFO BlockManager: Found block rdd160 locally
      
      17/05/20 10:20:25 INFO Executor: Finished task 0.0 in stage 19.0 (TID 13). 1085 bytes result sent to driver
      
      17/05/20 10:20:25 INFO TaskSetManager: Starting task 1.0 in stage 19.0 (TID 14, localhost, executor driver, partition 1, PROCESS_LOCAL, 6122 bytes)
      
      17/05/20 10:20:25 INFO TaskSetManager: Finished task 0.0 in stage 19.0 (TID 13) in 30 ms on localhost (executor driver) (1/2)
      
      17/05/20 10:20:25 INFO Executor: Running task 1.0 in stage 19.0 (TID 14)
      
      17/05/20 10:20:25 INFO BlockManager: Found block rdd161 locally
      
      17/05/20 10:20:25 INFO Executor: Finished task 1.0 in stage 19.0 (TID 14). 1172 bytes result sent to driver
      
      17/05/20 10:20:25 INFO TaskSetManager: Finished task 1.0 in stage 19.0 (TID 14) in 85 ms on localhost (executor driver) (2/2)
      
      17/05/20 10:20:25 INFO TaskSchedulerImpl: Removed TaskSet 19.0, whose tasks have all completed, from pool 
      
      17/05/20 10:20:25 INFO DAGScheduler: ResultStage 19 (sortByKey at LocalTest.scala:58) finished in 0.115 s
      
      17/05/20 10:20:25 INFO DAGScheduler: Job 5 finished: sortByKey at LocalTest.scala:58, took 0.129602 s
      
      17/05/20 10:20:25 INFO MemoryStore: Block broadcast_10 stored as values in memory (estimated size 127.1 KB, free 912.0 MB)
      
      17/05/20 10:20:25 INFO MemoryStore: Block broadcast10piece0 stored as bytes in memory (estimated size 14.3 KB, free 911.9 MB)
      
      17/05/20 10:20:25 INFO BlockManagerInfo: Added broadcast10piece0 in memory on 192.168.6.90:52302 (size: 14.3 KB, free: 912.2 MB)
      
      17/05/20 10:20:25 INFO SparkContext: Created broadcast 10 from sortByKey at LocalTest.scala:58
      
      17/05/20 10:20:25 INFO ReliableRDDCheckpointData: Done checkpointing RDD 16 to file:/Users/wttttt/programming/flume/RealTimeLog/flumeCheckpoint/874437fe-0c21-4a9d-8793-8ae914f3f38b/rdd-16, new parent is RDD 20
      
      17/05/20 10:20:25 INFO SparkContext: Starting job: take at LocalTest.scala:59
      
      17/05/20 10:20:25 INFO DAGScheduler: Registering RDD 17 (map at LocalTest.scala:58)
      
      17/05/20 10:20:25 INFO DAGScheduler: Got job 6 (take at LocalTest.scala:59) with 1 output partitions
      
      17/05/20 10:20:25 INFO DAGScheduler: Final stage: ResultStage 21 (take at LocalTest.scala:59)
      
      17/05/20 10:20:25 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 20)
      
      17/05/20 10:20:25 INFO DAGScheduler: Missing parents: List(ShuffleMapStage 20)
      
      17/05/20 10:20:25 INFO DAGScheduler: Submitting ShuffleMapStage 20 (MapPartitionsRDD[17] at map at LocalTest.scala:58), which has no missing parents
      
      17/05/20 10:20:25 INFO MemoryStore: Block broadcast_11 stored as values in memory (estimated size 6.1 KB, free 911.9 MB)
      
      17/05/20 10:20:25 INFO MemoryStore: Block broadcast11piece0 stored as bytes in memory (estimated size 3.5 KB, free 911.9 MB)
      
      17/05/20 10:20:25 INFO BlockManagerInfo: Added broadcast11piece0 in memory on 192.168.6.90:52302 (size: 3.5 KB, free: 912.2 MB)
      
      17/05/20 10:20:25 INFO SparkContext: Created broadcast 11 from broadcast at DAGScheduler.scala:996
      
      17/05/20 10:20:25 INFO DAGScheduler: Submitting 2 missing tasks from ShuffleMapStage 20 (MapPartitionsRDD[17] at map at LocalTest.scala:58)
      
      17/05/20 10:20:25 INFO TaskSchedulerImpl: Adding task set 20.0 with 2 tasks
      
      17/05/20 10:20:25 INFO TaskSetManager: Starting task 0.0 in stage 20.0 (TID 15, localhost, executor driver, partition 0, PROCESS_LOCAL, 6169 bytes)
      
      17/05/20 10:20:25 INFO Executor: Running task 0.0 in stage 20.0 (TID 15)
      
      17/05/20 10:20:25 INFO BlockManager: Found block rdd160 locally
      
      17/05/20 10:20:25 INFO Executor: Finished task 0.0 in stage 20.0 (TID 15). 1413 bytes result sent to driver
      
      17/05/20 10:20:25 INFO TaskSetManager: Starting task 1.0 in stage 20.0 (TID 16, localhost, executor driver, partition 1, PROCESS_LOCAL, 6169 bytes)
      
      17/05/20 10:20:25 INFO TaskSetManager: Finished task 0.0 in stage 20.0 (TID 15) in 44 ms on localhost (executor driver) (1/2)
      
      17/05/20 10:20:25 INFO Executor: Running task 1.0 in stage 20.0 (TID 16)
      
      17/05/20 10:20:25 INFO BlockManager: Found block rdd161 locally
      
      17/05/20 10:20:25 INFO Executor: Finished task 1.0 in stage 20.0 (TID 16). 1413 bytes result sent to driver
      
      17/05/20 10:20:25 INFO TaskSetManager: Finished task 1.0 in stage 20.0 (TID 16) in 58 ms on localhost (executor driver) (2/2)
      
      17/05/20 10:20:25 INFO TaskSchedulerImpl: Removed TaskSet 20.0, whose tasks have all completed, from pool 
      
      17/05/20 10:20:25 INFO DAGScheduler: ShuffleMapStage 20 (map at LocalTest.scala:58) finished in 0.101 s
      
      17/05/20 10:20:25 INFO DAGScheduler: looking for newly runnable stages
      
      17/05/20 10:20:25 INFO DAGScheduler: running: Set(ResultStage 0)
      
      17/05/20 10:20:25 INFO DAGScheduler: waiting: Set(ResultStage 21)
      
      17/05/20 10:20:25 INFO DAGScheduler: failed: Set()
      
      17/05/20 10:20:25 INFO DAGScheduler: Submitting ResultStage 21 (MapPartitionsRDD[22] at map at LocalTest.scala:58), which has no missing parents
      
      17/05/20 10:20:25 INFO MemoryStore: Block broadcast_12 stored as values in memory (estimated size 4.0 KB, free 911.9 MB)
      
      17/05/20 10:20:25 INFO MemoryStore: Block broadcast12piece0 stored as bytes in memory (estimated size 2.4 KB, free 911.9 MB)
      
      17/05/20 10:20:25 INFO BlockManagerInfo: Added broadcast12piece0 in memory on 192.168.6.90:52302 (size: 2.4 KB, free: 912.2 MB)
      
      17/05/20 10:20:25 INFO SparkContext: Created broadcast 12 from broadcast at DAGScheduler.scala:996
      
      17/05/20 10:20:25 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 21 (MapPartitionsRDD[22] at map at LocalTest.scala:58)
      
      17/05/20 10:20:25 INFO TaskSchedulerImpl: Adding task set 21.0 with 1 tasks
      
      17/05/20 10:20:25 INFO TaskSetManager: Starting task 0.0 in stage 21.0 (TID 17, localhost, executor driver, partition 0, ANY, 5903 bytes)
      
      17/05/20 10:20:25 INFO Executor: Running task 0.0 in stage 21.0 (TID 17)
      
      17/05/20 10:20:25 INFO ShuffleBlockFetcherIterator: Getting 1 non-empty blocks out of 2 blocks
      
      17/05/20 10:20:25 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
      
      17/05/20 10:20:25 INFO Executor: Finished task 0.0 in stage 21.0 (TID 17). 1869 bytes result sent to driver
      
      17/05/20 10:20:25 INFO TaskSetManager: Finished task 0.0 in stage 21.0 (TID 17) in 28 ms on localhost (executor driver) (1/1)
      
      17/05/20 10:20:25 INFO TaskSchedulerImpl: Removed TaskSet 21.0, whose tasks have all completed, from pool 
      
      17/05/20 10:20:25 INFO DAGScheduler: ResultStage 21 (take at LocalTest.scala:59) finished in 0.029 s
      
      17/05/20 10:20:25 INFO DAGScheduler: Job 6 finished: take at LocalTest.scala:59, took 0.150449 s
      
      17/05/20 10:20:25 INFO SparkContext: Starting job: take at LocalTest.scala:59
      
      17/05/20 10:20:25 INFO MapOutputTrackerMaster: Size of output statuses for shuffle 4 is 160 bytes
      
      17/05/20 10:20:25 INFO DAGScheduler: Got job 7 (take at LocalTest.scala:59) with 1 output partitions
      
      17/05/20 10:20:25 INFO DAGScheduler: Final stage: ResultStage 23 (take at LocalTest.scala:59)
      
      17/05/20 10:20:25 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 22)
      
      17/05/20 10:20:25 INFO DAGScheduler: Missing parents: List()
      
      17/05/20 10:20:25 INFO DAGScheduler: Submitting ResultStage 23 (MapPartitionsRDD[22] at map at LocalTest.scala:58), which has no missing parents
      
      17/05/20 10:20:25 INFO MemoryStore: Block broadcast_13 stored as values in memory (estimated size 4.0 KB, free 911.9 MB)
      
      17/05/20 10:20:25 INFO MemoryStore: Block broadcast13piece0 stored as bytes in memory (estimated size 2.4 KB, free 911.9 MB)
      
      17/05/20 10:20:25 INFO BlockManagerInfo: Added broadcast13piece0 in memory on 192.168.6.90:52302 (size: 2.4 KB, free: 912.2 MB)
      
      17/05/20 10:20:25 INFO SparkContext: Created broadcast 13 from broadcast at DAGScheduler.scala:996
      
      17/05/20 10:20:25 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 23 (MapPartitionsRDD[22] at map at LocalTest.scala:58)
      
      17/05/20 10:20:25 INFO TaskSchedulerImpl: Adding task set 23.0 with 1 tasks
      
      17/05/20 10:20:25 INFO TaskSetManager: Starting task 0.0 in stage 23.0 (TID 18, localhost, executor driver, partition 1, ANY, 5903 bytes)
      
      17/05/20 10:20:25 INFO Executor: Running task 0.0 in stage 23.0 (TID 18)
      
      17/05/20 10:20:25 INFO ShuffleBlockFetcherIterator: Getting 2 non-empty blocks out of 2 blocks
      
      17/05/20 10:20:25 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
      
      17/05/20 10:20:25 INFO Executor: Finished task 0.0 in stage 23.0 (TID 18). 1869 bytes result sent to driver
      
      17/05/20 10:20:25 INFO TaskSetManager: Finished task 0.0 in stage 23.0 (TID 18) in 75 ms on localhost (executor driver) (1/1)
      
      17/05/20 10:20:25 INFO TaskSchedulerImpl: Removed TaskSet 23.0, whose tasks have all completed, from pool 
      
      17/05/20 10:20:25 INFO DAGScheduler: ResultStage 23 (take at LocalTest.scala:59) finished in 0.077 s
      
      17/05/20 10:20:25 INFO DAGScheduler: Job 7 finished: take at LocalTest.scala:59, took 0.087657 s
      
      17/05/20 10:20:25 INFO JobScheduler: Added jobs for time 1495246825000 ms
      
      17/05/20 10:20:25 INFO JobGenerator: Checkpointing graph for time 1495246825000 ms
      
      17/05/20 10:20:25 INFO DStreamGraph: Updating checkpoint data for time 1495246825000 ms
      
      17/05/20 10:20:25 INFO JobScheduler: Starting job streaming job 1495246825000 ms.0 from job set of time 1495246825000 ms
      
      (test3 ,8)
      
      (test2 ,4)
      
      17/05/20 10:20:25 INFO DStreamGraph: Updated checkpoint data for time 1495246825000 ms
      
      17/05/20 10:20:25 INFO SparkContext: Starting job: print at LocalTest.scala:64
      
      17/05/20 10:20:25 INFO CheckpointWriter: Saving checkpoint for time 1495246825000 ms to file 'file:/Users/wttttt/programming/flume/RealTimeLog/flumeCheckpoint/checkpoint-1495246825000'
      
      17/05/20 10:20:25 INFO CheckpointWriter: Submitted checkpoint of time 1495246825000 ms to writer queue
      
      17/05/20 10:20:25 INFO DAGScheduler: Got job 8 (print at LocalTest.scala:64) with 1 output partitions
      
      17/05/20 10:20:25 INFO DAGScheduler: Final stage: ResultStage 25 (print at LocalTest.scala:64)
      
      17/05/20 10:20:25 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 24)
      
      17/05/20 10:20:25 INFO DAGScheduler: Missing parents: List()
      
      17/05/20 10:20:25 INFO DAGScheduler: Submitting ResultStage 25 (MapPartitionsRDD[22] at map at LocalTest.scala:58), which has no missing parents
      
      17/05/20 10:20:25 INFO MemoryStore: Block broadcast_14 stored as values in memory (estimated size 4.0 KB, free 911.9 MB)
      
      17/05/20 10:20:25 INFO MemoryStore: Block broadcast14piece0 stored as bytes in memory (estimated size 2.4 KB, free 911.9 MB)
      
      17/05/20 10:20:25 INFO BlockManagerInfo: Added broadcast14piece0 in memory on 192.168.6.90:52302 (size: 2.4 KB, free: 912.2 MB)
      
      17/05/20 10:20:25 INFO SparkContext: Created broadcast 14 from broadcast at DAGScheduler.scala:996
      
      17/05/20 10:20:25 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 25 (MapPartitionsRDD[22] at map at LocalTest.scala:58)
      
      17/05/20 10:20:25 INFO TaskSchedulerImpl: Adding task set 25.0 with 1 tasks
      
      17/05/20 10:20:25 INFO TaskSetManager: Starting task 0.0 in stage 25.0 (TID 19, localhost, executor driver, partition 0, ANY, 6672 bytes)
      
      17/05/20 10:20:25 INFO Executor: Running task 0.0 in stage 25.0 (TID 19)
      
      17/05/20 10:20:25 INFO ShuffleBlockFetcherIterator: Getting 1 non-empty blocks out of 2 blocks
      
      17/05/20 10:20:25 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
      
      17/05/20 10:20:25 INFO CheckpointWriter: Deleting file:/Users/wttttt/programming/flume/RealTimeLog/flumeCheckpoint/checkpoint-1495246115000.bk
      
      17/05/20 10:20:25 INFO CheckpointWriter: Checkpoint for time 1495246825000 ms saved to file 'file:/Users/wttttt/programming/flume/RealTimeLog/flumeCheckpoint/checkpoint-1495246825000', took 4010 bytes and 37 ms
      
      17/05/20 10:20:25 INFO Executor: Finished task 0.0 in stage 25.0 (TID 19). 1869 bytes result sent to driver
      
      17/05/20 10:20:25 INFO TaskSetManager: Finished task 0.0 in stage 25.0 (TID 19) in 37 ms on localhost (executor driver) (1/1)
      
      17/05/20 10:20:25 INFO TaskSchedulerImpl: Removed TaskSet 25.0, whose tasks have all completed, from pool 
      
      17/05/20 10:20:25 INFO DAGScheduler: ResultStage 25 (print at LocalTest.scala:64) finished in 0.039 s
      
      17/05/20 10:20:25 INFO DAGScheduler: Job 8 finished: print at LocalTest.scala:64, took 0.046366 s
      
      17/05/20 10:20:25 INFO SparkContext: Starting job: print at LocalTest.scala:64
      
      17/05/20 10:20:25 INFO DAGScheduler: Got job 9 (print at LocalTest.scala:64) with 1 output partitions
      
      17/05/20 10:20:25 INFO DAGScheduler: Final stage: ResultStage 27 (print at LocalTest.scala:64)
      
      17/05/20 10:20:25 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 26)
      
      17/05/20 10:20:25 INFO DAGScheduler: Missing parents: List()
      
      17/05/20 10:20:25 INFO DAGScheduler: Submitting ResultStage 27 (MapPartitionsRDD[22] at map at LocalTest.scala:58), which has no missing parents
      
      17/05/20 10:20:25 INFO MemoryStore: Block broadcast_15 stored as values in memory (estimated size 4.0 KB, free 911.9 MB)
      
      17/05/20 10:20:25 INFO MemoryStore: Block broadcast15piece0 stored as bytes in memory (estimated size 2.4 KB, free 911.9 MB)
      
      17/05/20 10:20:25 INFO BlockManagerInfo: Added broadcast15piece0 in memory on 192.168.6.90:52302 (size: 2.4 KB, free: 912.2 MB)
      
      17/05/20 10:20:25 INFO SparkContext: Created broadcast 15 from broadcast at DAGScheduler.scala:996
      
      17/05/20 10:20:25 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 27 (MapPartitionsRDD[22] at map at LocalTest.scala:58)
      
      17/05/20 10:20:25 INFO TaskSchedulerImpl: Adding task set 27.0 with 1 tasks
      
      17/05/20 10:20:25 INFO TaskSetManager: Starting task 0.0 in stage 27.0 (TID 20, localhost, executor driver, partition 1, ANY, 6672 bytes)
      
      17/05/20 10:20:25 INFO Executor: Running task 0.0 in stage 27.0 (TID 20)
      
      17/05/20 10:20:25 INFO ShuffleBlockFetcherIterator: Getting 2 non-empty blocks out of 2 blocks
      
      17/05/20 10:20:25 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
      
      
      
      Time: 1495246825000 ms
      
      (test3 ,8)
      
      (test2 ,4)
      
      (test1 ,2)
      

    配置JobHistory

    •  spark-defaults.conf: 
    spark.eventLog.enabled  true
    spark.eventLog.dir      hdfs://host99:9000/sparkLogHistory
    spark.eventLog.compress true
    
    •  eventLog.dir要手动创建: hadoop fs -mkdir /sparkLogHistory
    • spark-env.sh
    export SPARK_HISTORY_OPTS="-Dspark.history.ui.port=7777 -Dspark.history.retainedApplications=3 -Dspark.history.fs.logDirectory=hdfs://host99:9000/sparkLogHistory"
    

       默认只保留最近三次的jobHistory,端口号是7777

    • 运行一个job试试
    满地都是六便士,她却抬头看见了月亮。
  • 相关阅读:
    php记录代码执行时间
    java中针对同一变量的不同函数的互斥操作
    Linux下mysql新建账号及权限设置
    Linux下重启apache
    Mysql数据导入
    ubuntu安装phpcurl与phptidy扩展
    Linux服务器间文件传输
    Flash本地传递大数据,图片数据,localconnection 超出大小,超出限制 bitmapdata 拂晓风起
    [Java][JavaScript]字符串数组与字符串之间的互转(join/split)(转) 拂晓风起
    java poi读取excel公式,返回计算值(转) 拂晓风起
  • 原文地址:https://www.cnblogs.com/wttttt/p/6881760.html
Copyright © 2011-2022 走看看