zoukankan      html  css  js  c++  java
  • Spark metrics on wordcount example

    I read the section Metrics on spark website. I wish to try it on the wordcount example, I can't make it work.

    spark/conf/metrics.properties :

    # Enable CsvSink for all instances
    *.sink.csv.class=org.apache.spark.metrics.sink.CsvSink
    
    # Polling period for CsvSink
    *.sink.csv.period=1
    
    *.sink.csv.unit=seconds
    
    # Polling directory for CsvSink
    *.sink.csv.directory=/home/spark/Documents/test/
    
    # Worker instance overlap polling period
    worker.sink.csv.period=1
    
    worker.sink.csv.unit=seconds
    
    # Enable jvm source for instance master, worker, driver and executor
    master.source.jvm.class=org.apache.spark.metrics.source.JvmSource
    
    worker.source.jvm.class=org.apache.spark.metrics.source.JvmSource
    
    driver.source.jvm.class=org.apache.spark.metrics.source.JvmSource
    
    executor.source.jvm.class=org.apache.spark.metrics.source.JvmSource
    

      I run my app in local like in the documentation :

    $SPARK_HOME/bin/spark-submit   --class "SimpleApp"   --master local[4]   target/scala-2.10/simple-project_2.10-1.0.jar
    

      

    I checked /home/spark/Documents/test/ and it is empty.

    What did I miss?

    Shell:

    $SPARK_HOME/bin/spark-submit   --class "SimpleApp"   --master local[4]  --conf   spark.metrics.conf=/home/spark/development/spark/conf/metrics.properties  target/scala-2.10/simple-project_2.10-1.0.jar
    Spark assembly has been built with Hive, including Datanucleus jars on classpath
    Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
    INFO SparkContext: Running Spark version 1.3.0
    WARN Utils: Your hostname, cv-local resolves to a loopback address: 127.0.1.1; using 192.168.1.64 instead (on interface eth0)
    WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
    INFO SecurityManager: Changing view acls to: spark
    INFO SecurityManager: Changing modify acls to: spark
    INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(spark); users with modify permissions: Set(spark)
    INFO Slf4jLogger: Slf4jLogger started
    INFO Remoting: Starting remoting
    INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@cv-local.local:35895]
    INFO Utils: Successfully started service 'sparkDriver' on port 35895.
    INFO SparkEnv: Registering MapOutputTracker
    INFO SparkEnv: Registering BlockManagerMaster
    INFO DiskBlockManager: Created local directory at /tmp/spark-447d56c9-cfe5-4f9d-9e0a-6bb476ddede6/blockmgr-4eaa04f4-b4b2-4b05-ba0e-fd1aeb92b289
    INFO MemoryStore: MemoryStore started with capacity 265.4 MB
    INFO HttpFileServer: HTTP File server directory is /tmp/spark-fae11cd2-937e-4be3-a273-be8b4c4847df/httpd-ca163445-6fff-45e4-9c69-35edcea83b68
    INFO HttpServer: Starting HTTP Server
    INFO Utils: Successfully started service 'HTTP file server' on port 52828.
    INFO SparkEnv: Registering OutputCommitCoordinator
    INFO Utils: Successfully started service 'SparkUI' on port 4040.
    INFO SparkUI: Started SparkUI at http://cv-local.local:4040
    INFO SparkContext: Added JAR file:/home/spark/workspace/IdeaProjects/wordcount/target/scala-2.10/simple-project_2.10-1.0.jar at http://192.168.1.64:52828/jars/simple-project_2.10-1.0.jar with timestamp 1444049152348
    INFO Executor: Starting executor ID <driver> on host localhost
    INFO AkkaUtils: Connecting to HeartbeatReceiver: akka.tcp://sparkDriver@cv-local.local:35895/user/HeartbeatReceiver
    INFO NettyBlockTransferService: Server created on 60320
    INFO BlockManagerMaster: Trying to register BlockManager
    INFO BlockManagerMasterActor: Registering block manager localhost:60320 with 265.4 MB RAM, BlockManagerId(<driver>, localhost, 60320)
    INFO BlockManagerMaster: Registered BlockManager
    INFO MemoryStore: ensureFreeSpace(34046) called with curMem=0, maxMem=278302556
    INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 33.2 KB, free 265.4 MB)
    INFO MemoryStore: ensureFreeSpace(5221) called with curMem=34046, maxMem=278302556
    INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 5.1 KB, free 265.4 MB)
    INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on localhost:60320 (size: 5.1 KB, free: 265.4 MB)
    INFO BlockManagerMaster: Updated info of block broadcast_0_piece0
    INFO SparkContext: Created broadcast 0 from textFile at SimpleApp.scala:11
    WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    WARN LoadSnappy: Snappy native library not loaded
    INFO FileInputFormat: Total input paths to process : 1
    INFO SparkContext: Starting job: count at SimpleApp.scala:12
    INFO DAGScheduler: Got job 0 (count at SimpleApp.scala:12) with 2 output partitions (allowLocal=false)
    INFO DAGScheduler: Final stage: Stage 0(count at SimpleApp.scala:12)
    INFO DAGScheduler: Parents of final stage: List()
    INFO DAGScheduler: Missing parents: List()
    INFO DAGScheduler: Submitting Stage 0 (MapPartitionsRDD[2] at filter at SimpleApp.scala:12), which has no missing parents
    INFO MemoryStore: ensureFreeSpace(2848) called with curMem=39267, maxMem=278302556
    INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 2.8 KB, free 265.4 MB)
    INFO MemoryStore: ensureFreeSpace(2056) called with curMem=42115, maxMem=278302556
    INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 2.0 KB, free 265.4 MB)
    INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on localhost:60320 (size: 2.0 KB, free: 265.4 MB)
    INFO BlockManagerMaster: Updated info of block broadcast_1_piece0
    INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:839
    INFO DAGScheduler: Submitting 2 missing tasks from Stage 0 (MapPartitionsRDD[2] at filter at SimpleApp.scala:12)
    INFO TaskSchedulerImpl: Adding task set 0.0 with 2 tasks
    INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, PROCESS_LOCAL, 1391 bytes)
    INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, localhost, PROCESS_LOCAL, 1391 bytes)
    INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
    INFO Executor: Running task 1.0 in stage 0.0 (TID 1)
    INFO Executor: Fetching http://192.168.1.64:52828/jars/simple-project_2.10-1.0.jar with timestamp 1444049152348
    INFO Utils: Fetching http://192.168.1.64:52828/jars/simple-project_2.10-1.0.jar to /tmp/spark-cab5a940-e2a4-4caf-8549-71e1518271f1/userFiles-c73172c2-7af6-4861-a945-b183edbbafa1/fetchFileTemp4229868141058449157.tmp
    INFO Executor: Adding file:/tmp/spark-cab5a940-e2a4-4caf-8549-71e1518271f1/userFiles-c73172c2-7af6-4861-a945-b183edbbafa1/simple-project_2.10-1.0.jar to class loader
    INFO CacheManager: Partition rdd_1_1 not found, computing it
    INFO CacheManager: Partition rdd_1_0 not found, computing it
    INFO HadoopRDD: Input split: file:/home/spark/development/spark/conf/metrics.properties:2659+2659
    INFO HadoopRDD: Input split: file:/home/spark/development/spark/conf/metrics.properties:0+2659
    INFO MemoryStore: ensureFreeSpace(7840) called with curMem=44171, maxMem=278302556
    INFO MemoryStore: Block rdd_1_0 stored as values in memory (estimated size 7.7 KB, free 265.4 MB)
    INFO BlockManagerInfo: Added rdd_1_0 in memory on localhost:60320 (size: 7.7 KB, free: 265.4 MB)
    INFO BlockManagerMaster: Updated info of block rdd_1_0
    INFO MemoryStore: ensureFreeSpace(8648) called with curMem=52011, maxMem=278302556
    INFO MemoryStore: Block rdd_1_1 stored as values in memory (estimated size 8.4 KB, free 265.4 MB)
    INFO BlockManagerInfo: Added rdd_1_1 in memory on localhost:60320 (size: 8.4 KB, free: 265.4 MB)
    INFO BlockManagerMaster: Updated info of block rdd_1_1
    INFO Executor: Finished task 1.0 in stage 0.0 (TID 1). 2399 bytes result sent to driver
    INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 2399 bytes result sent to driver
    INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 139 ms on localhost (1/2)
    INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 133 ms on localhost (2/2)
    INFO DAGScheduler: Stage 0 (count at SimpleApp.scala:12) finished in 0.151 s
    INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 
    INFO DAGScheduler: Job 0 finished: count at SimpleApp.scala:12, took 0.225939 s
    INFO SparkContext: Starting job: count at SimpleApp.scala:13
    INFO DAGScheduler: Got job 1 (count at SimpleApp.scala:13) with 2 output partitions (allowLocal=false)
    INFO DAGScheduler: Final stage: Stage 1(count at SimpleApp.scala:13)
    INFO DAGScheduler: Parents of final stage: List()
    INFO DAGScheduler: Missing parents: List()
    INFO DAGScheduler: Submitting Stage 1 (MapPartitionsRDD[3] at filter at SimpleApp.scala:13), which has no missing parents
    INFO MemoryStore: ensureFreeSpace(2848) called with curMem=60659, maxMem=278302556
    INFO MemoryStore: Block broadcast_2 stored as values in memory (estimated size 2.8 KB, free 265.3 MB)
    INFO MemoryStore: ensureFreeSpace(2056) called with curMem=63507, maxMem=278302556
    INFO MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 2.0 KB, free 265.3 MB)
    INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on localhost:60320 (size: 2.0 KB, free: 265.4 MB)
    INFO BlockManagerMaster: Updated info of block broadcast_2_piece0
    INFO SparkContext: Created broadcast 2 from broadcast at DAGScheduler.scala:839
    INFO DAGScheduler: Submitting 2 missing tasks from Stage 1 (MapPartitionsRDD[3] at filter at SimpleApp.scala:13)
    INFO TaskSchedulerImpl: Adding task set 1.0 with 2 tasks
    INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 2, localhost, PROCESS_LOCAL, 1391 bytes)
    INFO TaskSetManager: Starting task 1.0 in stage 1.0 (TID 3, localhost, PROCESS_LOCAL, 1391 bytes)
    INFO Executor: Running task 0.0 in stage 1.0 (TID 2)
    INFO Executor: Running task 1.0 in stage 1.0 (TID 3)
    INFO BlockManager: Found block rdd_1_0 locally
    INFO Executor: Finished task 0.0 in stage 1.0 (TID 2). 1830 bytes result sent to driver
    INFO TaskSetManager: Finished task 0.0 in stage 1.0 (TID 2) in 9 ms on localhost (1/2)
    INFO BlockManager: Found block rdd_1_1 locally
    INFO Executor: Finished task 1.0 in stage 1.0 (TID 3). 1830 bytes result sent to driver
    INFO TaskSetManager: Finished task 1.0 in stage 1.0 (TID 3) in 10 ms on localhost (2/2)
    INFO DAGScheduler: Stage 1 (count at SimpleApp.scala:13) finished in 0.011 s
    INFO TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks have all completed, from pool 
    INFO DAGScheduler: Job 1 finished: count at SimpleApp.scala:13, took 0.024084 s
    Lines with a: 5, Lines with b: 12
    

      

  • 相关阅读:
    JavaScript function (简单总结)
    JavaScript 数组 (简单总结)
    yum update 和 yum upgrate 区别
    git clone警告,提示Warning:Permission denied (publickey)
    ''退格符号笔记
    MySQL Workbench导出Model提示['ERROR 1064 (42000): You have an error in your SQL syntax....syntax to use near 'VISIBLE']
    《Python编程从入门到实践》--- 学习过程笔记(3)列表
    《Python编程从入门到实践》--- 学习过程笔记(2)变量和简单数据类型
    Windows+MyEclipse+MySQL【连接数据库报错caching_sha2_password】
    测试 | 让每一粒尘埃有的放矢
  • 原文地址:https://www.cnblogs.com/felixzh/p/5882268.html
Copyright © 2011-2022 走看看