zoukankan      html  css  js  c++  java
  • Spark-Cache与Checkpoint

    一、Cache缓存操作

    scala> val rdd1 = sc.textFile("hdfs://192.168.146.111:9000/logs")
    rdd1: org.apache.spark.rdd.RDD[String] = hdfs://192.168.146.111:9000/logs MapPartitionsRDD[38] at textFile at <console>:24
    
    scala> rdd1.count
    res13: Long = 40155                                                             
    
    scala> rdd1.count
    res14: Long = 40155

    scala> val rdd2 = sc.textFile("hdfs://192.168.146.111:9000/logs") rdd2: org.apache.spark.rdd.RDD[String] = hdfs://192.168.146.111:9000/logs MapPartitionsRDD[40] at textFile at <console>:24 scala> val rdd2Cache = rdd2.cache rdd2Cache: rdd2.type = hdfs://192.168.146.111:9000/logs MapPartitionsRDD[40] at textFile at <console>:24 scala> rdd2Cache.count res15: Long = 40155 scala> rdd2Cache.count res16: Long = 40155 scala> rdd2Cache.count res17: Long = 40155

    二、Checpoint机制

    scala> sc.setCheckpointDir("hdfs://192.168.146.111:9000/chechdir")
    
    scala> val rddc = rdd1.filter(_.contains("bigdata"))
    rddc: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[41] at filter at <console>:26
    
    scala> rddc.checkpoint
    
    scala> rddc.count
    res21: Long = 7155 

  • 相关阅读:
    SpringBoot框架(二)
    SpringBoot框架(一)
    JavaScript语言和jQuery技术(一)
    Mysql数据库技术(四)
    Mysql数据库技术(三)
    Mysql数据库技术(二)
    Mysql数据库技术(一)
    JDBC技术(三)
    JDBC技术(二)
    JDBC技术(一)
  • 原文地址:https://www.cnblogs.com/areyouready/p/10293756.html
Copyright © 2011-2022 走看看