zoukankan      html  css  js  c++  java
  • RDD中cache和persist的区别

    通过观察RDD.scala源代码即可知道cache和persist的区别:

    def persist(newLevel: StorageLevel): this.type = {
      if (storageLevel != StorageLevel.NONE && newLevel != storageLevel) {
        throw new UnsupportedOperationException( "Cannot change storage level of an RDD after it was already assigned a level")
      }
      sc.persistRDD(this)

      sc.cleaner.foreach(_.registerRDDForCleanup(this))
      storageLevel = newLevel
      this
    }

    /** Persist this RDD with the default storage level (`MEMORY_ONLY`). */
    def persist(): this.type = persist(StorageLevel.MEMORY_ONLY)

     

    /** Persist this RDD with the default storage level (`MEMORY_ONLY`). */
    def cache(): this.type = persist()

     

     

     

     

     

     

     

     

     

     

     

    可知:

    1)RDD的cache()方法其实调用的就是persist方法,缓存策略均为MEMORY_ONLY;

    2)可以通过persist方法手工设定StorageLevel来满足工程需要的存储级别;

    3)cache或者persist并不是action;

     

     

     

     

  • 相关阅读:
    dos常用命令
    组合封装知识点
    继承与派生知识点
    继承与派生
    面向对象知识点
    面向对象
    Day 84 DRF的分页和过滤
    Day80 使用第三方(腾讯云)短信验证码接口
    Day 79 xadmin后台管理/Git仓库
    Day 77 三大认证组件
  • 原文地址:https://www.cnblogs.com/luogankun/p/3801062.html
Copyright © 2011-2022 走看看