zoukankan      html  css  js  c++  java
  • RDD无reduceByKey方法

    写Spark代码的时候经常发现rdd没有reduceByKey的方法,这个发生在spark1.2及其以前对版本,因为rdd本身不存在reduceByKey的方法,需要隐式转换成PairRDDFunctions才能访问,因此需要引入Import org.apache.spark.SparkContext._。

    不过到了spark1.3的版本后,隐式转换的放在rdd的object中,这样就会自动被引入,不需要显式引入。

     * Defines implicit functions that provide extra functionalities on RDDs of specific types.
     * For example, [[RDD.rddToPairRDDFunctions]] converts an RDD into a [[PairRDDFunctions]] for
     * key-value-pair RDDs, and enabling extra functionalities such as [[PairRDDFunctions.reduceByKey]].
    */
     
    object RDD {
      // The following implicit functions were in SparkContext before 1.3 and users had to
      // `import SparkContext._` to enable them. Now we move them here to make the compiler find
      // them automatically. However, we still keep the old functions in SparkContext for backward
      // compatibility and forward to the following functions directly.
      implicit def rddToPairRDDFunctions[K, V](rdd: RDD[(K, V)])
        (implicit kt: ClassTag[K], vt: ClassTag[V], ord: Ordering[K] = null): PairRDDFunctions[K, V] = {
        new PairRDDFunctions(rdd)
      }

    至于什么是隐式转换,简单来讲就是scala偷梁换柱换柱,让隔壁老王来干你干不了的事情了。

  • 相关阅读:
    oop第四次课作业总结
    我罗斯方块
    getline()、cin.getline()、cin.get()的区分和应用
    20200328上记笔记
    Markdown使用方法
    A Lovely Message Board
    NOIP2020游记
    题解 Codeforces Round #678 (Div. 2) (CF1436)
    CSP-S2020 第一轮认证(初赛)游记
    题解 Codeforces Round #670 (Div. 2) (CF1406)
  • 原文地址:https://www.cnblogs.com/luckuan/p/4479551.html
Copyright © 2011-2022 走看看