zoukankan      html  css  js  c++  java
  • spark2.1:rdd.combineByKeyWithClassTag的用法示例

    测试spark版本:

    Spark context Web UI available at http://192.168.1.1:32735
    Spark context available as 'sc' (master = local[*], app id = local-1380172893828).
    Spark session available as 'spark'.
    Welcome to
          ____              __
         / __/__  ___ _____/ /__
        _ / _ / _ `/ __/  '_/
       /___/ .__/\_,_/_/ /_/\_   version 2.1.0
          /_/
             
    Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_72)
    Type in expressions to have them evaluated.
    Type :help for more information.

    备注:spark1.5中没有提供rdd.combineByKeyWithClassTag算子,但提供的有rdd.combineByKey算子(spark2.1中依然保留)。

    使用示例:

    scala> case class FModel(cgridid: Int, angle: Double, drsrp: Double, distance: Double)
    defined class FModel
    
    scala>  val sample_rdd=sc.makeRDD(
         |       Array(
         |         (1,FModel(1,2.0,2.1,2.2)),
         |         (1,FModel(2,2.2,2.11,23.2)),
         |         (2,FModel(1,2.0,2.1,2.2)),
         |         (1,FModel(3,2.0,42.1,22.2)),
         |         (2,FModel(2,2.2,2.11,23.2)),
         |         (3,FModel(3,2.0,42.1,22.2))
         |       )
         |     )
    sample_rdd: org.apache.spark.rdd.RDD[(Int, FModel)] = ParallelCollectionRDD[0] at makeRDD at <console>:26
    
    scala> val combinByKeyRDD = sample_rdd.combineByKeyWithClassTag(
         |       (x: FModel) => (List(x), 1),
         |       (peo: (List[FModel], Int), x: FModel) => (x :: peo._1, peo._2 + 1),
         |       (sex1: (List[FModel], Int), sex2: (List[FModel], Int)) => (sex1._1 ::: sex2._1, sex1._2 + sex2._2))
    combinByKeyRDD: org.apache.spark.rdd.RDD[(Int, (List[FModel], Int))] = ShuffledRDD[1] at combineByKeyWithClassTag at <console>:28
    
    scala> combinByKeyRDD.foreach(println)
    [Stage 0:>                                                         (0 + 0) / 12](3,(List(FModel(3,2.0,42.1,22.2)),1))
    (2,(List(FModel(1,2.0,2.1,2.2), FModel(2,2.2,2.11,23.2)),2))
    (1,(List(FModel(1,2.0,2.1,2.2), FModel(2,2.2,2.11,23.2), FModel(3,2.0,42.1,22.2)),3))
                                                                                    
    scala> 
  • 相关阅读:
    linux 常用命令-编辑模式
    关于react虚拟DOM的研究
    oracle 分页的sql语句
    react+webpack+wepack-dev-server的环境中ant design图标离线的方法
    oracle 语句之对数据库的表名就行模糊查询,对查询结果进行遍历,依次获取每个表名结果中的每个字段(存储过程)
    eclipse 中使用git
    好东西要分享
    《梦断代码》阅读笔记二
    《梦断代码》阅读笔记一
    第二段冲刺进程4
  • 原文地址:https://www.cnblogs.com/yy3b2007com/p/8506552.html
Copyright © 2011-2022 走看看