zoukankan      html  css  js  c++  java
  • spark 累加器

    累加器原理图:

     累加器创建:

    sc.longaccumulator("")                sc.longaccumulator
    sc.collectionaccumulator()            sc.collectionaccumulator
    sc.doubleaccumulator()                sc.doubleaccumulator
    
    

    累加器累加:

    l.add(1L)

    累加器结果获取:

    l.value

    demo

    long累加器

    spark.sparkContext.setLogLevel("error")

    val data=spark.sparkContext.parallelize(List(" "," "," "," "))
    //var l=spark.sparkContext.longAccumulator
    var l=spark.sparkContext.longAccumulator("test")
    data.map(x=>{
    l.add(3L)
    x
    }).count()
    //count 函数仅仅用于触发执行
    println(l.value)

    data循环4次,每次加3,输出结果为12

    Double累加器

    collection累加器

    重复累计问题:

    val data=spark.sparkContext.parallelize(List(" "," "," "," "))
        //var l=spark.sparkContext.longAccumulator
        var l=spark.sparkContext.longAccumulator("test")
        val res=data.map(x=>{
          l.add(3L)
          x
        })
          res.count()
        //count 函数仅仅用于触发执行
        println(l.value)
        res.collect()
        println(l.value)

     连续两次调用了action算子,所以这里累加器进行了两次重复的累加,也就是说,累加器实在遇到action算子的时候才进行累加操作的

    正确写法在累加器结束后加入cache

     spark.sparkContext.setLogLevel("error")
    
        val data=spark.sparkContext.parallelize(List(" "," "," "," "))
        //var l=spark.sparkContext.longAccumulator
        var l=spark.sparkContext.longAccumulator("test")
        val res=data.map(x=>{
          l.add(3L)
          x
        }).cache()
          res.count()
        //count 函数仅仅用于触发执行
        println(l.value)
        res.collect()
        println(l.value)
  • 相关阅读:
    RESTful API
    访问方式由http改为https curl:(51)
    java.lang.OutOfMemoryError: PermGen space
    liunx下tomcat启动 Cannot find ./catalina.sh
    Java-编译后出现$1.class、$2.class等多个class文件
    错误处理的返回--异常还是返回值
    ubuntu 上安装温度检测
    mysql5.6不能输入中文
    jmap在ubuntu上DebuggerException: Can't attach to the process
    tomcat-reload-与内存泄露
  • 原文地址:https://www.cnblogs.com/students/p/14263733.html
Copyright © 2011-2022 走看看