zoukankan      html  css  js  c++  java
  • SPARK共享变量:广播变量和累加器

    Shared Variables

    Spark does provide two limited types of shared variables for two common usage patterns: broadcast variables and accumulators.

     Broadcast variables allow the programmer to keep a read-only variable cached on each machine rather than shipping a copy of it with tasks. 

    Broadcast variables are created from a variable v by calling SparkContext.broadcast(v). The broadcast variable is a wrapper around v, and its value can be accessed by calling the value method.    

     val broadcastVar sc.broadcast(Array(123))

    Accumulators are variables that are only “added” to through an associative and commutative operation and can therefore be efficiently supported in parallel. They can be used to implement counters (as in MapReduce) or sums. Spark natively supports accumulators of numeric types, and programmers can add support for new types.

    scala> val accnum=sc.longAccumulator("ggg")
    accnum: org.apache.spark.util.LongAccumulator = LongAccumulator(id: 264, name: Some(ggg), value: 0)

    scala> sc.parallelize(Array(1,2,3,4,5)).foreach(x=>accnum.add(x))

    scala> accnum
    res14: org.apache.spark.util.LongAccumulator = LongAccumulator(id: 264, name: Some(ggg), value: 15)

     累加器(accumulator)与广播变量(broadcast variable)。累加器用来对信息进行聚合,而广播变量用来高效分发较大的对象

  • 相关阅读:
    定时器的使用
    new LayoutParams 使用
    判断,日期是是昨天,前天 ,今天
    google推出的SwipeRefreshLayout下拉刷新用法
    Intent的Flag
    Eclipse Java注释模板设置详解
    Eclipse的模板设置代码
    Android如何在java代码中设置margin
    软键盘挡住输入框的解决方案
    Android自定义遮罩层设计
  • 原文地址:https://www.cnblogs.com/playforever/p/9408109.html
Copyright © 2011-2022 走看看