zoukankan      html  css  js  c++  java
  • 2.1.4、SparkEnv中创建BroadcastManager

    Broadcast是分布式的数据共享,由BroadcastManager负责管理其创建或销毁。Broadcast一般用于处理共享的配置文件、通用Dataset、常用数据结构

    通过SparkContext.broadcast广播一个Broadcast, 实际调用的是SparkEnv的BroadManager来创建

      /**
       * Broadcast a read-only variable to the cluster, returning a
       * [[org.apache.spark.broadcast.Broadcast]] object for reading it in distributed functions.
       * The variable will be sent to each cluster only once.
       */
      def broadcast[T: ClassTag](value: T): Broadcast[T] = {
        assertNotStopped()
        require(!classOf[RDD[_]].isAssignableFrom(classTag[T].runtimeClass),
          "Can not directly broadcast RDDs; instead, call collect() and broadcast the result.")
        //使用SparkEnv.broadcastManager创建Broadcast
        val bc = env.broadcastManager.newBroadcast[T](value, isLocal)
        val callSite = getCallSite
        logInfo("Created broadcast " + bc.id + " from " + callSite.shortForm)
        cleaner.foreach(_.registerBroadcastForCleanup(bc))
        bc
      }
    View Code

    在SparkEnv中创建BroadcastManager, 

    // 此处只是声明, 只有调用initialize, 才会生效
    val broadcastManager = new BroadcastManager(isDriver, conf, securityManager)

    initialize() 

      // Called by SparkContext or Executor before using Broadcast
      private def initialize() {
        synchronized {
          if (!initialized) {
            broadcastFactory = new TorrentBroadcastFactory
            broadcastFactory.initialize(isDriver, conf, securityManager)
            initialized = true
          }
        }
      }

     BoradcastManager操作BradCast实际是代理BroadcastFactory, 此处使用工长模式

      def stop() {
        broadcastFactory.stop()
      }
    
      private val nextBroadcastId = new AtomicLong(0)
    
      def newBroadcast[T: ClassTag](value_ : T, isLocal: Boolean): Broadcast[T] = {
        broadcastFactory.newBroadcast[T](value_, isLocal, nextBroadcastId.getAndIncrement())
      }
    
      def unbroadcast(id: Long, removeFromDriver: Boolean, blocking: Boolean) {
        broadcastFactory.unbroadcast(id, removeFromDriver, blocking)
      }
    View Code
  • 相关阅读:
    Json schema前奏 关于JSON
    笔试题:能被1~10同时整除的最小整数是2520,问能被1~20同时整除的最小整数是多少?
    CentOS7 安装 Docker、最佳Docker学习文档
    2019年4399暑期实习算法题2,迷宫路径条数
    2019vivo秋招提前批笔试题第3题
    python内存机制与垃圾回收、调优手段
    N皇后问题的python实现
    一行代码判断一个数是否是2的整数次方
    在O(1)的时间内删除链表节点
    打印从1到n位数的最大值
  • 原文地址:https://www.cnblogs.com/chengbao/p/10624951.html
Copyright © 2011-2022 走看看