zoukankan      html  css  js  c++  java
  • Spark Structured Streaming框架(5)之进程管理

      Structured Streaming提供一些API来管理Streaming对象。用户可以通过这些API来手动管理已经启动的Streaming,保证在系统中的Streaming有序执行。

    1. StreamingQuery

      在调用DataStreamWriter方法的start启动Streaming后,会返回一个StreamingQuery对象。所以用户就可以通过这个对象来管理Streaming。

    如下所示:

    val query = df.writeStream.format("console").start() // get the query object

    query.id // get the unique identifier of the running query that persists across restarts from checkpoint data

    query.runId // get the unique id of this run of the query, which will be generated at every start/restart

    query.name // get the name of the auto-generated or user-specified name

    query.explain() // print detailed explanations of the query

    query.stop() // stop the query

    query.awaitTermination() // block until query is terminated, with stop() or with error

    query.exception // the exception if the query has been terminated with error

    query.recentProgress // an array of the most recent progress updates for this query

    query.lastProgress // the most recent progress update of this streaming query

    2. StreamingQueryManager

      Structured Streaming提供了另外一个管理Streaming的接口是:StreamingQueryManager。用户可以通过SparkSession对象的streams方法获得。

    如下所示:

    val spark: SparkSession = ...

    val streamManager = spark.streams()

    streamManager.active // get the list of currently active streaming queries

    streamManager.get(id) // get a query object by its unique id

    streamManager.awaitAnyTermination() // block until any one of them terminates

    3. 参考文献

    [2]. Kafka Integration Guide.

  • 相关阅读:
    浅谈 倍增/ST表
    Meetings S 题解
    排序模板
    Huffman 树
    2020.7.31 模拟赛 题解
    浅谈 最短路
    【lcez校内第三次考T1】【题解】folder
    【题解】 P2613 【模板】有理数取余
    【题解】P5535 【XR-3】小道消息
    【笔记】积性函数 与 欧拉函数
  • 原文地址:https://www.cnblogs.com/huliangwen/p/7470766.html
Copyright © 2011-2022 走看看