zoukankan      html  css  js  c++  java
  • Spark2 Dataset之collect_set与collect_list

    collect_set去除重复元素;collect_list不去除重复元素
    select gender,
           concat_ws(',', collect_set(children)),
           concat_ws(',', collect_list(children))
      from Affairs
     group by gender

    // 创建视图 
    data.createOrReplaceTempView("Affairs")
    
    val df3= spark.sql("select gender,concat_ws(',',collect_set(children)),concat_ws(',',collect_list(children)) from Affairs group by gender")
    df3: org.apache.spark.sql.DataFrame = [gender: string, concat_ws(,, collect_set(children)): string ... 1 more field]
    
    df3.show  // collect_set去除重复元素;collect_list不去除重复元素
    +------+-----------------------------------+------------------------------------+
    |gender|concat_ws(,, collect_set(children))|concat_ws(,, collect_list(children))|
    +------+-----------------------------------+------------------------------------+
    |female|                             no,yes|                    no,yes,no,no,yes|
    |  male|                             no,yes|                    no,yes,no,yes,no|
    +------+-----------------------------------+------------------------------------+
    
  • 相关阅读:
    asp.net2.0系列视频教程
    Android入门教程(三十一)SQLite分页读取(转)
    sql语句大全
    判断是不是ie浏览器 加上ie11
    字节面试题
    泛型
    线程和
    sysbench工具
    自定义集合类
    类型通配符
  • 原文地址:https://www.cnblogs.com/wwxbi/p/6102380.html
Copyright © 2011-2022 走看看