zoukankan      html  css  js  c++  java
  • spark dataFrame withColumn

    说明:withColumn用于在原有DF新增一列

    1. 初始化sqlContext

    val sqlContext = new org.apache.spark.sql.SQLContext(sc) 

    2.导入sqlContext隐式转换

    import sqlContext.implicits._ 

    3.  创建DataFrames

    val df = sqlContext.read.json("file:///usr/local/spark-2.3.0/examples/src/main/resources/people.json")

    4. 显示内容

    df.show()   

    | age|   name| 
    +----+-------+
    |null|Michael|
    |  30|   Andy|
    |  19| Justin|

    5. 为原有df新加一列

    df.withColumn("id2", monotonically_increasing_id()+1) 

    6. 显示添加列后的内容

     res6.show() 

    +----+-------+---+
    | age|   name|id2|
    +----+-------+---+
    |null|Michael|  1|
    |  30|   Andy|  2|
    |  19| Justin|  3|
    +----+-------+---+

    完成的过程如下:

    scala> val sqlContext = new org.apache.spark.sql.SQLContext(sc) 
    warning: there was one deprecation warning; re-run with -deprecation for details
    sqlContext: org.apache.spark.sql.SQLContext = org.apache.spark.sql.SQLContext@2513155a
    scala> import sqlContext.implicits._
    import sqlContext.implicits._
    scala> val df = sqlContext.read.json("file:///usr/local/spark-2.3.0/examples/src/main/resources/people.json")
    2018-06-25 18:55:30 WARN  ObjectStore:6666 - Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0
    2018-06-25 18:55:30 WARN  ObjectStore:568 - Failed to get database default, returning NoSuchObjectException
    2018-06-25 18:55:32 WARN  ObjectStore:568 - Failed to get database global_temp, returning NoSuchObjectException
    df: org.apache.spark.sql.DataFrame = [age: bigint, name: string]
    scala> df.show()
    +----+-------+
    | age|   name|
    +----+-------+
    |null|Michael|
    |  30|   Andy|
    |  19| Justin|
    +----+-------+
    scala> df.withColumn("id2", monotonically_increasing_id()+1)
    res6: org.apache.spark.sql.DataFrame = [age: bigint, name: string ... 1 more field]
    scala> res6.show()
    +----+-------+---+
    | age|   name|id2|
    +----+-------+---+
    |null|Michael|  1|
    |  30|   Andy|  2|
    |  19| Justin|  3|
    +----+-------+---+
  • 相关阅读:
    网络存储系统NSS基准性能测试(7.10)
    Design and Model Analysis of the ECommerce Development Platform for 3Tiered Web Applications
    网络存储系统NSS基准性能测试(7.24)
    Android雁翎刀之ImageView之哈哈镜
    Android雁翎刀之ImageView之舞动乾坤
    Android雁翎刀之ImageView之异步下载
    网络存储系统NSS基准性能测试(7.17)
    freeradius源码安装及相关问题解释
    解决linux乱码显示的问题
    在ubtunu使用aptget安装和配置freeradius
  • 原文地址:https://www.cnblogs.com/abcdwxc/p/9225855.html
Copyright © 2011-2022 走看看