zoukankan      html  css  js  c++  java
  • spark dataFrame withColumn

    说明:withColumn用于在原有DF新增一列

    1. 初始化sqlContext

    val sqlContext = new org.apache.spark.sql.SQLContext(sc) 

    2.导入sqlContext隐式转换

    import sqlContext.implicits._ 

    3.  创建DataFrames

    val df = sqlContext.read.json("file:///usr/local/spark-2.3.0/examples/src/main/resources/people.json")

    4. 显示内容

    df.show()   

    | age|   name| 
    +----+-------+
    |null|Michael|
    |  30|   Andy|
    |  19| Justin|

    5. 为原有df新加一列

    df.withColumn("id2", monotonically_increasing_id()+1) 

    6. 显示添加列后的内容

     res6.show() 

    +----+-------+---+
    | age|   name|id2|
    +----+-------+---+
    |null|Michael|  1|
    |  30|   Andy|  2|
    |  19| Justin|  3|
    +----+-------+---+

    完成的过程如下:

    scala> val sqlContext = new org.apache.spark.sql.SQLContext(sc) 
    warning: there was one deprecation warning; re-run with -deprecation for details
    sqlContext: org.apache.spark.sql.SQLContext = org.apache.spark.sql.SQLContext@2513155a
    scala> import sqlContext.implicits._
    import sqlContext.implicits._
    scala> val df = sqlContext.read.json("file:///usr/local/spark-2.3.0/examples/src/main/resources/people.json")
    2018-06-25 18:55:30 WARN  ObjectStore:6666 - Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0
    2018-06-25 18:55:30 WARN  ObjectStore:568 - Failed to get database default, returning NoSuchObjectException
    2018-06-25 18:55:32 WARN  ObjectStore:568 - Failed to get database global_temp, returning NoSuchObjectException
    df: org.apache.spark.sql.DataFrame = [age: bigint, name: string]
    scala> df.show()
    +----+-------+
    | age|   name|
    +----+-------+
    |null|Michael|
    |  30|   Andy|
    |  19| Justin|
    +----+-------+
    scala> df.withColumn("id2", monotonically_increasing_id()+1)
    res6: org.apache.spark.sql.DataFrame = [age: bigint, name: string ... 1 more field]
    scala> res6.show()
    +----+-------+---+
    | age|   name|id2|
    +----+-------+---+
    |null|Michael|  1|
    |  30|   Andy|  2|
    |  19| Justin|  3|
    +----+-------+---+
  • 相关阅读:
    如何查找本地的ip
    jQuery解析AJAX返回的html数据时碰到的问题与解决
    angularjs之ng-bind和ng-model
    nodejs配置及cmd常用操作
    ID属性值为小数
    DOM对象
    js跨域问题
    加载图片失败,怎样替换为默认图片
    常用前端 网址
    echart字符云之添加点击事件
  • 原文地址:https://www.cnblogs.com/abcdwxc/p/9225855.html
Copyright © 2011-2022 走看看