zoukankan      html  css  js  c++  java
  • Spark 将DataFrame所有的列类型改为double

    Spark 将DataFrame所有的列类型改为double

    1.单列转化方法

    import org.apache.spark.sql.types._
    val data = Array(("1", "2", "3", "4", "5"), ("6", "7", "8", "9", "10"))
    val df = spark.createDataFrame(data).toDF("col1", "col2", "col3", "col4", "col5")
    
    import org.apache.spark.sql.functions._
    df.select(col("col1").cast(DoubleType)).show()
    

    2.循环转变

    val colNames = df.columns
    
    var df1 = df
    for (colName <- colNames) {
      df1 = df1.withColumn(colName, col(colName).cast(DoubleType))
    }
    df1.show()
    

    3.通过:_*

    val cols = colNames.map(f => col(f).cast(DoubleType))
    df.select(cols: _*).show()
    
    +----+----+----+----+----+
    |col1|col2|col3|col4|col5|
    +----+----+----+----+----+
    | 1.0| 2.0| 3.0| 4.0| 5.0|
    | 6.0| 7.0| 8.0| 9.0|10.0|
    +----+----+----+----+----+
    
    

    查询指定多列和转变指定列的类型了:

    val name = "col1,col3,col5"
    df.select(name.split(",").map(name => col(name)): _*).show()
    df.select(name.split(",").map(name => col(name).cast(DoubleType)): _*).show()
    
    +----+----+----+
    |col1|col3|col5|
    +----+----+----+
    |   1|   3|   5|
    |   6|   8|  10|
    +----+----+----+
    
    +----+----+----+
    |col1|col3|col5|
    +----+----+----+
    | 1.0| 3.0| 5.0|
    | 6.0| 8.0|10.0|
    +----+----+----+
    
    

    上部分完整代码:

    import org.apache.spark.sql.SparkSession
    import org.apache.spark.sql.types._
    import org.apache.spark.sql.DataFrame
    
    object ChangeAllColDatatypes {
    
      def main(args: Array[String]): Unit = {
        val spark = SparkSession.builder().appName("ChangeAllColDatatypes").master("local").getOrCreate()
        import org.apache.spark.sql.types._
        val data = Array(("1", "2", "3", "4", "5"), ("6", "7", "8", "9", "10"))
        val df = spark.createDataFrame(data).toDF("col1", "col2", "col3", "col4", "col5")
    
        import org.apache.spark.sql.functions._
        df.select(col("col1").cast(DoubleType)).show()
    
        val colNames = df.columns
    
        var df1 = df
        for (colName <- colNames) {
          df1 = df1.withColumn(colName, col(colName).cast(DoubleType))
        }
        df1.show()
    
        val cols = colNames.map(f => col(f).cast(DoubleType))
        df.select(cols: _*).show()
        val name = "col1,col3,col5"
        df.select(name.split(",").map(name => col(name)): _*).show()
        df.select(name.split(",").map(name => col(name).cast(DoubleType)): _*).show()
    
      }
    

    上部分原文地址:董可伦

  • 相关阅读:
    .net core EF 入门笔记Code First
    Windows环境下安装MongodDB
    Ueditor1.4.3.3 富文本编辑器在图片不显示问题
    .net IIS网站部署Host文件简单应用
    .Net初学Less的安装与部署
    EF+MVC动态Lamda表达式拼接(学习笔记二)
    EF+MVC动态Lamda表达式拼接(学习笔记)
    区块链从入门到放弃
    Unity3D的update和FixedUpdate
    忽雷太极拳十三式
  • 原文地址:https://www.cnblogs.com/aixing/p/13327350.html
Copyright © 2011-2022 走看看