zoukankan      html  css  js  c++  java
  • spark 学习笔记 RDD 向 Dataframe 转换

    1、普通方式:

    例如rdd.map(para(para(0).trim(),para(1).trim().toInt)).toDF("name","age")   

    #需要导入隐式转换

     import spark.implicits._  // 隐式转换
     val df1=data.map(x=>x.split(",")).map(x=>(x(0).trim,x(1).trim,x(2).trim,x(3).trim,x(4).trim,x(5).trim,x(6).trim,x(7).trim,x(8).trim.toLong,x(9).trim,x(10).trim,x(11).trim,x(12).trim,x(13).trim,x(14).trim))
          .toDF("xxid","province_id","xid","test","number","time","a","b","sales","selection","add","game","begin","end","draw")
    df1.createOrReplaceTempView(
    "bjlot") spark.sql("select sum(sales) as a from bjlot " ).createOrReplaceTempView("tmp")

    2、通过反射来设置schema,例如:

    //通过反射导入schema
    import spark.implicits._

    val df2=data.map(x=>x.split(",")).map(x=>bd(x(0).trim.toString,x(1).trim.toInt,x(2).trim.toInt,x(3).trim.toInt,x(4).trim.toString,x(5).trim.toString,x(6).toInt,x(7).toInt,x(8).toLong,x(9).toString,x(10).toString,x(11).toString,x(12).toString, x(13).toString,x(14).trim.toString)).toDF() case class bd(shop:String,province:Int,loc:Int,no:Int,ticket_no:String,sale_time:String,chances:Int,multple:Int,sales:Long,selection:String,add:String,game:String,begn:String,end:String,draw_date:String)

      

  • 相关阅读:
    while练习
    运算符
    作业
    [新手必看] 17个常见的Python运行时错误
    作业
    day04
    作业
    算法模板——线段树2(区间加+区间乘+区间求和)
    1798: [Ahoi2009]Seq 维护序列seq
    1708: [Usaco2007 Oct]Money奶牛的硬币
  • 原文地址:https://www.cnblogs.com/students/p/13446988.html
Copyright © 2011-2022 走看看