zoukankan      html  css  js  c++  java
  • SparkSQL External Datasource简易使用之AVRO

    下载源码&编译:

    git clone https://github.com/databricks/spark-avro.git
    sbt/sbt package

    Maven GAV:

    groupId: com.databricks.spark
    artifactId: spark-avro_2.10
    version: 0.1

    $SPARK_HOME/conf/spark-env.sh

    export SPARK_CLASSPATH=/home/spark/software/source/spark_package/spark-avro/target/scala-2.10/spark-avro_2.10-0.1.jar:$SPARK_CLASSPATH

    测试数据下载:

    wget https://github.com/databricks/spark-avro/raw/master/src/test/resources/episodes.avro 

    Scala API:

    import org.apache.spark.sql.SQLContext
    val sqlContext = new SQLContext(sc)
    import com.databricks.spark.avro._
    val episodes = sqlContext.avroFile("file:///home/spark/software/data/episodes.avro")
    import sqlContext._
    episodes.select('title).collect()

    SQL:

    CREATE TEMPORARY TABLE episodes
    USING com.databricks.spark.avro
    OPTIONS (path "file:///home/spark/software/data/episodes.avro");
    
    select * from episodes;
  • 相关阅读:
    常用英语1000句
    TXT EXPLORER
    窗体美化
    C++ Code_StatusBar
    C++ Code_Slider
    C++ Code_ScrollBar
    C++ Code_ImageList
    C++ Code_HotKey
    C++ Code_animateCtrl
    C++ CheckMenuItem
  • 原文地址:https://www.cnblogs.com/luogankun/p/4181873.html
Copyright © 2011-2022 走看看