zoukankan      html  css  js  c++  java
  • 【原创】大叔经验分享(55)spark连接kudu报错

    spark-2.4.2
    kudu-1.7.0


    开始尝试

    1)自己手工将jar加到classpath

    spark-2.4.2-bin-hadoop2.6
    +
    kudu-spark2_2.11-1.7.0-cdh5.16.1.jar

    # bin/spark-shell
    scala> val df = spark.read.options(Map("kudu.master" -> "master:7051", "kudu.table" -> "impala::test.tbl_test")).format("kudu").load
    java.lang.ClassNotFoundException: Failed to find data source: kudu. Please find packages at http://spark.apache.org/third-party-projects.html
      at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:660)
      at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:194)
      at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:167)
      ... 49 elided
    Caused by: java.lang.ClassNotFoundException: kudu.DefaultSource
      at scala.reflect.internal.util.AbstractFileClassLoader.findClass(AbstractFileClassLoader.scala:72)
      at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
      at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
      at org.apache.spark.sql.execution.datasources.DataSource$.$anonfun$lookupDataSource$5(DataSource.scala:634)
      at scala.util.Try$.apply(Try.scala:213)
      at org.apache.spark.sql.execution.datasources.DataSource$.$anonfun$lookupDataSource$4(DataSource.scala:634)
      at scala.util.Failure.orElse(Try.scala:224)
      at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:634)
      ... 51 more

    2)采用官方的方式(将kudu版本改为1.7.0)

    spark-2.4.2-bin-hadoop2.6

    # bin/spark-shell --packages org.apache.kudu:kudu-spark2_2.11:1.7.0

    same error

    3)采用官方的方式(不修改)

    spark-2.4.2-bin-hadoop2.6

    # bin/spark-shell --packages org.apache.kudu:kudu-spark2_2.11:1.9.0
    scala> val df = spark.read.options(Map("kudu.master" -> "master:7051", "kudu.table" -> "impala::test.tbl_test")).format("kudu").load
    java.lang.NoClassDefFoundError: scala/Product$class
      at org.apache.kudu.spark.kudu.Upsert$.<init>(OperationType.scala:41)
      at org.apache.kudu.spark.kudu.Upsert$.<clinit>(OperationType.scala)
      at org.apache.kudu.spark.kudu.DefaultSource$$anonfun$getOperationType$2.apply(DefaultSource.scala:217)
      at org.apache.kudu.spark.kudu.DefaultSource$$anonfun$getOperationType$2.apply(DefaultSource.scala:217)
      at scala.Option.getOrElse(Option.scala:138)
      at org.apache.kudu.spark.kudu.DefaultSource.getOperationType(DefaultSource.scala:217)
      at org.apache.kudu.spark.kudu.DefaultSource.createRelation(DefaultSource.scala:104)
      at org.apache.kudu.spark.kudu.DefaultSource.createRelation(DefaultSource.scala:87)
      at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:318)
      at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:223)
      at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:211)
      at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:167)
      ... 49 elided
    Caused by: java.lang.ClassNotFoundException: scala.Product$class
      at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
      at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
      at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
      ... 61 more

    看起来是scala版本冲突,到spark下载页面发现一句话:

    Note that, Spark is pre-built with Scala 2.11 except version 2.4.2, which is pre-built with Scala 2.12.

    4)kudu-spark改为scala2.12

    spark-2.4.2-bin-hadoop2.6

    # bin/spark-shell --packages org.apache.kudu:kudu-spark2_2.12:1.9.0

            ::::::::::::::::::::::::::::::::::::::::::::::
    
            ::          UNRESOLVED DEPENDENCIES         ::
    
            ::::::::::::::::::::::::::::::::::::::::::::::
    
            :: org.apache.kudu#kudu-spark2_2.12;1.9.0: not found
    
            ::::::::::::::::::::::::::::::::::::::::::::::

    好吧,下载2.4.3

    5)采用官方的方式(继续)

    spark-2.4.3-bin-hadoop2.6

    # bin/spark-shell --packages org.apache.kudu:kudu-spark2_2.11:1.9.0
    scala> val df = spark.read.options(Map("kudu.master" -> "master:7051", "kudu.table" -> "impala::test.tbl_test")).format("kudu").load
    df: org.apache.spark.sql.DataFrame = [order_no: string, id: bigint ... 28 more fields]

    正常了

    6)采用官方的方式(将kudu版本改为1.7.0)

    spark-2.4.3-bin-hadoop2.6

    # bin/spark-shell --packages org.apache.kudu:kudu-spark2_2.11:1.7.0

    same error

    看来spark连接kudu只能采用scala2.11+kudu-spark2_2.11:1.9.0

    参考:
    https://kudu.apache.org/docs/developing.html
    http://spark.apache.org/downloads.html

  • 相关阅读:
    使用FolderBrowserDialog组件选择文件夹
    使用OpenFileDialog组件打开多个文
    使用OpenFileDialog组件打开对话框
    获取弹出对话框的相关返回值
    PAT 甲级 1139 First Contact (30 分)
    PAT 甲级 1139 First Contact (30 分)
    PAT 甲级 1138 Postorder Traversal (25 分)
    PAT 甲级 1138 Postorder Traversal (25 分)
    PAT 甲级 1137 Final Grading (25 分)
    PAT 甲级 1137 Final Grading (25 分)
  • 原文地址:https://www.cnblogs.com/barneywill/p/10840608.html
Copyright © 2011-2022 走看看