zoukankan      html  css  js  c++  java
  • sbt assembly a fat jar for spark-submit cluster model

    在用spark-submit提交作业时,用sbt package打包好的jar程序,可以很好的运行在client模式,当在cluster模式,

    一直报错:Exception in thread "main" java.lang.ClassNotFoundException。决定利用sbt assembly插件把所有的依赖打成一个jar。

    我的工程结构:

      myProject/build.sbt

      myProject/project/assembly.sbt

      myProject/src/main/scala/com/lasclocker/java/SparkGopProcess.java

    上面褐色部分是java源程序的包名。

    build.sbt的内容:

    lazy val root = (project in file(".")).
      settings(
        name := "my-project",
        version := "1.0",
        scalaVersion := "2.11.7",
        mainClass in Compile := Some("com.lasclocker.java.SparkGopProcess")  // 这里是主类名字
      )
    
    autoScalaLibrary := false // exclude scala library
    libraryDependencies +=
    "org.apache.spark" %% "spark-core" % "1.4.1" % "provided" // exclude spark library unmanagedBase := baseDirectory.value / "custom_spark_lib" // 这里是第三方依赖包,我直接放在myProject的custom_spark_lib目录下面 // META-INF discarding mergeStrategy in assembly <<= (mergeStrategy in assembly) { (old) => { case PathList("META-INF", xs @ _*) => MergeStrategy.discard case x => MergeStrategy.first } }

    其中custom_spark_lib目录下的jar包有:guava-10.0.1.jar, hadoopCustomInputFormat.jar.

    assembly.sbt的内容:

    addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.12.0")

    在myProject目录下,执行:

    sbt clean assembly

    最后生成a fat jar包:target/scala-2.11/my-project-assembly-1.0.jar.

    最后附上我的spark-submit cluster模式的shell脚本(脚本中的ip地方被xx了):

    inPath=/LPR
    outPath=/output
    minPartitionNum=4
    sparkURL=spark://xx.xx.xx.xx:7077
    hdfsFile=hdfs://xx.xx.xx.xx:9000/user/root
    ldLib=/opt/hadoop/lib #这里放一些动态库, 比如JNI中的.so文件            
    
    spark-submit 
     --class ${yourAppClass} 
     --master ${sparkURL} 
     --driver-library-path $ldLib 
     --deploy-mode cluster 
     $hdfsFile/my-project-assembly-1.0.jar $inPath $outPath $minPartitionNum

    参考: sbt-assembly,  How to build an Uber JAR (Fat JAR) using SBT within IntelliJ IDEA?

  • 相关阅读:
    192021
    191020
    magento注册
    magento登陆
    把PHP的数组变成带单引号的字符串
    magento直接操作数据库
    兼容各大浏览器的event获取
    手动修改magento域名
    微信支付中的jsapi返回提示信息
    CentOS 下安装xdebug
  • 原文地址:https://www.cnblogs.com/lasclocker/p/4718687.html
Copyright © 2011-2022 走看看