编译spark源码及塔建源码阅读环境
(一),编译spark源码
1,更换maven的下载镜像:
<mirrors> <!-- 阿里云仓库 --> <mirror> <id>alimaven</id> <mirrorOf>central</mirrorOf> <name>aliyun maven</name> <url>http://maven.aliyun.com/nexus/content/repositories/central/</url> </mirror> <!-- 中央仓库1 --> <mirror> <id>repo1</id> <mirrorOf>central</mirrorOf> <name>Human Readable Name for this Mirror.</name> <url>http://repo1.maven.org/maven2/</url> </mirror> <!-- 中央仓库2 --> <mirror> <id>repo2</id> <mirrorOf>central</mirrorOf> <name>Human Readable Name for this Mirror.</name> <url>http://repo2.maven.org/maven2/</url> </mirror> </mirrors>
2,使用编译命令
$ export MAVEN_OPTS="-Xmx2g -XX:ReservedCodeCacheSize=512m"
$ mvn -Pyarn -Phadoop-2.7 -Pspark-ganglia-lgpl -Pkinesis-asl -Phive -DskipTests clean package (大约需要1个多小时左右,看网速)
3,生成部署包
$ export MAVEN_OPTS="-Xmx2g -XX:ReservedCodeCacheSize=512m"
$ ./dev/make-distribution.sh -name custom-spark-tgz -Psparkr -Phadoop-2.7 -Phive -Phive-thriftserver -Pyarn (生成spark-2.1.0-bin-custom-spark.tgz)
(二),塔建阅读环境
1,将上步编译好的spark导入IDEA中
2,异常解决
解决方法:进入编译好的spark文件下:external/flueme-sink/target/spark-streaming-flume-sink_2.11-2.0.0-source.jar包解压
再将解压的文件中target/spark-streaming-flume-sink_2.11-2.1.0-sources/org/apache/spark/streaming/flume/sink的文件复制到
external/flume-sink/src/main/scala/org/apache/spark/streaming/flume/sink ,执行rebuild即可
3,到此阅读环境就算搭好了,验证可使用spark 中的localPi 例子