zoukankan      html  css  js  c++  java
  • 查看Spark与Hadoop等其他组件的兼容版本

    安装与Spark相关的其他组件的时候,例如JDK,Hadoop,Yarn,Hive,Kafka等,要考虑到这些组件和Spark的版本兼容关系。这个对应关系可以在Spark源代码的pom.xml文件中查看。

    一、 下载Spark源代码

    打开网址https://github.com/apache/spark,例如选择v2.4.0-rc5版本,再点击“Clone or download”按钮,点击下方的“Download ZIP”进行下载。

    二、查看pom.xml文件
    将下载的源代码压缩包解压后,打开里面的pom.xml文件,查看properties标签内各配置项,里面有列出其他组件的兼容版本信息,例如<hadoop.version>2.6.5</hadoop.version>表示hadoop版本为2.6.5。如下:

      <properties>
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
        <project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
        <java.version>1.8</java.version>
        <maven.compiler.source>${java.version}</maven.compiler.source>
        <maven.compiler.target>${java.version}</maven.compiler.target>
        <maven.version>3.5.4</maven.version>
        <sbt.project.name>spark</sbt.project.name>
        <slf4j.version>1.7.16</slf4j.version>
        <log4j.version>1.2.17</log4j.version>
        <hadoop.version>2.6.5</hadoop.version>
        <protobuf.version>2.5.0</protobuf.version>
        <yarn.version>${hadoop.version}</yarn.version>
        <flume.version>1.6.0</flume.version>
        <zookeeper.version>3.4.6</zookeeper.version>
        <curator.version>2.6.0</curator.version>
        <hive.group>org.spark-project.hive</hive.group>
        <!-- Version used in Maven Hive dependency -->
        <hive.version>1.2.1.spark2</hive.version>
        <!-- Version used for internal directory structure -->
        <hive.version.short>1.2.1</hive.version.short>
        <derby.version>10.12.1.1</derby.version>
        <parquet.version>1.10.0</parquet.version>
        <orc.version>1.5.2</orc.version>
        <orc.classifier>nohive</orc.classifier>
        <hive.parquet.version>1.6.0</hive.parquet.version>
        <jetty.version>9.3.24.v20180605</jetty.version>
        <javaxservlet.version>3.1.0</javaxservlet.version>
        <chill.version>0.9.3</chill.version>
        <ivy.version>2.4.0</ivy.version>
        <oro.version>2.0.8</oro.version>
        <codahale.metrics.version>3.1.5</codahale.metrics.version>
        <avro.version>1.8.2</avro.version>
        <avro.mapred.classifier>hadoop2</avro.mapred.classifier>
        <aws.kinesis.client.version>1.8.10</aws.kinesis.client.version>
        <!-- Should be consistent with Kinesis client dependency -->
        <aws.java.sdk.version>1.11.271</aws.java.sdk.version>
        <!-- the producer is used in tests -->
        <aws.kinesis.producer.version>0.12.8</aws.kinesis.producer.version>
        <!--  org.apache.httpcomponents/httpclient-->
        <commons.httpclient.version>4.5.6</commons.httpclient.version>
        <commons.httpcore.version>4.4.10</commons.httpcore.version>
        <!--  commons-httpclient/commons-httpclient-->
        <httpclient.classic.version>3.1</httpclient.classic.version>
        <commons.math3.version>3.4.1</commons.math3.version>
        <!-- managed up from 3.2.1 for SPARK-11652 -->
        <commons.collections.version>3.2.2</commons.collections.version>
        <scala.version>2.11.12</scala.version>
        <scala.binary.version>2.11</scala.binary.version>
        <codehaus.jackson.version>1.9.13</codehaus.jackson.version>
        <fasterxml.jackson.version>2.6.7</fasterxml.jackson.version>
        <fasterxml.jackson.databind.version>2.6.7.1</fasterxml.jackson.databind.version>
        <snappy.version>1.1.7.1</snappy.version>
        <netlib.java.version>1.1.2</netlib.java.version>
        <calcite.version>1.2.0-incubating</calcite.version>
        <commons-codec.version>1.10</commons-codec.version>
        <commons-io.version>2.4</commons-io.version>
        <!-- org.apache.commons/commons-lang/-->
        <commons-lang2.version>2.6</commons-lang2.version>
        <!-- org.apache.commons/commons-lang3/-->
        <commons-lang3.version>3.5</commons-lang3.version>
        <datanucleus-core.version>3.2.10</datanucleus-core.version>
        <janino.version>3.0.9</janino.version>
        <jersey.version>2.22.2</jersey.version>
        <joda.version>2.9.3</joda.version>
        <jodd.version>3.5.2</jodd.version>
        <jsr305.version>1.3.9</jsr305.version>
        <libthrift.version>0.9.3</libthrift.version>
        <antlr4.version>4.7</antlr4.version>
        <jpam.version>1.1</jpam.version>
        <selenium.version>2.52.0</selenium.version>
        <!--
        Managed up from older version from Avro; sync with jackson-module-paranamer dependency version
        -->
        <paranamer.version>2.8</paranamer.version>
        <maven-antrun.version>1.8</maven-antrun.version>
        <commons-crypto.version>1.0.0</commons-crypto.version>
        <!--
        If you are changing Arrow version specification, please check ./python/pyspark/sql/utils.py,
        ./python/run-tests.py and ./python/setup.py too.
        -->
        <arrow.version>0.10.0</arrow.version>
    
        <test.java.home>${java.home}</test.java.home>
        <test.exclude.tags></test.exclude.tags>
        <test.include.tags></test.include.tags>
    
        <!-- Package to use when relocating shaded classes. -->
        <spark.shade.packageName>org.spark_project</spark.shade.packageName>
    
        <!-- Modules that copy jars to the build directory should do so under this location. -->
        <jars.target.dir>${project.build.directory}/scala-${scala.binary.version}/jars</jars.target.dir>
    
        <!-- Allow modules to enable / disable certain build plugins easily. -->
        <build.testJarPhase>prepare-package</build.testJarPhase>
        <build.copyDependenciesPhase>none</build.copyDependenciesPhase>
    
        <!--
          Dependency scopes that can be overridden by enabling certain profiles. These profiles are
          declared in the projects that build assemblies.
    
          For other projects the scope should remain as "compile", otherwise they are not available
          during compilation if the dependency is transivite (e.g. "graphx/" depending on "core/" and
          needing Hadoop classes in the classpath to compile).
        -->
        <flume.deps.scope>compile</flume.deps.scope>
        <hadoop.deps.scope>compile</hadoop.deps.scope>
        <hive.deps.scope>compile</hive.deps.scope>
        <orc.deps.scope>compile</orc.deps.scope>
        <parquet.deps.scope>compile</parquet.deps.scope>
        <parquet.test.deps.scope>test</parquet.test.deps.scope>
    
        <!--
          Overridable test home. So that you can call individual pom files directly without
          things breaking.
        -->
        <spark.test.home>${session.executionRootDirectory}</spark.test.home>
    
        <CodeCacheSize>512m</CodeCacheSize>
      </properties>

    完毕。

  • 相关阅读:
    一种集各种优点于一身的技术面试方式--转
    spring cloud集成 consul源码分析
    一篇文章全面了解监控知识体系--转
    使用CNN做电影评论的负面检测——本质上感觉和ngram或者LSTM同,因为CNN里图像检测卷积一般是3x3,而文本分类的话是直接是一维的3、4、5
    CNN tflearn处理mnist图像识别代码解说——conv_2d参数解释,整个网络的训练,主要就是为了学那个卷积核啊。
    神经网络中embedding层作用——本质就是word2vec,数据降维,同时可以很方便计算同义词(各个word之间的距离),底层实现是2-gram(词频)+神经网络
    使用LSTM做电影评论负面检测——使用朴素贝叶斯才51%,但是使用LSTM可以达到99%准确度
    如何比较Keras, TensorLayer, TFLearn ?——如果只是想玩玩深度学习,想快速上手 -- Keras 如果工作中需要解决内部问题,想快速见效果 -- TFLearn 或者 Tensorlayer 如果正式发布的产品和业务,自己设计网络模型,需要持续开发和维护 -- Tensorlayer
    TensorFlow高层次机器学习API (tf.contrib.learn)
    tensorflow LSTM
  • 原文地址:https://www.cnblogs.com/liuys635/p/12371793.html
Copyright © 2011-2022 走看看