zoukankan      html  css  js  c++  java
  • spark cdh5编译安装[spark-1.0.2 hadoop2.3.0 cdh5.1.0]

    前提你得安装有Hadoop 我的版本hadoop2.3-cdh5.1.0

    1、下载maven包

    2、配置M2_HOME环境变量,配置maven 的bin目录到path路径

    3、export MAVEN_OPTS="-Xmx2g -XX:MaxPermSize=512M -XX:ReservedCodeCacheSize=512m"

    4、到官方下载spark-1.0.2.gz压缩包、解压

    5、进入spark解压包目录

    6、执行./make-distribution.sh --hadoop 2.3.0-cdh5.1.0 --with-yarn --tgz

    7、漫长的等待

    8、完成后会在当前目录下生成spark-1.0.2-bin-2.3.0-cdh5.1.0.tgz

    9、复制到安装目录解压

    10、配置conf下的配置文件

    cp spark-env.sh.template spark-env.sh

    vim spark-env.sh

    配置参数:对应即可

    export JAVA_HOME=/home/hadoop/jdk
    export HADOOP_HOME=/home/hadoop/hadoop-2.3.0-cdh5.1.0
    export HADOOP_CONF_DIR=/home/hadoop/hadoop-2.3.0-cdh5.1.0/etc/hadoop
    export SPARK_YARN_APP_NAME=spark-on-yarn
    export SPARK_EXECUTOR_INSTANCES=1
    export SPARK_EXECUTOR_CORES=2
    export SPARK_EXECUTOR_MEMORY=3500m
    export SPARK_DRIVER_MEMORY=3500m
    export SPARK_MASTER_IP=master
    export SPARK_MASTER_PORT=7077
    export SPARK_WORKER_CORES=2
    export SPARK_WORKER_MEMORY=3500m
    export SPARK_WORKER_INSTANCES=1

    11、配置slaves

    slave01
    slave02
    slave03
    slave04
    slave05

    12、分发

    拷贝spark安装目录到各个slave节点

    13、启动

    sbin/start-all.sh

    14、运行实例

    $SPARK_HOME/bin/spark-submit --class org.apache.spark.examples.SparkPi     --master yarn-client     --num-executors 3     --driver-memory 4g     --executor-memory 2g     --executor-cores 1     /home/hadoop/spark/lib/spark-examples-1.0.2-hadoop2.3.0-cdh5.1.0.jar     100

    15、发送实例竟然没成功

    在yarn监控界面点击日志出现一堆这些错误

    INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 0 time(s).

    INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 0 time(s).

    INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 0 time(s).

    INFO [main] org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8030. Already tried 0 time(s).

    16、解决问题

    将spark目录下lib包的spark核心包拿到本地,发现里面有一个yarn-defaul.xml文件,打开发现

      <!-- Resource Manager Configs -->
      <property>
        <description>The hostname of the RM.</description>
        <name>yarn.resourcemanager.hostname</name>
        <value>0.0.0.0</value>
      </property> 

    可想而知,到本地找resorcemanager,如果运行节点不是在yarn节点的resourcemanager上运行,怎么可能找到呢

    17、修改这个配置如下

      <!-- Resource Manager Configs -->
      <property>
        <description>The hostname of the RM.</description>
        <name>yarn.resourcemanager.hostname</name>
        <value>master</value>
      </property> 

    18、打包重新分发spark到各个节点

  • 相关阅读:
    BZOJ 3506 机械排序臂 splay
    BZOJ 2843 LCT
    BZOJ 3669 魔法森林
    BZOJ 2049 LCT
    BZOJ 3223 文艺平衡树 splay
    BZOJ 1433 假期的宿舍 二分图匹配
    BZOJ 1051 受欢迎的牛 强连通块
    BZOJ 1503 郁闷的出纳员 treap
    BZOJ 1096 ZJOI2007 仓库设计 斜率优化dp
    BZOJ 1396: 识别子串( 后缀数组 + 线段树 )
  • 原文地址:https://www.cnblogs.com/ningbj/p/3939888.html
Copyright © 2011-2022 走看看