zoukankan      html  css  js  c++  java
  • linux上配置spark集群

    环境:

    linux

    spark1.6.0

    hadoop2.2.0

    一.安装scala(每台机器)
     
    1.下载scala-2.11.0.tgz
     
    放在目录: /opt下,tar -zxvf scala-2.11.0.tgz
     
    2.在hadoop用户下
     
    vim /etc/profile
    3.在profile文件加入Scala路径
     
     export SCALA_JAVA=/opt/scala-2.11.0
     export PATH=$PATH:$SCALA_JAVA/bin  
     
    4.使配置环境生效
    source /etc/profile
    5.检验scala是否安装成功
    [hadoop@testhdp01 ~]$ scala -version
    Scala code runner version 2.10.1 -- Copyright 2002-2013, LAMP/EPF
    成功
     
     
    二.安装spark
     
    1.编译spark1.6.0(在linux下编译很多次都编译不成功,所以我放到mac下编译的。)
     
    进入spark目录,然后执行以下命令:
    build/mvn -Pyarn -Phadoop-2.2 -Dhadoop.version=2.2.0 -DskipTests clean package
    ./make-distribution.sh --name custom-spark --tgz -Psparkr -Phadoop-2.2 -Phive -Phive-thriftserver -Pyarn
    mvn -Pyarn -Phadoop-2.2 -Dhadoop.version=2.2.0 -Phive -Phive-thriftserver -DskipTests clean package
     
    用idea编译方法:
     
    2.配置spark
     
    cd /opt/spark-1.6.0-bin-hadoop2.2.0/conf
     
    cp spark-env.sh.template spark-env.sh
     
    cp slaves.template slaves
     
    vim spark-env.sh
     
    加入
    export SCALA_HOME=/opt/scala-2.10.1
    export JAVA_HOME=/opt/jdk1.7.0_51
    export SPARK_MASTER_IP=192.168.22.7
    export HADOOP_HOME=/opt/hadoop-2.2.0
    export SPARK_HOME=/opt/spark-1.6.0-bin-hadoop2.2.0
    export SPARK_LIBRARY_PATH=.:$JAVA_HOME/lib:$JAVA_HOME/jre/lib:$HADOOP_HOME/lib/native
    export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop/
    export LD_LIBRARY_PATH=$HADOOP_HOME/lib/native
    export SPARK_JAR=$SPARK_HOME/lib/spark-assembly-1.6.0-hadoop2.2.0.jar

    mac 下配置如下,在文件头加入

    #jdk
    export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.7.0_79.jdk/Contents/Home
    export PATH=$PATH:$JAVA_HOME/bin
    
    #scala
    export SCALA_HOME=/usr/local/Cellar/scala-2.10.4
    export PATH=$PATH:$SCALA_HOME/bin
    
    #hadoop
    export HADOOP_HOME=/usr/local/Cellar/hadoop/2.7.2/libexec
    export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
    
    #hive
    export HIVE_HOME=/usr/local/Cellar/hive/2.0.1/libexec
    export SPARK_CLASSPATH=$HIVE_HOME/lib/mysql-connector-java-5.1.28.jar:$SPARK_CLASSPATH
    
    #spark
    export SPARK_HOME=/usr/local/Cellar/spark-1.3.1-bin-hadoop2.6
    export PATH=$PATH:$SPARK_HOME/bin

    3.配置spark 支持hive

    vim spark-env.sh
    export HIVE_HOME=/opt/apache-hive-0.13.0
    export SPARK_CLASSPATH=$HIVE_HOME/lib/mysql-connector-java-5.1.26.jar:$SPARK_CLASSPATH
    拷贝apache-hive-0.13.1-bin/conf/hive-site.xml到$SPARK_HOME/conf下
    cp /opt/apache-hive-0.13.0/conf/hive-site.xml conf/
    在/etc/profile.d目录下创建hive.sh文件
    加入环境变量设置
    #!/bin/bash
    export HIVE_HOME=/opt/apache-hive-0.13.0
    export PATH=$HIVE_HOME/bin:$PATH
    是环境变量生效
    source /etc/profile.d/hive.sh
    4.配置集群
    进入spark的conf目录
    vim slaves
    删除localhost
    加入子节点的名字
    testhdp02
    testhdp03
     
    配置spark系统环境(三个子节点都要配置)
    sudo su - root
    sudo vim /etc/profile
    export SPARK_HOME=/opt/spark-1.5.0-bin-hadoop2.2.0
    export PATH=$PATH:$SPARK_HOME/bin:$SPARK_HOME/sbin
     
    3.把配置好的spark打包,发送到子节点
     
     
     
    三:错误分析
     
    bin/spark-shell
    运行
    val textFile = sc.textFile("README.md")
    textFile.count()
     
    出现如下错误:
     
    Caused by: java.lang.reflect.InvocationTargetException
            at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
            at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
            at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
            at java.lang.reflect.Method.invoke(Method.java:606)
            at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
            ... 61 more
    Caused by: java.lang.IllegalArgumentException: Compression codec com.hadoop.compression.lzo.LzoCodec not found.
            at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:135)
            at org.apache.hadoop.io.compress.CompressionCodecFactory.<init>(CompressionCodecFactory.java:175)
            at org.apache.hadoop.mapred.TextInputFormat.configure(TextInputFormat.java:45)
            ... 66 more
    Caused by: java.lang.ClassNotFoundException: Class com.hadoop.compression.lzo.LzoCodec not found
            at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1626)
            at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:128)
            ... 68 more
     
    解决方案:
    修改saprk-env.sh文件
     
    export SCALA_HOME=/opt/scala-2.10.1
    export JAVA_HOME=/opt/jdk1.7.0_51
    export SPARK_MASTER_IP=192.168.22.7
    export HADOOP_HOME=/opt/hadoop-2.2.0
    export SPARK_HOME=/opt/spark-1.6.0-bin-hadoop2.2.0
    export SPARK_LIBRARY_PATH=.:$JAVA_HOME/lib:$JAVA_HOME/jre/lib:$HADOOP_HOME/lib/native
    export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop/
    export LD_LIBRARY_PATH=$HADOOP_HOME/lib/native
    export SPARK_JAR=$SPARK_HOME/lib/spark-assembly-1.6.0-hadoop2.2.0.jar
    export SPARK_CLASSPATH=$SPARK_CLASSPATH:$HADOOP_HOME/share/hadoop/yarn/*:$HADOOP_HOME/share/hadoop/yarn/lib/*:$HADOOP_HOME/share/hadoop/common/*:$HADOOP_HOME/share/hadoop/common/lib/*:$HADOOP_HOME/share/hadoop/hdfs/*:$HADOOP_HOME/share/hadoop/hdfs/lib/*:$HADOOP_HOME/share/hadoop/mapreduce/*:$HADOOP_HOME/share/hadoop/mapreduce/lib/*:$HADOOP_HOME/share/hadoop/tools/lib/*:$SPARK_HOME/lib/*

     
     
     
     
     
     
     
     
     
     
     
  • 相关阅读:
    java web项目打包.war格式
    version 1.4.2-04 of the jvm is not suitable for thi
    Sugarcrm Email Integration
    sharepoint 2010 masterpage中必须的Content PlaceHolder
    微信开放平台
    Plan for caching and performance in SharePoint Server 2013
    使用自定义任务审批字段创建 SharePoint 顺序工作流
    Technical diagrams for SharePoint 2013
    To get TaskID's Integer ID value from the GUID in SharePoint workflow
    how to get sharepoint lookup value
  • 原文地址:https://www.cnblogs.com/aijianiula/p/5192580.html
Copyright © 2011-2022 走看看