zoukankan      html  css  js  c++  java
  • Spark-1.5.2安装--Standalone和Yarn

    Spark Standalone

    1.下载scala-2.10.6包解压到指定目录,添加环境变量

    #SCALA VARIABLES START
    export SCALA_HOME=/usr/local/scala-2.10.6
    export PATH=$PATH:$SCALA_HOME/bin
    #SCALA VARIABLES END
    

    2.下载Spark-1.5.2包解压到指定目录,添加环境变量

    #SPARK VARIABLES START
    export SPARK_HOME=/usr/local/spark-1.5.2
    export PATH=$PATH:$SPARK_HOME/bin
    #SPARK VARIABLES END
    

    3.修改spark-env.sh文件

    export JAVA_HOME=/usr/lib/jvm/jdk1.8.0_66
    export SCALA_HOME=/usr/local/scala-2.10.6
    export HADOOP_HOME=/usr/local/hadoop-2.6.0
    export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
    SPARK_MASTER_IP=10.9.2.100
    SPARK_LOCAL_DIR="/usr/local/spark-1.5.2/tmp"
    

    4.启动集群(机器ssh端口改变时)
    启动主节点:sbin/start-master.sh
    启动从节点:sbin/start-slave.sh 10.9.2.100:7077
    5.验证

    #本地模式两线程运行
    ./bin/run-example SparkPi 10 --master local[2]
    
    #Spark Standalone 集群模式运行
    ./bin/spark-submit   --class org.apache.spark.examples.SparkPi   --master spark://10.9.2.100:7077   lib/spark-examples-1.5.2-hadoop2.6.0.jar   100
    
    #Spark on YARN 集群上 yarn-cluster 模式运行(此方法无需启动master和slaves,需要yarn环境)
    ./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn-cluster lib/spark-examples*.jar 10
    

    直接使用bin/spark-shell是local模式
    6.错误解决:

    15/11/30 16:20:00 ERROR util.SparkUncaughtExceptionHandler: Uncaught exception in thread Thread[sparkWorker-akka.actor.default-dispatcher-6,5,main]
    
    java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.FutureTask@4a890723 rejected from java.util.concurrent.ThreadPoolExecutor@64992284[Running, pool size = 1, active threads = 0, queued tasks = 0, completed tasks = 1]
    
            at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2047)
    
            at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823)
    
            at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369)
    
            at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:112)
    
            at org.apache.spark.deploy.worker.Worker$$anonfun$org$apache$spark$deploy$worker$Worker$$tryRegisterAllMasters$1.apply(Worker.scala:211)
    
            at org.apache.spark.deploy.worker.Worker$$anonfun$org$apache$spark$deploy$worker$Worker$$tryRegisterAllMasters$1.apply(Worker.scala:210)
    
            at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
    
            at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
    
            at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
    
            at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
    
            at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
    
            at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108)
    
            at org.apache.spark.deploy.worker.Worker.org$apache$spark$deploy$worker$Worker$$tryRegisterAllMasters(Worker.scala:210)
    
            at org.apache.spark.deploy.worker.Worker$$anonfun$org$apache$spark$deploy$worker$Worker$$reregisterWithMaster$1.apply$mcV$sp(Worker.scala:288)
    
            at org.apache.spark.util.Utils$.tryOrExit(Utils.scala:1119)
    
            at org.apache.spark.deploy.worker.Worker.org$apache$spark$deploy$worker$Worker$$reregisterWithMaster(Worker.scala:234)
    
            at org.apache.spark.deploy.worker.Worker$$anonfun$receive$1.applyOrElse(Worker.scala:521)
    
            at org.apache.spark.deploy.worker.Worker$$anonfun$receive$1.applyOrElse(Worker.scala:521)
    
    sr/local/spark-1.5.2/lib/datanucleus-rdbms-3.2.9.jar:/usr/local/spark-1.5.2/lib/datanucleus-api-jdo-3.2.6.jar:/usr/local/spark-1.5.2/lib/datanucleus-core-3.
    
    2.10.jar:/usr/local/hadoop-2.6.0/etc/hadoop/ -Xms1g -Xmx1g org.apache.spark.deploy.worker.Worker --webui-port 8081 10.9.2.100:7077
    

    解决:

    将SPARK_MASTER_IP=master改成
    SPARK_MASTER_IP=10.9.2.100

    spark on yarn

    spark按需部署,不用部署全集群节点, 同时也没必要启动spark的master和slaves服务,因为Spark应用程序提交到YARN后,YARN会负责集群资源的调度。
    按照上面步骤1-3进行配置即可,需要去掉步骤3中的SPARK_MASTER_IP=10.9.2.100配置项。

  • 相关阅读:
    rm
    Linux下解包/打包,压缩/解压命令
    虚拟机安装---vm12+ubuntukylin16.04
    mysql-5.6.41-winx64安装
    tensorflow学习笔记一------下载安装,配置环境(基于ubuntu16.04 pycharm)
    大一上学期C语言学习心得总结
    常见HTTP状态码
    Java语言基础及java核心
    linux下安装JMeter(小白教程)
    Linux下安装JDK(小白教程)
  • 原文地址:https://www.cnblogs.com/ggzone/p/10121220.html
Copyright © 2011-2022 走看看