zoukankan      html  css  js  c++  java
  • spark standalone zookeeper HA部署方式

    虽然spark master挂掉的几率很低,不过还是被我遇到了一次。以前在spark standalone的文章中也介绍过standalone的ha,现在详细说下部署流程,其实也比较简单。

    一.机器

    zookeeper集群

    zk1:2181
    zk2:2181
    zk3:2181
    

    spark master

    spark-m1
    spark-m2
    

    spark worker

    若干
    

    二.步骤

    1.进入spark-m1
    修改conf/spark-env.sh

    vi spark-env.sh
    export SPARK_MASTER_IP=spark-m1
    export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER -Dspark.deploy.zookeeper.url=zk1:2181,zk2:2181,zk3:2181 -Dspark.deploy.zookeeper.dir=/spark"   

    启动master和slaves

    ./sbin/start-master.sh
    ./sbin/start-slaves.sh

    2.进入spark-m2

    修改conf/spark-env.sh

    vi spark-env.sh
    export SPARK_MASTER_IP=spark-m2
    export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER -Dspark.deploy.zookeeper.url=zk1:2181,zk2:2181,zk3:2181 -Dspark.deploy.zookeeper.dir=/spark"   

    启动master和slaves

    ./sbin/start-master.sh
    ./sbin/start-slaves.sh

    三.检测

    在spark-m1的web ui中可以看到状态

    spark-m2中可以看到处于STANDBY状态

    application提交时,master改为

    --master spark://spark-m1:7077,spark-m2:7077
    

    spark shell 测试

    在spark-m1中启动spark Shell

    spark-shell --master spark://spark-m1:7077,spark-m2:7077
    

    连接后关闭spark-m1 master

    ./bin/stop-master.sh
    

    发现spark-shell不会断开而是转到spark-m2的master上继续执行(该过程持续大概1分钟,woker会重新注册到spark-m2上),同时spark-m2变为alive状态。

    可以在spark-m2的master日志中看到:

    15/08/17 14:45:35 INFO ZooKeeperLeaderElectionAgent: We have gained leadership
    15/08/17 14:45:36 INFO Master: I have been elected leader! New state: RECOVERING
    15/08/17 14:45:36 INFO Master: Trying to recover worker:...
    15/08/17 14:45:36 INFO Master: Trying to recover worker: ...
    15/08/17 14:45:36 INFO Master: Trying to recover worker: ...
    ......
    15/08/17 14:45:36 INFO Master: Worker has been re-registered: worker-...
    15/08/17 14:45:36 INFO Master: Worker has been re-registered: worker-...
    15/08/17 14:45:36 INFO Master: Worker has been re-registered: worker-...
    ...
    15/08/17 14:45:36 INFO Master: Recovery complete - resuming operations!

    部署结束

  • 相关阅读:
    在centos7下 布隆过滤器2种安装方式
    Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
    redis lua --eval报错1: Lua redis() command arguments must be strings or integers
    redis+lua脚本 分布式锁初步学习
    redis中通用命令(key)和补充
    redis基本数据类型有序集合(zset)学习笔记
    redis基本数据结构集合(set)学习笔记
    大道至简读后感
    第一周
    《大道至简》读后感
  • 原文地址:https://www.cnblogs.com/zhangyunlin/p/6168174.html
Copyright © 2011-2022 走看看