zoukankan      html  css  js  c++  java
  • spark standalone zookeeper HA部署方式

    虽然spark master挂掉的几率很低,不过还是被我遇到了一次。以前在spark standalone的文章中也介绍过standalone的ha,现在详细说下部署流程,其实也比较简单。

    一.机器

    zookeeper集群

    zk1:2181
    zk2:2181
    zk3:2181
    

    spark master

    spark-m1
    spark-m2
    

    spark worker

    若干
    

    二.步骤

    1.进入spark-m1
    修改conf/spark-env.sh

    vi spark-env.sh
    export SPARK_MASTER_IP=spark-m1
    export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER -Dspark.deploy.zookeeper.url=zk1:2181,zk2:2181,zk3:2181 -Dspark.deploy.zookeeper.dir=/spark"   

    启动master和slaves

    ./sbin/start-master.sh
    ./sbin/start-slaves.sh

    2.进入spark-m2

    修改conf/spark-env.sh

    vi spark-env.sh
    export SPARK_MASTER_IP=spark-m2
    export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER -Dspark.deploy.zookeeper.url=zk1:2181,zk2:2181,zk3:2181 -Dspark.deploy.zookeeper.dir=/spark"   

    启动master和slaves

    ./sbin/start-master.sh
    ./sbin/start-slaves.sh

    三.检测

    在spark-m1的web ui中可以看到状态

    spark-m2中可以看到处于STANDBY状态

    application提交时,master改为

    --master spark://spark-m1:7077,spark-m2:7077
    

    spark shell 测试

    在spark-m1中启动spark Shell

    spark-shell --master spark://spark-m1:7077,spark-m2:7077
    

    连接后关闭spark-m1 master

    ./bin/stop-master.sh
    

    发现spark-shell不会断开而是转到spark-m2的master上继续执行(该过程持续大概1分钟,woker会重新注册到spark-m2上),同时spark-m2变为alive状态。

    可以在spark-m2的master日志中看到:

    15/08/17 14:45:35 INFO ZooKeeperLeaderElectionAgent: We have gained leadership
    15/08/17 14:45:36 INFO Master: I have been elected leader! New state: RECOVERING
    15/08/17 14:45:36 INFO Master: Trying to recover worker:...
    15/08/17 14:45:36 INFO Master: Trying to recover worker: ...
    15/08/17 14:45:36 INFO Master: Trying to recover worker: ...
    ......
    15/08/17 14:45:36 INFO Master: Worker has been re-registered: worker-...
    15/08/17 14:45:36 INFO Master: Worker has been re-registered: worker-...
    15/08/17 14:45:36 INFO Master: Worker has been re-registered: worker-...
    ...
    15/08/17 14:45:36 INFO Master: Recovery complete - resuming operations!

    部署结束

  • 相关阅读:
    [转]十个让你变成糟糕的程序员的行为
    [转]CKEDITOR 使用说明
    [转]惹恼程序员的十件事
    基本权限管理框架配套代码生成器!
    Easy UI 点击TAB 标签 刷新内容
    jQuery.easyui 与 jQuery.Valiedate 验证控件组合使用实例!
    [转]十条不错的编程观点
    dos 改 ip
    今天碰到了几个老同学,哎,,感觉
    猛玩War3中
  • 原文地址:https://www.cnblogs.com/zhangyunlin/p/6168174.html
Copyright © 2011-2022 走看看