zoukankan      html  css  js  c++  java
  • docker1.12在cento7里的组建swarm (一)

    docker1.12在cento7里的跨多主机容器网络方案

    我的虚拟机是192.168.2.108-116 9台

    200是仓库机

    在仓库机上执行

    docker swarm init 初始化swarm

    [root@localhost ~]# docker swarm init
    Swarm initialized: current node (ado6uyaldy5ovvi7fwkvvuoh4) is now a manager.

    To add a worker to this swarm, run the following command:

    docker swarm join
    --token SWMTKN-1-5i9rc8jlypt8ngy137asbi5qhwenuze9ez1o19f40jxftnq4nj-2mnw1bzecz6tqz94bia3ok5rp
    192.168.2.200:2377

    To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.

    [root@localhost ~]# netstat -lan|grep 2377
    tcp 0 0 192.168.2.200:49598 192.168.2.200:2377 ESTABLISHED
    tcp 0 0 127.0.0.1:42714 127.0.0.1:2377 ESTABLISHED
    tcp6 0 0 :::2377 :::* LISTEN
    tcp6 0 0 127.0.0.1:2377 127.0.0.1:42714 ESTABLISHED
    tcp6 0 0 192.168.2.200:2377 192.168.2.200:49598 ESTABLISHED
    [root@localhost ~]# docker node ls
    ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS
    ado6uyaldy5ovvi7fwkvvuoh4 * localhost.localdomain Ready Active Leader

    仓库机的基础swarm集群 算是创建出来了

    在108上面执行

    docker swarm join
    --token SWMTKN-1-5i9rc8jlypt8ngy137asbi5qhwenuze9ez1o19f40jxftnq4nj-2mnw1bzecz6tqz94bia3ok5rp
    192.168.2.200:2377

    过很久报错:

    Error response from daemon: Timeout was reached before node was joined. The attempt to join the swarm will continue in the background. Use the "docker info" command to see the current swarm status of your node.

    怀疑是防火墙问题;

    200:

    firewall-cmd --permanent --zone=public --add-port=2377/tcp

    firewall-cmd --reload

    怀疑是时间问题:

    每个节点上安装ntp网络时间同步服务:

    yum -y install ntp

    systemctl enable ntpd

    systemctl start ntpd

    ntpdate -u cn.pool.ntp.org

    怀疑是主机名问题:

    hostnamectl set-hostname ip+ip尾段.sfimc.com

    108上:

    因为之前 join过 再join 会报错

     docker swarm leave  注意在运行时的节点上这句属于危险语句 要小心

    [root@localhost ~]# docker swarm leave
    Node left the swarm.
    [root@localhost ~]# docker swarm join --token SWMTKN-1-5i9rc8jlypt8ngy137asbi5qhwenuze9ez1o19f40jxftnq4nj-2mnw1bzecz6tqz94bia3ok5rp 192.168.2.200:2377
    This node joined a swarm as a worker.

    恭喜!这就成功了

    然后在109-115的机器上:也

     docker swarm join --token SWMTKN-1-5i9rc8jlypt8ngy137asbi5qhwenuze9ez1o19f40jxftnq4nj-2mnw1bzecz6tqz94bia3ok5rp 192.168.2.200:2377

    也就是将其他worker节点加入集群  应该可以都成功

    然后 在200上

    docker swarm join-token manager  这是用来取得 join管理节点的  token的   是的 只是token不一样 我开始也找了很久 英文太差了。。。哈哈哈

    docker swarm join --token SWMTKN-1-5i9rc8jlypt8ngy137asbi5qhwenuze9ez1o19f40jxftnq4nj-3uyi9txwapgnfe2gxbs6nkv6f 192.168.2.200:2377

    在116执行:

    docker swarm join --token SWMTKN-1-5i9rc8jlypt8ngy137asbi5qhwenuze9ez1o19f40jxftnq4nj-3uyi9txwapgnfe2gxbs6nkv6f 192.168.2.200:2377

    This node joined a swarm as a manager.

    这时候在200上看节点:

    [root@ip200 ~]# docker node ls
    ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS
    4zgqf8j8q9zi5nw2czdx59a7g ip108.sfimc.com Ready Active
    ado6uyaldy5ovvi7fwkvvuoh4 * ip200.sfimc.com Ready Active Leader
    e55zsu88tf1ggj824ezutmjww ip116.sfimc.com Ready Active Reachable

    116成了可选节点 当主节点异常了  或者关机了 他就会切换成主节点

    到这里其实整个swarm集群节点就已经建成了

    这里会有个严重问题  一个管理节点的退出  

    开始在116上

    [root@ip116 ~]# docker swarm leave
    Error response from daemon: You are attempting to leave the swarm on a node that is participating as a manager. Removing this node leaves 1 managers out of 2. Without a Raft quorum your swarm will be inaccessible. The only way to restore a swarm that has lost consensus is to reinitialize it with `--force-new-cluster`. Use `--force` to suppress this message.
    [root@ip116 ~]# docker swarm leave
    Error response from daemon: You are attempting to leave the swarm on a node that is participating as a manager. Removing this node leaves 1 managers out of 2. Without a Raft quorum your swarm will be inaccessible. The only way to restore a swarm that has lost consensus is to reinitialize it with `--force-new-cluster`. Use `--force` to suppress this message.
    [root@ip116 ~]# docker swarm leave --force
    Node left the swarm.
    [root@ip116 ~]# ^C
    [root@ip116 ~]# docker swarm leave --force
    Error response from daemon: This node is not part of a swarm

    我试图从116这个管理节点上上退出集群 换到新造的201 202号上   英文不好的我在116上试了一个命令参数  force  结果造成了

    200上:

    [root@ip200 ~]# docker node ls
    Error response from daemon: rpc error: code = 2 desc = raft: no elected cluster leader

    [root@ip200 ~]# docker node update e55(注e55是原来116在集群里的字符串代码的头三位)
    Error response from daemon: rpc error: code = 4 desc = context deadline exceeded

    网上查了暂时无解

    只好在200上也:docker swarm leave --force(注意这就意味着整个集群没有一个管理节点 实际上集群就已经崩溃了)

    重新配置:

    200上:

    [root@ip200 ~]# docker swarm init
    Swarm initialized: current node (c0fqga97cqoghgn5h8rqn39yc) is now a manager.

    To add a worker to this swarm, run the following command:

    docker swarm join
    --token SWMTKN-1-2djciyagpvqpvs0r770pu5fgcith6yc1uhsev2g1e0riprt1qy-3789ph5zmo6tx703c77zno6kf
    192.168.2.200:2377

    To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.

    [root@ip200 ~]# docker swarm join-token manager
    To add a manager to this swarm, run the following command:

    docker swarm join
    --token SWMTKN-1-2djciyagpvqpvs0r770pu5fgcith6yc1uhsev2g1e0riprt1qy-aookkrj5i8ggkfn8r2er5hrt3
    192.168.2.200:2377

    再次取得新的两种节点的join token

    在201 202上运行管理节点的join命令  在110-116上加入成为worker的join命令,有些已经加入过之前集群的要先docker swarm leave才行,管理节点要开放2337端口,这些上文都有示例,就不重复絮叨了;

    在处理202的时候有点异常,加入集群以后 202控制台没有任何显示,在200 201上显示成了工作节点,我把202关机  这时候node ls 202显示down,在200上docker node rm 202  成功在集群里删了他,

    但是试图重新加入节点的时候202异常,似乎还是认为自己属于某个集群的管理节点,

    docker swarm leave --force
    Error response from daemon: context deadline exceeded

    连强制leave都报这个异常,

    [root@ip202 ~]# cd /var/lib/docker/swarm/
    [root@ip202 swarm]# ls
    certificates docker-state.json raft state.json worker
    [root@ip202 swarm]# rm -rf *

    重启docker服务

    [root@ip202 swarm]# service docker restart
    Redirecting to /bin/systemctl restart docker.service
    [root@ip202 swarm]# docker swarm leave
    Error response from daemon: This node is not part of a swarm

     systemctl stop  docker.service;cd /var/lib/docker/swarm/;rm -rf *; service docker restart;docker swarm leave --force;

    Redirecting to /bin/systemctl restart docker.service
    Error response from daemon: This node is not part of a swarm

    总算正常了,看来终极大招就是重启服务和删除swarm相关文件 不过是万不得已的时候才能做就是了

    再次加入202到集群作为管理节点,

    最后在200、201、202上随遍哪个执行;

    docker node ls

    [root@ip200 swarm]# docker node ls
    ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS
    08kaohdh5jl17toyiqe08rxuv ip110.sfimc.com Ready Active
    20tv0r7vw3xtxyez8mx7o5g5d * ip200.sfimc.com Ready Active Leader
    2hslpef7idwka474cok6kach8 ip111.sfimc.com Ready Active
    2rkf3xqjpq89tjoqj073hht14 ip114.sfimc.com Ready Active
    52ued4l33njka5ua4wjoi8epk ip112.sfimc.com Ready Active
    5kclmdzx3mxafqi3achr3h47a ip202.sfimc.com Ready Active Reachable
    5osxnk5asjxi3d3vzdsy9lvbv ip201.sfimc.com Ready Active Reachable
    6qrxcpsiqpmec4regf5cmyszc ip116.sfimc.com Ready Active
    ai77no9444jka1t8srjwk8bzk ip108.sfimc.com Ready Active
    bduvkn5xczeo9ax2ydyvzmvbo ip115.sfimc.com Ready Active
    da0e74wzdxgi83f7m89r51jil ip113.sfimc.com Ready Active
    dlfy5dwho3b6k0db3q9za5pov ip109.sfimc.com Ready Active

    终于  都加入了

    现在随便在哪个管理节点进行管理操作了

  • 相关阅读:
    HDU 1312 Red and Black(经典DFS)
    POJ 1274 The Perfect Stall(二分图 && 匈牙利 && 最小点覆盖)
    POJ 3041 Asteroids(二分图 && 匈牙利算法 && 最小点覆盖)
    HDU 1016 素数环(dfs + 回溯)
    HDU 1035 Robot Motion(dfs + 模拟)
    vjudge Trailing Zeroes (III) (二分答案 && 数论)
    openjudge 和为给定数(二分答案)
    图的存储
    二分查找
    快速选择算法
  • 原文地址:https://www.cnblogs.com/sfissw/p/6083454.html
Copyright © 2011-2022 走看看