zoukankan      html  css  js  c++  java
  • etcd安装和所遇到的坑

    首先参照 https://www.cnblogs.com/lyzw/p/6016789.html来安装

      虚拟机:VMware® Workstation 12 Pro

      系统:CentOS Linux release 7.2.1511 (Core) 3.10.0-327.el7.x86_64

    由于刚开始学习k8s,本次软件的安装,我们都采用最简单的方式,能用yum 安装的尽量采用yum安装

    1、ETCD安装

    ETCD官方文档:https://github.com/coreos/etcd/blob/master/Documentation/docs.md

    1.1 检查ETCD版本

    [root@localhost ~]# yum list|grep etcd
    etcd.x86_64                                2.3.7-4.el7                 @extras  
    [root@localhost ~]# 

    1.2 安装ETCD

    yum install etcd

    1.3 修改ETCD配置

    安装好后,系统会自动生成etcd.service文件(路径为/usr/lib/systemd/system/),修改对应的配置

    复制代码
    [Unit]
    Description=Etcd Server
    After=network.target
    After=network-online.target
    Wants=network-online.target
    
    [Service]
    Type=notify
    WorkingDirectory=/var/lib/etcd/
    EnvironmentFile=-/etc/etcd/etcd.conf
    User=etcd
    # set GOMAXPROCS to number of processors
    ExecStart=/bin/bash -c "GOMAXPROCS=$(nproc) /usr/bin/etcd 
    --name="${ETCD_NAME}" 
    --data-dir="${ETCD_DATA_DIR}" 
    --listen-peer-urls="${ETCD_LISTEN_PEER_URLS}" 
    --advertise-client-urls="${ETCD_ADVERTISE_CLIENT_URLS}" 
    --initial-cluster-token="${ETCD_INITIAL_CLUSTER_TOKEN}" 
    --initial-cluster="${ETCD_INITIAL_CLUSTER}"  
    --initial-cluster-state="${ETCD_INITIAL_CLUSTER_STATE}" 
    --listen-client-urls="${ETCD_LISTEN_CLIENT_URLS}""
    Restart=on-failure
    LimitNOFILE=65536
    
    [Install]
    WantedBy=multi-user.target
    复制代码

     并配置其配置文件

    复制代码
    ETCD_NAME=zwetcd_2
    ETCD_DATA_DIR="/var/lib/etcd/default.etcd"ETCD_LISTEN_PEER_URLS="http://192.168.37.131:2380"
    ETCD_LISTEN_CLIENT_URLS="http://192.168.37.131:2379,http://127.0.0.1:2379"
    #[cluster]
    ETCD_INITIAL_ADVERTISE_PEER_URLS="http://192.168.37.131:2380"
    # if you use different ETCD_NAME (e.g. test), set ETCD_INITIAL_CLUSTER value for this name, i.e. "test=http://..."
    ETCD_INITIAL_CLUSTER="zwetcd_2=http://192.168.37.131:2380,zwetcd_1=http://192.168.37.130:2380"
    ETCD_INITIAL_CLUSTER_STATE="new"
    ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster"
    ETCD_ADVERTISE_CLIENT_URLS="http://192.168.37.131:2379"
    复制代码

    如果使用firewalld作为防火墙,则需要开放端口:

    1
    2
    3
    4
    firewall-cmd --zone=public --add-port=2379/tcp --permanent
    firewall-cmd --zone=public --add-port=2380/tcp --permanent
    firewall-cmd --reload
    firewall-cmd --list-all

      问题:

    1、本地连接报错

    [root@localhost system]# etcdctl ls /
    Error: client: etcd cluster is unavailable or misconfigured
    error #0: dial tcp 127.0.0.1:2379: getsockopt: connection refused
    error #1: dial tcp 127.0.0.1:4001: getsockopt: connection refused

    如果出现如上的错误,是因为ETCD_LISTEN_CLIENT_URLS参数没有配置http://127.0.0.1:2379而导致的,不过已经配置了具体的IP,还需要配置本地链路,这个就有点奇怪了。

    2、Docker安装

    2.1、检查docker版本

    yum list |grep docker

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    [root@localhost ~]# yum list|grep docker
    docker.x86_64                              1.10.3-46.el7.centos.14     @extras 
    docker-common.x86_64                       1.10.3-46.el7.centos.14     @extras 
    docker-selinux.x86_64                      1.10.3-46.el7.centos.14     @extras 
    cockpit-docker.x86_64                      0.114-2.el7.centos          extras  
    docker-devel.x86_64                        1.3.2-4.el7.centos          extras  
    docker-distribution.x86_64                 2.4.1-2.el7                 extras  
    docker-forward-journald.x86_64             1.10.3-44.el7.centos        extras  
    docker-latest.x86_64                       1.12.1-2.el7.centos         extras  
    docker-latest-logrotate.x86_64             1.12.1-2.el7.centos         extras  
    docker-latest-v1.10-migrator.x86_64        1.12.1-2.el7.centos         extras  
    docker-logrotate.x86_64                    1.10.3-46.el7.centos.14     extras  
    docker-lvm-plugin.x86_64                   1.10.3-46.el7.centos.14     extras  
    docker-novolume-plugin.x86_64              1.10.3-46.el7.centos.14     extras  
    docker-python.x86_64                       1.4.0-115.el7               extras  
    docker-registry.noarch                     0.6.8-8.el7                 extras  
    docker-registry.x86_64                     0.9.1-7.el7                 extras  
    docker-unit-test.x86_64                    1.10.3-46.el7.centos.14     extras  
    docker-v1.10-migrator.x86_64               1.10.3-46.el7.centos.14     extras  
    python-docker-py.noarch                    1.7.2-1.el7                 extras  
    [root@localhost ~]#

    2.2 安装docker

    1 yum install docker -y

    2.3 检查docker安装信息

    复制代码
    [root@localhost ~]# docker version
    Client:
     Version:         1.10.3
     API version:     1.22
     Package version: docker-common-1.10.3-46.el7.centos.14.x86_64
     Go version:      go1.6.3
     Git commit:      cb079f6-unsupported
     Built:           Fri Sep 16 13:24:25 2016
     OS/Arch:         linux/amd64
    Cannot connect to the Docker daemon. Is the docker daemon running on this host?
    复制代码

    3 flannel

    3.1 检查flannel版本

    [root@localhost etcd]# yum list |grep flannel
    flannel.x86_64                             0.5.3-9.el7                 @extras  

    3.2 安装flannel

    yum install flannel 

    3.3 修改service配置

    查看flannel的配置文件(使用yum安装会自动生成此文件,如果下载的执行文件则需要手动生成,在使用systemctl命令执行service 的时候会用到),可以看到flannel的service配置如下:

    复制代码
    [root@localhost etcd]# more /usr/lib/systemd/system/flanneld.service
    [Unit]
    Description=Flanneld overlay address etcd agent
    After=network.target
    After=network-online.target
    Wants=network-online.target
    After=etcd.service
    Before=docker.service
    
    [Service]
    Type=notify
    EnvironmentFile=/etc/sysconfig/flanneld
    EnvironmentFile=-/etc/sysconfig/docker-network
    ExecStart=/usr/bin/flanneld -etcd-endpoints=${FLANNEL_ETCD} -etcd-prefix=${FLANNEL_ETCD_KEY} $FLANNEL_OPTIONS
    ExecStartPost=/usr/libexec/flannel/mk-docker-opts.sh -k DOCKER_NETWORK_OPTIONS -d /run/flannel/docker
    Restart=on-failure
    
    [Install]
    WantedBy=multi-user.target
    RequiredBy=docker.service
    复制代码

    其中所有的参数都配置在/etc/sysconfig/flanneld文件中,修改此文件,初始文件如下:

    复制代码
    # Flanneld configuration options  
    
    # etcd url location.  Point this to the server where etcd runs
    FLANNEL_ETCD="http://127.0.0.1:2379"
    
    # etcd config key.  This is the configuration key that flannel queries
    # For address range assignment
    FLANNEL_ETCD_KEY="/atomic.io/network"
    
    # Any additional options that you want to pass
    FLANNEL_OPTIONS=""
    复制代码

    其中

      FLANNEL_ETCD:为ETCD的地址,

      FLANNEL_ETCD_KEY:为在etcd中配置的网络参数的key 

      FLANNEL_OPTIONS:为flannel的启动参数,我在这里加上了监听的网卡

    根据前面步骤中etcd的配置,我们修改配置文件如下:

    复制代码
    # Flanneld configuration options  
    
    # etcd url location.  Point this to the server where etcd runs
    FLANNEL_ETCD="http://192.168.37.130:2379"
    
    # etcd config key.  This is the configuration key that flannel queries
    # For address range assignment
    FLANNEL_ETCD_KEY="/flannel/network"
    
    # Any additional options that you want to pass
    FLANNEL_OPTIONS="--iface=eno16777736"
    复制代码

    3.4 启动FLANNEL

    可以使用service flanneld start 或者systemctl start flannel启动flannel

    3.5 修改docker网络

     因为docker需要使用flanneld的网络,因此需要修改docker的service文件:

    复制代码
    [Unit]
    Description=Docker Application Container Engine
    Documentation=http://docs.docker.com
    After=network.target rhel-push-plugin.socket
    Wants=docker-storage-setup.service
    
    [Service]
    Type=notify
    NotifyAccess=all
    #import flannel configuration 
    EnvironmentFile=-/etc/sysconfig/flanneld
    EnvironmentFile=-/run/flannel/subnet.env
    EnvironmentFile=-/etc/sysconfig/docker
    EnvironmentFile=-/etc/sysconfig/docker-storage
    EnvironmentFile=-/etc/sysconfig/docker-network
    Environment=GOTRACEBACK=crash
    ExecStart=/usr/bin/docker-current daemon 
              --exec-opt native.cgroupdriver=systemd 
              $OPTIONS 
              $DOCKER_STORAGE_OPTIONS 
              $DOCKER_NETWORK_OPTIONS 
              $ADD_REGISTRY 
              $BLOCK_REGISTRY 
              $INSECURE_REGISTRY 
              --bip=${FLANNEL_SUBNET}
    LimitNOFILE=1048576
    LimitNPROC=1048576
    LimitCORE=infinity
    TimeoutStartSec=0
    MountFlags=slave
    Restart=on-abnormal
    
    [Install]
    WantedBy=multi-user.target
    复制代码

    在执行前增加配置文件

    EnvironmentFile=-/etc/sysconfig/flanneld

    EnvironmentFile=-/run/flannel/subnet.env

    执行命令增加参数 --bip=${FLANNEL_SUBNET}

    重启docker

    systemctl daemon-reload
    systemctl restart docker

    3.6 问题

    1、Failed to retrieve network config: 104: Not a directory (/flannel/network/config)

    问题原因:在初次配置的时候,把flannel的配置文件中的etcd-prefix-key配置成了/flannel/network/config,实际上应该是/flannel/network

    注意:如上配置需要在集群的所有机器上执行,完成后,上述安装的各个系统的启动顺序应该是:

    systemctl start etcd

    systemctl start flannel

    systemctl start docker

    配置完检查:

    使用ip a检查当前的网络的准备情况:

    复制代码
    [root@localhost system]# ip a
    1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN 
        link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
        inet 127.0.0.1/8 scope host lo
           valid_lft forever preferred_lft forever
        inet6 ::1/128 scope host 
           valid_lft forever preferred_lft forever
    2: eno16777736: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
        link/ether 00:0c:29:79:cf:e3 brd ff:ff:ff:ff:ff:ff
        inet 192.168.37.130/24 brd 192.168.37.255 scope global dynamic eno16777736
           valid_lft 1554sec preferred_lft 1554sec
        inet6 fe80::20c:29ff:fe79:cfe3/64 scope link 
           valid_lft forever preferred_lft forever
    9: flannel0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1472 qdisc pfifo_fast state UNKNOWN qlen 500
        link/none 
        inet 172.17.75.0/16 scope global flannel0
           valid_lft forever preferred_lft forever
    10: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN 
        link/ether 02:42:e3:f0:0d:05 brd ff:ff:ff:ff:ff:ff
        inet 172.17.75.1/24 scope global docker0
           valid_lft forever preferred_lft forever
    复制代码

    如果看到到flannel0余docker0的网段相同,则网络配置成功。

    在第一次安装完3台机器集群后没有问题,后面过段时间后要加入一个新节点,就碰到很多坑了

    如http://www.mamicode.com/info-detail-2194387.html 此文说到的

            systemd启动etcd服务的时候出现错误:Failed at step CHDIR spawning /usr/bin/etcd: No such file or directory
    
          解决办法:etcd.service服务配置文件中设置的工作目录WorkingDirectory=/var/lib/etcd/必须存在,否则会报以上错误
    
            systemd启动etcd服务的时候出现错误:cannot assign requested address
    
          解决办法:绑定阿里云的私网IP

    也有https://blog.csdn.net/u010087956/article/details/53670468

    通过systemd托管的etcd数据备份还原无法启动服务并且报错
    
    error listing data dir: /var/lib/etcd/default.etcd
    
        1
    
    但是单独执行启动命令可以
    
    /usr/bin/etcd --debug  --name=default --data-dir=/var/lib/etcd/default.etcd --listen-client-urls http://0.0.0.0:2379 --advertise-client-urls http://0.0.0.0:2380
    
        1
    
    主要是还原目录时没有注意权限问题,systemd默认是以etcd用户执行的,这里需要修改default.etcd文件夹权限
    
    chown etcd:etcd -R /var/lib/etcd/default.etcd
    
        1
    
    参考文档
    
    etcd can’t start due to status=1/FAILURE or status=200/CHDIR · Issue #3331 · coreos/etcd · GitHub 
  • 相关阅读:
    H5页面跳到安卓APP和iosAPP
    JS location.href传参及接受参数
    获取当前日期及对应星期
    前端获取当前一周时间 数组形式
    Java基础(四) Object 数组转成 String 数组
    定时任务cron表达式详解
    jquery如何删除数组中的一个元素?
    Mybatis Mapper.xml 需要查询返回List<String>
    oracle的 listagg() WITHIN GROUP () 行转列函数的使用
    如何修改Oracle中表的字段长度?
  • 原文地址:https://www.cnblogs.com/devilwind/p/8880677.html
Copyright © 2011-2022 走看看