zoukankan      html  css  js  c++  java
  • postgresql 高可用 etcd + patroni 之二 patroni

    os: centos 7.4
    postgresql: 9.6.9
    etcd: 3.2.18
    patroni: 1.4.4

    patroni + etcd 是在一个postgrsql 开源大会上 亚信的一个哥们讲解的高可用方案。
    依然是基于 postgreql stream replication。

    ip规划
    192.168.56.101 node1 master
    192.168.56.102 node2 slave
    192.168.56.103 node3 slave

    安装postgresql并配置好stream

    node1、node2、node3 节点上注意设置如下几个参数

    synchronous_commit = on
    full_page_writes = on
    wal_log_hints = on
    synchronous_standby_names = '*'
    max_replication_slots = 10
    

    主要是为了使用 pg_rewind,尽量不用 synchronous 方式复制数据,性能影响太大。

    node1上创建复制槽,至关重要,patroni 用到了这个玩意

    postgres=# create user replicator replication login encrypted password '1qaz2wsx';
    
    postgres=# select * from pg_create_physical_replication_slot('pgsql96_node1');
    postgres=# select * from pg_create_physical_replication_slot('pgsql96_node2');
    postgres=# select * from pg_create_physical_replication_slot('pgsql96_node3');
    

    node2、node3 配置stream replication

    $ /usr/pgsql-9.6/bin/pg_ctl stop -m fast -D /var/lib/pgsql/9.6/main
    
    $ cd /var/lib/pgsql/9.6/main
    $ rm -rf ./*
    $ /usr/pgsql-9.6/bin/pg_basebackup -h 192.168.56.101 -D /var/lib/pgsql/9.6/main -U replicator -v -P -R
    
    $ vi recovery.conf
    recovery_target_timeline = 'latest'
    standby_mode = 'on'
    primary_conninfo = 'host=192.168.56.101 port=5432 user=replicator password=1qaz2wsx'
    primary_slot_name = 'pgsql96_node1'
    trigger_file = '/tmp/postgresql.trigger.5432'
    
    $ /usr/pgsql-9.6/bin/pg_ctl start -D /var/lib/pgsql/9.6/main -o "-c config_file=/etc/postgresql/9.6/main/postgresql.conf"
    

    注意 recovery.conf 的 primary_slot_name 在不同节点值会不同。

    添加复制功能条目 pg_hba.conf

    $ vi pg_hba.conf
    
    # Database administrative login by Unix domain socket
    local   all             postgres                                peer
    
    # TYPE  DATABASE        USER            ADDRESS                 METHOD
    
    # "local" is for Unix domain socket connections only
    local   all             all                                     peer
    
    # IPv4 local connections:
    host    all             postgres        127.0.0.1/32            trust
    host    all             all             127.0.0.1/32            md5
    host    all             all             192.168.56.0/24         md5
    
    # IPv6 local connections:
    host    all             all             ::1/128                 md5
    
    # Allow replication connections from localhost, by a user with the
    # replication privilege.
    local   replication     replicator                              peer
    host    replication     replicator      127.0.0.1/32            md5
    host    replication     replicator      ::1/128                 md5
    
    host	replication	    replicator	    192.168.56.101/32	         md5
    host	replication	    replicator	    192.168.56.102/32	         md5
    host	replication	    replicator	    192.168.56.103/32	         md5
    
    $ psql -c "select pg_reload_conf();"
    

    查看复制状态

    postgres=#  select client_addr,
           pg_xlog_location_diff(sent_location, write_location) as write_delay,
           pg_xlog_location_diff(sent_location, flush_location) as flush_delay,
           pg_xlog_location_diff(sent_location, replay_location) as replay_delay 
     from pg_stat_replication;
    
      client_addr   | write_delay | flush_delay | replay_delay 
    ----------------+-------------+-------------+--------------
     192.168.56.102 |           0 |           0 |            0
     192.168.56.103 |           0 |           0 |            0
    (2 row)
    
    

    安装etcd

    参考上一篇blog安装好etcd

    下载、安装 patroni

    用户也可以参考 https://www.linode.com/docs/databases/postgresql/create-a-highly-available-postgresql-cluster-using-patroni-and-haproxy/

    个人觉得上篇文章中 etcd 做成单点不太合适,当然作为参考完全没有问题。

    # cd /tmp
    # curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
    # python get-pip.py
    # pip install patroni[dependencies]
    
    

    patroni的一些依赖

    urllib3>=1.19.1,!=1.21
    boto
    psycopg2>=2.5.4
    PyYAML
    requests
    six>=1.7
    kazoo>=1.3.1
    python-etcd>=0.4.3,<0.5
    python-consul>=0.7.0
    click>=4.1
    prettytable>=0.7
    tzlocal
    python-dateutil
    psutil
    cdiff
    kubernetes>=2.0.0,<=6.0.0,!=4.0.*,!=5.0.*
    

    patroni 的配置

    # which patroni
    /usr/bin/patroni
    
    # patroni --help
    /usr/lib64/python2.7/site-packages/psycopg2/__init__.py:144: UserWarning: The psycopg2 wheel package will be renamed from release 2.8; in order to keep installing from binary please use "pip install psycopg2-binary" instead. For details see: <http://initd.org/psycopg/docs/install.html#binary-install-from-pypi>.
      """)
    Usage: /usr/bin/patroni config.yml
    	Patroni may also read the configuration from the PATRONI_CONFIGURATION environment variable
    
    

    错误提示 please use “pip install psycopg2-binary” instead

    # pip install psycopg2-binary
    

    patroni 配置文件

    # mkdir -p /usr/patroni/conf
    # cd /usr/patroni/conf/
    
    # vi patroni_postgresql.yml
    
    scope: pgsql96
    namespace: /pgsql/
    name: pgsql96_node1
    
    restapi:
      listen: 192.168.56.101:8008
      connect_address: 192.168.56.101:8008
    
    etcd:
      host: 192.168.56.101:2379
    
    bootstrap:
      # this section will be written into Etcd:/<namespace>/<scope>/config after initializing new cluster
      # and all other cluster members will use it as a `global configuration`
      dcs:
        ttl: 30
        loop_wait: 10
        retry_timeout: 10
        maximum_lag_on_failover: 1048576
        master_start_timeout: 300
    #    synchronous_mode: false
        postgresql:
          use_pg_rewind: true
          use_slots: true
          parameters:
            listen_addresses: "*"
            port: 5432
            wal_level: logical
            hot_standby: "on"
            wal_keep_segments: 1000
            max_wal_senders: 10
            max_replication_slots: 10
            wal_log_hints: "on"
    #        archive_mode: "on"
    #        archive_timeout: 1800s
    #        archive_command: gzip < %p > /data/backup/pgwalarchive/%f.gz
    #      recovery_conf:
    #        restore_command: gunzip < /data/backup/pgwalarchive/%f.gz > %p
    
    postgresql:
      listen: 0.0.0.0:5432
      connect_address: 192.168.56.101:5432
      data_dir: /var/lib/pgsql/9.6/data
      bin_dir: /usr/pgsql-9.6/bin
    #  config_dir:
      authentication:
        replication:
          username: replicator
          password: 1qaz2wsx
        superuser:
          username: postgres
          password: 1qaz2wsx
    
    #watchdog:
    #  mode: automatic # Allowed values: off, automatic, required
    #  device: /dev/watchdog
    #  safety_margin: 5
    
    tags:
        nofailover: false
        noloadbalance: false
        clonefrom: false
        nosync: false
    

    上面的配置和后面的输出信息有细微差异,是因为当时实验完成后又对这个配置文件做了好几次修正,方便大家直接copy使用。

    手动启动 patroni

    参数将按以下顺序应用(运行时被赋予最高优先级):

    1、从文件加载参数postgresql.base.conf(或从自定义conf文件(如果已设置)
    2、从文件加载参数postgresql.conf
    3、从文件加载参数postgresql.auto.conf
    4、运行时参数使用-o-name=value

    node1、node2、node3 三个节点依次启动

    $ patroni /usr/patroni/conf/patroni_postgresql.yml
    
    

    node1 的日志如下

    2018-07-11 18:17:22,402 INFO: Lock owner: pg96_101; I am pg96_101
    2018-07-11 18:17:22,430 INFO: no action.  i am the leader with the lock
    2018-07-11 18:17:32,403 INFO: Lock owner: pg96_101; I am pg96_101
    2018-07-11 18:17:32,432 INFO: no action.  i am the leader with the lock
    

    node2 的日志如下

    2018-07-11 18:17:22,421 INFO: Lock owner: pg96_101; I am pg96_102
    2018-07-11 18:17:22,421 INFO: does not have lock
    2018-07-11 18:17:22,435 INFO: no action.  i am a secondary and i am following a leader
    2018-07-11 18:17:32,426 INFO: Lock owner: pg96_101; I am pg96_102
    2018-07-11 18:17:32,426 INFO: does not have lock
    2018-07-11 18:17:32,436 INFO: no action.  i am a secondary and i am following a leader
    

    node3 的日志如下

    2018-07-11 18:17:22,409 INFO: Lock owner: pg96_101; I am pg96_103
    2018-07-11 18:17:22,410 INFO: does not have lock
    2018-07-11 18:17:22,423 INFO: no action.  i am a secondary and i am following a leader
    2018-07-11 18:17:32,415 INFO: Lock owner: pg96_101; I am pg96_103
    2018-07-11 18:17:32,415 INFO: does not have lock
    2018-07-11 18:17:32,425 INFO: no action.  i am a secondary and i am following a leader
    

    查看集群状态

    查看 patroni 集群状态

    $ patronictl -c /usr/patroni/conf/patroni_postgresql.yml list pg96
    +---------+----------+----------------+--------+---------+-----------+
    | Cluster |  Member  |      Host      |  Role  |  State  | Lag in MB |
    +---------+----------+----------------+--------+---------+-----------+
    |   pg96  | pg96_101 | 192.168.56.101 | Leader | running |       0.0 |
    |   pg96  | pg96_102 | 192.168.56.102 |        | running |       0.0 |
    |   pg96  | pg96_103 | 192.168.56.103 |        | running |       0.0 |
    +---------+----------+----------------+--------+---------+-----------+
    
    $ patronictl -c /usr/patroni/conf/patroni_postgresql.yml show-config pg96
    loop_wait: 10
    maximum_lag_on_failover: 1048576
    postgresql:
      parameters:
        listen_addresses: '*'
        port: 5432
      use_pg_rewind: true
    retry_timeout: 10
    ttl: 30
    
    

    查看 etcd 的 信息

    $ etcdctl ls /pg96/pg96/
    /pg96/pg96/members
    /pg96/pg96/initialize
    /pg96/pg96/leader
    /pg96/pg96/config
    /pg96/pg96/optime
    
    $ etcdctl get /pg96/pg96/members/pg96_101
    {"conn_url":"postgres://192.168.56.101:5432/postgres","api_url":"http://127.0.0.1:8008/patroni","timeline":1,"state":"running","role":"master","xlog_location":50378640}
    $ etcdctl get /pg96/pg96/members/pg96_102
    {"conn_url":"postgres://192.168.56.102:5432/postgres","api_url":"http://127.0.0.1:8008/patroni","timeline":1,"state":"running","role":"replica","xlog_location":50378640}
    $ etcdctl get /pg96/pg96/members/pg96_103
    {"conn_url":"postgres://192.168.56.103:5432/postgres","api_url":"http://127.0.0.1:8008/patroni","timeline":1,"state":"running","role":"replica","xlog_location":50378640}
    
    $ etcdctl get /pg96/pg96/initialize
    6576484813966394513
    
    $ etcdctl get /pg96/pg96/leader
    pg96_101
    
    $ etcdctl get /pg96/pg96/config
    {"ttl":30,"maximum_lag_on_failover":1048576,"retry_timeout":10,"postgresql":{"use_pg_rewind":true,"parameters":{"listen_addresses":"*","port":5432}},"loop_wait":10}
    
    
    $ etcdctl get /pg96/pg96/optime/leader
    50378640
    
    

    connction

    using jdbc:

    jdbc:postgresql://node1,node2,node3/postgres?targetServerType=master
    
    

    libpq starting from postgresql 10:

    postgresql://node1:port,node2:port,node3:port/?target_session_attrs=read-write
    
    

    配置随OS启动

    # vi /etc/rc.local
    su - postgres -c "/usr/bin/patroni /usr/patroni/conf/patroni_postgresql.yml >> /var/log/postgresql/patroni.log  2>&1 &"
    
    

    或者配置成 patroni.service

    # vi /etc/systemd/system/patroni.service
    
    [Unit]
    Description=Runners to orchestrate a high-availability PostgreSQL
    After=syslog.target network.target etcd.target
    
    [Service]
    Type=simple
    
    User=postgres
    Group=postgres
    
    ExecStart=/usr/bin/patroni /usr/patroni/conf/patroni_postgresql.yml
    
    KillMode=process
    
    TimeoutSec=30
    
    Restart=no
    
    [Install]
    WantedBy=multi-user.targ
    
    # systemctl status patroni
    # systemctl start patroni
    # systemctl enable patroni
    
    # systemctl status postgresql
    # systemctl disable postgresql
    
    # systemctl status etcd
    # systemctl enable etcd
    
    

    禁止 postgresql 的自启动,通过 patroni 来管理 postgresql。

    总结:
    个人感觉 etcd + patroni 还是相当不错的,会继续对patroni 研究下。

    参考:
    https://github.com/zalando/patroni
    https://patroni.readthedocs.io/en/latest/
    https://pypi.org/project/patroni/

    https://github.com/zalando/patroni/blob/master/docs/replication_modes.rst
    https://postgresconf.org/system/events/document/000/000/228/Patroni_tutorial_4x3-2.pdf

  • 相关阅读:
    Django -- 模板系统
    CSRF_TOKEN
    MySQL的sql_mode模式说明及设置
    程序员必备的600单词
    前端 -- jQuery
    datatime模块
    github高效搜索
    Mac上Homebrew常用命令总结
    对比System.currentTimeMillis()、new Date().getTime()、System.nanoTime()
    MACOS安装使用kafka
  • 原文地址:https://www.cnblogs.com/ctypyb2002/p/9792939.html
Copyright © 2011-2022 走看看