zoukankan      html  css  js  c++  java
  • 【ClickHouse】7:clickhouse多实例安装

    背景介绍:

    有三台CentOS7服务器安装了ClickHouse

    HostName IP 安装程序 实例1端口 实例2端口
    centf8118.sharding1.db 192.168.81.18 clickhouse-server,clickhouse-client 9000 9002
    centf8119.sharding2.db 192.168.81.19 clickhouse-server,clickhouse-client 9000 9002
    centf8120.sharding3.db 192.168.81.20 clickhouse-server,clickhouse-client 9000 9002

     

     

    安装多实例是为了用三台服务器测试3分片2备份集群。最后的部署如下表:

      备份1 备份2
    分片1 192.168.81.18:9000 192.168.81.19:9002
    分片2 192.168.81.19:9000 192.168.81.20:9002
    分片3 192.168.81.20:9000 192.168.81.18:9002

     

    一:安装clickhouse

    【ClickHouse】1:clickhouse安装 (CentOS7)

    二:新增一个clickhouse实例服务

    2.1:将/etc/clickhouse-server/config.xml文件拷贝一份改名

    [root@centf8118 clickhouse-server]# cp /etc/clickhouse-server/config.xml /etc/clickhouse-server/config9002.xml

    2.2:编辑/etc/clickhouse-server/config9002.xml更改以下内容将两个服务区分开来

    多实例修改的config9002.xml:原来内容
    <log>/var/log/clickhouse-server/clickhouse-server.log</log>
    <errorlog>/var/log/clickhouse-server/clickhouse-server.err.log</errorlog>
    <http_port>8123</http_port>
    <tcp_port>9000</tcp_port>
    <mysql_port>9004</mysql_port>
    <interserver_http_port>9009</interserver_http_port>
    <path>/data/clickhouse/</path>
    <tmp_path>/data/clickhouse/tmp/</tmp_path>
    <user_files_path>/data/clickhouse/user_files/</user_files_path>
    <access_control_path>/data/clickhouse/access/</access_control_path>
    <include_from>/etc/clickhouse-server/metrika.xml</include_from>   #集群配置文件
    
    
    
    多实例修改的config9002.xml:调整后内容
    <log>/var/log/clickhouse-server/clickhouse-server-9002.log</log>
    <errorlog>/var/log/clickhouse-server/clickhouse-server-9002.err.log</errorlog>
    <http_port>8124</http_port>
    <tcp_port>9002</tcp_port>
    <mysql_port>9005</mysql_port>
    <interserver_http_port>9010</interserver_http_port>
    <interserver_http_port>9009</interserver_http_port>
    <path>/data/clickhouse9002/</path>
    <tmp_path>/data/clickhouse9002/tmp/</tmp_path>
    <user_files_path>/data/clickhouse9002/user_files/</user_files_path>
    <access_control_path>/data/clickhouse9002/access/</access_control_path>
    <include_from>/etc/clickhouse-server/metrika9002.xml</include_from>

    2.3:创建对应的目录

    [root@centf8118 data]# mkdir -p /data/clickhouse9002
    [root@centf8118 data]# chown -R clickhouse:clickhouse /data/clickhouse9002

     PS: 一定要记得更改目录的所属组和用户为clickhouse。

    2.4:增加实例对应的服务启动脚本

    [root@centf8118 init.d]# cp /etc/init.d/clickhouse-server /etc/init.d/clickhouse-server9002
    [root@centf8118 init.d]# vim /etc/init.d/clickhouse-server9002 
    调整内容如下:
    调整后内容:
    CLICKHOUSE_CONFIG=$CLICKHOUSE_CONFDIR/config9002.xml
    CLICKHOUSE_PIDFILE="$CLICKHOUSE_PIDDIR/$PROGRAM-9002.pid"
    
    调整前内容:
    CLICKHOUSE_CONFIG=$CLICKHOUSE_CONFDIR/config.xml
    CLICKHOUSE_PIDFILE="$CLICKHOUSE_PIDDIR/$PROGRAM.pid"

    2.5:centf81.18完成上述操作后,在其他两台服务器做以上完全一样的操作。

    三:集群配置(三分片两备份)

    3.1:六个metrika*.xml共同部分:

    <yandex>
        <!-- 集群配置 -->
        <clickhouse_remote_servers>
            <!-- 3分片2备份 -->
            <xinchen_3shards_2replicas>
                <shard>
                    <weight>1</weight>
                    <internal_replication>true</internal_replication>
                    <replica>
                        <host>192.168.81.18</host>
                        <port>9000</port>
                    </replica>
                    <replica>
                        <host>192.168.81.19</host>
                        <port>9002</port>
                    </replica>
                </shard>
                <shard>
                    <weight>1</weight>
                    <internal_replication>true</internal_replication>
                    <replica>
                        <host>192.168.81.19</host>
                        <port>9000</port>
                    </replica>
                    <replica>
                        <host>192.168.81.20</host>
                        <port>9002</port>
                    </replica>
                </shard>
                <shard>
                    <weight>1</weight>
                    <internal_replication>true</internal_replication>
                    <replica>
                        <host>192.168.81.20</host>
                        <port>9000</port>
                    </replica>
                    <replica>
                        <host>192.168.81.18</host>
                        <port>9002</port>
                    </replica>
                </shard>
            </xinchen_3shards_2replicas> 
        </clickhouse_remote_servers>
        
        <!-- zookeeper 配置 -->
        <zookeeper-servers>
            <node index="1">
                <host>192.168.81.18</host>
                <port>4181</port>
            </node>
            <node index="2">
                <host>192.168.81.19</host>
                <port>4181</port>
            </node>
            <node index="3">
                <host>192.168.81.20</host>
                <port>4181</port>
            </node>
        </zookeeper-servers>
        
        <!-- macros配置 -->
        <macros>
            <!-- <replica>192.168.81.18</replica> -->
            <layer>01</layer>
            <shard>01</shard>
            <replica>cluster01-01-1</replica>
        </macros>
        
        <networks>
            <ip>::/0</ip>
        </networks>
        
        <clickhouse_compression>
            <case>
                <min_part_size>10000000000</min_part_size>
                <min_part_size_ratio>0.01</min_part_size_ratio>
                <method>lz4</method>
            </case>
        </clickhouse_compression>
    </yandex>

    3.2:metrika*.xml不同部分修改如下:

    centf81.18实例1(端口:9000)对应metrika.xml调整:
    <macros>
        <!-- <replica>centf81.18</replica> -->
        <layer>01</layer>
        <shard>01</shard>
        <replica>cluster01-01-1</replica>
    </macros>
    
    
    centf81.18实例2(端口:9002)对应metrika9002.xml调整:
    <macros>
        <!-- <replica>centf81.18</replica> -->
        <layer>01</layer>
        <shard>03</shard>
        <replica>cluster01-03-2</replica>
    </macros>
    
    centf81.19实例1(端口:9000)对应metrika.xml调整:
    <macros>
        <!-- <replica>centf81.19</replica> -->
        <layer>01</layer>
        <shard>02</shard>
        <replica>cluster01-02-1</replica>
    </macros>
    
    
    centf81.19实例2(端口:9002)对应metrika9002.xml调整:
    <macros>
        <!-- <replica>centf81.19</replica> -->
        <layer>01</layer>
        <shard>01</shard>
        <replica>cluster01-01-2</replica>
    </macros>
    
    
    
    
    
    centf81.20实例1(端口:9000)对应metrika.xml调整:
    <macros>
        <!-- <replica>centf81.20</replica> -->
        <layer>01</layer>
        <shard>03</shard>
        <replica>cluster01-03-1</replica>
    </macros>
    
    
    centf81.20实例2(端口:9002)对应metrika9002.xml调整:
    <macros>
        <!-- <replica>centf81.20</replica> -->
        <layer>01</layer>
        <shard>02</shard>
        <replica>cluster01-02-2</replica>
    </macros>
    
    说明:这其中的规律显而易见,这里不再说明。如果还不明白可以参照开头的两个表格内容便于理解。
    其中layer是双级分片设置,这里是01;然后是shard表示分片编号;最后是replica是副本标识。
    这里使用了cluster{layer}-{shard}-{replica}的表示方式,比如cluster01-02-1表示cluster01集群的02分片下的1号副本,这样既非常直观的表示又唯一确定副本。

      额外提醒:如果一直是跟着我前面的教程来操作的。在这里操作前,先把之前建立的分区表都删掉。否则会导致服务启动失败。

      因为这里macros中的备份名称参数值改了:原来是01/02/03的,现在改为cluster01-01-1这种形式。这样会导致之前已建好的分区表找不到对应的备份数据。

        <macros>
            <!-- <replica>192.168.81.18</replica> -->
            <layer>01</layer>
            <shard>01</shard>
            <replica>cluster01-01-1</replica>
        </macros>

    3.3:启动高可用clickhouse集群

    [root@centf8118 clickhouse-server]# /etc/init.d/clickhouse-server start
    [root@centf8118 clickhouse-server]# /etc/init.d/clickhouse-server9002 start

    在三个节点上都执行上面的脚本。

    3.4:登录数据库查看集群信息

    centf8119.sharding2.db :) select * from system.clusters;
    
    SELECT *
    FROM system.clusters
    
    ┌─cluster───────────────────────────┬─shard_num─┬─shard_weight─┬─replica_num─┬─host_name───┬─host_address─┬─port─┬─is_local─┬─user────┬─default_database─┬─errors_count─┬─estimated_recovery_time─┐
    │ xinchen_3shards_2replicas         │         111192.168.81.18│192.168.81.1890000 │ default │                  │            00 │
    │ xinchen_3shards_2replicas         │         112192.168.81.19│192.168.81.1990020 │ default │                  │            00 │
    │ xinchen_3shards_2replicas         │         211192.168.81.19│192.168.81.1990001 │ default │                  │            00 │
    │ xinchen_3shards_2replicas         │         212192.168.81.20│192.168.81.2090020 │ default │                  │            00 │
    │ xinchen_3shards_2replicas         │         311192.168.81.20│192.168.81.2090000 │ default │                  │            00 │
    │ xinchen_3shards_2replicas         │         312192.168.81.18│192.168.81.1890020 │ default │                  │            00 │
    └───────────────────────────────────┴───────────┴──────────────┴─────────────┴─────────────┴──────────────┴──────┴──────────┴─────────┴──────────────────┴──────────────┴─────────────────────────┘

    登录每一个实例,查看集群信息验证配置是否正确。都符合自己之前的预设,说明集群配置成功了。

    clickhouse-client --host 10.30.81.18 --port 9000
    clickhouse-client --host 10.30.81.18 --port 9002
    clickhouse-client --host 10.30.81.19 --port 9000
    clickhouse-client --host 10.30.81.19 --port 9002
    clickhouse-client --host 10.30.81.20 --port 9000
    clickhouse-client --host 10.30.81.20 --port 9002

    三:集群高可用验证。

    3.1:高可用原理

            zookeeper+ReplicatedMergeTree(复制表)+Distributed(分布式表)

    3.2:首先创建ReplicatedMergeTree引擎表。

     需要在三个节点六个实例中都创建,创建sql如下:

    CREATE TABLE test_clusters_ha
    (
        dt Date,
        path String 
    )
    ENGINE = ReplicatedMergeTree('/clickhouse/tables/{layer}-{shard}/test_clusters_ha','{replica}',dt, dt, 8192);

    解释:

    ENGINE = ReplicatedMergeTree('/clickhouse/tables/{layer}-{shard}/test_clusters_ha','{replica}',dt, dt, 8192);

    第一个参数为ZooKeeper中该表的路径。

    第二个参数为ZooKeeper中该表的副本名称。

    这里的{layer},{shard}和{replica}就是metrika*.xml中的macros标签对应的值。

    3.3:创建分布表。

    这里只要一个节点创建就行了,假定在sharding1上的9000实例创建。

    CREATE TABLE test_clusters_ha_all AS test_clusters_ha ENGINE = Distributed(xinchen_3shards_2replicas, default, test_clusters_ha, rand());

    3.4:插入并查看数据

    insert into test_clusters_ha_all values('2020-09-01','path1');
    insert into test_clusters_ha_all values('2020-09-02','path2');
    insert into test_clusters_ha_all values('2020-09-03','path3');
    insert into test_clusters_ha_all values('2020-09-04','path4');
    insert into test_clusters_ha_all values('2020-09-05','path5');
    insert into test_clusters_ha_all values('2020-09-06','path6');
    insert into test_clusters_ha_all values('2020-09-07','path7');
    insert into test_clusters_ha_all values('2020-09-08','path8');
    insert into test_clusters_ha_all values('2020-09-09','path9');

    查看数据结果就不贴出来了。总数据9条,分布在三个9000实例的分片中。对应9002是三个分片的备份副本。

    3.5:验证某个节点宕机现有数据查询一致性

    a.将sharding2上的两个实例服务全部停止,模拟sharding2节点宕机。

    [root@centf8119 ~]# service clickhouse-server stop
    Stop clickhouse-server service: DONE
    [root@centf8119 ~]# service clickhouse-server9002 stop
    Stop clickhouse-server service: DONE

    b.先验证在分布式表中查询数据总量。

    centf8118.sharding1.db :) select count(*) from test_clusters_ha_all;
    
    SELECT count(*)
    FROM test_clusters_ha_all
    
    ┌─count()─┐
    │       9 │
    └─────────┘
    
    1 rows in set. Elapsed: 0.010 sec. 

    结果是单节点宕机数据一致性得到保证。同理如果只是宕机sharding3也是一样的结果。

    c.如果sharding2和sharding3同时宕机会如何呢?

    [root@centf8120 ~]# service clickhouse-server stop
    Stop clickhouse-server service: DONE
    [root@centf8120 ~]# service clickhouse-server9002 stop
    Stop clickhouse-server service: DONE

    d.停掉sharding3两个实例后,再查询分布表:

    centf8118.sharding1.db :) select count(*) from test_clusters_ha_all;
    
    SELECT count(*)
    FROM test_clusters_ha_all
    
    ↗ Progress: 2.00 rows, 8.21 KB (17.20 rows/s., 70.58 KB/s.) 
    Received exception from server (version 20.6.4):
    Code: 279. DB::Exception: Received from 192.168.81.18:9000. DB::Exception: All connection tries failed. Log: 
    
    Code: 32, e.displayText() = DB::Exception: Attempt to read after eof (version 20.6.4.44 (official build))
    Code: 210, e.displayText() = DB::NetException: Connection refused (192.168.81.19:9000) (version 20.6.4.44 (official build))
    Code: 210, e.displayText() = DB::NetException: Connection refused (192.168.81.20:9002) (version 20.6.4.44 (official build))
    Code: 210, e.displayText() = DB::NetException: Connection refused (192.168.81.19:9000) (version 20.6.4.44 (official build))
    Code: 210, e.displayText() = DB::NetException: Connection refused (192.168.81.20:9002) (version 20.6.4.44 (official build))
    Code: 210, e.displayText() = DB::NetException: Connection refused (192.168.81.19:9000) (version 20.6.4.44 (official build))
    
    : While executing Remote. 
    
    0 rows in set. Elapsed: 0.119 sec. 

    直接报错,可见在该方案中,sharding1作为主节点不能宕机,sharding2,sharding3只允许一个节点宕机。

    e. 然后启动sharding3的两个实例。再查看查询分布表是否正常。(能查到数据)

    f.在sharding1分布表中插入9条新的数据。再查看sharding1和sharding3的本地表是否有新增数据。 (有新数据)

      sharding1: 9000查看分区表: 总共18条数据 

      本地表( sharding1:9000 + sharding3:9000 + sharding3:9002 ) = 18条数据 = 分区表数据

    g.然后重启sharding2:9000服务,查看sharding2:9000本地表数据是否和sharding3:9002本地表数据一致? (是一致的)

    h.最后重启sharding2:9002服务,查看sharding2:9002本地表数据是否和sharding1:9000本地表数据一致? (是一致的)

  • 相关阅读:
    数据的艺术
    第十七篇 make的路径搜索综合实践
    第十六篇 make中的路径搜索
    第十五篇 make中的隐式规则概述
    第十四篇 自动生成依赖关系(终结)
    [SDOI2009]HH的项链解题报告
    欧几里德与扩展欧几里德算法的理解、实现与应用
    浅析强连通分量(Tarjan和kosaraju)
    deque-at
    plt.imshow()
  • 原文地址:https://www.cnblogs.com/DBArtist/p/clickhouse_clusters_ha.html
Copyright © 2011-2022 走看看