zoukankan      html  css  js  c++  java
  • ClickHouse高可用集群

     一、集群说明

    zookeeper 需要单独部署在其他机器上,以免clickhouse 节点挂掉,引起zookeeper 挂掉。
    0. 高可用原理:zookeeper + ReplicatedMergeTree(复制表) + Distributed(分布式表)
    1. 前提准备:所有节点防火墙关闭或者开放端口;所有节点建立互信(免密码登录);hosts表和主机名一定要集群保持一致正确配置,因为zookeeper返
    回的是主机名,配置错误或不配置复制表时会失败.
    clickhouse测试节点3个:10.0.0.236 cdhserver1(clickhouse1), 10.0.0.237 cdhserver2 (clickhouse2),10.0.0.238 cdhserver3 (clickhouse3),10.0.0.239 cdhserver4
    (clickhouse4)
    zookeeper测试节点3个:10.0.0.237 cdhserver2 (zookeeper),10.0.0.238 cdhserver3 (zookeeper),10.0.0.239 cdhserver4 (zookeeper)
    配置方案:4个节点点各配置两个clickhouse实例,相互备份.
    cdhserver1: 实例1, 端口: tcp 9006, http 8123, 同步端口9006, 类型: 主节点
    cdhserver2: 实例2, 端口: tcp 9006, , http 8123, 同步端口9006, 类型: 分片1, 副本1
    cdhserver3: 实例1, 端口: tcp 9006, , http 8123, 同步端口9006, 类型: 分片2, 副本1
    cdhserver4: 实例2, 端口: tcp 9006, , http 8123, 同步端口9006, 类型: 分片3, 副本1

    二、环境准备
    1)主机配置(根据自身情况)

    10.0.0.236 cdhserver1 centos 7.1 32G 200G
    10.0.0.237 cdhserver2 centos 7.1 32G 200G
    10.0.0.238 cdhserver3 centos 7.1 32G 200G
    10.0.0.239 cdhserver4 centos 7.1 32G 200G

    2)hosts表和主机名

    3)所有节点防火墙关闭或者开放端口;
    # 1.关闭防火墙
    service iptables stop
    chkconfig iptables off
    chkconfig ip6tables off
    # 2.关闭selinux
    修改/etc/selinux/config中的SELINUX=disabled后重启
    [root@cdhserver1 ~]# vim /etc/selinux/config
    SELINUX=disabled

    4)优化所有节点服务器打开文件个数
    在/etc/security/limits.conf、/etc/security/limits.d/90-nproc.conf这2个文件的末尾加入一下内容:
    [root@cdhserver1 software]# vim /etc/security/limits.conf
    在文件末尾添加:
    * soft nofile 65536
    * hard nofile 65536
    * soft nproc 131072
    * hard nproc 131072
    [root@cdhserver1 software]# vim /etc/security/limits.d/90-nproc.conf
    在文件末尾添加:
    * soft nofile 65536
    * hard nofile 65536
    * soft nproc 131072
    * hard nproc 131072
    重启服务器it -n 或者ulimit -a查看设置结果
    [root@cdhserver1 ~]# ulimit -n
    65536

    5)所有节点建立互信(免密码登录);
    1、在各节点通过ssh-keygen生成RSA密钥和公钥
    ssh-keygen -q -t rsa -N "" -f ~/.ssh/id_rsa
    2、将所有的公钥文件汇总到一个总的授权key文件中,在cdhserver1机器执行如下命令,必须逐行执行:
    ssh cdhserver1 cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
    ssh cdhserver2 cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
    ssh cdhserver3 cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
    ssh cdhserver4 cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
    3、设置授权key文件的权限,在M01机器执行如下命令:
    chmod 600 ~/.ssh/authorized_keys
    4、分发授权key文件到所有服务器,必须逐行执行:
    scp ~/.ssh/authorized_keys cdhserver1:~/.ssh/
    scp ~/.ssh/authorized_keys cdhserver2:~/.ssh/
    scp ~/.ssh/authorized_keys cdhserver3:~/.ssh/
    scp ~/.ssh/authorized_keys cdhserver4:~/.ssh/

    三、分布式集群安装
    1、包准备

    说明:
    1) 安装资源获取: https://packagecloud.io/Altinity/clickhouse。
    2)也可以直接在csdn 下载:https://download.csdn.net/download/u013205003/10968848
    2、在所有节点安装clickhouse(centos 7.1 为例)
    1) 分别在个服务器上进行如下安装和配置
    2)安装clickhouse
    安装 libicu
    mkdir -p /usr/local/icu/
    cd /usr/local/icu/libicu-4.2.1-14.el6.x86_64.rpm
    rpm -ivh libicu-4.2.1-14.el6.x86_64.rpm
    安装clickhouse
    rpm -ivh clickhouse-server-common-18.14.12-1.el6.x86_64.rpm
    rpm -ivh clickhouse-compressor-1.1.54336-3.el6.x86_64.rpm
    rpm -ivh clickhouse-common-static-18.14.12-1.el6.x86_64.rpm
    rpm -ivh clickhouse-server-18.14.12-1.el6.x86_64.rpm
    rpm -ivh clickhouse-client-18.14.12-1.el6.x86_64.rpm
    rpm -ivh clickhouse-test-18.14.12-1.el6.x86_64.rpm
    rpm -ivh clickhouse-debuginfo-18.14.12-1.el6.x86_64.rpm

    3、配置文件修改
    1)四个节点修改配置文件config.xml
    [root@cdhserver1 ~]# vim /etc/clickhouse-server/config.xml
    <http_port>8123</http_port>
    <tcp_port>9006</tcp_port>
    <listen_host>::</listen_host>
    <!-- <listen_host>::1</listen_host> -->
    <!-- <listen_host>127.0.0.1</listen_host> -->

    <!-- <max_table_size_to_drop>0</max_table_size_to_drop> -->
    <!-- <max_partition_size_to_drop>0</max_partition_size_to_drop> -->

    2)四个节点修改配置文件users.xml添加用户

    [root@cdhserver1 ~]# vim /etc/clickhouse-server/users.xml

    修改/etc/clickhouse-server/users.xml
    在<!-- Example of user with readonly access. -->上新增:

    <ck>
        <password_sha256_hex>8d969eef6ecad3c29a3a629280e686cf0c3f5d5a86aff3ca12020c923adc6c92</password_sha256_hex>
        <networks incl="networks" replace="replace">
        <ip>::/0</ip>
        </networks>
        <profile>default</profile>
        <quota>default</quota>
    </ck>

    3)四个节点的/etc/clickhouse-server目录下新建metrika.xml文件集群分片的配置
    [root@cdhserver1 ~]# vim /etc/clickhouse-server/metrika.xml
    添加如下内容:

    <?xml version="1.0"?>
    <yandex>
    <!--ck集群节点-->
    <clickhouse_remote_servers>
        <idc_cluster>
            <!--分片1-->
            <shard>
                <weight>1</weight>
                <replica>
                    <host>cdhserver2</host>
                    <port>9006</port>
                    <user>ck</user>
                    <password>123456</password>
                   <compression>true</compression>
                </replica>
            </shard>
            <!--分片2-->
            <shard>
                <weight>1</weight>
                <replica>
                    <host>cdhserver3</host>
                    <port>9006</port>
                    <user>ck</user>
                    <password>123456</password>
                    <compression>true</compression>
                </replica>
            </shard>
            <!--分片3-->
            <shard>
                <weight>1</weight>
                <replica>
                    <host>cdhserver4</host>
                    <port>9006</port>
                    <user>ck</user>
                    <password>123456</password>
                    <compression>true</compression>
                </replica>
            </shard>
        </idc_cluster>
        <idc_replicate_cluster>
            <!--分片1-->
            <shard>
                <weight>1</weight>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>cdhserver2</host>
                    <port>9006</port>
                    <user>ck</user>
                    <password>123456</password>
                </replica>
                <replica>
                    <host>hadoop2</host>
                    <port>9006</port>
                    <user>ck</user>
                    <password>123456</password>
                </replica>
            </shard>
            <!--分片2-->
            <shard>
                <weight>1</weight>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>cdhserver3</host>
                    <port>9006</port>
                    <user>ck</user>
                    <password>123456</password>
                </replica>
                <replica>
                    <host>hadoop3</host>
                    <port>9006</port>
                    <user>ck</user>
                    <password>123456</password>
                </replica>
            </shard>
            <!--分片3-->
                    <shard>
                <weight>1</weight>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>cdhserver4</host>
                    <port>9006</port>
                    <user>ck</user>
                    <password>123456</password>
                </replica>
                <replica>
                    <host>hadoop4</host>
                    <port>9006</port>
                    <user>ck</user>
                    <password>123456</password>
                </replica>
            </shard>
        </idc_replicate_cluster>
    </clickhouse_remote_servers>
    
    <!--zookeeper相关配置-->
    <zookeeper-servers>
        <node index="1">
            <host>cdhserver2</host>
            <port>2181</port>
        </node>
        <node index="2">
            <host>cdhserver3</host>
            <port>2181</port>
        </node>
        <node index="3">
            <host>cdhserver4</host>
            <port>2181</port>
        </node>
    </zookeeper-servers>
    
    <!--分片和副本配置-->
    <macros>
        <replica>cdhserver1</replica>
    </macros>
    
    <!--开启远程访问-->
    <networks>
        <ip>::/0</ip>
    </networks>
    
    <!--压缩相关配置-->
    <clickhouse_compression>
        <case>
            <min_part_size>10000000000</min_part_size>
            <min_part_size_ratio>0.01</min_part_size_ratio>
            <method>lz4</method> <!--压缩算法lz4压缩比zstd快, 更占磁盘-->
        </case>
    </clickhouse_compression>
    </yandex>

    注意:上面标红的地方需要根据节点不同去修改

    cdhserver2 分片1,副本1:

    <macros>
        <layer>01</layer>
        <shard>01</shard>
        <replica>cdhserver2</replica>
    </macros>

    cdhserver3 分片1,副本1:

    <macros>
        <layer>01</layer>
        <shard>02</shard>
        <replica>cdhserver3</replica>
    </macros>  

    cdhserver4 分片1,副本1:

    <macros>
        <layer>01</layer>
        <shard>03</shard>
        <replica>cdhserver4</replica>
    </macros>
    

    4、 启动脚本
    启动脚本路径:/etc/init.d/clickhouse-server

    前台启动:
    [root@cdhserver1 software]# clickhouse-server --config-file=/etc/clickhouse-server/config.xml
    后台启动:
    [root@cdhserver1 software]# nohup clickhouse-server --config-file=/etc/clickhouse-server/config.xml >null 2>&1 &

    复制修改
    然后进入到配置文件目录,将原有配置文件拷贝一份,这里是config1.xml,然后修改配置:
    修改config.xm,pid使用clickhouse-server-1.pid

    五、测试. 创建本地复制表和分布式表
    1、所有实例配置完上面这些之后,分别执行启动命令启动,然后所有实例都执行下面语句创建数据库:
    CREATE DATABASE monchickey;
    2、 然后对于所有实例分别创建对应的复制表,这里测试创建一个简单的表

    drop table monchickey.t_co_dwfxdata_0809_30000000_new;
    CREATE
    	TABLE
    		monchickey.t_co_dwfxdata_0809_30000000_new(
    		    CRETIME date,
    			YEAR  String,
    			MONTH String,
    			COMPNAME String,
    			CBYS String,
    			YWHD String,
    			DYDJ String,
    			ZCLX String,
    			AMOUNT Float64,
    			YHLB String,
    			YSLX String
    		) ENGINE = ReplicatedMergeTree('/clickhouse/tables/{shard}/t_co_dwfxdata_0809_30000000_new', '{replica}',
    			CRETIME,
    			(COMPNAME),
    			8192
    		);
     
    drop table monchickey.t_co_dwfxdata_0809_30000000_all;
    CREATE TABLE monchickey.t_co_dwfxdata_0809_30000000_all AS monchickey.t_co_dwfxdata_0809_30000000_new ENGINE = Distributed(distable, monchickey, t_co_dwfxdata_0809_30000000_new, rand());
     
    select  count (*) from  monchickey.t_co_dwfxdata_0809_30000000_new;
    select count(*) from  monchickey.t_co_dwfxdata_0809_30000000_all;

    3、简单数据样例

    分别在四台机器上插入如下数据

    cdhserver1 /clickhouse1

    insert into monchickey.t_co_dwfxdata_0809_30000000_all VALUES ('2016-01-01','2016年','1月','单位8','其他长期职工福利','检修-输电运检','500kV及以上','架空线路',9.4441398224E8,'','工');
    insert into monchickey.t_co_dwfxdata_0809_30000000_all VALUES ('2016-01-01','2016年','1月','单位8','其他短期薪酬','检修-输电运检','500kV及以上','架空线路',2.0894114651E8,'','工');

    cdhserver2 /clickhouse2

    insert into monchickey.t_co_dwfxdata_0809_30000000_all VALUES ('2016-01-01','2016年','1月','单位8','其他长期职工福利','检修-输电运检','500kV及以上','架空线路',9.4441398224E8,'','工')
    insert into monchickey.t_co_dwfxdata_0809_30000000_all VALUES (('2016-01-','2016年','1月','单位8','临时用工薪酬','检修-输电运检','500kV及以上','架空线路',8.5247987777E8,'','工');
    cdhserver3 /clickhouse3

    insert into monchickey.t_co_dwfxdata_0809_30000000_all VALUES ('2016-01-01','2016年','1月','单位8','劳务派遣费','检修-输电运检','500kV及以上','架空线路',3.59378772E8,'','工');
    insert into monchickey.t_co_dwfxdata_0809_30000000_all VALUES ('2016-01-01','2016年','1月','单位8','委托服务费','检修-输电运检','500kV及以上','架空线路',4.7638581405E8,'','费');
    insert into monchickey.t_co_dwfxdata_0809_30000000_all VALUES ('2016-01-01','2016年','1月','单位8','财产保险费','检修-输电运检','500kV及以上','架空线路',9.527716281E8,'','费');
    在任意一台机器上查询数据

    a.能查询到完成 7条数据
    b.查询总数为7条
    select * from monchickey.t_co_dwfxdata_0809_30000000_all;--能查询到完整的7 条数据
    select count(*) from monchickey.t_co_dwfxdata_0809_30000000_all; -- 总数为7

    六、集群表操作

    1、表增加字段

    alter TABLE idc.web_initial ON CLUSTER idc_cluster add COLUMN tracerouteip String AFTER jitter;

    2、更改列的类型

    alter TABLE idc.web_initial ON CLUSTER idc_cluster modify column tracerouteip UInt16;

    3、 删除列

    alter TABLE idc.web_initial ON CLUSTER idc_cluster drop column tracerouteip;

     4、删除集群多个节点同一张表

    drop table tabl  on cluster clickhouse_cluster;

    drop TABLE if exists idc.web_initial on CLUSTER idc_cluster ;

    5、清理集群表数据

    truncate table lmmbase.user_label_uid on cluster crm_4shards_1replicas;

    七、优化

    1、max_table_size_to_drop
    默认情况下, Clickhouse 不允许删除分区或表的大小大于 50GB 的分区或表. 可以通过修改server的配置文件来永久配置. 也可以临时设置一下来删除而不用重启服务.
    永久配置
    sudo vim /etc/clickhouse-server/config.xml
    然后注释掉下面两行
    <!-- <max_table_size_to_drop>0</max_table_size_to_drop> -->
    <!-- <max_partition_size_to_drop>0</max_partition_size_to_drop> -->
    0表示不限制. 或者你可以设置为你想限制的最大的大小.
    临时设置
    创建个标志文件:
    sudo touch '/home/username/clickhouse/flags/force_drop_table' && sudo chmod 666 '/home/username/clickhouse/flags/force_drop_table'
    创建好之后, 就可以执行上面的删除分区或表的命令了.
    2、max_memory_usage
    此参数在/etc/clickhouse-server/users.xml中,表示档次query占内存最大值,超过本值query失败,建议在资源足够情况尽量调大
    <max_memory_usage>25000000000</max_memory_usage>
    3、删除多个节点上的同张表
    drop table tabl on cluster clickhouse_cluster;

    参考资料:https://blog.csdn.net/u013205003/article/details/87861613?utm_medium=distribute.pc_aggpage_search_result.none-task-blog-2~all~first_rank_v2~rank_v25-3-87861613.nonecase&utm_term=%E6%90%AD%E5%BB%BAclickhouse%E9%AB%98%E5%8F%AF%E7%94%A8%E9%9B%86%E7%BE%A4%E6%90%AD%E5%BB%BA

  • 相关阅读:
    python基础008----Python中类/函数/模块的简单介绍
    linux基础004---用户管理、用户登录注销、系统关机重启
    python基础006----流程控制&推导式&深浅拷贝
    python基础005----字典&集合
    python基础004----列表&元组
    难缠的布隆过滤器,这次终于通透了
    C# 位图BitArray 小试牛刀
    以步步为营的风格解读 Redis分布式锁
    你是不是对MD5算法有误解?
    最适合新手的Redis Cluster搭建过程
  • 原文地址:https://www.cnblogs.com/uestc2007/p/13704912.html
Copyright © 2011-2022 走看看