zoukankan      html  css  js  c++  java
  • CentOS7安装部署ClickHouse(单机版&&集群部署)

    1.1 什么是ClickHouse

    ClickHouse 是俄罗斯的Yandex于2016年开源的列式存储数据库(DBMS),主要用于在线分析处理查询(OLAP),能够使用SQL查询实时生成分析数据报告。

    1.2 什么是列式存储

    以下面的表为例:

    Id Name Age
    1 张三 18
    2 李四 22
    3 王五 34

    采用行式存储时,数据在磁盘上的组织结构为:

    1 张三 18 2 李四 22 3 王五 34

    好处是想查某个人所有的属性时,可以通过一次磁盘查找加顺序读取就可以。但是当想查所有人的年龄时,需要不停的查找,或者全表扫描才行,遍历的很多数据都是不需要的。

    而采用列式存储时,数据在磁盘上的组织结构为:

    1 2 3 张三 李四 王五 18 22 34

    这时想查所有人的年龄只需把年龄那一列拿出来就可以了

    一、CentOS7安装ClickHouse

    1.1 安装前准备

    1.1.1 CentOS取消打开文件数限制

    在/etc/security/limits.conf、/etc/security/limits.d/90-nproc.conf这2个文件的末尾加入一下内容:

    [root@master ~]# vim /etc/security/limits.conf
    
    在文件末尾添加:
    
    *       soft    nofile  65536
    
    *       hard    nofile  65536
    
    *       soft    nproc   131072
    
    *       hard    nproc   131072
    
    [root@master ~]# vim /etc/security/limits.d/20-nproc.conf
    
    在文件末尾添加:
    *       soft    nofile  65536
    
    *       hard    nofile  65536
    
    *       soft    nproc   131072
    
    *       hard    nproc   131072
    
    

    重启服务器之后生效,用 ulimit -n 或者 ulimit -a 查看设置结果

    1.1.2 CentOS取消SELINUX

    修改/etc/selinux/config中的SELINUX=disabled后重启

    [root@master ~]# vim /etc/selinux/config
    
    # This file controls the state of SELinux on the system.
    # SELINUX= can take one of these three values:
    #     enforcing - SELinux security policy is enforced.
    #     permissive - SELinux prints warnings instead of enforcing.
    #     disabled - No SELinux policy is loaded.
    SELINUX=disabled
    # SELINUXTYPE= can take one of three values:
    #     targeted - Targeted processes are protected,
    #     minimum - Modification of targeted policy. Only selected processes are protect
    ed. #     mls - Multi Level Security protection.
    SELINUXTYPE=targeted
    
    

    1.1.3 关闭防火墙

    1、命令行界面输入命令“systemctl status firewalld.service”并按下回车键。
    
    2、然后在下方可度以查看得到“active(running)”,此时说明防火墙已经被打开了。
    
    3、在命令行中输入systemctl stop firewalld.service命令,进行关闭防火墙。
    
    4、然后再使用命令systemctl status firewalld.service,在下方出现disavtive(dead),这权样就说明防火墙已经关闭。
    
    5、再在命令行中输入命令“systemctl disable firewalld.service”命令,即可永久关闭防火墙。
    

    1.2 单机安装

    1 网址

    官网:https://clickhouse.yandex/

    下载地址:http://repo.red-soft.biz/repos/clickhouse/

    2 单机模式

    大数据培训之ClickHouse安装

    2.1 上传个文件到linux

    [root@master clickhouse_softs]# ll
    总用量 34636
    -rw-r--r-- 1 root root     6376 3月  10 14:32 clickhouse-client-19.7.3.9-1.el7.x86_64.rpm
    -rw-r--r-- 1 root root 25990492 3月  10 14:32 clickhouse-common-static-19.7.3.9-1.el7.x86_64.rpm
    -rw-r--r-- 1 root root  9453456 3月  10 14:32 clickhouse-server-19.7.3.9-1.el7.x86_64.rpm
    -rw-r--r-- 1 root root     9944 3月  10 14:32 clickhouse-server-common-19.7.3.9-1.el7.x86_64.rpm
    

    2.2分别安装这5个rpm文件

    [root@master clickhouse_softs]# rpm -ivh *.rpm
    准备中...                          ################################# [100%]
    正在升级/安装...
       1:clickhouse-server-common-19.7.3.9################################# [ 25%]
       2:clickhouse-common-static-19.7.3.9################################# [ 50%]
       3:clickhouse-server-19.7.3.9-1.el7 ################################# [ 75%]
    Create user clickhouse.clickhouse with datadir /var/lib/clickhouse
       4:clickhouse-client-19.7.3.9-1.el7 ################################# [100%]
    Create user clickhouse.clickhouse with datadir /var/lib/clickhouse
    
    
    

    2.3 启动ClickServer

    [root@master clickhouse_softs]# service clickhouse-server start
    Start clickhouse-server service: Path to data directory in /etc/clickhouse-server/config.xml: /var/lib/clickhouse/DONE
    

    2.2.4 使用client连接server

    [root@master clickhouse_softs]# clickhouse-client
    ClickHouse client version 19.7.3.1.
    Connecting to localhost:9000 as user default.
    Connected to ClickHouse server version 19.7.3 revision 54419.
    
    master :) 
    master :) 
    master :) quit
    Bye.
    

    1.3 分布式集群安装

    1.3.1 另外几台机器同样需要做安装前准备

    1.3.2 将4个rpm包上传到另外几台机器

    1.3.3 分别安装

    [root@node01 clickhouse_softs]# ll
    总用量 0
    [root@node01 clickhouse_softs]# rpm -ivh *.rpm
    准备中...                          ################################# [100%]
    正在升级/安装...
       1:clickhouse-server-common-19.7.3.9################################# [ 25%]
       2:clickhouse-common-static-19.7.3.9################################# [ 50%]
       3:clickhouse-server-19.7.3.9-1.el7 ################################# [ 75%]
    Create user clickhouse.clickhouse with datadir /var/lib/clickhouse
       4:clickhouse-client-19.7.3.9-1.el7 ################################# [100%]
    Create user clickhouse.clickhouse with datadir /var/lib/clickhouse
    
    
    [root@node02 clickhouse_softs]# rpm -ivh *.rpm
    准备中...                          ################################# [100%]
    正在升级/安装...
       1:clickhouse-server-common-19.7.3.9################################# [ 25%]
       2:clickhouse-common-static-19.7.3.9################################# [ 50%]
       3:clickhouse-server-19.7.3.9-1.el7 ################################# [ 75%]
    Create user clickhouse.clickhouse with datadir /var/lib/clickhouse
       4:clickhouse-client-19.7.3.9-1.el7 ################################# [100%]
    Create user clickhouse.clickhouse with datadir /var/lib/clickhouse
    
    

    1.3.4 每台机器修改config.xml文件 ,将<listen_host>::</listen_host>注释放开即可

    [root@node02 clickhouse_softs]# vim /etc/clickhouse-server/config.xml
        <!-- Port for communication between replicas. Used for data exchange. -->
        <interserver_http_port>9009</interserver_http_port>
    
        <!-- Hostname that is used by other replicas to request this server.
             If not specified, than it is determined analoguous to 'hostname -f' command.
             This setting could be used to switch replication to another network interface.
          -->
        <!--
        <interserver_http_host>example.yandex.ru</interserver_http_host>
        -->
    
        <!-- Listen specified host. use :: (wildcard IPv6 address), if you want to accept connections both with IPv4 and IPv6 from everywhere. -->
        <listen_host>::</listen_host>
        <!-- Same for hosts with disabled ipv6: -->
        <!-- <listen_host>0.0.0.0</listen_host> -->
    
        <!-- Default values - try listen localhost on ipv4 and ipv6: -->
    
    

    1.3.5 每台机器创建metrika.xml

    [root@node02 clickhouse_softs]# vim /etc/metrika.xml
    

    内容如下:

    <yandex>
        <clickhouse_remote_servers>
            <clickhouse_cluster>
                <shard>
                    <internal_replication>true</internal_replication>
                    <replica>
                        <host>master</host>										(第一台服务器的主机名或ip)
                        <port>9000</port>
                    </replica>
                </shard>
                <shard>
                    <replica>
                        <internal_replication>true</internal_replication>
                        <host>node01</host>										(第二台服务器的主机名或ip)
                        <port>9000</port>
                    </replica>
                </shard>
                <shard>
                    <internal_replication>true</internal_replication>
                    <replica>
                        <host>node02</host>										(第三台服务器的主机名或ip)
                        <port>9000</port>
                    </replica>
                </shard>
            </clickhouse_cluster>
        </clickhouse_remote_servers>
         
        <zookeeper-servers>
            <node index="1">
                <host>master</host>												(第一台服务器的主机名或ip)
                <port>2181</port>
            </node>
            <node index="2">
                <host>node01</host>												(第二台服务器的主机名或ip)
                <port>2181</port>
            </node>
            <node index="3">
                <host>node02</host>												(第三台服务器的主机名或ip)
                <port>2181</port>
            </node>
        </zookeeper-servers>
    
        <macros>
            <replica>node02</replica>			   								(里要写本机的地址,另外两台记得改为对应的地址)
        </macros>
    
        <networks>
            <ip>::/0</ip>
        </networks>
    
        <clickhouse_compression>
            <case>
                <min_part_size>10000000000</min_part_size>
                <min_part_size_ratio>0.01</min_part_size_ratio>
                <method>lz4</method>
            </case>
        </clickhouse_compression>
    </yandex>
    

    1.3.6 每台机器先启动zookeeper

    [root@node02 soft]# zkServer.sh start
    JMX enabled by default
    Using config: /usr/local/soft/zookeeper-3.4.6/bin/../conf/zoo.cfg
    Starting zookeeper ... STARTED
    
    

    1.3.7 每台机器启动clickhouse

    [root@node01 soft]# service clickhouse-server start
    Start clickhouse-server service: Path to data directory in /etc/clickhouse-server/co
    nfig.xml: /var/lib/clickhouse/DONE
    

    1.3.8 验证,进入client

    [root@master soft]# clickhouse-client -m
    ClickHouse client version 19.7.3.1.
    Connecting to localhost:9000 as user default.
    Connected to ClickHouse server version 19.7.3 revision 54419.
    
    master :)
    

    1.3.9 验证集群是否创建成功,输入 select * from system.clusters;

    master :) select * from system.clusters;
    
    SELECT *
    FROM system.clusters 
    
    ┌─cluster───────────────────────────┬─shard_num─┬─shard_weight─┬─replica_num─┬─host_name─┬─host_address───┬─port─┬─is_local─┬─user────┬─default_database─┐
    │ clickhouse_cluster                │         1 │            1 │           1 │ master    │ 192.168.41.105 │ 9000 │        1 │ default │                  │
    │ clickhouse_cluster                │         2 │            1 │           1 │ node01    │ 192.168.41.106 │ 9000 │        0 │ default │                  │
    │ clickhouse_cluster                │         3 │            1 │           1 │ node02    │ 192.168.41.107 │ 9000 │        0 │ default │                  │
    │ test_cluster_two_shards_localhost │         1 │            1 │           1 │ localhost │ ::1            │ 9000 │        1 │ default │                  │
    │ test_cluster_two_shards_localhost │         2 │            1 │           1 │ localhost │ ::1            │ 9000 │        1 │ default │                  │
    │ test_shard_localhost              │         1 │            1 │           1 │ localhost │ ::1            │ 9000 │        1 │ default │                  │
    │ test_shard_localhost_secure       │         1 │            1 │           1 │ localhost │ ::1            │ 9440 │        0 │ default │                  │
    │ test_unavailable_shard            │         1 │            1 │           1 │ localhost │ ::1            │ 9000 │        1 │ default │                  │
    │ test_unavailable_shard            │         2 │            1 │           1 │ localhost │ ::1            │    1 │        0 │ default │                  │
    └───────────────────────────────────┴───────────┴──────────────┴─────────────┴───────────┴────────────────┴──────┴──────────┴─────────┴──────────────────┘
    
    9 rows in set. Elapsed: 0.001 sec.
    
  • 相关阅读:
    1301班 github安装及账户注册
    对于软件工程课程的疑问
    LeetCode50:Pow
    LeetCode49:字母异位词分组
    LeetCode46:全排列
    LeetCode38:外观数列
    LeetCode:有效的数独
    LeetCode34:在排序数组中查找元素的第一个位置和最后一个位置
    LeetCode33:搜索旋转排序数组
    LeetCode29:两数相除
  • 原文地址:https://www.cnblogs.com/wyh-study/p/14512136.html
Copyright © 2011-2022 走看看