zoukankan      html  css  js  c++  java
  • ClickHouse之初步认识

    最近在Percona的blog上看到一篇文章:Column Store Database Benchmarks: MariaDB ColumnStore vs. Clickhouse vs. Apache Spark,从中可以看到Clickhouse的性能完爆MariaDB ColumnStore和 Spark。于是对Clickhouse产生了浓厚的兴趣,所以也打算进行学习。目前Clickhouse还没有中文文档,学习起来还是有点费劲。Percona的另一篇博客介绍Clickhouse的也可以看看。ClickHouse: New Open Source Columnar Database,其中这里也有一篇文章也可以看看:彪悍开源的分析数据库-ClickHouse

    那么ClickHouse到底是啥?

    1. 开源的列存储数据库管理系统

    2. 支持线性扩展

    3. 简单方便

    4. 高可靠性

    5. 容错(支持多主机异步复制,可以跨多个数据中心部署。 单个节点或整个数据中心的停机时间不会影响系统的读写可用性)

    ClickHouse关键功能和应用场景:

    更加详细的可以参考官方文档。

    目前ClickHouse对ubuntu系统支持比较友好,对于centos就差一点。Ubuntu有deb包可以直接安装。对于centos的则需要自己编译。本人在折腾很久都没有编译出来,最后放弃了。无意中看到了google邮件组中有人提到RPM包,有人搞了yum源,可以直接安装,这才解放。对于想自己编译的同学,可以参考:https://github.com/yandex/ClickHouse/blob/master/doc/build.md,下面进行yum安装

    1. 添加yum源

    yum-config-manager --add-repo http://repo.red-soft.biz/repos/clickhouse/repo/clickhouse-el6.repo

    2.  安装:

    yum install clickhouse-server clickhouse-client clickhouse-server-common clickhouse-compressor

    3. 添加用户clickhouse

    useradd clickhouse

    4. 启动clickhouse

    /etc/init.d/clickhouse-server start

    5. 登录测试:

    [root@db_server_yayun_01 ~]# clickhouse-client 
    ClickHouse client version 1.1.54198.
    Connecting to localhost:9000.
    Connected to ClickHouse server version 1.1.54198.
    
    :) select 1
    
    SELECT 1
    
    ┌─1─┐
    │ 1 │
    └───┘
    → Progress: 1.00 rows, 1.00 B (64.77 rows/s., 64.77 B/s.) 
    1 rows in set. Elapsed: 0.016 sec. 
    
    :) select now();
    
    SELECT now()
    
    ┌───────────────now()─┐
    │ 2017-03-31 15:14:18 │
    └─────────────────────┘
    ↘ Progress: 1.00 rows, 1.00 B (216.22 rows/s., 216.22 B/s.) 
    1 rows in set. Elapsed: 0.005 sec. 
    
    :) 

    启动失败可以查看日志,日志的目录默认为

    /var/log/clickhouse-server
    [root@db_server_yayun_01 clickhouse-server]# ll
    total 16
    -rw-rw-rw-. 1 clickhouse clickhouse  383 Mar 31 13:33 clickhouse-server.err.log
    -rw-rw-rw-. 1 clickhouse clickhouse 7733 Mar 31 15:14 clickhouse-server.log
    -rw-rw-rw-. 1 clickhouse clickhouse  138 Mar 31 13:33 stderr
    -rw-rw-rw-. 1 clickhouse clickhouse    0 Mar 31 13:33 stdout
    [root@db_server_yayun_01 clickhouse-server]# 

    下面说说clickhouse-client的简单使用:

    交互模式

    clickhouse-client
    clickhouse-client --host=... --port=... --user=... --password=...

    启用多行查询:

    clickhouse-client -m
    clickhouse-client --multiline

    对于建表的时候就需要启用多行查询,否则会报错,比如建如下表:

    CREATE TABLE `ontime` (
      `Year` UInt16,
      `Quarter` UInt8,
      `Month` UInt8,
      `DayofMonth` UInt8,
      `DayOfWeek` UInt8,
      `FlightDate` Date,
      `UniqueCarrier` FixedString(7),
      `AirlineID` Int32,
      `Carrier` FixedString(2),
      `TailNum` String,
      `FlightNum` String,
      `OriginAirportID` Int32,
      `OriginAirportSeqID` Int32,
      `OriginCityMarketID` Int32,
      `Origin` FixedString(5),
      `OriginCityName` String,
      `OriginState` FixedString(2),
      `OriginStateFips` String,
      `OriginStateName` String,
      `OriginWac` Int32,
      `DestAirportID` Int32,
      `DestAirportSeqID` Int32,
      `DestCityMarketID` Int32,
      `Dest` FixedString(5),
      `DestCityName` String,
      `DestState` FixedString(2),
      `DestStateFips` String,
      `DestStateName` String,
      `DestWac` Int32,
      `CRSDepTime` Int32,
      `DepTime` Int32,
      `DepDelay` Int32,
      `DepDelayMinutes` Int32,
      `DepDel15` Int32,
      `DepartureDelayGroups` String,
      `DepTimeBlk` String,
      `TaxiOut` Int32,
      `WheelsOff` Int32,
      `WheelsOn` Int32,
      `TaxiIn` Int32,
      `CRSArrTime` Int32,
      `ArrTime` Int32,
      `ArrDelay` Int32,
      `ArrDelayMinutes` Int32,
      `ArrDel15` Int32,
      `ArrivalDelayGroups` Int32,
      `ArrTimeBlk` String,
      `Cancelled` UInt8,
      `CancellationCode` FixedString(1),
      `Diverted` UInt8,
      `CRSElapsedTime` Int32,
      `ActualElapsedTime` Int32,
      `AirTime` Int32,
      `Flights` Int32,
      `Distance` Int32,
      `DistanceGroup` UInt8,
      `CarrierDelay` Int32,
      `WeatherDelay` Int32,
      `NASDelay` Int32,
      `SecurityDelay` Int32,
      `LateAircraftDelay` Int32,
      `FirstDepTime` String,
      `TotalAddGTime` String,
      `LongestAddGTime` String,
      `DivAirportLandings` String,
      `DivReachedDest` String,
      `DivActualElapsedTime` String,
      `DivArrDelay` String,
      `DivDistance` String,
      `Div1Airport` String,
      `Div1AirportID` Int32,
      `Div1AirportSeqID` Int32,
      `Div1WheelsOn` String,
      `Div1TotalGTime` String,
      `Div1LongestGTime` String,
      `Div1WheelsOff` String,
      `Div1TailNum` String,
      `Div2Airport` String,
      `Div2AirportID` Int32,
      `Div2AirportSeqID` Int32,
      `Div2WheelsOn` String,
      `Div2TotalGTime` String,
      `Div2LongestGTime` String,
      `Div2WheelsOff` String,
      `Div2TailNum` String,
      `Div3Airport` String,
      `Div3AirportID` Int32,
      `Div3AirportSeqID` Int32,
      `Div3WheelsOn` String,
      `Div3TotalGTime` String,
      `Div3LongestGTime` String,
      `Div3WheelsOff` String,
      `Div3TailNum` String,
      `Div4Airport` String,
      `Div4AirportID` Int32,
      `Div4AirportSeqID` Int32,
      `Div4WheelsOn` String,
      `Div4TotalGTime` String,
      `Div4LongestGTime` String,
      `Div4WheelsOff` String,
      `Div4TailNum` String,
      `Div5Airport` String,
      `Div5AirportID` Int32,
      `Div5AirportSeqID` Int32,
      `Div5WheelsOn` String,
      `Div5TotalGTime` String,
      `Div5LongestGTime` String,
      `Div5WheelsOff` String,
      `Div5TailNum` String
    ) ENGINE = MergeTree(FlightDate, (Year, FlightDate), 8192)
    View Code

    以批处理方式运行查询:

    clickhouse-client --query='SELECT 1'
    echo 'SELECT 1' | clickhouse-client

    从指定格式的文件插入数据:

    clickhouse-client --query='INSERT INTO table VALUES' < data.txt
    clickhouse-client --query='INSERT INTO table FORMAT TabSeparated' < data.tsv

    参考资料:

    https://github.com/redsoftbiz/clickhouse-rpm

    https://clickhouse.yandex/

  • 相关阅读:
    团队项目简介及视频
    构建之法阅读笔记04
    软件工程结对作业02
    团队项目NABCD
    搜狗输入法人机交互设计的用户体验
    站立会议个人博客5(2016/4/23)
    站立会议个人博客4(2016/4/22)
    典型用户和用户场景描述
    站立会议个人博客3(2016/4/21)
    站立会议个人博客2(2016/4/20)
  • 原文地址:https://www.cnblogs.com/gomysql/p/6651513.html
Copyright © 2011-2022 走看看