zoukankan      html  css  js  c++  java
  • centos7 安装测试clickhouse

    一 系统要求

    ClickHouse可以在任何具有x86_64,AArch64或PowerPC64LE CPU架构的Linux,FreeBSD或Mac OS X上运行。

    虽然预构建的二进制文件通常是为x86 _64编译并利用SSE 4.2指令集,但除非另有说明,否则使用支持它的CPU将成为额外的系统要求。这是检查当前CPU是否支持SSE 4.2的命令:

    $ grep -q sse4_2 /proc/cpuinfo && echo "SSE 4.2 supported" || echo "SSE 4.2 not supported"
    

    二 安装和启动

    首先,您需要添加官方存储库:

    sudo yum install yum-utils
    sudo rpm --import https://repo.clickhouse.tech/CLICKHOUSE-KEY.GPG
    sudo yum-config-manager --add-repo https://repo.clickhouse.tech/rpm/stable/x86_64
    

    如果您想使用最新版本,请将stable替换为testing(建议您在测试环境中使用)。

    然后运行这些命令以实际安装包:

    sudo yum install clickhouse-server clickhouse-client
    

    可以运行如下命令在后台启动服务:

    sudo service clickhouse-server start
    

    可以在/var/log/clickhouse-server/目录中查看日志。

    如果服务没有启动,请检查配置文件 /etc/clickhouse-server/config.xml。

    你可以使用命令行客户端连接到服务:

    clickhouse-client
    

    默认情况下它使用’default’用户无密码的与localhost:9000服务建立连接。
    客户端也可以用于连接远程服务,例如:

    clickhouse-client --host=example.com
    

    检查系统是否工作:

    milovidov@hostname:~/work/metrica/src/src/Client$ ./clickhouse-client
    ClickHouse client version 0.0.18749.
    Connecting to localhost:9000.
    Connected to ClickHouse server version 0.0.18749.
    
    :) SELECT 1
    
    SELECT 1
    
    ┌─1─┐
    │ 1 │
    └───┘
    
    1 rows in set. Elapsed: 0.003 sec.
    
    :)
    

    三 导入示例数据集

    现在是时候用一些示例数据填充我们的ClickHouse服务器。 在本教程中,我们将使用Yandex的匿名数据。Metrica,在成为开源之前以生产方式运行ClickHouse的第一个服务(更多关于这一点 历史科). 有 多种导入Yandex的方式。梅里卡数据集,为了本教程,我们将使用最现实的一个。

    下载并提取表数据

    curl https://clickhouse-datasets.s3.yandex.net/hits/tsv/hits_v1.tsv.xz | unxz --threads=`nproc` > hits_v1.tsv
    curl https://clickhouse-datasets.s3.yandex.net/visits/tsv/visits_v1.tsv.xz | unxz --threads=`nproc` > visits_v1.tsv
    

    提取的文件大小约为10GB。

    如果国外下载慢 可以使用网盘下载数据文件

    链接: https://pan.baidu.com/s/1LFzoWq-IdVONJra1lHN-PA 提取码: 4zez
    

    创建表

    与大多数数据库管理系统一样,ClickHouse在逻辑上将表分组为 “databases”. 有一个 default 数据库,但我们将创建一个名为新的 tutorial:

    clickhouse-client --query "CREATE DATABASE IF NOT EXISTS tutorial"
    

    与数据库相比,创建表的语法要复杂得多(请参阅 参考资料. 一般 CREATE TABLE 声明必须指定三个关键的事情:

    要创建的表的名称。
    Table schema, i.e. list of columns and their 数据类型.

    表引擎 及其设置,这决定了如何物理执行对此表的查询的所有细节。
    YandexMetrica是一个网络分析服务,样本数据集不包括其全部功能,因此只有两个表可以创建:

    hits 是一个表格,其中包含所有用户在服务所涵盖的所有网站上完成的每个操作。
    visits 是一个包含预先构建的会话而不是单个操作的表。

    打开客户端多行sql执行窗口

    clickhouse-client -m
    

    让我们看看并执行这些表的实际创建表查询:

    CREATE TABLE tutorial.hits_v1
    (
        `WatchID` UInt64,
        `JavaEnable` UInt8,
        `Title` String,
        `GoodEvent` Int16,
        `EventTime` DateTime,
        `EventDate` Date,
        `CounterID` UInt32,
        `ClientIP` UInt32,
        `ClientIP6` FixedString(16),
        `RegionID` UInt32,
        `UserID` UInt64,
        `CounterClass` Int8,
        `OS` UInt8,
        `UserAgent` UInt8,
        `URL` String,
        `Referer` String,
        `URLDomain` String,
        `RefererDomain` String,
        `Refresh` UInt8,
        `IsRobot` UInt8,
        `RefererCategories` Array(UInt16),
        `URLCategories` Array(UInt16),
        `URLRegions` Array(UInt32),
        `RefererRegions` Array(UInt32),
        `ResolutionWidth` UInt16,
        `ResolutionHeight` UInt16,
        `ResolutionDepth` UInt8,
        `FlashMajor` UInt8,
        `FlashMinor` UInt8,
        `FlashMinor2` String,
        `NetMajor` UInt8,
        `NetMinor` UInt8,
        `UserAgentMajor` UInt16,
        `UserAgentMinor` FixedString(2),
        `CookieEnable` UInt8,
        `JavascriptEnable` UInt8,
        `IsMobile` UInt8,
        `MobilePhone` UInt8,
        `MobilePhoneModel` String,
        `Params` String,
        `IPNetworkID` UInt32,
        `TraficSourceID` Int8,
        `SearchEngineID` UInt16,
        `SearchPhrase` String,
        `AdvEngineID` UInt8,
        `IsArtifical` UInt8,
        `WindowClientWidth` UInt16,
        `WindowClientHeight` UInt16,
        `ClientTimeZone` Int16,
        `ClientEventTime` DateTime,
        `SilverlightVersion1` UInt8,
        `SilverlightVersion2` UInt8,
        `SilverlightVersion3` UInt32,
        `SilverlightVersion4` UInt16,
        `PageCharset` String,
        `CodeVersion` UInt32,
        `IsLink` UInt8,
        `IsDownload` UInt8,
        `IsNotBounce` UInt8,
        `FUniqID` UInt64,
        `HID` UInt32,
        `IsOldCounter` UInt8,
        `IsEvent` UInt8,
        `IsParameter` UInt8,
        `DontCountHits` UInt8,
        `WithHash` UInt8,
        `HitColor` FixedString(1),
        `UTCEventTime` DateTime,
        `Age` UInt8,
        `Sex` UInt8,
        `Income` UInt8,
        `Interests` UInt16,
        `Robotness` UInt8,
        `GeneralInterests` Array(UInt16),
        `RemoteIP` UInt32,
        `RemoteIP6` FixedString(16),
        `WindowName` Int32,
        `OpenerName` Int32,
        `HistoryLength` Int16,
        `BrowserLanguage` FixedString(2),
        `BrowserCountry` FixedString(2),
        `SocialNetwork` String,
        `SocialAction` String,
        `HTTPError` UInt16,
        `SendTiming` Int32,
        `DNSTiming` Int32,
        `ConnectTiming` Int32,
        `ResponseStartTiming` Int32,
        `ResponseEndTiming` Int32,
        `FetchTiming` Int32,
        `RedirectTiming` Int32,
        `DOMInteractiveTiming` Int32,
        `DOMContentLoadedTiming` Int32,
        `DOMCompleteTiming` Int32,
        `LoadEventStartTiming` Int32,
        `LoadEventEndTiming` Int32,
        `NSToDOMContentLoadedTiming` Int32,
        `FirstPaintTiming` Int32,
        `RedirectCount` Int8,
        `SocialSourceNetworkID` UInt8,
        `SocialSourcePage` String,
        `ParamPrice` Int64,
        `ParamOrderID` String,
        `ParamCurrency` FixedString(3),
        `ParamCurrencyID` UInt16,
        `GoalsReached` Array(UInt32),
        `OpenstatServiceName` String,
        `OpenstatCampaignID` String,
        `OpenstatAdID` String,
        `OpenstatSourceID` String,
        `UTMSource` String,
        `UTMMedium` String,
        `UTMCampaign` String,
        `UTMContent` String,
        `UTMTerm` String,
        `FromTag` String,
        `HasGCLID` UInt8,
        `RefererHash` UInt64,
        `URLHash` UInt64,
        `CLID` UInt32,
        `YCLID` UInt64,
        `ShareService` String,
        `ShareURL` String,
        `ShareTitle` String,
        `ParsedParams` Nested(
            Key1 String,
            Key2 String,
            Key3 String,
            Key4 String,
            Key5 String,
            ValueDouble Float64),
        `IslandID` FixedString(16),
        `RequestNum` UInt32,
        `RequestTry` UInt8
    )
    ENGINE = MergeTree()
    PARTITION BY toYYYYMM(EventDate)
    ORDER BY (CounterID, EventDate, intHash32(UserID))
    SAMPLE BY intHash32(UserID)
    SETTINGS index_granularity = 8192
    
    
    CREATE TABLE tutorial.visits_v1
    (
        `CounterID` UInt32,
        `StartDate` Date,
        `Sign` Int8,
        `IsNew` UInt8,
        `VisitID` UInt64,
        `UserID` UInt64,
        `StartTime` DateTime,
        `Duration` UInt32,
        `UTCStartTime` DateTime,
        `PageViews` Int32,
        `Hits` Int32,
        `IsBounce` UInt8,
        `Referer` String,
        `StartURL` String,
        `RefererDomain` String,
        `StartURLDomain` String,
        `EndURL` String,
        `LinkURL` String,
        `IsDownload` UInt8,
        `TraficSourceID` Int8,
        `SearchEngineID` UInt16,
        `SearchPhrase` String,
        `AdvEngineID` UInt8,
        `PlaceID` Int32,
        `RefererCategories` Array(UInt16),
        `URLCategories` Array(UInt16),
        `URLRegions` Array(UInt32),
        `RefererRegions` Array(UInt32),
        `IsYandex` UInt8,
        `GoalReachesDepth` Int32,
        `GoalReachesURL` Int32,
        `GoalReachesAny` Int32,
        `SocialSourceNetworkID` UInt8,
        `SocialSourcePage` String,
        `MobilePhoneModel` String,
        `ClientEventTime` DateTime,
        `RegionID` UInt32,
        `ClientIP` UInt32,
        `ClientIP6` FixedString(16),
        `RemoteIP` UInt32,
        `RemoteIP6` FixedString(16),
        `IPNetworkID` UInt32,
        `SilverlightVersion3` UInt32,
        `CodeVersion` UInt32,
        `ResolutionWidth` UInt16,
        `ResolutionHeight` UInt16,
        `UserAgentMajor` UInt16,
        `UserAgentMinor` UInt16,
        `WindowClientWidth` UInt16,
        `WindowClientHeight` UInt16,
        `SilverlightVersion2` UInt8,
        `SilverlightVersion4` UInt16,
        `FlashVersion3` UInt16,
        `FlashVersion4` UInt16,
        `ClientTimeZone` Int16,
        `OS` UInt8,
        `UserAgent` UInt8,
        `ResolutionDepth` UInt8,
        `FlashMajor` UInt8,
        `FlashMinor` UInt8,
        `NetMajor` UInt8,
        `NetMinor` UInt8,
        `MobilePhone` UInt8,
        `SilverlightVersion1` UInt8,
        `Age` UInt8,
        `Sex` UInt8,
        `Income` UInt8,
        `JavaEnable` UInt8,
        `CookieEnable` UInt8,
        `JavascriptEnable` UInt8,
        `IsMobile` UInt8,
        `BrowserLanguage` UInt16,
        `BrowserCountry` UInt16,
        `Interests` UInt16,
        `Robotness` UInt8,
        `GeneralInterests` Array(UInt16),
        `Params` Array(String),
        `Goals` Nested(
            ID UInt32,
            Serial UInt32,
            EventTime DateTime,
            Price Int64,
            OrderID String,
            CurrencyID UInt32),
        `WatchIDs` Array(UInt64),
        `ParamSumPrice` Int64,
        `ParamCurrency` FixedString(3),
        `ParamCurrencyID` UInt16,
        `ClickLogID` UInt64,
        `ClickEventID` Int32,
        `ClickGoodEvent` Int32,
        `ClickEventTime` DateTime,
        `ClickPriorityID` Int32,
        `ClickPhraseID` Int32,
        `ClickPageID` Int32,
        `ClickPlaceID` Int32,
        `ClickTypeID` Int32,
        `ClickResourceID` Int32,
        `ClickCost` UInt32,
        `ClickClientIP` UInt32,
        `ClickDomainID` UInt32,
        `ClickURL` String,
        `ClickAttempt` UInt8,
        `ClickOrderID` UInt32,
        `ClickBannerID` UInt32,
        `ClickMarketCategoryID` UInt32,
        `ClickMarketPP` UInt32,
        `ClickMarketCategoryName` String,
        `ClickMarketPPName` String,
        `ClickAWAPSCampaignName` String,
        `ClickPageName` String,
        `ClickTargetType` UInt16,
        `ClickTargetPhraseID` UInt64,
        `ClickContextType` UInt8,
        `ClickSelectType` Int8,
        `ClickOptions` String,
        `ClickGroupBannerID` Int32,
        `OpenstatServiceName` String,
        `OpenstatCampaignID` String,
        `OpenstatAdID` String,
        `OpenstatSourceID` String,
        `UTMSource` String,
        `UTMMedium` String,
        `UTMCampaign` String,
        `UTMContent` String,
        `UTMTerm` String,
        `FromTag` String,
        `HasGCLID` UInt8,
        `FirstVisit` DateTime,
        `PredLastVisit` Date,
        `LastVisit` Date,
        `TotalVisits` UInt32,
        `TraficSource` Nested(
            ID Int8,
            SearchEngineID UInt16,
            AdvEngineID UInt8,
            PlaceID UInt16,
            SocialSourceNetworkID UInt8,
            Domain String,
            SearchPhrase String,
            SocialSourcePage String),
        `Attendance` FixedString(16),
        `CLID` UInt32,
        `YCLID` UInt64,
        `NormalizedRefererHash` UInt64,
        `SearchPhraseHash` UInt64,
        `RefererDomainHash` UInt64,
        `NormalizedStartURLHash` UInt64,
        `StartURLDomainHash` UInt64,
        `NormalizedEndURLHash` UInt64,
        `TopLevelDomain` UInt64,
        `URLScheme` UInt64,
        `OpenstatServiceNameHash` UInt64,
        `OpenstatCampaignIDHash` UInt64,
        `OpenstatAdIDHash` UInt64,
        `OpenstatSourceIDHash` UInt64,
        `UTMSourceHash` UInt64,
        `UTMMediumHash` UInt64,
        `UTMCampaignHash` UInt64,
        `UTMContentHash` UInt64,
        `UTMTermHash` UInt64,
        `FromHash` UInt64,
        `WebVisorEnabled` UInt8,
        `WebVisorActivity` UInt32,
        `ParsedParams` Nested(
            Key1 String,
            Key2 String,
            Key3 String,
            Key4 String,
            Key5 String,
            ValueDouble Float64),
        `Market` Nested(
            Type UInt8,
            GoalID UInt32,
            OrderID String,
            OrderPrice Int64,
            PP UInt32,
            DirectPlaceID UInt32,
            DirectOrderID UInt32,
            DirectBannerID UInt32,
            GoodID String,
            GoodName String,
            GoodQuantity Int32,
            GoodPrice Int64),
        `IslandID` FixedString(16)
    )
    ENGINE = CollapsingMergeTree(Sign)
    PARTITION BY toYYYYMM(StartDate)
    ORDER BY (CounterID, StartDate, intHash32(UserID), VisitID)
    SAMPLE BY intHash32(UserID)
    SETTINGS index_granularity = 8192
    

    您可以使用以下交互模式执行这些查询 clickhouse-client (只需在终端中启动它,而不需要提前指定查询)或尝试一些 替代接口 如果你愿意的话

    正如我们所看到的, hits_v1 使用 基本MergeTree引擎,而 visits_v1 使用 崩溃 变体。

    导入数据

    数据导入到ClickHouse是通过以下方式完成的 INSERT INTO 查询像许多其他SQL数据库。 然而,数据通常是在一个提供 支持的序列化格式 而不是 VALUES 子句(也支持)。

    我们之前下载的文件是以制表符分隔的格式,所以这里是如何通过控制台客户端导入它们:

    clickhouse-client --query "INSERT INTO tutorial.hits_v1 FORMAT TSV" --max_insert_block_size=100000 < hits_v1.tsv
    clickhouse-client --query "INSERT INTO tutorial.visits_v1 FORMAT TSV" --max_insert_block_size=100000 < visits_v1.tsv
    

    ClickHouse有很多 要调整的设置 在控制台客户端中指定它们的一种方法是通过参数,我们可以看到 --max_insert_block_size. 找出可用的设置,它们意味着什么以及默认值的最简单方法是查询 system.settings 表:

    SELECT name, value, changed, description
    FROM system.settings
    WHERE name LIKE '%max_insert_b%'
    FORMAT TSV
    
    max_insert_block_size    1048576    0    "The maximum block size for insertion, if we control the creation of blocks for insertion."
    

    您也可以 OPTIMIZE 导入后的表。 使用MergeTree-family引擎配置的表总是在后台合并数据部分以优化数据存储(或至少检查是否有意义)。 这些查询强制表引擎立即进行存储优化,而不是稍后进行一段时间:

    clickhouse-client --query "OPTIMIZE TABLE tutorial.hits_v1 FINAL"
    clickhouse-client --query "OPTIMIZE TABLE tutorial.visits_v1 FINAL"
    

    这些查询开始一个I/O和CPU密集型操作,所以如果表一直接收到新数据,最好不要管它,让合并在后台运行。

    查询测试

    现在我们可以检查表导入是否成功:

    clickhouse-client --query "SELECT COUNT(*) FROM tutorial.hits_v1"
    clickhouse-client --query "SELECT COUNT(*) FROM tutorial.visits_v1"
    

    查询示例

    SELECT
        StartURL AS URL,
        AVG(Duration) AS AvgDuration
    FROM tutorial.visits_v1
    WHERE StartDate BETWEEN '2014-03-23' AND '2014-03-30'
    GROUP BY URL
    ORDER BY AvgDuration DESC
    LIMIT 10
    
    
    SELECT
        sum(Sign) AS visits,
        sumIf(Sign, has(Goals.ID, 1105530)) AS goal_visits,
        (100. * goal_visits) / visits AS goal_percent
    FROM tutorial.visits_v1
    WHERE (CounterID = 912887) AND (toYYYYMM(StartDate) = 201403) AND (domain(StartURL) = 'yandex.ru')
    

    四 win客户端管理软件-DBeaver

    软件下载地址
    https://dbeaver.io/download/

    国内网盘下载 地址同上

    修改config和添加用户

    vi /etc/clickhouse/config.xml
    #监听host改下
    <listen_host>0.0.0.0</listen_host>
    
    vi /etc/clickhouse/user.xml
    #添加root用户 密码root
    
    <users>
        <!-- If user name was not specified, 'default' user is used. -->
        <default>
            <password></password>
            <networks incl="networks" replace="replace">
                <ip>::/0</ip>
            </networks>
            <profile>default</profile>
            <quota>default</quota>
        </default>        
    	<root>
            <password>root</password>
            <networks incl="networks" replace="replace">
                <ip>::/0</ip>
            </networks>
            <profile>default</profile>
            <quota>default</quota>
        </root>
    </users>
    

    DBeaver新建clickhouse连接

    查询测试

  • 相关阅读:
    full gc
    C#调用C++编写的DLL
    用C#调用C++DLL提示找不到DLL解决方法【转】
    VS2015 编写C++的DLL,并防止DLL导出的函数名出现乱码(以串口通信为例,实现串口通信)
    VS2015 C++ 获取 Edit Control 控件的文本内容,以及把获取到的CString类型的内容转换为 int 型
    VS2015 建立一个C++的MFC简易窗体程序项目
    C# 实现串口发送数据(不用串口控件版)
    STM32 HAL库的定时器中断回调函数跟串口中断回调函数
    把 Python 脚本打包成可以直接双击运行的 .exe 文件 【转】
    Python 实现把 .cvs 文件保存为 Excel 文件
  • 原文地址:https://www.cnblogs.com/sentangle/p/13282430.html
Copyright © 2011-2022 走看看