zoukankan      html  css  js  c++  java
  • druid部署

    Quickstart单机测试

    http://druid.io/docs/0.10.1/tutorials/quickstart.html

    (1)Getting started

    下载安装Druid:

    curl -O http://static.druid.io/artifacts/releases/druid-0.10.1-bin.tar.gz
    tar -xzf druid-0.10.1-bin.tar.gz
    cd druid-0.10.1
    

    主要目录:

    • LICENSE - the license files.
    • bin/ - scripts useful for this quickstart.
    • conf/* - template configurations for a clustered setup.
    • conf-quickstart/* - configurations for this quickstart.
    • extensions/* - all Druid extensions.
    • hadoop-dependencies/* - Druid Hadoop dependencies.
    • lib/* - all included software packages for core Druid.
    • quickstart/* - files useful for this quickstart.

    (2)Start up Zookeeper

    启动ZK

    curl http://www.gtlib.gatech.edu/pub/apache/zookeeper/zookeeper-3.4.6/zookeeper-3.4.6.tar.gz -o zookeeper-3.4.6.tar.gz

    tar -xzf zookeeper-3.4.6.tar.gz
    cd zookeeper-3.4.6
    cp conf/zoo_sample.cfg conf/zoo.cfg
    ./bin/zkServer.sh start
    

    (3)Start up Druid services

    启动Druid,Zookeeper running后,返回 druid-0.10.1目录,执行

     bin/init
    

    这会为我们建立目录如log和var,下面在不同的terminal windows中执行不同的进程

    java `cat conf-quickstart/druid/historical/jvm.config | xargs` -cp "conf-quickstart/druid/_common:conf-quickstart/druid/historical:lib/*" io.druid.cli.Main server historical
    java `cat conf-quickstart/druid/broker/jvm.config | xargs` -cp "conf-quickstart/druid/_common:conf-quickstart/druid/broker:lib/*" io.druid.cli.Main server broker
    java `cat conf-quickstart/druid/coordinator/jvm.config | xargs` -cp "conf-quickstart/druid/_common:conf-quickstart/druid/coordinator:lib/*" io.druid.cli.Main server coordinator
    java `cat conf-quickstart/druid/overlord/jvm.config | xargs` -cp "conf-quickstart/druid/_common:conf-quickstart/druid/overlord:lib/*" io.druid.cli.Main server overlord
    java `cat conf-quickstart/druid/middleManager/jvm.config | xargs` -cp "conf-quickstart/druid/_common:conf-quickstart/druid/middleManager:lib/*" io.druid.cli.Main server middleManager
    

      

    如果需要CTRL-C 来结束(这里不需要)

    如果需要重启,需要删掉var目录,然后重启bin/init

    摄入数据

    在druid-0.10.1目录下执行

    curl -X 'POST' -H 'Content-Type:application/json' -d @quickstart/wikiticker-index.json localhost:8090/druid/indexer/v1/task

    返回
    {"task":"index_hadoop_wikiticker_2017-11-26T12:57:40.055Z"}

    ingestion task console: http://localhost:8090/console.html

     
    image

    coordinator console http://localhost:8081/#/.

     
    image
     
    image
     

    (4)查询数据

    执行

    curl -L -H'Content-Type: application/json' -XPOST --data-binary @quickstart/wikiticker-top-pages.json http://localhost:8082/druid/v2/?pretty
    

    返回

    [html] view plaincopy

    <embed id="ZeroClipboardMovie_1" src="http://static.blog.csdn.net/scripts/ZeroClipboard/ZeroClipboard.swf" loop="false" menu="false" quality="best" bgcolor="#ffffff" width="16" height="16" name="ZeroClipboardMovie_1" align="middle" allowscriptaccess="always" allowfullscreen="false" type="application/x-shockwave-flash" pluginspage="http://www.macromedia.com/go/getflashplayer" flashvars="id=1&width=16&height=16" wmode="transparent" >

    1. {"task":"index_hadoop_wikiticker_2017-11-18T16:07:55.681Z"}localhost:druid-0.10.-data-binary @quickstart/wikiticker-top-pages.json http://localhost:8082/druid/v2/?pretty
    2. [ {
    3. "timestamp" : "2015-09-12T00:46:58.771Z",
    4. "result" : [ {
    5. "edits" : 33,
    6. "page" : "Wikipedia:Vandalismusmeldung"
    7. }, {
    8. "edits" : 28,
    9. "page" : "User:Cyde/List of candidates for speedy deletion/Subpage"
    10. }, {
    11. "edits" : 27,
    12. "page" : "Jeremy Corbyn"
    13. }, {
    14. "edits" : 21,
    15. "page" : "Wikipedia:Administrators' noticeboard/Incidents"
    16. }, {
    17. "edits" : 20,
    18. "page" : "Flavia Pennetta"
    19. }, {
    20. "edits" : 18,
    21. "page" : "Total Drama Presents: The Ridonculous Race"
    22. }, {
    23. "edits" : 18,
    24. "page" : "User talk:Dudeperson176123"
    25. }, {
    26. "edits" : 18,
    27. "page" : "Wikipédia:Le Bistro/12 septembre 2015"
    28. }, {
    29. "edits" : 17,
    30. "page" : "Wikipedia:In the news/Candidates"
    31. }, {
    32. "edits" : 17,
    33. "page" : "Wikipedia:Requests for page protection"
    34. }, {
    35. "edits" : 16,
    36. "page" : "Utente:Giulio Mainardi/Sandbox"
    37. }, {
    38. "edits" : 16,
    39. "page" : "Wikipedia:Administrator intervention against vandalism"
    40. }, {
    41. "edits" : 15,
    42. "page" : "Anthony Martial"
    43. }, {
    44. "edits" : 13,
    45. "page" : "Template talk:Connected contributor"
    46. }, {
    47. "edits" : 12,
    48. "page" : "Chronologie de la Lorraine"
    49. }, {
    50. "edits" : 12,
    51. "page" : "Wikipedia:Files for deletion/2015 September 12"
    52. }, {
    53. "edits" : 12,
    54. "page" : "Гомосексуальный образ жизни"
    55. }, {
    56. "edits" : 11,
    57. "page" : "Constructive vote of no confidence"
    58. }, {
    59. "edits" : 11,
    60. "page" : "Homo naledi"
    61. }, {
    62. "edits" : 11,
    63. "page" : "Kim Davis (county clerk)"
    64. }, {
    65. "edits" : 11,
    66. "page" : "Vorlage:Revert-Statistik"
    67. }, {
    68. "edits" : 11,
    69. "page" : "Конституция Японской империи"
    70. }, {
    71. "edits" : 10,
    72. "page" : "The Naked Brothers Band (TV series)"
    73. }, {
    74. "edits" : 10,
    75. "page" : "User talk:Buster40004"
    76. }, {
    77. "edits" : 10,
    78. "page" : "User:Valmir144/sandbox"
    79. } ]

    ================================

    数据加载方法

    Loading Data

    http://druid.io/docs/0.10.1/tutorials/ingestion.html
    两种形式streaming (real-time) file-based (batch)
    【1】HDFS文件
    http://druid.io/docs/0.10.1/ingestion/batch-ingestion.html
    【2】Kafka, Storm, Spark Streaming
    利用Tranquility客户端 http://druid.io/docs/0.10.1/ingestion/stream-ingestion.html#stream-push

    文件加载简单入门

    Files-based
    【1】加载本地磁盘文件:http://druid.io/docs/0.10.1/tutorials/tutorial-batch.html
    【2】Streams-based
    push data over HTTP:http://druid.io/docs/0.10.1/tutorials/tutorial-streams.html

    【3】Kafka-based tutorial:http://druid.io/docs/0.10.1/tutorials/tutorial-kafka.html

    例子1-加载本地磁盘文件

    Loading from Files-Load your own batch data
    【1】按照单机版下载并启动
    http://druid.io/docs/0.10.1/tutorials/quickstart.html
    【2】写ingestion规则
    参考下载包中的 quickstart/wikiticker-index.json
    要点:
    (1)标识dataset,dataSource中dataSchema
    (2)标识dataset的位置,inputSpec中的paths,多个文件用逗号分隔
    (3)标识timestamp,timestampSpec的column
    (4)标识dimensions ,dimensionsSpec的imensions(
    (5)标识metrics,metricsSpec
    (6)ranges,granularitySpec的intervals
    如果数据无时间可以按照"2000-01-01T00:00:00.000Z"形式标识每一行
    文件支持TSV, CSV, and JSON ,不支持嵌套JSON
    JSON数据形式如下:
    pageviews.json文件内容
    {"time": "2015-09-01T00:00:00Z", "url": "/foo/bar", "user": "alice", "latencyMs": 32}
    {"time": "2015-09-01T01:00:00Z", "url": "/", "user": "bob", "latencyMs": 11}
    {"time": "2015-09-01T01:30:00Z", "url": "/foo/bar", "user": "bob", "latencyMs": 45}
    主要保证每一行数据没有newline符号
    如按下面写规则json,my-index-task.json
    "dataSource": "pageviews"
    "inputSpec": {
    "type": "static",
    "paths": "pageviews.json"
    }
    "timestampSpec": {
    "format": "auto",
    "column": "time"
    }
    "dimensionsSpec": {
    "dimensions": ["url", "user"]
    }
    "metricsSpec": [
    {"name": "views", "type": "count"},
    {"name": "latencyMs", "type": "doubleSum", "fieldName": "latencyMs"}
    ]
    "granularitySpec": {
    "type": "uniform",
    "segmentGranularity": "day",
    "queryGranularity": "none",
    "intervals": ["2015-09-01/2015-09-02"]
    }
    【3】为了保障indexing task可以读到pageviews.json文件内容
    (1)本地执行(不配置连接hadoop),将pageviews.json文件放在Druid root目录
    (2)若连接hadoop,修改inputSpec中的paths
    【4】执行
    curl -X 'POST' -H 'Content-Type:application/json' -d @my-index-task.json OVERLORD_IP:8090/druid/indexer/v1/task
    若本地执行用下面
    curl -X 'POST' -H 'Content-Type:application/json' -d @my-index-task.json localhost:8090/druid/indexer/v1/task=
    通过http://OVERLORD_IP:8090/druid/indexer/v1/task 查看indexing的进度
    【4】查询数据
    数据将在1到2分钟后可用,通过Coordinator console http://localhost:8081/#/. 查看
    【5】查看数据
    http://druid.io/docs/0.10.1/querying/querying.html

    例子2-消费kafka数据

    Tutorial: Load from Kafka
    【1】下载启动kafka
    curl -O http://www.us.apache.org/dist/kafka/0.9.0.0/kafka_2.11-0.9.0.0.tgz
    tar -xzf kafka_2.11-0.9.0.0.tgz
    cd kafka_2.11-0.9.0.0
    启动Kafka broker
    ./bin/kafka-server-start.sh config/server.properties
    建立Kafka topic命名为metrics
    ./bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic metrics
    【2】发送样例数据
    Druid目录生成测试数据bin/generate-example-metrics
    启动kafka的producer
    ./bin/kafka-console-producer.sh --broker-list localhost:9092 --topic metrics
    将生成的数据贴到producer的终端中
    【3】查询数据

    转自 http://blog.csdn.net/hjw199089/article/details/78572034


    作者:大诗兄_zl
    链接:https://www.jianshu.com/p/03d32119dfdc
    來源:简书
    简书著作权归作者所有,任何形式的转载都请联系作者获得授权并注明出处。
  • 相关阅读:
    sql server 2005 时间转化获取年,有一个时间如20090715,现在要分别查出年、月、日,
    在服务器上使用第三方独立组件对Word/Excel进行编程
    Session的配置
    sql 2005判断某个表或某个表中的列是否存在
    GridView中FooterTemplate模板下内容不显示
    asp.net 按钮单击事件问题(自动弹出新窗口)
    NameValueCollection详解
    System.Web.HttpException: 类型“TextBox”的控件“ctl02_TextBox2”必须放在具有 runat=server 的窗体标记内。
    SqlBulkCopy批量复制数据
    使用嵌入式关系型SQLite数据库存储数据
  • 原文地址:https://www.cnblogs.com/momoyan/p/9614650.html
Copyright © 2011-2022 走看看