zoukankan      html  css  js  c++  java
  • druid

    实时分析型数据库

    Druid | Interactive Analytics at Scale http://druid.io/

    Druid is primarily used to store, query, and analyze large event streams. Examples of event streams include user generated data such as clickstreams, application generated data such as performance metrics, and machine generated data such as network flows and server metrics. Druid is optimized for sub-second queries to slice-and-dice, drill down, search, filter, and aggregate this data. Druid is commonly used to power interactive applications where performance, concurrency, and uptime are important.

    Druid was initially created to power a scalable, visual, multi-tenant application where users could not only rapidly slice and dice data to create ad-hoc reports, but also interactively explore data to quickly determine the root cause of patterns and anomalies. Druid is designed from the ground up for sub-second queries, which are critical in interactive applications as usability studies have shown that humans get distracted and lose their train of thought if responses take longer than a second.

    Design

    Druid’s core design combines ideas from OLAP/analytic databasestimeseries databases, and search systems to create a unified system for operational analytics. Core design ideas include:

    Column-oriented storage

    Druid stores and compresses each column individually, and only needs to read the ones needed for a particular query, which supports fast scans, rankings, and groupBys.

    Native search indexes

    Druid creates inverted indexes for string values for fast search and filter.

    Streaming and batch ingest

    Out-of-the-box connectors for Apache Kafka, HDFS, AWS S3, stream processors, and more.

    Flexible schemas

    Druid gracefully handles evolving schemas and nested data.

    Time-optimized partitioning

    Druid intelligently partitions data based on time and time-based queries are significantly faster than traditional databases.

    SQL support

    In addition to its native JSON based language, Druid speaks SQL over either HTTP or JDBC.

    Horizontally scalable

    Druid has been used in production to ingest millions of events/sec, retain years of data, and provide sub-second queries.

    Easy to operate

    Scale up or down by just adding or removing servers, and Druid automatically rebalances. Fault-tolerant architecture routes around server failures.

    To learn more, read our Technology page.

    Use cases

    Druid is proven in production at the world’s leading companies, with the largest installations having more than a thousand servers, ingesting over 10 million events per second, and supporting thousands of concurrent queries per second. Druid is used to:

     
     
    Analyze performance

    Create interactive dashboards with full drill down capabilities. Analyze performance of digital products, track mobile app usage, or monitor site reliability.

    Diagnose problems

    Find the root cause of issues. Troubleshoot netflow bottlenecks, analyze security threats, or diagnose software crashes.

    Find commonalities

    Find common attributes among events. Identify shared components in defective products, or determine patterns in top performing products.

    Increase efficiency

    Improve product engagement. Optimize ad-spend in digital marketing campaigns or increase user engagement in online products.

    To learn more, read our Use Cases page.

  • 相关阅读:
    AnaConda环境下安装librosa包超时
    [浙江大学数据结构]多项式求值,及算法效率问题
    java正则表达式测试用例
    tK Mybatis 通用 Mapper 3.4.6: Example 新增 builder 模式的应用
    Detect image format in Java(使用java探测图片文件格式)
    使用ColumnType注解解决/过滤/转义tk mybatis插入insertSelective、insert语句中遇到sql关键字
    IDEA中关闭sonar代码质量检测
    pip设置安装源
    无法修正错误,因为您要求某些软件包保持现状,就是它们破坏了软件包间的依赖关系
    sql 查出一张表中重复的所有记录数据
  • 原文地址:https://www.cnblogs.com/rsapaper/p/9857791.html
Copyright © 2011-2022 走看看