zoukankan      html  css  js  c++  java
  • Data Warehouse

    Knowledge Discovery Process

    OLTP & OLAP

    联机事务处理(OLTP, online transactional processing)系统:涵盖组织机构大部分的日常操作,purchasing, inventory, banking,manufacturing, payroll, registration, accounting
    联机分析处理(OLAP, online analytical processing)系统:以不同的格式组织和提供数据,以满足不同用户的各种需求,为数据分析和决策方面提供服务。
     
    Distinct features (OLTP vs. OLAP):
     User and system orientation: customer vs. market
     Data contents: current, detailed vs. historical, consolidated
     View: current, local vs. evolutionary, integrated
     Access patterns: update vs. read-only but complex queries

    Data Warehouse

    DBMS— tuned for OLTP: access methods, indexing, concurrency control, recovery
    Warehouse—tuned for OLAP: complex OLAP queries, multidimensional view, consolidation
     
    Data Warehouse:
     
    数据仓库将分布在企业网络中不同信息岛上的业务数据集成到一起,存储在一个单一的集成关系型数据库中,利用这样的集成信息,可方便用户对信息访问,可使决策人员对一段时间内的历史数据进行分析,研究事务的发展走势。
     
    A data warehouse is a subject-oriented, integratedtime-variant, and nonvolatile collection of data in support of management’s decision-making process.” — W. H.Inmon
     
    data stored in data warehouse has been processed after extracation, cleaning, transformation, load(sort, summarize...) and refresh.
     
     
     
    Data Warehouse model : dimensions and measures, you can locate some data by dimension and see the data by measures 
    Conception model : star schema, snowflake schema(a refinement of star schema), fact constellations(a collection of stars)
    Example of Star Schema:
     
    Typical OLAP Operations : 
    Roll up: summarize data by climbing up hierarchy or by dimension reduction, you can roll up to all to reduce a dimension
    Dill down: reverse of Roll-up, from higher level summary to lower level summary or detailed data
    Slice and dice: project and select 
    Priot(rotate): reorient the cube, visualization, 3D to series of 2D planes.
     

    参考

    中国科学院大学《数据挖掘》课程slices
  • 相关阅读:
    通过Javascript调用微软认知服务情感检测接口的两种实现方式
    Microsoft Flow 概览
    使用PowerApps快速构建基于主题的轻业务应用 —— 进阶篇
    从三个语言(C++,Java,C#)的几个性能测试案例来看性能优化
    自己动手,打造轻量级VSCode/C#环境代替LinqPad
    2015年总结
    将知识变成你的技能点
    瞎子摸象与刻舟求剑
    俺的追求,下一个五年的指导纲领
    工作中任务管理的四个原则和四个技能
  • 原文地址:https://www.cnblogs.com/shine-lee/p/4126972.html
Copyright © 2011-2022 走看看