zoukankan      html  css  js  c++  java
  • Data Warehouse

    Knowledge Discovery Process

    OLTP & OLAP

    联机事务处理(OLTP, online transactional processing)系统:涵盖组织机构大部分的日常操作,purchasing, inventory, banking,manufacturing, payroll, registration, accounting
    联机分析处理(OLAP, online analytical processing)系统:以不同的格式组织和提供数据,以满足不同用户的各种需求,为数据分析和决策方面提供服务。
     
    Distinct features (OLTP vs. OLAP):
     User and system orientation: customer vs. market
     Data contents: current, detailed vs. historical, consolidated
     View: current, local vs. evolutionary, integrated
     Access patterns: update vs. read-only but complex queries

    Data Warehouse

    DBMS— tuned for OLTP: access methods, indexing, concurrency control, recovery
    Warehouse—tuned for OLAP: complex OLAP queries, multidimensional view, consolidation
     
    Data Warehouse:
     
    数据仓库将分布在企业网络中不同信息岛上的业务数据集成到一起,存储在一个单一的集成关系型数据库中,利用这样的集成信息,可方便用户对信息访问,可使决策人员对一段时间内的历史数据进行分析,研究事务的发展走势。
     
    A data warehouse is a subject-oriented, integratedtime-variant, and nonvolatile collection of data in support of management’s decision-making process.” — W. H.Inmon
     
    data stored in data warehouse has been processed after extracation, cleaning, transformation, load(sort, summarize...) and refresh.
     
     
     
    Data Warehouse model : dimensions and measures, you can locate some data by dimension and see the data by measures 
    Conception model : star schema, snowflake schema(a refinement of star schema), fact constellations(a collection of stars)
    Example of Star Schema:
     
    Typical OLAP Operations : 
    Roll up: summarize data by climbing up hierarchy or by dimension reduction, you can roll up to all to reduce a dimension
    Dill down: reverse of Roll-up, from higher level summary to lower level summary or detailed data
    Slice and dice: project and select 
    Priot(rotate): reorient the cube, visualization, 3D to series of 2D planes.
     

    参考

    中国科学院大学《数据挖掘》课程slices
  • 相关阅读:
    Redis
    多线程相关
    selenium操作浏览器的基本方法
    selenium之 webdriver与三大浏览器版本映射表(更新至v2.29)
    selenium安装及官方文档
    Python(3)_python对Json进行操作
    python类中的self参数和cls参数
    python3中shuffle函数
    Python3中assert断言
    python2和python3中range的区别
  • 原文地址:https://www.cnblogs.com/shine-lee/p/4126972.html
Copyright © 2011-2022 走看看