zoukankan      html  css  js  c++  java
  • Data Management Technology(1) -- Introduction

    1.Database concepts

    (1)Data & Information

    Information

    • Is any kind of event that affects the state of a dynamic system
    • Is the message (utterance or expression) being conveyed
    • Is an ordered sequence of symbols that can be interpreted as a message

    As sensory input: Often be viewed as a type of input to an organism or system

    Can be recorded and transmitted

    Data

    Definition

    • values of a qualitative or quantitative variables, belonging to a set of items
    • Used to record information
    • Is the carrier of Information

    Data type and Data value

    • Data Type: the way values of the data can be stored in computer system
    • Data Value: records the meaning of information

    Data & Information

    Data on its own carries no meaning

    Data must be interpreted to take on a meaning, so that the information can be revealed

    Data is the lower level of abstraction; is the carrier of information

    Information is the interpretation of data

    (2)Data Processing VS Data Management

    Main categories of Data Manipulation

    Data Management: Database

    Data Processing: Computer program

    Data Transmission: Computer Network

    Tasks of Data Management

    Data Storage: Organize the data and store them into the storage device such as hard disk;

    Data Maintenance: Insert new value, delete invalid data or modify old data;

    Data Query & Data statistic: Retrieve information from the data storage

    Application Requirements

    store the data for a long period of time

    • large amounts (100s of GB)
    • protect against crashes
    • protect against unauthorized use

    allow users to query/update:

    allow several (100s, 1000s) users to access the data simultaneously

    allow administrators to change the schema

    Trying Without a DBMS

    Storing data: file system is limited

    • size less than 4 GB (on 32 bits machines)
    • when system crashes we may loose data
    • password-based authorization insufficient

    Query/update:

    • need to write a new C++/Java program for every new query
    • need to worry about performance

    Concurrency: limited protection

    • need to worry about interfering with other users
    • need to offer different views to different users (e.g. registrar, students, professors)

    Schema change:

    • entails changing file formats
    • need to rewrite virtually all applications

    Schema Versus Data

    schema: describes how data is to be structured

    defined at set-up time

    rarely changes

    also called “meta data”

    data is actual “instance” of database, changes rapidly

    vs. types and variables in programming languages

    DBMS

    Data Definition Language – DDL

    • Easy to define schema

    Data Manipulation Language - DML

    • query language

    Storage management

    • Retrieve data from disk automatically for you

    Transaction Management

    • concurrency control
    • recovery

    Automate a lot of boring/mundane operations on data

    • so that we don’t have to program over and over
    • so that we can write complex data manipulations in just a few lines, so that we can concentrate on app logics

    Make execution very fast

    • so that it scales up to very large data sets

    Make concurrent access/modification possible

    • so that many users can use the data at the same time

    Building an Application with a DBMS

    Requirements modeling (conceptual, pictures)

    • Decide what entities should be part of the application and how they should be linked.

    Schema design and implementation

    • Decide on a set of tables, attributes.
    • Define the tables in the database system.
    • Populate database (insert tuples).

    Write application programs using the DBMS

    • way easier now that the data management is taken care of.

    Querying a Database(查询数据库)

    Database Technology

    Started from 1960’s

    Is an important branch of CS

    • Programming language
    • OS
    • DB
    • Network

    Is the main component of computer infrastructure

    Main Functions of DBMS

    Data Definition数据定义

    • Provides Data Definition Language to define schema of database

    Data Manipulation数据操作

    • Provides Data Manipulation Language to manipulate data in database: RETRIEVE, INSERT, DELETE, MODIFY

    Database operation

    • Security
    • Integrity
    • Concurrency
    • recovery

    Toolsets

    • Data loader
    • Monitor
    • Performance tuning tools

    (3)Database

    • Efficient, convenient, and safe multi-user storage of massive amounts of well organized persistent data

    (4)Database Management System

    • A Software System that manages database
    • Buy, install, set up for particular application

    (5)Database System

    DBS, information systems that based on database

    Consists of database, DBMS, application, and users

    硬件<OS<DBMS<App development tools<Database system

    2.Development of DB tech

    Stages

    (1)Manual processing

    • Data stored in punched-cards
    • Data managed by hand
    • No data sharing

    (2)File system

    • Data stored in files
    • Use OS IO interface to access data
    • Measures taken to accelerate the data access
    • Primary data-program independence

    Drawbacks of file systems

    Program-Data Dependence

    • All programs maintain metadata for each file they use

    Data Redundancy (Duplication of data)

    • Different systems/programs have separate copies of the same data
    • Multiple file formats, duplication of information in different files
      • Requires space, effort and result in loss of data & metadata integrity

    Limited Data Sharing

    • No centralized control of data
      • Each application has its own private files & users has little chance to share data outside their own applications

    Lengthy Development Times

    • For each new application programmers must design their own file formats & descriptions from scratch

    Excessive Program Maintenance

    • 80% of information systems budget

    Difficulty in accessing data

    • Need to write a new program to carry out each new task

    Integrity problems

    • Integrity constraints
    • Hard to add new constraints or change existing ones

    (3)Database

    Main advantages compared to file systems:

    data sharing

    Less data redundancy

    Data- program Independence

    Convenient program interface

    Efficient data access

    Data integrity and data security

    Concurrency management

    Database advantages

    Structured data storage

    • Organize the data and link them together by their inner relations
    • Automatically manage the data relationship

    Data sharing

    • One copy of data for many applications
    • Allow many users to access data simultaneously

    Less data redundancy

    • 共享数据(对应关系变成表格关系)

    Data integrity

    • Automatically check the input value of certain data items, according to the data integrity rules

    Concurrency并发性

    • Isolate the concurrent accesses
    • Prevent the dirty use of data
      • The modification that may jeopardize the integrity of data

    3.Database Architecture

    (1)Database Architecture (physical architecture)

    The architecture of a database systems is greatly influenced by the underlying computer system on which the database is running:

    • Centralized集中的
    • Client-server客户端-服务器
    • Parallel (multi-processor)并行(多处理器)
    • Distributed分布式

    (2)Database Users

    Users are differentiated by the way they expect to interact with the system

    • Application programmers – interact with system through DML calls
    • Sophisticated users – form requests in a database query language
    • Specialized users – write specialized database applications that do not fit into the traditional data processing framework
    • Naive users – invoke one of the permanent application programs that have been written previously. Examples, people accessing database over the web, bank tellers, clerical staff

    (3)Database Administrator

    Coordinates all the activities of the database system; the database administrator has a good understanding of the enterprise’s information resources and needs.

    Database administrator’s duties include:

    • Schema definition
    • Storage structure and access method definition
    • Schema and physical organization modification
    • Granting user authority to access the database
    • Specifying integrity constraints
    • Acting as liaison with users
    • Monitoring performance and responding to changes in requirements
  • 相关阅读:
    Failed to create the Java Virtual Machine
    图文解析进程与线程区别
    HTTP协议详解
    打开某网站无法访问出现空白页可能的原因
    子网划分举例
    上传验证绕过全解析
    Linux命令之远程登录与执行远程主机命令
    information_schema Introduction
    python多进程之multiprocessing
    python多线程之Threading
  • 原文地址:https://www.cnblogs.com/wojiaobuzhidao/p/11071371.html
Copyright © 2011-2022 走看看