zoukankan      html  css  js  c++  java
  • Indexing and Hashing

    DATABASE SYSTEM CONCEPTS, SIXTH EDITION
    11.1 Basic Concepts
    An index for a file in a database system works in much the same way as the index
    in this textbook. If we want to learn about a particular topic (specified by a word
    or a phrase) in this textbook, we can search for the topic in the index at the back
    of the book, find the pages where it occurs, and then read the pages to find the
    information for which we are looking. The words in the index are in sorted order,
    making it easy to find the word we want. Moreover, the index is much smaller
    than the book, further reducing the effort needed.
    Database-system indices play the same role as book indices in libraries. For
    example, to retrieve a student record given an
    ID
    , the database system would look
    up an index to find on which disk block the corresponding record resides, and
    then fetch the disk block, to get the appropriate student record.
    Keeping a sorted list of students’
    ID
    would not work well on very large
    databases with thousands of students, since the index would itself be very big;
    further, even though keeping the index sorted reduces the search time, finding a
    student can still be rather time-consuming. Instead, more sophisticated indexing
    techniques may be used. We shall discuss several of these techniques in this
    chapter.
    There are two basic kinds of indices:


    Ordered indices. Based on a sorted ordering of the values.


    Hash indices. Based on a uniform distribution of values across a range of
    buckets. The bucket to which a value is assigned is determined by a function,
    called a hash function.


    We shall consider several techniques for both ordered indexing and hashing.
    No one technique is the best. Rather, each technique is best suited to particular
    database applications. Each technique must be evaluated on the basis of these
    factors:



    Access types: The types of access that are supported efficiently. Access types
    can include finding records with a specified attribute value and finding
    records whose attribute values fall in a specified range.

    Access time: The time it takes to find a particular data item, or set of items,
    using the technique in question.

    Insertion time: The time it takes to insert a new data item. This value includes
    the time it takes to find the correct place to insert the new data item, as well
    as the time it takes to update the index structure.

    Deletion time: The time it takes to delete a data item. This value includes
    the time it takes to find the item to be deleted, as well as the time it takes to
    update the index structure.

    Space overhead: The additional space occupied by an index structure. Pro-
    vided that the amount of additional space is moderate, it is usually worth-
    while to sacrifice the space to achieve improved performance.
    We often want to have more than one index for a file. For example, we may
    wish to search for a book by author, by subject, or by title.
    An attribute or set of attributes used to look up records in a file is called a
    search key. Note that this definition of key differs from that used in primary key,
    candidate key, and superkey. This duplicate meaning for key is (unfortunately) well
    established in practice. Using our notion of a search key, we see that if there are
    several indices on a file, there are several search keys.

  • 相关阅读:
    Bootstrap 网页乱码
    西游记人物
    球从100米高度自由落下,每次落地后反跳回原高度的一半;再落下,求它在第10次落地时,共经过多少米?第10次反弹多高?
    利用条件运算符的嵌套来完成此题:学习成绩> =90分的同学用A表示,60-89分之间的用B表示,60分以下的用C表示。
    s=a+aa+aaa+aaaa+aa...a的值,其中a是一个数字。例如2+22+222+2222+22222(此时共有5个数相加),几个数相加由用户控制。
    实现判断字符串的开头和结尾
    值类型和引用类型
    随机生成4位验证码
    实现 从1-36中随机产生6个不重复的中奖号码
    冒泡排序
  • 原文地址:https://www.cnblogs.com/rsapaper/p/6236956.html
Copyright © 2011-2022 走看看