zoukankan      html  css  js  c++  java
  • Mysql Programming CS 155P笔记(三) Design Structures

    MySQL Programming is more than just writing stored procedures and functions.   It's about designing the entire table structure to best fit the application and it's use cases.

    When structuring your tables, it makes the most sense to group data together that often belongs together, and separate data (columns) which are either often used in combination with other data or is infrequently used.

    In most simple applications, it may be possible to build one or just a few tables to hold all the necessary information.  However, you may be creating performance issues down the road both in the database and in the application.

    Lets say we're building a "users" table for a website that will mail people things as they score certain points.   We'll need the following data:

    • Username
    • Email Address
    • Mailing Address (street, city, state, zipcode)
    • Password
    • Points Earned

    This could all be put into one table with 9 columns.   You'd have an assortment of variable width columns most likely and some fixed witch columns.    It's always best to layout your tables with the fixed width columns first and variable width columns last.   It's also good practice to have a numeric key field first.

    So, does it make sense to put all this in a single field?

    When a user logs in, what information do we need?
    We ONLY need to compare their email (or username?  which are you supporting as a login ?  maybe both?) and their password.   If those match then you might want to retrieve their current points balance (and UserID).

    Since the address fields take up most of the row data, the DB has to retrieve the entire row from the disk in order to extract the columns we want.  So even though we can write a query to return just the columns we care about (so we're not wasting program/application memory or cycles storing and sorting through data the application doesn't need), the DB still has to sift through a lot of data we don't need.

    Imagine if you're trying to sum the total number of points outstanding, or compute the average points.  You're reading a ton of data from the disk in order to do something that should be fast and simple.

    If we move the address into it's own table, with a common key as UserId, we save ourselves a lot of this effort.   Then when we do need to ship something, we just retrieve one row from another table to get the shipping address on a very rare occasion.  We've now increased our I/O performance of our user table queries by a significant margin.

    If we then decide we want to support multiple addresses by user, we now can do this easily.  If we kept the address info in the User table, we would have a very difficult time supporting multiple addresses.

    N:1 Relationships

    Now that we're talking about mutli (N) to 1 relationships, it's a good time to talk about other table design considerations. 

    In this case I've chosen to use the UserID as a field in the address table to associate the address with the User..  However,  this may not make the most sense.  This also means that I MUST associate addresses with users.    What if I allow people to gift their prizes later on.   I might have a whole collection of addresses that are associated with people who aren't users and I might store their info in a "friends" table or whatever seems appropriate at the time.

    Instead, it might make sense to create a "linking table".

    In this case I might have the following tables:

    • users
      • userId (unsigned int) (primary key)
      • userName (char 40) (unique index)
      • password (char 20)
      • points (unsigned int) (default value 0)
      • emailAddress (varchar 80)
    • addresses
      • addressId (unsigned int) (primary key)
      • addressNickname (char 10)
      • city (char 80)
      • state  (char 2)
      • zipcode (int 9)
      • street1 (varchar 80)
      • street2 (varchar 80)
    • user_address  (I use _ to indicate a linking table.. it's a personal convention)
      • userId (unsigned int) (index)
      • addressId (unsigned int) (unique)
        • primary key (userId,addressId)

    In this case, If I wanted to get all the users addresses for a selection list, I'd issue the following (pardon my lazy use of *):

    SELECT * from `users` u, LEFT JOIN `user_addresses` ua USING(`userId`) LEFT JOIN `addresses` a USING(addressId) WHERE u.`userId`=43234;

    Hopefully I'll have several rows of addresses.   One disadvantage of the join is that i get duplicate data in my rows.  I get all the user data when all I really wanted were the addresses.  I also get duplicate userId and addressId columns, which isn't ideal.   Lets try to clean this up:

    SELECT * from `addresses` a WHERE a.`addressId` = `user_address`.`addressId` AND `user_address`.`user_id` = 43234;

    Similarly, if we happen to have an address record, and wan't to find out the user, we can do the same query but look for users by addressId instead.  (I haven't tested these queries, you may need to include user_address in the from field, it may depend what version of MySQL you're using).

    N:N relationships

    We may have lots of users who have the same shipping address.   We could store the same shipping address data multiple times (once for each user) which is required with our current design of our linking table.

    However, we might prefer to limit such redundancy.  In this case, assuming we know an address is a duplicate, we can associate that address with multiple users as well.

    user_address  (I use _ to indicate a linking table.. it's a personal convention)

    • userId (unsigned int) (index)
    • addressId (unsigned int) (index)
      • primary key (userId,addressId)

    By changing addressId's unique keys to an index, I can have multiple references to the same address.  Since my primary key stays the same, I still insure there's a 1:1 mapping from a specific userId to a specific addressId

    Summary

    There are a lot of ways to model the same data.   Each has benefits and costs.  It's important to understand the specific use cases to determine which data model works the same.  In your life as a database programmer or architect you may find that you have the exact same data for 2 different environments and the models for those are completely different.

    It's important to understand what queries will be most frequent, what volume of data is read that isn't needed (selecting only a few columns from a table with many columns),   What impact these things will have on database caches as well as the application using the data.

    In addition the response time requirements of an application may either limit or focus your design.

    索引分为主索引键(primary key)、唯一索引(unique index)与非唯一索引(non-unique index)三种。

    主索引键的应用很常见,而且一个表格通常会有一个,而且只能有一个。在一个表格中,设定为主索引键的字段值不可以重复,而且不可以储存“NULL”值。因为这样的限制,所以很适合使用在类似编码、代号或身份证字号这类字段。

    唯一索引也称为“不可重复索引”,在一个表格中,设定为唯一索引的字段值不可以重复,但是可以储存“NULL”值。这种索引适合用在类似员工资料表格中储存电子邮件帐号的字段,因为员工不一定有电子邮件帐号,所以允许储存“NULL”值,可以每一个员工的电子邮件帐号都不可以重复。

    上列两种索引都可以预防储存的资料发生重复的问题,也可以增加查询与维护资料的效率。非唯一索引就只是用来增加查询与维护资料效率的索引。设定为非唯一索引的字段值可以重复,也可以储存“NULL”值

    如果你要建立一般索引(可以重复的索引),或是要建立包含多个字段的索引时,就一定要把建立索引的定义加在所有字段定义后面 index(id, email)

  • 相关阅读:
    正则匹配
    去除数组、对象某个元素
    换行
    a标签问题
    vue的坑
    宽度100%-20px ,css样式设置超出部分...
    搜狗密码框自带小键盘问题
    阻止button刷新页面
    Java IO流经典练习题
    Java中统计字符串中各个字符出现的次数
  • 原文地址:https://www.cnblogs.com/ecwork/p/8520282.html
Copyright © 2011-2022 走看看