zoukankan      html  css  js  c++  java
  • 表关联键上创建索引的重要性

         很久没有写SQL相关的文章了,主要是现在技术部分工比以前明确了。网站部门并不自己写SQL查询数据,数据有其它部门提供服务。但并不是所有情况都是这样,有些项目由于之前没有管理,所以只能自己完成。在这次写的一个SQL查询中,体会到了在做join联接时,关联的键需要创建索引的重要性。
        
          说明:
             1:free_room,freeroom这两个表数据量都不大,小于5000行。
             2:room_type_num,这个表有十多万数据,room_type_id没有创建索引。
        

          查询语句:
           

    代码
    FROM      free_room f WITH ( NOLOCK )
                                LEFT JOIN freeroom fr WITH ( NOLOCK ) ON f.id 
    = fr.free_room_id
                                INNER JOIN room_type_num r WITH ( NOLOCK ) ON r.room_type_id 
    = f.room_type_id

         

          执行以上结果,发现在关联room_type_num表时,系统选择了hash join,最优的情况绝不应该是hash join,因为不太符合hash join的要求。 我之前的文章简单说明了loop join以及hash join的概念,先贴出来看下:


          第一种算法:NESTED LOOP:

                          定义: 对于被连接的数据子集较小的情况,嵌套循环连接是个较好的选择。在嵌套循环中,内表被外表驱动,外表返回的每一行都要在内表中检索找到与它匹配的行,因此整个查询返回的结果集不能太大(大于1 万不适合),要把返回子集较小表的作为外表。
                                                       
          第二种算法:HASH JOIN :


                         定义: 散列连接是做大数据集连接时的常用方式,优化器使用两个表中较小的表(或数据源)利用连接键在内存中建立散列表,然后扫描较大的表并探测散列表,找出与散列表匹配的行。这种方式适用于较小的表完全可以放于内存中的情况,这样总成本就是访问两个表的成本之和。但是在表很大的情况下并不能完全放入内存,这时优化器会将它分割成若干不同的分区,不能放入内存的部分就把该分区写入磁盘的临时段,此时要有较大的临时段从而尽量提高I/O 的性能。


           dba建议在room_type_id创建索引,然后在SQL脚本中把inner join 改写成 inner loop join room_type_num。修改后的执行效率比较如下:

            问题:我在测试后,发现room_type_id创建索引后,SQL查询优化器就不再采用hash join room_type_num了,自动选择了最优的loop join。在SQL脚本中人为的写优化提示,并不需要特别指出,只要我们索引创建到位,优化器也会配合我们选择最优的执行计划进行查询,否则只能程序员自己走下后门了。

     

            图一为没有创建索引前的执行计划:

                       

            

    SQL Server parse and compile time:

       CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

       CPU time = 0 ms,  elapsed time = 1 ms.

    SQL Server parse and compile time:

       CPU time = 15 ms, elapsed time = 189 ms.

    SQL Server Execution Times:

       CPU time = 0 ms,  elapsed time = 1 ms.

    SQL Server Execution Times:

       CPU time = 0 ms,  elapsed time = 1 ms.

    SQL Server Execution Times:

       CPU time = 0 ms,  elapsed time = 1 ms.

    DBCC execution completed. If DBCC printed error messages, contact your system administrator.

    SQL Server Execution Times:

       CPU time = 94 ms,  elapsed time = 101 ms.

    DBCC execution completed. If DBCC printed error messages, contact your system administrator.

    SQL Server Execution Times:

       CPU time = 16 ms,  elapsed time = 6 ms.

    SQL Server Execution Times:

       CPU time = 0 ms,  elapsed time = 1 ms.

    SQL Server Execution Times:

       CPU time = 0 ms,  elapsed time = 1 ms.

    SQL Server Execution Times:

       CPU time = 0 ms,  elapsed time = 1 ms.

    SQL Server Execution Times:

       CPU time = 0 ms,  elapsed time = 1 ms.

    (0 row(s) affected)

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'free_room'. Scan count 2, logical reads 16, physical reads 1, read-ahead reads 48, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'free_room_use'. Scan count 1, logical reads 8, physical reads 5, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    (20 row(s) affected)

    (1 row(s) affected)

    SQL Server Execution Times:

       CPU time = 15 ms,  elapsed time = 100 ms.

    SQL Server parse and compile time:

       CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

       CPU time = 0 ms,  elapsed time = 1 ms.

           图二为增加索引后的执行计划:

                       

    SQL Server parse and compile time:

       CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

       CPU time = 0 ms,  elapsed time = 1 ms.

    SQL Server parse and compile time:

       CPU time = 16 ms, elapsed time = 45 ms.

    SQL Server Execution Times:

       CPU time = 0 ms,  elapsed time = 1 ms.

    SQL Server Execution Times:

       CPU time = 0 ms,  elapsed time = 1 ms.

    SQL Server Execution Times:

       CPU time = 0 ms,  elapsed time = 1 ms.

    DBCC execution completed. If DBCC printed error messages, contact your system administrator.

    SQL Server Execution Times:

       CPU time = 109 ms,  elapsed time = 102 ms.

    DBCC execution completed. If DBCC printed error messages, contact your system administrator.

    SQL Server Execution Times:

       CPU time = 0 ms,  elapsed time = 1 ms.

    SQL Server Execution Times:

       CPU time = 0 ms,  elapsed time = 1 ms.

    SQL Server Execution Times:

       CPU time = 0 ms,  elapsed time = 1 ms.

    SQL Server Execution Times:

       CPU time = 0 ms,  elapsed time = 1 ms.

    SQL Server Execution Times:

       CPU time = 0 ms,  elapsed time = 1 ms.

    (0 row(s) affected)

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'free_room'. Scan count 2, logical reads 16, physical reads 1, read-ahead reads 48, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    Table 'free_room_use'. Scan count 1, logical reads 8, physical reads 5, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    (20 row(s) affected)

    (1 row(s) affected)

    SQL Server Execution Times:

      CPU time = 0 ms,  elapsed time = 27 ms.

    SQL Server parse and compile time:

       CPU time = 0 ms, elapsed time = 1 ms.

    SQL Server Execution Times:

       CPU time = 0 ms,  elapsed time = 1 ms.

           总结:在关联表的关联键上创建合适的索引会影响到SQL优化器对执行计划的选择。数据量的大小以及数据分布情况都会影响执行计划,所以对于SQL优化来说,不能一味套用概念,要实实在在的看到效果才算优化成功。

  • 相关阅读:
    Fedora/centos Linux如何设置网卡自动获取网络地址
    Django virtualenv Apache2 mod_wsgi
    virtualbox在装centos5.6时自协把virtualbox addon 装上了
    Spring中的事务属性介绍以及声明式事务管理
    Fedora Linux如何设置网卡自动获取网络地址
    Linux vi/vim 编辑命令总结
    about reviewboard stack information
    Installing Python 2.7.2 on Centos 5.2
    Linux Deepin 下载
    Can't open file for writing
  • 原文地址:https://www.cnblogs.com/ASPNET2008/p/1750911.html
Copyright © 2011-2022 走看看