SQL查询优化
维护项目时,需要加载某页面,总共加载也就4000多条数据,竟然需要35秒钟,要是数据增长到40000条,我估计好几分钟都搞不定。卧槽,要我是用户的话估计受不了,趁闲着没事,就想把它优化一下,走你。
先把查询贴上:
![](https://images.cnblogs.com/OutliningIndicators/ContractedBlock.gif)
1.select Pub_AidBasicInformation.AidBasicInfoId, 2. 3. Pub_AidBasicInformation.UserName, 4. 5. Pub_AidBasicInformation.District, 6. 7. Pub_AidBasicInformation.Street, 8. 9. Pub_AidBasicInformation.Community, 10. 11. Pub_AidBasicInformation.DisCard, 12. 13. Pub_Application.CreateOn AS AppCreateOn, 14. 15. Pub_User.UserName as DepartmentUserName, 16. 17. Pub_Consult1.ConsultId, 18. 19. Pub_Consult1.CaseId, 20. 21. Clinicaltb.Clinical,AidNametb.AidName, 22. 23. Pub_Application.IsUseTraining, 24. 25. Pub_Application.ApplicationId, 26. 27. tab.num 28. 29.FROM Pub_Consult1 30. 31.INNER JOIN Pub_Application ON Pub_Consult1.ApplicationId = Pub_Application.ApplicationId 32. 33.INNER JOIN Pub_AidBasicInformation ON Pub_Application.AidBasicInfoId = Pub_AidBasicInformation.AidBasicInfoId 34. 35.INNER JOIN(select ConsultId,dbo.f_GetClinical(ConsultId) as Clinical 36. 37. from Pub_Consult1) Clinicaltb on Clinicaltb.ConsultId=Pub_Consult1.ConsultId 38. 39.left join (select distinct ApplicationId, sum(TraniningNumber) as num from dbo.Review_Aid_UseTraining_Record where AidReferralId is null group by ApplicationId) tab on tab.ApplicationId=Pub_Consult1.ApplicationId 40. 41.INNER JOIN(select ConsultId,dbo.f_GetAidNamebyConsult1(ConsultId) as AidName from Pub_Consult1) AidNametb on AidNametb.ConsultId=Pub_Consult1.ConsultId 42. 43.LEFT OUTER JOIN Pub_User ON Pub_Application.ReviewUserId = Pub_User.UserId 44. 45. WHERE Pub_Consult1.Directory = 0 46. 47. order by Pub_Application.CreateOn desc
执行后有图有真相:
![](http://dl2.iteye.com/upload/attachment/0110/3330/2ed1a68e-05c4-3438-bfe9-5aee178f2352.jpg)
这么慢,没办法就去看看查询计划是怎么样:
![](http://dl2.iteye.com/upload/attachment/0110/3332/c1e240a2-c0f2-369c-a055-8371b040973b.png)
这是该sql查询里面执行三个函数时生成查询计划的截图,一看就知道,执行时开销比较大,而且都是花费在聚集索引扫描上,把鼠标放到聚集索引扫描的方块上面,依次看到如下详细计划:
![](http://dl2.iteye.com/upload/attachment/0110/3334/76f52962-60aa-3563-98a4-44d3bdbeea40.jpg)
![](http://dl2.iteye.com/upload/attachment/0110/3336/e575f534-eba8-3c86-a562-d653cf49abc9.jpg)
从这几张图里,可以看到查询I/O开销,运算符开销,估计行数,以及操作的对象和查询条件,这些都为优化查询提供了有利证据。第1,3张图IO开销比较大,第2张图估计行数比较大,再根据其它信息,首先想到的应该是去建立索引,不行的话再去改查询。
先看看数据库引擎优化顾问能给我们提供什么优化信息,有时候它能够帮我们提供有效的信息,比如创建统计,索引,分区什么的。
先打开SQL Server Profiler 把刚刚执行的查询另存为跟踪(.trc)文件,再打开数据库引擎优化顾问,做如下图操作
![点击查看原始大小图片](http://dl2.iteye.com/upload/attachment/0110/3338/739c51a4-1702-3520-ba6c-0461c5f4b283.jpg)
最后生成的建议报告如下:
![点击查看原始大小图片](http://dl2.iteye.com/upload/attachment/0110/3340/7f50eaf1-2b07-38d1-9a45-1122011de933.jpg)
在这里可以单击查看一些建议,分区,创建索引,根据提示创建了如下索引:
![](https://images.cnblogs.com/OutliningIndicators/ContractedBlock.gif)
1.CREATE NONCLUSTERED INDEX index1 ON [dbo]. [Pub_AidBasicInformation] 2. 3.( 4. 5. [AidBasicInfoId] ASC 6. 7.) 8. 9. 10.CREATE NONCLUSTERED INDEX index1 ON [dbo].[Pub_Application] 11. 12.( 13. 14. [ApplicationId] ASC,[ReviewUserId] ASC,[AidBasicInfoId] ASC,[CreateOn] ASC 15. 16.) 17. 18.CREATE NONCLUSTERED INDEX index1 ON [dbo].[Pub_Consult1] 19. 20.( 21. 22. [Directory] ASC,[ApplicationId] ASC 23. 24.) 25. 26. 27. 28.CREATE NONCLUSTERED INDEX idnex1 ON [dbo].[Review_Aid_UseTraining_Record] 29. 30.( 31. 32. [AidReferralId] ASC,[ApplicationId] ASC 33. 34.)
索引创建后,再次执行查询,原以为可提高效率,没想到我勒个去,还是要30几秒,几乎没什么改善,优化引擎顾问有时候也会失灵,在这里只是给大家演示有这种解决方案去解决问题,有时候还是靠谱的,只是这次不靠谱。没办法,只有打开函数仔细瞅瞅,再结合上面的查询计划详细图,删除先前创建的索引,然后创建了如下索引:
![](https://images.cnblogs.com/OutliningIndicators/ContractedBlock.gif)
1.CREATE NONCLUSTERED INDEX index1 ON dbo.Report_AdapterAssessment_Aid 2. 3.( 4. 5. AdapterAssessmentId ASC, ProductDirAId ASC 6. 7.) 8. 9.CREATE NONCLUSTERED INDEX index1 ON dbo.Report_AdapterAssessment 10. 11.( 12. 13. ConsultId ASC 14. 15.)
再次执行查询
![](http://dl2.iteye.com/upload/attachment/0110/3342/ada1a66c-1c9b-354c-b000-4431bc2005d7.jpg)
好了,只需3.5秒,差不多提高10倍速度,看来这次是凑效了哈。
再来看看查询计划是否有改变,上张图来说明下问题:
![点击查看原始大小图片](http://dl2.iteye.com/upload/attachment/0110/3348/4da0ca15-278b-3007-81ed-b2885589d45a.jpg)
从上图当中我们可以看到,索引扫描不见了,只有索引查找,聚集索引查找,键查找,而且运算符开销,I/O开销都降低了很多。索引扫描(Index Scan),聚集索引扫描(Clustered Index Scan)跟表扫描(Table Scan)差不多,基本上是逐行去扫描表记录,速度很慢,而索引查找(Index Seek),聚集索引查找,键查找都相当的快。优化查询的目的就是尽量把那些带有XXXX扫描的去掉,换成XXXX查找。
这样够了吗?但是回头又想想,4000多条数据得3.5秒钟,还是有点慢了,应该还能再快点,所以决定再去修改查询。看看查询,能优化的也只有那个三个函数了。
为了看函数执行效果先删除索引,看看查询中函数f_GetAidNamebyConsult1要干的事情,截取查询中与该函数有关的子查询:
![](https://images.cnblogs.com/OutliningIndicators/ContractedBlock.gif)
1.select Pub_Consult1.ConsultId,AidName from (select ConsultId,dbo.f_GetAidNamebyConsult1(ConsultId) as AidName 2. 3.from Pub_Consult1) AidNametb inner join Pub_Consult1 4. 5.on AidNametb.ConsultId=Pub_Consult1.ConsultId
得到下图的结果:
![](http://dl2.iteye.com/upload/attachment/0110/3354/6a691268-8f32-3409-b891-c9b1f4b3bf96.jpg)
没想到就这么点数据竟然要46秒,看来这个函数真的是罪魁祸首。
该函数的具体代码就不贴出来了,而且该函数里面还欠套的另外一个函数,本身函数执行起来就慢,更何况还函数里子查询还包含函数。其实根据几相关联的表去查询几个字段,并且把一个字段的值合并到同一行,这样没必要用函数或存储过程,用子查询再加sql for xml path就行了,把该函数改成如下查询:
![](https://images.cnblogs.com/OutliningIndicators/ContractedBlock.gif)
1.with cte1 as 2. 3.( 4. 5. select A.AdapterAssessmentId,case when B.AidName is null then A .AidName else B.AidName end AidName 6. 7. from Report_AdapterAssessment_Aid as A left join Pub_ProductDir as B 8. 9. on A.ProductDirAId=B.ProductDirAId 10. 11.), 12. 13. cte2 as 14. 15.( 16. 17. 18.--根据AdapterAssessmentId分组并合并AidName字段值 19. 20. select AdapterAssessmentId,(select AidName+',' from cte1 21. 22. where AdapterAssessmentId= tb.AdapterAssessmentId 23. 24. for xml path(''))as AidName 25. 26. from cte1 as tb 27. 28. group by AdapterAssessmentId 29. 30.), 31. 32.cte3 as 33. 34.( 35. 36. select ConsultId,LEFT(AidName,LEN(AidName)-1) as AidName 37. 38. from 39. 40. ( 41. 42. select Pub_Consult1.ConsultId,cte2.AidName from Pub_Consult1,Report_AdapterAssessment,cte2 43. 44. where Pub_Consult1.ConsultId=Report_AdapterAssessment.ConsultId 45. 46. and Report_AdapterAssessment.AdapterAssessmentId=cte2.AdapterAssessmentId 47. 48. and Report_AdapterAssessment.AssessTuiJian is null 49. 50. ) as tb)
这样查询出来的结果在没有索引的情况下不到1秒钟就行了。再把主查询写了:
![](https://images.cnblogs.com/OutliningIndicators/ContractedBlock.gif)
1.select distinct Pub_AidBasicInformation.AidBasicInfoId, 2. 3. Pub_AidBasicInformation.UserName, 4. 5. Pub_AidBasicInformation.District, 6. 7. Pub_AidBasicInformation.Street, 8. 9. Pub_AidBasicInformation.Community, 10. 11. Pub_AidBasicInformation.DisCard, 12. 13. Pub_Application.CreateOn AS AppCreateOn, 14. 15. Pub_User.UserName as DepartmentUserName, 16. 17. Pub_Consult1.ConsultId, 18. 19. Pub_Consult1.CaseId, 20. 21. Clinicaltb.Clinical, 22. 23. cte3.AidName, 24. 25. Pub_Application.IsUseTraining, 26. 27. Pub_Application.ApplicationId, 28. 29. tab.num 30. 31.from Pub_Consult1 32. 33.INNER JOIN Pub_Application ON Pub_Consult1.ApplicationId = Pub_Application.ApplicationId 34. 35.INNER JOIN Pub_AidBasicInformation ON Pub_Application.AidBasicInfoId = Pub_AidBasicInformation.AidBasicInfoId 36. 37.INNER JOIN(select ConsultId,dbo.f_GetClinical(ConsultId) as Clinical 38. 39. from Pub_Consult1) Clinicaltb on Clinicaltb.ConsultId=Pub_Consult1.ConsultId 40. 41.left join (select distinct ApplicationId, sum(TraniningNumber) as num from dbo.Review_Aid_UseTraining_Record 42. 43. where AidReferralId is null 44. 45. group by ApplicationId) tab 46. 47. on tab.ApplicationId=Pub_Consult1.ApplicationId 48. 49.left JOIN cte3 on cte3.ConsultId=Pub_Consult1.ConsultId 50. 51.LEFT OUTER JOIN Pub_User ON Pub_Application.ReviewUserId = Pub_User.UserId 52. 53. where Pub_Consult1.Directory = 0 54. 55.order by Pub_Application.CreateOn desc
这样基本上就完事了,在没有建立索引的情况下需要8秒钟,比没索引用函数还是快了27秒。
![](http://dl2.iteye.com/upload/attachment/0110/3368/2584cb6b-afaf-38c7-8f37-b980cba05487.jpg)
把索引放进去,就只需1.6秒了,比建立索引用函数而不用子查询和sql for xml path快了1.9秒。
![](http://dl2.iteye.com/upload/attachment/0110/3373/65543834-5aa2-3a62-93a0-9f8186014e7a.jpg)
查询里面还有个地方用了函数,估计再优化下还能提高执行效率,因为时间有限再加上篇幅有点长了,在这里就不多讲了。
最后做个总结吧,查询优化不外乎以下这几种办法:
1:增加索引或重建索引。通常在外键,连接字段,排序字段,过滤查询的字段建立索引,也可通过数据库引擎优化顾问提供的信息去建索引。有时候当你创建索引时,会发现查询还是按照索引扫描或聚集索引扫描的方式去执行,而没有去索引查找,这时很可能是你的查询字段和where条件字段没有全部包含在索引字段当中,解决这个问题的办法就是多建立索引,或者在创建索引时Include相应的字段,让索引字段覆盖你的查询字段和where条件字段。
2:调整查询语句,前提要先看懂别人的查询,搞清楚业务逻辑。
3:表分区,大数据量可以考虑。
4:提高服务器硬件配置。