zoukankan      html  css  js  c++  java
  • 大数据量分页存储过程效率测试附代码

    在项目中,我们经常遇到或用到分页,那么在大数据量(百万级以上)下,哪种分页算法效率最优呢?我们不妨用事实说话。 
      测试环境 
      硬件:CPU 酷睿双核T5750 内存:2G 
      软件:Windows server 2003 + sql server 2005 
      OK,我们首先创建一数据库:data_Test,并在此数据库中创建一表:tb_TestTable 
      

    create database data_Test --创建数据库
    data_Test 
    GO
    use data_Test
    GO
    create table tb_TestTable --创建表
    (id int identity(1,1) primary key,
     userName nvarchar(20) not null,
     userPWD nvarchar(20) not null,
     userEmail nvarchar(40) null)
    GO

      然后我们在数据表中插入2000000条数据: 
     

     --插入数据 
     set identity_insert tb_TestTable on 
     declare @count int 
     set @count=1 
     while @count<=2000000 
     begin 
     insert into tb_TestTable(id,userName,userPWD,userEmail) values(@count,'admin','admin888','lli0077@yahoo.com.cn') 
     set @count=@count+1 
     end 
     set identity_insert tb_TestTable off 

      我首先写了五个常用存储过程: 
      1,利用select top 和select not in进行分页,具体代码如下: 
     

    create procedure proc_paged_with_notin --利用select top and select not in 
    ( 
     @pageIndex int, --页索引 
     @pageSize int --每页记录数 
    ) 
    as 
    begin 
    set nocount on; 
    declare @timediff datetime --耗时 
    declare @sql nvarchar(500) 
    select @timediff=Getdate() 
    set @sql='select top '+str(@pageSize)+' * from tb_TestTable where(ID not in(select top '+str(@pageSize*@pageIndex)+' id from tb_TestTable order by ID ASC)) order by ID' 
    execute(@sql) --因select top后不支技直接接参数,所以写成了字符串@sql 
    select datediff(ms,@timediff,GetDate()) as 耗时 
    set nocount off; 
    end 

      2,利用select top 和 select max(列键) 

    create procedure proc_paged_with_selectMax --利用select top and select max(列) 
    ( 
     @pageIndex int, --页索引 
     @pageSize int --页记录数 
    ) 
    as 
    begin 
    set nocount on; 
     declare @timediff datetime 
     declare @sql nvarchar(500) 
     select @timediff=Getdate() 
     set @sql='select top '+str(@pageSize)+' * From tb_TestTable where(ID>(select max(id) From (select top '+str(@pageSize*@pageIndex)+' id From tb_TestTable order by ID) as TempTable)) order by ID' 
     execute(@sql) 
     select datediff(ms,@timediff,GetDate()) as 耗时 
     set nocount off; 
    end

     
      3,利用select top和中间变量--此方法因网上有人说效果最佳,所以贴出来一同测试 
     

    create procedure proc_paged_with_Midvar --利用ID>最大ID值和中间变量 
    ( 
     @pageIndex int, 
     @pageSize int 
    ) 
    as 
     declare @count int 
     declare @ID int 
     declare @timediff datetime 
     declare @sql nvarchar(500) 
    begin 
    set nocount on; 
     select @count=0,@ID=0,@timediff=getdate() 
     select @count=@count+1,@ID=case when @count<=@pageSize*@pageIndex then ID else @ID end from tb_testTable order by id 
     set @sql='select top '+str(@pageSize)+' * from tb_testTable where ID>'+str(@ID) 
     execute(@sql) 
     select datediff(ms,@timediff,getdate()) as 耗时 
    set nocount off; 
    end  

      4,利用Row_number() 此方法为sql server 2005中新的方法,利用Row_number()给数据行加上索引 

    create procedure proc_paged_with_Rownumber --利用SQL 2005中的Row_number() 
    ( 
     @pageIndex int, 
     @pageSize int 
    ) 
    as 
     declare @timediff datetime 
    begin 
    set nocount on; 
     select @timediff=getdate() 
     select * from (select *,Row_number() over(order by ID asc) as IDRank from tb_testTable) as IDWithRowNumber where IDRank>@pageSize*@pageIndex and IDRank<@pageSize*(@pageIndex+1) 
     select datediff(ms,@timediff,getdate()) as 耗时 
    set nocount off; 
    end  


      5,利用临时表及Row_number 

    create procedure proc_CTE --利用临时表及Row_number 
    ( 
     @pageIndex int, --页索引 
     @pageSize int --页记录数 
    ) 
    as 
     set nocount on; 
     declare @ctestr nvarchar(400) 
     declare @strSql nvarchar(400) 
     declare @datediff datetime 
    begin 
     select @datediff=GetDate() 
     set @ctestr='with Table_CTE as 
      (select ceiling((Row_number() over(order by ID ASC))/'+str(@pageSize)+') as page_num,* from tb_TestTable)'; 
       set @strSql=@ctestr+' select * From Table_CTE where page_num='+str(@pageIndex) 
      end 
    begin 
      execute sp_executesql @strSql 
      select datediff(ms,@datediff,GetDate()) 
      set nocount off; 
    end  


      OK,至此,存储过程创建完毕,我们分别在每页10条数据的情况下在第2页,第1000页,第10000页,第100000页,第199999页进行测试,耗时单位:ms 每页测试5次取其平均值 
      存过 第2页耗时 第1000页耗时 第10000页耗时 第100000页耗时 第199999页耗时 效率排行 
      1用not in 0ms 16ms 47ms 475ms 953ms 3 
      2用select max 5ms 16ms 35ms 325ms 623ms 1 
      3中间变量 966ms 970ms 960ms 945ms 933ms 5 
      4row_number 0ms 0ms 34ms 365ms 710ms 2 
      4临时表 780ms 796ms 798ms 780ms 805ms 4


      测试结果显示:select max >row_number>not in>临时表>中间变量 
      于是我对效率最高的select max方法用2分法进行了扩展,代码取自互联网,我修改了ASC排序时取不到值的BUG,测试结果: 
      2分法 156ms 156ms 180ms 470ms 156ms 1* 
      从测试结果来看,使用2分法确实可以提高效率并使效率更为稳定,我又增加了第159999页的测试,用时仅296ms,效果相当的不错! 
      下面是2分法使用select max的代码,已相当完善。 

    --/*-----存储过程 分页处理 孙伟 2005-03-28创建 -------*/ 
    --/*-----存储过程 分页处理 浪尘 2008-9-1修改----------*/ 
    --/*----- 对数据进行了2分处理使查询前半部分数据与查询后半部分数据性能相同 -------*/ 
     
    alter PROCEDURE proc_paged_2part_selectMax 
    ( 
    @tblName nvarchar(200), ----要显示的表或多个表的连接 
    @fldName nvarchar(500) = '*', ----要显示的字段列表 
    @pageSize int = 10, ----每页显示的记录个数 
    @page int = 1, ----要显示那一页的记录 
    @fldSort nvarchar(200) = null, ----排序字段列表或条件 
    @Sort bit = 0, ----排序方法,0为升序,1为降序(如果是多字段排列Sort指代最后一个排序字段的排列顺序(最后一个排序字段不加排序标记)--程序传参如:' SortA Asc,SortB Desc,SortC ') 
    @strCondition nvarchar(1000) = null, ----查询条件,不需where 
    @ID nvarchar(150), ----主表的主键 
    @Dist bit = 0, ----是否添加查询字段的 DISTINCT 默认0不添加/1添加 
    @pageCount int = 1 output, ----查询结果分页后的总页数 
    @Counts int = 1 output ----查询到的记录数 
    ) 
    AS 
    SET NOCOUNT ON 
    Declare @sqlTmp nvarchar(1000) ----存放动态生成的SQL语句 
    Declare @strTmp nvarchar(1000) ----存放取得查询结果总数的查询语句 
    Declare @strID nvarchar(1000) ----存放取得查询开头或结尾ID的查询语句 
     
    Declare @strSortType nvarchar(10) ----数据排序规则A 
    Declare @strFSortType nvarchar(10) ----数据排序规则B 
     
    Declare @SqlSelect nvarchar(50) ----对含有DISTINCT的查询进行SQL构造 
    Declare @SqlCounts nvarchar(50) ----对含有DISTINCT的总数查询进行SQL构造 
     
    declare @timediff datetime --耗时测试时间差 
    select @timediff=getdate() 
     
    if @Dist = 0 
    begin 
     set @SqlSelect = 'select ' 
     set @SqlCounts = 'Count(*)' 
    end 
    else 
    begin 
     set @SqlSelect = 'select distinct ' 
     set @SqlCounts = 'Count(DISTINCT '+@ID+')' 
    end 
     
     
    if @Sort=0 
    begin 
     set @strFSortType=' ASC ' 
     set @strSortType=' DESC ' 
    end 
    else 
    begin 
     set @strFSortType=' DESC ' 
     set @strSortType=' ASC ' 
    end 
     
     
     
    --------生成查询语句-------- 
    --此处@strTmp为取得查询结果数量的语句 
    if @strCondition is null or @strCondition='' --没有设置显示条件 
    begin 
     set @sqlTmp = @fldName + ' From ' + @tblName 
     set @strTmp = @SqlSelect+' @Counts='+@SqlCounts+' FROM '+@tblName 
     set @strID = ' From ' + @tblName 
    end 
    else 
    begin 
     set @sqlTmp = + @fldName + 'From ' + @tblName + ' where (1>0) ' + @strCondition 
     set @strTmp = @SqlSelect+' @Counts='+@SqlCounts+' FROM '+@tblName + ' where (1>0) ' + @strCondition 
     set @strID = ' From ' + @tblName + ' where (1>0) ' + @strCondition 
    end 
     
    ----取得查询结果总数量----- 
    exec sp_executesql @strTmp,N'@Counts int out ',@Counts out 
    declare @tmpCounts int 
    if @Counts = 0 
     set @tmpCounts = 1 
    else 
     set @tmpCounts = @Counts 
     
     --取得分页总数 
     set @pageCount=(@tmpCounts+@pageSize-1)/@pageSize 
     
     /**//**//**//**当前页大于总页数 取最后一页**/ 
     if @page>@pageCount 
     set @page=@pageCount 
     
     --/*-----数据分页2分处理-------*/ 
     declare @pageIndex int --总数/页大小 
     declare @lastcount int --总数%页大小 
     
     set @pageIndex = @tmpCounts/@pageSize 
     set @lastcount = @tmpCounts%@pageSize 
     if @lastcount > 0 
     set @pageIndex = @pageIndex + 1 
     else 
     set @lastcount = @pagesize 
     
     --//***显示分页 
     if @strCondition is null or @strCondition='' --没有设置显示条件 
     begin 
     if @pageIndex<2 or @page<=@pageIndex / 2 + @pageIndex % 2 --前半部分数据处理 
     begin 
     if @page=1 
     set @strTmp=@SqlSelect+' top '+ CAST(@pageSize as VARCHAR(4))+' '+ @fldName+' from '+@tblName 
     +' order by '+ @fldSort +' '+ @strFSortType 
     else 
     begin 
     if @Sort=1 
     begin 
     set @strTmp=@SqlSelect+' top '+ CAST(@pageSize as VARCHAR(4))+' '+ @fldName+' from '+@tblName 
     +' where '+@ID+' <(select min('+ @ID +') from ('+ @SqlSelect+' top '+ CAST(@pageSize*(@page-1) as Varchar(20)) +' '+ @ID +' from '+@tblName 
     +' order by '+ @fldSort +' '+ @strFSortType+') AS TBMinID)' 
     +' order by '+ @fldSort +' '+ @strFSortType 
     end 
     else 
     begin 
     set @strTmp=@SqlSelect+' top '+ CAST(@pageSize as VARCHAR(4))+' '+ @fldName+' from '+@tblName 
     +' where '+@ID+' >(select max('+ @ID +') from ('+ @SqlSelect+' top '+ CAST(@pageSize*(@page-1) as Varchar(20)) +' '+ @ID +' from '+@tblName 
     +' order by '+ @fldSort +' '+ @strFSortType+') AS TBMinID)' 
     +' order by '+ @fldSort +' '+ @strFSortType 
     end 
     end 
     end 
     else 
     begin 
     set @page = @pageIndex-@page+1 --后半部分数据处理 
     if @page <= 1 --最后一页数据显示 
     set @strTmp=@SqlSelect+' * from ('+@SqlSelect+' top '+ CAST(@lastcount as VARCHAR(4))+' '+ @fldName+' from '+@tblName 
     +' order by '+ @fldSort +' '+ @strSortType+') AS TempTB'+' order by '+ @fldSort +' '+ @strFSortType 
     else 
     if @Sort=1 
     begin 
     set @strTmp=@SqlSelect+' * from ('+@SqlSelect+' top '+ CAST(@pageSize as VARCHAR(4))+' '+ @fldName+' from '+@tblName 
     +' where '+@ID+' >(select max('+ @ID +') from('+ @SqlSelect+' top '+ CAST(@pageSize*(@page-2)+@lastcount as Varchar(20)) +' '+ @ID +' from '+@tblName 
     +' order by '+ @fldSort +' '+ @strSortType+') AS TBMaxID)' 
     +' order by '+ @fldSort +' '+ @strSortType+') AS TempTB'+' order by '+ @fldSort +' '+ @strFSortType 
     end 
     else 
     begin 
     set @strTmp=@SqlSelect+' * from ('+@SqlSelect+' top '+ CAST(@pageSize as VARCHAR(4))+' '+ @fldName+' from '+@tblName 
     +' where '+@ID+' <(select min('+ @ID +') from('+ @SqlSelect+' top '+ CAST(@pageSize*(@page-2)+@lastcount as Varchar(20)) +' '+ @ID +' from '+@tblName 
     +' order by '+ @fldSort +' '+ @strSortType+') AS TBMaxID)' 
     +' order by '+ @fldSort +' '+ @strSortType+') AS TempTB'+' order by '+ @fldSort +' '+ @strFSortType 
     end 
     end 
     end 
     
     else --有查询条件 
     begin 
     if @pageIndex<2 or @page<=@pageIndex / 2 + @pageIndex % 2 --前半部分数据处理 
     begin 
     if @page=1 
     set @strTmp=@SqlSelect+' top '+ CAST(@pageSize as VARCHAR(4))+' '+ @fldName+' from '+@tblName 
     +' where 1=1 ' + @strCondition + ' order by '+ @fldSort +' '+ @strFSortType 
     else if(@Sort=1) 
     begin 
     set @strTmp=@SqlSelect+' top '+ CAST(@pageSize as VARCHAR(4))+' '+ @fldName+' from '+@tblName 
     +' where '+@ID+' <(select min('+ @ID +') from ('+ @SqlSelect+' top '+ CAST(@pageSize*(@page-1) as Varchar(20)) +' '+ @ID +' from '+@tblName 
     +' where (1=1) ' + @strCondition +' order by '+ @fldSort +' '+ @strFSortType+') AS TBMinID)' 
     +' '+ @strCondition +' order by '+ @fldSort +' '+ @strFSortType 
     end 
     else 
     begin 
     set @strTmp=@SqlSelect+' top '+ CAST(@pageSize as VARCHAR(4))+' '+ @fldName+' from '+@tblName 
     +' where '+@ID+' >(select max('+ @ID +') from ('+ @SqlSelect+' top '+ CAST(@pageSize*(@page-1) as Varchar(20)) +' '+ @ID +' from '+@tblName 
     +' where (1=1) ' + @strCondition +' order by '+ @fldSort +' '+ @strFSortType+') AS TBMinID)' 
     +' '+ @strCondition +' order by '+ @fldSort +' '+ @strFSortType 
     end 
     end 
     else 
     begin 
     set @page = @pageIndex-@page+1 --后半部分数据处理 
     if @page <= 1 --最后一页数据显示 
     set @strTmp=@SqlSelect+' * from ('+@SqlSelect+' top '+ CAST(@lastcount as VARCHAR(4))+' '+ @fldName+' from '+@tblName 
     +' where (1=1) '+ @strCondition +' order by '+ @fldSort +' '+ @strSortType+') AS TempTB'+' order by '+ @fldSort +' '+ @strFSortType 
     else if(@Sort=1) 
     set @strTmp=@SqlSelect+' * from ('+@SqlSelect+' top '+ CAST(@pageSize as VARCHAR(4))+' '+ @fldName+' from '+@tblName 
     +' where '+@ID+' >(select max('+ @ID +') from('+ @SqlSelect+' top '+ CAST(@pageSize*(@page-2)+@lastcount as Varchar(20)) +' '+ @ID +' from '+@tblName 
     +' where (1=1) '+ @strCondition +' order by '+ @fldSort +' '+ @strSortType+') AS TBMaxID)' 
     +' '+ @strCondition+' order by '+ @fldSort +' '+ @strSortType+') AS TempTB'+' order by '+ @fldSort +' '+ @strFSortType 
     else 
     set @strTmp=@SqlSelect+' * from ('+@SqlSelect+' top '+ CAST(@pageSize as VARCHAR(4))+' '+ @fldName+' from '+@tblName 
     +' where '+@ID+' <(select min('+ @ID +') from('+ @SqlSelect+' top '+ CAST(@pageSize*(@page-2)+@lastcount as Varchar(20)) +' '+ @ID +' from '+@tblName 
     +' where (1=1) '+ @strCondition +' order by '+ @fldSort +' '+ @strSortType+') AS TBMaxID)' 
     +' '+ @strCondition+' order by '+ @fldSort +' '+ @strSortType+') AS TempTB'+' order by '+ @fldSort +' '+ @strFSortType 
     end 
     end 
     
    ------返回查询结果----- 
    exec sp_executesql @strTmp 
    select datediff(ms,@timediff,getdate()) as 耗时 
    --print @strTmp 
    SET NOCOUNT OFF 
    GO 

      执行示例:exec proc_paged_2part_selectMax 'tb_testTable','ID,userName,userPWD,userEmail',10,100000,'ID',0,null,'ID',0 
      这种测试只在单机进行,并且没有在实际开发WEB项目中分页测试,测试项也比较单一,所以不够全面系统,但从其效率相比上,我们可以在数据库分页算法上进行有效的控制。

  • 相关阅读:
    EBS SQL > Form & Report
    oracle sql 优化分析点
    MRP 物料需求计划
    MRPII 制造资源计划
    Barcode128 应用实务
    Oracle SQL语句优化技术分析
    APPSQLAP10710 Online accounting could not be created. AP Invoice 无法创建会计分录
    Oracle数据完整性和锁机制
    ORACLE Responsibility Menu Reference to Other User
    EBS 常用 SQL
  • 原文地址:https://www.cnblogs.com/computer/p/1780722.html
Copyright © 2011-2022 走看看