zoukankan      html  css  js  c++  java
  • Mysql 行号+分组行号+取Top记录 SQL

    Mysql 行号+分组行号+取Top记录 SQL

    select * from (
                      SELECT (@rowNum := @rowNum + 1)                      as rowNum  -- 全量行号
                           , a.col1
    
                           , case
                                 when @groupItem != a.col1 then @groupRowNum := 1
                                 else @groupRowNum := @groupRowNum + 1 end as groupRowNum   -- 根据分组项目,设置分组行号, 分组项目与当前行分组项目不一致时,则分组行号重新置 1 
                           , case
                                 when @groupItem != a.col1 then @groupItem := a.col1
                                 else round(@groupItem, 0) end             as groupItem    -- 标记分组项目, 为分组行号提供判断依据
                           , col2
                           , num
                      FROM (
                               SELECT  col1 ,  col2, COUNT(*) as num
                               FROM tb_test
                               GROUP BY col1, col2
                           ) a
                               inner join (select @rowNum := 0 as rowNum) t1   -- 全量行号变量 初始化
                               inner join (select @groupRowNum := 0) t2   -- 分组行号变量 初始化
                               inner join (select @groupItem := -1) t3  -- 分组项目变量 初始化
                      where 1 = 1
                      order by a.col1, num desc  -- 分组行号排序条件 
                        limit 100000  -- order by 在子查询中不能行, 需要加 limit 
                  ) x
    where 1=1
    and groupRowNum <=3  -- 增加 分组后 top n 条件
    ;



    注:

    mysql5.7及以上 子查询里面最好不要用order by

    
    

    官方解释:
    在mysql5.7手册的8.2.2.1中有解释:

    
    

    子查询的优化是使用半连接的策略完成的(The optimizer uses semi-join strategies to improve subquery execution)

    
    

    使用半连接进行优化,子查询语句必须满足一些标准(In MySQL, a subquery must satisfy these criteria to be handled as a semi-join)。

    
    

    其中一个标准是:必须不是一个包含了limit和order by的语句(It must not have ORDER BY with LIMIT.)

    
    

    1.子查询如果同时存在order by和limit,不会忽略order by
    此方法查询特别慢,具体原因不知道,最好把order by放在父查询
    2.只存在order by 会忽略

    因此只适用于平时线下数据分析使用.

    Top 1 

    select b.col1, max(b.col2) as col2 , num
    from (
             SELECT  col1, col2, COUNT(1) as num
             FROM tb_test
             GROUP BY col1, col2
         ) b
    
    where 1=1and not exists(
                  select 1
                  from (
                                    SELECT col1, col2, COUNT(1) as num
                                    FROM tb_test
                                    GROUP BY col1, col2
                       ) c
                  where 1 = 1
                    and b.col1 = c.col1
                    and b.num < c.num
              )
    group by b.col1, num
    order by col1
    ;

    Top  num N 

    注: 如需 TOP N 还需要再 Group 一下

    select
           a.col1, a.col2, a.num, count(*)
    # *
    from (
             SELECT  col1,  col2, COUNT(1) as num
             FROM tb_test
             where 1=1GROUP BY col1, col2
         ) a
        left join (
    
             SELECT col1,  col2, COUNT(1) as num
             FROM tb_test
             where 1=1GROUP BY col1, col2
    
        ) b
        on a.col1 = b.col2
        and a.num < b.num
        where 1=1
        group by a.col1, a.col2, a.num having count(b.col1) < 2
        order by  a.col1, a.num desc
    
    
    ;
  • 相关阅读:
    字符串替换
    字符串查找
    字符串比较
    字节与字符串相互转换
    1365. How Many Numbers Are Smaller Than the Current Number
    1486. XOR Operation in an Array
    1431. Kids With the Greatest Number of Candies
    1470. Shuffle the Array
    1480. Running Sum of 1d Array
    【STM32H7教程】第56章 STM32H7的DMA2D应用之刷色块,位图和Alpha混合
  • 原文地址:https://www.cnblogs.com/wuyifu/p/14848274.html
Copyright © 2011-2022 走看看