朋友提出的问题是这样的:
以下表记录,如保根据比例随机查询单条?
id showPercent(展示比例/机率)
---- -------
1 20 (表示20%)
2 30
3 50
呵呵,思索+尝试了一番,应该是可以解决这个问题的
create table #tmp
(
weight int
);
insert into #tmp (weight) select 10;
insert into #tmp (weight) select 20;
insert into #tmp (weight) select 30;
insert into #tmp (weight) select 40;
create table #tmp2
(
row_no int identity(1,1),
x int,
cnt int default 0
);
DECLARE @i int
SET @i=0
WHILE @i<10000
BEGIN
SET @i = @i+1;
with CTE as
(
SELECT t.RowId ,t.weight,t.weight AS cnt FROM
(
SELECT ROW_NUMBER() OVER(ORDER BY weight) AS RowId,* FROM #tmp
) AS t WHERE RowId =1
union ALL
SELECT t2.RowId,t2.weight,t2.weight+c.cnt AS cnt FROM
(
SELECT ROW_NUMBER() OVER(ORDER BY weight) AS RowId,* FROM #tmp
) AS t2 inner join CTE c on c.RowId+1 = t2.RowId
)
--SELECT max(cnt)+1 from CTE
INSERT INTO #tmp2
select top 1 weight,cnt
from cte
where cnt > cast(RAND()*100 as bigint)%(SELECT max(cnt) from CTE)
order by cnt;
END
drop table #tmp
SELECT COUNT(1)AS 数量,cnt FROM #tmp2 GROUP BY cnt
drop table #tmp2
-----解决方案的思路是这样的:
比如有一个 概率列表 分别为 10%,20%,30%,40%
那么,根据他们计算的概率权重,应该是 10,30,60,100
也就是说,我获得一个100以内的随机数,数值区间分布表如下
0-10 对应概率10%
10-30 对应概率 20%
30-60 对应概率 30%
60-100 对应概率 40%
所以我们只要获得一个随机数,% 100求模,获得随机数的区间,即可根据该数值获得指定概率的记录
再延伸一下,即使概率为和不为100,依然可以根据该公式计算对应的概率,
也就是说,概率列表可以是 10 10 10 10 30 加起来为 70,第一个就是 1/7,最后一个为 3/7
依次类推,可解决这个问题
上面那个公式包含了整个完整算法
实例代码,包含了一个CTE递归合并写法,不难理解
cast(RAND()*100 as bigint) 表示生成一个100以内的数值,如果和大于一百,可以把这个放大到 10000 或者更高,其实这个就是随机值的获取咯
运行1w次结果
出现次数 权重
1010 10
1989 30
2979 60
4022 100