1. 问题描述:
有一张表叫domains_report, 有 渠道号, 日期,登陆数,注册数, 交易笔数, 交易金额, 退款笔数,退款金额 这几个字段, 字段是分散在4张表中(每个表都有日期字段. 可以限定统计区间). 需求是 将统计的数据查询出来, 然后插入到domains_report表里面去.
2. 业务思路:
1.先在数据库编写生成统计数据的查询sql. 2.将写好的查询sql放到java程序里面 3.查询sql , 得到list(近30天的数据) 结果. 先delete表domains_report表里近30天的数据, 遍历list, 把数据 insert到domains_report表里去. 4.设置java程序,每天凌晨启动.
补充:
- 对于第3点, 为什么要先删除domains_report近30天的数据, 再插入domains_report近30天的数据?
因为, 不删除直接插入的话, 会报主键唯一错误. 因为日期report_date是唯一键.
并且, delete和 insert 要放在一个事物里面. 要么都一起成功,要么一起失败.
- 在把查询sql放到数据库里面去的时候, 最好把一些参数, 都写在配置文件里面 . 以及定时任务设置运行的时间, 也放在配置文件里面
3. 在统计查询的sql过程中, 很快就发现2大问题:
1. 发现4个表一起查询, 会有重复数据 2. 从8月1号, 到8月30号, 如果没有数据, 那天什么都不显示. 如这位网友遇到的问题一样: https://blog.csdn.net/jie11447416/article/details/50887888
为了解决这2个问题, 参考网友的做法,可以解决. 建立一张日历表, 查询的时候, 关联日历表即可. 但是网友是mysql数据库, 我是orcal, 于是函数和方法找了替代.
a . 建立一张日期表 calendar
create table CALENDAR ( datelist VARCHAR2(1024) not null ) ; alter table CALENDAR add constraint PK_CALENDAR_ITEM primary key (DATELIST);
b. 生成日期表的数据, 把数据导入到 calendar日期表里面去
SELECT A.DATAS FROM (SELECT TO_CHAR(TO_DATE('20190101', 'yyyyMMdd') + ROWNUM - 1, 'yyyyMMdd') AS DATAS FROM DUAL CONNECT BY ROWNUM <= TRUNC(TO_DATE('20500902', 'yyyyMMdd') - TO_DATE('20190101', 'yyyyMMdd')) + 1) A
c. 在查询的语句中, 要关联 calendar日期表,就解决这个问题啦. 下面这个是查询数据的完整sql
select a1.domains_id, a1.report_date, a3.login_count, a2.regin_count, a1.trans_count, a1.trans_sum, a4.refund_count, a4.refund_sum from (select t2.report_date, nvl(t1.trans_count, 0) trans_count, nvl(t1.trans_sum, 0) trans_sum, nvl(t1.domains_id, '04103000000001000') domains_id from (select to_char(to_date(t.trans_time, 'yyyy-MM-dd hh24:mi:ss'), 'yyyyMMdd') trans_time, count(t.trans_nbr) trans_count, sum(t.trans_amount) trans_sum, t.domains_id from trans_item t where t.domains_id = '04103000000001000' and t.stats in (8, 9) group by to_char(to_date(t.trans_time, 'yyyy-MM-dd hh24:mi:ss'), 'yyyyMMdd'), domains_id) t1 right join (SELECT datelist as report_date, null, null, null FROM calendar t where SYSDATE - 30 <= trunc(to_date(t.datelist, 'yyyyMMdd')) and trunc(to_date(t.datelist, 'yyyyMMdd')) <= sysdate) t2 on t1.trans_time = t2.report_date) a1, (select t2.dday, nvl(t1.regin_count, 0) regin_count from (select count(1) regin_count, to_char(to_date(t.create_time, 'yyyy-MM-dd hh24:mi:ss'), 'yyyyMMdd') create_time from domains_users t where t.domains_id = '04103000000001000' group by to_char(to_date(t.create_time, 'yyyy-MM-dd hh24:mi:ss'), 'yyyyMMdd')) t1 right join (SELECT datelist as dday, null, null FROM calendar t where SYSDATE - 30 <= trunc(to_date(t.datelist, 'yyyyMMdd')) and trunc(to_date(t.datelist, 'yyyyMMdd')) <= sysdate) t2 on t1.create_time = t2.dday) a2, (select t2.dday, nvl(t1.login_count, 0) login_count from (select count(1) login_count, to_char(to_date(t.login_time, 'yyyy-MM-dd hh24:mi:ss'), 'yyyyMMdd') login_time from login_log t where t.domains_id = '04103000000001000' group by to_char(to_date(t.login_time, 'yyyy-MM-dd hh24:mi:ss'), 'yyyyMMdd')) t1 right join (SELECT datelist as dday, null FROM calendar t where SYSDATE - 30 <= trunc(to_date(t.datelist, 'yyyyMMdd')) and trunc(to_date(t.datelist, 'yyyyMMdd')) <= sysdate) t2 on t1.login_time = t2.dday) a3, (select t2.dday, nvl(t1.refund_count, 0) refund_count, nvl(t1.refund_sum, 0) refund_sum from (select nvl(count(1), '0') refund_count, sum(ti.refund_amount) refund_sum, to_char(to_date(ti.refund_time, 'yyyy-MM-dd hh24:mi:ss'), 'yyyyMMdd') refund_time from refund_item ti, trans_item tt where ti.trans_nbr = tt.trans_nbr and tt.domains_id = '04103000000001000' group by to_char(to_date(ti.refund_time, 'yyyy-MM-dd hh24:mi:ss'), 'yyyyMMdd')) t1 right join (SELECT datelist as dday, null FROM calendar t where SYSDATE - 30 <= trunc(to_date(t.datelist, 'yyyyMMdd')) and trunc(to_date(t.datelist, 'yyyyMMdd')) <= sysdate) t2 on t1.refund_time = t2.dday) a4 where a1.report_date = a2.dday and a2.dday = a3.dday and a3.dday = a4.dday
运行出来的效果图如下:
补充:
删除近30天统计数据的sql , 这个时间 30代表30天, 也要放在配置文件里面.
delete from domains_report t where t.report_date in ( SELECT * FROM calendar t where SYSDATE - 30 <= trunc(to_date(t.datelist, 'yyyyMMdd')) and trunc(to_date(t.datelist, 'yyyyMMdd')) <= sysdate)
.
~~~~~~~~~~~~~~~~~静下心来想想, 在写这个定时程序中, 细节怎么实现. 技术怎么实现. 多试试. 一定可以的. 加油!~~~~~~~~~~~~~~~~~