-- 经排查日志,发现ordertest.com下的url检测,频繁<Response [403]>,Forbidden;再进一步查询数据库数据:
逐日统计
错误临时表test_error_temp中ordertest.com日行数:
SELECT COUNT(1),FROM_UNIXTIME(create_time,'%Y%m%d') AS d FROM test_error_temp WHERE url LIKE '%ordertest.com%' GROUP BY d ORDER BY d DESC ;
COUNT(1) d
897 20171219
2686 20171218
2871 20171217
964 20171216
654 20171215
836 20171214
32 20171213
6 20171212
9 20171211
17 20171210
41 20171209
55 20171208
44 20171207
78 20171206
46 20171205
48 20171204
26 20171203
81 20171202
21 20171201
12 20171130
18 20171129
错误临时表test_error_temp中ordertest.com相对日行数:
SELECT COUNT(1),t_url,FROM_UNIXTIME(create_time,'%Y%m%d') AS d
FROM (
SELECT create_time,
CASE url LIKE '%ordertest.com%'
WHEN TRUE THEN 0
ELSE 1
END AS t_url
FROM test_error_temp
)
AS tmp
GROUP BY d, t_url
ORDER BY d DESC
;
COUNT(1) t_url d
897 0 20171219
676 1 20171219
2686 0 20171218
751 1 20171218
2871 0 20171217
1102 1 20171217
964 0 20171216
1598 1 20171216
654 0 20171215
1939 1 20171215
836 0 20171214
2116 1 20171214
32 0 20171213
2129 1 20171213
6 0 20171212
164 1 20171212
9 0 20171211
447 1 20171211
17 0 20171210
1723 1 20171210
41 0 20171209
2076 1 20171209
55 0 20171208
3568 1 20171208
44 0 20171207
2028 1 20171207
78 0 20171206
2963 1 20171206
46 0 20171205
1713 1 20171205
48 0 20171204
1963 1 20171204
26 0 20171203
684 1 20171203
81 0 20171202
1947 1 20171202
21 0 20171201
989 1 20171201
12 0 20171130
538 1 20171130
18 0 20171129
432 1 20171129
认为:ordertest.com从14号开始全量反爬,我考虑删除错误临时表test_error_temp中从14号起的该域名数据,请酌情处理该域名在test_error中的数据。
-- 前检查,确保安全性和有效性
-- SELECT t.*,FROM_UNIXTIME(create_time,'%Y%m%d') AS d FROM test_error_temp t WHERE url LIKE '%ordertest.com%' AND FROM_UNIXTIME(create_time,'%Y%m%d') >= '20171214';
-- 执行
-- DELETE FROM test_error_temp WHERE url LIKE '%ordertest.com%' AND FROM_UNIXTIME(create_time,'%Y%m%d') >= '20171214'; -- 8909
-- DELETE FROM test_error WHERE url LIKE '%ordertest.com%' AND FROM_UNIXTIME(create_time,'%Y%m%d') >= '20171214' AND payoff_status=0; -- 35
-- 后检查,检查执行结果