zoukankan      html  css  js  c++  java
  • python脚本分析nginx访问日志

    日志格式如下:

    223.74.135.248  [11/May/2017:11:19:47 +0800] "POST /login/getValidateCode HTTP/1.1" 404 14227 "http://www.example.com/login/getValidateCode" "Mozilla/4.0 (compatible; MSIE 9.0; Windows NT 6.1)"
    

    分别是IP,访问时间,请求方法,请求URI,HTTP协议,响应状态码,响应体大小,referer,客户浏览器。

    除了HTTP协议不用截取,其他的都匹配后,存储到数据库,以备后续分析。

    #!/usr/bin/python
    # -*- coding:utf-8 -*-
    import re
    import datetime
    import time
    import MySQLdb as mdb
    import json
    import urllib
    import sys
    
    log = "/root/access_" + (datetime.datetime.now() - datetime.timedelta(days=1)).strftime('%Y-%m-%d')  + ".log"
    line = open(log,'r')
    con = mdb.connect('localhost','','','database',charset="utf8")
    cur = con.cursor()
    
    try:
        for i in line:
            matchObj = re.match(r'(.*)  [(.*)] "(.*) (/.*) (.*)" (.*) (.*) (.*) "(.*)" "(.*)"', i, re.I)
            if matchObj != None:
                ip = matchObj.group(1)
                API = "http://ip.taobao.com/service/getIpInfo.php?ip=" + ip
                jsondata = json.loads(urllib.urlopen(API).read())
                address = jsondata['data']['country'] + jsondata['data']['region'] + jsondata['data']['city'] + jsondata['data']['isp']
                time = matchObj.group(2)
                method = matchObj.group(3)
                request = matchObj.group(4)
                status = int(matchObj.group(6))
                bytesSent = int(matchObj.group(7))
                request_time = float(matchObj.group(8))
                refer = matchObj.group(9)
                agent = matchObj.group(10)
                cur.execute('insert into nginx_access_log values("%s","%s","%s","%s","%s",%d,%d,%f,"%s","%s")' % (ip,address,time,method,request,status,bytesSent,request_time,refer,agent))
    finally:
        line.close()
        cur.close()
    
  • 相关阅读:
    树链剖分 (模板) 洛谷3384
    ST表 (模板) 洛谷3865
    IOI 2005 River (洛谷 3354)
    IOI 2005 River (洛谷 3354)
    poj1094 Sorting It All Out
    poj1094 Sorting It All Out
    spfa(模板)
    HAOI 2006 受欢迎的牛 (洛谷2341)
    HAOI 2006 受欢迎的牛 (洛谷2341)
    洛谷1850(NOIp2016) 换教室——期望dp
  • 原文地址:https://www.cnblogs.com/t-road/p/6868049.html
Copyright © 2011-2022 走看看