zoukankan      html  css  js  c++  java
  • 详解NGINX如何统计网站的PV、UV、独立IP

    Nginx: PV、UV、独立IP

    做网站的都知道,平常经常要查询下网站PV、UV等网站的访问数据,当然如果网站做了CDN的话,nginx本地的日志就没什么意义了,下面就对nginx网站的日志访问数据做下统计;

    概念:

    • UV(Unique Visitor):独立访客,将每个独立上网电脑(以cookie为依据)视为一位访客,一天之内(00:00-24:00),访问您网站的访客数量。一天之内相同cookie的访问只被计算1次
    • PV(Page View):访问量,即页面浏览量或者点击量,用户每次对网站的访问均被记录1次。用户对同一页面的多次访问,访问量值累计
    • 统计独立IP:00:00-24:00内相同IP地址只被计算一次,做网站优化的朋友最关心这个

    先声明下环境,此次运行的nginx版本1.7,后端Tomcat运行的是动态交互程序(需进行用户认证,如果是静态页面则抓不到cache值,$http_cookie是空值),就是这样;

    nginx日志文件配置

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    http {
      include    mime.types;
      default_type application/octet-stream;
      log_format main '$remote_addr - [$time_local] "$request" '
                ' - $status "User_Cookie:$guid" ';
     #User_Cookie为日志显示字符,$guid为变量,具体内容在下面定义,也可在日志格式里写入$http_cookie 显示完整的cookie内容<br>
      sendfile    on;
      keepalive_timeout 65;
        upstream backserver {
        ip_hash;
        server 1.1.2.2:8080;
        server 1.1.2.3:8080;
    }
    server {
        listen    80;
        server_name localhost;
        #if ( $http_cookie ~* "(.*)$") 匹配所有内容
        if ( $http_cookie ~* "CSID=([A-Z0-9]*)"){
            set $guid $1;
        #只匹配CSID字符信息,此处为正则表达式<br>
        access_log logs/host.access.log main;
         location ~* ^(.*)$ {
           #limit_req zone=allips burst=1 nodelay;
      
           proxy_pass http://backserver;
           proxy_set_header Host $host;
           proxy_set_header X-Real-IP $remote_addr;
           proxy_set_header REMOTE-HOST $remote_addr;
           proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
           client_max_body_size 8m;
           }
        error_page  500 502 503 504 /50x.html;
        location = /50x.html {
          root  html;
        }
    }

    注:$http_cookie这个里面的值是一个一个cookie的值,中间以“;”分隔

    日志输出格式

    192.168.40.2 - [02/Nov/2016:15:44:35 +0800]  "GET /wcm/app/main/refresh.jsp?r=1478072325778 HTTP/1.1"  - 200 "User_Cookie:7F00000122A5597C46607B1C0A7EC016"
    192.168.40.2 - [02/Nov/2016:15:44:35 +0800]  "GET /webpic/W0201611/W020161102/W020161102566715167404.jpg HTTP/1.1"  - 200 "User_Cookie:7F00000122A5597C46607B1C0A7EC016"
    119.255.31.109 - [02/Nov/2016:15:44:36 +0800]  "GET /wcm/app/main/refresh.jsp?r=1478072510132 HTTP/1.1"  - 200 "User_Cookie:7F000001237921BE9237838AEC65704D"
    119.255.31.109 - [02/Nov/2016:15:44:36 +0800]  "GET /wcm/app/message/message_query_service.jsp?READFLAG=0&MSGTYPES=1%2C2%2C3 HTTP/1.1"  - 200 "User_Cookie:7F000001237921BE9237838AEC65704D"
    192.168.40.2 - [02/Nov/2016:15:44:37 +0800]  "GET /wcm/app/message/message_query_service.jsp?READFLAG=0&MSGTYPES=1%2C2%2C3 HTTP/1.1"  - 200 "User_Cookie:7F00000123D3BF2345115EAAC21F71E0"
    192.168.40.2 - [02/Nov/2016:15:44:37 +0800]  "GET /wcm/app/message/message_query_service.jsp?READFLAG=0&MSGTYPES=1%2C2%2C3 HTTP/1.1"  - 200 "User_Cookie:7F00000123EF73896DF98EDA9950944E"
    192.168.40.2 - [02/Nov/2016:15:44:37 +0800]  "GET /wcm/app/message/message_query_service.jsp?READFLAG=0&MSGTYPES=1%2C2%2C3 HTTP/1.1"  - 200 "User_Cookie:7F00000123FE0F9C397E1A8F0C4F044B"
    192.168.40.2 - [02/Nov/2016:15:44:37 +0800]  "GET /wcm/app/main/refresh.jsp?r=1478072511427 HTTP/1.1"  - 200 "User_Cookie:7F00000123A465B7EA1DE0AF0AE671B7"
    119.255.31.109 - [02/Nov/2016:15:44:38 +0800]  "GET /wcm/app/message/message_query_service.jsp?READFLAG=0&MSGTYPES=1%2C2%2C3 HTTP/1.1"  - 200 "User_Cookie:7F00000123D89B11302DF80AE773C900" 

    PV统计

    可统计单个链接地址访问量:

    1
    [root@localhost logs]# grep index.shtml host.access.log | wc -l

    总PV量:

    1
    [root@localhost logs]# awk '{print $6}' host.access.log | wc -l

    独立IP

    1
    [root@localhost logs]# awk '{print $1}' host.access.log | sort -r |uniq -c | wc -l

    UV统计

    1
    [root@localhost logs]# awk '{print $10}' host.access.log | sort -r |uniq -c |wc -l
    学习时的痛苦是暂时的 未学到的痛苦是终生的
  • 相关阅读:
    AcWing 1135. 新年好 图论 枚举
    uva 10196 将军 模拟
    LeetCode 120. 三角形最小路径和 dp
    LeetCode 350. 两个数组的交集 II 哈希
    LeetCode 174. 地下城游戏 dp
    LeetCode 面试题 16.11.. 跳水板 模拟
    LeetCode 112. 路径总和 递归 树的遍历
    AcWing 1129. 热浪 spfa
    Thymeleaf Javascript 取值
    Thymeleaf Javascript 取值
  • 原文地址:https://www.cnblogs.com/grimm/p/15161547.html
Copyright © 2011-2022 走看看