zoukankan      html  css  js  c++  java
  • Varnish purges 缓存清除技术研究[转]

    Varnish的缓存清除非常复杂。无论是Varnish的清除方式还是清除时候使用的语法规则等,都是比较复杂。为了理解他,我花费了不少时间,现在我很高兴我知道怎么来解释给大家听了。
    1、Varnish有两种方式来清除缓存,其中一种方式是通过命中对象的单一变体,所以在他命中一个没有压缩的对象的时候他不能清除一个已经压缩的对象。 这个方式也就是强制过期(forced expiry),他是通过设置你想清除的对象的TTL为0去强制它过期。VCL设置如下:

    acl purge {
            "localhost";
            "192.0.2.14";
    }
     
    sub vcl_recv {
            if (req.request == "PURGE") {
                    if (!client.ip ~ purge) {
                            error 405 "Not allowed.";
                    }
                    lookup;
            }
    }
     
    sub vcl_hit {
            if (req.request == "PURGE") {
                    set obj.ttl = 0s;
                    error 200 "Purged.";
            }
    }
     
    sub vcl_miss {
            if (req.request == "PURGE") {
                    error 404 "Not in cache.";
            }
    }

    2、另外一种方式是使用purge_url,VCL设置如下:

    acl purge {
            "localhost";
            "192.0.2.14";
    }
     
    sub vcl_recv {
            if (req.request == "PURGE") {
                    if (!client.ip ~ purge) {
                            error 405 "Not allowed.";
                    }
                    purge("req.url == " req.url);
            }


    通过以上在VCL文件的设置,我们通过HTTP来执行PURGE。这意味着你现在发送了一个:

    PURGE / HTTP/1.0
    Host: www.example.com

    通过80端口给了Varnish。但是,这种执行PURGE的方式不支持正则。如果你想支持,可以按照这样来设置VCL:

    acl purge {
            "localhost";
            "192.0.2.14";
    }
     
    sub vcl_recv {
            if (req.request == "PURGE") {
                    if (!client.ip ~ purge) {
                            error 405 "Not allowed.";
                    }
                    purge("req.url ~ " req.url);
            }

    3、对于purge的方式,除了像上边第2点那样设置VCL来允许PURGE外,其实我们还可以通过Varnish的管理端口发送灵活的PURGE命令来清除缓存。
    3.1 首先让我们来看看管理端口的help(Varnish版本2.1)

    [root@varnish4 varnish]# telnet 192.168.1.185 3500
    Trying 192.168.1.185...
    Connected to 192.168.1.185 (192.168.1.185).
    Escape character is '^]'.
    200 154     
    -----------------------------
    Varnish HTTP accelerator CLI.
    -----------------------------
    Type 'help' for command list.
    Type 'quit' to close CLI session.
     
    help
    200 377     
    help [command]
    ping [timestamp]
    auth response
    quit
    banner
    status
    start
    stop
    stats
    vcl.load <configname> <filename>
    vcl.inline <configname> <quoted_VCLstring>
    vcl.use <configname>
    vcl.discard <configname>
    vcl.list
    vcl.show <configname>
    param.show [-l] [<param>]
    param.set <param> <value>
    purge.url <regexp>
    purge <field> <operator> <arg> [&& <field> <oper> <arg>]...
    purge.list

    3.2 help中和purge有关的命令有三个,其中purge.list是查看purge的列表,能执行purge的是purge.url和purge两个命令。
    3.2.1 purge.url命令它只支持url的purge,如清除http://blog.izhoufeng.com/test.html。

    [root@varnish2 varnish]# telnet 192.168.1.185 3500
    Trying 192.168.1.185...
    Connected to varnish1 (192.168.1.185).
    Escape character is '^]'.
    200 154     
    -----------------------------
    Varnish HTTP accelerator CLI.
    -----------------------------
    Type 'help' for command list.
    Type 'quit' to close CLI session.
     
    purge.url test.html
    200 0

    除用CLI接口外也可以用:

    /usr/local/varnish-2.1/bin/varnishadm -T 192.168.1.185:3500 purge.url ^test.html$

    3.2.2 purge命令则很灵活,请看列子:
    清除http://izhoufeng.com/somedirectory/和目录下的所有页面。

    purge req.http.host == izhoufeng.com && req.url ~ ^/somedirectory/.*$
    or
    purge req.url ~ ^/somedirectory/ && req.http.host == izhoufeng.com

    清除所有带“Cache-Control: max-age=3600”的对象。

    purge obj.http.Cache-Control ~ max-age=3600
    or
    purge obj.http.Cache-Control ~ max-age ?= ?3600[^0-9]

    4、对于大量清除,需要程序接口来做。
    4.1 通过HTTP的PURGE的接口。

    <?php
    //刷新varnish缓存的函数,$ip为varnish服务器IP地址, $host为要刷新的网站域名,$url为要刷新的不含域名的URL地址
    function varnish_purge($ip, $host, $url)   
    {   
        $errstr = '';   
        $errno = '';
        $fp = fsockopen ($ip, 80, $errno, $errstr, 10);
        if (!$fp)   
        {   
             return false;   
        }   
        else  
        {   
            $out = "PURGE {$url} HTTP/1.1
    
    ";   
            $out .= "Host:{$host}
    
    ";   
            $out .= "Connection: close
    
    
    
    ";   
            fputs ($fp, $out);   
            $out = fgets($fp , 4096);
            fclose ($fp);   
            return true;   
        }   
    }   
     
    //用法:假设 192.168.1.185(varnish1)和192.168.1.186(varnish2)是两台varnish缓存服务器的内网IP地址,http://blog.izhoufeng.com/housing1d/08041110_2372147.htm为要刷新的地址
    varnish_purge("varnish1", "blog.izhoufeng.com", "/housing1d/08041110_2372147.htm");
    varnish_purge("varnish2", "blog.izhoufeng.com", "/housing1d/08041110_2372147.htm");
    ?>

    4.2 通过Varnish的管理端口的接口。
    对下边接口,先建立表把需要PURGE的URL放入表内。

    mysql> show create table dirty_url;
    +-----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | Table     | Create Table                                                                                                                                                                                                                                                                                                           |
    +-----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    | dirty_url | CREATE TABLE `dirty_url` (
      `id` int(11) NOT NULL AUTO_INCREMENT,
      `url` varchar(600) NOT NULL,
      `is_done` tinyint(2) NOT NULL DEFAULT '0',
      `ip` varchar(15) NOT NULL,
      `time` datetime NOT NULL,
      PRIMARY KEY (`id`),
      KEY `is_done` (`is_done`)
    ) ENGINE=InnoDB AUTO_INCREMENT=20627685 DEFAULT CHARSET=utf8 |
    +-----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    1 row in set (0.00 sec)
     
    mysql> select * from dirty_url limit 10;
    +----------+---------------------------------------------------------------+---------+-----------------+---------------------+
    | id       | url                                                           | is_done | ip              | time                |
    +----------+---------------------------------------------------------------+---------+-----------------+---------------------+
    | 20544199 | http://blog.izhoufeng.com/xianzhilipin/10040810_3231622.htm         |       0 | 192.168.1.186  | 2010-04-08 10:43:11 |
    | 20544200 | http://blog.izhoufeng.com/yingyouyunfu/10040810_3231623.htm         |       0 | 192.168.1.186  | 2010-04-08 10:43:11 |
    | 20544201 | http://blog.izhoufeng.com/ershoubijibendiannao/10040810_3231624.htm |       0 | 192.168.1.186  | 2010-04-08 10:43:11 |
    | 20544202 | http://blog.izhoufeng.com/zhaoshangjiameng/10032610_2222213.htm     |       0 | 192.168.1.186 | 2010-04-08 10:43:11 |
    | 20544203 | http://blog.izhoufeng.com/fushixiaobaxuemao/10040810_3231625.htm    |       0 | 192.168.1.186  | 2010-04-08 10:43:11 |
    | 20544204 | http://blog.izhoufeng.com/wupinjiaohuan/10040810_3231626.htm        |       0 | 192.168.1.186  | 2010-04-08 10:43:11 |
    | 20544205 | http://blog.izhoufeng.com/fushixiaobaxuemao/10040810_3231627.htm    |       0 | 192.168.1.186  | 2010-04-08 10:43:11 |
    | 20544206 | http://blog.izhoufeng.com/wupinjiaohuan/10040810_3231628.htm        |       0 | 192.168.1.186  | 2010-04-08 10:43:11 |
    | 20544207 | http://blog.izhoufeng.com/qitawupin/10040810_3231629.htm            |       0 | 192.168.1.186  | 2010-04-08 10:43:11 |
    | 20544208 | http://blog.izhoufeng.com/shoujitongxun/10040810_3231630.htm        |       0 | 192.168.1.186  | 2010-04-08 10:43:11 |
    +----------+---------------------------------------------------------------+---------+-----------------+---------------------+
    <?php
    //error_reporting(E_ALL);
    //ini_set("log_errors", "1");
    //ini_set("display_errors", "0");
    set_time_limit(0);
     
    $adminHost[0] = "192.168.1.185"; // IP address to connect to
    $adminHost[1] = "192.168.1.186"; // IP address to connect to
    $adminHost[2] = "192.168.1.187"; // IP address to connect to
    $adminPort = "3500"; // Port to connect to
     
    function pollServer($command,$adminHost,$adminPort) {
            $socket = socket_create(AF_INET, SOCK_STREAM, getprotobyname("tcp"));
            if ((!socket_set_option($socket, SOL_SOCKET, SO_RCVTIMEO, Array("sec" => "5", "usec" => "0"))) OR (!socket_set_option($socket, SOL_SOCKET, SO_SNDTIMEO, Array("sec" => "5", "usec" => "0")))) {
                    die("Unable to set socket timeout");
            }
            if (@socket_connect($socket, $adminHost, $adminPort)) {
                    $data = "";
                    if (!$socket) {
                            die("Unable to open socket to " . $server . ":" . $port . "
    ");
                    }
                    socket_write($socket, $command . "
    ");
                    socket_recv($socket, $buffer, 65536, 0);
                    $data .= $buffer;
                    socket_close($socket);
                    return $data;
            }
            else {
                    return "Unable to connect: " . socket_strerror(socket_last_error()) . "
    ";
            }
    }
    //this function is to add the web into varnish
    function socketHttpGet($url,$hostIp)
    {
            $urlInfo = parse_url($url);
            $urlInfo["path"] = ($urlInfo["path"] == "" ? "/" : $urlInfo["path"]);
            $urlInfo["port"] = (!isset($urlInfo["port"]) ? 80 : $urlInfo["port"]);
     
     
            $urlInfo["request"] =  $urlInfo["path"] .
            (empty($urlInfo["query"]) ? "" : "?" . $urlInfo["query"]) .
            (empty($urlInfo["fragment"]) ? "" : "#" . $urlInfo["fragment"]);
     
            $fsock = fsockopen($hostIp, $urlInfo["port"], $errno, $errstr, 10);
     
            if (false == $fsock) {
                    return false;
            }
     
            /* begin get  */
            $in = "GET " . $urlInfo["request"] . " HTTP/1.0
    
    ";
            $in .= "Host: " . $urlInfo["host"] . "
    
    ";
            $in .= "Accept: */*
    
    ";
            $in .= "Accept-Language: zh-CN
    
    ";
            $in .= "Accept-Encoding: gzip, deflate
    
    ";
            $in .= "Accept-Language: zh-CN
    
    ";
            $in .= "User-Agent: Mozilla/4.0 (Auto add the Web to Varnish)
    
    ";
            $in .= "Cache-Control: no-cache
    
    ";
            $in .= "Connection: close
    
    
    
    ";
     
     
            //stream_set_timeout($fsock, 10);
            if (!fwrite($fsock, $in, strlen($in))) {
                    fclose($fsock);
                    return false;
            }
            unset($in);
     
     
            //fclose($fsock);return;
            //process response
            $out = "";
            while ($buff = fgets($fsock, 2048)) {
                    $out .= $buff;
            }
     
            //finish socket
            fclose($fsock);
            $pos = strpos($out, "
    
    
    
    ");
            $head = substr($out, 0, $pos);
            return $head;
    }
    //now to purge from the mysql
    while(true)
    {
    $conn=mysql_connect("192.168.1.186","username","password");
    mysql_query("set names utf8");
    mysql_select_db("queue",$conn);
    $query="select url,id from dirty_url where is_done =0 order by id asc limit 200";
    $results=mysql_query($query);
    while($arr=mysql_fetch_array($results))
    {
            $url=parse_url($arr['url']);
            $purge_url="purge.url ".$url['path'];
            $stats=0;
            foreach($adminHost as $value)
            {
                    $result=pollServer($purge_url,$value,$adminPort);
                    $status = explode(" ", $result);
                    if($status[0]=="200")
                    {
                            //socketHttpGet($arr['url'],$value);
                            $stats++;
                    }
            }
            if($stats==3)
            {
                    $ups="update dirty_url set is_done =1 where id=$arr[id]";
                    mysql_query($ups);
            }
    }
    sleep(1);
    }
    ?>

    我想接口固然是接口,但是如果到大型网站,每天刷新量有10多万,就需要用到多线程来做了,下图就是我开发的刷新程序,希望能给大家些启发。

    5、对于PURGE的几种方式,原理都一样,我们可以从purge.list的输出结果就可以看出。
    5.1 用2大点中用VCL来控制刷新,然后通过HTTP发送PURGE命令的日志如下:

      384   req.url == /fang5/10040721_93041.htm
        0   req.url == /jiajiao/10040721_25874.htm
        0   req.url == /fang5/10040721_530212.htm
        0   req.url == /jiaoyou9/10040721_3814.htm
        0   req.url == /fang1/10040721_152079.htm
        0   req.url == /fang5/10040721_284698.htm
        0   req.url == /fang1/10040721_625739.htm
        0   req.url == /fang5/10040721_388442.htm
        0   req.url == /fang1/10040721_450056.htm
        0   req.url == /fang5/10040721_704267.htm
        0   req.url == /fang5/10040721_704266.htm
        0   req.url == /fang1/10040721_625738.htm
        0   req.url == /fang5/10040721_704265.htm
        0   req.url == /fang1/10040721_71558.htm
        0   req.url == /fang5/10040721_226345.htm
        0   req.url == /fang5/10040721_121378.htm
        0   req.url == /fang1/10040721_818489.htm

    5.2 用purge.url命令的日志是:

    0x2aaaaec44640 1270695397.844543     0  req.url ~ ^test.html$

    5.3 用purge命令的日志是:

    0x2aaaaecc9fa0 1270698757.617076     0  req.http.host ~ sh.izhoufeng.com && req.url ~ ^/jzjiuba/10040810_1173704.htm$
    0x2aaaaecc9f40 1270698757.616768     0  req.http.host ~ bj.izhoufeng.com && req.url ~ ^/fang5/10040711_6414363.htm$
    0x2aaaaecc9e80 1270698757.547097     0  req.http.host ~ sh.izhoufeng.com && req.url ~ ^/zpshichangyingxiao/09112600_1315775.htm$
    0x2aaaaecc9e20 1270698755.967497     0  req.http.host ~ sh.izhoufeng.com && req.url ~ ^/jzjiuba/10040810_1173702.htm$
    0x2aaaaecc9dc0 1270698755.665087     0  req.http.host ~ bj.izhoufeng.com && req.url ~ ^/zpsiji/10040811_3258734.htm$
    0x2aaaaecc9d60 1270698755.591958     0  req.http.host ~ bj.izhoufeng.com && req.url ~ ^/zpbaomu/10040811_3258735.htm$
    0x2aaaaecc9ca0 1270698755.356326     0  req.http.host ~ bj.izhoufeng.com && req.url ~ ^/zpyinyuebiaoyanzhuchi/10040811_3258736.htm$
    0x2aaaaecc9c40 1270698755.291577     0  req.http.host ~ bj.izhoufeng.com && req.url ~ ^/zpwenan/10040811_3258737.htm$
    0x2aaaaecc9be0 1270698755.290569     0  req.http.host ~ sh.izhoufeng.com && req.url ~ ^/lvshi/10040809_2410859.htm$

    6、Varnish刷新原理是通过ban列表来操作的。
    当我们执行如下:

    purge req.url ~ .png

    这个操作就添加到活跃的bans的前边。当有同样的purge操作时,他就会一直添加,Varnish不可能遍历它缓存的几亿个缓存对象以确定谁受影响。代替Varnish从缓存中查找对象是它通过比较purge list的bans。
    如果匹配的ban被找到,这个ban和缓存中的对像进行比较。当有一个匹配的时候,对象被标记为不可用,除非另外个合适的对象能被找到,缓存hit将被一缓存miss替代,促使对象从后端获取。
    新创建的对象不被老bans约束,当我们插入一个对象到缓存,他被标记为checked以针对所有的在列表中的现有bans。
    针对一个ban当所有的对象都已经被检查过,这个ban就被从purge list中移除,同时内存重新可用。
    7、我参考的文档如下:

    http://varnish-cache.org/wiki/Purging

    http://kristian.blog.linpro.no/2010/02/02/varnish-purges/

    http://varnish-cache.org/wiki/VCLExamplePurging

    来自:http://blog.izhoufeng.com/posts/98.html

  • 相关阅读:
    Java StringTokenizer Example
    java 删除字符串中的特定字符
    [Python]网络爬虫(二):利用urllib2通过指定的URL抓取网页内容
    Uniform resource name
    [Python]网络爬虫(一):抓取网页的含义和URL基本构成
    coco2dx 精灵类
    window和nodejs作用域区别(待续)
    ubuntu开机遇到-您的当前网络有.local域,我们不建议这样做而且这与AVAHI网络服务探测不兼容。该服务已被禁用
    ruby中的reject和reject!
    ruby中将数组转换成hash
  • 原文地址:https://www.cnblogs.com/merryfreespace/p/3537852.html
Copyright © 2011-2022 走看看