zoukankan      html  css  js  c++  java
  • Snoopy.class.php使用手册

    Snoopy - the PHP net client v1.2.4

    Snoopy是一个php类,用来模拟浏览器的功能,可以获取网页内容,发送表单。
    Snoopy的特点:
    1、抓取网页的内容 fetch
    2、抓取网页的文本内容 (去除HTML标签) fetchtext
    3、抓取网页的链接,表单 fetchlinks fetchform
    4、支持代理主机
    5、支持基本的用户名/密码验证
    6、支持设置 user_agent, referer(来路), cookies 和 header content(头文件)
    7、支持浏览器重定向,并能控制重定向深度
    8、能把网页中的链接扩展成高质量的url(默认)
    9、提交数据并且获取返回值
    10、支持跟踪HTML框架
    11、支持重定向的时候传递cookies
    要求php4以上就可以了,由于本身是php一个类,无需扩支持,服务器不支持curl时候的最好选择。

    概要方法:

        include "Snoopy.class.php";
        $snoopy = new Snoopy;
        
        $snoopy->fetchtext("http://www.php.net/");
        print $snoopy->results;
        
        $snoopy->fetchlinks("http://www.phpbuilder.com/");
        print $snoopy->results;
        
        $submit_url = "http://lnk.ispi.net/texis/scripts/msearch/netsearch.html";
        
        $submit_vars["q"] = "amiga";
        $submit_vars["submit"] = "Search!";
        $submit_vars["searchhost"] = "Altavista";
            
        $snoopy->submit($submit_url,$submit_vars);
        print $snoopy->results;
        
        $snoopy->maxframes=5;
        $snoopy->fetch("http://www.ispi.net/");
        echo "<PRE>
    ";
        echo htmlentities($snoopy->results[0]); 
        echo htmlentities($snoopy->results[1]); 
        echo htmlentities($snoopy->results[2]); 
        echo "</PRE>
    ";
    
        $snoopy->fetchform("http://www.altavista.com");
        print $snoopy->results; 

    类方法说明:

        fetch($URI)
        -----------
        
        This is the method used for fetching the contents of a web page.
        $URI is the fully qualified URL of the page to fetch.
        The results of the fetch are stored in $this->results.
        If you are fetching frames, then $this->results
        contains each frame fetched in an array.
            
        fetchtext($URI)
        ---------------    
        
        This behaves exactly like fetch() except that it only returns
        the text from the page, stripping out html tags and other
        irrelevant data.        
    
        fetchform($URI)
        ---------------    
        
        This behaves exactly like fetch() except that it only returns
        the form elements from the page, stripping out html tags and other
        irrelevant data.        
    
        fetchlinks($URI)
        ----------------
    
        This behaves exactly like fetch() except that it only returns
        the links from the page. By default, relative links are
        converted to their fully qualified URL form.
    
        submit($URI,$formvars)
        ----------------------
        
        This submits a form to the specified $URI. $formvars is an
        array of the form variables to pass.
            
            
        submittext($URI,$formvars)
        --------------------------
    
        This behaves exactly like submit() except that it only returns
        the text from the page, stripping out html tags and other
        irrelevant data.        
    
        submitlinks($URI)
        ----------------
    
        This behaves exactly like submit() except that it only returns
        the links from the page. By default, relative links are
        converted to their fully qualified URL form.

    类 VARIABLES: (default value in parenthesis)

        $host            the host to connect to
        $port            the port to connect to
        $proxy_host        the proxy host to use, if any
        $proxy_port        the proxy port to use, if any
        $agent            the user agent to masqerade as (Snoopy v0.1)
        $referer        referer information to pass, if any
        $cookies        cookies to pass if any
        $rawheaders        other header info to pass, if any
        $maxredirs        maximum redirects to allow. 0=none allowed. (5)
        $offsiteok        whether or not to allow redirects off-site. (true)
        $expandlinks    whether or not to expand links to fully qualified URLs (true)
        $user            authentication username, if any
        $pass            authentication password, if any
        $accept            http accept types (image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, )
        $error            where errors are sent, if any
        $response_code    responde code returned from server
        $headers        headers returned from server
        $maxlength        max return data length
        $read_timeout    timeout on read operations (requires PHP 4 Beta 4+)
                        set to 0 to disallow timeouts
        $timed_out        true if a read operation timed out (requires PHP 4 Beta 4+)
        $maxframes        number of frames we will follow
        $status            http status of fetch
        $temp_dir        temp directory that the webserver can write to. (/tmp)
        $curl_path        system path to cURL binary, set to false if none

    EXample:

        Example:     fetch a web page and display the return headers and
                    the contents of the page (html-escaped):
        
        include "Snoopy.class.php";
        $snoopy = new Snoopy;
        
        $snoopy->user = "joe";
        $snoopy->pass = "bloe";
        
        if($snoopy->fetch("http://www.slashdot.org/"))
        {
            echo "response code: ".$snoopy->response_code."<br>
    ";
            while(list($key,$val) = each($snoopy->headers))
                echo $key.": ".$val."<br>
    ";
            echo "<p>
    ";
            
            echo "<PRE>".htmlspecialchars($snoopy->results)."</PRE>
    ";
        }
        else
            echo "error fetching document: ".$snoopy->error."
    ";
    
    
    
        Example:    submit a form and print out the result headers
                    and html-escaped page:
    
        include "Snoopy.class.php";
        $snoopy = new Snoopy;
        
        $submit_url = "http://lnk.ispi.net/texis/scripts/msearch/netsearch.html";
        
        $submit_vars["q"] = "amiga";
        $submit_vars["submit"] = "Search!";
        $submit_vars["searchhost"] = "Altavista";
    
            
        if($snoopy->submit($submit_url,$submit_vars))
        {
            while(list($key,$val) = each($snoopy->headers))
                echo $key.": ".$val."<br>
    ";
            echo "<p>
    ";
            
            echo "<PRE>".htmlspecialchars($snoopy->results)."</PRE>
    ";
        }
        else
            echo "error fetching document: ".$snoopy->error."
    ";
    
    
    
        Example:    showing functionality of all the variables:
        
    
        include "Snoopy.class.php";
        $snoopy = new Snoopy;
    
        $snoopy->proxy_host = "my.proxy.host";
        $snoopy->proxy_port = "8080";
        
        $snoopy->agent = "(compatible; MSIE 4.01; MSN 2.5; AOL 4.0; Windows 98)";
        $snoopy->referer = "http://www.microsnot.com/";
        
        $snoopy->cookies["SessionID"] = 238472834723489l;
        $snoopy->cookies["favoriteColor"] = "RED";
        
        $snoopy->rawheaders["Pragma"] = "no-cache";
        
        $snoopy->maxredirs = 2;
        $snoopy->offsiteok = false;
        $snoopy->expandlinks = false;
        
        $snoopy->user = "joe";
        $snoopy->pass = "bloe";
        
        if($snoopy->fetchtext("http://www.phpbuilder.com"))
        {
            while(list($key,$val) = each($snoopy->headers))
                echo $key.": ".$val."<br>
    ";
            echo "<p>
    ";
            
            echo "<PRE>".htmlspecialchars($snoopy->results)."</PRE>
    ";
        }
        else
            echo "error fetching document: ".$snoopy->error."
    ";
    
    
        Example:     fetched framed content and display the results
        
        include "Snoopy.class.php";
        $snoopy = new Snoopy;
        
        $snoopy->maxframes = 5;
        
        if($snoopy->fetch("http://www.ispi.net/"))
        {
            echo "<PRE>".htmlspecialchars($snoopy->results[0])."</PRE>
    ";
            echo "<PRE>".htmlspecialchars($snoopy->results[1])."</PRE>
    ";
            echo "<PRE>".htmlspecialchars($snoopy->results[2])."</PRE>
    ";
        }
        else
            echo "error fetching document: ".$snoopy->error."
    ";
  • 相关阅读:
    Kafka项目实战-用户日志上报实时统计之编码实践
    MapReduce-深度剖析
    Word 页码设置教程:如何删除封面和目录的目录?
    Pytorch autograd,backward详解
    Pytorch Sampler详解
    Pytorch并行计算:nn.parallel.replicate, scatter, gather, parallel_apply
    论文笔记系列-Auto-DeepLab:Hierarchical Neural Architecture Search for Semantic Image Segmentation
    Pytorch: parameters(),children(),modules(),named_*区别
    Broadcast,Scatter,Gather,Reduce,All-reduce分别是什么?
    如何理解正定矩阵和半正定矩阵
  • 原文地址:https://www.cnblogs.com/nobcool/p/3404059.html
Copyright © 2011-2022 走看看