zoukankan      html  css  js  c++  java
  • php curl抓取页面及来路

    作者:知乎用户
    链接:https://www.zhihu.com/question/38960915/answer/79147687
    来源:知乎
    著作权归作者所有。商业转载请联系作者获得授权,非商业转载请注明出处。
    
    <?php
    
        $url = "https://www.zhihu.com/";
    
        $ch = curl_init();
        // 设置浏览器的特定header
        curl_setopt($ch, CURLOPT_HTTPHEADER, array(
            "Host: www.zhihu.com",
            "Connection: keep-alive",
            "Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
            "Upgrade-Insecure-Requests: 1",
            "DNT:1",
            "Accept-Language: zh-CN,zh;q=0.8,en-GB;q=0.6,en;q=0.4,en-US;q=0.2",
            'Cookie:_za=4540d427-eee1-435a-a533-66ecd8676d7d; __utma=51854390.3169871.1440319332.1441339521.1442067491.5; __utmz=51854390.1442067491.5.5.utmcsr=baidu|utmccn=(organic)|utmcmd=organic; __utmv=51854390.100-1|2=registration_date=20140525=1^3=entry_date=20140525=1; q_c1=efa8c4ccdba04f63a0ba88845f485836|1451394239000|1440047640000; _xsrf=20c250b28098f92459cac05a3944d48d; cap_id="ZWQ5OGIzN2JiZWNmNGRlNGE3YTE1MTE0YTA5YjY1NjE=|1451394239|0efd13fc965c43c0fb6a7a2523b5dac4d1dac7e3"; z_c0="QUFCQXRLa3ZBQUFYQUFBQVlRSlZUY29ScWxZN0k3T1BHaFdqb1JNVlVZekNnZ0trU0xXdEdnPT0=|1451394250|02ed77acc81edbf2340fd0ce1b13618862b3674e"; unlock_ticket="QUFCQXRLa3ZBQUFYQUFBQVlRSlZUZEtMZ2xiM21FNDRmdzdsX1NnOVdieUp3M1VtY0RsaUVBPT0=|1451394250|8cf44cefb523b2973eca01f0918ef97fc03a49qa"',
            
            ));
        curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 10.0; WOW64; rv:41.0) Gecko/20100101 Firefox/41.0');
        // 在HTTP请求头中"Referer: "的内容。
        curl_setopt($ch, CURLOPT_REFERER,"https://www.baidu.com/s?word=%E7%9F%A5%E4%B9%8E&tn=sitehao123&ie=utf-8&ssl_sample=normal&f=3&rsp=0");
        curl_setopt($ch, CURLOPT_ENCODING, "gzip, deflate, sdch");
        curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
        curl_setopt($ch, CURLOPT_URL, $url);
        curl_setopt($ch, CURLOPT_TIMEOUT,120);
        curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);//302redirect
        // 针对https的设置
        curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
        curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 2);
        $html = curl_exec($ch);
        curl_close($ch);
        if($html === false) {
            echo 'Curl error: ' . curl_error($ch) . "<br>
    
    ";
        } else {
            echo $html;
        }
  • 相关阅读:
    2. linux下如何上传和下载文件
    (六)使用Docker镜像(下)
    (五)使用Docker镜像(上)
    1. chmod命令
    阿里P7/P8学习路线图——技术封神之路
    问题二:pip install python-igraph 报错,C core of igraph 没有安装。
    问题一:【Hive】explain command throw ClassCastException in 2.3.4
    (四)docker创建私人仓库
    P5024 保卫王国
    jzoj5980. 【WC2019模拟12.27】字符串游戏
  • 原文地址:https://www.cnblogs.com/741570hh/p/7423370.html
Copyright © 2011-2022 走看看