zoukankan      html  css  js  c++  java
  • linux使用wget


    wget is a Linux command-line utility for retrieving files from the web, via HTTP, HTTPS and FTP protocols. When you are using wget to download a file at a particular HTTP url, wget sends an appropriate HTTP request to a destination web server.

    To view default HTTP request header being sent by wget, you can use "-d" option.

    $ wget -d http://www.google.com/
    ---request begin---
    GET / HTTP/1.0
    User-Agent: Wget/1.12 (linux-gnu)
    Accept: */*
    Host: www.google.com
    Connection: Keep-Alive
    
    ---request end---
    

    Sometimes you may want to customize the default HTTP request header used by wget. For example, you may want to customize "User-Agent" field as some sites rely on "User-Agent" string to block robots like wget to retrieve their content. You may want to add an additional "Accept-Encoding" field in order to test encoding schemes of your web server. In some other cases, you may need to set "Host" field properly to be able to access a web server running on name-based virtual hosting.

    wget allows you to send an HTTP request with custom HTTP headers. To supply custom HTTP headers, use "--header" option. You can use "--header" option as many time as you want in a single run.

    $ wget -d --header="User-Agent: Mozilla/5.0 (Windows NT 6.0) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.97 Safari/537.11" --header="Referer: http://xmodulo.com/" --header="Accept-Encoding: compress, gzip" http://www.google.com/
    ---request begin---
    GET / HTTP/1.0
    User-Agent: Mozilla/5.0 (Windows NT 6.0) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.97 Safari/537.11
    Accept: */*
    Host: www.google.com
    Connection: Keep-Alive
    Referer: http://xmodulo.com/
    Accept-Encoding: compress, gzip
    
    ---request end---
    

    If you would like to permanently set the default HTTP request header you want to use with wget, you can use ~/.wgetrc configuration file. You can specify as many header fields as you want in ~/.wgetrc.

    $ vi ~/.wgetrc
    header = User-Agent: Mozilla/5.0 (Windows NT 6.0) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.97 Safari/537.11
    header = Referer: http://xmodulo.com/
    header = Accept-Encoding: compress, gzip
    

    Once you have configured ~/.wgetrc, you no longer need to use "--header" option with wget.

    curl is another command-line tool with similar functionality as wget. The curl utility also allows you to set a custom HTTP header. Refer to this guideline for detail on curl.

  • 相关阅读:
    [置顶] 【Git入门之十五】Github操作指南
    hdu 3698 Let the light guide us(线段树优化&简单DP)
    拥有最小高度能自适应高度,IE、FF全兼容的div设置
    浏览器小览【欢迎讨论】
    实习心得体会--在一家互联网公司4个月的心得体会
    九度online judge 1543 二叉树
    指令系统是指计算机所能执行的全部指令的集合
    电脑的CPU可直接解读的数据机器码
    解释是一句一句的翻译
    编译解释两种方式只是翻译的时间不同
  • 原文地址:https://www.cnblogs.com/mrcharles/p/11879832.html
Copyright © 2011-2022 走看看