zoukankan      html  css  js  c++  java
  • linux使用wget


    wget is a Linux command-line utility for retrieving files from the web, via HTTP, HTTPS and FTP protocols. When you are using wget to download a file at a particular HTTP url, wget sends an appropriate HTTP request to a destination web server.

    To view default HTTP request header being sent by wget, you can use "-d" option.

    $ wget -d http://www.google.com/
    ---request begin---
    GET / HTTP/1.0
    User-Agent: Wget/1.12 (linux-gnu)
    Accept: */*
    Host: www.google.com
    Connection: Keep-Alive
    
    ---request end---
    

    Sometimes you may want to customize the default HTTP request header used by wget. For example, you may want to customize "User-Agent" field as some sites rely on "User-Agent" string to block robots like wget to retrieve their content. You may want to add an additional "Accept-Encoding" field in order to test encoding schemes of your web server. In some other cases, you may need to set "Host" field properly to be able to access a web server running on name-based virtual hosting.

    wget allows you to send an HTTP request with custom HTTP headers. To supply custom HTTP headers, use "--header" option. You can use "--header" option as many time as you want in a single run.

    $ wget -d --header="User-Agent: Mozilla/5.0 (Windows NT 6.0) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.97 Safari/537.11" --header="Referer: http://xmodulo.com/" --header="Accept-Encoding: compress, gzip" http://www.google.com/
    ---request begin---
    GET / HTTP/1.0
    User-Agent: Mozilla/5.0 (Windows NT 6.0) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.97 Safari/537.11
    Accept: */*
    Host: www.google.com
    Connection: Keep-Alive
    Referer: http://xmodulo.com/
    Accept-Encoding: compress, gzip
    
    ---request end---
    

    If you would like to permanently set the default HTTP request header you want to use with wget, you can use ~/.wgetrc configuration file. You can specify as many header fields as you want in ~/.wgetrc.

    $ vi ~/.wgetrc
    header = User-Agent: Mozilla/5.0 (Windows NT 6.0) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.97 Safari/537.11
    header = Referer: http://xmodulo.com/
    header = Accept-Encoding: compress, gzip
    

    Once you have configured ~/.wgetrc, you no longer need to use "--header" option with wget.

    curl is another command-line tool with similar functionality as wget. The curl utility also allows you to set a custom HTTP header. Refer to this guideline for detail on curl.

  • 相关阅读:
    Linux常用命令
    python_并发编程——多进程的第二种启动方式
    python_并发编程——多进程
    python_面向对象——动态创建类和isinstance和issubclass方法
    python_面向对象——双下划线方法
    python_反射:应用
    python_反射——根据字符串获取模块中的属性
    python_面向对象——反射
    python_面向对象——属性方法property
    python_面向对象——类方法和静态方法
  • 原文地址:https://www.cnblogs.com/mrcharles/p/11879832.html
Copyright © 2011-2022 走看看