zoukankan      html  css  js  c++  java
  • URL Parsing

    URL Parsing

    urllib.parse.urlparse(urlstring, scheme='', allow_fragments=True)

    Parse a URL into six components, returning a 6-tuple. This corresponds to the general structure of a URL: scheme://netloc/path;parameters?query#fragment. Each tuple item is a string, possibly empty. The components are not broken up in smaller parts (for example, the network location is a single string), and % escapes are not expanded. The delimiters as shown above are not part of the result, except for a leading slash in the path component, which is retained if present. For example:

    
    

    Following the syntax specifications in RFC 1808, urlparse recognizes a netloc only if it is properly introduced by ‘//’. Otherwise the input is presumed to be a relative URL and thus to start with a path component.

    
    

    The scheme argument gives the default addressing scheme.

    If the allow_fragments argument is false, fragment identifiers are not recognized. Instead, they are parsed as part of the path, parameters or query component, and fragment is set to the empty string in the return value.

    urllib.parse.parse_qs(qs, keep_blank_values=False, strict_parsing=False, encoding='utf-8', errors='replace')

    Parse a query string given as a string argument (data of type application/x-www-form-urlencoded). Data are returned as a dictionary. The dictionary keys are the unique query variable names and the values are lists of values for each name.

    urllib.parse.parse_qsl(qs, keep_blank_values=False, strict_parsing=False, encoding='utf-8', errors='replace')

    Parse a query string given as a string argument (data of type application/x-www-form-urlencoded). Data are returned as a list of name, value pairs.

    urllib.parse.urlunparse(parts)

    Construct a URL from a tuple as returned by urlparse(). The parts argument can be any six-item iterable.

    urllib.parse.urlsplit(urlstring, scheme='', allow_fragments=True)

    This is similar to urlparse(), but does not split the params from the URL.

    urllib.parse.urlunsplit(parts)

    Combine the elements of a tuple as returned by urlsplit() into a complete URL as a string. The parts argument can be any five-item iterable.

    urllib.parse.urldefrag(url)

    If url contains a fragment identifier, return a modified version of url with no fragment identifier, and the fragment identifier as a separate string. If there is no fragment identifier in url, return url unmodified and an empty string.

  • 相关阅读:
    N点虚拟主机管理系统(For Windows2003/2008)功能及介绍
    淘宝API开发系列商家的绑定
    在linux上使用ASP
    petshop4.0 详解之五(PetShop之业务逻辑层设计)
    vsFTPd 服务器
    中国联通短信如何 对接
    淘宝API开发系列开篇概述
    “VPS FTP应用”目录存档
    使用c#+(datagrid控件)编辑xml文件
    Centos 5.3 Nginx+php+mysql配置 独立的 Subversion (SVN)服务器
  • 原文地址:https://www.cnblogs.com/tekkaman/p/5767943.html
Copyright © 2011-2022 走看看