zoukankan      html  css  js  c++  java
  • URL Parsing

    URL Parsing

    urllib.parse.urlparse(urlstring, scheme='', allow_fragments=True)

    Parse a URL into six components, returning a 6-tuple. This corresponds to the general structure of a URL: scheme://netloc/path;parameters?query#fragment. Each tuple item is a string, possibly empty. The components are not broken up in smaller parts (for example, the network location is a single string), and % escapes are not expanded. The delimiters as shown above are not part of the result, except for a leading slash in the path component, which is retained if present. For example:

    
    

    Following the syntax specifications in RFC 1808, urlparse recognizes a netloc only if it is properly introduced by ‘//’. Otherwise the input is presumed to be a relative URL and thus to start with a path component.

    
    

    The scheme argument gives the default addressing scheme.

    If the allow_fragments argument is false, fragment identifiers are not recognized. Instead, they are parsed as part of the path, parameters or query component, and fragment is set to the empty string in the return value.

    urllib.parse.parse_qs(qs, keep_blank_values=False, strict_parsing=False, encoding='utf-8', errors='replace')

    Parse a query string given as a string argument (data of type application/x-www-form-urlencoded). Data are returned as a dictionary. The dictionary keys are the unique query variable names and the values are lists of values for each name.

    urllib.parse.parse_qsl(qs, keep_blank_values=False, strict_parsing=False, encoding='utf-8', errors='replace')

    Parse a query string given as a string argument (data of type application/x-www-form-urlencoded). Data are returned as a list of name, value pairs.

    urllib.parse.urlunparse(parts)

    Construct a URL from a tuple as returned by urlparse(). The parts argument can be any six-item iterable.

    urllib.parse.urlsplit(urlstring, scheme='', allow_fragments=True)

    This is similar to urlparse(), but does not split the params from the URL.

    urllib.parse.urlunsplit(parts)

    Combine the elements of a tuple as returned by urlsplit() into a complete URL as a string. The parts argument can be any five-item iterable.

    urllib.parse.urldefrag(url)

    If url contains a fragment identifier, return a modified version of url with no fragment identifier, and the fragment identifier as a separate string. If there is no fragment identifier in url, return url unmodified and an empty string.

  • 相关阅读:
    如何防止源码被盗
    C# WebBrowser 获得选中部分的html源码
    特殊字符和空格
    MySQL性能优化
    mysql5.7新特性探究
    【九】MongoDB管理之安全性
    【八】MongoDB管理之分片集群实践
    【七】MongoDB管理之分片集群介绍
    【六】MongoDB管理之副本集
    【五】MongoDB管理之生产环境说明
  • 原文地址:https://www.cnblogs.com/tekkaman/p/5767943.html
Copyright © 2011-2022 走看看