zoukankan      html  css  js  c++  java
  • php urlencode vs java URLEncoder.encode

    结论:urlencode 先比URLEncoder.encode多编码 “ * ” 符号,其他都保持一致

    php urlencode 

      phpversion()>=5.3 will compliant with RFC 3986, while phpversion()<=5.2.7RC1 is not compliant with RFC 3986.

      参考 RFC3896 方式编码

      

    返回字符串,此字符串中除了 -_. 之外的所有非字母数字字符都将被替换成百分号(%)后跟两位十六进制数,空格则编码为加号(+)。
    此编码与 WWW 表单 POST 数据的编码方式是一样的,同时与 application/x-www-form-urlencoded 的媒体类型编码方式一样。
    由于历史原因,此编码在将空格编码为加号(+)方面与 » RFC3896 编码(参见 rawurlencode())不同。

    php并没有完全按照 rfc3896编码,符号【~】在标准中是不用编码,但是他也编码了。

    所以最终的未编码的字符列表为 [-], [_], [.],如同其文档中描述的一样

    java URLEncoder.encode

      参考 RFC2396 方式编码

      但是由于ie浏览器编码了除  "-", "_", ".", "*" 之外的字符,java采用了和IE一样的编码列表,

      所以最终的未编码的字符列表为 [-], [_], [.], [*]

      

    The list of characters that are not encoded has been
    determined as follows:
    
    RFC 2396 states:
    -----
    Data characters that are allowed in a URI but do not have a
    reserved purpose are called unreserved.  These include upper
    and lower case letters, decimal digits, and a limited set of
    punctuation marks and symbols.
    
    unreserved  = alphanum | mark
    
    mark        = "-" | "_" | "." | "!" | "~" | "*" | "'" | "(" | ")"
    
    Unreserved characters can be escaped without changing the
    semantics of the URI, but this should not be done unless the
    URI is being used in a context that does not allow the
    unescaped character to appear.
    -----
    
    It appears that both Netscape and Internet Explorer escape
    all special characters from this list with the exception
    of "-", "_", ".", "*". While it is not clear why they are
    escaping the other characters, perhaps it is safest to
    assume that there might be contexts in which the others
    are unsafe if not escaped. Therefore, we will use the same
    list. It is also noteworthy that this is consistent with
    O'Reilly's "HTML: The Definitive Guide" (page 164).
    
    As a last note, Intenet Explorer does not encode the "@"
    character which is clearly not unreserved according to the
    RFC. We are being consistent with the RFC in this matter,
    as is Netscape.

    History of related RFCs:

    RFC 1738 section 2.2
    only alphanumerics, the special characters "$-_.+!*'(),", and
    reserved characters used for their reserved purposes may be used
    unencoded within a URL.

    RFC 2396 section 2.3
    unreserved = alphanum | mark
    mark = "-" | "_" | "." | "!" | "~" | "*" | "'" | "(" | ")"

    RFC 2732 section 3
    (3) Add "[" and "]" to the set of 'reserved' characters:

    RFC 3986 section 2.3
    unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"

    RFC 3987 section 2.2
    unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"

  • 相关阅读:
    服务方式加载卸载NT驱动函数集
    《Windows核心编程》学习笔记(12)– 虚拟内存
    《Windows核心编程》学习笔记(14)– 堆
    数据库连接错误:提示TCP端口1433,sql server 2008 Connection refused:connect
    Windows驱动开发技术详解笔记
    Struts2文件上传的大小限制问题
    pragma comment的使用 pragma预处理指令详解
    解决FastCGI Error Error Number: 2147467259 (0x80004005). 和 Error Number: 1073741819 (0xc0000005).
    PHP中的日期处理
    mysql远程连接10061错误
  • 原文地址:https://www.cnblogs.com/siqi/p/10070926.html
Copyright © 2011-2022 走看看