zoukankan      html  css  js  c++  java
  • hexcode of é î Latin-1 Supplement

    hexcode of é

    https://www.codetable.net/hex/e9

    Symbol Name: Latin Small Letter E With Acute
    Html Entity: é
    Hex Code: é
    Decimal Code: é
    Unicode Group: Latin-1 Supplement

    http://www.unicode.org/charts/index.html   这个页面搜索的时候,需要输入00E9,字母必须大写。然后就能找到字符是属于Latin-1 Supplement

    Latin-1 Supplement  https://www.unicode.org/charts/PDF/U0080.pdf

     

    Character encoding for French Accents

    If intérêt shows up as intérêt you likely (i.e. short of corruption due to double encoding) have UTF-8 encoded text being shown up as if it were ISO-8859-1.

    Make sure the headers are correctly formed and present the content as being UTF-8 encoded.

    Double encoded UTF-8 strings in C#

    有一个错误的字符串,转换步骤如下

    1.先用utf8,把字符串转换成utf8的字节数组

    2.把utf8的字节数组,转换成iso的字节数组

    3.再用utf8,把iso的字节数组,转换成utf8对应的字符串

     [Test]
            public void Test20210409003()
            {
                string correctFormat = "125,chaînes";//This is the correct format
    
                var utf8Str = "125,chaînes";
                Encoding iso = Encoding.GetEncoding("ISO-8859-1");
    
                Encoding utf8 = Encoding.UTF8;
    
                byte[] utfBytes = utf8.GetBytes(utf8Str);
    
                byte[] isoBytes = Encoding.Convert(utf8, iso, utfBytes);
    
                var result = utf8.GetString(isoBytes);
                Console.WriteLine(result);
            }

    C# Convert string from UTF-8 to ISO-8859-1 (Latin1) H

    Use Encoding.Convert to adjust the byte array before attempting to decode it into your destination encoding.

    Encoding iso = Encoding.GetEncoding("ISO-8859-1");
    Encoding utf8 = Encoding.UTF8;
    byte[] utfBytes = utf8.GetBytes(Message);
    byte[] isoBytes = Encoding.Convert(utf8, iso, utfBytes);
    string msg = iso.GetString(isoBytes);
    

    Latin-1 Supplement (Unicode block)

    The Latin-1 Supplement (also called C1 Controls and Latin-1 Supplement) is the second Unicode block in the Unicode standard. It encodes the upper range of ISO 8859-1: 80 (U+0080) - FF (U+00FF). Controls C1 (0080–009F) are not graphic. This block ranges from U+0080 to U+00FF, contains 128 characters and includes the C1 controls, Latin-1 punctuation and symbols, 30 pairs of majuscule and minuscule accented Latin characters and 2 mathematical operators.

    The C1 controls and Latin-1 Supplement block has been included in its present form, with the same character repertoire since version 1.0 of the Unicode Standard.[3] Its block name in Unicode 1.0 was simply Latin1.

    Character Encoding Issue UTF-8 and ISO-8859-1

    The answer would be you have wrong data in the database. What probably happened is that you did a conversion ISO-8859-1 -> UTF-8 on data that's already in UTF-8. Therefore, doing a conversion UTF-8 -> ISO-8859-1 gives you the original UTF-8 data back.

    Make sure you're not calling utf8_encode (which does an ISO-8859-1 -> UTF-8 conversion) on UTF-8 data!  这里是double encoding的问题,已经编码成utf-8的字符串,又做了一次从iso-8859-1到utf-8的转换。

    Since every UTF-8 string is also a valid ISO-8859-1 string (well, not quite, but it's commonly extended so that that's the case), you have no errors on the ISO-8859-1 -> UTF-8 conversion over UTF-8 data.

     î被错误的编码读取

    850
    ibm850
    OEM Multilingual Latin 1; Western European (DOS)

    1252
    windows-1252
    ANSI Latin 1; Western European (Windows)

    28591
    iso-8859-1
    ISO 8859-1 Latin 1; Western European (ISO)

      [Test]
            public void Test20210414001()
            {
                Console.WriteLine(Encoding.Default.EncodingName);
                Console.WriteLine(Encoding.Default.CodePage);
    
                string str = "î";
                var array = Encoding.UTF8.GetBytes(str);
                var encoding2 = Encoding.GetEncoding(850);
                var str2 = encoding2.GetString(array);
                Console.WriteLine(str2);
    
                var encoding3 = Encoding.GetEncoding(1252);
                var str3 = encoding3.GetString(array);
                Console.WriteLine(str3);
    
                var encoding4 = Encoding.GetEncoding(28591);
                var str4 = encoding3.GetString(array);
                Console.WriteLine(str4);
            }

    code page 850解析的是├«
    code page 1252解析的是î

    code page 28591解析的是î

     测试

  • 相关阅读:
    servlet的提交
    servlet的doPost 和doGet和web文件结构
    helloServlet
    捕鱼达人
    The 2018 ACM-ICPC China JiangSu Provincial Programming Contest I. T-shirt
    ACM-ICPC 2017 Asia Urumqi A. Coins
    Nordic Collegiate Programming Contest 2015​ B. Bell Ringing
    变量
    hiho 1050 树的直径
    ACM-ICPC 2017 Asia Urumqi G. The Mountain
  • 原文地址:https://www.cnblogs.com/chucklu/p/14637784.html
Copyright © 2011-2022 走看看