zoukankan      html  css  js  c++  java
  • 转载>>ASCII、UTF8、Uncicode编码下的中英文字符大小

     原地址:http://www.tracefact.net/CSharp-Programming/Network-Programming-Part2.aspx

    ASCII、UTF8、Uncicode编码下的中英文字符大小

    • ASCII不能保存中文

    • UTF8是变长编码。在对ASCII字符编码时,UTF更省空间,只占1个字节,与ASCII编码方式和长度相同;Unicode在对ASCII字符编码时,占用2个字节,且第2个字节补零。

    • UTF8在对中文编码时需要占用3个字节;Unicode对中文编码则只需要2个字节。

    代码示例:

     1 private static void ShowCode() {
     2     string[] strArray = { "b", "abcd", "", "甲乙丙丁" };
     3     byte[] buffer;
     4     string mode, back;
     5 
     6     foreach (string str in strArray) {
     7 
     8         for (int i = 0; i <= 2; i++) {
     9             if (i == 0) {
    10                 buffer = Encoding.ASCII.GetBytes(str);
    11                 back = Encoding.ASCII.GetString(buffer, 0, buffer.Length);
    12                 mode = "ASCII";
    13             } else if (i == 1) {
    14                 buffer = Encoding.UTF8.GetBytes(str);
    15                 back = Encoding.UTF8.GetString(buffer, 0, buffer.Length);
    16                 mode = "UTF8";
    17             } else {
    18                 buffer = Encoding.Unicode.GetBytes(str);
    19                 back = Encoding.Unicode.GetString(buffer, 0, buffer.Length);
    20                 mode = "Unicode";
    21             }
    22 
    23             Console.WriteLine("Mode: {0}, String: {1}, Buffer.Length: {2}",
    24                 mode, str, buffer.Length);
    25 
    26             Console.WriteLine("Buffer:");
    27             for (int j = 0; j <= buffer.Length - 1; j++) {
    28                 Console.Write(buffer[j] + " ");
    29             }
    30 
    31             Console.WriteLine("
    Retrived: {0}
    ", back);
    32         }
    33     }
    34 }

    运行结果:

     1 Mode: ASCII, String: b, Buffer.Length: 1
     2 Buffer: 98
     3 Retrived: b
     4 
     5 Mode: UTF8, String: b, Buffer.Length: 1
     6 Buffer: 98
     7 Retrived: b
     8 
     9 Mode: Unicode, String: b, Buffer.Length: 2
    10 Buffer: 98 0
    11 Retrived: b
    12 
    13 Mode: ASCII, String: abcd, Buffer.Length: 4
    14 Buffer: 97 98 99 100
    15 Retrived: abcd
    16 
    17 Mode: UTF8, String: abcd, Buffer.Length: 4
    18 Buffer: 97 98 99 100
    19 Retrived: abcd
    20 
    21 Mode: Unicode, String: abcd, Buffer.Length: 8
    22 Buffer: 97 0 98 0 99 0 100 0
    23 Retrived: abcd
    24 
    25 Mode: ASCII, String: 乙, Buffer.Length: 1
    26 Buffer: 63
    27 Retrived: ?
    28 
    29 Mode: UTF8, String: 乙, Buffer.Length: 3
    30 Buffer: 228 185 153
    31 Retrived: 乙
    32 
    33 Mode: Unicode, String: 乙, Buffer.Length: 2
    34 Buffer: 89 78
    35 Retrived: 乙
    36 
    37 Mode: ASCII, String: 甲乙丙丁, Buffer.Length: 4
    38 Buffer: 63 63 63 63
    39 Retrived: ????
    40 
    41 Mode: UTF8, String: 甲乙丙丁, Buffer.Length: 12
    42 Buffer: 231 148 178 228 185 153 228 184 153 228 184 129
    43 Retrived: 甲乙丙丁
    44 
    45 Mode: Unicode, String: 甲乙丙丁, Buffer.Length: 8
    46 Buffer: 50 117 89 78 25 78 1 78
    47 Retrived: 甲乙丙丁

    得出结论:

    1 ASCII不能保存中文(貌似谁都知道=_-`)。
    2 UTF8是变长编码。在对ASCII字符编码时,UTF更省空间,只占1个字节,与ASCII编码方式和长度相同;Unicode在对ASCII字符编码时,占用2个字节,且第2个字节补零。
    3 UTF8在对中文编码时需要占用3个字节;Unicode对中文编码则只需要2个字节。
  • 相关阅读:
    17张程序员壁纸推荐,是否有一张你喜欢的?
    学会了这些英文单词,妈妈再也不用担心我学不会Python
    小白学习Python英语基础差怎么办,都帮你想好拉!看这里
    自动化测试学习防踩坑手册,测试人员人手一份
    Selenium自动化结合Mysql数据项目实战操作
    解除你学习Python自动化测试框架的所有疑惑,开启学习直通车
    数据库管理软件navicate12的激活和安装
    修改文件版本号方法
    Json的数据映射(基于LitJson)
    VMware 虚拟机安装黑屏问题
  • 原文地址:https://www.cnblogs.com/JiYF/p/6604776.html
Copyright © 2011-2022 走看看