读写Unicode字符串(UTF8,UTF16…)

zoukankan html css js c++ java

读写Unicode字符串(UTF8,UTF16…)
写UTF-16字符串：

    class TestDataGenerator
    {
        public static void CreateNewTestDataFile(string FileName, int record_length)
        {
            using (FileStream fs = File.Create(FileName))
            {
                StringBuilder sb = new StringBuilder();
                for (int i = 0; i < record_length; i++)
                {
                    sb.Append('的');
                }
                byte[] content = Encoding.Unicode.GetBytes(sb.ToString());
                fs.Write(content, 0, content.Length);
            }
        }
    }

高亮的那句话用于把string编码为UTF-16字节流。

调用该函数生成包含100个字符的测试文件：
```
TestDataGenerator.CreateNewTestDataFile("test.txt", 100);
```
可以看到文件大小为200字节。原因是UTF-16使用2个字节来存储包括汉字在内的非ASCII字符

读取Unicode字符串：

            FileStream fs = new System.IO.FileStream("test.txt", FileMode.Open, FileAccess.Read);
            byte[] blob = new byte[100];
            fs.Read(blob, 0, 100);
            fs.Flush();
            string strUtf16 = Encoding.Unicode.GetString(blob);
            string strUtf8 = Encoding.UTF8.GetString(blob);

从Watch窗口可见, 将字符串强转为UTF-8形式会出现乱码，这是因为UTF-8标准使用3个字节来存储汉字等字符，而不是UTF-16的2个字节。

尝试使用UTF-8编码存储字符：

            byte[] content = Encoding.UTF8.GetBytes(sb.ToString());
            fs.Write(content, 0, content.Length);

刷新后查看文件属性可见文件大小变为300字节：

同理，读取时将UTF-8字符强转为UTF-16也是不行的, strUtf16显示为乱码：
查看全文

相关阅读:
MySQL锁之三：MySQL的共享锁与排它锁编码演示
 服务链路追踪(Spring Cloud Sleuth)
服务网关zuul之四：zuul网关配置
 hdu 1505 City Game (hdu1506加强版)
PHP设计模式——訪问者模式
 极客互联网电视不是噱头，用户体验成创维G7200核心竞争力
 深入理解JavaScript系列（23）：JavaScript与DOM（上）——也适用于新手
 使用php分页类实现简单分类
 管理之路（四）
poj 2485 Highways （最小生成树）

原文地址：https://www.cnblogs.com/k330/p/2453336.html