zoukankan      html  css  js  c++  java
  • 保存文件为UTF8格式(Writing UTF8 files in C++).

    都是简单的单词,我就不翻译了。

    原文地址:http://mariusbancila.ro/blog/2008/10/20/writing-utf-8-files-in-c/

    Let’s say you need to write an XML file with this content:

    < ?xml version="1.0" encoding="UTF-8"? >
    < root description="this is a naïve example" >
    < /root >

    How do we write that in C++?

    At a first glance, you could be tempted to write it like this:

    #include< fstream >

    int main()
    {
            std
    ::ofstream testFile;

            testFile
    .open("demo.xml", std::ios::out| std::ios::binary);

            std
    ::string text =
                   
    "< ?xml version=\"1.0\" encoding=\"UTF-8\"? >\n"
                   
    "< root description=\"this is a naïve example\" >\n< /root >";

            testFile
    << text;

            testFile
    .close();

           
    return0;
    }

    When you open the file in IE for instance, surprize! It's not rendered correctly:

    So you could be tempted to say "let's switch to wstring and wofstream".

    int main()
    {
            std
    ::wofstream testFile;

            testFile
    .open("demo.xml", std::ios::out| std::ios::binary);

            std
    ::wstring text =
                    L
    "< ?xml version=\"1.0\" encoding=\"UTF-8\"? >\n"
                    L
    "< root description=\"this is a naïve example\" >\n< /root >";

            testFile
    << text;

            testFile
    .close();

           
    return0;
    }

    And when you run it and open the file again, no change. So, where is the problem? Well, the problem is that neither ofstream nor wofstream write the text in a UTF-8 format. If you want the file to really be in UTF-8 format, you have to encode the output buffer in UTF-8. And to do that we can use WideCharToMultiByte(). This Windows API maps a wide character string to a new character string (which is not necessary from a multibyte character set). The first argument indicates the code page. For UTF-8 we need to specify CP_UTF8.

    The following helper functions encode a std::wstring into a UTF-8 stream, wrapped into a std::string.

    #include< windows.h >

    std
    ::string to_utf8(constwchar_t* buffer,int len)
    {
           
    int nChars =::WideCharToMultiByte(
                    CP_UTF8
    ,
                   
    0,
                    buffer
    ,
                    len
    ,
                    NULL
    ,
                   
    0,
                    NULL
    ,
                    NULL
    );
           
    if(nChars ==0)return"";

           
    string newbuffer;
            newbuffer
    .resize(nChars);
           
    ::WideCharToMultiByte(
                    CP_UTF8
    ,
                   
    0,
                    buffer
    ,
                    len
    ,
                   
    const_cast<char*>(newbuffer.c_str()),
                    nChars
    ,
                    NULL
    ,
                    NULL
    );

           
    return newbuffer;
    }

    std
    ::string to_utf8(const std::wstring& str)
    {
           
    return to_utf8(str.c_str(),(int)str.size());
    }

    With that in hand, all you have to do is doing the following changes:

    int main()
    {
            std
    ::ofstream testFile;

            testFile
    .open("demo.xml", std::ios::out| std::ios::binary);

            std
    ::wstring text =
                    L
    "< ?xml version=\"1.0\" encoding=\"UTF-8\"? >\n"
                    L
    "< root description=\"this is a naïve example\" >\n< /root >";

            std
    ::string outtext = to_utf8(text);

            testFile
    << outtext;

            testFile
    .close();

           
    return0;
    }

    And now when you open the file, you get what you wanted in the first place.

    And that is all!

  • 相关阅读:
    Sum Root to Leaf Numbers [LeetCode]
    Symmetric Tree [LeetCode]
    Combination Sum II [LeetCode]
    Maximal Rectangle [LeetCode]
    Trapping Rain Water [LeetCode]
    Combination Sum [LeetCode]
    05 如何“响铃”
    04 八进制
    03 关键字?保留字?预留字?
    020 函数之变量的作用域
  • 原文地址:https://www.cnblogs.com/lebronjames/p/2944007.html
Copyright © 2011-2022 走看看