zoukankan      html  css  js  c++  java
  • C++ string在unicode下

    VS2008编译环境下
    string 不管是在unicode还是多字节字符集下。都是单字节,数字字母占一个字节,汉字占2个字节。如果想用宽字符 请用std::wstring,这个和THCAR的效果相同。当然也可以用微软的CString更方便些。

    I have written before about How to use Unicode with Python, but I've never figured out how to use Unicode in Standard C before. I managed to find an extremely helpful UTF-8 and Unicode FAQ which answers most of the questions, particularly the section beginning with C Support for Unicode and UTF-8. Additionally, Tim Bray's Characters vs. Bytes provides a very readable overview of Unicode encodings.

    The good news is that if you use wchar_t* strings and the family of functions related to them such as wprintf, wcslen, and wcslcat, you are dealing with Unicode values. In the C++ world, you can use std::wstring to provide a friendly interface. My only complaint is that these are 32-bit (4 byte) characters, so they are memory hogs for all languages. The reason for this choice is that it guarantees each possible character can be represented by one value. Contrary to popular belief, it is possible for a Unicode character to require multiple 16-bit values. While these characters are rare, it is dangerous to hard code that belief into your software, particularly as computers spread through more of the world, and as people create new characters.

  • 相关阅读:
    洛谷 U140360 购物清单
    洛谷 U140359 批量处理
    洛谷 U140358 操作系统
    洛谷U140357 Seaway连续
    洛谷 U141394 智
    洛谷 U141387 金
    CF1327F AND Segments
    刷题心得—连续位运算题目的小技巧
    CF743C Vladik and fractions
    洛谷 P6327 区间加区间sin和
  • 原文地址:https://www.cnblogs.com/yuzhould/p/4454995.html
Copyright © 2011-2022 走看看