§什么是utf-8
The name is derived from: Universal Coded Character Set + Transformation Format—8-bit.
统一编码字符集转型格式8位
§几个字节
UTF-8 uses one byte for any ASCII character, all of which have the same code values in both UTF-8 and ASCII encoding, and up to four bytes for other characters.
8bit = 1byte (00-FF)
§为什么字节不一样
The encoding is variable-length and uses 8-bit code units.
All code points in the BMP are accessed as a single code unit in UTF-16 encoding and can be encoded in one, two or three bytes in UTF-8.
§优势
UTF-8 is the dominant character encoding for the World Wide Web.
The Internet Mail Consortium (IMC) recommends that all e-mail programs be able to display and create mail using UTF-8,[5] and the W3C recommends UTF-8 as the default encoding in XML and HTML.