字符编码与转码
注: 该图转自 http://www.cnblogs.com/luotianshuai/p/5735051.html.
Python2 解码,编码
1 #Python2 2 #-*- coding:utf-8 -*- 3 4 s = "你好" 5 s_to_unicode = s.decode("utfi-8") 6 s_to_gbk = s.decode("utf-8").encode("gbk") #先用指定码表解码,再用指定码表编码 7 print(s_to_bgk) 8 print(s_to_unicode) 9 10 11 gbk_to_utf8 = s_to_gbk.decode("gbk").encode("utf-8") 12 print(gbk_to_utf8) 13 14 15 16 s1 = u"你好" #前面加u代表码表是Unicode 17 print(s1) 18 19 20 21 # 打印系统默认编码 22 import sys 23 print(sys.getdefaultencoding())
Python3 解码编码
# -*- encoding: utf-8 -*- #Python3 默认码表是Unicode s = "你好" (Unicode,因为Python3默认数据的码表是Unicode. 改文件码表不影响其内容的数据存储形式) s_gbk = s.encode("gbk") print(s_gbk) # gbk print(s.encode()) # utf-8 gbk_to_utf8 = s_gbk.decode("gbk").encode("utf-8") print("utf8",gbk_to_utf8)
1 s = "你好" 2 print(s.encode("utf-8").decode("utf-8).encode("gb2312").decode("gb2312")) #encode("gb2312")后变成bytes; 之后再decode("gb2312")又转成字符串