经常用python打开中文文档,然后呢,经常忘记编码,经常出错,记录错误:
UnicodeDecodeError: 'gbk' codec can't decode byte 0xad in position 5: illegal multibyte sequence
找出报错的代码行。
1 filename = '有中文内容的.txt' 2 with open(filename, 'r') as file_object: 3 line = file_object.readlines() 4 print(line)
修复错误:
1 filename = '有中文内容的.txt' 2 with open(filename, 'r', encoding='utf-8') as file_object: 3 line = file_object.readlines() 4 print(line)
延伸一点,上面使用的是上下文管理器打开的文档,所以不需要关闭。如果是直接open的,一定要记得关闭,这样能节省内存了啦。
找出错误的代码行。
1 filename = open('有中文字体或者是gbk编码的文档.txt','r') 2 for line in filename: #按行读取 3 print(line.strip()) #去除换行符 4 filename.close() #关闭文档
修复错误:
1 filename = open('毛概.txt','r',encoding='utf-8') #加上编码 2 for line in filename: #按行读取 3 print(line.strip()) #去除换行符 4 filename.close() #关闭文档