背景:
数仓项目中,从Linux导出的文件是UTF-8编码,而Windows默认编码是GBK。涉及到包含中文字符的文件互传导致文件乱码,为加快工作效率,通过python实现编码之间的互转。
语言:
Python3
脚本:
1 import codecs 2 def ReadFile(filePath,encoding): 3 with codecs.open(filePath,"r",encoding) as f: 4 return f.read() 5 6 def WriteFile(filePath,data,encoding): 7 with codecs.open(filePath,"w",encoding) as f: 8 f.write(data) 9 10 def Encode_Convert(src,dst): 11 content = ReadFile(filePath=src[0],encoding=src[1]) 12 WriteFile(filePath=dst[0],data=content,encoding=dst[1]) 13 14 src=input("input src: ").split(",") 15 dst=input("input dst: ").split(",") 16 Encode_Convert(src,dst)