zoukankan html css js c++ java

利用python处理txt文件

前段时间做公司一个自动翻译项目需要处理大量的文案字段，手工去做简直不大可能（懒），因此借用python脚本自动化处理掉了，在此记录一下。

import linecache

def outputfile(i,j,n):
	# zh = file_zh.read().decode('utf-8').encode('gbk', 'ignore')
	file_new = open ('1.txt', 'r+')
	for l in range(i,j+1):
		# line = linecache.getline('tw.txt', l).decode('utf-8').encode('gbk', 'ignore')
		line1 = str(linecache.getline('zh.txt', l)[0:-1]).strip()
		line2 = str(linecache.getline('tw.txt', l+n)[0:-1]).strip()
		if(len(str(line1[0:-1]))==0):
			continue
		file_new.write('msgid  "'+ line1 +'"
')
		file_new.write('msgstr "'+ line2 +'"

')
	file_new.close()

outputfile(1,25,3)

生成效果：

# 这些是测试字段

msgid  "Hello world!"
msgstr "世界你好!"

msgid  "Python is a good Language."
msgstr "Python 是门好语言."

msgid  "中国"
msgstr "中國"

msgid  "测试啊"
msgstr "測試呢!!!"

msgid  "Hello %(useraaa)s!"
msgstr "你好 %(useraaa)s!"

代码非常简单，过程为从两个不同文件里一行一行地读取字符，再按照一定格式一行一行地输出到新文件中，需要注意的是，txt文件需要保存为UTF-8编码格式，否则需要转码，有点麻烦~

危楼高百尺，手可摘星辰。不敢高声语，恐惊天上人。

查看全文

相关阅读:
python中os模块中文帮助
 TypeError: string indices must be integers, not str
ValueError: multi-byte encodings are not supported
Codeforces Round #620 (Div. 2)E（LCA求树上两点最短距离）
Codeforces Round #620 (Div. 2)D（LIS，构造）
Codeforces Round #619 (Div. 2)D（模拟）
Codeforces Round #619 (Div. 2)C（构造，容斥）
Educational Codeforces Round 82 (Rated for Div. 2)E（DP，序列自动机）
Educational Codeforces Round 82 (Rated for Div. 2)D（模拟）
【PAT甲级】1114 Family Property (25分)（并查集）

原文地址：https://www.cnblogs.com/mingtan/p/6428194.html