此代码目的是统计3个文件中所有切分出来的词的词频,并按照词频从大到小的顺序写入另一个文件
with open('/home/xbwang/Desktop/output_measures.txt','r') as f: with open('/home/xbwang/Desktop/output_measures2.txt','r') as f1: with open('/home/xbwang/Desktop/output_output_bk.txt','r') as f2: word_frequency = {} for line in f : line = line.strip().split(' ') for word in line: if word in word_frequency: 判断word是否在字典的键中 word_value = word_frequency.get(word)+1 word_frequency[word] = word_value 这里也可以使用word_frequency.update({word:word_value}) else: word_frequency.update({word:1}) 这里也可以使用word_frequency[word] = 1 for line in f1 : line = line.strip().split(' ') for word in line: if word in word_frequency: word_value = word_frequency.get(word)+1 word_frequency[word] = word_value else: word_frequency.update({word:1}) for line in f2 : line = line.strip().split(' ') for word in line: if word in word_frequency: word_value = word_frequency.get(word)+1 word_frequency[word] = word_value else: word_frequency.update({word:1}) word_frequency = sorted(word_frequency.iteritems(), key=lambda A:A[1], reverse=True) 注意这里经过sorted排序后,返回的是一个列表,且列表元素为键值组成的,形如: (a,1) with open('/home/xbwang/Desktop/word_frequency.bk','a') as f3: for word in word_frequency: #print word #f3.write(word) word0 = word[0] word1 = word[1] #print word0 #print word1 #f3.write(word0+' '+str(word_frequency.get(word))+' ') 原本没排序的写法可以直接写成f3.write(word+' '+str(word_frequency.get(word))+' '),这就是一个典型的从字典中打印出键值的写法 f3.write(word0+' '+str(word1)+' ')