统计文本词频

方法一：

#将文本内容转换为字典进行统计
file01 = open('art.txt','r')
list = file01.read().replace(',','').replace('.','').replace(';','').split()    #读取文件去除文本中的特殊符号并切片
list01 = {}
for i in list:  #生成字典，单词为keys，出现的次数为value
    if i in list01.keys():
        list01[i] = list01[i] + 1
    else:
        list01[i] = 1

a = sorted(list01.items(), key=lambda va:va[1],reverse=True)    #排序
count = 0
for j in a:
    if count <5:
        print('单词 %s 出现了 %d 次' % (j[0],j[1]))   #打印前5名
        count += 1
    else:
        break
file01.close()

方法二：

#将文本内容转换为列表进行统计
from collections import Counter
file = open('art.txt','r')
list01 =  file.read().replace(',','').replace('.','').replace(';','').split()   #读取文件去除文本中的特殊符号并切片
a = Counter(list01)     #排序
b = a.most_common(5)    #取出前5名
for i in b:
    print('单词 %s 出现了 %d 次' % (i[0], i[1]))
file01.close()

输出结果：

单词 the 出现了 6 次
单词 of 出现了 5 次
单词 in 出现了 3 次
单词 to 出现了 3 次
单词 something 出现了 3 次

查看全文

相关阅读:
[LeetCode] Trips and Users 旅行和用户
 [LeetCode] Rising Temperature 上升温度
 [LeetCode] Delete Duplicate Emails 删除重复邮箱
 [LeetCode] Department Top Three Salaries 系里前三高薪水
 Spring boot Jackson基本演绎法&devtools热部署
 使用spring tool suite(STS)工具创建spring boot项目和出现错误后的处理
 Spring Boot 2.0官方文档之 Actuator
springboot 使用webflux响应式开发教程（二）
SpringBoot在自定义类中调用service层等Spring其他层
 springBoot单元测试-模拟MVC测试

原文地址：https://www.cnblogs.com/jacky-zhao/p/8244117.html