zoukankan      html  css  js  c++  java
  • 文件方式实现完整的英文词频统计实例(9.27)

    1.读入待分析的字符串

    2.分解提取单词 

    3.计数字典

    4.排除语法型词汇

    5.排序

    6.输出TOP(20)

    文本代码如下:

    girl='''Remembering me, Discover and see All over the world, She's known as a girl To those who a free, The mind shall be key Forgotten as the past 'Cause history will last
    
    God is a girl, Wherever you are, Do you believe it, can you recieve it? God is a girl, Whatever you say, Do you believe it, can you recieve it? God is a girl.'''

    实现代码如下:

    fo=open('daili.txt','r')
    girl=fo.read()
    girl='''Remembering me, Discover and see All over the world, She's known as a girl To those who a free, The mind shall be key Forgotten as the past 'Cause history will last
    
    God is a girl, Wherever you are, Do you believe it, can you recieve it? God is a girl, Whatever you say, Do you believe it, can you recieve it? God is a girl.'''
    exc={'','a','the','and','is','as','you','me','do'}
    
    girl=girl.lower()
    for i in ',?': 
     girl=girl.replace(i,' ')
    words=girl.split(' ')
    print('歌词:
    ',words)
    
    dict={}
    keys=set(words)
    keys=keys-exc
    print('最终单词:
    ',keys)
    
    
    for i in words:
     dict[i]=words.count(i)
    print('统计单词结果:
    ',dict)
    
    
    dai=list(dict.items())
    dai.sort(key=lambda x:x[1],reverse=True)
    print('排序结果:
    ')
    
    
    for i in range(20):
     print(dai[i])
    
    fo.close()

    程序结果如下:

  • 相关阅读:
    网络并发服务器设计
    linux脚本编程技术
    守护进程学习
    UDP通讯程序设计
    TCP通讯程序设计
    linux中socket的理解
    linux网络协议
    kafka ProducerConfig 配置
    crontab定时执行datax
    crontab
  • 原文地址:https://www.cnblogs.com/laidaili/p/7595295.html
Copyright © 2011-2022 走看看