zoukankan      html  css  js  c++  java
  • 综合练习:英文词频统计

    1. 词频统计预处理
    2. 下载一首英文的歌词或文章
    3. 将所有,.?!’:等分隔符全部替换为空格
    4. 将所有大写转换为小写
    5. 生成单词列表
    6. 生成词频统计
    7. 排序
    8. 排除语法型词汇,代词、冠词、连词
    9. 输出词频最大TOP10
    song='''
    Trouble will find you no matter where you go, oh oh
    
    No matter if you;re fast, no matter if you;re slow, oh oh
    
    The eye of the storm wanna cry in the morn, oh oh
    
    You;re fine for a while but you start to lose control
    
    He;s there in the dark, he;s there in my heart
    
    He waits in the wings, he;s gotta play a part
    
    Trouble is a friend, yeah trouble is a friend of mine
    
    Ahh..
    
    Trouble is a friend, but trouble is a foe, oh oh
    
    And no matter what I feed him he always seems to grow, oh oh
    
    He sees what I see and he knows what I know, oh oh
    
    So don;t forget as you ease on down my road
    
    He;s there in the dark, he;s there in my heart
    
    He waits in the wings, he;s gotta play a part
    
    Trouble is a friend, yeah trouble is a friend of mine
    
    So don;t be alarmed if he takes you by the arm
    
    I roll down the window, I;m a sucker for his charm
    
    Trouble is a friend, yeah trouble is a friend of mine
    
    Ahh..
    
    How I hate the way he makes me feel
    
    And how I try to make him leave
    
    I try, oh oh I try
    
    He;s there in the dark, he;s there in my heart
    
    He waits in the wings, he;s gotta play a part
    
    Trouble is a friend, yeah trouble is a friend of mine
    
    So don;t be alarmed if he takes you by the arm
    
    I roll down the window, I;m a sucker for his charm
    
    Trouble is a friend, yeah trouble is a friend of mine
    '''
    
    song=song.replace(";"," ").replace(","," ").replace("."," ")
    song=song.lower()
    song=song.split()
    dict={}
    for key in song:
        dict[key]=song.count(key)
    can={'of','be','m','if','and','will','re','but','what','so','how'}
    for key in can:
        del dict[key]
    dict=sorted(dict.items(),key=lambda e:e[1],reverse=True)
    for key in range(10):
        print(dict[key])

    运行截图:

  • 相关阅读:
    随笔分类目录
    数据结构与算法
    ASP.NET Web网站中App_Code文件夹的作用及使用场景
    Java语言入门
    C#语言入门_基本介绍
    汇编语言入门
    学期总结
    王者光耀作业2
    王者光耀作业1
    第三次作业
  • 原文地址:https://www.cnblogs.com/lzs741788135/p/8650193.html
Copyright © 2011-2022 走看看