zoukankan      html  css  js  c++  java
  • 文件方式实现完整的英文词频统计实例

    1读入待分析的字符串

    2.分解提取单词 

    3.计数字典

    4.排除语法型词汇

    5.排序

    6.输出TOP(20)

    lyric=open('lyric.txt','w')
    lyric.write('''your butt is mine
     
    I Gonna tell you right
     
    Just show your face
     
    In broad daylight
     
    I'm telling you
     
    On how I feel
     
    Gonna Hurt Your Mind
     
    Don't shoot to kill
     
    Shamone
     
    Shamone
     
    Lay it on me
     
    All right
     
    I'm giving you
     
    On count of three
     
    To show your stuff
     
    Or let it be
     
    I'm telling you
     
    Just watch your mouth
    I know your game
     
    What you're about
     
    Well they say the sky's the limit
     
    And to me that's really true
     
    But my friend you have seen nothin'
     
    Just wait till I get through
     
    Because I'm bad,I'm bad
     
    shamone
     
    (Bad,bad,really,really bad)
     
    You know I'm bad,I'm bad
     
    (Bad,bad,really,really bad)
     
    You know it
     
    You know I'm bad,I'm bad
     
    Come on,you know
     
    (Bad,bad,really,really bad)
     
    And the whole world
     
    Has to answer right now
     
    Just to tell you once again
     
    Who's bad
     
    The word is out
     
    You're doin' wrong
     
    Gonna lock you up
     
    Before too long
     
    Your lyin' eyes
     
    Gonna tell you right
     
    So listen up
     
    Don't make a fight
     
    Your talk is cheap
     
    You're not a man
     
    Your throwin' stones
     
    To hide your hands
     
    Well they say the sky's the limit
     
    And to me that's really true
     
    But my friend you have seen nothin'
     
    Just wait till I get through
     
    Because I'm bad,I'm bad
     
    shamone
     
    (Bad,bad,really,really bad)
     
    You know I'm bad,I'm bad
     
    (Bad,bad,really,really bad)
     
    You know it
     
    You know I'm bad,I'm bad
     
    Come on,you know
     
    (Bad,bad,really,really bad)
     
    And the whole world
     
    Has to answer right now
     
    Just to tell you once again
     
    Who's bad
     
    We could change the world tomorrow
     
    This could be a better place
     
    If you don't like what I'm sayin'
     
    Then won't you slap my face
     
    Because I'm bad''')
    lyric.close()
    comment=open('lyric.txt','r')
    bad=comment.read()
    comment.close()
    
    bad=bad.lower()
    for i in ",.?!()":
        bad=bad.replace(i,' ')
    bad=bad.replace('
    ',' ')
    words=bad.split(' ')
    s=set(words)
    
    delete={"the","a","it","to","on","and"}
    for i in delete:
        s.remove(i)
        
    dic={}
    lis=[]
    for i in s:
        if(i==" "):
            continue
        if(i==""):
            continue 
        dic[i]=words.count(i)
        lis.append(words.count(i))
    
    lis=list (dic.items())
    lis.sort(key=lambda x:x[1],reverse=True)
    for i in range(20):
        print(lis[i])
    

    运行:

  • 相关阅读:
    Net包管理NuGet(3)搭建私服及引用私服的包
    MyMql 下载以及配置
    Oracle 环境部署 以及数据库创建 ,用户新建和权限分配
    VUE.JS 环境配置
    .NET WEB API 简单搭建
    C# Timer 定时任务
    RemoTing 搭建简单实现
    MVC+EF三层+抽象工厂
    ASP.NET MVC SignalR 简单聊天推送笔记
    .net Mvc Dapper 方法封装
  • 原文地址:https://www.cnblogs.com/mavenlon/p/7595133.html
Copyright © 2011-2022 走看看