zoukankan      html  css  js  c++  java
  • 文件方式实现完整的英文词频统计实例

    1读入待分析的字符串

    2.分解提取单词 

    3.计数字典

    4.排除语法型词汇

    5.排序

    6.输出TOP(20)

    lyric=open('lyric.txt','w')
    lyric.write('''your butt is mine
     
    I Gonna tell you right
     
    Just show your face
     
    In broad daylight
     
    I'm telling you
     
    On how I feel
     
    Gonna Hurt Your Mind
     
    Don't shoot to kill
     
    Shamone
     
    Shamone
     
    Lay it on me
     
    All right
     
    I'm giving you
     
    On count of three
     
    To show your stuff
     
    Or let it be
     
    I'm telling you
     
    Just watch your mouth
    I know your game
     
    What you're about
     
    Well they say the sky's the limit
     
    And to me that's really true
     
    But my friend you have seen nothin'
     
    Just wait till I get through
     
    Because I'm bad,I'm bad
     
    shamone
     
    (Bad,bad,really,really bad)
     
    You know I'm bad,I'm bad
     
    (Bad,bad,really,really bad)
     
    You know it
     
    You know I'm bad,I'm bad
     
    Come on,you know
     
    (Bad,bad,really,really bad)
     
    And the whole world
     
    Has to answer right now
     
    Just to tell you once again
     
    Who's bad
     
    The word is out
     
    You're doin' wrong
     
    Gonna lock you up
     
    Before too long
     
    Your lyin' eyes
     
    Gonna tell you right
     
    So listen up
     
    Don't make a fight
     
    Your talk is cheap
     
    You're not a man
     
    Your throwin' stones
     
    To hide your hands
     
    Well they say the sky's the limit
     
    And to me that's really true
     
    But my friend you have seen nothin'
     
    Just wait till I get through
     
    Because I'm bad,I'm bad
     
    shamone
     
    (Bad,bad,really,really bad)
     
    You know I'm bad,I'm bad
     
    (Bad,bad,really,really bad)
     
    You know it
     
    You know I'm bad,I'm bad
     
    Come on,you know
     
    (Bad,bad,really,really bad)
     
    And the whole world
     
    Has to answer right now
     
    Just to tell you once again
     
    Who's bad
     
    We could change the world tomorrow
     
    This could be a better place
     
    If you don't like what I'm sayin'
     
    Then won't you slap my face
     
    Because I'm bad''')
    lyric.close()
    comment=open('lyric.txt','r')
    bad=comment.read()
    comment.close()
    
    bad=bad.lower()
    for i in ",.?!()":
        bad=bad.replace(i,' ')
    bad=bad.replace('
    ',' ')
    words=bad.split(' ')
    s=set(words)
    
    delete={"the","a","it","to","on","and"}
    for i in delete:
        s.remove(i)
        
    dic={}
    lis=[]
    for i in s:
        if(i==" "):
            continue
        if(i==""):
            continue 
        dic[i]=words.count(i)
        lis.append(words.count(i))
    
    lis=list (dic.items())
    lis.sort(key=lambda x:x[1],reverse=True)
    for i in range(20):
        print(lis[i])
    

    运行:

  • 相关阅读:
    先做人,再做事
    当ligerui的grid出现固定列与非固定列不在同一水平线上时,改怎么处理
    权限设计的idea
    ligerUI问题
    在程序出现问题,当找不到错误时,第一时间用try ,catch包括起来
    当页面是本地页面时,通过ajax访问tomcat里的action,传递的参数在action里并不能识别
    好句子
    js Uncaught TypeError: undefined is not a function
    Photoshop学习笔记(一)
    microsoft project 出现不能保存为xls文件时可以按照如下方法解决
  • 原文地址:https://www.cnblogs.com/mavenlon/p/7595133.html
Copyright © 2011-2022 走看看