zoukankan      html  css  js  c++  java
  • 9.26文件方式实现完整的英文词频统计实例

    #读入待分析的字符串
    fo=open('song.txt','w')
    fo.write('''You gotta go and get angry at all of my honesty
    You know I try but I don’t do too well with apologies
    I hope I don’t run out of time, could someone call a referee?
    Cause I just need one more shot at forgiveness
    I know you know that I made those mistakes maybe once or twice
    By once or twice I mean maybe a couple a hundred times
    So let me, oh let me redeem, oh redeem, oh myself tonight
    Cause I just need one more shot at second chances
    Yeah, is it too late now to say sorry?
    Cause I’m missing more than just your body
    Is it too late now to say sorry?
    Yeah I know that I let you down
    Is it too late to say I’m sorry now?
    I’m sorry, yeah
    Sorry, yeah
    Sorry
    Yeah I know that I let you down
    Is it too late to say sorry now?
    I’ll take every single piece of the blame if you want me to
    But you know that there is no innocent one in this game for two
    I’ll go, I’ll go and then you go, you go out and spill the truth
    Can we both say the words and forget this?
    Is it too late now to say sorry?
    Cause I’m missing more than just your body
    Is it too late now to say sorry?
    Yeah I know that I let you down
    Is it too late to say I’m sorry now?
    I’m not just trying to get you back on me
    Cause I’m missing more than just your body
    Is it too late now to say sorry?
    Yeah I know that I let you down
    Is it too late to say sorry now?
    I’m sorry, yeah
    Sorry, oh
    Sorry
    Yeah I know that I let you down
    Is it too late to say sorry now?
    I’m sorry, yeah
    Sorry, oh
    Sorry
    Yeah I know that I let you down
    Is it too late to say sorry now?''')
    fo.close()

    分解提取单词 

    计数字典

    排除语法型词汇

    排序

    输出TOP(20)

    代码为:

    fo=open('song.txt','r')
    sorry = fo.read()
    sorry=sorry.lower()#首字母小写
    exc={'to','it','taht','and','that','of','or','you','i'}
    for i in ',?':
        sorry=sorry.replace(i,' ')#替换所有的,?
    word=sorry.split(' ')#以‘ ’断开分成单独的字符串
    
    #单词计数字典
    dt={}#定义一个空字典
    keys=set(word)#取键值
    keys=keys-exc
    #print(keys)
    for i in keys:
        dt[i]=word.count(i)#dt[i]输出key
    #print(dt)
        
    wc=list(dt.items())#字典转化为列表
    #print(wc)
    
    wc.sort(key=lambda x:x[1],reverse=True)
    #print(wc)
    
    for i in range(20):
        print(wc[i])

    结果:

  • 相关阅读:
    13 原型链_继承_this大总结_函数一定是对象,对象不一定是函数
    12 贪吃蛇游戏
    实现wiki访问
    11 第三个阶段js高级_原型
    JZOJ.5257【NOIP2017模拟8.11】小X的佛光
    模板——权值线段树(逆序对)
    LCA模板
    笛卡尔树——神奇的“二叉搜索堆”
    JZOJ.5246【NOIP2017模拟8.8】Trip
    JZOJ.5236【NOIP2017模拟8.7】利普希茨
  • 原文地址:https://www.cnblogs.com/liminghui3/p/7595152.html
Copyright © 2011-2022 走看看