zoukankan      html  css  js  c++  java
  • 文件方式实现完整的英文词频统计实例

    1. 读入待分析的字符串
    2. 分解提取单词 
    3. 计数字典
    4. 排除语法型词汇
    5. 排序
    6. 输出TOP(20)
    fo=open('test.txt','w')
    >>> fo.write('''Twinkle Twinkle Little Star
      (Declan's Prayer) - Declan Galbraith
    
      Twinkle twinkle little star,
      How I wonder what you are,
      Up above the world so high,
      Like a diamond in the sky,
      Star light,
      Star bright,
      The first star I see tonight,
      I wish I may, I wish I might,
      Have the wish I wish tonight,
    
      Twinkle twinkle little star,
      How I wonder what you are,
      I have so many wishes to make,
      But most of all is what I state,
      So just wonder,
      That I've been dreaming of,
      I wish that I can have owe her enough,
      I wish I may, I wish I might,
      Have the dream I dream tonight,
    
      Ooo baby
    
      Twinkle twinkle little star,
      How I wonder what you are,
      I want a girl who'll be all mine,
      And wants to say that I'm her guy,
      Someone's sweet that's for sure,
      I want to be the one shes looking for,
      I wish I may, I wish I might,
      Have the girl I wish tonight,
    
      Ooo baby
    
      Twinkle twinkle little star,
      How I wonder what you are,
      Up above the world so high,
      Like a diamond in the sky,
      Star light,
      Star bright,
      The first star I see tonight,
      I wish I may, I wish I might,
      Have the wish I wish tonight.''')
    1138
    >>> fo.close()
    >>> fr=open('test.txt','r')
    >>> fr.read()
    fo=open('test.txt','r')
    song=fo.read()
    exc={'the','in','to','a','of','and','on','what','that'}
    song=song.lower()
    for i in '''.,-
    	u3000'()"''':
        song=song.replace(i,'')
    words=song.split(' ')
    dic={}
    keys=set(words)
    keys=keys-exc
    for w in keys:
        dic[w]=words.count(w)
    
    wc = list(dic.items())
    wc.sort(key=lambda x:x[1],reverse=True)
    print(wc)
    for w in range(20):
        print(wc[w])

     

  • 相关阅读:
    自学python day 10 函数的动态参数、命名空间、作用域
    老男孩 python 自学 打印05 dict 复习总结
    老男孩python 自学day09 函数开始
    今天
    day 03
    eclipse如何安装配置tomcat
    windows上配置maven环境
    如何创建ssh key使电脑和Github关联在一起
    怎么将本地文件上传到github
    使用git工具上传代码到github
  • 原文地址:https://www.cnblogs.com/lintingting/p/7595150.html
Copyright © 2011-2022 走看看