zoukankan      html  css  js  c++  java
  • 词频统计

    f=open('ghost.txt','r')
    news=f.read()
    f.close()
    
    sep=''':;""'',.?!-'''
    exclude={'the','me','you','my','of','to','in','will','for','and','from'}
    
    for c in sep:
        news=news.replace(c,'')
    
    wordList=news.lower().split()
    wordDict={}
    
    
    '''
    for i in wordList:
        wordDict[i]=wordDict.get(i,0)+1
    '''
    
    wordSet=set(wordList) - exclude
    for i in wordSet:
        wordDict[i]=wordList.count(i)
    
    dictList=list(wordDict.items())
    dictList.sort(key=lambda x:x[1],reverse=True)
    
    f=open('result.txt','a')
    for i in range(20):
        f.write(dictList[i][0]+' '+str(dictList[i][1])+'
    ')
    f.close()

    PS.这是经过分析后的结果

  • 相关阅读:
    LoadRunner
    LoadRunner
    LoadRunner
    LoadRunner
    Python
    hadoop for .Net
    MVC初学
    MVC初学
    android学习---面试一
    android学习---progressbar和ratingbar
  • 原文地址:https://www.cnblogs.com/polvem/p/8658540.html
Copyright © 2011-2022 走看看