zoukankan      html  css  js  c++  java
  • 英文词频统计

    # news='''A special variant of the Code Completion " 
    #      "feature invoked by pressing Ctrl twice " 
    #      "allows you to complete the name of any class no matter " 
    #      "if it was imported in the current file or not. If the class " 
    #      "is not imported yet, the import statement is generated automatically.'''
    f=open('news.txt','r')
    news=f.read()
    f.close()
    sep=''',.?'":!'''
    exclude={'the','and','a','not'}
    for c in sep:
        news=news.replace(c,' ')
    
    wordList=news.lower().split()
    wordDict={}
    '''for w in wordList:
        wordDict[w]=wordDict.get(w,0)+1
        for w in exclude
         del(wordDict[w])
     '''
    wordSet=set(wordList)-exclude
    for w in wordSet:
        wordDict[w]=wordList.count(w)
    
    dictList=list(wordDict.items())
    dictList.sort(key=lambda x:x[1],reverse=True)
    # for w in wordDict:
    #     print(w,wordDict[w])
    #print(dictList)
    
    for i in range(20):
        print(dictList[i])

  • 相关阅读:
    Sublime Text 2快捷键大全
    JavaSE
    Ubuntu16.04 install ideaIC-2017.2.5.tar.gz
    Ubuntu/CentOS hadoop-2.x Cluster Setup
    Ubuntu16.04 install hadoop-2.8.1.tar.gz Cluster Setup
    12file
    11input/output
    10function
    09FlowControl
    08test
  • 原文地址:https://www.cnblogs.com/Runka/p/8660044.html
Copyright © 2011-2022 走看看