zoukankan      html  css  js  c++  java
  • 词频统计

    # -*- coding: UTF-8 -*-
    
    str = '''Gotta Have You  (The Weepies)
    Gray, quiet and tired and mean
    Picking at a worried seam
    I try to make you mad at me over the phone
    Red eyes and fire and signs
    I'm taken by a nursery rhyme
    I want to make a ray of sunshine and never leave home
    No amount of coffee
    No amount of crying
    No amount of whiskey
    No amount of wine
    No, nothing else will do
    I've gotta have you
    I've gotta have you
    The road gets cold
    There's no spring in the middle this year
    I'm the new chicken clucking open hearts and ears
    Oh, such a prima Donna
    Sorry for myself
    But green, it is also summer
    And I won't be warm till I'm lying in your arms
    No amount of coffee
    No amount of crying
    No amount of whiskey
    No amount of wine
    No, nothing else will do
    I've gotta have you
    I've gotta have you
    I see it all through a telescope:
    Guitar, suitcase, and a warm coat
    Lying in the back of the blue boat
    Humming a tune...
    No amount of coffee
    No amount of crying
    No amount of whiskey
    No wine
    No, nothing else will do
    I've gotta have you
    I've gotta have you
    No amount of coffee
    No amount of crying
    No amount of whiskey
    No amount of wine
    No, nothing else will do
    I've gotta have you ~~
    gotta have you
    I've gotta have you'''
    
    n='''~,.'"?!;)('''
    exclude = {'the','a','of','to','for','in','and','no','i','you','ve','have'}
    for t in n:
        str = str.replace(t,' ')
    
    wordList = str.lower().split()
    
    wordDict = {}
    wordSet = set(wordList)
    for w in wordSet:
        wordDict[w] = wordList.count(w)
    
    for w in  wordList:
        wordDict[w] = wordDict.get(w,0)+1
    for w in  exclude:
        wordDict.pop(w)
    
    dictList = list(wordDict.items())
    dictList.sort(key=lambda x: x[1], reverse=True)
    
    
    for w in dictList:
        print(w)

    f = open('C:UsersAdministratorPycharmProjectsdso.txt','r')
    text = f.read()
    f.close()
    
    print(text)
  • 相关阅读:
    七 、linux正则表达式
    六、通配符
    Codeforces1099D.Sum in the tree(贪心)
    叮,出现!
    Codeforces1056E.Check Transcription(枚举+Hash)
    2018.11.25 AMC-ICPC 亚洲区域赛(焦作站)吊银
    Gym101889J. Jumping frog(合数分解+环形dp预处理)
    Gym101889E. Enigma(bfs+数位)
    Gym101889B. Buggy ICPC(打表)
    Codeforces1076F. Summer Practice Report(贪心+动态规划)
  • 原文地址:https://www.cnblogs.com/0056a/p/8649901.html
Copyright © 2011-2022 走看看