zoukankan      html  css  js  c++  java
  • 词频统计

    # -*- coding: UTF-8 -*-
    
    str = '''Gotta Have You  (The Weepies)
    Gray, quiet and tired and mean
    Picking at a worried seam
    I try to make you mad at me over the phone
    Red eyes and fire and signs
    I'm taken by a nursery rhyme
    I want to make a ray of sunshine and never leave home
    No amount of coffee
    No amount of crying
    No amount of whiskey
    No amount of wine
    No, nothing else will do
    I've gotta have you
    I've gotta have you
    The road gets cold
    There's no spring in the middle this year
    I'm the new chicken clucking open hearts and ears
    Oh, such a prima Donna
    Sorry for myself
    But green, it is also summer
    And I won't be warm till I'm lying in your arms
    No amount of coffee
    No amount of crying
    No amount of whiskey
    No amount of wine
    No, nothing else will do
    I've gotta have you
    I've gotta have you
    I see it all through a telescope:
    Guitar, suitcase, and a warm coat
    Lying in the back of the blue boat
    Humming a tune...
    No amount of coffee
    No amount of crying
    No amount of whiskey
    No wine
    No, nothing else will do
    I've gotta have you
    I've gotta have you
    No amount of coffee
    No amount of crying
    No amount of whiskey
    No amount of wine
    No, nothing else will do
    I've gotta have you ~~
    gotta have you
    I've gotta have you'''
    
    n='''~,.'"?!;)('''
    exclude = {'the','a','of','to','for','in','and','no','i','you','ve','have'}
    for t in n:
        str = str.replace(t,' ')
    
    wordList = str.lower().split()
    
    wordDict = {}
    wordSet = set(wordList)
    for w in wordSet:
        wordDict[w] = wordList.count(w)
    
    for w in  wordList:
        wordDict[w] = wordDict.get(w,0)+1
    for w in  exclude:
        wordDict.pop(w)
    
    dictList = list(wordDict.items())
    dictList.sort(key=lambda x: x[1], reverse=True)
    
    
    for w in dictList:
        print(w)

    f = open('C:UsersAdministratorPycharmProjectsdso.txt','r')
    text = f.read()
    f.close()
    
    print(text)
  • 相关阅读:
    单例模式学习(一)
    java线程池学习(一)
    redis面试总结(二)
    redis面试总结(一)
    spark 内存溢出处理
    大数据面试总结(一)
    Spark 知识点总结--调优(一)
    组合数据类型
    一些小细节
    文件归档
  • 原文地址:https://www.cnblogs.com/0056a/p/8649901.html
Copyright © 2011-2022 走看看