zoukankan      html  css  js  c++  java
  • 英文词频统计

    song = '''
    Can't believe its over
    That you're leaving
    Weren't we meant to be?
    Should've sensed the danger
    Read the warnings
    Right there in front of me
    Just stop
    Lets start it over
    Couldn't I get one more try?
    All:
    Maybe tomorrow you'll say that you're mine
    You'll realize, I could change
    I'm gonna show you I'm in it for life
    I'll get you back someday
    Maybe tomorrow
    Shane:
    I forgot to be there
    I was selfish
    I can see that now
    Mark:
    I should've got to known you
    Should've held you
    When your tears fell down
    Just stop
    Don't make me beg you
    Tell me that you'll stay the night
    All:
    Maybe tomorrow you'll say that you're mine
    You'll realize, I could change
    I'm gonna show you I'm in it for life
    I'll get you back someday
    I will find a way
    Nicky:
    Wait a minute
    Just hear me out
    This time I promise, I'll put you first
    Shane:
    Turn around now
    Your heart can't let you walk away
    I'll do what it takes
    All:
    Maybe tomorrow you'll say that you're mine
    You'll realize (realize), I could change (I can change)
    I'm gonna show you I'm in it for life
    I'll get you back someday
    Maybe tomorrow
    Kian:
    There's so much I wanna say now
    I just wanna make a life with you (don't walk away)
    There's so much I wanna do now
    I just wanna make love to you
    Shane:
    Maybe tomorrow
    '''
    
    UnusefulWords = ['on', 'was', 'I', 'i', 'at']  # 需要替换的单词
    UnusefulSymbol = ["." "'", "(", ")"]  # 需要替换的标点
    
    NewWords = song
    for i in range(len(UnusefulSymbol)):
        NewWords = NewWords.replace(UnusefulWords[i], ' ')  # 把文章的标点符号替换
    NewWords = NewWords.lower()  # 全部改成小写
    
    WordsList = NewWords.split()  # 将字符串分成一个个单词
    
    Count = dict(zip())
    
    for i in WordsList:
        Count[i] = NewWords.count(i)  # 用字典记录单词和其出现次数
    
    for i in song:
        if (Count.get(i) != None):
    
         Count.pop(i)
    
    CountWords = sorted(Count.items(), key=lambda x: x[1], reverse=True)
    
    for i in range(10):
        print(CountWords[i])  # 输出出现频率最高的10个词
  • 相关阅读:
    ubuntu 搭建 php 环境
    【转】送给和我一样曾经浮躁过的PHPer程序猿,希望有帮助
    thinkphp iis下去掉index.php
    windows定时执行PHP的技巧
    js 生成随机数字的方法
    Linux下crontab命令的用法
    收藏下(设为收藏,设为首页)
    C#扩展方法的理解
    Win7 访问共享时输入正确密码仍然提示密码错误
    SQL Server 获取插入记录后的自动编号ID
  • 原文地址:https://www.cnblogs.com/abcdcd/p/8653634.html
Copyright © 2011-2022 走看看