zoukankan      html  css  js  c++  java
  • 综合练习:英文词频统计

    综合练习:英文词频统计

    下载一首英文的歌词或文章

    sing  = '''
    i'm just a little bit caught in the middle
    life is a maze and love is a riddle
    i don't know where to go
    can't do it alone
    i've tried but i don't know why
    slow it down make it stop
    or else my heart is going to pop
    cause its to much yea its alot
    to be something i'm not
    i'm a fool out of love
    cause i just can't get enough
    i'm just a little bit caught in the middle
    life is a maze and love is a riddle
    i don't know where to go
    can't do it alone
    i've tride but i don't know why
    i'm just a little girl lost in the moment
    i'm so scared but i don't show it
    i can't figure it out
    it's bringing me down
    i know i've got to let it go
    and just enjoy the show
    the sun is hot in the sky
    just like a giant spot light
    the people follow the signs
    and sicronise in time
    it's just no body knows
    they got to take it to the show
    i'm just a little bit caught in the middle
    life is a maze and love is a riddle
    i don't know where to go
    can't do it alone
    i've tried but i don't know why
    i'm just a little girl lost in the moment
    i'm so scared but i don't show it
    i can't figure it out
    it's bringing me down
    i know i've got to let it go
    and just enjoy the show
    just engoy the show
    i'm just a little bit caught in the middle
    life is a maze and love is a riddle
    i don't know where to go
    can't do it alone
    i've tride but i don't know why
    i'm just a little girl lost in the moment
    i'm so scared but i don't show it
    i can't figure it out
    it's bringing me down
    i know i've got to let it go
    and just enjoy the show
    just enjoy the show
    just enjoy the show
    i want my money back
    i want my money back
    i want my money back
    just enjoy the show
    i want my money back
    i want my money back
    i want my money back
    just enjoy the show 
    '''

    1.将所有,.?!’:等分隔符全部替换为空格

    newSing = sing.replace("'"," ").replace("."," ").replace("?"," ").replace("
    "," ")
    print(newSing)


    2.将所有小写转换为大写

    newSmall = newSing.upper()
    print(newSmall)


    3.生成单词列表

    listWord = newSing.replace("\"," ").split(" ")
    print(listWord)


    4.生成词频统计

    DicWord ={}
    for word in listWord:
        if word in DicWord.keys():
            DicWord[word] +=1
        else:
            DicWord[word] =1
    print(DicWord)


    5.排序

    Dec = sorted(DicWord.keys())
    print(Dec)


    6.排除语法型词汇,代词、冠词、连词

    vocalbuary = ["a","so","the","they","is","in","to","of","i"]
    for word in vocalbuary:
        del DicWord[word]
    print(DicWord)


    7.输出词频最大TOP10

    NewDicWord = sorted(DicWord.items(),key=lambda item:item[1],reverse=True)
    print(NewDicWord)
    
    for word in range(10):
        print(NewDicWord[word])

    做了许多,很多都是查阅网上的一些资料,其实也是加深自己的一个基础牢固程度吧~

  • 相关阅读:
    Java基础知识回顾之一 ----- 基本数据类型
    大数据初学者应该知道的知识
    MyEclipse 快捷键大全
    hibernate简单入门教程(一)---------基本配置
    MyEclipse中文注释乱码解决
    中间件(WAS、WMQ)运维 9个常见难点解析
    Oracle PL/SQL Dev工具(破解版)被植入勒索病毒的安全预警及自查通告
    呼叫中心系统的相关术语
    INFORMATICA 开发规范
    什么是RESTful API
  • 原文地址:https://www.cnblogs.com/qazwsx833/p/8619743.html
Copyright © 2011-2022 走看看