zoukankan      html  css  js  c++  java
  • 《三国演义》人物出场次数词云统计

    【领域文章来源】:

    通过百度查找的《三国演义》,下载,在本地自己创建txt文件。注:用encoding=utf-8

    【源代码】:

    import jieba
    excludes = {"来到","人马","领兵","将军","却说","荆州","二人","不可","不能","如此"}
    txt = open("threekingdom.txt", "rb").read()
    words = jieba.lcut(txt)
    counts = {}
    for word in words:
    if len(word) == 1:
    continue
    elif word == "诸葛亮" or word == "孔明曰":
    rword = "孔明"
    elif word == "关公" or word == "云长":
    rword = "关羽"
    elif word == "玄德" or word == "玄德曰":
    rword = "刘备"
    elif word == "孟德" or word == "丞相":
    rword = "曹操"
    else:
    rword = word
    counts[rword] = counts.get(rword,0) + 1
    for word in excludes:
    del(counts[word])
    items = list(counts.items())
    items.sort(key=lambda x:x[1], reverse=True)
    for i in range(55):
    word, count = items[i]
    print ("{0:<10}{1:>5}".format(word, count))

    下面是输出的内容:

    词云制作:

     import jieba
    import wordcloud

    f = open("threekingdom.txt","rb")
    t = f.read()
    f.close()
    ls = jieba.lcut(t)
    txt = " ".join(ls)
    w = wordcloud.WordCloud( font_path = "NotoSerifCJK-Bold.ttc",
    width = 1000,height = 700,background_color = "white",
    )

    w.generate(txt)
    w.to_file("gr.png")

    效果如下:

    然后我说一下在制作过程中的问题:

    一开始最大的问题就是各种库的安装,自己真的是费了九牛二虎之力,花了好几天也没搞明白,后来一问同学,有的问题才迎刃而解。(特别感谢李拓和柴易晨同学)!!!

    其他不足之处还请教员,同学们指正,谢谢大家!

  • 相关阅读:
    4.再来看看逆向——OD的简介
    3.资源里加个混淆
    反编译python打包的exe文件
    2.释放资源那些事
    1.恶意软件中的防双开
    windbg源码驱动调试 + 无源码驱动调试
    [转]当勒索病毒“不图财”时会图什么?
    勒索病毒加密过程分析1——简易加密型(坏兔子病毒)
    通过驱动杀死那个进程
    前端学习笔记 day14 模拟滚动条
  • 原文地址:https://www.cnblogs.com/jxt123/p/12674328.html
Copyright © 2011-2022 走看看