zoukankan      html  css  js  c++  java
  • 《三国演义》人物出场次数词云统计

    【领域文章来源】:

    通过百度查找的《三国演义》,下载,在本地自己创建txt文件。注:用encoding=utf-8

    【源代码】:

    import jieba
    excludes = {"来到","人马","领兵","将军","却说","荆州","二人","不可","不能","如此"}
    txt = open("threekingdom.txt", "rb").read()
    words = jieba.lcut(txt)
    counts = {}
    for word in words:
    if len(word) == 1:
    continue
    elif word == "诸葛亮" or word == "孔明曰":
    rword = "孔明"
    elif word == "关公" or word == "云长":
    rword = "关羽"
    elif word == "玄德" or word == "玄德曰":
    rword = "刘备"
    elif word == "孟德" or word == "丞相":
    rword = "曹操"
    else:
    rword = word
    counts[rword] = counts.get(rword,0) + 1
    for word in excludes:
    del(counts[word])
    items = list(counts.items())
    items.sort(key=lambda x:x[1], reverse=True)
    for i in range(55):
    word, count = items[i]
    print ("{0:<10}{1:>5}".format(word, count))

    下面是输出的内容:

    词云制作:

     import jieba
    import wordcloud

    f = open("threekingdom.txt","rb")
    t = f.read()
    f.close()
    ls = jieba.lcut(t)
    txt = " ".join(ls)
    w = wordcloud.WordCloud( font_path = "NotoSerifCJK-Bold.ttc",
    width = 1000,height = 700,background_color = "white",
    )

    w.generate(txt)
    w.to_file("gr.png")

    效果如下:

    然后我说一下在制作过程中的问题:

    一开始最大的问题就是各种库的安装,自己真的是费了九牛二虎之力,花了好几天也没搞明白,后来一问同学,有的问题才迎刃而解。(特别感谢李拓和柴易晨同学)!!!

    其他不足之处还请教员,同学们指正,谢谢大家!

  • 相关阅读:
    jQuery事件对象event的属性和方法
    使用CSS3动画库animate.css
    Git常用命令整理
    模拟现实物理效果
    数组排序之选择排序
    数组排序之冒泡排序
    小方块靠着浏览器运动
    轮播图片, 不用滚动视图, 也不用时间计时器
    UIActionSheet
    自动计算高度的方法 iOS, height为0, 可以自动计算weith,
  • 原文地址:https://www.cnblogs.com/jxt123/p/12674328.html
Copyright © 2011-2022 走看看