zoukankan      html  css  js  c++  java
  • python工具——wordcloud

    生成词云

    安装wordcloud模块

    pip install -i https://pypi.tuna.tsinghua.edu.cn/simple/ wordcloud

    用重复的单个单词组成单词云

    import numpy as np
    from wordcloud import WordCloud
    
    text = "square"
    x, y = np.ogrid[:300, :300]
    
    mask = (x - 150) ** 2 + (y - 150) ** 2 > 130 ** 2
    mask = 255 * mask.astype(int)
    
    wc = WordCloud(background_color="white", repeat=True, mask=mask)
    wc.generate(text)
    wc.to_file('wc.png')

    使用一句话生成词云

    from wordcloud import WordCloud
    wc = WordCloud()    # 创建词云对象
    wc.generate('This is not the end. It is not even the beginning of the end. But it is, perhaps, the end of the beginning.')    # 生成词云
    wc.to_file('wc.png')    # 保存词云

    读取txt文件生成

    import os
    
    from os import path
    from wordcloud import WordCloud
    import matplotlib.pyplot as plt
    d = path.dirname(__file__) if "__file__" in locals() else os.getcwd()
    text = open(path.join(d, 'test.txt')).read()
    
    wordcloud = WordCloud(max_font_size=40).generate(text)
    plt.figure()
    plt.imshow(wordcloud, interpolation="bilinear")
    plt.axis("off")
    plt.show()

    生成一个词云文件需要三步:

       1、配置对象参数 

       2、加载词云文本 

       3、输出词云文件 (如果不加说明默认的图片大小为400 * 200)

    wordcloud做词频统计分为以下几个步骤:

    1、分隔:以空格分隔单词 

    2、统计 :单词出现的次数并过滤 

    3、字体:根据统计搭配相应的字号 

    4、布局

    常用参数

     eg:

    import os
    
    from os import path
    from wordcloud import WordCloud
    
    d = path.dirname(__file__) if "__file__" in locals() else os.getcwd()
    text = open(path.join(d, 'test.txt')).read()
    text=text.lower()
    wordcloud = WordCloud(background_color="white",width=800,height=660).generate(text)
    import matplotlib.pyplot as plt
    
    plt.imshow(wordcloud)
    plt.axis("off")
    plt.show()
    wc.to_file('test.png')

     

     test.txt的获取

    链接:https://pan.baidu.com/s/1zfuK9-W5tyq1P8ftlQJuJQ
    提取码:iet4

    更多参考 http://amueller.github.io/word_cloud/

        https://github.com/amueller/word_cloud

  • 相关阅读:
    HashMap的理解
    红黑树
    No constructor found matching
    会话 控制终端 setsid
    信息表示和处理 from computer system chapter 2
    tcp keepalive
    TCP 四步挥手
    CS 课程
    close vs shutdown socket
    Linux time总结
  • 原文地址:https://www.cnblogs.com/baby123/p/13024713.html
Copyright © 2011-2022 走看看