zoukankan      html  css  js  c++  java
  • 基于python的tagcloud

    setp1: 安装jieba,pytagcloud

    pip install jieba

    apt-get install python-pygame

    pip install simplejson

    pip install pytagcloud

    step2:下载中文字体文件比如simhei.ttf

    • 找到pytagcloud包的字体文件(/usr/local/lib/python2.7/dist-packages/pytagcloud/fonts)
    • 复制字体文件到pytagcloud中 cp simhei.ttf /usr/local/lib/python2.7/dist-packages/pytagcloud/fonts
    • 编辑fonts.json  vim fonts.json (如下图)
    [
      2     {
      3         "name":"SimHei",
      4         "ttf":"simhei.ttf",
      5         "web":"none"
      6     },
      7     {
      8         "name": "Nobile",
      9         "ttf": "nobile.ttf",
     10         "web": "http://fonts.googleapis.com/css?family=Nobile"
     11     },
     12     {
     13         "name": "Old Standard TT",
     14         "ttf": "OldStandard-Regular.ttf",
     15         "web": "http://fonts.googleapis.com/css?family=Old+Standard+TT"
     16     },

    step3:爬取文本

    step4:生成tagcloud

     1 # -*- coding:utf-8 -*-
      2 import jieba
      3 import jieba.analyse
      4 import pytagcloud
      5 from pytagcloud import create_tag_image,make_tags
      6 from pytagcloud.lang.counter import get_tag_counts
      7 fp=open('sent.txt','r')
      8 content = fp.read()
      9 words = jieba.cut(content)
     10 top = jieba.analyse.extract_tags(content,topK=100,withWeight=True)
     11 tagcloud={}
     12 for i in xrange(len(top)):
     13     tagcloud[top[i][0]]=int(top[i][1])
     14 print tagcloud
     15 from operator import itemgetter
     16 swd = sorted(tagcloud.iteritems(),key=itemgetter(1),reverse=True)
     17 tags = make_tags(swd, minsize=20,maxsize=60)
     18 #print tags
     19 create_tag_image(tags, 'cloud_large.png',background=(0,0,0,255),size=(900, 600),fontname='SimHei')
     20 import webbrowser
     21 webbrowser.open('cloud_large.png')

  • 相关阅读:
    IIS服务器SSL证书安装 (pfx文件不能直接运行时)
    .NET Core创建Worker Services
    1.初步了解IOC容器
    1.Configuration
    2.第一个Codefirst实例
    安装Mysql
    2.我的第一个小程序(获取用户信息--包括敏感信息)
    1.什么是微信小程序
    Repeater的使用
    1.MVC的初步了解
  • 原文地址:https://www.cnblogs.com/hee0624/p/5340396.html
Copyright © 2011-2022 走看看