zoukankan      html  css  js  c++  java
  • itchat和wordcloud对微信好友的签名进行画像

    获取好友列表的时候,返回的json信息中还看到了有个性签名的信息,脑洞一开,把大家的个性签名都抓下来,看看高频词语,还做了个词云。

    # coding:utf-8
    import itchat
    
    # 先登录
    itchat.login()
    
    # 获取好友列表
    friends = itchat.get_friends(update=True)[0:]
    for i in friends:
        # 获取个性签名
        signature = i["Signature"]
    print signature
    

    先全部抓取下来 
    打印之后你会发现,有大量的span,class,emoji,emoji1f3c3等的字段,因为个性签名中使用了表情符号,这些字段都是要过滤掉的,写个正则和replace方法过滤掉

    for i in friends:
    # 获取个性签名
        signature = i["Signature"].strip().replace("span", "").replace("class", "").replace("emoji", "")
    # 正则匹配过滤掉emoji表情,例如emoji1f3c3等
        rep = re.compile("1fd.+")
        signature = rep.sub("", signature)
        print signature
    

    接来下用jieba分词,然后制作成词云,首先要安装jieba和wordcloud库

    pip install jieba
    pip install wordcloud
    

    代码

    # coding:utf-8
    import itchat
    import re
    
    itchat.login()
    friends = itchat.get_friends(update=True)[0:]
    tList = []
    for i in friends:
        signature = i["Signature"].replace(" ", "").replace("span", "").replace("class", "").replace("emoji", "")
        rep = re.compile("1fd.+")
        signature = rep.sub("", signature)
        tList.append(signature)
    
    # 拼接字符串
    text = "".join(tList)
    
    # jieba分词
    import jieba
    wordlist_jieba = jieba.cut(text, cut_all=True)
    wl_space_split = " ".join(wordlist_jieba)
    
    # wordcloud词云
    import matplotlib.pyplot as plt
    from wordcloud import WordCloud
    import PIL.Image as Image
    
    # 这里要选择字体存放路径,这里是Mac的,win的字体在windows/Fonts中
    my_wordcloud = WordCloud(background_color="white", max_words=2000, 
                             max_font_size=40, random_state=42,
                             font_path='/Users/sebastian/Library/Fonts/Arial Unicode.ttf').generate(wl_space_split)
    
    plt.imshow(my_wordcloud)
    plt.axis("off")
    plt.show()


    修改一下代码

    # wordcloud词云
    import matplotlib.pyplot as plt
    from wordcloud import WordCloud, ImageColorGenerator
    import os
    import numpy as np
    import PIL.Image as Image
    
    
    d = os.path.dirname(__file__)
    alice_coloring = np.array(Image.open(os.path.join(d, "wechat.jpg")))
    my_wordcloud = WordCloud(background_color="white", max_words=2000, mask=alice_coloring,
                             max_font_size=40, random_state=42,
                             font_path='/Users/sebastian/Library/Fonts/Arial Unicode.ttf')
        .generate(wl_space_split)
    
    image_colors = ImageColorGenerator(alice_coloring)
    plt.imshow(my_wordcloud.recolor(color_func=image_colors))
    plt.imshow(my_wordcloud)
    plt.axis("off")
    plt.show()
    
    # 保存图片 并发送到手机
    my_wordcloud.to_file(os.path.join(d, "wechat_cloud.png"))
    itchat.send_image("wechat_cloud.png", 'filehelper')


    
    


  • 相关阅读:
    IIS“服务没有及时响应启动或控制请求”错误解决
    CSS Overflow属性详解
    访问二维数组的实例ActionScript
    mailto语法
    IIS重新注册asp.net
    flash 动态文本 html
    C++继承中构造函数、析构函数调用顺序及虚析构函数
    根据指定两个日期计算出这些时间内有多少天是周末 php程序函数代码
    计算一段日期内的周末天数(星期六,星期日总和)(
    计算一段日期内的周末天数(星期六,星期日总和
  • 原文地址:https://www.cnblogs.com/ameile/p/7999014.html
Copyright © 2011-2022 走看看