zoukankan      html  css  js  c++  java
  • Python 对新浪微博的博文元素 (Word, Screen Name)的频率分析

    CODE:

    #!/usr/bin/python 
    # -*- coding: utf-8 -*-
    
    '''
    Created on 2014-7-9
    @author: guaguastd
    @name: weiboFrequencyAnalysis.py
    '''
    
    if __name__ == '__main__':
        
        # get weibo_api to access sina api
        from sinaWeiboLogin import sinaWeiboLogin
        sinaWeiboApi = sinaWeiboLogin()
        
        # import sinaWeibo
        from sinaWeibo import extractWeiboEntities
        
        # import sinaWeoboStatuses
        from sinaWeiboStatuses import publicTimeline
        
        # import sinaWeiboFrequency
        from sinaWeiboFrequency import weiboFrequencyAnalysis
        
        # get the new 5 weibo
        weiboNum = 5
        statuses = publicTimeline(sinaWeiboApi, weiboNum)
        status_texts,screen_names,words = extractWeiboEntities(statuses)  
    
        for label, data in (('Word', words),
                            ('Screen Name', screen_names)):
            weiboFrequencyAnalysis(label, data, weiboNum)

    RESULT:

    +------------------------------------------+-------+
    | Word                                     | Count |
    +------------------------------------------+-------+
    | http://t.cn/8snKY0S                      |     1 |
    | [围观]CANNCI千姿百袋2014新款牛皮菱格女包 |     1 |
    | 时尚潮流单肩包                           |     1 |
    | 浪漫RI系「喜欢请赞                       |     1 |
    | ✲✲✲✲✲✲                             |     1 |
    +------------------------------------------+-------+
    +--------------------+-------+
    | Screen Name        | Count |
    +--------------------+-------+
    | 马傻强             |     1 |
    | 手机用户2360148561 |     1 |
    | 潮流爆款搭V        |     1 |
    | star爱上泡面猫     |     1 |
    | 美容潮搭健康       |     1 |
    +--------------------+-------+
    


  • 相关阅读:
    Python基础语法
    理解session和cookie
    应用服务器
    web服务器
    Python正则表达式
    理解HTTP协议
    常见浏览器内核
    python中range()和len()函数区别
    多线程执行测试用例
    selenium+Python(生成html测试报告)
  • 原文地址:https://www.cnblogs.com/bhlsheji/p/5356677.html
Copyright © 2011-2022 走看看