zoukankan      html  css  js  c++  java
  • Python 对Twitter tweet的元素 (Word, Screen Name, Hash Tag)的频率分析

    CODE:

    #!/usr/bin/python 
    # -*- coding: utf-8 -*-
    
    '''
    Created on 2014-7-2
    @author: guaguastd
    @name: tweet_frequency_analysis.py
    '''
    
    if __name__ == '__main__':
        
        # import frequency
        from frequency import frequency_analysis
        
        # import search
        from search import search_for_tweet
            
        # import login, see http://blog.csdn.net/guaguastd/article/details/31706155 
        from login import twitter_login
    
        # get the twitter access api
        twitter_api = twitter_login()
        
        # import tweet
        from tweet import extract_tweet_entities
    
        while 1:
            query = raw_input('
    Input the query (eg. #MentionSomeoneImportantForYou, exit to quit): ')
            
            if query == 'exit':
                print 'Successfully exit!'
                break
            
            statuses = search_for_tweet(twitter_api, query)
            status_texts,screen_names,hashtags,words = extract_tweet_entities(statuses)  
    
            for label, data in (('Word', words),
                                ('Screen Name', screen_names),
                                ('Hashtag', hashtags)):
                frequency_analysis(label, data, 10)

    RESULT:

    Input the query (eg. #MentionSomeoneImportantForYou, exit to quit): #MentionSomeoneImportantForYou
    Length of statuses 96
    +--------------------------------+-------+
    | Word                           | Count |
    +--------------------------------+-------+
    | #MentionSomeoneImportantForYou |    84 |
    | RT                             |    49 |
    | @paynashton                    |    13 |
    | #mentionsomeoneimportantforyou |    12 |
    | @gellystyles                   |    11 |
    | @cuddlingxbrooks               |     9 |
    | @sickhorandiva                 |     9 |
    | @cuddlingxbrooks:              |     8 |
    | so                             |     8 |
    | @fratboyliamx                  |     7 |
    +--------------------------------+-------+
    +-----------------+-------+
    | Screen Name     | Count |
    +-----------------+-------+
    | paynashton      |    18 |
    | cuddlingxbrooks |    17 |
    | gellystyles     |    15 |
    | sickhorandiva   |    13 |
    | SwaggyOnFire1   |     9 |
    | TichaaAlves     |     7 |
    | wtvpottorff     |     7 |
    | idkdallasbae    |     7 |
    | ElenaBomerC     |     7 |
    | cuddings        |     7 |
    +-----------------+-------+
    +-----------------------------------+-------+
    | Hashtag                           | Count |
    +-----------------------------------+-------+
    | MentionSomeoneImportantForYou     |    84 |
    | mentionsomeoneimportantforyou     |    12 |
    | MentionSomeoneBeautiful           |     1 |
    | mentionyourinternetbestfriend     |     1 |
    | MentionSomeoneYouLoveAndCareAbout |     1 |
    | BAMsingleOutTmrw                  |     1 |
    +-----------------------------------+-------+
    
    Input the query (eg. #MentionSomeoneImportantForYou, exit to quit): exit
    Successfully exit!

  • 相关阅读:
    Log4J输出日志到WEB工程目录的实现方法
    MyEclipse 10 中增加svn插件
    Web 项目添加log4j
    ruby on rails 之旅 第一章 ubuntu11.10安装
    ruby on rails 之旅 第一章 ubuntu12.04安装
    centos 6.3 server 安装mysql5
    技术实践第四期|解读移动开发者日常性能监控平台应用
    电脑一族预防颈椎劳损八法
    方便的使用单击和双击更新DataGrid中的数据的例子(转载)
    不重复随机数列生成算法
  • 原文地址:https://www.cnblogs.com/lxjshuju/p/6813889.html
Copyright © 2011-2022 走看看