zoukankan      html  css  js  c++  java
  • 【402】Twitter Data Collection

    参考:Python判断文件是否存在的三种方法

    参考:在python文件中执行另一个python文件

    参考:How can I make a time delay in Python?

    参考:Twilio SMS Python Quickstart


    1. 收集某一区域的实时数据

    Name: AUS.py

    #Import the necessary methods from tweepy library
    from tweepy.streaming import StreamListener
    from tweepy import OAuthHandler
    from tweepy import Stream
     
    #Variables that contains the user credentials to access Twitter API
    access_token = "*****"
    access_token_secret = "*****"
    consumer_key = "*****"
    consumer_secret = "*****"
    
    #This is a basic listener that just prints received tweets to stdout.
    class StdOutListener(StreamListener):
    
        def on_data(self, data):
            print(data)
            return True
    
        def on_error(self, status):
            print(status)
    
            
    if __name__ == '__main__':
        
        #This handles Twitter authetification and the connection to Twitter Streaming API
        l = StdOutListener()
        auth = OAuthHandler(consumer_key, consumer_secret)
        auth.set_access_token(access_token, access_token_secret)
        stream = Stream(auth, l)
        
        #This line filter Twitter Streams to capture data by the keywords: 'python', 'javascript', 'ruby'
        stream.filter(locations=[112, -44, 154, -9])
    

    在 cmd 上运行代码 python AUS.py > 2019-06-07.txt ,将数据实时存储。

    通过上面的代码可以将打印出来的数据直接存储到文本文件中。(类似 print() 可以直接将内容存储)

    2.自动发短信功能

    由于数据存储到一定量会出现奔溃的情况,因此增加 Twilio 自动发短信功能,遇到奔溃可以实时发短信,实现如下:

    文件名: AUS_SMS.py

    #Import the necessary methods from tweepy library
    from tweepy.streaming import StreamListener
    from tweepy import OAuthHandler
    from tweepy import Stream
    from twilio.rest import Client 
    import time
     
    #Variables that contains the user credentials to access Twitter API
    access_token = "*****"
    access_token_secret = "*****"
    consumer_key = "*****"
    consumer_secret = "*****"
    
    #This is a basic listener that just prints received tweets to stdout.
    class StdOutListener(StreamListener):
    
        def on_data(self, data):
            print(data)
            return True
    
        def on_error(self, status):
            print(status)
    
    def textMessage(message):       
        account = '*****'
        token = '*****'
        myNumber='+*****'
        twilioNumber='+*****'
     
        client = Client(account, token)
        message = client.messages.create(to=myNumber, from_=twilioNumber, body=message)
            
    if __name__ == '__main__':
        try:
            #This handles Twitter authetification and the connection to Twitter Streaming API
            l = StdOutListener()
            auth = OAuthHandler(consumer_key, consumer_secret)
            auth.set_access_token(access_token, access_token_secret)
            stream = Stream(auth, l)
            
            #This line filter Twitter Streams to capture data by the keywords: 'python', 'javascript', 'ruby'
            stream.filter(locations=[112, -44, 154, -9])
        except:
            textMessage("n(*≧▽≦*)n [HELP] Program crashed!!!
    Time: "+time.asctime())
    

    3. 无限运行

    可以直接通过 Python 文件来运行 Python 文件,通过建立无线循环可以实现无限收集数据

    文件名:main.py

    import os
    import time
    
    while True:    
        year = str(time.localtime().tm_year)
        mon = str(time.localtime().tm_mon)
        day = str(time.localtime().tm_mday)
        filename = year + '-' + mon.zfill(2) + '-' + day.zfill(2)
        i = 0
        while os.path.exists(os.getcwd() + '\' + filename + '.txt'):
            i += 1
            filename = year + '-' + mon.zfill(2) + '-' + day.zfill(2) + '-' + str(i)
            time.sleep(1)
        os.system("python AUS_SMS.py > " + filename + '.txt')
    

    按照当天日期进行文件名命名,如果同一天的文件存在,则后面加 1,然后加 2,,,以此类推。。。

    通过 os.system() 方法可以实现 cmd 运行 Python 文件的效果。

  • 相关阅读:
    [Codeforces 1245D] Shichikuji and Power Grid (最小生成树)
    [BZOJ 1535] [Luogu 3426]SZA-Template (KMP+fail树+双向链表)
    [BZOJ1009] [HNOI2008] GT考试(KMP+dp+矩阵快速幂)
    [Codeforces 1246B] Power Products (STL+分解质因数)
    [BZOJ 3992] [SDOI 2015] 序列统计(DP+原根+NTT)
    [BZOJ5306] [HAOI2018]染色(容斥原理+NTT)
    [Codeforces 1239D]Catowise City(2-SAT)
    [BZOJ 3527] [ZJOI2014]力(FFT)
    [BZOJ 3456]城市规划(cdq分治+FFT)
    【2-SAT(最小字典序/暴力染色)】HDU1814-Peaceful Commission
  • 原文地址:https://www.cnblogs.com/alex-bn-lee/p/10987978.html
Copyright © 2011-2022 走看看