zoukankan      html  css  js  c++  java
  • Twitter search API

    Twitter crawler 与sina 微博类似,使用twitter api之前,首先要有twitter的账号,在twitter developer中创建应用(https://apps.twitter.com/app/new)。

    创建成功之后可以获得应用的信息,包括Consumer key和Consumer secret。并generate access token,将这四个数据保存下来,

    接下来可以借助twitter api 进行twitter 抓取,现有的twitter api的python版本很多,这里主要介绍tweepy 以及python-twitter。

    1. python-twitter

    安装:在cmd 窗口中使用pip 命令:pip install python-twitter

    安装成功后,可以运行如下代码:

    import twitter
      
    import optparse
    import sys
      
    def print_safe(string):
        """
        Format a string for safe printing
        """
        return string.encode('cp437', 'xmlcharrefreplace')
      
    def print_tweet(tweet):
        """
        Format and print `tweet`.
        """
        print "@" + print_safe( tweet.GetUser().GetScreenName() ) +  
        ": " + print_safe(tweet.GetText())
      
    def search(search_term):
        """
        Print recent tweets containing `search_term`.
        """
        api = twitter.Api()
        tweets = api.GetSearch(search_term)
        for tweet in tweets:
            print_tweet(tweet)
      
    def trending_topics():
        """
        Print the currently trending topics.
        """
        api = twitter.Api()
        trending_topics = api.GetTrendsCurrent()
        for topic in trending_topics:
            print print_safe(topic.name)
      
    def user_tweets(username):
        """
        Print recent tweets by `username`.
        """
        api = twitter.Api()
        user_tweets = api.GetUserTimeline(screen_name=username)
        for tweet in user_tweets:
            print_tweet(tweet)
      
    def trending_tweets():
        """
        Print tweets for all the trending topics.
        """
        api = twitter.Api()
      
        trending_topics = api.GetTrendsCurrent()
        tweets = []
        # To add some variety, let's round-robin through the trending
        # topics, displaying a tweet from each until we run out of tweets.
        for topic in trending_topics:
            tweets.append((topic, api.GetSearch(topic.name)))
      
        while True:
            for topic, topic_tweets in tweets:
                if topic_tweets:
                    print_tweet(topic_tweets.pop())
                else:
                    return
      
    def main(args):
        parser = optparse.OptionParser("""Usage: %prog [-s <search term> | -t | -u <username>]""")
      
        parser.add_option("-s", "--search",
                          type="string",
                          action="store",
                          dest="search_term",
                          default=None,
                          help="Display tweets containing a particular string.")
        parser.add_option("-t", "--trending-topics",
                          action="store_true",
                          dest="trending_topics",
                          default=False,
                          help="Display the trending topics.")
        parser.add_option("-u", "--user",
                          type="string",
                          action="store",
                          dest="username",
                          default=None,
                          help="Display tweets for a particular public user.")
        parser.add_option("-w", "--trending-tweets",
                          action="store_true",
                          dest="trending_tweets",
                          default=None,
                          help="Display the tweets from trending topics.")
      
        (opts, args) = parser.parse_args(args)
      
        if opts.search_term:
            search(opts.search_term)
        elif opts.trending_topics:
            trending_topics()
        elif opts.username:
            user_tweets(opts.username)
        elif opts.trending_tweets:
            trending_tweets()
      
    if __name__ == "__main__":
        main(sys.argv[1:])
      
    

     PS:有时会遇到程序报错,缺少module,可能系统安装过python-twitter,但是版本较低,因此可以首先卸载twitter-python 再重新安装,卸载命令为:pip uninstall python-twitter。

    2.tweepy:

    首先tweepy下载:https://github.com/tweepy/tweepy

  • 相关阅读:
    将数据导入PostGIS
    图层管理
    CentIOS PHP 扩展库
    js 笔记 数组(对象)
    JSP 中的 Request 和 Response 对象
    ubuntu 安装 LAMP
    html 学习笔记
    Struts Ajax Json
    Servlet 笔记
    PHP+MYSQL 出现乱码的解决方法
  • 原文地址:https://www.cnblogs.com/tec-vegetables/p/4533582.html
Copyright © 2011-2022 走看看