zoukankan      html  css  js  c++  java
  • pipeline的存储代码

    在spider中最后一个函数返回item时会scrapy会调用pipeline里面的

    process_item(self, item, spider):
    函数并传入item,spider等参数
    在这里可以将数据进行持久化储存
    我的piple代码


    # -*- coding: utf-8 -*-
    # See: http://doc.scrapy.org/en/latest/topics/item-pipeline.html
    
    import MySQLdb
    import MySQLdb.cursors
    from twisted.enterprise import adbapi
    
    class MyPipeline(object):  ##这里的函数名于setting中的对应
    
        def __init__(self, dbpool):
            self.dbpool = dbpool
        @classmethod    ##得到数据库的连接
        def from_settings(cls, settings):
            dbargs = dict(
                    host = settings['MYSQL_HOST'],
                    db = settings['MYSQL_DBNAME'],
                    port = settings['MYSQL_PORT'],
                    user = settings['MYSQL_USER'],
                    passwd = settings['MYSQL_PASSWD'],
                    charset = 'utf8',
                    cursorclass = MySQLdb.cursors.DictCursor,
                    use_unicode = True,
                )
            dbpool = adbapi.ConnectionPool('MySQLdb', **dbargs)
            return cls(dbpool)
    
        def process_item(self, item, spider): ##这个函数会在spider返回时调用
            d = self.dbpool.runInteraction(self._do_upinsert, item, spider)
            return item
    
        def _do_upinsert(self, conn, item, spider):
            valid = True
            for data in item:
                if not data:
                    valid = False
            if valid:         ##执行sql
                result = conn.execute(‘sql’)  
                if result:
                    print 'added a record'
                else:
                    print 'failed insert into table'
    代码git地址:过几天会上传



  • 相关阅读:
    P2761 软件补丁问题
    CF1335F Robots on a Grid
    [bzoj2088]P3505 [POI2010]TEL-Teleportation
    CF1335E Three Blocks Palindrome
    P3831 [SHOI2012]回家的路
    P4568 [JLOI2011]飞行路线(分层图)
    P4774 [NOI2018]屠龙勇士
    P2480 [SDOI2010]古代猪文
    CF #632 (Div. 2) 对应题号CF1333
    BSGS 和扩展
  • 原文地址:https://www.cnblogs.com/seablog/p/6993975.html
Copyright © 2011-2022 走看看