zoukankan      html  css  js  c++  java
  • scrapy-pipeline的方法

    a

    # -*- coding: utf-8 -*-
    
    # Define your item pipelines here
    #
    # Don't forget to add your pipeline to the ITEM_PIPELINES setting
    # See: https://doc.scrapy.org/en/latest/topics/item-pipeline.html
    # from pymongo import MongoClient
    
    # client = MongoClient()
    # collection = client['SpiderAnything']['hr']
    import pymysql
    
    
    class SpiderSuningBookPipeline(object):
        def process_item(self, item, spider):
            if spider.name == 'book_info':
                # collection.insert(dict(item))
                sql = """
                    insert into book(title,author,download_text,new) values('%s','%s','%s','%s')"""
                      %(
                    item['title'],
                    item['author'],
                    item['download_text'],
                    item['new']
        )
                print(sql)
                self.cursor.execute(sql)
            elif spider.name == 'dangdang':
                print(item)
    
            return item
    
        def open_spider(self, spider):
            # 连接数据库
            self.connect = pymysql.connect(
            host='127.0.0.1',
            port=3306,
            db='study',
            user='root',
            passwd='123456',
            charset='utf8',
            use_unicode=True)
    
            # 通过cursor执行增删查改
            self.cursor = self.connect.cursor()
            self.connect.autocommit(True)
    
        def close_spider(self, spider):
            self.cursor.close()
            self.connect.close()
  • 相关阅读:
    Task示例,多线程
    request
    do put in ruby
    Ruby零星笔记
    Git的常用操作
    如何在Rails中执行Get/Post/Put请求
    Lua中的基本函数库
    Step By Step(Lua目录)
    position:fixed失效原因
    前端性能监控-window.performance.timing篇
  • 原文地址:https://www.cnblogs.com/tangpg/p/10691114.html
Copyright © 2011-2022 走看看