zoukankan      html  css  js  c++  java
  • scrapy直接取出文章所有内容

    示例:

    import scrapy
    from firstscrapy.items import CnblogsMysqlItem
    
    
    class CnblogsSpider(scrapy.Spider):
        name = 'cnblogs'
        allowed_domains = ['www.cnblogs.com']
        # start_urls = ['http://www.cnblogs.com/']
        start_urls = ['http://www.cnblogs.com/lifei01/p/13440458.html']
    
        def parse(self, response):
            article = response.css('#main')
            print(article.css('#cb_post_title_url span::text').extract_first())
            article_body = response.xpath('.//div[@id="cnblogs_post_body"]//text()').extract()
            for line in article_body:
                print(line.strip())
  • 相关阅读:
    For in
    For each.....in
    正则表达式
    DOM&&BOM
    字幕滚动
    web API种类
    应用程序接口
    for each in
    object constructor
    编程语言历史
  • 原文地址:https://www.cnblogs.com/baicai37/p/13443587.html
Copyright © 2011-2022 走看看