zoukankan      html  css  js  c++  java
  • python 第二周(第九天) 我的python成长记 一个月搞定python数据挖掘!(16) -scrapy框架

    scrapy 框架

    response的解析

    >>> response.css('title::text').extract()
    ['Quotes to Scrape']

    There are two things to note here:
      (1)one is that we’ve added ::text to the CSS query, to mean we want to select only the text elements directly inside <title> element. If we don’t specify ::text, we’d get the full title element, including its tags:  
      (2)the other thing is that the result of calling .extract() is a list, because we’re dealing with an instance of SelectorList. When you know you just want the first result, as in this case, you can do:
    When you know you just want the first result, as in this case, you can do:
    >>> response.css('title::text').extract_first()
    'Quotes to Scrape'

    Besides the extract() and extract_first() methods, you can also use the re() method to extract using regular expressions:
    >>> response.css('title::text').re(r'Quotes.*')
    ['Quotes to Scrape']
    >>> response.css('title::text').re(r'Qw+')
    ['Quotes']
    >>> response.css('title::text').re(r'(w+) to (w+)')
    ['Quotes', 'Scrape']

     
  • 相关阅读:
    原型污染
    C#之抛异常
    为什么['1', '7', '11'].map(parseInt) returns [1, NaN, 3]?
    Linux
    Linux
    Linux
    Linux
    Linux
    Linux
    其他
  • 原文地址:https://www.cnblogs.com/yugengde/p/7270696.html
Copyright © 2011-2022 走看看