zoukankan      html  css  js  c++  java
  • webmagic 初始化 startRequests

    在spider类中有三个方法可以初始化startRequests。可以对这些地方进行扩展。

        /**
         * create a spider with pageProcessor.
         *
         * @param pageProcessor pageProcessor
         */
        public Spider(PageProcessor pageProcessor) {
            this.pageProcessor = pageProcessor;
            this.site = pageProcessor.getSite();
            this.startRequests = pageProcessor.getSite().getStartRequests();
        }
    
        /**
         * Set startUrls of Spider.<br>
         * Prior to startUrls of Site.
         *
         * @param startUrls startUrls
         * @return this
         */
        public Spider startUrls(List<String> startUrls) {
            checkIfRunning();
            this.startRequests = UrlUtils.convertToRequests(startUrls);
            return this;
        }
    
        /**
         * Set startUrls of Spider.<br>
         * Prior to startUrls of Site.
         *
         * @param startRequests startRequests
         * @return this
         */
        public Spider startRequest(List<Request> startRequests) {
            checkIfRunning();
            this.startRequests = startRequests;
            return this;
        }
  • 相关阅读:
    反射 元类
    多态
    封装
    继承
    面向基础
    包 logging模块 hashlib模块 openpyxl 深浅拷贝
    常用模块
    re模块(正则表达式)
    模块 导入方式 软件开发目录规范
    第 3 章 镜像
  • 原文地址:https://www.cnblogs.com/guazi/p/6676189.html
Copyright © 2011-2022 走看看