zoukankan      html  css  js  c++  java
  • failed with: java.lang.NullPointerException

    failed with: java.lang.NullPointerException
    
    需要在nutch的配置文件 'conf/nutch-site.xml'. 里设置如下,不然就报上面的错误了。
    
    当然在crawl-urlfilter.txt里面也要相应于 urls/url.txt里的域名进行设置。
    
    
    <?xml version="1.0"?>
    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
    
    <!-- Put site-specific property overrides in this file. -->
    
    <configuration>
    <property>
    <name>http.agent.name</name>
    <value>MySearch</value>
    <description>My Search Engine</description>
    </property>
    
    <property>
    <name>http.agent.description</name>
    <value></value>
    <description>Further description of our bot- this text is used in
    the User-Agent header. It appears in parenthesis after the agent name.
    </description>
    </property>
    
    <property>
    <name>http.agent.url</name>
    <value></value>
    <description>A URL to advertise in the User-Agent header. This will
    appear in parenthesis after the agent name. Custom dictates that this
    should be a URL of a page explaining the purpose and behavior of this
    crawler.
    </description>
    </property>
    
    <property>
    <name>http.agent.email</name>
    <value></value>
    <description>An email address to advertise in the HTTP 'From' request
    header and User-Agent header. A good practice is to mangle this
    address (e.g. 'info at example dot com') to avoid spamming.
    </description>
    </property>
    
    </configuration>
  • 相关阅读:
    QT调用其他UI并使用QLabel(text)
    QT调用单例模式脚本
    QT 调用另一个UI实现方式
    QT 键值
    (一) Mybatis 源码解析之源码概述
    设计模式之 模板模式开发
    十二、线程池
    (十一)并发容器ConcurrentHashMap
    mybatis plus 踩坑记 -- 自动填充
    C/C++ file
  • 原文地址:https://www.cnblogs.com/i80386/p/3972350.html
Copyright © 2011-2022 走看看