zoukankan      html  css  js  c++  java
  • failed with: java.lang.NullPointerException

    failed with: java.lang.NullPointerException
    
    需要在nutch的配置文件 'conf/nutch-site.xml'. 里设置如下,不然就报上面的错误了。
    
    当然在crawl-urlfilter.txt里面也要相应于 urls/url.txt里的域名进行设置。
    
    
    <?xml version="1.0"?>
    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
    
    <!-- Put site-specific property overrides in this file. -->
    
    <configuration>
    <property>
    <name>http.agent.name</name>
    <value>MySearch</value>
    <description>My Search Engine</description>
    </property>
    
    <property>
    <name>http.agent.description</name>
    <value></value>
    <description>Further description of our bot- this text is used in
    the User-Agent header. It appears in parenthesis after the agent name.
    </description>
    </property>
    
    <property>
    <name>http.agent.url</name>
    <value></value>
    <description>A URL to advertise in the User-Agent header. This will
    appear in parenthesis after the agent name. Custom dictates that this
    should be a URL of a page explaining the purpose and behavior of this
    crawler.
    </description>
    </property>
    
    <property>
    <name>http.agent.email</name>
    <value></value>
    <description>An email address to advertise in the HTTP 'From' request
    header and User-Agent header. A good practice is to mangle this
    address (e.g. 'info at example dot com') to avoid spamming.
    </description>
    </property>
    
    </configuration>
  • 相关阅读:
    svn出现黄色感叹号怎么办
    数据库设计三大范式
    windows server2008R2 64位 配置 mysql-8.0.15-winx64
    sqlquerystress
    锁表操作
    微软专用消息队列msmq的简单使用
    数据库上移和下移
    mvc全局时间输出格式化处理
    webapi jsonp处理
    泛型处理ToEntity
  • 原文地址:https://www.cnblogs.com/i80386/p/3972350.html
Copyright © 2011-2022 走看看