zoukankan      html  css  js  c++  java
  • Lucene 4.6.1 java.lang.IllegalStateException: TokenStream contract violation

    这是旧代码在新版本Lucene中出现的异常,异常如下:

    Exception in thread "main" java.lang.IllegalStateException: TokenStream contract violation: reset()/close() call missing, reset() called multiple times, or subclass does not call super.reset(). Please see Javadocs of TokenStream class for more information about the correct consuming workflow.
    at org.apache.lucene.analysis.Tokenizer$1.read(Tokenizer.java:110)
    at java.io.Reader.read(Reader.java:140)
    at org.wltea.analyzer.core.AnalyzeContext.fillBuffer(AnalyzeContext.java:124)
    at org.wltea.analyzer.core.IKSegmenter.next(IKSegmenter.java:122)
    at org.wltea.analyzer.lucene.IKTokenizer.incrementToken(IKTokenizer.java:78)
    at com.hankcs.train.IKHelper.parse(IKHelper.java:36)
    at com.hankcs.train.AnalysisAdjuster.handleFile(AnalysisAdjuster.java:44)
    at com.hankcs.train.AnalysisAdjuster.main(AnalysisAdjuster.java:37)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at com.intellij.rt.execution.application.AppMain.main(AppMain.java:120)

    Process finished with exit code 1

    旧代码:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    IKAnalyzer ss = new IKAnalyzer();
            StringReader reader = new StringReader(str);
            try
            {
                TokenStream tokenStream = ss.tokenStream("", reader);
                while (tokenStream.incrementToken())
                {
                    CharTermAttribute termAttribute = tokenStream.getAttribute(CharTermAttribute.class);
                    System.out.println(termAttribute.toString());
     
                }
            catch (IOException e)
            {
                e.printStackTrace();
            }

    根据新的API文档,调用TokenStream API的流程必须是:

    The workflow of the new TokenStream API is as follows:

    1. Instantiation of TokenStream/TokenFilters which add/get attributes to/from the AttributeSource.

    2. The consumer calls reset().

    3. The consumer retrieves attributes from the stream and stores local references to all attributes it wants to access.

    4. The consumer calls incrementToken() until it returns false consuming the attributes after each call.

    5. The consumer calls end() so that any end-of-stream operations can be performed.

    6. The consumer calls close() to release any resource when finished using the TokenStream.

    所以代码必须在incrementToken()之前调用一次reset()

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
      IKAnalyzer ss = new IKAnalyzer();
            StringReader reader = new StringReader(str);
            try
            {
                TokenStream tokenStream = ss.tokenStream("", reader);
                tokenStream.reset();
                while (tokenStream.incrementToken())
                {
                    CharTermAttribute termAttribute = tokenStream.getAttribute(CharTermAttribute.class);
                    System.out.println(termAttribute.toString());
     
                }
            catch (IOException e)
            {
                e.printStackTrace();
            }

    转载请注明:码农场 » Lucene 4.6.1 java.lang.IllegalStateException: TokenStream contract violation

  • 相关阅读:
    高级数据结构实现——自顶向下伸展树
    优先队列——二项队列(binominal queue)
    优先队列——左式堆
    近似装箱问题(两种脱机算法实现)
    近似装箱问题(三种联机算法实现)
    Instruments
    CALayer之 customizing timing of an animation
    PKCS填充方式
    使用Xcode和Instruments调试解决iOS内存泄露
    apple网址
  • 原文地址:https://www.cnblogs.com/coder-zhang/p/3864992.html
Copyright © 2011-2022 走看看