zoukankan      html  css  js  c++  java
  • 【问题和解决】NLTK was unable to find the megam file!(1)

    在学到“训练基于分类器的分块器”这一小节的时候,在测试代码之后遇到了问题。

    class ConsecutiveNPChunkTagger(nltk.TaggerI):
        def __init__(self, train_sents):
            train_set = []
            for tagged_sent in train_sents:
                untagged_sent = nltk.tag.untag(tagged_sent)
                history = []
                for i, (word, tag) in enumerate(tagged_sent):
                    featureset = npchunk_features(untagged_sent, i, history) 
                    train_set.append( (featureset, tag) )
                    history.append(tag)
            self.classifier = nltk.MaxentClassifier.train(train_set, algorithm='megam', trace=0)
        def tag(self, sentence):
            history = []
            for i, word in enumerate(sentence):
                featureset = npchunk_features(sentence,i, history)
                tag = self.classifier.classify(featureset)
                history.append(tag)
            return zip(sentence, history)
    class ConsecutiveNPChunker(nltk.ChunkParserI):
            def __init__(self, train_sents):
                tagged_sents = [[((w,t),c) for (w,t,c) in
                nltk.chunk.tree2conlltags(sent)]for sent in train_sents]
                self.tagger = ConsecutiveNPChunkTagger(tagged_sents)
            def parse(self, sentence):
                tagged_sents = self.tagger.tag(sentence)
                conlltags =[(w,t,c) for ((w,t),c) in tagged_sents]
                return nltk.chunk.conlltags2tree(conlltags)

    def npchunk_features(sentence,i, history):
    ... word,pos= sentence[i]
    ... return {"pos": pos}
    >>>chunker = ConsecutiveNPChunker(train_sents)
    >>>print chunker.evaluate(test_sents)

    以上是书上提供的代码,问题是,当在执行

    chunker = ConsecutiveNPChunker(train_sents)并没有如期执行,反而出现了一个错误。
    Traceback (most recent call last):
      File "<pyshell#119>", line 1, in <module>
        chunker = ConsecutiveNPChunker(train_sents)
      File "<pyshell#118>", line 5, in __init__
        self.tagger = ConsecutiveNPChunkTagger(tagged_sents)
      File "<pyshell#116>", line 11, in __init__
        self.classifier = nltk.MaxentClassifier.train(train_set, algorithm='megam', trace=0)
      File "D:\SpecialSoftware\Python25\Lib\site-packages\nltk\classify\maxent.py", line 319, in train
        gaussian_prior_sigma, **cutoffs)
      File "D:\SpecialSoftware\Python25\Lib\site-packages\nltk\classify\maxent.py", line 1522, in train_maxent_classifier_with_megam
        stdout = call_megam(options)
      File "D:\SpecialSoftware\Python25\Lib\site-packages\nltk\classify\megam.py", line 163, in call_megam
        config_megam()
      File "D:\SpecialSoftware\Python25\Lib\site-packages\nltk\classify\megam.py", line 59, in config_megam
        url='http://www.cs.utah.edu/~hal/megam/')
      File "D:\SpecialSoftware\Python25\Lib\site-packages\nltk\internals.py", line 528, in find_binary
        url, verbose)
      File "D:\SpecialSoftware\Python25\Lib\site-packages\nltk\internals.py", line 512, in find_file
        raise LookupError('\n\n%s\n%s\n%s' % (div, msg, div))
    LookupError: 
    
    ===========================================================================
    NLTK was unable to find the megam file!
    Use software specific configuration paramaters or set the MEGAM environment variable.
    
      For more information, on megam, see:
        <http://www.cs.utah.edu/~hal/megam/>
    ===========================================================================

    虽然说给出了相应的提示,但是并不完全。

    通过对谷歌的搜索,找到了一些解决的眉目。

    我的操作系统是Windows8.

    nltk语言工具的官网给出了提示:

    https://sites.google.com/site/naturallanguagetoolkit/download

    Megam为可选包,将来使用的时候可以再来安装。下载的网址为:MegaM: http://hal3.name/megam/megam_src.tgz,直接下载我并没有下载成功,使用迅雷下载成功的。

    但是打开之后发现,都是些源文件。但是在这个压缩包里面有一个README文件,给出了怎样使用的提示,发现,还需要装一个东西。

    README中这样写到:ocaml(http://caml.inria.fr)

    需要到这个网站下载源文件的编译器,于是我下载了和自己电脑系统相匹配的版本,但是看说明安装起来还是需要琢磨的,在安装过程中提示是病毒,但是我还是选择信任,要不然没办法继续。

     【现在Ocaml正在安装,我点的完全安装(可能实际当中没有必要完全安装吧),等安装完成后,再继续探索怎么解决这个问题】

     
  • 相关阅读:
    文档管理项目
    根据商品名称、价格区间检索商品的SQL语句
    ASP.NET MVC进阶三
    ASP.NET MVC进阶二
    vscode编写html,常用快捷方式与插件
    bpexpdate – 更改映像目录库中备份的截止日期以及介质目录库中介质的截止日期nbu
    Netbackup用于技术支持的问题报告(报障模版)
    netbackup 8.1安装注意事项
    金融的本质是什么?终于有人讲清楚了!(源于网络)
    nbu集群Alwayson相关问题
  • 原文地址:https://www.cnblogs.com/createMoMo/p/3079290.html
Copyright © 2011-2022 走看看