zoukankan      html  css  js  c++  java
  • Food Log with Speech Recognition and NLP

    1. 分词 word segmentation

    国内有jieba 分词

    2. Named Entity Recognition

    1. 训练自己的Model

          

    How can I train my own NER model

    https://nlp.stanford.edu/software/crf-faq.html#a

    C:my_studyMLNLPstanford-ner-2018-02-27>java -cp stanford-ner.jar edu.stanford.nlp.ie.crf.CRFClassifier -prop chinese.meal.fpp.prop
    Invoked on Thu Mar 22 16:34:06 CST 2018 with arguments: -prop chinese.meal.fpp.prop
    usePrevSequences=true
    useClassFeature=true
    useTypeSeqs2=true
    useSequences=true
    wordShape=chris2useLC
    useTypeySequences=true
    useDisjunctive=true
    noMidNGrams=true
    serializeTo=ner-model.ser.gz
    maxNGramLeng=6
    useNGrams=true
    usePrev=true
    useNext=true
    maxLeft=1
    trainFile=chinese.meal.fpp.tsv
    map=word=0,answer=1
    useWord=true
    useTypeSeqs=true
    numFeatures = 564
    Time to convert docs to feature indices: 0.0 seconds
    numClasses: 5 [0=O,1=TIME,2=QUANTITY,3=UNIT,4=FOOD]
    numDocuments: 1
    numDatums: 56
    numFeatures: 564
    Time to convert docs to data/labels: 0.0 seconds
    numWeights: 6460
    QNMinimizer called on double function of 6460 variables, using M = 25.
                   An explanation of the output:
    Iter           The number of iterations
    evals          The number of function evaluations
    SCALING        <D> Diagonal scaling was used; <I> Scaled Identity
    LINESEARCH     [## M steplength]  Minpack linesearch
                       1-Function value was too high
                       2-Value ok, gradient positive, positive curvature
                       3-Value ok, gradient negative, positive curvature
                       4-Value ok, gradient negative, negative curvature
                   [.. B]  Backtracking
    VALUE          The current function value
    TIME           Total elapsed time
    |GNORM|        The current norm of the gradient
    {RELNORM}      The ratio of the current to initial gradient norms
    AVEIMPROVE     The average improvement / current value
    EVALSCORE      The last available eval score
    
    Iter ## evals ## <SCALING> [LINESEARCH] VALUE TIME |GNORM| {RELNORM} AVEIMPROVE EVALSCORE
    
    Iter 1 evals 1 <D> [M 1.000E-1] 9.068E2 0.04s |4.550E1| {4.995E-1} 0.000E0 -
    Iter 2 evals 2 <D> [M 1.000E0] 6.222E2 0.05s |3.525E1| {3.870E-1} 2.287E-1 -
    Iter 3 evals 3 <D> [M 1.000E0] 2.386E2 0.07s |5.406E1| {5.935E-1} 9.334E-1 -
    Iter 4 evals 4 <D> [M 1.000E0] 9.082E1 0.08s |1.571E1| {1.724E-1} 2.246E0 -
    Iter 5 evals 5 <D> [M 1.000E0] 7.031E1 0.10s |1.181E1| {1.297E-1} 2.379E0 -
    Iter 6 evals 6 <D> [M 1.000E0] 5.308E1 0.11s |1.025E1| {1.125E-1} 2.681E0 -
    Iter 7 evals 7 <D> [1M 2.740E-1] 2.988E1 0.14s |7.586E0| {8.328E-2} 4.193E0 -
    Iter 8 evals 9 <D> [1M 1.292E-1] 2.234E1 0.16s |6.471E0| {7.105E-2} 4.949E0 -
    Iter 9 evals 11 <D> [1M 1.801E-1] 1.615E1 0.18s |5.573E0| {6.118E-2} 6.127E0 -
    Iter 10 evals 13 <D> [1M 1.815E-1] 1.218E1 0.24s |4.477E0| {4.915E-2} 7.346E0 -
    Iter 11 evals 15 <D> [1M 3.119E-1] 8.873E0 0.30s |4.694E0| {5.154E-2} 6.912E0 -
    Iter 12 evals 17 <D> [1M 4.760E-1] 6.621E0 0.31s |2.092E0| {2.296E-2} 3.504E0 -
    Iter 13 evals 19 <D> [M 1.000E0] 6.093E0 0.32s |1.906E0| {2.092E-2} 1.390E0 -
    Iter 14 evals 20 <D> [M 1.000E0] 5.844E0 0.33s |9.067E-1| {9.955E-3} 1.103E0 -
    Iter 15 evals 21 <D> [M 1.000E0] 5.721E0 0.33s |5.774E-1| {6.339E-3} 8.279E-1 -
    Iter 16 evals 22 <D> [M 1.000E0] 5.660E0 0.34s |3.535E-1| {3.881E-3} 4.279E-1 -
    Iter 17 evals 23 <D> [M 1.000E0] 5.640E0 0.35s |1.946E-1| {2.137E-3} 2.961E-1 -
    Iter 18 evals 24 <D> [M 1.000E0] 5.632E0 0.36s |7.832E-2| {8.599E-4} 1.868E-1 -
    Iter 19 evals 25 <D> [M 1.000E0] 5.631E0 0.38s |3.559E-2| {3.907E-4} 1.163E-1 -
    Iter 20 evals 26 <D> [M 1.000E0] 5.631E0 0.39s |2.149E-2| {2.359E-4} 5.758E-2 -
    Iter 21 evals 27 <D> [M 1.000E0] 5.631E0 0.41s |1.027E-2| {1.128E-4} 1.758E-2 -
    Iter 22 evals 28 <D> [M 1.000E0] 5.631E0 0.42s |3.631E-3| {3.986E-5} 8.218E-3 -
    Iter 23 evals 29 <D> [M 1.000E0] 5.631E0 0.44s |1.629E-3| {1.789E-5} 3.791E-3 -
    Iter 24 evals 30 <D> [M 1.000E0] 5.631E0 0.45s |9.548E-4| {1.048E-5} 1.596E-3 -
    Iter 25 evals 31 <D> [M 1.000E0] 5.631E0 0.45s |5.724E-4| {6.284E-6} 5.196E-4 -
    Iter 26 evals 32 <D> [M 1.000E0] 5.631E0 0.47s |1.578E-4| {1.732E-6} 1.686E-4 -
    QNMinimizer terminated due to average improvement: | newest_val - previous_val | / |newestVal| < TOL
    Total time spent in optimization: 0.49s
    CRFClassifier training ... done [0.6 sec].
    Serializing classifier to ner-model.ser.gz... done.

     2. 使用训练好的Model来evaluate 一下,看看效果怎么样. 

    C:my_studyMLNLPstanford-ner-2018-02-27>java -cp stanford-ner.jar edu.stanford.nlp.ie.crf.CRFClassifier -loadClassifier ner-model.ser.gz -testFile chinese.meal.fpp.test.tsv
    Invoked on Thu Mar 22 16:30:48 CST 2018 with arguments: -loadClassifier ner-model.ser.gz -testFile chinese.meal.fpp.test.tsv
    testFile=chinese.meal.fpp.test.tsv
    loadClassifier=ner-model.ser.gz
    Loading classifier from ner-model.ser.gz ... done [0.1 sec].
    我      O       O
    今天    O       O
    晚上    TIME    TIME
    吃      O       O
    了      O       O
    两      QUANTITY        QUANTITY
    盘      UNIT    UNIT
    回锅肉  FOOD    FOOD
    
    CRFClassifier tagged 8 words in 1 documents at 88.89 words per second.
             Entity P       R       F1      TP      FP      FN
               FOOD 1.0000  1.0000  1.0000  1       0       0
           QUANTITY 1.0000  1.0000  1.0000  1       0       0
               TIME 1.0000  1.0000  1.0000  1       0       0
               UNIT 1.0000  1.0000  1.0000  1       0       0
             Totals 1.0000  1.0000  1.0000  4       0       0

    还不错哦!

    Ref:

    1. Standford NLP NER: https://nlp.stanford.edu/software/CRF-NER.html

    转载请注明出处 http://www.cnblogs.com/mashuai-191/
  • 相关阅读:
    ADF中遍历VO中的行数据(Iterator)
    程序中实现两个DataTable的Left Join效果(修改了,网上第二个DataTable为空,所处的异常)
    ArcGIS api for javascript——鼠标悬停时显示信息窗口
    ArcGIS api for javascript——查询,然后单击显示信息窗口
    ArcGIS api for javascript——查询,立刻打开信息窗口
    ArcGIS api for javascript——显示多个查询结果
    ArcGIS api for javascript——用图表显示查询结果
    ArcGIS api for javascript——查询没有地图的数据
    ArcGIS api for javascript——用第二个服务的范围设置地图范围
    ArcGIS api for javascript——显示地图属性
  • 原文地址:https://www.cnblogs.com/mashuai-191/p/8621413.html
Copyright © 2011-2022 走看看