zoukankan      html  css  js  c++  java
  • IntroductionToNLP

    from pyhanlp import *
    
    def main():
        HanLP.Config.enableDebug()
        #  为了避免你等得无聊,开启调试模式说点什么:-)
        print(HanLP.segment("王国维和服务员"))
    
    if __name__ == '__main__':
        main()

    下载数据包时间较长,先睡觉去了,明日再看~

    第二天看结果:

    下载 https://file.hankcs.com/hanlp/data-for-1.7.5.zip 到 C:UsersAdministratorAppDataLocalProgramsPythonPython38Libsite-packagespyhanlpstaticdata-for-1.7.8.zip
    100.00%, 637 MB, 404 KB/s, 还有 0 分  0 秒
    解压 data.zip...
    6月 17, 2020 1:28:41 上午 com.hankcs.hanlp.dictionary.CustomDictionary loadMainDictionary
    信息: 自定义词典开始加载:C:/Users/Administrator/AppData/Local/Programs/Python/Python38/Lib/site-packages/pyhanlp/static/data/dictionary/custom/CustomDictionary.txt
    6月 17, 2020 1:28:42 上午 com.hankcs.hanlp.dictionary.CustomDictionary isDicNeedUpdate
    信息: 已清除自定义词典缓存文件!
    6月 17, 2020 1:28:42 上午 com.hankcs.hanlp.dictionary.CustomDictionary loadMainDictionary
    信息: 以默认词性[n]加载自定义词典C:/Users/Administrator/AppData/Local/Programs/Python/Python38/Lib/site-packages/pyhanlp/static/data/dictionary/custom/CustomDictionary.txt中……
    6月 17, 2020 1:28:42 上午 com.hankcs.hanlp.dictionary.CustomDictionary loadMainDictionary
    信息: 以默认词性[n]加载自定义词典C:/Users/Administrator/AppData/Local/Programs/Python/Python38/Lib/site-packages/pyhanlp/static/data/dictionary/custom/现代汉语补充词库.txt中……
    6月 17, 2020 1:28:42 上午 com.hankcs.hanlp.dictionary.CustomDictionary loadMainDictionary
    信息: 以默认词性[ns]加载自定义词典C:UsersAdministratorAppDataLocalProgramsPythonPython38Libsite-packagespyhanlpstaticdatadictionarycustom全国地名大全.txt中……
    6月 17, 2020 1:28:42 上午 com.hankcs.hanlp.dictionary.CustomDictionary loadMainDictionary
    信息: 以默认词性[n]加载自定义词典C:/Users/Administrator/AppData/Local/Programs/Python/Python38/Lib/site-packages/pyhanlp/static/data/dictionary/custom/人名词典.txt中……
    6月 17, 2020 1:28:42 上午 com.hankcs.hanlp.dictionary.CustomDictionary loadMainDictionary
    信息: 以默认词性[n]加载自定义词典C:/Users/Administrator/AppData/Local/Programs/Python/Python38/Lib/site-packages/pyhanlp/static/data/dictionary/custom/机构名词典.txt中……
    6月 17, 2020 1:28:42 上午 com.hankcs.hanlp.dictionary.CustomDictionary loadMainDictionary
    信息: 以默认词性[ns]加载自定义词典C:UsersAdministratorAppDataLocalProgramsPythonPython38Libsite-packagespyhanlpstaticdatadictionarycustom上海地名.txt中……
    6月 17, 2020 1:28:42 上午 com.hankcs.hanlp.dictionary.CustomDictionary loadMainDictionary
    信息: 以默认词性[nrf]加载自定义词典C:UsersAdministratorAppDataLocalProgramsPythonPython38Libsite-packagespyhanlpstaticdatadictionaryperson
    rf.txt中……
    6月 17, 2020 1:28:42 上午 com.hankcs.hanlp.dictionary.CustomDictionary loadMainDictionary
    信息: 正在构建DoubleArrayTrie……
    6月 17, 2020 1:28:44 上午 com.hankcs.hanlp.dictionary.CustomDictionary loadMainDictionary
    信息: 正在缓存词典为dat文件……
    6月 17, 2020 1:28:44 上午 com.hankcs.hanlp.dictionary.CustomDictionary <clinit>
    信息: 自定义词典加载成功:381940个词条,耗时2795ms
    6月 17, 2020 1:28:44 上午 com.hankcs.hanlp.dictionary.CoreDictionary load
    信息: 核心词典开始加载:C:/Users/Administrator/AppData/Local/Programs/Python/Python38/Lib/site-packages/pyhanlp/static/data/dictionary/CoreNatureDictionary.txt
    6月 17, 2020 1:28:44 上午 com.hankcs.hanlp.dictionary.CoreDictionary <clinit>
    信息: C:/Users/Administrator/AppData/Local/Programs/Python/Python38/Lib/site-packages/pyhanlp/static/data/dictionary/CoreNatureDictionary.txt加载成功,153091个词条,耗时60ms
    粗分词网:
    0:[ ]
    1:[王, 王国]
    2:[国]
    3:[维, 维和]
    4:[和, 和服]
    5:[服, 服务, 服务员]
    6:[务]
    7:[员]
    8:[ ]
    
    6月 17, 2020 1:28:44 上午 com.hankcs.hanlp.dictionary.CoreBiGramTableDictionary <clinit>
    信息: 开始加载二元词典C:/Users/Administrator/AppData/Local/Programs/Python/Python38/Lib/site-packages/pyhanlp/static/data/dictionary/CoreNatureDictionary.ngram.txt.table
    6月 17, 2020 1:28:44 上午 com.hankcs.hanlp.dictionary.CoreBiGramTableDictionary <clinit>
    信息: C:/Users/Administrator/AppData/Local/Programs/Python/Python38/Lib/site-packages/pyhanlp/static/data/dictionary/CoreNatureDictionary.ngram.txt.table加载成功,耗时94ms
    粗分结果[王国/n, 维和/vn, 服务员/nnt]
    6月 17, 2020 1:28:45 上午 com.hankcs.hanlp.dictionary.nr.PersonDictionary <clinit>
    信息: C:/Users/Administrator/AppData/Local/Programs/Python/Python38/Lib/site-packages/pyhanlp/static/data/dictionary/person/nr.txt加载成功,
    耗时171ms
    人名角色观察:[  K 1 A 1 ][王国 X 232 L 3 ][维和 L 2 V 1 Z 1 ][服务员 K 14 ][  K 1 A 1 ]
    人名角色标注:[ /K ,王国/X ,维和/V ,服务员/K , /K]
    识别出人名:王国维 XD
    细分词网:
    0:[ ]
    1:[王国, 王国维]
    2:[]
    3:[维和]
    4:[和, 和服]
    5:[服务员]
    6:[]
    7:[]
    8:[ ]
    
    [王国维/nr, 和/cc, 服务员/nnt]
  • 相关阅读:
    为什么机器学习中常常假设数据是独立同分布的?
    深度学习常见的问题
    隐马尔科夫模型(HMM)与词性标注问题
    机器学习常见问题
    特征向量、特征值以及降维方法(PCA、SVD、LDA)
    anaconda安装tensorflow后pip安装jieba出错的问题
    神经网络与BP神经网络
    tensorflow模块安装
    requests爬取百度音乐
    scrapy爬虫,爬取图片
  • 原文地址:https://www.cnblogs.com/hbuwyg/p/13150123.html
Copyright © 2011-2022 走看看