zoukankan      html  css  js  c++  java
  • NLTK的安装

    一、NLTK:Natural Language Toolkit(自然语言工具包)

     下载:http://www.nltk.org

    pip install nltk
    

    二、使用

    import nltk
    nltk.download()#下载数据

    import nltk
    
    text = 'Hello, Tom! How are you recently?'
    
    sens = nltk.sent_tokenize(text) #对文本按照句子进行分割
    sens#['Hello, Tom!', 'How are you recently?']
    
    words = []
    for sen in sens:
        words.append(nltk.word_tokenize(sen))#对句子进行分词
    
    words#[['Hello', ',', 'Tom', '!'], ['How', 'are', 'you', 'recently', '?']]
    
    tags = []
    
    for tokens in words:
        tags.append(nltk.pos_tag(tokens))#对句子进行词性标注
    tags#[[('Hello', 'NNP'), (',', ','), ('Tom', 'NNP'), ('!', '.')], [('How', 'WRB'), ('are', 'VBP'), ('you', 'PRP'), ('recently', 'RB'), ('?', '.')]]
    

     

    三、安装成功,导入报错

     已经成功安装nltk,但是import nltk时报错:No module named '_sqlite3'

     背景:linux系统自带的python2,已经成功安装nltk,本人自己安装了python3,import nltk出错

     解决方法:sudo apt-get install  sqlite*之后,重新安装python3

    #step1
    sudo apt-get install  sqlite*
    
    #step2
    ./configure --prefix=/python3_path
    make && make install
    

     

     

     

     

  • 相关阅读:
    使用ConfigFilter
    读取特定文件,替换第一行内容
    sqlserver,oracle,mysql等的driver驱动,url怎么写
    Excel 数字处理
    ResultMap详解
    正则表达式
    Tomasulo algorithm
    scoreboarding
    data hazard in CPU pipeline
    差分绕线间距对阻抗的影响
  • 原文地址:https://www.cnblogs.com/always-fight/p/9779997.html
Copyright © 2011-2022 走看看