zoukankan      html  css  js  c++  java
  • python nltk 学习笔记(1)

    from nltk.book import *

    >>> type(text1)

    <class 'nltk.text.Text'>

    http://nltk.googlecode.com/svn/trunk/doc/api/nltk.text.Text-class.html


    text1.concordance("monstrous")
    text1.similar("monstrous")

    sorted(set(text3))

    >>> f = FreqDist(text1)

    >>> f

    <FreqDist with 19317 samples and 260819 outcomes>

    http://nltk.googlecode.com/svn/trunk/doc/api/nltk.probability.FreqDist-class.html

    >>> v = f.keys()

    >>> type(v)

    <type 'list'>

    >>> V = set(text1)

    >>> len(V) == len(v)

    True

     

    >>> long_words = [w for w in V if len(w) > 15]

    >>> sorted(long_words)

    FunctionMeaning
    s.startswith(t) test if s starts with t
    s.endswith(t) test if s ends with t
    in s test if t is contained inside s
    s.islower() test if all cased characters in s are lowercase
    s.isupper() test if all cased characters in s are uppercase
    s.isalpha() test if all characters in s are alphabetic
    s.isalnum() test if all characters in s are alphanumeric
    s.isdigit() test if all characters in s are digits
    s.istitle() test if s is titlecased (all words in s have have initial capitals)

     

     

    bigrams(text1)

    >>> [w.upper() for w in s]  //cap all elements

    >>> len(set([word.lower() for word in text1]))

     

     

     

  • 相关阅读:
    redis安装
    redis的使用场景和基本数据类型
    (传输层)tcp协议
    async/await
    Promise对象
    对称加密与非对称加密
    Js遍历数组总结
    HTTPS加密传输过程
    HTML节点操作
    Js的new运算符
  • 原文地址:https://www.cnblogs.com/wintor12/p/3622282.html
Copyright © 2011-2022 走看看