zoukankan      html  css  js  c++  java
  • python nltk 学习笔记(1)

    from nltk.book import *

    >>> type(text1)

    <class 'nltk.text.Text'>

    http://nltk.googlecode.com/svn/trunk/doc/api/nltk.text.Text-class.html


    text1.concordance("monstrous")
    text1.similar("monstrous")

    sorted(set(text3))

    >>> f = FreqDist(text1)

    >>> f

    <FreqDist with 19317 samples and 260819 outcomes>

    http://nltk.googlecode.com/svn/trunk/doc/api/nltk.probability.FreqDist-class.html

    >>> v = f.keys()

    >>> type(v)

    <type 'list'>

    >>> V = set(text1)

    >>> len(V) == len(v)

    True

     

    >>> long_words = [w for w in V if len(w) > 15]

    >>> sorted(long_words)

    FunctionMeaning
    s.startswith(t) test if s starts with t
    s.endswith(t) test if s ends with t
    in s test if t is contained inside s
    s.islower() test if all cased characters in s are lowercase
    s.isupper() test if all cased characters in s are uppercase
    s.isalpha() test if all characters in s are alphabetic
    s.isalnum() test if all characters in s are alphanumeric
    s.isdigit() test if all characters in s are digits
    s.istitle() test if s is titlecased (all words in s have have initial capitals)

     

     

    bigrams(text1)

    >>> [w.upper() for w in s]  //cap all elements

    >>> len(set([word.lower() for word in text1]))

     

     

     

  • 相关阅读:
    Mysql优化之6年工作经验总结
    mysql_innodb存储引擎的优化
    十六、MySQL授权命令grant的使用方法
    十五、Mysql字符集的那些事
    十四、索引
    十三、视图
    十二、存储过程
    十一、触发器
    十、存储引擎
    九、备份与恢复
  • 原文地址:https://www.cnblogs.com/wintor12/p/3622282.html
Copyright © 2011-2022 走看看