转载请注明出处“一块努力的牛皮糖”:http://www.cnblogs.com/yuxc/
新手上路,翻译不恰之处,恳请指出,不胜感谢
2.6 Summary 小结
• A text corpus is a large, structured collection of texts. NLTK comes with many corpora, e.g., the Brown Corpus, nltk.corpus.brown.
文本语料库是一个大型的结构化的一系列的文本。NLTK包含了许多语料库,例如,Brown Corpus,nltk.corpus.brown。
• Some text corpora are categorized, e.g., by genre or topic; sometimes the categories of a corpus overlap each other.
一些文本语料库进行了分类,例如通过类型或者主题;有时候语料库的类别相互重叠。
• A conditional frequency distribution is a collection of frequency distributions, each one for a different condition. They can be used for counting word frequencies,given a context or a genre.
条件频率分布是一系列的条件分布,每个都是不同的条件。它们可以用于通过给定内容或者类型对单词频率计数。
• Python programs more than a few lines long should be entered using a text editor, saved to a file with a .py extension, and accessed using an import statement.
有数行的Python程序应该使用文本编辑器输入,保存为.py的文件,并使用import语句来访问。
• Python functions permit you to associate a name with a particular block of code, and reuse that code as often as necessary.
Python函数允许将一段特别的代码块与名字联系起来,并且频繁地重用代码。
• Some functions, known as “methods,” are associated with an object, and we give the object name followed by a period followed by the method name, like this: x.funct(y), e.g., word.isalpha().
一些被称为“方法”的函数与对象联系起来,我们随后通过方法名给出了对象名称,就像这样:x.funct(y),例如,word.isalpha()。
• To find out about some variable v, type help(v) in the Python interactive interpreter to read the help entry for this kind of object.
为了找出一些变量v,在Pyhon交互解释器中输入help(v)来阅读这种对象的帮助条目。
• WordNet is a semantically oriented dictionary of English, consisting of synonym sets—or synsets—and organized into a network.
WordNet是一个面向语义的英语字典,由同义词的集合—或同义词集组成—并且组织成网络。
• Some functions are not available by default, but must be accessed using Python’s import statement.
有些函数的默认值不是有效的,但必须使用Python的import语句来访问。