zoukankan      html  css  js  c++  java
  • python nltk 学习笔记(5) Learning to Classify Text

    >>> def gender_features(word):
    ... return {'last_letter': word[-1]}
    >>> gender_features('Shrek')
    {'last_letter': 'k'}

    >>> from nltk.corpus import names
    >>> import random
    >>> names = ([(name, 'male') for name in names.words('male.txt')] +
    ... [(name, 'female') for name in names.words('female.txt')])
    >>> import random
    >>> random.shuffle(names)
    >>> featuresets = [(gender_features(n), g) for (n,g) in names]
    >>> train_set, test_set = featuresets[500:], featuresets[:500]
    >>> classifier = nltk.NaiveBayesClassifier.train(train_set)
    >>> classifier.classify(gender_features('Neo'))
    'male'
    >>> classifier.classify(gender_features('Trinity'))
    'female'
    >>> print nltk.classify.accuracy(classifier, test_set)
    0.758

  • 相关阅读:
    C++学习笔记1——const
    反转二叉树
    pywinauto 使用
    pywinauto 的使用
    爬虫基础知识
    mongdb安装配置
    pyinstaller
    Python3.6+pyinstaller+Django
    py2exe安装使用
    cx_freeze的安装使用
  • 原文地址:https://www.cnblogs.com/wintor12/p/3622822.html
Copyright © 2011-2022 走看看