参考:Stanford PoS Tagger: tagging from Python
参考:NLTK Part of Speech Tagging Tutorial
# running the Stanford POS Tagger from NLTK import nltk from nltk import word_tokenize from nltk import StanfordTagger text_tok = nltk.word_tokenize("Just a small snippet of text.") # print(text_tok) pos_tagged = nltk.pos_tag(text_tok) # print the list of tuples: (word,word_class) print(pos_tagged) # for loop to extract the elements of the tuples in the pos_tagged list # print the word and the pos_tag with the underscore as a delimiter for word,word_class in pos_tagged: print(word + "_" + word_class)
解释 POS tag list:
- CC coordinating conjunction
- CD cardinal digit
- DT determiner
- EX existential there (like: "there is" ... think of it like "there exists")
- FW foreign word
- IN preposition/subordinating conjunction
- JJ adjective 'big'
- JJR adjective, comparative 'bigger'
- JJS adjective, superlative 'biggest'
- LS list marker 1)
- MD modal could, will
- NN noun, singular 'desk'
- NNS noun plural 'desks'
- NNP proper noun, singular 'Harrison'
- NNPS proper noun, plural 'Americans'
- PDT predeterminer 'all the kids'
- POS possessive ending parent's
- PRP personal pronoun I, he, she
- PRP$ possessive pronoun my, his, hers
- RB adverb very, silently,
- RBR adverb, comparative better
- RBS adverb, superlative best
- RP particle give up
- TO to go 'to' the store.
- UH interjection errrrrrrrm
- VB verb, base form take
- VBD verb, past tense took
- VBG verb, gerund/present participle taking
- VBN verb, past participle taken
- VBP verb, sing. present, non-3d take
- VBZ verb, 3rd person sing. present takes
- WDT wh-determiner which
- WP wh-pronoun who, what
- WP$ possessive wh-pronoun whose
- WRB wh-abverb where, when