What does NN VBD IN DT NNS RB means in NLTK?
Solution 1
The tags that you see are not a result of the chunks but the POS tagging that happens before chunking. It's the Penn Treebank tagset, see https://www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank_pos.html
>>> from nltk import word_tokenize, pos_tag, ne_chunk
>>> sent = "This is a Foo Bar sentence."
# POS tag.
>>> nltk.pos_tag(word_tokenize(sent))
[('This', 'DT'), ('is', 'VBZ'), ('a', 'DT'), ('Foo', 'NNP'), ('Bar', 'NNP'), ('sentence', 'NN'), ('.', '.')]
>>> tagged_sent = nltk.pos_tag(word_tokenize(sent))
# Chunk.
>>> ne_chunk(tagged_sent)
Tree('S', [('This', 'DT'), ('is', 'VBZ'), ('a', 'DT'), Tree('ORGANIZATION', [('Foo', 'NNP'), ('Bar', 'NNP')]), ('sentence', 'NN'), ('.', '.')])
To get the chunks look for subtrees within the chunked outputs. From the above output, the Tree('ORGANIZATION', [('Foo', 'NNP'), ('Bar', 'NNP')])
indicates the chunk.
This tutorial site is pretty helpful to explain the chunking process in NLTK: http://www.eecis.udel.edu/~trnka/CISC889-11S/lectures/dongqing-chunking.pdf.
For official documentation, see http://www.nltk.org/howto/chunk.html
Solution 2
Even though the above links have all kinds. But hope this is still helpful for someone, added a few that are missed on other links.
CC: Coordinating conjunction
CD: Cardinal number
DT: Determiner
EX: Existential there
FW: Foreign word
IN: Preposition or subordinating conjunction
JJ: Adjective
VP: Verb Phrase
JJR: Adjective, comparative
JJS: Adjective, superlative
LS: List item marker
MD: Modal
NN: Noun, singular or mass
NNS: Noun, plural
PP: Preposition Phrase
NNP: Proper noun, singular Phrase
NNPS: Proper noun, plural
PDT: Pre determiner
POS: Possessive ending
PRP: Personal pronoun Phrase
PRP: Possessive pronoun Phrase
RB: Adverb
RBR: Adverb, comparative
RBS: Adverb, superlative
RP: Particle
S: Simple declarative clause
SBAR: Clause introduced by a (possibly empty) subordinating conjunction
SBARQ: Direct question introduced by a wh-word or a wh-phrase.
SINV: Inverted declarative sentence, i.e. one in which the subject follows the tensed verb or modal.
SQ: Inverted yes/no question, or main clause of a wh-question, following the wh-phrase in SBARQ.
SYM: Symbol
VBD: Verb, past tense
VBG: Verb, gerund or present participle
VBN: Verb, past participle
VBP: Verb, non-3rd person singular present
VBZ: Verb, 3rd person singular present
WDT: Wh-determiner
WP: Wh-pronoun
WP: Possessive wh-pronoun
WRB: Wh-adverb
Solution 3
As told by Alvas above, these tags are part-of-speech which tells whether a word/phrase is Noun phrase,Adverb,determiner,verb etc...
Here are the POS Tag details you can refer.
Chunking recovers the phrased from the Part of speech tags
You can refer this link for reading for about chunking.
Knows Not Much
Updated on July 14, 2022Comments
-
Knows Not Much almost 2 years
when I chunk text, I get lots of codes in the output like
NN, VBD, IN, DT, NNS, RB
. Is there a list documented somewhere which tells me the meaning of these? I have tried googlingnltk chunk code
nltk chunk grammar
nltk chunk tokens
.But I am not able to find any documentation which explains what these codes mean.