Tagging a single word with the nltk pos tagger tags each letter instead of the word

11,183

Solution 1

nltk.tag.pos_tag accepts a list of tokens, separate and tags its elements. Therefore you need to put your words in an iterable like list:

>>> nltk.tag.pos_tag(['going'])
[('going', 'VBG')]

Solution 2

>>> word = 'going'
>>> word = nltk.word_tokenize(word)
>>> l1 = nltk.pos_tag(word)
>>> l1
[('going', 'VBG')]

Solution 3

Return the POS tag of one word

nltk.pos_tag(["going"])
----->[('going', 'VBG')]
Share:
11,183
jksnw
Author by

jksnw

Updated on June 16, 2022

Comments

  • jksnw
    jksnw almost 2 years

    I'm try to tag a single word with the nltk pos tagger:

    word = "going"
    pos = nltk.pos_tag(word)
    print pos
    

    But the output is this:

    [('g', 'NN'), ('o', 'VBD'), ('i', 'PRP'), ('n', 'VBP'), ('g', 'JJ')]
    

    It's tagging each letter rather than just the one word.

    What can I do to make it tag the word?

  • jksnw
    jksnw about 9 years
    I know it's meant to work on list but can it work on a single word?
  • Alaa M.
    Alaa M. over 6 years
    Note that this tags the sentence as whole (I know the OP asked about 1 word but this might be confusing)
  • Mazdak
    Mazdak over 6 years
    @AlaaM. What do you mean by tagging a sentence as a whole? The pos tagging aims to tag the words based on their initial character and their position in the sentence. That's why the tag is composed of multiple character.
  • Alaa M.
    Alaa M. over 6 years
    I'm just saying if you have more than one word then do nltk.tag.pos_tag('a sentence'.split()), and not nltk.tag.pos_tag(['a sentence']), because the latter would produce a single tag
  • Mazdak
    Mazdak over 6 years
    @AlaaM. Definitely, that's why I liked to the documentation. Also I updated the answer since it was for a long while ago and full of confusion ;)).
  • JoeF
    JoeF over 5 years
    This is just a technical (and probably overly pedantic) clarification. The problem is that pos_tag accepts any iterable, not just lists. It iterates over the items in that iterable (characters in the case of strings, items in the case of lists) and attempts to tag those items. I'm sure you are aware of this, but I thought I would just provide more clarification for those who are wondering why the output is the way it is.
  • Jonathan
    Jonathan over 2 years
    This solution has already been provided in this answer.