Python NLP Intent Identification

22,299

Solution 1

You can do intent identification with DeepPavlov, it supports multi-label classification. More information can be found in http://docs.deeppavlov.ai/en/master/components/classifiers.html The demo page https://demo.ipavlov.ai

Solution 2

you can use spacy for training a custom parser for chat intent semantics.

spaCy's parser component can be used to trained to predict any type of tree structure over your input text. You can also predict trees over whole documents or chat logs, with connections between the sentence-roots used to annotate discourse structure.

for example: "show me the best hotel in berlin"

('show', 'ROOT', 'show')
('best', 'QUALITY', 'hotel') --> hotel with QUALITY best
('hotel', 'PLACE', 'show') --> show PLACE hotel
('berlin', 'LOCATION', 'hotel') --> hotel with LOCATION berlin

To train the model you need data in this format:

# training data: texts, heads and dependency labels
# for no relation, we simply chose an arbitrary dependency label, e.g. '-'
TRAIN_DATA = [
    ("find a cafe with great wifi", {
        'heads': [0, 2, 0, 5, 5, 2],  # index of token head
        'deps': ['ROOT', '-', 'PLACE', '-', 'QUALITY', 'ATTRIBUTE']
    }),
    ("find a hotel near the beach", {
        'heads': [0, 2, 0, 5, 5, 2],
        'deps': ['ROOT', '-', 'PLACE', 'QUALITY', '-', 'ATTRIBUTE']
    })]

TEST_DATA:
input : show me the best hotel in berlin
output: [
      ('show', 'ROOT', 'show'),
      ('best', 'QUALITY', 'hotel'),
      ('hotel', 'PLACE', 'show'),
      ('berlin', 'LOCATION', 'hotel')
    ]

for more details Please check the below link. https://spacy.io/usage/examples#intent-parser

Solution 3

For a general knowledge and list of excellent examples for question and answering based systems, the leaderboard of NLP in the industry are listed here: https://rajpurkar.github.io/SQuAD-explorer/ This process can actually get really complicated depending on the complexity and range of your domain. For example, more advanced approaches apply first order + propositional logic and complex neural nets. One of the more impressive solutions I've seen is bidirectional attention flow: https://github.com/allenai/bi-att-flow, demo is here: http://beta.moxel.ai/models/strin/bi-att-flow/latest

In practice, I have found that if your corpus has more domain-specific terms, you will need to build your own dictionary. In your example, "NLP" and "Natural Language Processing" are the same entity, so you need to include this in a dictionary.

Basically, consider yourself really lucky if you can get away with just a pure statistical approach like cosine distance. You'll likely need to combine with a lexicon-based approach as well. All the NLP projects I have done have had domain-specific terminology and "slang", so I have used combined both statistical and lexicon based methods, especially for feature extraction like topics, intents, and entities.

Share:
22,299
Yogesh
Author by

Yogesh

Updated on July 09, 2022

Comments

  • Yogesh
    Yogesh almost 2 years

    I am novice in Python and NLP, and my problem is how to finding out Intent of given questions, for example I have sets of questions and answers like this :

    question:What is NLP; answer: NLP stands for Natural Language Processing

    I did some basic POS tagger on given questions in above question I get entety [NLP] I also did String Matching using this algo.

    Basically I faced following issues :

    1. If user ask what is NLP then it will return exact answers
    2. If user ask meaning of NLP then it fail
    3. If user ask Definition of NLP then it fail
    4. If user ask What is Natural Language Processing then it fail

    So how I should identify user intent of given questions because in my case String matching or pattern matching not works.