What is Rasa NLU and what can I do with it?
Hello Juste,
Iâm trying to build a bot for entity extraction and intent detection. That being said, Iâm just interested in NLU. Giving query/question as an input/request and getting intent and its parameters (entities) as an output/response.
E.g. Letâs say my intent name is DetectItem. Examples will be I have a Mango I have a Chair Do you need Coffee
Here, Iâll map Mango, Chair and Coffee to an entity called @Item. Now, my @Item entity will be like below. Fruit - Mango, Banana, Pineapple, Apple Furniture - Chair, Table, Cupboard Drink - Coffee, Tea
Please note that there is only one entity here called @Item which will have values as Fruit, Furniture and Drink. Mango, Banana etc. would be synonyms for Fruit. Similarly, Chair, Table would be synonym for Furniture and so on.
Now with the above, setup. If I ask âWhere is my teaâ, Response should be Intent - DetectItem Entity - Item EntityValue - Tea.
I can do similar stuff in Dialogflow. If you want I can share an export of dialogflow agent. Can you please suggest whether I can do similar thing with Rasa NLU.
Also, please note that the example I gave above is pretty general but in my case it would be completely different(custom) values and those values wonât be common or well known values. That being said, any general dictionaries wonât help and I have my own dictionary as I mentioned.
Could you please suggest which model will fit in the best for above use case ?
So far, I have tried to set up mitie + sklearn as that might fit well based on what we researched (not sure though). We are using RASA-UI to generate agent, intent and entities. However, the agent test data that will be fed to training doesnât contain entities synonyms and hence it doesnât work in a way it should. Please help us to understand this better.
Many thanks in advance.
Hey @greg3108,
Welcome to Rasa Community Forum and I am glad to see you are looking into Rasa NLU for your project.
You can achieve the result you are looking for by using synonyms in Rasa. For this to work you should make sure you do two important things:
-
Include synonyms component
ner_synonyms
into your pipeline so that your model would know that it should recognise and extract them. For example, your pipeline could look like this:
language: "en"
pipeline:
- name: "nlp_spacy"
- name: "tokenizer_spacy"
- name: "intent_entity_featurizer_regex"
- name: "intent_featurizer_spacy"
- name: "ner_crf"
- name: "ner_synonyms"
- name: "intent_classifier_sklearn
- Make sure you include entities and their synonyms inside your training dataset. If you are using Rasa UI for it, one of the ways to do it is to highlight the entity as you would usually do, assign it the entity name âItemâ and the change itâs value to the name of the group you would like that entity to be a synonym of. For example, based on the picture below, the input is âWhere is my teaâ. I highlighted âteaâ as an entity and assigned this entity a name âItemâ. Then instead of leaving original value âteaâ, I changed it to âTeaâ so that my model would know that when it sees an entity âteaâ it should be extracted with the value âTeaâ instead.
You can also define synonyms inside the json file of your training data. For example:
{
"text": "Where is my tea",
"intent": "DetectItem",
"entities": [
{
"start": 12,
"end": 15,
"value": "Tea",
"entity": "Item"
}
]
}
There is one more way to define synonyms - include âentity_synonymsâ array inside you training data json file. For example:
{
"rasa_nlu_data": {
"entity_synonyms": [
{
"value": "Tea",
"synonyms": ["tea", "matte", "herbal drink"]
}
]
}
}
You can read more about it all here: NLU Training Data
I hope it makes things a bit more clear. Give it a go and let me know how it works!
Many thanks for your quick revert. However, my scenario is little different. My entity is not just a list of values. It also has synonyms. Below are screenshots from the Dialogflow to demonstrate what Iâm looking for with RASA.
Entity
Intent
Test Call
API Call
Please let me know if you need more information. I totally agree on your first point. However, Iâm not sure whether RASA will support the above scenario or if Iâm missing a dot.
Thanks again in advance.
Hey Greg,
Have you looked at the Chatito tool that generates more training data of the kind youâre looking for? https://github.com/rodrigopivi/Chatito
Yeah. I have checked this. Thanks Again. So is that the only solution as of date ? The reason I didnât go for that solution is that
- My entity values may get updated
- Generating examples for each possible set of entity value will lead to m power n (where m is number of entities in an intent and n is number of values in a given entity) x no of examples for given intent. That will make it a very very large number of example statements. That doesnât sound like a right way to do it as it diminishes the reason to use Natural Language Understanding and power of algorithms.
It would be great to know if RASA supports something like what I attached in screenshots above.
Thanks again.
Moreover, this screen from RASA UI confuses more. This looks similar to what is there in Dialogflow. However, when training data is generated, there is nothing about this entities and synonyms.
I understand what you mean, you donât need to provide an example for every possible sentence structure for every entity. Just a few so the model is aware of it. Could you also try this PR for your use case? https://github.com/RasaHQ/rasa_nlu/pull/822 That should directly extract the entity values you list there. Or you could try adding regexs to your training data file for your entities.
As for the RASA UI, this isnât something we built ourselves but is a project by one of our contributors, so Iâm not too sure about the specifics of it. But I would assume it deals with synonyms in the Rasa sense, so mapping an entity to a particular slot value
best tool ever to generate data
Thanks @akelad for the quick revert. We took a look on the thread you suggested to use Phrase Matcher and it looks like it is close to what we are looking for. However, there isnât any clear documentation. Can you please help us to understand what is the difference between entity_phrases and entity_synonyms ? Below link has an example json that uses this.
Thanks again.
So the entity_phrases are entities that should be extracted from the text. And synonyms are values that entities can be mapped to. So if you extract say entities for cuisine like chinese
, japanese
, thai
, but only actually want to search for a broader cuisine like asian
in your custom action, you can specify synonyms for this like specified in the entity_synonyms
you linked
is there ang way can be used to solve context-sensitive intent classification