Reusing common phrasings for intent detection across different projects?

pgdev · March 8, 2020, 12:27pm

What is the most robust an easy way of reusing common language knowledge when providing data for intent detection? I think most bot authors need to handle the many similar ways of phrasing certain questions, and from what I see the current best practice is just copy pasting or generating many variations for each project.

For example ‘expressing interest in’ could be a generic intent, to be tuned to each bot authors domain, but right now each project needs to list ‘could you tell me more about’, ‘I would like to hear about’, ‘I am interested in…’ and countless other ways of saying essentially the same thing. Same with greetings and chitchat, which currently are being provided as copy pasted data.

From what I understand, even if modern language models can tell that such phrases are very similar, there’s no easy or practical way of leveraging that in intent detection at the moment.

amn41 · March 9, 2020, 7:04am

yep you can do this with Training Data Importers

pgdev · March 9, 2020, 8:11am

Thank you for the quick answer! My question was quite long-winded and probably did not get my main concern across: is there a possibility to have such public canned resources ready to use? Could something like a ChitChat bot or Greeting bot be an abstraction that evolves in time without the bot author doing anything? Either by being updated regularly with new phrasings (and why not, new languages) or taking advantage of language models that work with phrase similarities?

And my second question was about generic expressions of say interest, that can be phased in many different ways, that are common regardless of the object of interest. So you could ‘import’ an intent like ‘express interest’ and use it for your domain. The intent data would be updated like the above greetings and chitchat domains, including a growing set of phrasings like ‘I would like to know more about X’, ‘Could you tell me about X’, etc. and for a new bot one would just say what X is supposed to be. Adding the express_interest in cats intent would bring in all the language knowledge from where the express_interest intent is built but fine tuned for cats. This is similar to how slots work now for intents, but I think would be one level of abstraction above. I realize that if this worked it would not be very Rasa specific.

Topic		Replies	Views
Let's discuss intent best practices Rasa Open Source	3	1255	September 2, 2023
How to include context in intent detection? Rasa Open Source	3	1857	December 22, 2021
Modelling hierarchical intent structure Rasa Open Source	8	2844	October 10, 2019
Database for NLU Rasa Open Source	3	1083	April 8, 2020
Is there a better way to group similar intents? Rasa Open Source	4	878	April 23, 2021

Reusing common phrasings for intent detection across different projects?

Related topics