Because of the very specific domain we are covering intents could lie extremely close together. Even when a very ‘straightforward’ question is asked from which the intent should be easily detected (and the entities are detected correctly), the NLU threshold is not obtained, resulting in the chatbot asking to rephrase the question.
To create stories ‘chatette’ is used, which is a module that enables the creation of similar stories in terms of structure, changing some words to their synonyms. We than sample from the possible stories generated by this module in a way that enough sample stories are made for chatbot training, but the processing time is not impacted tremendously.
Is there a way to cope optimally with intents covering very closely related but dissimilar topics (like assigning more weight to core words representing the intents when detected in a story)?