Intents do not obtain NLU threhold because of domain specificity chatbot

BlackSwan · December 7, 2018, 7:58am

Because of the very specific domain we are covering intents could lie extremely close together. Even when a very ‘straightforward’ question is asked from which the intent should be easily detected (and the entities are detected correctly), the NLU threshold is not obtained, resulting in the chatbot asking to rephrase the question.

To create stories ‘chatette’ is used, which is a module that enables the creation of similar stories in terms of structure, changing some words to their synonyms. We than sample from the possible stories generated by this module in a way that enough sample stories are made for chatbot training, but the processing time is not impacted tremendously.

Is there a way to cope optimally with intents covering very closely related but dissimilar topics (like assigning more weight to core words representing the intents when detected in a story)?

amn41 · December 20, 2018, 5:55pm

this is a good question! You can use (regex features)[Training Data Format] to achieve what you want to some extent.

Probably the better approach is to do a hyperparameter search on the parameters of your pipeline. Say for example you’re using the tensorflow embedding pipeline. Split up the components:

language: en
pipeline:
- name: "intent_featurizer_count_vectors"
  analyzer: char_wb
  min_df: 0.006789048157425257
  max_df: 0.4343982144945721
  max_ngram: 7
- name: "intent_classifier_tensorflow_embedding"
  epochs: 34
  batch_size: 165
  embed_dim: 60
  C2: 0.0002167140015537944
  C_emb: 0.00022451701750527038
  droprate: 0.17386092794138916
  num_hidden_layers_a: 0
  hidden_layer_size_a: 170
  num_hidden_layers_b: 4
  hidden_layer_size_b: 130

And use a library like hyperopt to optimize those parameters with respect to your loss function. It might make sense to penalize confusion between closely related intents more heavily than other errors to achieve what you want.

Another possibility is to use (multi intents)[Choosing a Rasa NLU Pipeline] to construct multi-intents like main_topic+subtopic_1, main_topic+subtopic_2, etc.

BlackSwan · December 24, 2018, 10:48am

Many thanks for the answer! These are definitely some interesting suggestions to try out I’ll keep you posted about the most efficient technique to solve the issue I am facing!

Topic		Replies	Views
NLU - When testing, does every utterance have to match an intent Rasa Open Source	26	4094	April 1, 2019
Incorrect intent recognition for similar words in rasa nlu Rasa Open Source	2	892	September 9, 2019
Same statement different Intents under different context Rasa Open Source	1	1226	February 14, 2019
Yes/no intents do not go through NLU threshold Rasa Open Source	1	894	December 14, 2018
Intents with the same text Rasa Open Source	1	745	September 10, 2019

Intents do not obtain NLU threhold because of domain specificity chatbot

Related topics