Honestly, this could seem like a vague question but, it’s because I am exhausted fixing and I don’t know how else to fix. I can explain further if I get a response and if required
What causes a case where the addition of (one) more intent causes a total destruction in performance of the entire bot? To the point where previous intents that were correctly classified suddenly starts predicting nlu-fallback…
What causes this and what are some suggested ways to fix this?
@Dustyposa . I’m not sure I understand your response. Are you sure you were trying to respond to my post? Because I don’t see a relation between your response and my question.
This is not a conversation issue, rather a miss-classification of intent during training.
Let’s say I have 3 intents, and I train, it works fine when I interact with the bot. But after I add one more intent (to make it 4), and I train, the model/bot performance becomes rubbish when I interact with it. It even starts missing the 3 intents that were previously correct.
same happened with me, when you provide more examples to intent, there could be chance that your bot is getting cofused now, reason is that examples can be relatable with other intents…
you can create confusion graph, that will give you the error json from that you can check and corerct the examples…
you can try changing the classifier, i chanfed from DIETClassifier to other classifier, results were good with that
Thank you @ermarkar . Your suggestions makes sense.
Couple of questions:
you can create confusion graph, that will give you the error json from that you can check and corerct the examples…
Can you please elaborate on this or share a link to a resource that can be helpful with this? Also using this graph, how do you “correct the examples”?
you can try changing the classifier, i chanfed from DIETClassifier to other classifier, results were good with that
This is also smart, but how did you handle entity extraction since DIET also extracts entities.
based on the requirement, you can choose the classifier and configuration from this link
and to test and generate the confusion graph and json
split the data using the command, it will split into 80/20… then train the model with new splitted data and then test using
rasa test
and in one of the project, rasa was not working properly even after spending so many days and refactoring, in that case for new intents i switched to elmo embedded model to predict the extra intents
@ermarkar Thank you.
I actually knew about the confusion graph. All my values are along the diagonals which means my model is not confusing an intent with another but what is happening is that it is predicting/confusing intents with nlu_fallback.
In this case, I wouldn’t know if the confusion graph is still useful. However, I will try changing classifier and see.
Hi @laboratory . Firstly, can you please post the config of your NLU pipeline?
Secondly, have you created a train test split of your complete data? If yes, then you can check the approximate confidence values of the predictions by running rasa test nlu -u <path to test data> . The confidence values are visible in the intent_confidence_histogram.png. You are particularly looking for bars that are green but predicted with low confidence.
As others have pointed out, when you add a new intent there can be a possibility that you add examples under an intent which are semantically very similar to the examples under some other intent. That will cause the confusion in model’s predictions and confidence values will start getting low. If you can post some examples of the new and old intents, we could try to spot some overlapping intents.