NLU classifies unseen utterances to a random intent with too high confidence

koaning · July 3, 2020, 9:06am

This is a general downside to classification models; the output proba/confidence values always sum up to one.

There’s a long story of what I think is going wrong here (blogpost, pydata talk) but I’ll try to give the short story here.

Let’s say there’s three classes that I’d like to classify. Let’s take this artificial example;

I could train an algorithm on this dataset and it might produce a prediction like this;

The strong colors that the model has a high confidence (>0.8) and the weaker colors indicate less certain regions. It’s less certain in certain regions because the two colors might overlap.

Now here’s the issue, let’s zoom out a bit.

Notice how the algorithm still assigns a strong red color in regions where it has seen no data whatsoever.

This is a general phenomenon in classification algorithms. The algorithm will look for examples that are the most similar, even if the distance from of the training examples is huge. There’s no outlier detection happening when it is computing a confidence score. It’s a number that is a proxy for confidence but it is not 100% the same thing. My collegue Rachael also made a youtube video about how you might interpret this confidence value.

I can’t be 100% sure if there’s not something odd happening in your training data. It could be that there are different classes some of which cover a lot of ground linguistically while others are very narrow in terms of meaning. This could also cause overlap. But the main thing to take away is that this confidence score is more like “artificial confidence” than actual confidence much like how “artificial intelligence” is much more “artificial” than actual intelligence.

Topic		Replies	Views
NLU detects random input with wrong intent and high confidence Rasa Open Source	39	4909	July 27, 2022
Intent recognition extremely low how to improve? Rasa Open Source	3	386	February 8, 2022
As number of intents increases, confidence level decreases Rasa Open Source	7	1990	August 24, 2018
Rasa NLU without Rasa Core Getting Started with Rasa confidence	4	147	August 23, 2019
Huge intent recognition problems Getting Started with Rasa	4	168	June 10, 2019

NLU classifies unseen utterances to a random intent with too high confidence

Related Topics