NLU classifies unseen utterances to a random intent with too high confidence

This is a general downside to classification models; the output proba/confidence values always sum up to one.

There’s a long story of what I think is going wrong here (blogpost, pydata talk) but I’ll try to give the short story here.

Let’s say there’s three classes that I’d like to classify. Let’s take this artificial example;

I could train an algorithm on this dataset and it might produce a prediction like this;

image

The strong colors that the model has a high confidence (>0.8) and the weaker colors indicate less certain regions. It’s less certain in certain regions because the two colors might overlap.

Now here’s the issue, let’s zoom out a bit.

image

Notice how the algorithm still assigns a strong red color in regions where it has seen no data whatsoever.

This is a general phenomenon in classification algorithms. The algorithm will look for examples that are the most similar, even if the distance from of the training examples is huge. There’s no outlier detection happening when it is computing a confidence score. It’s a number that is a proxy for confidence but it is not 100% the same thing. My collegue Rachael also made a youtube video about how you might interpret this confidence value.

I can’t be 100% sure if there’s not something odd happening in your training data. It could be that there are different classes some of which cover a lot of ground linguistically while others are very narrow in terms of meaning. This could also cause overlap. But the main thing to take away is that this confidence score is more like “artificial confidence” than actual confidence much like how “artificial intelligence” is much more “artificial” than actual intelligence.

5 Likes