DIETClassifier overfitting to individual words

humcasma · February 19, 2021, 11:15am

I have an intent need_registration with 40 examples (e.g., do I need to be registered in service_x to pay with it?, do I need to have an account in service_x to use it?’), where 12 out of those 40 examples contain the word “need”. The other intents have many more examples (~ 600).

Using the DIETClassifier with word and character n-grams sparse features, and training for 40 epochs, the single word “need” is classified as intent need_registration with high confidence (0.94). If I only train for 20 epochs, the confidence is still relatively high (0.62), but below 0.7.

I am trying to understand if this is something to be expected in special cases like this one or in general. Or if this is otherwise something unexpected. In any case, I would appreciate strategies to prevent it.

My intuition right now tells me this could be related to the use of balanced batches when training the DIETClassifier. Since the problematic intent has many fewer examples, the same examples might be being sampled too often. Could this be the reason? How exactly are batches balanced?

humcasma · February 19, 2021, 4:24pm

I think my intuition is wrong. I have checked other intents and single words and I find cases where a word appears almost the exact same number of times in examples two different intents, but the DIETClassifier classifies the word as one of the intents with high confidence and gives very low confidence to the other intent.

Having a word been classified as a wrong intent is not that problematic, since I could check the length of words of the input text and do not trust predictions different than the inform intent for sentences of one word. But I need to know if the case I relate is to be expected or if it is a symptom of overfitting or another issue with the classifier. Thanks!

Ghostvv · February 19, 2021, 4:29pm

the behavior you describe is typical behavior for such classifiers. If some word doesn’t appear in many different classes, a classifier has no way of knowing that this word should not be attributed to this class. In order to “teach” the classifier that a word is not an indicator it should be present in other classes as well

Topic		Replies	Views
Training the DIETClassifier (NLU) Rasa Open Source	1	436	April 24, 2020
DIETClassifier prone to changes in nlu data Rasa Open Source	1	357	March 5, 2021
Classifier matching on single word with extremely high confidence Feedback on Rasa Open Source	1	368	June 20, 2023
Stop DIET classifications with low confidence Rasa Open Source	2	336	February 23, 2021
Strange misclassification of intent Rasa Open Source	6	1107	October 31, 2018

DIETClassifier overfitting to individual words

Related topics