Classifier matching on single word with extremely high confidence

MikesUsingRasa · July 30, 2022, 2:16pm

A test user purposely entered the nonsense phrase: “soap on a rope.” I would have preferred that it be classified as out of scope, however, it was incorrectly classified as “intent_A” with a 99% confidence level!

intent_A had the word “rope” in one training example, and “rope” does not appear in any other intent. “soap” does not appear in any training examples, and “on” and “a” appear in many training examples.

So, presumably it was the “rope” token that caused the classifier to classify the intent as intent_A. But why 99% confidence?

I have repeated these results with other nonsense phrases where a single word appears in only one intent, and consequently that intent is predicted with 95% plus confidence.

Any suggestions on what I should do?

My pipeline: pipeline:

name: WhitespaceTokenizer
name: RegexFeaturizer
name: LexicalSyntacticFeaturizer
name: CountVectorsFeaturizer
name: CountVectorsFeaturizer analyzer: char_wb min_ngram: 1 max_ngram: 4
name: DIETClassifier epochs: 100 constrain_similarities: true
name: EntitySynonymMapper
name: ResponseSelector epochs: 100 constrain_similarities: true
name: FallbackClassifier threshold: 0.5 ambiguity_threshold: 0.1

aaronlikesrasa · June 20, 2023, 3:13pm

hey @MikesUsingRasa did you ever find a solution

Topic		Replies	Views
Intent classification poor even with exact matches Rasa Open Source	9	1180	June 5, 2020
Getting Confidence of 0.0 Rasa Open Source	9	593	December 9, 2021
NLU detects random input with wrong intent and high confidence Rasa Open Source	39	5259	July 27, 2022
Failing at intent classification Rasa Open Source	4	796	August 5, 2019
DIETClassifier overfitting to individual words Rasa Open Source	2	961	February 19, 2021

Classifier matching on single word with extremely high confidence

Related topics