How to decrease False Positive RASA 2.8.6

Hello,

i’am using Rasa 2.8.6 for training french model.

I have this below configuration

language: fr pipeline:

  • name: SpacyNLP model: fr_core_news_lg case_sensitive: false
  • name: SpacyTokenizer
  • name: SpacyFeaturizer
  • name: RegexFeaturizer
  • name: LexicalSyntacticFeaturizer
  • name: CountVectorsFeaturizer
  • name: CountVectorsFeaturizer analyzer: char_wb min_ngram: 1 max_ngram: 4
  • name: DIETClassifier entity_recognition: False epochs: 200
  • name: CRFEntityExtractor
  • name: EntitySynonymMapper
  • name: ResponseSelector epochs: 100
  • name: FallbackClassifier threshold: 0.7 policies:
  • name: RulePolicy core_fallback_threshold: 0.3 core_fallback_action_name: ‘action_default_fallback’ enable_fallback_prediction: True

My problem is that the bot predict several OOV or Out of scope intents with high probability (0.8-0.9). This make my bot not stable. To sup up, there are several False Positive intents.

are there a way to ameliorate the bot quality please ?

Try reducing the fallback threshold.

Also read about Tensorboard in Rasa to clearly see results for DIET, ResponseSelector, and TED and optimize your pipeline.

i have an nlu problem. here is an example

image

so i have written out_of scope message like “kkkkkk”

It is predict as “bye” with confidence 0.8

image

an other message like “mou”

It is predict as “vacation_how_many_days” with confidence 0.7

ETC

Add “kkkk” as an out-of-scope message in the NLU.

Yes, there is technically an infinite amount of messages that need to fall under the out-of-scope intent. But don’t forget that your bot is still in development. A lot of out-of-scope examples need to be added, and each time, your bot will be better at detecting real intents.

Also Tensorboard really helps a lot. Aim for a high test/validation accuracy, not training accuracy.

@Asmacats in general, if you’re seeing low accuracy and low confidence (i.e. less than 0.9 as a very rough guide) it usually means you either need more training data or to reduce your number of intents/the degree of overlap between them. This video talks about it a bit more: Conversational AI with Rasa: Training Data and Rules - YouTube

1 Like

@rctatman my problem is that the bot associate OOV words to intents with veryy high confidence.

I have added

  • name: CountVectorsFeaturizer OOV_token: oov

but the problem still the same

Any idea please ?

I believe that’s expected behaviour; tokens oov for another model can still be learned/given weights by the countVectorsFeaturizer: spaCy and OOV-Tokens

The high confidence is probably due to specific tokens being strongly associated with specific intents.

1 Like

I know this thread is old and I thought maybe someone will still find this tips useful. I wrote a blog post discussing basic principles to improve your intent detection model: You might be training your chatbot wrong | Everything Chatbots Blog