Too high confidence for non-related messages

Hi all,

I’m building bot to response for user’s questions and I have an issue.

Rasa gives me high level of confidence for messages that are completely not related to intent’s examples.

I have medical-related intents but message like “I like coffee” gives me even more confidence than messages related. Also, random chars messages like “laj jfias jjlas fe” also give me high confidence.

Could anyone give me a hint how to fix this? Where can I look for a bug?

This is my config:

language: "en"

pipeline:
- name: "nlp_spacy"
- name: "tokenizer_spacy"
- name: "intent_entity_featurizer_regex"
- name: "intent_featurizer_spacy"
- name: "ner_crf"
- name: "ner_synonyms"
- name: "intent_classifier_sklearn"

Hi @kumek,

What does your training data look like? Could you post the data file here?