Getting high intent confidence for untrained bot utterances

Below is the config and nlu file content. Issue description: for any untrained utterances it is giving intent and with high confidence(0.99) Untrained utternaces like “cricket station” “Michael is on leave” “asdasfasf” PLease correct if any changes in config or nlu data file

Config file: language: en

pipeline:

  • name: addons.CustomTokenizer.CustomTokenizer
  • name: RegexFeaturizer case_sensitive: true use_word_boundaries: false
  • name: CountVectorsFeaturizer stop_words:
    • a
    • and
    • any
    • are
    • aren’t
    • because
    • being
    • by
    • can’t
    • cannot
    • could
    • couldn’t
    • does
    • doesn’t
    • don’t
    • during
    • from
    • further
    • if
    • in
    • into
    • itself
    • let’s
    • more
    • of
    • or
    • other
    • ought
    • over
    • shan’t
    • some
    • such
    • than
    • that
    • that’s
    • them
    • themselves
    • this
    • those
    • through
    • under
    • until
    • up
    • very
    • where
    • where’s
    • which
    • while analyzer: word min_ngram: 1 max_ngram: 4
  • name: DIETClassifier epochs: 100
  • name: addons.FuzzyMatch.EntityTypoFixer score_cutoff: 0.9
  • name: FallbackClassifier threshold: 0.4 ambiguity_threshold: 0.1
  • name: EntitySynonymMapper

policies:

  • name: TEDPolicy max_history: 1 epochs: 150 batch_size: 20 max_training_samples: 300
  • name: MemoizationPolicy

nlu file version: ‘2.0’ nlu:

  • intent: greetings_how_r_u examples: '- how are you getting on

    • how is your day going

    • how do you do

    • how are you

  • intent: greetings_hello examples: '- hi

    • hi there

    • hello bot

    • hey bot

  • intent: hours_min examples: '- what is the time now

    • What time is it?

    • what is the time

  • intent: month_date examples: '- what date is it

    • what is the date today

    • what is the current date

1 Like

@simpleng Hello, can you please explain your use case more and why you want to remove the stop_words? I guess you know the actual use of stop word and what will happen if we remove the stop word from the training? Just checking :slight_smile:

@simpleng please can you even format the code for better understanding please.

@nik202 : No i don’t want to remove stopwords. My issue is with a given config file and nlu file content if i train. And call nlu parse api with some random untrained utterances/expressions it is giving intent with high confidence which is wrong. Either it should give nlu_fallback intent or it should give intent with low confidence

@simpleng Please share your rasa version rasa --version

Rasa version: 2.8.3

@simpleng can you please type rasa --version in your environment

rasa --version

Rasa Version : 2.7.1 Minimum Compatible Version: 2.6.0 Rasa SDK Version : 2.8.2 Rasa X Version : None Python Version : 3.7.11 Operating System : Linux-5.4.0-1058-azure-x86_64-with-glibc2.17 Python Path : /opt/venv/bin/python

@simpleng did you ever find a solution to this problem? I am facing the same problem where my fallback action isn’t triggered because an intent is always at least 0.9+ confidence score.