Rasa classifies random input as intents with high probability

Hello, I have trained my model using pipeline:

  • name: “DucklingHTTPExtractor” url: “http://localhost:8000” dimensions: [“duration”]
  • name: WhitespaceTokenizer
  • name: RegexFeaturizer
  • name: LexicalSyntacticFeaturizer
  • name: CountVectorsFeaturizer
  • name: CountVectorsFeaturizer analyzer: char_wb min_ngram: 1 max_ngram: 4
  • name: DIETClassifier epochs: 100
  • name: EntitySynonymMapper
  • name: ResponseSelector epochs: 100
  • name: FallbackClassifier threshold: 0.3 ambiguity_threshold: 0.1

However random words like ‘asdf’, ‘qwerty’ and even single letters are classified as intents (‘greet’, ‘neutral’ respectively). I could increase FallbackClassifier threshold, but some examples have confidence over 0.9. I’m using Rasa 2.8. How to deal with it?


@Jakub Hi! Can you please share some example or file or screenshot?

@Jakub Why you are using ambiguity_threshold: 0.1 inside FallbackClassifier or share the link from where you get this idea?

@Jakub Please share complete config.yml or update the above one.

Hi! Same on my site. No Matter what kind of input. There is always a intent recognition. Eg “123456”, “dfjjaspfj” or something like “i want pizza”.

I’m looking forward for some help. Thanks!

Using diffrent versions with docker:

  • rasa/rasa:main-spacy-de
  • rasa/rasa:2.8.1-spacy-de

pipeline: language: de

    # # No configuration for the NLU pipeline was provided. The following default pipeline was used to train your model.
    # # If you'd like to customize it, uncomment and adjust the pipeline.
    # # See https://rasa.com/docs/rasa/tuning-your-model for more information.
    - name: SpacyNLP
      model: de_core_news_sm
    - name: SpacyTokenizer
    - name: SpacyFeaturizer
    - name: RegexFeaturizer
    - name: LexicalSyntacticFeaturizer
    - name: CountVectorsFeaturizer
    - name: CountVectorsFeaturizer
      analyzer: char_wb
      min_ngram: 1
      max_ngram: 4
    - name: DIETClassifier
      epochs: 100
      constrain_similarities: true
    - name: EntitySynonymMapper
    - name: FallbackClassifier
      threshold: 0.3
      ambiguity_threshold: 0.1

nlu.yml: version: “2.0”


- intent: greet

  examples: |

    - hey

    - hallo

    - hi

    - hallo du

    - guten morgen

    - guten abend

    - morgen

    - guten tag

- intent: goodbye

  examples: |

    - tschüss

    - wiederhören

    - ciao

    - bis dann

    - bye



    "text": "12345",

    "intent": {

        "id": 8761605359927853060,

        "name": "goodbye",

        "confidence": 0.9780691862106323


    "entities": [],

    "intent_ranking": [


            "id": 8761605359927853060,

            "name": "goodbye",

            "confidence": 0.9780691862106323



            "id": -3994034142759590561,

            "name": "greet",

            "confidence": 0.02193082869052887




@thorty Means whatever input i.e “I want pizza” , “I need drinks” do such examples are in your nlu.yml file?; it’s giving you goodbye output?

No they are not into my nlu.yml as examples. In this case I’m expecting a result like: intent: none. But what I’ m getting here is something like intent: goodbye with a conf of over 90.

my nlu.yml is exactly like in the post above. the pipeline also. and the result is an example with this config.

thank you @nik202 !

@thorty Why this? any significance? @thorty Link please, I guess its used for two-stage fallback not for FallbackClassifier

@Nik202 No. It’s from the doc or from the example of the init pipeline. Should I use another threshold? I will try this out later.

@thorty All your inputs will be given an intent. If you have very high confidence and a lot of incorrect classifications it’s often the case that this is due to your data not being a good representation of what your assistant is seeing in production.

You should create an out of scope intent if you want to be able to capture things your assistant can’t do. This page goes over how to do it: Fallback and Human Handoff


@Nik202: Thanks, but this does not change enything.

@rctatman: Thanks for your explanation and the Link to the right Step in Documentation. This helps a lot in understanding rasa.

In my case I do not really know the utterances from users. So I will start with a small set of examples and train the NLU after going live with real-world examples. To handle the “non” trained utterances the right way it would be better to get no intent rather then the wrong intent. Especially when utterances are non-sense like in my example. Is there any chance to achieve this?

@thorty ok

@Jakub Any progress about your error?

@nik202 sorry for late response. I removed ambiguity_threshold: 0.1 from my config file (the one I sent in first message is my whole config.yml, I have default policies), but there’s only a slight diffrence (a little bit more of inputs are classified as nlu_fallback). I upload examples and neutral intent below.

example1 example2 example3 example4

Short Update from my side:

when I also use the KeywordIntentClassifier in my pipline the NLU response as follows:

"intent": {
    "name": null,
    "confidence": 0.0

pipeline - name: KeywordIntentClassifier

But I think that makes the DIET Classifier with all its advantages useless. ?!?

Short Update:

Like @rctatman explained. With more trainingdata I become better confidence values for my utterences. Also for “nonsense” imput like “askjfnqiurz”.

1 Like