NLU Performance

I’m working on a model using TensorFlow classifier with 10 intents and as I evaluated the NLU , the confusion matrix is really bad and all the metrics are low (train and test) I don’t know what I did wrong could u please help me find out

Hi boutaina,

this can be due to a lot of reasons. Are your intents really similar in terms of words that are used? Do you have at least 10 examples of each intent? What does your config.yml look like?

If the train scores are bad as well then I’d argue that you’re currently not overfitting but underfitting.

Yes exaclty it is underfitting , my intents are different one another and yes I have at least 10 examples per intent here is a look at my config file language: en


  • name: “WhitespaceTokenizer_arab”

  • name: “ner_synonyms”

  • name: “CountVectorsFeaturizer”

    “OOV_token”: “oov”

  • name: “EmbeddingIntentClassifier”

    batch_strategy: sequence


  • name: KerasPolicy

    max_history: 5

    epochs: 100

    batch_size: 50

  • name: “MemoizationPolicy”

    max_history: 5

  • name: “FallbackPolicy”

    nlu_threshold: 0.4

    core_threshold: 0.3

  • name: MappingPolicy

The thing that is really strange is that I actually used the same data in another rasa version with another config file and it’s actually working really well so now I really don’t know the reason why it’s not working can u please help me figure this out

You can convert code into pretty html by using ```

like this

This makes your config.yml file render just a bit nicer.

Looking at your config I might recommend adding a counvectorizer that also takes ngrams into account. Have you used these before? Also, what Rasa version are you using? The embedding intent classifier is now deprecated in favor of DIET.

Thank you for replying first , Im using 1.8.1 version and the embeddingintent classifier is working fine with other data so I don’t think that’s the problem , I also noticed something while trying many solutions , I tried to train only 8 intents and it’s working fine even though it’s overfitting a little bit but it’s good and as soon as I add one or two other intents the accuracy drops to 0.1 in the last epoch so I’m really lost here . I will take in consideration your recommendation abt the featurizer I’ll try it .