Inconsistency between results/intent_errors.json and rasa shell nlu

Hi,

I trained a rasa nlu model(using command $rasa train nlu --nlu train_test_split/training_data.yml --domain domains --config all_configs/config_exp.yaml) and tested the trained model against the same dataset(using command rasa test nlu --nlu train_test_split/training_data.yml --config all_configs/config_exp.yaml --out results/) to get performance on training set.

I find the intent_errors.json to produce different results when compared to running the same rasa model on command line (using command rasa shell nlu)

E.g.: In results/intent_errors.json ....

  {
    "text": "see you back",
    "intent": "nlu_fallback",
    "intent_prediction": {
      "name": "goodbye",
      "confidence": 0.03028026781976223
    }

On command line:

    Next message:
    see you back
    {
      "text": "see you back",
      "intent": {
        "id": 8927892910707661214,
        "name": "nlu_fallback",
        "confidence": 0.9697197675704956
      },
      "entities": [],
      "intent_ranking": [
        {
          "id": 8927892910707661214,
          "name": "nlu_fallback",
          "confidence": 0.9697197675704956
        },
        {
          "id": -3812479436477113040,
          "name": "goodbye",
          "confidence": 0.03028026781976223
        },

On the command line I get nlu_fallback as the intent with highest confidence but in /intent_errors.json “goodbye” is predicted. Let me know if I am missing something.

Thanks!

@Ironman42 How many previous trained model do you have? Delete all and re-train and compare.

I did try that. I deleted model/* before training any new model. But the issue still exists!

@Ironman42 Why you splitting the data? Every time you split it will take different data and provide you different output. @Ironman42 check this Using NLU Only

I get your point, but I don’t split data every single time. train_test_split is the folder name where the data is saved. I’ve done it once and I haven’t changed it after that. The main issue I am facing is that the results generated by rasa test nlu is different from what I get when I run rasa shell nlu

@Ironman42 Your bot working and replying as per your need? Or he is also behaving bad?

nvm, I resolved it.

when evaluating the model, rasa(rasa/test.py at c33711ccaed438a7d4e1391e49989aaa30039197 · RasaHQ/rasa · GitHub) removes “nlu_fallback” intent and chooses the next best intent as its prediction. so if you have your own “nlu_fallback” intent then it automatically gets ignored.

Solution: rename your “nlu_fallback” intent

@Ironman42 Close this topic thread as a solution for others. Congrts.