Why i get different confidence scores for the same question when i only change the training phrase order position

First training data order:

intent: welcome

  • hello
  • hi
  • hey
  • how are you

Second training data order: intent: welcome

  • how are you
  • hello
  • hey
  • hi

When i do the parse for the same question “Hello” i got different confidences

Once trained, neural networks are deterministic. Training neural networks, typically, is a stochastic process though. The weights of layers are often initialised randomly but the batches of data that create the gradient signal are usually also stochastically sorted. This could explain part of what you’re experiencing but I also want to double check: did you also add examples in any of the other intents? That would certainly also influence the confidence scores.

1 Like

@koaning Thank you so much for replying. I didn’t change the others intents, only this one. And as i said i didn’t add new data, i only changed the order of the training phrases of the intent:welcome like the example.

I always have random_seed: 1 in the pipeline:

language: pt pipeline:

  • name: tokenizer_whitespace
  • name: ner_crf features: [[“low”],[“bias”, “low”, “prefix5”, “prefix2”, “suffix5”,“suffix3”, “suffix2”, “digit”,“pattern”],[“low”]]
  • name: ner_synonyms
  • name: intent_featurizer_count_vectors lowercase: true OOV_token: None
  • name: intent_classifier_tensorflow_embedding random_seed: 1
  • name: “ner_duckling_http” url: “http://rasa_duckling:8000” locale: “pt_PT” timezone: “UTC” dimensions: [“amount-of-money”,“distance”,“duration”,“email”,“phone-number”,“quantity”,“temperature”,“time”,“url”,“volume”,“number”]

Your explanation seems to make sense and it could explain the pipeline behaviour. Thank you so much!

What version of Rasa are you using here? You can confirm via;

rasa --version

Im working with a legacy code base that works with version 0.14.6