I want to reproduce the result of rasa-nlu to get the same set of error messages in each execution of my model. There are a couple of lines in “intent_classifier_tensorflow_embedding” component which had randomness and I set a seed for them but I still get a different result each time I ran my model. Do you have any solution for this problem? I would appreciate it if you could help me figure this out.
I’m not sure what you mean, could you post an example?
My data includes 2500 messages. I train my model with the fixed training data but I get a different number of misclassification on "intent " and "entities "in each run. For example, the number of misclassification in one run is 52, and in the second run with the same data and same everything, I got 63. I wonder why the number of misclassification is not fixed?
For the first try to find out the reason, I tried to find and set a seed for each random command. “intent_classifier_tensorflow_embedding” component has a couple of lines (“permutation” and “choice”) and I set a seed for them but I still get a different result each time I ran my model. It would be great if you could mention a possible reason for this problem.
My final goal is adding a new component and checking the effect of that component in reducing the misclassification, but in the meantime, since I got different results for the same data, I couldn’t rely on my result.
The reason it’s not fixed is because the ML models predictions will differ slightly every time you train it. But 52-63 out of 2500 isn’t a huge inconsistency. I’d suggest looking at the misclassifications and seeing whether you can improve your training data
@akelad I am facing the same problem in rasa core. The issue is not about number of misclassifications but being able reproduce the results.
Rasa docs mention that " In order to get reproducible training results for the same inputs you can set the random_seed
attribute of the KerasPolicy
to any integer." But it does not work because of internal shuffling of training data on every train run.
Also, your point that “ML models predictions will differ slightly every time you train it” is not correct. Given same training data, same initial weights, and same training config the final weights will always be the same.
In my case of training rasa core model, training config is same, initial weights are same by setting random_seed but training data(shuffled_X, shuffled_Y) is changing due to the preprocessing steps. I am investigating this further. If you have any insights, kindly share. Thanks!
Hey @ankeshp welcome to the community!
Well, it is correct if you use a different random seed each time. I think you submitted a PR for rasa core to solve the issues you’re talking about right? The original poster here was talking about NLU though, so this would really belong in a new post in the forum