Comparing Policies - guide not clear

Hi,

I need to evaluate and compare the performances of my policies; I’m trying to follow this guide, but I find it quite confusing.

Especially in the beginning of a project, you do not have a lot of real conversations to use to train your bot, so you don’t just want to throw some away to use as a test set.

Here it says that I shouldn’t “throw away” my data for using it as test set

Once you are happy with it, you can then train your final configuration on your full data set.

But one line below it says that I can then train the model on the “full data set”. So I did have to split my data? And how can I split it?

For each policy configuration provided, Rasa Core will be trained multiple times with 0, 5, 25, 50, 70 and 95% of your training stories excluded from the training data.

What’s the point of doing that? Why would I want to train my model on just a portion of the training set?

Also, I didn’t really get the way the stories are evaluated. If I have one story 30 utterances long and another one 2 utterances long, and my model predicts incorrectly one time each, are the 2 stories evaluated differently?

Thank you, Tiziano

@tiziano The concept of splitting your complete data into training and testing set is a well known practice in machine learning. If you train a machine learning model on all of your dataset, it may overfit or memorize data points as it is and may not be able to handle new novel data points that it encounters in real world. For rasa core, these new data points would be new real world conversations it encounters once deployed.

I’m well aware of that. I was just pointing out the lack of consistency of the guide

Thanks for bringing it up. Would you be up for contributing to the documentation to make it more consistent? Please feel free to open a PR with your proposed changes. Thanks

If you are referring to my last question, the thing is different. I wasn’t talking about splitting the data into training set and test set, but about taking the training set (already split) and using just a portion of that for training the model (with exclusion percentages of 0,5,25…). Why would you want to do that?

I’m not aware of the exact functioning of Rasa, I don’t think I have enough knowledge for proposing changes to the docs Also, I’m not sure about how to open a PR