Evaluate based on Test/Train Split

stephens · January 15, 2019, 4:36am

Has anyone considered an NLU evaluation option that splits out a fraction of the training data to create a test data set so you don’t have to create a separate test dataset? You could then create a confusion matrix and errors.json like the normal evaluate.

souvikg10 · January 15, 2019, 9:24am

if you use the evaluation using cross-validation there is a stratified train-test split and it can run the training n-times to give you an average of all runs.

stephens · January 15, 2019, 4:01pm

yes, but it doesn’t provide the errors.json or confusion matrix and both of those seem very useful in producing a better chatbot.

souvikg10 · February 4, 2019, 9:52pm

you could use the same code to generate your stratified dataset from your training data. sklearn train_test_split() might help https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html

stephens · February 5, 2019, 2:02am

Yes, I saw that code and called it in the PR I submitted.

Topic		Replies	Views
Multiple questions related to testing the bot Rasa Open Source testing	2	1073	May 14, 2020
Comparing Policies - guide not clear Rasa Open Source	5	419	March 2, 2020
Model evaluation Rasa Open Source	2	764	November 12, 2018
Why is the test set trained for evaluation purpose? Rasa Open Source	2	722	January 11, 2019
A way to compare different NLU Pipelines with new test data Rasa Open Source	4	655	September 10, 2021

Evaluate based on Test/Train Split

Related Topics