How the test/train split works in rasa

@AminaDerouiche

  1. It is a benchmark or recommended while training the model we consider 80:20 ratio is a standard starting point to train and test our model.

In RASA they have set the default 0.8 as mention: –training-fraction TRAINING_FRACTION Percentage of the data which should be in the training data. (default: 0.8)

Reference 1: Testing Your Assistant

Reference 2: https://rasa.com/docs/rasa/command-line-interface#rasa-data-splithttps://rasa.com/docs/rasa/command-line-interface#rasa-data-split

So, if you have enough data then you not need to worried about changing the ratio, as it’s standard and can deal with a large number of data as deep learning required.

If you further want to investigate how it works with the TensorFlow pipeline, I will suggest contacting Rasa Core Developer (Hope that will help) but you can even see this link and read it step by step: https://aspiresoftware.in/blog/rasa-nlu-intent-classification-using-different-pipeline/ Hope it will help :slight_smile:

  1. I think RASA as per my knowledge is not implemented the re-sampling, maybe I can be wrong but if in the context of TensorFlow you want to know sample please follow this detailed link: Sampling Methods Within TensorFlow Input Functions | Datatonic : Datatonic

  2. Currently, A cross-validation test specifies a number (k) of folds that should be used to evaluate the model. By default, Rasa sets the number of folds to 5 for further reading please read this detailed blog by Karen White Write Tests! How to Make Automated Testing Part of Your Rasa Dev Workflow

I hope it will help you. Seen your related questions today on Youtube. Happy learning :slight_smile: If you have any further doubt please do let me know!

4 Likes