How to split train test data using python

how to split train test data using python

Hi and welcome to forum :bouquet: @SATEESH302 Why you need that? is that you need for chatbot? can you elaborate more.

Welcome to the forum!

You can use rasa data split as mentioned in the docs.

@SATEESH302

  1. It is a benchmark or recommended while training the model we consider 80:20 ratio is a standard starting point to train and test our model.

In RASA they have set the default 0.8 as mention: –training-fraction TRAINING_FRACTION Percentage of the data which should be in the training data. (default: 0.8)

Reference 1: Testing Your Assistant

Reference 2: https://rasa.com/docs/rasa/command-line-interface#rasa-data-splithttps://rasa.com/docs/rasa/command-line-interface#rasa-data-split

So, if you have enough data then you not need to worried about changing the ratio, as it’s standard and can deal with a large number of data as deep learning required.

If you further want to investigate how it works with the TensorFlow pipeline, Further, you can even see this link and read it step by step: https://aspiresoftware.in/blog/rasa-nlu-intent-classification-using-different-pipeline/ Hope it will help :slight_smile:

i want to generate that test data using python script

i want to generate train test data using python script so that i need code in python

Are you asking for Rasa or in general?

rasa only

Well you can run the above command from Python:

import os
os.system('cmd /c "rasa data split"')

i know that rasa split nlu command works in terminal but i want run script using python