Manage and maintain large amount of training data

gerasimos · October 22, 2020, 11:36am

Hello,

I am building a shinny new bot, which currently has around 500 story examples. This number is increasing day-by-day. Due to the agile nature of the project, flows change, mostly insertion/deletion of user turns in the already coded stories, invalidation of some examples, flow changes.

Is there any delicate way to manage and maintain such changes? We are currently reviewing all the examples after a new change is introduces. RasaX and postgres offer some help to search for affected stories. Is there another approach?

Thank you, Gerasimos

stephens · October 22, 2020, 10:35pm

Do you really need that many stories? If possible, split the stories up into fragments to reduce the number.

It’s common for larger bots to create multiple story files around skills or functionality. As you’ve probably noticed, you can do a simple string search in Rasa X to find stories based on intent or utterance.

Greg

Topic		Replies	Views
Is it better to have lots of stories? - general discussion Rasa Open Source	2	505	March 12, 2019
Chatterbot to Rasa Rasa Open Source	5	1189	December 30, 2019
Train a bot with large dataset Rasa Open Source	2	882	December 11, 2019
Rasa is getting confused with many stories Rasa Open Source	3	517	December 16, 2021
Training on large intent examples Rasa Open Source	5	1369	March 17, 2022

Manage and maintain large amount of training data

Related topics