Manage and maintain large amount of training data


I am building a shinny new bot, which currently has around 500 story examples. This number is increasing day-by-day. Due to the agile nature of the project, flows change, mostly insertion/deletion of user turns in the already coded stories, invalidation of some examples, flow changes.

Is there any delicate way to manage and maintain such changes? We are currently reviewing all the examples after a new change is introduces. RasaX and postgres offer some help to search for affected stories. Is there another approach?

Thank you, Gerasimos

Do you really need that many stories? If possible, split the stories up into fragments to reduce the number.

It’s common for larger bots to create multiple story files around skills or functionality. As you’ve probably noticed, you can do a simple string search in Rasa X to find stories based on intent or utterance.