Moving training data from prod to test environment


I have a Rasa X setup with Docker Compose and I want to use two servers, one for development and one for production. The data from the conversations is going to the production server, but I need it on my development server to train new models. The production server has no repository connected for safety reasons, so I need to move the data directly from server to server. Is there a common practice how to set thit up, so that I’ll be able to move the data to my development server?

Hi @JanT. Can you explain in a bit more detail about why “the production server has no repository connected for safety reasons”?


sorry for the late reply.

I’m building a chatbot for a larger company with a relatively big IT infrastructure. The current plan is to have one Live Server, that only holds the current trained model an nothing else. This live server also isn’t supposed to be connected to the central repository server just in case somebody gains access to the chatbot server, so he doesn’t also gain access to the repository. Instead there is a second development server behind a DMZ that is connected to the repo. This is just to be absolutely safe in every possible case and was requirement by our IT Security Officer. It looks something like this:

Custom Frontend -> Live Server (only contains model) | DMZ | Prod Server (with training data, etc.) -> Git Server

All the data that comes from actually using the bot is obviously going to land on the Live server. But since training data is needed on the prod server I somehow have to move the data or do it by hand, wich I’m hoping i can prevent.

Maybe I’m going about this all wrong, there isn’t much information about how to go about such a scenario that i could find.

@JanT That makes sense. I think the easiest way would be to use a Rasa Open Source deployment to serve your model in production and then automatically forward all new incoming messages directly from Rasa Open Source to Rasa X