Zero downtime model replace?

mbukovy · February 13, 2023, 3:40pm

Hey guys, simple question - is it possible to do zero-downtime model replacement through API?

When I start a server like rasa run ---enable-api and then try to load a new model via API PUT /model, Rasa isn’t serving the traffic at the time of replacing - I mean that endpoint /webhooks/callback/webhook doesn’t work & it’s waiting for model replace to finish. Any chances for sort of “atomic swap” of the models?

nonola · February 13, 2023, 5:06pm

Is the model in the same storage as rasa?

mbukovy · February 13, 2023, 10:13pm

No, it’s being loaded from aws s3

rasa_learner · February 14, 2023, 4:03pm

Hi @mbukovy, I suggest these steps

Keep the current server running
Start another server with the new model
Switch the traffic to the new server endpoint
If everything is good, kill the old server endpoint

mbukovy · February 15, 2023, 10:28pm

That’s our current setup. I’m trying to find a way, how to speed up the deployment. Starting a new Rasa instance in ECS is quite slow.

lumpidu · March 3, 2023, 6:48pm

There is no way around it. Rasa is slow to start up, even on beefier servers. The models always downloads additional model dependencies. You could minimize the time, if you manage to cache the downloaded dependency models between the servers, but I haven’t tried that before.

If you use multiple Rasa instances per Bot, the problems get manifold and it would probably make sense to use K8s for that.

Topic		Replies	Views
Trouble replacing new models through the API Rasa Open Source	1	738	May 7, 2021
How to reload the model without restart server Rasa Open Source	3	3023	September 17, 2020
How to run training and replace current RASA model from python script Rasa Open Source	35	4213	July 13, 2022
Sync Rasa Pods on Kubernetes Rasa Open Source	1	235	November 18, 2022
Behavior when changing Rasa models during a conversation? Rasa Open Source	3	150	August 26, 2023

Zero downtime model replace?

Related topics