Run_in_thread for model-loading

Hi,

we have an issue that our rasa server stops responding while loading a new model.

But we realized that by using “@run_in_thread” in the /model api in server.py (as given below), makes it non-blocking and serves our purpose.

But how can I override this api in a graceful manner without touching the rasa code. Please give us some hints if possible.

@app.put("/model")
**@run_in_thread**
@requires_auth(app, auth_token)
async def load_model(request: Request) -> HTTPResponse:

Best Regards, tejaswini

I don’t know that you can resolve the issue with that approach. I normally see users bringing up multiple instances and using a load balancer as a proxy to switch from an instance with the old model to another instance with the new model.