Hi, We have a rasa chatbot deployed in the kubernetes container. It uses the model server to load the model.
The issue is, when rasa detects a new model it calls the _load_and_set_updated_model(in agent.py) and at this time, it stops responding to the api calls. Kubernetes liveness probe doesn’t get response to http://localhost:8080/status and thinks its down and brings down the container.
We would like to write a custom probe script, that sends the ‘good health’ response when rasa is busy loading the model. But we can not override agent.load_model() to put some indication that rasa is loading the model.
Can you give some suggestions on how to know when rasa is loading the model.
Thank you in advance teja