I have installed RASA using the helm chart in Kubernetes but with minikube. I have also connected RASA-X with the GitHub repository to train my model. Everything is working fine and I am able to chat with the bot by going to the URL (http://(minikube IP):8000). Now I want to scale up my deployment but I don’t know which deployment (rasa-worker, rasa-production, or rasa-x) should I scale up, and also after scaling up, how can I verify that the Nginx load balancer is working, I mean is there any curl command to see on which pod my rasa-x chatbot is working and this curl command should return different pod name every time to show that load balancing is happening correctly. I’m new to RASA, please help me out if someone knows anything.
rasa-production is the container you’d want to scale up since it’s the one handling incoming bot traffic. To verify nginx is working, you can use the
/health endpoint to retrieve the status of the active services: Rasa X Documentation
Hi @b-quachtran, Can we do something similar like this, here each time we hit a different pod which shows that load balancing is happening. Can we do anything similar to check which pod is responding when we type a message in rasa-x UI chatbot after scaling up rasa-production to 2 or more replicas?
I found it, see the “rasa-production” pod is the one responsible for handling the traffic and responding to the msgs, so when you talk to the bot, you can look into the logs of this production pod that how the msg is getting recognised and replying in real time. Now when you scale up this production pod to 2 replicas, open the logs of both the pods in separate terminals and watch the activities in them by sending msgs to bot, you’ll see in the logs that some msgs are handled by pod one and some are handled by the other pod.