I have installed RASA on Kubernetes using helm charts. I have used the EKS cluster for doing this. It says that Kubernetes supports autoscaling so when there is a load on chatbot, autoscaling will happen. When we deploy using helm charts, it creates several deployments and the one responsible for creating more instances of chatbot is the “rasa-production” deployment (this I learned from another post on this forum).
- Now I want to understand all of these deployments of the helm chart, is there any resource or video playlist available for that apart from the documentation of installing rasa using helm charts?
- How can I verify that scaling is happening or not for the chatbot on Kubernetes, there must be some way?
- If I scale the rasa-production deployment then it creates replicas, which is fine, but how do I know if it is using the newly created pod for running the instance or not? I mean it should use another pod to balance the load.
- How will I force it to use another pod? How can I increase the load on one pod so that it uses another to run a chatbot?