Rasa X scaling issue on kubernetes

I have installed RASA on Kubernetes using helm charts. I have used the EKS cluster for doing this. It says that Kubernetes supports autoscaling so when there is a load on chatbot, autoscaling will happen. When we deploy using helm charts, it creates several deployments and the one responsible for creating more instances of chatbot is the “rasa-production” deployment (this I learned from another post on this forum).

  1. Now I want to understand all of these deployments of the helm chart, is there any resource or video playlist available for that apart from the documentation of installing rasa using helm charts?
  2. How can I verify that scaling is happening or not for the chatbot on Kubernetes, there must be some way?
  3. If I scale the rasa-production deployment then it creates replicas, which is fine, but how do I know if it is using the newly created pod for running the instance or not? I mean it should use another pod to balance the load.
  4. How will I force it to use another pod? How can I increase the load on one pod so that it uses another to run a chatbot?
  1. There is a udemy course which explains this
  2. You can see the number of pods to check pod scaling. Also you can use horizontal pod autoscaler to automate scaling of pod based on load on them.
  3. See here - How to verify that load balancer is dividing work correctly
  4. You can use selenium to send automated texts so that it can send like 100 texts in 60 sec which can increase the load on the rasa-production pod Or you can use any other method to automate sending texts.