hey @singh-l, this is a common and recommended approach. There are a few components to this. I recommend reviewing our architecture diagram for Rasa Open Source as well.
Typically, I’ve seen this done by sticking a user to a particular machine/region once they start in it. This allows you to have full replication across both stacks in case one fails (vs. sharing a tracker store between the two). A con of this approach is your conversations are stored in two separate places… but it’s not a huge deal because entire conversations will be contained in a single DB.
So, you’ll have a tracker store per region that manages the context between the assistant and the user. And then your load balancer is responsible for the stickiness… i.e. you need to route based on something like the IP of the user and make sure they are always sent to the same region.
If one region goes down, a conversation would need to start over in the new region. You could get around this by having a shared tracker store with both regions, but that creates a single point of failure at the Tracker DB. So it’s up to you.