I am trying to figure out the architecture for the healthcare personal assistant bot (NOT an FAQ/listings bot) with Rasa being the NLU backend. The actual chatting channel is not important – it will either be Telegram/FB Messenger, or a custom simple mobile app, which will serve as the UI/front end.
Where I am getting lost is how to adapt the model/intents/slots to various people, i.e., make a real learning bot that can pick up different domain/actions/stories files that will be updated in real-time based on conversations the bot is having with the users. Meaning, there is real-time learning and adaptation.
Is this even possible with a single instance of Rasa, or I would have to have some sort of routing middleware layer and keep individual models for each user?
Interesting idea. To do this, it sounds like you would constantly be training new models which could be very expensive in terms of CPU/GPU requirements. Depending on the pipeline and size of the bot, the training time could be a 1-3 minutes (for a VERY simple bot and pipeline) or up to 1 hour.
Regarding separate models for each user, this would be more manageable via a front end router (nginx?) which could route users to separate Rasa instances. People do this type of routing based on different languages and you could do the same for each user.
A single Rasa instance can be associated with only one model but you can use the replaceModel API endpoint to point the instance and an updated version of the model.
Thanks Greg! Yeah, that’s the approximate approach I was thinking if it weren’t possible to have multiple models and Rasa instances – a middleware layer for routing, instance management, spinning up and shutting off Docker containers, etc. Becomes quite expensive, to be honest as well. I wonder how folks at Replika are doing it, since they are claiming that the model becomes a ‘replica’ of a user, hence it is implied that is it learning on the go, not just filling the slots.
I think your hint at replaceModel makes the most sense, as the model training can be automated and batched in predetermined times, not just every time something new is learned, as well as common patterns can be inferred between various users.
hi @stephens
I have made a chatbot for my firm.
The UI is our company messenger app
I have one instance of rasa model and rasa actions server
My users are approx 100 now.
I have one bulky api which takes around 2 -3 min for fetching result.
When someone calls this api , the custom action is called and in the mean time if another user needs to run any custom action, it gets queued. It seems as if the action server is processing the requests in a sequential manner but this is not desirable to me as the action server is not available for other users.
Desired Result - I want a parallel processing for my users
Can u please help me how to achieve this.
Thanks
You could create an async action so that the bot won’t have to wait around for the response and when the response is received, schedule an external event. In your case, when the event finishes, you’d do a post to trigger the external event. Something like this:
import requests
# get conversation id from tracker
x = requests.post('http://localhost:5005/conversations/{conversation_id}/trigger_intent', data = '{"name": "EXTERNAL_dry_plant", "entities": {"plant": "Orchid"}}')