Hi Adrian - I’m currently experimenting with RASA and planning to use it behind the Alexa NLP. (Mainly planning to use RASA Core to implement our “chat” functionality via Stories as it’ll allow us to evolve them without having to re-release the Alexa Skill. May also make use of RASA NLU, or perhaps spaCy or NLTK, to do some entity recognition as it offers more than Alexa currently exposes in this area.)
Anyway as a result I’ve been experimenting with both the RASA Agent and the RASA HTTP Server while I decide on my overall Solution Architecture. In both cases I’ve been able to hold multiple, concurrent, independent conversations using a single instance of the Agent or the Server by setting the Sender-Id field to assign inputs to a specific conversation. Seems to work OK, but I haven’t tested exhaustively yet.
I still have investigate the scalability of each approach, as in what limits the number of parallel, in-flight, conversations I can hold (probably memory?), and how to persist/restore conversations across User visits (to help manage memory).
Finally need to decide between HTTP Server where I’ll probably call it via AWS API Gateway, or writing my own Agent capable of handling multiple clients which, again, I’ll probably put behind API-GW. Overall, I’m erring toward the latter as I like the Agent functionality, but it could go either way.