The issue of RASA removing the multi model serving has been addressed in few threads in the past (in my opinion this was a terrible decision). However, I am facing the dilemma of having to produce 120 distinct models (NLU only) for patient triage… and it doesn’t matter what clever architecture I apply; the cost of farming out new instances per model will eliminate the cost saving of using an open source platform like RASA (cost of dev and infrastructure).
I took it upon myself to dig into RASA’s code, and eventually I whittled NLU classification to the class RasaNLUInterpreter (hierarchy goes deeper into more interpreter classes but I have neither the time nor the energy to dig deeper)…
So, whether I can run multiple instances of the RasaNLUInterpreter hinges on whether its is thread safe. So, my question is can I run multiple instances of this class in the same process and have it correctly apply the supplied trained model? Or do the classes have common global state that will stop them from being used in such manner.
This is really important for us as we are building prototypes and evaluating technology at the moment.