Rasa NLU: bulk multiple queries in a single HTTP request

Hi Rasa team,

This is my first message on the forum so wanted first to thank you guys for the awesome products you are building.

My team is working on bringing to production various types of ML-based products. We started using Rasa NLU in the context of chatbots and it worked out really well. We are now considering using Rasa NLU as a server for non chatbot applications, for example serving NER predictions from our own custom components. The value we see in Rasa NLU is that it gives us a solid framework to build on: components specifications, pipelines, CLI, HTTP serving… We also think that in the future we could re-use some of our custom components developed for the NER server into our chatbot NLU server, and vice-versa.

In the case of the non-chatbot application, our current in-house server can handle the parsing of multiple queries (multiple sentences) in a single HTTP request, e.g.

"queries": ["This is my first sentence with entity 1.", "This is my second sentence with entity 2"]

This has a lot of advantages when volumes of queries are high and when the parsing is requested at the same time for a large number of them (which I understand is not a common case for chatbot applications where single queries come at different times). For example, this makes us save the overhead of multiple HTTP headers and request roundtrips.

My understanding is that the current Rasa NLU API does not allow multiple queries in a single HTTP request. Is there a way to achieve this nonetheless? If not, do you think such a capability would fit with the current Rasa NLU HTTP server implementation (which if I understand uses a multi-threaded Twisted server?)

Thanks again for the awesome work. Cheers,