Rasa Summer Release - Boosting Performance for Voice Assistants
Hey everyone, hope you’re all doing great! We’re excited to share some updates in Rasa Pro 3.9, aimed at making CALM voice assistants more efficient and high-performing. Here’s a rundown of what’s new:
Added a gRPC action server that can use gRPC protocol to efficiently handle double the amount of requests in the amount of time
Optimized model loading of bot config files
We’ve added multi-step prompting. This breaks the dialogue understanding module into multiple shorter steps and enables smaller and cheaper LLMs (GPT-3.5) to perform almost at the same functional performance as larger, stronger, but more expensive models (e.g. GPT-4)
You can now create custom information retrievers to improve search accuracy. This allows you to connect to any information retrieval source for more precise results.
We improved slot extraction by allowing CALM assistants to use either an LLM or a lightweight NLU extractor, such as Duckling. This reduces the number of LLM calls and lowers the operational costs and latency per user interaction.
I’m particularly excited about the multi-step prompting feature; it’s great to see how it can optimize costs while maintaining functional performance with smaller LLMs.
Custom information retrievers also open up a lot of possibilities for improving search accuracy—I’m curious to see how others are planning to implement these features. Overall, these enhancements seem like they will significantly.