I’m currently working on a project where I have a Rasa custom action that sends user questions to a FastAPI backend. This backend then forwards the question to the OpenAI API to generate an answer in natural language (using gpt-3.5-turbo model), which is sent back to the Rasa server.
I’ve noticed that the OpenAI API has an option to send responses as a stream (using ‘stream=true’) which results in the model’s response being sent character by character, similar to the experience with ChatGPT. I’m interested in implementing this streaming response in my chatbot to create a more dynamic and interactive user experience.
My goal is to have this stream sent to the Rasa server and then from Rasa to my chatbot UI (built with React), so it will print the answer character by character as it is received.
I’ve seen the stream_response method in the Rasa documentation, but I’m not sure if this is applicable to my use case or how I would go about implementing it.
Does anyone have any ideas or suggestions on how to implement this kind of streaming response in Rasa? Is it even possible to do this with the current capabilities of Rasa?
Any guidance or advice would be greatly appreciated. Thank you!
You can Receive the streaming response from OpenAI. Convert it into chunks and send these chunks step by step to Rasa using stream_response . And Handle backend processing efficiently to avoid overwhelming your Rasa server.
Any updates on this? can not find any good resource on implementing streaming with rasa.
not exactly sure what ‘stream_response’ option is or how to set it to true.