Implementing Streaming Responses in Rasa for a ChatGPT-like Experience

Hello Rasa Community,

I’m currently working on a project where I have a Rasa custom action that sends user questions to a FastAPI backend. This backend then forwards the question to the OpenAI API to generate an answer in natural language (using gpt-3.5-turbo model), which is sent back to the Rasa server.

I’ve noticed that the OpenAI API has an option to send responses as a stream (using ‘stream=true’) which results in the model’s response being sent character by character, similar to the experience with ChatGPT. I’m interested in implementing this streaming response in my chatbot to create a more dynamic and interactive user experience.

My goal is to have this stream sent to the Rasa server and then from Rasa to my chatbot UI (built with React), so it will print the answer character by character as it is received.

I’ve seen the stream_response method in the Rasa documentation, but I’m not sure if this is applicable to my use case or how I would go about implementing it.

Does anyone have any ideas or suggestions on how to implement this kind of streaming response in Rasa? Is it even possible to do this with the current capabilities of Rasa?

Any guidance or advice would be greatly appreciated. Thank you!

3 Likes

You can Receive the streaming response from OpenAI. Convert it into chunks and send these chunks step by step to Rasa using stream_response . And Handle backend processing efficiently to avoid overwhelming your Rasa server.

Facing same issue. Please provide solution if you found any.

1 Like

Any solution to this?
How to use stream_response in rasa? Use in action.py? Thanks!

1 Like

Can you please provide some details? Any sample code or a bit more clarity where exactly and what needs to be changed?

1 Like

Any updates on this? can not find any good resource on implementing streaming with rasa. not exactly sure what ‘stream_response’ option is or how to set it to true.

1 Like

I have the same issue

I am able to get the response from open ai via stream in chunks. But I am facing issues in how to response it to my frontend from rasa?

  for chunk in explanation:
        # print(chunk)
        dispatcher.utter_message(text=chunk)  # Send each chunk immediately

The below is how I am sending currently but not working. . any help would be really appreciated. Thank you