Rasa with LLM fallback

I have implemented Rasa with a LLM working as the fallback (through custom actions). It is working fine. However, if there are simultaneous request made to the chatbot, which are directed to LLM fallback, the response generation fails. Any further requests made to LLM fallback also fails after one failed iteration. Meanwhile any request made to Rasa that does not require fallback action is successful. Is there any way to put the requests to LLM fallback in a pipeline to be addressed sequentially or asynchronously?

Yes, I think you could use webhooks and websockets (given that you also have some front-end app). I’m pretty sure your issue has nothing to do with the LLM itself.

Hopefully you’ve solved this problem by now :smiley: