Is monkey patching required in rasa-core.run script to handle concurrent requests?

I use the rasa_core.run script to start rasa-core and run it on a windows environment. I observed issues for concurrent requests in rasa core. It was observed that the requests were getting queued instead of getting executed in parallel.

The requests started to get processed concurrently after monkey patching in the rasa_core.run script as mentioned below:

from gevent import monkey

gevent.monkey.patch_all()

On a basic sanity it seems to be working fine. However, I wanted to confirm if monkey patching is always required and if there are any impacts in doing it.

Note: By the way as an alternative, if I use “cheroot” (from “cherrypy” ) as the WSGI server than there are no issues in serving concurrent requests and monkey patching is not required. Seems its a problem with gevent.

Hi @ksoneji, great to see you’re achieving concurrency with monkey patching. This is not something we’ve tested or intended, so I wouldn’t be able to comment on the implications.

We’re currently working on porting the flask-based server to sanic which will make all endpoints asynchronous. We’re expecting to release this in version 0.14. Here’s the PR I’m referring to: https://github.com/RasaHQ/rasa_core/pull/1498