First time posting on here, I’m a big fan of Rasa and the options it provides us looking to create meaningful and powerful chatbots. Though, something I’ve been curious about is whether or not Rasa has the ability to use dask to help train Rasa models.
I came accross the rasa.engine.runner.dask documentation located here:
I just would like some more elaboration as to how you can use Rasa with dask and also what parts of dask is Rasa using. For example is Rasa using the dask graph and executor to be able to train Rasa models on a dask cluster or your local machine?
Also, what examples currently exist for using Rasa with Dask?
Thanks again and much appreciated!
Hi Brian, I am interested in this subject and want to use Dask on Ray. Observing the code, I noticed that Dask is used but I’m stuck with Ray for bachelor thesis.
Anyway, Rasa already uses Dask, at least in versions 3+ for training graph but they use
dask.get() on line 101 of
dask.py, which is “single-threaded” and certainly synchronous scheduler. Apparently, tensorflow already uses parallel tasks locally but does not scale well over one machine.
Hi @toza-mimoza ,
I’m very interested in this as well. Mainly my use case is just making sure that if we use what’s available here, would it be able to be passed to a dask cluster for use in a local or remote setting? I’m also someone interested in rapids for nlp use on a gpu. But if any form of this works with the dask executor then this will be in the right direction. It would be weird to do dask integration for Rasa and just leave that ability out.
Hi @bparbhu ,
I suppose it would be possible to use Dask locally and remotely although I am not familiar with Dask, but let’s stir up the community and developers to help us.
Here I had this suggestion/question for
DaskGraphRunner regarding the synchronous/threaded scheduler [DaskGraphRunner] dask.threaded.get instead of dask.get · Issue #10754 · RasaHQ/rasa · GitHub.
Here is my post regarding Dask on Ray integration: Dask on Ray for DaskGraphRunner: Serialization of GraphNode class.
Hope our questions get answered.
@bparbhu It’s been days but I have managed to run Rasa’s Dask graph on Ray cluster.
Here are my observations:
To be clear, I had to disable cache since it cannot be serialized (there is an SQLAlchemy object in a
TrainingHook somewhere in the code) and it’s not important for my use case (Bachelor thesis/research) but that’s already already a big disadvantage for Rasa 3.x+.
I did not change the configuration nor added any stories or intents, it was a pure
rasa init chatbot.
On Azure I have 3 VMs with 2 vCPU resources each, which makes 6 in total for the whole cluster. At any time Dask graph uses up to 5 vCPUs, with 3-4 being most common, because I suppose max 4 graph nodes can be ran in parallel. That corresponds with local testing on a single-node cluster with my computer, where I do not see the benefit of parallelizing beyond 4 threads.
So the best bet is using threaded dask and not the cluster.