Hello! The assistant model is trained on the CPU faster than on the GPU. I turning off support of CUDA in OS (remove modules cudatoolkit and cunn). What could be the problem and how to fix it? Please help me.
OS: Linux Manjaro KDE
GPU: Geforce 750Ti
CPU: AMD FX6300 (six core)
Rasa: 2.8
Python: 3.8
Tensorflow 2.6
Data time of train
NLU Inizialize Core Total
|CPU 1st train| 00:03:14| 00:01:00| 00:02:24| 00:06:38|
|CPU 2nd train| 00:03:17| 00:01:00| 00:02:22| 00:06:39|
|GPU 1st train| 00:01:30| 00:01:00| 00:17:31| 00:20:01|
|GPU 2nd train| 00:01:23| 00:01:00| 00:16:55| 00:19:18|
Hey @faupotron. My first guess is that it’s caused by TensorFlow. By default, TensorFlow blocks all the available GPU memory for running processes. You can try changing this behavior by setting the environmental variable TF_FORCE_GPU_ALLOW_GROWTH to TRUE.
Just a quick question on top of that - how many training examples do you have for your NLU model? If your project is on a smaller side, GPU might not be even necessary for your project.
Great. For an assistant this size I’d doubt you really need a GPU, unless your plan is to scale the application more.
The variable should be set in your environment. Are you training your assistant locally? You should be able to set the environment variables by executing the following in your terminal:
It is most likely due to the fact that your assistant is smaller compared to rasa-demo. As I have mentioned previously, for smaller applications GPUs are completely unnecessary and in some cases you might see little to no improvement to training times compared to using only CPU. It’s because when you use GPU for small models your system gets a lot of overhead in invoking GPU kernels and copying data.