I am trying to train a model in a GPU instance.
I can see a process spinned for rasa once the training starts however the GPU utilizatoin remains 0 till the training is completed.
Also there is no improvement in the training time as compared to CPU.
Here are some information about the setup.
NVIDIA-SMI 470.141.03
Driver Version: 470.141.03
CUDA Version: 11.4
RASA 1.10.23
Is there a way to troubleshoot why my GPU is not utilized ?
One thing I have done in the past is make sure the libraries are loading and the GPU is detected inside Python using your environment. One way to do this is to open a python prompt on the command line, e.g. type python, and start entering some of the lines from the Rasa source code like, include Tensorflow and use its libraries’ functions to tell you how many GPUs it can see. (Google to see how.)
In specific GPU problems I’ve had in the past, it has been either a version incompatibility between my versions of python, Tensorflow, GPU librariars, etc., or it has been my inability to install (and keep installed) the GPU libraries on my VM / compute instance.