Should we install GPU for training?

noman · January 31, 2020, 8:12am

Hi @akelad we are training nlu and core on the cpu right now, but now nlu data has grown up in size and we are using embedding intent classifier and it takes more than 2 hours which is unbearable. And in future we are also planning to add sentiment analyzer that will further explode the training time. So my question is that, there was a bottleneck that Rasa didn’t optimised models for GPUs. Does this bottleneck has gone now. And can we now harness the power of GPUs for training our nlu and core models? Thanks.

samscudder · February 4, 2020, 5:55pm

We have a couple of machines here with GPU and the speed increase has been significant.

We use RASA 1.4.6. In relation to training times. Here’s a rough comparison:

8 x i7 cores - around 6 hours
8 x i7 cores + Rtx 1050(4Gb) - estimates show around 45 minutes but it runs out of memory at around 50% of the EmbeddingIntentClassifier training
8 x i7 cores + V100 (cloud based) - around 25 minutes

Setting up the GPU was a bit of a pain, because I couldn’t get the latest nvidia drivers to work with tensorflow. We use Tensorflow 1.15.0, nvidia drivers 418.87.01 and CUDA toolkit 10.1.

noman · February 5, 2020, 3:19pm

@akelad what do you suggest? Thanks

akelad · February 5, 2020, 3:57pm

@samscudder wow, how much data do you have? When it comes to that size of data a GPU does help. Generally if you don’t have that much data though, since Rasas models are quite shallow, a GPU won’t help much. As for running out of memory, as of version 1.6.0 some of our featurizers now use sparse features (see more info here). So you should consider upgrading.

@noman you can definitely try a GPU and see if that speeds up your training time. Which version of Rasa are you on?

akelad · February 5, 2020, 3:59pm

As for your SentimentAnalyzer, that will depend on how it’s implemented

samscudder · February 5, 2020, 5:18pm

Actually, I don’t think we have that much data…We have two datasets. The smaller one has 273 intents, 356 actions, 551 stories, and 2694 phrases in the nlu. The largest one (with the numbers I mentioned above) has 663 intents, 665 actions, 660 stories, and 5542 phrases in the nlu.

The chatbots are in portuguese (Brazilian).

We are unable to upgrade at the moment, as accuracy drops from around .90 to .40 in versions > 1.6.0. Can’t wait to get it sorted out though… the tests I ran, the training time was incredibly quick.

akelad · February 6, 2020, 9:28am

@samscudder that is quite a bit of data though especially with the amount of intents. It seems like you have on average less than 10 examples per intent though? That’s generally not advisable.

Hm, the accuracy drop happens for version > 1.6.0? That’s worrying - could you tell me what pipeline you’re using for this?

samscudder · February 6, 2020, 12:52pm

@akelad There is a bug open for this (Loss of confidence in Rasa > 1.6.0 nlu (compared to 1.4.6) · Issue #5004 · RasaHQ/rasa · GitHub)

Here’s my pipeline:

# Configuration for Rasa NLU.
# https://rasa.com/docs/rasa/nlu/components/
language: pt

pipeline:
- name: "SpacyNLP"
- name: "SpacyTokenizer"
- name: "RegexFeaturizer"
- name: "CRFEntityExtractor"
- name: "CountVectorsFeaturizer"
  strip_accents: "unicode"
- name: "CountVectorsFeaturizer"
  strip_accents: "unicode"
  analyzer: "char_wb"
  min_ngram: 1
  max_ngram: 4
- name: "components.lemmatization.lemma.CustomLemmatization"
  use_cls_token: false
- name: "EmbeddingIntentClassifier"
  epochs: 200
  random_seed: 2614

policies:
  - name: FormPolicy
  - name: AugmentedMemoizationPolicy
    max_history: 1
  - name: KerasPolicy
    random_seed: 2614
    batch_size: 32
    epochs: 350
  - name: MappingPolicy
  - name: FallbackPolicy
    nlu_threshold: 0.6
    core_threshold: 0.6
    fallback_action_name: action_padrao

Our threshold in 1.4.6 was 0.7, but we reduced it to 0.6.

In > 1.6.0, though, I’m getting 0.4-0.5 per intent

noman · February 6, 2020, 2:02pm

@akelad right now we are on 1.3.3 soon we’ll upgrade it to the recent stable version. So you think we should try gpu for intent classification having Embedding intent classifier in the pipeline and evaluate the performance, right? But my question how can i reduce the training time from hours to minutes?

akelad · February 6, 2020, 3:04pm

the GPU would potentially speed up performance, not improve performance. As for getting training time from hours to minutes, I think upgrading to the newer Rasa version should help, and potentially using the GPU as well

noman · February 12, 2020, 8:37am

Thanks @akelad and @samscudder

Beherasaptami · March 4, 2020, 6:11am

hi @akelad i have one concern regarding GPU like is it really required to have GPU or not .

akelad · March 5, 2020, 12:57pm

it is definitely not required.

xiaolinpeter · December 8, 2020, 1:05pm

hi @akelad , I has spend a hour in training the project rasa-demo(called sara), i feel too slow, do you think the speed is normal? I will be thankful you if you give me suggestion. my rasa version is the newlest.

dingusagar · February 4, 2021, 1:29pm

asking another query related to GPU here.

I have a GTX 1660 ti GPU, rasa model trainings are noticeably faster in GPU compared to CPU. However when I was looking at the GPU usage using the command nvidia-smi, i saw only 24% of GPU is utilised at max.

I was just wondering if this is normal or is there any hyperparameters that I can tune to maximise the usage of available GPU.

Ahmed.Akl · May 18, 2021, 9:53am

Hi @samscudder, how did you utilize GPU with Rasa? I can’t configure the conda environment.

Thanks in advance

Topic		Replies	Views
Rasa training on GPU Rasa Open Source	1	99	July 19, 2024
How to only train Rasa core on CPU and only Rasa NLU on GPU Rasa Open Source	0	1756	April 23, 2020
Traning NLU on GPU Rasa Open Source	6	859	August 24, 2021
Use intent_classifier_tensorflow_embedding with GPU Rasa Open Source	2	752	March 14, 2019
Can the Rasa-Core work on an embedded systems without GPU supporting? Rasa Open Source	5	1057	September 13, 2019

Should we install GPU for training?

Related topics