How to increase parallel jobs while training?

rajesh2211 · July 3, 2019, 6:22am

I want to increase my no of parallel jobs manually while training because rasa nlu is taking some time to train.

Default parallel_jobs is 1 i want to increase that to 6

Thanks in advance

Ghostvv · July 5, 2019, 1:01pm

which pipeline do you use?

rajesh2211 · July 5, 2019, 1:40pm

I am using spacy_sklearn pipeline

Ghostvv · July 5, 2019, 2:01pm

I saw you started another thread: How to reduce the training time? about it, let’s keep the conversation there

shayan09 · November 21, 2019, 4:30am

Is there a solution for this yet?

Ghostvv · November 21, 2019, 9:54am

you’d need to customize sklearn classifier to modify num_threads parameter

shayan09 · November 21, 2019, 4:19pm

language: en

pipeline:
  - name: "nlp_spacy"                  
  - name: "tokenizer_spacy"             
  - name: "ner_crf"                     
  - name: "intent_featurizer_spacy"     
  - name: "intent_classifier_sklearn"   
    num_threads: 8

policies:
  - name: MemoizationPolicy
  - name: KerasPolicy

This is my config.yml and I still get:

Fitting 2 folds for each of 6 candidates, totalling 12 fits
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done  12 out of  12 | elapsed:    0.2s finished
127.0.0.1 - - [21/Nov/2019 10:11:37] "POST /train HTTP/1.1" 200 -
Fitting 2 folds for each of 6 candidates, totalling 12 fits
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done  12 out of  12 | elapsed:    0.0s finished
127.0.0.1 - - [21/Nov/2019 10:12:02] "POST /train HTTP/1.1" 200 -
Fitting 2 folds for each of 6 candidates, totalling 12 fits
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done  12 out of  12 | elapsed:    0.1s finished
127.0.0.1 - - [21/Nov/2019 10:12:27] "POST /train HTTP/1.1" 200 -
Fitting 2 folds for each of 6 candidates, totalling 12 fits
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done  12 out of  12 | elapsed:    0.0s finished
127.0.0.1 - - [21/Nov/2019 10:12:52] "POST /train HTTP/1.1" 200 -
Fitting 2 folds for each of 6 candidates, totalling 12 fits
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done  12 out of  12 | elapsed:    0.0s finished

For some reason it is still using the default value from here:

num_threads = kwargs.get("num_threads", 1)

Any idea why is this happening?

Ghostvv · November 21, 2019, 11:17pm

because, the parameter num_threads is not taken from config

shayan09 · November 22, 2019, 2:47pm

This is how I am training the models in Python.

training_data = load_data('./intents.md')
trainer = Trainer(config.load('./config.yml'))
trainer.train(training_data)
model_directory = trainer.persist('./models/', fixed_model_name=model_name)
model_path[model_name] = model_directory
interpreter_dict[model_name] = Interpreter.load(model_directory)

I know that Rasa used to supoort num_threads from CLI earlier but now I think it does not. Please could you help me out with setting the num_threads in this use case?

Topic		Replies	Views
How to reduce the training time? Rasa Open Source	2	1495	July 5, 2019
Set num_threads while training rasa_nlu Rasa Open Source	1	1224	November 27, 2019
SklearnIntentClassifier with custom "num_jobs" kwargs value Rasa Open Source	4	781	July 30, 2019
Training seems to finish properly, but there is no new model after 2 hours [Deprecated] Rasa X Community Edition	8	766	November 4, 2020
Rasa Docker 0.29.3 - Train Model only creating 40 % CPU load? [Deprecated] Rasa X Community Edition	5	1047	September 24, 2020

How to increase parallel jobs while training?

Related topics