I want to increase my no of parallel jobs manually while training because rasa nlu is taking some time to train.
Default parallel_jobs is 1 i want to increase that to 6
Thanks in advance
I want to increase my no of parallel jobs manually while training because rasa nlu is taking some time to train.
Default parallel_jobs is 1 i want to increase that to 6
Thanks in advance
which pipeline do you use?
I am using spacy_sklearn pipeline
I saw you started another thread: How to reduce the training time? about it, let’s keep the conversation there
Is there a solution for this yet?
you’d need to customize sklearn classifier to modify num_threads
parameter
language: en
pipeline:
- name: "nlp_spacy"
- name: "tokenizer_spacy"
- name: "ner_crf"
- name: "intent_featurizer_spacy"
- name: "intent_classifier_sklearn"
num_threads: 8
policies:
- name: MemoizationPolicy
- name: KerasPolicy
This is my config.yml and I still get:
Fitting 2 folds for each of 6 candidates, totalling 12 fits
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done 12 out of 12 | elapsed: 0.2s finished
127.0.0.1 - - [21/Nov/2019 10:11:37] "POST /train HTTP/1.1" 200 -
Fitting 2 folds for each of 6 candidates, totalling 12 fits
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done 12 out of 12 | elapsed: 0.0s finished
127.0.0.1 - - [21/Nov/2019 10:12:02] "POST /train HTTP/1.1" 200 -
Fitting 2 folds for each of 6 candidates, totalling 12 fits
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done 12 out of 12 | elapsed: 0.1s finished
127.0.0.1 - - [21/Nov/2019 10:12:27] "POST /train HTTP/1.1" 200 -
Fitting 2 folds for each of 6 candidates, totalling 12 fits
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done 12 out of 12 | elapsed: 0.0s finished
127.0.0.1 - - [21/Nov/2019 10:12:52] "POST /train HTTP/1.1" 200 -
Fitting 2 folds for each of 6 candidates, totalling 12 fits
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done 12 out of 12 | elapsed: 0.0s finished
For some reason it is still using the default value from here:
num_threads = kwargs.get("num_threads", 1)
Any idea why is this happening?
because, the parameter num_threads
is not taken from config
This is how I am training the models in Python.
training_data = load_data('./intents.md')
trainer = Trainer(config.load('./config.yml'))
trainer.train(training_data)
model_directory = trainer.persist('./models/', fixed_model_name=model_name)
model_path[model_name] = model_directory
interpreter_dict[model_name] = Interpreter.load(model_directory)
I know that Rasa used to supoort num_threads
from CLI earlier but now I think it does not. Please could you help me out with setting the num_threads
in this use case?