Can not train rasa nlu with spacy models

Hi everyone, I am new to rasa and I am encountering a problem when using spacy models to train rasa nlu. Let me explain.

I am able to use the “supervised_embeddings” configuration: my config.yml file is

language: “it”
pipeline: “supervised_embeddings”

and I am able to train rasa nlu using the command

$rasa train nlu --out rasa_dir/model/ --nlu rasa_dir/data/

But I am not able to do the same thing using the pretrained_embedding_spacy configuration! I installed spacy and the italian embedding through the commands

$pip install rasa[spacy]
$python -m spacy download it_core_news_sm
$python -m spacy link it_core_news_sm it

then I changed my config.yml file to:

language: “it”
pipeline: “pretrained_embeddings_spacy”

but when I rerun the command

$rasa train nlu --out rasa_dir/model/ --nlu rasa_dir/data/

I get stuck with this log

2019-06-12 12:41:47 INFO rasa.nlu.model - Starting to train component SpacyNLP
2019-06-12 12:42:16 INFO rasa.nlu.model - Finished training component.
2019-06-12 12:42:16 INFO rasa.nlu.model - Starting to train component SpacyTokenizer
2019-06-12 12:42:16 INFO rasa.nlu.model - Finished training component.
2019-06-12 12:42:16 INFO rasa.nlu.model - Starting to train component SpacyFeaturizer
2019-06-12 12:42:16 INFO rasa.nlu.model - Finished training component.
2019-06-12 12:42:16 INFO rasa.nlu.model - Starting to train component RegexFeaturizer
2019-06-12 12:42:16 INFO rasa.nlu.model - Finished training component.
2019-06-12 12:42:16 INFO rasa.nlu.model - Starting to train component CRFEntityExtractor
2019-06-12 12:45:12 INFO rasa.nlu.model - Finished training component.
2019-06-12 12:45:12 INFO rasa.nlu.model - Starting to train component EntitySynonymMapper
2019-06-12 12:45:12 INFO rasa.nlu.model - Finished training component.
2019-06-12 12:45:12 INFO rasa.nlu.model - Starting to train component SklearnIntentClassifier
Fitting 5 folds for each of 6 candidates, totalling 30 fits
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.

and no model is produced.
The same thing happens when I try to use the en_core_web_md model, i.e. I run the commands

python -m spacy download en_core_web_md
python -m spacy link en_core_web_md en

and change config.yml to

language: “en”
pipeline: “pretrained_embeddings_spacy”

Do you have any ideas on what am I doing wrong?

Thank you

Hi @fwole welcome to the forum! Could you tell me how many training examples you have? Sometimes training the sklearn classifier takes a little longer

Hi, I’m experiencing exactly the same problem. It’s definitely not the number of training examples. I left this running over night on both Ubuntu 18.04 and Windows 10 and nothing happened. I also tried en_core_web_sm as well as en_core_web_md. Still, the same problem.

Doest anyone know a solution to this issue?

> spacy==2.2.1
> rasa==1.4.3
> rasa-nlu==0.14.3
> rasa-sdk==1.4.0
> rasa-x==0.20.0
> scikit-learn==0.20.4
> sklearn-crfsuite==0.3.6
> tensor2tensor==1.14.1
> tensorboard==1.14.0
> tensorflow==1.14.0
> tensorflow-datasets==1.3.0
> tensorflow-estimator==1.14.0
> tensorflow-gan==2.0.0
> tensorflow-hub==0.7.0
> tensorflow-metadata==0.15.0
> tensorflow-probability==0.7.0

Hello @fwole try downloading the spacy model outside your visual environment and then train… Should work