Spacy in Rasa X

Hi everyone,

I updated the rasa pipeline using spacy now. This works perfectly fine on my local machine. But in rasa x I get the following error in the production container.

ERROR    rasa.engine.graph  - Error initializing graph component for node provide_SpacyNLP0.
ERROR    rasa.server  - Traceback (most recent call last):
  File "/opt/venv/lib/python3.8/site-packages/rasa/nlu/utils/spacy_utils.py", line 92, in load_model
    language = spacy.load(spacy_model_name, disable=["parser"])
  File "/opt/venv/lib/python3.8/site-packages/spacy/__init__.py", line 51, in load
    return util.load_model(
  File "/opt/venv/lib/python3.8/site-packages/spacy/util.py", line 427, in load_model
    raise IOError(Errors.E050.format(name=name))
OSError: [E050] Can't find model 'de_core_news_md'. It doesn't seem to be a Python package or a valid path to a data directory.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/venv/lib/python3.8/site-packages/rasa/server.py", line 1057, in train
    training_result = train(**training_payload)
  File "/opt/venv/lib/python3.8/site-packages/rasa/model_training.py", line 160, in train
    return _train_graph(
  File "/opt/venv/lib/python3.8/site-packages/rasa/model_training.py", line 234, in _train_graph
    trainer.train(
  File "/opt/venv/lib/python3.8/site-packages/rasa/engine/training/graph_trainer.py", line 105, in train
    graph_runner.run(inputs={PLACEHOLDER_IMPORTER: importer})
  File "/opt/venv/lib/python3.8/site-packages/rasa/engine/runner/dask.py", line 101, in run
    dask_result = dask.get(run_graph, run_targets)
  File "/opt/venv/lib/python3.8/site-packages/dask/local.py", line 553, in get_sync
    return get_async(
  File "/opt/venv/lib/python3.8/site-packages/dask/local.py", line 496, in get_async
    for key, res_info, failed in queue_get(queue).result():
  File "/usr/lib/python3.8/concurrent/futures/_base.py", line 437, in result
    return self.__get_result()
  File "/usr/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result
    raise self._exception
  File "/opt/venv/lib/python3.8/site-packages/dask/local.py", line 538, in submit
    fut.set_result(fn(*args, **kwargs))
  File "/opt/venv/lib/python3.8/site-packages/dask/local.py", line 234, in batch_execute_tasks
    return [execute_task(*a) for a in it]
  File "/opt/venv/lib/python3.8/site-packages/dask/local.py", line 234, in <listcomp>
    return [execute_task(*a) for a in it]
  File "/opt/venv/lib/python3.8/site-packages/dask/local.py", line 225, in execute_task
    result = pack_exception(e, dumps)
  File "/opt/venv/lib/python3.8/site-packages/dask/local.py", line 220, in execute_task
    result = _execute_task(task, data)
  File "/opt/venv/lib/python3.8/site-packages/dask/core.py", line 119, in _execute_task
    return func(*(_execute_task(a, cache) for a in args))
  File "/opt/venv/lib/python3.8/site-packages/rasa/engine/graph.py", line 451, in __call__
    self._load_component(**constructor_kwargs)
  File "/opt/venv/lib/python3.8/site-packages/rasa/engine/graph.py", line 390, in _load_component
    self._component: GraphComponent = constructor(  # type: ignore[no-redef]
  File "/opt/venv/lib/python3.8/site-packages/rasa/nlu/utils/spacy_utils.py", line 120, in create
    model = cls.load_model(spacy_model_name)
  File "/opt/venv/lib/python3.8/site-packages/rasa/nlu/utils/spacy_utils.py", line 95, in load_model
    raise InvalidModelError(
rasa.nlu.model.InvalidModelError: Please confirm that de_core_news_md is an available spaCy model. You need to download one upfront. For example:
python -m spacy download en_core_web_md
More information can be found on https://rasa.com/docs/rasa/components#spacynlp

ERROR    rasa.server  - An unexpected error occurred during training. Error: Please confirm that de_core_news_md is an available spaCy model. You need to download one upfront. For example:
python -m spacy download en_core_web_md
More information can be found on https://rasa.com/docs/rasa/components#spacynlp

This error message is very well writte and helps to install spacy. But I don’t know how to do this in Rasa x. I added the following lin the my values.yaml file.

rasa:
   # ...
   tag: "3.1.0-spacy-de"

What German language model is actually installed in this container? I am trying to use de_core_news_md.

Thank you in advance,

Sören

I found the solution. The problem was just due to poor documentation. The de_core_news_sm model is only installed in the container: rasa/Dockerfile.pretrained_embeddings_spacy_de at 2b943f9c7b4c6235feb035658a80a63b6a4736b0 · RasaHQ/rasa · GitHub

I think it would be great to add this to the documentation.

Ok, I celebrated too early.

I can train a new model, it gets diplayed in the list of models. But it fails to get activated. Unfortunately I have no conclusive error message. The containers crashes. It sounds very similar to this post: Rasa X training fails when using SpaCy model

Does anyone have an idea what I can do?