Can't run cross validation with pre-trained spacy pipeline due to thinc dependency

Hi. I don’t know if you still maintain older versions but I’m using Rasa 1.6.0 at the moment (will migrate later). I’m trying to run a cross-validation on pre-trained embeddings with spacy pipeline but it needs a types module in thinc which it can’t find. Below are the details.

 ✗ rasa test nlu --model ./models --nlu ./train_test_split/test_data.md --cross-validation
2021-02-25 01:16:23 INFO     rasa.cli.test  - Test model using cross validation.
2021-02-25 01:16:27 INFO     absl  - Entry Point [tensor2tensor.envs.tic_tac_toe_env:TicTacToeEnv] registered with id [T2TEnv-TicTacToeEnv-v0]
2021-02-25 01:16:36 INFO     rasa.nlu.utils.spacy_utils  - Trying to load spacy model with name 'en'
2021-02-25 01:16:37 INFO     pytorch_transformers.modeling_bert  - Better speed can be achieved with apex installed from https://www.github.com/nvidia/apex .
2021-02-25 01:16:37 INFO     pytorch_transformers.modeling_xlnet  - Better speed can be achieved with apex installed from https://www.github.com/nvidia/apex .
Traceback (most recent call last):
  File "/Users/mervenoyan/.pyenv/versions/repo/bin/rasa", line 11, in <module>
    sys.exit(main())
  File "/Users/mervenoyan/.pyenv/versions/3.6.5/envs/repo/lib/python3.6/site-packages/rasa/__main__.py", line 76, in main
    cmdline_arguments.func(cmdline_arguments)
  File "/Users/mervenoyan/.pyenv/versions/3.6.5/envs/repo/lib/python3.6/site-packages/rasa/cli/test.py", line 143, in test_nlu
    perform_nlu_cross_validation(config, nlu_data, output, vars(args))
  File "/Users/mervenoyan/.pyenv/versions/3.6.5/envs/repo/lib/python3.6/site-packages/rasa/test.py", line 206, in perform_nlu_cross_validation
    data, folds, nlu_config, output, **kwargs
  File "/Users/mervenoyan/.pyenv/versions/3.6.5/envs/repo/lib/python3.6/site-packages/rasa/nlu/test.py", line 1244, in cross_validate
    trainer = Trainer(nlu_config)
  File "/Users/mervenoyan/.pyenv/versions/3.6.5/envs/repo/lib/python3.6/site-packages/rasa/nlu/model.py", line 147, in __init__
    self.pipeline = self._build_pipeline(cfg, component_builder)
  File "/Users/mervenoyan/.pyenv/versions/3.6.5/envs/repo/lib/python3.6/site-packages/rasa/nlu/model.py", line 159, in _build_pipeline
    component = component_builder.create_component(component_cfg, cfg)
  File "/Users/mervenoyan/.pyenv/versions/3.6.5/envs/repo/lib/python3.6/site-packages/rasa/nlu/components.py", line 482, in create_component
    component = registry.create_component_by_config(component_config, cfg)
  File "/Users/mervenoyan/.pyenv/versions/3.6.5/envs/repo/lib/python3.6/site-packages/rasa/nlu/registry.py", line 226, in create_component_by_config
    return component_class.create(component_config, config)
  File "/Users/mervenoyan/.pyenv/versions/3.6.5/envs/repo/lib/python3.6/site-packages/rasa/nlu/utils/spacy_utils.py", line 81, in create
    nlp = cls.load_model(spacy_model_name)
  File "/Users/mervenoyan/.pyenv/versions/3.6.5/envs/repo/lib/python3.6/site-packages/rasa/nlu/utils/spacy_utils.py", line 51, in load_model
    return spacy.load(spacy_model_name, disable=["parser"])
  File "/Users/mervenoyan/.pyenv/versions/3.6.5/envs/repo/lib/python3.6/site-packages/spacy/__init__.py", line 30, in load
    return util.load_model(name, **overrides)
  File "/Users/mervenoyan/.pyenv/versions/3.6.5/envs/repo/lib/python3.6/site-packages/spacy/util.py", line 162, in load_model
    return load_model_from_link(name, **overrides)
  File "/Users/mervenoyan/.pyenv/versions/3.6.5/envs/repo/lib/python3.6/site-packages/spacy/util.py", line 179, in load_model_from_link
    return cls.load(**overrides)
  File "/Users/mervenoyan/.pyenv/versions/3.6.5/envs/repo/lib/python3.6/site-packages/spacy/data/en/__init__.py", line 12, in load
    return load_model_from_init_py(__file__, **overrides)
  File "/Users/mervenoyan/.pyenv/versions/3.6.5/envs/repo/lib/python3.6/site-packages/spacy/util.py", line 228, in load_model_from_init_py
    return load_model_from_path(data_path, meta, **overrides)
  File "/Users/mervenoyan/.pyenv/versions/3.6.5/envs/repo/lib/python3.6/site-packages/spacy/util.py", line 197, in load_model_from_path
    nlp = cls(meta=meta, **overrides)
  File "/Users/mervenoyan/.pyenv/versions/3.6.5/envs/repo/lib/python3.6/site-packages/spacy/language.py", line 158, in __init__
    user_factories = util.registry.factories.get_all()
  File "/Users/mervenoyan/.pyenv/versions/3.6.5/envs/repo/lib/python3.6/site-packages/catalogue.py", line 112, in get_all
    result.update(self.get_entry_points())
  File "/Users/mervenoyan/.pyenv/versions/3.6.5/envs/repo/lib/python3.6/site-packages/catalogue.py", line 127, in get_entry_points
    result[entry_point.name] = entry_point.load()
  File "/Users/mervenoyan/.pyenv/versions/3.6.5/envs/repo/lib/python3.6/site-packages/importlib_metadata/__init__.py", line 100, in load
    module = import_module(match.group('module'))
  File "/Users/mervenoyan/.pyenv/versions/3.6.5/lib/python3.6/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 994, in _gcd_import
  File "<frozen importlib._bootstrap>", line 971, in _find_and_load
  File "<frozen importlib._bootstrap>", line 941, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "<frozen importlib._bootstrap>", line 994, in _gcd_import
  File "<frozen importlib._bootstrap>", line 971, in _find_and_load
  File "<frozen importlib._bootstrap>", line 955, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 665, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 678, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/Users/mervenoyan/.pyenv/versions/3.6.5/envs/repo/lib/python3.6/site-packages/spacy_transformers/__init__.py", line 1, in <module>
    from . import architectures
  File "/Users/mervenoyan/.pyenv/versions/3.6.5/envs/repo/lib/python3.6/site-packages/spacy_transformers/architectures.py", line 3, in <module>
    from thinc.types import Ragged, Floats2d
ModuleNotFoundError: No module named 'thinc.types'

Current setup:

  • Python 3.6.5
  • Rasa 1.6.0
  • Spacy 2.2.3
  • thinc 7.3.1 (which comes with spaCy I guess)

What I’ve tried:

  • I looked into thinc 7.3.1 release and couldn’t find the module actually.
  • Tried thinc 7.1.0, the issue went away but raised other problems, had to install spaCy 2.1.3 which raised another dependency problem and I ended up going back to start.
  • Tried thinc 7.3.0, didn’t solve my problem.

It’s weird because I remember running cross-validation with this pipeline before, it could’ve been a dependency issue. I’d understand if you don’t maintain former versions, but I’d appreciate if you could help me. (You can also tell me to write this to explosion’s forum since I don’t know where to write about it)

A few things to check:

  • Training worked just fine? That’s a bit strange.
  • Did you try manually removing spaCy and then installing it again? You might want to try installing spaCy 2.2.2.
  • Could you share your config.yml? Would I be correct in assuming you’re only using spaCy as a featurizer here?

With python versioning issues I always apply the “burn it with fire”-approach. I usually just drop the entire virtualenv and start anew. In your case, with a clean python -m pip install rasa==1.6.0.

1 Like

Thank you so much. I removed the venv and started a fresh one and it worked!

“When in doubt, restart with fire.”

– Vincent D. Warmerdam