Multiple spacy models in one pipeline

jamesmf · September 17, 2019, 12:56pm

Hi there,

I’d like to have two spacy models in my pipeline, but the current implementation doesn’t seem well-suited to that. The SpacyNLP object will set "spacy_doc", and were I to have a second SpacyNLP object with a different model, I believe I’d overwrite it.

It does seem possible to accomplish everything I’d want the spacy_doc for before including the second SpacyNLP component. For instance if I used

- name: 'SpacyNLP'
  model: 'en_core_web_md'
- name: 'SpacyTokenizer'     #is this necessary anymore? 
- name: 'SpacyFeaturizer'
- name: 'SpacyNLP'
  model: 'my_other_model'
- name: 'SpacyTokenizer'
- name: 'SpacyFeaturizer'

My best guess is that the above pipeline would work, and anything relying on "tokens" would get the second my_other_model's tokens. But that you’d featurize the document vectors for both successfully.

Other than being a memory-glutton, is there anything else wrong with that? Is there any appetite for supporting a cleaner interface for that?

Ghostvv · September 18, 2019, 8:09am

It seems to me that it should work for features for intent classification, not sure about crf though. Did you try it?

Topic		Replies	Views
Confusion on SpacyNLP pipeline Rasa Open Source	0	135	May 1, 2024
Using multiple NLU models? Rasa Open Source	12	2470	September 16, 2019
Some problems about loading two models Rasa Open Source	1	452	April 23, 2019
SpacyNLP and supervised_embeddings in the same pipeline Rasa Open Source	0	449	May 21, 2019
How to use SpacyFeaturizer Rasa Open Source	0	303	June 6, 2022

Multiple spacy models in one pipeline

Related topics