Once you have a trained model, how do you pass a message through all of the steps in a pipeline up until a specific one?
I’ve tried pulling out that piece of the Interpreter.pipeline
and saving that as a separate object and from there running both Interpreter.process
and Interpreter.partially_process
but nothing seems to work.
Background
I’ve trained an NLU model and with the Interpreter object, I’d like to debug the individual components of the pipeline. By debug, I mean I would like the run a message through all of the steps before that component and through the component I’m interested in.
Example
As a simple example, if I use the basic pre-trained Spacy pipeline, I would like to pull out the second component, the Spacy Tokenizer
to see how my message is tokenized. This way I can inspect what is actually being passed to CRF-Entity Extractor
.
When I run:
interpret = Interpreter.load(os.path.join(model_dir, 'model_name'))
tokenizer = interpret.pipeline[1]
tokenizer.process('Hello, how are you?')
It doesn’t convert the text into a Spacy Doc
but instead keeps it as a str
.
I’ve also tried:
tokenizer.prepare_partial_process(tokenizer.partial_processing_pipeline, tokenizer.partial_processing_context)
tokenizer.partially_process('Hellow, how are you?')
But that still doesn’t work.
I have other use cases for pulling out the components from a pipeline. For instance, I do post-processing on a specific intent to match text to a predefined list. I want to pull out the featurizer I use in my pipeline to convert the entities I extract as well as my predefined list into an embedding space. That way I can use a better string matching algorithm via a similarity vector.
Any and all help will be much appreciated.
Thank you!