NLU Pipeline Debugging --- how to get and use an instance from a Rasa NLU Pipeline

Once you have a trained model, how do you pass a message through all of the steps in a pipeline up until a specific one?

I’ve tried pulling out that piece of the Interpreter.pipeline and saving that as a separate object and from there running both Interpreter.process and Interpreter.partially_process but nothing seems to work.

Background

I’ve trained an NLU model and with the Interpreter object, I’d like to debug the individual components of the pipeline. By debug, I mean I would like the run a message through all of the steps before that component and through the component I’m interested in.

Example

As a simple example, if I use the basic pre-trained Spacy pipeline, I would like to pull out the second component, the Spacy Tokenizer to see how my message is tokenized. This way I can inspect what is actually being passed to CRF-Entity Extractor.

When I run: interpret = Interpreter.load(os.path.join(model_dir, 'model_name'))

tokenizer = interpret.pipeline[1]

tokenizer.process('Hello, how are you?')

It doesn’t convert the text into a Spacy Doc but instead keeps it as a str.

I’ve also tried:

tokenizer.prepare_partial_process(tokenizer.partial_processing_pipeline, tokenizer.partial_processing_context)

tokenizer.partially_process('Hellow, how are you?')

But that still doesn’t work.

I have other use cases for pulling out the components from a pipeline. For instance, I do post-processing on a specific intent to match text to a predefined list. I want to pull out the featurizer I use in my pipeline to convert the entities I extract as well as my predefined list into an embedding space. That way I can use a better string matching algorithm via a similarity vector.

Any and all help will be much appreciated.

Thank you!

3 Likes

I am also facing the same issue, I want to see what kind feature being generated when RegexFeaturizer is executed, want to see how components get updated as we move through the pipeline. I need to put a breakpoint in a method in RegexFeaturizer class and see the behaviour. Please suggest how this is done.

1 Like

I also want to check the input and output for each step in the pipeline. Please guide.

1 Like

Ditto all the above. We need a better way to debug a pipeline, e.g. see which features are being applied to each token. E.g., the lookup table features don’t seem to be working and I need to see if they are being applied internally to all the tokens I expect.