Replacing the NLU Pipeline by a custom Interpreter in Rasa 2.0

slapierre · November 5, 2020, 6:25pm

Hello,

I am seeking your advice on how to replace the NLU pipeline in Rasa 2.0 and on an “interpreter injection” concern I noticed during the training phase.

First, let’s discuss the discrepancy between rasa train core and rasa shell.

The startup sequence of rasa train core and rasa shell is different: when running rasa shell the create_interpreter factory function is called before the agent is created, the factory uses the EndpointConfig (from endpoints.yml) to instantiate our custom interpreter and forwards that interpreter to the Agent. When running rasa train core, the creation of the interpreter is delegated to the Agent, the EndpointConfig parameter is not forwarded to the Agent’s contructor, the Agent calls the factory and gets a RegexInterpreter.

My concern is that the RegexInterpreter.featurize_message method function would be called during policy training instead of our own featurization function. For now the RegexInterpreter.featurize_message “does nothing” so there would not be a mismatch in the featurization.

Please let me know what you think: from my perspective, the startup sequence of rasa train needs to be updated to allow instantiating a custom NLU (or a RasaNLUHttpInterpreter) to make sure that the core featurization introduced in New core featurization #6296 uses the right interpreter when featurizing instead of RegexInterpreter’s featurization function.

I found out that this major refactoring is in progress: Refactor Agent / Processor / TrackerStore #5257, so perhaps the “interpreter injection” concern I am raising will be addressed in-or-around this issue and for the moment I can experiment with my custom interpreter with the certainty that there is no featurization happening.

Now let’s discuss the best way to implement a custom NLU in Rasa 2.0.

I would like to confirm that 1.0 mechanics described in the Legacy Docs is still supported and that it is still the recommended approach. I am referring to these articles:

I am able to instantiate a RasaNLUHttpInterpreter with the sample endpoints.yml below during the evaluation phase (rasa shell), therefore I believe that the legacy documentation is still valid.

# endpoints.yml
nlu:
  type: http
  url: http://my.nlu.server:5000/nlu

I have identified two other approaches:

Keep using the Rasa NLU pipeline and replace the components with a single, custom component that extends rasa.nlu.components.Component
- Pros:
  - Ability to add a few add more components if needed
  - Great articles and code samples explain how to create custom NLU component:
    - Enhancing Rasa NLU models with Custom Components - Rasa Blog - 2019-02-19
    - Building Rasa NLU custom component for lemmatization with spaCy - Medium - 2019-04-19
    - How to make a Custom Printer Component in Rasa NLU - Rasa Blog - 2020-06-29
    - Rasa NLU Examples: Docs | GitHub
- Cons:
  - Well… it’s a hack, using the existing NLU pipeline to call another NLU is a convoluted solution which will cause headaches in the future because we do not have control over what will happen to RegexInterpreter.featurize_message (at the moment all it does is pass)
  - No control on the concrete class that’s instantiated, well… it’s going to be a RegexInterpreter
Replace the Rasa NLU by a custom interpreter that extends rasa.shared.nlu.interpreter.NaturalLanguageInterpreter (or RegexInterpreter)
- Pro:
  - Full control over the behavior of the parse and featurize_message methods
  - Function invokation instead of REST call.
- Cons:
  - Not much documentation available online on how to proceed (not a big deal)
  - As mentionned above rasa train core is not loading my interpreter

The second approach is very similar to using a RasaNLUHttpInterpreter except that I can skip a REST call to the server. It has the caveat that right now I can’t instantiate it during the traing phase.

Please let me know what you recommend.

Thanks for the help! Simon

Topic		Replies	Views
Cant use core with seprate NLU without training NLU and core Rasa Open Source	6	492	January 20, 2020
Rasa Nlu custom pipeline Rasa Open Source	4	1771	October 4, 2019
Rasa NLU interpreter with RegexInterpreter Rasa Open Source	1	1252	September 14, 2018
Pipeline for outgoing messages Rasa Open Source	5	1173	September 23, 2021
NLU pipeline - Inspecting the Message-object yourself Rasa Open Source	3	324	March 15, 2023

Replacing the NLU Pipeline by a custom Interpreter in Rasa 2.0

Related topics