Custom Pipeline

Hello

I have a question about adding custom step in the NLU pipeline.

  1. When adding a custom step in the NLU pipeline, this custom step is processed during training as well as parse?

  2. Moreover when I want to add spellcheck, can I create a custom step just for parse? Spellcheck does not need to be executed during training, since training data is correct.

Thanks in advance.

  1. Yes, When adding a custom component in rasa-nlu, the custom component should be inherited from Component class, which has both train & process methods.
    def train(self, training_data, cfg, **kwargs):
        # type: (TrainingData, RasaNLUModelConfig, **Any) -> None
        """Train this component.

        This is the components chance to train itself provided
        with the training data. The component can rely on
        any context attribute to be present, that gets created
        by a call to :meth:`components.Component.pipeline_init`
        of ANY component and
        on any context attributes created by a call to
        :meth:`components.Component.train`
        of components previous to this one."""
        pass

    def process(self, message, **kwargs):
        # type: (Message, **Any) -> None
        """Process an incoming message.

        This is the components chance to process an incoming
        message. The component can rely on
        any context attribute to be present, that gets created
        by a call to :meth:`components.Component.pipeline_init`
        of ANY component and
        on any context attributes created by a call to
        :meth:`components.Component.process`
        of components previous to this one."""
        pass
  1. Yes, you can implement only the process method for the custom component and leave the train method like
def train(self, training_data, cfg, **kwargs):
    pass

While training phase, nothing will happen here; while utterance processing, process method will be called which will ultimately have your spell check logic.

1 Like

yuga04, thanks for the prompt answer.

Further questions:

  1. when I spellcheck messages, I should override message.text?
  2. the same with the training_data, when I do some lemma on the training_data, I should directly override the message.text under intent_samples?
  3. Pipeline are executed sequentially, so the next step will be e.g. the intent count vectoriser or tensorflow intent embedding, which will create the vectors based on the new text?
1 Like

Hi Nik, did you find a solution for your first question ? I want to create a component which do spell checking but Iā€™m not really sure how to implement it. Thank you