Custom NLU component trained external to deployed Rasa container not being picked up

This is related to Enhacing Rasa NLU models with Custom Components but figured I would start a new thread just in case.

In our set up, we deploy Rasa as a container and then side load new models via a model server. The environment that we train new models is different than the package actually deployed since they require different things, meaning for simplicity, we deploy package1 with Rasa in it but we train models using package2 since package2 also contains extra code for our model server (we are hosting the test results of a the model on a website).

I recently created a custom component that uses some regex at the start of my pipeline to account for common spelling errors. I train the component in the package2, uploaded it to the model server and the model was picked up and side-loaded in. However, because in my config file for the custom component, I needed to specify a path that was specific to package2, the model deployed in package1 can’t find it.

For instance, using the SentimentAnalyzer in the custom components doc, if I put in a sub-directory, I need to load that component by its full path:

package2.components.sentiment.SentimentAnalyzer

however, when it gets side loaded to package1, I get an error that it can’t find package2.

Since the component I built only contains Regex, the re module is in both package1 and package2. Is there a way to cache my custom component so that when it is trained, the class get’s stored in the model and the deployed package doesn’t need to know about it?

Hey @jhamburg. To better understand your question and setup - are you running Rasa in one container and train your custom component in a separate one?

@Juste, yes that is exactly what I’m doing. The below hopefully clarifies.

Package 1 only has the endpoints.yml and credentials.yml files as those are the only files needed to launch the server if you point to a model server. Package 1 is deployed to the production container.

Package 2 is a separate Python environment/package. This package is used to train the model and deploy to our custom model server. It also includes extra files to host the model server.

File System:

  • Package 1
    • endpoints.yml
    • credentials.yml
    • channels
      • custom channel 1
      • custom channel 2
  • Package 2
    • data
      • nlu.md
      • stories.md
    • config.yml
    • domain.yml
    • components
      • custom NLU component 1
      • custom NLU component 2

In the config.yml file, if I want to load a custom component, I need to do the full path:

components.custom_comp1.CustomComp1

In my case the custom component is processing the message to substitute text based on Regex matches to clean the message:

 import re  
  
class CustomComp1(Component):  
   " Cleans message based on Regex"  
    name =  "CustomComp1"  
    def \__init\__(self, component_config=None):  
        super().\__init\__(component_config)  
  
    def process(self, message, **kwargs):  
         "Clean message of symbols"  
          message.text = re.sub(r"\&", "and", message.text)  
          message.text = re.sub(r"""([\!\.\-\'\"\,])""", "", message.text)

Based on the custom component above, I’m not importing any additional packages so I assumed that the component should just work.

However, since I need to refer to the component by the complete path components.custom_comp1.CustomComp1, when the model gets loaded in by the model server it fails to load the component and the model processing fails.

Is there a way to cache NLU components as part of the pipeline so that I don’t need to add each component to deployed container file-system too?

If not, in a way that defeats the purpose of having the model server so that we do not need to re-deploy our main container each time.

Hopefully that provides more clarity!

Hi @jhamburg,

Is there a way to cache NLU components as part of the pipeline so that I don’t need to add each component to deployed container file-system too?

The way to do this would be to create a custom rasa/rasa image where you copy the code over and use that image. you could reference the new image in a docker-compose.override.yml, and then just re-pull the image if it is updated instead of having to copy over the custom code to your server each time.

Otherwise you need to mount the custom code to the rasa containers:

@erohmensing, thank you for your reply and my apologies for taking so long to get back you. I’ve been busy with other things given the current uncertain times and I hope you and your family are safe and healthy!

The workaround you explain doesn’t solve the underlying issue that I’m having. Essentially, as a user, I’m expecting that the model file generated by rasa train is independent and should work in any container as long as that container has the necessary packages installed.

However, this is not the case because the container you deploy must include all of the components that was used to build the model zip file.

I understand that in a custom component, we can specify how to persist the component if needed but I also assumed that the entire pipeline would be persisted altogether. In that format, the environment where I train my model can have the custom components but the container deployed doesn’t need them if it is pulling the model from a remote server.

I hope I’m explaining this well. The metaphor to this would be as if you have zip file containing a pre-trained ensemble model but you can’t actually use it unless you already have all of the code for each model inside of it.

@jhamburg you found a solution?