Registering component name in a custom component

Hi, thank you for developing such a wonderful framework!

I am using some custom components in my pipeline, but I’m having trouble loading the configuration for them when loading the model. Specifically, Component.load loads configuration from the Metadata object based on the ‘name’ field, but in my metadata.json file, the custom components don’t have a ‘name’ field, only ‘class’. I do specify the ‘name’ attribute in my custom component class. What am I doing wrong? Please let me know if you need a code example.

Sometimes writing a question down is enough to spark a solution right away… I used the ‘persist’ method return value (usually used to provide the filename of the persisted component file) to provide the ‘name’ field, although I feel like this should really be done somewhere else by default.

def persist(self, model_dir):
    #Nothing to persist, but use this return value to register the name
    return {'name': CustomComponent.name}

Actually, the above is not a solution. The problem remains at model specification and training time, where I have to specify my pipeline by putting the “path.to.CustomComponent” as the “name”. When the component is being created, the configuration context along with the cls.name is used to try to extract the configuration for CustomComponent (line 241 in components.py), but the only way for this to work is for cls.name to be “path.to.CustomComponent”.

Is this the best practice for creating custom components, to set cls.name = “path.to.CustomComponent”? Is registering the component in registry.py necessary to be able to use a shorter name? And if so, is there any way to do that without forking the rasa_nlu library and modifying it directly?

Yes, when you have a custom component you should list it with its path. (Sorry I wrote this reply a while ago but apparently didn’t hit the send button)

Follow-up question: does one need to register a custom component? how do you register one?

It appears that all components are listed in rasa_nlu/registry.py and that file need to be modified in order to register our custom component. Is there a better way to register a custom component without having to fork rasa_nlu?

Thanks, Alexis

You can simply give the class path to your component in the config file

yourcomponent.intent_classifier.CustomClassifier

It’s not working for me.

Project structure:

  • a
    • init.py
    • b.py

class B is defined inside module b, so I added to config:

pipeline:

  • name: “a.b.B”

I also tried “a.b” and “a.B” and it still can’t pick it up.

I tried setting name = “B” in class B but that also didn’t work.

What am I doing wrong?

1 Like

I think your name in the class should also be the full path

a.b.B

1 Like

The name in config and the name of the component in the component class should be the full class path with dot notation

Hi @souvikg10 , are you suggesting we don’t need to modify the registry.py file in order to create custom component ?

When I try to specify full (partial or half-partial) path in the config file , I get the following error :

Failed to find component class for ‘custom.my_component.MyComponent’. Unknown component name. Check your configured pipeline and make sure the mentioned component is not misspelled. If you are creating your own component, make sure it is either listed as part of the component_classes in rasa_nlu.registry.py or is a proper name of a class in a module.

Thanks to @znat , I have noticed the following.

In the rasa_nlu.utils files there is a function called class_from_module_path which is called if the component is not found in the list of components in registry.py.

def class_from_module_path(module_path):
    """Given the module name and path of a class, tries to retrieve the class.

    The loaded class can be used to instantiate new objects. """
    import importlib

    # load the module, will raise ImportError if module cannot be loaded
    if "." in module_path:
        module_name, _, class_name = module_path.rpartition('.')
        m = importlib.import_module(module_name)
        # get the class, will raise AttributeError if class cannot be found
        return getattr(m, class_name)
    else:
        return globals()[module_path]

Therefore you could try to call in your main file :

from rasa_nlu import utils 
utils.class_from_module_path('a.b.B')

and it should work. In the case, this works but you still get the error ‘‘Failed to find…’’ then maybe you haven’t written properly the name of your component in the config file. If you use a .yml config file it should look like the following :

pipeline:
- name: "a.b.B"

Hope this helps :slight_smile: !

P.S. I am using rasa_nlu==0.13.7

1 Like

If I remember well, for us we had to write the name of the component in class itself

class SklearnIntentClassifier(Component):
"""Intent classifier using the sklearn framework"""

name = "classpath.intent_classifier_sklearn"

provides = ["intent", "intent_ranking"]

requires = ["text_features"]

And put the same path in config file

1 Like

@souvikg10 are you wrapping NLU in your own python app then? Or are you still able to use the built in server and endpoints?

It is wrapped in a custom python code and exposed as an API

Indeed yes, if i remember from what was done( I don’t work on the project anymore plus the implementation is done by data scientists) , you have to import your class and pass the name of the component in the config when you train your model and when you run your model the classpath is part of the config already and it should be teh same for running model when you do

interpreter.load(model)

the classpath should be imported for it to be recognized by the interpreter.

I think that is what the registry.py is doing , in order to avoid a fork of rasa, we imported the class upon training

That sounds reasonable, but where is that main file being called? I thought we do the training by running python -m rasa_nlu.train --config nlu_config.yml --data training_data/ --path ./, which is a built-in rasa module, does that bring us back to forking the repo?

1 Like

Ok I think I figured out what was happening here. So following the steps you guys mentioned, the path was actually picked up, but because in registry.py the exception being called is too generic, it was giving me the module cannot be loaded error for other errors that I have in my code.

To be more specific, on line 111 in rasa_nlu.registry.py it says

except Exception:

which will catch pretty much any problem that happens while loading the file (in my case for example, I was importing a module which was not properly installed), which is unrelated to the error message. I really think this part of the code should be changed to a more specific exception.

2 Likes

You can create an issue in github for that :slight_smile:

Glad it worked out

yep, thanks for the help!

I have noticed this as well , the error management needs to be optimize. Also pytest should be update since it is still using py.test syntax.