Oh yeah, it’s totally possible to write your own models. In fact, there’s plenty of examples over at rasa-nlu-examples. There’s also some examples of custom classifiers and featurizers in there. Note though that right now we’re going to be transitioning these components to Rasa 3.x. The latest release for Ras 2.x is found here.
A few caveats though.
Huggingface featurizers are natively supported already. These are supported via LanguageModelFeaturizer.
Usually, you should delay custom components. Typically the most pressing thing when you’re building an assistant is the data that you’re learning on. The DIET architecture is pretty good at picking up many patterns from many languages and I wouldn’t worry too much about an optimal pipeline unless you have a large representative dataset.
@koaning If I want to attach the xlm-roberta-base to the pipeline via LanguageModelFeaturizer, is it possible? If so, can you please explain a bit on how I can do that? I am sorry but in the documentation I was only able to find bert, gpt, gpt2, xlnet, distilbert, and roberta based models, that’s why I had to ask. (if I want to add xlm-roberta-base model, what should be the “model_name” and “model_weights” 'cause there are no defaults given for those for xml-roberta-base in rasa documentation)
… and thank you very much for all the info. That helps a lot.
The idea is that a bert-kind of huggingface model can be used in Rasa but that you’ll need to give it appropriate weights. Am I understanding it correctly that xml-roberta-base refers to a non-roberta model?
It’d help if you could share the config.yml file that you tried to run.
My question is that is it possible to attach xml-roberta-base model in the same way? If I want to add it to the pipeline via LanguageModelFeaturizer, how do I have to specify model_name and model_weights? That’s where I’m stuck because I couldn’t find those parameters in the documentaion for xml-roberta based models.