Support for Language Models inside Rasa

With Rasa Open Source 1.8, we added support for leveraging language models like BERT, GPT-2, etc. These models can now be used as featurizers inside your NLU pipeline for intent classification, entity recognition and response selection models. The following snippet shows how to configure your pipeline to leverage BERT model as an example -

pipeline:
   - name: HFTransformersNLP
     model_name: "bert"
   - name: LanguageModelTokenizer
   - name: LanguageModelFeaturizer
   - name: DIETClassifier

HFTransformersNLP is a utility component which relies on HuggingFace’s Transformers library for the core implementation of the selected language model. LanguageModelTokenizer and LanguageModelFeaturizer constructs the tokens and features respectively to be used inside the downstream NLU models.

You can load different variants of the same language model using the parameter model_weights depending on the size of the model and language of your training corpus. For example, there are chinese (bert-base-chinese) and japanese (bert-base-japanese) variants of the BERT model which you can load if your training data is in chinese or japanese respectively. A full list of different variants of these language models is available in the official documentation of the Transformers library.

Please note, the current implementation uses these language models strictly as a featurizer which means that its weights are not fine-tuned along with the training of downstream NLU components like DIETClassifier, etc.

As always, you can still use multiple featurizers in your pipeline, for example -

pipeline:
   - name: HFTransformersNLP
     model_name: "bert"
   - name: LanguageModelTokenizer
   - name: LanguageModelFeaturizer
   - name: CountVectorsFeaturizer
   - name: DIETClassifier

We would love to hear everyone’s feedback on it in terms of how it performs on your internal datasets, specially when used in combination with the newly introduced DIETClassifier.

10 Likes

I would also love to hear if the video on benchmarking helped navigate the settings. I would also love to hear if it didn’t!

5 Likes

Awesome :slight_smile:

Any plans on adding the XLM-RoBERTa (XLM-RoBERTa — transformers 2.5.1 documentation) to available language models?

1 Like

Is there a way to load HF-transformers compatible model saved in pytorch format? Unfortunately there is no RuBERT model in TF2.0 format.

When I try to load pytorch model there is an error:

OSError: Error no file named ['pytorch_model.bin', 'tf_model.h5'] found in directory /opt/rubert/conversational_cased_L-12_H-768_A-12_pt/ or `from_pt` set to False

There exists pytorch_model.bin so I think the case is from_pt set to False.

@ezhvsalate If we set from_pt to True that would require pytorch in the backend to load that model. We don’t support that yet.

Maybe it will be possible to add an optional parameter defining if loaded model was saved as PyTorch checkpoint? And write in the docs that setting param to True will require installation of PyTorch. I tried it locally and it works - RuBert model was loaded. If this is ok - I’ll create a pull request.

@ezhvsalate You can also convert the pytorch checkpoint into a compatible tensorflow checkpoint using this script and then load the model - transformers/convert_pytorch_checkpoint_to_tf2.py at master · huggingface/transformers · GitHub

2 Likes

how can i load different variants of the same language model using the parameter model_weights

can you specify the process of doing so in pipeline.

@dakshvar22

For example, if the variant you want to use is bert-base-uncased, then your pipeline would look something like -

pipeline:
   - name: HFTransformersNLP
     model_name: "bert"
     model_weights: "bert-base-uncased"
   - name: LanguageModelTokenizer
   - name: LanguageModelFeaturizer
   - name: DIETClassifier

If you want to load the model weights from huggingface compatible model checkpoint stored locally, you can pass its path as well as the value of the model_weights parameter

1 Like

@dakshvar22 for loading the local model what will be the parameter i shall use?

Path to the directory containing the model checkpoint.

pipeline:
   - name: HFTransformersNLP
     model_name: "bert"
     model_weights: "path/to/your/model"
   - name: LanguageModelTokenizer
   - name: LanguageModelFeaturizer
   - name: DIETClassifier
1 Like