Hey, Recently the folks at PolyAI have open-sourced a new sentence encoding model called ConveRT which is pretrained on a large conversational dataset and hence claims better conversational representations over traditional large language models like BERT, etc. The idea resonates very well with what we also believe and have been working on internally at Rasa. Since they open sourced their model as a TFHub model, we decided to build a quick featurizer based on this to extract representations and use them with downstream intent classification models already existing inside Rasa.
In our internal tests, the model does give a significant boost to intent classification accuracy on multiple datasets. We would love the community to try it out and share evaluation numbers on their test sets.
We have released the featurizer as part of Rasa 1.5.0. You can try by doing a pip install of Rasa-
pip install rasa==1.5.0
The featurizer uses an optional dependency
tensorflow_text. Install it with -
pip install --no-deps tensorflow_text==1.15.1
Now you are ready to use the
ConveRTFeaturizer . In the project directory, we recommend using a config along these lines -
language: en pipeline: - name: WhitespaceTokenizer - name: ConveRTFeaturizer - name: EmbeddingIntentClassifier
This uses the ConveRT model as a feature extractor alone and we do not fine tune it along with our intent classifier. Please note that this model can only be used for a dataset in english language. Feel free to change the config params of
EmbeddingIntentClassifier according to your dataset. It would be great if you can share the numbers for evaluation metrics on your test dataset here.