Clarification on Model Weights

I am currently using pipeline something like this:

- name: HFTransformersNLP
  model_name: "bert"
  model_weights: "rasa/LaBSE"
  cache_dir: /tmp
- name: LanguageModelFeaturizer
  model_name: "bert"
  model_weights: "rasa/LaBSE"
  cache_dir: /tmp
  alias: LMF
- name: "LanguageModelTokenizer"
  "intent_tokenization_flag": False
  "intent_split_symbol": "_"
- name: RegexFeaturizer
- name: CountVectorsFeaturizer
  alias: CVF
  analyzer: char_wb
  min_ngram: 1
  max_ngram: 4
  "use_shared_vocab": True
- name: DIETClassifier
  batch_strategy: balanced
  intent_split_symbol: +
  intent_tokenization_flag: True
  epochs: 300
  batch_size: 50
- name: CRFEntityExtractor
- name: EntitySynonymMapper
- name: ResponseSelector
  featurizers: {CVF, LMF}
  epochs: 300
  retrieval_intent: faq
- name: ResponseSelector
  featurizers: {CVF, LMF}
  epochs: 300
  retrieval_intent: chitchat
- name: FallbackClassifier
  threshold: 0.4
  ambiguity_threshold: 0.1

Is it mandatory to use:

model_weights: "rasa/LaBSE"

or can I cherry-pick from:

for example:

model_weights: "bert-large-uncased"

and which one would be better to use?


1 Like

Hi @mfkarch!

You’re not constrained to work with model_weights: "rasa/LaBSE", that’s just the default value for bert. You can specify the model_weights you’d like to use in your configuration. The options are listed here.

As for which one is better to use – that depends on your data! "rasa/LaBSE", for example are language-agnostic embeddings. This may help your model generalize. You can try out a couple different options and see how they affect your model’s performance, but it’s worth noting that there are probably other factors that will have a greater impact on your model’s performance. Check out one of our Algorithm Whiteboard videos on the topic here!

1 Like

Thanks! I’ll surely look into other model weights as well. Since you are here can you answer this as well it’s kind of important and I am stuck there, not at all familiar with the deployment stuff.