Support for Language Models inside Rasa

@koaning, Adding bert based models works just fine. I’ve tried it with the following config.

language: si

pipeline:
  - name: "HFTransformersNLP"
    model_name: "roberta"
    model_weights: "keshan/SinhalaBERTo"
    cache_dir: "hf_lm_weights/bert_si"
  - name: "LanguageModelTokenizer"
  - name: "LanguageModelFeaturizer"
  - name: "LexicalSyntacticFeaturizer"
  - name: "CountVectorsFeaturizer"
  - name: "CountVectorsFeaturizer"
    analyzer: "char_wb"
    min_ngram: 1
    max_ngram: 4
  - name: "CountVectorsFeaturizer"
    analyzer: "char"
    min_ngram: 3
    max_ngram: 5
  - name: "DIETClassifier"
    entity_recognition: true
    epochs: 300
  - name: "EntitySynonymMapper"
  - name: "ResponseSelector"
    epochs: 300
    retrieval_intent: faq

policies:
  - name: RulePolicy

My question is that is it possible to attach xml-roberta-base model in the same way? If I want to add it to the pipeline via LanguageModelFeaturizer, how do I have to specify model_name and model_weights? That’s where I’m stuck because I couldn’t find those parameters in the documentaion for xml-roberta based models.

1 Like