@koaning, Adding bert based models works just fine. I’ve tried it with the following config.
language: si
pipeline:
- name: "HFTransformersNLP"
model_name: "roberta"
model_weights: "keshan/SinhalaBERTo"
cache_dir: "hf_lm_weights/bert_si"
- name: "LanguageModelTokenizer"
- name: "LanguageModelFeaturizer"
- name: "LexicalSyntacticFeaturizer"
- name: "CountVectorsFeaturizer"
- name: "CountVectorsFeaturizer"
analyzer: "char_wb"
min_ngram: 1
max_ngram: 4
- name: "CountVectorsFeaturizer"
analyzer: "char"
min_ngram: 3
max_ngram: 5
- name: "DIETClassifier"
entity_recognition: true
epochs: 300
- name: "EntitySynonymMapper"
- name: "ResponseSelector"
epochs: 300
retrieval_intent: faq
policies:
- name: RulePolicy
My question is that is it possible to attach xml-roberta-base
model in the same way? If I want to add it to the pipeline via LanguageModelFeaturizer
, how do I have to specify model_name
and model_weights
? That’s where I’m stuck because I couldn’t find those parameters in the documentaion for xml-roberta
based models.