Hi @ganbaa_elmer, I think the error you’re seeing comes from the way how the model name and weights are mapped to the corresponding Huggingface classes. I tested this with Rasa version 3.0.2. and config
and am getting an error as well. If you’re using a different Rasa version the concrete reason might be different though.
According to here, if you specify model: bert, Rasa tries to initialize a BertTokenizer from the given weights (in your case tugstugi/bert-base-mongolian-uncased). However, when checking which kind of tokenizer is actually used by this model directly in HF transformers using
from transformers import AutoTokenizer
tok = AutoTokenizer.from_pretrained("tugstugi/bert-base-mongolian-uncased")
print(type(tok))
Therefore there seems to be a mismatch between the tokenizer that the model uses and the one Rasa is trying to load. Since the mapping from model to tokenizer class is hard-coded, I think it is currently only possible to use Bert models that also make use of the BertTokenizer. This is not transparent from the documentation and hard to see on the HF model hub. I would suggest to open a ticket to improve the documentation on that.
As an alternative, if you’re looking for dense embeddings in Mongolian, you could also try using the BytePairFeaturizer from rasa-nlu-examples, which has a Mongolian model of dense sub-word embeddings. See here for installation and usage instructions.
This should be independent of Rasa 2 vs. Rasa 3, since the way the Huggingface models are integrated did not change afaik. The ones you listed did not work for me unfortunately, either because they use a different tokenizer than the standard mapped one, or they don’t have a pretrained Tensorflow model available (which Rasa is using), just PyTorch.
However, have you already tried using the default multi-lingual Bert model and tested it for your use case?
Hi, I wasn’t able to reproduce the exact same error, but got a different one - which most likely happens because the model has no Tensorflow weights included, just PyTorch (as can be seen in the tags on the HF model website). There is already a ticket opened here on improving the documentation of LanguageModelFeaturizer, so it becomes clearer which HF models can be included and how.
In the meantime, in order for you to be able to continue, you could try out the BytePairFeaturizer mentioned above in your pipeline, which will also provide dense subword embeddings in Mongolian.
I don’t work at Rasa, I’m just a volunteer moderator on the forum
If you’re sure this is an issue from Rasa and not from your side, you can open an issue on GitHub. If you do so, please post the link to that issue here so that people can follow it in the future.