Can't load bert German model from huggingface

Hi Rasa community,

I’m using rasa to build a bot in German language and want to try out BERT in LanguageModelFeaturizer. From https://huggingface.co/transformers/pretrained_models.html, the model “bert-base-german-cased” works well.

However “bert-base-german-dbmdz-cased”, “bert-base-german-dbmdz-uncased” and “distilbert-base-german-cased” doesn’t work and give me an OSError:

OSError: Can’t load weights for ‘distilbert-base-german-cased’. Make sure that:

  • ‘distilbert-base-german-cased’ is a correct model identifier listed on ‘Models - Hugging Face
  • or ‘distilbert-base-german-cased’ is the correct path to a directory containing a file named one of tf_model.h5, pytorch_model.bin.

OSError: Couldn’t reach server at ‘https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-german-dbmdz-cased-config.json’ to download configuration file or configuration file is not a valid JSON file.

Is there something wrong here? Does anyone face the same issue? Or are those model still not supported by Rasa?

Beside I saw some XLM model which support German language like “xlm-mlm-ende-1024”, but xlm ist not listed in the document of LanguageModelFeaturizer. So it is still not supported?

Thanks in advance:)

It seems like that the model weights are just compatible with pytorch. See https://huggingface.co/dbmdz/bert-base-german-cased:

Currently only PyTorch-Transformers compatible weights are available. If you need access to TensorFlow checkpoints, please raise an issue!

So unfortunately you cannot use those models in our pipeline as we are using tensorflow.

Beside I saw some XLM model which support German language like “xlm-mlm-ende-1024”, but xlm ist not listed in the document of LanguageModelFeaturizer . So it is still not supported?

Yes this is still not supported.

1 Like

Is it still not supported? I would like to use the xlm-mlm-ende-1024 as well, but I couldn’t figure out how to set the pipeline correct.

Thanks!