Use recent release google bert -MuRIL

I want to use the recently released MuRIL - google pre-trained bert model for Indian languages. How can I do so?


Hi Mani,

There are a couple of posts that will walk you through the process. This blog post by @koaning and this forum post by @dakshvar22.


I have seen these posts, they talk about using bert through huggingface implementation. But recently released MuRIL is not added to huggingface repo yet. Is there any way to use it in rasa.

Thank you.

Hi ManiNuthi, did you get any solution for this ? I am trying to use Muril as well using huggingface implementation but unable to do so.

@ManiNuthi It looks like the architecture of the model does not differ from traditional BERT. That means you can convert your model[1,2] in the tfhub format to the huggingface format and then use it inside Rasa.

  • Thank you for the response. I gave the path to the model and faced this error:`

2021-01-06 20:29:53 INFO transformers.modeling_tf_utils - loading weights file /home/mani/Downloads/muril-cased/tf_model.h5 2021-01-06 20:29:53.551466: E tensorflow/stream_executor/cuda/] failed call to cuInit: UNKNOWN ERROR (303) Traceback (most recent call last): File “/home/mani/.local/bin/rasa”, line 8, in sys.exit(main()) File “/home/mani/.local/lib/python3.6/site-packages/rasa/”, line 92, in main cmdline_arguments.func(cmdline_arguments) File “/home/mani/.local/lib/python3.6/site-packages/rasa/cli/”, line 76, in train additional_arguments=extract_additional_arguments(args), File “/home/mani/.local/lib/python3.6/site-packages/rasa/”, line 50, in train additional_arguments=additional_arguments, File “uvloop/loop.pyx”, line 1456, in uvloop.loop.Loop.run_until_complete File “/home/mani/.local/lib/python3.6/site-packages/rasa/”, line 101, in train_async additional_arguments, File “/home/mani/.local/lib/python3.6/site-packages/rasa/”, line 188, in _train_async_internal additional_arguments=additional_arguments, File “/home/mani/.local/lib/python3.6/site-packages/rasa/”, line 245, in _do_training persist_nlu_training_data=persist_nlu_training_data, File “/home/mani/.local/lib/python3.6/site-packages/rasa/”, line 482, in _train_nlu_with_validated_data persist_nlu_training_data=persist_nlu_training_data, File “/home/mani/.local/lib/python3.6/site-packages/rasa/nlu/”, line 75, in train trainer = Trainer(nlu_config, component_builder) File “/home/mani/.local/lib/python3.6/site-packages/rasa/nlu/”, line 145, in init self.pipeline = self._build_pipeline(cfg, component_builder) File “/home/mani/.local/lib/python3.6/site-packages/rasa/nlu/”, line 157, in _build_pipeline component = component_builder.create_component(component_cfg, cfg) File “/home/mani/.local/lib/python3.6/site-packages/rasa/nlu/”, line 781, in create_component component = registry.create_component_by_config(component_config, cfg) File “/home/mani/.local/lib/python3.6/site-packages/rasa/nlu/”, line 246, in create_component_by_config return component_class.create(component_config, config) File “/home/mani/.local/lib/python3.6/site-packages/rasa/nlu/”, line 489, in create return cls(component_config) File “/home/mani/.local/lib/python3.6/site-packages/rasa/nlu/utils/hugging_face/”, line 47, in init self._load_model() File “/home/mani/.local/lib/python3.6/site-packages/rasa/nlu/utils/hugging_face/”, line 84, in _load_model self.model_weights, cache_dir=self.cache_dir File “/home/mani/.local/lib/python3.6/site-packages/transformers/”, line 401, in from_pretrained model.load_weights(resolved_archive_file, by_name=True) File “/home/mani/.local/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/”, line 234, in load_weights return super(Model, self).load_weights(filepath, by_name, skip_mismatch) File “/home/mani/.local/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/”, line 1220, in load_weights f, self.layers, skip_mismatch=skip_mismatch) File “/home/mani/.local/lib/python3.6/site-packages/tensorflow_core/python/keras/saving/”, line 761, in load_weights_from_hdf5_group_by_name str(len(weight_values)) + ’ element(s).’) ValueError: Layer #0 (named “bert”) expects 199 weight(s), but the saved weights have 197 element(s).

Is the model architecture exactly the same as traditional bert?