Problem when using transformer in NLU pipeline

Hi, I am trying to add Arabic bert model to NLU pipeline. But I see that Rasa don’t use the layer of the pre-trained model in the training as stated in the debugging messages.

2021-11-07 16:10:15 INFO transformers.modeling_tf_utils - Layers from pretrained model not used in TFBertModel: ['mlm___cls']

Is there an error or do you have any idea to solve this?

version: "2.0"
language: ar
pipeline:
  - name: WhitespaceTokenizer
  - name: LanguageModelFeaturizer
    model_name: "bert"
    model_weights: "asafaya/bert-base-arabic"
  - name: DIETClassifier
    epochs: 200
  - name: FallbackClassifier
    threshold: 0.7
  - name: DucklingEntityExtractor
    url: http://localhost:8000
    dimensions:
    - time
    - number
  - name: EntitySynonymMapper
policies:
- name: AugmentedMemoizationPolicy
- name: TEDPolicy
  epochs: 40
- name: RulePolicy
  core_fallback_threshold: 0.4
  core_fallback_action_name: "action_default_fallback"
  enable_fallback_prediction: True
(financial_env) dell@dell-lin:~/Desktop/financial_chatbots/financial-demo-Arabic$ rasa train
2021-11-07 16:09:56 INFO     rasa.model  - Data (domain) for Core model section changed.
2021-11-07 16:09:56 INFO     rasa.model  - Data (nlu-config) for NLU model section changed.
Training NLU model...
2021-11-07 16:09:59 INFO     transformers.file_utils  - TensorFlow version 2.6.0 available.
2021-11-07 16:10:00 INFO     transformers.tokenization_utils  - Model name 'asafaya/bert-base-arabic' not found in model shortcut name list (bert-base-uncased, bert-large-uncased, bert-base-cased, bert-large-cased, bert-base-multilingual-uncased, bert-base-multilingual-cased, bert-base-chinese, bert-base-german-cased, bert-large-uncased-whole-word-masking, bert-large-cased-whole-word-masking, bert-large-uncased-whole-word-masking-finetuned-squad, bert-large-cased-whole-word-masking-finetuned-squad, bert-base-cased-finetuned-mrpc, bert-base-german-dbmdz-cased, bert-base-german-dbmdz-uncased, TurkuNLP/bert-base-finnish-cased-v1, TurkuNLP/bert-base-finnish-uncased-v1, wietsedv/bert-base-dutch-cased). Assuming 'asafaya/bert-base-arabic' is a path, a model identifier, or url to a directory containing tokenizer files.
2021-11-07 16:10:03 INFO     transformers.tokenization_utils  - loading file https://s3.amazonaws.com/models.huggingface.co/bert/asafaya/bert-base-arabic/vocab.txt from cache at /home/dell/.cache/torch/transformers/1f0caadd43445032cdd25fb630a643ba7e6d9c0549d891c60566b13c0124f700.70499f9363142415275a4f221a36f914d2f3b073fb026c85672f1b5a5611f1b6
2021-11-07 16:10:03 INFO     transformers.tokenization_utils  - loading file https://s3.amazonaws.com/models.huggingface.co/bert/asafaya/bert-base-arabic/added_tokens.json from cache at None
2021-11-07 16:10:03 INFO     transformers.tokenization_utils  - loading file https://s3.amazonaws.com/models.huggingface.co/bert/asafaya/bert-base-arabic/special_tokens_map.json from cache at /home/dell/.cache/torch/transformers/a5073fc31f2e0d31a383002819d229fd5925660fdbc13b0c9a9e654a7c44d9db.275045728fbf41c11d3dae08b8742c054377e18d92cc7b72b6351152a99b64e4
2021-11-07 16:10:03 INFO     transformers.tokenization_utils  - loading file https://s3.amazonaws.com/models.huggingface.co/bert/asafaya/bert-base-arabic/tokenizer_config.json from cache at /home/dell/.cache/torch/transformers/22ef2bf36f103972615fc423be82f54a3cabed445a3815b65959e502e3db8df2.73a933aa27255ce576c445dcdb8155b6edb6e4c43cceb14b4b81f9e699a818b7
2021-11-07 16:10:04 INFO     transformers.configuration_utils  - loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/asafaya/bert-base-arabic/config.json from cache at /home/dell/.cache/torch/transformers/667afd39ed2586647499009bacd41114de05559229bdd1bf2001b9c22df3fa40.36d80aeef08d09f22ce3f578b0cb84fb7182c1355da222078b6579a8fb4b4d77
2021-11-07 16:10:04 INFO     transformers.configuration_utils  - Model config BertConfig {
  "architectures": [
    "BertModel"
  ],
  "attention_probs_dropout_prob": 0.1,
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "output_past": true,
  "pad_token_id": 0,
  "type_vocab_size": 2,
  "vocab_size": 32000
}

2021-11-07 16:10:08 INFO     transformers.modeling_tf_utils  - loading weights file https://cdn.huggingface.co/asafaya/bert-base-arabic/tf_model.h5 from cache at /home/dell/.cache/torch/transformers/31441b079fd49383f7df4bcbdf3d05831ddb410df20ab836ba659fa0eb9de2f7.a1dcb45cbaa9cacb3045ac20f9d3af709f5b53f7d286b605af4df7ff410e950c.h5
2021-11-07 16:10:15 INFO     transformers.modeling_tf_utils  - Layers from pretrained model not used in TFBertModel: ['mlm___cls']
2021-11-07 16:10:15 INFO     rasa.nlu.components  - Added 'LanguageModelFeaturizer' to component cache. Key 'LanguageModelFeaturizer-bert-0a5bd334e9527259f7125139499738a3'.
/home/dell/Desktop/financial_chatbots/financial_env/lib/python3.8/site-packages/rasa/utils/train_utils.py:641: UserWarning: constrain_similarities is set to `False`. It is recommended to set it to `True` when using cross-entropy loss. It will be set to `True` by default, Rasa Open Source 3.0.0 onwards.
  rasa.shared.utils.io.raise_warning(
2021-11-07 16:10:15 INFO     rasa.shared.nlu.training_data.training_data  - Training data stats:
2021-11-07 16:10:15 INFO     rasa.shared.nlu.training_data.training_data  - Number of intent examples: 560 (20 distinct intents)

2021-11-07 16:10:15 INFO     rasa.shared.nlu.training_data.training_data  -   Found intents: 'affirm', 'inform', 'pay_cc', 'thankyou', 'check_earnings', 'trigger_handoff', 'search_transactions', 'check_recipients', 'handoff', 'check_balance', 'ask_transfer_charge', 'check_human', 'human_handoff', 'nlu_fallback', 'transfer_money', 'out_of_scope', 'deny', 'goodbye', 'help', 'greet'
2021-11-07 16:10:15 INFO     rasa.shared.nlu.training_data.training_data  - Number of response examples: 0 (0 distinct responses)
2021-11-07 16:10:15 INFO     rasa.shared.nlu.training_data.training_data  - Number of entity examples: 123 (5 distinct entities)
2021-11-07 16:10:15 INFO     rasa.shared.nlu.training_data.training_data  -   Found entity types: 'vendor_name', 'account_type', 'amount-of-money', 'PERSON', 'credit_card'
2021-11-07 16:10:15 INFO     rasa.nlu.model  - Starting to train component WhitespaceTokenizer
2021-11-07 16:10:15 INFO     rasa.nlu.model  - Finished training component.
2021-11-07 16:10:15 INFO     rasa.nlu.model  - Starting to train component RegexFeaturizer
2021-11-07 16:10:15 INFO     rasa.nlu.model  - Finished training component.
2021-11-07 16:10:15 INFO     rasa.nlu.model  - Starting to train component LexicalSyntacticFeaturizer
2021-11-07 16:10:16 INFO     rasa.nlu.model  - Finished training component.
2021-11-07 16:10:16 INFO     rasa.nlu.model  - Starting to train component CountVectorsFeaturizer
2021-11-07 16:10:16 INFO     rasa.nlu.featurizers.sparse_featurizer.count_vectors_featurizer  - 632 vocabulary items were created for text attribute.
2021-11-07 16:10:16 INFO     rasa.nlu.model  - Finished training component.
2021-11-07 16:10:16 INFO     rasa.nlu.model  - Starting to train component CountVectorsFeaturizer
2021-11-07 16:10:16 INFO     rasa.nlu.featurizers.sparse_featurizer.count_vectors_featurizer  - 4202 vocabulary items were created for text attribute.
2021-11-07 16:10:16 INFO     rasa.nlu.model  - Finished training component.
2021-11-07 16:10:16 INFO     rasa.nlu.model  - Starting to train component LanguageModelFeaturizer
/home/dell/Desktop/financial_chatbots/financial_env/lib/python3.8/site-packages/rasa/nlu/featurizers/dense_featurizer/lm_featurizer.py:521: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
  return np.array(nonpadded_sequence_embeddings)
2021-11-07 16:10:27 INFO     rasa.nlu.model  - Finished training component.
2021-11-07 16:10:27 INFO     rasa.nlu.model  - Starting to train component DIETClassifier
/home/dell/Desktop/financial_chatbots/financial_env/lib/python3.8/site-packages/rasa/utils/tensorflow/model_data_utils.py:395: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
  np.array([v[0] for v in values]), number_of_dimensions=3
/home/dell/Desktop/financial_chatbots/financial_env/lib/python3.8/site-packages/rasa/utils/tensorflow/model_data.py:750: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
  np.concatenate(np.array(f)),
Epochs:   0%|                                                                                                                                                                             | 0/100 [00:00<?, ?it/s]/home/dell/Desktop/financial_chatbots/financial_env/lib/python3.8/site-packages/tensorflow/python/framework/indexed_slices.py:447: UserWarning: Converting sparse IndexedSlices(IndexedSlices(indices=Tensor("gradients/cond_grad/gradients/cond/GatherV2_grad/Reshape_1:0", shape=(None,), dtype=int32), values=Tensor("gradients/cond_grad/gradients/cond/GatherV2_grad/Reshape:0", shape=(None,), dtype=float32), dense_shape=Tensor("gradients/cond_grad/gradients/cond/GatherV2_grad/Cast:0", shape=(1,), dtype=int32))) to a dense Tensor of unknown shape. This may consume a large amount of memory.
  warnings.warn(
Epochs: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 100/100 [02:14<00:00,  1.35s/it, t_loss=1.35, i_acc=0.999, e_f1=0.983]
2021-11-07 16:12:42 INFO     rasa.nlu.model  - Finished training component.
2021-11-07 16:12:42 INFO     rasa.nlu.model  - Starting to train component FallbackClassifier
2021-11-07 16:12:42 INFO     rasa.nlu.model  - Finished training component.
2021-11-07 16:12:42 INFO     rasa.nlu.model  - Starting to train component DucklingEntityExtractor
2021-11-07 16:12:42 INFO     rasa.nlu.model  - Finished training component.
2021-11-07 16:12:42 INFO     rasa.nlu.model  - Starting to train component EntitySynonymMapper
2021-11-07 16:12:42 INFO     rasa.nlu.model  - Finished training component.
2021-11-07 16:12:42 INFO     rasa.nlu.model  - Successfully saved model into '/tmp/tmp0v5mfan2/nlu'
NLU model training completed.
2021-11-07 16:12:44 INFO     transformers.tokenization_utils  - Model name 'asafaya/bert-base-arabic' not found in model shortcut name list (bert-base-uncased, bert-large-uncased, bert-base-cased, bert-large-cased, bert-base-multilingual-uncased, bert-base-multilingual-cased, bert-base-chinese, bert-base-german-cased, bert-large-uncased-whole-word-masking, bert-large-cased-whole-word-masking, bert-large-uncased-whole-word-masking-finetuned-squad, bert-large-cased-whole-word-masking-finetuned-squad, bert-base-cased-finetuned-mrpc, bert-base-german-dbmdz-cased, bert-base-german-dbmdz-uncased, TurkuNLP/bert-base-finnish-cased-v1, TurkuNLP/bert-base-finnish-uncased-v1, wietsedv/bert-base-dutch-cased). Assuming 'asafaya/bert-base-arabic' is a path, a model identifier, or url to a directory containing tokenizer files.
2021-11-07 16:12:47 INFO     transformers.tokenization_utils  - loading file https://s3.amazonaws.com/models.huggingface.co/bert/asafaya/bert-base-arabic/vocab.txt from cache at /home/dell/.cache/torch/transformers/1f0caadd43445032cdd25fb630a643ba7e6d9c0549d891c60566b13c0124f700.70499f9363142415275a4f221a36f914d2f3b073fb026c85672f1b5a5611f1b6
2021-11-07 16:12:47 INFO     transformers.tokenization_utils  - loading file https://s3.amazonaws.com/models.huggingface.co/bert/asafaya/bert-base-arabic/added_tokens.json from cache at None
2021-11-07 16:12:47 INFO     transformers.tokenization_utils  - loading file https://s3.amazonaws.com/models.huggingface.co/bert/asafaya/bert-base-arabic/special_tokens_map.json from cache at /home/dell/.cache/torch/transformers/a5073fc31f2e0d31a383002819d229fd5925660fdbc13b0c9a9e654a7c44d9db.275045728fbf41c11d3dae08b8742c054377e18d92cc7b72b6351152a99b64e4
2021-11-07 16:12:47 INFO     transformers.tokenization_utils  - loading file https://s3.amazonaws.com/models.huggingface.co/bert/asafaya/bert-base-arabic/tokenizer_config.json from cache at /home/dell/.cache/torch/transformers/22ef2bf36f103972615fc423be82f54a3cabed445a3815b65959e502e3db8df2.73a933aa27255ce576c445dcdb8155b6edb6e4c43cceb14b4b81f9e699a818b7
2021-11-07 16:12:48 INFO     transformers.configuration_utils  - loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/asafaya/bert-base-arabic/config.json from cache at /home/dell/.cache/torch/transformers/667afd39ed2586647499009bacd41114de05559229bdd1bf2001b9c22df3fa40.36d80aeef08d09f22ce3f578b0cb84fb7182c1355da222078b6579a8fb4b4d77
2021-11-07 16:12:48 INFO     transformers.configuration_utils  - Model config BertConfig {
  "architectures": [
    "BertModel"
  ],
  "attention_probs_dropout_prob": 0.1,
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "output_past": true,
  "pad_token_id": 0,
  "type_vocab_size": 2,
  "vocab_size": 32000
}

2021-11-07 16:12:51 INFO     transformers.modeling_tf_utils  - loading weights file https://cdn.huggingface.co/asafaya/bert-base-arabic/tf_model.h5 from cache at /home/dell/.cache/torch/transformers/31441b079fd49383f7df4bcbdf3d05831ddb410df20ab836ba659fa0eb9de2f7.a1dcb45cbaa9cacb3045ac20f9d3af709f5b53f7d286b605af4df7ff410e950c.h5
2021-11-07 16:13:09 INFO     transformers.modeling_tf_utils  - Layers from pretrained model not used in TFBertModel: ['mlm___cls']
2021-11-07 16:13:09 INFO     rasa.nlu.components  - Added 'LanguageModelFeaturizer' to component cache. Key 'LanguageModelFeaturizer-bert-0a5bd334e9527259f7125139499738a3'.
Training Core model...
Processed story blocks: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 45/45 [00:00<00:00, 1351.29it/s, # trackers=1]
Processed story blocks: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 45/45 [00:01<00:00, 40.96it/s, # trackers=34]
Processed story blocks: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 45/45 [00:01<00:00, 25.89it/s, # trackers=50]
Processed story blocks: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 45/45 [00:01<00:00, 22.61it/s, # trackers=50]
Processed rules: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 22/22 [00:00<00:00, 2086.67it/s, # trackers=1]
Processed trackers: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 34/34 [00:00<00:00, 1524.09it/s, # action=132]
Processed actions: 132it [00:00, 3526.76it/s, # examples=132]
Processed trackers: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 534/534 [00:02<00:00, 256.03it/s, # action=4526]
Epochs:   0%|                                                                                                                                                                              | 0/40 [00:00<?, ?it/s]/home/dell/Desktop/financial_chatbots/financial_env/lib/python3.8/site-packages/tensorflow/python/framework/indexed_slices.py:447: UserWarning: Converting sparse IndexedSlices(IndexedSlices(indices=Tensor("gradients/cond_grad/Identity_1:0", shape=(None,), dtype=int64), values=Tensor("gradients/cond_grad/Identity:0", shape=(None,), dtype=float32), dense_shape=Tensor("gradients/cond_grad/Identity_2:0", shape=(1,), dtype=int32))) to a dense Tensor of unknown shape. This may consume a large amount of memory.
  warnings.warn(
/home/dell/Desktop/financial_chatbots/financial_env/lib/python3.8/site-packages/tensorflow/python/framework/indexed_slices.py:447: UserWarning: Converting sparse IndexedSlices(IndexedSlices(indices=Tensor("gradients/cond_1_grad/Identity_1:0", shape=(None,), dtype=int64), values=Tensor("gradients/cond_1_grad/Identity:0", shape=(None,), dtype=float32), dense_shape=Tensor("gradients/cond_1_grad/Identity_2:0", shape=(1,), dtype=int32))) to a dense Tensor of unknown shape. This may consume a large amount of memory.
  warnings.warn(
/home/dell/Desktop/financial_chatbots/financial_env/lib/python3.8/site-packages/tensorflow/python/framework/indexed_slices.py:447: UserWarning: Converting sparse IndexedSlices(IndexedSlices(indices=Tensor("gradients/cond_2_grad/Identity_1:0", shape=(None,), dtype=int64), values=Tensor("gradients/cond_2_grad/Identity:0", shape=(None,), dtype=float32), dense_shape=Tensor("gradients/cond_2_grad/Identity_2:0", shape=(1,), dtype=int32))) to a dense Tensor of unknown shape. This may consume a large amount of memory.
  warnings.warn(
/home/dell/Desktop/financial_chatbots/financial_env/lib/python3.8/site-packages/tensorflow/python/framework/indexed_slices.py:447: UserWarning: Converting sparse IndexedSlices(IndexedSlices(indices=Tensor("gradients/cond_3_grad/Identity_1:0", shape=(None,), dtype=int64), values=Tensor("gradients/cond_3_grad/Identity:0", shape=(None,), dtype=float32), dense_shape=Tensor("gradients/cond_3_grad/Identity_2:0", shape=(1,), dtype=int32))) to a dense Tensor of unknown shape. This may consume a large amount of memory.
  warnings.warn(
Epochs: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 40/40 [05:19<00:00,  7.99s/it, t_loss=3.01, loss=2.7, acc=0.956]
Processed trackers: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 21/21 [00:00<00:00, 3389.14it/s, # action=51]
Processed actions: 51it [00:00, 20731.68it/s, # examples=45]
Processed trackers: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 34/34 [00:00<00:00, 1418.12it/s, # action=161]
Processed trackers: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 21/21 [00:00<00:00, 1660.08it/s]
Processed trackers: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 55/55 [00:00<00:00, 562.47it/s]
2021-11-07 16:19:54 INFO     rasa.core.agent  - Persisted model to '/tmp/tmp0v5mfan2/core'
Core model training completed.
Your Rasa model is trained and saved at '/home/dell/Desktop/financial_chatbots/financial-demo-Arabic/models/20211107-161956.tar.gz'.

@Pain Pain, I am not expert in Arabic Pipeline, but I guess you have seen this if not please.

For example, let’s say you’ve found this Arabic model and you’re interested in using it. It’s a model based on the bert architecture, so the configuration for Rasa would be:

YAML

- name: LanguageModelFeaturizer  
model_name: bert  
model_weights: asafaya/bert-base-arabic

From here, Rasa would download the models on your behalf automatically. There are many Bert models that Rasa supports via this route. The main thing to keep in mind is that Bert models tend to require many computing resources to run. As a best practice, we recommend properly benchmarking the pipeline to ensure that the accuracy is worth the compute costs for these models. It’s certainly possible that adding Bert to a pipeline makes performance worse due to overfitting.

Ref Link: Non-English Tools for Rasa NLU | The Rasa Blog | Rasa

OR

While using HuggingFace : How to Use BERT in Rasa NLU | The Rasa Blog | Rasa

OR

While using Spacy : https://github.com/hashirabdulbasheer/rasa-financial-assistant-arabic-demo/blob/main/config.yml

I hope this will help you further.

1 Like

Thanks a lot. I have seen all of these. but my question is about the info of the debugging.

Let me ask more general question, how to be sure that NLU pipeline use the bert model you provided not the default one?

Your logs list this line :slight_smile:

2021-11-07 16:10:08 INFO     transformers.modeling_tf_utils  - loading weights file https://cdn.huggingface.co/asafaya/bert-base-arabic/tf_model.h5 from cache at /home/dell/.cache/torch/transformers/31441b079fd49383f7df4bcbdf3d05831ddb410df20ab836ba659fa0eb9de2f7.a1dcb45cbaa9cacb3045ac20f9d3af709f5b53f7d286b605af4df7ff410e950c.h5

So you can confirm that it’s loading in the asafaya/bert-base-arabic weights.

1 Like

Perfect, thanks.

So, Is this an indication that the NLU pipeline works with asafaya/bert-base-arabic ?

I mean, how to be sure 100% that NLU pipeline use this language model?

Let’s turn the question around. If it’s in the configuration file, it will be used. This is confirmed by the logs.

What is making you doubt that this language model is used?

Ahhha, perfect as I expected.

What makes me doubt are the results of End-to-End testing. I am testing with different approaches and the results seem the same for me.

So, now I am testing the NLU pipeline separately using corss-validation to figure out the problem.

How much data do you have?

If you don’t have a lot of it, it doesn’t surprise me too much that huggingface models don’t contribute much to the overall performance. Instead, you’ll likely need to add more training data instead.

You may enjoy this talk, which explains the β€œdont-need-heavy-models” phenomenon a bit more.

Just, I translate the the data for financial-demo chatbot from English to Arabic.

Of course I will watch it.