Hi, I am trying to add Arabic bert model to NLU pipeline. But I see that Rasa donβt use the layer of the pre-trained model in the training as stated in the debugging messages.
2021-11-07 16:10:15 INFO transformers.modeling_tf_utils - Layers from pretrained model not used in TFBertModel: ['mlm___cls']
Is there an error or do you have any idea to solve this?
version: "2.0"
language: ar
pipeline:
- name: WhitespaceTokenizer
- name: LanguageModelFeaturizer
model_name: "bert"
model_weights: "asafaya/bert-base-arabic"
- name: DIETClassifier
epochs: 200
- name: FallbackClassifier
threshold: 0.7
- name: DucklingEntityExtractor
url: http://localhost:8000
dimensions:
- time
- number
- name: EntitySynonymMapper
policies:
- name: AugmentedMemoizationPolicy
- name: TEDPolicy
epochs: 40
- name: RulePolicy
core_fallback_threshold: 0.4
core_fallback_action_name: "action_default_fallback"
enable_fallback_prediction: True
(financial_env) dell@dell-lin:~/Desktop/financial_chatbots/financial-demo-Arabic$ rasa train
2021-11-07 16:09:56 INFO rasa.model - Data (domain) for Core model section changed.
2021-11-07 16:09:56 INFO rasa.model - Data (nlu-config) for NLU model section changed.
Training NLU model...
2021-11-07 16:09:59 INFO transformers.file_utils - TensorFlow version 2.6.0 available.
2021-11-07 16:10:00 INFO transformers.tokenization_utils - Model name 'asafaya/bert-base-arabic' not found in model shortcut name list (bert-base-uncased, bert-large-uncased, bert-base-cased, bert-large-cased, bert-base-multilingual-uncased, bert-base-multilingual-cased, bert-base-chinese, bert-base-german-cased, bert-large-uncased-whole-word-masking, bert-large-cased-whole-word-masking, bert-large-uncased-whole-word-masking-finetuned-squad, bert-large-cased-whole-word-masking-finetuned-squad, bert-base-cased-finetuned-mrpc, bert-base-german-dbmdz-cased, bert-base-german-dbmdz-uncased, TurkuNLP/bert-base-finnish-cased-v1, TurkuNLP/bert-base-finnish-uncased-v1, wietsedv/bert-base-dutch-cased). Assuming 'asafaya/bert-base-arabic' is a path, a model identifier, or url to a directory containing tokenizer files.
2021-11-07 16:10:03 INFO transformers.tokenization_utils - loading file https://s3.amazonaws.com/models.huggingface.co/bert/asafaya/bert-base-arabic/vocab.txt from cache at /home/dell/.cache/torch/transformers/1f0caadd43445032cdd25fb630a643ba7e6d9c0549d891c60566b13c0124f700.70499f9363142415275a4f221a36f914d2f3b073fb026c85672f1b5a5611f1b6
2021-11-07 16:10:03 INFO transformers.tokenization_utils - loading file https://s3.amazonaws.com/models.huggingface.co/bert/asafaya/bert-base-arabic/added_tokens.json from cache at None
2021-11-07 16:10:03 INFO transformers.tokenization_utils - loading file https://s3.amazonaws.com/models.huggingface.co/bert/asafaya/bert-base-arabic/special_tokens_map.json from cache at /home/dell/.cache/torch/transformers/a5073fc31f2e0d31a383002819d229fd5925660fdbc13b0c9a9e654a7c44d9db.275045728fbf41c11d3dae08b8742c054377e18d92cc7b72b6351152a99b64e4
2021-11-07 16:10:03 INFO transformers.tokenization_utils - loading file https://s3.amazonaws.com/models.huggingface.co/bert/asafaya/bert-base-arabic/tokenizer_config.json from cache at /home/dell/.cache/torch/transformers/22ef2bf36f103972615fc423be82f54a3cabed445a3815b65959e502e3db8df2.73a933aa27255ce576c445dcdb8155b6edb6e4c43cceb14b4b81f9e699a818b7
2021-11-07 16:10:04 INFO transformers.configuration_utils - loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/asafaya/bert-base-arabic/config.json from cache at /home/dell/.cache/torch/transformers/667afd39ed2586647499009bacd41114de05559229bdd1bf2001b9c22df3fa40.36d80aeef08d09f22ce3f578b0cb84fb7182c1355da222078b6579a8fb4b4d77
2021-11-07 16:10:04 INFO transformers.configuration_utils - Model config BertConfig {
"architectures": [
"BertModel"
],
"attention_probs_dropout_prob": 0.1,
"gradient_checkpointing": false,
"hidden_act": "gelu",
"hidden_dropout_prob": 0.1,
"hidden_size": 768,
"initializer_range": 0.02,
"intermediate_size": 3072,
"layer_norm_eps": 1e-12,
"max_position_embeddings": 512,
"model_type": "bert",
"num_attention_heads": 12,
"num_hidden_layers": 12,
"output_past": true,
"pad_token_id": 0,
"type_vocab_size": 2,
"vocab_size": 32000
}
2021-11-07 16:10:08 INFO transformers.modeling_tf_utils - loading weights file https://cdn.huggingface.co/asafaya/bert-base-arabic/tf_model.h5 from cache at /home/dell/.cache/torch/transformers/31441b079fd49383f7df4bcbdf3d05831ddb410df20ab836ba659fa0eb9de2f7.a1dcb45cbaa9cacb3045ac20f9d3af709f5b53f7d286b605af4df7ff410e950c.h5
2021-11-07 16:10:15 INFO transformers.modeling_tf_utils - Layers from pretrained model not used in TFBertModel: ['mlm___cls']
2021-11-07 16:10:15 INFO rasa.nlu.components - Added 'LanguageModelFeaturizer' to component cache. Key 'LanguageModelFeaturizer-bert-0a5bd334e9527259f7125139499738a3'.
/home/dell/Desktop/financial_chatbots/financial_env/lib/python3.8/site-packages/rasa/utils/train_utils.py:641: UserWarning: constrain_similarities is set to `False`. It is recommended to set it to `True` when using cross-entropy loss. It will be set to `True` by default, Rasa Open Source 3.0.0 onwards.
rasa.shared.utils.io.raise_warning(
2021-11-07 16:10:15 INFO rasa.shared.nlu.training_data.training_data - Training data stats:
2021-11-07 16:10:15 INFO rasa.shared.nlu.training_data.training_data - Number of intent examples: 560 (20 distinct intents)
2021-11-07 16:10:15 INFO rasa.shared.nlu.training_data.training_data - Found intents: 'affirm', 'inform', 'pay_cc', 'thankyou', 'check_earnings', 'trigger_handoff', 'search_transactions', 'check_recipients', 'handoff', 'check_balance', 'ask_transfer_charge', 'check_human', 'human_handoff', 'nlu_fallback', 'transfer_money', 'out_of_scope', 'deny', 'goodbye', 'help', 'greet'
2021-11-07 16:10:15 INFO rasa.shared.nlu.training_data.training_data - Number of response examples: 0 (0 distinct responses)
2021-11-07 16:10:15 INFO rasa.shared.nlu.training_data.training_data - Number of entity examples: 123 (5 distinct entities)
2021-11-07 16:10:15 INFO rasa.shared.nlu.training_data.training_data - Found entity types: 'vendor_name', 'account_type', 'amount-of-money', 'PERSON', 'credit_card'
2021-11-07 16:10:15 INFO rasa.nlu.model - Starting to train component WhitespaceTokenizer
2021-11-07 16:10:15 INFO rasa.nlu.model - Finished training component.
2021-11-07 16:10:15 INFO rasa.nlu.model - Starting to train component RegexFeaturizer
2021-11-07 16:10:15 INFO rasa.nlu.model - Finished training component.
2021-11-07 16:10:15 INFO rasa.nlu.model - Starting to train component LexicalSyntacticFeaturizer
2021-11-07 16:10:16 INFO rasa.nlu.model - Finished training component.
2021-11-07 16:10:16 INFO rasa.nlu.model - Starting to train component CountVectorsFeaturizer
2021-11-07 16:10:16 INFO rasa.nlu.featurizers.sparse_featurizer.count_vectors_featurizer - 632 vocabulary items were created for text attribute.
2021-11-07 16:10:16 INFO rasa.nlu.model - Finished training component.
2021-11-07 16:10:16 INFO rasa.nlu.model - Starting to train component CountVectorsFeaturizer
2021-11-07 16:10:16 INFO rasa.nlu.featurizers.sparse_featurizer.count_vectors_featurizer - 4202 vocabulary items were created for text attribute.
2021-11-07 16:10:16 INFO rasa.nlu.model - Finished training component.
2021-11-07 16:10:16 INFO rasa.nlu.model - Starting to train component LanguageModelFeaturizer
/home/dell/Desktop/financial_chatbots/financial_env/lib/python3.8/site-packages/rasa/nlu/featurizers/dense_featurizer/lm_featurizer.py:521: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
return np.array(nonpadded_sequence_embeddings)
2021-11-07 16:10:27 INFO rasa.nlu.model - Finished training component.
2021-11-07 16:10:27 INFO rasa.nlu.model - Starting to train component DIETClassifier
/home/dell/Desktop/financial_chatbots/financial_env/lib/python3.8/site-packages/rasa/utils/tensorflow/model_data_utils.py:395: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
np.array([v[0] for v in values]), number_of_dimensions=3
/home/dell/Desktop/financial_chatbots/financial_env/lib/python3.8/site-packages/rasa/utils/tensorflow/model_data.py:750: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
np.concatenate(np.array(f)),
Epochs: 0%| | 0/100 [00:00<?, ?it/s]/home/dell/Desktop/financial_chatbots/financial_env/lib/python3.8/site-packages/tensorflow/python/framework/indexed_slices.py:447: UserWarning: Converting sparse IndexedSlices(IndexedSlices(indices=Tensor("gradients/cond_grad/gradients/cond/GatherV2_grad/Reshape_1:0", shape=(None,), dtype=int32), values=Tensor("gradients/cond_grad/gradients/cond/GatherV2_grad/Reshape:0", shape=(None,), dtype=float32), dense_shape=Tensor("gradients/cond_grad/gradients/cond/GatherV2_grad/Cast:0", shape=(1,), dtype=int32))) to a dense Tensor of unknown shape. This may consume a large amount of memory.
warnings.warn(
Epochs: 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 100/100 [02:14<00:00, 1.35s/it, t_loss=1.35, i_acc=0.999, e_f1=0.983]
2021-11-07 16:12:42 INFO rasa.nlu.model - Finished training component.
2021-11-07 16:12:42 INFO rasa.nlu.model - Starting to train component FallbackClassifier
2021-11-07 16:12:42 INFO rasa.nlu.model - Finished training component.
2021-11-07 16:12:42 INFO rasa.nlu.model - Starting to train component DucklingEntityExtractor
2021-11-07 16:12:42 INFO rasa.nlu.model - Finished training component.
2021-11-07 16:12:42 INFO rasa.nlu.model - Starting to train component EntitySynonymMapper
2021-11-07 16:12:42 INFO rasa.nlu.model - Finished training component.
2021-11-07 16:12:42 INFO rasa.nlu.model - Successfully saved model into '/tmp/tmp0v5mfan2/nlu'
NLU model training completed.
2021-11-07 16:12:44 INFO transformers.tokenization_utils - Model name 'asafaya/bert-base-arabic' not found in model shortcut name list (bert-base-uncased, bert-large-uncased, bert-base-cased, bert-large-cased, bert-base-multilingual-uncased, bert-base-multilingual-cased, bert-base-chinese, bert-base-german-cased, bert-large-uncased-whole-word-masking, bert-large-cased-whole-word-masking, bert-large-uncased-whole-word-masking-finetuned-squad, bert-large-cased-whole-word-masking-finetuned-squad, bert-base-cased-finetuned-mrpc, bert-base-german-dbmdz-cased, bert-base-german-dbmdz-uncased, TurkuNLP/bert-base-finnish-cased-v1, TurkuNLP/bert-base-finnish-uncased-v1, wietsedv/bert-base-dutch-cased). Assuming 'asafaya/bert-base-arabic' is a path, a model identifier, or url to a directory containing tokenizer files.
2021-11-07 16:12:47 INFO transformers.tokenization_utils - loading file https://s3.amazonaws.com/models.huggingface.co/bert/asafaya/bert-base-arabic/vocab.txt from cache at /home/dell/.cache/torch/transformers/1f0caadd43445032cdd25fb630a643ba7e6d9c0549d891c60566b13c0124f700.70499f9363142415275a4f221a36f914d2f3b073fb026c85672f1b5a5611f1b6
2021-11-07 16:12:47 INFO transformers.tokenization_utils - loading file https://s3.amazonaws.com/models.huggingface.co/bert/asafaya/bert-base-arabic/added_tokens.json from cache at None
2021-11-07 16:12:47 INFO transformers.tokenization_utils - loading file https://s3.amazonaws.com/models.huggingface.co/bert/asafaya/bert-base-arabic/special_tokens_map.json from cache at /home/dell/.cache/torch/transformers/a5073fc31f2e0d31a383002819d229fd5925660fdbc13b0c9a9e654a7c44d9db.275045728fbf41c11d3dae08b8742c054377e18d92cc7b72b6351152a99b64e4
2021-11-07 16:12:47 INFO transformers.tokenization_utils - loading file https://s3.amazonaws.com/models.huggingface.co/bert/asafaya/bert-base-arabic/tokenizer_config.json from cache at /home/dell/.cache/torch/transformers/22ef2bf36f103972615fc423be82f54a3cabed445a3815b65959e502e3db8df2.73a933aa27255ce576c445dcdb8155b6edb6e4c43cceb14b4b81f9e699a818b7
2021-11-07 16:12:48 INFO transformers.configuration_utils - loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/asafaya/bert-base-arabic/config.json from cache at /home/dell/.cache/torch/transformers/667afd39ed2586647499009bacd41114de05559229bdd1bf2001b9c22df3fa40.36d80aeef08d09f22ce3f578b0cb84fb7182c1355da222078b6579a8fb4b4d77
2021-11-07 16:12:48 INFO transformers.configuration_utils - Model config BertConfig {
"architectures": [
"BertModel"
],
"attention_probs_dropout_prob": 0.1,
"gradient_checkpointing": false,
"hidden_act": "gelu",
"hidden_dropout_prob": 0.1,
"hidden_size": 768,
"initializer_range": 0.02,
"intermediate_size": 3072,
"layer_norm_eps": 1e-12,
"max_position_embeddings": 512,
"model_type": "bert",
"num_attention_heads": 12,
"num_hidden_layers": 12,
"output_past": true,
"pad_token_id": 0,
"type_vocab_size": 2,
"vocab_size": 32000
}
2021-11-07 16:12:51 INFO transformers.modeling_tf_utils - loading weights file https://cdn.huggingface.co/asafaya/bert-base-arabic/tf_model.h5 from cache at /home/dell/.cache/torch/transformers/31441b079fd49383f7df4bcbdf3d05831ddb410df20ab836ba659fa0eb9de2f7.a1dcb45cbaa9cacb3045ac20f9d3af709f5b53f7d286b605af4df7ff410e950c.h5
2021-11-07 16:13:09 INFO transformers.modeling_tf_utils - Layers from pretrained model not used in TFBertModel: ['mlm___cls']
2021-11-07 16:13:09 INFO rasa.nlu.components - Added 'LanguageModelFeaturizer' to component cache. Key 'LanguageModelFeaturizer-bert-0a5bd334e9527259f7125139499738a3'.
Training Core model...
Processed story blocks: 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 45/45 [00:00<00:00, 1351.29it/s, # trackers=1]
Processed story blocks: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 45/45 [00:01<00:00, 40.96it/s, # trackers=34]
Processed story blocks: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 45/45 [00:01<00:00, 25.89it/s, # trackers=50]
Processed story blocks: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 45/45 [00:01<00:00, 22.61it/s, # trackers=50]
Processed rules: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 22/22 [00:00<00:00, 2086.67it/s, # trackers=1]
Processed trackers: 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 34/34 [00:00<00:00, 1524.09it/s, # action=132]
Processed actions: 132it [00:00, 3526.76it/s, # examples=132]
Processed trackers: 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 534/534 [00:02<00:00, 256.03it/s, # action=4526]
Epochs: 0%| | 0/40 [00:00<?, ?it/s]/home/dell/Desktop/financial_chatbots/financial_env/lib/python3.8/site-packages/tensorflow/python/framework/indexed_slices.py:447: UserWarning: Converting sparse IndexedSlices(IndexedSlices(indices=Tensor("gradients/cond_grad/Identity_1:0", shape=(None,), dtype=int64), values=Tensor("gradients/cond_grad/Identity:0", shape=(None,), dtype=float32), dense_shape=Tensor("gradients/cond_grad/Identity_2:0", shape=(1,), dtype=int32))) to a dense Tensor of unknown shape. This may consume a large amount of memory.
warnings.warn(
/home/dell/Desktop/financial_chatbots/financial_env/lib/python3.8/site-packages/tensorflow/python/framework/indexed_slices.py:447: UserWarning: Converting sparse IndexedSlices(IndexedSlices(indices=Tensor("gradients/cond_1_grad/Identity_1:0", shape=(None,), dtype=int64), values=Tensor("gradients/cond_1_grad/Identity:0", shape=(None,), dtype=float32), dense_shape=Tensor("gradients/cond_1_grad/Identity_2:0", shape=(1,), dtype=int32))) to a dense Tensor of unknown shape. This may consume a large amount of memory.
warnings.warn(
/home/dell/Desktop/financial_chatbots/financial_env/lib/python3.8/site-packages/tensorflow/python/framework/indexed_slices.py:447: UserWarning: Converting sparse IndexedSlices(IndexedSlices(indices=Tensor("gradients/cond_2_grad/Identity_1:0", shape=(None,), dtype=int64), values=Tensor("gradients/cond_2_grad/Identity:0", shape=(None,), dtype=float32), dense_shape=Tensor("gradients/cond_2_grad/Identity_2:0", shape=(1,), dtype=int32))) to a dense Tensor of unknown shape. This may consume a large amount of memory.
warnings.warn(
/home/dell/Desktop/financial_chatbots/financial_env/lib/python3.8/site-packages/tensorflow/python/framework/indexed_slices.py:447: UserWarning: Converting sparse IndexedSlices(IndexedSlices(indices=Tensor("gradients/cond_3_grad/Identity_1:0", shape=(None,), dtype=int64), values=Tensor("gradients/cond_3_grad/Identity:0", shape=(None,), dtype=float32), dense_shape=Tensor("gradients/cond_3_grad/Identity_2:0", shape=(1,), dtype=int32))) to a dense Tensor of unknown shape. This may consume a large amount of memory.
warnings.warn(
Epochs: 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 40/40 [05:19<00:00, 7.99s/it, t_loss=3.01, loss=2.7, acc=0.956]
Processed trackers: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 21/21 [00:00<00:00, 3389.14it/s, # action=51]
Processed actions: 51it [00:00, 20731.68it/s, # examples=45]
Processed trackers: 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 34/34 [00:00<00:00, 1418.12it/s, # action=161]
Processed trackers: 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 21/21 [00:00<00:00, 1660.08it/s]
Processed trackers: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 55/55 [00:00<00:00, 562.47it/s]
2021-11-07 16:19:54 INFO rasa.core.agent - Persisted model to '/tmp/tmp0v5mfan2/core'
Core model training completed.
Your Rasa model is trained and saved at '/home/dell/Desktop/financial_chatbots/financial-demo-Arabic/models/20211107-161956.tar.gz'.