Cannot use JiebaTokenizer with bert and DIETClassifier

I cannot use JiebaTokenizer with bert, here is my config, any thought?

version: "2.0"
language: zh
pipeline:
  - name: HFTransformersNLP
    model_name: bert
    model_weights: bert-base-chinese
  - name: LanguageModelTokenizer
  - name: LanguageModelFeaturizer
  - name: CountVectorsFeaturizer
  - name: CountVectorsFeaturizer
    analyzer: char_wb
    min_ngram: 1
    max_ngram: 6
  - name: DIETClassifier
    epochs: 300
    constrain_similarities: true
    entity_recognition: false
    evaluate_on_number_of_examples: 5000
    evaluate_every_number_of_epochs: 5
    tensorboard_log_directory: "./tensorboard"
    tensorboard_log_level: "epoch"
    ranking_length: 5
    number_of_negative_examples: 20
policies:
  - name: MemoizationPolicy
  - name: TEDPolicy
    max_history: 5
    epochs: 100
  - name: RulePolicy

Here is the output

2022-02-03 00:34:07 INFO     transformers.modeling_tf_utils  - loading weights file https://cdn.huggingface.co/bert-base-chinese-tf_model.h5 from cache at /root/.cache/torch/transformers/86a460b592673bcac3fe5d858ecf519e4890b4f6eddd1a46a077bd672dee6fe5.e6b974f59b54219496a89fd32be7afb020374df0976a796e5ccd3a1733d31537.h5
2022-02-03 00:34:12 INFO     transformers.modeling_tf_utils  - Layers from pretrained model not used in TFBertModel: ['nsp___cls', 'mlm___cls']
2022-02-03 00:36:21 INFO     rasa.engine.training.hooks  - Restored component 'CountVectorsFeaturizer' from cache.
2022-02-03 00:38:11 INFO     rasa.engine.training.hooks  - Restored component 'CountVectorsFeaturizer' from cache.
2022-02-03 00:40:17 INFO     rasa.engine.training.hooks  - Starting to train component 'DIETClassifier'.
Epochs:   0% 0/300 [00:00<?, ?it/s]Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/rasa/engine/graph.py", line 458, in __call__
    output = self._fn(self._component, **run_kwargs)
  File "/usr/local/lib/python3.7/dist-packages/rasa/nlu/classifiers/diet_classifier.py", line 919, in train
    shuffle=False,  # we use custom shuffle inside data generator
  File "/usr/local/lib/python3.7/dist-packages/rasa/utils/tensorflow/temp_keras_modules.py", line 181, in fit
    tmp_logs = train_function(iterator)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/def_function.py", line 885, in __call__
    result = self._call(*args, **kwds)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/def_function.py", line 917, in _call
    return self._stateless_fn(*args, **kwds)  # pylint: disable=not-callable
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/function.py", line 3040, in __call__
    filtered_flat_args, captured_inputs=graph_function.captured_inputs)  # pylint: disable=protected-access
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/function.py", line 1964, in _call_flat
    ctx, args, cancellation_manager=cancellation_manager))
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/function.py", line 596, in call
    ctx=ctx)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/execute.py", line 60, in quick_execute
    inputs, attrs, num_outputs)
tensorflow.python.framework.errors_impl.InvalidArgumentError:  ConcatOp : Dimensions of inputs should match: shape[0] = [64,33,128] vs. shape[1] = [64,32,768]
	 [[node rasa_sequence_layer_text/rasa_feature_combining_layer_text/concatenate_sparse_dense_features_text_sequence/concat (defined at /lib/python3.7/dist-packages/rasa/utils/tensorflow/rasa_layers.py:339) ]] [Op:__inference_train_function_719741]

Errors may have originated from an input operation.
Input Source operations connected to node rasa_sequence_layer_text/rasa_feature_combining_layer_text/concatenate_sparse_dense_features_text_sequence/concat:
 rasa_sequence_layer_text/rasa_feature_combining_layer_text/concatenate_sparse_dense_features_text_sequence/dropout/dropout/Mul_1 (defined at /lib/python3.7/dist-packages/rasa/utils/tensorflow/rasa_layers.py:309)	
 IteratorGetNext (defined at /lib/python3.7/dist-packages/rasa/utils/tensorflow/temp_keras_modules.py:181)

Function call stack:
train_function


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/bin/rasa", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.7/dist-packages/rasa/__main__.py", line 121, in main
    cmdline_arguments.func(cmdline_arguments)
  File "/usr/local/lib/python3.7/dist-packages/rasa/cli/train.py", line 59, in <lambda>
    train_parser.set_defaults(func=lambda args: run_training(args, can_exit=True))
  File "/usr/local/lib/python3.7/dist-packages/rasa/cli/train.py", line 103, in run_training
    finetuning_epoch_fraction=args.epoch_fraction,
  File "/usr/local/lib/python3.7/dist-packages/rasa/api.py", line 117, in train
    finetuning_epoch_fraction=finetuning_epoch_fraction,
  File "/usr/local/lib/python3.7/dist-packages/rasa/model_training.py", line 171, in train
    **(nlu_additional_arguments or {}),
  File "/usr/local/lib/python3.7/dist-packages/rasa/model_training.py", line 232, in _train_graph
    is_finetuning=is_finetuning,
  File "/usr/local/lib/python3.7/dist-packages/rasa/engine/training/graph_trainer.py", line 105, in train
    graph_runner.run(inputs={PLACEHOLDER_IMPORTER: importer})
  File "/usr/local/lib/python3.7/dist-packages/rasa/engine/runner/dask.py", line 101, in run
    dask_result = dask.get(run_graph, run_targets)
  File "/usr/local/lib/python3.7/dist-packages/dask/local.py", line 558, in get_sync
    **kwargs,
  File "/usr/local/lib/python3.7/dist-packages/dask/local.py", line 496, in get_async
    for key, res_info, failed in queue_get(queue).result():
  File "/usr/lib/python3.7/concurrent/futures/_base.py", line 428, in result
    return self.__get_result()
  File "/usr/lib/python3.7/concurrent/futures/_base.py", line 384, in __get_result
    raise self._exception
  File "/usr/local/lib/python3.7/dist-packages/dask/local.py", line 538, in submit
    fut.set_result(fn(*args, **kwargs))
  File "/usr/local/lib/python3.7/dist-packages/dask/local.py", line 234, in batch_execute_tasks
    return [execute_task(*a) for a in it]
  File "/usr/local/lib/python3.7/dist-packages/dask/local.py", line 234, in <listcomp>
    return [execute_task(*a) for a in it]
  File "/usr/local/lib/python3.7/dist-packages/dask/local.py", line 225, in execute_task
    result = pack_exception(e, dumps)
  File "/usr/local/lib/python3.7/dist-packages/dask/local.py", line 220, in execute_task
    result = _execute_task(task, data)
  File "/usr/local/lib/python3.7/dist-packages/dask/core.py", line 119, in _execute_task
    return func(*(_execute_task(a, cache) for a in args))
  File "/usr/local/lib/python3.7/dist-packages/rasa/engine/graph.py", line 467, in __call__
    ) from e
rasa.engine.exceptions.GraphComponentException: Error running graph component for node train_DIETClassifier4.
1 Like

which version of Rasa are you using?

your error suggest Rasa3.0 and above while your config looks like the one from Rasa 2.0 . can you try this config on Rasa 3.0: please lint the yaml in the correct format.

recipe: default.v1
language: zh
pipeline:
  - name: "JiebaTokenizer"
  dictionary_path: "path/to/custom/dictionary/dir"
  # Flag to check whether to split intents
  "intent_tokenization_flag": False
  # Symbol on which intent should be split
  "intent_split_symbol": "_"
  # Regular expression to detect tokens
  "token_pattern": None
  - name: CountVectorsFeaturizer
  - name: CountVectorsFeaturizer
    analyzer: char_wb
    min_ngram: 1
    max_ngram: 6
- name: LanguageModelFeaturizer
    # Name of the language model to use
    model_name: "bert"
    # Pre-Trained weights to be loaded
    model_weights: "bert-base-chinese"

    # An optional path to a directory from which
    # to load pre-trained model weights.
    # If the requested model is not found in the
    # directory, it will be downloaded and
    # cached in this directory for future use.
    # The default value of `cache_dir` can be
    # set using the environment variable
    # `TRANSFORMERS_CACHE`, as per the
    # Transformers library.
    cache_dir: null
  - name: DIETClassifier
    epochs: 300
    constrain_similarities: true
    entity_recognition: false
    evaluate_on_number_of_examples: 5000
    evaluate_every_number_of_epochs: 5
    tensorboard_log_directory: "./tensorboard"
    tensorboard_log_level: "epoch"
    ranking_length: 5
    number_of_negative_examples: 20
policies:
  - name: MemoizationPolicy
  - name: TEDPolicy
    max_history: 5
    epochs: 100
  - name: RulePolicy
2 Likes

Sorry, paste the wrong one, let me re-paste the config

version: "2.0"
language: zh
pipeline:
  - name: HFTransformersNLP
    model_name: bert
    model_weights: bert-base-chinese
  - name: JiebaTokenizer
  - name: LanguageModelFeaturizer
  - name: CountVectorsFeaturizer
  - name: CountVectorsFeaturizer
    analyzer: char_wb
    min_ngram: 1
    max_ngram: 6
  - name: DIETClassifier
    epochs: 300
    constrain_similarities: true
    entity_recognition: false
    evaluate_on_number_of_examples: 5000
    evaluate_every_number_of_epochs: 5
    tensorboard_log_directory: "./tensorboard"
    tensorboard_log_level: "epoch"
    ranking_length: 5
    number_of_negative_examples: 20
policies:
  - name: MemoizationPolicy
  - name: TEDPolicy
    max_history: 5
    epochs: 100
  - name: RulePolicy

the logs

2022-02-04 01:24:51 INFO     root  - Generating grammar tables from /usr/lib/python3.7/lib2to3/Grammar.txt
2022-02-04 01:24:51 INFO     root  - Generating grammar tables from /usr/lib/python3.7/lib2to3/PatternGrammar.txt
No stories present. Just a Rasa NLU model will be trained.
Training NLU model...
2022-02-04 01:25:31 INFO     numexpr.utils  - NumExpr defaulting to 4 threads.
2022-02-04 01:25:31 INFO     transformers.file_utils  - PyTorch version 1.10.0+cu111 available.
2022-02-04 01:25:31 INFO     transformers.file_utils  - TensorFlow version 2.6.3 available.
2022-02-04 01:25:31 INFO     transformers.tokenization_utils  - loading file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-chinese-vocab.txt from cache at /root/.cache/torch/transformers/8a0c070123c1f794c42a29c6904beb7c1b8715741e235bee04aca2c7636fc83f.9b42061518a39ca00b8b52059fd2bede8daa613f8a8671500e518a8c29de8c00
2022-02-04 01:25:31 INFO     transformers.configuration_utils  - loading configuration file https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-chinese-config.json from cache at /root/.cache/torch/transformers/8a3b1cfe5da58286e12a0f5d7d182b8d6eca88c08e26c332ee3817548cf7e60a.f12a4f986e43d8b328f5b067a641064d67b91597567a06c7b122d1ca7dfd9741
2022-02-04 01:25:31 INFO     transformers.configuration_utils  - Model config BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "directionality": "bidi",
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 0,
  "pooler_fc_size": 768,
  "pooler_num_attention_heads": 12,
  "pooler_num_fc_layers": 3,
  "pooler_size_per_head": 128,
  "pooler_type": "first_token_transform",
  "type_vocab_size": 2,
  "vocab_size": 21128
}

2022-02-04 01:25:31 INFO     transformers.modeling_tf_utils  - loading weights file https://cdn.huggingface.co/bert-base-chinese-tf_model.h5 from cache at /root/.cache/torch/transformers/86a460b592673bcac3fe5d858ecf519e4890b4f6eddd1a46a077bd672dee6fe5.e6b974f59b54219496a89fd32be7afb020374df0976a796e5ccd3a1733d31537.h5
2022-02-04 01:25:34 INFO     transformers.modeling_tf_utils  - Layers from pretrained model not used in TFBertModel: ['mlm___cls', 'nsp___cls']
2022-02-04 01:25:34 INFO     rasa.nlu.components  - Added 'HFTransformersNLP' to component cache. Key 'HFTransformersNLP-bert-68d7c530c1c4708f5657e4ae28219570'.
2022-02-04 01:25:34 INFO     rasa.nlu.components  - Added 'LanguageModelFeaturizer' to component cache. Key 'LanguageModelFeaturizer-None-99914b932bd37a50b983c5e7c90ae93b'.
2022-02-04 01:25:54 INFO     rasa.shared.nlu.training_data.training_data  - Training data stats:
2022-02-04 01:25:54 INFO     rasa.shared.nlu.training_data.training_data  - Number of intent examples: 30212 (803 distinct intents)

2022-02-04 01:25:54 INFO     rasa.shared.nlu.training_data.training_data  -   Found intents: 'id3105', 'id1177', 'id4009', 'id4777', 'id3801', 'id4782', 'id2726', 'id2765', 'id1931', 'id1205', 'id710', 'id174', 'id1297', 'id4382', 'id654', 'id4992', 'id4006', 'id1500', 'id4808', 'id4939', 'id176', 'id713', 'id175', 'id2004', 'id3855', 'id390', 'id3106', 'id505', 'id3800', 'id4632', 'id256', 'id1577', 'id3110', 'id136', 'id2764', 'id1370', 'id2606', 'id4491', 'id3798', 'id2058', 'id3680', 'id2397', 'id5018', 'id547', 'id1581', 'id2176', 'id2192', 'id1916', 'id3103', 'id3795', 'id5597', 'id3797', 'id1970', 'id4758', 'id5129', 'id618', 'id294', 'id4276', 'id2195', 'id913', 'id5411', 'id583', 'id1090', 'id1388', 'id2571', 'id247', 'id60', 'id5051', 'id1840', 'id4098', 'id1726', 'id1651', 'id5656', 'id5637', 'id4625', 'id5598', 'id293', 'id4934', 'id830', 'id4926', 'id4850', 'id2194', 'id4819', 'id4858', 'id548', 'id1797', 'id686', 'id2495', 'id1499', 'id486', 'id1368', 'id368', 'id587', 'id1609', 'id3803', 'id5532', 'id985', 'id393', 'id958', 'id3107', 'id4835', 'id2563', 'id4387', 'id1230', 'id2617', 'id5491', 'id1338', 'id2730', 'id5529', 'id5638', 'id5242', 'id2007', 'id5204', 'id2179', 'id715', 'id105', 'id914', 'id178', 'id1794', 'id5436', 'id5627', 'id439', 'id4484', 'id586', 'id1617', 'id959', 'id194', 'id2494', 'id3678', 'id2187', 'id893', 'id2761', 'id242', 'id173', 'id2402', 'id4861', 'id4271', 'id190', 'id4832', 'id189', 'id453', 'id3220', 'id3849', 'id584', 'id1648', 'id5492', 'id2651', 'id5705', 'id4981', 'id5563', 'id2405', 'id1695', 'id5324', 'id2180', 'id4064', 'id5564', 'id484', 'id2569', 'id5316', 'id2178', 'id4154', 'id4778', 'id2614', 'id4631', 'id1415', 'id5655', 'id3907', 'id890', 'id3221', 'id4134', 'id4862', 'id1936', 'id4912', 'id413', 'id620', 'id4915', 'id957', 'id255', 'id4384', 'id1288', 'id1544', 'id4278', 'id320', 'id3854', 'id211', 'id5408', 'id1398', 'id5021', 'id241', 'id2724', 'id4913', 'id2565', 'id137', 'id4066', 'id853', 'id5281', 'id177', 'id4390', 'id2407', 'id5569', 'id1340', 'id5639', 'goodbye', 'id1618', 'id2616', 'id5050', 'id3417', 'id1294', 'id4980', 'id4993', 'id100', 'id370', 'id5278', 'id4904', 'id4821', 'id2644', 'id142', 'id2406', 'id145', 'id894', 'id2654', 'id5133', 'id1838', 'id5530', 'id4002', 'id916', 'id4839', 'id5413', 'id4158', 'id1342', 'id5449', 'id149', 'id5314', 'id5435', 'id2199', 'id2490', 'id1087', 'id4274', 'id1918', 'id99', 'id2308', 'id4878', 'id411', 'id3799', 'id4157', 'id1831', 'id1013', 'id1929', 'id5565', 'id1508', 'id4099', 'id3108', 'id58', 'id192', 'id1934', 'id1289', 'id895', 'id5059', 'id319', 'id2008', 'id4458', 'id4496', 'id4628', 'id1968', 'id5640', 'id2197', 'id2729', 'id4910', 'id315', 'id984', 'id4156', 'id144', 'id5450', 'id2645', 'id552', 'id5525', 'id5125', 'id5526', 'id1015', 'id1178', 'id246', 'id2573', 'id4490', 'id1699', 'id4385', 'id4063', 'id1842', 'id4779', 'id389', 'id3415', 'id4004', 'id260', 'id516', 'id2647', 'id4498', 'id2409', 'id243', 'id1836', 'id1339', 'id180', 'id655', 'id1510', 'id143', 'id2191', 'id917', 'id4776', 'id2735', 'id5544', 'id1115', 'id789', 'id829', 'id2760', 'id5229', 'id3219', 'id5240', 'id4759', 'id4936', 'id5022', 'id2398', 'id5126', 'id2653', 'id3682', 'id1612', 'id5131', 'id3113', 'id2487', 'id1971', 'id244', 'id5157', 'id182', 'id1393', 'id3114', 'id711', 'id2307', 'id1545', 'id4070', 'id1231', 'id4990', 'id2618', 'id1969', 'id5399', 'id683', 'id3104', 'id5407', 'id1549', 'id1839', 'id4994', 'id1372', 'id1497', 'id4800', 'id515', 'id2400', 'id5528', 'id1047', 'id140', 'id1389', 'id193', 'id4456', 'id4281', 'id4859', 'id4979', 'id4067', 'id793', 'id4279', 'id1919', 'id4132', 'id4806', 'id653', 'id5546', 'id1610', 'id181', 'id1172', 'id2403', 'id2486', 'id2650', 'id3848', 'id4888', 'id4130', 'id1391', 'id3681', 'id854', 'id4995', 'id5241', 'id372', 'id1119', 'id2733', 'id682', 'id3905', 'id1925', 'id3115', 'id1833', 'id4462', 'id1290', 'id506', 'id2763', 'id148', 'id4386', 'id2303', 'id290', 'id1206', 'id1504', 'id1725', 'id2184', 'id63', 'id2313', 'id3215', 'id3906', 'id4133', 'id1089', 'id4905', 'id2174', 'id2488', 'id915', 'id1232', 'id4633', 'id4774', 'id412', 'id1292', 'id104', 'id2401', 'id4863', 'id4991', 'id5613', 'id317', 'id4976', 'id5038', 'id5158', 'id4282', 'id4153', 'id1798', 'id5279', 'id4823', 'id1930', 'id62', 'id726', 'id2621', 'id685', 'id134', 'id1345', 'id1171', 'id4275', 'id1417', 'id1173', 'id4757', 'id1698', 'id1552', 'id1622', 'id1175', 'id4003', 'id5657', 'id2567', 'id1580', 'id1014', 'id106', 'id2311', 'id4833', 'id3112', 'id3851', 'id2306', 'id892', 'id101', 'id254', 'id1926', 'id1337', 'id5020', 'id5201', 'id2646', 'id827', 'id2183', 'id1207', 'id2491', 'id3850', 'id2315', 'id2655', 'id191', 'id1291', 'id2190', 'id616', 'id5493', 'id410', 'id5614', 'id259', 'id2731', 'id253', 'id5134', 'id4272', 'id3853', 'id1935', 'id150', 'id4996', 'id5547', 'id5410', 'id1343', 'id5405', 'id1834', 'id1170', 'id1088', 'id2605', 'id1696', 'id3909', 'id551', 'id4383', 'id1547', 'id1553', 'id1390', 'id1917', 'id1697', 'id4069', 'id1837', 'id4820', 'id1579', 'id2186', 'id1694', 'id2619', 'id2196', 'id717', 'id5226', 'id5406', 'id2185', 'id1966', 'id4010', 'id1578', 'id3805', 'id4269', 'id5102', 'id5280', 'id2762', 'id1117', 'id5058', 'id5203', 'id5615', 'id3413', 'id141', 'id4784', 'id2059', 'id1649', 'id4455', 'id1796', 'id4959', 'id3685', 'id4013', 'id4822', 'id5128', 'id2570', 'id1546', 'id4131', 'id1620', 'id983', 'id1169', 'id1046', 'id4104', 'id2652', 'id5132', 'id712', 'id1176', 'id2566', 'id5154', 'id4005', 'id292', 'id3856', 'id3796', 'id179', 'id64', 'id1921', 'id4781', 'id4803', 'id5136', 'id4494', 'id4941', 'id1924', 'id4873', 'id956', 'id2734', 'id4012', 'id2314', 'id3414', 'id2725', 'id2316', 'id1646', 'id183', 'id1608', 'id2613', 'id5040', 'id5039', 'id1799', 'id4065', 'id1650', 'id2310', 'id2492', 'id2188', 'id4836', 'id5130', 'id4100', 'id2057', 'id5451', 'id2198', 'id139', 'id1045', 'id1932', 'id245', 'id1511', 'id1923', 'id4773', 'id1295', 'id2399', 'id4911', 'id2648', 'id5135', 'id4283', 'id1841', 'id409', 'id3116', 'id4978', 'id1209', 'id1554', 'id107', 'id4389', 'id4273', 'id4007', 'id4101', 'id5545', 'id2189', 'id5404', 'id4493', 'id4772', 'id3214', 'id1371', 'id5313', 'id3802', 'id1418', 'id3217', 'id1067', 'id617', 'id4277', 'id1293', 'id4280', 'id4834', 'id456', 'id982', 'id258', 'id147', 'id4847', 'id2408', 'id2404', 'id5437', 'id4849', 'id3908', 'id5409', 'id5276', 'id1208', 'id454', 'id3804', 'id5561', 'id4933', 'id392', 'id1503', 'id195', 'id4775', 'id257', 'id2493', 'id5243', 'id1922', 'id2727', 'id1920', 'id504', 'id2489', 'id3904', 'id4903', 'id4940', 'id3419', 'id1550', 'id2604', 'id65', 'id4880', 'id102', 'id4391', 'id2225', 'id4129', 'id5052', 'id4879', 'id452', 'id1652', 'id1793', 'id5570', 'id2005', 'id1933', 'id5127', 'id2177', 'id1068', 'id4627', 'id1728', 'id291', 'id5202', 'id4001', 'id981', 'id3910', 'id1548', 'id4457', 'id5205', 'id5490', 'id289', 'id1700', 'id656', 'id4459', 'id3111', 'id1501', 'id2620', 'id1017', 'id1341', 'id2305', 'id4925', 'id3418', 'id2304', 'id5527', 'id4975', 'id391', 'id135', 'id5159', 'id4388', 'id1832', 'id4874', 'id5228', 'id4805', 'id4801', 'id1727', 'id3683', 'id3684', 'id4626', 'id2312', 'id4783', 'id4871', 'id2564', 'greet', 'id1795', 'id318', 'id1419', 'id1724', 'id1228', 'id4155', 'id1065', 'id5531', 'id103', 'id2622', 'id146', 'id1703', 'id1701', 'id4860', 'id5019', 'id1507', 'id414', 'id1346', 'id5412', 'id4495', 'id4927', 'id2181', 'id4935', 'id5227', 'id657', 'id4838', 'id1044', 'id2607', 'id4807', 'id2572', 'id316', 'id4152', 'id1296', 'id3852', 'id1611', 'id2182', 'id1421', 'id2728', 'id1505', 'id2175', 'id1116', 'id2562', 'id4887', 'id3679', 'id4497', 'id2732', 'id4848', 'id1042', 'id3857', 'id3109', 'id4008', 'id4851', 'id4872', 'id4804', 'id4914', 'id1835', 'id2615', 'id1369', 'id59', 'id4270', 'id5599', 'id5094', 'id1967', 'id4011', 'id4982', 'id1647', 'id4068', 'id1344', 'id261', 'id2006'
2022-02-04 01:25:54 INFO     rasa.shared.nlu.training_data.training_data  - Number of response examples: 0 (0 distinct responses)
2022-02-04 01:25:54 INFO     rasa.shared.nlu.training_data.training_data  - Number of entity examples: 0 (0 distinct entities)
2022-02-04 01:25:56 INFO     rasa.nlu.model  - Starting to train component HFTransformersNLP
/usr/local/lib/python3.7/dist-packages/rasa/nlu/utils/hugging_face/hf_transformers.py:444: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
  return np.array(nonpadded_sequence_embeddings)
2022-02-04 01:27:23 INFO     rasa.nlu.model  - Finished training component.
2022-02-04 01:27:23 INFO     rasa.nlu.model  - Starting to train component JiebaTokenizer
Building prefix dict from the default dictionary ...
Dumping model to file cache /tmp/jieba.cache
Loading model cost 0.868 seconds.
Prefix dict has been built successfully.
2022-02-04 01:27:31 INFO     rasa.nlu.model  - Finished training component.
2022-02-04 01:27:31 INFO     rasa.nlu.model  - Starting to train component LanguageModelFeaturizer
2022-02-04 01:27:31 INFO     rasa.nlu.model  - Finished training component.
2022-02-04 01:27:31 INFO     rasa.nlu.model  - Starting to train component CountVectorsFeaturizer
2022-02-04 01:27:33 INFO     rasa.nlu.featurizers.sparse_featurizer.count_vectors_featurizer  - 5762 vocabulary items were created for text attribute.
2022-02-04 01:27:52 INFO     rasa.nlu.model  - Finished training component.
2022-02-04 01:27:52 INFO     rasa.nlu.model  - Starting to train component CountVectorsFeaturizer
2022-02-04 01:27:55 INFO     rasa.nlu.featurizers.sparse_featurizer.count_vectors_featurizer  - 28390 vocabulary items were created for text attribute.
2022-02-04 01:28:15 INFO     rasa.nlu.model  - Finished training component.
2022-02-04 01:28:15 INFO     rasa.nlu.model  - Starting to train component DIETClassifier
/usr/local/lib/python3.7/dist-packages/rasa/utils/tensorflow/model_data_utils.py:395: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
  np.array([v[0] for v in values]), number_of_dimensions=3
Epochs:   0% 0/300 [00:00<?, ?it/s]Traceback (most recent call last):
  File "/usr/local/bin/rasa", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.7/dist-packages/rasa/__main__.py", line 118, in main
    cmdline_arguments.func(cmdline_arguments)
  File "/usr/local/lib/python3.7/dist-packages/rasa/cli/train.py", line 59, in <lambda>
    train_parser.set_defaults(func=lambda args: run_training(args, can_exit=True))
  File "/usr/local/lib/python3.7/dist-packages/rasa/cli/train.py", line 103, in run_training
    finetuning_epoch_fraction=args.epoch_fraction,
  File "/usr/local/lib/python3.7/dist-packages/rasa/api.py", line 124, in train
    loop,
  File "/usr/local/lib/python3.7/dist-packages/rasa/utils/common.py", line 296, in run_in_loop
    result = loop.run_until_complete(f)
  File "uvloop/loop.pyx", line 1456, in uvloop.loop.Loop.run_until_complete
  File "/usr/local/lib/python3.7/dist-packages/rasa/model_training.py", line 119, in train_async
    finetuning_epoch_fraction=finetuning_epoch_fraction,
  File "/usr/local/lib/python3.7/dist-packages/rasa/model_training.py", line 251, in _train_async_internal
    finetuning_epoch_fraction=finetuning_epoch_fraction,
  File "/usr/local/lib/python3.7/dist-packages/rasa/model_training.py", line 765, in _train_nlu_with_validated_data
    **additional_arguments,
  File "/usr/local/lib/python3.7/dist-packages/rasa/nlu/train.py", line 111, in train
    interpreter = trainer.train(training_data, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/rasa/nlu/model.py", line 221, in train
    component.train(working_data, self.config, **context)
  File "/usr/local/lib/python3.7/dist-packages/rasa/nlu/classifiers/diet_classifier.py", line 887, in train
    shuffle=False,  # we use custom shuffle inside data generator
  File "/usr/local/lib/python3.7/dist-packages/rasa/utils/tensorflow/temp_keras_modules.py", line 190, in fit
    tmp_logs = train_function(iterator)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/def_function.py", line 885, in __call__
    result = self._call(*args, **kwds)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/def_function.py", line 950, in _call
    return self._stateless_fn(*args, **kwds)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/function.py", line 3040, in __call__
    filtered_flat_args, captured_inputs=graph_function.captured_inputs)  # pylint: disable=protected-access
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/function.py", line 1964, in _call_flat
    ctx, args, cancellation_manager=cancellation_manager))
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/function.py", line 596, in call
    ctx=ctx)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/execute.py", line 60, in quick_execute
    inputs, attrs, num_outputs)
tensorflow.python.framework.errors_impl.InvalidArgumentError:  ConcatOp : Dimensions of inputs should match: shape[0] = [64,30,128] vs. shape[1] = [64,3,768]
	 [[node rasa_sequence_layer_text/rasa_feature_combining_layer_text/concatenate_sparse_dense_features_text_sequence/concat (defined at /lib/python3.7/dist-packages/rasa/utils/tensorflow/rasa_layers.py:338) ]] [Op:__inference_train_function_719741]

Errors may have originated from an input operation.
Input Source operations connected to node rasa_sequence_layer_text/rasa_feature_combining_layer_text/concatenate_sparse_dense_features_text_sequence/concat:
 rasa_sequence_layer_text/rasa_feature_combining_layer_text/concatenate_sparse_dense_features_text_sequence/dropout/dropout/Mul_1 (defined at /lib/python3.7/dist-packages/rasa/utils/tensorflow/rasa_layers.py:308)	
 IteratorGetNext (defined at /lib/python3.7/dist-packages/rasa/utils/tensorflow/temp_keras_modules.py:190)

Function call stack:
train_function

@mayuanyang1 - can you try the config i pasted above which is more relevant for Rasa 3.0 components

got this error

Downloading: 100% 478M/478M [00:10<00:00, 46.1MB/s]
2022-02-04 22:32:32 INFO     transformers.modeling_tf_utils  - loading weights file https://cdn.huggingface.co/bert-base-chinese-tf_model.h5 from cache at /root/.cache/torch/transformers/86a460b592673bcac3fe5d858ecf519e4890b4f6eddd1a46a077bd672dee6fe5.e6b974f59b54219496a89fd32be7afb020374df0976a796e5ccd3a1733d31537.h5
2022-02-04 22:32:36 INFO     transformers.modeling_tf_utils  - Layers from pretrained model not used in TFBertModel: ['nsp___cls', 'mlm___cls']
2022-02-04 22:35:39 INFO     rasa.engine.training.hooks  - Starting to train component 'DIETClassifier'.
Epochs:   0% 0/300 [00:00<?, ?it/s]Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/rasa/engine/graph.py", line 458, in __call__
    output = self._fn(self._component, **run_kwargs)
  File "/usr/local/lib/python3.7/dist-packages/rasa/nlu/classifiers/diet_classifier.py", line 919, in train
    shuffle=False,  # we use custom shuffle inside data generator
  File "/usr/local/lib/python3.7/dist-packages/rasa/utils/tensorflow/temp_keras_modules.py", line 181, in fit
    tmp_logs = train_function(iterator)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/def_function.py", line 885, in __call__
    result = self._call(*args, **kwds)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/def_function.py", line 917, in _call
    return self._stateless_fn(*args, **kwds)  # pylint: disable=not-callable
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/function.py", line 3040, in __call__
    filtered_flat_args, captured_inputs=graph_function.captured_inputs)  # pylint: disable=protected-access
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/function.py", line 1964, in _call_flat
    ctx, args, cancellation_manager=cancellation_manager))
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/function.py", line 596, in call
    ctx=ctx)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/execute.py", line 60, in quick_execute
    inputs, attrs, num_outputs)
tensorflow.python.framework.errors_impl.InvalidArgumentError:  ConcatOp : Dimensions of inputs should match: shape[0] = [64,25,128] vs. shape[1] = [64,24,768]
	 [[node rasa_sequence_layer_text/rasa_feature_combining_layer_text/concatenate_sparse_dense_features_text_sequence/concat (defined at /lib/python3.7/dist-packages/rasa/utils/tensorflow/rasa_layers.py:339) ]] [Op:__inference_train_function_721236]

Errors may have originated from an input operation.
Input Source operations connected to node rasa_sequence_layer_text/rasa_feature_combining_layer_text/concatenate_sparse_dense_features_text_sequence/concat:
 IteratorGetNext (defined at /lib/python3.7/dist-packages/rasa/utils/tensorflow/temp_keras_modules.py:181)	
 rasa_sequence_layer_text/rasa_feature_combining_layer_text/concatenate_sparse_dense_features_text_sequence/dropout/dropout/Mul_1 (defined at /lib/python3.7/dist-packages/rasa/utils/tensorflow/rasa_layers.py:309)

Function call stack:
train_function


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/bin/rasa", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.7/dist-packages/rasa/__main__.py", line 121, in main
    cmdline_arguments.func(cmdline_arguments)
  File "/usr/local/lib/python3.7/dist-packages/rasa/cli/train.py", line 59, in <lambda>
    train_parser.set_defaults(func=lambda args: run_training(args, can_exit=True))
  File "/usr/local/lib/python3.7/dist-packages/rasa/cli/train.py", line 103, in run_training
    finetuning_epoch_fraction=args.epoch_fraction,
  File "/usr/local/lib/python3.7/dist-packages/rasa/api.py", line 117, in train
    finetuning_epoch_fraction=finetuning_epoch_fraction,
  File "/usr/local/lib/python3.7/dist-packages/rasa/model_training.py", line 171, in train
    **(nlu_additional_arguments or {}),
  File "/usr/local/lib/python3.7/dist-packages/rasa/model_training.py", line 232, in _train_graph
    is_finetuning=is_finetuning,
  File "/usr/local/lib/python3.7/dist-packages/rasa/engine/training/graph_trainer.py", line 105, in train
    graph_runner.run(inputs={PLACEHOLDER_IMPORTER: importer})
  File "/usr/local/lib/python3.7/dist-packages/rasa/engine/runner/dask.py", line 101, in run
    dask_result = dask.get(run_graph, run_targets)
  File "/usr/local/lib/python3.7/dist-packages/dask/local.py", line 558, in get_sync
    **kwargs,
  File "/usr/local/lib/python3.7/dist-packages/dask/local.py", line 496, in get_async
    for key, res_info, failed in queue_get(queue).result():
  File "/usr/lib/python3.7/concurrent/futures/_base.py", line 428, in result
    return self.__get_result()
  File "/usr/lib/python3.7/concurrent/futures/_base.py", line 384, in __get_result
    raise self._exception
  File "/usr/local/lib/python3.7/dist-packages/dask/local.py", line 538, in submit
    fut.set_result(fn(*args, **kwargs))
  File "/usr/local/lib/python3.7/dist-packages/dask/local.py", line 234, in batch_execute_tasks
    return [execute_task(*a) for a in it]
  File "/usr/local/lib/python3.7/dist-packages/dask/local.py", line 234, in <listcomp>
    return [execute_task(*a) for a in it]
  File "/usr/local/lib/python3.7/dist-packages/dask/local.py", line 225, in execute_task
    result = pack_exception(e, dumps)
  File "/usr/local/lib/python3.7/dist-packages/dask/local.py", line 220, in execute_task
    result = _execute_task(task, data)
  File "/usr/local/lib/python3.7/dist-packages/dask/core.py", line 119, in _execute_task
    return func(*(_execute_task(a, cache) for a in args))
  File "/usr/local/lib/python3.7/dist-packages/rasa/engine/graph.py", line 467, in __call__
    ) from e
rasa.engine.exceptions.GraphComponentException: Error running graph component for node train_DIETClassifier4.

could you try without the sparse features from CountVectorizers. Unfortunately i don’t have chinese data to test it.

Aha, it works by removing the CountVectorsFeaturizer, is it expected?

i suppose that is probably due to the shape of the bert-base-chinese embeddings and count vectorizers which doesn’t match. in my english bert case, this wasn’t a problem.

could you try reversing the order.

recipe: default.v1
language: zh
pipeline:
  - name: "JiebaTokenizer"
  dictionary_path: "path/to/custom/dictionary/dir"
  # Flag to check whether to split intents
  "intent_tokenization_flag": False
  # Symbol on which intent should be split
  "intent_split_symbol": "_"
  # Regular expression to detect tokens
  "token_pattern": None
  - name: LanguageModelFeaturizer
    # Name of the language model to use
    model_name: "bert"
    # Pre-Trained weights to be loaded
    model_weights: "bert-base-chinese"

    # An optional path to a directory from which
    # to load pre-trained model weights.
    # If the requested model is not found in the
    # directory, it will be downloaded and
    # cached in this directory for future use.
    # The default value of `cache_dir` can be
    # set using the environment variable
    # `TRANSFORMERS_CACHE`, as per the
    # Transformers library.
    cache_dir: null
  - name: CountVectorsFeaturizer
  - name: CountVectorsFeaturizer
    analyzer: char_wb
    min_ngram: 1
    max_ngram: 6
  - name: DIETClassifier
    epochs: 300
    constrain_similarities: true
    entity_recognition: false
    evaluate_on_number_of_examples: 5000
    evaluate_every_number_of_epochs: 5
    tensorboard_log_directory: "./tensorboard"
    tensorboard_log_level: "epoch"
    ranking_length: 5
    number_of_negative_examples: 20
policies:
  - name: MemoizationPolicy
  - name: TEDPolicy
    max_history: 5
    epochs: 100
  - name: RulePolicy

Similar problem by changing the order

File “/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/execute.py”, line 60, in quick_execute inputs, attrs, num_outputs) tensorflow.python.framework.errors_impl.InvalidArgumentError: ConcatOp : Dimensions of inputs should match: shape[0] = [2061,35,128] vs. shape[1] = [2061,34,768] [[node rasa_sequence_layer_text/rasa_feature_combining_layer_text/concatenate_sparse_dense_features_text_sequence/concat (defined at /lib/python3.7/dist-packages/rasa/utils/tensorflow/rasa_layers.py:338) ]] [Op:__inference_train_function_734691]

Yeah likely the shape of the tensor from bert-base-chinese is not the same as count vectoriser thus merging them is not possible to train all features. you can do without count vectorizer…

try few different configs to validate what works for your data.

you can try simply using bert-base-chinese with the above config, or vanilla DIET with Jieba/CountVectors and without bert-base-chinese. Sometimes that works quite well. I never saw a clear case where one always works better than the other.

thanks for that