I needed to use RASA3 to train Chinese, but my configuration pipeline could not get bert-base-chinese, which led to my training error

moyao313 · September 22, 2023, 8:09am

(rasa) [zkcl@zyd-8c16g-0006 rasa]$ /data/apps/Python3/bin/python3 -m rasa train /data/apps/Python3/lib/python3.9/site-packages/rasa/core/tracker_store.py:1042: MovedIn20Warning: Deprecated API features detected! These feature(s) are not compatible with SQLAlchemy 2.0. To prevent incompatible upgrades prior to updating applications, ensure requirements files are pinned to “sqlalchemy<2.0”. Set environment variable SQLALCHEMY_WARN_20=1 to show all deprecation warnings. Set environment variable SQLALCHEMY_SILENCE_UBER_WARNING=1 to silence this message. (Background on SQLAlchemy 2.0 at: Error Messages — SQLAlchemy 2.0 Documentation) Base: DeclarativeMeta = declarative_base() /data/apps/Python3/lib/python3.9/site-packages/rasa/shared/utils/validation.py:134: DeprecationWarning: pkg_resources is deprecated as an API. See Package Discovery and Resource Access using pkg_resources - setuptools 68.2.2.post20230912 documentation import pkg_resources /data/apps/Python3/lib/python3.9/site-packages/pkg_resources/init.py:2871: DeprecationWarning: Deprecated call to pkg_resources.declare_namespace('mpl_toolkits'). Implementing implicit namespace packages (as specified in PEP 420) is preferred to pkg_resources.declare_namespace. See Keywords - setuptools 68.2.2.post20230912 documentation declare_namespace(pkg) /data/apps/Python3/lib/python3.9/site-packages/pkg_resources/init.py:2871: DeprecationWarning: Deprecated call to pkg_resources.declare_namespace('ruamel'). Implementing implicit namespace packages (as specified in PEP 420) is preferred to pkg_resources.declare_namespace. See Keywords - setuptools 68.2.2.post20230912 documentation declare_namespace(pkg) /data/apps/Python3/lib/python3.9/site-packages/pkg_resources/init.py:2871: DeprecationWarning: Deprecated call to pkg_resources.declare_namespace('ruamel.yaml'). Implementing implicit namespace packages (as specified in PEP 420) is preferred to pkg_resources.declare_namespace. See Keywords - setuptools 68.2.2.post20230912 documentation declare_namespace(pkg) 2023-09-22 15:54:11 INFO rasa.cli.train - Started validating domain and training data… 2023-09-22 15:54:13 INFO rasa.validator - Validating intents… 2023-09-22 15:54:13 INFO rasa.validator - Validating uniqueness of intents and stories… 2023-09-22 15:54:13 INFO rasa.validator - Validating utterances… 2023-09-22 15:54:13 INFO rasa.validator - Story structure validation… Processed story blocks: 100%|█████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 2623.08it/s, # trackers=1] 2023-09-22 15:54:13 INFO rasa.core.training.story_conflict - Considering all preceding turns for conflict analysis. 2023-09-22 15:54:13 INFO rasa.validator - No story structure conflicts found. 2023-09-22 15:54:19 INFO rasa.engine.training.hooks - Restored component ‘JiebaTokenizer’ from cache. Building prefix dict from the default dictionary … Loading model from cache /tmp/jieba.cache Loading model cost 0.636 seconds. Prefix dict has been built successfully. 2023-09-22 15:54:19 INFO rasa.engine.training.hooks - Starting to train component ‘RegexFeaturizer’. 2023-09-22 15:54:19 INFO rasa.engine.training.hooks - Finished training component ‘RegexFeaturizer’. 2023-09-22 15:54:19 INFO rasa.engine.training.hooks - Starting to train component ‘LexicalSyntacticFeaturizer’. 2023-09-22 15:54:19 INFO rasa.engine.training.hooks - Finished training component ‘LexicalSyntacticFeaturizer’. Traceback (most recent call last): File “/data/apps/Python3/lib/python3.9/site-packages/rasa/engine/graph.py”, line 394, in _load_component self._component: GraphComponent = constructor( # type: ignore[no-redef] File “/data/apps/Python3/lib/python3.9/site-packages/rasa/engine/graph.py”, line 221, in load return cls.create(config, model_storage, resource, execution_context) File “/data/apps/Python3/lib/python3.9/site-packages/rasa/nlu/featurizers/dense_featurizer/lm_featurizer.py”, line 100, in create return cls(config, execution_context) File “/data/apps/Python3/lib/python3.9/site-packages/rasa/nlu/featurizers/dense_featurizer/lm_featurizer.py”, line 67, in init self._load_model_instance() File “/data/apps/Python3/lib/python3.9/site-packages/rasa/nlu/featurizers/dense_featurizer/lm_featurizer.py”, line 152, in _load_model_instance self.tokenizer = model_tokenizer_dict[self.model_name].from_pretrained( File “/data/apps/Python3/lib/python3.9/site-packages/transformers/tokenization_utils_base.py”, line 1838, in from_pretrained raise EnvironmentError( OSError: Can’t load tokenizer for ‘bert-base-chinese’. If you were trying to load it from ‘https://huggingface.co/models’, make sure you don’t have a local directory with the same name. Otherwise, make sure ‘bert-base-chinese’ is the correct path to a directory containing all relevant files for a BertTokenizer tokenizer.

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File “/data/apps/Python3/lib/python3.9/runpy.py”, line 197, in _run_module_as_main return _run_code(code, main_globals, None, File “/data/apps/Python3/lib/python3.9/runpy.py”, line 87, in _run_code exec(code, run_globals) File “/data/apps/Python3/lib/python3.9/site-packages/rasa/main.py”, line 151, in main() File “/data/apps/Python3/lib/python3.9/site-packages/rasa/main.py”, line 133, in main cmdline_arguments.func(cmdline_arguments) File “/data/apps/Python3/lib/python3.9/site-packages/rasa/cli/train.py”, line 61, in train_parser.set_defaults(func=lambda args: run_training(args, can_exit=True)) File “/data/apps/Python3/lib/python3.9/site-packages/rasa/cli/train.py”, line 101, in run_training training_result = train_all( File “/data/apps/Python3/lib/python3.9/site-packages/rasa/api.py”, line 105, in train return train( File “/data/apps/Python3/lib/python3.9/site-packages/rasa/model_training.py”, line 207, in train return _train_graph( File “/data/apps/Python3/lib/python3.9/site-packages/rasa/model_training.py”, line 286, in _train_graph trainer.train( File “/data/apps/Python3/lib/python3.9/site-packages/rasa/engine/training/graph_trainer.py”, line 105, in train graph_runner.run(inputs={PLACEHOLDER_IMPORTER: importer}) File “/data/apps/Python3/lib/python3.9/site-packages/rasa/engine/runner/dask.py”, line 101, in run dask_result = dask.get(run_graph, run_targets) File “/data/apps/Python3/lib/python3.9/site-packages/dask/local.py”, line 557, in get_sync return get_async( File “/data/apps/Python3/lib/python3.9/site-packages/dask/local.py”, line 500, in get_async for key, res_info, failed in queue_get(queue).result(): File “/data/apps/Python3/lib/python3.9/concurrent/futures/_base.py”, line 439, in result return self.__get_result() File “/data/apps/Python3/lib/python3.9/concurrent/futures/_base.py”, line 391, in __get_result raise self._exception File “/data/apps/Python3/lib/python3.9/site-packages/dask/local.py”, line 542, in submit fut.set_result(fn(*args, **kwargs)) File “/data/apps/Python3/lib/python3.9/site-packages/dask/local.py”, line 238, in batch_execute_tasks return [execute_task(*a) for a in it] File “/data/apps/Python3/lib/python3.9/site-packages/dask/local.py”, line 238, in return [execute_task(a) for a in it] File “/data/apps/Python3/lib/python3.9/site-packages/dask/local.py”, line 229, in execute_task result = pack_exception(e, dumps) File “/data/apps/Python3/lib/python3.9/site-packages/dask/local.py”, line 224, in execute_task result = _execute_task(task, data) File “/data/apps/Python3/lib/python3.9/site-packages/dask/core.py”, line 119, in _execute_task return func((_execute_task(a, cache) for a in args)) File “/data/apps/Python3/lib/python3.9/site-packages/rasa/engine/graph.py”, line 474, in call self._load_component(**constructor_kwargs) File “/data/apps/Python3/lib/python3.9/site-packages/rasa/engine/graph.py”, line 407, in _load_component raise GraphComponentException( rasa.engine.exceptions.GraphComponentException: Error initializing graph component for node run_LanguageModelFeaturizer3.

this is my config.yml recipe: default.v1

assistant_id: 20230922-140123-matching-saddle

language: zh

pipeline:

name: JiebaTokenizer
name: RegexFeaturizer
name: LexicalSyntacticFeaturizer
name: LanguageModelFeaturizer model_name: bert model_weights: bert-base-chinese cache_dir: null
name: DIETClassifier epochs: 1
name: EntitySynonymMapper
name: ResponseSelector epochs: 100
name: FallbackClassifier threshold: 0.5 ambiguity_threshold: 0.3

policies:

name: MemoizationPolicy
max_history: 5
name: TEDPolicy
epochs: 50
name: RulePolicy

i need help!!!

Topic		Replies	Views
How to configure pipeline if I need chinese language with RASA 3 Rasa Open Source	1	965	December 9, 2021
Rasa.core.agent - Could not load model due to Error initializing graph component for node run_DIETClassifier Feedback on Rasa Open Source	1	837	June 13, 2023
Using BERT with RASA Rasa Open Source	10	7115	September 9, 2020
Rasa 3.0 Error on train model (LanguageModelFeaturizer , bert) Rasa Open Source	1	963	April 1, 2022
Support for Language Models inside Rasa Release Announcements community , rasa	25	12765	November 25, 2021

I needed to use RASA3 to train Chinese, but my configuration pipeline could not get bert-base-chinese, which led to my training error

Related topics