Problem is happening when I’ve loaded my own yml files with about 15mb of data for training phrases and responses After that process has stuck on theese messages
2022-04-25 05:05:29 DEBUG h5py._conv - Creating converter from 7 to 5
2022-04-25 05:05:29 DEBUG h5py._conv - Creating converter from 5 to 7
2022-04-25 05:05:29 DEBUG h5py._conv - Creating converter from 7 to 5
2022-04-25 05:05:29 DEBUG h5py._conv - Creating converter from 5 to 7
/Users/xxx/PycharmProjects/xxx/venv/lib/python3.8/site-packages/flatbuffers/compat.py:19: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
import imp
/Users/xxx/PycharmProjects/xxx/venv/lib/python3.8/site-packages/sklearn/utils/multiclass.py:14: DeprecationWarning: Please use `spmatrix` from the `scipy.sparse` namespace, the `scipy.sparse.base` namespace is deprecated.
from scipy.sparse.base import spmatrix
2022-04-25 05:05:30 DEBUG rasa.shared.nlu.training_data.loading - Training data format of 'data/nlu.yml' is 'rasa_yml'.
2022-04-25 05:05:30 DEBUG rasa.shared.nlu.training_data.loading - Training data format of 'data/rules.yml' is 'unk'.
2022-04-25 05:05:31 DEBUG rasa.shared.nlu.training_data.loading - Training data format of 'data/stories.yml' is 'unk'.
CPU is 100% at one core, as normal for python, and memory for process is growing very slow, 400-500mb for 1 hour, training isn’t starting, tried to wait for 2 hours
Any ideas what’s problem? Maybe any ways is possible for increase speed of data preparation?
UPD: running training with this command:
rasa train -vv --num-threads=16
UPD2: number of intents and stories to train: 107,931
UPD3: my hardware:
CPU: Core i9 9880H
RAM: 16GB
SSD: M.2 1tb
GPU: Radeon Pro 5500M 4GB
Thanks in advance for your answer