Rasa 2, reads many-many times the data?

Hi all,

I am trying to migrate to rasa 2, and starting “rasa train” seems to take ages. Why the same data are read over and over?

Starting Training… 2020-09-30 16:31:18 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/nlu.yml’ is ‘rasa_yml’. 2020-09-30 16:31:22 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/responses.yml’ is ‘rasa_yml’. 2020-09-30 16:31:22 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/stories.yml’ is ‘unk’. 2020-09-30 16:31:22 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/lookup/airline.el.yml’ is ‘rasa_yml’. 2020-09-30 16:31:22 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/lookup/airline.yml’ is ‘rasa_yml’. 2020-09-30 16:31:22 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/lookup/city.forms.el.yml’ is ‘rasa_yml’. 2020-09-30 16:31:22 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/lookup/city.yml’ is ‘rasa_yml’. 2020-09-30 16:31:22 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/lookup/flight_statuses.json’ is ‘unk’. 2020-09-30 16:31:22 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/lookup/flying.yml’ is ‘rasa_yml’. 2020-09-30 16:31:22 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/lookup/time.yml’ is ‘rasa_yml’. 2020-09-30 16:31:27 DEBUG pykwalify.compat - Using yaml library: .local/lib/python3.8/site-packages/ruamel/yaml/init.py 2020-09-30 16:31:28 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/lookup/airline.el.yml’ is ‘rasa_yml’. 2020-09-30 16:31:28 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/lookup/airline.yml’ is ‘rasa_yml’. 2020-09-30 16:31:28 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/lookup/city.forms.el.yml’ is ‘rasa_yml’. 2020-09-30 16:31:28 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/lookup/city.yml’ is ‘rasa_yml’. 2020-09-30 16:31:28 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/lookup/flying.yml’ is ‘rasa_yml’. 2020-09-30 16:31:28 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/lookup/time.yml’ is ‘rasa_yml’. 2020-09-30 16:31:28 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/nlu.yml’ is ‘rasa_yml’. 2020-09-30 16:31:41 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/responses.yml’ is ‘rasa_yml’. 2020-09-30 16:32:07 DEBUG rasa.shared.importers.importer - Added 23 training data examples from the story training data. 2020-09-30 16:32:07 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/lookup/airline.el.yml’ is ‘rasa_yml’. 2020-09-30 16:32:07 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/lookup/airline.yml’ is ‘rasa_yml’. 2020-09-30 16:32:07 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/lookup/city.forms.el.yml’ is ‘rasa_yml’. 2020-09-30 16:32:07 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/lookup/city.yml’ is ‘rasa_yml’. 2020-09-30 16:32:07 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/lookup/flying.yml’ is ‘rasa_yml’. 2020-09-30 16:32:07 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/lookup/time.yml’ is ‘rasa_yml’. 2020-09-30 16:32:07 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/nlu.yml’ is ‘rasa_yml’. 2020-09-30 16:32:24 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/responses.yml’ is ‘rasa_yml’. 2020-09-30 16:33:04 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/lookup/airline.el.yml’ is ‘rasa_yml’. 2020-09-30 16:33:04 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/lookup/airline.yml’ is ‘rasa_yml’. 2020-09-30 16:33:04 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/lookup/city.forms.el.yml’ is ‘rasa_yml’. 2020-09-30 16:33:04 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/lookup/city.yml’ is ‘rasa_yml’. 2020-09-30 16:33:04 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/lookup/flying.yml’ is ‘rasa_yml’. 2020-09-30 16:33:04 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/lookup/time.yml’ is ‘rasa_yml’. 2020-09-30 16:33:05 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/nlu.yml’ is ‘rasa_yml’. 2020-09-30 16:33:27 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/responses.yml’ is ‘rasa_yml’. 2020-09-30 16:34:12 DEBUG rasa.shared.importers.importer - Added 23 training data examples from the story training data. 2020-09-30 16:34:12 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/lookup/airline.el.yml’ is ‘rasa_yml’. 2020-09-30 16:34:12 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/lookup/airline.yml’ is ‘rasa_yml’. 2020-09-30 16:34:12 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/lookup/city.forms.el.yml’ is ‘rasa_yml’. 2020-09-30 16:34:12 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/lookup/city.yml’ is ‘rasa_yml’. 2020-09-30 16:34:12 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/lookup/flying.yml’ is ‘rasa_yml’. 2020-09-30 16:34:12 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/lookup/time.yml’ is ‘rasa_yml’. 2020-09-30 16:34:12 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/nlu.yml’ is ‘rasa_yml’. 2020-09-30 16:34:39 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/responses.yml’ is ‘rasa_yml’. 2020-09-30 16:35:39 DEBUG rasa.shared.importers.importer - Added 23 training data examples from the story training data. 2020-09-30 16:35:39 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/lookup/airline.el.yml’ is ‘rasa_yml’. 2020-09-30 16:35:39 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/lookup/airline.yml’ is ‘rasa_yml’. 2020-09-30 16:35:39 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/lookup/city.forms.el.yml’ is ‘rasa_yml’. 2020-09-30 16:35:39 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/lookup/city.yml’ is ‘rasa_yml’. 2020-09-30 16:35:39 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/lookup/flying.yml’ is ‘rasa_yml’. 2020-09-30 16:35:39 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/lookup/time.yml’ is ‘rasa_yml’. 2020-09-30 16:35:39 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/nlu.yml’ is ‘rasa_yml’. 2020-09-30 16:36:10 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/responses.yml’ is ‘rasa_yml’. 2020-09-30 16:37:20 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/lookup/airline.el.yml’ is ‘rasa_yml’. 2020-09-30 16:37:20 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/lookup/airline.yml’ is ‘rasa_yml’. 2020-09-30 16:37:20 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/lookup/city.forms.el.yml’ is ‘rasa_yml’. 2020-09-30 16:37:20 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/lookup/city.yml’ is ‘rasa_yml’. 2020-09-30 16:37:20 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/lookup/flying.yml’ is ‘rasa_yml’. 2020-09-30 16:37:20 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/lookup/time.yml’ is ‘rasa_yml’. 2020-09-30 16:37:20 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/nlu.yml’ is ‘rasa_yml’. 2020-09-30 16:37:56 DEBUG rasa.shared.nlu.training_data.loading - Training data format of ‘data/responses.yml’ is ‘rasa_yml’.

Starting Training...
2020-09-30 16:31:18 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/nlu.yml' is 'rasa_yml'.
2020-09-30 16:31:22 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/responses.yml' is 'rasa_yml'.
2020-09-30 16:31:22 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/stories.yml' is 'unk'.
2020-09-30 16:31:22 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/airline.el.yml' is 'rasa_yml'.
2020-09-30 16:31:22 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/airline.yml' is 'rasa_yml'.
2020-09-30 16:31:22 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/city.forms.el.yml' is 'rasa_yml'.
2020-09-30 16:31:22 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/city.yml' is 'rasa_yml'.
2020-09-30 16:31:22 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/flight_statuses.json' is 'unk'.
2020-09-30 16:31:22 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/flying.yml' is 'rasa_yml'.
2020-09-30 16:31:22 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/time.yml' is 'rasa_yml'.
2020-09-30 16:31:27 DEBUG    pykwalify.compat  - Using yaml library: /home/pepper/.local/lib/python3.8/site-packages/ruamel/yaml/__init__.py
2020-09-30 16:31:28 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/airline.el.yml' is 'rasa_yml'.
2020-09-30 16:31:28 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/airline.yml' is 'rasa_yml'.
2020-09-30 16:31:28 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/city.forms.el.yml' is 'rasa_yml'.
2020-09-30 16:31:28 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/city.yml' is 'rasa_yml'.
2020-09-30 16:31:28 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/flying.yml' is 'rasa_yml'.
2020-09-30 16:31:28 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/time.yml' is 'rasa_yml'.
2020-09-30 16:31:28 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/nlu.yml' is 'rasa_yml'.
2020-09-30 16:31:41 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/responses.yml' is 'rasa_yml'.
2020-09-30 16:32:07 DEBUG    rasa.shared.importers.importer  - Added 23 training data examples from the story training data.
2020-09-30 16:32:07 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/airline.el.yml' is 'rasa_yml'.
2020-09-30 16:32:07 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/airline.yml' is 'rasa_yml'.
2020-09-30 16:32:07 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/city.forms.el.yml' is 'rasa_yml'.
2020-09-30 16:32:07 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/city.yml' is 'rasa_yml'.
2020-09-30 16:32:07 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/flying.yml' is 'rasa_yml'.
2020-09-30 16:32:07 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/time.yml' is 'rasa_yml'.
2020-09-30 16:32:07 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/nlu.yml' is 'rasa_yml'.
2020-09-30 16:32:24 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/responses.yml' is 'rasa_yml'.
2020-09-30 16:33:04 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/airline.el.yml' is 'rasa_yml'.
2020-09-30 16:33:04 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/airline.yml' is 'rasa_yml'.
2020-09-30 16:33:04 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/city.forms.el.yml' is 'rasa_yml'.
2020-09-30 16:33:04 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/city.yml' is 'rasa_yml'.
2020-09-30 16:33:04 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/flying.yml' is 'rasa_yml'.
2020-09-30 16:33:04 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/time.yml' is 'rasa_yml'.
2020-09-30 16:33:05 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/nlu.yml' is 'rasa_yml'.
2020-09-30 16:33:27 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/responses.yml' is 'rasa_yml'.
2020-09-30 16:34:12 DEBUG    rasa.shared.importers.importer  - Added 23 training data examples from the story training data.
2020-09-30 16:34:12 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/airline.el.yml' is 'rasa_yml'.
2020-09-30 16:34:12 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/airline.yml' is 'rasa_yml'.
2020-09-30 16:34:12 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/city.forms.el.yml' is 'rasa_yml'.
2020-09-30 16:34:12 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/city.yml' is 'rasa_yml'.
2020-09-30 16:34:12 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/flying.yml' is 'rasa_yml'.
2020-09-30 16:34:12 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/time.yml' is 'rasa_yml'.
2020-09-30 16:34:12 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/nlu.yml' is 'rasa_yml'.
2020-09-30 16:34:39 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/responses.yml' is 'rasa_yml'.
2020-09-30 16:35:39 DEBUG    rasa.shared.importers.importer  - Added 23 training data examples from the story training data.
2020-09-30 16:35:39 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/airline.el.yml' is 'rasa_yml'.
2020-09-30 16:35:39 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/airline.yml' is 'rasa_yml'.
2020-09-30 16:35:39 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/city.forms.el.yml' is 'rasa_yml'.
2020-09-30 16:35:39 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/city.yml' is 'rasa_yml'.
2020-09-30 16:35:39 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/flying.yml' is 'rasa_yml'.
2020-09-30 16:35:39 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/time.yml' is 'rasa_yml'.
2020-09-30 16:35:39 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/nlu.yml' is 'rasa_yml'.
2020-09-30 16:36:10 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/responses.yml' is 'rasa_yml'.
2020-09-30 16:37:20 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/airline.el.yml' is 'rasa_yml'.
2020-09-30 16:37:20 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/airline.yml' is 'rasa_yml'.
2020-09-30 16:37:20 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/city.forms.el.yml' is 'rasa_yml'.
2020-09-30 16:37:20 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/city.yml' is 'rasa_yml'.
2020-09-30 16:37:20 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/flying.yml' is 'rasa_yml'.
2020-09-30 16:37:20 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/time.yml' is 'rasa_yml'.
2020-09-30 16:37:20 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/nlu.yml' is 'rasa_yml'.
2020-09-30 16:37:56 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/responses.yml' is 'rasa_yml'.
2020-09-30 16:39:10 DEBUG    urllib3.connectionpool  - Starting new HTTPS connection (1): api.segment.io:443
2020-09-30 16:39:10 DEBUG    urllib3.connectionpool  - https://api.segment.io:443 "POST /v1/track HTTP/1.1" 200 21
Training NLU model...
2020-09-30 16:39:10 DEBUG    rasa.shared.importers.importer  - Added 23 training data examples from the story training data.
2020-09-30 16:39:10 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/airline.el.yml' is 'rasa_yml'.
2020-09-30 16:39:11 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/airline.yml' is 'rasa_yml'.
2020-09-30 16:39:11 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/city.forms.el.yml' is 'rasa_yml'.
2020-09-30 16:39:11 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/city.yml' is 'rasa_yml'.
2020-09-30 16:39:11 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/flying.yml' is 'rasa_yml'.
2020-09-30 16:39:11 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/time.yml' is 'rasa_yml'.
2020-09-30 16:39:11 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/nlu.yml' is 'rasa_yml'.
2020-09-30 16:39:51 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/responses.yml' is 'rasa_yml'.
2020-09-30 16:41:21 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/airline.el.yml' is 'rasa_yml'.
2020-09-30 16:41:21 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/airline.yml' is 'rasa_yml'.
2020-09-30 16:41:21 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/city.forms.el.yml' is 'rasa_yml'.
2020-09-30 16:41:21 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/city.yml' is 'rasa_yml'.
2020-09-30 16:41:21 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/flying.yml' is 'rasa_yml'.
2020-09-30 16:41:21 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/lookup/time.yml' is 'rasa_yml'.
2020-09-30 16:41:22 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/nlu.yml' is 'rasa_yml'.
2020-09-30 16:42:07 DEBUG    rasa.shared.nlu.training_data.loading  - Training data format of 'data/responses.yml' is 'rasa_yml'.

Hey @petasis

Indeed training data can be read several times during the training, but prob. not that many times. Which alpha/release candidate are you using?

rasa --version reports:

rasa --version
Rasa Version     : 2.0.0rc2
Rasa SDK Version : 2.0.0rc1
Rasa X Version   : None
Python Version   : 3.8.5 (default, Aug 12 2020, 00:00:00) 
Operating System : Linux-5.8.12-200.fc32.x86_64-x86_64-with-glibc2.2.5
Python Path      : /usr/bin/python3