Rasa X training error : concurrent.futures._base.CancelledError

Hello,

I got a problem with rasa-x on cluster Openshift version 0.23.6 (rasa version 1.61)

I used the Integrated Version Control pointing to the Rasa Demo App on my GitHub account.

I installed the model as described on my rasa-X Web page.

When I launched the training the worker service initiated a training, but in the middle I saw the hereafter message.

Any idea ?

Processed actions: 0it [00:00, ?it/s, # examples=13]

Processed actions: 0it [00:00, ?it/s, # examples=14]

Processed actions: 14it [00:00, 7811.66it/s, # examples=14]

2020-02-18 10:10:14 DEBUG rasa.core.policies.memoization - Memorized 14 unique examples. 2020-02-18 10:10:15 INFO rasa.core.agent - Persisted model to ‘/tmp/tmp_9x0bufq/core’ 2020-02-18 10:10:15 DEBUG rasa.server - Traceback (most recent call last): File “/build/lib/python3.6/site-packages/rasa/server.py”, line 725, in train force_training=rjs.get(“force”, False), File “/usr/local/lib/python3.6/asyncio/coroutines.py”, line 129, in throw return self.gen.throw(type, value, traceback) File “/build/lib/python3.6/site-packages/rasa/train.py”, line 97, in train_async kwargs, File “/usr/local/lib/python3.6/asyncio/coroutines.py”, line 129, in throw return self.gen.throw(type, value, traceback) File “/build/lib/python3.6/site-packages/rasa/train.py”, line 184, in _train_async_internal kwargs=kwargs, File “/usr/local/lib/python3.6/asyncio/coroutines.py”, line 129, in throw return self.gen.throw(type, value, traceback) File “/build/lib/python3.6/site-packages/rasa/train.py”, line 241, in _do_training persist_nlu_training_data=persist_nlu_training_data, File “/usr/local/lib/python3.6/asyncio/coroutines.py”, line 129, in throw return self.gen.throw(type, value, traceback) File “/build/lib/python3.6/site-packages/rasa/train.py”, line 463, in _train_nlu_with_validated_data config = await file_importer.get_config() File “/usr/local/lib/python3.6/asyncio/coroutines.py”, line 129, in throw return self.gen.throw(type, value, traceback) File “/build/lib/python3.6/site-packages/rasa/importers/importer.py”, line 236, in get_config configs = await asyncio.gather(*configs) concurrent.futures._base.CancelledError

2020-02-18 10:10:17 DEBUG rasa.core.agent - Requesting model from server http://rasa-x:5002/api/projects/default/models/tags/production … 2020-02-18 10:10:17 DEBUG rasa.core.agent - Model server returned 204 status code, indicating that no new model is available. Current fingerprint: 05a2825a9feac63d7568f3c2c7f7f2c5 2020-02-18 10:10:17 DEBUG rasa.core.agent - No new model found at URL http://rasa-x:5002/api/projects/default/models/tags/production 2020-02-18 10:10:27 DEBUG rasa.core.agent - Requesting model from server http://rasa-x:5002/api/projects/default/models/tags/production … 2020-02-18 10:10:27 DEBUG rasa.core.agent - Model server returned 204 status code, indicating that no new model is available. Current fingerprint: 05a2825a9feac63d7568f3c2c7f7f2c5

Hey @lepoulle it seems like training may be timing out. Can you show me the logs from the beginning of the training? And ideally format them with ```

Where are the log files on rasa-x service worker ?

On the stdout stream I got the following lines

2020-02-20 13:41:57 DEBUG    rasa.core.agent  - No new model found at URL http://rasa-x:5002/api/projects/default/models/tags/production 
2020-02-20 13:42:07 DEBUG    rasa.core.agent  - Requesting model from server http://rasa-x:5002/api/projects/default/models/tags/production
 ...
2020-02-20 13:42:07 DEBUG    rasa.core.agent  - Model server returned 204 status code, indicating that no new model is available. Current fingerprint: 05a2825a9feac63d7568f3c2c7f7f2c5
2020-02-20 13:42:07 DEBUG    rasa.core.agent  - No new model found at URL http://rasa-x:5002/api/projects/default/models/tags/production 
2020-02-20 13:42:15 DEBUG    rasa.nlu.training_data.loading  - Training data format of '/tmp/tmpgiaps1za/nlu.md' is 'md'.
2020-02-20 13:42:15 DEBUG    rasa.nlu.training_data.loading  - Training data format of '/tmp/tmpgiaps1za/stories.md' is 'unk'.
2020-02-20 13:42:15 DEBUG    rasa.nlu.training_data.loading  - Training data format of '/tmp/tmpgiaps1za/nlu.md' is 'md'.
2020-02-20 13:42:15 DEBUG    rasa.nlu.training_data.loading  - Training data format of '/tmp/tmpgiaps1za/nlu.md' is 'md'.
2020-02-20 13:42:15 DEBUG    rasa.core.nlg.generator  - Instantiated NLG to 'TemplatedNaturalLanguageGenerator'.
2020-02-20 13:42:15 DEBUG    rasa.core.training.generator  - Generated trackers will be deduplicated based on their unique last 5 states.
2020-02-20 13:42:15 DEBUG    rasa.core.training.generator  - Number of augmentation rounds is 3
2020-02-20 13:42:15 DEBUG    rasa.core.training.generator  - Starting data generation round 0 ... (with 1 trackers)

Processed Story Blocks:   0%|          | 0/4 [00:00<?, ?it/s
Processed Story Blocks:   0%|          | 0/4 [00:00<?, ?it/s, # trackers=1
Processed Story Blocks:   0%|          | 0/4 [00:00<?, ?it/s, # trackers=1
Processed Story Blocks:   0%|          | 0/4 [00:00<?, ?it/s, # trackers=1
Processed Story Blocks:   0%|          | 0/4 [00:00<?, ?it/s, # trackers=1
Processed Story Blocks: 100%|██████████| 4/4 [00:00<00:00, 4287.56it/s, # trackers=1
2020-02-20 13:42:15 DEBUG    rasa.core.training.generator  - Finished phase (4 training samples found).
2020-02-20 13:42:15 DEBUG    rasa.core.training.generator  - Data generation rounds finished.
2020-02-20 13:42:15 DEBUG    rasa.core.training.generator  - Found 0 unused checkpoints
2020-02-20 13:42:15 DEBUG    rasa.core.training.generator  - Starting augmentation round 0 ... (with 4 trackers)

Processed Story Blocks:   0%|          | 0/4 [00:00<?, ?it/s
Processed Story Blocks:   0%|          | 0/4 [00:00<?, ?it/s, # trackers=4
Processed Story Blocks:   0%|          | 0/4 [00:00<?, ?it/s, # trackers=4
Processed Story Blocks:   0%|          | 0/4 [00:00<?, ?it/s, # trackers=4
Processed Story Blocks:   0%|          | 0/4 [00:00<?, ?it/s, # trackers=4
Processed Story Blocks: 100%|██████████| 4/4 [00:00<00:00, 1568.84it/s, # trackers=4
2020-02-20 13:42:15 DEBUG    rasa.core.training.generator  - Finished phase (20 training samples found).
2020-02-20 13:42:15 DEBUG    rasa.core.training.generator  - Starting augmentation round 1 ... (with 20 trackers)

Processed Story Blocks:   0%|          | 0/4 [00:00<?, ?it/s
Processed Story Blocks:   0%|          | 0/4 [00:00<?, ?it/s, # trackers=12
Processed Story Blocks:   0%|          | 0/4 [00:00<?, ?it/s, # trackers=12
Processed Story Blocks:   0%|          | 0/4 [00:00<?, ?it/s, # trackers=12
Processed Story Blocks:   0%|          | 0/4 [00:00<?, ?it/s, # trackers=12
Processed Story Blocks: 100%|██████████| 4/4 [00:00<00:00, 67.27it/s, # trackers=12
2020-02-20 13:42:15 DEBUG    rasa.core.training.generator  - Finished phase (68 training samples found).
2020-02-20 13:42:15 DEBUG    rasa.core.training.generator  - Starting augmentation round 2 ... (with 50 trackers)

Processed Story Blocks:   0%|          | 0/4 [00:00<?, ?it/s
Processed Story Blocks:   0%|          | 0/4 [00:00<?, ?it/s, # trackers=12
Processed Story Blocks:   0%|          | 0/4 [00:00<?, ?it/s, # trackers=12
Processed Story Blocks:   0%|          | 0/4 [00:00<?, ?it/s, # trackers=12
Processed Story Blocks:   0%|          | 0/4 [00:00<?, ?it/s, # trackers=12
Processed Story Blocks: 100%|██████████| 4/4 [00:00<00:00, 408.55it/s, # trackers=12
2020-02-20 13:42:15 DEBUG    rasa.core.training.generator  - Finished phase (125 training samples found).
2020-02-20 13:42:15 DEBUG    rasa.core.training.generator  - Found 125 training trackers.
2020-02-20 13:42:15 DEBUG    rasa.core.training.generator  - Subsampled to 121 augmented training trackers.
2020-02-20 13:42:15 DEBUG    rasa.core.training.generator  - There are 4 original trackers.
2020-02-20 13:42:15 DEBUG    rasa.core.agent  - Agent trainer got kwargs: {}
2020-02-20 13:42:15 DEBUG    rasa.core.featurizers  - Creating states and action examples from collected trackers (by MaxHistoryTrackerFeaturizer(BinarySingleStateFeaturizer))...

Processed trackers:   0%|          | 0/125 [00:00<?, ?it/s
Processed trackers:   0%|          | 0/125 [00:00<?, ?it/s, # actions=1
Processed trackers:   0%|          | 0/125 [00:00<?, ?it/s, # actions=2
...
Processed trackers:  92%|█████████▏| 115/125 [00:00<00:00, 593.13it/s, # actions=62
Processed trackers: 100%|██████████| 125/125 [00:00<00:00, 450.09it/s, # actions=62
2020-02-20 13:42:16 DEBUG    rasa.core.featurizers  - Created 62 action examples.
2020-02-20 13:42:16 INFO     rasa.core.policies.keras_policy  - Fitting model with 62 total samples and a validation split of 0.1
2020-02-20 13:42:16 DEBUG    rasa.core.policies.policy  - Parameters ignored by `model.fit(...)`: {}
Core model training completed.
Training Core model...
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
masking (Masking)            (None, 5, 20)             0         
_________________________________________________________________
lstm (LSTM)                  (None, 32)                6784      
_________________________________________________________________
dense (Dense)                (None, 14)                462       
_________________________________________________________________
activation (Activation)      (None, 14)                0         
=================================================================
Total params: 7,246
Trainable params: 7,246
Non-trainable params: 0
_________________________________________________________________
Train on 62 samples
Epoch 1/300
62/62 - 1s - loss: 2.6317 - acc: 0.1613
Epoch 2/300
62/62 - 0s - loss: 2.5885 - acc: 0.3226
...
62/62 - 0s - loss: 0.0805 - acc: 1.0000
Epoch 300/300
62/62 - 0s - loss: 0.0739 - acc: 1.0000
2020-02-20 13:42:53 INFO     rasa.core.policies.keras_policy  - Done fitting keras policy model
2020-02-20 13:42:53 DEBUG    rasa.core.featurizers  - Creating states and action examples from collected trackers (by MaxHistoryTrackerFeaturizer(NoneType))...

Processed trackers:   0%|          | 0/4 [00:00<?, ?it/s
Processed trackers:   0%|          | 0/4 [00:00<?, ?it/s, # actions=1
Processed trackers:   0%|          | 0/4 [00:00<?, ?it/s, # actions=2
Processed trackers:   0%|          | 0/4 [00:00<?, ?it/s, # actions=3
Processed trackers:   0%|          | 0/4 [00:00<?, ?it/s, # actions=4
Processed trackers:   0%|          | 0/4 [00:00<?, ?it/s, # actions=5
Processed trackers:   0%|          | 0/4 [00:00<?, ?it/s, # actions=6
Processed trackers:   0%|          | 0/4 [00:00<?, ?it/s, # actions=7
Processed trackers:   0%|          | 0/4 [00:00<?, ?it/s, # actions=8
Processed trackers:   0%|          | 0/4 [00:00<?, ?it/s, # actions=8
Processed trackers:   0%|          | 0/4 [00:00<?, ?it/s, # actions=8
Processed trackers:   0%|          | 0/4 [00:00<?, ?it/s, # actions=8
Processed trackers:   0%|          | 0/4 [00:00<?, ?it/s, # actions=8
Processed trackers:   0%|          | 0/4 [00:00<?, ?it/s, # actions=8
Processed trackers:   0%|          | 0/4 [00:00<?, ?it/s, # actions=8
Processed trackers:   0%|          | 0/4 [00:00<?, ?it/s, # actions=9
Processed trackers:   0%|          | 0/4 [00:00<?, ?it/s, # actions=10
Processed trackers:   0%|          | 0/4 [00:00<?, ?it/s, # actions=10

Proce]ssed trackers:   0%|          | 0/4 [00:00<?, ?it/s, # actions=11
Processed trackers:   0%|          | 0/4 [00:00<?, ?it/s, # actions=12
Processed trackers:   0%|          | 0/4 [00:00<?, ?it/s, # actions=12
Processed trackers:   0%|          | 0/4 [00:00<?, ?it/s, # actions=12
Processed trackers:   0%|          | 0/4 [00:00<?, ?it/s, # actions=12
Processed trackers:   0%|          | 0/4 [00:00<?, ?it/s, # actions=13
Processed trackers:   0%|          | 0/4 [00:00<?, ?it/s, # actions=14
Processed trackers: 100%|██████████| 4/4 [00:00<00:00, 2579.92it/s, # actions=14
2020-02-20 13:42:53 DEBUG    rasa.core.featurizers  - Created 14 action examples.

Processed actions: 0it [00:00, ?it/s
Processed actions: 0it [00:00, ?it/s, # examples=1
Processed actions: 0it [00:00, ?it/s, # examples=2
Processed actions: 0it [00:00, ?it/s, # examples=3
Processed actions: 0it [00:00, ?it/s, # examples=4
Processed actions: 0it [00:00, ?it/s, # examples=5
Processed actions: 0it [00:00, ?it/s, # examples=6
Processed actions: 0it [00:00, ?it/s, # examples=7
Processed actions: 0it [00:00, ?it/s, # examples=8
Processed actions: 0it [00:00, ?it/s, # examples=9
Processed actions: 0it [00:00, ?it/s, # examples=10
Processed actions: 0it [00:00, ?it/s, # examples=11
Processed actions: 0it [00:00, ?it/s, # examples=12
Processed actions: 0it [00:00, ?it/s, # examples=13
Processed actions: 0it [00:00, ?it/s, # examples=14
Processed actions: 14it [00:00, 8953.99it/s, # examples=14
2020-02-20 13:42:53 DEBUG    rasa.core.policies.memoization  - Memorized 14 unique examples.
2020-02-20 13:42:54 INFO     rasa.core.agent  - Persisted model to '/tmp/tmp6s59mxh0/core'
2020-02-20 13:42:54 DEBUG    rasa.server  - Traceback (most recent call last):
  File "/build/lib/python3.6/site-packages/rasa/server.py", line 725, in train
    force_training=rjs.get("force", False),
  File "/usr/local/lib/python3.6/asyncio/coroutines.py", line 129, in throw
    return self.gen.throw(type, value, traceback)
  File "/build/lib/python3.6/site-packages/rasa/train.py", line 97, in train_async
    kwargs,
  File "/usr/local/lib/python3.6/asyncio/coroutines.py", line 129, in throw
    return self.gen.throw(type, value, traceback)
  File "/build/lib/python3.6/site-packages/rasa/train.py", line 184, in _train_async_internal
    kwargs=kwargs,
  File "/usr/local/lib/python3.6/asyncio/coroutines.py", line 129, in throw
    return self.gen.throw(type, value, traceback)
  File "/build/lib/python3.6/site-packages/rasa/train.py", line 241, in _do_training
    persist_nlu_training_data=persist_nlu_training_data,
  File "/usr/local/lib/python3.6/asyncio/coroutines.py", line 129, in throw
    return self.gen.throw(type, value, traceback)
  File "/build/lib/python3.6/site-packages/rasa/train.py", line 463, in _train_nlu_with_validated_data
    config = await file_importer.get_config()
  File "/usr/local/lib/python3.6/asyncio/coroutines.py", line 129, in throw
    return self.gen.throw(type, value, traceback)
  File "/build/lib/python3.6/site-packages/rasa/importers/importer.py", line 236, in get_config
    configs = await asyncio.gather(*configs)
concurrent.futures._base.CancelledError

2020-02-20 13:42:57 DEBUG    rasa.core.agent  - Requesting model from server http://rasa-x:5002/api/projects/default/models/tags/production
2020-02-20 13:42:57 DEBUG    rasa.core.agent  - Model server returned 204 status code, indicating that no new model is available. Current fingerprint: 05a2825a9feac63d7568f3c2c7f7f2c5
2020-02-20 13:42:57 DEBUG    rasa.core.agent  - No new model found at URL http://rasa-x:5002/api/projects/default/models/tags/production 
2020-02-20 13:43:07 DEBUG    rasa.core.agent  - Requesting model from server http://rasa-x:5002/api/projects/default/models/tags/production
2020-02-20 13:43:07 DEBUG    rasa.core.agent  - Model server returned 204 status code, indicating that no new model is available. Current fingerprint: 05a2825a9feac63d7568f3c2c7f7f2c5
2020-02-20 13:43:07 DEBUG    rasa.core.agent  - No new model found at URL http://rasa-x:5002/api/projects/default/models/tags/production sa-x:5002/api/projects/default/models/tags/production```

@lepoulle could you please format your post surrounding the whole logs with ```, and not just individual lines? It’s quite hard to read.

Sorry !! I put the ``` in the original post.

Sorry for getting back to you so late… not sure if you’ve since resolved this issue. It seems like the request gets cancelled when you try to get the config from rasa-x. The whole thing only takes 1 minute - seems strange that it would timeout after that