Has anyone noticed faster training times going from core 0.13.8 to 0.14.5?

niveK · June 10, 2019, 8:01pm

I’ve been working with some training data that typically takes about 2 hours to train in total, but after upgrading to 0.14.5, I’ve noticed that it’s sampling a lower amount of examples.

This is the output I get from running a training session:

Processed Story Blocks: 100%|██████████████████████████████████████████████████████████████████████████████████████████████| 468/468 [00:00<00:00, 1187.92it/s, # trackers=1]
Processed Story Blocks: 100%|███████████████████████████████████████████████████████████████████████████████████████████████| 468/468 [00:15<00:00, 30.65it/s, # trackers=50]
Processed Story Blocks: 100%|███████████████████████████████████████████████████████████████████████████████████████████████| 468/468 [00:15<00:00, 29.87it/s, # trackers=50]
Processed Story Blocks: 100%|███████████████████████████████████████████████████████████████████████████████████████████████| 468/468 [00:16<00:00, 27.76it/s, # trackers=50]
2019-06-10 19:24:05.647213: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-06-10 19:24:05.670416: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3200030000 Hz
2019-06-10 19:24:05.670805: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x555e28852470 executing computations on platform Host. Devices:
2019-06-10 19:24:05.670839: I tensorflow/compiler/xla/service/service.cc:158]   StreamExecutor device (0): <undefined>, <undefined>
WARNING:tensorflow:From /usr/local/lib/python3.6/site-packages/tensorflow/python/ops/resource_variable_ops.py:435: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
2019-06-10 19:24:05 WARNING  tensorflow  - From /usr/local/lib/python3.6/site-packages/tensorflow/python/ops/resource_variable_ops.py:435: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
WARNING:tensorflow:From /usr/local/lib/python3.6/site-packages/tensorflow/python/keras/backend.py:4010: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
2019-06-10 19:24:05 WARNING  tensorflow  - From /usr/local/lib/python3.6/site-packages/tensorflow/python/keras/backend.py:4010: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
masking (Masking)            (None, 10, 819)           0         
_________________________________________________________________
lstm (LSTM)                  (None, 32)                109056    
_________________________________________________________________
dense (Dense)                (None, 820)               27060     
_________________________________________________________________
activation (Activation)      (None, 820)               0         
=================================================================
Total params: 136,116
Trainable params: 136,116
Non-trainable params: 0
_________________________________________________________________
2019-06-10 19:24:06 INFO     rasa_core.policies.keras_policy  - Fitting model with 3863 total samples and a validation split of 0.1
WARNING:tensorflow:From /usr/local/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
2019-06-10 19:24:06 WARNING  tensorflow  - From /usr/local/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
Epoch 1/150
3863/3863 [==============================] - 2s 584us/sample - loss: 5.2763 - acc: 0.3546
Epoch 2/150
3863/3863 [==============================] - 2s 492us/sample - loss: 4.4344 - acc: 0.3642
Epoch 3/150
3863/3863 [==============================] - 2s 459us/sample - loss: 4.3978 - acc: 0.3642
Epoch 4/150
3863/3863 [==============================] - 2s 490us/sample - loss: 4.3695 - acc: 0.3642
Epoch 5/150
3863/3863 [==============================] - 2s 498us/sample - loss: 4.3194 - acc: 0.3642
... and so on

I can see it shows 136,116 trainable parameters, but it’s only sampling 3863. Any idea why? If I remember correctly, the samples and the trainable parameters would be one-to-one. That’s why the training would take so long. I haven’t changed anything in my policies or my training call, but I’ll provide them below:

policies.yml:

policies:
  - name: "KerasPolicy"
    epochs: 150
    featurizer:
    - name: MaxHistoryTrackerFeaturizer
      max_history: 10
      state_featurizer:
        - name: LabelTokenizerSingleStateFeaturizer
  - name: "MemoizationPolicy"
    max_history: 10
  - name: "MappingPolicy"

training call:

rasa-core train -c config/policies.yml --stories stories --out model

If anyone has encountered this before, please let me know!

Ghostvv · June 19, 2019, 12:25pm

yes, we changed augmentation logic. Now we perform additional subsampling of augmented stories to prevent exponential growth of them

Topic		Replies	Views
Training time for NLU with Tensorflow 1.14 Rasa Open Source	1	484	November 28, 2019
Feedback: Upgrading to Tensorflow 2.6 Rasa Open Source	30	9372	September 19, 2022
More information on rasa core model training process Rasa Open Source	9	1845	August 17, 2018
Rasa 2.8 Training takes too much time Rasa Open Source	1	331	May 10, 2022
Story Blocks never process before Interactive Training with many Checkpoints Rasa Open Source	2	540	March 26, 2022

Has anyone noticed faster training times going from core 0.13.8 to 0.14.5?

policies.yml:

training call:

Related topics