How to know which epoch model is saved in the models folder of rasa after using checkpoint_model: True

rasa_version: 2.8.3
python_version: 3.8.0\

I have used checkpoint_model: True for saving the best performing model during training and trained the model using rasa train. The trained model got saved in the default --out directory “models”. how to know which epoch model is saved in the models folder. Is it really the best performing model?\

is there any way to find the which epoch model is save using checkpoint_model: True?\

Below is the config.yml i used:

language: en

pipeline:

  • name: WhitespaceTokenizer
  • name: RegexFeaturizer
  • name: LexicalSyntacticFeaturizer
  • name: CountVectorsFeaturizer
  • name: CountVectorsFeaturizer analyzer: char_wb min_ngram: 1 max_ngram: 4
  • name: DIETClassifier epochs: 300
    evaluate_on_number_of_examples: 40
    evaluate_every_number_of_epochs: 5
    tensorboard_log_directory: “tensorboard3”
    tensorboard_log_level: “epoch”
    checkpoint_model: True
    constrain_similarities: True\
  • name: EntitySynonymMapper\

- name: ResponseSelector

epochs: 100

constrain_similarities: true

- name: FallbackClassifier

threshold: 0.3

ambiguity_threshold: 0.1

Configuration for Rasa Core.

  • name: MemoizationPolicy
  • name: TEDPolicy max_history: 5
    epochs: 300
    evaluate_on_number_of_examples: 40
    evaluate_every_number_of_epochs: 5
    tensorboard_log_directory: “tensorboard3”
    tensorboard_log_level: “epoch”
    #checkpoint_model: True
    #constrain_similarities: True\
  • name: RulePolicy
1 Like

hi @SowmyaBalam it should tell you in the training logs what epoch it stopped at. Do you see it there or if not can you post your training logs?

@SowmyaBalam Heya! No worries. Quick Solution.

Step1. Go to your models folder and which you have trained it saved by date and time.

Step2. Unzip that folder

Step3. Go to NLU folder

Step4. You will see metadata.json file see that you will see everything I guess.

Note: I have only one standard configuration so I can only see 100 epochs, if you are varying the epochs, I guess it will reflect in the folder. Please experiment with yourself and whilst using your use case please. Even, I guess you can see the checkpoint file. Not let me know :slight_smile: Good Luck!

Hi! I have the same problem, I can see in the training logs what epoch it stopped: 2022-04-08 12:32:14 INFO rasa.utils.tensorflow.models - The model of epoch 10 (out of 40 in total) will be stored! but I can’t see this in the checkpoint file (metadata.json). Could you tell me where you find this information exactly?

Thank you in advance!

what if the training logs about what epoch is the best for the model didnt appear? It’s my problem rn

2022-10-24 14:12:00 INFO rasa.nlu.model - Starting to train component DIETClassifier D:\miniconda3\envs\cuda11-2\lib\site-packages\rasa\utils\tensorflow\model_data.py:750: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify ‘dtype=object’ when creating the ndarray np.concatenate(np.array(f)), Epochs: 0%| | 0/1000 [00:00<?, ?it/s]

2022-10-24 14:12:02.403448: E tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1666] function cupti_interface_->Subscribe( &subscriber_, (CUpti_CallbackFunc)ApiCallback, this)failed with error CUPTI could not be loaded or symbol could not be found.

2022-10-24 14:12:02.403638: E tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1757] function cupti_interface_->Finalize()failed with error CUPTI could not be loaded or symbol could not be found.

2022-10-24 14:12:13.329456: E tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1666] function cupti_interface_->Subscribe( &subscriber_, (CUpti_CallbackFunc)ApiCallback, this)failed with error CUPTI could not be loaded or symbol could not be found.

2022-10-24 14:12:13.589573: E tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1757] function cupti_interface_->Finalize()failed with error CUPTI could not be loaded or symbol could not be found.

Epochs: 100%|█| 1000/1000 [10:48<00:00, 1.54it/s, t_loss=1.24, i_acc=1, val_t_loss=2.55, val_i_acc=0.9

2022-10-24 14:22:51 INFO rasa.nlu.model - Finished training component.

This is my training logs for DIET