Rasa.train and Rasa.model_training.train_nlu runs indefinitely on jupyter notebook

Hi.

I’m working on a jupyter notebook to train, test and compare many models using Rasa, as fast as python would allow, and producing some custom reports I think can help my team choose between pipelines.

However, I always gets stuck after training the first model. Using rasa.train or rasa.model_training.train_nlu my jupyter notebook just keeps running indefinitely after saving the trained model, never returning anything.

I’m using Rasa 2.8.26, python 3.8.12 and training models with:

model_path = rasa.train(
    domain = domain_file, 
    config = f"config/{config_file}", 
    training_files = [training_files], 
    output = 'models/', 
    persist_nlu_training_data = True,
    fixed_model_name = config_file[:-4])

or

model_path = rasa.model_training.train_nlu(
    domain = domain_file,
    config = f"config/{config_file}", 
    nlu_data = training_files, 
    output = 'models/', 
    persist_nlu_training_data = True,
    fixed_model_name = config_file[:-4],
    force_training = True)

I don’t have any stories available right now and just want to train and compare NLU models. At the beginning of training I get the message No stories present. Just a Rasa NLU model will be trained.

At the end of rasa.train I receive the message NLU model training completed. Your Rasa model is trained and saved at [...]' but the cell keeps running indefinitely.

So my questions are:

  1. how should I train models in jupyter notebooks?
  2. Should rasa.train or rasa.model_training.train_nlu run indefinitely or am I not using it properly?
  3. And, on a side note, how to run “async” functions like compare_nlu_models in jupyter notebooks?

Thanks and keep up the amazing work!

@edheinen

  1. how should I train models in jupyter notebooks?

I am not sure, why you are using JN. I’d recommend to you rasa open source only or rasa x for running the bot. You can train and save the model in the project file. If you are thinking that while using COLab OR JN it will train fast, then I don’t think so.

  1. Should rasa.train or rasa.model_training.train_nlu run indefinitely or am I not using it properly?

I guess both are same.

  1. And, on a side note, how to run “async” functions like compare_nlu_models in jupyter notebooks?

Ref this link : Testing Your Assistant everything is mentioned in detail.

I hope this will give you more clearity and good luck!

It’s not the training that I hope to run faster in JN but the testing. CLI’s rasa test takes ~1h to test only one model while I can load the model and make predictions for the same dataset in 2 mins with pandas. That’s also the reason I asked about the async functions, I plan to classify the whole dataset loading the model as a rasa interpreter, applying it to a pandas dataset with the examples and then using rasa methods to produce the classification reports.

Thanks for the answer I’ll train models using !rasa train […] and try to test them using JN.

edit1: I see the link you sent also uses the CLI, what I wanted to learn is how to run the async functions from the JN and use them to produce the classification reports from a pandas dataframe with the model’s predictions. I also wanted to train many models programmatically, loading all configs in a folder and training them one by one through the night, that’s why I was looking for a non-CLI way of training and testing.

edit2: just found out I can pass variables as part of the shell commands in JN, like

config_file = ...
!rasa train --config {config_file}

The async functions and JN tests are still unanswered though.