Training Rasa using Rasa NLU

I’m following the code sample here to train the model via python.

from rasa_nlu.training_data import load_data
from rasa_nlu.model import Trainer
from rasa_nlu import config

training_data = load_data('data/examples/rasa/demo-rasa.json')
trainer = Trainer(config.load("sample_configs/config_pretrained_embeddings_spacy.yml"))
model_directory = trainer.persist('./projects/default/')  # Returns the directory the model is stored in

Now I have 2 issues.

  1. my training data isn’t in json format it is in yml format like so

version: “2.0”


  • intent: greet examples: |
    • hey
    • hello
    • hi
    • hello there
    • good morning
    • good evening
    • moin
    • hey there
    • let’s go
    • hey dude
    • goodmorning
    • goodevening
    • good afternoon

How do I use yml file to load the training data. I get this error when I try to load a yaml file. ValueError: Unknown data format for file

  1. config.load("sample_configs/config_pretrained_embeddings_spacy.yml") throws this error
    return yaml.load(read_file(filename, "utf-8"))
TypeError: load() missing 1 required positional argument: 'Loader'

I fear that you’re referring to a rather old 0.x version of Rasa that isn’t supported anymore. In modern versions of Rasa the training is triggered from the command line via;

rasa train

Is there a reason why you’re using such an old version?

what if we want to write our own endpoints for training NLU and CORE separately?

Technically, they are trained seperately in 2.x. You can confirm by running:

rasa train --help
usage: rasa train [-h] [-v] [-vv] [--quiet] [--data DATA [DATA ...]]
                  [-c CONFIG] [-d DOMAIN] [--out OUT] [--dry-run]
                  [--augmentation AUGMENTATION] [--debug-plots]
                  [--num-threads NUM_THREADS]
                  [--fixed-model-name FIXED_MODEL_NAME] [--persist-nlu-data]
                  [--force] [--finetune [FINETUNE]]
                  [--epoch-fraction EPOCH_FRACTION]
                  {core,nlu} ...

positional arguments:
    core                Trains a Rasa Core model using your stories.
    nlu                 Trains a Rasa NLU model using your NLU data.

That means that you can run:

rasa train nlu

To only run the NLU part of the pipeline. Or, if you only want the core policies:

rasa train core

Actually I want to write my own server using FastAPI, for that, I need the actual code. I do not want to start the Rasa server. So i think above commands will only work in I run the Rasa HTTPI API.

While you could write a lot of code to wrap around our NLU objects, you can also run Rasa as a service via;

rasa run --enable-api

I even wrote a blog post that explains the details on how to use Rasa’s NLU pipeline as a general NLU service in Docker.

rasa run will open all the endpoints, my custom api will only have training endpoint that will train the models along with other custom endpoints. I have seen the blog post, it also train models using rasa train, which is not what i required.

What I am doing is, I have installed Rasa 2.8.8 and imported train from rasa.api and trying to train by giving paths. It did train for the first time but after that it is giving me this error RuntimeError: This event loop is already running

Below is the code to train the models.

from rasa.api import train

domain = 'sample_bot/domain.yml'
config = 'sample_bot/config.yml'
training_files = 'sample_bot/data'
fixed_model_name= 'self_model'
dry_run = False
force_training = True


Just to double-check, are you running that from a python script or from Jupyter?

Yes :confused:

Its Jupyter

Could you try running it in a normal python script? If memory serves, Jupyter is running in an async loop via Tornado which might be confusing the async code inside of Rasa.

1 Like

Working perfectly fine in a normal python script. Thanks.

Any workaround for Jupyter?

Not without getting hacky as far as I know.

1 Like

ok. Thanks. You helped me solve the issue :slight_smile: