ValueError: Unknown data format for file data/stories.md

I am getting an error when I try to train rasa NLU with rasa-master branch using following command.

rasa train nlu -c config.yml -o models --verbose

My data/nlu.md file content is given below:

intent:greet

  • hey
  • hello there
  • hi
  • good morning
  • good evening
  • hey there
  • let’s go
  • hey dude
  • goodmorning
  • goodevening
  • good afternoon
  • Hi
  • Hello

My data/stories.md file content is given below:

greet

  • greet
    • utter_greet

Error log:

2019-04-20 07:20:16 INFO rasa.nlu.utils.spacy_utils - Trying to load spacy model with name ‘en’ 2019-04-20 07:20:32 INFO rasa.nlu.components - Added ‘SpacyNLP’ to component cache. Key ‘SpacyNLP-en’. 2019-04-20 07:20:32 INFO rasa.nlu.training_data.loading - Training data format of data/nlu.md is md 2019-04-20 07:20:32 INFO rasa.nlu.training_data.training_data - Training data stats: - intent examples: 13 (1 distinct intents) - Found intents: ‘greet’ - entity examples: 0 (0 distinct entities) - found entities:

Traceback (most recent call last): File “/home/ubuntu/virtual_env/bin/rasa”, line 11, in load_entry_point(‘rasa’, ‘console_scripts’, ‘rasa’)() File “/home/ubuntu/projects/rasa/rasa/main.py”, line 64, in main cmdline_arguments.func(cmdline_arguments) File “/home/ubuntu/projects/rasa/rasa/cli/train.py”, line 164, in train_nlu return train_nlu(config, nlu_data, output, train_path) File “/home/ubuntu/projects/rasa/rasa/train.py”, line 174, in train_nlu config, nlu_data, _train_path, project="", fixed_model_name=“nlu” File “/home/ubuntu/projects/rasa/rasa/nlu/train.py”, line 159, in train training_data = load_data(data, nlu_config.language) File “/home/ubuntu/projects/rasa/rasa/nlu/training_data/loading.py”, line 55, in load_data data_sets = [_load(f, language) for f in files] File “/home/ubuntu/projects/rasa/rasa/nlu/training_data/loading.py”, line 55, in data_sets = [_load(f, language) for f in files] File “/home/ubuntu/projects/rasa/rasa/nlu/training_data/loading.py”, line 114, in _load raise ValueError(“Unknown data format for file {}”.format(filename)) ValueError: Unknown data format for file data/stories.md

Thansk you

Hey there @anoopmohan! We’re still working out the kinks in the new CLI, and this is one we’re already aware of. If you’re trying to train a full Core/NLU model (my guess is yes because you provided stories too), you should do it with rasa train as this will train both core and nlu. I believe for this the default path of -d data should be fine, so you don’t have to pass an argument. If you specifically only want to train an nlu model, then use rasa train nlu, but add the path to your specific nlu file, so -d data/nlu.md.

Thank you very much @erohmensing for the quick response. So, just to get more clarity on this, I have my nlu.md and stories.md files stored in data directory. So, if I need to train both nlu and core, believe I should use the following command.

rasa train -c config.yml -o models --verbos

Please correct me, if I am wrong. Thank you,

Yep that should work I believe! And those config and outfile locations are default, so rasa train --verbose should do it. Keep in mind both your nlu config and your policies should be in the config.yml with the new setup.

Great! Thank you for your support @erohmensing

Thank you @erohmensing for your help. Yes, that works with out any issues.

1 Like