Split : domain.yml

Is there a plans to be able to divide the domain.yml the same way we can split the intent and stories in multiple .md files ? What is this good for ?

  • the file will be less cluttered
  • it will be easy to switch on/off big portions of train data
  • you can do incremental development of the bot
  • easy to import data from other people
  • name collisions : you should think of allowing hierarchical IDs

    f.e. : ## intent: banter.greet ; ## intent: greet.greet

    OR automatically prepend based on the directory

dir structure of data dir :

+base - domain.yaml, nlu.md, stories.md
+prod- domain.yaml, *.md's
+new- .... test data
+banter - ...
+greet
+ ...

rasa train --data=base,prod,new

2 Likes

Hi @sten, I hacked together a way to do this. It is not ideal, but works pretty well, for me at least.

In a similar fashion, you can create a data/domain directory and split your files up. Then run my script right before you train and it will merge together all of those files into a single domain.yml file

The script is here - Rasa domain Assembler · GitHub

I just added this into a script I run whenever I train.

Hope that helps until it is a feature of Rasa.

1 Like

thanks …

hiyapyco - cool…

i searched for yaml&markdown lib and there was none that you can use to manipulate the structure in python …

it seems also the markdown parser the rasa uses, doesn’t support multi-lines like this : markdown-multiline

I just also figured out I can use C/C++ preprocessor :slight_smile:

cpp -P included.inc > output.file

interesting, I haven’t had any problems with multi line entries, at least not yet, and I have quite a few

@jonathanpwheat you might be interested in the experimental MultiProjectImporter: Training Data Importers