Splitting up domain.yml?

jonathanpwheat · September 18, 2019, 8:42pm

I’ve been able to split up my large nlu.md and stories.md files using the method explained here - How do you seperate stories into different story files?

Is there a similar way to do that with the domain.yml file?

oliver5881 · September 19, 2019, 7:40am

Have a look at Training Data Importers

jonathanpwheat · September 20, 2019, 6:22pm

Thanks, I read that, and it seems like overkill to setup multiple bots when I really only want to divide the “topic areas” in my domain.yml file into different files, mostly for manageability and extensibility.

It sounds like that takes all the files, and assembles them at build / training time which gave me an idea. Maybe I’ll split them up and I’ll just write a tool to concatenate them before I run the train command

jonathanpwheat · September 26, 2019, 4:08pm

As a response to this need of mine, I’ve written a script that merges any .yml file in data/domain and creates a domain.yml file in the project root.

I’ve posted a gist here in case this helps anyone else out -

gist.github.com

https://gist.github.com/jwheat/9a2611b738127f45d32254117fe3b959

build-domain.py

import os
import glob
import hiyapyco

path = 'data/domain'
yaml_list = []

for filename in glob.glob(os.path.join(path, '*.yml')):
    with open(filename) as fp:
        yaml_file = fp.read()

This file has been truncated. show original

wale · June 10, 2020, 10:15pm

This is very interesting and very useful Jonathan! I am also trying ways to split my domain files as it is getting unwieldy and too bulky for my taste. I like your approach. Do you know how I can embed this into my docker workflow?

wale · June 10, 2020, 10:17pm

You know what? Never mind. I will just do it locally. Before I push the code to git, I will run the script.

jonathanpwheat · June 12, 2020, 6:10pm

Ooo good question.

You actually shouldn’t need to because you use my script only when you need to (re)train so can run it before you fire up your container(s).

Once you have your containers up if you “ssh” into them you’d have to stop rasa, run my script, retrain and run rasa again.

If you launch your containers and your workflow then trains and runs rasa server automatically in your container - off the top of my head, in your Dockerfile, you can probably use the COPY command to copy the script to your container’s project directory, then do a RUN command to run python build-domain.py and then another RUN command to launch rasa server itself.

Let me know if you try this, I’m curious.

wale · June 12, 2020, 9:04pm

Yeah sounds sensible. Thanks…I will definitely try this out!

ChrisRahme · March 2, 2021, 5:13pm

Hey @jonathanpwheat. Sorry to revive this old topic.

Thanks a lot for your tool. It’s a long-awaited feature in Rasa and as you said, making a custom Data Importer is overkill.

I don’t know why they made splitting NLU, stories, and rules possible, but not for the domain.

But I have a problem. My domain file contains multiple sets of characters (Latin, Arabic, and Armenian). So at first, it wasn’t even able to read the files.

Then I added encoding='utf-8' to the open() functions.

Now it reads and writes the files but it’s messing up all the Arabic and Armenian letters, the French letters with accents, and the single quotes (').

Do you have any solution?

jonathanpwheat · March 2, 2021, 7:20pm

Hi @ChrisRahme If you’re using Rasa 2.x you can split your domain files (maybe they listened? LOL)

Just create a data/domain directory and put all your domain related files there. You can even create directories under those. I would hope then the out of the box functionality would deal with your other character formats properly. Give it a whirl.

I did a write up of it here - Organize your Rasa 2.0 training data like a boss - DEV Community

BUT you have to add some flags when you train and validate your data

Validate your data:

rasa data validate --domain data

Train

rasa train --domain data

ChrisRahme · March 2, 2021, 9:25pm

Wow how did I not know that!

Anyway, thank you for your reply and the great article!

I placed the domain folder directly in the root of the chatbot and I use rasa train --domain .. But this is doable on a local installation. Any idea how to do that on a server (using quick-install) via Rasa X?

I understand I can upload the model file instead of training on the interface but it would be nice if there was a way to do that

jonathanpwheat · March 2, 2021, 9:30pm

They didn’t really document that feature at all. Since all the files are yml you can literally (can’t remember if I mention this or not) split out each key into it’s own file. So you can have a forms.yml file that only has the forms: key and data in it It’s pretty cool.

As far as RasaX goes, that’s a great question. I really don’t know the answer to that. The train function is just a button on the screen. I wonder if there is any way to setup parameters for that command. Maybe in a Dockerfile or something? I haven’t really worked with RasaX much.

I’d definitely post that as a question here in a new thread for sure. I’m anxious to know the answer too

ChrisRahme · March 2, 2021, 9:45pm

Thank you so much!

Yeah the Rasa Docs are awesome except that they don’t document this kind of stuff.

This is exactly why I want to split the domain. One file per key. Maybe later also split the responses into multiple files.

I will try some stuff and research myself, and post a question about it if I don’t find anything. I already have a similar question about the Action Server anyway.

If I find a solution I’ll post it here in this reply

Edit: I asked a question.

Edit: Nobody replied. So I decided to go back to a single domain file, but left the folder there just in case (so the folder was there on GitHub too). I deployed the chatbot on the server, and went to Rasa X’s “Domain” tab, only to see that it automatically detected all the domain files in the folder as well as the original one. So that’s solved

sahibpreetsingh12 · May 18, 2021, 4:21pm

Thank you so much @jonathanpwheat for build-domain.py

jonathanpwheat · May 18, 2021, 4:25pm

I’m glad it helped!

sahibpreetsingh12 · May 20, 2021, 6:24am

hey @jonathanpwheat can you help me I have rasa installed on a server how can i connect my rasa chatbot on to the website

jonathanpwheat · May 20, 2021, 5:36pm

If you can, could you copy / paste your message into a new thread and mention me? I don’t want to tangent off the main topic here. I’m happy to help and have some suggestions for you.

Topic		Replies	Views
Suggestions: Split domain.yml into multiple files, no deleting of comments Rasa Open Source	8	1722	September 25, 2020
Split : domain.yml Rasa Open Source	6	2667	March 11, 2021
Multiple domain file Rasa Open Source	4	2468	September 1, 2021
How do you seperate stories into different story files? Rasa Open Source	2	1309	September 17, 2019
Rasa train - -domain Rasa Open Source	1	399	February 9, 2021

Splitting up domain.yml?

Related topics