I’ve been able to split up my large nlu.md and stories.md files using the method explained here - How do you seperate stories into different story files?
Is there a similar way to do that with the domain.yml file?
I’ve been able to split up my large nlu.md and stories.md files using the method explained here - How do you seperate stories into different story files?
Is there a similar way to do that with the domain.yml file?
Have a look at Training Data Importers
Thanks, I read that, and it seems like overkill to setup multiple bots when I really only want to divide the “topic areas” in my domain.yml file into different files, mostly for manageability and extensibility.
It sounds like that takes all the files, and assembles them at build / training time which gave me an idea. Maybe I’ll split them up and I’ll just write a tool to concatenate them before I run the train command
As a response to this need of mine, I’ve written a script that merges any .yml file in data/domain and creates a domain.yml file in the project root.
I’ve posted a gist here in case this helps anyone else out -
This is very interesting and very useful Jonathan! I am also trying ways to split my domain files as it is getting unwieldy and too bulky for my taste. I like your approach. Do you know how I can embed this into my docker workflow?
You know what? Never mind. I will just do it locally. Before I push the code to git, I will run the script.
Ooo good question.
You actually shouldn’t need to because you use my script only when you need to (re)train so can run it before you fire up your container(s).
Once you have your containers up if you “ssh” into them you’d have to stop rasa, run my script, retrain and run rasa again.
If you launch your containers and your workflow then trains and runs rasa server automatically in your container - off the top of my head, in your Dockerfile
, you can probably use the COPY
command to copy the script to your container’s project directory, then do a RUN
command to run python build-domain.py
and then another RUN
command to launch rasa server itself.
Let me know if you try this, I’m curious.
Yeah sounds sensible. Thanks…I will definitely try this out!
Hey @jonathanpwheat. Sorry to revive this old topic.
Thanks a lot for your tool. It’s a long-awaited feature in Rasa and as you said, making a custom Data Importer is overkill.
I don’t know why they made splitting NLU, stories, and rules possible, but not for the domain.
But I have a problem. My domain file contains multiple sets of characters (Latin, Arabic, and Armenian). So at first, it wasn’t even able to read the files.
Then I added encoding='utf-8'
to the open()
functions.
Now it reads and writes the files but it’s messing up all the Arabic and Armenian letters, the French letters with accents, and the single quotes ('
).
Do you have any solution?
Hi @ChrisRahme If you’re using Rasa 2.x you can split your domain files (maybe they listened? LOL)
Just create a data/domain
directory and put all your domain related files there. You can even create directories under those. I would hope then the out of the box functionality would deal with your other character formats properly. Give it a whirl.
I did a write up of it here - Organize your Rasa 2.0 training data like a boss - DEV Community
BUT you have to add some flags when you train and validate your data
Validate your data:
rasa data validate --domain data
Train
rasa train --domain data
Wow how did I not know that!
Anyway, thank you for your reply and the great article!
I placed the domain
folder directly in the root of the chatbot and I use rasa train --domain .
. But this is doable on a local installation. Any idea how to do that on a server (using quick-install) via Rasa X?
I understand I can upload the model file instead of training on the interface but it would be nice if there was a way to do that
They didn’t really document that feature at all. Since all the files are yml
you can literally (can’t remember if I mention this or not) split out each key into it’s own file. So you can have a forms.yml
file that only has the forms:
key and data in it It’s pretty cool.
As far as RasaX goes, that’s a great question. I really don’t know the answer to that. The train function is just a button on the screen. I wonder if there is any way to setup parameters for that command. Maybe in a Dockerfile or something? I haven’t really worked with RasaX much.
I’d definitely post that as a question here in a new thread for sure. I’m anxious to know the answer too
Thank you so much!
Yeah the Rasa Docs are awesome except that they don’t document this kind of stuff.
This is exactly why I want to split the domain. One file per key. Maybe later also split the responses into multiple files.
I will try some stuff and research myself, and post a question about it if I don’t find anything. I already have a similar question about the Action Server anyway.
If I find a solution I’ll post it here in this reply
Edit: I asked a question.
Edit: Nobody replied. So I decided to go back to a single domain file, but left the folder there just in case (so the folder was there on GitHub too). I deployed the chatbot on the server, and went to Rasa X’s “Domain” tab, only to see that it automatically detected all the domain files in the folder as well as the original one. So that’s solved
Thank you so much @jonathanpwheat for build-domain.py
I’m glad it helped!
hey @jonathanpwheat can you help me I have rasa installed on a server how can i connect my rasa chatbot on to the website
If you can, could you copy / paste your message into a new thread and mention me? I don’t want to tangent off the main topic here. I’m happy to help and have some suggestions for you.