Donate your NLU training data!

Hey Rasa @community,

We have created a new repository that lives in RasaHQ/NLU-training-data with the goal of providing basic training data for developing chatbots.

We are currently testing this initiative, and we will need your help to build this open source dataset - which means it’s now open for contributions!

How do I donate my training data?
Within the Github read.me, you will find a guide on how to donate your data. The repository is sectioned into different categories of intent, and there is also a FAQ section to help you understand where to put your training data.

What about training data that’s not in English?
Right now, we are unable to evaluate the quality of all language contributions, and therefore, during the initial phase we can only accept English training data to the repository.
However, we understand that the Rasa community is a global one, and in the long-term we would like to find a solution for this in collaboration with the community.

Your feedback
We created this based on suggestions from the Rasa community and we’d love to improve it in a direction that would be beneficial for you and other developers, therefore, it would also be great to have your thoughts on the following:

  • Do you think that the organisation of the repository works well and is intuitive?
  • Do you feel this would be a valuable resource for the community?
3 Likes

Hello @Emma, Its just what I`m looking for.

I`ll contribute with some data, but how about other languages ? Maybe change file name or some folder structure.

Thanks, great initiative!

-Best

1 Like

As mentioned in the Readme only English is accepted at the moment.

Hey @davi,

That’s awesome to hear! :star_struck:

Exactly, as @markusgl kindly mentioned, first we would like to test it out in English so that we can evaluate the quality. If we are able to open this up to localised training data in future, we would adjust the repo structure retroactively to specify the language and make this much clearer. :slight_smile:

Despite all of this, it’s great to know that you’re interested in donating localised training data and letting us know really helps us to understand what the community is looking for.

You are right, sorry. Anyway just forked the repo, when ready for others languages i`ll make a PR.

Thanks!

1 Like

Just added a pull request with a BUNCH of new intents (54) for smalltalk and a handful of new intents (5) for mood, with some additional data to some of the out of the box intents in both of those categories.

Looking forward to seeing what others will contribute!

1 Like

Hey @Emma

I have added different 86 intents for small talk. Please review it and if you find it useful do let me know.

2 Likes

Wow @abhishakskilrock - those are fantastic, well done

1 Like

Hey @jonathanpwheat

Thanks for complement btw you also did a fantastic job by providing 54 different intents.

1 Like

Thanks, I see a some overlap of intents, but you have all the context entities setup, whereas I just have basic data.

I’m glad this is an open source shared repo, because I’ll be implementing your nlu data into the small talk portion of my bot :grinning:

1 Like

Sure @jonathanpwheat, after all this is the real purpose of open-source, where one person can also share the benefits of others contribution.

3 Likes

@jonathanpwheat & @abhishakskilrock,

Wow guys! Thank you so much for submitting all of this training data! :heart_eyes: we should be able to review your PRs before the end of this week.

It’s also great to see such a wholesome discussion going on here, we are very fortunate to have this supportive community. :blush:

Hi, I have added my employment bot nlu data. This is my first contribution, please guide me if i have done any mistakes.

Thanks