Grab the NLU training dataset!

Note: Comments in this thread may be outdated since the starter packs were deprecated with the merged repo. See this post for more information.

You can now initialise a project with training data and all files necessary to train and talk to an AI assistant out-of-the-box with Rasa - just use the command rasa init.

Grab the NLU training dataset and starter-packs!

To get you started with building your custom assistants easier, we would like to share a really cool NLU training dataset which you can use. This dataset can be used to train your assistant to handle small talk and make the conversations with your assistants a lot more fun! Grab it here:

smalltalk.md (27.1 KB)

Use this dataset with your custom assistant or augment the initial project created with rasa init.

Let us know how you are getting on with it and what have you built :slight_smile:

8 Likes

Hi, the link to the starter-packs don’t seem to work.

Thanks Andre, we are still configuring some things. I will be up and running in a few minutes :wink:

2 Likes

It is available now :slight_smile:

2 Likes

does this comes with the shown UI too?

Not at the moment :slight_smile: Following the insctructions below you can build an assistant which runs locally on your console. After that, you can connect it to any messaging platform you want.

1 Like

Thank you for smalltalk.md, really helpful. I have another question.

Suppose I have dataset of short dialogues/conversations, say, 2,3, or 4 lines. For example:

  • H: What is a computer?
  • B: Computer is a…

  • H: Let’s talk about AI
  • B: I don’t feel like talking right now
  • H: What’s the matter?

  • H: What do you think about the climate change?
  • B: That is a topical question. Need to do some research.
  • H: Let’s meet tomorrow, and discuss. Deal?
  • B: Of course, with pleasure.

Assume that I have 3000 such short conversations. What is a possible approach to train the bot with this conversations? Do we need to convert each conversation into a single story? This will be a huge stories file with 3000 stories. Is this how we should go about it?

1 Like

@Juste I’m not able to download smalltalk.md. Looks like the link is broken. Can you please provide the correct link ?

Hi. The link for smalltalk doesn’t work again.

@leslyarun @H-Theking The link has been updated. Thanks a lot for pointing this out! :slight_smile:

Hi, I am beginner, and facing problem at time of Rasa Core installation. Sometimes getting Permission Error and sometimes ModuleNotFoundError: No module named '_pywrap_tensorflow_internal’

Hey @lee-van-oetz, thanks for pointing this out. The file was reuploaded and it should not disappear from now on.

I am also looking answer to this question, did you got one already?

@rasalearner I haven’t got any answer, neither positive nor negative on this forum. But I think RASA is not able to handle such cases. What I did is design my NN from scratch using Keras and other libraries, no RASA at all.

Also I recommend not rely too much on RASA. It is not mature, it is not able to address special cases which do not fit concepts such as intent, slots, etc. It also has poor documentation and much questions on the forum are left unanswered.

Hey @fade2black. Can you elaborate more on what do you have in mind with special cases? In general, Rasa is mostly suitable for goal-oriented chatbots (though new policies that were recently added to the Rasa Stack allow handling chit-chat as well). By goal-oriented bot I mean that an assistant needs to built for a purpose (to make a booking, provide info about the specific domain, etc). Answering your previously asked question - if you have a very big amount of training data available you have a few options:

  • Convert at least some of those conversations into the stories and use them to build an assistant which you could give to your users to use and generate new data which you could reuse to improve the bot.
  • Instead of converting conversations into the stories, use interactive learning which will allow you to generate training data in the Rasa stories format by chatting to your bot.
2 Likes

Tried with both nlu and stack, with stack gets the following response which is not expected:This is after appending smalltalk.md file data.

hey there 2018-11-24 08:32:28 WARNING py.warnings - C:\Users\NITIN\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\preprocessing\label.py:151: Dep recationWarning: The truth value of an empty array is ambiguous. Returning False, but in future this will result in an error. Use array.size > 0 to check that an array is not empty. if diff:

Hey there! Tell me your name.127.0.0.1 - - [2018-11-24 08:32:28] “POST /webhooks/rest/webhook?stream=true&token= HTTP/1.1” 200 197 0.046868

My name is Nitin 2018-11-24 08:32:33 WARNING py.warnings - C:\Users\NITIN\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\preprocessing\label.py:151: Dep recationWarning: The truth value of an empty array is ambiguous. Returning False, but in future this will result in an error. Use array.size > 0 to check that an array is not empty. if diff:

Nice to you meet you None. How can I help?127.0.0.1 - - [2018-11-24 08:32:33] “POST /webhooks/rest/webhook?stream=true&token= HTTP/1.1” 200 210 0.048142

what is your age? 2018-11-24 08:32:44 WARNING py.warnings - C:\Users\NITIN\AppData\Local\Programs\Python\Python36\lib\site-packages\sklearn\preprocessing\label.py:151: Dep recationWarning: The truth value of an empty array is ambiguous. Returning False, but in future this will result in an error. Use array.size > 0 to check that an array is not empty. if diff:

Chuck Norris knows Victoria’s secret. 127.0.0.1 - - [2018-11-24 08:32:45] “POST /webhooks/rest/webhook?stream=true&token= HTTP/1.1” 200 205 1.490340

same here - did anybody find out why? I was simply appending the nlu_data.md file and then re-training nlu and core (also restarting the server)

Also, for @rasalearner After you append the nlu data, you should include new training stories inside the stories.md file of the project for those new intents to be included in a dialogue. Once you do that, the bot will start using the new intents for both - NLU and Core parts of the bot.

1 Like

Thx, but that does not explain why the current stories (e.g. name) do not work anymore („None“ instead of name)?

1 Like