I am having a hard time understanding training data in rasa nlu. Say I want to have training data where someone is informing someone of animals they can buy. For clarity I’ll use markdown format:
Say the user is hypothetically responding to a question:
"What kind of animal would you like to buy?"
There are only so many different ways of saying you want to buy something. So take the below example:
##intent:inform
- [cat](animal)
- buy [cat](animal)
- I would like to buy a [cat](animal)
Would I need to repeat this for every type of animal I intended to handle? Like below?
##intent:inform
- [cat](animal)
- [dog](animal)
- [parrot](animal)
- buy [cat](animal)
- buy [dog](animal)
- buy [parrot](animal)
- I would like to buy a [cat](animal)
- I would like to buy a [dog](animal)
- I would like to buy a [parrot](animal)
Also, I noticed that in rasa’s restaurant bot, they sometimes repeat the same example over and over again, sometimes up to seven times, like below:
##intent:inform
- [cat](animal)
- [cat](animal)
- [cat](animal)
- [cat](animal)
- [cat](animal)
- buy [cat](animal)
- I would like to buy a [cat](animal)
Why is that necessary? What affect does this have on the understanding? How would more occurrences of the same single word in the same position be an indicator that it is an appropriate response, especially if you had something like the below where a different value of the same entity was repeated the same amount of times?
##intent:inform
- [cat](animal)
- [cat](animal)
- [cat](animal)
- [cat](animal)
- [cat](animal)
- buy [cat](animal)
- I would like to buy a [cat](animal)
- [dog](animal)
- [dog](animal)
- [dog](animal)
- [dog](animal)
- [dog](animal)
- buy [dog](animal)
- I would like to buy a [dog](animal)
I am curious about the above because I see it in the old format in franken_data.json. To get a better understanding do q quick search on “text”: “cheap” You should get 14 of the same results.
Thank you, any advice is appreciated.