Is it necessary to have equal number of examples for every intent

vinayver198 · November 2, 2018, 11:31am

souvikg10 · November 2, 2018, 11:52am

It is a question of distribution. Machine learning is nothing but a probabilistic model.

If you have an imbalanced dataset, the probability of the machine understanding one intent with many examples and one with few can be skewed. Ideally your dataset should be balanced but that doesn’t mean absolutely 20 examples per intent. However there should be a mean.

vinayver198 · November 2, 2018, 2:07pm

So if this is the case. Suppose I am a question that can be asked in 10 ways and another question is there that can be asked in 100 ways. Then what should be my approach

souvikg10 · November 2, 2018, 2:32pm

there aren’t usually a lot of ways of asking a question or users replying to a particular question. The NLU problem isn’t just mathematical or technical, the trick of a conversation is also psychological and is driven with a purpose unless you want to build a machine that is like a human being.

You should build NLU with a task driven approach. what is the problem you are trying to solve?

if you are getting such a big difference between what users might ask then you are not approaching the problem correctly. you shouldn’t have such big difference, i could imagine one having 20 ways while another 40 which won’t come out as a big problem but 10 and 100 seems oddly large.

znat · November 5, 2018, 10:59pm

From my experience the number of examples imports less than the semantic width (no better term) that are in them. You can have 200 examples generated with chatito and have poor results, and 50 carefully crafted or gathered example encompassing the semantic space or width that will perfrom much better

Topic		Replies	Views
Is it a problem, if i have more nlu examples Rasa Open Source	2	410	October 5, 2021
Multiple datasets or single one for 25+ intents Rasa Open Source	1	435	October 9, 2018
A few questions from a newbie at Rasa Getting Started with Rasa	6	276	September 7, 2021
Large number of entities per example & cross validation expectations Rasa Open Source	0	777	April 5, 2019
Advices for creating a data set Rasa Open Source	8	1138	September 27, 2018

Is it necessary to have equal number of examples for every intent

Related topics