Hi @HermanH. If you are early into your project, then I would generally recommend adding all messages to your NLU training data, even the ones that are predicted correctly. However, before you merge them into your main branch and deploy the new model, I would recommend using test stories and an NLU evaluation to ensure that you have not introduce model regressions with your new annotations (e.g. like you mentioned, you could cause a class imbalance if you annotate too many of one intent).
Once you start to receive more messages than you are able to annotate, then developing a strategy for annotating the data most likely to need your attention makes sense. For example, in Rasa X, filtering for low NLU confidence messages or filtering based on a specific predicted intent (after you decide to look at specific one based on the output of a cross-validation) and then annotating those messages first.
- Is the intent sentence already in the NLU-data?
The NLU inbox only contains messages you have received that are not in your training data, so you do not need to worry about that.
- If the sentence is not already in the training data, why should I add it?
You want to grow a training data set that is representative of how your users actually speak with your assistant. The best way to do that is to annotate the messages you receive from them. This will lead to better model performance, a better conversational interface, and ultimately a better user experience.