But during bot testing running (rasa shell nlu) I find this threshold is too much for some intents. Some of my intents fall within 0.6xxx and a little just slightly below 0.6.
My questions are:
is it advisable to reduce the threshold value for my FallbackClassifier
if yes, what’s the trade-off for reducing the threshold value
IF NO, how can I make my intent prediction score 0.7 and above?
this is still an open research question. It depends a lot on your data which makes it hard to give specific advice for your situation. In my experience it involves a lot of trial and error to get right. That said, there is some general advice that you might find helpful.
We recently introduced a change to our DIET algorithm which should make the “confidence” output more reliable. There’s a full explanation of the technical details on our algorithm whiteboard but in general we now recommend setting the model_confidence parameter to linear_norm and the constrain_similarities to True.
The threshold parameter determines when a fallback is triggered but only to an extent. We shouldn’t forget that the data that you train on will influence the pipeline too. There’s something to be said to address this issue by adding more training data. If we have more relevant data to learn from, odds are that the pipeline will also be able to quantify the confidence better.
Another way of dealing with “early triggering of fallbacks” is that you could consider not sending the user a general message (like, "could you rephrase?") but instead trigger a custom action. The custom action could load buttons which represent the top 3 intents, that the user can press. That way, it comes more of a minor inconvenience for the user. There’s an example of such an action here.