I tried it, and so far it achieved poorer performance than the sklearn one. I think it’s because the language used is not that specific to the use-case, we’re just trying to make our chatbot talk about a lot of stuff and that means lots of intents, so maybe I just need an architecture that can handle more intents.
But that’s actually what the gist of my question was about : let’s say I want to make for the sake of having a complete example, an FAQ bot that can answer 20 questions. Let’s assume for simplicity for now that it’s a question-answer format, i.e. no form actions or multiple follow-up intents necessary to answer the question. So all I have to do is have some basic intents (affirm, deny, out_of_scope, greeting, goodbye) and one intent for each question, and the bot has to classify between those intents. But there are not so many ways to say “yes” (maybe 100-300 would make sense if you add in punctuation and that kind of stuff with a bit of creativity) but there’s definitely thousands, even millions of ways to ask for a question, especially if you start using data generation tools and expect entities to show up in your question.
In this case, I have the problem that I don’t think that keeping a roughly equal amount of examples per intent sounds okay, since even if I put all my examples for my basic intents, I get stuck somewhere below 300 examples for my questions for which I have much more data (and variability, so the more examples would be the better). I was thinking that having some hierarchical classification would be useful (i.e. having a broadly defined “ask_question” intent that would be a back-end intent only used to trigger a second classifier that would classify the question), but I also see several problems with that approach and I don’t know if they can all be solved. Before diving into code, I was hoping that someone with a bit of experience down that road could share some thoughts! It would be really nice to discuss.
P.S. : In German, there’s only one language model, it’s the small one (the one that you linked, de_core_news_sm). I’m happy to see that there’s a medium-sized language model in French and Spanish though, there hasn’t always been, this must be recent!