Chatbot incorrectly classifying easy questions

Hey,

I’m working on a bot for internal communication within a company. I have about 20 intents to distinguish, and the bot works quite well for most of them (0.900 F score). For each intent I have prepared upwards of 40 example phrases, and for ones where I expected to be difficulty, I’ve included more. Some intents are quite similar, and on these the bot makes some mistakes. The problem is that these mistakes seem quite trivial, as among the incorrectly classified intents are phrases which include something along the lines of keywords for the desired intent. I.e. I have a phrase containing word X, which occurs verbatim in the training set, and only in the dataset for the desired intent, and yet it gets incorrectly classified. As I’ve said, the bot works quite well on phrases I would consider harder, so it’s hard for me to make sense of it. Is there any recognizable factor which could be behind this?

Hey @ryszardtuora. It’s quite difficult to answer without seeing some examples you are training the model on. Would it be possible to share a snippet of the NLU training examples you are training your model on?

Ok, here is one: I want to recognize what type of leave the user wants to take, i have 4 types total, and one of them (“urlop wypoczynkowy na żądanie”) is a “subtype” of another one (“urlop wypoczynkowy”). I have the word “na żądanie” in all the examples for the subtype, and in none of the examples of the supertype, and yet I get this in my errorlog:

{ “text”: “Ok, a wzór na żądanie?”, “intent”: “wniosek_urlop_wypoczynkowy_na_żądanie”, “intent_prediction”: { “name”: “wniosek_delegacja_krajowa”, “confidence”: 0.10438069749559263 }

There are also two related intents: “delegacja_krajowa” and “delegacja_zagraniczna”, which usually differ by keywords such as “krajowa”, “na kraj”, “w kraju”, “za granicę” etc., and yet sometimes these two intents get mixed up, despite the presence of keywords in the query.

Here are four of about 100 examples for “delegacja_zagraniczna”:

  • Jadę w delegację za granicę, co mam wypełnić?
  • Gdzie są wnioski do wypełnienia na delegację zagraniczną?
  • Daj mi wniosek do wypełnienia w celu wyjazdu na delegację zagraniczną.
  • Proszę o formularz wniosku na delegację za granicę

And here are four for “delegacja_krajowa”

  • Wyjeżdżam w delegację krajową, co muszę wypełnić?
  • Gdzie są kwestionariusze do wypełnienia na delegację krajową?
  • Skąd wezmę formularz na delegację krajową?
  • Daj mi wniosek o delegację wewnątrz kraju?

Here’s an example of the error:

{ “text”: “Zostałem delegowany do wyjazdu krajowego, daj mi wzór wniosku.”, “intent”: “wniosek_delegacja_krajowa”, “intent_prediction”: { “name”: “wniosek_delegacja_zagraniczna”, “confidence”: 0.5110035213040968 }