Cannot Understand All Banglish Intent in FAQ

I am developing a FAQ chatbot where a user will ask a question in Banglish Language. Banglish means combination of Bangla and English written in English character. I have near about 100 intents with lots of examples. But the bot cannot correctly identify the intent for many cases and shared wrong reply. What is the best way to handle this issue?

There might be several issues regarding your problem:

  • The questions might be too similar => check whether patterns/words/sentences are shared between intents. I do not know, whether some sentence encoder exist for Banglish, but you could also check cosine similarities between your samples
  • The model is not sufficiently large enough to recognize differences => increase the dimensionality and the number of layers within your config.yml
  • Check whether the trainings samples are balanced in size. Do you have for example 100 intents with 20 questions each or do some intents contain 40 questions and others only 5, etc.
  • What is the accuaracy at the end of your training, how many epochs do you train?

Additionally to that, it would be helpful to get your entire setup, e.g. your config.yml, so it is possible to provide additional suggestions.