You seem to have very less training data. Adding more data would help. Make sure to add a few examples in the training data which are in the lookup table.
Also, as you can see, the confidence of lt_filme is very less (0.37) and since movie de comédia is similar to filme de comédia, the CRF is identifying it as belonging to lt_filme entity.
Hi @srikar_1996 thanks for answering. I already tried with larger data but got the same result.
I always put some of the lookup synonyms in the training phrases.
So you are saying that it does not extract only when the word matches with the words in the lookup file but if it is a similar word it also extracts but with less confidence
?
For example when i parse q=“filme de dramático” i obtain this response from rasa nlu:
These results do not seem to make sense because i have this words in the lookup files, and the extractor some times do not extract the word or extract with low confidence score.
Yes, something like that. This happens with my application as well but I do not use a lookup table. I’m not entirely sure if it’s the same case with lookup tables.
You have multiple entities which are very similar, probably that is the reason the bot extracts with less confidence. For example, lt_dramat, lt_comedia have a similar structure.