Hi there,
I’m using Rasa 1.10.24. I have to make the same chatbot (we have in English) for Arabic. I’m using DIET Classifier, Whitespace Tokenizer, count vectors and language ar. Because I can’t read Arabic and it’s written in a different way, I don’t know how entity annotations work. I looked at nlu files in the Rasa Arabic PoCs, I thought all of the annotations would be like entity value but it doesn’t seem to be this way, and we are directly using google’s translation API to translate the nlu data. I get errors like these when I train NLU:
rasa/utils/common.py:387: UserWarning: Misaligned entity annotation in message ‘ما هي العلامات المبكرة لخلل التنسج الصدري’ with intent ‘user_inform_health’. Make sure the start and end values of entities in the training data match the token boundaries (e.g. entities don’t include trailing whitespaces or punctuation).
Can someone inform me on how entity annotations should take place?