the warning may be caused by the sentence “for fivefive people” in nlu.md
why are there two entities “five” and “five”, i hope it is treated as number or num_people, but not both. may be i miss some config, but i can’t fix it. any help will be appreciate, thanks.
Can you post your training data with entity annotation here ? I’m having trouble understanding your situation, specifically about what entity you tried to extract and in what format you want to extract it.
yes, of course, i’m just leaning rasa from the examples/formbot example, and i create a new project, and add some sentence like the example step by step
this is the screenshot of the training data where may be has some mistake:
They are not supposed to be recognized like that at all if i’m not mistaking. Normally the pipeline uses Whitespace Tokenizer which tokenize the sentence by ’ '. So 88 can only be recognized as an entity. I don’t see that data in the formbot example on github. Is that your custom data ? Maybe you want to do it as [88](num_people) people please ? Why do you want to split 88 to 8 and 8 ?
the data is mine for testing, i add the data with the rasa x, and the origin sentence i added is “8 people please”, but may be something has changed when marking the entity by the rasa x automatically. is’n it true, the 8 will be treated as num_people entity or number entity, not both?
Yes, if you have 2 entities ‘num_people’ and ‘number’ then the 8 can only be categorized as 1 of them, and which entity it’s recognized depends on the data that you provide for the model to train on.
What does ‘number’ represent for ? Can it be more specified in a particular context ? Like number of dishes, number of waiter ? Because i recommend that you design that entity in a more specific context for the model to learn or just get rid of it.
Anyway, i’m pretty sure [8](num_people)[8](number) is an invalid format.