Hi, I just started using entity roles and groups and wanted to know what the correct way of labelling them would be? In the example below I have data for a user travelling from one location to another location and have the roles - currentlocation, destination. However, how should I label a sentence which only has the currentlocation or only has the destination?
Eg.
## intent:entitychallenge
- from [UK]{"entity":"Country", "role":"currentlocation"} to [USA]{"entity":"Country", "role":"destination"}
- from [USA]{"entity":"Country", "role":"currentlocation"} to [UK]{"entity":"Country", "role":"destination"}
- to [USA]{"entity":"Country"}
- from [UK]{"entity":"Country"}
Is that correct or should it be -
## intent:entitychallenge
- from [UK]{"entity":"Country", "role":"currentlocation"} to [USA]{"entity":"Country", "role":"destination"}
- from [USA]{"entity":"Country", "role":"currentlocation"} to [UK]{"entity":"Country", "role":"destination"}
- to [USA]{"entity":"Country","role":"destination"}
- from [UK]{"entity":"Country","role":"currentlocation"}
after reading the documentation I think that both of those annotation versions are correct however I would go with version 1.
I think you need to think about what the model should learn. I would use the roles whenever there is a pair of annotated entities in the sentence such that the model learns: Whenever there is a currentlocation, there most likely also is a destination no matter of the combination / interchangeability of entities. If you say “I love the USA” then there is semantically no need to actually role-label the entity - because the idea of your sentence is another than to specify some sort of destination/current location however it is a country after all.
Maybe we should ask @Tanja for clarification here - is there any kind of best practice?
both versions are correct. There is no clear answer, whether to use version 1 or 2.
I agree with @JulianGerhard that you need to think about what the model should actually learn. In what kind of situation would a user just say “to USA”? Does USA in this situation still maps to the country destination or can the phrase also be used in a different context, in which USA is just a country? So, you need to think about the assistant you are building. Do currentlocation and destination always co-occur or can they also be mentioned on their own? If the user can mention them on their own, I would go with version 2, otherwise, if they always occur together, version 1 might make more sense.
If a country is mentioned in a completely different context, you should not annotate it with any role label. For example, Julian mentioned that “I love the USA”, USA should just be annotated with the country label - I agree.
Thank you for the help @JulianGerhard and @Tanja.
I went with version 2 and updated my NLU data as such:
## intent:entitychallenge
- i want to go from [UK]{"entity":"Country", "role":"currentlocation"} to [USA]{"entity":"Country", "role":"destination"}
- i want to go from [USA]{"entity":"Country", "role":"currentlocation"} to [UK]{"entity":"Country", "role":"destination"}
- i want to go to [USA]{"entity":"Country", "role":"destination"}
- i want to go to [UK]{"entity":"Country", "role":"destination"}
- i want to go from [USA]{"entity":"Country", "role":"currentlocation"}
- i want to go from [UK]{"entity":"Country", "role":"currentlocation"}
The recognition works perfectly for these four sentences:
? Your input -> i want to go from UK to USA
? Is the intent 'entitychallenge' correct for 'i want to go from [UK]{"entity": "Country", "role": "currentlocation"} to [USA]{"e
ntity": "Country", "role": "destination"}' and are all entities labeled correctly? (Y/n)
? Your input -> i want to go from USA to UK
? Is the intent 'entitychallenge' correct for 'i want to go from [USA]{"entity": "Country", "role": "currentlocation"} to [UK]{"e
ntity": "Country", "role": "destination"}' and are all entities labeled correctly? (Y/n)
? Your input -> i want to go to USA
? Is the intent 'entitychallenge' correct for 'i want to go to [USA]{"entity": "Country", "role": "destination"}' and are all entities labeled correctly? (Y/n)
? Your input -> i want to go to UK
? Is the intent 'entitychallenge' correct for 'i want to go to [UK]{"entity": "Country", "role": "destination"}' and are all entities labeled correctly? (Y/n)
However it does not seem to work for the other two sentences:
? Your input -> i want to go from USA
? Is the intent 'entitychallenge' correct for 'i want to go from [USA]{"entity": "Country", "role": "destination"}' and are all e
ntities labeled correctly? (Y/n)
? Your input -> i want to go from UK
? Is the intent 'entitychallenge' correct for 'i want to go from [UK]{"entity": "Country", "role": "destination"}' and are all en
tities labeled correctly? (Y/n)
Is it possibly due to the bot not having enough data/examples or do you think maybe the sentences are too similar?
Yes, it could be that you have not enough training examples. Just try to add a couple of more examples and see how it goes. However, also other users reported similar issues already and I want to take a closer look into that soon as the model might just overfit to USA being destination. Let me know how it goes. Thanks.