I am building a bot in real estate and I am trying to extract entities at two level at the same time: apartment type and number of specific rooms (bedrooms, bathrooms, etc.)
In our domain, people usually refer to apartments with phrases like:
- I am looking for a 2 bed 2 bath apt
- Looking for a 2x2 apt
- I need a studio or a 1 bedroom
- I need a 2b/2b
- Looking for a room in a shared apartment
My idea was to extract three entities:
- unit_type: studio, full apartment, shared apartment
- bedrooms: 1, 2, 3, etc.
- bathrooms: 1, 1.5, 2, etc.
However, in many cases, these entities overlap and I am therefore not sure how to create the training samples nor if Rasa will be able to handle the overlapping entities.
For example, the same sentence could be annotated as:
- I am looking for a (2 bed 2 bath apt)[unit_type:full apartment]
- I am looking for a (2 bed)[bedrooms:1] (2 bath)[bathrooms:2] apt
Should I have two different training examples? Can I somehow merge them in one? And is there a better way of handling the numbers themselves?
Thanks a lot for your help! Nicolas