Im wondering what is a good option to handle something like I need 2 rooms that can allow 6 people. Here 2 and 4 will be a “number” entities (extracted say by ner_duckling_http) but we need 2 slots populated for say number_rooms and number_people.
You need ner_crf for this. Duckling can only extract the number but cannot contextualize it.
It is similar to the probem of origin and destination. I want to fly to New york from LA. Both cities but in order to give it context, you need ner_crf
Read into CRF to understand how to tackle such problem. You will need many variations to training data.
Are you saying that ner_crf can handle any number ?(It would not be feasible to give every possible number including text variations in training data) - it was my impression that duckling is needed for anything dealing with numbers/time etc.
Yeah duckling can handle numbers, but in your case you need more than numbers. Users says- I need 2 rooms that can allow 6 people Duckling will get you 2 and 6 as numbers which doesn’t help you contextualize it.
You should train the CRF to get “2 rooms” and “6 people”. And use that to process the context.
You are not handling numbers with CRF rather context
2 rooms 5 people
Basically to determine a pattern that essentially should follow a keyword (room,people,persons,guests) preceded by a number
So if I understand you correctly I should train the system to recognize the entities as normal and then use something else to say convert either the word two or the numeric 2 to 2 ?
Use the CRF to find 2 persons or 3 rooms and then you can use regex to find the number. Look into CRF a bit to tackle this kind of problem, it is not easy and your dataset should really be well balanced
Ok thank you for your suggestions
I have the same question but I think it’s a bit complicated: What if I need similar functionality, but not for English? Currently, we are using Tensor Flow pipeline as it’s language agnostic + translating Duckling, but as duckling just extracts everything it finds in the data, it won’t work in such situations. What approach should be taken in this situation?
@shota can you explain the problem or an example of your scenario