My apologies if this topic is already discussed but I wasn’t able to find anything like this. Our problem is the following. In languages like German and Dutch there are articles which are recognized as numbers by entity extractors like duckling. Examples are:
- Ich möchte ein Seminar buchen (I would like to book a seminar)
- Ik zou graag een voucher willen kopen (I would like to buy a voucher)
In these examples the bold words are articles which cause duckling to extract a number entity with value 1. In the form which is following on this user utterances we would like to ask for the number of days the seminar would take or the number of rooms that are needed, but the form fills this kind of slots with the number entity that is extracted because of the article.
The problem is that we can’t just skip the entity extraction here because we would like to extract numbers if the user gives the number of rooms directly in his utterance like this:
- Ich möchte ein Seminar buchen und brauche 3 Räume (I would like to book a seminar and I need 3 rooms)
So our question is, is there a nice way to prevent number extraction for articles or is there a workaround for this problem?
Thanks in advance!