Article in German/Dutch set number entity in duckling

Hi Everyone,

My apologies if this topic is already discussed but I wasn’t able to find anything like this. Our problem is the following. In languages like German and Dutch there are articles which are recognized as numbers by entity extractors like duckling. Examples are:

  • Ich möchte ein Seminar buchen (I would like to book a seminar)
  • Ik zou graag een voucher willen kopen (I would like to buy a voucher)

In these examples the bold words are articles which cause duckling to extract a number entity with value 1. In the form which is following on this user utterances we would like to ask for the number of days the seminar would take or the number of rooms that are needed, but the form fills this kind of slots with the number entity that is extracted because of the article.

The problem is that we can’t just skip the entity extraction here because we would like to extract numbers if the user gives the number of rooms directly in his utterance like this:

  • Ich möchte ein Seminar buchen und brauche 3 Räume (I would like to book a seminar and I need 3 rooms)

So our question is, is there a nice way to prevent number extraction for articles or is there a workaround for this problem?

Thanks in advance!

You may need a custom action or validate to clean-up after duckling.

Hi Greg,

Thanks for your response! Perhaps I don’t understand you correctly but we have validations running already which are validating the extracted entities for the requested slots like number of people for example.

But the problem here is that we cannot differentiate if we have an actual number which is extracted or if it is just the article of a noun, which should be ignored.

Perhaps it would help if you can elaborate your answer with an example?