Using two CRFEntityExtractors

Hi, i want my bot to extract entities from german adress data. The most usual format would be . So one way would be to mark the whole adress as a single entity, however I also want to be able to extract (most) entities independently. If I split it up like in my example I might loose the advantages of extracting that whole pattern. So my idea was, why not use to both and use custom actions to make the decisions.

My idea was to use trained CRFEntityExtractors, which should be no problem (as long as i use different entity names) or is it? Has anyone tried something like this? Is this a good idea?

Do I have to write a lot of custom code or is there an easy way to solve this?

1 Like


Sounds like you’re talking about a familiar concept: composite entities. We have had a few discussions about how to implement it, and recently started drafting one on the Github repo. You could test out this branch if you like, but note it is not fully complete.

Using two CRFEntityExtractors is highly experimental, but it could work.

Solved it from the technical side by running one of the extractors as a NLU server and making server requests via a custom entity extractor. Seems to work as expected so far. I’ll look further into it and check if it’s helpfull for my application.

Im not sure if these composite entities do the same thing here. What I expect by using the full adress as a single entity is that its far easier recognized correctly as a full pattern. However I want to be flexible enough, that the user can also enter just individual parts and it might do the job as well, but I expect it to perform worse in cases where the full adress is provided.