Extracting pre-defined entities with PhraseMatcher

I saw that phrase-matching NER was addressed in PR #822 but it was never merged - it’s said to be implemented as part of PR #1312.

However, the latter doesn’t include entity_phrases - it uses the lookup table in order to generate more features for the CRF entity extractor.

Did someone here try to work with such an extractor? and do you know if phrase-matching NER is in the roadmap?

Hey, no the lookup table is a replacement for the phrase matcher. It should work just as well

Thanks @akelad According to the lookup table documentation “For lookup tables to be effective, there must be a few examples of matches in your training data. Otherwise the model will not learn to use the lookup table match features.” Meaning that they add additional reg-exp based features for the CRF model. On the other hand, the phrase-matching doesn’t require any training data, it does a simple matching. So the outcome of both extractors will not be the same.

Hi @akelad,

Any update on this one? Thanks!

But they do serve the same purpose. We decided against merging the ner phrase matcher, and instead created the lookup tables. So please use that