I am struggling with retrieving the organization names. I tried to use SpacyEntityExtractor as it was recommended in docs and on this forum, and also to train CRF on some sample, but as it was also mentioned in discussion, usually the quality is really bad, because there is no any common pattern in company names.
But most of our clients send really similar requests that involves a company name. Something like ‘Can you give me an address of company XXX’. ‘Registration info of YYY’, ‘Tax id of ZZZ, do you know where to find it?’ etc.
I thought would it be possible to ‘substract’ these typical parts and feed the rest to NLU to make the NER task more simple? Or how would you do it, if the requests that involve hard-to-recognize entity are very similar?