NER_CRF generalizes very badly

datistiquo · October 5, 2018, 11:57am

It is really desperate to train NER_CRF even for one entity.

I start training by adding sentence for sentence where NER fails and filling it with different entity values. I get the following results:

Ading just one examples infers with other examples such other examples are not recognized anymore. Also, even testing on examples which are exactly in training data fails sometimes.

I feel, I have to add every exact sentences structure to data. NER_CRF does not learn to mix context words so examples where you have two context words which are in two seperate training examples.

Even If I have all subparts of a sentence in training data and testing it on a example containing two those parts it fails…

I use as features:

 ["prefix5", "prefix2", "suffix3",
             "suffix2", "title", "upper"],
            ["bias", "upper", "title", "digit", "pattern"],
            ["prefix5", "prefix2", "suffix3",
             "suffix2", "title", "upper"]],

Any help?

akelad · October 5, 2018, 4:38pm

Could you post your training data so we can see what’s going wrong there?

datistiquo · October 5, 2018, 6:21pm

I think those are typical situation like most have. I just like some advices. This should be possible somehow. I read everytime the request for training data. But I think there should be some good guidelines when somone has experinece in this. I don’t want to make the data public.

What I think is, that for my config, some statistics about context occurences are important to know?

And, I stated an issue that even examples from training data are not recognized on same exatc examples. How can that be. Also learning mixtures of different context words is really difficult! and, adding a new examples infers negatively with previous examples…

akelad · October 9, 2018, 4:18pm

Well, there’s not really any general guidelines for this… It depends what your training data looks like, what kind of entities there are, how much of it there is etc etc

If you don’t want to share your training data here, we’d be happy for you to send it to hi@rasa.com so we can take a look (confidentially of course)

datistiquo · October 9, 2018, 7:51pm

@akelad thanks. I think about it. But, really, above issues are still common and of course they are from overfitting. But it is really hard to find out how much variation CRF needs to generalize well! As I can read many have those troubles. I have the trouble that almost both context words have to coincide with the trained example (above config using just context properties), Also entities at the end of sentences are not recognized well even if same is in training data…

I did some thought on that. I feel the only solutionis to build a more elaborate evaluation system which does something like backward fitting (removing each example) until best training data is found…

akelad · October 12, 2018, 3:31pm

If this is something you want to work on implementing, then we’d love it if you shared it with us

datistiquo · October 12, 2018, 3:53pm

Is it possible to access the weights of learned features for CRF? I think this would be great to see where is an overfitting and which examples to include?

akelad · October 16, 2018, 4:38pm

feel free to look at the code, i think you should be able to

samvatsar · November 27, 2019, 9:41am

Hi, I am facing the same issue. I have two entities. One entity(say entity A) is in the middle of a sentence in my training data(nlu.md). Entity A is recognised with 100% accuracy during conversations(even new unseen entities). But, entity B is at the end of sentences in training data (For ex: How much do i have to pay to RASA. Here RASA is the entity) Even the entities which are in training data are not recognised during testing(conversation). Please help. You were facing similar issue. How did you solve it. @akelad, can u please help me.

mhsnrafi · November 27, 2019, 11:31am

@akelad @akelad i also faced the same problem but didn’t get any answer from rasa community or from another forum…here is my question bots - How to handle dynamic entities in rasa? - Stack Overflow… rasa can not handle dynamic entities if you have entities around 5 to 10, rasa will learn the pattern, ner_crf works on one word before and one word after it will learn the context of your training data, but didn’t give you the results that you want…I solve this issue by running post processing script on my training data by add some random characters of entity value not fixed the value just randomize the values and balance the training data according to my requirements. But you can’t depend on crf_entity_extractor. So, please add some new enhancement regarding this.

Topic		Replies	Views
NER_CRF model is not generalizing Rasa Open Source	3	841	December 2, 2019
Ner_crf Rasa Open Source	12	5141	September 28, 2018
Rasa_NLU ner_crf classification issue Rasa Open Source	1	503	June 12, 2019
Using NER as a Feature for CRFEntityExtractor Rasa Open Source	6	1722	June 28, 2021
Making rasa nlu to identify similar entities based on trained entities Rasa Open Source	11	2072	May 17, 2019

NER_CRF generalizes very badly

Related topics