NER_CRF generalizes very badly

It is really desperate to train NER_CRF even for one entity.

I start training by adding sentence for sentence where NER fails and filling it with different entity values. I get the following results:

Ading just one examples infers with other examples such other examples are not recognized anymore. Also, even testing on examples which are exactly in training data fails sometimes.

I feel, I have to add every exact sentences structure to data. NER_CRF does not learn to mix context words so examples where you have two context words which are in two seperate training examples.

Even If I have all subparts of a sentence in training data and testing it on a example containing two those parts it fails…

I use as features:

 ["prefix5", "prefix2", "suffix3",
             "suffix2", "title", "upper"],
            ["bias", "upper", "title", "digit", "pattern"],
            ["prefix5", "prefix2", "suffix3",
             "suffix2", "title", "upper"]],

Any help?

Could you post your training data so we can see what’s going wrong there?

I think those are typical situation like most have. I just like some advices. This should be possible somehow. I read everytime the request for training data. But I think there should be some good guidelines when somone has experinece in this. I don’t want to make the data public.

What I think is, that for my config, some statistics about context occurences are important to know?

And, I stated an issue that even examples from training data are not recognized on same exatc examples. How can that be. Also learning mixtures of different context words is really difficult! and, adding a new examples infers negatively with previous examples…

Well, there’s not really any general guidelines for this… It depends what your training data looks like, what kind of entities there are, how much of it there is etc etc

If you don’t want to share your training data here, we’d be happy for you to send it to hi@rasa.com so we can take a look (confidentially of course)

@akelad thanks. I think about it. But, really, above issues are still common and of course they are from overfitting. But it is really hard to find out how much variation CRF needs to generalize well! As I can read many have those troubles. I have the trouble that almost both context words have to coincide with the trained example (above config using just context properties), Also entities at the end of sentences are not recognized well even if same is in training data…

I did some thought on that. I feel the only solutionis to build a more elaborate evaluation system which does something like backward fitting (removing each example) until best training data is found…

If this is something you want to work on implementing, then we’d love it if you shared it with us

Is it possible to access the weights of learned features for CRF? I think this would be great to see where is an overfitting and which examples to include?

feel free to look at the code, i think you should be able to

Hi, I am facing the same issue. I have two entities. One entity(say entity A) is in the middle of a sentence in my training data(nlu.md). Entity A is recognised with 100% accuracy during conversations(even new unseen entities). But, entity B is at the end of sentences in training data (For ex: How much do i have to pay to RASA. Here RASA is the entity) Even the entities which are in training data are not recognised during testing(conversation). Please help. You were facing similar issue. How did you solve it. @akelad, can u please help me.

@akelad @akelad i also faced the same problem but didn’t get any answer from rasa community or from another forum…here is my question bots - How to handle dynamic entities in rasa? - Stack Overflow… rasa can not handle dynamic entities if you have entities around 5 to 10, rasa will learn the pattern, ner_crf works on one word before and one word after it will learn the context of your training data, but didn’t give you the results that you want…I solve this issue by running post processing script on my training data by add some random characters of entity value not fixed the value just randomize the values and balance the training data according to my requirements. But you can’t depend on crf_entity_extractor. So, please add some new enhancement regarding this.