Using 5 words instead of 3 with ner_crf

(Einar Bui Magnusson) #1

Has anyone tried using 5 words (2 preceding / 2 following) instead of 3 (1 preceding / 1 following) with the CRF model in rasa_nlu? I just tried it briefly when I had quite small training data, and the confidence dropped drastically. I guess it’s a matter of increasing the amount of training data as we have more parameters to train. Anyone have a feeling for how much training data is needed for a 5-word model to work better than a 3-word model?

(Neil Stoker) #2

I haven’t tried that but it sounds interesting. More data seems sensible. On the confidence, is that not to be expected? The probability of n-grams decreases with larger n. Isn’t the key thing whether the relative confidence is helpful within the CRF model output?

(Einar Bui Magnusson) #3

You’re right, the absolute value of the confidence may not mean much, as long as the correct entity has the highest confidence. I will be trying 5-word CRF models as I have different entities that can appear in identical 3-word contexts. Will try to remember to follow up here if I have any success.

(Ishan Khatri) #4

@einar.bui How did you have the CRF model use 5 words instead of 3? was it a change the the configuration in the pipeline as follows? If so I’d love to give this a try on our data set and see what happens. I have 200+ examples for some of my entities and I’m curious if I’ll get a better model with a 5-word model.

- name: "ner_crf"
  features: [[first],[second],[third],[fourth],[fifth]]

(Einar Bui Magnusson) #5

@ikhatri, yes, you just add extra arrays in the features array as you’ve shown. Has to be an odd number so that there is a middle word.