Ner_crf

Hey, I am facing a little ner_crf, it detects new as entity in both sentences like, I am for a new website, I am not looking at new website, it must recognized new in second sentence .

here it my pipeline:

  • name: “nlp_spacy” model: “en”
  • name: “tokenizer_spacy”
  • name: “ner_crf” BILOU_flag: true features:

features for word before token

  • [“low”, “title”, “upper”, “pos”, “pos2”]

features of token itself

  • [“bias”, “low”, “upper”, “title”, “digit”, “pos”, “pos2”, “pattern”]

features for word after the token we want to tag

  • [“low”, “title”, “upper”, “pos”, “pos2”] max_iterations: 50 L1_c: 1 L2_c: 1e-3
  • name: “ner_synonyms”
  • name: “intent_featurizer_count_vectors”
  • name: “intent_classifier_tensorflow_embedding”

and I face when I add “word2” , “word3” in crf feature in the pipline I face this error

We made some changes to the features in ner_crf, this is the list of available ones now:

The standard ones used are listed here: http://rasa.com/docs/nlu/components/#ner-crf

I have used the latest version of rasa_nlu and I have also updated the dictionary but all in vain, I am facing the same issue.

Updated the dictionary where? Can you post your config file in a legible format please?

language: "en"

pipeline:
- name: "nlp_spacy"
  model: "en"
- name: "tokenizer_spacy"
- name: "ner_spacy"
- name: "ner_duckling"
- name: "ner_duckling_http"
  dimensions:
  - "NUMBER"
- name: "ner_crf"
  BILOU_flag: true
  features:
    # features for word before token
    - ["low", "title", "upper", "pos", "pos2"]
    # features of token itself
    - ["bias", "low", "upper","word3", "word2" "title", "digit", "pos", "pos2", "pattern"]
    # features for word after the token we want to tag
    - ["low", "title", "upper", "pos", "pos2"]
  max_iterations: 50
  L1_c: 1
  L2_c: 1e-3
- name: "ner_synonyms"
- name: "intent_featurizer_count_vectors"
- name: "intent_classifier_tensorflow_embedding"

Yes so you need to remove all the word3, word2 etc features that aren’t part of the available features for the new version anymore. These are the standard ones, that I linked previously http://rasa.com/docs/nlu/components/#ner-crf

Can you tell why it is not update in git ?

looks like we forgot to update that, feel free to create a PR to fix it

Ok thanks

Can you tell where I can where I can find help, I would like to know what this features means, like “pos”, “title” etc. as mentioned in spacy config http://rasa.com/docs/nlu/components/#ner-crf

1 Like

@akelad can you help where to find the features meaning ?

https://eli5.readthedocs.io/en/latest/tutorials/sklearn_crfsuite.html#feature-extraction

You have information about the features here

1 Like

More information on crf features:

https://hk.saowen.com/a/c8fe0764b2e5d63ca38cc9867746b739c94eb2bec0b2f6b86d24fa2b01023a11

1 Like