Rasa_NLU ner_crf classification issue

Hi, I’m currently building a chatbot using Rasa-NLU and using ner_crf as entity classifier in the pipeline.

I’m having around half a million training sentences with only 12 different entities. The extraction is going well but the recognition is not that accurate…

I’m trying to find why…

The pipeline is as folow: language: “fr”

pipeline:

  • name: “components.preprocess.PrepareString”
  • name: “nlp_spacy”
  • name: “tokenizer_spacy”
  • name: “ner_crf” features: [[“low”], [“bias”, “suffix3”], [“upper”, “pos”, “pos2”]]
  • name: “ner_synonyms”
  • name: “intent_featurizer_count_vectors”
  • name: “intent_classifier_tensorflow_embedding”

I believe that it may be due to my none understanding of the features on ner_crf… Could someone explain to me what are the different features for ?

For example:

  • low
  • title
  • suffix5
  • suffix3
  • suffix2
  • suffix1
  • pos
  • pos2
  • prefix5
  • prefix2
  • bias
  • upper
  • digit

what do you mean by?