How much training data do you have? E.g. how much examples per entity? It often helps just to add a couple of more examples to the training data.
Also if you want to extract number of a certain pattern, I recommend to either use duckling or RegexEntityExtractor. That should help you to extract entities that follow a certain pattern.
Also it might be a good idea to switch to DIETClassifier
for entity extraction instead of CRFEntityExtractor
as it is usually a bit more powerful.
So maybe you can try the following config and use the DIETClassifier
to extract civilite
and the RegexEntityExtractor
to extract the phone number, for example.
- name: WhitespaceTokenizer
- name: LexicalSyntacticFeaturizer
- name: CountVectorsFeaturizer
- name: CountVectorsFeaturizer
analyzer: char_wb
min_ngram: 1
max_ngram: 4
- name: DIETClassifier
epochs: 100
- name: RegexEntityExtractor
- name: EntitySynonymMapper