naveenjr
(Naveenjr)
February 21, 2022, 9:51am
1
Hi,
i was wondering is there any way we can map regex alone for some Entity recoganization in my training data ?
for example i know one entity A is only 5 or 6 digit number
but i dont know entity B’s regex or which can’t be easily converted into regex ?
is there any priority while using NER ? like diet classifier> synonyms> regex ?
ChrisRahme
(Chris Rahmé)
February 21, 2022, 7:21pm
2
Entities A and B are completely independent. You can use Regex for A but not for B.
naveenjr
(Naveenjr)
February 22, 2022, 5:00am
3
this is my configuration file
- name: WhitespaceTokenizer
- name: RegexFeaturizer
- name: RegexEntityExtractor
- name: LexicalSyntacticFeaturizer
- name: CountVectorsFeaturizer
- name: CountVectorsFeaturizer
analyzer: char_wb
min_ngram: 2
max_ngram: 4
- name: DIETClassifier
epochs: 70
- name: EntitySynonymMapper
- name: ResponseSelector
epochs: 100
how i can use regex for one entity and another not here ? because while classifying both Diet classifier as well as RegexEntityExtractor trigger simultaneously.
ChrisRahme
(Chris Rahmé)
February 22, 2022, 4:33pm
4
You don’t need to worry about that… Just use Regex for entity A and treat entity B as normal.
1 Like
naveenjr
(Naveenjr)
February 23, 2022, 3:58am
5
@ChrisRahme how many samples i need to use for entity A in nlu.yml if it is purely based on regex? some entity i am getting regex entity recognition as well as diet entity recognition also . that is leading filling both of my slots value simultaneously.
ChrisRahme
(Chris Rahmé)
February 23, 2022, 6:40pm
6
You just need to give at least 2 examples of the entity, as per the docs
[…] you do need at least two annotated examples of the entity so that the NLU model can register it as an entity at training time.