Hi Rasa Community,
I use lookup table full of uppercase words for
RegexEntityExtractor, in the doc NLU Training Data it is said that lookup table is case insensitive. However after cross validation, it turns out that my
RegexEntityExtractor is not able to identify all those entities in lowercase. It seems that the lookup table is case sensitive? I’ve defined couple of training examples for the entities in lookup table and set
RegexFeaturizer to False.
This problem really confused me for a while, can someone help me to solve this?
This is how my pipeline looks like:
language: de pipeline: - name: WhitespaceTokenizer - name: RegexFeaturizer case_sensitive: False - name: RegexEntityExtractor case_sensitive: False - name: LexicalSyntacticFeaturizer - name: CountVectorsFeaturizer - name: CountVectorsFeaturizer analyzer: char_wb min_ngram: 1 max_ngram: 4 - name: "DucklingEntityExtractor" url: "http://localhost:8000" dimensions: ["time", "number", "amount-of-money"] locale: "de_DE" timezone: "Europe/Berlin" - name: "CRFEntityExtractor" - name: DIETClassifier epochs: 100 - name: EntitySynonymMapper - name: ResponseSelector epochs: 100 - name: FallbackClassifier threshold: 0.75 ambiguity_threshold: 0.1
thanks and regards