Not able to extract entity when its consists special character

I am developing a bot, which need to handle special character also. But unfortunately it is not extracting that value. I’m providing the details of my config, nlu and regex… Could you please check and help me out to resolve this

config.yml -------------------- . recipe: default.v1 language: en pipeline:

  • name: WhitespaceTokenizer intent_tokenization_flag: true case_sensitive: false intent_split_symbol: +
  • name: RegexFeaturizer
  • name: RegexEntityExtractor pattern: regex_patterns.yml # Define regex patterns for special characters
  • name: CountVectorsFeaturizer
  • name: DIETClassifier
  • name: CRFEntityExtractor
  • name: EntitySynonymMapper
  • name: DucklingEntityExtractor
  • name: CountVectorsFeaturizer analyzer: char_wb min_ngram: 1 max_ngram: 4
  • name: FallbackClassifier threshold: 0.98 policies:
  • name: MemoizationPolicy max_history: 50 epochs: 100
  • name: TEDPolicy max_history: 50 epochs: 100
  • name: RulePolicy
    assistant_id: 20230419-003217-calm-contract

nlu.yml

regex_patterns.yml

version: “2.0”

extractions:

  • name: “casetype” type: “regex” pattern: “[.*?]” # This pattern captures anything within square brackets

domain.yml

slots: casetype: type: text mappings: - type: from_entity entity: casetype entities: casetype

output of my console

Let me know if any thing more you required

you need to provide examples first, check out any entity extraction video on our YouTube.

Hi Sonam, I’ve already used entity extraction in my several project. Now I am getting stuck where an entity consists special character. Entity always are getting None this time Please check the following images and take it as example. Please guide me to resolve it domain.yml

image

Config.yml

nlu.yml

Hi,

I think you need to change tokenizer, from whitespacetokenizer, or write a custom one. You can look into similar discussion here.