You can add the option "case_sensitive": False
to the WhitespaceTokenizer
in your pipeline, e.g.
language: en
pipeline:
- name: "WhitespaceTokenizer"
case_sensitive: False
- name: "RegexFeaturizer"
- name: "CRFEntityExtractor"
- name: "EntitySynonymMapper"
- name: "CountVectorsFeaturizer"
- name: "CountVectorsFeaturizer"
analyzer: "char_wb"
min_ngram: 1
max_ngram: 4
- name: "EmbeddingIntentClassifier"