Hi guys. i have a doubt about my training data. i am to create a nlu with multiple intents and i have a large amount of data for training so i prefer this config. `language: “en”
pipeline:
- name: “WhitespaceTokenizer”
- name: “RegexFeaturizer”
- name: “deepPavlov.DeepPavlov”
- name: “CRFEntityExtractor” features: [ [“low”, “title”, “upper”], [“bias”, “low”, “prefix5”, “prefix2”, “suffix5”, “suffix3”,“suffix2”, “upper”, “title”, “digit”, “pattern”], [“low”, “title”, “upper”] ]
- name: “EntitySynonymMapper”
- name: “CountVectorsFeaturizer”
- name: “EmbeddingIntentClassifier”
- name: “DucklingHTTPExtractor”
url: http://rasa-support
timezone: UTC
dimensions:
- time
- number
- amount-of-money
- distance
- ordinal
policies:
- name: MemoizationPolicy
- name: KerasPolicy
- name: MappingPolicy`
my doubt is , as i have multiple intents few looks so similar just some key words differ, should i use those keywords as slots(Entity) for best usage of nlu recognization or should i use that as a normal text. which method will increase my acurracy and good matching. Thanks in advance.