Problem: If I include the countvectorizer both word and char_wb the loss shoots up from 0.7 to 5.5. The accuracy is same. I am not sure why this is happening I tried changing the param the dense_dimension but. It did not help.
If I remove the sparse featurizers completely and only use dense features the loss stays below 1.
I have 35 intents. eg affirm, deny.
Is this happening because of data?
I might have introduced slight ambiguity while creating the intents.
eg. In intent deny I have a few examples like yes I am not interested, yes no na dont this.
Or can this be sloved using hyperparameter settings.
This is my config.
language: en
pipeline:
- name: WhitespaceTokenizer
- name: LexicalSyntacticFeaturizer
- name: CountVectorsFeaturizer
- name: CountVectorsFeaturizer
analyzer: char_wb
min_ngram: 1
max_ngram: 4
#- name: gloveFer.GLoVeFeaturizer
# path: glove_100d.kv
##### this is a custom component to use the paraphrase model weights.
- name: customlanguageFR.CustomLanguageModelFeaturizer
model_name: sentence-transformers/paraphrase-MiniLM-L6-v2
base_model: bert
- name: RegexEntityExtractor
- name: DIETClassifier
epochs: 100
random_seed: 307
# embedding_dimension: 120
constrain_similarities: True
# connection_density: 0.7
# scale_loss: true
# dense_dimension:
# text: 256
# hidden_layers_sizes:
# text: [256]
- name: EntitySynonymMapper
- name: ResponseSelector
epochs: 100
retrieval_intent: faq
scale_loss: False
- name: ResponseSelector
epochs: 100
retrieval_intent: chitchat
scale_loss: False
- name: ResponseSelector
epochs: 100
retrieval_intent: inform
scale_loss: False
- name: FallbackClassifier
threshold: 0.67
ambiguity_threshold: 0.1
policies:
- name: RulePolicy
- name: MemoizationPolicy
max_history: 3
- name: TEDPolicy
max_history: 4
epochs: 300
embedding_dimension: 120
number_of_transformer_layers: 4
connection_density: 0.6
drop_rate: 0.25
constrain_similarities: True
Has anyone else experienced this ?
Any mistakes or tips on my config will much appreciated.