Problem: If I include the countvectorizer both word and char_wb the loss shoots up from 0.7 to 5.5. The accuracy is same. I am not sure why this is happening I tried changing the param the dense_dimension but. It did not help.
If I remove the sparse featurizers completely and only use dense features the loss stays below 1.
I have 35 intents. eg affirm, deny.
Is this happening because of data?
I might have introduced slight ambiguity while creating the intents.
eg. In intent deny I have a few examples like yes I am not interested, yes no na dont this.
Or can this be sloved using hyperparameter settings.
This is my config.
language: en pipeline: - name: WhitespaceTokenizer - name: LexicalSyntacticFeaturizer - name: CountVectorsFeaturizer - name: CountVectorsFeaturizer analyzer: char_wb min_ngram: 1 max_ngram: 4 #- name: gloveFer.GLoVeFeaturizer # path: glove_100d.kv ##### this is a custom component to use the paraphrase model weights. - name: customlanguageFR.CustomLanguageModelFeaturizer model_name: sentence-transformers/paraphrase-MiniLM-L6-v2 base_model: bert - name: RegexEntityExtractor - name: DIETClassifier epochs: 100 random_seed: 307 # embedding_dimension: 120 constrain_similarities: True # connection_density: 0.7 # scale_loss: true # dense_dimension: # text: 256 # hidden_layers_sizes: # text:  - name: EntitySynonymMapper - name: ResponseSelector epochs: 100 retrieval_intent: faq scale_loss: False - name: ResponseSelector epochs: 100 retrieval_intent: chitchat scale_loss: False - name: ResponseSelector epochs: 100 retrieval_intent: inform scale_loss: False - name: FallbackClassifier threshold: 0.67 ambiguity_threshold: 0.1 policies: - name: RulePolicy - name: MemoizationPolicy max_history: 3 - name: TEDPolicy max_history: 4 epochs: 300 embedding_dimension: 120 number_of_transformer_layers: 4 connection_density: 0.6 drop_rate: 0.25 constrain_similarities: True
Has anyone else experienced this ?
Any mistakes or tips on my config will much appreciated.