What changed in the DIETClassifier implementation or defaults? Significant drop in performance for rare intents

Hello there!

During my migration to Rasa 2.x, I realized that rare intents are very badly classified now. I do have very imbalanced data (some intents have a dozen samples while others (like the ones that are using the ResponseSelector) have a few hundreds. It didn’t cause major problem for most intents in Rasa 1.9.

Even testing on the training data yields terrible results for intents with low support. For example, I have an intent happy which contains a dozen examples such as you're great!, I love you, etc. with a support of only 12 samples. In Rasa 1.9, we had a training recall of ~83% while in Rasa 2.1, I get a recall of 40%.

I am using the same pipeline as before, using Spacy tokenizers/featurizers and the DIETClassifier. From my understanding of the documentation, the DIETClassifier uses a balanced batching which should handle an imbalanced dataset. I have copy-pasted my NLU pipeline at the end of this message.

The only difference I can see is in the settings for the RegexFeaturizer and the SpacyFeaturizer which do not have the return_sequence: true option anymore. A quick glance at the code showed that at least the SpacyFeaturizer returns both sequence and sentence features. The Spacy versions are also identical (2.1.9)

Any idea of where I could look? What has changed between the two versions?

Thanks a lot for the pointers! Cheers Nicolas

config.yml (Rasa 2.x)

language: "en"
pipeline:
  - name: "DucklingEntityExtractor"
    url: "http://duckling.alpaca.casa"
    dimensions: ["time", "duration", "amount-of-money", "number", "email", "phone-number", "ordinal", "url"]
    timezone: "America/New_York"
  - name: "SpacyNLP"
    case_sensitive: true
  - name: "SpacyTokenizer"
  - name: "SpacyEntityExtractor"
    dimensions: ["PERSON", "MONEY"]
  - name: "RegexFeaturizer"
  - name: "SpacyFeaturizer"
  - name: LexicalSyntacticFeaturizer
  - name: "DIETClassifier"
    epochs: 50
    entity_recognition: true
    use_masked_language_model: false
  - ... # response selectors

config.yml (Rasa 1.9)

language: "en"
pipeline:
  - name: "DucklingHTTPExtractor"
    url: "http://duckling.alpaca.casa"
    dimensions: ["time", "duration", "amount-of-money", "number", "email", "phone-number", "ordinal", "url"]
    timezone: "America/New_York"
  - name: "SpacyNLP"
    case_sensitive: true
  - name: "SpacyTokenizer"
  - name: "SpacyEntityExtractor"
    dimensions: ["PERSON", "MONEY"]
  - name: "RegexFeaturizer"
    return_sequence: True  # <-- option not available anymore
  - name: "SpacyFeaturizer"
    return_sequence: True  # <-- option not available anymore
  - name: LexicalSyntacticFeaturizer
  - name: "DIETClassifier"
    epochs: 50
    entity_recognition: true
    use_masked_language_model: false

Maybe @Tanja would know? It might actually be between 1.9 and 1.10, I’ll test that

I just tested with Rasa 1.10 and I have the same issue as with 2.1.

Any recommendations of a configuration to get results as I had back in 1.9?

Thanks! Nicolas

Oh, nevermind. I just increased the number of epochs and that was sufficient. Still curious to know what changed conceptually! :slight_smile:

@nbeuchat Most likely this change here let to the performance change of yours. We did not saw any performance drop when we tested this change back then. However, our datasets we used for testing were not a very imbalanced. We also always used the same number of epochs. Can you maybe try to set scale_loss = True for the DIETClassifier and check if you get same performance as before?

1 Like

Hi Tanja! Thanks a lot for your response and my apologies, I forgot to give feedback after testing your suggestion. Setting scale_loss = True seemed to give the same performance for the same number of epochs as initially tested (I just quickly tested once).