Edunston
(Eric)
November 22, 2023, 9:53am
1
Hi all,
I have much love for the RASA community, but DIET hasn’t been sota for a long time now. Is the community working on something new?
According to the massive text embedding benchmark, SGPT-5.8B-nli or ST5-Large are performing far better than the LaBSE (which is the base model available for diet) at classification tasks.
https://paperswithcode.com/paper/mteb-massive-text-embedding-benchmark/review/?hl=72261
stephens
(Greg Stephens)
November 22, 2023, 6:09pm
2
Yes, you should try some alternatives for intent classification and entity recognition. Once simple change to improve performance is to disable DIET for entity extraction and try the CRF entity extractor:
- name: CRFEntityExtractor
- epochs: 100
name: DIETClassifier
entity_recognition: false
Another alternative is to replace DIET with your choice of featurizer and use the logistic regression classifier:
- name: LanguageModelFeaturizer
model_name: "bert"
model_weights: "sentence-transformers/all-MiniLM-L6-v2"
- name: CRFEntityExtractor
- name: LogisticRegressionClassifier
max_iter: 100
solver: lbfgs
tol: 0.0001
random_state: 42
ranking_length: 10
Edunston
(Eric)
November 22, 2023, 6:25pm
3
Thanks Stephens,
will try both and give you some feedback.