DIETClassifier with sparse input features only

koaning · January 19, 2021, 1:21pm

Assuming we’re using no subwords, the mental picture is similar to this;

Note that the sparse representation for the entire utterance can be interpreted as the sum of the separate tokens.

Let’s zoom in on a sparse encoding followed by a single embedding layer.

When we have a ‘1’ input then the weights from the feedforward layer matter. Otherwise, we multiply a weight times zero which always equals zero.

If now, we’d have a sparse input for a sentence, more weights would matter and thus the output embedding would be different.

So by merit of linear algebra, the dense representation of the sparse embeddings can also be interpreted as the sum. Note that in these diagrams I’m only looking at the first embedding tlayer hat is applied. Also, I’m ignoring any activations that theoretically could be in there.

Topic		Replies	Views
Question about DIET classifier implementation details - Are featurizers trained? (and others) Rasa Open Source	1	332	May 4, 2023
Semantic Hashing with DIETClassifier Rasa Open Source	2	362	May 24, 2021
DIET Architecture - Individual Token Pathway Feedback on Rasa Open Source	4	908	November 14, 2021
DIETClassifier: Where do pretrained embeddings come from? Rasa Open Source	2	1268	July 28, 2020
Using DIETClassifier with ConveRT and Transfomer based featurizers Rasa Open Source	7	1933	April 3, 2020

DIETClassifier with sparse input features only

Related topics