Word Embedding in RASA NLU

Yasmine · September 18, 2020, 10:46pm

Hello,

I would like to know in which step word embeddings are created? Otherwise, Are word embeddings created by Text Featurizes ( CountVectorsFeaturizer for example) or by Intent Classifiers (EmbeddingIntentClassifiers for example) ? I know that CountVectorsFeaturizer transforms tokens into vectors, and EmbeddingIntentClassifiers is a ANN with 2 hidden layers and calculates the coefficients used for the text classification. But word embedding is a dense matrix, represents the similarity between the terms and (according to my knowledge) is used by the classifier. I hope you might be able to give me some insights on this.

Thanks!

SamS · September 23, 2020, 2:30pm

Hi Yasmine, there is no simple answer but I’ll try to give you some useful pointers.

With EmbeddingIntentClassifier, word embeddings are initialised and later trained as part of the classifier itself. It’s similar within our more recent classifier DIET (see this nice video on the architecture of DIET). However, one could argue that the embeddings are not true word embeddings: The classifiers accept inputs of all kinds from various featurisers (not one-hot encodings of words), and don’t train a true embedding matrix. Ultimately, the classifiers focus on training good sentence embeddings.

If you were to see some true word embeddings, it would be in the featurisers. For instance, the ConveRTFeaturizer and SpacyFeaturizer both use pre-trained embeddings. You can leverage other common embeddings such as FastText, see the nlu-examples repo.

Does this help? Feel free to ask more

Yasmine · September 23, 2020, 3:16pm

Hi Sam, thank you for your reply!

So in Rasa, the Framework doesn’t really create an independent word embedding with its own parameters. Otherwise, it transforms tokens into sparce vectors by the Featurizers using a bag of words, then the output is used by the classifier to maximize the similarity between the sentences. Is that correct ? I have another question if you don’t mind, concerning the pretrained models, is there a HuggingFace model supported by Rasa and using for French data?

Thanks!

SamS · September 25, 2020, 12:55pm

Hey Yasmine

First, regarding your intuitions: You are right, though I should point out that there are also many dense featurisers which transform tokens and messages into dense vectors, using methods other than bag-of-words (though the very stable CountVectorsFeaturizer still produces sparse bag-of-words/characters/n-grams features which tend to perform well).

Regarding this bit:

could you be more specific? Do you mean NLU classifiers (perhaps specifically DIETClassifier)?

For French, you could use as a featuriser the multilingual version of Bert (or DistilBert), see the list of all available models. As the classifier, you would then most likely use DIET. By the way, I recommend using CountVectorsFeaturizer alongside any dense featuriser, usually it only helps (see also the recommended non-English pipeline for more details).

Yasmine · January 14, 2021, 3:29pm

Hey Sam,

I’m sorry for the late reply, I didn’t see the response. Yes, I meant NLU Classifiers which classify the sentence by maximizing the similarity with the correct intent and minimizing similarities with the incorrect intents. Thank you again.

Topic		Replies	Views
how to see the word embedding representation used by rasa given a model? Rasa Open Source	2	699	January 22, 2021
How tensorflow embedding works Rasa Open Source	1	779	February 12, 2020
Custom sentence embedding component Rasa Open Source	0	774	May 8, 2022
RASA Word Embeddings Confusion Rasa Open Source	7	1694	May 8, 2020
Dense word-embeddings with RASA (spaCy) Rasa Open Source	4	932	February 4, 2021

Word Embedding in RASA NLU

Related topics