Bag-of-words understanding in supervised embedding pipeline

Ghada · September 22, 2020, 12:57pm

Hello,

I would like to unserstand more what is considered in a bag of words representation. let’s say I have an intent food:

##intent: ask_about_food
- I want food
- I want to eat
- I'd like to eat

then I have a user input: “I wanna eat”

Given the user input, the CountVectorsFeaturizer counts how many times distinct words of the training data appear in a user message. Meaning it will compare “I wanna eat” with the other 3 sentences. These 3 features are defined as sparse features .

Now in the EmbeddingIntentClassier it is mentioned that it embeds user inputs and intent labels into the same space in a vectorial representation.

Are the user inputs the sparse features calculated? Or are we going back here to the user message “I wanna eat”? And how the intent labels are relevant here?

I’ll apreciate if someone gives an explanation with an example, please.

Thank you!

Topic		Replies	Views
Difference between sparse features Rasa Open Source	1	773	September 19, 2020
RASA Word Embeddings Confusion Rasa Open Source	7	1707	May 8, 2020
DIETClassifier with sparse input features only Rasa Open Source	9	2570	January 19, 2021
A question about twice CountVectorsFeaturizer entry in supervised_embedding pipeline recipe Rasa Open Source	1	1225	October 15, 2019
how to see the word embedding representation used by rasa given a model? Rasa Open Source	2	703	January 22, 2021

Bag-of-words understanding in supervised embedding pipeline

Related topics