Bag-of-words understanding in supervised embedding pipeline

Hello,

I would like to unserstand more what is considered in a bag of words representation. let’s say I have an intent food:

##intent: ask_about_food
- I want food
- I want to eat
- I'd like to eat

then I have a user input: “I wanna eat”

Given the user input, the CountVectorsFeaturizer counts how many times distinct words of the training data appear in a user message. Meaning it will compare “I wanna eat” with the other 3 sentences. These 3 features are defined as sparse features .

Now in the EmbeddingIntentClassier it is mentioned that it embeds user inputs and intent labels into the same space in a vectorial representation.

Are the user inputs the sparse features calculated? Or are we going back here to the user message “I wanna eat”? And how the intent labels are relevant here?

I’ll apreciate if someone gives an explanation with an example, please.

Thank you!