How to access DIET embedding vectors?

Hey all,

I want to access the dense vector obtained after convergence from the part marked with a red circle (this architecture picture without this mark is from the Rasa Whiteboard video: Rasa Algorithm Whiteboard - Diet Architecture 1: How it Works - YouTube).

I want to use the message vectors to analyze through some data mining or ML algorithms and obtain some general insights.

Now, I do understand that from a Message object applied to an NLU interpreter like presented on this blog post might offer me some sparse and dense features, probably the ones constructed before the DIETClassifier call.

Below is what I tried:

featurized_msg = nlu_interpreter.featurize_message(train_data.intent_examples[0])
featurized_msg.get_dense_features('text')[0].features # Seems to offer me the embedding for each token
featurized_msg.get_dense_features('text')[1].features # Seems to offer me the average of the vectors from the previous element as an embedding of the phrase

But then again, I think those are the vectors I obtain from my pre-trained word embedding from spaCy. Does anyone knows how can I access the word embeddings obtained from the DIET model?

Any help on this would be very appreciated!

Cheers

That’s a great question and we’re actually working on a feature that does this. You can find the PR here.

Once the feature is ready I’ll likely also add it as an easy to use the component in whatlies so that you may inspect the “DIET”-embeddings from Jupyter.

2 Likes

That’s nice! Thank you, Vincent. I really enjoy your Rasa Whiteboard videos.

It’s good to know that this is going to be addressed as I barely have experience using TensorFlow directly.

I also didn’t know about the whatlies lib, going to keep an eye on it.

1 Like