Semantic Map Embedding

BellaBoga · September 9, 2021, 10:53am

Hello everyone,

I am highly interested in the fairly recent blogpost about Semantic Map Embedding. This really sounds like an idea I would like to try myself. So, I have been wondering about the implementation of the embedding which unfortunately seems to not be accessible freely for using with an entirely different dataset right? I can only use the SemanticMapFeaturizer with an already embedded wikipedia corpus but I cannot train my own embedding. Please correct me if I am mistake.

Looking forward to further comments on this.

Best,

Bella

j.mosig · September 10, 2021, 6:53am

Hello @BellaBoga. Welcome to the forum and thank you for your interest!

You’re right, I didn’t publish the code to generate the map. This would require a bit of cleanup and documentation on my side. I’ll do this eventually, but it’s not a priority right now. Who else would be interested in creating their own semantic maps?

j.mosig · September 13, 2021, 8:54am

Ok, I just did it now. You can find the code you’d need to generate your own semantic map embedding here: https://github.com/RasaHQ/semantic-map-embedding

@BellaBoga Let me know about your experiences with this embedding!

BellaBoga · September 13, 2021, 9:08am

Hi j.mosig,

Great thank you! I will try it out and let you know about my experience. Thank you for sharing. Best, Bella

TomV · September 13, 2021, 7:39pm

For anyone else following this thread, here are links to the two Blog articles by @j.mosig related to this post:

BellaBoga · October 1, 2021, 3:28pm

Hi there,

Sorry for only getting back to you now! There is still some issues with running the code. To my understanding there are some files missing that are needed for the compilation process. When I try to compile the executable there is already something missing.

I’d be glad about any sort of feedback!

j.mosig · October 6, 2021, 8:00am

Hello @BellaBoga

What is the error message and at what step do you get it? Maybe you don’t have g++-10 installed in your system? I think I installed g++-10 like this.

BellaBoga · October 8, 2021, 8:28am

Thank you! Yes, that’s it. I could run the code!

So basically, I can build my own semantic map like with the example corpus right? Or differently ask: what kind of data is taken as input? And what is the output? How can this be analysed?

j.mosig · October 15, 2021, 10:45am

@BellaBoga Inputs are plain text files like these, as well as a vocabulary list. The output is the JSON file of the embedding that you can use with the Rasa’s SemanticMapFeaturizer on the rasa-nlu-examples repo.

To create text files of this format from Wikipedia dumps, you may use the smap branch of the forked version of the WikiExtractor on the Rasa GitHub page.

Topic		Replies	Views
Dense word-embeddings with RASA (spaCy) Rasa Open Source	4	941	February 4, 2021
First Time Using Rasa with Word Embedding Rasa Open Source	0	220	January 21, 2022
Issues using Custom Embeddings Feedback on Rasa Open Source	0	289	September 21, 2021
Word Embedding in RASA NLU Rasa Open Source	4	1744	January 14, 2021
Custom sentence embedding component Rasa Open Source	0	775	May 8, 2022

Semantic Map Embedding

Related topics