Decision about using a pre-trained words embeddings or not

Hi guys, I’m studying Rasa framework and chatbots from the last two months. I started my project making use of Spacy’s (pt) pre-trained words embeddings and getting my results with certain NLU confiance. When I decided to use default Rasa pipeline changing just the language option to ‘pt’ the confiance increased significantly. Is there any reason to it? I mean, what kind of things should I consider to before choose between these tow approaches? Thanks.

Hi @IRBraga!

Maybe this doc could help you.

1 Like

Thanks @saurabh-m523! I’ve already read the docs, I am looking for someone’s oppinion to try to have a better understanding. Maybe something like, if your bot uses a big variaty of intents or your intents have big similarity between them, so you should use a standard pipeline, because the words vectors in a pre-trained database could confuse the intent choice… something like that. I realy don’t know. :upside_down_face:

Do you have any experience like that? Using a different language than English? Thx.

Oh, well, I don’t have experience in languages other than English :slightly_smiling_face:.

1 Like