Feeding Custom/Pretrained embeddings for ner_crf


I’m trying to use custom embeddings or pretrained embeddings with ner_crf for entity extraction, but can’t find a proper tutorial for it yet. I have tried using fasttext with spacy but I don’t think the embeddings are being used by ner_crf(as I’m not using POS tags feature with ner_crf).

If I had to feed custom embeddings as an additional feature to ner_crf, how should I do it with/without spacy(Spacy doesn’t have support for Bert embeddings yet)?

Hey @gowtham1997. Did you have a chance to look into the blogpost written by our contributor Souvik? Build a Rasa NLU Chatbot with spaCy and FastText – strai – Medium

1 Like


@Juste Yes, I did and have followed this GitHub issue to use FastText with rasa. But going through the code I see that Spacy is only used when pos_features are used with the ner_crf(and pos features aren’t included in default params of ner_crf). I tried using the pos_features(in the config file) as well but did not find any improvements. So my questions are:

  1. assuming the embeddings I want are available in spacy and I create a package following the instructions, does rasa only use pos_tags as features for ner_crf? (Going through the code, I couldn’t find where the actual word embeddings are directly used. So maybe I’m missing something.)
  2. Let’s say Spacy doesn’t have embeddings I want to use(or I have custom embeddings for every word or subword). How do I pass them as features to ner_crf?

Hi @gowtham1997, did you manage to figure this out? I’m having the same problem and I also went through the code, without being able to figure out where and how the actual word embeddings are used by ner_crf.

I have a work-in-progress PR to discuss how to pass these kinds of features to ner_crf/CRFEntityExtractor. This would then pair with another new component like SpacyVectorEntityFeaturizer that would pass the features along. That way if any new components for custom NER came along, it would be reusable.

Hi, is your solution now answer to the original question and best way to go if I have the same problem?

It’s not merged into master yet, but yes.

If you have a SpacyFeaturizer whose component config specifies ner_feature_vectors: true, it should work. It will make token.vector available to CRFEntityExtractor for every token in the spacy.Doc

@jamesmf @souvikg10 - I read your tutorial too

Just to check, this is how we did things:

  1. Downloaded the language I desire from here - Word vectors for 157 languages · fastText
  2. Started this code and saved the model - https://github.com/souvikg10/spacy-fasttext/blob/master/load_fastText.py
  3. Loaded the model with rasa, made our domain,stories, etc. files for that language and trained on it
  4. Started chatting and used config like this:
language: "br"
pipeline: "pretrained_embeddings_spacy"
- epochs: 45
 max_history: 10
 name: KerasPolicy
- max_history: 10
 name: AugmentedMemoizationPolicy
- name: "FallbackPolicy"
 nlu_threshold: 0.2
 core_threshold: 0.1
 fallback_action_name: "action_default_fallback"

This worked and we started to chat with the bot but I know we do not have any entity extraction so it is pretty lame, can you maybe help me with what should I do next, should I wait for your solution to come to master branch or are there things I need to do beforehand?


You just need to replace the pipeline: "pretrained_embeddings_spacy" with individual components. You can pick and choose, but if you want mostly spacy based components, you could do:

  - name: 'SpacyNLP'
    model: 'your_model_name_here'
  - name: 'SpacyTokenizer'
  - name: 'SpacyFeaturizer'
    ner_feature_vectors: true      # this is the part that's new functionality
  - name: 'CRFEntityExtractor'
  - name: 'EmbeddingIntentClassifier'

This would use spacy to tokenize, would create features for intents using the .vector attribute on the Doc, and would pass the .vector attribute on each token to the CRFEntityExtractor as (some of) the features to do custom entity extraction.

1 Like

Hi @Juste , Can Rasa just use idea of this paper(" Massively Multilingual Sentence Embeddings for Zero-ShotCross-Lingual Transfer and Beyond ") in place of pretrained embeddings in DIET Classifier archiecture. Instead of using GLoVe and BERT or ConveRT. Early reply will be admired…