Using NER as a Feature for CRFEntityExtractor

liaeh · June 21, 2021, 1:31pm

Hi all, I’ve been playing around with Rasa (NLU-only) and am wondering how to achieve this –

I’m working on a slot-filling task, in a ‘low-data’ (i.e. 20 samples per label) setting. I want to train a custom entity extractor, and have decided to use CRFEntityExtractor (because I don’t have so much data for DIET).

Think of an utterance such as "I want to depart from New York" with NY = departure as slot label. My idea is to use a pre-trained NER extractor, e.g. from SpaCy to first extract New York as a city. Then, combine it with token embeddings, e.g. LanguageFeaturizer component to use a Transformer model to create contextual embeddings, and use both the entity label and token embeddings as features to train the CRF tagger.

My questions:

How could I combine an entity prediction from the pretrained NER and use it as a feature for the CRF?
Would this be the correct config:

pipeline:
  - name: SpacyNLP
    model: en_core_web_trf
    case_sensitive: False
  - name: SpacyTokenizer
  - name: LanguageModelFeaturizer
    # Name of the language model to use
    # choose from ['bert', 'gpt', 'gpt2', 'xlnet', 'distilbert', 'roberta']
    # or create a new class inheriting from this class to support your model.
    model_name: "bert"
    # Pre-Trained weights to be loaded
    model_weights: "bert-base-uncased"
  - name: LexicalSyntacticFeaturizer
    "features": [
      # features for the word preceding the word being evaluated
      [ "suffix2", "prefix2", "pos2" ],
      # features for the word being evaluated
      [ "BOS", "EOS", "pos2" ],
      # features for the word following the word being evaluated
      [ "suffix2", "prefix2", "pos2" ]]
  - name: CRFEntityExtractor

Thanks for any help

koaning · June 25, 2021, 7:44am

At the time of writing this answer, the entities extracted by entity extractors are not stored as features for the machine learning pipeline.

It’s explained in full detail in this NLU guide.

That said, what entities are you trying to detect? Cities? If that’s the case it might be more pragmatic to start with a name-list or spaCy before considering BERT features.

liaeh · June 25, 2021, 11:17am

Thanks for the response!

I am using RasaNLU to run several benchmarking experiments on various datasets for slot filling. So I don’t have an entity in particular in mind. I just thought this would be a useful feature to use pre-trained names entities as features for a custom entity extractor.

For example, I want to fly to New York from Chicago , New York being a city could be a useful feature to classify it as a departure slot.

I see now that spaCy is probably a good option for this, as it has several pre-trained NER extractors that can be used as features. I’ve had success implementing this with sklearn-crf suite, but haven’t tried with Rasa. I guess this would have to be some custom logic I add in.

koaning · June 25, 2021, 11:21am

The CRF implementation inside of Rasa is based on sklearn-crf.

I’m actually reminded now that, technically (very technically), the DIET algorithm uses a CRF layer from Tensorflow. One interpretation of this layer is that it uses the entity information from the neighboring tokens to predict if the current token is an entity.

koaning · June 25, 2021, 11:22am

In case you’re interested, you may find this youtube video I made on name detection interesting. It tries to highlight how hard it can be to properly benchmark for certain entities. Human names, in particular, turn out to be much harder than I thought they would be.

liaeh · June 28, 2021, 7:57am

True, DIET does use a CRF head for decoding the sequence. However, as I understand it, it is looking at the previous predicted custom entities when decoding a given entity; it would not consider e.g. information from a pretrained NER unless this information is explicitly passed as a feature, right?

koaning · June 28, 2021, 8:01am

True. The pre-trained entities, for example from spaCy, are not passed as features.

Topic		Replies	Views
Leveraging both spaCy and CRF entity extraction correctly Rasa Open Source	8	4927	February 18, 2020
Pass custom features to CRFEntityExtractor Rasa Open Source	4	1194	August 12, 2019
Feeding Custom/Pretrained embeddings for ner_crf Rasa Open Source	9	3249	May 22, 2020
Multiple NER Rasa Open Source	10	1306	May 24, 2019
Ner_crf Rasa Open Source	12	5122	September 28, 2018

Using NER as a Feature for CRFEntityExtractor

Related topics