Extending CRFEntityExtractor

rasa-nlu

(HQ) #1

Hello, I’m trying to add features to the ner_crf. By looking at the source code i noticed that features are defined in ‘function_dict’. To add to this dictionary additional features, I created a new component (called it Crf) which inherits from ‘crf_entity_extractor.CRFEntityExtractor’. Then I referenced this component in the rasa_config.yml file. When I run the pipeline I notice that rasa_nlu.extractors.crf_entity_extractor.CRFEntityExtractor is being called instead. Can you tell me what I might be doing wrong?

The custom component is:

class Crf(crf_entity_extractor.CRFEntityExtractor):
        function_dict = crf_entity_extractor.CRFEntityExtractor.function_dict

modifications on function_dict are reflected on the process. For example, something like this would work

function_dict['email'] = lambda doc: True if '@' in doc[0] and '.' in doc[0] else False

However, if i attempt to override a method from CRFEntityExtractor (e.g. process, or _from_text_to_crf) in my custom class (Crf), the pipeline will still call these methods from CRFEntityExtractor. What can i do to fix this?


(Oceania) #2

Are you doing predict?


(HQ) #3

Sorry but I’m not following. Can you please elaborate?


(Oceania) #4

Are you doing train or predict when you invoke your new pipeline. If predict, rasa will invoke the same pipeline with training at that moment. That’s why your new function isn’t be invoked.


(HQ) #5

I first did train, then I loaded the trained model with RasaNLUInterpreter. When I called the parse method, in debugging mode, I noticed that it’s invoking CRFEntityExtractor and not my custom class, Crf.


(HQ) #7

So basically i trained the model on my custom class (referenced in rasa_config.yml) before invoking the predict function.

pipeline:

  • name: “nlp_spacy”
  • name: “tokenizer_spacy”
  • name: “CustomExtractor.Crf”
  • name: “intent_featurizer_count_vectors”
  • name: “intent_classifier_tensorflow_embedding”

as you can see from the pipeline above, i referenced “CustomExtractor.Crf” before training the model. Yet, after training, the pipeline invokes “rasa_nlu.extractors.crf_entity_extractor.CRFEntityExtractor” when i call the function model.parse(string)