Debugging Custom Rasa NLU Component during Training

rasa-nlu
paula
(Nabito) #1

I’m adding custom component (a specific language tokenizer) to use in my pipeline.

I’m using latest version of rasa NLU from https://github.com/RasaHQ/rasa_nlu.git Adding my_tokenizer.py and added the new component definition to rasa/nlu/registry.py The reference in nlu_config.yml (content below) can now referred to the component.

[nlu_config.yml]

pipeline:
- name: "my_tokenizer"
- name: "ner_crf"
- name: "ner_synonyms"
- name: "intent_featurizer_count_vectors"
- name: "intent_classifier_tensorflow_embedding"

Question: I’ve got the below error when trying to train the nlu model, how am I supposed to debug what happening during the pipeline (adding print into my_tokenizer.py and train with --debug doesn’t seems to work)

interpreter = trainer.train(training_data, **kwargs)

File “~/rasa_nlu/rasa/nlu/model.py”, line 194, in train updates = component.train(working_data, self.config, **context) File “~/rasa_nlu/rasa/nlu/extractors/crf_entity_extractor.py”, line 141, in train dataset = self._create_dataset(filtered_entity_examples) File “~/rasa_nlu/rasa/nlu/extractors/crf_entity_extractor.py”, line 151, in _create_dataset dataset.append(self._from_json_to_crf(example, entity_offsets)) File “~/rasa_nlu/rasa/nlu/extractors/crf_entity_extractor.py”, line 454, in _from_json_to_crf ents = self._bilou_tags_from_offsets(doc_or_tokens, entity_offsets) File “~/rasa_nlu/rasa/nlu/extractors/crf_entity_extractor.py”, line 485, in _bilou_tags_from_offsets starts = {token.offset: i for i, token in enumerate(tokens)} TypeError: ‘NoneType’ object is not iterable make: *** [train-nlu] Error 1

p.s. I know what error is, it seems my component doesn’t produce a proper output for ner component that follows. Just need a decent way to debug.

(Paula Wesselmann) #2

Hey @nabito,

In order to print statements from your file you can add import logging and then logger = logging.getLogger(__name__) to your file. All print statements that you wish to show in debug mode should then be written like logger.debug("your debug statement") and for general statements do logger.info("your info statement").

I hope that helps you debugging!

(Nabito) #3

Thank you Paula, it turns out I just forgot to add a proper custom component call from ‘training’ function. But your logging gave me a way to output debug msg in the same fashion as other rasa module, that’s awewsome!