Many issues when customizing starter pack pipeline

Hi everyone,

I face many issues when I try to customize parameters of the Rasa pipeline. Starting from the Rasa Starter Pack, I first try the following pipeline (suggested here) without issue:

language: "en"

pipeline:

- name: "tokenizer_whitespace"

- name: "ner_crf"

- name: "intent_featurizer_count_vectors"

- name: "intent_classifier_tensorflow_embedding"

Then, I try to customize this pipeline with configurations presented here. First I wanted the customize the intent_classifier_tensorflow_embedding, by copying/pasting his detailed configuration. In my opinion the detailed configuration is the default configuration, therefore I switch - name: "intent_classifier_tensorflow_embedding" with the complete pipeline and there is no issue as the configuration remains the same.

1. Problems with max_features of intent_featurizer_count_vectors

However, issues come when I do the same thing with intent_featurizer_count_vectors and ner_crf. When I switch only the intent_featurizer_count_vectors with its complete pipeline (described here), I have first the following issue with the max_features parameter which is considered as the string ‘None’ instead of a “real” None: ValueError: max_features='None', neither a positive integer nor None I can avoid this value by deleting the corresponding line, but it remains an issue when it is mentioned.

2. Problems with features shapes with intent_featurizer_count_vectors

Then, if I delete the max_features line, whatever the detailed pipeline I set up for intent_featurizer_count_vectors, I get the following issue with tensorflow classifier: ValueError: Cannot feed value of shape (64,) for Tensor 'a:0', which has shape '(?, 66)'

It seems that the Tensor is not reshaped, which explains the ?.

Therefore, my first question is: How can I solve this problem? How can I customize the “intent_featurizer_count_vectors” when it is followed by the “intent_classifier_tensorflow_embedding”?

3. Problems with ner_crf

Next, I have also the same kind of problems with ner_crf. When I set only the default detailed pipeline described here, I first get an error due to the features pos and pos2 unavailable without loading nlp_spacy as described here.

So my second question is: How can ner_crf without the detailed configuration work if nlp_spacy is not in the pipeline? Does it mean that the configuration on the hyperlink is not the default configuration?

Therefore, I loaded the nlp_spacy before the detailed ner_crf configuration. It solved the problem with my own data set, but I just see that it does not work with the starter pack because of a index out of range error.

You can try to reproduce these errors from the starter pack… If you find the solution that would help me and also the Rasa team to fix these issues with default configurations…

Thanks a lot in advance for your answers! :slight_smile:

@Abir

@akelad Please help out with issue 1 and 2

Go through this link and see if you made the same mistake . If not then share the complete nlu_config