Hi everyone,
I face many issues when I try to customize parameters of the Rasa pipeline. Starting from the Rasa Starter Pack, I first try the following pipeline (suggested here) without issue:
language: "en"
pipeline:
- name: "tokenizer_whitespace"
- name: "ner_crf"
- name: "intent_featurizer_count_vectors"
- name: "intent_classifier_tensorflow_embedding"
Then, I try to customize this pipeline with configurations presented here. First I wanted the customize the intent_classifier_tensorflow_embedding, by copying/pasting his detailed configuration.
In my opinion the detailed configuration is the default configuration, therefore I switch
- name: "intent_classifier_tensorflow_embedding"
with the complete pipeline and there is no issue as the configuration remains the same.
1. Problems with max_features
of intent_featurizer_count_vectors
However, issues come when I do the same thing with intent_featurizer_count_vectors
and ner_crf
. When I switch only the intent_featurizer_count_vectors
with its complete pipeline (described here), I have first the following issue with the max_features
parameter which is considered as the string ‘None’ instead of a “real” None:
ValueError: max_features='None', neither a positive integer nor None
I can avoid this value by deleting the corresponding line, but it remains an issue when it is mentioned.
2. Problems with features shapes with intent_featurizer_count_vectors
Then, if I delete the max_features
line, whatever the detailed pipeline I set up for intent_featurizer_count_vectors
, I get the following issue with tensorflow classifier:
ValueError: Cannot feed value of shape (64,) for Tensor 'a:0', which has shape '(?, 66)'
It seems that the Tensor
is not reshaped, which explains the ?
.
Therefore, my first question is: How can I solve this problem? How can I customize the “intent_featurizer_count_vectors” when it is followed by the “intent_classifier_tensorflow_embedding”?
3. Problems with ner_crf
Next, I have also the same kind of problems with ner_crf
. When I set only the default detailed pipeline described here, I first get an error due to the features pos
and pos2
unavailable without loading nlp_spacy
as described here.
So my second question is: How can ner_crf
without the detailed configuration work if nlp_spacy
is not in the pipeline? Does it mean that the configuration on the hyperlink is not the default configuration?
Therefore, I loaded the nlp_spacy
before the detailed ner_crf
configuration. It solved the problem with my own data set, but I just see that it does not work with the starter pack because of a index out of range
error.
You can try to reproduce these errors from the starter pack… If you find the solution that would help me and also the Rasa team to fix these issues with default configurations…
Thanks a lot in advance for your answers!