Hi everyone, I am new to rasa and I have faced the following issue when “Starting to train component DIETClassifier” in the pipeline building process with the given configuration:
tensorflow.python.framework.errors_impl.InvalidArgumentError: All dimensions except 2 must match. Input 1 has shape [64 8 768] and doesn’t match input 0 with shape [64 11 128].
I have tried a “language: en” model with LanguageModelFeaturizer, and it runs perfectly. Is it mean the LanguageModelFeaturizer doesn’t support “language: zh”?
Strange, that shouln’t happen. I’m wondering what is happening here.
Just to confirm, if you remove the LanguageModelFeaturizer component, does the error persist? I’m wondering if there’s a mismatch between the tokeniser and the language model.
Could you confirm the Rasa version here? Also the huggingface version?
rasa --version
pip freeze | grep huggingface
Related, I just merged a PR for spaCy. Soon, you should also be able to get pre-trained language models for chinese via spaCy tooling as well.
One thing to perhaps try out, could you try using these settings?
pipeline:
- name: LanguageModelFeaturizer
# Name of the language model to use
model_name: "bert"
# Pre-Trained weights to be loaded
model_weights: "rasa/LaBSE"
I’m mentioning this model because it is explicitly mentioned in our docs and the LaBSE model is trained for the multi-language base. One of the languages it was trained on is Chinese, so this might be an alternative to try.
Sorry for the late reply and I have tried your setting with LaBSE finally, but it gave me the same issue as the BERT model when training the component of DIETClassifier:
tensorflow.python.framework.errors_impl.InvalidArgumentError: All dimensions except 2 must match. Input 1 has shape [64 13 768] and doesn't match input 0
with shape [64 21 128].
[[node gradient_tape/ConcatOffset_1 (defined at C:\Users\p768l\AppData\Roaming\Python\Python38\site-packages\rasa\utils\tensorflow\models.py:157)
]] [Op:__inference_train_function_54635]
Function call stack:
train_function
And for your information, I can train and run the pipeline with both BERT and LaBSE if I replace the DIETClassifier with SklearnIntentClassifier and CRFEntityExtractor.
I’m now wondering if this is perhaps a bug that we should investigate. Is it possible for you to send me a minimum viable example of nlu.yml and config.yml that I might be able to run locally? If I can confirm the error I’ll gladly start a GitHub issue for it.
The error is originating from the concatenation of sequence features in DIET Classifier from two featurizers that differ in the number of tokens (batch size x tokens x embedding).
I am able to reproduce with a custom component that uses a tokenizer not in the NLU pipeline. There is likely a similar issue occurring between the Jieba Tokenizer and the tokenizer in the LanguageModelFeaturizer.
@p768lwy3, did using the spacy tokenizer resolve this issue?
I encountered the same problem when testing with cross-validation. The curious thing was that some models worked others didn’t. After a long time investigating I found the problem in my nlu data. We use automated scripts that read our excel files with the nlu data and write them in rasa nlu format and also label our entities in the process. For some reason the scripts sometimes introduce a space between the words that is not a whitespace. You will only see it when your IDE is set to show the white-spaces.
Maybe this helps when you are investigating.