CountVector fails with analyzer=char_wb

I tried “word” and “char”, it works. But when I tried semantic word hashing with:

analyzer=char_wb

It always reports the following error: File “/rasa/nlu/classifiers/embedding_intent_classifier.py”, line 565, in train self._train_tf(X, Y, intents_for_X, loss, is_training, train_op) File “/rasa/nlu/classifiers/embedding_intent_classifier.py”, line 458, in _train_tf is_training: True, File “/lib/python3.7/site-packages/tensorflow/python/client/session.py”, line 929, in run run_metadata_ptr) File “/lib/python3.7/site-packages/tensorflow/python/client/session.py”, line 1128, in _run str(subfeed_t.get_shape()))) ValueError: Cannot feed value of shape (64,) for Tensor ‘a:0’, which has shape ‘(?, 15449)’

What caused this issue?

hi @twittmin ! can you please provide your config file, Rasa version, and a minimal NLU training file which produces this issue?

@twittmin I am assuming you are on the master branch currently. Looks like the some of the data points didn’t get featurized correctly by countVectorizer. Line 242 on file count_vectors_featurizer.py basically computes the text_features for all data points of your training which are consumed by EmbeddingIntentClassifier. Could you check the length of the second dimension of X and see if it’s consistent for all datapoints. If not, there is something wrong with the text of those data points.

Hi, @amn41 and @dakshvar22 For some reason I don’t have this error message any more. I will look into it again if it pops up again. Thanks.