SOLVED
I was also getting a similar error… my classifier is not set up for Sentiment, but a similar function and similar pipeline. There were a few pieces to this puzzle.
First, ensure that you require “tokens” within your component. Otherwise you will not be able to reference .get(‘tokens’) during training. If you’d like to ignore tokens, then you can use the simple workaround below.
training_data.training_examples[SOME VALUE].get(‘tokens’)
OLD:
training_data = training_data.training_examples #list of Message objects
tokens = [list(map(lambda x: x.text, t.get('tokens'))) for t in training_data] #HERE
processed_tokens = [self.preprocessing(t) for t in tokens]
labeled_data = [(t, x) for t,x in zip(processed_tokens, labels)]
self.clf = NaiveBayesClassifier.train(labeled_data)
NEW
training_data = training_data.training_examples
tokens = [t.text.split() for t in training_data] #HERE
processed_tokens = [self.preprocessing(t) for t in tokens]
labeled_data = [(t, x) for t,x in zip(processed_tokens, labels)]
self.clf = NaiveBayesClassifier.train(labeled_data)
Secondly, the reason you are getting the same value is because the model is not saving/loading correctly. Turns out that rasa.nlu.utils is the culprit of this problem. Originally, the Sentiment Analysis component used utils.json_pickle and utils.json_unpickle to save/load the classifier. I’m not sure why this code suddenly broke but it did.
SOLUTION:
Use pickle vs rasa.nlu.utils to save/load. See the code below for persist/load
def _write_model(self, model_file, classifier):
save_classifier = open(model_file,"wb")
pickle.dump(classifier, save_classifier)
save_classifier.close()
def persist(
self,
file_name: Text,
model_dir: Text
) -> Optional[Dict[Text, Any]]:
"Persist this model into the passed directory."
if self.clf:
model_file_name = os.path.join(model_dir, MODEL_FILE_NAME)
self._write_model(model_file_name, self.clf)
return {"domain_classifier_model": MODEL_FILE_NAME}
@classmethod
def load(
cls,
meta: Dict[Text, Any],
model_dir: Text = None,
model_metadata: Metadata = None,
cached_component: Optional["YOURCLASSNAME"] = None,
**kwargs: Any
) -> "YOURCLASSNAME":
file_name = meta.get("classifier_model")
classifier_file = os.path.join(model_dir, file_name)
if os.path.exists(classifier_file):
classifier_f = open(classifier_file, "rb")
clf = pickle.load(classifier_f)
classifier_f.close()
return cls(meta, clf)
else:
return cls(meta)
Goodluck