I am trying to build a custom training data loader for the DIET Classifier. I guess it’s the definition _create_model_data that loads the training data for the training, but not sure.
In my case I don’t want to change the training data and I want to use the pipeline in the usual way. I have training data with multiple intents but the DIET classifier should only train on the second intent (and skip the first one).
Does anyone have a hint, where I can change the way the intents-labels are loaded into the DIET Classifier?
Before diving into technicalities here, is there a reason why you need this “second intent” feature? What goal are you trying to accomplish in your conversation?
Hi,
I want to train two models on the same messages. So each model is trained to find other intents given the assumption that the messages have two intents. It is also for research purposes.
I tried to select the first intent via the tokenzier:
def _tokenize_on_split_symbol(self, text: Text) -> List[Text]:
words = (
text.split(self.intent_split_symbol)[0]
if self.intent_tokenization_flag
else [text]
)
But nothing is happening, I still got both intents in the DIET Classifier while training - not only the first via [0]…
Right. In that case, Rasa does not natively allow for classifiers that detect two intents. We have the “multi-intent trick” but to my knowledge that’s it. An alternative might be to use end-to-end learning but that’s more like “intent-less” action prediction.
I know! I don’t want to use a classifier that detect two intents. That’s why I need to do a custom load into the classifier…
In my case I don’t want to change the training data and I want to use the pipeline in the usual way. I have training data with multiple intents but the DIET classifier should only train on the second intent (and skip the first one).
It’s unclear what exactly you need here. Could you explain your use-case a bit more? As in, could you describe the kind of virtual assistant you’re trying to create and what needs to happen? That would help me understand what is broken.
I have training data with multiple intents ( abc+xyz ). I will use two pipelines for each intent, so I need to have a custom training data load into the DIET Classifiers.
Simple example: I want to order something to do fast cooking. (intent_order+product_pressurecooker)
The first pipeline will detect main intents, the second will detect products in that case. It’s just an example, but I hope you understand the idea behind that.
What I need:
Rasa classes are highly interrelated and I struggle to find the way the training data/messages get loaded into the DIET Classifier. I tried it via the tokenizer, but it doesn’t work. I want to split the intents in the training data and only use the first/second one.
In your example though … why not have an intent buy and then have the product of interest be an entity? That way, a user can indicate that they’re interested in buying multiple items in a single utterance.
It was just an example and even here: You cannot use entities. “Something to cut vegetables very fast”, “an instrument to find metal in the ground” are not entities in the usual sense.
I am informed about entities, multi-intent classification and so on.
I described the information I need. Could you please help me out?
Maybe a more specific question: Is it enough to manipulate the def preprocess_train_data for filtering the intents? Or is it important to manipulate the training data even before?