Doubts about multiple intents and the impact on the classifier

Hey all,

I’ve been wondering. Didn’t check/tested the TensorFlow components of the source code, but maybe someone here could give me a light avoiding my need to check things manually.

I have this modelling doubt about multiple intents:

How does declaring instances on a multiple intent class on nlu.md differs from declaring a whole new class? Is just a matter of control/syntax or the classification performance does improve? Most important thing is: does multiple intent uses the marginal information from the intents that compose it? If I activate the multiple intent funcionality, the NLU will label something as A + B even with only A instances and B instances on the trainset?

If there are indeed multiple labels (in terms of classification) instead of a new class, how does the validation/test metrics are affected when declaring the use of multiple intents?

Any help on this would be very much appreciated.

Just to clarify with multiple intents you mean classifying one utterance into multiple classes, e.g. assigning multiple labels to it? For example, “I need a flight to Berlin and a hotel for three days over there. Can you help me with that?”, would be classified as request_flight and request_hotel. Is that correct? Or do you mean something else? Can you maybe give an example?

Sure, no problem.

“I need a flight to Berlin and a hotel for three days over there. Can you help me with that?”, would be classified as request_flight and request_hotel . Is that correct?

Exactly. This would be labeled as request_flight + request_hotel. But the deal is: will the model infer this intent two intents if they were never show in simultaneously as one instance in the trainset?

Consider this: you have 100 sentences on the nlu.md labeled as request_flight and other 100 labeled as request_hotel, but no examples as request_flight + request_hotel in the dataset. When you declare the multi-intent option in the training, will the model infer joint intents from just marginal intents? Will it infer A + B event if trained only over A and B but never A + B? I am considering the situation that ok, you might declare some A + B instances in your nlu.md, and maybe from this logic, the model will infer some C + D event with just C’s and D’s in the dataset without no C + D instance present in the .md file.

In other words: Is the model ‘smart’ enough to learn that there are two different intents when presented only with instances with a single intent in its training set? If the answer to this question is ‘no’, then I see no reason to this functionality when you can just declare a whole new class/intent that represent the intents together.

You still need to declare the whole new intent to represent combined intent. Improvement of performance heavily depends on your dataset, but on our datasets, we see that setting intent_tokenization_flag to True and providing appropriate intent_split_symbol increases classification accuracy.

The easiest way is to try two versions and compare performance

Please, keep in mind that we do not predict multiple intents for an input in the end. We will always just assign one intent to an input.

Hi @bayesianwannabe,

in addition to the other answers there is a “simple” technical one:

If you define a story like the one in the blog article:

* greet
    - utter_greet
* meetup
    - utter_meetup
* affirm+ask_transport
    - utter_affirm_suggest_transport

this would be considered as one intent thus there has to be a corresponding intent in the domain file and the nlu.md. If you now would want to use EmbeddingIntentClassifier, you might most probably get an error which sais that there have to be at least two different embedded intents - so there is no automation on combining existing “single”-intents.

If there would be one, imagine the computational effort that arises by e.g. cross-combining 100 intents.

Perfect. I guess the answers from the thread and also yours solves my doubt well.

Predicting multiple intents seems convenient in a way, but a multi-label paradigm would imply in a huge adaptation in the stories data structure, right? Like, if a new pair of intents come without an story mapping an action, it would need a treatment that the stories file doesn’t allow, I guess.

Thank you for the attention with these doubts, it will save a lot of time following with the design and knowing well about these details.

I see! Thank you for the answer! It’s good to know about this point and this affects the way we are designing things and some decisions.

If there would be one, imagine the computational effort that arises by e.g. cross-combining 100 intents.

I’m not sure about this last one. I thought for a moment that the TensorFlow architecture adopted would switch to something that would provide a multi-label classifier. Training would not necessarily require all combinations but would be more like the model learning the ‘and’ effect, or simply tagging over the instance with classes/intents above some thresholds. Even considering modeling the possible pairwise or even n-wise interactions, I wonder if stuff like shared parameters or then modeling the interactions in a different way just like factorization machines in recommender systems would help e.g…

Well, that being discussed, I guess the number of intents on my chatbot is going to increase a lot lol. Again, thank you for the answers.

Interesting. I will do some tests, but I guess that the idea of this parameter is more about the symbolism on the data structure than changing anything in the model/architecture of the intent classifier, right?

Thanks!