Binary evaluation multiple intents

Dear all,

I am working on something similar to emotion recognition. I am in a spot where I am trying to figure out how to handle multiple intents, and currently none of the material on that seems to answer the question I am posing. The main problem people run into with multiple intents, is a “Yes, what are the opening times?” kinda issue, for which there are many solutions present, which lead to your chatbot picking the best one.

My problem is that I want to evaluate a single message for 7 different intents. I am not looking to pick only the best one, I am trying to evaluate for all 7 separately if they are present in the single message.

For example, my input would look something like:

“I went to the dentist today, and he was late. I got very angry at him, and he apologized. Afterwards I felt bad because I had been too mean to him, and I was sad”.

The way I would want my output would then be something like:

Happy: 0.1 Sad: 0.9 Angry 0.9 Afraid: 0.2 etc.

I have considered the current approach of dealing with multiple intents, but it doesn’t seem feasible to me currently, because I would have to make separate stories for every possible combination of 7 intents, which I would then also have to provide with training data. That means I would have to provide 128 different stories, intents, and also populate these with training data.

Right now what I am doing is having the intents separately, which brings me a ranking, of probabilities, which adds up to 1. Something like:

Sad 0.4 Angry 0.4 Sad 0.1 Happy 0.1 etc.

The problem I have with this is that it assumes the different intents are mutually exclusives, which unfortunately from the example is clear that they are not.

As a TL;DR: How do I evaluate multiple non-mutually exclusive intents on a single message?

Thank you for your time!

In Rasa, intents are implemented as a classification problem. That means that we assume that an intent represents a class that is mutually exclusive. You can’t have an intent that is both chitchat and faq at the same time.

If you want to have a peek at what your pipeline thinks of an utterance though, then you can have a look inside of the predictions by running.

rasa shell nlu

This will give you some more info on the NLU predictions. You can also get more info by running;

rasa interactive

But this command will also aid in labeling.

@koaning - On a related note, while labelling training examples how are we to treat multiple intents? I can think of two ways below:

  1. Additive - If an intent appears multiple times, we label it multiple times.
    For example, an utterance like:
Hello, thank you for reaching out.  I was wondering if you would be free tomorrow evening.  Thanks again for your patience.

could be labelled as hello+thanks+availability+thanks.

  1. Flags - If an intent appears at least once, then that intent is set to true. Else it is not part of the response.

With this approach, the above utterance would simply be classified as hello+thanks+availability. The second thanks intent does not make it more true, so the response only contains the thanks intent once.

If the second approach is how Rasa is dealing with multiple intents, then how do you deal with order? For example, is hello+thanks+availability treated differently to availability+hello+thanks or any other permutation of the three intents identified?

1 Like

@koaning Any update on this? :slight_smile:

Sorry for the late reply, just got back from my honeymoon.

Rasa does not support multi-intents. Each utterance belongs to one, and only one, intent.

If you add the same utterance to three intents then odds are that you’ll end up confusing the classifier because it only predicts one class for each utterance.

@koaning Thanks for responding! I don’t think I understood your response well. I still see pipeline specifications for multiple intents in the Rasa docs and in the blog.

Ah. Sorry, it seems I also misunderstood you and there’s a feature in Rasa that I wasn’t aware of.

If you have a look at the example on this blogpost.

## intent: meetup
- I am new to the area. What meetups I could join in Berlin? 
- I have just moved to Berlin. Can you suggest any cool meetups for me?

## intent: affirm+ask_transport
- Yes. How do I get there?
- Sounds good. Do you know how I could get there from home?

Then here, yes, you could say that we’re combining intents affirm, ask_transport but the same blogpost also demonstrates that this is classified as a single class. From the same blogpost;

{  
   'intent':{  
      'name':'affirm+ask_transport',
      'confidence':0.918471097946167
   },
   'entities':[  

   ],
   'intent_ranking':[  
      {  
         'name':'affirm+ask_transport',
         'confidence':0.918471097946167
      },
      {  
         'name':'ask_transport',
         'confidence':0.32662829756736755
      },
      {  
         'name':'thanks+goodbye',
         'confidence':0.004435105249285698
      }
...

I wasn’t aware that you were able to construct examples like affirm+ask_transport but the output of DIET remains a classifcation. This does mean that any “combined intent classes” still need to be created as such beforehand and that you’ll still need to write responses for them.

I’ll double check internally about any algorithmic details but back to your issue!

What is the exact use-case you have for evaluating a message that contains 7 different intents? You should be able to get this information by using a custom action since the assigned confidence of each intent is available from there.

@koaning So in the training examples, I’ve noticed that there are not well applied punctuations and a lot of run-in sentences. Something like -

what time are you open I was looking to come visit for a consultation thanks

We were earlier looking to build something where we serially pass each small utterance to the bot and process each of the intents from those utterances. But because of the high number of such examples, it becomes difficult to reliably break a single big utterance into multiple sentences. Hence, we were hoping to use the multiple intent feature of rasa to catch those multiple intents within a single utterance. If there’s an alternative we should be looking at for processing longer utterances, I’m all ears.

Now, there aren’t many examples where we have 7 different intents (I’ll come back to this in a bit), but we do have multiple cases where an intent “repeats” in a sentence. So in my first example earlier in this thread, the intent “thanks” can appear twice in the same utterance -

Hello, thank you for reaching out.  I was wondering if you would be free tomorrow evening.  Thanks again for your patience.

In my data from over 2 years of conversation data (production), these examples are VERY common. The docs and the blog do not explain if I should be labeling the “thanks” intent twice in the above example. I’ve currently done so in my current dataset and I need to know if I should change my approach before the dataset becomes too big.

For nearly 100% of our use cases, we’re also looking to build business logic where a “hello+thanks+availability” class is identical to “hello+availability+thanks”, so order becomes important here when labeling the training and test examples. While getting the F-1 score, this is seen as a misclassification and since these are very common in the dataset, the F-1 score tends to not reflect how good the model is.

For the relatively fewer cases where we have 7 different intents in a single utterance, I’m trying to get the bot to fallback. I wanted to also check with you if this is a good approach.

I can also explain over a call if you need more details.

A fallback might be an option. I’m also wondering if there’s something you can do in themes of “conversational design”.

I’m a bit surpised by your example. I’ve not seen users who type utterances like;

Hello, thank you for reaching out.  I was wondering if you would be free tomorrow evening.  Thanks again for your patience.

It feels more like an automated message from a customer sales rep or something that was generated by a script. Just to check; are these texts generated by actual users or are they sourced from another set?

They’re actual examples from users in production. Could you say more by what you mean by “conversational design”?

Any update @koaning? :slight_smile:

With conversational design I mean that you sometimes can send users into a conversational flow. You can narrow down the scope a lot by presenting the user with buttons instead of open ended questions for example.