I was reading through Rasa NLU in Depth: Intent Classification and it sort of gave me an idea different of what I am thought was happening. So my question is, and perhaps for the various EntityExtraction choices there is a different answer, to what extent if any does the extraction of entities influence the ultimate intent prediction?
The example given in the blog post is one where the hypothetical user has separate intents where in one case a person name would be the distinguishing feature and a date would be in the other case. That is to say two intents are identical except for the entity slot present in each intent. Quite a common scenario I face as well, except that I don’t have the (full) control in intent creation. The recommendation I gather is to combine the two intents into a single intent and then handle that in the core component.
For some of the blackbox tools (I use DialogFlow a lot), it seems that the “NLU” portion does have its intent influenced by which entities were or were not picked out. We commonly see for example that if an expected person is missed, we would end up at a completely different intent.
So is it true that in Rasa the two processes, intent and entity detection, are completely separate? I had hoped to use a SpaCy entity extractor along with the EmbeddingIntentClassifier. Ultimately we would like a custom SpaCy model, I see that is not supported, but then would the off-the-shelf SpaCy NER model ever be used to influence which prediction is chosen.
I am sure I have stated my use case around these forums before, but using core just is not possible at the given moment so entities must work in the nlu for me.
Hi there @grjasewe, yes, at the moment they are completely separate. We have discussed before whether either intent classification should affect entity extraction (i.e. should Virginia be picked up as a name or a place? Depends on the intent) or if entity extraction should affect intent classification (I found an email, the probability the intent is inform increases).
I see that you don’t/can’t use core – I’m not sure how you are handling dialogue management, but would recommend you handle something like that in the same way, depending on entities that are picked up.
We commonly see for example that if an expected person is missed, we would end up at a completely different intent.
This is a valid concern for something like where entity extraction affects intent classification. We prefer to have the intent classification be separate – then the intent will (hopefully) be picked up correctly, and Core will handle the missing entity if you wrote stories correctly (i.e. subscribe_newsletter w/email entity --> action_subscribe_user, subscribe_newsletter w/o email entity --> utter_ask_email).
Using spacy entity extractor with the embedding intent classifier should be fine, I think – you can also use multiple entity extractors in a given pipeline. Can you point to where you found that custom spacy models are unsupported? I think there might have been a miscommunication there. It takes a little more work, but we’ve successfully set up custom spacy models for languages that spacy doesn’t have a model for (e.g. for Russian there is an open source ru2 model), so I don’t know why you wouldn’t be able to use whatever spacy model you like.
Thanks, that will be very helpful to share with my team and does clarify our thoughts! We are a large team, so even if I would say method X is the correct method to go about doing things, it is not always so easy to get method X applied.
In regards to custom NER models, and that would be so very useful as well because we are leaning towards custom NER models and see that mapping an existing model will not work for us, let me share the link which I have already found to support what I was saying had been reported. I reference •https://rasa.com/docs/rasa/nlu/components/#spacyentityextractor. At that page, I read the following:
“As of now, this component can only use the spacy builtin entity extraction models and can not be retrained. This extractor does not provide any confidence scores.”
If that is not the case, or if I misunderstand, we certainly would like the guidance as how to do this. Our case is a custom English model where we may have a few of our own specialized entities plus the ORG entity provided perhaps is a superset of what we want (maybe more like COMPANY instead)
Ah okay, I misunderstood, as this has to do with spacy entities rather than word embeddings. Why did you decide to create your custom entities within the spacy model instead of using the regular spacy model and adding on custom entities in rasa with the CRFEntityExtractor? (I assume because you are also doing this with dialogflow?)
As of now, this component can only use the spacy builtin entity extraction models and can not be retrained.
I’m not 100 percent sure, but if everything is defined in the spacy manner with your specific spacy entities, it might be that this entity extractor will work to extract your spacy entities.
I can try to use both to get those additional entity types if that is the best approach. However as I said the off-the-shelf ORG entity does not completely work for us so we would only want to use PERSON from spacy perhaps. Therefore, it almost becomes a case of just using CRFEntityExtractor alone instead for all our entities (not a bad thing necessarily). We definitely do see those embeddings you mention as potentially useful from spacy hence why we are using spacy to the extent we can in our pipelines we are trying out.
As far as DialogFlow, that is what we currently use, however we were leaning towards something else, including perhaps rasa which we have been looking at for many months, for our next release. The benefit is the greater control and greater customization of the training process, to include custom annotators as need be. It is very much a different way of thinking of things though going from entity affecting intent prediction to entity and intent prediction being separate, so much so that it might cause a difference in system architecture for us. I don’t see why NER be included in the rasa pipeline at all since we can have whatever NER tool running on a separate server independent of and in parallel with the rasa flow. For example, some of us are really keen on Stanford NER as our NER tool. I cannot say either way is better because we have identified on this post one of the problems with entity affecting intent. It is just that as in my original message, we have a fair amount of overlap in our intents and it is not very easy from a process standpoint to eliminate this overlap so we cope the best we can.
Yeah, I would say probably your best bet would be to use the CRF for your personal entities, and then only choose the entities from spacy that you want to use, as they’ve been made configurable.
I don’t see why NER be included in the rasa pipeline at all since we can have whatever NER tool running on a separate server independent of and in parallel with the rasa flow.
You’re absolutely welcome to either add a custom component that handles your NER, or a message preprocessor that processes the messages before they get to the pipeline. All it needs to provide are the entities in our format. We provide it in our pipelines because most people do not host their own NER server.
However, if you’re using our intent recognition, it’s highly probably that the overlapping intents will cause confusion in the classificaton.
@erohmensing one thing is still not clear for me:
If Intent and Entity matching are separate (that is clear), don’t entity example values/synonyms not appearing in intent training phrases have any effect on certain intent being matched?
If that would be true, and an entity had 100 example values, it would mean we should include them all in intent training phrases AND in different variations, to make it work really accurate.