I actually have a bot that use lookup table. That’s why I need to use the CRFEntityExtractor to extract my entities.
My intents are quite unbalanced, that’s why I would like to use the batch_strategy: sequence of DIETClassifier to help with this.
The CRFEntityExtractor detect correctly the entities, however the DIETClassifier will not detect the correct intent (because it doesn’t know the entities from the lookup table)
Is there a way to specify which entity extractor we want to use to the DIETClassifier ?
If not, is there anyone that have an idea how to handle unbalanced Intents and lookup tables ?
The CRFEntityExtractor detect correctly the entities, however the DIETClassifier will not detect the correct intent (because it doesn’t know the entities from the lookup table)
Just to confirm, you’re worried that DIET doesn’t do the intents correctly or the entities? The way DIET handles intents is independant of the entities that are extracted from the CRFEntityExtractor.
Hi @PaulB , i’m also curious about your request ,so here is what i think (please correct me if i’m wrong) from my understanding each component in the nlu pipeline will hand it’s result to the next one , so if CRFEntityExtractor will give it’s results to DIETClassifier , shouldn’t you consider setting entity_recognition to false if you don’t want the DIETClassifier influencing the entities extracted ?
Hello @pandaxar,
Thank you for your suggestion.
I already tried with the pipeline you suggest, but it’s not working. DIET seems to ignore previously extracted entities of CRF.
Bonjour PaulB,
I’m keen to help you as i’ll understand how things work too .
Could you tell us what’s like the f1 score for entity extraction of the NLU pipeline used ?(remove the core policies for now)
hm , the scores are pretty high , so how come the entities don’t get extracted ? can you enable diet for entity extraction and share with the us its scores again ?
(does the pipeline without CRFEntityExtractor fails to extract entities in lookup tables ?)
(if so will switching the position between the two components solves this issue ?)
(edit1 :also ,did you include the lookup tables in the learning data ? xd it would be hilarious if it wasn’t)
(edit2: hm , it seems CRFEntityExtractor and DIETClassifier aren’t used both in any rasa repositories)
There was a post on the forum probably last week where someone used the CRFEntityExtractor for entity extraction and the DIETClassifier for intent classification but I didn’t find it.
Hi there! I haven’t found any answer on how to specify an entity extractor to extract only specific entities? I’ve spent some time now looking for this answer and I haven’t found any documentations about it nor examples that could lead to the solution for version 3.0.
I am particularly needing to use CRFEntityExtractor for entity1 from lookup tables, Duckling for entity2 and the rest of them are fine with DIET. I have seen official Rasa3.# videos mentioning that extractors can be assigned to specific entities but the implementation is nowhere to be found.
source
In Rasa, when using both the CRFEntityExtractor and DIETClassifier in your pipeline, the DIETClassifier typically relies on the entity information provided by the CRFEntityExtractor to understand the context of the conversation. However, you’ve mentioned that the DIETClassifier is not detecting the correct intent because it doesn’t know the entities from the lookup table.
As of my last knowledge update in September 2021, specifying which entity extractor to use with a particular classifier within the pipeline was not a built-in feature in Rasa. Instead, Rasa usually relies on the order of components in the pipeline.
Here are some suggestions to handle unbalanced intents and lookup tables:
Data Augmentation: If you have limited training data for unbalanced intents, consider augmenting your training data. You can create more examples for underrepresented intents to help the DIETClassifier learn them better.
Balancing Data: Try to balance your training data by adding more examples for the underrepresented intents. This can help the classifier perform better.
Threshold Adjustment: Adjust the confidence threshold for intent recognition. You can set a lower threshold for the DIETClassifier to be more inclusive of intents. However, be cautious with this approach, as it may increase the likelihood of incorrect intent classifications.
Custom Actions: If the lookup table is critical for your bot’s functionality, consider using custom actions to handle entities from the lookup table separately. You can write a custom action that processes the lookup table entities and sets relevant slots or context.
Slot Filling: Use slot filling to capture important information from user messages and make it available to the DIETClassifier. This information can be used to help the classifier determine the intent.
Redefine Training Data: Carefully review and redefine your training data to ensure that it captures the conversational patterns and entity use cases specific to your bot.