DIETClassifier does not generalize?

Hi all,

I am wondering whether some of you had maybe kind of similar problem with the DIETClassifier. Currently I am testing how good the generalization works with the standard scenario of flying from A to B. AND now I got stuck - there is no generalization happening at all. It looks like the classifier is doing a word mapping as the entity extraction works fine as long I am using cities that I used in the nlu training data.

Any ideas?

Andre

Hi @AndreD,

From your example, Did you mean the entities cannot able to correctly identify the location and destination when you are giving input to the bot.

Hi Murali,

exactly, but I get good entity matches (>60%) when I take location names that are mentioned in the training data. When I take similar utterances and change only the location name - let’s say Berlin -> Pune I get no matches at all…

BR Andre

@AndreD, Have you tried Entities & Roles.

Declaring like this in stories.md:

- I am from [UK]{"entity":"Country", "role":"currentlocation"} and want to go to [USA]{"entity":"Country", "role":"destination"}

- I am from [USA]{"entity":"Country", "role":"currentlocation"} and want to go to [UK]{"entity":"Country", "role":"destination"}

and you can get these values from custom actions and utter from there.

Do you think it makes any difference when I use roles in addition?

Yes, entities will now have an role. Although entity is a Country in my case, But it has two options either it can be a currentlocation or destination. I used FormAction here. So It will do automatic slot filling for me.

.

If you want to try different approach. You can write a if/else condition and predict user is going to utter current location first and destination second. Example I done before. By doing this you will lose NER capabilities.

Murali, thank you so much helping me out here :slightly_smiling_face:

But let’s please start from the original question. So before I can dig any deeper into the customer actions I need to understand the intent/entity extraction problem…

So coming back to you comments above. Why did you mention the stories.md? I mean, I wouldn’t put these utterances into the stories file, but into the nlu.md, would I?

Apologizes, I declared in nlu.md, But mentioned stories.md

Yes It will be in nlu.md, Working late hours on my projects it got me a bit. :sweat_smile:

no problem. How many examples do you have annotated for the country entity? Looking at the image I see only the two you mentioned before. I am asking because I am wondering whether this is enough training data for the model to generalize …

can you also show me your pipeline ?

This is how mine is looking currently …

image

Now I have taken your “Country” Example one-by-one. I also took the en language model. I do not see any generalization happening either :thinking:

This is my config file.

How’s your actions.py looks like, I have only two examples in my nlu.md

I cannot see any major differences which could cause my observations…

Using your Country example delivers the following in my runtime:

When I utter the sentence “I am in UK and want to go to the USA” the extractor delivers two country entities like expected

When I utter the sentence “I am in India and want to go to Australia” no entities are extracted at all, which is not as I expected it - do you agree? Maybe I totally missunderstood the concept but …

I have no code so far in my actions.py as I first wanted to understand why the extraction isn’t working …

Got it.

Now you need to write some code in the actions.py to extract those entities and store it in a slot by means match those entities roles to the given slot and then we have to utter from there or do the steps you want to do.

You can try this example code: This will match the entities to their roles

from typing import Any, Text, Dict, List
from rasa_sdk import Action, Tracker
from rasa_sdk.executor import CollectingDispatcher
from rasa_sdk.forms import FormAction
from rasa_sdk.events import SlotSet

class CountryForm(FormAction):

    def name(self) -> Text:
        return "CountryForm"

    @staticmethod
    def required_slots(tracker: Tracker) -> List[Text]:
        """A list of required slots that the form has to fill"""
        return ["currentlocation", "destination"]
       
    def slot_mappings(self) -> Dict[Text, Any]:
        return {
                "currentlocation":  self.from_entity(entity="Country", role= "currentlocation"),
                "destination": self.from_entity(entity="Country", role="destination")
                }

    def submit(self, 
                dispatcher: CollectingDispatcher, 
                tracker: Tracker, 
                domain: Dict[Text, Any]) -> List[Dict]:

        current = tracker.get_slot('currentlocation')
        des = tracker.get_slot('destination')
        print("Current",current)
        print("des",des)

        dispatcher.utter_message("I am from {} and I want to go to {}".format(current, des))
        return []

stories.md

## country test 
* greet
- CountryForm
- form{"name": "CountryForm"}
- form{"name": null}

nlu.md

config.yml

- name: FormPolicy

do not forget to declare formpolicy on the config.yml

Hi Murali,

thanks a lot, I will definetely do so…I will come back whether it works. It is anyway part of my learning process using the RASA stack :slight_smile:

But this will not help regarding the NLU topic as such, as I only can store something that I have been able to extract before. Like mentioned before in the ‘Country’ example:

…When I utter the sentence “I am in UK and want to go to the USA” the extractor delivers two country entities like expected

When I utter the sentence “I am in India and want to go to Australia” no entities are extracted at all, which is not as I expected it - do you agree? Maybe I totally missunderstood the concept but …

If you have not declared that in your nlu.md, then you cannot extract those entities.

So basicly this was my main question… and I really have doubts believing it to be honest :wink:

Why should I try to use a machine learning mechanism for something that a simple mappingtable can solve even better?

Reading the following in the documentation

“…in order to properly train your model with entities that have roles/groups, make sure to include enough training data examples for every combination of entity and role/group label. Also make sure to have some variations in your training data, so that the model is able to generalize. For example, you should not only have example like fly FROM x TO y , but also include examples like fly TO y FROM x …”

was for my understanding an indication that the model is foreseen to generalize from several examples …

Try this pipeline. Works pretty well for me-

pipeline:
  - name: ConveRTTokenizer
  - name: ConveRTFeaturizer
  - name: RegexFeaturizer
  - name: LexicalSyntacticFeaturizer
  - name: CRFEntityExtractor
  # added new
  - name: CountVectorsFeaturizer
  - name: CountVectorsFeaturizer
    analyzer: "char_wb"
    min_ngram: 1
    max_ngram: 4
  - name: DIETClassifier
    epochs: 50
  - name: EntitySynonymMapper
  - name: ResponseSelector
    epochs: 100

Thanks for your contribution to help me. I tried it, but I am running into a runtime error. It seems that the dependencies are not correct as I got this during training of nlu:

undefined symbol: _ZN10tensorflow8OpKernel11TraceStringEPNS_15OpKernelContextEb

It may help to show me your full package list