Entity not identified (mentioned in the training data)

ManiNuthi · August 22, 2020, 9:02pm

I am building a bot for traveling services where bot needs to identify the source and destination from the sentence. ex : I want to go to New York from Malibu

I have added lots of similar examples under the related intent.

And I have locations in a lookup file. So I have added samples for locations under intent “inform” and mentioned path of the file as a lookup. Even so, bot fails to identify locations that are mentioned in the lookup, and some of the locations are classified as other intents.

I looked for a solution online and tried adding more samples (>100) in “inform”. With this, although several locations are identified, still the issue persists for few locations.

There are 20k location names in the lookup file. Is size is the problem or can I fix it anyway?

tyd · August 24, 2020, 9:14am

Hi @ManiNuthi. If you are looking to identify source and destination cities, I would recommend looking entity roles and groups:

ManiNuthi · August 24, 2020, 4:49pm

This is cool. But the error still persists.

I changed my samples from " I want to travel from [NYC](Source) to [LA](Destination)" to "I want to travel from [NYC]{"entity": "location", "role": "Source"} to [LA]{"entity": "location", "role": "Destination"}.

slot mapping:
    def slot_mappings(self) -> Dict[Text, Union[Dict, List[Dict]]]:
        return {
        "Source": [self.from_entity(entity="location", role="Source")],
         "Destination": [self.from_entity(entity="location", role="Destination")]
         }
 
And in order to connect with lookup, under the intent 'inform' I placed some samples similar to:
 [new york](location)
 [brroklyn](location)

And also tried in another format: 
[new york]{"entity": "location", "role": "Destination"}
[new york]{"entity": "location", "role": "Source"}

But still, the error appears.

This is my NLP pipeline:
language: te
pipeline:
  - name: SpacyNLP
  - name: SpacyTokenizer
  - name: CountVectorsFeaturizer
  - name: CountVectorsFeaturizer
    analyzer: "char_wb"
    min_ngram: 1
    max_ngram: 4
  - name: RegexFeaturizer
  - name: EntitySynonymMapper
  - name: DIETClassifier
    epochs: 130

Tanja · August 26, 2020, 9:39am

@ManiNuthi Using entity roles and groups definitely makes sense in your case. Can you please run your bot in debug mode (e.g. using the flag --debug), chat with it a bit and share the logs afterwards here? Thanks. That should help us to figure out what the bot is actually recognising and where the error might be.

ManiNuthi · August 26, 2020, 11:20am

Here I attached logs for conversation(in telugu language).

The conversation goes like this:

input: hi

bot: how can I help you

input: search train from kadiri to palasa.

logs of shell --debug:

Here are logs for a sample where it worked. input : search trains from tirupati to anakapalle.

Tanja · September 1, 2020, 7:49am

@ManiNuthi Sorry for the late reply. Did you already fixed the issue?

According to the slots the slot mapping does not seem to be the issue. The entity extractor is simply not able to identify the entities. This might have several reasons: (1) You training data is not consistently annotated - make sure that all entities are labelled. (2) Or you have not enough examples - however you already mentioned that you added some more examples, so might not be a problem. How much entities do you have and how much examples per entity? (3) Or the pipeline is not ideal - you could try playing around with different components, parameter options. (4) Or the tokenization does not always work as expected - I am not familiar with telugu language but I guess that the tokenizer from spacy does a good job, but might be worth to double check this. Sorry, I cannot give you any better advise, but it is hard to figure our what is exactly going wrong without taking a closer look at the data.

Topic		Replies	Views
Not able to identify locations from lookup table Rasa Open Source	4	398	September 18, 2020
Why am i not able to get entities that are not trained? Rasa Open Source	56	1939	June 26, 2020
Flight booking Rasa Open Source	0	101	May 14, 2024
Problem with entity role recognition Rasa Open Source	4	268	March 7, 2023
Entity not identified Rasa Open Source	2	1010	March 21, 2019

Entity not identified (mentioned in the training data)

Related topics