Rasa only recognizing names from lookuptable

Hi, this is my config.yml

language: "en"
pipeline:
- name: "SpacyNLP"
- name: "SpacyTokenizer"
- name: "SpacyFeaturizer"
- name: "RegexFeaturizer"
- name: "CRFEntityExtractor"
- name: "EntitySynonymMapper"
- name: "SklearnIntentClassifier"
policies:
  - name: MemoizationPolicy
  - name: KerasPolicy
  - name: MappingPolicy

In nlu.md i have first_name intent

## intent:first_name
- [michal](first_name)
- [shepard](first_name)
- [brendan](first_name)
- [ansell](first_name)
- [sutherland](first_name)
- [goraud](first_name)

and in lookup table i have 4k names

So, here is my problem: if I send a name that is NOT in lookup table, rasa is not extracting it as an entity, even if rasa guesses intent correctly.

Example: ā€˜Michael’ exists in lookup table and ā€˜Michaela’ doesn’t

michael
{
  "intent": {
    "name": "first_name",
    "confidence": 0.27226685483533475
  },
  "entities": [
    {
      "start": 0,
      "end": 7,
      "value": "michael",
      "entity": "first_name",
      "confidence": 0.7108028369067685,
      "extractor": "CRFEntityExtractor"
    }
  ],

Next message:
michaela
{
  "intent": {
    "name": "first_name",
    "confidence": 0.5452198820814614
  },
  "entities": [],

Hi, names are notoriously hard to extract, since there’s no general pattern. Spacy got a nice pretrained extractor for names, see

or just add ALL possible names to the lookup table. Another possibility is to ask for the name and use self.from_text after self.from_entity in the slot_mappings. This way the whole message gets extracted to the name slot. You might cut off common non-name string like ā€˜my name is …’ in the validate_{slot} method

1 Like

Thanks for quick and helpful response @IgNoRaNt23. In the meantime i changed my pipeline to

language: en
pipeline:
  - name: WhitespaceTokenizer
  - name: CRFEntityExtractor
  - name: EntitySynonymMapper
  - name: CountVectorsFeaturizer
    token_pattern: (?u)\b\w+\b
  - name: EmbeddingIntentClassifier
  - name: DucklingHTTPExtractor
    url: http://localhost:8000
    dimensions:
      - number
policies:
  - name: FallbackPolicy
  - name: MemoizationPolicy
  - name: FormPolicy
  - name: MappingPolicy 

(same as in Building contextual assistants with Rasa Forms tutorial). Extracting names works fine now but I came across another problem. I have 2 intents for first and last name

## intent:first_name
-[Amery](first_name)
-[Bobbie](first_name)
-[Faulkner](first_name)
-[Leeland](first_name)
-[Regan](first_name)
-[Tammie](first_name)

## intent:last_name
-[Szymanski](last_name)
-[Gicala](last_name)
-[Chrzanowski](last_name)
-[Marzewski](last_name)
-[Obama](last_name)
-[Trump](last_name)

I’m using forms in order to ask for these 2 values. But often lastname is recognized as firstname, for example:

Bot: give me first_name
Me: X
Bot: give me last_name
Me: Y
Bot: give me last_name
Me: Z
Bot: so you are Y Z

I know i could ā€œsolveā€ this problem by adding more data but sooner or later it’ll fail (some people have first names as surnames).

Is there any way of rising probability of intent last_name when bot asks for last_name so it doesn’t get mistaken with first_name?

Well, as i said, its hard to control entity extraction but far easier to control the slot mappings. So when you use a FormAction and ask for the surname, you may extract the user’s whole message into the surname-slot. Unfortunately user’s may not even answer your question or add words that belong there.

Is it even necessary for your usecase to split both? If its just playing around to get a feeling for the bot, dont use name extraction to learn how it works, since its probably the hardest entity to extract.

Okay, thanks again for help :slight_smile:

you could try this:

that’s why we developed that solution, to deal with the first name/last name problem]. There are other ways of doing it in a form of course. Like setting the requested slot.from_text, and then validating the expected lastname against a list of last names that would include the last names that are also first names, e.g. George and Allen as well as Szymanski and Robertson

hi @IgNoRaNt23. Can you give an example of how to implement self.from_text and self.from_entity in slot_mappings and also how to use validate_{slot}.

Just look into the docs

there are examples for everything

1 Like