Rasa only recognizing names from lookuptable

Hi, this is my config.yml

language: "en"
pipeline:
- name: "SpacyNLP"
- name: "SpacyTokenizer"
- name: "SpacyFeaturizer"
- name: "RegexFeaturizer"
- name: "CRFEntityExtractor"
- name: "EntitySynonymMapper"
- name: "SklearnIntentClassifier"
policies:
  - name: MemoizationPolicy
  - name: KerasPolicy
  - name: MappingPolicy

In nlu.md i have first_name intent

## intent:first_name
- [michal](first_name)
- [shepard](first_name)
- [brendan](first_name)
- [ansell](first_name)
- [sutherland](first_name)
- [goraud](first_name)

and in lookup table i have 4k names

So, here is my problem: if I send a name that is NOT in lookup table, rasa is not extracting it as an entity, even if rasa guesses intent correctly.

Example: ā€˜Michaelā€™ exists in lookup table and ā€˜Michaelaā€™ doesnā€™t

michael
{
  "intent": {
    "name": "first_name",
    "confidence": 0.27226685483533475
  },
  "entities": [
    {
      "start": 0,
      "end": 7,
      "value": "michael",
      "entity": "first_name",
      "confidence": 0.7108028369067685,
      "extractor": "CRFEntityExtractor"
    }
  ],

Next message:
michaela
{
  "intent": {
    "name": "first_name",
    "confidence": 0.5452198820814614
  },
  "entities": [],

Hi, names are notoriously hard to extract, since thereā€™s no general pattern. Spacy got a nice pretrained extractor for names, see

or just add ALL possible names to the lookup table. Another possibility is to ask for the name and use self.from_text after self.from_entity in the slot_mappings. This way the whole message gets extracted to the name slot. You might cut off common non-name string like ā€˜my name is ā€¦ā€™ in the validate_{slot} method

1 Like

Thanks for quick and helpful response @IgNoRaNt23. In the meantime i changed my pipeline to

language: en
pipeline:
  - name: WhitespaceTokenizer
  - name: CRFEntityExtractor
  - name: EntitySynonymMapper
  - name: CountVectorsFeaturizer
    token_pattern: (?u)\b\w+\b
  - name: EmbeddingIntentClassifier
  - name: DucklingHTTPExtractor
    url: http://localhost:8000
    dimensions:
      - number
policies:
  - name: FallbackPolicy
  - name: MemoizationPolicy
  - name: FormPolicy
  - name: MappingPolicy 

(same as in Building contextual assistants with Rasa Forms tutorial). Extracting names works fine now but I came across another problem. I have 2 intents for first and last name

## intent:first_name
-[Amery](first_name)
-[Bobbie](first_name)
-[Faulkner](first_name)
-[Leeland](first_name)
-[Regan](first_name)
-[Tammie](first_name)

## intent:last_name
-[Szymanski](last_name)
-[Gicala](last_name)
-[Chrzanowski](last_name)
-[Marzewski](last_name)
-[Obama](last_name)
-[Trump](last_name)

Iā€™m using forms in order to ask for these 2 values. But often lastname is recognized as firstname, for example:

Bot: give me first_name
Me: X
Bot: give me last_name
Me: Y
Bot: give me last_name
Me: Z
Bot: so you are Y Z

I know i could ā€œsolveā€ this problem by adding more data but sooner or later itā€™ll fail (some people have first names as surnames).

Is there any way of rising probability of intent last_name when bot asks for last_name so it doesnā€™t get mistaken with first_name?

Well, as i said, its hard to control entity extraction but far easier to control the slot mappings. So when you use a FormAction and ask for the surname, you may extract the userā€™s whole message into the surname-slot. Unfortunately userā€™s may not even answer your question or add words that belong there.

Is it even necessary for your usecase to split both? If its just playing around to get a feeling for the bot, dont use name extraction to learn how it works, since its probably the hardest entity to extract.

Okay, thanks again for help :slight_smile:

you could try this:

thatā€™s why we developed that solution, to deal with the first name/last name problem]. There are other ways of doing it in a form of course. Like setting the requested slot.from_text, and then validating the expected lastname against a list of last names that would include the last names that are also first names, e.g. George and Allen as well as Szymanski and Robertson

hi @IgNoRaNt23. Can you give an example of how to implement self.from_text and self.from_entity in slot_mappings and also how to use validate_{slot}.

Just look into the docs

there are examples for everything

1 Like