Lookup table does not work

Mykhailo-Seniutovych · March 22, 2020, 12:01pm

I want to use lookup table to be able to extract entities. However I cannot get it working. Only the entities that are present in my training data are extracted and not those that are available in the lookup table. Consider a simple example I created to illustrate the issue. This is the content of nlu.md(with the lookup table):

## intent:greet
- Hello
- Hi
- Good afternoon

## intent:inform_user_name
- My name is [Mike](user_name)
- I'm [Sarah](user_name)
- Call me [Bob](user_name)

## lookup:user_name
data/lookup_tables/user_names.txt

This is the content of user_names.txt:

Robert
Arnold
Stephanie
Caroline
Mike
Sarah
Bob

It has a simple UserNameForm to extract the name of the user:

class UserNameForm(FormAction):
    def name(self) -> Text:
        return "name_form"

    @staticmethod
    def required_slots(tracker: Tracker) -> List[Text]:
        return ["user_name"]

    def submit(
        self,
        dispatcher: CollectingDispatcher,
        tracker: Tracker,
        domain: Dict[Text, Any],

    ) -> List[EventType]:
        return []

This is my stories.md:

## happy path
* greet
  - name_form
  - form{"name": "name_form"}
  - form{"name": null}
  - utter_greet_user

domain.yml looks like this:

intents:
  - greet

entities:
  - user_name

slots:
  user_name:
    type: unfeaturized
    auto_fill: false

forms:
  - name_form

responses:
  utter_greet_user:
  - text: "Hey {user_name}! How are you?"
  utter_ask_user_name:
  - text: "What is your name?"

session_config:
  session_expiration_time: 60
  carry_over_slots_to_new_session: true

And finally this is config.yml: language: en

pipeline:
  - name: WhitespaceTokenizer
  - name: RegexFeaturizer
  - name: LexicalSyntacticFeaturizer
  - name: CountVectorsFeaturizer
  - name: CountVectorsFeaturizer
    analyzer: "char_wb"
    min_ngram: 1
    max_ngram: 4
  - name: DIETClassifier
    epochs: 100
  - name: EntitySynonymMapper
  - name: ResponseSelector
    epochs: 100

policies:
  - name: MemoizationPolicy
  - name: TEDPolicy
    max_history: 5
    epochs: 100
  - name: MappingPolicy
  - name: FormPolicy

If you run it, you can type “My name is Mike” and since Mike is in the training data it will be extracted without a problem but when you type “My name is Arnold”, where Arnold is not in the training data it will not be extracted even though it is in the lookup table.

Can anyone please tell me what I did wrong? I suspect that I might to change my configuration, maybe add some different pipelines, but I’m not sure.

Tanja · March 24, 2020, 1:44pm

How much training data do you have? It is important that you add some examples that contain names listed in the lookup tables (see Training Data Format).

If you just have a few training examples, you might also want to try the following config:

pipeline:
  - name: WhitespaceTokenizer
  - name: RegexFeaturizer
  - name: LexicalSyntacticFeaturizer
  - name: CountVectorsFeaturizer
  - name: CountVectorsFeaturizer
    analyzer: "char_wb"
    min_ngram: 1
    max_ngram: 4
  - name: DIETClassifier
    number_of_transformer_layers: 0
    weight_sparsity: 0.0
    epochs: 100
  - name: EntitySynonymMapper
  - name: ResponseSelector
    epochs: 100

Or you use the CRFEntityExtractor instead of the DIETClassifier for entity extraction:

pipeline:
  - name: WhitespaceTokenizer
  - name: RegexFeaturizer
  - name: CRFEntityExtractor 
  - name: LexicalSyntacticFeaturizer
  - name: CountVectorsFeaturizer
  - name: CountVectorsFeaturizer
    analyzer: "char_wb"
    min_ngram: 1
    max_ngram: 4
  - name: DIETClassifier
    entity_recognition: False
    epochs: 100
  - name: EntitySynonymMapper
  - name: ResponseSelector
    epochs: 100

lindig · March 31, 2020, 12:04pm

@Tanja

Interesting approach to try DIET without Entity Extraction and use the CRFEntityExtractor instead. But what happens if we get much better results with the CRFExtractor and you will remove it with Rasa 2.0? Do we have to write a custom component therefore then?

Tanja · March 31, 2020, 12:18pm

CRFEntityExtractor is not deprecated and will not be removed in 2.0. So nothing to worry about

MMustafa · May 22, 2020, 6:34am

Hello @Tanja I tried to use variants but I it does not work for me.Help to understand lookup tables Can you look at my data

monicatffee · January 26, 2022, 7:40pm

Hi, I’m trying to do something similar. Did you get it to work? Could you help me? thanks

Topic		Replies	Views
Lookup does not used to extract entities Rasa Open Source	3	765	June 15, 2020
Lookup Tables not being detected Rasa Open Source	5	728	May 14, 2024
Lookup Table didn't work for RegexEntityExtractor Rasa Open Source	24	1482	February 3, 2022
Lookup table is not working Rasa Open Source	15	5684	October 9, 2022
Lookup not working in entity extraction Rasa Open Source	13	1343	December 2, 2021

Lookup table does not work

Related topics