Lookup table does not work

I want to use lookup table to be able to extract entities. However I cannot get it working. Only the entities that are present in my training data are extracted and not those that are available in the lookup table. Consider a simple example I created to illustrate the issue. This is the content of nlu.md(with the lookup table):

## intent:greet
- Hello
- Hi
- Good afternoon

## intent:inform_user_name
- My name is [Mike](user_name)
- I'm [Sarah](user_name)
- Call me [Bob](user_name)

## lookup:user_name
data/lookup_tables/user_names.txt

This is the content of user_names.txt:

Robert
Arnold
Stephanie
Caroline
Mike
Sarah
Bob

It has a simple UserNameForm to extract the name of the user:

class UserNameForm(FormAction):
    def name(self) -> Text:
        return "name_form"

    @staticmethod
    def required_slots(tracker: Tracker) -> List[Text]:
        return ["user_name"]

    def submit(
        self,
        dispatcher: CollectingDispatcher,
        tracker: Tracker,
        domain: Dict[Text, Any],

    ) -> List[EventType]:
        return []

This is my stories.md:

## happy path
* greet
  - name_form
  - form{"name": "name_form"}
  - form{"name": null}
  - utter_greet_user

domain.yml looks like this:

intents:
  - greet

entities:
  - user_name

slots:
  user_name:
    type: unfeaturized
    auto_fill: false

forms:
  - name_form

responses:
  utter_greet_user:
  - text: "Hey {user_name}! How are you?"
  utter_ask_user_name:
  - text: "What is your name?"

session_config:
  session_expiration_time: 60
  carry_over_slots_to_new_session: true

And finally this is config.yml: language: en

pipeline:
  - name: WhitespaceTokenizer
  - name: RegexFeaturizer
  - name: LexicalSyntacticFeaturizer
  - name: CountVectorsFeaturizer
  - name: CountVectorsFeaturizer
    analyzer: "char_wb"
    min_ngram: 1
    max_ngram: 4
  - name: DIETClassifier
    epochs: 100
  - name: EntitySynonymMapper
  - name: ResponseSelector
    epochs: 100

policies:
  - name: MemoizationPolicy
  - name: TEDPolicy
    max_history: 5
    epochs: 100
  - name: MappingPolicy
  - name: FormPolicy

If you run it, you can type “My name is Mike” and since Mike is in the training data it will be extracted without a problem but when you type “My name is Arnold”, where Arnold is not in the training data it will not be extracted even though it is in the lookup table.

Can anyone please tell me what I did wrong? I suspect that I might to change my configuration, maybe add some different pipelines, but I’m not sure.

How much training data do you have? It is important that you add some examples that contain names listed in the lookup tables (see Training Data Format).

If you just have a few training examples, you might also want to try the following config:

pipeline:
  - name: WhitespaceTokenizer
  - name: RegexFeaturizer
  - name: LexicalSyntacticFeaturizer
  - name: CountVectorsFeaturizer
  - name: CountVectorsFeaturizer
    analyzer: "char_wb"
    min_ngram: 1
    max_ngram: 4
  - name: DIETClassifier
    number_of_transformer_layers: 0
    weight_sparsity: 0.0
    epochs: 100
  - name: EntitySynonymMapper
  - name: ResponseSelector
    epochs: 100

Or you use the CRFEntityExtractor instead of the DIETClassifier for entity extraction:

pipeline:
  - name: WhitespaceTokenizer
  - name: RegexFeaturizer
  - name: CRFEntityExtractor 
  - name: LexicalSyntacticFeaturizer
  - name: CountVectorsFeaturizer
  - name: CountVectorsFeaturizer
    analyzer: "char_wb"
    min_ngram: 1
    max_ngram: 4
  - name: DIETClassifier
    entity_recognition: False
    epochs: 100
  - name: EntitySynonymMapper
  - name: ResponseSelector
    epochs: 100
1 Like

@Tanja

Interesting approach to try DIET without Entity Extraction and use the CRFEntityExtractor instead. But what happens if we get much better results with the CRFExtractor and you will remove it with Rasa 2.0? Do we have to write a custom component therefore then?

CRFEntityExtractor is not deprecated and will not be removed in 2.0. So nothing to worry about :slight_smile:

1 Like

Hello @Tanja I tried to use variants but I it does not work for me.Help to understand lookup tables Can you look at my data

Hi, I’m trying to do something similar. Did you get it to work? Could you help me? thanks :smiley: