Lookup Tables not being detected

Hi,

I am developing a Rasa bot and I am struggling in using lookup tables in it.

I have gone through the official docs (Training Data Format), some related topics in the forum (e.g. Rasa 2 Lookup Table file syntax) and followed that guide - Entity extraction with the new lookup table feature in Rasa NLU | The Rasa Blog | Rasa. However, my agent still relies on the examples and doesn’t detect any of the objects listed in the lookup tables.

I have created a lookup table for different APIs, located in data/lookup/api.yml (I’ve also tried with api.txt).

version: "2.0"
nlu:
  - lookup: api
    examples: |
      - authorization
      - users
      - backups
      - prechecks
      - bundles
      - personalities
      - manifests
      - releases
      - upgradables
      - version aliases
      - upgrades

Then, I am giving some examples in the nlu.yml:

      - List all endpoints for [authorization](api)
      - List the [bundles](api) endpoints
      - Give me all endpoints for [bundles](api)
      - Show me the [upgrades](api) APIs
      - Could you please find all APIs for [manifests] 

My expectation is that after referring to the “api” lookup in the examples, the bot will automatically start identifying the rest of the objects from the “api” lookup table.

Also, here is the pipeline of my config.yml:

pipeline:
  - name: WhitespaceTokenizer
  - name: RegexEntityExtractor
  - name: LexicalSyntacticFeaturizer
  - name: CountVectorsFeaturizer
  - name: CountVectorsFeaturizer
    analyzer: char_wb
    min_ngram: 1
    max_ngram: 4
  - name: DIETClassifier
    epochs: 100

I have tried to set the use_lookup_tables property of the RegexEntityExtractor to true and also to use CRFEntityExtractor and EntitySynonymMapper. However, none of these attempts was sucessful and the bot was only detecting the entities from the examples.

Can you please give some advice on that? I guess that I might be missing something very obvious, but I can’t find what it is.

Welcome to the forum! :slight_smile:

That’s weird. Can you also try with the following pipeline?

pipeline:
  - name: WhitespaceTokenizer
  - name: RegexFeaturizer
  - name: LexicalSyntacticFeaturizer
  - name: CountVectorsFeaturizer
  - name: CountVectorsFeaturizer
    analyzer: char_wb
    min_ngram: 1
    max_ngram: 4
  - name: DIETClassifier
    epochs: 100
  - name: RegexEntityExtractor
1 Like

It worked like magic with a single modification in the pipeline as @ChrisRahme proposed - huge thanks for the quick and useful response!

1 Like

Awesome! :slight_smile:

Can you please mark the answer as solution to mark the thread as solved?

Done, thanks again :slight_smile:

1 Like