Lookup tables

Hi there!

I have some issues in using properly my lookuptables. Here’s the deal:

Even if I uploaded my lookup tables in /data folder image

and added them in my nlu file image

they are not being read by the NLU component. I added some examples in my training data coming from my lookup tables, but that simply didn’t work out.

I think that the trouble could be in the config.yml file, because I am using the pipeline supervised_embeddings. However, since my chatbot is in italian, i need that kind of pipeline. Moreover, if I try to add something in the pipeline, I simply get an error message when I train the model. Can somebody help?

Thx Andrea

I came to the conclusion that you need to include CRFEntityExtractor in the pipeline for it to work with lookup tables. For instance, DIETClassifier would ignore them. Maybe someone from @Rasa can shed some light.

Do I have to replace supervised_embeddings with `CRFEntityExtractor? This is how the config file looks like right now:

image

If I change it like this:

image

it fails the training because:

:thinking:

Sorry for late reply. I’m not exactly sure how CRFEntityExtractor interacts with supervised_embeddings. If it helps you, this is my config:

language: es_core_news_sm

pipeline:
  - name: SpacyNLP
    model: es_core_news_sm
  - name: SpacyTokenizer
  - name: SpacyFeaturizer
  - name: RegexFeaturizer
  - name: LexicalSyntacticFeaturizer
  - name: CountVectorsFeaturizer
  - name: CountVectorsFeaturizer
    analyzer: char_wb
    min_ngram: 2
    max_ngram: 4
  - name: CRFEntityExtractor
  - name: DucklingHTTPExtractor
    # url of the running duckling server
    url: http://localhost:8000
    # dimensions to extract
    dimensions: [ time ]
    # allows you to configure the locale, by default the language is used
    locale: es_ES
    # if not set the default timezone of Duckling is going to be used
    # needed to calculate dates from relative expressions like "tomorrow"
    timezone: Europe/Madrid
    # Timeout for receiving response from http url of the running duckling server
    # if not set the default timeout of duckling http url is set to 3 seconds.
    timeout: 3
  - name: DIETClassifier
    entity_recognition: False
    epochs: 200
  - name: EntitySynonymMapper
  - name: ResponseSelector
    epochs: 100


policies:
  - name: TEDPolicy
    max_history: 10
    epochs: 20
    batch_size:
      - 32
      - 64
  - name: AugmentedMemoizationPolicy
    max_history: 6
  - name: TwoStageFallbackPolicy
    core_threshold: 0.3
    nlu_threshold: 0.8
  - name: FormPolicy
  - name: MappingPolicy

Thx! I’ll try it right now!