Entity not recognized with DIET

I’m having a problem with entity recognition: obviously, the recommended pipeline doesn’t work, whereas a deprecated one does.

Rasa: 1.10.1 Rasa x: 0.28.5 Two look-ups: land.txt and product.txt

Working as expected: Entities ‘product’ as well as ‘land‘ recognized, slots set

Pipeline – with deprication warnings!!: language: nl pipeline:

  • name: WhitespaceTokenizer
  • name: RegexFeaturizer
  • name: LexicalSyntacticFeaturizer
  • name: CountVectorsFeaturizer
  • name: CountVectorsFeaturizer analyzer: char_wb min_ngram: 1 max_ngram: 4
  • name: CRFEntityExtractor
  • name: EntitySynonymMapper
  • name: EmbeddingIntentClassifier
  • name: ResponseSelector epochs: 100
    policies:
  • name: AugmentedMemoizationPolicy max_history: 3
  • name: TEDPolicy max_history: 5 epochs: 100
  • name: MappingPolicy
  • name: FormPolicy
  • name: FallbackPolicy nlu_threshold: 0.4 core_threshold: 0.3 fallback_action_name: action_default_fallback image

Not working as expected: Only entity ‘product’ recognized and its coresponding slot filled. But not ’land’.

Pipeline from recommended Pipeline:
language: nl
pipeline:

  • name: WhitespaceTokenizer
  • name: RegexFeaturizer
  • name: LexicalSyntacticFeaturizer
  • name: CountVectorsFeaturizer
  • name: CountVectorsFeaturizer analyzer: char_wb min_ngram: 1 max_ngram: 4
  • name: DIETClassifier epochs: 100
  • name: EntitySynonymMapper
  • name: ResponseSelector epochs: 100
    policies:
  • name: AugmentedMemoizationPolicy max_history: 3
  • name: TEDPolicy max_history: 5 epochs: 100
  • name: MappingPolicy
  • name: FormPolicy
  • name: FallbackPolicy nlu_threshold: 0.4 core_threshold: 0.3 fallback_action_name: action_default_fallback

Hey another Dutch person. Cool.

For the future by the way, you can have code render nicely on this forum via markdown syntax. By using three tics (`) before and after a code block you can have the config.yml file render

like this

Back to your original problem. Are you running this locally on your machine as well or everything via Rasa X? If you run this locally on your machine you might be able to see if the algorithm converged before the 100 epochs run out. That might be a reason why DIET is underperforming.

That said, I am a bit suprised since you’re using the EntitySynonymMapper in both scenarios and you’ve gotten two look up tables. Considering your use-case though, is there a reason why you’ve not considered using a Form here? There’s an tutorial on how to use those here.

Hi @koaning , thanks for your reply!

Where running Rasa?

My organisation is exploring Rasa. So we’re exploring all kind of Rasa stuff, f.e.:

  1. Rasa on openshift, developping actions on linux. Running actions either on server or in pod on openshift.
  2. Rasa on linux-server, both Open Source and Rasa X, as well rasa SDK

Why use look-up and not forms?

I also use forms. But for this part of the use-case, I use look-ups to verify if user has given a correct value for land or product. I think, that’s the easiest way, don’t you?

Why should I consider to use a form?

A form can be a very structured way of retreiving information from a user that has very few surpises. Especially if the information you’re retreiving is super important (or perhaps an enumerable instead of a full string) it can be convenient to use a form. How many countries are possible? If there’s only 4 options then a form might be a more convenient API.

All countries of the world :grinning: i.e. almost 200.

If you’re interested, I could share the case I’m working on.

For now: I use the look-up for verifying if country exists. When country exists, I use a form to determine if country is member of EU.

Off-course, I could handle the existance also in the form, f.e. return a not found if not in my country-table. But:

  1. This look-up in the dialogue doesn’t need any technical / python / programming skills, so could be done by a SME. And completely in Rasa X / Open shift; i.e. user-friendly.
  2. I want to get to know Rasa, so experience the look-up

By the way, I use a form to gather all they infrmation form the user on products, countries, costs.

But I like this discussion, as it forces me to reconsider it all over again :+1:

1 Like

Were you able to confirm if the DIET algorithm converged by the way?

Not yet, also working on other issues.

I’l come-back as soon as I’ve runned the DIET again

I managed to get it working! But I realy don’t know what made it work :woozy_face:.

And yes, at least now the DIET algorithm converged, even to a good recognation.

What I did:

  1. Working only in virtual machine, Rasa Core
  2. Tsting with python scripts (from within Pycharm)
  3. Started with NLU only: iboth intents and entities
  4. Concentrated first at look-up.
  5. Made a train/test set by splitting nlu_data
  6. Created lot of test intents form all entries in look-up

And here’s my pipe-line:

language: nl
pipeline:
  - name: WhitespaceTokenizer
    case_sensitive: false
  - name: RegexFeaturizer
  - name: LexicalSyntacticFeaturizer
  - name: CountVectorsFeaturizer
  - name: CountVectorsFeaturizer
    analyzer: char_wb
    min_ngram: 1
    max_ngram: 6
  - name: DIETClassifier
    epochs: 100
  - name: EntitySynonymMapper
  - name: ResponseSelector
    epochs: 50

Thanks for assistance!