Evaluation of Dialog Policy

dav-92 · May 10, 2021, 5:15pm

I want to evaluate the Dialog Policy as it was done in the TED Paper and therefore I’m looking for a way to train different models on varying number of dialogs without having to split manually.

When I run rasa train core using the --percentages flag as described here I still end up with one model instead of one for each percentage value.

Is there some built in support (I also haven’t found an option to use cross validation) for evaluating core?

anca · May 11, 2021, 8:31am

Hi @dav-92 thanks for flagging, are you able to please share the config files with me so I can try to replicate?

dav-92 · May 11, 2021, 9:14am

Hi @anca

I used this command :

rasa train core --out comparison_models --runs 3 --percentages 0 5 25 50 70 95

which produces one model

I did not specify different config files like in the example because I only want to use one but varying amounts of data so I’m not sure if this is even possible.

I used this config file:

language: de
pipeline:
  - name: SpacyNLP
    model: "de_core_news_sm"
  - name: SpacyTokenizer
  - name: SpacyFeaturizer
  - name: LexicalSyntacticFeaturizer
  - name: CountVectorsFeaturizer
  - name: CountVectorsFeaturizer
    analyzer: "char_wb"
    min_ngram: 1
    max_ngram: 4
  - name: DIETClassifier
    epochs: 100
  - name: EntitySynonymMapper
  - name: FallbackClassifier
    threshold: 0.7
    
policies:
  #- name: MemoizationPolicy
  - name: RulePolicy
  - name: TEDPolicy
    max_history: 10
    epochs: 100

The .zip contains the whole project

chatbot.zip (2.5 MB)

Tobias_Wochinger · May 12, 2021, 8:26am

To me this seems like a bug. It’s a bit tricky to fix though as we need some other marker than the current to decide whether the user explicitly wants to do comparison training.

Could you please create an issue here @dav-92 ?

Thanks!

dav-92 · May 12, 2021, 10:46am

Issue: Training core using --pecentages only produces one model when only one config is specified · Issue #8673 · RasaHQ/rasa · GitHub

Tobias_Wochinger · May 17, 2021, 8:13am

Awesome - thank you!

Topic		Replies	Views
Comparing Policies - guide not clear Rasa Open Source	5	462	March 2, 2020
Rasa core evaluation metrics Rasa Open Source	16	2048	July 24, 2019
Evaluate NLU models with different pretrained models Rasa Open Source	2	1441	March 24, 2020
ValueError: You can only specify one policy per model for comparison Rasa Open Source	7	960	February 17, 2019
Rasa NLU without Rasa Core Getting Started with Rasa confidence	4	189	August 23, 2019

Evaluation of Dialog Policy

Related topics