Evaluate stories against a model not working as expected

Hi there,

I want to evaluate my stories against an already trained model, without evaluating the NLU part.

For this, I am using the HTTP API, with e2e=false. My test stories contains only intents that are in the domain file (actually, I am using exactly one of the stories that was used to train the model).

My test story looks like:

image

When I evaluate the story against the trained model I get low accurracy and the following warning:

WARNING rasa_core.training.dsl - Found unknown intent ‘None’ on line 3. Please, make sure that all intents are listed in your domain yaml.

WARNING rasa_core.training.dsl - Found unknown intent ‘None’ on line 5. Please, make sure that all intents are listed in your domain yaml.

However, if I perform and end to end evaluation (e2e=true), using a story that looks like this:

image

I get 1 of accuracy and no warnings.

It seems like in e2e=false mode, RASA is trying to perform NLU evaluation too.

Could anyone help me with this issue? Thank you in advance.

I am using RASA CORE 0.14.5

Hi lgonzalez, it looks like a bug. A couple things you could do to find out more:

  • Remove the NL input from the story you’re evaluating. It could be that rasa does not recognize this format if e2e=false.
  • Try the same thing on a newer version of rasa.

Hi Nikolai,

I’ve already tried “e2e=false” without NL input (see the first snippet in my previous post).

The thing is, whether I set e2e=false or e2e=true, if I include the NL input I get 1of accuracy. Therefore it seems that the e2e flag is being ignored and it tries to do the NLU evaluation in both cases.

If I change to a newer version of RASA I have to move to 1.X, which means major changes that I cannot assume right now.

Sorry, I misunderstood your post.

Perhaps the best way would be to include the nlu part but have the input always be one of the examples. That way NLU accuracy would be (close to) 100% and the results of the evaluation would reflect only the core model.

Thank you for your comments Nikolai, that’s actually the workaround I was implementing.

However, I would like to know if someone managed to solve this.