How does brackets value works on intents?

Hello,

I don’t get something with intents, is it useful to declare something like that :

## intent:parking_ask

- Where can I park in [Paris](city)
- Where can I park in [London](city)
- Where can I park in [New York](city)
- Where can I park in [Los Angles](city)

Is this can help the bot to train better ? I don’t get what I should write between brackets in fact

Thanks !

Hi @gillesf

there are two perspectives out of that this question could be answered:

Basically you want to define an intent “parking_ask” and for that you will need training data in form of sentences. Currently, the only discriminating criteria in your sentences are the cities. Depending on which algorithm you use for training, this might be a little poor. If you only want to describe the intent, think about adding something like “Am I able to park in Paris?”, “Does New York offer possibilities to park?” and many more.

Writing those cities in brackets is actually the process of defining them as entities - “city” in this case. This might be important if you want to create a FormAction which is able to asking for a city if the question about parking possibilites comes up - something like:

User: Show me the possibilites to park (missing the entity)
Bot: In which city do you want to park?
User: In San Francisco

The question which words are to write in brackets relies fully on the description of your entity.

Does that answer your question?

Hello,

Yes, I know I need more data in my intent, but I shortened them for this example :slight_smile:

I don’t understand what value(s) I need to write in the brackets :

I would like to know the importance of the value between theses brackets, and if I should write multiple lines of the same sentence and just change this value, is this brackets usefull for rasa or it’s just for the developer ?

Sorry for me english :sleepy:

Thanks a lot !

To summarize :

It is useful to write :

....
- Where can I park in [Paris](city)
- Where can I park in [London](city)
- Where can I park in [New York](city)
- Where can I park in [Los Angles](city)
...

Or can I just write one example as :

....
- Where can I park in [Paris](city)
...

@gillesf

Ah okay - got it.

Indeed every information that you can provide to the learning algorithm is useful - basically, so no, one single example is not enough - even for the use of lookup tables. I am not quite sure how many examples exactly are needed - one need to evaluate this for your particular setup.

Just keep in mind, that you might overfit the model if you provide too much data regarding that single intent.

1 Like

Perfect, thanks

Hi ,

How to find if the model over fits or not ?

In case of over fitting will it give wrong entities classification ?

HI @Aswinprabhakaran

one possible way is to compare the resulting F1 score between the train set and the test set using e.g. a cross-validation, documented here:

I’d suggest to use the following command for getting more verbose outout:

rasa test nlu --config pretrained_embeddings_spacy.yml supervised_embeddings.yml
  --nlu data/nlu.md --runs 3 --percentages 0 25 50 70 90

Keep in mind to use your own files and pipeline. If you want us to, please post the results here so we can analyse them together.

To answer the question about the consequences of overfitting, it is necessary to understand what overfitting means. A simple explanation is, that the model learned “too much” from the training data such that it can’t extrapolate its learnings to unseen data. So - whatever you are learning, try to use any early stopping method to avoid overfitting and thn evaluate your result properly.

Regards

Hi, Thanks for the reply.

I will look into that.

Meanwhile , when i run evaluation on my testing data, i got the below results :

2019-06-22 16:40:43 INFO rasa.nlu.test - Intent evaluation results:

2019-06-22 16:40:43 INFO rasa.nlu.test - Intent Evaluation: Only considering those 20557 examples that have a defined intent out of 20557 examples

2019-06-22 16:40:43 INFO rasa.nlu.test - F1-Score: 0.9983095015409214

2019-06-22 16:40:43 INFO rasa.nlu.test - Precision: 0.9982888889376054

2019-06-22 16:40:43 INFO rasa.nlu.test - Accuracy: 0.9983460621686043

2019-06-22 16:40:43 INFO rasa.nlu.test - Classification report:

             precision    recall  f1-score   support

actor_search 0.99 0.99 0.99 1352 actress_search 1.00 1.00 1.00 1352 affirm 0.60 1.00 0.75 3 costar_search 1.00 1.00 1.00 1352 director_search 0.99 0.99 0.99 1352 goodbye 0.50 0.50 0.50 2 greet 0.00 0.00 0.00 2 movie_search 1.00 1.00 1.00 11897 producer_search 0.99 1.00 0.99 1352 rating_search 1.00 1.00 1.00 1893

  micro avg       1.00      1.00      1.00     20557
  macro avg       0.81      0.85      0.82     20557

weighted avg 1.00 1.00 1.00 20557

2019-06-22 16:40:43 INFO rasa.nlu.test - Model prediction errors saved to errors.json.

2019-06-22 16:40:44 INFO rasa.nlu.test - Confusion matrix, without normalization:

[[ 1339 0 0 0 4 0 0 0 9 0] [ 1 1351 0 0 0 0 0 0 0 0] [ 0 0 3 0 0 0 0 0 0 0] [ 0 0 0 1352 0 0 0 0 0 0] [ 6 0 0 1 1341 0 0 0 4 0] [ 0 0 1 0 0 1 0 0 0 0] [ 0 0 1 0 0 1 0 0 0 0] [ 0 0 0 0 0 0 0 11897 0 0] [ 3 0 0 0 3 0 0 0 1346 0] [ 0 0 0 0 0 0 0 0 0 1893]]

2019-06-22 16:40:46 INFO rasa.nlu.test - Entity evaluation results:

2019-06-22 16:40:53 INFO rasa.nlu.test - Evaluation for entity extractor: CRFEntityExtractor

2019-06-22 16:40:58 INFO rasa.nlu.test - F1-Score: 0.9956100901680592

2019-06-22 16:40:58 INFO rasa.nlu.test - Precision: 0.9955535721798925

2019-06-22 16:40:58 INFO rasa.nlu.test - Accuracy: 0.995691860508227

2019-06-22 16:40:58 INFO rasa.nlu.test - Classification report:

                 precision    recall  f1-score   support

  acting_person       0.96      0.98      0.97     12889
      aggmethod       1.00      1.00      1.00      2226
       composer       0.33      0.28      0.30       512
      condition       1.00      1.00      1.00      3793
        contrib       1.00      1.00      1.00       249
          count       1.00      1.00      1.00      6779
 coworker_actor       1.00      1.00      1.00      1352

coworker_actress 1.00 1.00 1.00 1352 coworker_costar 1.00 1.00 1.00 4056 coworker_director 1.00 1.00 1.00 1352 coworker_producer 1.00 1.00 1.00 1352 director 0.99 0.99 0.99 21817 director_role 1.00 1.00 1.00 21109 frequency 1.00 1.00 1.00 6760 no_entity 1.00 1.00 1.00 121696 order 1.00 1.00 1.00 6492 producer 0.99 0.99 0.99 21934 producer_role 1.00 1.00 1.00 20918 selection_criteria 1.00 1.00 1.00 2172 time 1.00 1.00 1.00 8740 value_for_condition 1.00 1.00 1.00 2172

      micro avg       1.00      1.00      1.00    269722
      macro avg       0.97      0.96      0.96    269722
   weighted avg       1.00      1.00      1.00    269722

In the above result, the accuracy obtained is for over all all intents and entities.

what i wanted is , for each intent and for each entity , i want to know the accuracy. Like how precision, f1 score and support is obtained for individual intent and entities, can i get the accuracy for individual values as well ?