Extracting date from user

Hi, My use case is that I want user to provide start and end date so I can fetch transactions that happened during that date range. I first tried without duckling with the following config:

- name: WhitespaceTokenizer

  - name: RegexFeaturizer

  - name: LexicalSyntacticFeaturizer

  - name: CountVectorsFeaturizer

  - name: CountVectorsFeaturizer

    analyzer: char_wb

    min_ngram: 1

    max_ngram: 4

  - name: DIETClassifier

    epochs: 100

    constrain_similarities: true

  - name: EntitySynonymMapper

  - name: ResponseSelector

    epochs: 100

    constrain_similarities: true

  - name: FallbackClassifier

    threshold: 0.8

    ambiguity_threshold: 0.1

and by annotating the nlu examples for this intent with start_date and end_date. So my nlu example would look like : Fetch transactions that happened between [20/02/2020]{“entity”: “date”, “role”: “start_date”} and [01/02/2021]{“entity”: “date”, “role”: “end_date”}. I was using forms to collect start date and end date from the user. Every time the bot asks for start date, it actually stored it in end date slot.

I also came across duckling and tried that out with the same setup but just changing the config to include duckling config also. But there is still no difference and the bot stores the start date given by user in the end date slot.

Idk if this is the right way of going about, I just tried out something I learnt from those docs. If someone is aware of how to handle this situation, it would be of great help to me if you could share your approach. Thanks in advance!

Check out Entity Roles The docs provide an example similar to your requirement

Thanks @Vin. I had already tried this. Still the problem persists.

Try including DucklingEntityExtractor in the pipelines of your config.yml file and retrain.

Yes, I have done this @earroyoh. Problem still exists

Hi, how much training examples did you provide in your nlu.yml? I wonder if DIET might need some more data to learn the fine-grained roles.

Also, if you want to use roles to distinguish start and end date, only DIETClassifier and CRFEntityExtractor will pick those up, according to here. Duckling will just extract the dates, not roles.

Could you also provide your domain.yml and rules.yml, containing the form and slot mapping that you use to fill the slots?

Hi @MatthiasLeimeister thanks for taking time to explain.

I think I’m making a mistake in annotating the training data for entities. I’m marking the from and the to dates in the training example like - Get me transaction details from [20/11/2020]{entity: “date”, role: “start”} to [21/11/2021]{entity: “date”, role: “end”}.

I’m not really sure how duckling is extracting dates. Since it’s in the end of the pipeline I thought it was necessary to mark dates as entities in the training example and not assume that the entire sentence will be taken to extract dates. As that was not working as expected I thought maybe I should mark all the words that might be helpful for duckling to recognize that it’s the start and the end date. So I marked the entire date range as one entity like: Get me transaction details [from 20/11/2020 to 21/11/2021]{entity: “date”}.

domain.yml:

transactions_form:
    required_slots:
      transaction_date_range:
        - type: from_entity
          entity: date_range

stories.yml:

- story: transaction details path
  steps:
    - intent: transaction_details
    - action: transactions_form
    - active_loop: transactions_form
    - active_loop: null
    - action: action_transaction_details

I’m still very confused. Need some insights here. Thanks again

Hi @lis, Duckling uses a fixed pretrained model, meaning it won’t use the annotated dates in your nlu.yml to learn. It will parse the input message and extract the entities it is trained for. The response format can be seen in the docs here. A date would be labelled as a time entity. You can try what it finds for different inputs in this interactive demo.

For your problem of recognizing a date range with start and end date, your first approach to use DIET with annotated roles seems like a good idea. This way you can provide many examples of training data in the format you expect in your application in nlu.yml:

- intent: transaction_details
  examples: |
    - Get me transaction details from [20/11/2020]{"entity": "date", "role": "start"} to [21/11/2021]{"entity": "date", "role": "end"}.
    - Search from [02/08/2020]{"entity": "date", "role": "start"} to [30/10/2021]{"entity": "date", "role": "end"}.
    - Between [15/02/2022]{"entity": "date", "role": "start"} and [19/03/2022]{"entity": "date", "role": "end"}.
    ...

Then your form can pick up the roles for the different slots in domain.yml:

slots:
  start_date:
    type: text
    influence_conversation: false
  end_date:
    type: text
    influence_conversation: fase

forms:
  transaction_form:
    required_slots:
      start_date:
        - type: from_entity
          entity: date
          role: start
      end_date:
        - type: from_entity
          entity: date
          role: end

@MatthiasLeimeister thanks again.

I was doing the exact same thing, but everytime the bot asks for start date (and the user enters it), it updates the value in the end date slot. Any inputs here? Maybe adding more training data would help. Right now I have around 20-25 examples under this intent.

Do the entity names match while training ?

yess! i missed that while editing here. but in my code it does.

Hey, ah I see. So you mean that the user enters an individual date, not a full sentence with the date range? Does the conversation look like this then:

Bot: What is the start date?
User: 20/11/2021
Bot: What is the end date?
User: 21/12/2022

In this case it would be hard for DIET to detect roles I think, because the NLU classification is just based on the single message, not the context. One reason could be that both dates get detected as end because the end dates are always at the end of the sentence in the training data.

But as a different approach, if it is ok for your application to ask the dates one by one, you could try to define the slots just as from_text, so that no intent and entity extraction would be needed:

utter_ask_start_date:
  - text: "Which start date?"

utter_ask_end_date:
  - text: "Which end date?"

slots:
  start_date:
    type: text
    influence_conversation: false
  end_date:
    type: text
    influence_conversation: fase

forms:
  transaction_form:
    required_slots:
      start_date:
        - type: from_text
      end_date:
        - type: from_text

The form should then fill the correct slots with the full user input by using the utter_ask_start_date and utter_ask_end_date actions if it is activated.

Yay! That’s exactly what I was trying to ask! Thanks a lot for your input @MatthiasLeimeister. So there is no way to handle date ranges that appear in a single sentence and date ranges that appear in multiple sentences, is it?

Basically what I expect is that user gives both the start and the end dates in the same sentence. If they don’t, I want the bot to ask them to enter those values.

Hey, you’re welcome :slight_smile: In order to support those 2 different paths a user can take, you can specify the slots to be filled in several ways, based on intents. Not sure if there’s an easier way to do this, but you could introduce an intent for date ranges and another one for single dates:

- intent: inform_date_range
  examples: |
    - Get me transaction details from [20/11/2020]{"entity": "date", "role": "start"} to [21/11/2021]{"entity": "date", "role": "end"}.
    - Search from [02/08/2020]{"entity": "date", "role": "start"} to [30/10/2021]{"entity": "date", "role": "end"}.
    - Between [15/02/2022]{"entity": "date", "role": "start"} and [19/03/2022]{"entity": "date", "role": "end"}.

- intent: inform_date
  examples: |
    - [20/11/2020](date)
    - [02/08/2020](date)
    - [15/02/2022](date)

Then you can specify in the form how the slots get filled based on the intent (see here for details of the different settings):

forms:
  transaction_form:
    required_slots:
      start_date:
        - type: from_entity
          entity: date
          role: start
          intent: inform_date_range
        - type: from_text
          intent: inform_date
      end_date:
        - type: from_entity
          entity: date
          role: end
          intent: inform_date_range
        - type: from_text
          intent: inform_date

so if a date range is given, both start and end slots will be filled based on the detected roles. If instead the form asks for the slots one by one, they will be filled from just the text of the single date. To make this work, the form would have to be activated by both a generic transaction_details intent (as you had before) and also by the inform_date_range intent:

rules.yml

- rule: Activate transaction form when user asks for transaction details
  steps:
  - intent: transaction_details
  - action: transaction_form
  - active_loop: transaction_form

- rule: Activate transaction form when user submits a date range
  steps:
  - intent: inform_date_range
  - action: transaction_form
  - active_loop: transaction_form

This way you could have conversations in both forms:

Bot loaded. Type a message and press enter (use '/stop' to exit): 
Your input ->  hello                                                                                                                                                                                                                       
Hello, how can I help?
Your input ->  i need more transaction details                                                                                                                                                                                             
Which start date?
Your input ->  22/10/2021                                                                                                                                                                                                                  
Which end date?
Your input ->  24/11/2021                                                                                                                                                                                                                  
Ok, searching between 22/10/2021 and 24/11/2021. Goodbye
Your input ->  /restart                                                                                                                                                                                                                    
Your input ->  hello                                                                                                                                                                                                                       
Hello, how can I help?
Your input ->  i need transaction details between 22/10/2021 and 24/11/2021                                                                                                                                                                
Ok, searching between 22/10/2021 and 24/11/2021. Goodbye
Your input ->
2 Likes

@lis Did you try the current duckling library- there is a type:interval to the time dimension which does take date/time range as input but of course you would have to make a custom entity extractor to deal with it in Rasa.

Though i am not sure how accurate duckling works with this date/range range interval type because range is quite a specific case for various dimensions. https://github.com/facebook/duckling/issues/29 - It was mentioned here as well.

1 Like

Thanks @MatthiasLeimeister :slight_smile:

Ohh, interesting! I did not know this before. Thank you! I will check it out :smiley:

@souvikg10 would you recommend using duckling though? I mean in this case we will have to set up another server for just running a date extractor. How good/bad is it to use the existing default pipeline and later parse using something like dateparser in my custom actions? What is your take on this?

I would imagine simply for date/time, it won’t be very interesting but duckling does a lot more than that. However the way duckling parses date/time from natural language and it’s performance is quite impressive.

if your usecase is simple enough where date is provided in a certain format instead, a simple dateparser is enough, if your users requests date in terms of natural language like today, tomorrow, next month etc… then maybe duckling is a better choice

Thank you!

Hi @lis, There very nice discussion but I also facing the same problem. I’m new to rasa so If you have any repo for this please share it or you can guide me it will be very helpful for me.