How to resolve multiple, conflicting high-confidence results from Duckling

I’ve got Duckling configured in my pipeline to extract date, time, date/time range values from user utterances. I’ve observed that within the same intent, Duckling throws in multiple time values. See the example below -

we're looking for something on the third week of february

What I got -

{
  "intent": {
    "name": "1.1_state_preferences",
    "confidence": 0.529475748538971
  },
  "entities": [
    {
      "start": 26,
      "end": 38,
      "text": "on the third",
      "value": "2020-02-03T00:00:00.000-08:00",
      "confidence": 1.0,
      "additional_info": {
        "values": [
          {
            "value": "2020-02-03T00:00:00.000-08:00",
            "grain": "day",
            "type": "value"
          },
          {
            "value": "2020-03-03T00:00:00.000-08:00",
            "grain": "day",
            "type": "value"
          },
          {
            "value": "2020-04-03T00:00:00.000-07:00",
            "grain": "day",
            "type": "value"
          }
        ],
        "value": "2020-02-03T00:00:00.000-08:00",
        "grain": "day",
        "type": "value"
      },
      "entity": "time",
      "extractor": "DucklingHTTPExtractor"
    },
    {
      "start": 29,
      "end": 55,
      "text": "the third week of february",
      "value": "2020-02-17T00:00:00.000-08:00",
      "confidence": 1.0,
      "additional_info": {
        "values": [
          {
            "value": "2020-02-17T00:00:00.000-08:00",
            "grain": "week",
            "type": "value"
          },
          {
            "value": "2021-02-15T00:00:00.000-08:00",
            "grain": "week",
            "type": "value"
          },
          {
            "value": "2022-02-21T00:00:00.000-08:00",
            "grain": "week",
            "type": "value"
          }
        ],
        "value": "2020-02-17T00:00:00.000-08:00",
        "grain": "week",
        "type": "value"
      },
      "entity": "time",
      "extractor": "DucklingHTTPExtractor"
    }
  ]

Why is it giving me two sets of values? It seems to analyze the “on the third” as the third of February, March and April but that’s not what I got from duckling.wit.ai and even if it does extract it, shouldn’t it give it a lower confidence considering that there are more words after this phrase.

How do I resolve this?

Also in some cases, I’m getting the same kind of information in two different levels in the NLU JSON inside additional_info. See the following value extracted for end of the month -

"additional_info": {
        "values": [
          {
            "to": {
              "value": "2020-02-01T00:00:00.000-08:00",
              "grain": "second"
            },
            "from": {
              "value": "2020-01-27T04:03:34.000-08:00",
              "grain": "second"
            },
            "type": "interval"
          }
        ],
        "to": {
          "value": "2020-02-01T00:00:00.000-08:00",
          "grain": "second"
        },
        "from": {
          "value": "2020-01-27T04:03:34.000-08:00",
          "grain": "second"
        },
        "type": "interval"
      }

Which one is more reliable to extract values from so that I can pass this onto a custom action?

@Tanja - would you be able to help here? Is there anyone from Duckling that I would need to approach? Thanks again!

Bump. Can anyone help? :see_no_evil:

I’m seeing some different results from what you have there. Can you share how you have your config setup currently? When I"m testing these phrases directly using duckling via the curl I seem to see what I would expect to see.

For example the end of the month for me:

curl -XPOST http://localhost:8000/parse --data 'locale=en_US&text=end of the month' | json_pp

[
   {
      "end" : 16,
      "dim" : "time",
      "value" : {
         "type" : "interval",
         "to" : {
            "value" : "2020-03-01T00:00:00.000-08:00",
            "grain" : "day"
         },
         "from" : {
            "value" : "2020-02-21T00:00:00.000-08:00",
            "grain" : "day"
         },
         "values" : [
            {
               "from" : {
                  "value" : "2020-02-21T00:00:00.000-08:00",
                  "grain" : "day"
               },
               "to" : {
                  "grain" : "day",
                  "value" : "2020-03-01T00:00:00.000-08:00"
               },
               "type" : "interval"
            }
         ]
      },
      "start" : 0,
      "body" : "end of the month",
      "latent" : false
   }
]

This seems to be more realistic and uses days instead of your grains of second and intervals.

Hello @btotharye, thanks for your response. This is my compose file -

version: '3.4'
services:
  rasa:
    image: sample-rasa-core:latest
    ports:
      - 5005:5005
    command: ["run", "--enable-api", "--cors", " \"*\"", "--verbose"]

  rasa-duckling:
    image: rasa/duckling:latest

  rasa-spacy:
    image: rasa/rasa:latest-spacy-en

  rasa-action-server:
    image: sample-rasa-action-server:latest
    command: ["start", "--actions", "actions", "-vv"]

This is my pipeline configuration for Duckling in config.yml -

pipeline: 
  - name: "WhitespaceTokenizer"
  - name: "RegexFeaturizer"
  - name: "CRFEntityExtractor"
  - name: "EntitySynonymMapper"
  - name: "CountVectorsFeaturizer"
    analyzer: "char_wb"
    min_ngram: 1
    max_ngram: 4
  - name: "EmbeddingIntentClassifier"
  - name: "DucklingHTTPExtractor"
    url: "http://rasa-duckling:8000"
    dimensions: ["time", "date", "amount-of-money", "number"]
    locale: "en_US"
    timezone: "US/Pacific"

I tried the curl command but for some reason it’s not working for me. Getting a curl (6) or a curl (7) error. What should I enter instead of the http://localhost:8000/parse?

I guess you don’t have duckling exposed in this case cause its in compose. That is why it doesn’t work, so you would have to expose it to be able to hit it outside the container network.

I’m just running duckling as a standard container to test it

docker run -p 8000:8000 rasa/duckling

Are you actually using time in your duckling extractions? I think it might be giving you time and date data for the phrases you were mentioning.

I guess you don’t have duckling exposed in this case cause its in compose.

Do you mean the curl command?

Are you actually using time in your duckling extractions? I think it might be giving you time and date data for the phrases you were mentioning.

Yes I am. But the values returned with 1.0 confidence by Duckling are both “time” entities for the third week of february.

No I meant cause you are using compose typically there isn’t a reason to expose duckling since Rasa only would use it, for you to interact with it from your local machine in compose you would have to expose the ports which it appears you aren’t from that file.

I see. So just to clarify, for the curl commands, I need to expose the port in my compose. I can try that.

As for the bot config, I’m able to get responses from Duckling when I run the bot through the following commands -

docker build -t sample-rasa-core . && docker build -t sample-rasa-action-server -f Dockerfile.actionserver . && docker-compose -f rasa-compose.yml up

docker run -it -v $(pwd):/app --name=rasa-interactive --network=rasa_default sample-rasa-core interactive

Is there still anything wrong with my config for Duckling to send me two high-confidence values? What’s the response you’re getting for the utterance “third week of february”?

Is there a reason you are using time and date? Do you need both of these in your current setup? I would have to test again but I’m wondering if you only had time here if that would change anything.

Part of the bot’s responsibility is to book meeting/appointments. So we would need to get both date and time. I’ll try again with just time and report back. Thanks @btotharye!

@btotharye, Looks like it didn’t matter. I got again two 1.0 confidence time entities by Duckling. This is the response -

NLU model loaded. Type a message and press enter to parse it.
Next message:
we're looking for something on the third week of february
{
  "intent": {
    "name": "1.1_state_preferences",
    "confidence": 0.406886488199234
  },
  "entities": [
    {
      "start": 28,
      "end": 40,
      "text": "on the third",
      "value": "2020-03-03T00:00:00.000-08:00",
      "confidence": 1.0,
      "additional_info": {
        "values": [
          {
            "value": "2020-03-03T00:00:00.000-08:00",
            "grain": "day",
            "type": "value"
          },
          {
            "value": "2020-04-03T00:00:00.000-07:00",
            "grain": "day",
            "type": "value"
          },
          {
            "value": "2020-05-03T00:00:00.000-07:00",
            "grain": "day",
            "type": "value"
          }
        ],
        "value": "2020-03-03T00:00:00.000-08:00",
        "grain": "day",
        "type": "value"
      },
      "entity": "time",
      "extractor": "DucklingHTTPExtractor"
    },
    {
      "start": 31,
      "end": 57,
      "text": "the third week of february",
      "value": "2020-02-17T00:00:00.000-08:00",
      "confidence": 1.0,
      "additional_info": {
        "values": [
          {
            "value": "2020-02-17T00:00:00.000-08:00",
            "grain": "week",
            "type": "value"
          },
          {
            "value": "2021-02-15T00:00:00.000-08:00",
            "grain": "week",
            "type": "value"
          },
          {
            "value": "2022-02-21T00:00:00.000-08:00",
            "grain": "week",
            "type": "value"
          }
        ],
        "value": "2020-02-17T00:00:00.000-08:00",
        "grain": "week",
        "type": "value"
      },
      "entity": "time",
      "extractor": "DucklingHTTPExtractor"
    }
  ],

I’ll try and test this in the next couple days, I was not seeing some of this same behavior though in my testing initially.

@btotharye Let me know if you had any luck with this or have any further questions that I can answer for you. Thanks!

Sorry been kinda tied up and haven’t had a chance to honestly look at this. I will eventually since I’m going to be doing the same thing in a new assistant I’m making soon.

Thanks! Could you suggest any other forum where I can ask this question?