How to handle multiple languages with Duckling?

I am working on a healthcare chatbot that handles two different languages, Romanian and Italian. The development is being done in English. After some tests with patients, we found that the duckling component doesn’t parse correctly the dates when given in Romanian or Italian.

Below is the pipeline:

language: en
pipeline:
  - name: sentiment.SentimentAnalyzer
  - name: WhitespaceTokenizer
  - name: RegexFeaturizer
  - name: LexicalSyntacticFeaturizer
  - name: CountVectorsFeaturizer
  - name: CountVectorsFeaturizer
    analyzer: char_wb
    min_ngram: 1
    max_ngram: 4
  - name: "DucklingEntityExtractor"
    url: "http://localhost:8000"
    dimensions: ["time", "number"]
  - name: DIETClassifier
    epochs: 100
    constrain_similarities: true
  - name: EntitySynonymMapper
  - name: FallbackClassifier
    threshold: 0.25

For each user, the language changes automatically through the app. The extractor works well in English but in Romanian doesn’t. Below is an example:

{
  "text": "11 Ianuarie 2021",
  "intent": {
    "id": -4402847296280810522,
    "name": "inform",
    "confidence": 1.0
  },
  "entities": [
    {
      "start": 0,
      "end": 2,
      "text": "11",
      "value": 11,
      "confidence": 1.0,
      "additional_info": {
        "value": 11,
        "type": "value"
      },
      "entity": "number",
      "extractor": "DucklingEntityExtractor"
    },
    {
      "start": 12,
      "end": 16,
      "text": "2021",
      "value": 2021,
      "confidence": 1.0,
      "additional_info": {
        "value": 2021,
        "type": "value"
      },
      "entity": "number",
      "extractor": "DucklingEntityExtractor"
    }

Any suggestions on how could I make it to work?

@EvanMath try uses the spacy pipeline, hope that will help you.

hey @nik202 thank you for your reply.

I tried it but it didn’t work. I think that if somehow I could dynamically change the pipeline language might help.

@EvanMath try including locale (Components) parameter

locale: “de_DE”

Hey @rasa_learner ,

I did it, but it didn’t work either. I am not sure if I did it properly. For multiple languages should be something like this?

locale: ["en_GB", "it_IT", "ro_RO"]

I can’t find any related documentation.

The correct format for handling multiple languages with duckling seems to be:

  - name: "DucklingEntityExtractor"
    url: "http://localhost:8000"
    dimensions: ["time", "number"]
    locale: "ro_RO"
  - name: "DucklingEntityExtractor"
    url: "http://localhost:8000"
    dimensions: ["time", "number"]
    locale: "en_US"
  - name: "DucklingEntityExtractor"
    url: "http://localhost:8000"
    dimensions: ["time", "number"]
    locale: "it_IT"

The following:

  - name: "DucklingEntityExtractor"
    url: "http://localhost:8000"
    dimensions: ["time", "number"]
    locale: ["ro_RO", "en_US", "it_IT"]

Didn’t work as expected.

1 Like

I know it has been a while since you posted on this topic but I was wondering if you could help me a little bit with the duckling implementation in other languages. Did you provide examples in the nlu for duckling to extract, or if you didn’t how did you go about utilizing duckling for other languages. I am implementing duckling in my spanish bot and when i am giving it a number it is extracting it as a time

Hello Dante, there’s a video here that has an example in German, does that help?

Thanks for the link but unfortunately it was just a video of the general duckling setup which I already have.