How to use duckling in pipeline

I’ve installed and compiled duckling as instucted at their github page and I can also run their example form bash

stack exec duckling-example-exe

But when I try to train my NLU model I get an error:

Exception: Not all required packages are installed. To use this pipeline, you need to install the missing dependencies. Please install duckling

What part of the setup did I miss?

config.yml:

pipeline: 
- name: "tokenizer_whitespace"
- name: "intent_entity_featurizer_regex"
- name: "ner_crf"
- name: "ner_synonyms"
- name: "intent_featurizer_count_vectors"
- name: "intent_classifier_tensorflow_embedding"
  intent_tokenization_flag: true
  intent_split_symbol: "+"
- name: "ner_duckling"
  dimensions: ["time"]
1 Like

The duckling you installed is part of the ner_duckling_http pipeline and not ner_duckling, ner_duckling is simply the python wrapper of the Clojure version of duckling. Facebook has deprecated that. You pipeline should look like this

pipeline: 
- name: "tokenizer_whitespace"
- name: "intent_entity_featurizer_regex"
- name: "ner_crf"
- name: "ner_synonyms"
- name: "intent_featurizer_count_vectors"
- name: "intent_classifier_tensorflow_embedding"
  intent_tokenization_flag: true
  intent_split_symbol: "+"
- name: "ner_duckling_http"
  url: "http://duckling:8000"
  dimensions: ["time"]

url is where you have started the duckling server.

3 Likes

Great that is working! However do I need to write my own Haskell code to specify the correct language? There is not much info on their github page and I have no experience with that language.

just add locale in the pipeline

pipeline: 
- name: "tokenizer_whitespace"
- name: "intent_entity_featurizer_regex"
- name: "ner_crf"
- name: "ner_synonyms"
- name: "intent_featurizer_count_vectors"
- name: "intent_classifier_tensorflow_embedding"
  intent_tokenization_flag: true
  intent_split_symbol: "+"
- name: "ner_duckling_http"
  url: "http://duckling:8000"
  locale: "NL_Nothing"
  dimensions: ["time"]

NL is the language and since NL has no regional variances as such in duckling, hence Nothing, simillarly for en_US, en_UK, en_CA, these are locales

3 Likes

Excellent. Thank you. Just one note. I had to specify the locale as “da_DK” for Danish, or it would not work.

1 Like

The settings for ner_duckling_http should be part of our NLU docs soon too btw!

4 Likes

My final pipeline for those interested

pipeline: 
- name: "tokenizer_whitespace"
- name: "intent_entity_featurizer_regex"
- name: "ner_crf"
- name: "ner_synonyms"
- name: "intent_featurizer_count_vectors"
- name: "intent_classifier_tensorflow_embedding"
  intent_tokenization_flag: true
  intent_split_symbol: "+"
- name: "ner_duckling_http"
  url: "http://0.0.0.0:8000"
  locale: "da_DK"
  dimensions: ["time"]
5 Likes

update I found the solution in the docs. Updating my pipeline with

pipeline:
  timezone: "Europe/Copenhange"

Original

Addendum.

How do I handle timezone? I believe the normal procedure for duckling is to send the local time zone within the web request. But I can’t do that with rasa. If I’m reading the datetime right, the coordinated universal time is -7, “2018-08-06T00:00:00.000-07:00”. I need UTC +1

What do you mean? You’re specifying the timezone in your config file right?

Sorry, that was a confusing post. I originally made the addendum. After I found the solution I updated the post. I should probably have deleted the original message.

1 Like

Anybody used Duckling as hosted on cloud (on aws or heroku). I would like to understand the deployment process to use that end point as API (url: “http://duckling:8000”).

Thanks

I think i should clarify what the url you suggested http://duckling:8000 This is actually my docker environment where i create a network and name each of my containers.

Duckling as far as i am aware is not a hosted service. You have to self host it

There are several ways - Host the docker image https://hub.docker.com/r/rasa/duckling/

You can also deploy it manually onto your ec2 server. follow the installation process here

I want to use India timeline. What is the locale code?

Did you check their repo? I am sure you will find the locale code for different countries

@akelad When?

They are aren’t they? Component Configuration

I don’t think it’s a the right decision to go with duckling. There are plenty of other hosts available in the market. As the likes of Amazon Web services, Microsoft Azure, DigitalOcean etc. You can go with either of them but, I am not sure why are you considering it as an option.

I think you are mixing apples and oranges. Duckling is for entity extraction, not a hosting service.

Can you suggest me any proper available project based on duckling

Hi @abhishek1,

You can check with this repository.

But please make sure your config.yml is working. if not change to this:

language: en

pipeline:

  • name: “WhitespaceTokenizer”

  • name: “RegexFeaturizer”

  • name: “CRFEntityExtractor”

  • name: “EntitySynonymMapper”

  • name: “CountVectorsFeaturizer”

  • name: “EmbeddingIntentClassifier”

    intent_tokenization_flag: true

    intent_split_symbol: “+”

  • name: “DucklingHTTPExtractor”

    url: “http://0.0.0.0:8000

    locale: “en_GB”

    dimensions: [“time”]

policies:

  • name: FallbackPolicy

  • name: MemoizationPolicy

  • name: KerasPolicy

  • name: MappingPolicy

1 Like