Error while running Duckling Extractor in the pipeline

HI All,

I am using Duckling Http Extractor in the pipeline and i have installed the duckling server locally and running it on port 8000. ( get the message on the shell that its running on 0.0.0.0:8000)

When i make a call to parse api from rasa where i need to extract time dimension, i get the following error on duckling server error logs :

"POST /parse HTTP/1.1\ncontent-length: 86\ncontent-type: application/x-www-form-urlencoded; charset=UTF-8\nconnection: keep-alive\naccept: */*\naccept-encoding: gzip, deflate\nuser-agent: python-requests/2.22.0\nhost: localhost:8000\n\nsn=\"localhost:8000\" c=127.0.0.1:59719 s=127.0.0.1:8000 ctx=/ clen=86\nparams: locale: [\"en_EN\"], reftime: [\"1569441253000\"], text: [\"show failed jobs from 1 day\"], tz: [\"Europe/Berlin\"]"
A web handler threw an exception. Details:
TerminateSessionException user error (Text.Regex.PCRE.String died: (21,"invalid UTF-8 string")) 

How do i proceed ahead? Following is the config.yml file contents:

language: en
pipeline:
 - name: "SpacyNLP"
 - name: "SpacyTokenizer"
 - name: "SpacyFeaturizer"
 - name: "RegexFeaturizer"
 - name: "CountVectorsFeaturizer"
 - name: "CRFEntityExtractor"
   features: [["low", "title", "upper","pos","pos2"],["bias","low","upper","title","digit","pos","pos2","pattern"],["low", "title", "upper"]]
 - name: "EmbeddingIntentClassifier"
 - name: "EntitySynonymMapper"
 - name: "SklearnIntentClassifier"
 - name: "DucklingHTTPExtractor"
   url: "http://localhost:8000"
   dimensions:
     - time
   timezone: "Europe/Berlin"

policies:
 - name: MemoizationPolicy
 - name: KerasPolicy
 - name: MappingPolicy

Hmm this is strange. Are you running rasa shell or rasa run (if so what connector)? Also, what is your OS and what message are you passing to duckling?

HI I could get duckling working. However i landed in a new error, where if lets say i have 2 entities, One is name and other is time in a sentence, now duckling identifies only the time entity. The name from the sentence is not getting picked. How can i overcome? For example : Show me orders from last 2 days.

Orders : entityType1 last 2 days : time entity

Now only time entity is returned. How can i add configuration to pick entityType1 as well?

Mhm, name isn’t a duckling entity, so it should be an entity defined in your training data, then it will be picked up by the CRF. If it’s not picking it up correctly, you probably need more training data for that entity.

Got it. Its working now. Thanks for the help Ella.

Hello @surajhes

I am facing the same problem as you, when I submit a request to Duckling I get similar error in logs file:

A web handler threw an exception. Details:
TerminateSessionException user error (Text.Regex.PCRE.String died: (21,"invalid UTF-8 string")) 

Do you mind explaining how you solved this issue?

Thanks