Rasa NLU unable to extract JSON input from text

Hello!

I’m training a chatbot to be able to easily modify and connect with a back-end database. For this, there are a few intents in which the user will be expected to directly input some JSON data with which Rasa will make the API call.

One of the training example is:

- Create transition for work order [{\"region\":\"AM\",\"subregion\":\"CA\",\"dispatch_id\":\"xxxxx\"}](status_data) for [onSite](status_type)

After training the data and running the NLU shell, the ‘status_type’ entity with value ‘onSite’ is being recognized and the value is correctly extracted, however, that’s the only entity extracted and the ‘status_data’ is completely ignored.

I don’t know where I’m going wrong and why the JSON part of the text isn’t being extracted. I have tried with the backslash before the double quotes to escape them, as well as without them, neither seems to be working. I’ve also tried with both DIETClassifier and CRFEntityExtractor, here is my current config.yml file:

language: “en” pipeline:

  • name: WhitespaceTokenizer
  • name: RegexFeaturizer
  • name: LexicalSyntacticFeaturizer
  • name: CountVectorsFeaturizer analyzer: “char_wb” min_ngram: 1 max_ngram: 4
  • name: DIETClassifier epochs: 100
  • name: CRFEntityExtractor
  • name: EntitySynonymMapper
  • name: ResponseSelector epochs: 100

Would really appreciate if someone could guide me as to how to make this work. Thank you!

I think that is due to the way how we parse the incoming text. It might be that the status_data entity is not correctly recognized and therefore not trained correctly. Can you share your training log file? It should say what kind of entities it found.

It might be easier to use a form for your use case. You can prompt a couple of questions, one asking about the JSON data. You can fill a specific entity directly from the incoming text using the slot mapping from_text.