Extracting Entities End-to-End Rasa training

Hi,

I am doing some experiments with e-2-e on rasa opensource and I am interested in extracting entities from end-to-end stories. I am facing the following yml syntax issue when training:

> story:
>   steps:
>   - bot: Can I have your phone number?
>   - user: [123-456-7890](phone_number)

Error:
YamlSyntaxException: Failed to read 'data/stories.yml'. while parsing a block mapping
      - user: [123-456-7890](phone_number) 
        ^ (line: 53)
expected <block end>, but found '<scalar>'

How can I solve this error?

Also, I would highly appreciate it if you can give me recommendations regarding the config.yml file for e-2-e with entity extraction. Thanks!

I’m not at all familiar with the End-to-end training data format, but I think the error may be due to the missing hyphen ( - ) in front of the keyword story.

Hyphens, spaces, tabs, new lines… are essential for the file to be read correctly without any error.

In this case the keyword story should start with a hyphen and be at the same level as step, see this example:

image

I hope it helps :wink:

Hi, you can use YAML Checker to check your YAML files.

Also, start your story with a user message (intent).

Your line 53 is not valid (here line 5)

The following YAML is valid

2022-10-13 (4)q

If this doesn’t work, then please provide your complete stories so we can check for <block end>

Hi and thank you for your response. I have tried testing my yml file using online tools and it is correct. My problem is with “tagging entities” within end-to-end stories.

I know that if I add words (i.e., string) in front of those numbers and symbols, the yml file will be correct. But I cannot modify the text. I need to keep the format as is…

I think it might be better to re-phrase my issue. I am trying to extract numbers (e.g., money, phone-number, doorplate, etc. that are not accompanied by any strings) as entities. For instance, imagine the following part of my story:

- story: story_with_numbers_as_entities
  steps:
  - user: Hi! I want to know how much downpayment is there for leasing a BMW i8?
  - bot: Can you please tell me your annual income?
  - user: 150,000$

Now, I want to extract the user’s income (i.e., 150,000$) as an entity ({entity: money, role: annual_income}), so my story would look like below:

- story: story_with_numbers_as_entities
  steps:
  - user: Hi! I want to know how much downpayment is there for leasing a BMW i8?
  - bot: Can you please tell me your annual income?
  - user: [150,000$]{"entity": "money", "role": "annual_income"}

This annotation leads to a yml error, but it is the format that rasa requires (atleast within the nlu file). Adding quotations such as “[150,000$]{“entity”: “money”, “role”: “annual_income”}” or “[150,000$]{entity: money, role: annual_income}” also leads to yml and rasa errors.

I have tried looking for an example where stand-alone numbers are extracted as entities, but there is no one single example.

Hi Parisa,

As you’ve probably seen, e2e is experimental. I’m not aware of any production bots using this feature. You should switch to standard intents.

For entity extraction, there’s a good blog post on this topic here. Duckling is best for numbers and dates.

You would then use forms or separate intents to extracting different numeric values depending on your use case.

Greg