How do we extract entities in end-to-end learning?

Please take a look at the docs for how to create stories for e2e training: https://rasa.com/docs/rasa/training-data-format#end-to-end-training

You can mark entities in user text in the same way, you mark entities in the NLU data

version: "2.0"

stories:
- story: end to end happy path
  steps:
  - user: “hi”
  - bot: “hi!”
  - user: “I’m looking for a restaurant”
  - bot: “how about [Chinese](cuisine) food?”
  - user: “sure”
  - bot: “here’s what I found ...”