NLU Training data - migrating from DialogFlow

Im a newbie in RASA and currently trying to migrate average size project to RASA from Dialogflow. In total around 700 intents. Not all intents has training phrases as they were used to state tracking(to push a model into certain state.)

We have around 36 distinct entities. And I was not able to properly train rasa model using the coversion scripts for Dialogflow. So I wrote my own scripts to generate JSON training data in the format that I found in examples:

{ “text”: “show me a mexican place in the centre”, “intent”: “restaurant_search”, “entities”: [ { “start”: 31, “end”: 37, “value”: “centre”, “entity”: “location” }, { “start”: 10, “end”: 17, “value”: “mexican”, “entity”: “cuisine” } ] },

Some training phrases have multiple entities, and because of that I ended up with huge number of combinations between training phrase and entity values(4560962). The pipeline use spacy to process them. I feel that nothing good will come out of this approach. Would you mind to point me, how exactly I should organise my training data in case, when Intent have many training phrases and many entities (with many example values)?

Hey @Kirill. Have you looked into one of our tutorials on how to migrate a DialogFlow assistant to Rasa? It’s a step-by-step tutorial and I hope it will help you move forward with the migration: https://medium.com/rasa-blog/how-to-migrate-your-existing-google-dialogflow-assistant-to-rasa-412cd07f424a

Hi, Thanks for reply. I appreciate your help. Yes I actually followed this tutorial. And in our case it is not so easy. After 3 weeks of non-stop coding I almost gave up. There are many issues, partially because our backend use dialogflow in non standard way. I have many issues with resolving entitiies into real values to produce valid training phrases. As we use template mode in dialogflow and entities can be composite. I mean they can be build from other entities. Also there are dialogflow default entitties, such as @sys.any:something which cannot be mapped automatically. I heavily rely on recursion to resolve the entitites. Another part is rasa core. Generating strories from our “conversations” and related intent seems to be not possible, we have too many “empty” intents, without any “user says” text in them. They mostly used to push conversation into certain state. Hard to explain without showing the code