HI All …
I am trying to use Rasa NLU pipeline to build a natural language text query to sql converter . Currently following the below steps
- Create a Pipeline in rasa nlu with ner crf and tensorflow intent classifier and train it on our domain.
- The domain currently is being trained with the training data generated with the help of chatito .
- Model followed to create queries is via categorizing them into simple,aggregated and group by query by determining there intent .
- Use the nlu parse api then to give entity and intent suggestions and use these to finally be able to generate the query . The intent say can be a findquery or a find order by query and entity extraction should give us terms such as table . field order param etc .
This so far is working well but I have certain question on how to train the model on the below complex situations
- What if user adds multiple find columns or group by columns like show me all apples where taste is sweet and sour . How can we use nlu to identify sweet and sour as where clauses for field taste ? similar to how a grammer parser could have done .
*What about queries that have aggregates … show me top 10 apples by price … here we need to identify group by clause and an aggregate … still possible I guess through training but can become complex it its show me top 10 prices of apples having region as Europe …
- Can we to ngrams entity extraction so say I have trained the model on a entity applename which is categorized as a field will apple name also be categorized as a field ?
and there could be endless possibilities pls I will need to train on all the entities and fields we have in our database universe .
Any suggestion and comments on our approach and how we can improve will be greatly helpful .
Thanks Gaurav …