Natural language to SQL

Our Rasa chatbot should have some intents designated to querying and aggregating data from a database (for every table field and operation!).

Rasa doesn’t seem to be the right tool for that since it doesn’t has any ML based SQL generation out of the box. Writing a SQL generation layer manually is very tedious and error prone. We would like to avoid that.

There are many papers on this topic under keywords like seq2SQL, nl2SQL, Text-to-SQL (it seems to be a hot topic currently, last publication is just 5 days old).

We would like to use Rasa as a middleman and only a seq2SQL model in the backend (after Rasa). Rasa should keep the context and the seq2SQL model should provide the data for a given request (based on context).

We are not sure how to go about this. Did anybody in this community or at Rasa worked on a similar problem and likes to share?

What we can share on this so far is a list of papers in the field for which GitHub implementations exist:

I hope someone has some insight.

2 Likes

Hi Spen,

This is a super interesting and ambitious project. In fact, we were working on something similar when we decided to start building Rasa!

Writing a SQL generation layer manually is very tedious and error prone

Yes, but training a neural net to do the same is also difficult :slight_smile: remember that you will have to create a model for the specific schema you are working with. So I would strongly doubt you can take something off the shelf and work with it. Do you have data you can train on?

My advice would be to start simple. Define entities for the things you need (essentially the SQL verbs). Your training data would look something like:

[How many](aggregation) [purchases](item) were made [yesterday](filter:date)?
Show me the [total](aggregation) number of [customers](item) by [country](groupby)

You can then combine those entities to write a SQL query for your DB. Once you have something (basic) that’s working, you can give it to people to test and generate more training data. I suspect you’ll need quite a lot in order to train one of the neural models :slight_smile:

4 Likes

Hi Alan,

thank you for your reply.

The way you recommended is how we are currently going about this. Until now this works great with our self-created training set. The way of starting with a small set and basic functionality to later expand on this with user data is also how we planned it.

The problem that we see with that is that when requests become more complex (for example with multi table conditions … and field selections) building valid SQL queries from these entities can get very complicated. We expect a ton of special cases that all need to be programmed in. Maybe plugging in an additional model is overkill but I am currently not aware of any python library that could bridge the way between entities and valid SQL queries.

How did your company solve this part? Did Rasa suffice on all your requirements in regards to this use case?

1 Like

Hi Sebastian.

I’m afraid I don’t have a simple answer to this :slight_smile: - it’s a hard problem to solve in the general case and definitely an opportunity for you to build something defensible!

Just to share our experience, we found that people never expressed very complicated queries in a single utterance. “Please show me all mexican restaurants within 3 miles with wifi but not in the tenderloin” is something you’re unlikely to see. It’s much more common to get something ambiguous (e.g. “I’m hungry”) and then having a conversation to figure out the details. Depends on your target users of course but that was our experience.

1 Like

Hi guys! I’m working on a similar project and thought RASA will be better for me. I have seen a paid service kueri.me, Which offers to give a natural language interface to your database. I want to start with something basic then maybe go to harder queries. Did you guys find any new advancements in the field?

Hi @Akshit,

we now mainly use Rasa with training data that annotates and abstracts general database concepts like tables, field, … A general select query with table and field as well as aggregations, number of requested entries, … works very well. Defining filters with free text are more hard. Currently we use Rasa Forms for that.

But what I can share is that we switched to ponyORM to talk to the SQL database. Since it works in a pythonic way and we don’t need to define SQL queries from intents and entities anymore it makes everything much more easy (not perfect).

So instead of maping it to a perfect query in SQL form you are doing it in python only! Are you solving for multiple tables as well?

From a chatbot user standpoint - no, an user can only request data for one subject (that is how we handle tables). But these tables can also be views. Yes, that heavily limits the capabilities but as @amn41 stated, most queries are not that complex anyway.

1 Like

I am new to Rasa and currently working on chatbot that can fetch information from MYSQL database. Can you guide me how can i do it with Rasa?

hey @siddharthchauhan, check this out it will help you:

https://github.com/JiteshGaikwad/MoodBot

1 Like

Hi @JiteshGaikwad, Thanks for the reply but i wasn’t looking for this. I need a code which can convert English to SQL and retrieve me data from database.

hi @siddharthchauhan - translating English to SQL is a very difficult problem. If you look at my comment above, you can get an idea of how you can build a limited version of this for a particular database schema.

Hi @amn41,

Can you explain me in detail about your previous post, I am kind a lost and I don’t know what to do. And also i am kind a new to Rasa.

Thanks in advance.

Hi @siddharthchauhan,

I think this masterclass episode 8 explains everything well. It covers an actual example of how you collect information from users to do a query.

In the masterclass, an REST API is called, but that part can be swapped out by an SQL query.

@amn41 - What’s your thoughts on Converting Natural Language to SQL query using seq2sql or SQLNet

Reference:

1 Like

hi @malarsarav - these are good papers! but as I mentioned above, this is a research area. So if you are willing to put in the work, you can build something that others don’t have!

How you are using ponyORM to query to database. Can you please explain in detail that will help me a lot.

similar requirement. currently, we could query a single table using nl/voice. i am planing to transfer our app to Rasa in order not to using too many rules.