background
Structured Query Language (SQL) is a special purpose programming Language, a database Query and programming Language, used for accessing data, querying, updating and managing relational database systems.
CURD stands for Create, Update, Retrieve, and Delete operations. A lot of application business code, always some curd stuff. With the development of artificial intelligence, machine-generated code has become possible. Some time ago, GitHub officials and OpenAI teamed up to give programmers a programming tool called GitHub Copilot. As long as a programmer writes a comment, GitHub Copilot can complete the rest of the code and suggest improvements.
Today we look at another possibility: we enter a string of Chinese characters, and the computer retrieving the statistics we want directly from a database. In this way, even the code does not need to be generated by the machine, and the computer can directly help the end user to solve some common data analysis needs. Let’s look at the final result
System structures,
We first describe the system construction process, and finally introduce the principle, the construction process is very simple, you have time to start your own experiment, here used Baidu Sugar visualization service and cloud database MemfireDB
- We first log in MemfireDB and upload the data we have prepared. Here we prepare three tables, namely the original information of StockBars stock, the RSV indicator calculated by StockRSVS, and the KDJ indicator calculated by StockKDJ. Specific how to upload data and calculating index can see another article https://juejin.cn/post/697920…
- We enter Baidu Sugar to configure the data source and data model, and the database IP used in the configuration of the data source is obtained from the MemfireDB console. Here, we need to connect the tables StockRSVS and StockKDJS internally, and the connection fields are stock_id and date
The associated data can be previewed by clicking “View Data”
- Open the data model of “intelligent question and answer” option, the data model will become a “training”, “training” complete, after using data fields by default “physical field name” and “display name” in the data model to match the data fields, if you need to pass through more to match the data fields, You can configure synonyms for data fields and here you can also configure synonyms for each field.
- Finally, create the “AI exploration page”, tick the data model that has turned on the “intelligent question and answer” option to complete the effect in the above video, is it very simple
This system uses data technology and artificial intelligence NL2SQL technology. The database system provides data access and SQL statement parsing. Sugar provides the conversion from natural language to SQL
The technology principle
NL2SQL is a research direction of natural Language processing technology, which can automatically transform human natural Language into the corresponding SQL statements (Structured Query Language), and then directly interact with the database and return the interactive results. For example, we ask: how many Volkswagen models are between 100,000 and 200,000? NL2SQL allows the machine to understand such natural language and retrieve the answer from the table.
NL2SQL gives non-professionals the freedom to query a rich variety of databases without having to learn and master a database programming language: just say the word. There are no rules and regulations, more content and information. It used to be that programmers wrote a “template” in which to look up content. The implementation of NL2SQL uses a large number of cutting-edge artificial intelligence algorithm models, such as the use of a number of pre-training language models, equivalent to the AI brain, so that AI can read the user’s language; Using the graph neural network, let the AI “see” the database, ten lines at a glance, and more clearly distinguish the contents of each table.
conclusion
In the end, for a year or two before NL2SQL is still in some academic research stage, accuracy is only 60%, today we have seen the baidu has launched commercial system, the development of artificial intelligence is too fast, he not only alternative is simple repetition work, now even write code that can also do, friends, you are trembling?