SQL Generation from Natural Language: A Sequence-to-Sequence Model Powered by the Transformers Architecture and Association Rules

Youssef Mellah; Abdelkader Rhouati; El Hassane Ettifouri; Toumi Bouchentouf; Mohammed Ghaouth Belkasmi

doi:10.3844/jcssp.2021.480.489

Research Article Open Access

SQL Generation from Natural Language: A Sequence-to-Sequence Model Powered by the Transformers Architecture and Association Rules

Youssef Mellah¹, Abdelkader Rhouati², El Hassane Ettifouri², Toumi Bouchentouf¹ and Mohammed Ghaouth Belkasmi¹

¹ Mohammed First University Oujda, Morocco
² NovyLab Research, France

Abstract

Using Natural Language (NL) to interacting with relational databases allows users from any background to easily query and analyze large amounts of data. This requires a system that understands user questions and automatically converts them into structured query language such as SQL. The best performing Text-to-SQL systems use supervised learning (usually formulated as a classification problem) by approaching this task as a sketch-based slot-filling problem, or by first converting questions into an Intermediate Logical Form (ILF) then convert it to the corresponding SQL query. However, non-supervised modeling that directly converts questions to SQL queries has proven more difficult. In this sense, we propose an approach to directly translate NL questions into SQL statements. In this study, we present a Sequence-to-Sequence (Seq2Seq) parsing model for the NL to SQL task, powered by the Transformers Architecture exploring the two Language Models (LM): Text-To-Text Transfer Transformer (T5) and the Multilingual pre-trained Text-To-Text Transformer (mT5). Besides, we adopt the transformation-based learning algorithm to update the aggregation predictions based on association rules. The resulting model achieves a new state-of-the-art on the WikiSQL DataSet, for the weakly supervised SQL generation.

Journal of Computer Science

Volume 17 No. 5, 2021, 480-489

DOI: https://doi.org/10.3844/jcssp.2021.480.489

Submitted On: 12 March 2021 Published On: 23 May 2021

How to Cite: Mellah, Y., Rhouati, A., Ettifouri, E. H., Bouchentouf, T. & Belkasmi, M. G. (2021). SQL Generation from Natural Language: A Sequence-to-Sequence Model Powered by the Transformers Architecture and Association Rules. Journal of Computer Science, 17(5), 480-489. https://doi.org/10.3844/jcssp.2021.480.489

Copyright: © 2021 Youssef Mellah, Abdelkader Rhouati, El Hassane Ettifouri, Toumi Bouchentouf and Mohammed Ghaouth Belkasmi. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

8,586 Views
6,245 Downloads
11 Citations

Download

Keywords

SQL
Text-to-SQL
Sequence-to-Sequence
Transformers Architecture
Multilingual Pre-Trained Text-To-Text Transformer
WikiSQL