skip to main content
research-article

Bravely Say I Don’t Know: Relational Question-Schema Graph for Text-to-SQL Answerability Classification

Published:25 March 2023Publication History
Skip Abstract Section

Abstract

Recently, the Text-to-SQL task has received much attention. Many sophisticated neural models have been invented that achieve significant results. Most current work assumes that all the inputs are legal and the model should generate an SQL query for any input. However, in the real scenario, users are allowed to enter the arbitrary text that may not be answered by an SQL query. In this article, we focus on the issue–answerability classification for the Text-to-SQL system, which aims to distinguish the answerability of the question according to the given database schema. Existing methods concatenate the question and the database schema into a sentence, then fine-tune the pre-trained language model on the answerability classification task. In this way, the database schema is regarded as sequence text that may ignore the intrinsic structure relationship of the schema data, and the attention that represents the correlation between the question token and the database schema items is not well designed. To this end, we propose a relational Question-Schema graph framework that can effectively model the attention and relation between question and schema. In addition, a conditional layer normalization mechanism is employed to modulate the pre-trained language model to generate better question representation. Experiments demonstrate that the proposed framework outperforms all existing models by large margins, achieving new state of the art on the benchmark TRIAGESQL. Specifically, the model attains 88.41%, 78.24%, and 75.98% in Precision, Recall, and F1, respectively. Additionally, it outperforms the baseline by approximately 4.05% in Precision, 6.96% in Recall, and 6.01% in F1.

REFERENCES

  1. [1] Ba Lei Jimmy, Kiros Jamie Ryan, and Hinton Geoffrey E.. 2016. Layer normalization. CoRR abs/1607.06450 (2016). http://arxiv.org/abs/1607.06450Google ScholarGoogle Scholar
  2. [2] Bahdanau Dzmitry, Cho Kyunghyun, and Bengio Yoshua. 2015. Neural machine translation by jointly learning to align and translate. In Proceedings of the 3rd International Conference on Learning Representations: Conference Track (ICLR’15).Google ScholarGoogle Scholar
  3. [3] Choi Keunwoo, Fazekas György, Sandler Mark B., and Cho Kyunghyun. 2017. Convolutional recurrent neural networks for music classification. In Proceedings of the 2017 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’17). IEEE, Los Alamitos, CA, 23922396.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. [4] Vries Harm de, Strub Florian, Mary Jérémie, Larochelle Hugo, Pietquin Olivier, and Courville Aaron C.. 2017. Modulating early visual processing by language. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017. 65946604.Google ScholarGoogle Scholar
  5. [5] Devlin Jacob, Chang Ming-Wei, Lee Kenton, and Toutanova Kristina. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT’19) (Volume 1: Long and Short Papers). 41714186.Google ScholarGoogle Scholar
  6. [6] Duchi John, Hazan Elad, and Singer Yoram. 2011. Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research 12, 7 (2011), 2121–2159.Google ScholarGoogle Scholar
  7. [7] Hamilton William L., Ying Zhitao, and Leskovec Jure. 2017. Inductive representation learning on large graphs. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017. 10241034.Google ScholarGoogle Scholar
  8. [8] Henriques João F., Ehrhardt Sébastien, Albanie Samuel, and Vedaldi Andrea. 2019. Small steps and giant leaps: Minimal Newton solvers for deep learning. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV’19). IEEE, Los Alamitos, CA, 47624771.Google ScholarGoogle ScholarCross RefCross Ref
  9. [9] Hochreiter Sepp and Schmidhuber Jürgen. 1997. Long short-term memory. Neural Computation 9, 8 (1997), 17351780.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. [10] Hui Binyuan, Geng Ruiying, Ren Qiyu, Li Binhua, Li Yongbin, Sun Jian, Huang Fei, Si Luo, Zhu Pengfei, and Zhu Xiaodan. 2021. Dynamic hybrid relation network for cross-domain context-dependent semantic parsing. CoRR abs/2101.01686 (2021).Google ScholarGoogle Scholar
  11. [11] Kim Yoon. 2014. Convolutional neural networks for sentence classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP’14). 17461751. DOI:Google ScholarGoogle ScholarCross RefCross Ref
  12. [12] Kingma Diederik P. and Ba Jimmy. 2015. Adam: A method for stochastic optimization. In Proceedings of the 3rd International Conference on Learning Representations (ICLR’15).Google ScholarGoogle Scholar
  13. [13] Kingma Diederik P. and Ba Jimmy. 2015. Adam: A method for stochastic optimization. In Proceedings of the 3rd International Conference on Learning Representations: Conference Track (ICLR’15). http://arxiv.org/abs/1412.6980Google ScholarGoogle Scholar
  14. [14] Kipf Thomas N. and Welling Max. 2017. Semi-supervised classification with graph convolutional networks. In Proceedings of the 5th International Conference on Learning Representations: Conference Track (ICLR’17). https://openreview.net/forum?id=SJU4ayYgl.Google ScholarGoogle Scholar
  15. [15] Lan Zhenzhong, Chen Mingda, Goodman Sebastian, Gimpel Kevin, Sharma Piyush, and Soricut Radu. 2020. ALBERT: A lite BERT for self-supervised learning of language representations. In Proceedings of the 8th International Conference on Learning Representations (ICLR’20).Google ScholarGoogle Scholar
  16. [16] Liu Qian, Chen Bei, Lou Jian-Guang, Zhou Bin, and Zhang Dongmei. 2020. Incomplete utterance rewriting as semantic segmentation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP’20). 28462857.Google ScholarGoogle ScholarCross RefCross Ref
  17. [17] Lyu Qin, Chakrabarti Kaushik, Hathi Shobhit, Kundu Souvik, Zhang Jianwen, and Chen Zheng. 2020. Hybrid ranking network for Text-to-SQL. CoRR abs/2008.04759 (2020).Google ScholarGoogle Scholar
  18. [18] Mikolov Tomas, Chen Kai, Corrado Greg, and Dean Jeffrey. 2013. Efficient estimation of word representations in vector space. In Proceedings of the 1st International Conference on Learning Representations: Workshop Track (ICLR’13).Google ScholarGoogle Scholar
  19. [19] Min Qingkai, Shi Yuefeng, and Zhang Yue. 2019. A pilot study for Chinese SQL semantic parsing. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). 36523658.Google ScholarGoogle ScholarCross RefCross Ref
  20. [20] Patel Vivak. 2016. Kalman-based stochastic gradient method with stop condition and insensitivity to conditioning. SIAM Journal on Optimization 26, 4 (2016), 26202648.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. [21] Peters Matthew E., Neumann Mark, Iyyer Mohit, Gardner Matt, Clark Christopher, Lee Kenton, and Zettlemoyer Luke. 2018. Deep contextualized word representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT’18) (Volume 1: Long Papers). 22272237.Google ScholarGoogle ScholarCross RefCross Ref
  22. [22] Polyak Boris T. and Juditsky Anatoli B.. 1992. Acceleration of stochastic approximation by averaging. SIAM Journal on Control and Optimization 30, 4 (1992), 838855.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. [23] Qian Ning. 1999. On the momentum term in gradient descent learning algorithms. Neural Networks 12, 1 (1999), 145151.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. [24] Radford Alec, Narasimhan Karthik, Salimans Tim, and Sutskever Ilya. 2018. Improving Language Understanding by Generative Pre-Training. Retrieved January 6, 2023 from https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf.Google ScholarGoogle Scholar
  25. [25] Raffel Colin, Shazeer Noam, Roberts Adam, Lee Katherine, Narang Sharan, Matena Michael, Zhou Yanqi, Li Wei, and Liu Peter J.. 2020. Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research 21 (2020), Article 140, 67 pages.Google ScholarGoogle Scholar
  26. [26] Rumelhart David E., Hinton Geoffrey E., and Williams Ronald J.. 1988. Learning Representations by Back-Propagating Errors. MIT Press, Cambridge, MA, 696699. Google ScholarGoogle Scholar
  27. [27] Maaten Laurens van der and Hinton Geoffrey. 2008. Visualizing data using t-SNE. Journal of Machine Learning Research 9, 86 (2008), 25792605. http://jmlr.org/papers/v9/vandermaaten08a.html.Google ScholarGoogle Scholar
  28. [28] Vaswani Ashish, Shazeer Noam, Parmar Niki, Uszkoreit Jakob, Jones Llion, Gomez Aidan N., Kaiser Lukasz, and Polosukhin Illia. 2017. Attention is all you need. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017. 59986008.Google ScholarGoogle Scholar
  29. [29] Velickovic Petar, Cucurull Guillem, Casanova Arantxa, Romero Adriana, Liò Pietro, and Bengio Yoshua. 2017. Graph attention networks. CoRR abs/1710.10903 (2017).Google ScholarGoogle Scholar
  30. [30] Wang Bailin, Shin Richard, Liu Xiaodong, Polozov Oleksandr, and Richardson Matthew. 2020. RAT-SQL: Relation-aware schema encoding and linking for Text-to-SQL parsers. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL’20). 75677578.Google ScholarGoogle ScholarCross RefCross Ref
  31. [31] Wang Kai, Shen Weizhou, Yang Yunyi, Quan Xiaojun, and Wang Rui. 2020. Relational graph attention network for aspect-based sentiment analysis. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL’20). 32293238.Google ScholarGoogle ScholarCross RefCross Ref
  32. [32] Yang Zhilin, Dai Zihang, Yang Yiming, Carbonell Jaime G., Salakhutdinov Ruslan, and Le Quoc V.. 2019. XLNet: Generalized autoregressive pretraining for language understanding. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019 (NeurIPS’19). 57545764.Google ScholarGoogle Scholar
  33. [33] Yann LeCun, Leon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. In Proceedings of the IEEE 86, 11 (1998), 2278–2324.Google ScholarGoogle Scholar
  34. [34] Yu Tao, Zhang Rui, Er Heyang, Li Suyi, Xue Eric, Pang Bo, Lin Xi Victoria, et al. 2019. CoSQL: A conversational Text-to-SQL challenge towards cross-domain natural language interfaces to databases. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). 19621979.Google ScholarGoogle ScholarCross RefCross Ref
  35. [35] Yu Tao, Zhang Rui, Yang Kai, Yasunaga Michihiro, Wang Dongxu, Li Zifan, Ma James, et al. 2018. Spider: A large-scale human-labeled dataset for complex and cross-domain semantic parsing and Text-to-SQL task. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.Google ScholarGoogle ScholarCross RefCross Ref
  36. [36] Yu Tao, Zhang Rui, Yasunaga Michihiro, Tan Yi Chern, Lin Xi Victoria, Li Suyi, Er Irene Li Heyang, et al. 2019. SParC: Cross-domain semantic parsing in context. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.Google ScholarGoogle ScholarCross RefCross Ref
  37. [37] Yu Wei, Chang Tao, Guo Xiaoting, Wang Mengzhu, and Wang Xiaodong. 2021. An interaction-modeling mechanism for context-dependent Text-to-SQL translation based on heterogeneous graph aggregation. Neural Networks 142 (2021), 573582. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. [38] Yu Wei, Guo Xiaoting, Chen Fei, Chang Tao, Wang Mengzhu, and Wang Xiaodong. 2021. Similar questions correspond to similar SQL queries: A case-based reasoning approach for Text-to-SQL translation. In Case-Based Reasoning Research and Development, Sánchez-Ruiz Antonio A. and Floyd Michael W. (Eds.). Springer International Publishing, Cham, Switzerland, 294308. Google ScholarGoogle Scholar
  39. [39] Zhang Yusen, Dong Xiangyu, Chang Shuaichen, Yu Tao, Shi Peng, and Zhang Rui. 2020. Did you ask a good question? A cross-domain question intention classification benchmark for Text-to-SQL. CoRR abs/2010.12634 (2020). https://arxiv.org/abs/2010.12634Google ScholarGoogle Scholar
  40. [40] Zhong Victor, Xiong Caiming, and Socher Richard. 2017. Seq2SQL: Generating structured queries from natural language using reinforcement learning. CoRR abs/1709.00103 (2017). http://arxiv.org/abs/1709.00103Google ScholarGoogle Scholar
  41. [41] Zhou Peng, Shi Wei, Tian Jun, Qi Zhenyu, Li Bingchen, Hao Hongwei, and Xu Bo. 2016. Attention-based bidirectional long short-term memory networks for relation classification. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, (ACL’16) (Volume 2: Short Papers).Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Bravely Say I Don’t Know: Relational Question-Schema Graph for Text-to-SQL Answerability Classification

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Asian and Low-Resource Language Information Processing
        ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 22, Issue 4
        April 2023
        682 pages
        ISSN:2375-4699
        EISSN:2375-4702
        DOI:10.1145/3588902
        Issue’s Table of Contents

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 25 March 2023
        • Online AM: 4 January 2023
        • Accepted: 19 December 2022
        • Revised: 17 October 2022
        • Received: 19 September 2021
        Published in tallip Volume 22, Issue 4

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
      • Article Metrics

        • Downloads (Last 12 months)120
        • Downloads (Last 6 weeks)10

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      View Full Text

      HTML Format

      View this article in HTML Format .

      View HTML Format
      About Cookies On This Site

      We use cookies to ensure that we give you the best experience on our website.

      Learn more

      Got it!