Powered SQL education: Automating SQL/PLSQL question classification with LLMs and machine learning

Naif  Alzriqat; Mohammad  Al-Oudat

doi:10.53894/ijirss.v8i2.5467

Engineering

Naif Alzriqat, Mohammad Al-Oudat

https://doi.org/10.53894/ijirss.v8i2.5467

Issue
Vol. 8 No. 2 (2025)

Keywords:

PDF

Abstract

Mastering Structured Query Language/Procedural Language (SQL/PLSQL) is considered challenging for academic students and industrial professionals, showing a significant gap between academic preparation and industrial demands that leads both to seek solutions on Stack Overflow (SO). This research presents a novel automated framework to classify SQL/PLSQL questions and shed light on learning challenges. A new dataset was collected from SO posts, totaling 10,266 questions, and categorized into five categories—Data Definition Language (DDL), Data Manipulation Language (DML), Data Query Language (DQL), Data Control Language (DCL), and Transaction Control Language (TCL)—using the LLM GPT-4o-mini API, followed by preprocessing and applying Machine Learning (ML) techniques like Random Forest and XGBoost. Results show that Data Query Language (DQL) and Data Manipulation Language (DML) are the most challenging areas, with Random Forest and XGBoost producing the highest classification accuracy at 85.57% and 85.13%, respectively, while DDL and DCL appear less often. This research bridges the gap between academic and industrial requirements, concluding that AI-driven analysis identifies the real challenges, suggesting that the academic curriculum enhance hands-on problem-solving to meet industry needs.

Authors

Naif Alzriqat

Department of Software Engineering Faculty of Information Technology Philadelphia University Amman, Jordan.

202220940@philadelphia.edu.jo (Primary Contact)

Mohammad Al-Oudat

Department of Computer Science Faculty of Information Technology Philadelphia University Amman, Jordan.

https://orcid.org/0000-0002-0553-7961

Alzriqat, N. ., & Al-Oudat, M. . (2025). Powered SQL education: Automating SQL/PLSQL question classification with LLMs and machine learning. International Journal of Innovative Research and Scientific Studies, 8(2), 1395–1407. https://doi.org/10.53894/ijirss.v8i2.5467

Download Citation

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

	All	Since 2020
Citations	658	638
h-index	11	11
i10-index	14	14

Powered SQL education: Automating SQL/PLSQL question classification with LLMs and machine learning

Abstract

Authors

Most read articles by the same author(s)

Similar Articles

Related Article based on the article keywords

gsCitation

Similar Articles

0

0 Total citations

0 Recent citations

n/a Field Citation Ratio

n/a Relative Citation Ratio
https://doi.org/10.53894/ijirss.v8i2.5467

Cybersecurity awareness among school students: Exploring influencing factors, legal implications, and knowledge gaps

The effect of dryer and trehalose concentration on characteristics of mangosteen as natural dyes

Article Sidebar

Abstract

Authors

Article Details

Most read articles by the same author(s)

Similar Articles

Related Article based on the article keywords

gsCitation