Machine Learning Approaches for Enhancing Query Optimization in Large Databases
Downloads
More effective query optimization strategies in large-scale databases are required due to the growing volume and complexity of data in contemporary applications. Performance inefficiencies result from traditional query optimization techniques, such as rule-based and cost-based approaches, which frequently find it difficult to manage dynamic and complicated workloads. By utilizing deep learning, reinforcement learning, and predictive analytics to enhance query execution plans, indexing, and workload management, machine learning (ML) has become a game-changing method for improving query optimization. With its many advantages—including workload-aware indexing, adaptive tuning, and real-time performance improvements—ML-driven optimization approaches are especially well-suited for distributed and cloud-based database setups. However, challenges remain, such as the need for more explainable AI-powered optimizers, security vulnerabilities, and the high computational costs of training machine learning models. To ensure reliable and efficient database management, future research should focus on creating hybrid optimization frameworks, strengthening security measures, and making machine learning-based decision-making more explainable. By addressing these challenges, machine learning-powered query optimization could open the door to smarter, more flexible, and scalable database systems.
Abbasi, M., Bernardo, M. V., Váz, P., Silva, J., & Martins, P. (2024). Adaptive and Scalable Database Management with Machine Learning Integration: A PostgreSQL Case Study. Information (Switzerland), 15(9). https://doi.org/10.3390/info15090574
Ahmadi, S. (n.d.). Optimizing Data Warehousing Performance through Machine Learning Algorithms in the Cloud. International Journal of Science and Research, 2023(12), 1859–1867.
https://doi.org/10.21275/SR231224074241ï
Aken, D. Van, Pavlo, A., Gordon, G. J., & Zhang, B. (2017). Automatic database management system tuning through large-scale machine learning. Proceedings of the ACM SIGMOD International Conference on Management of Data, Part F127746, 1009–1024. https://doi.org/10.1145/3035918.3064029
Akhtar, S., & Farzana, N. (2024). Optimizing Query Performance in Distributed Databases: A Comprehensive Approach with Autonomous AI, Reinforcement Learning, and Explainable AI. https://doi.org/10.13140/RG.2.2.13274.56008
Alamu, R. (n.d.). Deep Learning for Data Management: Enhancing Data Structuring and Query Optimization.
https://www.researchgate.net/publication/389316639
Aoun, M. (n.d.). Improving query efficiency in heterogeneous big data environments through advanced query processing techniques. https://www.researchgate.net/publication/377334744
Ashlam, A. A., Badii, A., & Stahl, F. (2022). A Novel Approach Exploiting Machine Learning to Detect SQLi Attacks. Proceedings of the 2022 5th International Conference on Advanced Systems and Emergent Technologies, IC_ASET 2022, 513–517. https://doi.org/10.1109/IC_ASET53395.2022.9765948
Azhir, E., Navimipour, N. J., Hosseinzadeh, M., Sharifi, A., & Darwesh, A. (2019). Query optimization mechanisms in the cloud environments: A systematic study. International Journal of Communication Systems, 32(8).
https://doi.org/10.1002/dac.3940
BRIDGING DATA MANAGEMENT AND MACHINE LEARNING: CASE STUDIES ON INDEX, QUERY OPTIMIZATION, AND DATA ACQUISITION. (n.d.).
Chandramouli, K., Kliegr, T., Nemrava, J., Svatek, V., & Izquierdo, E. (n.d.). Query Refinement and User Relevance Feedback for Contextualized Image Retrieval. http://wordnet.princeton.edu
Choi, J. A., & Lim, K. (2020). Identifying machine learning techniques for classification of target advertising. In ICT Express (Vol. 6, Issue 3, pp. 175–180). Korean Institute of Communications Information Sciences. https://doi.org/10.1016/j.icte.2020.04.012
Ding, B., Das, S., Marcus, R., Wu, W., Chaudhuri, S., & Narasayya, V. R. (2019). AI meets AI: Leveraging query executions to improve index recommendations. Proceedings of the ACM SIGMOD International Conference on Management of Data, 1241–1258.
https://doi.org/10.1145/3299869.3324957
Dong, W., Liu, W., Xi, R., Hou, M., & Fan, S. (2024). MLETune: Streamlining Database Knob Tuning via Multi-LLMs Experts Guided Deep Reinforcement Learning. Proceedings of the International Conference on Parallel and Distributed Systems - ICPADS, 226–235.
https://doi.org/10.1109/ICPADS63350.2024.00038
Dou, B., Zhu, Z., Merkurjev, E., Ke, L., Chen, L., Jiang, J., Zhu, Y., Liu, J., Zhang, B., & Wei, G. W. (2023). Machine Learning Methods for Small Data Challenges in Molecular Science. In Chemical Reviews (Vol. 123, Issue 13, pp. 8736–8780). American Chemical Society.
https://doi.org/10.1021/acs.chemrev.3c00189
Du, P., Bai, X., Tan, K., Xue, Z., Samat, A., Xia, J., Li, E., Su, H., & Liu, W. (2020). Advances of Four Machine Learning Methods for Spatial Data Handling: a Review. In Journal of Geovisualization and Spatial Analysis (Vol. 4, Issue 1). Springer Nature. https://doi.org/10.1007/s41651-020-00048-5
During, A. D. (n.d.). Enhanced Query Optimization in Distributed Databases for Resilient and Efficient Supply Chains. https://doi.org/10.13140/RG.2.2.26612.36488
Eido, W. M., & Ibrahim, I. M. (2025). Ant Colony Optimization (ACO) for Traveling Salesman Problem: A Review. Asian Journal of Research in Computer Science, 18(2), 20–45.
https://doi.org/10.9734/ajrcos/2025/v18i2559
Eido, W. merza, & Yasin, H. M. (2025). Pneumonia and COVID-19 Classification and Detection Based on Convolutional Neural Network: A Review. Asian Journal of Research in Computer Science, 18(1),
–183. https://doi.org/10.9734/ajrcos/2025/v18i1556
Eido, W. merza, & Zeebaree, S. R. M. (2025). A Review of Blockchain Technology In E-business: Trust, Transparency, and Security in Digital Marketing through Decentralized Solutions. Asian Journal of Research in Computer Science, 18(3),
–433. https://doi.org/10.9734/ajrcos/2025/v18i3602
Gadde, H. (2022). Integrating AI into SQL Query Processing: Challenges and Opportunities. In International Journal of Advanced Engineering Technologies and Innovations (Vol. 01).
Guo, Y., Li, G., Hu, R., & Wang, Y. (2025). In-database query optimization on SQL with ML predicates. VLDB Journal, 34(1).
https://doi.org/10.1007/s00778-024-00888-3
Hamza Akhtar, M., Ali, A., Ali, S., Nasim, F., Hamza Aziz, M., Khan, H., & Asad Ali Naqvi, S. (2025). Spectrum of Engineering Sciences A Novel Machine Learning Approach for Database Exploitation to Enhance Database Security: A Survey. 3(2).
Hasan, S., Thirumuruganathan, S., Augustine, J., Koudas, N., & Das, G. (2020). Deep Learning Models for Selectivity Estimation of Multi-Attribute Queries. Proceedings of the ACM SIGMOD International Conference on Management of Data, 1035–1050. https://doi.org/10.1145/3318464.3389741
Hayath, T. M., Usman, K., Shafiulla, M., & Dadapeer. (2023). An Overview of SQL Optimization Techniques for Enhanced Query Performance. 2nd IEEE International Conference on Distributed Computing and Electrical Circuits and Electronics, ICDCECE 2023.
https://doi.org/10.1109/ICDCECE57866.2023.10151265
Islam, S. (2024). FUTURE TRENDS IN SQL DATABASES AND BIG DATA ANALYTICS: IMPACT OF MACHINE LEARNING AND ARTIFICIAL INTELLIGENCE. International Journal of Science and Engineering, 1(4), 47–62. https://doi.org/10.62304/ijse.v1i04.188
Jambigi, N., Hammesfahr, J., Mueller, M., Bach, T., & Felderer, M. (2024). On Enhancing Root Cause Analysis with SQL Summaries for Failures in Database Workload Replays at SAP HANA. Proceedings - 2024 IEEE 35th International Symposium on Software Reliability Engineering Workshops, ISSREW 2024, 85–90. https://doi.org/10.1109/ISSREW63542.2024.00052
Kaur, P., Sharma, M., & Mittal, M. (2018). Big Data and Machine Learning Based Secure Healthcare Framework. Procedia Computer Science, 132, 1049–1059. https://doi.org/10.1016/j.procs.2018.05.020
Li, G., Zhou, X., Li, S., & Gao, B. (2018). QTune: A QueryAware database tuning system with deep reinforcement learning. Proceedings of the VLDB Endowment, 12(12), 2118–2130.
https://doi.org/10.14778/3352063.3352129
Li, M.-L. (2023). AI-Driven Database Performance Tuning: Automated Indexing and Query Optimization. In Advances in Computer Sciences (Vol. 6).
Li, X., Liu, H., Wang, W., Zheng, Y., Lv, H., & Lv, Z. (2022). Big data analysis of the Internet of Things in the digital twins of smart city based on deep learning. Future Generation Computer Systems, 128, 167–177.
https://doi.org/10.1016/j.future.2021.10.006
Ma, Q., & Triantafillou, P. (2019). DBEST: Revisiting approximate query processing engines with machine learning models. Proceedings of the ACM SIGMOD International Conference on Management of Data, 1553–1570.
https://doi.org/10.1145/3299869.3324958
Marcus, R., Negi, P., Mao, H., Tatbul, N., Alizadeh, M., & Kraska, T. (2021). Bao: Making Learned Query Optimization Practical. Proceedings of the ACM SIGMOD International Conference on Management of Data, 1275–1288.
https://doi.org/10.1145/3448016.3452838
Marcus, R., Negi, P., Mao, H., Zhang, C., Alizadeh, M., Kraska, T., Papaemmanouil, O., & Tatbul, N. (2018). Neo: A Learned query optimizer. Proceedings of the VLDB Endowment, 12(11),
–1718. https://doi.org/10.14778/3342263.3342644
Meduri, V. V., Chowdhury, K., & Sarwat, M. (2021). Evaluation of Machine Learning Algorithms in Predicting the Next SQL Query from the Future. ACM Transactions on Database Systems, 46(1). https://doi.org/10.1145/3442338
Milicevic, B., & Babovic, Z. (2024). A systematic review of deep learning applications in database query execution. Journal of Big Data, 11(1). https://doi.org/10.1186/s40537-024-01025-1
Mitchell, O., & Nelson, S. (2021). Machine Learning for Query Processing in Big Data Analytics: Trends. In Print) International Journal of Engineering and Advanced Technology Studies (Vol. 9, Issue 1).
M.M.F. Fahima, A.H. Sahna Sreen, S.L. Fathima Ruksana, D.T.E. Weihena, & M.H.M. Majid. (2024). Machine Learning for Database Management and Query Optimization. Elementaria: Journal of Educational Research, 2(1), 96–108. https://doi.org/10.61166/elm.v2i1.66
Oluwafemi Oloruntoba. (2025). AI-Driven autonomous database management: Self-tuning, predictive query optimization, and intelligent indexing in enterprise it environments. World Journal of Advanced Research and Reviews, 25(2), 1558–1580. https://doi.org/10.30574/wjarr.2025.25.2.0534
Panwar, V. (2024). Optimizing Big Data Processing in SQL Server through Advanced Utilization of Stored Procedures. International Journal of Management, IT & Engineering, 14, 2. http://www.ijmra.us,
Patel, L., Shukla, T., Huang, X., Ussery, D. W., & Wang, S. (2020). Machine Learning Methods in Drug Discovery. Molecules, 25(22).
https://doi.org/10.3390/MOLECULES25225277
Polkowski, Z., Mishra, J. P., & Mishra, S. K. (2021, July 1). Prioritization of complex heterogeneous queries using evolutionary and computational approach. Proceedings of the 13th International Conference on Electronics, Computers and Artificial Intelligence, ECAI 2021.
https://doi.org/10.1109/ECAI52376.2021.9515096
Rachakatla, S. K., & Machireddy, J. R. (n.d.). The Role of Machine Learning in Data Warehousing: Enhancing Data Integration and Query Optimization.
Rahman, M. M., Islam, S., Kamruzzaman, M., & Joy, Z. H. (2024). ADVANCED QUERY OPTIMIZATION IN SQL DATABASES FOR REAL-TIME BIG DATA ANALYTICS. ACADEMIC JOURNAL ON BUSINESS ADMINISTRATION, INNOVATION & SUSTAINABILITY, 4(3), 1-1–14.
https://doi.org/10.69593/ajbais.v4i3.77
Ramu, V. B. (2023). Optimizing Database Performance: Strategies for Efficient Query Execution and Resource Utilization. International Journal of Computer Trends and Technology, 71(7), 15–21. https://doi.org/10.14445/22312803/ijctt-v71i7p103
Saleh, R. A., & Yasin, H. M. (2025). Advancing Cybersecurity through Machine Learning: Bridging Gaps, Overcoming Challenges, and Enhancing Protection. Asian Journal of Research in Computer Science, 18(2), 206–217. https://doi.org/10.9734/ajrcos/2025/v18i2572
Saleh, R. A., & Zebari, I. M. I. (2025a). Enhancing Network Performance: A Comprehensive Analysis of Hybrid Routing Algorithms. Asian Journal of Research in Computer Science, 18(3), 1–16. https://doi.org/10.9734/ajrcos/2025/v18i3573
Saleh, R. A., & Zebari, I. M. I. (2025b). Enhancing Network Performance: A Comprehensive Analysis of Hybrid Routing Algorithms. Asian Journal of Research in Computer Science, 18(3), 1–16. https://doi.org/10.9734/ajrcos/2025/v18i3573
Singh, B., Indu, S., & Majumdar, S. (2025). Comparison of machine learning algorithms for classification of Big Data sets. Theoretical Computer Science, 1024. https://doi.org/10.1016/j.tcs.2024.114938
Sundaram, S., Somasundaram, K., Jothilakshmi, S., Jayaraman, S., & Dhanalakshmi, P. (2023). Modelling of Firefly Algorithm with Densely Connected Networks for Near-Duplicate Image Detection System. International Conference on Sustainable Communication Networks and Application, ICSCNA 2023 - Proceedings, 66–72. https://doi.org/10.1109/ICSCNA58489.2023.10370117
Tato, F. R., & Yasin, H. M. (2025). Detecting Diabetic Retinopathy Using Machine Learning Algorithms: A Review. Asian Journal of Research in Computer Science, 18(2), 118–131. https://doi.org/10.9734/ajrcos/2025/v18i2566
Thirupurasundari, D. R., Rajesh Kumar, K., Palani, H. K., Ilangovan, S., & Senthilvel, P. G. (2023). Optimizing Query Performance in Big Data Systems Using Machine Learning Algorithms. 2023 International Conference on Communication, Security and Artificial Intelligence, ICCSAI 2023, 891–895. https://doi.org/10.1109/ICCSAI59793.2023.10421253
Wang, D., Hoi, S. C. H., He, Y., Zhu, J., Mei, T., & Luo, J. (2014). Retrieval-based face annotation by weak label regularized local coordinate coding. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(3), 550–563. https://doi.org/10.1109/TPAMI.2013.145
Wang, Z., Zhao, X., Han, Z., Luo, L., Xiang, J., Zheng, S., Liu, G., Yu, M., Cui, Y., Shittu, S., & Hu, M. (2021). Advanced big-data/machine-learning techniques for optimization and performance enhancement of the heat pipe technology – A review and prospective study. Applied Energy, 294. https://doi.org/10.1016/j.apenergy.2021.116969
Wisam Altaher, A., & Hasan Hussein, A. (n.d.). Head of information technology department Babelon-Iraq COMPARATIVE ANALYSIS OF MACHINE LEARNING AND TRADITIONAL QUERY OPTIMIZATION METHODS IN DATABASE MANAGEMENT SYSTEMS WITH ADAPTIVE MATHEMATICAL MODELING TEST.
Xie, G., Qian, Y., & Wang, S. (2021). Forecasting Chinese cruise tourism demand with big data: An optimized machine learning approach. Tourism Management, 82. https://doi.org/10.1016/j.tourman.2020.104208
Zhang, A., Xing, L., Zou, J., & Wu, J. C. (2022). Shifting machine learning for healthcare from development to deployment and from models to data. In Nature Biomedical Engineering (Vol. 6, Issue 12, pp. 1330–1345). Nature Research. https://doi.org/10.1038/s41551-022-00898-y
Zhang, J., Liu, Y., Zhou, K., Li, G., Xiao, Z., Cheng, B., Xing, J., Wang, Y., Cheng, T., Liu, L., Ran, M., & Li, Z. (2019). An end-to-end automatic cloud database tuning system using deep reinforcement learning. Proceedings of the ACM SIGMOD International Conference on Management of Data, 415–432. https://doi.org/10.1145/3299869.3300085
Zhou, X., Chai, C., Li, G., & Sun, J. (2022). Database Meets Artificial Intelligence: A Survey. IEEE Transactions on Knowledge and Data Engineering, 34(3), 1096–1116.