Black-Box Adversarial Attacks Against SQL Injection Detection Model
DOI:
https://doi.org/10.37256/cm.5420245292Keywords:
SQL injection attacks, generative adversarial network, conditional tabular generative adversarial network, adversarial attacks, conventional neural networkAbstract
Structured Query Language (SQL) injection attacks represent a substantial threat to the security of web applications, making the development of effective detection techniques crucial. These techniques have evolved from traditional signature-based techniques to more advanced techniques based on machine learning models. Machine learning detection models are often vulnerable to adversarial examples. Adversarial examples are deliberately crafted inputs designed to deceive models into making incorrect predictions by subtly altering the original dataset in ways that are typically imperceptible to humans. To train and test these machine learning models, datasets comprising both malicious and normal data are indispensable. However, the lack of sufficient and balanced datasets presents a significant challenge, particularly for models intended to detect SQL injection attacks. Most network traffic datasets exhibit a substantial class imbalance, with a disproportionate amount of normal traffic compared to malicious traffic, making it difficult to train effective and reliable detection models. This study addressed the shortcomings of current SQL injection detection techniques and proposed a conditional tabular generative adversarial network adversarial attack method. We evaluated the effectiveness of the generated adversarial SQL injection examples using qualitative and quantitative methods, measuring their ability to evade the detection model. A conventional neural network algorithm detection model was built and tested, and the generated adversarial examples successfully bypassed the detection model at a rate of up to 6%. The evidence demonstrated that the conditional tabular generative adversarial network successfully captures the statistical properties of real data and generates synthetic data that accurately represents the real data. This method is also expected to address the problem of insufficient and imbalanced SQL injection datasets, which could aid in training various machine learning models beyond the one used in our study.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 Maha Alqhtani, et al.
This work is licensed under a Creative Commons Attribution 4.0 International License.