How existing machine learning models for DDoS detection differ in performance and accuracy when applied to synthetic versus real-world network traffic datasets

Authors

  • Abdulqudos Alnahari Faculty of Artificial Intelligence, University Technology
  • Noor Azurati Ahmad Faculty of Artificial Intelligence, University Technology

DOI:

https://doi.org/10.11113/oiji2025.13n2.349

Keywords:

DDoS, Cloud, Security, Real-World Dataset, Synthetic Dataset

Abstract

Machine learning–based DDoS detection systems frequently report exceptionally high performance, often exceeding 98–99% accuracy. However, such results are predominantly derived from synthetic, laboratory-generated datasets that fail to capture the complexity, variability, and noise of real operational environments. This phenomenon is not unique to cybersecurity; similar patterns have been observed in applied health technologies such as remote blood pressure monitoring, where machine learning models trained on controlled clinical datasets often demonstrate inflated performance but struggle to generalize to real-world home monitoring conditions. This paper empirically demonstrates how multiple machine learning models achieve near-perfect performance when evaluated on controlled, laboratory-created DDoS datasets. Using two widely adopted benchmark datasets, the evaluated models achieved accuracies close to 99%. However, when the same learning methods were applied to a real-world dataset constructed from 28 months of unsolicited network traffic, model accuracy declined to approximately 92%.

Downloads

Published

2025-12-26

How to Cite

Alnahari, A., & Noor Azurati Ahmad. (2025). How existing machine learning models for DDoS detection differ in performance and accuracy when applied to synthetic versus real-world network traffic datasets. Open International Journal of Informatics, 13(2), 105–117. https://doi.org/10.11113/oiji2025.13n2.349