Critical Review of Network Intrusion Detection Benchmark Datasets for Practical IoT Security
DOI:
https://doi.org/10.37256/cnc.3220257228Keywords:
benchmark datasets, feature extraction, Internet of Things (IoT) security, Machine Learning (ML), Network Intrusion Detection Systems (NIDS), reproducibilityAbstract
The rapid expansion of Internet of Things (IoT) devices in modern environments introduces significant security vulnerabilities, increasing the risk of cyberattacks and potential physical threats to users. While Network Intrusion Detection Systems based on Machine Learning (ML-based NIDS) hold promise for mitigating these risks, the effectiveness of such systems is heavily influenced by the quality and representativeness of the datasets used for their development and evaluation. This study provides a critical review of publicly available benchmarking datasets commonly used to train ML-based NIDS for IoT security, with a particular focus on two pivotal but often overlooked factors: testbed configurations and feature extraction methods. The study's findings reveal critical inconsistencies in dataset design and feature sets, posing challenges to model generalizability and its real-world application. To address these issues, the study proposes clear criteria for dataset assessment and practical recommendations for researchers and dataset developers. The findings demonstrate the urgent need for standardisation to enhance reproducibility, enable fair comparison of intrusion detection models, and bridge the gap between academic research and practical IoT security solutions.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Oluwasegun Apejoye, et al.

This work is licensed under a Creative Commons Attribution 4.0 International License.
