Enhancing Intrusion Detection System Performance Using Reinforcement Learning : A Fairness-Aware Comparative Study on NSL-KDD and CICIDS2017

Yudhi Arta; Suzani Mohamad Samuri; Nesi Syafitri

Authors

Yudhi Arta Department of Software Engineering and Smart Technology, Faculty of Computing and Meta-Technology, Universiti Pendidikan Sultan Idris, Malaysia
Suzani Mohamad Samuri Data Intelligent and Knowledge Management (DILIGENT), Universiti Pendidikan Sultan Idris, Malaysia
Nesi Syafitri Department of Software Engineering and Smart Technology, Faculty of Computing and Meta-Technology, Universiti Pendidikan Sultan Idris, Malaysia

Keywords:

Intrusion Detection System, Reinforcement Learning, Q-Learning, Deep Q-Networks, Proximal Policy Optimization, CICIDS2017, NSL-KDD, MCC

Abstract

Conventional Intrusion Detection Systems (IDS) often fail to generalize in dynamic network environments, facing challenges with evolving attack patterns and class imbalance. This study aims to evaluate and compare the effectiveness of three Reinforcement Learning (RL) paradigms to enhance IDS adaptability and accuracy against these challenges. This research employs a comparative experimental design, implementing Q-Learning, Deep Q-Networks (DQN), and Proximal Policy Optimization (PPO). These algorithms were systematically evaluated using the NSL-KDD and CICIDS2017 benchmark datasets to represent both legacy and modern network traffic. A fairness-aware evaluation framework was applied, prioritizing the Matthews Correlation Coefficient (MCC) as a primary metric alongside accuracy to ensure robust performance assessment against skewed class distributions. Experimental results demonstrate that PPO significantly outperforms value-based algorithms such as Q-Learning and DQN. On the high-dimensional CICIDS2017 dataset, PPO achieved the highest detection accuracy (96.3%) and MCC (0.913). Confusion matrix analyses confirmed PPO’s capability to simultaneously minimize false positives and false negatives. Conversely, Q-Learning exhibited poor generalization on complex data, while DQN showed improved performance due to deep value approximation but remained less stable than PPO. These findings imply that policy-gradient methods like PPO are superior for real-world IDS deployments where scalability, adaptability, and low error rates are critical. Theoretically, the results suggest that stochastic policy optimization handles complex, continuous state spaces more effectively than traditional value-estimation approaches. This study contributes a rigorous head-to-head comparative analysis of RL algorithms across multiple standard datasets using fairness-aware metrics. It bridges the research gap found in previous studies that often evaluated algorithms in isolation or relied on accuracy metrics that can be misleading in imbalanced security contexts.

Downloads

Download data is not yet available.

References

Alavizadeh, H., Alavizadeh, H., & Jang-Jaccard, J. (2022). Deep Q-learning based reinforcement learning approach for network intrusion detection. Computers, 11(3), 41.

Alavizadeh, H., Jang-Jaccard, J., Enoch, S. Y., Al-Sahaf, H., Welch, I., Camtepe, S. A., & Kim, D. D. (2022). A survey on cyber situation-awareness systems: Framework, techniques, and insights. ACM Computing Surveys, 55(5), 1–37.

Alshuaibi, F., Alshamsi, F., Saeed, A., & Kaddoura, S. (2024). Machine Learning-Based Classification Approach for Network Intrusion Detection System. 2024 15th Annual Undergraduate Research Conference on Applied Computing (URC), 1–6.

Amouri, A. (2020). A machine learning based intrusion detection system for mobile internet of things. Sensors (Switzerland), 20(2). https://doi.org/10.3390/s20020461

Apruzzese, G. (2018). On the effectiveness of machine and deep learning for cyber security. In International Conference on Cyber Conflict, CYCON (Vol. 2018, pp. 371–389). https://doi.org/10.23919/CYCON.2018.8405026

Apruzzese, G. (2023). The Role of Machine Learning in Cybersecurity. Digital Threats: Research and Practice, 4(1). https://doi.org/10.1145/3545574

Arta, Y., Hanafiah, A., Syafitri, N., Setiawan, P. R., & Gustianda, Y. H. (2023). Vulnerability Analysis and Effectiveness of OWASP ZAP and Arachni on Web Security Systems. International Conference on Smart Computing and Cyber Security: Strategic Foresight, Security Challenges and Innovation, 517–526.

Bandarupalli, G. (2025). Efficient deep neural network for intrusion detection using CIC-IDS-2017 dataset. 2025 First International Conference on Advances in Computer Science, Electrical, Electronics, and Communication Technologies (CE2CT), 476–480.

Chicco, D., & Jurman, G. (2020). The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics, 21, 1–13.

Chicco, D., & Jurman, G. (2023). The Matthews correlation coefficient (MCC) should replace the ROC AUC as the standard metric for assessing binary classification. BioData Mining, 16(1), 4.

Darabi, B., Bag-Mohammadi, M., & Karami, M. (2024). A micro Reinforcement Learning architecture for Intrusion Detection Systems. Pattern Recognition Letters, 185, 81–86.

Dube, R. (2024). Faulty use of the cic-ids 2017 dataset in information security research. Journal of Computer Virology and Hacking Techniques, 20(1), 203–211.

Einy, S. (2021). The Anomaly- And Signature-Based IDS for Network Security Using Hybrid Inference Systems. Mathematical Problems in Engineering, 2021. https://doi.org/10.1155/2021/6639714

Farooq, M., Khan, R. A., & Zahoor, S. Z. (2024). Q-learning and deep Q networks for securing IoT networks, challenges, and solution. In Cognitive Machine Intelligence (pp. 158–175). CRC Press.

Ferrag, M. A. (2020). Deep learning for cyber security intrusion detection: Approaches, datasets, and comparative study. Journal of Information Security and Applications, 50. https://doi.org/10.1016/j.jisa.2019.102419

Gu, Y., Cheng, Y., Chen, C. L. P., & Wang, X. (2021). Proximal policy optimization with policy feedback. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 52(7), 4600–4610.

Hegde, R., Likitha, S., & Natesh, M. (2024). Intrusion Detection System Using Machine Learning Based on NSL KDD Dataset. 2024 International Conference on Recent Advances in Science and Engineering Technology (ICRASET), 1–5.

Ishaque, M., Johar, M. G. M., Khatibi, A., & Yamin, M. (2023). Dynamic Adaptive Intrusion Detection System Using Hybrid Reinforcement Learning. International Conference on Business and Technology, 245–253.

Jayaprakash, J. S., Kodati, S., Kanchana, A., Al-Farouni, M., & AC, R. (2024). An Effective Cyber Security Threat Detection in Smart Cities Using Dueling Deep Q Networks. 2024 4th International Conference on Mobile Networks and Wireless Communications (ICMNWC), 1–5.

Khraisat, A. (2020). Hybrid intrusion detection system based on the stacking ensemble of C5 decision tree classifier and one class support vector machine. Electronics (Switzerland), 9(1). https://doi.org/10.3390/electronics9010173

Khraisat, A. (2021). A critical review of intrusion detection systems in the internet of things: techniques, deployment strategy, validation strategy, attacks, public datasets and challenges. Cybersecurity, 4(1). https://doi.org/10.1186/s42400-021-00077-7

Lin, Y.-D., Huang, H.-X., Sudyana, D., & Lai, Y.-C. (2024). AI for AI-based intrusion detection as a service: Reinforcement learning to configure models, tasks, and capacities. Journal of Network and Computer Applications, 229, 103936.

Massaoudi, M., Eddin, M. E., Abu-Rub, H., & Ghrayeb, A. (2024). Advanced Proximal Policy Optimization Strategy for Resilient Cyber-Physical Power Grid Stability Against Hostile Electrical Disruptions. IECON 2024-50th Annual Conference of the IEEE Industrial Electronics Society, 1–6.

Mazini, M. (2019). Anomaly network-based intrusion detection system using a reliable hybrid artificial bee colony and AdaBoost algorithms. Journal of King Saud University - Computer and Information Sciences, 31(4), 541–553. https://doi.org/10.1016/j.jksuci.2018.03.011

Mazyavkina, N., Sviridov, S., Ivanov, S., & ... (2021). Reinforcement learning for combinatorial optimization: A survey. Computers &Operations …. https://www.sciencedirect.com/science/article/pii/S0305054821001660

Mishra, N., & Mishra, S. (2024). NSL-KDD Dataset Analysis: A Machine Learning Implementation to Detect Intrusions in the Computer Network. 2024 2nd International Conference on Signal Processing, Communication, Power and Embedded System (SCOPES), 1–6.

Mishra, P., Pilli, E. S., & Joshi, R. C. (2021). Cloud security: attacks, techniques, tools, and challenges. Chapman and Hall/CRC.

Mizher, M. Z., & Nassif, A. B. (2024). Enhanced Intrusion Detection in Cloud Security by Optimizing Classification Algorithms. 2024 IEEE/ACM International Conference on Big Data Computing, Applications and Technologies (BDCAT), 296–303.

Mnih, V. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529–533. https://doi.org/10.1038/nature14236

Mnih, V. (2016). Asynchronous methods for deep reinforcement learning. In 33rd International Conference on Machine Learning, ICML 2016 (Vol. 4, pp. 2850–2869). https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=84999036937&origin=inward

Mo, K., Ye, P., Ren, X., Wang, S., Li, W., & Li, J. (2024). Security and privacy issues in deep reinforcement learning: Threats and countermeasures. ACM Computing Surveys, 56(6), 1–39.

Nguyen, T. T. (2023). Deep Reinforcement Learning for Cyber Security. IEEE Transactions on Neural Networks and Learning Systems, 34(8), 3779–3795. https://doi.org/10.1109/TNNLS.2021.3121870

Nguyen, T. T., & Reddi, V. J. (2021). Deep reinforcement learning for cyber security. IEEE Transactions on Neural Networks and Learning Systems, 34(8), 3779-3795.

Ozkan-Okay, M., Samet, R., Aslan, Ö., & Gupta, D. (2021). A comprehensive systematic literature review on intrusion detection systems. IEEE Access, 9, 157727–157760.

Sangoleye, F., Johnson, J., & Tsiropoulou, E. E. (2024). Intrusion detection in industrial control systems based on deep reinforcement learning. IEEE Access.

Schulman, J., Wolski, F., Dhariwal, P., Radford, A., & Klimov, O. (2017). Proximal policy optimization algorithms. ArXiv Preprint ArXiv:1707.06347.

Selvam, R., & Velliangiri, S. (2024). An improving intrusion detection model based on novel CNN technique using recent CIC-IDS datasets. 2024 International Conference on Distributed Computing and Optimization Techniques (ICDCOT), 1–6.

Sujatha, V., Prasanna, K. L., Niharika, K., Charishma, V., & Sai, K. B. (2023). Network intrusion detection using deep reinforcement learning. 2023 7th International Conference on Computing Methodologies and Communication (ICCMC), 1146–1150.

Tavallaee, M., Bagheri, E., Lu, W., & Ghorbani, A. A. (2009). A detailed analysis of the KDD CUP 99 data set. 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications, 1–6.

Tellache, A., Mokhtari, A., Korba, A. A., & Ghamri-Doudane, Y. (2024). Multi-agent Reinforcement Learning-based Network Intrusion Detection System. NOMS 2024-2024 IEEE Network Operations and Management Symposium, 1–9.