CatBoost-Based Network Intrusion Detection on Imbalanced CIC-IDS-2018 Dataset 


Vol. 46,  No. 12, pp. 2191-2197, Dec.  2021
10.7840/kics.2021.46.12.2191


PDF
  Abstract

Increasing volume and speed of internet traffic fosters unprecedented opportunity for malicious attackers. This in turn creates challenges for network intrusion detection systems (NIDSs) whose job is to detect intrusive (i.e., malicious) network traffic. Majority of current solutions exploit flow records which contain information regarding the flow (e.g., number of packets, avg. inter-arrival time). Hence, most of the NIDS solutions exploit tree-based ML models such as Decision Tree and Random Forest due to the tabular form of a flow record. However, recently Gradient Boosting Machine methods such as CatBoost has shown their superior performance over traditional tree-based solutions on tabular datasets such as in Kaggle competitions. In this work we explore the applicability of CatBoost for network intrusion detection task. Further, we demonstrate the performance gain achieved by addressing data imbalance. Our experimental comparisons show that addressing data imbalance with simple over-sampling technique can provide significant performance boost -from 88.84% to 92.41% accuracy improvement in the case of CatBoost. Results also suggest CatBoost classifier (92.41%) outperforms Decision Tree and Random Forest (88.34% and 89.88%) in term of balanced accuracy on CIC-IDS-2018 dataset.

  Statistics
Cumulative Counts from November, 2022
Multiple requests among the same browser session are counted as one view. If you mouse over a chart, the values of data points will be shown.


  Cite this article

[IEEE Style]

A. Jumabek, S. Yang, Y. Noh, "CatBoost-Based Network Intrusion Detection on Imbalanced CIC-IDS-2018 Dataset," The Journal of Korean Institute of Communications and Information Sciences, vol. 46, no. 12, pp. 2191-2197, 2021. DOI: 10.7840/kics.2021.46.12.2191.

[ACM Style]

Alikhanov Jumabek, SeungSam Yang, and YoungTae Noh. 2021. CatBoost-Based Network Intrusion Detection on Imbalanced CIC-IDS-2018 Dataset. The Journal of Korean Institute of Communications and Information Sciences, 46, 12, (2021), 2191-2197. DOI: 10.7840/kics.2021.46.12.2191.

[KICS Style]

Alikhanov Jumabek, SeungSam Yang, YoungTae Noh, "CatBoost-Based Network Intrusion Detection on Imbalanced CIC-IDS-2018 Dataset," The Journal of Korean Institute of Communications and Information Sciences, vol. 46, no. 12, pp. 2191-2197, 12. 2021. (https://doi.org/10.7840/kics.2021.46.12.2191)