Financial Loss Estimation in Cybersecurity Incidents: A Data Mining Approach Using Decision Tree and Linear Regression Models

Authors

  • Ika Maulita
  • B Herawan Hayadi Bina Bangsa University

DOI:

https://doi.org/10.63913/jcl.v1i2.9

Keywords:

Cybersecurity, Financial Loss Prediction, Data Mining, Random Forest, Risk Management

Abstract

This study explores the application of data mining techniques to predict financial losses resulting from cybersecurity incidents. Using a dataset of 3,000 reported cyberattacks from 2015 to 2024, the research analyzes both numerical and categorical factors, including the number of affected users, incident resolution time, attack type, vulnerability exploited, and defense mechanisms employed. Through comprehensive exploratory data analysis and robust preprocessing methods, the study prepares the data for modeling using Linear Regression, Decision Tree, and Random Forest regressors. Among these, Random Forest offers reliable feature importance insights, revealing that the number of affected users, resolution time, and specific attack characteristics are the most influential predictors of financial loss. Model evaluation shows that both Linear Regression and Random Forest models achieve comparable predictive accuracy, with mean absolute errors around 24.7 million dollars and R-squared values close to zero, indicating challenges in fully explaining the variance in financial loss due to the complexity of cyber incidents. Decision Tree regression underperforms, likely due to overfitting. Visualizations comparing predicted and actual losses support these findings, highlighting areas for improvement in handling extreme loss values. The results underscore the multifaceted nature of cybersecurity risk, where both quantitative impacts and qualitative attack attributes must be considered. This research has practical implications for cybersecurity risk management and policy formulation. By identifying key drivers of financial loss, organizations can prioritize mitigation efforts on the most damaging attack types and vulnerabilities. The study also emphasizes the importance of rapid incident response to minimize financial damage. For policymakers, the findings provide data-driven evidence to guide the development of more effective cybersecurity regulations and compliance standards. Future work should extend this analysis by incorporating additional data sources and advanced machine learning techniques to enhance prediction accuracy and support proactive defense strategies. Overall, this study contributes to bridging the gap between cybersecurity data analysis and practical financial risk reduction.

Downloads

Published

2025-06-03

How to Cite

Maulita, I., & Hayadi, B. H. (2025). Financial Loss Estimation in Cybersecurity Incidents: A Data Mining Approach Using Decision Tree and Linear Regression Models . Journal of Cyber Law, 1(2), 161–174. https://doi.org/10.63913/jcl.v1i2.9