Predicting Fraud Cases in E-Commerce Transactions Using Random Forest Regression: A Data Mining Approach for Enhancing Cybersecurity and Transaction Integrity
DOI:
https://doi.org/10.63913/jcl.v1i2.6Keywords:
Fraud Detection, E-Commerce Security, Random Forest Regression, Machine Learning, Cyber LawAbstract
Fraudulent activities in e-commerce pose significant risks to businesses and consumers alike, resulting in financial losses and eroding trust in online transactions. This study aims to address this issue by developing a predictive model for fraud cases using Random Forest Regression, a robust machine learning technique known for handling nonlinear relationships and high-dimensional data. The dataset comprises daily transaction metrics such as fraud cases, transaction errors per million, transparency rating, security incidents, cyber attacks, audit compliance scores, transaction speeds, and customer trust indices, collected over multiple years. The methodology involves extensive data preprocessing, including temporal feature extraction from date information, and exploratory data analysis to identify key relationships among features. Correlation analysis revealed that transaction errors per million and security incidents are highly correlated with fraud cases, serving as important predictors. The dataset was split into training and testing sets, with the Random Forest model trained on 80% of the data and evaluated on the remaining 20%. Results indicate that the Random Forest model predicts fraud cases with high accuracy, achieving an R-squared score of 0.9832 and low error metrics (MAE of 21.07 and RMSE of 26.26). Feature importance analysis identified transaction errors per million as the most influential variable, confirming its critical role in fraud detection. Despite these promising results, limitations such as potential data imbalance and model interpretability challenges remain and warrant further research. This research contributes to the growing body of knowledge applying machine learning to cybersecurity and fraud detection, demonstrating practical applicability for improving e-commerce transaction security. The findings also have implications for cyberlaw, suggesting that advanced predictive tools can enhance regulatory enforcement and help develop more secure online commerce environments. Future work will explore incorporating additional features and alternative algorithms to further improve model robustness and transparency.