Skip to main content
Article

Quantifying Commercial Disparagement by Analyzing Algorithmic Bias in the spambase Dataset with a Random Forest  

Authors
  • Arie Setya Putra
  • Admi Syarif
  • Mahfut Mahfut
  • Sri Ratna Sulistiyanti
  • Muhammad Said Hasibuan

Abstract

Automated decision-making systems, such as spam filters, are ubiquitous but increasingly scrutinized for algorithmic bias. While most scholarship focuses on social discrimination, this research investigates a novel legal claim: algorithmic commercial disparagement. We posit that a machine learning filter trained on a single company's "personalized" data can systematically and unfairly penalize its competitors, creating a data-driven basis for a tortious interference claim. This study provides an empirical model for this legal thesis using the spambase dataset. A Random Forest classifier was trained, achieving a high baseline accuracy of 94.57%—a "veneer of neutrality" that would justify its commercial deployment. However, a feature importance analysis revealed the model’s logic was biased, learning to associate corporate-specific keywords (e.g., hp, hpl, george) with non-spam emails. To quantify the harm, we simulated "internal" (Set A) and "competitor" (Set B) communications from the legitimate test data. The results demonstrate a significant disparate impact: the False Positive Rate (FPR) for internal emails was 1.31%, while the FPR for competitor emails was 5.53%. This shows the filter is 4.2 times more likely to wrongfully block a competitor's legitimate communication. This study concludes that this foreseeable, quantifiable harm, resulting from the negligent deployment of a biased model, provides an empirical foundation for claims of algorithmic commercial disparagement

Keywords: Algorithmic Bias, Commercial Disparagement, Machine Learning, Spam Filtering, Disparate Impact

How to Cite:

Putra, A. S., Syarif, A., Mahfut, M., Sulistiyanti, S. R. & Hasibuan, M. S., (2025) “Quantifying Commercial Disparagement by Analyzing Algorithmic Bias in the spambase Dataset with a Random Forest  ”, Journal of Cyber Law 1(4), 330-343. doi: https://doi.org/10.63913/jcl.v1i4.71

Downloads:
Download PDF
View PDF

30 Views

7 Downloads

Published on
2025-12-14

Peer Reviewed