Detecting Threatening Content in Social Media: A Data Mining Approach Using Random Forest for Classification of Tweets in Cyberlaw Context
DOI:
https://doi.org/10.63913/jcl.v1i2.7Keywords:
Threat Detection, Random Forest, Social Media Analysis, Cyberlaw, Machine LearningAbstract
The rapid growth of social media platforms has increased the prevalence of threatening and harmful content, raising significant challenges for online safety and legal enforcement. This study explores the application of data mining techniques, specifically the Random Forest algorithm, to detect threatening tweets based on numerical metadata features such as user follower count, retweet and favorite counts, hashtag usage, mentions, and emoticon presence. Using a dataset of 1,000 tweets with balanced classes of threatening and non-threatening posts, the research implements a structured workflow that includes exploratory data analysis, preprocessing, model training, and evaluation. The Random Forest classifier achieved moderate performance, with an accuracy of approximately 50.5%, precision and recall near 51%, and an F1-score of 51.2%. Feature importance analysis indicated that user engagement metrics—particularly user followers, favorite count, and retweet count—were the most influential in identifying threatening content. Despite these promising insights, the results also highlight limitations due to the absence of direct textual analysis and the inherent challenges of predicting threats solely from metadata. This research contributes to the Cyberlaw domain by demonstrating how machine learning can aid legal frameworks in automating the detection of online threats, potentially improving efficiency in monitoring social media for harmful content. However, the study emphasizes the necessity for combining metadata-driven models with natural language processing and human oversight to ensure balanced, accurate, and legally sound interventions. Future work should focus on expanding datasets, integrating textual features, and exploring advanced algorithms to enhance detection accuracy. Overall, this study provides foundational evidence for the role of data mining in supporting Cyberlaw enforcement, underscoring the importance of technological innovation in addressing the complex issues of online harassment and threats in the digital age.