Sentiment Analysis of Negative Comments (Hate Speech) on Twitter with K-means Clustering Algorithm Using RapidMiner
Keywords:
Data Mining, K-Means, RapidMiner, Komentar Negatif, TwitterAbstract
This study discusses sentiment analysis of negative comments (hate speech) on the social media platform Twitter by applying the K-Means Clustering algorithm using RapidMiner software. In today’s digital era, Twitter has become one of the main platforms for the open dissemination of public opinion, including negative comments that may lead to hate speech. To understand the sentiment patterns in these comments, clustering was carried out on a dataset consisting of 27,325 tweets obtained from Kaggle. The research stages included data collection, preprocessing, and the implementation of the K-Means algorithm with three clusters, categorizing the comments into negative, neutral, and positive groups. The results showed that most of the comments fell into the negative cluster, comprising 14,032 entries, followed by 9,924 neutral entries and 3,369 positive entries. These findings demonstrate that the K-Means algorithm is effective in identifying the distribution of hate speech on social media and provides valuable insights for mitigating and monitoring negative content automatically. This study is expected to serve as a foundation for developing more accurate and adaptive sentiment analysis systems in response to the dynamics of digital communication.
References
Wibowo, A., & Fitriyani, D. (2021). Analisis Sentimen Komentar pada Media Sosial Twitter Menggunakan K-Means Clustering. Jurnal Teknik Informatika dan Sistem Informasi, 7(1), 34–42. https://doi.org/10.31294/ji.v7i1.9601
Suryani, A., & Nurul, M. (2022). Deteksi Hate Speech pada Media Sosial Menggunakan Text Mining dan K-Means Clustering. Jurnal Media Informatika Budidarma, 6(1), 55–60. https://doi.org/10.30865/mib.v6i1.3017
Prasetyo, E., & Darmawan, I. D. (2023). Implementasi RapidMiner dalam Clustering Data Komentar Negatif Twitter Menggunakan K-Means. Jurnal Teknologi dan Sistem Komputer, 11(2), 110–117. https://doi.org/10.14710/jtsiskom.11.2.110-117
Kurniawan, H., & Rahman, A. (2021). Penggunaan Algoritma K-Means dalam Pengelompokan Data Komentar Kebencian (Hate Speech) di Twitter. Jurnal Sains dan Informatika, 7(3), 78–85. https://doi.org/10.32520/jsi.v7i3.1574
Lestari, D., & Putri, A. P. (2022). Analisis Sentimen Twitter menggunakan Text Mining dan Clustering K-Means. Jurnal Sistem dan Teknologi Informasi, 10(2), 22–30. https://doi.org/10.33395/jsti.v10i2.273
Safitri, R., & Nugroho, S. P. (2021). Penerapan Algoritma K-Means untuk Klasifikasi Ujaran Kebencian pada Twitter. Jurnal Teknologi dan Komunikasi Informasi, 9(2), 112–119. https://doi.org/10.24843/JTIKOM.2021.v9.i2.p9
Amelia, R., & Hidayatullah, S. (2023). Text Mining dan Clustering untuk Analisis Komentar Negatif Menggunakan RapidMiner. Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer (JPTIIK), 7(4), 196–202. https://j-ptiik.ub.ac.id/index.php/j-ptiik/article/view/9999
Permatasari, Y., & Hartanto, R. (2022). Analisis Sentimen dan Deteksi Hate Speech pada Twitter Menggunakan K-Means dan TF-IDF. Jurnal Informatika dan Sistem Informasi, 6(1), 13–19. https://doi.org/10.21009/jisi.061.02
Alamsyah, R., & Wijaya, H. (2023). Penerapan RapidMiner dalam Pengolahan Data Teks untuk Analisis Ujaran Kebencian di Twitter. Jurnal Teknologi Informasi dan Komunikasi, 10(1), 45–53. https://doi.org/10.31294/jtik.v10i1.2040
Nugraha, A., & Fadilah, I. (2021). Implementasi Algoritma K-Means pada Data Twitter untuk Menganalisis Sentimen Komentar Negatif. Jurnal Teknologi Informasi dan Ilmu Komputer, 5(3), 145–152. https://doi.org/10.33506/jtiik.v5i3.999
Downloads
Published
Data Availability Statement
iya, di website Journal of Information Technology and Informatics Engineering
Issue
Section
License
Copyright (c) 2025 Nazwa Alfira, Muhammad Refa Tsalits Ramdhani, Muhammad Ridho Putra Budika, Muhammad Virgi Santoso, Nyla Zahry (Author)

This work is licensed under a Creative Commons Attribution 4.0 International License.
Creative Commons Attribution 4.0 International (CC BY 4.0).