Segmentasi Pelanggan Berbasis RFM dan Analisis Asosiasi Produk pada Olist Brazilian E-Commerce Menggunakan FP-Growth
Keywords:
Segmentasi Pelanggan, K-Means Clustering, FP-Growth, RFM, e-commerceAbstract
Penelitian ini mengimplementasikan pipeline analitik dua tahap pada dataset Olist Brazilian E-Commerce Public,
mengintegrasikan K-Means Clustering berbasis RFM (Recency, Frequency, Monetary) dengan Market Basket Analysis
menggunakan algoritma FP-Growth dalam kerangka kerja CRISP-DM. Sebanyak 92.424 pelanggan unik dari 96.478
pesanan berstatus delivered dianalisis. K-Means Clustering dengan K=4 (Silhouette Score = 0,46) menghasilkan empat
segmen pelanggan yang terbedakan secara perilaku: Potential Loyalists (47.963 pelanggan, kontribusi revenue 39,9%),
At-Risk/Lost (35.455 pelanggan, 29,7%), Loyal Customers (6.272 pelanggan, 24,9%), dan Champions (2.734 pelanggan,
5,4%). Analisis FP-Growth yang dijalankan khusus pada transaksi multi-item per segmen mengungkap bahwa lebih dari
91% transaksi Olist merupakan pembelian satu item—karakteristik struktural yang membatasi pembentukan association
rules. Meski demikian, rules bermakna tetap ditemukan: segmen Loyal Customers menghasilkan lift sebesar 16,05 untuk
pasangan produk bebes → cool_stuff, sementara segmen At-Risk/Lost dan Champions menunjukkan asosiasi konsisten
antara cama_mesa_banho dan casa_conforto (lift 4,12–4,30). Integrasi kedua metode membuktikan bahwa segmentasi K
Means memperkaya analisis FP-Growth dengan memungkinkan analisis asosiasi yang kontekstual per segmen,
menghasilkan rekomendasi cross-selling yang lebih tepat sasaran dibandingkan analisis global.
References
[1] eMarketer, "Latin America Ecommerce 2022: Countries to Watch as the Region Recovers," eMarketer Report, New York, USA, 2022.
[2] T. L. Dzulfikar dan A. Adiwijaya, "Customer Segmentation Using K-Means Clustering Based on RFM Model," in Proc. 2019 International Conference on Information and Communications Technology (ICOIACT), Yogyakarta, Indonesia, 2019, pp. 542–547, doi: 10.1109/ICOIACT.2019.8784938.
[3] J. Han, J. Pei, dan Y. Yin, "Mining Frequent Patterns without Candidate Generation," in Proc. ACM SIGMOD International Conference on Management of Data, Dallas, TX, USA, 2000, pp. 1–12, doi: 10.1145/342009.335372.
[4] A. Olist, "Brazilian E-Commerce Public Dataset by Olist," Kaggle, 2018. [Online]. Tersedia: https://www.kaggle.com/datasets/olistbr/brazilian-ecommerce. [Diakses: 1 Mei 2025].
[5] P. Chapman, J. Clinton, R. Kerber, T. Khabaza, T. Reinartz, C. Shearer, dan R. Wirth, "CRISP-DM 1.0: Step-by-Step Data Mining Guide," SPSS Inc., Chicago, IL, USA, Technical Report, 2000.
[6] P. J. Rousseeuw, "Silhouettes: A Graphical Aid to the Interpretation and Validation of Cluster Analysis," Journal of Computational and Applied Mathematics, vol. 20, pp. 53–65, 1987, doi: 10.1016/0377-0427(87)90125-7.
[7] A. Shabani, M. Sohrabi, dan S. Nik, "RFM Based Customers Segmentation Using Business Intelligence Tools," International Journal of Computer Science and Information Security, vol. 14, no. 7, pp. 1–7, 2016.
[8] R. Agrawal dan R. Srikant, "Fast Algorithms for Mining Association Rules in Large Databases," in Proc. 20th International Conference on Very Large Data Bases (VLDB), Santiago, Chile, 1994, pp. 487–499.
[9] J. MacQueen, "Some Methods for Classification and Analysis of Multivariate Observations," in Proc. 5th Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, CA, USA, 1967, vol. 1, pp. 281–297.
[10] S. Anitha dan M. Patil, "RFM Model for Customer Purchase Behavior Using K-Means Algorithm," Journal of King Saud University – Computer and Information Sciences, vol. 34, no. 5, pp. 1785–1792, 2022, doi: 10.1016/j.jksuci.2019.12.011.
[11] M. Hahsler, C. Buchta, B. Gruen, dan K. Hornik, "arules: Mining Association Rules and Frequent Itemsets," R Package Version 1.7-5, 2022. [Online]. Tersedia: https://CRAN.R-project.org/package=arules.
[12] D. Arthur dan S. Vassilvitskii, "k-means++: The Advantages of Careful Seeding," in Proc. 18th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), New Orleans, LA, USA, 2007, pp. 1027–1035.
[13] G. Piatetsky-Shapiro, "Discovery, Analysis, and Presentation of Strong Rules," in Knowledge Discovery in Databases, G. Piatetsky-Shapiro dan W. J. Frawley, Eds. Cambridge, MA: AAAI/MIT Press, 1991, pp. 229–248.
[14] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, et al., "Scikit-learn: Machine Learning in Python," Journal of Machine Learning Research, vol. 12, pp. 2825–2830, 2011.
[15] S. Raschka, J. Patterson, dan C. Nolet, "Machine Learning in Python: Main Developments and Technology Trends in Data Science, Machine Learning, and Artificial Intelligence," Information, vol. 11, no. 4, p. 193, 2020, doi: 10.3390/info11040193.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Cyntiya Olyfiyany, Hayyu Risma Ameilya (Author)

This work is licensed under a Creative Commons Attribution 4.0 International License.
Creative Commons Attribution 4.0 International (CC BY 4.0).


This work is licensed under a