Multivariate Data Analysis for Customer Segmentation Using Principal Component Analysis and K-Means Clustering

Authors

  • Bosker Sinaga Information Technology, Mahkota Tricom Unggul University, Jl. Perintis Kemerdekaan No. 3A, Medan, Indonesia

Keywords:

Multivariate Data Analysis, Customer Segmentation, Principal Component Analysis, K-Means Clustering, Dimension Reduction

Abstract

This study discusses multivariate data analysis for customer segmentation using Principal Component Analysis (PCA) combined with the K-Means clustering method. The problem faced is the high dimension of customer data which makes it difficult to segment and make targeted marketing decisions. The solution offered is the implementation of PCA to reduce the data dimension without losing important information, then followed by K-Means to segment customers based on demographic attributes and shopping behavior. Using a dataset of 200 customers, three customer clusters with different characteristics in terms of age, annual revenue, and shopping score were found. The results of the PCA show that the first two main components are able to explain more than 78% of the data variation, making it easier to visualize and interpret the cluster. These findings provide the basis for a more targeted marketing strategy according to customer segments. In conclusion, the combination of PCA and K-Means is effective in simplifying complex data and resulting in meaningful customer segmentation.

Downloads

Download data is not yet available.

Author Biography

Bosker Sinaga, Information Technology, Mahkota Tricom Unggul University, Jl. Perintis Kemerdekaan No. 3A, Medan, Indonesia

Teknologi Informasi

References

Adawiyah, Q., & Defit, S. (2024). Penerapan Algoritma K-Means Clustering untuk Mengelompokkan Rekomendasi Metode Kontrasepsi Berbasis Machine Learning di Puskesmas. Jurnal KomtekInfo, 300–305.

Awalina, E. F. L., & Rahayu, W. I. (2023). Optimalisasi Strategi Pemasaran dengan Segmentasi Pelanggan Menggunakan Penerapan K-Means Clustering pada Transaksi Online Retail. Jurnal Teknologi Dan Informasi, 13(2), 122–137.

Badri, F., & Sari, S. U. R. (2021). Penerapan metode Principal Component Analysis (PCA) untuk identifikasi faktor-faktor yang mempengaruhi sikap mahasiswa memilih melanjutkan studi ke Kota Malang. Build. Informatics, Technol. Sci, 3(3), 426–431.

Bharadiya, J. P. (2023). A tutorial on principal component analysis for dimensionality reduction in machine learning. International Journal of Innovative Science and Research Technology, 8(5), 2028–2032.

Borlea, I.-D., Precup, R.-E., & Borlea, A.-B. (2022). Improvement of K-means cluster quality by post processing resulted clusters. Procedia Computer Science, 199, 63–70.

Dubey, P., & Rajavat, A. (2023). Effective K-means clustering algorithm for efficient data mining. 2023 2nd International Conference on Vision Towards Emerging Trends in Communication and Networking Technologies (ViTECoN), 1–6.

Listy, V., & Ilham, I. (2025). Revolusi Sistem Informasi Manajemen di Era AI dan Big Data Mengubah Cara Bisnis Bekerja. Simpatik: Jurnal Sistem Informasi Dan Informatika, 5(1), 27–36.

Ramadian, A., Judijanto, L., & Erwin, E. (2025). Customer Relationship Management (CRM): Strategi Membangun Hubungan Pelanggan yang Kuat. PT. Green Pustaka Indonesia.

Santosa, Y. P. (2023). Kombinasi Linier Target Data Untuk Regresi Multitarget Menggunakan Principal Component Analysis. Jurnal Teknologi Terpadu, 9(1), 1–9.

Santoso, R. P., Ningsih, L. S. R., & Irawati, W. (2024). Implementation Of Segmenting Targeting And Positioning Strategies In Improving Marketing Performance. BIMA: Journal of Business and Innovation Management, 6(2), 280–292.

Shalih, F. A., Ramadhan, R. A., & Syalaisa, N. (2025). Comprehensive Overview of Principal Component Analysis Applications and Developments. Jurnal EurekaMatika, 13(1), 25–34.

Sihombing, S. O. (2022). Pengantar metode analisis multivariat. Penerbit NEM.

Downloads

Published

2025-08-13

How to Cite

Sinaga, B. (2025). Multivariate Data Analysis for Customer Segmentation Using Principal Component Analysis and K-Means Clustering . Jurnal Info Sains : Informatika Dan Sains, 15(01), 283–291. Retrieved from https://ejournal.seaninstitute.or.id/index.php/InfoSains/article/view/7192