Comparison and Evaluation of Euclidean Distance and Arccosine Distance in Adaptive K-Means Clustering Algorithm for Penguin Species Clustering

Authors

  • Herlina Br Nainggolan University Katolik Santo Thomas, Medan, Indonesia
  • Pandi Barita Nauli Simangungsong University Katolik Santo Thomas, Medan, Indonesia

DOI:

https://doi.org/10.58471/jds.v3i2.6890

Keywords:

Clustering, Arccosine distance, Euclidean distance

Abstract

Clustering is an important method in unsupervised learning for grouping data based on similarity of characteristics. This study aims to cluster penguin species based on weight, height, and wing length attributes using the K-Means algorithm with two distance approaches: Euclidean and Arccosine. The dataset consists of 342 data points after cleaning. Evaluation results show that the Arccosine distance yields a clustering accuracy of 89.6%, higher than the Euclidean distance at 63.09%. This indicates that Arccosine is more optimal for classifying penguin species.

References

Adji, D.R. et al. (2025) ‘Metode dan Algoritma Dalam Data Clustering: Systematic Literature Review’, Science Technology and Management Journal, 5(1), pp. 9–15. Available at: https://doi.org/10.53416/stmj.v5i1.326.

Adrianto, H. et al. (2022) ‘Pembekalan Klasifikasi Baru Makhluk Hidup Hewan Kepada Guru-Guru Biologi’, Sebatik, 26(2), pp. 638–643. Available at: https://doi.org/10.46984/sebatik.v26i2.2152.

Anggraeni, R. et al. (2023) ‘Perilaku Makan , Adaptasi Dan Menghindari Predator Pada Hewan’, Jurnal Lingkungan, (July), pp. 1–21. Available at: https://osf.io/preprints/z572m/.

Bhatia, S.K. (2004) ‘Pengelompokan K-Means Adaptif’.

Dzakiansyah, F. and Pramiyati, T. (2020) ‘Perancangan Sistem Informasi E-Learning Pembelajaran Bahasa Inggris Berbasis Web’, Prosiding Seinasi-Kesi, 14, pp. 143–148. Available at: https://conference.upnvj.ac.id/index.php/seinasikesi/article/view/811.

Flasiński M (2016) Pattern recognition and cluster analysis. Introduction to Artifcial Intelligence. Springer, Cham. https://doi.org/10. 1007/978-3-319-40022-8_10

Gill, Frank; Donsker, David; Rasmussen, Pamela, ed. (2023). "Kagu, Sunbittern, tropicbirds, loons, penguins". World Bird List Version 13.1. International Ornithologists' Union. Diakses tanggal 15 juli 2025. https://id.wikipedia.org/wiki/Penguin

Grabski, Valerie (2009). "Little Penguin – Penguin Project". Penguin Sentinels/University of Washington. Diarsipkan dari asli tanggal 16 December 2011. Diakses tanggal 15 juli 2025. https://id.wikipedia.org/wiki/Penguin

Handayani, F. (2022) ‘Aplikasi Aplikasi Data Mining Menggunakan Algoritma K-Means Clustering untuk Mengelompokan Mahasiswa Berdasarkan Gaya Belajar’, Jurnal Teknologi dan Informasi, 12(1), pp. 46–63. Available at: https://doi.org/10.34010/jati.v12i1.6733.

Harifi, S., Khalilian, M. and Mohammadzadeh, J. (2023) ‘Swarm based automatic clustering using nature inspired Emperor Penguins Colony algorithm’, Evolving Systems, 14(6), pp. 1083–1099. Available at: https://doi.org/10.1007/s12530-023-09507-y.

Harif S, Byagowi E, Khalilian M (2017) Comparative study of apache spark MLlib clustering algorithms. In: Data mining and big data: second international conference, DMBD 2017, Fukuoka, Japan, July 27–August 1, 2017, Proceedings 2. Springer International Publishing, pp 61–73

‘Monica rizkiana_tugas uas_Pbd osf’ (no date).

Mughnyanti, M. and Hafiz Nanda Ginting, S. (2023) ‘Data Mining Manhattan Distance dan Euclidean Distance Pada Algoritma X-Means Dalam Klasifikasi Minat dan Bakat Siswa’, Remik, 7(1), pp. 835–842. Available at: https://doi.org/10.33395/remik.v7i1.12162.

Nelson, D. et al. (2024) ‘Introducing the TNG-Cluster simulation: Overview and the physical properties of the gaseous intracluster medium’, Astronomy and Astrophysics, 686, pp. 1–25. Available at: https://doi.org/10.1051/0004-6361/202348608.

Rasmussen, L.E.L., Lee, T.D., Roelofs, W.L., Zhang, A., Doyle Davies Jr, G. (1996). Insect pheromone in elephants. Nature. 379: 684

S. P. Collins et al., “No Title 済無No Title No Title No Title,” pp. 12–64, 2021.

Downloads

Published

2025-08-28

How to Cite

Herlina Br Nainggolan, & Pandi Barita Nauli Simangungsong. (2025). Comparison and Evaluation of Euclidean Distance and Arccosine Distance in Adaptive K-Means Clustering Algorithm for Penguin Species Clustering. Journal Of Data Science, 3(02), 69–78. https://doi.org/10.58471/jds.v3i2.6890