Fine-Tuning the Gemini 1.5 Flash Large Language Model for User Perception Classification in BSI Mobile Application Reviews

Authors

  • Rio Fidelis Program Studi Sistem Informasi Fakultas Sains dan Teknologi Universitas Prima Indonesia
  • Vicraj Vicraj Program Studi Sistem Informasi Fakultas Sains dan Teknologi Universitas Prima Indonesia
  • Dea Monica Bangun Program Studi Sistem Informasi Fakultas Sains dan Teknologi Universitas Prima Indonesia
  • Nur Mayanti Program Studi Sistem Informasi Fakultas Sains dan Teknologi Universitas Prima Indonesia
  • Evta Indra Program Studi Sistem Informasi Fakultas Sains dan Teknologi Universitas Prima Indonesia

Keywords:

Large Language Model, Fine-Tuning, Gemini 1.5 Flash, Perception Classification, Sentiment Analysis, IndoBERT, Google Cloud Vertex AI

Abstract

he growing volume of user reviews on digital platforms such as the Google Play Store presents a major challenge in automatically understanding user perceptions, especially due to the unstructured, varied, and highly subjective nature of the text data. Manual analysis at this scale is inefficient and prone to bias. To address this issue, this study applies fine-tuning on the Large Language Model (LLM) Gemini 1.5 Flash to automatically classify user perceptions of the BSI Mobile application. Perceptions are categorized into three classes: Very Poor, Fair, and Excellent. A total of 120,000 reviews were collected via web scraping and processed through cleaning, normalization, automatic labeling using the IndoBERT model, and conversion into JSONL format for fine-tuning on the Google Cloud Vertex AI platform. Evaluation results show an accuracy of 63.41% for perception classification and 67.31% for sentiment classification, with F1-scores of 28.82% and 28.75%, respectively. The model demonstrated better accuracy in identifying positive perceptions, while neutral or ambiguous reviews remained a challenge. Consistency analysis between predicted perceptions and user ratings showed a match rate of 83.81%. This study demonstrates that the fine-tuned Gemini 1.5 Flash is an effective solution for text-based perception classification and holds strong potential for broader application in user opinion analytics systems.

References

Ahmad, K. et al. (2022) ‘Global User-Level Perception of COVID-19 Contact Tracing Applications: Data-Driven Approach Using Natural Language Processing’, JMIR Formative Research, 6(5), p. e36238. Available at: https://doi.org/10.2196/36238.

Al-Baity, H.H. et al. (2022) ‘Computational Linguistics Based Emotion Detection and Classification Model on Social Networking Data’, Applied Sciences, 12(19), p. 9680. Available at: https://doi.org/10.3390/app12199680.

Arifiyanti, A.A., Shantika, N.R. and Syafira, A.O. (2023) ‘ANALISIS SENTIMEN ULASAN PENGGUNA BSI MOBILE PADA GOOGLE PLAY DENGAN PENDEKATAN SUPERVISED LEARNING’, Jurnal Informatika Polinema, 9(3), pp. 283–288. Available at: https://doi.org/10.33795/jip.v9i3.1003.

Arora, P. and Banerji, R. (2024) ‘The impact of digital banking service quality on customer loyalty: An interplay between customer experience and customer satisfaction’, Asian Economic and Financial Review, 14(9), pp. 712–733. Available at: https://doi.org/10.55493/5002.v14i9.5199.

Beurer-Kellner, L., Fischer, M. and Vechev, M. (2023) ‘Prompting Is Programming: A Query Language for Large Language Models’, Proceedings of the ACM on Programming Languages, 7(PLDI), pp. 1946–1969. Available at: https://doi.org/10.1145/3591300.

Catania, C. et al. (2022) ‘Beyond Random Split for Assessing Statistical Model Performance’, pp. 1–12. Available at: https://typeset.io/papers/beyond-random-split-for-assessing-statistical-model-2bxbb9rd.

Fahrani, F. and Aryanto, J. (2024) ‘Sentiment Analysis of Public Opinion on the Palestinian-Israeli Conflict using Support Vector Machine and Naïve Bayes Algorithms’, Journal of Scientific Research, Education, and Technology (JSRET), 3(4), pp. 1890–1900. Available at: https://doi.org/10.58526/jsret.v3i4.606.

Fatouros, G. et al. (2024) ‘Can Large Language Models beat wall street? Evaluating GPT-4’s impact on financial decision-making with MarketSenseAI’, Neural Computing and Applications [Preprint]. Available at: https://doi.org/10.1007/s00521-024-10613-4.

Gerlich, M., Elsayed, W. and Sokolovskiy, K. (2023) ‘Artificial intelligence as toolset for analysis of public opinion and social interaction in marketing: identification of micro and nano influencers’, Frontiers in Communication, 8. Available at: https://doi.org/10.3389/fcomm.2023.1075654.

Kaur, H. and Sandhu, N.K. (2023) ‘International Journal of Communication Networks and Information Security Evaluating the Effectiveness of the Proposed System Using F1 Score , Recall , Accuracy , Precision and Loss Metrics Compared to Prior Techniques’, 15(04), pp. 368–383.

Liu, Y. et al. (2023) ‘Improving Large Language Model Fine-tuning for Solving Math Problems’, (1), pp. 1–14. Available at: http://arxiv.org/abs/2310.10047.

Malik, N. and Bilal, M. (2024) ‘Natural language processing for analyzing online customer reviews: a survey, taxonomy, and open research challenges’, PeerJ Computer Science, 10, p. e2203. Available at: https://doi.org/10.7717/peerj-cs.2203.

Olujimi, P.A. and Ade-Ibijola, A. (2023) ‘NLP techniques for automating responses to customer queries: a systematic review’, Discover Artificial Intelligence, 3(1), p. 20. Available at: https://doi.org/10.1007/s44163-023-00065-5.

Sodik, F., Nur Zaida, A. and Zulmiati, K. (2022) ‘Analisis Minat Penggunaan pada Fitur Pembelian Mobile Banking BSI: Pendekatan TAM dan TPB’, Journal of Business Management and Islamic Banking, 1(1), pp. 35–53. Available at: https://doi.org/10.14421/jbmib.2022.011-03.

Sottana, A. et al. (2023) ‘Evaluation Metrics in the Era of GPT-4: Reliably Evaluating Large Language Models on Sequence to Sequence Tasks’, in Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA, USA: Association for Computational Linguistics, pp. 8776–8788. Available at: https://doi.org/10.18653/v1/2023.emnlp-main.543.

Trani, A.H. and Tran, D.A. (2024) ‘CUSTOMER EXPERIENCE AND SATISFACTION WITH DIGITAL BANKING SERVICES’, Proceeding of International Conference on Business, Economics, Social Sciences, and Humanities, 7, pp. 548–555. Available at: https://doi.org/10.34010/icobest.v7i.565.

Wong, M.-F. et al. (2023) ‘Natural Language Generation and Understanding of Big Code for AI-Assisted Programming: A Review’, Entropy, 25(6), p. 888. Available at: https://doi.org/10.3390/e25060888.

Downloads

Published

2025-05-22

How to Cite

Fidelis, R., Vicraj, V., Bangun, D. M., Mayanti, N., & Indra, E. (2025). Fine-Tuning the Gemini 1.5 Flash Large Language Model for User Perception Classification in BSI Mobile Application Reviews. Jurnal Ilmiah Multidisiplin Indonesia (JIM-ID), 4(05), 195–208. Retrieved from https://ejournal.seaninstitute.or.id/index.php/esaprom/article/view/6660