Enhancing Javanese Emotion Classification: A Comparative Study of Cross-Lingual, Supervised, and Hybrid Transfer Learning using IndoBERTweet

Galih Setiawan Nurohim; Heribertus Ary Setyadi; Sigit Wahyudi; Paulus Tofan Rapiyanta

doi:10.63158/journalisi.v8i3.1657

Authors

Galih Setiawan Nurohim Universitas Bina Sarana Informatika, Indonesia
Heribertus Ary Setyadi Universitas Bina Sarana Informatika, Indonesia
Sigit Wahyudi Sebelas Maret University, Indonesia
Paulus Tofan Rapiyanta Universitas Bina Sarana Informatika, Indonesia

DOI:

https://doi.org/10.63158/journalisi.v8i3.1657

Keywords:

Emotion classification, Javanese Ngoko, cross-lingual transfer learning, IndoBERTweet, machine translation

Abstract

This research investigates transfer learning efficacy for five-class emotion classification in Javanese Ngoko. A parallel Indonesian–Javanese Ngoko corpus was synthesized by translating 5,400 samples from the PRDECT-ID dataset using machine translation, with translation quality verified via a preliminary expert validation sample. Using IndoBERTweet as the backbone architecture, three paradigms were evaluated: zero-shot transfer (E1), fully supervised learning (E2), and cross-lingual transfer learning (E3) with identical hyperparameters. Empirical results indicate that the cross-lingual transfer (E3) setup achieved peak performance (67,5% accuracy; 0,67 weighted F1) under the evaluated dataset and experimental setting. Per-class analysis identified that positive affect (Happy) showed cross-lingual stability, whereas negative emotions (Sadness, Fear) suffered degradation due to lexical divergence between the two languages. Training dynamics revealed early-onset overfitting, suggesting model capacity exceeds current dataset density. This work establishes a baseline benchmark for Javanese emotion classification and provides a reproducible machine-translated parallel corpus, emphasizing the need for future validation with native-speaker data to mitigate translation bias.

Downloads

Download data is not yet available.

References

[1] Z. Maryani, R. Legino, and P. Waijittragum, “Linguistic hybridity between Javanese and Bahasa Indonesia in contemporary Javanese songs,” vol. 23, no. 2, pp. 278–286, 2025.

[2] Hermanto and T. W. Sen, “Syllable-Based Javanese Speech Recognition Using MFCC and CNNs : Noise Impact Evaluation,” J. Tek. Inform., vol. 18, no. 1, pp. 32–42, 2025, doi: 10.15408/jti.v18i1.41067.

[3] A. F. Hidayatullah, R. A. Apong, D. T. C. Lai, and A. Qazi, “Word Level Language Identification in Indonesian-Javanese-English Code-Mixed Text,” Procedia Comput. Sci., vol. 244, pp. 105–112, 2024, doi: 10.1016/j.procs.2024.10.183.

[4] S. R. Ntou, “Exploring complex diglossia in Javanese society,” Cogent Arts Humanit., vol. 11, no. 1, p., 2024, doi: 10.1080/23311983.2024.2313286.

[5] W. Udasmoro, A. Firmonasari, and W. T. Astuti, “Access to and Usage of Javanese in Mass Media among Yogyakarta Youth,” vol. 23, no. 2, pp. 268–277, 2023, doi: 10.24071/joll.v23i2.5508.

[6] P. Triawan, I. Tahyudin, and P. Purwadi, “Impact of NLP Algorithms on Sentiment Analysis Efficiency and Accuracy,” J. Inf. Syst. Informatics, vol. 7, no. 3, pp. 2684–2709, 2025, doi: 10.51519/journalisi.v7i3.1222.

[7] F. Arifin, A. Nasuha, A. S. Priambodo, A. Winursito, and T. S. Gunawan, “Advanced Multimodal Emotion Recognition for Javanese Language Using Deep Learning,” Indones. J. Electr. Eng. Informatics, vol. 12, no. 3, pp. 503–515, 2024, doi: 10.52549/ijeei.v12i3.5662.

[8] S. Praveena, “Emotion Classification Using BERT : A Comprehensive Study,” Tuijin Jishu/Journal Propuls. Technol., vol. 45, no. 4, pp. 3337–3345, 2024.

[9] A. Alabd-aljabar, Z. Raisan, M. Adnan, and S. Dhou, “A Hybrid Transfer Learning Approach to Teeth Diagnosis Using Orthopantomogram Radiographs,” IEEE Access, vol. 12, no. December, pp. 178142–178152, 2024, doi: 10.1109/ACCESS.2024.3507925.

[10] A. M. H. Pardede, R. Winanjaya, and J. Ismail, “HYBRID TRANSFER LEARNING AND ADVANCED DATA AUGMENTATION FOR MULTICLASS BRAIN TUMOR CLASSIFICATION,” vol. 11, no. 3, pp. 669–679, 2026, doi: 10.33480/jitk.v11i3.7524.

[11] T. Sindane, V. Marivate, and A. Modupe, “Cross-lingual embedding methods and applications : A systematic review for low-resourced scenarios,” Nat. Lang. Process. J., vol. 12, no. October 2024, p. 100157, 2025, doi: 10.1016/j.nlp.2025.100157.

[12] J. F. Kusuma and A. Chowanda, “Indonesian Hate Speech Detection Using IndoBERTweet and BiLSTM on Twitter,” Int. J. INFORMATICS Vis., vol. 7, no. September, pp. 773–780, 2023, doi: 10.30630/joiv.7.3.1035.

[13] A. I. Gufroni, P. Purwanto, and F. Farikhin, “Academic Performance Prediction Using Supervised Learning Algorithms in University Admission,” JOIV Int. J. Informatics Vis., vol. 9, no. January, pp. 184–194, 2025, doi: 10.62527/joiv.9.1.2974.

[14] F. Koto, A. Rahimi, J. H. Lau, and T. Baldwin, “IndoLEM and IndoBERT: A Benchmark Dataset and Pre-trained Language Model for Indonesian NLP,” COLING 2020 - 28th Int. Conf. Comput. Linguist. Proc. Conf., pp. 757–770, 2020, doi: 10.18653/v1/2020.coling-main.66.

[15] F. Koto Jey Han Lau Timothy Baldwin, “INDOBERTWEET: A Pretrained Language Model for Indonesian Twitter with Effective Domain-Specific Vocabulary Initialization,” pp. 10660–10668, 2021.

[16] A. F. Hidayatullah, R. A. Apong, D. T. C. Lai, and A. Qazi, “Corpus creation and language identification for code-mixed Indonesian-Javanese-English Tweets,” PeerJ Comput. Sci., vol. 9, pp. 1–24, 2023, doi: 10.7717/PEERJ-CS.1312.

[17] G. Enrique, I. Alfina, and E. Yulianti, “Javanese part-of-speech tagging using cross-lingual transfer learning,” IAES Int. J. Artif. Intell., vol. 13, no. 3, pp. 3498–3509, 2024, doi: 10.11591/ijai.v13.i3.pp3498-3509.

[18] P. K. L. Utama, J. S. Dibangoye, and T. M. Tashu, “Cross-Lingual Emotion Recognition in Balinese Text using Multilingual-LLMs under Peer-Collaborations Settings,” in Proceedings of the Second Workshop on Language Models for Low-Resource Languages (LoResLM 2026), 2026, pp. 225–238. doi: 10.18653/v1/2026.loreslm-1.21.

[19] R. Sutoyo, S. Achmad, A. Chowanda, E. W. Andangsari, and S. M. Isa, “PRDECT-ID: Indonesian product reviews dataset for emotions classification tasks,” Data Br., vol. 44, p. 108554, 2022, doi: 10.1016/j.dib.2022.108554.

[20] T. O. Tafa, S. Zaiton, M. Hashim, and M. S. Othman, “Machine Translation Performance for Low-Resource Languages : A Systematic Literature Review,” IEEE Access, vol. 13, no. March, pp. 72486–72505, 2025, doi: 10.1109/ACCESS.2025.3562918.

[21] T. R. Mahesh, V. K. V, D. K. V, O. Geman, and M. Margala, “Healthcare Analytics The stratified K-folds cross-validation and class-balancing methods with high-performance ensemble classifiers for breast cancer classification,” Healthc. Anal., vol. 4, no. July, p. 100247, 2023, doi: 10.1016/j.health.2023.100247.

[22] M. Martianus, D. Christian, K. Setyo, M. Martianus, and D. Christian, “ScienceDirect ScienceDirect Improving Indonesian emotion detection with openAI o4-mini Improving Indonesian emotion detection with openAI o4-mini text normalization text normalization,” Procedia Comput. Sci., vol. 269, pp. 863–871, 2025, doi: 10.1016/j.procs.2025.09.029.

Enhancing Javanese Emotion Classification: A Comparative Study of Cross-Lingual, Supervised, and Hybrid Transfer Learning using IndoBERTweet

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

Issue

Section

License

Most read articles by the same author(s)

publisher

sidebar

certificate

template

gs-citation

index

stat