Optimized K-Means Clustering for Web Server Anomaly Detection Using Elbow Method and Security-Rule Enhancements
Abstract
Anomaly detection in web server environments is essential for identifying early indicators of cyberattacks that arise from abnormal request behaviors. Traditional signature-based mechanisms often fail to detect emerging or obfuscated threats, requiring more adaptive analytical approaches. This study proposes an optimized anomaly detection model using K-Means clustering enhanced with engineered security-rule features and the Elbow Method. Two datasets were used: a small dataset of 3,399 log entries from one VPS and a large dataset of 223,554 entries collected from three VPS nodes, all sourced from local production servers of the Department of Computer and Business, Politeknik Negeri Cilacap. The preprocessing pipeline includes timestamp normalization, removal of non-informative static resources, numerical feature scaling, and TF-IDF encoding of URL paths. Domain-driven security features entropy scores, encoded-payload indicators, abnormal status-code ratios, and request-rate deviations were integrated to improve anomaly separability. Experiments across five model configurations show that combining larger datasets with rule-based features significantly enhances clustering performance, achieving a Silhouette Score of 0.9136 and a Davies–Bouldin Index of 0.4712. The results validate the effectiveness of incorporating security-rule engineering with unsupervised learning to support early-warning threat detection in web server environments.
Downloads
References
P. Feng et al., “GlareShell: Graph learning-based PHP webshell detection for web server of industrial internet,” Comput. Networks, vol. 245, no. April, p. 110406, 2024, doi: 10.1016/j.comnet.2024.110406.
B. Xie, Q. Li, and Y. Wang, “PHP-based malicious webshell detection based on abstract syntax tree simplification and explicit duration recurrent networks,” Comput. Secur., vol. 146, no. June, 2024, doi: 10.1016/j.cose.2024.104049.
Y. Xu, Y. Fang, Z. Liu, and Q. Zhang, “PWAGAT: Potential Web attacker detection based on graph attention network,” Neurocomputing, vol. 557, no. 2019, p. 126725, 2023, doi: 10.1016/j.neucom.2023.126725.
Yusuf Raharja, “Implementasi Metode Osint untuk Mengidentifikasi Serangan Judi Online pada Website,” J. Inform. Polinema, vol. 10, no. 3, pp. 359–364, 2024, doi: 10.33795/jip.v10i3.4847.
A. Kurniawan, B. S. Abbas, A. Trisetyarso, and S. M. Isa, “Classification of web backdoor malware based on function call execution of static analysis,” ICIC Express Lett., vol. 13, no. 6, pp. 445–452, 2019, doi: 10.24507/icicel.13.06.445.
H. Kwon and J. W. Baek, “Text Select-Backdoor: Selective Backdoor Attack for Text Recognition Systems,” IEEE Access, vol. 12, no. July, pp. 170688–170698, 2024, doi: 10.1109/ACCESS.2024.3436586.
Y. Bai et al., “Backdoor Attack and Defense on Deep Learning: A Survey,” IEEE Trans. Comput. Soc. Syst., vol. 12, no. 1, pp. 404–434, 2024, doi: 10.1109/TCSS.2024.3482723.
R. B. Trianto, A. S. Nugroho, and E. Supriyadi, “Klasterisasi Menggunakan Algoritma K-Means dan Elbow pada Opini Masyarakat Tentang Kebijakan Sekolah Luring Tahun 2022,” INOVTEK Polbeng - Seri Inform., vol. 8, no. 1, p. 1, 2023, doi: 10.35314/isi.v8i1.2756.
Y. Chen, P. Tan, M. Li, H. Yin, and R. Tang, “K-means clustering method based on nearest-neighbor density matrix for customer electricity behavior analysis,” Int. J. Electr. Power Energy Syst., vol. 161, no. July, 2024, doi: 10.1016/j.ijepes.2024.110165.
K. E. Setiawan, A. Kurniawan, A. Chowanda, and D. Suhartono, “Clustering models for hospitals in Jakarta using fuzzy c-means and k-means,” Procedia Comput. Sci., vol. 216, no. 2022, pp. 356–363, 2022, doi: 10.1016/j.procs.2022.12.146.
W. A. Prastyabudi, A. N. Alifah, and A. Nurdin, “Segmenting the Higher Education Market: An Analysis of Admissions Data Using K-Means Clustering,” Procedia Comput. Sci., vol. 234, no. 2023, pp. 96–105, 2024, doi: 10.1016/j.procs.2024.02.156.
N. Rylko, M. Stawiarz, P. Kurtyka, and V. Mityushev, “Study of anisotropy in polydispersed 2D micro and nano-composites by Elbow and K-Means clustering methods,” Acta Mater., vol. 276, no. April, p. 120116, 2024, doi: 10.1016/j.actamat.2024.120116.
X. Sun, X. Liu, C. Deng, H. Chu, G. Wang, and H. Zhao, “An Enhanced Density Peak Clustering Algorithm With Dimensionality Reduction and Relative Density Normalization for High-Dimensional Duplicate Data,” IEEE Access, vol. 13, no. August, pp. 147242–147264, 2025, doi: 10.1109/ACCESS.2025.3596983.
S. Tahvili, L. Hatvani, M. Felderer, F. G. de Oliveira Neto, W. Afzal, and R. Feldt, “Comparative analysis of text mining and clustering techniques for assessing functional dependency between manual test cases,” Softw. Qual. J., vol. 33, no. 2, pp. 1–36, 2025, doi: 10.1007/s11219-025-09722-7.
S. Mostafaei, A. Ahmadi, and J. Shahrabi, “Dealing with data intrinsic difficulties by learning an interPretable Ensemble Rule Learning (PERL) model,” Inf. Sci. (Ny)., vol. 595, pp. 294–312, 2022, doi: 10.1016/j.ins.2022.02.048.
A. Hannousse and S. Yahiouche, “Handling webshell attacks: A systematic mapping and survey,” Comput. Secur., vol. 108, p. 102366, 2021, doi: 10.1016/j.cose.2021.102366.
S. M, S. Anusuya, and L. K. Narayanan, “Enhancing Automatic Speech Recognition Accuracy Using a Gaussian Mixture Model (GMM),” SSRN Electron. J., 2025, doi: 10.2139/ssrn.5089158.
R. Nanda, E. Haerani, S. K. Gusti, and S. Ramadhani, “Klasifikasi Berita Menggunakan Metode Support Vector Machine,” J. Nas. Komputasi dan Teknol. Inf., vol. 5, no. 2, pp. 269–278, 2022, doi: 10.32672/jnkti.v5i2.4193.
D. F. AL-Hafiidh, I. F. Rozi, and I. K. Putri, “Peringkasan Teks Otomatis pada Portal Berita Olahraga menggunakan metode Maximum Marginal Relevance.,” J. Inform. Polinema, vol. 8, no. 3, pp. 21–30, 2022, doi: 10.33795/jip.v8i3.519.
D. H. Amalia and W. Yustanti, “Klasifikasi Buku Menggunakan Metode Support Vector Machine pada Digital Library,” J. Informatics Comput. Sci., vol. 3, no. 01, pp. 55–61, 2021, doi: 10.26740/jinacs.v3n01.p55-61.
J. Heidari, N. Daneshpour, and A. Zangeneh, “A novel K-means and K-medoids algorithms for clustering non-spherical-shape clusters non-sensitive to outliers,” Pattern Recognit., vol. 155, no. May, p. 110639, 2024, doi: 10.1016/j.patcog.2024.110639.
H. Zhang et al., “Webshell traffic detection with character-level features based on deep learning,” IEEE Access, vol. 6, pp. 75268–75277, 2018, doi: 10.1109/ACCESS.2018.2882517.
B. Subba and P. Gupta, “A tfidfvectorizer and singular value decomposition based host intrusion detection system framework for detecting anomalous system processes,” Comput. Secur., vol. 100, p. 102084, 2021, doi: 10.1016/j.cose.2020.102084.
M. Berhili, O. Chaieb, and M. Benabdellah, “Intrusion Detection Systems in IoT Based on Machine Learning: A state of the art,” Procedia Comput. Sci., vol. 251, pp. 99–107, 2024, doi: 10.1016/j.procs.2024.11.089.
Z. T. Sworna, Z. Mousavi, and M. A. Babar, “NLP methods in host-based intrusion detection systems: A systematic review and future directions,” J. Netw. Comput. Appl., vol. 220, no. November 2022, p. 103761, 2023, doi: 10.1016/j.jnca.2023.103761.
Abstract views: 40 times
Download PDF: 20 times
Copyright (c) 2025 Journal of Information Systems and Informatics

This work is licensed under a Creative Commons Attribution 4.0 International License.
- I certify that I have read, understand and agreed to the Journal of Information Systems and Informatics (Journal-ISI) submission guidelines, policies and submission declaration. Submission already using the provided template.
- I certify that all authors have approved the publication of this and there is no conflict of interest.
- I confirm that the manuscript is the authors' original work and the manuscript has not received prior publication and is not under consideration for publication elsewhere and has not been previously published.
- I confirm that all authors listed on the title page have contributed significantly to the work, have read the manuscript, attest to the validity and legitimacy of the data and its interpretation, and agree to its submission.
- I confirm that the paper now submitted is not copied or plagiarized version of some other published work.
- I declare that I shall not submit the paper for publication in any other Journal or Magazine till the decision is made by journal editors.
- If the paper is finally accepted by the journal for publication, I confirm that I will either publish the paper immediately or withdraw it according to withdrawal policies
- I Agree that the paper published by this journal, I transfer copyright or assign exclusive rights to the publisher (including commercial rights)














