K-Means Algorithm Analysis for Election Cluster Prediction

Sri Ngudi Wahyuni - Universitas Amikom Yogyakarta, Sleman, Yogyakarta 55283, Indonesia
Nazmun Khanom - University of Professionals, Mirpur Cantonment, Dhaka, 1216, Bangladesh
Yuli Astuti - Universitas Amikom Yogyakarta, Sleman, Yogyakarta 55283, Indonesia

Citation Format:

DOI: http://dx.doi.org/10.30630/joiv.7.1.1107


The general election is a democratic process that is carried out in every country whose system of government is presidential, including Indonesia, which conducts it every five years. In fact, some people abstain, leading to budget wasting and missing target. Thus, it is very important to identify clusters of general election districts and map the number of voters to map the budget for the upcoming election. This process needs prediction to help reduce budgeting risk as an early warning. Based on the latest election data taken from Margokaton, Yogyakarta, Indonesia, many people voted in 2021, but the number of abstainers is high. In this case, cluster prediction is important to identify the election participants in each area. The K-Means algorithm could also predict abstainer areas in election activities to facilitate early mitigation in drafting election budgeting. Therefore, this study aimed to identify the pattern of voters in the election using the K-means algorithm. The data parameters comprised the list of voters, Unused ballot papers, and the sum of abstainers. This study is important because it contributes to reducing the election budget of each area. The data obtained from the Indonesia Ministry of Internal Affairs official website in 2021 were processed using the RapidMiner tool. The results showed more than 11% of the non-voters in cluster 1, 16% in Cluster 2, and 8% in cluster 3. The evaluation of clusters value is 2.04, indicating that the clustering using K-means is suitable, as shown by the DBI value close to 0. The results indicate that testing the cluster optimization of the K-Means algorithm using DBI is highly recommended. Based on this prediction result, the government needs special attention to clusters with many abstainers to decrease the number of abstainers and prevent overbudgeting. These results indicate the need to review the election participant data in 2024. Furthermore, there is a need for continuous socialization and education about election activities to reduce the number of abstainers and prevent overbudgeting.


K-Means algorithm; cluster; prediction; election; Davies Bouldin index.

Full Text:



S. Zahi and B. Achchab, “Clustering of the population benefiting from health insurance using K-means,†in Proceedings of the 4th International Conference on Smart City Applications, 2019, pp. 1–6.

S. A. Rizvi, M. Umair, and M. A. Cheema, “Clustering of countries for COVID-19 cases based on disease prevalence, health systems and environmental indicators,†Chaos Solitons Fractals, vol. 151, p. 111240, 2021, doi: https://doi.org/10.1016/j.chaos.2021.111240.

H. L. Nguyen, “Specific K-mean clustering-based perceptron for dengue prediction,†International Journal of Intelligent Information and Database Systems, vol. 10, no. 3. pp. 269–288, 2017. doi: 10.1504/IJIIDS.2017.087242.

P. Manivannan, “Dengue fever prediction using K-means clustering algorithm,†Proceedings of the 2017 IEEE International Conference on Intelligent Techniques in Control, Optimization and Signal Processing, INCOS 2017, vol. 2018. pp. 1–5, 2018. doi: 10.1109/ITCOSP.2017.8303126.

T. Tada, K. Hitomi, Y. Wu, S. Y. Kim, H. Yamazaki, and K. Ishii, “K-mean clustering algorithm for processing signals from compound semiconductor detectors,†Nucl Instrum Methods Phys Res A, vol. 659, no. 1, pp. 242–246, Dec. 2011, doi: 10.1016/J.NIMA.2011.09.007.

A. Ahmad and L. Dey, “A k-mean clustering algorithm for mixed numeric and categorical data,†Data Knowl Eng, vol. 63, no. 2, pp. 503–527, Nov. 2007, doi: 10.1016/J.DATAK.2007.03.016.

S. G. Khawaja, M. Usman Akram, S. A. Khan, A. Shaukat, and S. Rehman, “Network-on-Chip based MPSoC architecture for k-mean clustering algorithm,†Microprocess Microsyst, vol. 46, pp. 1–10, Oct. 2016, doi: 10.1016/J.MICPRO.2016.08.006.

S. R. Vadyala, “Prediction of the Number of COVID-19 Confirmed Cases Based on K-Means-LSTM (Preprint).†JMIR Publications Inc., 2020. doi: 10.2196/preprints.20798.

S. W. Grant, G. L. Hickey, A. D. Grayson, D. C. Mitchell, and C. N. McCollum, “National risk prediction model for elective abdominal aortic aneurysm repair,†British Journal of Surgery, vol. 100, no. 5. Wiley, pp. 645–653, 2013. doi: 10.1002/bjs.9047.

P. Chévez, D. Barbero, I. Martini, and C. Discoli, “Application of the k-means clustering method for the detection and analysis of areas of homogeneous residential electricity consumption at the Great La Plata region, Buenos Aires, Argentina,†Sustain Cities Soc, vol. 32, pp. 115–129, 2017.

D. Debao, M. Yinxia, and Z. Min, “Analysis of big data job requirements based on K-means text clustering in China,†PLoS One, vol. 16, no. 8 August, Aug. 2021, doi: 10.1371/JOURNAL.PONE.0255419.

M. Mtshali, S. Dlamini, M. Adigun, and P. Mudali, “K-means based on resource clustering for smart farming problem in fog computing,†in 2019 IEEE AFRICON, 2019, pp. 1–6.

P. Janrao, D. Mishra, and V. Bharadi, “Clustering approaches for management zone delineation in precision agriculture for small farms,†in Proceedings of International Conference on Sustainable Computing in Science, Technology and Management (SUSCOM), Amity University Rajasthan, Jaipur-India, 2019.

R. Trivedi and S. Khadem, “Peak Demand Management and Schedule Optimisation for Energy Storage through the Machine Learning Approaches,†IEEE EUROCON 2021-19th …, 2021, [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9535559/

T. N. Tete and S. Kamlu, “Detection of plant disease using threshold, k-mean cluster and ann algorithm,†in 2017 2nd International Conference for Convergence in Technology (I2CT), 2017, pp. 523–526.

I. Boz, “Measuring environmental, economic, and social sustainability index of tea farms in Rize Province, Turkey,†Environ Dev Sustain, vol. 22, no. 3, pp. 2545–2567, 2020.

F. Ghassemi Tari and Z. Hashemi, “Prioritized K-mean clustering hybrid GA for discounted fixed charge transportation problems,†Comput Ind Eng, vol. 126, pp. 63–74, Dec. 2018, doi: 10.1016/J.CIE.2018.09.019.

M. Tleis, R. Callieris, and R. Roma, “Segmenting the organic food market in Lebanon: an application of k-means cluster analysis,†British Food Journal, 2017.

S. C. Babu, S. N. Gajanan, and P. Sanyal, “Classifying Households On Food Security and Poverty Dimensions—Application of K-Mean Cluster Analysis,†Food Security, Poverty and Nutrition Policy Analysis, pp. 417–439, 2014, doi: 10.1016/B978-0-12-405864-4.00013-2.

T. Wei, X. Wang, X. Li, and S. Zhu, “Fuzzy subspace clustering noisy image segmentation algorithm with adaptive local variance & non-local information and mean membership linking,†Eng Appl Artif Intell, vol. 110, Apr. 2022, doi: 10.1016/J.ENGAPPAI.2022.104672.

R. Yu et al., “Feature discretization-based deep clustering for thyroid ultrasound image feature extraction,†Comput Biol Med, Jul. 2022, doi: 10.1016/J.COMPBIOMED.2022.105600.

C. Wu and Z. Wang, “A modified fuzzy dual-local information c-mean clustering algorithm using quadratic surface as prototype for image clustering,†Expert Syst Appl, vol. 201, Sep. 2022, doi: 10.1016/J.ESWA.2022.117019.

K. Ren, Y. Ye, G. Gu, and Q. Chen, “Feature matching based on spatial clustering for aerial image registration with large view differences,†Optik (Stuttg), vol. 259, Jun. 2022, doi: 10.1016/J.IJLEO.2022.169033.

A. Abernathy and M. E. Celebi, “The incremental online k-means clustering algorithm and its application to color quantization,†Expert Syst Appl, vol. 207, p. 117927, Nov. 2022, doi: 10.1016/J.ESWA.2022.117927.

R. W. Grant et al., “Use of latent class analysis and k-means clustering to identify complex patient profiles,†JAMA Netw Open, vol. 3, no. 12, pp. e2029068–e2029068, 2020.

A. Risheh, P. Tavakolian, A. Melinkov, and A. Mandelis, “Infrared computer vision in non-destructive imaging: Sharp delineation of subsurface defect boundaries in enhanced truncated correlation photothermal coherence tomography images using K-means clustering,†NDT and E International, vol. 125, Jan. 2022, doi: 10.1016/J.NDTEINT.2021.102568.

S. Ilbeigipour, A. Albadvi, and E. Akhondzadeh Noughabi, “Cluster-based analysis of COVID-19 cases using self-organizing map neural network and K-means methods to improve medical decision-making,†Inform Med Unlocked, vol. 32, p. 101005, 2022, doi: https://doi.org/10.1016/j.imu.2022.101005.

J. A. L. Marques, F. N. B. Gois, J. Xavier-Neto, and S. J. Fong, “Predicting the geographic spread of the COVID-19 pandemic: a case study from Brazil,†in Predictive Models for Decision Support in the COVID-19 Crisis, Springer, 2021, pp. 89–98.

A. Sharma and S. Chauhan, “Sensor fusion for distributed detection of mobile intruders in surveillance wireless sensor networks,†IEEE Sens J, vol. 20, no. 24, pp. 15224–15231, 2020.

Y. Mekonnen, S. Namuduri, L. Burton, and ..., “Machine learning techniques in wireless sensor network based precision agriculture,†Journal of the …, 2019, doi: 10.1149/2.0222003JES.

S. Mostafavi and V. Hakami, “A new rankâ€order clustering algorithm for prolonging the lifetime of wireless sensor networks,†International Journal of Communication Systems, vol. 33, no. 7, p. e4313, 2020.

A. Likas, N. Vlassis, and J. J. Verbeek, “The global k-means clustering algorithm,†Pattern Recognit, vol. 36, no. 2, pp. 451–461, 2003, doi: 10.1016/S0031-3203(02)00060-2.

Y. Religia and A. S. Sunge, “Comparison of Distance Methods in K-Means Algorithm for Determining Village Status in Bekasi District,†Proceeding - 2019 International Conference of Artificial Intelligence and Information Technology, ICAIIT 2019, pp. 270–276, 2019, doi: 10.1109/ICAIIT.2019.8834604.

V. Faber, “Clustering and the continuous k-means algorithm,†Los Alamos Sci, vol. 22, no. 138144.21, p. 67, 1994.

N. Shi, X. Liu, and Y. Guan, “Research on k-means clustering algorithm: An improved k-means clustering algorithm,†3rd International Symposium on Intelligent Information Technology and Security Informatics, IITSI 2010, pp. 63–67, 2010, doi: 10.1109/IITSI.2010.74.

J. Carlos, R. Thomas, M. S. Peñas, and M. Mora, “New Version of Davies-Bouldin Index for Clustering Validation Based on Cylindrical Distance,†no. 1, 2013, doi: 10.1109/SCCC.2013.29.

M. Mughnyanti, S. Efendi, and M. Zarlis, “Analysis of determining centroid clustering x-means algorithm with davies-bouldin index evaluation,†in IOP Conference Series: Materials Science and Engineering, 2020, vol. 725, no. 1, p. 012128.