Deep Embedded Clustering for Indonesian Protein, Fat, and Energy Availability Data

Authors

  • Zakha Maisat Eka Darmawan The Electronic Engineering Polytechnic Institute of Surabaya
  • Oktavia Citra Resmi Rachmawati Politeknik Internasional Tamansiswa Mojokerto
  • Ashafidz Fauzan Dianta The Electronic Engineering Polytechnic Institute of Surabaya
  • Kholid Fathoni The Electronic Engineering Polytechnic Institute of Surabaya
  • Rizky Yuniar Hakkun The Electronic Engineering Polytechnic Institute of Surabaya
  • Tri Budi Santoso The Electronic Engineering Polytechnic Institute of Surabaya
  • Kevin Ilham Apriandy Politeknik Internasional Tamansiswa Mojokerto

DOI:

https://doi.org/10.37385/jaets.v7i2.8996

Keywords:

Deep Learning, Auto Encoder, Hierarchical Clustering, Data Analysis, Deep Embedding Clustering

Abstract

Understanding disparities in regional food availability is crucial for food security policies. Most previous studies on Indonesian food availability use conventional clustering methods. These methods operate directly on the feature space and may miss complex, non-linear relationships in nutritional data. This limitation highlights the need for advanced analytical approaches to uncover deeper patterns. This study analyzes patterns of provincial food availability in Indonesia using Deep Embedded Clustering (DEC). It uses per capita indicators of energy, fat, and protein from both plant and animal sources, as well as the 2023 Food Consumption Pattern (FCP) score. DEC integrates representation learning with clustering. This allows the model to capture latent structures and nonlinear relationships that traditional clustering cannot identify. The analysis began by comparing K-Means and Hierarchical Clustering using the silhouette score to generate pseudo-labels for the DEC model. Hierarchical Clustering with Ward linkage and Euclidean distance achieved the highest silhouette score (0.3958) and was used for pseudo-label generation. Two DEC configurations were implemented, showing improved clustering performance. These achieved silhouette scores of 0.7829 (DEC-1) and 0.6385 (DEC-2). The results reveal four distinct clusters of Indonesian provinces, each with different food availability characteristics. These range from balanced, nutrient-rich regions to provinces with more limited or specific nutritional patterns. The findings show that DEC can capture complex structures in nutritional data. It produces more meaningful clusters than conventional approaches. In practice, the identified clusters provide policymakers, nutrition experts, and the food industry with useful insights for region-specific strategies. These strategies can improve food security and nutritional balance. Theoretically, this study contributes to the use of deep learning-based clustering in food availability analysis. It is especially relevant in national food security research. Future research may extend this approach by integrating time-series data and spatial analysis. This will help understand the temporal and regional dynamics of food availability in Indonesia.

Downloads

Download data is not yet available.

References

Ahmad, A., Liew, A. X. W., Venturini, F., Kalogeras, A., Candiani, A., Di Benedetto, G., Ajibola, S., Cartujo, P., Romero, P., Lykoudi, A., De Grandis, M. M., Xouris, C., Lo Bianco, R., Doddy, I., Elegbede, I., Labate, G. F. D., Del Moral, L. F. G., & Martos, V. (2024). AI can empower agriculture for global food security: challenges and prospects in developing nations. Frontiers in Artificial Intelligence, 7. https://doi.org/10.3389/frai.2024.1328530

Apfel, N., & Liang, X. (2024). Agglomerative hierarchical clustering for selecting valid instrumental variables. Journal of Applied Econometrics, 39(7), 1201–1219. https://doi.org/10.1002/jae.3078

Azzam, A. F., Maghrabi, A., El-Naqeeb, E., Aldawood, M., & ElGhawalby, H. (2024). Morphological Accuracy Data Clustering: A novel algorithm for enhanced cluster analysis. Applied Computational Intelligence and Soft Computing, 2024(1). https://doi.org/10.1155/2024/3795126

Bui, V. H., & Phan, H. T. (2023). The Computational Complexity of Hierarchical Clustering Algorithms for Community Detection: a review. Vietnam Journal of Computer Science, 10(04), 409–431. https://doi.org/10.1142/s2196888823300016

Bussa, S. K., Boppana, N. K., & Deka, B. (2025). A performance-driven evaluation of deep learning for concrete crack detection with varying dataset sizes and training epochs: real-world implications for infrastructure monitoring. Asian Journal of Civil Engineering, 26(7), 3063–3081. https://doi.org/10.1007/s42107-025-01361-4

Chen, Y., Li, L., & Li, X. (2023). Correlation analysis of structural characteristics of table tennis players’ hitting movements and hitting effects based on data analysis. Entertainment Computing, 48, 100610. https://doi.org/10.1016/j.entcom.2023.100610

Darmawan, Z. M. E., Dianta, A. F., Fathoni, K., Rachmawati, O. C. R., & Apriandy, K. I. (2025). Comparison of Machine learning classification Methods for weather Prediction: A Performance analysis. Jurnal Teknologi Terapan G-Tech, 9(2), 715–727. https://doi.org/10.70609/gtech.v9i2.6649

Desai, D. D., Dey, J., Satapathy, S. K., Mishra, S., Mohanty, S. N., Mishra, P., & Panda, S. K. (2023). Optimal ambulance positioning for road accidents with deep embedded clustering. IEEE Access, 11, 59917–59934. https://doi.org/10.1109/access.2023.3284993

Ennaouri, M., & Zellou, A. (2024). A scoring approach for detecting fake reviews using MRCS similarity metric enhanced by personalized k-means. Bulletin of Electrical Engineering and Informatics, 14(1), 587–595. https://doi.org/10.11591/eei.v14i1.8288

Hu, J., & Szymczak, S. (2023). A review on longitudinal data analysis with random forest. Briefings in Bioinformatics, 24(2). https://doi.org/10.1093/bib/bbad002

Huang, Y., Zeng, P., & Zhong, C. (2024). Classifying breast cancer subtypes on multi-omics data via sparse canonical correlation analysis and deep learning. BMC Bioinformatics, 25(1). https://doi.org/10.1186/s12859-024-05749-y

Ilyas, F. M., & Priscila, S. S. (2024). An optimized clustering quality analysis in K-Means Cluster using silhouette scores. In Advances in computational intelligence and robotics book series (pp. 49–63). https://doi.org/10.4018/979-8-3693-1355-8.ch004

Julianto, I. T., Kurniadi, D., Nashrulloh, M. R., & Mulyani, A. (2022). DATA MINING CLUSTERING FOOD EXPENDITURE IN INDONESIA. Jurnal Teknik Informatika (Jutif), 3(6), 1491–1500. https://doi.org/10.20884/1.jutif.2022.3.6.331

Khaerani, P. I., Musa, Y., Utamy, R. F., & Ishii, Y. (2024). Botanical composition and yields of forages in natural pastures using principal component analysis and cluster dendrogram in South Sulawesi, Indonesia. OnLine Journal of Biological Sciences, 24(4), 613–623. https://doi.org/10.3844/ojbsci.2024.613.623

Khalil, N. Z., Kong, N., & Fricke, H. (2024). The influence of GNP on the mechanical and thermomechanical properties of epoxy adhesive: Pearson correlation matrix and heatmap application in data interpretation. Polymer Composites, 45(10), 8997–9018. https://doi.org/10.1002/pc.28390

Khan, A. A., Bashir, M. S., Bashir, M. S., Batool, A., Raza, M. S., Bashir, M. A., & Bashir, M. A. (2024). K‐Means Centroids initialization based on differentiation between instances attributes. International Journal of Intelligent Systems, 2024(1). https://doi.org/10.1155/2024/7086878

Kumar, S., Rani, R., Pippal, S. K., & Agrawal, R. (2024). Customer segmentation in e-commerce: K-means vs hierarchical clustering. TELKOMNIKA (Telecommunication Computing Electronics and Control), 23(1), 119. https://doi.org/10.12928/telkomnika.v23i1.26384

Lakshmi, H. N., Ramana, T. V., K, L. P., Reddy, L. K. K., & Raju, K. B. (2024). A novel comprehensive investigation for enhancing cluster analysis accuracy through ensemble learning methods. International Journal of Power Electronics and Drive Systems/International Journal of Electrical and Computer Engineering, 14(5), 5802. https://doi.org/10.11591/ijece.v14i5.pp5802-5812

Lee, Y., Park, C., & Kang, S. (2022). Deep Embedded clustering framework for mixed data. IEEE Access, 11, 33–40. https://doi.org/10.1109/access.2022.3232372

Li, M., Cao, C., Li, C., & Yang, S. (2024). Deep Embedding clustering based onResidual Autoencoder. Neural Processing Letters, 56(2). https://doi.org/10.1007/s11063-024-11586-0

Ma, Y., Pei, Y., & Li, C. (2023). A Deep Embedded Clustering Method Based on $beta$-Variational Autoencoder for Single-Cell RNA Sequencing Data. 2023 6th International Conference on Information Communication and Signal Processing (ICICSP), 97–102. https://doi.org/10.1109/icicsp59554.2023.10390592

Murugan, T. M., Lenus, C. R., Sridharan, S., & Malligarjun, A. (2024). Life Time Prediction of an Electromagnet Relay using Clustering based Principal Component Analysis with Hybrid Deep Learning Model. Journal of Applied Engineering and Technological Science (JAETS), 6(1), 715–729. https://doi.org/10.37385/jaets.v6i1.5891

Nugroho, H. Y. S. H., Indrawati, D. R., Wahyuningrum, N., Adi, R. N., Supangat, A. B., Indrajaya, Y., Putra, P. B., Cahyono, S. A., Nugroho, A. W., Basuki, T. M., Savitri, E., Yuwati, T. W., Narendra, B. H., Sallata, M. K., Allo, M. K., Bisjoe, A. R., Muin, N., Isnan, W., Ansari, F., . . . Hani, A. (2022). Toward Water, energy, and Food Security in Rural Indonesia: a review. Water, 14(10), 1645. https://doi.org/10.3390/w14101645

Rachmawati, O. C. R., Barakbah, A. R., & Karlita, T. (2024). Programming language selection for the development of deep learning library. JOIV International Journal on Informatics Visualization, 8(1), 434. https://doi.org/10.62527/joiv.8.1.2437

Rachmawati, O. C. R., Barakbah, A. R., & Karlita, T. (2025). The Comparison of Activation Functions in Feature Extraction Layer using Sharpen Filter. Journal of Applied Engineering and Technological Science (JAETS), 6(2), 1254–1267. https://doi.org/10.37385/jaets.v6i2.5895

Rachmawati, O. C. R., & Darmawan, Z. M. E. (2024). The comparison of deep learning models for Indonesian political hoax news detection. CommIT (Communication and Information Technology) Journal, 18(2), 123–135. https://doi.org/10.21512/commit.v18i2.10929

Ramadhan, A., Suhendra, A., & Yohanitas, W. A. (2025). ONE Data Indonesia: A Retrospective Analysis of Data Interoperability in Declaring Regional Planning and Development. KnE Social Sciences, 10(16), 152–171. https://doi.org/10.18502/kss.v10i16.19169

Rusmawati, E., Hartono, D., & Aritenang, A. F. (2023). Food security in Indonesia: the role of social capital. Development Studies Research, 10(1). https://doi.org/10.1080/21665095.2023.2169732

Setiono, H., & Dianto, T. M. (2022). Analysis of rice field cluster in Indonesia as an evaluation of food production availability using Fuzzy C-Means. Proceedings of the International Conference on Data Science and Official Statistics, 2021(1), 326–332. https://doi.org/10.34123/icdsos.v2021i1.245

Sahria, Y., Sudira, P., & Priyanto. (2026). Optimization of image compression using K-means clustering for digital heritage archives. Advance Sustainable Science, Engineering and Technology (ASSET), 8(1), 02601020. https://doi.org/10.26877/asset.v8i1.2772

Singh, Y., & Tiwari, M. (2025). A Comprehensive Machine Learning Approach for Early Detection of Diabetes on Imbalanced Data with Missing and Outlier Values. SN Computer Science, 6(3). https://doi.org/10.1007/s42979-025-03751-6

Sutardi, N., Apriyana, Y., Rejekiningrum, P., Alifia, A. D., Ramadhani, F., Darwis, V., Setyowati, N., Setyono, D. E. D., Gunawan, N., Malik, A., Abdullah, S., Muslimin, N., Wibawa, W., Triastono, J., Yusuf, N., Arianti, F. D., & Fadwiwati, A. Y. (2022). The transformation of rice crop technology in Indonesia: Innovation and sustainable food security. Agronomy, 13(1), 1. https://doi.org/10.3390/agronomy13010001

Testas, A. (2024). Deep Learning with TensorFlow for Classification. In Apress eBooks (pp. 431–488). https://doi.org/10.1007/979-8-8688-1017-6_7

Thongnim, P., Charoenwanit, E., & Phukseng, T. (2023). Cluster Quality in Agriculture: Assessing GDP and Harvest Patterns in Asia and Europe with K-Means and Silhouette Scores. 2023 7th International Conference on Electronics, Materials Engineering & Nano-Technology (IEMENTech), 1–5. https://doi.org/10.1109/iementech60402.2023.10423469

Wang, L., Nie, R., Zhang, Z., Gu, W., Wang, S., Wang, A., Zhang, J., & Cai, J. (2023). A deep generative framework with embedded vector arithmetic and classifier for sample generation, label transfer, and clustering of single-cell data. Cell Reports Methods, 3(8), 100558. https://doi.org/10.1016/j.crmeth.2023.100558

Wang, S., Beheshti, A., Wang, Y., Lu, J., Sheng, Q. Z., Elbourn, S., & Alinejad-Rokny, H. (2023). Learning distributed representations and deep embedded clustering of texts. Algorithms, 16(3), 158. https://doi.org/10.3390/a16030158

Wang, Y., Xiao, H., Zhang, Z., Guo, X., & Liu, Q. (2024). Self‐supervised representation learning of metro interior noise based on variational autoencoder and deep embedding clustering. Computer-Aided Civil and Infrastructure Engineering, 40(4), 503–522. https://doi.org/10.1111/mice.13336

Yang, J., & Lin, C. (2024). Enhanced Adjacency-Constrained hierarchical clustering using Fine-Grained pseudo labels. IEEE Transactions on Emerging Topics in Computational Intelligence, 8(3), 2481–2492. https://doi.org/10.1109/tetci.2024.3367811

Zhang, M., & Parnell, A. (2023). Review of clustering methods for functional data. ACM Transactions on Knowledge Discovery From Data, 17(7), 1–34. https://doi.org/10.1145/3581789

Zheng, A., Cai, J., Yang, H., Xun, Y., & Zhao, X. (2025). Triple-Stream contrastive deep embedding clustering via semantic structure. Mathematics, 13(22), 3578. https://doi.org/10.3390/math13223578

Zheng, Y., Jia, C., Yu, J., & Li, X. (2023). Deep embedded clustering with distribution consistency preservation for attributed networks. Pattern Recognition, 139, 109469. https://doi.org/10.1016/j.patcog.2023.109469

Downloads

Published

2026-06-15

How to Cite

Eka Darmawan, Z. M., Rachmawati, O. C. R., Dianta, A. F., Fathoni, K., Hakkun, R. Y., Santoso, T. B., & Apriandy, K. I. (2026). Deep Embedded Clustering for Indonesian Protein, Fat, and Energy Availability Data. Journal of Applied Engineering and Technological Science (JAETS), 7(2), 1561-1579. https://doi.org/10.37385/jaets.v7i2.8996