CLASSIFICATION OF MATURITY LEVELS IN ARECA FRUIT BASED ON HSV IMAGE USING THE KNN METHOD

Areca nut (Areca catechu) is a kind of palm plant that grows in Asia and Africa, the eastern part of the Pacific and in Indonesia itself, areca nut can also be found on the islands of Java, Sumatra and Kalimantan. At the stage of classifying the maturity of the betel nut so far, it is still using the manual method which at that stage has subjective weaknesses. Based on these problems, researchers will create a system that is able to classify the level of maturity of areca nut using HSV feature extraction with assistance at the classification stage using the KNN method. In this study, 842 datasets were used which were divided into 3 types of classes, namely ripe, unripe and old fruit. The dataset was divided into 683 training data and 159 test data. In the next stage, the data is tested using the K-Nearest Neighbor method by calculating the closest distance using k = 1. From the results of the calculation of the closest distance k1 produces an accuracy rate of 87.42%.


Introduction
Areca nut is a kind of palm plant that grows in Asia and Africa, the eastern part of the Pacific. Which has a height that reaches 12 to 30 m, rooted in white hairs, upright stems with a diameter of about 15 to 20 cm, unbranched with loose leaf marks that are clearly visible. Stem formation occurs at 2 years and bears fruit at the age of 5 years to 8 years depending on soil conditions, soil with good moisture and having a pH range of 5-8 strongly supports the growth of areca nut. And Indonesia itself, areca nut are also widely available on the islands of Sumatra, Java, Kalimantan, Sulawesi, and Papua. Indonesia is included in a country that can be said to be the largest betel nut exporting country with export volumes reaching one hundred and ten thousand tons in 2007 (Warnakulasuriya & Chen, 2022).
Increased in the following years, 80% of the demand for betel nut comes from Indonesia. In developed countries such as Belgium, the Netherlands, Japan, Germany, South Korea, and China, areca nut is used as a pharmaceutical raw material, betel nut is also used as a mixture of betel nut. And areca nut is also known as a medicinal ingredient such as being able to treat toothache in children and adults, dysentery, bloody diarrhea, and scabies. Areca nut seeds contain alkaloids, such as arekoline (C8 H13 NO2), arekolidine, arekain, guvakoline, guvasine, and isoguvasine, condensed tannins, hydrolyzed tannins, flavans, phenolic compounds, gallic acid, latex, lignin, volatile and non-volatile oils, and salt. This research is related to the current areca nut plantation sector, which is often raised as a topic of discussion, such as determining the betel nut based on the age of the betel nut, which is still not practical, due to the age difference between one betel nut and another. And the determination of harvest can also be determined from changes in the color of the skin on the areca nut, the hardness of the skin of the fruit, and the occurrence of loss of the fruit and also the broken skin of the fruit also determines the maturity of the betel nut, and the distribution of areca nut in various regions makes it important in classifying the fruit. areca nut based on the level of maturity. The ripeness of the betel nut starts from raw, ripe and old so that the color of the areca nut can be an important indicator to be able to determine the level of maturity of the fruit and the quality of the fruit. The classification of areca nut maturity aims to reduce the risk of unripe betel nut. The classification of betel nut maturity manually still has several weaknesses and requires a long process, which has low accuracy and is inconsistent, this is because the determination is made subjectively by areca nut farmers. As for the classification of the level of maturity of areca nut which automatically, can be faster with objective determination. And besides that it can improve accuracy and be more efficient. In a process of identification of fruit maturity can use image processing. The results of Arif Patriot's research stated that for the identification of mango fruit ripeness using images taken with the GLCM and LAB values, the accuracy of the results was 62.5% (Pamungkas, et al., 2019). GLCN is the Gray Level Co-Occurrence Matrix of the image, where the features are taken from the contrast, correlation, energy, and homogeneity of the grayscale image. While the LAB is the value of the color image taken on avrage, standard deviation. The process of classifying mango fruit maturity using KNN (Pamungkas, et al., 2019). Another related research in the process of image recognition pattern recognition is the identification of mango fruit maturity (Barkah, 2020). Where the image used is mango fruit, which is classified into 3 classes, namely: 38 ripe betel nut, 12 half ripe mango, and 50 unripe mango. In the process of identifying the preprocessed input image, the standard deviation feature is also taken. And the trahir stage is classification or identification using Euclidean distance with an accuracy value of 84% (Barkah, 2020).
In the year 2019 Research has been carried out by Cinantya Paramita, Eko Hari Rachmawanto, Christy Atika Sari, and De Rosal Ignatius Moses Setiadi. With the title "Classification of limes on the level of ripeness of fruit based on color features using K-Nearest Neighbor", From this study the researcher used the KNearest Neighbor method with the results of the classification of limes from the level of maturity using the color feature K, namely k=3 using k=7 on the search for Euclidean distance which produces an accuracy of 92% (Paramita, et al., 2019).
In a study conducted by Abdullah and Pahrianto in (2017) with the title "Tomato maturity classification system based on color and shape using the Support Vector Machine (SVM) method". In this study, the author uses the Support Vector Machine (SVM) method, with the test results obtained that this system is able to obtain an accuracy with an average of 82.83% and a standard deviation of 1.52 (Mubarok, 2021). Research in (2019) conducted by Husnul Khotimah, Nur Nafi'ah, and Masruroh using the title "classification of mango fruit maturity based on HSV image features using the KNN method". which is converted to HSV ( Hue , Saturation , Value ) where the features used are the average value of skewness, and kurtosis. The dataset used is a mango image with a value of k=1, the accuracy value is 56%. Results with raw, sufficient, ripe, and highly classified on mango images with a value of k=2, a ripe value. With training data of 129 mangoes, and the accuracy is 80%(Nafiah, 2019).
From several previous studies and existing problems, such as how to identify the maturity of areca nut is still very simple, namely by doing it traditionally only relying on the sense of sight, causing a reduction in the quality of sorting, so we need a system that is able to identify the maturity of areca nut as early as possible so that there are no mistakes. at post-harvest. So here it encourages the author to make an idea to design a system that becomes the background to be presented in this study with the title "Classification of betel nut maturity based on HSV images with KNN". With this research, it is hoped that in addition to helping betel nut farmers, there is also a classification of areca nut sold to the public so that the quality of the betel nut received by the community is better based on the level of maturity (Yang, et al., 2021;Thirani, et al., 2022;Pusadan & Abdullah, 2022).

Literature Review
Based on research conducted by (Paramita et al., 2019) "Classification of Lime Based on Level of Fruit Maturity Based on Color Features Using the K method. -Nearest Neighbor" According to the researcher, if the selection of limes still uses visible human judgment, it will result in weaknesses, namely being subjective and inconsistent so that the level of accuracy is low. So we need an automatic method that can improve the accuracy of the assessment with a consistent assessment in the classification of lime maturity levels based on color features. In his research, it is stated that the KNearest Neighbor method is a fairly simple classification method with fairly good accuracy, that is, it works based on the closest distance from training data to testing data by checking the distances of Cityblock Distance and Euclidean Distance. Decision makers using the K-Nearest Neighbor Method is a new data grouping technique based on the k nearest neighbor distance between training information and test information. The k values used in this paper are 1 , 3 , 5 , 7 and 9 with the search for the distance between the training and test data is Euclidean Distance and Cityblock Distance.
According to research conducted by (Pariyandani et al., 2019) "Classification of Formalin Fish Imagery using the K-NN Method and GLCM " Health is a very important factor in life. So to create human health, it is necessary to improve the quality of their food, namely by fulfilling 4 healthy 5 perfect. With the fulfillment of food quality, one must consume nutritious food rich in protein. One of the nutritious foods with high protein is fish. As a maritime country, Indonesia has a lot of resources, so we can find many types of fish in the seas of Indonesia. Fish is one of Indonesia's abundant and diverse marine resources, for centuries many people have made fish as their livelihood. With advances in marine fishing technology that has become increasingly important with modern equipment, approximately 60 million tons of fish are caught every year from the sea which is worth about 20 billion US dollars. There is more competition in the industrial world, not a few ways that are not well used, such as preserving fish using Formalin, which is a dangerous chemical which is used for preservation and is now also often used for the textile industry. Formalin substances that are used in the long term can cause many diseases, one of which is disturbed digestion and can also cause cancer. In this study using the K-NN and GLCM methods to determine the image of fresh fish and the image of fish that has been given formalin, in this study the classification of formalin fish images requires materials such as formalin liquid as a testing substance and also fresh fish as the main ingredient for research testing and using The research samples were 500 images of fish, 250 images of fresh fish and 250 samples of formalinized fish images. The sample trainer used was 60% of the 500 sample, 40% of the sample used as a tester. And to measure the spatial frequency of the image and GLCM moments can use contrast calculations. Contrast is the pixel gray level of the image.
According to research conducted by (Wijaya & Ridwan, 2019) "Classification of Types of Apples Using the K-Nearest Neighbors Method." enjoyed and consumed by the public. Apples have many varieties that can be distinguished based on the color and shape of the fruit. From these types of apples, besides being able to be consumed raw/directly, the fruit can be consumed by processing it first, such as candied fruit, apple chips and drinks. With so many types of apples, it is not uncommon for ordinary people to find it difficult to classify the types of apples. One of them is by utilizing computer technology. In classifying the types of apples, there are features of Hue Saturation Value (HSV) and local Binary Pattern (LBP). Used in this study as extraction of color features and also fruit shape which will then be used as characteristics of the color and shape of the apples to be studied (Shumaila, et al., 2022).
Research conducted by (Meiriyama, 2018) " HSV Color Feature Fruit Image Classification With SVM Classifier" Image classification with fruit objects is a classic problem in the area of image classification that is still attracting researchers' interest. In the fruit classification process, the feature selection process is one of the factors that affect the success rate and level of accuracy. The features obtained will be used in the training process on SVM to get the ideal hyperplane with the maximum margin. To extract the fruit, several steps are needed, such as: Converting the RGB color model into an HSV color model, Forming an HSV histogram with a Hue channel of 16 bins, Saturation 4 bin and value 4 bin, The HSV histogram is then normalized by dividing each value by the total value. from the histogram.

Research Methods
In this study, the research material is a public dataset downloaded by the MIHAI MINUT account through the Kaggle website with the title "fruit-262" with 256 kinds of fruit datasets. For the dataset that you want to study is the Pinang fruit label, there are 1,020 datasets which are divided into 3 classes. Each class is in jpg format with the label "Mature" containing an image of a ripe areca, the label "Raw" containing an image of an unripe betel nut, the label "old" containing an image of an ripe betel nut.

A. System Design Flow
The following is a test design stage using Matlab to identify the level of maturity of areca nut using the HSV and KNN methods. This stage is a stage. Based on the description of the test design to identify the level of maturity of the areca nut in this study : 1) Areca nut dataset will be formulated 2) The dataset will be processed through several pre-processing stages consisting of cropping, resizing image 100x100, converting to Grayscale with matlab. 3) Divide the dataset into training data and test data by dividing 80% of the training data and 20% of the test data manually 4) Furthermore, the RGB value will be converted through segmentation into the HSV color space, Hue, Saturation and Value obtained after calculating the equation. 5) The next test will be classified with the KNN algorithm with the calculation of K values from 1 to 9. 6) Test the dataset with the highest accuracy search

Results and Discussions a) Implementation
This chapter describes how the system works and the results that have been created. That is successful in detecting the maturity level of Areca nut by applying HSV and K-NN color feature extraction in the classification of maturity levels. In implementing this research, the researcher uses Matlab as a research support program. The level of maturity of the areca nut in this study were raw, ripe and old. b) Testing -Areca Fruit Dataset The initial stages of image acquisition, by determining the data needed for the image segmentation process. In this study, the process of searching for the Pinang fruit dataset obtained a public dataset which was downloaded by the MIHAI MINUT account through the Kaggle website.  The stage of normalizing the image size is by changing the size of the original image into pixels. In the application of this research, the number of pixels to resize the image is 100x100, so the process steps are as follows. which is where the resize formula will later be stored into a process with the aim of resizing each image dataset that has been cropped. -Dataset Sharing n the distribution of datasets using split-folders as a divider for the entire dataset which will become a folder for training data and test data. By using Google Collabs. This Python library package is needed as an initial step to start sharing datasets. Next import splitfolders then splitfolders.ratio(input_folder, output="buahpinang_2", seed=42, ratio=(.8, .2), group_prefix=None) # default values. To divide the 80% training dataset and 20% test dataset.

-HSV Feature Extraction
After dividing 80% of the train dataset and 20% of the test dataset, implementing the HSV (Hue Saturation Value) feature extraction, in this stage the aim is to obtain the calculation results of the extraction value of the image object. The following is the calculation of the results of the test dataset and training dataset obtained from feature extraction using HSV (Putra, et al., 2021). The first step to get the HSV feature extraction image is the steps that will be processed. By calling the dataset with RGB color which will go through the grayscale, binary and HSV stages. The RGB dataset image segmentation will be converted to grayscale using Matlab software, with a total of 599 datasets, which have been divided into 485 trains and 114 tests. The following displays the RGB images that have been converted to Grayscale. After going through the Grayscale stage. Then the dataset goes through a morphological stage, namely a binary image which is used to remove noise in the image. The next stage is HSV (Hue, Saturation and Value) which is the stage of image feature extraction to obtain feature data or information from the train image object and test whether the areca nut is ripe, unripe or old. From the results of the feature extraction above, the value of the image object has been obtained, namely Hue, Saturation and Value which are training data. Below is a table of training data extraction results obtained through feature extraction using HSV (Hue, Saturation and Value). Table 2 shows 15 data from the total data, namely 683 data. After getting the results of the training data extraction then perform feature extraction to get the results in the test data. The training data and test data will be processed through an advanced method, namely KNN. The following is the result of the test data that has gone through the feature extraction stage, which shows 15 test extraction data from 159 test data. -KNN Classification After the hsv feature is obtained, then the classification is carried out with knn with the closest distance k = 3. With the test results obtained at 83.01% as an example, the following is the result of the test data generated. From the testing process, 159 data were tested from a total of 842 data using KNN, it was found that 132 data were classified as accurate and 27 data were inaccurate and the test results can be seen as follows :

Conclusion
After testing the betel nut, the researcher can draw the conclusion that in detecting the maturity level of the areca nut by applying the HSV and K-NN color feature extraction in the level classification, it can be drawn well in a ripe areca classification system. From the testing data, the accuracy obtained has an accuracy value of 87,42%.with the closest distance K = 1 by using training data as many as 485 betel nuts and test data as many as 114 betel nuts by having their respective classes, namely raw class enough class, ripe class and old class. So based on the extraction of HSV color features with the areca nut maturity classification system using the K-Nearest Neighbors (KNN) method, it is feasible to use it properly.