Machine Learning Algorithms Based on Sampling Techniques for Raisin Grains Classification

Achmad Bisri - Universitas Islam Negeri Sultan Maulana Hasanuddin Banten
Mustafa Man - Universiti Malaysia Terengganu


Citation Format:



DOI: http://dx.doi.org/10.30630/joiv.7.1.970

Abstract


Raisin grains are among the agricultural commodities that can bring health benefits. The production of raisin grains needs to be classified to achieve optimal results. In this case, the classification is carried out on two types of grains, namely Kecimen and Besni. However, inaccurate sample data can affect the performance of the model. In this study, two sampling techniques are proposed, namely stratified and shuffled sampling. Whereas the proposed classification model is RF, GBT, NB, LR, and NN. This study aims to identify the performance of classification models based on sampling techniques. Classification models are applied to the seven-features dataset and modelling is done by cross-validation. The results of the models were tested with a different amount of test data. The performance of the models was evaluated related to accuracy and AUC. The best outcomes of all models based on stratified sampling were founded on tested data of 40 percent with a mean accuracy of 85.50% and an AUC of 0.921. Whereas models based on shuffled sampling were founded on test data of 20 percent with a mean accuracy of 88.11% and an AUC of 0.935. On the other hand, classification models based on a stratified sampling of all data splits do not all models generate an excellent category. Whereas, based on shuffled sampling, all models resulted in the excellent category. Therefore, models based on shuffled sampling are superior to stratified sampling. The result of the significant test, RF has a significant difference based on sampling techniques.

Keywords


Classification; data mining; machine learning; raisin grains; sampling technique.

References


M. J. Schuster, X. Wang, T. Hawkins, and J. E. Painter, “A Comprehensive review of raisins and raisin components and their relationship to human health,” J. Nutr. Heal., vol. 50, no. 3, p. 203, 2017, doi: 10.4163/jnh.2017.50.3.203.

R. Khiari, H. Zemni, and D. Mihoubi, “Raisin processing: physicochemical, nutritional and microbiological quality characteristics as affected by drying process,” Food Rev. Int., vol. 35, no. 3, pp. 246–298, Apr. 2019, doi: 10.1080/87559129.2018.1517264.

A. Rahimi, A. Heshmati, and A. Nili-Ahmadabadi, “Changes in pesticide residues in field-treated fresh grapes during raisin production by different methods of drying,” Dry. Technol., vol. 0, no. 0, pp. 1–14, May 2021, doi: 10.1080/07373937.2021.1919140.

G. Singh, N. Kaushal, O. Tokusoglu, and A. Singh, “Optimization of process parameters for drying of red Grapes ( Vitis vinifera ) to raisin: A design expert laden approach,” J. Food Process. Preserv., no. September 2020, pp. 1–8, Jan. 2021, doi: 10.1111/jfpp.15248.

J. Wang, A. S. Mujumdar, H. Wang, X.-M. Fang, H.-W. Xiao, and V. Raghavan, “Effect of drying method and cultivar on sensory attributes, textural profiles, and volatile characteristics of grape raisins,” Dry. Technol., vol. 39, no. 4, pp. 495–506, Feb. 2021, doi: 10.1080/07373937.2019.1709199.

L. Feng, S. Zhu, C. Zhang, Y. Bao, P. Gao, and Y. He, “Variety Identification of Raisins Using Near-Infrared Hyperspectral Imaging,” Molecules, vol. 23, no. 11, p. 2907, Nov. 2018, doi: 10.3390/molecules23112907.

M. Khojastehnazhand and H. Ramezani, “Machine vision system for classification of bulk raisins using texture features,” J. Food Eng., vol. 271, no. September 2019, p. 109864, Apr. 2020, doi: 10.1016/j.jfoodeng.2019.109864.

Y. Zhao, X. Xu, and Y. He, “A Novel Hyperspectral Feature-Extraction Algorithm Based on Waveform Resolution for Raisin Classification,” Appl. Spectrosc., vol. 69, no. 12, pp. 1442–1456, Dec. 2015, doi: 10.1366/14-07617.

K. Mollazade, M. Omid, and A. Arefi, “Comparing data mining classifiers for grading raisins based on visual features,” Comput. Electron. Agric., vol. 84, pp. 124–131, Jun. 2012, doi: 10.1016/j.compag.2012.03.004.

A. Bakhshipour, A. Jafari, and A. Zomorodian, “Vision based features in moisture content measurement during raisin production,” World Appl. Sci. J., vol. 17, no. 7, pp. 860–869, 2012.

Y. Zhao, M. L. Guindo, X. Xu, X. Shi, M. Sun, and Y. He, “A Novel Raisin Segmentation Algorithm Based on Deep Learning and Morphological Analysis,” Eng. Agrícola, vol. 39, no. 5, pp. 639–648, Oct. 2019, doi: 10.1590/1809-4430-eng.agric.v39n5p639-648/2019.

N. Karimi, R. Ranjbarzadeh Kondrood, and T. Alizadeh, “An intelligent system for quality measurement of Golden Bleached raisins using two comparative machine learning algorithms,” Measurement, vol. 107, pp. 68–76, Sep. 2017, doi: 10.1016/j.measurement.2017.05.009.

İ. Çinar, M. Koklu, and Ş. Taşdemir, “Classification of Raisin Grains Using Machine Vision and Artificial Intelligence Methods,” Gazi J. Eng. Sci., vol. 6, no. 3, pp. 200–209, Dec. 2020, doi: 10.30855/gmbd.2020.03.03.

I. Cinar and M. KOKLU, “Classification of Rice Varieties Using Artificial Intelligence Methods,” Int. J. Intell. Syst. Appl. Eng., vol. 7, no. 3, pp. 188–194, Sep. 2019, doi: 10.18201/ijisae.2019355381.

F. Tarakci and I. A. Ozkan, “Comparison of classification performance of kNN and WKNN algorithms,” vol. 20, no. 02, pp. 32–37, 2021, [Online]. Available: https://sujes.selcuk.edu.tr/sujes/article/view/536.

T. Zaman, “An efficient exponential estimator of the mean under stratified random sampling,” Math. Popul. Stud., vol. 28, no. 2, pp. 104–121, Apr. 2021, doi: 10.1080/08898480.2020.1767420.

R. Rachmatika and A. Bisri, “Perbandingan Model Klasifikasi untuk Evaluasi Kinerja Akademik Mahasiswa,” JEPIN (Jurnal Edukasi dan Penelit. Inform., vol. 6, no. 3, pp. 417–422, 2020, doi: http://dx.doi.org/10.26418/jp.v6i3.43097.

A. Bisri and R. Rachmatika, “Integrasi Gradient Boosted Trees dengan SMOTE dan Bagging untuk Deteksi Kelulusan Mahasiswa,” J. Nas. Tek. Elektro dan Teknol. Inf., vol. 8, no. 4, p. 309, Nov. 2019, doi: 10.22146/jnteti.v8i4.529.

T. Chen and C. Guestrin, “XGBoost: a scalable tree boosting system In: Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, NY: ACM; 2016: 785–94.” 2016.

M. J. Zaki and W. J. Meira, Data Mining and Machine Learning Fundamental Concepts and Algorithms, Second. 2020.

L. Niu, “A review of the application of logistic regression in educational research: common issues, implications, and suggestions,” Educ. Rev., vol. 72, no. 1, pp. 41–67, 2020, doi: 10.1080/00131911.2018.1483892.

V. Kotu and B. Deshpande, Data Science: Concepts and Practice, Second Edi. Morgan Kaufmann, 2019.

J. P. Quintas, F. Machado e Costa, and A. C. Braga, “ROSY Application for Selecting R Packages that Perform ROC Analysis,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 12251 LNCS, 2020, pp. 199–213.

F. Gorunescu, Data Mining: Concepts, Models and Techniques, vol. 12. Berlin, Heidelberg: Springer Berlin Heidelberg, 2011.

G. James, DanielaWitten, T. Hastie, and R. Tibshirani, Springer Texts in Statistics An Introduction to Statistical Learning wth application in R. 2013.




Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

__________________________________________________________________________
JOIV : International Journal on Informatics Visualization
ISSN 2549-9610  (print) | 2549-9904 (online)
Organized by Department of Information Technology - Politeknik Negeri Padang, and Institute of Visual Informatics - UKM and Soft Computing and Data Mining Centre - UTHM
W : http://joiv.org
E : joiv@pnp.ac.id, hidra@pnp.ac.id, rahmat@pnp.ac.id

View JOIV Stats

Creative Commons License is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.