Predicting Diabetes by adopting Classification Approach in Data Mining

Rapinder Kaur - Chandigarh University, Punjab, India.

Citation Format:



As the world is growing fast, the metamorphosing of things, lifestyle, perceptions of people and resources is taking place. But the elevation in technology has become a challenge now as the ideas, innovations are amplifying. One of the biggest things the advancement and elevations in technology has given birth is “Big Dataâ€. In this data massive amount of information is hidden. In order to refine or process this data and to find out and unmask the insights, many techniques and algorithms have been evolved, one of which is the data mining. The data mining is the approach or procedure which helps in detaching or extracting profitable and fruitful knowledge, reports and facts from the rough or impure data. The prediction analysis is approach comprehended from data mining to forecast and figure out the future making using classification technique. This research work is based on the diabetes prediction by making use of classification approach. In the existing approach SVM classifier is applied for the prediction analysis. To increase accuracy approach of KNN classifier is applied for the prediction analysis. Both the proposed and existing methods are implemented in Python. The simulation results show that accuracy of KNN is increased and execution time is reduced.


Diabetes, SVM, KNN

Full Text:



Ashish kumar Dogra and Tanu Walia, A Review Paper on Data Mining Techniques and Algorithms, May 2015, International Journal of Advanced Research in Computer Engineering & Technology (IJARCET), Volume 4 Issue 5.

Nirmal Kaur and Gurpinder Singh, A Review Paper On Data Mining And Big Data, May 2017 (Special Issue), International Journal of Advanced Research in Computer Science, Volume 8, No. 4.

Yanhui Sun, Liying Fang and Pu Wang, Improved k-means clustering based on Efros distance for longitudinal data, 2016 Chinese Control and Decision Conference (CCDC), Vol. 11, issue 3, pp. 12-23, 2016.

Shunye Wang, Improved K-means clustering algorithm based on the optimized initial centroids, 2013 3rd International Conference on Computer Science and Network Technology (ICCSNT), Vol. 11, issue 3, pp. 12-23, 2013.

Phattharat Songthung and Kunwadee Sripanidkulchai, Improving Type 2 Diabetes Mellitus Risk Prediction Using Classification, 2016 13th International Joint Conference on Computer Science and Software Engineering (JCSSE), Vol. 11, issue 3, pp. 12-23, 2016.

Jyoti, Neha Kaushik, Rekha, “Review paper on Clustering and Validation Techniquesâ€, International Journal for Research in Applied Science and Engineering Technologyâ€, vol. 2, pp. 182-186, 2014.

Dr. Sankar Rajagopal, “Customer data clustering using data mining techniqueâ€, International Journal of Database Management Systems ( IJDMS ), vol. 3, pp. 21- 32, 2011.

Shai Shalev-Shwartz, Shai Ben-David, “Understanding Machine Learning: From Theory to Algorithmsâ€, vol. 8, issue 4, pp. 1-499, 2014.

Bayu Adhi Tama,1 Afriyan Firdaus,2 Rodiyatul FS, “Detection of Type 2 Diabetes Mellitus with Data Mining Approach Using Support Vector Machineâ€, Vol. 11, issue 3, pp. 12-23, 2008.

Yu-Xuan Wang, QiHui Sun, Ting-Ying Chien, Po-Chun Huang, “Using Data Mining and Machine Learning Techniques for System Design Space Exploration and Automatized Optimizationâ€, Proceedings of the 2017 IEEE International Conference on Applied System Innovation, vol. 15, pp. 1079-1082, 2017.

Zhiqiang Ge, Zhihuan Song, Steven X. Ding, Biao Huang, “Data Mining and Analytics in the Process Industry: The Role of Machine Learningâ€, 2017 IEEE. Translations and content mining are permitted for academic research only, vol. 5, pp. 20590-20616, 2017.

Jahin Majumdar, Anwesha Mal, Shruti Gupta, “Heuristic Model to Improve Feature Selection Based on Machine Learning in Data Miningâ€, 2016 6th International Conference - Cloud System and Big Data Engineering (Confluence), vol. 3, pp. 73-77, 2016.

M. Sharma, G. Singh, R. Singh, “Stark Assessment of Lifestyle Based Human Disorders Using Data Mining Based Learning Techniquesâ€, Elsevier, vol. 5, pp. 202-222, 2017.

Han Wu, Shengqi Yang, Zhangqin Huang, Jian He, Xiaoyi Wang, “Type 2 diabetes mellitus prediction model based on data miningâ€, ScienceDirect, Vol. 11, issue 3, pp. 12-23, 2018.