A Survey on Data Mining Algorithms and Techniques in Medicine

Kasra Madadipouya


Medical Decision Support Systems (MDSS) industry collects a huge amount of data, which is not properly mined and not put to the optimum use. This data may contain valuable information that awaits extraction. The knowledge may be encapsulated in various patterns and regularities that may be hidden in the data. Such knowledge may prove to be priceless in future medical decision making.

Available medical decision support systems are based on static data, which may be out of date. Thus, a medical decision support system that can learn the relationships between patient histories, diseases in the population, symptoms, pathology of a disease, family history, and test results, would be useful to physicians and hospitals.

This paper provides an in-depth review of available data mining algorithms and techniques. In addition to that, data mining applications in medicine are discussed as well as techniques for evaluating them and available applications of performance metrics.


Data Mining; Classification; Decision Tree; Neural Network; Bayesian Network Classifier; Evaluation Metrics

Full Text:



E. Nolte, and C. M. McKee, Measuring the health of nations: updating an earlier analysis. Health affairs, 27(1), 58-71, 2008.

R. Teach and E. Shortliffe, “An analysis of physician attitudes regarding computer-based clinical consultation systems,” Computers and Biomedical Research, vol. 14, 542-558, 1981.

I. Turkoglu, A. Arslan and E. Ikay, “An expert system for diagnosis of the heart valve diseases,” Expert Systems with Applications, vol. 23, no.3, 229–236, 2002.

I. H. Witten, and E. Frank, “Data Mining, Practical Machine Learning Tools and Techniques,” Elsevier, 2005.

P. Herron, “Machine Learning for Medical Decision Support: Evaluating Diagnostic Performance of Machine Learning Classification Algorithms,” INLS 110, Data Mining, 2004.

L Li, et al., “Data mining techniques for cancer detection using serum proteomic profiling,” Artificial Intelligence in Medicine, vol. 32, 71-83, 2004.

E. Comak, A. Arslan and I. Turkoglu, “A decision support system based on support vector machines for diagnosis of the heart valve diseases,” Elsevier, vol. 37, 21-27, 2007.

R. Rojas, “Neural Networks: a systematic introduction,” Springer-Verlag, 1996.

A. J. Van gerven, R. Jurgelenaite, B. G. Taal, T. Heskes and P. J. F. Lucas, “Predicting carcinoid heart disease with the noisy-threshold classifier,” Artificial Intelligence in Medicine, vol. 40, 45-55, 2007.

D. Spiegelhalter and R. Knill-Jones, “Statistical and knowledge based approaches to clinical decision support systems, with an application in gastroenterology,” Journal of the Royal Statistical Society, vol. 147, 35-77, 1984.

A. Vlahou, J. O. Schorge, B. W. Gregory and R. L. Coleman, “Diagnosis of ovarian cancer using decision tree classification of mass spectral data,” Journal of Biomedicine and Biotechnology, vol. 5 308-314, 2003.

D. Cosic and S. Loncaric, “Rule-based labeling of CT head image. Lecture Notes in Artificial Intelligence,” Berlin, Germany, Springer-Verlag, vol. 1211, 453–456, 1999.

W. Duch, K. Grabczewski, R. Adamczak, K. Grudzinski and Z. S. Hippe, “Rules for melanoma skin cancer diagnosis,” Available from: http://www.phys.uni.torun.pl/publications/kmk/ [Accessed 2 May 2016], 2001.

M. Hunt, B. Von Konsky, S. Venkatesh and P. Petros, “Bayesian networks and decision trees in the diagnosis of female urinary incontinence,” Engineering in Medicine and Biology Society, Proceedings of the 22nd Annual International Conference of the IEEE, vol. 1, 551-554, 2000.

G. Richards, V.J. Rayward-Smith, P. H. Sönksen, S. Carey and C. Weng, “Data mining for indicators of early mortality in a database of clinical records,” Artificial Intelligence in Medicine, vol. 22, no. 3, 215–231, 2000.

W. Detmer, G. Barnett, W. Hersh and M. Weaver, “Integrating Decision Support,” Literature Searching and Web Exploration using the UMLS, Metathesaurus, 1997.

D. West and V. West, “Model selection for a medical diagnostic decision support system: a breast cancer detection case,” Artificial Intelligence in Medicine, vol. 20, 183-204, 2000.

T. M. Mitchell, “Machine Learning,” McGraw-Hill Higher Education, 1997.

L. Autio, M. Juhola and J. Laurikkala, “On the neural network classification of medical data and an endeavor to balance non-uniform data sets with artificial data extension,” Computers in Biology and Medicine, vol. 37, no. 3, 388-397, 2007.

Y. Hayashi, R. Setiono and K. Yoshida, “A comparison between two neural network rule extraction techniques for the diagnosis of hepatobiliary disorders,” Artificial Intelligence in Medicine, vol. 20, no. 3, 205–216, 2000.

P. Cunningham, J. Carney and S. Jacob, “Stability problems with artificial neural networks and the ensemble solution,” Artificial Intelligence in Medicine, vol. 20, no. 3, 217–225, 2000.

A. Sharkey, N. E. Sharkey and S. S. Cross, “Adapting an ensemble approach for the diagnosis of breast cancer,” Proceedings of ICANN, Skövde, Sweden, 281–286, 1998.

P. Domingos and M. Pazzani, “On the Optimality of the Simple Bayesian Classifier under Zero-One Loss,” Machine Learning, vol. 29, no. 2-3, 103-130, 1997.

T. Karthikeyan, and P. Thangaraju, “Analysis of Classification Algorithms Applied to Hepatitis Patients,” International Journal of Computer Applications, 62(15), 2013.

V. Podgorelec, P. Kokol, B. Stiglic and I. Rozman, “Decision trees: an overview and their use in medicine,” Journal of Medical Systems, 26(5):445-463, 2002.

J. Han, and M. Kamber, “Data Mining: Concepts and Techniques,” Morgan Kaufmann Publishers, 2nd ed, 2006.

S. K. Murthy, “Automatic Construction of Decision Trees from Data: A Multi-Disciplinary Survey,” Data Mining and Knowledge Discovery , 1997

J. Han, “Data Mining: Concepts and Techniques,” Morgan Kaufmann publications, 2006.

C. Van der gaag, and S. Renooij, “Aligning Bayesian Network Classifiers with Medical Contexts,” Technical Report UU-CS-2008-015, 2008.

K. Anil Jain, J. Mao and K.M. Mohiuddi, “Artificial Neural Networks: A Tutorial,” IEEE Computers, pp.31-44, 1996.

S. Haykin, “Neural Networks – A Comprehensive Foundation,” Pearson Education, 2001.

K. Cios and G. Moore, “Uniqueness of Medical Data Mining,” Artificial Intelligence in Medicine, 2002, vol. 26, 1-24, 2002.

D. Berrar, I. Bradbury and W. Dubitzky, “Avoiding model selection bias in small-sample genomic datasets,” Oxford University Press, 2006.

U. Scherf, “A gene expression database for the molecular pharmacology of cancer,” Nature Genetics, vol. 24, no. 236-245, 2000.

R. E. Banfield, L.O. Hall, K.W. Bowyer and W.P. Kegelmeyer, “A Comparison of Decision Tree Ensemble Creation Techniques,” IEEE Computer Society, vol. 29, 2007.

S. Daya, “Diagnostic test - receiver operating characteristic (ROC) curve,” Evidence-based Obstetrics and Gynaecology, vol. 8, no. 1-2, 3-4, 2006.

W. A. Yousef, R.F. Wagner and M.H. Loew, “Estimating the uncertainty in the estimated mean area under the ROC curve of a classifier,” Pattern Recognition Letters, vol. 26, no. 16, 2600-2610, 2005.

Breiman L, Friedman JH, Olshen RA, Stone CJ. “Classification and regression trees”. Wadsworth & Brooks. Monterey, CA. 1984.

Kasra Madadipouya “A New Decision tree method for Data mining in Medicine” Advanced Computational Intelligence: An International Journal (ACII), Vol.2, No.3, July 2015.


  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

JOIV : International Journal on Informatics Visualization
Published by Information Technology Department
Politeknik Negeri Padang, Indonesia

© JOIV - ISSN : 2549-9610 | e-ISSN : 2549-9904 

Phone : +62-82386434344
Email  : hidraamnur@live.com

Creative Commons License is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

View My Stats