Data mining techniques for disease risk prediction model: A systematic literature review

Wan Muhamad Taufik Wan Ahmad, Nur Laila Ab Ghani, Sulfeeza Mohd Drus

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Risk prediction model estimates event occurrence based on related data. Conventional statistical metrics that utilized primary data generates simple descriptive analysis that often provide insufficient knowledge for decision making. In contrast, data mining techniques that have the capability to find hidden pattern from the secondary data in large databases and create prediction for de- sired output has become a popular approach to develop any risk prediction model. In healthcare particularly, data mining techniques can be applied in disease risk prediction model to provide reliable prediction on the possibility of acquiring the disease based on individual’s clinical and non-clinical data. Due to the increased use of data mining in healthcare, this study aims at identifying the data mining techniques and algorithms that are commonly implemented in studies related to various disease risk prediction model as well as finding the accuracy of the algorithms. The accuracy evaluation consists of various method, but this paper is focusing on overall accuracy which is measured by the total number of correctly predicted output over the total number of prediction. A systematic literature review approach that search across five databases found 170 articles, of which 7 articles were selected in the final process. This review found that most prediction model used classification technique, with a focus on decision tree, neural network, support vector machines, and Naïve Bayes algorithms where heart-related disease is commonly studied. Further research can apply similar algorithms to develop risk prediction model for other types of diseases, such as infectious disease prediction.

Original languageEnglish
Title of host publicationRecent Trends in Data Science and Soft Computing - Proceedings of the 3rd International Conference of Reliable Information and Communication Technology IRICT 2018
EditorsFathey Mohammed, Faisal Saeed, Nadhmi Gazem, Abdelsalam Busalim
PublisherSpringer Verlag
Pages40-46
Number of pages7
ISBN (Print)9783319990064
DOIs
Publication statusPublished - 01 Jan 2019
Event3rd International Conference of Reliable Information and Communication Technology, IRICT 2018 - Kuala Lumpur, Malaysia
Duration: 23 Jun 201824 Jun 2018

Publication series

NameAdvances in Intelligent Systems and Computing
Volume843
ISSN (Print)2194-5357

Other

Other3rd International Conference of Reliable Information and Communication Technology, IRICT 2018
CountryMalaysia
CityKuala Lumpur
Period23/06/1824/06/18

    Fingerprint

All Science Journal Classification (ASJC) codes

  • Control and Systems Engineering
  • Computer Science(all)

Cite this

Ahmad, W. M. T. W., Ab Ghani, N. L., & Mohd Drus, S. (2019). Data mining techniques for disease risk prediction model: A systematic literature review. In F. Mohammed, F. Saeed, N. Gazem, & A. Busalim (Eds.), Recent Trends in Data Science and Soft Computing - Proceedings of the 3rd International Conference of Reliable Information and Communication Technology IRICT 2018 (pp. 40-46). (Advances in Intelligent Systems and Computing; Vol. 843). Springer Verlag. https://doi.org/10.1007/978-3-319-99007-1_4