A Random Forest Classifier based on Genetic Algorithm for Cardiovascular Diseases Diagnosis (RESEARCH NOTE)

Authors

1 computer science & engineering, Birla Institute of Technology,Mesra,Ranchi

2 computer science & engineering, Birla Institute of Technology, Mesra, Ranchi

Abstract

Machine learning-based classification techniques provide support for the decision making process in the field of healthcare, especially in disease diagnosis, prognosis and screening. Healthcare datasets are voluminous in nature and their high dimensionality problem comprises in terms of slower learning rate and higher computational cost. Feature selection is expected to deal with the high dimensionality of datasets in terms of reduced feature set. Feature selection improves the performance of classification accuracy particularly performing with less number of features in decision making process. In this paper, Random Forest (RF) is employed for the diagnosis of cardiovascular disease. The first phase of the proposed system aims at constructing various feature selection algorithms such as Principal Component Analysis (PCA), Relief- F, Sequential Forward Floating Search (SFFS), Sequential Backward Floating Search (SBFS) and Genetic Algorithm (GA) for reducing the dimension of cardiovascular disease dataset. The second phase switched to model construction based on RF algorithm for cardiovascular disease classification. The outcome shows that the combination with GA and RF delivered the highest classification accuracy of 93.2% by the help of six features.

Keywords