Document Type : Original Article
Department of Electrical and Computer Engineering, Nooshirvani University of Technology, Babol, Iran
Genomic data is used in various fields of medicine including diagnosis, prediction, and treatment of diseases. Stage detection of cancer progression is crucial for treating patients because the mortality rate of cancer is higher when it is diagnosed in the late stages. Furthermore, the type of treatment varies depending on the cancer stage. This paper presents a Multiple Kernel Learning based algorithm to predict the stage of cancer using genomic data. Because of the high dimension of genomic data, the curse of dimensionality may degrade the stage prediction. To reduce the dimension, features are clustered first in the proposed algorithm. Then, the original data samples are clustered into smaller subsets with reduced dimensions based on the computed feature clusters. Afterward, for each subset, a kernel matrix is calculated. The kernel matrices are weighted and then combined linearly. Finally, a cancer stage prediction model is trained using the combined kernel matrix and Support Vector Machine. The proposed algorithm is compared with the baseline methods. The classification accuracy of the proposed method outperforms the other methods in 13 cancer groups of 15 from the cancer genome atlas program (TCGA) dataset.