Human Disease Prediction using Machine Learning Techniques and Real-life Parameters

Document Type : Original Article

Authors

Department of Computer Science and Engineering, Bharati Vidyapeeth Deemed to be University College of Engineering, Pune, India

Abstract

Disease prediction of a human means predicting the probability of a patient’s disease after examining the combinations of the patient’s symptoms. Monitoring a patient's condition and health information at the initial examination can help doctors to treat a patient's condition effectively. This analysis in the medical industry would lead to a streamlined and expedited treatment of patients. The previous researchers have primarily emphasized machine learning models mainly Support Vector Machine (SVM), K-nearest neighbors (KNN), and RUSboost for the detection of diseases with the symptoms as parameters. However, the data used by the prior researchers for training the model is not transformed and the model is completely dependent on the symptoms, while their accuracy is poor. Nevertheless, there is a need to design a modified model for better accuracy and early prediction of human disease. The proposed model has improved the efficacy and accuracy model, by resolving the issue of the earlier researcher’s models. The proposed model is using the medical dataset from Kaggle and transforms the data by assigning the weights based on their rarity. This dataset is then trained using a combination of machine learning algorithms: Random Forest, Long Short-Term Memory (LSTM), and SVM. Parallel to this, the history of the patient can be analyzed using LSTM Algorithm. SVM is then used to conclude, the possible disease. The proposed model has achieved better accuracy and reliability as compared to state-of-the-art methods. The proposed model is useful to contribute towards development in the automation of the healthcare industries.

Keywords

Main Subjects


  1. Zhou, S.-M., Fernandez-Gutierrez, F., Kennedy, J., Cooksey, R., Atkinson, M., Denaxas, S., Siebert, S., Dixon, W.G., O’Neill, T.W. and Choy, E., "Defining disease phenotypes in primary care electronic health records by a machine learning approach: A case study in identifying rheumatoid arthritis", PloS One, Vol. 11, No. 5, (2016), e0154515. https://doi.org/10.1371/journal.pone.0154515
  2. Littell, C.L., "Innovation in medical technology: Reading the indicators", Health Affairs, Vol. 13, No. 3, (1994), 226-235. https://doi.org/10.1377/hlthaff.13.3.226
  3. Milella, F., Minelli, E.A., Strozzi, F. and Croce, D., "Change and innovation in healthcare: Findings from literature", ClinicoEconomics and Outcomes Research, (2021), 395-408. doi: 10.2147/CEOR.S301169.
  4. Rathi, M. and Pareek, V., "Disease prediction tool: An integrated hybrid data mining approach for healthcare", IRACST-International Journal of Computer Science and Information Technology & Security (IJCSITS), ISSN, (2016), 2249-9555.
  5. Kelly, C.J. and Young, A.J., "Promoting innovation in healthcare", Future Healthcare Journal, Vol. 4, No. 2, (2017), 121. doi: 10.7861/futurehosp.4-2-121.
  6. Mobeen, A., Shafiq, M., Aziz, M.H. and Mohsin, M.J., "Impact of workflow interruptions on baseline activities of the doctors working in the emergency department", BMJ Open Quality, Vol. 11, No. 3, (2022), e001813. doi: 10.1136/bmjoq-2022-001813.
  7. Ahmed, S., Szabo, S. and Nilsen, K., "Catastrophic healthcare expenditure and impoverishment in tropical deltas: Evidence from the mekong delta region", International Journal for Equity in Health, Vol. 17, No. 1, (2018), 1-13. doi: 10.1186/s12939-018-0757-5.
  8. Roberts, M.A. and Abery, B.H., "A person-centered approach to home and community-based services outcome measurement", Frontiers in rehabilitation Sciences, Vol. 4, (2023). doi: 10.3389/fresc.2023.1056530
  9. Farooqui, M. and Ahmad, D., "Disease prediction system using support vector machine and multilinear regression", International Journal of Innovative Research in Computer Science & Technology (IJIRCST) ISSN, (2020), 2347-5552. https://doi.org/10.21276/ijircst.2020.8.4.15
  10. Olatunji, O.O., Adedeji, P.A., Akinlabi, S., Madushele, N., Ishola, F. and Aworinde, A.K., "Improving classification performance of skewed biomass data", in IOP Conference Series: Materials Science and Engineering, IOP Publishing. Vol. 1107, (2021), 012191.
  11. Cao, J., Wang, M., Li, Y. and Zhang, Q., "Improved support vector machine classification algorithm based on adaptive feature weight updating in the hadoop cluster environment", PloS One, Vol. 14, No. 4, (2019), e0215136. https://doi.org/10.1371/journal.pone.0215136
  12. Hamidi, H. and Daraee, A., "Analysis of pre-processing and post-processing methods and using data mining to diagnose heart diseases", International Journal of Engineering, Transactions B: Applications, Vol. 29, No. 7, (2016), 921-930.
  13. Pisner, D.A. and Schnyer, D.M., Support vector machine, in Machine learning. 2020, Elsevier.101-121.
  14. Chen, J., Yu, J., Wen, J., Zhang, C., Yin, Z.e., Wu, J. and Yao, S., "Pre-evacuation time estimation based emergency evacuation simulation in urban residential communities", International Journal of environmental Research and Public Health, Vol. 16, No. 23, (2019), 4599. doi: 10.3390/ijerph16234599.
  15. Keniya, R., Khakharia, A., Shah, V., Gada, V., Manjalkar, R., Thaker, T., Warang, M. and Mehendale, N., "Disease prediction from various symptoms using machine learning", Available at SSRN 3661426, (2020). https://dx.doi.org/10.2139/ssrn.3661426
  16. Taunk, K., De, S., Verma, S. and Swetapadma, A., "A brief review of nearest neighbor algorithm for learning and classification", in 2019 International Conference on Intelligent Computing and Control Systems (ICCS), IEEE. (2019), 1255-1260.
  17. Pingale, K., Surwase, S., Kulkarni, V., Sarage, S. and Karve, A., "Disease prediction using machine learning", International Research Journal of Engineering and Technology (IRJET), Vol. 6, (2019), 831-833. doi: 10.1126/science.1065467.
  18. Ibrahim, I. and Abdulazeez, A., "The role of machine learning algorithms for diagnosing diseases", Journal of Applied Science and Technology Trends, Vol. 2, No. 01, (2021), 10-19. doi: 10.38094/jastt20179.
  19. Chhogyal, K. and Nayak, A., "An empirical study of a simple naive bayes classifier based on ranking functions", in AI 2016: Advances in Artificial Intelligence: 29th Australasian Joint Conference, Hobart, TAS, Australia, December 5-8, 2016, Proceedings 29, Springer., (2016), 324-331.
  20. Kumar, A., Bharti, R., Gupta, D. and Saha, A.K., "Improvement in boosting method by using rustboost technique for class imbalanced data", in Recent Developments in Machine Learning and Data Analytics: IC3 2018, Springer., (2019), 51-66.
  21. Biau, G. and Scornet, E., "A random forest guided tour", Test, Vol. 25, (2016), 197-227. doi: 10.1007/s11749-016-0481-7.
  22. Paul, S., Ranjan, P., Kumar, S. and Kumar, A., "Disease predictor using random forest classifier", in 2022 International Conference for Advancement in Technology (ICONAT), IEEE., (2022), 1-4.
  23. Negaresh, F., Kaedi, M. and Zojaji, Z., "Gender identification of mobile phone users based on internet usage pattern", International Journal of Engineering, Transactions B: Applications, Vol. 36, No. 2, (2023), 335-347. doi: 10.5829/IJE.2023.36.02B.13.
  24. Ren, Q., Cheng, H. and Han, H., "Research on machine learning framework based on random forest algorithm", in AIP conference proceedings, AIP Publishing LLC. Vol. 1820, (2017), 080020.
  25. Speiser, J.L., Miller, M.E., Tooze, J. and Ip, E., "A comparison of random forest variable selection methods for classification prediction modeling", Expert Systems with Applications, Vol. 134, (2019), 93-101. https://doi.org/10.1016/j.eswa.2019.05.028
  26. Van Houdt, G., Mosquera, C. and Nápoles, G., "A review on the long short-term memory model", Artificial Intelligence Review, Vol. 53, (2020), 5929-5955. https://doi.org/10.1007/s10462-020-09838-1
  27. Men, L., Ilk, N., Tang, X. and Liu, Y., "Multi-disease prediction using lstm recurrent neural networks", Expert Systems with Applications, Vol. 177, (2021), 114905. https://doi.org/10.1016/j.eswa.2021.114905
  28. Christianini, N. and Shawe Taylor, J., An introduction to support vector machines, cambridge unv. 2000, Press.
  29. Vijayarani, S. and Dhayanand, S., "Liver disease prediction using svm and naïve bayes algorithms", International Journal of Science, Engineering and Technology Research (IJSETR), Vol. 4, No. 4, (2015), 816-820.
  30. Le, H.M., Tran, T.D. and Van Tran, L., "Automatic heart disease prediction using feature selection and data mining technique", Journal of Computer Science and Cybernetics, Vol. 34, No. 1, (2018), 33-48. doi: 10.15625/1813-9663/34/1/12665.
  31. Asghari Beirami, B. and Mokhtarzade, M., "Ensemble of log-euclidean kernel svm based on covariance descriptors of multiscale gabor features for face recognition", International Journal of Engineering, Transactions B: Applications, Vol. 35, No. 11, (2022), 2065-2071. doi: 10.5829/IJE.2022.35.11B.01
  32. Rahman, A.S., Shamrat, F.J.M., Tasnim, Z., Roy, J. and Hossain, S.A., "A comparative study on liver disease prediction using supervised machine learning algorithms", International Journal of Scientific & Technology Research, Vol. 8, No. 11, (2019), 419-422.