Abstract :
This study aimed to analysis six machine learning algorithms in an multivariate analysis to identify key clinical, demographic and laboratory finding to predict mortality in COVID-19 pandemic. This retrospective study consisted of persons-under-investigation for COVID-19. Dataset taken from public database (kaggle.com), predictive models of mortality were constructed and compared using six supervised machine learning algorithms: KNN, naivebayes, SVM, decision tree, random forest and logistics regression using 10-fold cross-validation and multivariate analysis. The performance of algorithms was assessed using precision, recall, F-measure accuracy and area under the receiver operating characteristic curve (ROC). The Waikato Environment for Knowledge Analysis (WEKA) version 3.8.6 for analysis. Multivariate analysis using Logistic regression were used to predict mortality. A total of 4711 patients were included in the analysis. The top 4 mortality predictors were MAP (p<0,001;OR 17,07), stroke (p<0,001;OR 3,50), Age (p<0,001;OR 3,23), IL6 (p<0,001;OR 2,03), Creatinine (p<0,001;OR 1,81). Logistic regression was the best machine learning algorithms predicted mortality with 0,817 ROC. This study identifies important independent clinical variables that predict COVID-19 infection-related mortality. The prediction method is helpful, easily improved, and easily retrained with new data. This method can be applied right away and may help front-line doctors make clinical decisions in situations where there are limited resources and time.