QW5: Machine learning approaches improve the prediction of the major osteoporotic fracture using genomic and phenotypic data of 25,772 postmenopausal women
POSTER PRESENTATION (video):
PRESENTER: Jongyun Jung
AUTHORS: Qing Wu, Jongyun Jung
MENTOR: Qing Wu
The study aims were to develop fracture prediction models by using machine learning approaches and genomic data, as well as to identify the best modeling approach for fracture prediction. The genomic and phenotypic data in the Women Health Initiative (WHI) (N = 25,772), were analyzed. After a comprehensive genotype imputation, genetic risk score (GRS) was calculated from 1103 associated Single Nucleotide Polymorphisms for each participant. Data were normalized and split into a training set (80%) and a validation set (20%) for analysis. Random forest, gradient boosting, artificial neural network, support vector machine and logistic regression were used to develop prediction models for major osteoporotic fractures separately, with GRS, bone density, and other risk factors as predictors. In model training, the synthetic minority oversampling technique was used to account for low fracture rate, and tenfold cross-validation was employed for hyperparameters optimization. In the testing, the area under curve (AUC) and accuracy were used to assess the model performance. The McNemar test was employed to examine the accuracy difference between models. The results showed that the prediction performance of random forest was the best, with AUC of 0.75 and an accuracy of 0.72. The performance of machine learning approaches were also significantly better than that of logistic regression. This study suggested that improving fracture prediction in older men can be achieved by incorporating genetic profiling and by utilizing the machine learning approach.