Abstract
Purpose:
To model the modifiable risk factors by using the logistic regression (LR) and artificial neural network (ANN) models for prediction of progression of Age-related Macular Degeneration (AMD) and cross-validate these models for their predictive accuracies in a population in South India.
Methods:
The data (N = 3,723) were analyzed from Andhra Pradesh Eye Disease Study (APEDS) on participants aged ≥40 years. Sub-population data from this sample were drawn by using Random under Sampling (RUS) (n = 213) and combination of RUS and Random over Sampling (ROS) (n = 1420) techniques. The modifiable and non-modifiable risk factors which were elicited as part of the study were used to derive the LR based risk score models and the model fit was assessed using bootstrap method for internal validity. The ANN model was built for three sets of data using the multi-layer feed-forward back propagation network method. The ANN model's predictive ability was compared with that of traditional LR model using the Area under the Receiver Operating Characteristic Curve (AUROC).
Results:
The ANN and LR models revealed the modifiable risk factors of heavy smoking (risk score from 10 to 18), lower intake of antioxidants (risk scores from 5 to 10), hypertension (risk scores from 2 to 10) were in order of priority predictors for AMD. The ANN model showed significantly less predictive ability in a total sample analysis, AUROC curve (0.66 vs 0.76; p<0.0001). The LR risk score was built with a score ranging from 0 to 60 for a sub-population dataset (n = 213). A cut-off score of ≥30 had a sensitivity of 79% and a specificity of 69%. The predictive accuracies of ANN and LR models in predicting AMD were statistically equivalent (AUROC = 0.76 vs 0.78; p=0.624) in a sub-population (n = 213) analysis, however, the ANN model outperformed with a good predictive ability (AUROC = 0.89 vs 0.79; p<0.0001) in a sub-population (n = 1420) analysis. Both the models were stable and consistently obtained the same predictive accuracies in a 30-fold split-sample cross validation including bootstrap method.
Conclusions:
The sub-population analysis yielded the better predictive ability of ANN model. The LR model and sensitivity analysis of the ANN model both indicated the relative importance of prioritizing modifiable risk factors for AMD for preventive interventions to reduce the impact of the modifiable factors on onset of AMD.
Keywords: 459 clinical (human) or epidemiologic studies: biostatistics/epidemiology methodology •
464 clinical (human) or epidemiologic studies: risk factor assessment •
412 age-related macular degeneration