Abstract
Purpose :
Many diseases such as age-related macular degeneration (AMD) are classified based on human-defined rubrics that are prone to bias. Supervised neural networks are trained using human-generated labels that require labor-intensive annotations and are restricted to the specific trained tasks. Here, we employ unsupervised learning which organizes fundus images based only on visual similarity to determine AMD severity and identify ocular features without the confines of human definitions or labels.
Methods :
We trained an unsupervised deep neural network with Non-Parametric Instance Discrimination (NPID) using 100,848 human-graded fundus images from 4757 participants from the Age-Related Eye Disease Study (AREDS) to grade AMD severity using 2-step, 4-step, and 9-step classification schemes. We compared balanced and unbalanced accuracies of NPID against published supervised networks and ophthalmologists, explored network behavior using hierarchical learning of image subsets and spherical k-means clustering of feature vectors, then searched for ocular features that can be identified without labels.
Results :
Unsupervised NPID demonstrated versatility across different AMD classification schemes without re-training, and achieved balanced accuracies comparable to supervised networks or human ophthalmologists in classifying advanced AMD (82% vs. 81% or 89%), referable AMD (87% vs. 92% or 96%), or on the 4-step AMD severity scale (65% vs. 63% or 67%), despite never directly learning these labels. Drusen area drove network predictions on the 4-step scale, while depigmentation and geographic atrophy (GA) areas correlated with advanced AMD classes. Unsupervised learning identified grader-mislabeled images and revealed susceptibility of some classes within the more granular 9-step AMD scale to misclassification by both ophthalmologists and neural networks. Importantly, unsupervised learning enabled data-driven discovery of AMD features such as GA and other ocular phenotypes of the choroid (e.g. tessellated or blonde fundi), vitreous (e.g. asteroid hyalosis), and lens (e.g. nuclear cataracts), that were not pre-defined by human labels.
Conclusions :
Unsupervised learning enables automated AMD severity grading comparable to ophthalmologists and supervised networks, reveals biases of human-defined AMD classification systems, and allows unbiased, data-driven discovery of AMD and non-AMD ocular phenotypes.
This is a 2021 ARVO Annual Meeting abstract.