Abstract
Purpose :
To develop an infrastructure for curating, annotating, and pre-processing a longitudinal ophthalmic imaging dataset that will allow applications of machine and deep learning algorithms to blinding eye diseases.
Methods :
The University of Illinois at Chicago Ophthalmic Imaging Database and Atlas was developed from various components over the course of a decade: (i) An Image Management System; (ii) A server that houses raw images; (iii) SQL Databases that include patient, exam, and device type data tables; (iv) Excel data tables that include patient demographics and diagnosis; and (v) Billing data that consist of patient diagnosis and severity type. Algorithms in Python and a data science platform, Rapidminer, were developed and used to perform the pre-processing steps including cleaning, handling missing data, merging and connecting available sources in a unified database. Imaging labels were extracted from our billing system using ICD9 and ICD10 codes and from the Excel data tables that were developed over the past decade. Analysis was performed for the top 6 diagnoses: Glaucoma, glaucoma suspect, diabetic retinopathy, age-related macular degeneration, macular edema, and epi-retinal membrane. Clustering techniques were used to identify and categorize data for each patient into different types of images and reports, including optical coherence tomography, fundus photos, and Humphrey visual fields. Images were then de-identified using two methods: A blurring technique and a selective de-identifying technique.
Results :
Total number of patients included were 45,000 with a total number of images/files of 3.8 million. A total of 17,528 patients with 2,799,069 images/reports were successfully labeled for the top 6 diagnoses. Patient imaging and average follow-up distribution were: Glaucoma (6208, 2.73 years); glaucoma suspect (5538, 1.42 years); diabetic retinopathy (2905, 1.77 years); age-related macular degeneration (1450, 1.73 years); macular edema (718, 0.73 years); and epi-retinal membrane (709, 1.22 years).
Conclusions :
Machine and deep learning methods are intrinsic parts of artificial intelligence, which currently plays a major role in medicine and clinical decisions. The UIC Ophthalmic Imaging Database and Atlas includes labeled images for over a 10-year period. This will allow the application of computational algorithms to study progression of diseases from early diagnosis to late-stage.
This is an abstract that was submitted for the 2018 ARVO Annual Meeting, held in Honolulu, Hawaii, April 29 - May 3, 2018.