Abstract
Purpose :
In ophthalmology, the extensive use of ancillary imaging exams in the diagnosis, follow-up, and treatment of diseases leads to great potential for Artificial Intelligence (AI) algorithms to detect, segment, and recognize patterns in images to augment robust clinical decision-making. The unique and stable vascular pattern is the premise for considering retinal scans as individual authenticators; limiting data sharing and ophthalmological AI development. This project aims to apply a de-identification method in retinal scans and compare demographic classification (sex) and downstream tasks (diabetic retinopathy) performance.
Methods :
The study was conducted using the BR-OPHTSET, which consists of 16,266 retinal scans from 8,524 Brazilian patients. We applied the Snow model to add pixel-level noise by arbitrarily re-assigning pixel intensities and Differential privacy (DP) for quantifying privacy leakage. In the model, we applied the calculated average of the whole dataset pixel intensity for each RGB channel.
To evaluate the de-identification method, we trained a model to predict the patient's sex (Inception V3) and diabetic retinopathy (ResNet200d) with original images and measured accuracy (ACC), AUC, and F1 reduction throughout the DP levels.
Image region-specific image augmentation approach (gaussian noise, circular rotation, fisheye distortion, shear) was also performed for comparative analysis.
Results :
The sex identification model ACC and AUC was 75.12% and 80.66%. In the p=0.5 DP-Snow image, the model performance dropped to 52.15% ACC and 54.15 AUC.
The DR classification model ACC and F1 was 95.9% and 76%. In the P=0.5 DP-Snow image, the model ACC dropped to 95% and F1 to 70.6%.
In the augmented images, the Gaussian noise performs bests with a 5.12% decrease in sex identification. In DR classification, the ACC was 96.3% and F1 81%.
Conclusions :
We report the DP-Snow modified de-identification method and applied DP to quantify privacy and compared it to image manipulation techniques. We demonstrate that in DP Snow of p=0.5, a demographic characteristic (sex) cannot be determined by the AI model, and a downstream task (DR classification) is preserved. The next experiments will evaluate different tasks and demographics and external validation in other nationality patient data. The DP-Snow image of p=0.1 to 0.5 remains human gradable, reduces demographic discrimination, and enables downstream tasks.
This abstract was presented at the 2023 ARVO Annual Meeting, held in New Orleans, LA, April 23-27, 2023.