Abstract
Purpose :
To develop and validate a novel artificial intelligence (AI)-powered system to evaluate surgeon proficiency in maintaining eye stability, centration and adequate focus in cataract surgery and assess differences in these metrics between attending and resident cataract surgeons.
Methods :
An automated system was designed to evaluate cataract surgeon performance based on recorded videos. The palpebral fissure, limbus, and Purkinje Image 1 (PI-1) were automatically segmented using a deep learning model (UNet with VGG16 backbone) trained and validated on 5,700 annotated images from 190 cataract surgeries. 352 cataract surgeries (162 attending and 190 resident) were then evaluated on three proposed cataract surgery assessment metrics (CSAMs): 1) LCP1: distance between the limbus centroid and PI-1; 2) LCFC: distance between the limbus centroid and the center of the video frame; and 3) FS: focus level of the recorded video frame. A machine learning (ML)-based ensemble model (combining SVM, Random Forest, and Logistic Regression) for surgery-level classification was trained and validated on this dataset to evaluate the differences between CSAMs for attending and resident cataract surgeons.
Results :
The case-level mean and SD of all three CSAMs (LCP1, LCFC, and FS) were significantly better (lower) for attending cases than for resident cases [LCP1mean (p=0.0005), LCP1SD (p=0.0005), LCFCmean (p=0.0005), LCFCSD (p=0.0005), FSmean (p = 0.0024), and FSSD (p = 0.0005)]. Residents struggled with eye stability and centration most during cortical removal (LCP1mean and LCP1SD greater by 19.71% and 31.64%, respectively), viscoelastic removal (LCP1SD greater by 52.43%), and wound closure (LCP1mean greater by 22.09%). Residents also struggled to maintain adequate focus throughout surgery, evidenced by higher variations in FSmean compared to attendings (varres = 1.04, varatt = 0.91) across all surgical phases. Furthermore, the ML-based ensemble model achieved an accuracy of 83.96% and AUC of 83.19% for classification of surgeon as attending or resident.
Conclusions :
The proposed AI-enabled assessment system and novel CSAMs provides a high level of reliability in assessing surgeon’s ability to maintain eye stability, centration, and focus during cataract surgery. An ensemble ML model demonstrated high performance in distinguishing surgeon skill level.
This abstract was presented at the 2024 ARVO Annual Meeting, held in Seattle, WA, May 5-9, 2024.