June 2022
Volume 63, Issue 7
Open Access
ARVO Annual Meeting Abstract  |   June 2022
PhacoTrainer: Deep Learning for Cataract Surgical Videos to Track Surgical Tools
Author Affiliations & Notes
  • Hsu-Hang Yeh
    Biomedical Data Science, Stanford University, Stanford, California, United States
  • Anjal M Jain
    Byers Eye Institute, Stanford University School of Medicine, Stanford, California, United States
  • Mariama Jallow
    Georgetown University School of Medicine, Washington, District of Columbia, United States
  • Kostya Sebov
    Biomedical Data Science, Stanford University, Stanford, California, United States
  • Sophia Y Wang
    Byers Eye Institute, Stanford University School of Medicine, Stanford, California, United States
  • Footnotes
    Commercial Relationships   Hsu-Hang Yeh None; Anjal Jain None; Mariama Jallow None; Kostya Sebov None; Sophia Wang None
  • Footnotes
    Support  Stanford McCormick Gabilan Fellowship;Research to Prevent Blindness Career Development Award;Research to Prevent Blindness unrestricted departmental funds;NEI 1K23EY03263501;NEI P30-EY026877
Investigative Ophthalmology & Visual Science June 2022, Vol.63, 225 – F0072. doi:
  • Views
  • Share
  • Tools
    • Alerts
      ×
      This feature is available to authenticated users only.
      Sign In or Create an Account ×
    • Get Citation

      Hsu-Hang Yeh, Anjal M Jain, Mariama Jallow, Kostya Sebov, Sophia Y Wang; PhacoTrainer: Deep Learning for Cataract Surgical Videos to Track Surgical Tools. Invest. Ophthalmol. Vis. Sci. 2022;63(7):225 – F0072.

      Download citation file:


      © ARVO (1962-2015); The Authors (2016-present)

      ×
  • Supplements
Abstract

Purpose : Deep learning provides a powerful approach to analyze surgical videos and assess surgical skills objectively. We aim to build a model that automatically identifies the locations of cataract surgical tools and eye landmarks, which can be used to grade surgical performance.

Methods : We sampled 1156 frames from 9 core steps of 268 cataract surgical videos, and annotated the regions of 8 different surgical tools, and the pupil border and limbus. We pretrained a real-time object detection and segmentation model called YOLACT on the CaDIS dataset, a public dataset for semantic segmentation of cataract surgical videos. The pretrained model was fine-tuned on our dataset. Object detection was evaluated by average precision score (AP), calculated by averaging the precision of the bounding boxes along the precision-recall curve, and segmentation was evaluated by intersection-over-union (IoU), calculated as the intersection of the predicted mask and the true mask over their union. Tooltip positions were estimated by identifying the edge point of the predicted mask closest to the screen center. Pupil centers were estimated by fitting an ellipse to the outer edges of the pupil mask and localizing the ellipse center. For further validation, the tip position estimation was compared with the ground truth positions of the tips from 46620 frames of 4 phacoemulsification video clips.

Results : The mean AP and IoU across different classes of objects were 0.78 and 0.82, respectively. The segmentation performed the best for the blade, weck sponge, and phaco instruments, whereas performance in the needle or cannula class of instruments was the worst (Table). The average deviation of estimated phaco tip positions from ground-truth positions was 6.13 pixels. Examples are shown in Figure. When considering predictions within 10 pixels from the true position as true positives, the average sensitivities and precisions were 81% and 100%, respectively.

Conclusions : We trained a deep learning model to perform real-time surgical instrument and tooltip detection with good accuracy. The model could be used to develop an automated feedback system that rates surgical performance using cataract surgical videos.

This abstract was presented at the 2022 ARVO Annual Meeting, held in Denver, CO, May 1-4, 2022, and virtually.

 

Table. Per-class average precision of the bounding boxes and intersection-over-union of the segmentation masks on the test set

Table. Per-class average precision of the bounding boxes and intersection-over-union of the segmentation masks on the test set

 

Figure. Examples of predicted segmentation masks, tip positions for cataract surgical tools, and anatomical landmarks

Figure. Examples of predicted segmentation masks, tip positions for cataract surgical tools, and anatomical landmarks

×
×

This PDF is available to Subscribers Only

Sign in or purchase a subscription to access this content. ×

You must be signed into an individual account to use this feature.

×