Purchase this article with an account.
Chieh-Li Chen, Hiroshi Ishikawa, Gadi Wollstein, Richard Anthony Bilonick, Ian A Sigal, Larry Kagemann, Amanda Woodside, Joel S Schuman; Is Reproducibility a Good Measure of Segmentation Quality in Optical Coherence Tomography (OCT)?. Invest. Ophthalmol. Vis. Sci. 2014;55(13):4788.
Download citation file:
© ARVO (1962-2015); The Authors (2016-present)
Measurement reproducibility has been the staple of OCT segmentation software performance assessment. To achieve high reproducibility across the acceptable signal strength (SS) range, strong smoothing and/or aggressive signal enhancements are often employed, which may decrease the sensitivity to detect localized changes, such as early retinal nerve fiber bundle defects (NFBD). The purpose of this study was to evaluate segmentation algorithm performance in terms of the sensitivity to detect NFBD.
Eighteen eyes from 10 glaucomatous and 8 glaucoma suspects were scanned with Cirrus HD-OCT (Zeiss, Dublin, CA; Optic Disc Cube 200x200 scan pattern). All eyes showed NFBD on the Cirrus native retinal nerve fiber layer (RNFL) thickness deviation maps, but were categorized as normal based on global mean RNFL thicknesses. We have tuned our own RNFL segmentation algorithm from two different aspects: 1) for good RNFL thickness measurement reproducibility (R algorithm) and 2) for high NFBD detection sensitivity (S algorithm). The R algorithm was tuned to exhibit reproducibility across the acceptable SS range, similar to the method employed by the native Cirrus software, while the S algorithm was tuned to be sensitive to RNFL thinning at the cost of being more affected from SS (Figure). Sensitivity and specificity were calculated for detecting NFBD on R and S algorithms and Cirrus native software. The NFBD were defined by manually reading the cumulative arc length that showed at least borderline deviation (thinner than 5 percentile from the normative data) on Cirrus native deviation map: an abnormal clock hour was defined as greater than half of an arc length within one clock hour, and an abnormal quadrant was defined as larger than one clock hour arc length within a quadrant.
Neither algorithm detected an abnormality in global mean RNFL thickness. In quadrant and clock hour measurements, S algorithm demonstrated the highest sensitivities and the lowest specificities (Table).
Algorithm sensitivity can be improved at the cost of reproducibility across the acceptable SS range. A good balance of sensitivity and reproducibility needs to be achieved in segmentation algorithm development in order to ensure optimal clinical utility.
This PDF is available to Subscribers Only