This study evaluated the progression rate of OS metrics in male patients with XLRP associated with an
RPGR mutation. The results of this study demonstrated for the first time that, with the assistance of DLMs, it is possible to analyze volumetric measurements of photoreceptor OSs in a large OCT image dataset of XLRP. This allowed quantitative measurement of longitudinal changes of OS volume, as well as preserved EZ area and OS thickness, to assess disease progression. The linear mixed-effects models revealed that the progression rates of preserved EZ area and OS volume were dependent on their baseline values, with faster declines in eyes with larger baseline values. Relative to their means at baseline, the overall annual decreases for preserved EZ area and OS volume were around 9.5%. A previous study predicted a 13% reduction in EZ area based on a 7% annual decrease in EZ width,
20 and this predicted rate of decrease is comparable to the 14% to 16% annual percent decrease of EZ area estimated from
Table 2 for the subgroups with middle-level baseline EZ area.
Among the three OS metrics, 3D OS volume had a percent rate of progression comparable to that of the 2D EZ area but it was more than 6 times faster than one-dimensional (1D) OS thickness, which had an annual reduction of 1.5%. Although a 1D metric, such as EZ width as discussed in the previous paragraph, may show a smaller percent rate of change, the much slower percent progression rate for OS thickness is likely due to the limited range of OS thickness, as well as the axial resolution of OCT A-scans. Given an A-scan resolution of 3.87 µm/pixel for SPECTRALIS OCT and a mean baseline OS thickness around 20 µm among the patients with XLRP in this study, the mean OS thickness was only represented by 5 pixels. Thus, a shift of 1 pixel in the retinal layer boundary segmentation of EZ/pRPE would lead to a 20% change of OS thickness, which would result in a larger measurement variability to mask the actual change of OS thickness.
27 Hence, mean OS thickness alone may not be a good candidate as a biomarker. On the other hand, the EZ area contributes to OS volume the most and dictates the variability of OS volume measurements. Because of the large range of EZ areas (>60 mm
2 from 9-mm volume scans) and high-resolution of B-scans (∼5.8 µm/pixel), the impact of segmentation errors on the measurement of EZ area, and hence OS volume, is much smaller than that on OS thickness. Therefore, both EZ area and OS volume are better and more effective candidates than OS thickness as biomarkers to monitor disease progression in XLRP. The question is whether OS volume could offer something that EZ area could not, if, for example, a treatment increased the sensitivity of surviving photoreceptors with the increase of their outer segment length but not the area of preserved EZ. Although OS volume could be a potential new biomarker for assessing disease progression in XLRP, further studies are required to determine whether OS volume has any advantages over EZ area.
From the imaging and mathematical points of view, the progressive loss of the EZ band and hence EZ area over time in RP is the opposite of the progressive increase of geographic atrophy (GA) area over time in age-related macular degeneration. It has been shown that the growth rate of the square root of GA area is independent of baseline GA size,
33 leading to the suggestion that using the square root transformation of GA area measurements could potentially eliminate the dependence of growth rates on baseline lesion.
34 In contrast, our results showed that the progression rate of the preserved EZ area in XLRP was dependent on its baseline value even after the square root transformation. Our finding is consistent with a previous study showing a weak relationship between loss of EZ width and the initial width of the EZ band in XLRP, where the greater the initial EZ band width, the greater annual loss of EZ band (
R2 = 0.16,
P = 0.03).
20 A possible explanation for this difference may be due to the different patterns of disease progression for RP and GA. Another reason could be that, in our study, we had more cases of large EZ area (>10 mm
2) when compared to the atrophy sizes reported in the GA studies.
33,34 As Feuer et al.
34 noted, very small and very large lesions are expected to grow more slowly in GA, as demonstrated in a more recent study that included more cases of large-size GA (up to 30 mm
2) which showed that the square root transformation did not eliminate the dependence of the rate of progression of GA on baseline lesion size.
35 Hence, the square root transformation may not be applicable to datasets with an extended range of baseline values.
On the other hand, the progression rate of 1D cube roots of OS volume was not associated with its baseline value, similar to the result that the change of OS thickness was not associated with its baseline value. This finding could be due to the reduced range of 1D measures after cube root transformation of OS volume (
Fig. 4D), as well as the increased intersubject variability because of the larger variability of OS thickness measurements as shown in
Figure 2A. It appears that the 1D cube root of OS volume behaves closer to OS thickness than the 1D square root of EZ area. Further studies may be needed with increased sample sizes and improved OS thickness measurements (for example, by employing ultra-high-resolution OCT scans) to address the apparent difference in dependence on baseline values between 1D measures from square root transformation of EZ area and from cube root transformation of OS volume.
In this study, manual correction by human graders was performed on the automatic segmentation results of a deep learning model. Preliminary timing analysis revealed that the average times (SD) of two graders to examine and correct DLM segmentations of EZ and pRPE were 4.10 (2.04) minutes for a low-density (31-line) volume scan and 9.33 (1.76) minutes for a high-density (121-line) volume scan, suggesting that DL segmentation with manual correction can potentially be more efficient than traditional manual grading. On the other hand, the DL with manual correction segmentation method could be different from the conventional gold standard of manual grading. It has been demonstrated in our previous study, employing the same DLM as the one in this study, that there was a close agreement between the DLM only and the manual grading by a reading center for the EZ area measurements. Bland–Altman analyses showed similar findings for EZ area measurements in two studies, with a CoR of 1.83 mm2 from this study compared to 1.62 mm2 from the previous study. In addition, the performance comparison between the DLM only and the DLM with manual correction of human graders was comparable to that between two human graders. The manual correction method in this study may be different from the conventional method of human grading, and additional studies are needed to compare OS metrics measurement by the DLM and by different human grading methods, including the traditional manual segmentation approaches, to establish DLMs with manual correction as a potential new gold standard. However, our results suggest that automatic OS metric measurements by DLM are comparable to those obtained by manual grading.
The comparison of the performance between human graders as well as between individual graders and the DLM revealed that intergrader variability as determined by CoR was higher than that for the DLM versus grader 1 for OS thickness measurement and to some degree for EZ area measurement, whereas intergrader variability was smaller than that between the DLM versus grader 2. It is worth pointing out that grader 2 was one of the main contributors to the generation of the training data (the ground truth) for the DLM, but grader 1 had no input to the training data for the DLM. Nevertheless, it appears that grader 1 and the DLM are closer to each other than grader 2 and the DLM. These results suggest that intragrader variability, as well as intergrader variability, may contribute to the difference between individual graders and the DLM. Because the DLM was trained with the data generated from human graders with intra- and intergrader variability, it may reflect an “average” performance of graders.
In summary, the results from this study provide evidence to support using OS volume and EZ area as biomarkers to assess the progression of XLRP and as the outcome measures to evaluate treatment effects in clinical trials. Because the primary outcome measure of RP clinical trials is vision, it is also important to demonstrate the relationship between OS metrics and visual function, such as visual field sensitivity. Our preliminary results have shown that mean visual field sensitivity had much higher correlation coefficients with EZ area (
r = 0.84) and OS volume (
r = 0.84) than OS thickness (
r = 0.45) in RP.
36 Given that their progression rates are dependent on their baseline values, the baseline values of OS metrics should be considered in the design and statistical analysis of future clinical trials for XLRP. Our results also suggest that DL may provide effective tools to significantly reduce the burden of human graders when analyzing OCT scan images and to facilitate the study of structure and function relationships,
36 in addition to the use of DL to assess disease progression, particularly with respect to future treatment trials for RP in general and XLRP in particular.