**Purpose**:
Longitudinal imaging is becoming more commonplace for studies of disease progression, response to treatment, and healthy maturation. Accurate and reproducible quantification methods are desirable to fully mine the wealth of data in such datasets. However, most current retinal OCT segmentation methods are cross-sectional and fail to leverage the inherent context present in longitudinal sequences of images.

**Methods**:
We propose a novel graph-based method for segmentation of multiple three-dimensional (3D) scans over time (termed 3D + time or 4D). The usefulness of this approach in retinal imaging is illustrated in the segmentation of the choroidal surfaces from longitudinal optical coherence tomography (OCT) scans. A total of 3219 synthetic (3070) and patient (149) OCT images were segmented for validation of our approach.

**Results**:
The results show that the proposed 4D segmentation method is significantly more reproducible (*P* < 0.001) than the 3D approach and is significantly more sensitive to temporal changes (*P* < 0.0001) achieved by the substantial increase of measurement robustness.

**Conclusions**:
This is the first automated 4D method for jointly quantifying choroidal thickness in longitudinal OCT studies. Our method is robust to image noise and produces more reproducible choroidal thickness measurements than a sequence of independent 3D segmentations, without sacrificing sensitivity to temporal changes.

^{1–4}Compared with the large intersubject variability in cross-sectional studies, within-subject measurements allow more precise quantification of change over time. This is of great interest both for quantifying disease progression and in studies of aging.

^{4–8}and treat the OCT images from different time points completely independently. These methods rely on statistical analysis for exploring the relationship of the time points with each other. Recently, a new approach that considers the longitudinal time sequence during image analysis was proposed

^{9}; however, this method appears to suffer from temporal over-regularization as it can only detect small temporal changes. Such over-regularization is a concern in many longitudinal analysis methods, which aim to reduce the measurement discrepancies between time points by using temporal regularization; such over-regularization can be seen as preferring no or limited changes over time for which the allowed changes would not reflect the true changes of anatomy/pathology over time. Although temporal regularization is indeed useful to reduce measurement noise and increase reproducibility, in the extreme cases, over-regularization would result in an inability to detect actual temporal changes present in the data, which is clearly undesirable. In summary, no method currently exists for a successful 4D (3D + time) image quantification of longitudinal OCT scans that adequately balances the need to detect small changes over time with the need for consistent measurements between time points.

^{10,11}Choroidal thickness is an important retinal biomarker that is affected in many diseases of the retina, for example, in AMD,

^{12}in response to anti-VEGF therapy in diabetic macular edema,

^{13}and in healthy aging.

^{14}However, because of absorption by the RPE, the choroid is prone to low signal-to-noise ratio (SNR) in OCT scans and is therefore challenging to segment,

^{15–18}even in swept-source OCT, which is designed to increase SNR in the deeper layers. This is especially problematic in longitudinal studies where small changes in the noise appearance between images from different time points may lead to highly inconsistent surface positioning between time points, reducing statistical power in studies.

^{3}spacing and covering a volume of 6 × 6 × 2.3 mm

^{3}.

^{3}spacing, and covering the volume of 7.76 × 8.67 × 1.92 mm

^{3}.

^{6,19–23}Because the retinal angle between the incident light and subject's optical axis varied in OCT volumetric images, some geometric distortion occurs in the choroidal layer. An angle adjustment approach

^{24}was applied to the original OCT volumes to overcome this. Bruch's membrane was transformed to a relatively symmetrical surface in the OCT images, and then the BM was modeled as a convex arc on each B-scan, and the curvature of this arc was computed using the average axial length of a human eyeball. A single surface graph-search method was used to segment BM by utilizing this arc model as shape-prior information.

^{19}The OCT volumes were then flattened using the BM surface. The flattened scans from all time points are aligned such that the horizontal line representing the (flat) BM surface is at the same position in each volume. No within-plane alignment was necessary in the presented experiments because all scans were carefully centered on the fovea; however, other datasets with greater foveal location variability may need this as an additional preprocessing step.

^{4,8}was done using the LOGISMOS framework.

^{25}We begin by cropping the flattened OCT image to a sufficiently large region of interest underneath the BM surface. The empirically determined size of the crop region was 489 μm (188 voxels) in the anterior–posterior direction. Twenty-one micrometers of this region was anterior to the BM surface, to allow for recovery from any errors in the BM initialization. For computational efficiency, as well as to alleviate the anisotropy of the acquisition protocol, this cropped image was down-sampled by a factor of (2 × 4 × 1) in the inferior–superior, anterior–posterior, and nasal–temporal directions, respectively. A median filter was used in a one-voxel neighborhood to reduce noise.

^{25}to obtain the optimal surface segmentations.

*N*time points, that is, 2

*N*surfaces are segmented simultaneously. The preprocessing of each image, including the flattening, cropping, and down-sampling, was identical to the 3D approach.

*x*,

*y*,

*z*), where

*s*denotes the surface ID (lower or upper choroid),

*t*denotes the

_{i}*i*th time point, and (

*x*,

*y*,

*z*) denotes the spatial coordinates, the following infinite-weighted arcs were added:

*and Max*

_{t}*were set to [−2,2] node intervals for the upper choroid surface and [−5,5] node intervals for the lower choroid surface. These empirically determined settings indicate that each surface can move anteriorly or posteriorly independently. The choroid region bound by these two surfaces can thus thin or thicken by up to seven node intervals. Because the graph is constructed in the down-sampled image space, this corresponds to 72.8-μm thinning or thickening between consecutive time points, which is clinically reasonable.*

_{t}*N*longitudinal OCT volumes from

*N*time points, all

*N*subgraphs are connected to each other using the inter–time point arcs described above, to result in a single complex graph that represents the segmentation problem for all

*N*time points. The segmentation is then carried out using a single graph-cut on this complex graph to achieve simultaneous segmentation of the 2

*N*surfaces.

^{26}which is equivalent to additive Gaussian noise in the linear scale. To simulate noisy OCT images, we first converted the real 16-bit OCT images to decibel scale, by linearly scaling the intensities such that the voxel intensity range corresponded to the dynamic range of the OCT camera. The dynamic range of our OCT camera was empirically determined to be 42 dB. Then, the image was de-logged to convert it to linear scale, according to

*t*-tests between 3D and 4D segmentation reproducibility errors are used to assess statistical significance. We also present scatter plots of test–retest thickness measurements and report the correlation between these values at each noise level for 3D and 4D.

*t*(

_{i}*i*= 0…8, with

*t*

_{0}designating the real OCT image), a target thinning rate

*α*was generated as a random sample from a uniform distribution between [−25,25] μm, with a negative

_{i}*α*denoting a thinning choroid and a positive

*α*denoting a thickening choroid. These bounds for

*α*were chosen to represent a wide range of changes that can be realistically observed in the choroid. A B-spline grid with 13 nodes in each dimension is created to represent the whole OCT image domain. This grid density was empirically chosen to give adequate resolution, such that the choroid and surrounding regions can be manipulated separately from the rest of the image domain. The B-spline deformation was set to 0 outside the choroid region of interest (ROI), which is represented by two rows of grid nodes around the choroid. The B-spline grid is illustrated in Figure 1a. Inside the choroid region, each grid node (

_{i}*x*,

*y*) was assigned a local thinning amount

*β*

_{i}_{,(}

_{x}_{,}

_{y}_{)}, which was randomly drawn from a uniform distribution between [0,2|

*α*|]. Then, the vertical displacement at the grid node corresponding to the top row of the choroid ROI was set to

_{i}*β*

_{i}_{,(}

_{x}_{,}

_{y}_{)}, and at the grid node corresponding to the bottom row of the choroid ROI, it was set to −

*β*

_{i}_{,(}

_{x}_{,}

_{y}_{)}, for positive

*α*. The vertical displacement values were reversed for negative

_{i}*α*. We note that this corresponds to a local displacement of 2

_{i}*β*

_{i}_{,(}

_{x}_{,}

_{y}_{)}. Figure 1b shows these local displacements of the B-spline nodes. The smooth deformation field represented by the B-spline simulation ensures that the region surrounding the true choroid is stretched or compressed to accommodate the desired motion of the choroid itself.

*t*, the random deformation field thus computed was applied to the image at

_{i}*t*

_{i}_{−1}. Figures 1c and 1d show the thickening simulation for a particular B-scan from the experiment. We note that the

*α*are independent from each other; in practice, this means a time sequence can have alternating thinning and thickening periods, at varying rates of deformation. We chose this approach rather than consistent thinning or consistent thickening to obtain a wider range of situations that may occur in real clinical data based on disease progression and treatment schedule.

_{i}*t*

_{0}for the sensitivity experiment. One subject was excluded from this study as it had a very thin choroid, which occasionally led the choroidal layer to collapse to a line locally when a random deformation was applied. This resulted in an inability to create a synthetic 4D dataset and does not indicate a choroidal thickness measurement limitation of the reported image segmentation method. For each subject, 10 trials were conducted. For each trial, eight randomly deformed images were generated as described above. These images at

*t*

_{1,…,8}were segmented both independently in 3D and jointly in 4D. A total of 14 × 10 × 8 = 1120 synthetic images were thus used in this experiment. For a fair comparison, the initial BM surface was computed only once for each deformed image and used as input to both 3D and 4D segmentation algorithms.

*δ*

_{true}(

*i*) between time points

*t*and

_{i}*t*

_{i}_{−1}, by summing the local

*β*

_{i}_{,(}

_{x}_{,}

_{y}_{)}s. It is important to note that, although these relative thickness changes are known between time points, the absolute choroid thickness at any given time point is unknown, because we do not have the ground truth at

*t*

_{0}. For assessing the temporal sensitivity of the two methods, we compute the choroid thickness at each time point using the 3D and 4D segmentation results, and we report

*δ*

_{3D}(

*i*) −

*δ*

_{true}(

*i*) and

*δ*

_{4D}(

*i*) −

*δ*

_{true}(

*i*) as momentary sensitivity errors. Both signed and unsigned differences are reported. In this context, signed error represents a measurement bias in sensitivity, whereas unsigned error represents the bulk measurement error. In addition to these momentary thickness changes, we also consider cumulative thickness changes, Δ

_{true}(

*i*), Δ

_{3D}(

*i*), and Δ

_{4D}(

*i*), which represents the total change between time points

*t*and

_{i}*t*

_{0}, computed as . Paired

*t*-tests are used to statistically compare the findings.

*t*-tests were used for statistical comparison.

*P*< 0.001) between 3D and 4D segmentation methods; 4D segmentation was significantly more reproducible than 3D segmentation at each tested noise level, in terms of both surface positioning and thickness measurement. For eight pairs of images, 3D segmentation results were excluded due to segmentation failure.

*P*< 0.002) than for 3D segmentation.

*P*< 0.001), both for momentary and cumulative time intervals.

*δ*

_{3D}and

*δ*

_{true}(

*i*) was

*R*

^{2}= 0.299, compared with a correlation of

*R*

^{2}= 0.970 between momentary

*δ*

_{4D}and

*δ*

_{true}(

*i*). Similarly, the correlation between the cumulative Δ

_{3D}and Δ

_{true}(

*i*) was

*R*

^{2}= 0.288 compared with a correlation of

*R*

^{2}= 0.975 between cumulative Δ

_{4D}and Δ

_{true}(

*i*).

*t*

_{1}has a thicker choroid than the images at

*t*

_{2}–

*t*

_{4}, especially on the nasal side of the image. Quantitatively, the true change in thickness was

*δ*

_{true}(1) = 12 μm,

*δ*

_{true}(2) = 1 μm, and

*δ*

_{true}(3) = −2 μm. The 3D and 4D segmentations for the lower choroid boundary are shown in column b. Although the two methods closely agree at

*t*

_{1}, the 3D segmentation results are very poor at

*t*

_{2}and

*t*

_{3}, where they capture the wrong surface, presumably due to local noise and poorly defined boundaries. At

*t*

_{4}, the 3D segmentation captures the lower choroid surface on the nasal side of the image but again fails on the temporal side. The 4D segmentation leverages temporal regularization to generate reliably correct results at all four time points.

*t*

_{1}is substantially different than the later time points [

*δ*

_{4D}(1) = 10 μm], in agreement with the

*δ*

_{true}(1) value. In contrast, the thre surfaces corresponding to the time points

*t*

_{2}–

*t*

_{4}are nearly identical, which agrees with the near-zero

*δ*

_{true}values for these time points. As shown in Figure 4, these anecdotal findings were also found to be statistically significant in quantitative analysis.

*R*

^{2}= 0.54 for the 3D approach (

*y*= 1.69

*x*− 60.38) and

*R*

^{2}= 0.99 for the 4D approach (

*y*= 1.02

*x*− 1.82). Note that despite allowing quite dramatic changes of surface positioning along the temporal direction (liberal temporal context constraints Min

*and Max*

_{t}*), the slope and intercept of the regression line are virtually indistinguishable from 1 and 0, respectively. This further confirms both the hypothesized excellent performance of the 4D approach and also the validity of our expectation that choroidal thickness is not changing within 3 months in patients with glaucoma. Figure 7c further shows the thickness reproducibility error (i.e., the absolute difference between the test and retest thickness measurements). The mean and SD of the thickness reproducibility error was 35.6 ± 82.0 μm for the 3D approach compared with 3.7 ± 2.3 μm for the 4D approach. Similar to the results from the synthetic experiments, the dramatic improvement in reproducibility afforded by the longitudinal approach is clearly evident in these results.*

_{t}*P*> 0.34 for mean thickness,

*P*> 0.22 for temporal slope). The SD of choroidal thickness was significantly different (

*P*< 0.001, 2.43 ± 0.92 μm in 4D vs. 7.05 ± 3.69 μm in 3D). The results confirm our hypothesis that effective longitudinal (4D) analysis reduces measurement noise without sacrificing sensitivity to temporal change.

**I. Oguz**, None;

**M.D. Abramoff**, IDx LLC (C, I), P;

**L. Zhang**, None;

**K. Lee**, None;

**E.Z. Zhang**, None;

**M. Sonka**, P

*. 2016; 94: 586–591.*

*Acta Ophthalmol**. doi:10.1136/bjophthalmol-2015-308074.*

*Br J Ophthalmol**. 2015; 15: 181.*

*BMC Ophthalmol**. 2012; 53: 7510–7519.*

*Invest Ophthalmol Vis Sci**. 2013; 8869: pii1667494.*

*Proc SPIE**. 2009; 28: 1436–1447.*

*IEEE Trans Med Imag**. 2010; 18: 19413–19428.*

*Optics Express**. 2015; 56: 3202–3211.*

*Invest Ophthalmol Vis Sci**. 2015; 9413: 94130M.*

*Proc SPIE**. 1983; 6: 101–107.*

*Int Ophthalmol**. 2010; 29: 144–168.*

*Prog Retin Eye Res**. 2009; 50: 4982–4991.*

*Invest Ophthalmol Vis Sci**. 2014; 158: 745–751.e2.*

*Am J Ophthalmol**. 1994; 35: 2857–2864.*

*Invest Ophthalmol Vis Sci**. 2013; 4: 397–411.*

*Biomed Optics Express**. 2015; 6: 1694–1706.*

*Biomed Optics Express**. 2014 (2014):479268.*

*Comput Mathematic Methods Med**. 2013; 4: 2795–2812.*

*Biomed Optics Express**. 2012; 31: 1521–1531.*

*IEEE Trans Med Imag**. 2006; 28: 119–134.*

*IEEE Trans Pattern Anal Machine Intelligence**. 2011; 2: 2403–2416.*

*Biomed Optics Express**. 2010; 3: 169–208.*

*IEEE Rev Biomed Engineer**. 2010; 29: 1321–1330.*

*IEEE Trans Med Imag**. 2013; 54: 4808–4812.*

*Invest Ophthalmol Vis Sci**. 2010; 29: 2023–2037.*

*IEEE Trans Med Imaging**. 1999; 4: 95–105.*

*J Biomed Optics*