This study demonstrates a significant intrasession learning effect, but only between the first and second examinations when performing microperimetry for the first time. This learning effect was not present in a subset of participants examined after 6 months. Therefore, intrasession test–retest variability can be minimized by discarding the first examination to avoid the influence of this learning effect. These findings have important implications for design of clinical trials and testing in clinical practice when using microperimetry.
To determine the test–retest variability within a single session involving multiple examinations, it is crucial first to determine whether a significant systematic change, such as a learning effect, is present. A systematic change inevitably increases the test–retest variability of the examinations considered. Previous studies have consistently reported an absence of a significant learning effect using microperimetry in healthy subjects
14,15 and subjects with macular disease
20 when considering MS. Similarly, our findings confirm the lack of any significant difference in MS between any pairs of examination in any group.
However, we found a consistent improvement between the first and second examinations during the baseline examination session, and not subsequent examinations within the same session, when considering analysis of PWS. This improvement is also observed in participants who have performed a short practice test (approximately 1 minute in duration). A recent report by Cideciyan and colleagues
21 also examined whether there was a learning effect by PWS analysis when performing dark-adapted, red-on-red microperimetry in participants with inherited macular degenerations. They did not observe a significant learning effect in their study, and this discrepancy may be attributed to the different testing parameters used (such as background luminance, and size, color, number, and location of the test stimuli). Additionally, the discrepancy could also have arisen from the statistical analysis applied, since a variance component analysis incorporating test–retest as a random effect was used in that study, rather than considering test sequence as a fixed effect.
19 Applying the same method of analysis to our data, we also did not find test–retest as a significant effect (data not shown), even though all groups displayed a consistent learning effect when using a linear mixed-effects model that considered test–retest as a fixed effect. When a factor is considered as a random effect, the factor is considered as being randomly sampled from a population, and that the quantity of this factor will depend on chance. However, factors considered as fixed effects are factors that are measured without error, being manipulated during an experiment to determine its effect on an entire population, such as the test sequence in this study.
22
We did not detect a significant learning effect between the second and third examinations of the same eye in all participants of group 1, or between the third and fourth examinations of the session of the second eye in participants of group 2. On the whole, these results suggest that no significant systematic changes are likely to occur following the first test, even when testing the fellow eye for the first time. This suggests that the results from examinations following the first examination are likely to represent the true value of retinal sensitivity of participants, with inherent variability included.
A previous report that found higher retinal sensitivities in the second eye tested within the same session after assessment of both eyes suggested that adaptation to the low background luminance of the microperimeter (1.27 cd/m2) may be responsible for this observation (Notaroberto NF, et al. IOVS 2012;53:ARVO E-Abstract 4828). To avoid the effects of adaptation, we ensured that each test was performed under identical settings by turning on the room lighting following each test. Therefore, the systematic increases in our study and previous studies are likely to be a result of a learning effect, rather than adaptation.
Additionally, there was also no significant learning effect present between the first and second examination when a subset of participants in group 1 were reviewed approximately 6 months following the initial examination. This suggests that results obtained from the first examination during this visit are likely to represent the true value of retinal sensitivity with inherent variability included. However, what is not known is whether there is a significant intersession learning effect, and this could not be determined from this study because clinical progression and functional decline were observed in this subset of participants. Although it is possible that there may be some learning between sessions, it is likely to be small if the true value can be accurately established at each visit. Further studies are required to examine and confirm this, and also whether the intrasession learning effect remains extinguished if participants are retested at longer review intervals.
Our results of test–retest variability are comparable to findings from previous studies. Chen and colleagues
20 reported a PWS CoR of ±5.56 dB using the MP-1 with 68 test stimuli. This was similar to the PWS CoR of ±4.76 dB performed under similar conditions (group 2; brief training before commencing the examination with an experienced clinician) in eyes with AMD using 37 test stimuli. The larger CoR obtained by Chen and colleagues
20 may be attributed to a greater number of test stimuli used in their study, resulting in longer test durations of approximately 12 minutes, as compared with approximately 5.5 minutes in this study.
In addition to determining the intrasession test–retest variability under different conditions, this study also highlights the benefits of performing point-wise analysis rather than averaging multiple measurements with microperimetry. Averaged values are useful in providing an overall representation of the retinal sensitivity over the area averaged, and reduce the test–retest confidence limits in a similar way to obtaining multiple measurements over the sampled area. Depending on the spatial density and size of the stimulus, averaged values may underestimate localized pathological changes that occur in AMD,
6,8–10,23–27 such as areas that subsequently develop GA
11 or the expansion of a slow-progressing atrophic lesion.
21 This is particularly important when the distance between each test stimulus is larger than the size of the lesion of interest, or the expected rate of progression. Point-wise analysis of a region of interest can allow localized changes occurring at each test stimulus to be better represented during analyses.
The findings of this study have several implications for the design and analysis of clinical trials that use microperimetry as an outcome measure. First, the learning effect evident between the first and second examination suggests that functional defects observed on the first test may be falsely identified. This was especially evident in healthy participants, in whom functional defects (which were not expected in eyes without pathology) were present during the first examination, but disappeared consistently on subsequent examinations. Second, discarding the first examination during the first session can reduce test–retest variability influenced by a learning effect and allow the test–retest confidence limits to be more accurately applied to both eyes. We also noted in this study that a short practice examination (of approximately 1 minute in duration) is not sufficient in eliminating this learning effect under these test conditions, and performing a full examination as training would be prudent. Although the results of this study suggest that a learning effect is absent on repeat examination 6 months following the initial examination, it is not known whether this is the case with longer intervals between follow-up examinations. Therefore, discarding the first examination of a session for subsequent visits may still be beneficial in minimizing the test–retest variability, although future studies are required to examine intersession changes. However, caution must be made when applying these findings to microperimetric examinations with different test parameters and/or protocols, as the extent of learning and fatigue under those conditions is unknown. The findings of this study should not be generalized to subjects with poor central visual acuity, those with more advanced disease, including central GA or CNV, or other macular diseases; the generalizability of these findings remain to be investigated. We also have observed in our clinic that test–retest variability was larger for examinations supervised by less experienced examiners, and therefore recommend supervision by experienced examiners to minimize the variability. Finally, this study also underscores the benefits of performing point-wise analysis to better reflect the functional changes measured by microperimetry, which averaged values may underestimate.
In summary, we report that the intrasession test–retest variability represented by PWS CoR to be ±4.37 dB or less between subsequent examination pairs of the same or fellow eye within the same session. There was a significant learning effect between the first and second examination within the same session, but not the subsequent examination pairs of the same or second eye. The learning effect was extinguished when a subset of participants was retested 6 months after the initial examination. These findings suggest that the test–retest variability influenced by the learning effect can be minimized by discarding the first examination of participants performing microperimetry for the first time, and these findings have important implications for both the design and analysis for clinical trials and also clinical practice using microperimetry.