Abstract
Purpose:
To demonstrate methods that enable visual field sensitivities to be compared with normative data without restriction to a fixed test pattern.
Methods:
Healthy participants (n = 60, age 19–50) undertook microperimetry (MAIA-2) using 237 spatially dense locations up to 13° eccentricity. Surfaces were fit to the mean, variance, and 5th percentile sensitivities. Goodness-of-fit was assessed by refitting the surfaces 1000 times to the dataset and comparing estimated and measured sensitivities at 50 randomly excluded locations. A leave-one-out method was used to compare individual data with the 5th percentile surface. We also considered cases with unknown fovea location by adding error sampled from the distribution of relative fovea–optic disc positions to the test locations and comparing shifted data to the fixed surface.
Results:
Root mean square (RMS) difference between estimated and measured sensitivities were less than 0.5 dB and less than 1.0 dB for the mean and 5th percentile surfaces, respectively. Root mean square differences were greater for the variance surface, median 1.4 dB, range 0.8 to 2.7 dB. Across all participants 3.9% (interquartile range, 1.8–8.9%) of sensitivities fell beneath the 5th percentile surface, close to the expected 5%. Positional error added to the test grid altered the number of locations falling beneath the 5th percentile surface by less than 1.3% in 95% of participants.
Conclusions:
Spatial interpolation of normative data enables comparison of sensitivity measurements from varied visual field locations. Conventional indices and probability maps familiar from standard automated perimetry can be produced. These methods may enhance the clinical use of microperimetry, especially in cases of nonfoveal fixation.
Comparison of patient data to normative data is key to the interpretation of many clinical tests. In standard automated perimetry, sensitivity estimates at each tested location are typically compared with the means and empirical quantiles of normative datasets, with or without correction for general sensitivity loss, to obtain metrics like total deviation (TD) and pattern deviation (PD), and their associated probability values.
1 Calculation of global indices like mean deviation (MD) and pattern standard deviation (PSD) also relies on knowledge of the mean and variance of normative data.
1 Such comparisons enable the clinician to efficiently estimate the likelihood that a patient's test results could be produced by a normal visual system.
Gaze-contingent microperimetry (also called fundus perimetry) has gained popularity as a clinical tool for assessing central and paracentral visual function in a variety of ocular conditions (review
Ref. 2). Microperimeters make use of patients' habituated fixation patterns measured by in-built eye tracking to position the test grid according to the individual patient's fixation (the monocular “preferred retinal locus”). This holds advantages in customizing the test to the individual patient, but creates a problem for comparison to normative data. Because the spatial locations tested vary between patients, sensitivity estimates at individual locations cannot be readily compared with a conventional normative database with fixed test locations. As a result of this limitation, current commercially available microperimeters display only crude normative data comparisons, such as global average sensitivity, and color code individual sensitivity estimates according to an arbitrary scale. Improved normative data comparison would assist clinicians in the detection and characterization of subtle visual defects outside of apparent retinal lesions that may precede further disease progression,
3,4 and in the detection of early functional impairments due to retinal lesions.
5 Such visual defects may indicate the necessity for treatment to prevent further vision loss.
We aimed to develop methods for improved normative data comparison for gaze-contingent perimetry. We hypothesized that surfaces could be accurately fit to normative data collected using a densely sampled, spatially extensive grid. This would enable comparison of sensitivity at any tested spatial location to corresponding points on the high-resolution normative surfaces. The purpose of this article is to demonstrate the method for deriving and using the normative surfaces, such that these methods may be adopted for wider clinical use with the collection of a larger normative dataset.
Healthy participants (n = 60, median age 26; range, 19–50 years) were recruited from the staff and student population of the University of Nottingham (Nottingham, UK). Inclusion criteria were visual acuity 0.2 logMAR or better in the tested eye, spherical refractive error within the range that can be compensated for by the MAIA-2 (−15.00 diopters [D] to +10.00 D; CenterVue, Padova, Italy), cylindrical refractive error less than 4.00 D as per the manufacturer's guidelines and no known current or previous ocular disease. One eye was tested per participant, chosen randomly if both eyes met the inclusion criteria.
Participants undertook MAIA-2 microperimetry using 237 custom test locations placed on a square grid with 1° spacing up to 5° eccentricity and 2° spacing from 5° to 13° eccentricity. Participants were instructed to fixate the standard 0.76° diameter fixation annulus at all times. Testing was broken into four randomly ordered blocks, in each of which an evenly spaced subset of test locations was tested. Testing was completed over one or two study sessions lasting up to 1 hour, incorporating rests between tests as needed.
All participants undertook at least one practice test using the “4-2 Expert” strategy of the MAIA-2 (37-point annular pattern within 5° radius of fixation) before experimental data were collected. Sensitivity thresholds were then estimated using the MAIA-2′s standard 4-2 staircase algorithm and Goldmann III (0.43° diameter circular luminance increment) stimuli. It should be noted that the decibel scale used by the MAIA-2 is different to that used by some other perimeters owing to differences in maximum stimulus intensity. Any tests with fixation not classified as “stable” by the MAIA-2 software were discarded and repeated.
Data from left eyes were converted to right eye format. Data from locations with a horizontal eccentricity of +13° were excluded due to encroachment of the physiological blind spot, and data from (0°, 0°) were also excluded as sensitivity at this location is affected by the fixation target.
6 This left 228 locations for fitting. The present data were not adjusted for sensitivity decline with ageing as data on this for the MAIA-2 are not available in the public domain. Based on a previous study using the Humphrey Field Analyzer (Carl Zeiss Meditec, Jena, Germany),
7 the age-related sensitivity difference between the median and youngest or oldest participants in our dataset is expected to be less than 1 dB, excluding one outlier (age 50) whose difference is expected to be 1.3 dB. These differences were deemed small enough to be within measurement variability in this study.
Though surfaces could be fit to any quantile of the data, for the purpose of demonstrating the use of this approach we fit surfaces to the mean, variance, and empirical 5th percentile of the data at each location. This later enables the calculation of local and summary indices that are familiar from static automated perimetry such as TD, PD, MD, and PSD. We initially trialed a variety of spatial interpolation methods, including bilinear interpolation, local regression, and Universal Kriging.
8,9 In preliminary testing (data not shown), Universal Kriging was found to most frequently provide the best fits to subsets of the data, and so was used for the main study, however all of the above methods provided clinically acceptable fits to our data once suitable fitting parameters had been found. Universal Kriging is a spatial interpolation technique originally developed for geostatistical applications that estimates interpolated points without penalization for lack of smoothness, thereby predicting the most likely intermediate values given the available measurements.
Surface fitting and all other analyses were carried out in
R (ver 3.2.0)
10 using the MASS and spatial packages. Universal Kriging was done with a quadratic trend surface, chosen to reflect the expected shape of the hill of vision, and an exponential covariance matrix, chosen over the alternatives (Gaussian and spherical) as giving the best fits to our data in initial testing. Fitting a surface by Universal Kriging additionally requires the selection of two parameters, a range parameter (
d), that determines the range within which surrounding points are considered in choosing intermediate values, and a “nugget parameter” (
α) that controls the extent to which local maxima and minima in the data are smoothed. We selected these parameters separately for each of the three surfaces because previous studies of normative perimetric data have shown unequal variance in sensitivity across the visual field,
7 therefore it was expected that the parameters would optimally differ for the different surfaces.
To select the range (d) and nugget (α) parameters for each surface we used a grid search to trial all combinations of d in the range [1, 1.5…10] and α in the range [0, 0.05…1] (total 399 parameter combinations). The goodness-of-fit was assessed as follows:
-
Fit the surface with the chosen parameters to the data excluding a randomly selected subset of 50 locations,
-
Calculate root mean square (RMS) difference between the predicted and actual values at the excluded subset of locations,
-
Repeat steps 1 and 2, 200 times and take the mean of the RMS differences as an overall measure of goodness-of-fit.
For each surface the parameter combinations with the lowest mean RMS difference across the 200 repeats were chosen for the final surfaces, which were then fit to all locations.
The overall efficacy of the surface fitting approach was assessed in three ways: first, the above resampling procedure was repeated 1000 times using the chosen parameters and RMS difference from measured sensitivity at the excluded locations was assessed. Because these excluded locations are spatially more distant from their nearest included locations than any interpolated point would be when all data are used, and no constraint was placed to prevent excluded locations from clustering together forming large areas without data, this method gives a lower bound on the accuracy of the surface fitting. Second, a leave-one-out method was used to compare individual subject datasets with surfaces fitted to the 5th percentile of the remaining subjects' data at each location using the final chosen fitting parameters. Should the method be successful, we expected 5% of tested locations to fall beneath the 5th percentile surface (this is analogous to TD probability analyses in standard automated perimetry). Third, we considered instances in which the location of the anatomical fovea is unknown, as may be the case in central retinal pathology, by spatially shifting the test grid position by an amount sampled from the distribution of relative fovea–optic disc positions in our dataset and comparing the shifted data to the fixed fitted surface. This sampling procedure was repeated 1000 times per participant and was intended to simulate the situation where the best available estimate of the position of the anatomical fovea is derived by assuming that it is in the population average position from the optic nerve head.
Microperimetric testing of visual function at extrafoveal regions may be clinically useful in patients with existing macular pathology. When the test grid is positioned wholly or partly away from the region of damage, either manually by the clinician or by the patient fixating eccentrically, abnormal sensitivity estimates in retinal areas without apparent structural damage
3,4 (Krishnan A et al. IOVS 2016:57:ARVO E-Abstract 6100) may provide a useful early biomarker of disease progression that may allow treatment to be initiated to prevent further spread of pathology and associated vision loss. Microperimeters also typically allow custom placement of test locations that may be useful in assessment of macular visual function in glaucoma where spatially accurate stimulus placement relative to retinal structures is indicated by recent work.
12 However, the lack of tools for comparison to normative data hampers the clinical use of microperimetry, as it is difficult to separate measurements of healthy visual function from those likely to be affected by disease. Previous normative studies for microperimetry have concentrated on common test patterns centered on the fovea,
13–18 which is therefore limited to use in patients with central fixation, and where the clinician elects to test the central region using the same pattern of test locations.
In this paper, we have demonstrated a method for development of a normative database for gaze-contingent microperimetry using a dense, spatially extensive grid that can be broken down into several sparse grids for ease of testing and combined post hoc. Accurate spatial interpolation between tested locations is then possible by surface fitting, enabling any spatial location tested in a given patient to be compared with normative data. By fitting further surfaces to the variance and empirical percentiles of the data it is additionally possible to derive summary indices familiar from standard automated perimetry such as MD and PSD, as well as probability maps for TD and PD. With our data we were able to fit surfaces to the mean and 5th percentile sensitivities with estimated upper bounds on their difference from measured values of less than 0.5 and 1.0 dB, respectively, which are both well within measurement precision in clinical perimetry.
19–22 The surface fit to the variance had larger error, with an estimated upper bound of median 1.4 dB (range, 0.8–2.7 dB), though this is still within typical measurement variability. This was due to a larger variation in the variance across the tested visual field region meaning that the inclusion/exclusion of some points whose variance in sensitivity was markedly different from nearby points had a large effect on the surface fit. This larger error affects only the accuracy of global summary indices that require data on variance such as MD and PSD; point estimates of TD and PD and their associated probability maps do not use this surface. It is further worth noting that the metrics reported herein can be considered lower bounds on the accuracy of the surface fitting approach (upper bounds on fitting error), because for a clinical application no test locations would be excluded from the fit.
One limitation to normative data comparison away from the fovea, regardless of method, is positional uncertainty. In a patient with a structurally intact anatomical fovea it would be possible for the clinician or the instrument to accurately identify the location of the fovea, and then the test grid position would be accurately known. However, a major clinical use for this technique is in macular pathology where it may be impossible to accurately identify the location of the anatomical fovea. In these cases, our chosen method is to assume that the anatomical fovea lies in the population average position relative to the optic disc. In the MAIA-2 microperimeter the optic disc center is identified by the clinician at the start of the test and its position is recorded in the output data. Clearly, this method is subject to population variation in relative fovea–optic disc position
23–25 and additional variation from the selection of the optic disc center by the clinician. Both types of variation are taken into account in our data, which suggests that using this specific measurement method the distribution of optic disc position is (15.54 ± 1.05°, 2.12 ± 0.85° [mean ± SD in right eye format]) relative to the fovea. By repeatedly adding error sampled from this distribution to the position of the test grid we found that the change in number of locations falling beneath the 5th percentile surface was less than 3 of 228 (1.3%) in 95% of patients. This small proportion suggests that assuming the location of the fovea in this way is likely to be suitable for most clinical purposes.
The normative database collected for this study comprised 60 healthy volunteers with a narrow range of ages, collected from a single site. Although this database is suitable for the proof-of-concept use here, it is not large enough and does not have a wide enough age range to stand as a definitive normative database. Further data need to be collected using this method to construct such a database.
Further study on the optimal trade-off between sensitivity and eccentricity for selection of a preferred retinal locus may help to develop the role of microperimetry in optimizing visual function in patients with untreatable macular damage. Commercially available microperimeters already feature biofeedback training paradigms that enable a chosen retinal location to be trained as a preferred retinal location. The use of PD probability maps may aid clinicians in the selection of locations to train for this purpose.
In conclusion, we have demonstrated methods that enable normative data comparison in gaze-contingent microperimetry. Our methods enable sensitivity measured at any spatial location, regardless of fixation, to be compared with normative values. Further, our methods enable pointwise and summary indices and probability maps, familiar from standard automated perimetry, to be calculated in microperimetry, despite variations in test pattern and location. With the collection of larger normative datasets using these methods, the clinical use of microperimetry as a tool to measure visual function in central retinal pathology could be enhanced.
The authors thank Iram Ali and Helen Baggaley for assistance with data collection. Supported by grants from the College of Optometrists Postdoctoral Award (JD, ATA; London, UK) and a National Institute for Health Research (NIHR; UK) Postdoctoral Fellowship (ATA). This report presents independent research funded by the NIHR. The views expressed are those of the authors and not necessarily those of the NHS, the NIHR or the Department of Health.
Presented at the ARVO Annual Meeting, Seattle, WA, USA, May 2016.
Disclosure: J. Denniss, None; A.T. Astle None