Abstract
Purpose:
To assess the diagnostic performance of a novel, automated, noninvasive measure of tear film stability derived from Placido disc videokeratography, the tear film surface quality breakup time (TFSQ-BUT), as a clinical marker for diagnosing dry eye disease (DED) relative to a standard of tear hyperosmolarity.
Methods:
This prospective, cross-sectional study involved 45 participants (28 DED, 17 controls). Symptoms (Ocular Surface Disease Index) and signs (tear osmolarity, TFSQ-BUT, tear breakup time measured with sodium fluorescein [NaFl-BUT], ocular surface staining and Schirmer test with topical anesthesia) of DED were assessed. Three measures of TFSQ-BUT and NaFl-BUT were taken per eye; “first,” “average,” and “shortest” BUT were analyzed separately. Optimal diagnostic cutoff values were determined using the Youden Index. The repeatability and agreement of the TFSQ-BUT was compared with two clinicians who manually assessed noninvasive BUT (CNI-BUT). Repeatability of methods was assessed using the geometric coefficient of variation (gCoV, %). Agreement between methods was considered with Bland-Altman analysis.
Results:
Eyes with DED had significantly shorter TFSQ-BUTs than controls (P < 0.05). There was a significant, moderate correlation between both shortest and average TFSQ-BUT and NaFl-BUT (r = 0.35, P = 0.02 and r = 0.38, P = 0.01, respectively). The receiver-operator characteristic (ROC) curve for shortest TFSQ-BUT showed an area under the curve of 0.92 (P < 0.0001). Shortest TFSQ-BUT with a criterion of 12.1 seconds had a sensitivity of 82% and specificity of 94% for diagnosing DED against tear hyperosmolarity. Automated TFSQ-BUT showed less variability (gCoV = 9.4%, 95% confidence interval [CI]: 7.1%–14.0%) than CNI-BUT (gCoV = 27.0%, 95% CI: 19.62%–41.06%, P < 0.05).
Conclusions:
Automated TFSQ-BUT is a repeatable, noninvasive clinical marker with both high sensitivity and specificity for tear hyperosmolarity.
Dry eye disease (DED) is a “multifactorial disease of the tears and ocular surface that results in symptoms of discomfort, visual disturbance, and tear film instability. It is accompanied by increased osmolarity of the tear film and inflammation of the ocular surface.”
1 Dry eye disease is underwritten by perturbations to the lacrimal functional unit (LFU), consisting of the lacrimal gland and its accessory glands, cornea, conjunctiva, meibomian glands, eyelids, and their associated sensory and motor nerves.
2 Under physiologic conditions, the integrated LFU regulates the secretion, distribution, and clearance of tears, in response to endogenous and exogenous factors, to preserve ocular surface integrity.
2 Disruption to one or more components of the LFU can result in a loss of tear homeostasis, thereby leading to tear film dysfunction.
Tear hyperosmolarity is a key feature of DED.
1,3,4 Reduced aqueous production and/or excessive tear evaporation decrease(s) tear film volume and increase(s) tear protein and electrolyte concentration.
5,6 A meta-analysis of 16 studies identified a reference value of 316 mOsmol/L to be specific for diagnosing clinically significant DED, providing a sensitivity (true positive rate) of 73% and specificity (true negative rate) of 90%.
7 Most studies included in this analysis used laboratory-based, freezing-point depression techniques to measure tear osmolarity,
7 which is regarded as the “gold standard” method of assessment. This technique differs from electrical impedance osmometry, used by the osmolarity system (TearLab; TearLab Corp., San Diego, CA, USA). While measurements using both methods to quantify tear osmolarity from human tear samples have been reported to correlate,
8 differences in measures have also been described for assaying standard solutions of known osmolarity.
9,10 Measurement reliability with the TearLab system can be influenced by a range of factors, such as ambient temperature,
11 which may account for differences in findings between studies.
A number of studies have reported a lack of correlation between tear osmolarity measures and traditional dry eye diagnostic tests, including the Schirmer test, sodium fluorescein tear breakup time (NaFl-BUT), and corneal staining with vital dyes.
12–16 However, such findings, which speak to the complexity surrounding the clinical diagnosis of DED,
17 are not unanticipated. In contrast to tear osmolarity, which is objective, quantitative, and can be measured through acquiring a minute (50 nL) tear volume, most traditional diagnostic tests for DED are invasive (artificially disrupting tear film status) and rely on subjective assessment. A quantitative, automated, clinical parameter that noninvasively assesses tear film integrity might therefore be expected to more closely correlate with tear film osmolarity measurements and serve as a useful surrogate diagnostic marker for DED.
In this regard, a relevant potential candidate is noninvasive tear breakup time (NI-TBUT), which is intended to quantify natural tear film stability.
18 Although tear hyperosmolarity and tear instability are both key components in the definition of DED,
1 to the author's knowledge, this is the first study to have directly considered their potential relationship. This study tests a hypothesis that a novel, automated noninvasive measure of tear stability, herein termed the tear film surface quality breakup time (TFSQ-BUT) derived from dynamic-area high-speed Placido disc videokeratography, is a useful surrogate marker for diagnosing DED, relative to a current standard of tear hyperosmolarity.
Dynamic-area, high-speed Placido disc videokeratography was performed using the E300 corneal topographer (Medmont International Pty Ltd., Victoria, Australia). Participants were instructed to focus on the central fixation target, gently blink twice and then to suppress blinking during the capture period. Using a frame rate of four photokeratoscopic images per second (4 Hz), a video was captured of the reflected Placido disc mires for up to 23 seconds post-blink. Three measures were taken on each eye, alternating between eyes, with right eyes measured first. Participants were advised to blink freely between each measurement.
The E300 corneal topography system noninvasively analyzes changes in tear film stability by analyzing the structure of the reflected Placido disc image. The software calculates TFSQ values at 300 radial analysis points along each of the 32 rings, using an approach similar to the block-feature TFSQ indicator described by Alonso-Caneiro and colleagues.
22 However, whereas this paper calculates the block-feature TFSQ from raw image data, the Medmont software leverages the existing E300 topographical analysis algorithm to extract the ring location data from the image. Significantly, the topographical analysis algorithm is able to identify and eliminate images with excessive movement and to correctly identify ring reflections in areas that contain shadows from eyelashes. The local TFSQ value at a given analysis point is calculated by finding the SD of the radial distances to the next innermost ring for
n = 8 points (9.6°) either side of the analysis point.
First, the average ring width at the point
i is calculated as:
The standard deviation of the widths of the surrounding
n = 8 points is then calculated by:
Finally, the local TFSQ value is calculated as a dimensionless value:
A local TFSQ value of 0.30 or greater corresponds to visible distortion in the ring pattern. The following novel parameters, based upon the TFSQ index,
22 are then defined: (1) TFSQ-Area (%): the percentage of the tear film examination area (defined by a 7 mm diameter) with a TFSQ index greater than 0.30; and (2) TFSQ-BUT: time (in seconds) at which the TFSQ-Area (%) is calculated to be at least 5.0% in two consecutive photokeratoscopic images.
Figure 1 shows representative TFSQ index color maps, taken at 0.5, 12.5, and 20.3 (
Figs. 1A–C) seconds post-blink. Increasing tear instability is visually appreciable by a progressive increase in the proportion of warmer colors (yellow and red) in the TFSQ map. Enlargements of the Placido disc rings show an absence of distortion at 0.5 seconds post-blink (
Fig. 1D), early tear film breakup at 12.5 seconds (
Fig. 1E), and extensive distortion of the mires by 20.3 seconds (
Fig. 1F).
One drop of 0.5% proxymetacaine hydrochloride (Alcon Laboratories, New South Wales, Australia) was instilled into each inferior conjunctival sac. After four minutes, the folded edge of a sterile Schirmer strip (EagleVision, Memphis, TN, USA) was gently inserted between the middle and lateral third of each lower lid margin. Participants were instructed to close their eyes in the dimly lit room. After 5 minutes, the length of strip wetting was recorded in millimeters.
A subset of Placido-disc videos (n = 20) was used for repeatability and agreement tests on the automated TFSQ-TBUT. Each 4-Hz video was split into two 2-Hz videos, using a custom function (supplied by Medmont Pty Ltd., Victoria, Australia), that allocated alternate even and odd numbered images within the video to separate exams. Two research clinicians viewed the videos of the Placido ring reflections (in the absence of the TFSQ color map overlay) and manually assessed the noninvasive tear breakup time, in seconds, (termed the clinician-derived noninvasive break-up time, CNI-BUT), as indicated by the first observation of distortion of the Placido rings. Following an initial presentation of videos (n = 20) derived from even-numbered images, videos consisting of odd-numbered images were randomly presented for a repeat assessment. All assessments were undertaken in one sitting, under normal room illumination. Automated TFSQ-BUTs were also calculated for each video.
Intramethod repeatability, being the variability in automated TFSQ-BUT and CNI-BUT between repeated trials, was examined using a geometric coefficient of variation (gCoV, %) as described by Hopkins
24 and Vaz.
25 Intermethod (automated versus clinician-derived noninvasive BUT) variability was examined using Bland-Altman analysis.
26 The mean difference (bias) and limits of agreement (LoA, defined as bias ± two SDs of the mean difference) were calculated. The method described by Carkeet,
27 which takes into account sample size, was used to calculate exact 95% confidence limits (CLs) for the LoAs.
Data were analyzed using spreadsheet software (Microsoft Excel; Microsoft Office for Mac 2011, version 14.4.1, Microsoft Corp., Redmond, WA, USA) and graphing software (GraphPad Prism 5; GraphPad Software, San Diego, CA, USA). Descriptive statistics are summarized as mean ± SD. A Kolmogorov-Smirnov test was used to assess for normality of continuous variables. Comparisons between groups were undertaken using either a t-test or Mann-Whitney U test, as appropriate. A χ2 test was used to compare discrete variables. Pairwise correlations between TFSQ-BUT and NaFl-BUT parameters were explored using Spearman's correlation coefficient (r). An alpha value of 0.05 was adopted for statistical significance.
To investigate the diagnostic capacity of a number of TFSQ-BUT and NaFl-BUT parameters (i.e., shortest, average and first of the three measured BUTs using each method), the sensitivity and specificity for each parameter was determined against a standard diagnostic criterion for DED (tear hyperosmolarity of ≥316 mOsmol/L).
7 Receiver-operator characteristic (ROC) curves, showing sensitivity versus false positive rate (1 – specificity), were plotted and used to evaluate the discriminative capacity of the BUT parameters. The area under the ROC curve (AUC) was calculated to provide a measure of the overall performance of each BUT parameter (i.e., a measure of diagnostic accuracy). Optimal diagnostic cutoff values for BUT parameters were determined using the Youden index, which measures the distance of each point on the ROC curve from the identity (diagonal) line, for each criterion; the maximum value is considered the criterion of interest.
28 Likelihood ratios (LRs), being the gradient of the ROC curve at the cutoff criterion, were calculated. Likelihood ratios represent the ratio between the probability of a positive test result given the presence of disease and the probability of a positive test result given the absence of disease.
This paper reports that a novel, automated noninvasive measure of tear film stability derived from Placido disc videokeratography (TFSQ-BUT), has sufficient discriminative capacity to be a valuable marker of tear hyperosmolarity in DED. Using a cutoff value of 12.1 seconds, the shortest TFSQ-BUT (of three repeat measures) showed high sensitivity (82%) and specificity (94%) for diagnosing moderate to severe DED against tear hyperosmolarity; this diagnostic performance was superior to traditional NaFl-BUT measures. A modest correlation (r ≈ 0.4) was evident between TFSQ-BUT and NaFl-BUT parameters. Automated measures of TFSQ-BUT were significantly more repeatable than clinician-derived estimates of noninvasive tear breakup time, supporting the utility of this examiner independent measure in clinical settings.
The diagnosis of DED is recognized to be challenging, necessitating the undertaking of multiple clinical techniques in order to try and obtain a complete impression of a patient's ocular surface health and tear film integrity.
17 Tear hyperosmolarity, particularly with the application of freezing-point depression techniques, has been reported to have superior overall diagnostic capacity for DED compared with a range of standard diagnostic tests.
1 Despite this, a number of barriers to its widespread clinical adoption, even with the clinically-practicable system (TearLab Corp.) that utilizes electrical impedance, have been described.
29 Tear osmolarity assessment requires specialized instrumentation, daily calibration of this equipment, and test card consumables. Obtaining reliable tear osmolarity measurements necessitates the temperature of the clinical environment to be appropriately regulated.
11 Adequate control of such variables is critical for the reliability of the assay and was carefully maintained in this study. Therefore, clinicians utilizing tear osmolarity measures for diagnosing DED need to be aware of the importance of controlling for such factors.
Recent studies of the self-reported clinical practice behaviors of eye care clinicians in multiple demographics show that NaFl-BUT remains as one of the most common diagnostic tests for DED.
30–34 Clinical preferences for NaFl-BUT measurements are not surprising; the test can be rapidly undertaken and is inexpensive to perform. However, a major disadvantage of this test is that the instillation of fluid into the eye destabilizes the tear film.
35 Measurement of NaFl-BUT can also be influenced by factors such as pH, the volume of fluorescein instilled, slit lamp illumination technique and the clinician's expertise.
17,18 Using microquantities of NaFl, as was undertaken for this study, improves the reproducibility of NaFl-BUT measurements.
36 With such volumes, a NaFl-BUT of 5 seconds or less has been reported to differentiate between dry eye and control populations; however, sensitivity and specificity calculations for this cutoff value were not provided.
37
This study reports that when measuring three consecutive NaFl-BUTs (with a 1 μL NaFl volume), using the first measure with a cutoff of 8.0 seconds has 78% sensitivity and 72% specificity for diagnosing DED against tear hyperosmolarity. These diagnostic performance values are similar to those previously reported for NaFl-BUT parameters against different diagnostic parameters for DED.
3,38 Although the ROC characteristics showed a trend toward first NaFl-BUT having superior discriminative capacity relative to average and shortest NaFl-BUT, this was not significant (
P > 0.05). These findings support the conclusions of Papas,
39 who noted the first determination of NaFl-BUT to be as good as any measure in a sequence of consecutive readings.
In recent years, there has been a research-driven trend to develop relatively less invasive measures of tear stability.
18 Clinical devices that currently exist for this purpose include the Tearscope (Keeler Ophthalmic Instruments, USA) and OCULUS Keratograph (Oculus, Inc., Wetzlar, Germany). While the Tearscope relies upon the subjective determination of BUT, the OCULUS Keratograph provides an automated quantification. Evaluation of the OCULUS Keratograph has identified the need for a calibration offset for its noninvasive BUT measurements to be comparable with other devices
40; whether a newer release of this instrumentation (OCULUS Keratograph 5M) has led to such improvements remains to be determined. Although the Tearscope has been reported to be more reliable than other techniques, such as slit lamp observation or viewing videokeratoscope mires, measurement variability is significant
41 and this effect is exaggerated with multiple examiners.
42 This study supports these findings, reporting significantly poorer repeatability for CNI-BUT measures than automated TFSQ-BUTs. The degree of repeatability of the automated TFSQ-BUT (gCoV = 9.4%) was found to be similar to other contemporary ophthalmologic devices, including those used to quantify anterior segment biometry,
43 ocular bulbar redness,
44 and iridocorneal angle.
45
Given that tear instability and hyperosmolarity coexist in DED,
1 there is strong scientific rationale for an inter-relationship between their clinical measures. The present study appears to be the first to have directly considered this potential association. Shortest TFSQ-BUT was found to have the best overall discriminative capacity for tear hyperosmolarity. Interestingly, this parameter had superior diagnostic power to the first TFSQ-BUT measure. This finding may reflect the known variability in clinical expression of DED, being a hallmark of the condition that complicates diagnosis.
46 Sampling three noninvasive measures of tear stability, rather than a single measure, may increase the likelihood of capturing at least one “abnormal” result in an eye with DED. A healthy tear film would be predicted to demonstrate a consistently stable tear profile, and therefore less variation in consecutive TFSQ-BUT measures.
A consideration when interpreting the findings from this study is that the control and DED populations were intentionally defined to achieve an unambiguous classification of tear film quality, as either “normal” (control) or “abnormal” (DED). The average tear osmolarity value of the DED population (325 mOsmol/L) is consistent with moderate to severe DED, rather than earlier stages of the disease.
47 In this respect, the findings can be considered to reflect the diagnostic capacity of the TFSQ-BUT parameter to distinguish between asymptomatic healthy controls from those with potentially more severe expressions of DED, as defined by tear osmolarity. Interestingly, despite the reported level of tear hyperosmolarity in the DED group, the overall extent of ocular surface staining in this group was low (mean total ocular surface staining score of 1.2 out of 15.0) and Schirmer test scores were consistent with borderline, rather than severe, aqueous deficiency (mean of 9.0 mm in 5 minutes). Indeed, these findings are in agreement with previous work that has demonstrated a lack of consistent relationships between common signs and symptoms of DED.
21,48 Further investigation, in particular a longitudinal clinical study, would be of value to assess the diagnostic value of the TFSQ-BUT in marginal cases of DED.
In conclusion, this study demonstrates the clinical utility of a novel noninvasive tear breakup time measure, the shortest TFSQ-BUT, as a surrogate marker for tear hyperosmolarity in people with moderate to severe DED.
The author thanks Laura Deinema for performing clinical assessments and capturing Placido-disc videokeratographs, from which data were derived for this study; and Grant Frisken (Medmont Pty Ltd., Victoria, Australia) for providing technical support and supplying the custom software function to support the TFSQ-BUT repeatability assessment.
Medmont Pty. Ltd. provided the E300 corneal topographer used in this study. The author alone is responsible for the content and writing of the paper.
Disclosure: L.E. Downie, CooperVision (F), Allergan (F), Alcon (F)