Abstract
Purpose.:
Evidence is growing that dry eye represents a common disease process resulting from a number of underlying pathologies that impact the ocular surface and that clinical estimates of dry eye severity reflect the magnitude of a single dry eye disease state variable, Θ. A theory for estimating Θ from scaled clinical observations is developed, and the hypothesis is tested that Θ exists.
Methods.:
The theory is developed around three assumptions: (1) a monotonic function unique to each person and indicator maps the indicator onto Θ, (2) betweenperson differences in mapping functions are random, and (3) observed indicator values include random perturbations. Data recently published by Sullivan and his colleagues were digitized from scatter plots of seven different indicators versus a composite severity score (square root of summed weighted squared indicator scores).
Results.:
The data were analyzed with a model derived under the specific assumptions that betweenperson variance in mapping functions is independent of the indicator value and random perturbations in observed indicator values are normally distributed. Tear osmolarity was the most sensitive indicator, and tear breakup time was the least. The distribution of residuals (squared difference between observed and predicted indicator values) agreed with model expectations for all indicators except tear osmolarity, which had larger residuals than expected, and the composite severity score, which had smaller residuals than expected.
Conclusions.:
The results are consistent with the existence of a single latent dry eye disease state variable. Only tear osmolarity does not appear to map monotonically and/or unidimensionally onto the latent variable.
Introduction
Dry eye is a syndrome characterized by clinical signs and symptoms related to abnormal tear production, abnormal tear evaporation, inflammation of the conjunctiva or lid, and damage to the exposed ocular surface.^{ 1 } Dry eye appears to be the consequence of many different types of disorders but also appears to have a common set of clinical characteristics that progress through similar stages of severity.
Focusing on the differences in dry eye presentations, both the National Eye Institute (NEI)/Industry Workshop^{ 2 } and the International Dry Eye Workshop (DEWS)^{ 3 } recommended two major pathogenic classes: teardeficient dry eye, which both workshops further subdivided into Sjogren and nonSjogren tear deficiency, and evaporative dry eye, which DEWS subdivided into “intrinsic” causes (e.g., Meibomian gland dysfunction) and “extrinsic” causes (e.g., contact lens wear).
A delphi panel^{ 4 } suggested a different approach to classifying dry eye, which they recommended be called “dysfunctional tear syndrome” (DTS). The panel subdivided DTS into groups based on appropriate therapeutic strategies and concurred that DTS should be classified as having or not having clinically evident inflammation, with or without lid margin disease, and with or without abnormal tear distribution and clearance. They also recommended that four levels of severity of DTS could be discriminated clinically for the purpose of choosing different treatment algorithms. The DEWS analysis endorsed this concept that treatment algorithms were most appropriately based upon stratification of patients according to disease severity.^{ 5 }
In contradistinction to the qualitative approach of consensus groups, Mathers and Choi^{ 6 } employed cluster analysis on the results of a battery of clinical tests. They concluded from the statistical patterns in the data that dry eye patients could be sorted into nine different disease classes based on a tree structure that starts with the presence or absence of Meibomian gland drop out, followed by average lipid viscosity, then evaporation rate or Schirmer test value (depending on average lipid viscosity), and finally, lipid volume.
Consistent with the idea of dry eye being a multivariable disease, correlations between dry eye signs and symptoms are weak.^{ 7 } But, consistent with the idea of a common underlying dry eye disease state variable, the Delphi panel and DEWS agree that irrespective of cause, clinicians can estimate the magnitude of dry eye severity from observed clinical signs and symptoms.^{ 4,5 } To reconcile such diverging lines of evidence, Baudouin proposed that despite a plethora of disorders that result in dry eye, it appears that once started, the progression of the dry eye syndrome follows a common cascading and reciprocating course that can be linked to hyperosmolarity of the tear film.^{ 8 } Several investigators have proposed that tear film hyperosmolarity is the common denominator for dry eye,^{ 9 } should be considered the dry eye diagnostic gold standard,^{ 10 } and is the best single measure of dry eye disease severity.^{ 11,12 }
Sullivan et al. advanced this idea further by proposing that raw clinical measures of dry eye signs and symptoms could be mapped onto a single continuous dry eye severity variable common to all patients.^{ 13 } To construct this variable, an expert panel of clinicians employed a modified version of the DEWS severity scale to rate the strength of evidence of dry eye disease on a scale of 0 to 1 for a range of scores for each of six physiologic tests and for a dry eye symptom summary score (ocular surface disease index [OSDI]^{14}). Monotonic functions were fit to the trends in panel members' disease severity ratings versus observed physiologic sign or symptom scores. The fitted functions were inverted, and all the dry eye sign and symptom data were transformed to the hypothetical continuous dry eye disease severity scale. Finally, weighting coefficients were estimated to compensate for correlations between sets of scores, and a composite score was generated for each person as the square root of the sum of squared weighted transformed scores. All of these operations are monotonic and treated as independent, and therefore, the calculated composite dry eye disease severity variable must be monotonic with the raw observation score for each physiologic sign and with the OSDI symptom score. Correlations of raw observation scores with the composite dry eye disease severity score varied across signs and symptoms from 0.17 for the Schirmer test to 0.55 for tear film osmolarity.^{ 11,12 }
The construction of a single continuous dry eye severity variable common to all patients is a scientifically interesting and potentially useful proposition. However, the construction of the variable begs the question of whether or not such a disease state variable exists. The underlying premise is that each observed sign and symptom carries information about the putative disease state variable, but the information is distorted and masked by noise. Previously employed strategies have relied heavily on inferences from correlations to conclude that the estimated variable exists and is not simply a fiction created by the composite score algorithm. However, to deductively test the validity of the hypothesis that a single continuous dry eye severity variable common to all patients exists, it is necessary to employ a predictive model based on an axiomatic theory of the relationship between the hypothetical disease state variable and the clinical observations. In this paper, we present the required axiomatic theory and test the hypothesis that a single dry eye severity variable exists.
Theory
If a single latent dry eye disease state variable exists, then dry eye patients can be ordered by the severity of their disease.^{ 13 } This ordering implies the existence of a pattern in clinical observations on dry eye patients that is congruent with a latent disease state variable.^{ 2,3 } To avoid confusing the disease with its signs and symptoms, we will call this latent disease state variable Θ, a single variable that is assumed to underlie the clinically observed DTS that formally defines dry eye disease.^{ 4 } For the purpose of theory development, we define Θ to be continuous and unbounded (i.e., has infinite limits). The clinician cannot directly observe Θ in a patient; rather, the clinician observes the indicator variables that clinically constitute DTS. The indicator variables are labeled I_{j} for the j^{th} indicator. Indicator variables are signs and symptoms that can be ordered in severity and expressed on a dichotomous, polytomous, or continuous scale. Indicator variables act as surrogates for Θ. Examples of dry eye indicator variables are millimeters of wetting for the Schirmer test, tear breakup time (TBUT), tear osmolarity, and OSDI summary score. We assume that the value of each indicator variable carries information about the magnitude of Θ. However, the indicator variable also reflects other patientspecific physiologic and environmental traits unrelated to Θ that idiosyncratically distort or mask the information it carries about the magnitude of Θ.
How Θ is Related to Indicator Variables
The magnitude of the latent disease state variable is denoted Θ _{n} for patient n. Each indicator variable maps onto Θ _{n} by way of the personspecific and indicatorspecific monotonic and, most likely, nonlinear mapping function, Θ _{n} = u_{nj} (I_{j} ).
The mapping function for any given person/indicator combination is unknown and probably unknowable, but we can postulate an expected mapping function for each indicator variable for a population, û_{j} (I_{j} ) = E{u_{nj} (I_{j} )}.
The expected mapping function is approximately equal to the average mapping function for indicator variable
j across a representative sample of
N persons,
The personspecific error function is the difference between the mapping function for person
n and the expected mapping function for the population,
η_{nj} (
I_{j} ) =
u_{nj} (
I_{j} ) −
û_{j} (
I_{j} ).
By definition, the error function has an expected value of 0 across persons irrespective of the magnitude of I_{j} but has a variance across persons that can be a function of I_{j} , that is, VAR{η_{nj} (I_{j} )} = E{η_{nj} (I_{j} )^{2}}.
Besides differences between persons in the mapping functions, the observed value of each indicator variable is subject to observation error. Observation error is different from the error created by betweenperson differences in mapping functions. Within each person, the mapping function defines a fixed relationship between Θ _{n} and I_{j} , but within each person unknown forces, such as physiologic and environmental variables that are unrelated to Θ _{n}, can perturb that relationship. A second type of observation error arises from the operations themselves that define the observation. Patient selfreports, ratings by judges, and physical measurement instruments are inherently unreliable and can be biased. Given the fixed relationship between Θ _{n} and I_{j} , the expected value of indicator variable j for person n with disease state Θ _{n} is Î_{j} (n), but the observed value, I_{j} (n), may not agree. Thus, the observation error in I_{j} for person n, irrespective of its source, is e_{nj} = I_{j} (n) − Î_{j} (n).
The Stochastics of Error Functions and Observation Errors
Although we have defined Θ to be a continuous and unbounded variable, indicator variables generally are observed as bounded and discrete. Even in the case of physical indicator variables that are expressed on a continuous scale, they are observed and recorded in units that represent discrete intervals on that scale. For indicator variable
j to be observed having the discrete value
x for person
n, we assume that
where
C_{jx} is the criterion value of indicator variable
j, on a theoretically continuous observation scale, that must be exceeded to assign the discrete value
x to the observation and
C_{jx}_{+1} is the criterion value of indicator variable
j that must be exceeded to assign the discrete value
x +1 to the observation. There are
m +1 intervals with
m criterion values (interval boundaries) ranging from
C_{j}_{1} to
C_{jm} [
C_{j}_{0} and
C_{jm}_{+1} are the scale boundaries; if
I_{j}(
n) is unbounded, then
C_{j}_{0} = −∞ and
C_{jm}_{+1} = +∞]. To make the stochastic nature of the observation explicit, we reexpress the preceding observation rule in terms of the observation error,
where
Î_{j}(
n) is a fixed variable and
e_{nj} is a random variable.
Because we defined the mapping functions to be monotonic functions of the indicator variables, it follows from expression 1 that
which is an operation that transforms units of the indicator variable to units of Θ for person
n. To make betweenperson random variability explicit, this expression for transforming indicator variable units can be reexpressed in terms of the expected population mapping functions and personspecific error functions
If we define a new random variable,
ε_{nj} =
û_{j}(
Î_{j}(
n) +
e_{nj}) −
û_{j}(
Î_{j}(
n)), then expression 2a can be reexpressed as
By collecting all of the random terms into a single random error function, that is,
δ_{njx} =
η_{nj} (
C_{jx} ) −
η_{nj} (
Î_{j} (
n) +
e_{nj} ) –
ε_{nj} , expression 2c can be simplified to
Expressions 1b and 2d constitute the axiomatic theory for the relation of clinical indicator variables to a single latent disease state variable.
In order to estimate measures of the latent Θ variable from observations of manifest indicator variables, it is necessary to make an assumption about the nature of the joint probability density function for δ_{njx} . An openended model of the covariance matrix with respect to persons, items, and interval boundaries is likely to be overdetermined. To constrain the model, we must impose requirements on the covariance matrix that have to be met in order for observations of indicator variables to be considered valid surrogates from which measurements of the latent Θ variable can be estimated.
One constraint is to assume that the variance in
η_{nj} is constant within an indicator (i.e., independent of the value of the argument of the function). In this case, all variance in
δ_{njx} can be attributed to variance in
ε. When these conditions are satisfied,
VAR{
δ_{njx}} is independent of
x, and expression 2d becomes
or, from the definition of
ε_{nj}, expression 3a is identical to
We will refer to the class of models that include this constraint as “added independent noise” (i.e., added
ε) models. We say the added noise is independent because it does not depend on the magnitude of the indicator variable. This class of models would include Samejima's graded response model in psychometrics
^{15} and proportional odds ordinal logistic regression models in statistics.
^{16} (See the Appendix for more detail on the form of added noise models.)
Another constraint is to assume that the betweenperson variance of the error function η_{nj} (…) depends on the argument of the error function. When these conditions are satisfied, the joint probability density function for δ_{njx} depends on the response category threshold, the indicator variable, and the sample of persons. We will refer to the class of models that incorporate this constraint as “mapping noise” models. Mapping noise refers to betweenperson differences in the shapes of the mapping functions. This class of models includes Masters's partial credit^{ 17 } and Muraki's generalized partial credit^{ 18 } latent variable models in psychometrics. (See the Appendix for additional detail on mapping noise models.)
Estimation of Expected Mapping Functions from Observations of Indicator Variables
The different mapping functions are defined to transform their respective manifest indicator variables into the same latent Θ
_{n} value for person
n. Therefore, if there were no mapping noise or added noise,
E{Θ
_{n}} =
û_{i}(
Î_{i}(
n)) =
û_{j}(
Î_{j}(
n)) for all combinations of indicators
i and
j observed in person
n. If
J different indicators are observed for a sample of
N persons, we can define a summary variable for each person,
S_{n}, that is a monotonic function,
v(…), of the vector of
J indicator variables for person
n, that is,
Because the indicator variables have added noise,
e_{nj}, we can reexpress the summary variable in terms of the expected values of
I_{j} and an added noise term for person
n,
s_{n}, that is,
where the expected value of
s_{n} across persons is zero. We have constrained
S_{n} to be a monotonic function of each of the indicator variables, and therefore, we can posit a mapping function that monotonically transforms
S_{n} to Θ
_{n},
and, as for each of the indicator variables,
^{19}
Equation 6 implies monotonic relationships,
w_{j} (…), between the summary score,
S, and the expected values of the indicator variables, that is,
where
n(
S) refers to person
n who has a summary score of
S and
N_{S} refers to the total number of persons with a summary score of
S. One can estimate
w_{j} (…) for each indicator variable from observations of indicator variable values for each person, an acceptably defined summary variable function,
v(…), and regression analysis employing equation 7.
Once
w_{j} (…) functions have been estimated from the sample data, the next step is to transform each observed indicator variable value to a summary variable value for each person/indicator combination, that is,
This summary variable estimated for each person/indicator pair includes random noise. So at this point, it is necessary to make assumptions about variances and covariances and the form of the joint density function for δ
_{njx} to reduce the general theory to a specific measurement model. When the measurement model has been defined, maximum likelihood estimation procedures can be applied to the matrix of
S_{nj} values to estimate Θ
_{n} for each person and
û_{j} (
C_{jx} )for each discrete value of each indicator,
C_{jx} ,
^{20} which in turn leads to an estimate of each mapping function,
û_{j} (
Î_{j} ).
Methods
For the purpose of demonstrating an application of the latent disease state variable theory presented above and to preliminarily evaluate the hypothesis that a single latent dry eye disease state variable Θ exists, we employ the DTS indicator variable and composite disease severity score data recently published by Sullivan et al.^{ 13 } They obtained measures of seven different DTS indicator variables from each of 299 eye clinic patients and volunteer staff members in a prospective multicenter study; the DTS indicator variables were tear osmolarity (mOsm/liter), Schirmer test (millimeters), TBUT (seconds, average of three repeated observations), OSDI score,^{ 14 } Meibomian gland score (using the Bron/Foulkes scoring system^{21}), corneal staining with sodium fluorescein dye under cobalt blue illumination (using the NEI/Industry Workshop scale^{2}), and conjunctival staining with sodium lissamine green dye (using the NEI/Industry Workshop scale^{2}). Thus, they obtained a set of observed DTS indicator variable values for each participant. Using expert clinician ratings of disease severity and independent component analysis on the sets of indicator values to correct for correlations between measures, Sullivan and his colleagues then computed a disease summary score, the composite disease severity score, for each participant.
Sullivan et al. published scatter plots of indicator variable values versus the estimated composite disease severity score for each participant in
Figure 1 of their paper.
^{ 13 } We digitized the displayed data for each indicator variable. Because of overlapping data points, some data were missed. We were able to obtain 267 osmolarity points (89%), 277 Schirmer's test points (93%), 237 Meibomian score points (79%), 258 TBUT points (86%), 248 OSDI score points (83%), 214 conjunctiva stain score points (72%), and 190 cornea stain score points (64%). It was not possible to match the participants with the digitized points.
Results
We are limited in the extent of our analysis because we cannot recover the vector of indicator variable values for each person. However, we can determine whether or not the indicator variables and composite disease severity score meet the prescribed condition of measurement that each variable is monotonic with the latent disease state variable, Θ. This task is accomplished by presuming that expression 2d is satisfied by the observations and that in general, Θ
_{n} =
u_{nj}(
I_{j}(
n)) =
û_{j}(
Î_{j}) +
η_{nj}(
Î_{j} +
e_{nj}) +
ε_{nj} for all indicator variables and for the composite disease severity score, and testing the hypothesis that all variables map onto Θ, as predicted by equation 6. Because of the constraints we are working under with the digitized data, we employ the added independent noise model, which is built on the assumption that
VAR{
δ_{njx}} ≅
VAR{
ε_{nj}} and that expression 2a is a sufficient description of the observations. We also make the simplifying assumption that the probability density function for
ε_{nj} is a logistic, approximately normal, with an expected value of 0 and a standard deviation of
σ_{j}, that is,
Monotonicity of
û_{j}(
Î_{j}) requires that
P(
ε_{nj} <
û_{j}(
C_{jx}) −
û_{j}(
Î_{j}(
n)) =
P(
e_{nj} <
C_{jx} −
Î_{j}(
n)), where
The logit of equation 9 is
or in terms of the logit for the indicator variable, which must be the same as the logit for equation 9, and using the definition introduced earlier,
I_{j}(
n) =
Î_{j}(
n) +
e_{nj},
where
σ_{j} is the standard deviation of the
ε_{nj} distribution and
Display Formula_{j} is the value of Θ, that is,
û_{j}(
Î_{j}(
n)), at the median value of the indicator variable
j for the sample of persons. We can conclude from equation 10 that the mapping functions for the indicator variables are expected by the model to be linear transformations of the logit of
P(
I_{j}(
n) <
C_{jx}). (N.B., the derivation of equation 10 employs sign notation that implies the monotonic relationship between the indicator variable and Θ is expected to be positive. In some cases, the monotonic relationship will be negative, e.g., Meibomian gland dropout, in which case the sign notation would have to be modified accordingly.)
Figure 1 illustrates empirical logits, from cumulative probabilities representing
P(
I_{j} (
n) <
C_{jx} ), as a function of
C_{jx} for each indicator variable and for the composite disease severity score (filled circles). The solid curve in each panel is the least squares fit of an atheoretical thirdorder polynomial (the estimated equations are displayed in their respective panels). The ordered pair of indicator variable score and composite disease severity score, (
I_{j} (
n),
S_{nj} ), for each digitized data point from the study of Sullivan et al.
^{ 13 } was then transformed to the ordered pair (
û_{j} (
I_{j} (
n)),
û_{s} (
S_{nj} )) using the estimated polynomial equations from the regressions illustrated in
Figure 1. Scatter plots of these transformed ordered pairs are illustrated in
Figure 2.
Because the composite disease severity score is a monotonic transform of each indicator variable, we expect equation 10 to apply to the logit of the composite disease severity score, as well as to the logits of each of the indicator variables. This expectation, along with the requirement that
E{
û_{j}(
I_{j}(
n))} =
E{
û_{s}(
S_{nj})} =
E{Θ
_{n}} and substituting
I_{j}(
n) or
S_{nj} for
C_{jx} in equation 10, leads to the conclusion
or if we rescale Θ
_{n} to
Display Formula_{s} and
σ_{s}, then
where the rescaled Θ
_{n} is
Display Formula$\Theta \u2032n$ = (1.8/
σ_{s})(Θ
_{n} −
Display Formula_{s}).
Figure 2 illustrates scatter plots of
û_{s}(
S_{nj}) versus
û_{j}(
I_{j}(
n)) for each indicator (filled circles). The solid lines are bivariate regression lines that represent the principal component (i.e., orthogonal fit with unit variances). The Pearson correlations between the logits are indicated in each panel of the figure. These correlations are higher than those reported by Sullivan et al.
^{13} for the raw indicator observations versus the composite severity score (
P values for differences between correlations range from <0.0001 for TBUT to 0.06 for cornea staining). The slope of the regression line for each indicator variable in
Figure 2 corresponds to
σ_{j/}σ_{s} and the intercept corresponds to (1.8/
σ_{s})(
Display Formula_{j} −
Display Formula_{s}) in equation 10.
Table 1 compares intercept values for the different indicator variables relative to
Display Formula_{s} = 0 and
σ_{s} = 1.8. The intercept can be interpreted as an index of the sensitivity of the indicator to the disease state variable relative to the composite disease severity score. That is,
û_{s}(
S) =
Display Formula_{j} at the median value of
I_{j}. So, if
Display Formula_{j} is low, the indicator variable is identifying more people as being in a low dry eye disease state than indicator variables for which
Display Formula_{j} is high. Thus, TBUT is the least sensitive indicator of dry eye disease and osmolarity is the most sensitive indicator. As might be expected from any central tendency variable, the composite disease severity score is less sensitive than four of the indicators but more sensitive than the other three indicators.
Table 1.
Estimated Sensitivity,
, and SD of the Transformed Indicator Variables Relative to a Sensitivity of 0 and an SD of 1.8 for the Composite Severity Score
Table 1.
Estimated Sensitivity,
, and SD of the Transformed Indicator Variables Relative to a Sensitivity of 0 and an SD of 1.8 for the Composite Severity Score
Indicator Variable


SD

Osmolarity 
1.07 
2.32 
Cornea stain score 
0.53 
2.12 
Conjunctiva stain score 
0.48 
2.29 
OSDI score 
0.18 
2.27 
Composite severity score 
0.00 
1.80 
Meibomian score 
−0.10 
2.23 
Schirmer test 
−0.72 
1.76 
TBUT 
−4.93 
2.43 
The standard deviations of the added noise for each indicator variable,
σ_{j} , relative to the standard deviation of the added noise for the composite disease severity score,
σ_{s} = 1.8, also are compared in
Table 1. The estimated standard deviations are similar for all indicator variables; however, they are somewhat lower for the Schirmer test and the composite disease severity score. The standard deviation of the added noise determines the standard error of the estimate of Θ, which, relative to the standard deviation of the estimated measures, defines measurement reliability. Because of limitations imposed by the data digitization, we cannot combine information from all the indicators to estimate a single value of Θ for each person, so at this time, we cannot offer meaningful estimates of the standard errors.
Although we cannot evaluate measurement reliability, we can use the logistic model with the recovered data to test the hypothesis that a single dry eye disease state variable exists and can be estimated from observed values of indicator variables.
Figure 3 is a schematic illustrating how values of Θ can be estimated for each person from the ordered pair (
û_{j} (
I_{j} (
n)),
û_{s} (
S_{nj} )) that defines each data point in
Figure 2.
Figure 3 is a reproduction of the top left panel in
Figure 2 for osmolarity but with all except one data point removed. The regression line was fit to minimize the perpendicular distances of the data points from the line. Thus, in this example, the estimated value of Θ, using the transformed value of the observed osmolarity and the transformed value of the composite disease severity score for that person, is the point that falls on the regression line where the perpendicular from the data point (filled circle) to the line intersects the line (open circle). The corresponding ordered pair for the estimated point is (
û_{j} (Î
_{j} (
n)),
û_{s} (
Ŝ_{nj} )), the values of which are indicated by the dashed lines. The coordinates of the estimated point for each person were computed from the horizontal distance of the observed data point from the line, which is the difference between the logit for the composite disease severity score and the logit for the osmolarity value in osmolarity logit units (Δ
x in the figure), the vertical distance of the observed data point from the line, which is the same difference between logits, but in composite disease severity score logit units (Δ
y in the figure), and the slope of the regression line, that is,
σ_{j/}σ_{s} .
The estimated values of Θ
_{n} for each indicator were linearly transformed to composite disease severity score logit units using the appropriate regression coefficients. These estimated values of Θ
_{n} were substituted for
û_{j} (
Î_{j} (
n)) in equation 9 and the probability of observing the indicator value
C_{jx} was calculated for all possible values of the indicator variable in its defined range. These probabilities then were used to estimate the expected value of the indicator variable for each person
We also estimated
E{
Display Formula $ I j 2 $ Θ
_{n} } using equation 12 but substituting
Display Formula $ C j x 2 $ for the
C_{jx} multiplier in the numerator. These expected values were used to calculate a squared residual for each observed indicator value, (
I_{j} (
n) −
E{
I_{j} Θ
_{n} })
^{2}, and the squared residual expected by the model,
E{
Display Formula $ I j 2 $ Θ
_{n} } −
E{
I_{j} Θ
_{n} }
^{2}. If the observations,
I_{j} (
n), adhere to the assumptions of the measurement model, then the ratios of these squared residuals across persons are expected to be distributed as χ
^{2} normalized to its degrees of freedom,
^{ 20 } with one degree of freedom for each indicator (because two observations for each indicator are used to estimate Θ
_{n} ).
Figure 4 illustrates histograms of the ratios of squared residuals for composite disease severity scores when Θ
_{n} was estimated from a pairing of the composite disease severity score with each of the indicator variables (see legend). The solid curve is the expected quantized (Δratio = 0.1) normalized χ
^{2} distribution for one degree of freedom. The histogram for each pairing of the composite severity score with the indicator variable is not consistent with the expected normalized χ
^{2} distribution. The expected value of the χ
^{2} distribution is 1. The means of the ratios of squared residuals for the composite severity scores range from 0.11 when paired with tear film osmolarity to 0.49 when paired with conjunctiva stain score. Mean squared residual ratios of much less than 1 indicate that the observed composite severity scores are in better agreement with predicted scores than would be expected by the model. This situation can occur when the distributions of indicator values that make up the composite severity score are biased toward the ceiling or floor of the scale (the distribution of composite disease severity scores is biased toward the normal limit of 0; mean = 0.3, skew = 0.8 ).
Figure 5 illustrates histograms of the ratios of squared residuals for each of the indicator variables along with the expected normalized χ
^{2} distribution for one degree of freedom (N.B. the vertical scale changes across panels). With the exception of osmolarity, the histogram for each indicator variable agrees with the expected normalized χ
^{2} distribution. For six of the indicator variables, the mean squared residual ratio is close to or less than 1 (Schirmer test = 0.94; Meibomian score = 0.31; TBUT = 0.38; OSDI = 0.50; conjunctiva stain = 0.81; cornea stain = 0.48). However, in the case of osmolarity, the distribution of squared residual ratios does not match expectations. Besides the difference in the shape of the distribution, the observed squared residuals for osmolarity are larger than those expected by the model (mean = 1.28). This result indicates that confounding variables or operations in the person sample make contributions to observed osmolarity that distort the estimates of Θ.
Discussion
There has been a history of debate over the relevance, sensitivity/specificity, and accuracy/reliability of different clinical tests for dry eye.
^{ 11,22 } But implicit in the testing and in the goals of consensus workshops is the premise that dry eye disease can be represented by a single disease severity variable that can vary in magnitude between patients and over time and can be used by clinicians to guide treatment or to measure response to therapeutic interventions. The concept of a single dry eye disease severity variable is a theoretical construction. However, simply postulating its existence is insufficient; it is necessary to also postulate its relationship to surrogate clinical indicator variables. In the theory presented here, we postulated that the relationship between indicator variables and the disease severity variable is monotonic and perturbed by random factors. More explicit models of the relationship devolve from explicit assumptions about the sources and probability densities of the random perturbations. The assumption of monotonicity is a requirement of measurement. If the relationship for an indicator variable is not monotonic, the variance in the squared residuals will exceed expectation and that variable will be flagged as possibly not making a valid contribution to the measurement (e.g., osmolarity in
Figure 5). In the analysis presented here, we assumed that the perturbations can be characterized as logistically distributed (i.e., approximately normal) independent added noise. Within the framework of monotonicity and the explicit added noise assumption, and within the limitations imposed on the analysis by the data recovery from the publication of Sullivan et al.,
^{ 13 } we tentatively conclude that a single dry eye disease severity variable that is monotonically related to clinical indicators exists.
The analysis of the distributions of indicator variable squared residuals, which are summarized in
Figure 5, demonstrates that the Schirmer test, Meibomian score, TBUT, OSDI score, conjunctiva stain score, and cornea stain score agree with the predictions of the theoretical assumptions. Tear osmolarity measures do not agree with the prediction of those assumptions. Within the limits imposed on our analysis and the constraints of our assumptions, it appears that the relationship between tear osmolarity and dry eye disease severity is nonmonotonic, significantly distorted by confounding variables, or does not exist. One confounding variable could be cumulative or threshold effects of hyperosmolarity over time. Since osmolarity depends on both the concentration of solute and the volume of solution, it is conceivable that the relationships of tear volume to disease severity and of solute production to disease severity could combine to produce nonmonotonicities in the relationship between osmolarity and disease severity. All speculation aside, given the complexities of the variables and their relationships to the putative dry eye disease state variable, we add the caveat that it would be premature to draw any conclusions about the relative clinical values of different indicator variables based on the results of this one analysis.
Although the behavior of the composite disease severity score estimated from expert ratings and independent component analysis by Sullivan et al.^{ 13 } agrees with the prediction of the model and, therefore, would be considered a valid indicator variable, it does not enjoy any special status among the other indicator variables. Because each indicator variable was paired with the composite severity score, it was possible to perform the analysis described here and estimate values of the latent dry eye disease variable. However, because we could not identify which set of indicator variable data points belonged to each person, we were unable to explore the role of mapping noise in the relationship between indicator variables and the dry eye variable, we were unable to calculate meaningful standard errors of dry eye variable estimates for each person and thereby assess measurement reliability, and we were unable to explore the possible multidimensional factor structure of the univariate latent dry eye variable, as suggested by the cluster analysis of Mathers and Choi.^{ 6 }
The latent disease state variable theory presented here has a number of corollaries. First, the theory makes it possible to discriminate the effects of treatments that alter only selective signs and symptoms (i.e., selective changes in I_{j} ) from the effects of treatments that alter the underlying disease state (i.e., global changes in Θ). Treatmentrelated changes in an indicator variable(s) that are not the result of changes in underlying disease state would manifest as increased mean square residuals posttreatment for indicator variable j, with no effect on the distributions of mean square residuals for other indicators. Second, because the theory is axiomatic and derivative models require explicit assumptions, it is possible to separate the evaluation of the measurement accuracy of the underlying latent disease state variable from the evaluation of measurement precision. Measurement accuracy refers to the agreement of the observations with the expectations of the theoretical assumptions. Measurement precision is evaluated by comparing the expected variance in the estimate of Θ to the observed variance in the distribution of Θ estimates for the sample. Third, because the theory assumes the existence of a single underlying latent disease state variable, that is, Θ, it is possible to test the hypothesis that Θ is a composite variable with identifiable factors. In other words, various methods for analyzing the covariances of residuals (e.g., principal components analysis and independent components analysis) could be used to evaluate the factor structure of Θ.
We tentatively conclude from our analysis of the reconstructed data of the study of Sullivan et al.^{ 13 } that a single latent dry eye disease state variable exists and can be constructed from clinically observed physiologic and patient symptom indicators. Although tear film osmolarity appears to be the most sensitive dry eye indicator, the observed behavior relative to the other indicators suggests that the mapping function might not be monotonic and/or might not be univariate. If confirmed, that conclusion would suggest that tear film osmolarity, although possibly playing a central role in the disease pathway, would not function as a surrogate for measuring dry eye severity.
A challenge for clinicians managing patients with dry eye disease is the paucity of approved therapies, with most clinical trials resulting in failure of the tested drug to result in a statistically significant improvement in the primary outcome measure. Thus, much of the recommended treatment protocols today involve offlabel or unapproved therapies. One possible value of a latent dry eye disease state variable would be an improved outcome measure to use in clinical trials as a measure of the efficacy of the study drug in reducing disease severity or preventing disease progression.
Appendix Models of Error Function Distributions
The variance of the error function with respect to person
n, indicator variable
j, and interval boundary
x is
where
e_{nj} is a betweenperson and betweenindicator random variable and
η_{nj}(…) is a betweenperson and betweenindicator random function. If the variance of
η_{nj} is constant within an indicator (i.e., independent of the value of the argument of the function), then it is safe to assume that
and
With this single constraint and resulting approximations, the betweenperson variance in
δ_{njx} is determined almost entirely by
ε_{nj}, that is,
VAR{
δ_{njx}} ≅
VAR{
ε_{nj}}. When these conditions are satisfied,
VAR{
δ_{njx}} is independent of
x, and expression 2d in the paper becomes
or from the definition of
ε_{nj}, expression A1a is identical to
If
f_{j}(
e_{nj}) is defined to be the probability density function of the random error in the observed variable for indicator
j, then the probability of observing
C_{jx} <
Î_{j}(
n) +
e_{nj} <
C_{jx}_{+1} is
Similarly, if
g_{j}(
ε_{nj}) is defined to be the probability density function of the random error in expression A1a for indicator
j, then the probability of
û_{j}(
C_{jx}) <
û_{j}(
Î_{j}(
n) +
e_{nj}) <
û_{j}(
C_{jx}_{+1}) is
Because
û_{j}(…)is a monotonic function and because the random variable in expression A1b is included in the argument of the function, the probability of
û_{j}(
C_{jx}) <
û_{j}(
Î_{j}(
n) +
e_{nj}) <
û_{j}(
C_{jx}_{+1}) must be equal to the probability of
C_{jx} <
Î_{j}(
n) +
e_{nj} <
C_{jx}_{+1}. Therefore, we conclude from the equivalence of expressions A1b and A1a that
Equations A2 through A4 define a class of models, characterized as “added independent noise” (i.e., added ε), that differ only in their definitions of f_{j} (e_{nj} ) and g_{j} (ε_{nj} ). We say the added noise is independent because it does not depend on the magnitude of the indicator variable. Since there is only one sample of I_{j} (n) for each person or I_{j} (n) is an average of repeated samples from each person, ε represents betweenperson differences in factors that add their effects to the observed value of I_{j.} . Thus, g_{j} (ε_{nj} ) can be sample dependent. If g_{j} (ε_{nj} ) is a normal or logistic density function, then equation A4 takes the same form as Samejima's graded response model in psychometrics^{ 15 } or a proportional odds ordinal logistic regression model in statistics.^{ 16 } However, more generally, g_{j} (ε_{nj} ) can take any form, which can be different for different indicator variables, j, and for different samples of persons, n = 1 to N.
If the betweenperson variance of the error function η_{nj} (…) depends on the argument of the error function, then VAR{δ_{njx} } ≅ VAR{η_{nj} (C_{jx} )} + VAR{η_{nj} (Î_{j} (n) + e_{nj} )} + VAR{ε_{nj} } because the variance of η_{nj} (Î_{j} (n) + e_{nj} ) depends on both the variance of the error function and the variance of Î_{j} (n) + e_{nj} in the sample, the variance of η_{nj} (C_{jx} ) depends only on the variance of the error function in the sample, and the variance of ε depends only on the variance of Î_{j} (n) + e_{nj} ; therefore, all of the covariances are likely to be approximately zero. When these conditions are satisfied, the joint probability density function depends on x, the indicator variable j, and the sample of persons, n = 1 to N.
The probability of
û_{j}(
C_{jx}) +
δ_{njx} <
û_{j}(
Î_{j}(
n)) in expression 2d in the paper is equal to the probability of
δ_{njx} <
û_{j}(
Î_{j}(
n)) −
û_{j}(
C_{jx}), which is
and the probability of
û_{j}(
Î_{j}(
n)) <
û_{j}(
C_{jx}_{+1}) +
δ_{njx}_{+1} in expression 2d in the paper is equal to the probability of
δ_{njx}_{+1} >
û_{j}(
Î_{j}(
n)) −
û_{j}(
C_{jx}_{+1}), which is
To the extent that the probability of the event described in equation A5a is independent of the probability of the event described in equation A5b, the probability of both events occurring is the product of the two probabilities. Generalizing to all of the terms in expression 2d in the paper, the probability of the observation satisfying expression 2d, so that we can infer that Θ
_{n} =
x, is
If we impose the additional requirement that the latent interval boundaries for each indicator variable/person combination must be ordered for each observation, that is,
û_{j}(
C_{j}_{1}) +
δ_{nj}_{1} <
û_{j}(
C_{j}_{2}) +
δ_{nj}_{2} < … <
û_{j}(
C_{jm}_{−1}) +
δ_{njm}−1 <
û_{j}(
C_{jm}) +
δ_{njm}, then the conditional probability that Θ
_{n} =
x is
Equations A5a through A7 define a class of models that can be characterized as “mapping noise” (i.e., betweenperson differences in the shapes of the mapping functions). If h_{jx} (δ_{njx} ) is a joint logistic density function with an equal variance diagonal covariance matrix that is independent of the choice of indicator variable, j, then equation A7 takes the form of the Masters partial credit latent variable model.^{ 17 } If h_{jx} (δ_{njx} ) is a joint logistic density function with a diagonal equal variance covariance matrix but the variance depends on the choice of indicator variable, j, then equation A7 takes the form of Muraki's generalized partial credit latent variable model.^{ 18 } However, more generally, h_{jx} (δ_{njx} ) can be different for each value of x and each indicator variable j.
References
Bron
AJ
Abelson
MB
Ousler
BS
Methodologies to diagnose and monitor dry eye disease: report of the Diagnostic Methodology Subcommittee of the International Dry Eye Workshop (2007).
Ocul Surf.
2007;5:108–152.
[CrossRef] [PubMed]
Lemp
MA
.
Report of the National Eye Institute/Industry Workshop on Clinical Trials in Dry Eye.
CLAO J.
1995;21:221–232.
[PubMed]
Lemp
MA
Baudouin
C
Baum
J
The definition and classification of dry eye disease: report of the Definition and Classification Subcommittee of the International Dry Eye Workshop (2007).
Ocul Surf.
2007;5:75–92.
[CrossRef] [PubMed]
Behrens
A
Doyle
JJ
Stern
L
Dysfunctional tear syndrome: a delphi approach to treatment recommendations.
Cornea.
2006;25:900–907.
[CrossRef] [PubMed]
Pflugfelder
SC
Geerling
G
Kinoshita
S
Management and therapy of dry eye disease: report of the Management and Therapy Subcommittee of the International Dry Eye Workshop (2007).
Ocul Surf.
2007;5:163–178.
[CrossRef] [PubMed]
Mathers
WD
Choi
D
.
Cluster analysis of patients with ocular surface disease, blepharitis, and dry eye.
Arch Ophthalmol.
2004;122:1700–1704.
[CrossRef] [PubMed]
Nichols
KK
Nichols
JJ
Mitchell
GL
.
The lack of association between signs and symptoms in patients with dry eye disease.
Cornea.
2004;23:762–770.
[CrossRef] [PubMed]
Baudouin
C
.
Un nouveau schéma pour mieux comprendre les maladies de la surface oculaire.
J Fr Ophtalmol.
2007;30:239–246.
[CrossRef] [PubMed]
Mathers
WD
.
Why the eye becomes dry: a cornea and lacrimal gland feedback model.
CLAO J.
2000;26:159–165.
[PubMed]
Farris
RI
.
Tear osmolarity: a new gold standard?
Adv Exp Med Biol.
1994;506:495–503.
Tomlinson
A
Khanal
S
Ramaesh
K
Tear film osmolarity: Determination of a referent for dry eye diagnosis.
Invest Ophthalmol Vis Sci.
2006;47:4309–4315.
[CrossRef] [PubMed]
Lemp
MA
Bron
AJ
Baudouin
C
Tear osmolarity in the diagnosis and management of dry eye disease.
Am J Ophthalmol.
2011;151:792–798.
[CrossRef] [PubMed]
Sullivan
BD
Whitmer
D
Nichols
KK
An objective approach to dry eye disease severity.
Invest Ophthalmol Vis Sci.
2010;51:6125–6130.
[CrossRef] [PubMed]
Schiffman
R
Christianson
MD
Jacobsen
G
Reliability and validity of the ocular surface disease index.
Arch Ophthalmol.
2000;1218:615–621.
[CrossRef]
Samejima
F
.
Estimation of latent ability using a response pattern of graded scores.
Psychometric Monogr.
1969;
34(2, No. 17).
McCullagh
P
.
Regression models for ordinal data.
J R Stat Soc Series B Stat Methodol.
1980;42:109–142.
Masters
GN
.
A Rasch model for partial credit scoring.
Psychometrika.
1982;47:149–174.
[CrossRef]
Muraki
E
.
A generalized partial credit model: application of an EM algorithm.
Appl Psychol Meas.
1992;16:159–176.
[CrossRef]
Andrich
D
.
Sufficiency and conditional estimation of person parameters in the polytomous Rasch model.
Psychometrika.
2010;75:293–308.
[CrossRef]
Massof
RW
.
Understanding Rasch and itemresponse theory models: applications to the estimation and validation of interval latent trait measures from responses to rating scale questionnaires.
Ophthalmic Epidemiol.
2011;18:1–19.
[CrossRef] [PubMed]
Foulkes
GN
Bron
AJ
.
Meibomian gland dysfunction: a clinical scheme for description, diagnosis, classification, and grading.
Ocul Surf.
2003;1:107–126.
[CrossRef] [PubMed]
Korb
DR
.
Survey of preferred tests for diagnosis of the tear film and dry eye.
Cornea.
2000;19:483–486.
[CrossRef] [PubMed]
Disclosure:
R.W. Massof, Alcon (F, C);
P.J. McDonnell, Abbott Medical Optics (C), Alcon (C), Allergan (C), Inspire (C)
Supported by an Alcon Research Institute award (RWM) and by an unrestricted grant to the Department of Ophthalmology, Johns Hopkins University School of Medicine, from Research to Prevent Blindness, Inc., New York, NY.