Abstract
Purpose :
Individuals previously infected with the bacterium Chlamydia trachomatis (Ct) – the cause of trachoma – carry specific antibodies whose concentration can be measured by a Pgp3-specific enzyme-linked immunoabsorbent assay (ELISA) in arbitrary optical density (OD) units. In general, the distribution of antibody levels for individuals previously infected with Ct (seropositives) overlaps with the distribution for those that were not (seronegatives), presenting the problem of where to place the seropositivity threshold when it is not a priori known which individuals are seropositive. There are two conceptually different approaches for setting the seropositivity threshold. One approach requires obtaining a reference sample of seronegatives and seropositives, after which a threshold that appropriately balances sensitivity and specificity is chosen and used on future samples. A second approach requires no reference samples and instead uses a 2-component Gaussian Mixture Model (GMM) to estimate the underlying seronegative and seropositive distributions, from which the threshold is chosen. We compared the accuracy and precision of these two approaches.
Methods :
We used 3 reference samples of seronegatives and seropositives (one set from Nepal and two from Tanzania) to test how well Youden's J index – a threshold that equally weights sensitivity and specificity – in one sample predicts Youden's J in the other samples. We also looked at the accuracy of GMM's by comparing GMM-estimated Youden's J to the true J's. Standard errors of both these estimates as well as that of the most commonly used threshold in GMM – 3σ above the mean of the estimated seronegative distribution – were compared.
Results :
The mean absolute difference in log OD values when using Youden's J from one sample to predict Youden's J in another sample was 0.066, while the mean absolute difference between GMM-estimated Youden's J and true Youden's J was 1.786, demonstrating that the Gaussian assumption in GMM's is fairly inaccurate. Standard errors were 0.2287 for Youden's J, 0.5645 for GMM-estimated Youden's J, and 0.5862 for 3σ above the mean of the GMM-estimated seronegative distribution.
Conclusions :
Youden's J-index from a single reference sample of seronegatives and seropositives is more accurate and more precise than thresholds estimated using a 2-component Gaussian Mixture Model.
This abstract was presented at the 2019 ARVO Annual Meeting, held in Vancouver, Canada, April 28 - May 2, 2019.