Abstract
Purpose:
To examine the diagnostic accuracy and performance of Uvemaster, a mobile application (app) or diagnostic decision support system (DDSS) for uveitis. The app contains a large database of knowledge including 88 uveitis syndromes each with 76 clinical items, both ocular and systemic (total 6688) and their respective prevalences, and displays a differential diagnoses list (DDL) ordered by sensitivity, specificity, or positive predictive value (PPV).
Methods:
In this retrospective case-series study, diagnostic accuracy (percentage of cases for which a correct diagnosis was obtained) and performance (percentage of cases for which a specific diagnosis was obtained) were determined in reported series of patients originally diagnosed by a uveitis specialist with specific uveitis (N = 88) and idiopathic uveitis (N = 71), respectively.
Results:
Diagnostic accuracy was 96.6% (95% confidence interval [CI], 93.2–100). By sensitivity, the original diagnosis appeared among the top three in the DDL in 90.9% (95% CI, 84.1–96.6) and was the first in 73.9% (95% CI, 63.6–83.0). By PPV, the original diagnosis was among the top DDL three in 62.5% (95% CI, 51.1–71.6) and the first in 29.5% (95% CI, 20.5–38.6; P < 0.001). In 71 (31.1%) patients originally diagnosed with idiopathic uveitis, 19 new diagnoses were made reducing this series to 52 (22.8%) and improving by 8.3% the new rate of diagnosed specific uveitis cases (performance = 77.2%; 95% CI, 71.1–82.9).
Conclusions:
Uvemaster proved accurate and based on the same clinical data was able to detect more cases of specific uveitis than the original clinician only–based method.
Several technologies have been used to apply artificial intelligence and learning machines to the fields of biomedicine and medical diagnosis. The differential diagnosis procedure can be automated using computer-based systems, which are usually referred to as diagnostic decision support systems (DDSS).
1–3 In the medical literature, references to DDSS are ever more frequent.
4–7 Some of these applications may be found in ophthalmology,
8–11 and specifically in uveitis.
12–16 Diagnostic decision support systems relate health observations to health knowledge, and therefore help clinicians make adequate decisions for improved health care. A DDSS consists of a knowledge base, inference engine, and user communication. The knowledge base is acquired from the medical literature data, expert consultations, and individual clinical experience.
Uveitis is a major ophthalmologic problem worldwide, with a relatively high prevalence, multiple etiologies, and wide variation in its presentation.
17–20 Because of this, its differential diagnosis is difficult especially for nonuveitis experts, generating unnecessary testing costs, delays in initiating correct treatment, and a lack of clear information for patients. The Naming-Meshing-System (NMS) suggested by Smith and Nozik
21 is an effective systematic approach to diagnosing uveitis. Based on patient medical history and a physical examination, the clinical findings of the particular case (naming) are compared with the characteristics of specific uveitis syndromes (meshing) to give rise to the differential diagnosis that best matches the set of clinical data. Additional data are then compiled through a tailored approach. According to these authors, the main limitation of NMS consists of the percentage of cases of idiopathic uveitis that cannot be categorized, because of rare entities, masquerade syndromes, and the unpredictable idiosyncratic response of living tissues to invasion by microorganisms and other antigens.
Diagnostic decision support systems can help optimize the clinical management of uveitis.
13,15,16,22 Artificial intelligence models offer several benefits, such as their great computer power enabling data for a large number of uveitis syndromes and their different clinical characteristics to be summarized and compared with generate a differential diagnosis.
In this report, we present the Uvemaster (Leading SHT, A Coruña, Spain), a DDSS for uveitis based on a mobile application (app), and assess its diagnostic accuracy and performance.
The Uvemaster DDSS contains a large knowledge base of 88 selected uveitis syndromes, each comprising 76 clinical items, both ocular and systemic (
Table 1), and their respective frequencies of appearance (total 88 × 76 = 6688).
Table 1 Clinical Data in the Uvemaster's Knowledge Base
Table 1 Clinical Data in the Uvemaster's Knowledge Base
Clinical data were collected according to the criteria often used to classify uveitis
23,24: primary site of inflammation, severity, course, laterality, keratic precipitate shape and color, age, sex, response to steroids, ocular findings, systemic manifestations, and epidemiologic data. All clinical data can be obtained in an anamnesis, physical exam, and review of systemic disease symptoms. Primary site of inflammation, severity, and course were defined according to the criteria of the International Uveitis Study Group (IUSG).
24 To help the system distinguish between infectious and noninfectious uveitis, the former was excluded if it showed full resolution in response to steroids as the only treatment without concomitant antimicrobial therapy.
Each clinical finding is assigned a value of 0 to 100, depending on its prevalence for each specific uveitis syndrome: if an item is never recorded in a given syndrome the number assigned will be 0, and if it is always present then the value assigned will be 100; the remaining values are set in increments of 10 (10–90). All values were obtained from a systematic and continuous review of the literature from 1993 to 2016, carried out by the main author (JAG), averaging the data published by other authors with ours through meta-analysis techniques. The collected data were stored in an Access file (Microsoft Corporation, Redmond, WA, USA). Once the average prevalence was calculated, this was rounded to the nearest 10 (e.g., if the prevalence of a finding was 27.8, this item was assigned a value of 30).
Prevalences of known specific uveitis syndromes (
Table 2) were taken as the means of those of a patient series from the Hospital Clinico San Carlos reported by the authors in 1995,
25 and those of three recent large European series.
26–28 These prevalences were adjusted for subsequent epidemiologic trends).
26–28 Uveitis syndromes of unknown prevalence were assigned a frequency of 0.02% to 0.1%.
Table 2 Prevalences of Specific Uveitis Syndromes Included in Uvemaster*
Table 2 Prevalences of Specific Uveitis Syndromes Included in Uvemaster*
The knowledge base compiled also contains Uvepedia, an electronic textbook with an extensive log of lab tests, complementary investigations, therapy, prognoses, and additional information. From the DDL, the user can directly browse the e-book and select the most appropriate laboratory and special tests. Finally, the case assessed is stored in an electronic health registry (EHR) that will allow through an identification number for later viewing and updating. In line with data protection legislation, these EHRs store no information that could reveal the identity of patients. The user agrees to make appropriate use of the data entered in the app.
Each uveitis syndrome offered by Uvemaster in the DDL is provided with three diagnostic indices calculated through mathematical algorithms: sensitivity, specificity, and PPV (
Fig. 2). This is done based on the information in the database both in terms of frequency of appearance of clinical findings in the different uveitis syndromes, and of their prevalence. Statistical terms are referred to the patient clinical data set (test) for each uveitis syndrome comprising the DDL. Each time the user modifies the clinical findings for a patient or manually excludes entities from the list when these are not applicable, the diagnostic indices will be recalculated.
Sensitivity (S) indicates the capacity of a test to define as positive a true case (i.e., test sensitivity is its ability to detect disease in sick subjects). The sensitivity of a clinical data set indicates how much a specific uveitis syndrome resembles the case under study.
Table 3 shows an example of the app used for a 13-year-old girl with juvenile idiopathic arthritis and uveitis.
Table 3 Mathematical Algorithms and Statistics for an Example Case*
Table 3 Mathematical Algorithms and Statistics for an Example Case*
Specificity (SP) indicates the capacity of the test to identify as negative a real negative case (i.e., test specificity is the capacity to detect the absence of the disease in healthy subjects).
Positive predictive value is the proportion of subjects with disease among those testing positive (i.e., it describes the capacity of the test to detect the disease if the result is positive).
Sensitivity and specificity are theoretical values intrinsic to the diagnostic test and do not change among populations. Predictive values are probabilities of a given result, that is, posttest values and depend on the prevalence of a specific uveitis syndrome.
To assess the diagnostic accuracy of the app, we used clinical records for two consecutive series of cases from two different hospitals (Hospital Clinico San Carlos and Hospital La Princesa, Madrid, Spain) diagnosed with any form of uveitis or intraocular inflammation according to IUSG criteria
24 from July to December 1993. To assess system performance, records of all unclassified or idiopathic uveitis from a series reported by some of the authors of the present study
25 were used. Uvemaster has undergone constant development since its creation in 1992. Thus, we used the same patient series tested in 1993 to check whether improvements over the years would serve to improve the accuracy and performance of the app. Although not the objective of the present study, improvements include an easier use of the app, faster processing speed, adaptation to mobile phones, and a wider knowledge and inference algorithm database to provide statistics. The study protocol adhered to the tenets of the Declaration of Helsinki and received institutional review board approval. Diagnostic accuracy was defined as the presence of the correct clinical diagnosis in the DDL. This was assessed in a retrospective study using 88 medical records of uveitis from the Hospital Clinico San Carlos (
N = 43) and Hospital La Princesa (
N = 45) by comparing the original diagnosis made by some of the authors (DDV, RMF, JMBC, JAGF) and the computerized diagnosis obtained using the app. To avoid bias, patient data were entered in the app by the first author (JAGF) in a blinded fashion. We checked whether the original diagnosis was included in the DDL and its likely position within it.
The variables recorded were the percentage of prior clinician-based diagnoses in first position in the DDL and those among the top three positions according to their sensitivity or PPV values. We also analyzed mean number of diagnoses generated according to the number of positive findings introduced, and the presence of the original clinical diagnosis and pattern of uveitis in the top three in the DDL.
Performance was assessed as the percentage of cases for which a specific diagnosis was obtained using the app. This indicator was determined in a reported series
25 of 228 patients with uveitis from our center, 71 cases of which were idiopathic, who were examined at that time by the authors (DDV, RMF, JMBC). For the present study, data from the clinical records of these 71 patients diagnosed with idiopathic uveitis in 1993 were introduced in the app. To validate its performance new specific diagnoses of these cases were compared with later diagnoses made by specialists at the time during the follow-up of these patients.
The percentage of new specific diagnoses in first position in the DDL was determined according to their sensitivity, specificity, and PPV. Finally, we examined differences between new diagnoses and those that remained idiopathic in relation to mean positive findings, SP of the top diagnosis, and concomitant systemic manifestations.
The χ2 test was used to compare qualitative variables: the position of prior clinical diagnoses in DDL ordered by sensitivity versus PPV, and the percentage of idiopathic and specific uveitis before and after the use of the app. This test was also used to relate the appearance of a diagnosis in the top three of the DDL with other variables, such as original clinical diagnosis and pattern of uveitis. Finally, for cases of idiopathic uveitis, we compared the existence of systemic findings between new specific diagnoses and those that remained idiopathic.
To compare the means of the number of DDL diagnoses and the number of clinical findings introduced in the app we used the Student's t-test for related samples and calculated Pearson correlation coefficients both for specific and idiopathic uveitis. This test was also used to compare between new diagnoses and those that remained idiopathic, the average of positive findings and the SP value of the first diagnosis of DDL.
Significance was set at P = 0.05 or less and associated 95% confidence intervals (CI) were computed. To determine the likelihood that uveitis would be classified as idiopathic with the app versus clinician assessment, odds ratios (OR) were calculated. All statistical tests were performed using SPSS software version 21.0 (SPSS Inc., Chicago, IL, USA).
The differential diagnosis of uveitis remains a challenge determining a need for new accurate diagnostic methods to optimize the management of individuals with uveitis. The results of our study indicate the high accuracy and performance of the diagnostic decision support system we propose for this clinical entity.
One of the first computer-aided systems, Quick Medical Reference (First DataBank, San Bruno, CA, USA) designed by Bankowitz et al.
5 for internal medicine, was based on data from anamnesis, physical examination, and ancillary tests. The sensitivity of this system was 85%, lower than observed here for Uvemaster (96.6%). These authors concluded that their system provides diagnostic options that could modify the differential diagnosis.
Other known DDSS designed for internal medicine are Dxplain
4 (Massachusetts General Hospital, Boston, MA, USA), Iliad
7 (Applied Medical Informatics, Salt Lake City, UT, USA) and Isabel
6 (Isabel Healthcare Inc., Ann Arbor, MI, USA). Much like our system, the use of Isabel has been linked to a correct diagnosis in 96%.
Gonzalez-Lopez et al.
16 developed a Bayesian belief network algorithm for the differential diagnosis of anterior uveitis. In 63.8% of cases, the most probable etiology determined by the algorithm matched the clinician's diagnosis. This sensitivity is lower than the value we obtained here for the Uvemaster (73.9%). Another DDSS for uveitis named Uveitis Doctor
14 (Lara-Medina, Alcazar de San Juan, Spain) is currently available on iOS. Its knowledge base comprises 59 uveitis syndromes and the inference engine is based on decision trees, although it does not determine statistical indices for the DDL. We propose that many cases of uveitis are labeled idiopathic because the more uncommon presentations are often not considered, and would stress the importance of a thorough workup to try to find the origin.
For the patients in whom the app diagnosis was not within the first three of the list, differences emerged between DDL position and uveitis pattern (P = 0.041) or original diagnosis (P = 0.017). This meant that among patients whose diagnoses were further down than the top three when ordered by sensitivity there were more cases of panuveitis and retinal vasculitis, and among those whose diagnoses made the top three, there were more cases of anterior uveitis. The reason for this finding is that in cases of panuveitis and retinal vasculitis
Uvemaster places in the top three of the DDL, etiologies such as toxoplasmosis, herpes, or cytomegalovirus, due to their higher sensitivity than the original diagnoses, in this case of syphilis, tuberculosis or Behçet's, as may be observed in patients 17, 21, 32, and 46 of
Supplementary Table S1.
The etiologic syndromes that least appeared among the top three were syphilis, Behçet's, tuberculosis, and sarcoidosis, which could be correlated with a higher frequency of panuveitis in this group. No differences were observed according to sex (P = 0.078), though there were more diagnoses within the top three in women.
There were three false negative results detected in our study. In one case (patient #10), uveitis was bilateral and the system awarded lens-related uveitis a value of zero (likelihood null). In patient #20, the prior diagnosis of presumed ocular histoplasmosis syndrome (POHS) was rejected because of the presence of vitritis, and the system assigned POHS a zero likelihood. Finally, in patient #26, the clinical relevance of ocular toxocariasis was not recognized because it was bilateral, and a likelihood of zero was assigned to Toxocara. The most likely diagnoses in these three cases were idiopathic panuveitis, serpiginous choroidopathy, and sympathetic ophthalmia, respectively. This means that, besides confirming a diagnosis, the computerized system can also help the clinician to review and modify a diagnosis. However, if the original diagnosis is accepted as valid, this could be considered a limitation of the system and we would have to change the frequencies in its knowledge base of clinical findings that show values of zero because they never occur but that in these cases could indeed have occurred.
A significant difference in the DDL position of a given syndrome according to sensitivity versus PPV indicates that a patient's clinical data set may have a high sensitivity for a particular uveitis syndrome yet will have a lower PPV if the syndrome shows a low prevalence. In the example shown in
Table 3, the data set indicated a higher sensitivity for juvenile idiopathic arthritis (JIA) than for sarcoidosis. However, the PPV was higher for sarcoidosis because its prevalence is higher than that of JIA (F = 5.2% vs. 2.7%). Sensitivity is a pretest value that is independent of disease prevalence so it does not vary among populations, and is the best indicator of match between the clinical data set and a specific uveitis syndrome. In contrast, PPV is a likelihood posttest and depends on disease prevalence.
The new rate of specific uveitis (77.2%) obtained using the app in the patient series reported by Benitez del Castillo
25 was higher than the rates described for other well-known series such as those of Llorenç
28 (74%), Rothova
29 (73%), Jones
26 (67.8%), Henderly
30 (67.5%), Jakob
27 (59.1%), Kijlstra
31 (56%), Palmares
32 (51.5%), and Santin
33 (49.5%). These authors also made use of a tailored method. Computer-aided systems may enhance diagnostic performance because of their greater capacity to process, compare, and summarize clinical findings. Similarly, because the knowledge base of our system comprises numerous uveitis entities and masquerade syndromes, along with further ocular, systemic, and epidemiologic findings, new diagnostic choices frequently not considered by clinicians will modify the DDL. The OR of idiopathic uveitis identified using the DDSS was lower by one-third than through conventional clinician assessment. The Uvemaster could be especially useful for detecting low prevalence entities or those in which signs appear during the disease course, and are therefore difficult to initially diagnose. An example of this may be seen in several of the patients in
Supplementary Table S2.
All findings of the anamnesis and physical exam considered for the clinical diagnosis were introduced in the app. If we examine the cases that remained idiopathic according to the app, it becomes clear it is more difficult to label a syndrome when there are fewer clinical ocular or systemic findings and the fewer the differentiated or specific signs the patient has. Significant differences were found for mean positive findings introduced (9.11 among the newly diagnosed, 7.96 among those remaining idiopathic, P = 0.002) and mean SP of the top DDL diagnosis (56.3 vs. 53.3 respectively, P = 0.043). We also observed a trend for new diagnoses to show systemic manifestations compared with cases confirmed by the app as idiopathic (26.3% vs. 9.6% respectively, P = 0.073). This could mean a greater capacity of the app to clarify the etiology of a uveitis syndrome when associated with a systemic disease, which would seem logical.
Similarly to the diagnostic accuracy of our DDSS, the percentage of new diagnoses assigned first position in the DDL was higher (16/19; 84.2%) when these were ordered by sensitivity then by specificity or PPV. Again, the reason for this is that sensitivity is the statistic variable that best defines similarity between a particular clinical case and a specific uveitis syndrome.
The present DDSS has some limitations. Although we rounded values of prevalences of the clinical findings included in our app, this should not affect the final accuracy of its statistical calculations too much because of inherent variability in these prevalences, which depends on the patient series analyzed and geographic location. It should be noted that assigning 6688 frequency values was a laborious task because many values could not be found in the literature and had to be inferred from other findings.
Initially, the app was designed to only process present or positive findings. This limited its sensitivity and diagnostic accuracy because when clinical findings that always occur in certain uveitis syndromes were absent in a patient, the system was unable to eliminate such syndromes from the DDL. To resolve this limitation, we added a second filter rule (<100) in the inference engine, which considers absent or negative clinical findings. Hence, the app can rule out entities for which a necessary clinical finding is not recorded in the case under diagnosis.
The successful use of our DDSS is fully dependent on proper assessment of symptoms and signs by the responsible clinician, because the computer will only process the data we introduce. This means it is important that patient data are entered following defined criteria. As argued by Miller,
34 the clinical diagnosis of a patient involves two distinct steps: data collection followed by data interpretation. Thus, just as we can make a wrong diagnosis if the data from the anamnesis and physical examination are incorrect, the DDSS will also provide a wrong diagnosis.
Another limitation of our DDSS is that the DDL may include uveitis syndromes that do not fully match the patient's clinical profile due to data not considered by the program, such as specific morphologic details of ocular or systemic abnormalities. In these cases, the clinician has to manually exclude these syndromes from the DDL. Hence, we must make careful and proper use of the DDSS, because it will only be useful to support or reject our decision, as long as it is well guided.
Our study was based on a review of patient records so that in some cases we lacked information on all present clinical findings, and especially on absent or negative findings. This is a limitation because the larger the number of data introduced, the greater will be the sensitivity of the DDSS and shorter the DDL.
A large prospective study will be useful to compare the diagnostic performance and accuracy of Uvemaster with that of a senior specialist in uveitis, and also provide information on how the app could serve to modify patient management or prognosis. To improve the app's performance, a filter rule for different ethnic groups will be included. Regional and endemic areas are already considered.
In conclusion, the Uvemaster is an intuitive, highly sensitive, and accurate DDSS. Our findings indicate that this computerized tool can improve the clinical management of uveitis by offering potential diagnoses and reducing the number of cases labeled as idiopathic uveitis. The diagnoses generated by the tool can be then checked by clinicians through a tailored approach.
Awarded best poster at the Annual Meeting of the American Academy of Ophthalmology; October 15–18, 2016; Chicago, Illinois, United States. PO431.
Disclosure: J.A. Gegundez-Fernandez, Uvemaster (I), Bausch&Lomb (C, R), Leading Smart Health Tech (C, R), Santem, Inc. (C), Allergan (R), Medical-Mix (R), Thea Laboratories (R) J.I. Fernandez-Vigo, None; D. Diaz-Valle, None; R. Mendez-Fernandez, None; R. Cuiña-Sardiña, None; E. Santos-Bueso, None; J.M. Benitez-del-Castillo, None