**Purpose**:
To validate the prediction accuracy of variational Bayes linear regression (VBLR) with two datasets external to the training dataset.

**Method**:
The training dataset consisted of 7268 eyes of 4278 subjects from the University of Tokyo Hospital. The Japanese Archive of Multicentral Databases in Glaucoma (JAMDIG) dataset consisted of 271 eyes of 177 patients, and the Diagnostic Innovations in Glaucoma Study (DIGS) dataset includes 248 eyes of 173 patients, which were used for validation. Prediction accuracy was compared between the VBLR and ordinary least squared linear regression (OLSLR). First, OLSLR and VBLR were carried out using total deviation (TD) values at each of the 52 test points from the second to fourth visual fields (VFs) (VF2–4) to 2nd to 10th VF (VF2–10) of each patient in JAMDIG and DIGS datasets, and the TD values of the 11th VF test were predicted every time. The predictive accuracy of each method was compared through the root mean squared error (RMSE) statistic.

**Results**:
OLSLR RMSEs with the JAMDIG and DIGS datasets were between 31 and 4.3 dB, and between 19.5 and 3.9 dB. On the other hand, VBLR RMSEs with JAMDIG and DIGS datasets were between 5.0 and 3.7, and between 4.6 and 3.6 dB. There was statistically significant difference between VBLR and OLSLR for both datasets at every series (VF2–4 to VF2–10) (*P* < 0.01 for all tests). However, there was no statistically significant difference in VBLR RMSEs between JAMDIG and DIGS datasets at any series of VFs (VF2–2 to VF2–10) (*P* > 0.05).

**Conclusions**:
VBLR outperformed OLSLR to predict future VF progression, and the VBLR has a potential to be a helpful tool at clinical settings.

^{1}for predicting future visual fields (VFs) in glaucomatous patients. It is of importance to estimate VF progression rate in clinical settings, because glaucomatous VF defect is progressive and irreversible. Therefore, accurate prediction of future VF decay would contribute to appropriate medical or surgical intervention. Given that glaucoma is the second leading cause of blindness in the world

^{2}and that it could deteriorate quality of life, it is worthwhile improving it.

^{3}excluding the data from the University of Tokyo Hospital, in order to compute the prediction accuracy. Next, we also applied it to Diagnostic Innovations in Glaucoma Study (DIGS) dataset,

^{4}which was thought to be more challenging. The patients recruited at the University of Tokyo Hospital comprised Asians, especially Japanese, and consequently, normal tension glaucoma (NTG) was prevalent.

^{5}In contrast, the DIGS dataset consists of glaucoma patients of European and African descent with most patients having primary open angle glaucoma with elevated intraocular pressure (IOP) and few patients with NTG. Therefore, by comparing the result of previous study with JAMDIG and DIGS datasets, we could show whether and how much degree the model is generalized.

^{6,7}Reliability criteria applied for training data were: fixation losses (FL) ≤ 33 %, false-positive responses (FP) ≤ 33 % and false-negative rate (FN) ≤ 33%.

^{4}In brief, for DIGS glaucoma subjects recruited at the University of California San Diego Shiley Eye Institute, inclusion criteria were 20/40 or better best-corrected visual acuity, spherical refraction within ±5.0 diopters (D), cylinder correction within ±3.0 D, open-angles on gonioscopy, and at least two consecutive and reliable standard automated perimetry (SAP) examinations with either a pattern standard deviation (PSD) or a glaucoma hemifield test (GHT) result outside the 99% normal limits. Exclusion criteria were eyes with coexisting retinal disease and eyes with nonglaucomatous optic neuropathy. This dataset originally had 3583 eyes of 1913 patients and the criteria same to JAMDIG dataset was applied: (1) each patient had at least 11 VF measurements with 24-2 or 30-2 HFA II; (2) patients' first VFs were excluded; and (3) VFs with FL ≥ 20% and FP ≥ 15% were excluded.

^{1}In brief, let

*n*th VF in their series;

*m*th eye,

*m*th eye (where the first half and latter half of this vector include the intercept and slope coefficients of all 52 test VF points, respectively). Next, let

*n*th data,

*m*th eye. A less strict criteria (33% FL and FP) was employed for training data to increase the size of the dataset and to better represent what happens in clinical practice, and

^{8}

*t*-test were performed on DIGS and JAMDIG results, respectively. There was statistically significant difference between VBLR and OLSLR for both datasets at every series (VF2–4 to VF2–10) except for VF2–10 in JAMDIG (

*P*< 2.2e-16, < 2.2e-16, < 2.2e-16, 8.0e-11, 1.1e-5, 0.02, 0.47 for JAMDIG, and

*P*< 6.5e-11, 2.6e-10, 2.3e-10, < 2.2e-16, < 2.2e-16, < 2.2e-16, and 5.4e-16 for DIGS). However, there was no statistically significant difference in prediction performance of VBLR between JAMDIG and DIGS data at any series of VFs (VF2–2 to VF2–10).

*P*<0.05), while correlations did not reach statistical significance for all series of JAMDIG (

*P*> 0.05). Likewise, Figure 7 shows the relationship between initial mTD (the second VFs) and RMSEs that represent the association between severity of glaucoma and prediction performance, and in all series of VFs, there were statistically significant correlations (

*P*<0.05). However, as Figure 8 shows, there was no significant correlation between mean of raw error values that represents the discrepancy between real VFs and prediction, and initial mTD (

*P*> 0.05) except for VF2–2 and VF2–3 in DIGS (

*P*= 0.01 and 0.02). Furthermore, the regression lines between raw prediction error and initial mTD with VF2–2 and VF2–3 were near horizontal.

^{9}however, the merit would be only marginal, if any, because our previous study showed linear regression models outperformed nonlinear models, in terms of prediction accuracy.

^{10}It should be noted that the statistical significance of progression cannot be calculated with nonlinear regression, which limits the clinical usefulness of nonlinear regressions.

^{11}

^{12–14}In the previous and this study, VBLR was trained with the data only at the University of Tokyo Hospital, which means that the data mostly consisted of Asians and the prevalent type of glaucoma was NTG,

^{5}nonetheless the diagnostic/predicting performance in an external DIGS dataset obtained in United States was at least no worse than that in JAMDIG dataset collected in Japan.

^{15}

^{16}However, in our recent study with the JAMDIG data, it was indicated that mean IOP was not associated with progression of VF damage,

^{17}probably because most of the patients in the JAMDIG dataset were already medically intervened and the mean IOP was within an appropriate and tight range. Indeed, we have recently proposed a novel method of regressing VF against IOP integrated time, instead of time, using the JAMDIG data.

^{18}As a result, significant improvement of prediction accuracy was observed, but the magnitude of the improvement was small and its impact on the real clinical settings was almost negligible. Thus, achieving improvement of VF progression prediction by applying VBLR, although it cannot reflect IOP status, will be a clinically useful approach when assessing VF progression of glaucoma patients.

^{19}

**H. Murata**, None;

**L.M. Zangwill**, None;

**Y. Fujino**, None;

**M. Matsuura**, None;

**A. Miki**, None;

**K. Hirasawa**, None;

**M. Tanito**, None;

**S. Mizoue**, None;

**K. Mori**, None;

**K. Suzuki**, None;

**T. Yamashita**, None;

**K. Kashiwagi**, None;

**N. Shoji**, None;

**R. Asaoka**, None

*Invest Ophthalmol Vis Sci*. 2014; 55: 8386–8392.

*Br J Ophthalmol*. 2006; 90: 262–267.

*Invest Ophthalmol Vis Sci*. 2016; 57: 2012–2020.

*Arch Ophthalmol*. 2009; 127: 1136–1145.

*Ophthalmology*. 2004; 111: 1641–1648.

*Ophthalmology*. 1990; 97: 44–48.

*Arch Ophthalmol*. 1996; 114: 19–22.

*An Open Source C++ Linear Algebra Library for Fast Prototyping and Computationally Intensive Experiments*. Sydney, Australia: National Insurance Crime Training Academy (NICTA); 2010.

*Invest Ophthalmol Vis Sci*. 2011; 52: 4765–4773.

*Invest Ophthalmol Vis Sci*. 2015; 56: 4076–4082.

*PLoS One*. 2014; 9: e85654.

*Am J Ophthalmol*. 1989; 108: 636–642.

*Am J Ophthalmol*. 1986; 102: 402–404.

*Am J Ophthalmol*. 1987; 104: 577–580.

*Invest Ophthalmol Vis Sci*. 2000; 41: 417–421.

*Lancet*. 2015; 385: 1295–1304.

*Invest Ophthalmol Vis Sci*. 2016; 57: 2012–2020.

*Sci Rep*. 2016; 6: 31728.

*Br J Ophthalmol*. 1996; 80: 40–48.