Abstract
Abstract: :
Purpose: We describe an application of automated decision tree induction––a quantitative classification method of machine learning––to facilitate pattern classification of videokeratography data by quantitative analysis of corneal surface features and compare this approach with established classification methods. We also compare hold–out and cross validation methods of model error estimation. Methods: We fit a 7 mm diameter area of corneal surface data with a 7th order Zernike polynomial for 132 normal eyes and 112 eyes diagnosed with keratoconus. We then induced a decision tree classifier using the C4.5 algorithm. Model prediction error for the decision tree based classifier was estimated by both ten–fold cross validation and hold–out methods. Using the decision tree classifier as the gold standard, we compared the area under the Receiver Operator Characteristic (ROC) curve to five other classification indices: Rabinowitz McDonnell index (RM), Schwiegerling's Z3 index (Z3), Keratoconus Prediction Index (KPI), KISA%, and Cone Location and Magnitude Index (CLMI). Results: Model prediction error estimates based on ten–fold cross validaion were significantly less than error estimates by the hold–out method (P = .003). The decision tree based classifier had significantly greater area underneath the ROC curve (0.93) than other classification methods except CLMI (0.89; P = 0.30) and Z3 (0.89; P = 0.19), KPI (0.76; P < .0001) , RM (0.72; P < .0001) , KISA% (0.61; P < .0001). Only 5 of 36 Zernike polynomial coefficients–– C(3,–1), C(0,0), C(3,3), C(2,–2), C(6,–6)–– were needed to distinguish between normal and keratoconus eyes using our decision tree classification method with an accuracy of 93%, sensitivity of 90% and specificity of 96%. Conclusions: Our automated decision tree induction method of corneal shape classification from Zernike polynomials is an accurate quantitative approach that is interpretable and can be generated from any instrument platform capable of elevation data output. Cross validation methods are a desirable method of model error estimation. This generic method of pattern classification is extendable to other classification problems.
Keywords: topography • keratoconus • image processing