Evaluation of a Deep Learning-Derived Quantitative Retinopathy of Prematurity Severity Scale

Campbell, J Peter, Kim, Sang Jin, Brown, James M , Ostmo, Susan, Chan, R V Paul, Kalpathy-Cramer, Jayashree and Chiang, Michael F (2021) Evaluation of a Deep Learning-Derived Quantitative Retinopathy of Prematurity Severity Scale. Ophthalmology, 128 (7). pp. 1070-1076. ISSN 0161-6420

Full content URL: https://doi.org/10.1016/j.ophtha.2020.10.025

Evaluation of a Deep Learning-Derived Quantitative Retinopathy of Prematurity Severity Scale
Authors' Accepted Manuscript
EvaluationOfADeepLearningDerived.pdf - Whole Document
Available under License Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International.

Item Type:Article
Item Status:Live Archive


To evaluate the clinical usefulness of a quantitative deep learning-derived vascular severity score for retinopathy of prematurity (ROP) by assessing its correlation with clinical ROP diagnosis and by measuring clinician agreement in applying a novel scale.

Analysis of existing database of posterior pole fundus images and corresponding ophthalmoscopic examinations using 2 methods of assigning a quantitative scale to vascular severity.

Images were from clinical examinations of patients in the Imaging and Informatics in ROP Consortium. Four ophthalmologists and 1 study coordinator evaluated vascular severity on a scale from 1 to 9.

A quantitative vascular severity score (1–9) was applied to each image using a deep learning algorithm. A database of 499 images was developed for assessment of interobserver agreement.

Main Outcome Measures
Distribution of deep learning-derived vascular severity scores with the clinical assessment of zone (I, II, or III), stage (0, 1, 2, or 3), and extent (<3 clock hours, 3–6 clock hours, and >6 clock hours) of stage 3 evaluated using multivariate linear regression and weighted κ values and Pearson correlation coefficients for interobserver agreement on a 1-to-9 vascular severity scale.

For deep learning analysis, a total of 6344 clinical examinations were analyzed. A higher deep learning-derived vascular severity score was associated with more posterior disease, higher disease stage, and higher extent of stage 3 disease (P < 0.001 for all). For a given ROP stage, the vascular severity score was higher in zone I than zones II or III (P < 0.001). Multivariate regression found zone, stage, and extent all were associated independently with the severity score (P < 0.001 for all). For interobserver agreement, the mean ± standard deviation weighted κ value was 0.67 ± 0.06, and the Pearson correlation coefficient ± standard deviation was 0.88 ± 0.04 on the use of a 1-to-9 vascular severity scale.

A vascular severity scale for ROP seems feasible for clinical adoption; corresponds with zone, stage, extent of stage 3, and plus disease; and facilitates the use of objective technology such as deep learning to improve the consistency of ROP diagnosis.

Keywords:deep learning, retinopathy of prematurity, computer vision, artificial intelligence, ophthalmology
Subjects:G Mathematical and Computer Sciences > G700 Artificial Intelligence
G Mathematical and Computer Sciences > G760 Machine Learning
B Subjects allied to Medicine > B500 Ophthalmics
G Mathematical and Computer Sciences > G740 Computer Vision
Divisions:College of Science > School of Computer Science
ID Code:46552
Deposited On:19 Oct 2021 10:08

Repository Staff Only: item control page