TY - JOUR
T1 - Deep learning versus ophthalmologists for screening for glaucoma on fundus examination
T2 - A systematic review and meta-analysis
AU - Buisson, Mathieu
AU - Navel, Valentin
AU - Labbé, Antoine
AU - Watson, Stephanie L.
AU - Baker, Julien S.
AU - Murtagh, Patrick
AU - Chiambaretta, Frédéric
AU - Dutheil, Frédéric
N1 - Publisher Copyright:
© 2021 Royal Australian and New Zealand College of Ophthalmologists
PY - 2021/12
Y1 - 2021/12
N2 - BackgroundIn this systematic review and meta-analysis, we aimed to compare deep learning versus ophthalmologists in glaucoma diagnosis on fundus examinations. MethodPubMed, Cochrane, Embase, ClinicalTrials.gov and ScienceDirect databases were searched for studies reporting a comparison between the glaucoma diagnosis performance of deep learning and ophthalmologists on fundus examinations on the same datasets, until 10 December 2020. Studies had to report an area under the receiver operating characteristics (AUC) with SD or enough data to generate one. ResultsWe included six studies in our meta-analysis. There was no difference in AUC between ophthalmologists (AUC = 82.0, 95% confidence intervals [CI] 65.4–98.6) and deep learning (97.0, 89.4–104.5). There was also no difference using several pessimistic and optimistic variants of our meta-analysis: the best (82.2, 60.0–104.3) or worst (77.7, 53.1–102.3) ophthalmologists versus the best (97.1, 89.5–104.7) or worst (97.1, 88.5–105.6) deep learning of each study. We did not retrieve any factors influencing those results. ConclusionDeep learning had similar performance compared to ophthalmologists in glaucoma diagnosis from fundus examinations. Further studies should evaluate deep learning in clinical situations.
AB - BackgroundIn this systematic review and meta-analysis, we aimed to compare deep learning versus ophthalmologists in glaucoma diagnosis on fundus examinations. MethodPubMed, Cochrane, Embase, ClinicalTrials.gov and ScienceDirect databases were searched for studies reporting a comparison between the glaucoma diagnosis performance of deep learning and ophthalmologists on fundus examinations on the same datasets, until 10 December 2020. Studies had to report an area under the receiver operating characteristics (AUC) with SD or enough data to generate one. ResultsWe included six studies in our meta-analysis. There was no difference in AUC between ophthalmologists (AUC = 82.0, 95% confidence intervals [CI] 65.4–98.6) and deep learning (97.0, 89.4–104.5). There was also no difference using several pessimistic and optimistic variants of our meta-analysis: the best (82.2, 60.0–104.3) or worst (77.7, 53.1–102.3) ophthalmologists versus the best (97.1, 89.5–104.7) or worst (97.1, 88.5–105.6) deep learning of each study. We did not retrieve any factors influencing those results. ConclusionDeep learning had similar performance compared to ophthalmologists in glaucoma diagnosis from fundus examinations. Further studies should evaluate deep learning in clinical situations.
KW - artificial intelligence
KW - deep learning
KW - glaucoma
KW - machine learning
KW - screening
UR - http://www.scopus.com/inward/record.url?scp=85115260205&partnerID=8YFLogxK
U2 - 10.1111/ceo.14000
DO - 10.1111/ceo.14000
M3 - Journal article
C2 - 34506041
AN - SCOPUS:85115260205
SN - 1442-6404
VL - 49
SP - 1027
EP - 1038
JO - Clinical and Experimental Ophthalmology
JF - Clinical and Experimental Ophthalmology
IS - 9
ER -