TY - JOUR
T1 - Multistream Deep Learning Models Using Multimodal Optical Coherence Tomography for Predicting Visual Impairment in Epiretinal Membrane
AU - Yeh, Hsu Hang
AU - Chou, Po Yung
AU - Hsieh, Cheng Chang
AU - Lai, Ying Hui
AU - Hsieh, Yi Ting
AU - Lin, Cheng Hung
N1 - Publisher Copyright:
© 2025 Elsevier Inc.
PY - 2026/2
Y1 - 2026/2
N2 - Objective: To develop multistream deep learning models that receive multimodal optical coherence tomography (OCT) images to predict visual impairment in epiretinal membrane (ERM), and to identify possible OCT biomarkers for visual impairment. Methods: Patients who were diagnosed as idiopathic ERM at one medical center were retrospectively enrolled. Eight types of images were collected: horizontal/vertical B-scan OCT, superficial/deep/full-layered en face OCT angiography, and superficial/deep/full-layered retinal thickness maps of the macula. The patients were labeled as either >20/50 (less visual impairment) or ≤20/50 (profound visual impairment) by best-corrected visual acuity. We developed deep learning models combining different inputs using a multistream design for predicting visual impairment. Grad-CAM was utilized for visualizing heatmaps. Prediction accuracy for profound visual impairment were compared among different models. Results: In total, 351 sets of images including horizontal and vertical B-scan OCT, superficial, deep and full-layered en face OCT angiography, and superficial, deep and full-layered retinal thickness maps were included for model development and 50 sets for external validation. The single-stream models achieved accuracies ranging from 79.48% to 88.89% in model development but decreased to 60.69%-75.11% in external validation. Increasing the number of input streams to two or three further enhanced predictive performance. Ultimately, the eight-stream model integrating all imaging modalities outperformed all others, attaining 90.90% accuracy in model development and 80.00% in external validation. Heatmaps revealed that the hot spots for model prediction focused at the foveal and parafoveal areas in all types of images, and the thickened areas and retinal folds in retinal thickness maps. Conclusions: B-scan OCT, en face OCT angiography and retinal thickness maps of the macula could all be used for predicting visual impairment in ERM via deep learning. The multistream design can enhance predictive accuracy and may provide localization of vital retinal regions relevant to visual compromise in ERM.
AB - Objective: To develop multistream deep learning models that receive multimodal optical coherence tomography (OCT) images to predict visual impairment in epiretinal membrane (ERM), and to identify possible OCT biomarkers for visual impairment. Methods: Patients who were diagnosed as idiopathic ERM at one medical center were retrospectively enrolled. Eight types of images were collected: horizontal/vertical B-scan OCT, superficial/deep/full-layered en face OCT angiography, and superficial/deep/full-layered retinal thickness maps of the macula. The patients were labeled as either >20/50 (less visual impairment) or ≤20/50 (profound visual impairment) by best-corrected visual acuity. We developed deep learning models combining different inputs using a multistream design for predicting visual impairment. Grad-CAM was utilized for visualizing heatmaps. Prediction accuracy for profound visual impairment were compared among different models. Results: In total, 351 sets of images including horizontal and vertical B-scan OCT, superficial, deep and full-layered en face OCT angiography, and superficial, deep and full-layered retinal thickness maps were included for model development and 50 sets for external validation. The single-stream models achieved accuracies ranging from 79.48% to 88.89% in model development but decreased to 60.69%-75.11% in external validation. Increasing the number of input streams to two or three further enhanced predictive performance. Ultimately, the eight-stream model integrating all imaging modalities outperformed all others, attaining 90.90% accuracy in model development and 80.00% in external validation. Heatmaps revealed that the hot spots for model prediction focused at the foveal and parafoveal areas in all types of images, and the thickened areas and retinal folds in retinal thickness maps. Conclusions: B-scan OCT, en face OCT angiography and retinal thickness maps of the macula could all be used for predicting visual impairment in ERM via deep learning. The multistream design can enhance predictive accuracy and may provide localization of vital retinal regions relevant to visual compromise in ERM.
UR - https://www.scopus.com/pages/publications/105023383458
UR - https://www.scopus.com/pages/publications/105023383458#tab=citedBy
U2 - 10.1016/j.ajo.2025.10.023
DO - 10.1016/j.ajo.2025.10.023
M3 - Article
AN - SCOPUS:105023383458
SN - 0002-9394
VL - 282
SP - 146
EP - 153
JO - American Journal of Ophthalmology
JF - American Journal of Ophthalmology
ER -