Abstract
随着智能手机摄影的普及,图像数据采集变得极为便捷,但在透过透明介质例如玻璃窗进行拍摄时,玻璃反射的存在严重影响了图像质量,进而干扰下游计算机视觉任务的性能。反射消除作为计算摄像学与计算机视觉领域的重要研究问题,旨在从带反射图像中消除反射干扰以恢复清晰的背景图像。随着深度学习在计算摄像问题中的广泛应用,反射消除领域经历了快速发展,鉴于此,本文旨在围绕近年来基于深度学习的反射消除研究进展进行深入探讨。首先,从混合图像成像模型入手,分析玻璃材质特性以及相机特性对反射图像和背景图像性质的影响。其次,从输入图像的角度,详细汇总了现有的反射消除真实数据集,并对其应用场景、具体用途、数据规模和分辨率等属性进行了统计分析。接着,从深度学习模型的视角,系统性对比了反射消除网络的设计范式、损失函数和评估指标。此外,根据反射消除方法所依赖的分层依据和辅助信息,将现有方法归纳为基于图像特征、文本特征、几何特性和光照特性四大类,并进行了简明的描述和分析。最后,通过讨论反射消除领域内尚未解决的关键挑战,对该领域进行总结与展望。本文旨在提供一个关于反射消除问题的系统研究视角,帮助研究者建立对反射消除技术的深刻认识,为未来研究提供有价值的参考。
The widespread adoption of smartphone photography has simplified image acquisition, leading to generation of massive amounts of image data for training intelligent visual perception models. However, several image degradation issues hinder the use of captured image data, one of which is the glass reflection. When people take photos through transparent materials such as glass windows, the presence of reflections can severely degrade the quality of the captured images and interfere with downstream computer vision tasks. Reflection removal aims to separate different scene components located on either side of the glass from reflection-contaminated mixture images, which eliminates glass reflections to obtain clear transmission images. Reflection removal, as an attractive topic in computational photography and computer vision, has obtained considerable attention from researchers and rapidly developed with the extensive application of deep learning in computational photography problems. This study comprehensively reviews recent advancements in deep learning-based reflection removal. First, we start by analyzing the image formation model of mixture images, followed by examining the effects of glass material and camera characteristics on the properties of reflection and transmission images, including refraction, absorption, and reflection effects of glass and image blur caused by camera depth of field. Second, from the perspective of input images, we summarize publicly available reflection removal datasets and conduct statistical analysis of their application scenarios, specific purposes, data scale, and resolution attributes. Synthetic data created based on theoretical imaging models are important for large-scale training dataset creation, but they still exhibit distribution discrepancies compared with real captured images. Therefore, a key strategy to enhance model performance is to use a portion of real data during training. Benchmark datasets from real data need to be constructed to comprehensively evaluate the performance of reflection removal algorithms in real-world settings. Third, from the viewpoint of deep learning models, we systematically compare the network design, loss functions, and evaluation metrics of reflection removal networks. For network design, researchers have primarily employed three types of network structures to construct the network models, namely, direct, cascaded, and concurrent structures, depending on different strategies to predict transmission and reflection images. As for loss functions, early methods mainly utilize pixel loss and edge loss. Subsequently, more sophisticated loss functions have been introduced to constrain the perceptual quality of predicted images and the correlation between transmission and reflection images. These functions guide the optimization of network models toward more realistic and higher-quality restoration. For evaluating the quality of reflection removal results, similarities across various statistical characteristics between the predicted and reference images are used as metrics. Based on the employed auxiliary information in reflection removal methods, we propose a systematical taxonomy to categorize existing approaches into four types: image feature-based, text feature-based, geometry characteristics-based, and light characteristics-based. The rapid development of deep learning has enabled the use of deep neural networks in reflection removal methods, exploiting image features to extract low- or high-level image characteristics from large training datasets, thereby enhancing reflection removal performance. However, due to the inherent ill-posed nature of the problem, incorporating additional auxiliary information becomes crucial when dealing with complex reflection scenarios. Methods based on geometry characteristics use panoramic cameras or capture multiple images from different camera positions to obtain additional views of scenes, which provides auxiliary contextual information. Methods based on light characteristics leverage the discrepancy in the light paths between rays from the transmission and reflection scenes, such as variations in light conditions or polarization characteristics, to provide key cues for effective reflection removal. With the emergence of multimodal large language models, methods based on text features introduce natural language descriptions to cooperate with the image modality and provide semantic guidance for reflection removal. Consequently, state-of-the-art results are obtained without requiring additional hardware support. Finally, by discussing unresolved key challenges within the field, we offer a summary and outlook for reflection removal research. This survey provides a systematic review on recent advances in deep learning for the reflection removal problem, which helps researchers develop a profound understanding of reflection removal techniques and facilitate future research.
The widespread adoption of smartphone photography has simplified image acquisition, leading to generation of massive amounts of image data for training intelligent visual perception models. However, several image degradation issues hinder the use of captured image data, one of which is the glass reflection. When people take photos through transparent materials such as glass windows, the presence of reflections can severely degrade the quality of the captured images and interfere with downstream computer vision tasks. Reflection removal aims to separate different scene components located on either side of the glass from reflection-contaminated mixture images, which eliminates glass reflections to obtain clear transmission images. Reflection removal, as an attractive topic in computational photography and computer vision, has obtained considerable attention from researchers and rapidly developed with the extensive application of deep learning in computational photography problems. This study comprehensively reviews recent advancements in deep learning-based reflection removal. First, we start by analyzing the image formation model of mixture images, followed by examining the effects of glass material and camera characteristics on the properties of reflection and transmission images, including refraction, absorption, and reflection effects of glass and image blur caused by camera depth of field. Second, from the perspective of input images, we summarize publicly available reflection removal datasets and conduct statistical analysis of their application scenarios, specific purposes, data scale, and resolution attributes. Synthetic data created based on theoretical imaging models are important for large-scale training dataset creation, but they still exhibit distribution discrepancies compared with real captured images. Therefore, a key strategy to enhance model performance is to use a portion of real data during training. Benchmark datasets from real data need to be constructed to comprehensively evaluate the performance of reflection removal algorithms in real-world settings. Third, from the viewpoint of deep learning models, we systematically compare the network design, loss functions, and evaluation metrics of reflection removal networks. For network design, researchers have primarily employed three types of network structures to construct the network models, namely, direct, cascaded, and concurrent structures, depending on different strategies to predict transmission and reflection images. As for loss functions, early methods mainly utilize pixel loss and edge loss. Subsequently, more sophisticated loss functions have been introduced to constrain the perceptual quality of predicted images and the correlation between transmission and reflection images. These functions guide the optimization of network models toward more realistic and higher-quality restoration. For evaluating the quality of reflection removal results, similarities across various statistical characteristics between the predicted and reference images are used as metrics. Based on the employed auxiliary information in reflection removal methods, we propose a systematical taxonomy to categorize existing approaches into four types: image feature-based, text feature-based, geometry characteristics-based, and light characteristics-based. The rapid development of deep learning has enabled the use of deep neural networks in reflection removal methods, exploiting image features to extract low- or high-level image characteristics from large training datasets, thereby enhancing reflection removal performance. However, due to the inherent ill-posed nature of the problem, incorporating additional auxiliary information becomes crucial when dealing with complex reflection scenarios. Methods based on geometry characteristics use panoramic cameras or capture multiple images from different camera positions to obtain additional views of scenes, which provides auxiliary contextual information. Methods based on light characteristics leverage the discrepancy in the light paths between rays from the transmission and reflection scenes, such as variations in light conditions or polarization characteristics, to provide key cues for effective reflection removal. With the emergence of multimodal large language models, methods based on text features introduce natural language descriptions to cooperate with the image modality and provide semantic guidance for reflection removal. Consequently, state-of-the-art results are obtained without requiring additional hardware support. Finally, by discussing unresolved key challenges within the field, we offer a summary and outlook for reflection removal research. This survey provides a systematic review on recent advances in deep learning for the reflection removal problem, which helps researchers develop a profound understanding of reflection removal techniques and facilitate future research.
| Translated title of the contribution | Deep learning for image reflection removal: a survey |
|---|---|
| Original language | Chinese (Simplified) |
| Pages (from-to) | 2711-2728 |
| Number of pages | 18 |
| Journal | 中国图像图形学报 |
| Volume | 30 |
| Issue number | 8 |
| Early online date | 7 Aug 2025 |
| DOIs | |
| Publication status | Published - 16 Aug 2025 |
User-Defined Keywords
- 卷积神经网络(CNN)
- 反射消除
- 图像复原
- 感知质量
- 扩散模型
- 计算摄像学
- computational photography
- image restoration
- reflection removal
- convolutional neural network (CNN)
- diffusion model
- perceptual quality