TY - JOUR
T1 - Cultural relic image restoration using two-stage transformer-CNN framework
AU - Wu, Xing
AU - Gao, Deyu
AU - Li, Zhi
AU - Yao, Junfeng
AU - Qian, Quan
AU - Song, Jun
N1 - This work is supported by the National Natural Science Foundation of China (No. 62172267), the State Key Program of National Natural Science Foundation of China (Grant No. 61936001), the Project of Key Laboratory of Silicate Cultural Relics Conservation (Shanghai University), Ministry of Education (No. SCRC2023ZZ02ZD).
Publisher Copyright:
© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2025
PY - 2026/1
Y1 - 2026/1
N2 - Cultural relic image restoration presents unique challenges due to irregular damage and historically specific textures, which standard deep learning methods struggle to address. This paper proposes a novel two-stage Transformer-CNN framework tailored for this task. The first stage leverages a Transformer to capture global structural dependencies from low-resolution priors, generating coherent coarse proposals. The second stage employs a specialized CNN to refine fine-grained textures from these proposals, optimized by a compound perceptual loss function. Validated on a new large-scale dataset of 88,000 East Asian cultural relic images, our approach demonstrates state-of-the-art performance. A key contribution is the generation of diversified restoration outputs, providing conservators with multiple valid references for decision-making. This work establishes an effective paradigm for digital heritage conservation that balances global structural integrity with local texture fidelity.
AB - Cultural relic image restoration presents unique challenges due to irregular damage and historically specific textures, which standard deep learning methods struggle to address. This paper proposes a novel two-stage Transformer-CNN framework tailored for this task. The first stage leverages a Transformer to capture global structural dependencies from low-resolution priors, generating coherent coarse proposals. The second stage employs a specialized CNN to refine fine-grained textures from these proposals, optimized by a compound perceptual loss function. Validated on a new large-scale dataset of 88,000 East Asian cultural relic images, our approach demonstrates state-of-the-art performance. A key contribution is the generation of diversified restoration outputs, providing conservators with multiple valid references for decision-making. This work establishes an effective paradigm for digital heritage conservation that balances global structural integrity with local texture fidelity.
KW - Convolutional neural networks
KW - Cultural relic restoration
KW - Diversified output
KW - Transformer architecture
UR - https://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=hkbuirimsintegration2023&SrcAuth=WosAPI&KeyUT=WOS:001649976100006&DestLinkType=FullRecord&DestApp=WOS_CPL
UR - https://link.springer.com/article/10.1007/s10489-025-07058-0
UR - https://www.scopus.com/pages/publications/105026321116
U2 - 10.1007/s10489-025-07058-0
DO - 10.1007/s10489-025-07058-0
M3 - Journal article
SN - 0924-669X
VL - 56
JO - Applied Intelligence
JF - Applied Intelligence
IS - 1
M1 - 19
ER -