融合细节增强与动态上采样的轻量化图像超分辨率模型

于明伟

doi:10.19554/j.cnki.1001-3563.2026.09.030

PDF(22290 KB)

包装工程（技术栏目） ›› 2026, Vol. 47 ›› Issue (9) : 286-295. DOI: 10.19554/j.cnki.1001-3563.2026.09.030

自动化与智能化技术

融合细节增强与动态上采样的轻量化图像超分辨率模型

于明伟^*

作者信息 +

Lightweight Image Super-resolution Model Fusing Detail Enhancement with Dynamic Up-sampling

YU Mingwei^*

Author information +

文章历史 +

摘要

目的针对超分辨率模型普遍存在的高复杂度和计算开销问题,提高重建精度与效率。方法在Dual Aggregation Transformer（DAT）基础上提出增强型图像超分辨率模型（Enhanced DAT,EDAT）,融合图像细节增强与动态上采样。通过边缘注意力与图像锐化增强模块提升局部边缘和全局细节表达能力;动态上采样替代传统上采样,在保证重建性能的同时显著降低模型复杂度和参数量。结果在4倍图像超分辨率任务中,EDAT在Set5和Set14上均优于DAT、CAT、Swin2SR、SwinIR和SRCNN;消融实验显示,在模型复杂度降低约65%的情况下,重建均方根误差（RMSE）仍较DAT提升约1.5%。结论 EDAT在不增加额外输入的前提下,实现了重建精度与计算效率的平衡,适合计算资源受限和实时性要求高的应用场景。

Abstract

To address high complexity and computational cost in image super-resolution models, the work aims to improve reconstruction accuracy and efficiency. The Enhanced Dual Aggregation Transformer (EDAT) was proposed based on the Dual Aggregation Transformer (DAT), fusing image detail enhancement with dynamic up-sampling. By introducing an edge attention and image sharpening enhancement module, the representation capability of local edges and global details was improved. Dynamic up-sampling was adopted to replace traditional up-sampling methods, which significantly reduced model complexity and parameter count while maintaining reconstruction performance. On 4-fold super-resolution tasks, EDAT outperformed DAT, CAT, Swin2SR, SwinIR, and SRCNN on Set5 and Set14. Ablation studies showed EDAT reduced model complexity by approximately 65%, while still improving reconstruction RMSE by about 1.5% compared to DAT. Without extra input information, EDAT effectively balances reconstruction accuracy and efficiency, demonstrating strong potential in resource-constrained and real-time applications.

导出引用

于明伟. 融合细节增强与动态上采样的轻量化图像超分辨率模型[J]. 包装工程. 2026, 47(9): 286-295 https://doi.org/10.19554/j.cnki.1001-3563.2026.09.030

YU Mingwei. Lightweight Image Super-resolution Model Fusing Detail Enhancement with Dynamic Up-sampling[J]. Packaging Engineering. 2026, 47(9): 286-295 https://doi.org/10.19554/j.cnki.1001-3563.2026.09.030

中图分类号： TS801.3

参考文献

[1] 宗国浩, 张明琰, 王锐, 等. 卷烟包装外观缺陷数据集构建及深度学习检测技术研究[J]. 包装工程, 2024, 45(5): 135-143.
ZONG G H, ZHANG M Y, WANG R, et al.Cigarette Packaging Appearance Defect Data Set Construction and Deep Learning Detection Technology Research[J]. Packaging Engineering, 2024, 45(5): 135-143.
[2] VU T T H, PHAM D L, CHANG T W. A YOLO-Based Real-Time Packaging Defect Detection System[J]. Procedia Computer Science, 2023, 217: 886-894.
[3] LEPCHA D C, GOYAL B, DOGRA A, et al.Image Super-Resolution: A Comprehensive Review, Recent Trends, Challenges and Applications[J]. Information Fusion, 2023, 91: 230-260.
[4] CHEN H G, HE X H, QING L B, et al.Real-World Single Image Super-Resolution: A Brief Review[J]. Information Fusion, 2022, 79: 124-145.
[5] LI H Y, YANG Y F, CHANG M, et al.SRDiff: Single Image Super-Resolution with Diffusion Probabilistic Models[J]. Neurocomputing, 2022, 479: 47-59.
[6] LIM B, SON S, KIM H, et al.Enhanced Deep Residual Networks for Single Image Super-Resolution[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Honolulu, HI, USA. IEEE, 2017: 1132-1140.
[7] ZHANG Y L, TIAN Y P, KONG Y, et al.Residual Dense Network for Image Super-Resolution[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA. IEEE, 2018: 2472-2481.
[8] WANG Y, LI Y S, WANG G, et al.Multi-Scale Attention Network for Single Image Super-Resolution[C]// 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Seattle, WA, USA. IEEE, 2024: 5950-5960.
[9] DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An Image Is Worth 16x16 Words: Transformers for Image Recognition at Scale[EB/OL]. 2020: arXiv: 2010.11929. https://arxiv.org/abs/2010.11929
[10] LIU Z, LIN Y T, CAO Y, et al.Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows[C]// 2021 IEEE/CVF International Conference on Computer Vision (ICCV). Montreal, QC, Canada. IEEE, 2022: 9992-10002.
[11] LIANG J Y, CAO J Z, SUN G L, et al.SwinIR: Image Restoration Using Swin Transformer[C]// 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW). Montreal, BC, Canada. IEEE, 2021: 1833-1844.
[12] ZAMIR S W, ARORA A, KHAN S, et al.Restormer: Efficient Transformer for High-Resolution Image Restoration[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans, LA, USA. IEEE, 2022: 5718-5729.
[13] CHEN Z, ZHANG Y L, GU J J, et al.Dual Aggregation Transformer for Image Super-Resolution[C]// 2023 IEEE/CVF International Conference on Computer Vision (ICCV). Paris, France. IEEE, 2024: 12278-12287.
[14] DONG C, LOY C C, HE K M, et al.Image Super-Resolution Using Deep Convolutional Networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(2): 295-307.
[15] KIM J, LEE J K, LEE K M.Accurate Image Super-Resolution Using very Deep Convolutional Networks[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, NV, USA. IEEE, 2016: 1646-1654.
[16] SHI W Z, CABALLERO J, HUSZÁR F, et al. Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, NV, USA. IEEE, 2016: 1874-1883.
[17] DONG C, LOY C C, TANG X O.Accelerating the Super-Resolution Convolutional Neural Network[C]// Computer Vision-ECCV 2016. Cham: Springer, 2016: 391-407.
[18] CONDE M V, CHOI U J, BURCHI M, et al.Swin2SR: SwinV2 Transformer for Compressed Image Super-Resolution and Restoration[C]// Computer Vision-ECCV 2022 Workshops. Cham: Springer, 2023: 669-687.
[19] LIU W Z, LU H, FU H T, et al.Learning to Upsample by Learning to Sample[C]// 2023 IEEE/CVF International Conference on Computer Vision (ICCV). Paris, France. IEEE, 2024: 6004-6014.
[20] WANG J Q, CHEN K, XU R, et al.CARAFE: Content-Aware ReAssembly of FEatures[C]// 2019 IEEE/CVF International Conference on Computer Vision (ICCV). Seoul, Korea. IEEE, 2019: 3007-3016.
[21] LU H, LIU W Z, FU H T, et al.FADE: Fusing theAssets ofDecoder andEncoder ForTask-Agnostic Upsampling[C]// Computer Vision-ECCV 2022. Cham: Springer, 2022: 231-247.