ZANG Haoke, ZHANG Yihan, LI Bohao, SU Hongjun, BAO Fengwei, HAN Shaolong
The work aims to achieve high-accuracy, robust, and real-time detection of barrel packaging text in tobacco slurry preparation production lines under complex interference conditions, including curved-surface deformation, reflection, low contrast, and stain occlusion, thereby providing reliable front-end detection support for automatic packaging information recognition and production traceability. To address the problems of principal-direction mismatch, boundary discontinuity, and missed detection of barrel packaging text under curved-surface deformation, reflection interference, low contrast, and local stains, a geometry-aware multi-branch feature fusion model named Vim-DFUMNet was proposed. The model was designed around three key requirements: geometric alignment, global modeling, and multi-scale collaborative fusion. Specifically, PRSS was used to alleviate principal-direction deviation caused by curved-surface projection; P-VimNet was employed to enhance long-range dependency modeling for curved text; and DFUM was designed to coordinate high-level semantic information with low-level boundary details, thereby improving the continuous representation capability, boundary integrity, and detection stability of barrel packaging text in complex industrial scenarios. Comparative experiments, ablation studies, and visualization analyses were conducted on a self-built industrial barrel packaging dataset. The dataset contained 600 original images and was expanded to 1 500 images through data augmentation, with training, validation, and test sets divided at a ratio of 6:2:2. The proposed method achieved a precision of 95.0%, a recall rate of 92.4%, an F1-score of 93.7%, and a detection speed of 46 FPS on the test set. Compared with the baseline DBNet++, the precision, recall rate, and F1-score were improved by 7.2%, 9.2%, and 8.3%, respectively. Compared with TextMamba, the F1-score was further improved by 2.2%. The proposed method effectively improves the geometric alignment capability, boundary integrity, and detection stability of barrel packaging text under complex industrial interference conditions, including curved-surface deformation, reflection-induced boundary discontinuity, low contrast, and local stains. While maintaining real-time performance, it provides technical support for automatic barrel packaging information acquisition, online detection, and production traceability.