Voxel-based 3D Object Detection Network Based on Multi-level Feature Fusion

ZHANG Wu-ran; HU Chun-yan; CHEN Ze-lai; LI Fei-fei

doi:10.19554/j.cnki.1001-3563.2022.15.005

PDF(4777 KB)

Packaging Engineering ›› 2022 ›› Issue (15) : 42-53. DOI: 10.19554/j.cnki.1001-3563.2022.15.005

Voxel-based 3D Object Detection Network Based on Multi-level Feature Fusion

ZHANG Wu-ran¹, HU Chun-yan¹, CHEN Ze-lai¹, LI Fei-fei²

Author information +

History +

Abstract

The work aims to accurately analyze the location and classification information of the object to be tested in the point cloud scene, and propose a voxel-based 3D object detection network based on multi-level feature fusion. The two-stage Voxel-RCNN was used as the baseline network. In the first stage, the Sparse Feature Residual Dense Fusion Module (SFRDFM) was added to propagate and reuse the level-by-level features from shallow to deep, to achieve full interactive fusion of 3D features. The Residual Light-weight and Efficient Channel Attention (RL-ECA) mechanism was added to the 2D backbone network to explicitly enhance channel feature representation. A multi-level feature and multi-scale kernel adaptive fusion module was proposed to adaptively extract the weight information of the multi-level features, to achieve a strong fusion with a weighted manner. In the second stage, a Triple Feature Fusion Strategy (TFFS) was designed to aggregate neighborhood features based on the Manhattan distance search algorithm, and a Deep Fusion Module (DFM) and a Coarse to Fine Fusion Module (CTFFM) were embedded to improve the quality of grid features. The algorithm in this paper was tested in the autonomous driving data set KITTI. Compared with the baseline network at three difficulty levels, the average 3D accuracy of pedestrians in the first stage detection model was improved by 3.97%, and the average 3D accuracy of cyclists in the second stage detection model was improved by 3.37%. The experimental results prove that the proposed method can effectively improve the performance of object detection, each module has superior portability, and can be flexibly embedded into the voxel-based 3D detection model to bring corresponding improvements.

Cite this article

EndNote

Ris (Procite)

Bibtex

Download Citations

ZHANG Wu-ran, HU Chun-yan, CHEN Ze-lai, LI Fei-fei. Voxel-based 3D Object Detection Network Based on Multi-level Feature Fusion[J]. Packaging Engineering. 2022(15): 42-53 https://doi.org/10.19554/j.cnki.1001-3563.2022.15.005