• CN: 11-2187/TH
  • ISSN: 0577-6686

Journal of Mechanical Engineering ›› 2025, Vol. 61 ›› Issue (16): 239-249.doi: 10.3901/JME.2025.16.239

Previous Articles    

Voxel Feature Attention-based Point Cloud Object Detection Algorithm for Traffic Cone

LIAN Qiuyou1, ZHENG Shaowu1, TU Xinkui1, LI Weihua1,2   

  1. 1. School of Mechanical & Automotive Engineering, South China University of Technology, Guangzhou 510641;
    2. Guangdong Artificial Intelligent and Digital Economg Laboratory, Guangzhou 510335
  • Accepted:2024-08-25 Online:2025-03-11 Published:2025-03-11

Abstract: Traffic cone barrels serve as important markers for defining the boundaries of drivable areas on roads, and precise and efficient detection of traffic cone barrels is of significant importance for the safe navigation of autonomous vehicles. This paper presents a voxel feature attention-based point cloud object detection algorithm for traffic cone barrels, named AttenPillar, aiming to address the issues of weak feature extraction capabilities and the inability to focus on key spatial information in existing methods, resulting in poor robustness and accuracy. PointPillars is used as the baseline model, and an encoder-decoder structure is adopted as the backbone network, while the ReLU activation function is replaced by the GELU (Gaussian Error Linear Unit) activation function. To better capture features, a PillarWise hybrid domain attention mechanism module is proposed, which can aggregate the features of points in voxels and use the hybrid domain attention mechanism to generate attention tensors. By performing hybrid domain attention operations between the input and output feature layers of the network backbone and the attention tensors, the spatial geometric information loss during the downsampling process can be reduced, and the weights of non-empty voxels in the output feature map can be increased. This allows the network to focus more on the non-empty voxel part, thereby fully extracting the point cloud features in voxels with different point cloud quantities. Finally, the detection head outputs the 3D prediction boxes. A traffic cone point cloud dataset is collected and constructed for algorithm verification, and the proposed algorithm achieves 10.90% and 14.41% improvements in BEV AP (IoU=0.7) and 3D AP (IoU=0.7) respectively compared to the baseline model PointPillars, with a speed of 78.86 FPS on the embedded computing device NVIDIA AGX Xavier, effectively improving the accuracy of cone detection while ensuring real-time performance.

Key words: traffic cone detection, 3D object detection, attention mechanism, small object detection

CLC Number: