基于改进YOLOv5的车辆端目标检测
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

黔科合重大专项(ZNWLQC[2019]3012-1); 黔科合支撑([2021]一般297)


Vehicle-side Target Detection Based on Improved YOLOv5
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 增强出版
  • |
  • 文章评论
    摘要:

    在自动驾驶应用场景下, 将YOLOv5应用于目标检测中, 性能较之前版本有明显的提升, 但在高运行速度情况下检测精度仍不够高, 本文提出一种基于改进YOLOv5的车辆端目标检测方法. 为解决训练不同数据集时需手动设计初始锚框大小, 引入自适应锚框计算. 在主干网络(backbone)添加压缩与激励模块(squeeze and excitation, SE), 筛选针对通道的特征信息, 提升特征表达能力. 为了提升检测不同大小物体时的精度, 将注意力机制与检测网络融合, 把卷积注意力模块 (convolutional block attention module, CBAM)与Neck部分融合, 使模型在检测不同大小的物体时能关注重要的特征, 提升特征提取能力. 在主干网络中使用空间金字塔池化SPP模块, 使得模型输入可以输入任意图像高宽比和大小. 在激活函数方面, 进行卷积操作后使用Hardswish激活函数, 应用于整个网络模型. 在损失函数方面, 使用CIoU作为检测框回归的损失函数, 改善定位精度低和训练过程中目标检测框回归速度慢的问题. 实验结果表明, 改进后的检测模型在KITTI 2D数据集上测试, 目标检测的精确率(precision)提高了2.5%, 召回率(recall)提高了5.1%, 平均精度均值(mean average precision, mAP)提高了2.3%.

    Abstract:

    In the application scenario of autonomous driving, YOLOv5 is applied to target detection, and the performance is significantly improved compared with that of previous versions. However, the detection accuracy is still low in the case of high running speed. This study proposes a vehicle-side target detection method based on improved YOLOv5. In order to address the issue of manually designing the initial anchor box size in training different datasets, an adaptive anchor box calculation is introduced. In addition, a squeeze and excitation (SE) module is added to the backbone network to screen the feature information for channels and improve the feature expression ability. In order to improve the accuracy of detecting objects of different sizes, the attention mechanism is integrated with the detection network, and the convolutional block attention module (CBAM) is integrated with the Neck part. As a result, the model can focus on important features when detecting objects of different sizes, and its ability in feature extraction is improved. The spatial pyramid pooling (SPP) module is used in the backbone network so that the model can input any image aspect ratio and size. In terms of the activation function, the Hardswish activation function is adopted for the entire network model after the convolution operation. In terms of the loss function, CIoU is used as the loss function of detection box regression to solve the problems of low positioning accuracy and slow regression of the target detection box during training. Experimental results show that the improved detection model is tested on the KITTI 2D dataset, and the precision of target detection, the recall rate, and the mean average precision (mAP) are increased by 2.5%, 5.1%, and 2.3%, respectively.

    参考文献
    相似文献
    引证文献
引用本文

黎国溥,陈升东,王亮,邹凯,袁峰.基于改进YOLOv5的车辆端目标检测.计算机系统应用,2022,31(12):127-134

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2022-01-24
  • 最后修改日期:2022-02-22
  • 录用日期:
  • 在线发布日期: 2022-09-14
  • 出版日期:
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京海淀区中关村南四街4号 中科院软件园区 7号楼305房间,邮政编码:100190
电话:010-62661041 传真: Email:csa (a) iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号