多模态融合知识蒸馏的无人机路桥检测
CSTR:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

重庆市教育委员会科学技术研究计划青年项目(KJQN202402002); 2024年重庆对外经贸学院科研项目(KYZK2024005); 重庆对外经贸学院教育教学改革研究项目(JG2025034)


UAV-based Road and Bridge Inspection Combining Multimodal Fusion and Knowledge Distillation
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    针对传统路桥检测技术存在的效率低、成本高及安全风险问题, 以及当前无人机检测中多模态模型参数量大、难以在机载平台实时部署的挑战, 本文提出一种基于交叉蒸馏的多模态特征融合路桥检测模型. 该模型采用双分支教师网络与单分支学生网络架构, 通过教师网络间的特征交互与协同蒸馏机制, 实现多模态特异性知识的高效迁移; 同时引入基于注意力机制的动态特征融合模块, 强化对路桥缺陷关键特征的感知能力. 实验表明: 在保持检测精度mAP@0.5为89.6%的同时, 该模型参数量降至8.2M, 推理速度达32.6 f/s, 性能显著优于传统多模态融合及轻量化方法. 相比特征拼接、单模态蒸馏后融合等策略, 其检测精度与计算效率均具明显优势. 消融实验证实了交叉蒸馏机制与注意力融合模块的有效性. 该模型成功实现了路桥缺陷的高精度轻量化检测, 为无人机路桥检测工程应用提供了技术基础.

    Abstract:

    To address the inefficiencies, high costs, and safety risks inherent in traditional road and bridge inspection techniques, as well as the challenges posed by the large parameter volumes of current multimodal detection models and the difficulty in achieving real-time deployment on unmanned aerial vehicle (UAV) platforms, this study proposes a multimodal feature fusion road and bridge detection model based on cross distillation. The model employs a dual-branch teacher network and a single-branch student network architecture. Efficient knowledge transfer of modality-specific features is achieved through feature interaction and collaborative distillation mechanisms between the teacher networks. Concurrently, a dynamic feature fusion module, utilizing attention mechanisms, is introduced to enhance the perception of critical features associated with road and bridge defects. Experimental results demonstrate that, while maintaining a detection precision of 89.6% mAP@0.5, the proposed model reduces its parameter size to 8.2M and achieves an inference speed of 32.6 f/s. These results significantly outperform traditional multimodal fusion and lightweight methods. Compared to strategies utilizing feature concatenation or post-distillation unimodal fusion, the proposed model shows clear advantages in both detection accuracy and computational efficiency. Ablation studies confirm the effectiveness of the cross-distillation mechanism and the attention-based fusion module. The model successfully enables high-precision, lightweight detection of road and bridge defects, thus providing a technical foundation for the engineering application of UAV-based road and bridge inspection.

    参考文献
    相似文献
    引证文献
引用本文

梁巧,杨德刚,王杰.多模态融合知识蒸馏的无人机路桥检测.计算机系统应用,,():1-10

复制
分享
相关视频

文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2025-08-15
  • 最后修改日期:2025-09-16
  • 录用日期:
  • 在线发布日期: 2026-01-08
  • 出版日期:
文章二维码
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62661041 传真: Email:csa@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号