IHRAG: 面向大模型的迭代式混合检索增强生成
CSTR:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

武汉市重点研发计划(2022012202015070); 武汉东湖新技术开发区“揭榜挂帅”项目(2022KJB126)


IHRAG: Iterative Hybrid Retrieval-augmented Generation for Large Language Model
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    在医疗领域, 检索增强生成(RAG)被提出以减少大模型幻觉, 并提供更多的可解释性和可控性, 然而现有技术面临对低频实体的召回能力较弱、难以处理模糊冗长或多义性强的查询, 本文提出一种面向大模型的迭代式混合检索增强生成(iterative hybrid retrieval-augmented generation, IHRAG)方法以提升对复杂问题的意图解析能力, 增强模型在知识挖掘方面的表现, 使大模型生成更加准确的回答. 该框架通过动态路由机制协同调度向量检索的语义泛化能力与知识图谱的结构化推理能力, 结合医疗本体驱动的查询解构算法, 将复杂临床问题分解为可检索的原子子问题, 并引入知识缺口感知的神经符号扩展模型与“检索-验证-迭代”闭环优化机制, 构建了从表层信息提取到深层知识挖掘的递进式发现流程. 实验表明, IHRAG在Qwen、DeepSeek等不同规模基础模型上均显著提升性能, 最高可使准确率提升11.12个百分点, 优秀回答率提升17个百分点.

    Abstract:

    In the medical field, retrieval-augmented generation (RAG) has been proposed to mitigate hallucinations in large language model (LLM) and enhance the interpretability and controllability. However, existing techniques are faced with poor recall of low-frequency entities and difficulties in processing ambiguous, verbose, or polysemous queries. To this end, this study proposes an iterative hybrid retrieval-augmented generation (IHRAG) approach for LLM to improve the intention parsing ability of complex queries and enhance the model’s performance in knowledge mining capabilities for making LLMs generate more accurate responses. IHRAG employs a dynamic routing mechanism to synergistically leverage the semantic generalization capability of vector retrieval andthe structured reasoning capacity of knowledge graphs. By combining a medical ontology-driven query decomposition algorithm, complex clinical questionsare broken down into retrievable atomic sub-questions. Furthermore, a knowledge gap-aware neuro-symbolic expansion model and a “retrieve-verify-iterate” closed-loop optimization mechanism are introduced to establish a progressive discovery process that advances from surface-level information extraction to deep knowledge mining. Experiments demonstrate that IHRAG significantly enhances the performance of base models of various scales such as Qwen and DeepSeek, achieving an improvement in the accuracy of up to 11.12 percentage points and a 17% increase in the high-quality response rate.

    参考文献
    相似文献
    引证文献
引用本文

谢雨霏,李琳,李涛,何柳,高贝琳,何志婷. IHRAG: 面向大模型的迭代式混合检索增强生成.计算机系统应用,,():1-10

复制
分享
相关视频

文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2025-09-03
  • 最后修改日期:2025-09-29
  • 录用日期:
  • 在线发布日期: 2026-01-19
  • 出版日期:
文章二维码
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62661041 传真: Email:csa@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号