基于机密计算的大语言模型安全推理方案
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

国家重点研发计划 (2022YFB4501500, 2022YFB4501501)


Secure Inference Solution for LLM Based on Confidential Computing
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    以ChatGPT、DeepSeek为代表的大语言模型(简称大模型)高速发展, 在各种任务中得到广泛使用, 如文本生成、智能助理等. 但这些大模型也面临着严峻的隐私安全风险. 特别地, 在医疗、金融等高安全需求的场景中, 模型窃取与数据隐私泄露等威胁往往是阻碍大模型应用的重要因素. 现有针对大模型推理保护的安全方案通常存在一些局限性, 或缺少对推理计算过程的运行时保护, 或因计算与通信的高昂代价而面临实用性挑战. 机密计算能够基于可信执行环境(TEE)硬件构建安全推理环境, 是实现大语言模型安全推理的一种实用且有效的安全技术. 由此, 本文提出了一种基于机密计算的大语言模型安全推理应用方案, 通过远程证明确保推理计算环境、模型权重参数和模型镜像文件的完整性, 采用基于TEE硬件的机密互联实现大模型推理流量的加密保护, 通过隔离不同用户的推理上下文等方式在多用户场景中保护提示词隐私. 该方案对大语言模型推理的全过程、全链路进行安全保护, 同时对运行环境进行完整性验证, 从而实现高效安全的机密大语言模型推理. 此外, 本文基于异构TEE服务器(SEV和CSV)平台实现了一个原型系统, 并对系统的安全性和性能进行了评估. 结果表明, 在实现预期安全目标的同时, 本文方案引入的性能损耗理论上不超过原生AI模型推理开销的1%, 实际应用中这种差异可以忽略不计.

    Abstract:

    Large language models (LLMs), represented by ChatGPT and DeepSeek, are rapidly developing and widely used in various tasks, such as text generation and intelligent assistants. However, these large models also face severe privacy and security risks. Especially in high security scenarios such as healthcare and finance, threats such as model theft and data privacy leakage are often key factors hindering the application of large models. Existing security solutions for protecting large model inference usually have certain limitations, such as the lack of runtime protection for the inference computation process, or practical challenges caused by the high cost of computation and communication. Confidential computing can build a secure inference environment based on trusted execution environment (TEE) hardware, and is a practical and effective security technology for implementing secure inference of large language models. Therefore, this study proposes a secure inference application scheme for large language models based on confidential computing, which ensures the integrity of the inference computing environment, model weight parameters, and model image files through remote attestation, implements encryption protection for large model inference traffic via confidential interconnection based on TEE hardware, and protects the privacy of user prompts in multi-user scenarios by isolating the inference contexts among different users. The proposed scheme provides comprehensive security protection for the entire process and full chain of large language model inference, while verifying the integrity of the execution environment to achieve efficient and secure confidential large language model inference. Furthermore, a prototype system is implemented on a heterogeneous TEE server platform (SEV and CSV), and the system’s security and performance are evaluated. The results show that while achieving the expected security goals, the performance loss introduced by the proposed scheme theoretically does not exceed 1% of the inference overhead of the native AI model, which can be ignored in practical applications.

    参考文献
    相似文献
    引证文献
引用本文

崔越,冯伟,秦宇,马鸿展,冯登国.基于机密计算的大语言模型安全推理方案.计算机系统应用,2026,35(2):76-91

复制
分享
相关视频

文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2025-07-17
  • 最后修改日期:2025-08-13
  • 录用日期:
  • 在线发布日期: 2025-11-26
  • 出版日期:
文章二维码
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京市海淀区中关村南四街4号,邮政编码:100190
电话:010-62661041 传真: Email:csa@iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号