置信度判决用于确定语音数据与模型之间的匹配程度, 可以发现语音命令系统中的识别错误, 提高其可靠性. 近年来, 基于身份矢量(identity vector, i-vector)以及概率线性判别分析(Probabilistic Linear Discriminant Analysis, PLDA)的方法在说话人识别任务中取得了显著效果. 本文尝试将i-vector以及PLDA模型作为一种命令词识别结果置信度分析方法, 其无需声学模型、语言模型支撑, 且实验表明性能良好. 在此基础上, 针对i-vector在刻画时序信息方面的不足, 尝试将该系统与DTW融合, 有效提升了系统对音频时序的鉴别能力.
Confidence measures represent the degree of match between speech data and models, and thus can be utilized to spot errors in voice command systems, improving their reliability. In recent years, systems based on identity vector (i-vector) and Probabilistic Linear Discriminant Analysis (PLDA) have been proven effective in the task of Speaker Verification (SV). This study proposes i-vector and PLDA as a confidence measure for voice command system without the need for acoustic or language models and demonstrates fair performance. Furthermore, in consideration of the deficiency of such i-vectors in modeling temporal information, this study proposes a fusion approach of such system with DTW, enhancing its time sequence discrimination ability.