给图片添加特定扰动可以生成对抗样本, 误导深度神经网络输出错误结果, 更加强力的攻击方法可以促进网络模型安全性和鲁棒性的研究. 攻击方法分为白盒攻击和黑盒攻击, 对抗样本的迁移性可以借已知模型生成结果来攻击其他黑盒模型. 基于直线积分梯度的攻击TAIG-S可以生成具有较强迁移性的样本, 但是在直线路径中会受噪声影响, 叠加与预测结果无关的像素梯度, 影响了攻击成功率. 所提出的Guided-TAIG方法引入引导积分梯度, 在每一段积分路径计算上采用自适应调整的方式, 纠正绝对值较低的部分像素值, 并且在一定区间内寻找下一步的起点, 规避了无意义的梯度噪声累积. 基于ImageNet数据集上的实验表明, Guided-TAIG在CNN和Transformer架构模型上的白盒攻击性能均优于FGSM、C&W、TAIG-S等方法, 并且制作的扰动更小, 黑盒模式下迁移攻击性能更强, 表明了所提方法的有效性.
Adding specific perturbations to images can help generate adversarial samples that mislead deep neural networks to output incorrect results. More powerful attack methods can facilitate research on the security and robustness of network models. The attack methods are divided into white-box and black-box attacks, and the transferability of adversarial samples can be used to attack other black-box ones by the results generated by known models. Attacks based on linear integrated gradients (TAIG-S) can generate highly transferable adversarial samples, but they are affected by noise in the linear path, superimposing pixel gradients that are irrelevant to the prediction results, which limits the success rate of attacks. With guided integrated gradients, the proposed Guided-TAIG method uses adaptive adjustment to correct some pixel values with low absolute values on each segment of the integrated path calculation and finds the starting point of the next step within a certain interval, circumventing the accumulation of meaningless gradient noise. The experiments on the ImageNet dataset show that Guided-TAIG outperforms FGSM, C&W, and TAIG-S for white-box attacks on both CNN and Transformer architecture models, produces smaller perturbations, and has better performance for transferable attacks in the black-box mode. This demonstrates the effectiveness of the proposed method.