节点文献
基于距离的机器学习模型对抗鲁棒性研究
Adversarial Robustness of Distance-based Machine Learning Models
【作者】 王璐;
【作者基本信息】 南京大学 , 计算机科学与技术, 2020, 博士
【摘要】 机器学习模型的现实应用场景非常复杂,存在任务环境不稳定等问题。在安全敏感领域,简单地假设模型的预测环境和训练环境的样本独立同分布,可能会带来巨大的安全性风险。一个典型的例子是对抗样本问题——在测试样本特征表示上施加的微小扰动可以很容易地改变模型的预测结果。这一现象的广泛存在表明模型的对抗鲁棒性(或简称鲁棒性)较弱。本文关注基于距离的机器学习模型(或简称距离模型)的对抗鲁棒性。距离模型虽应用广泛,但其对抗鲁棒性研究却相对滞后。我们从距离模型鲁棒性评估方法着手,进而研究距离模型鲁棒性增强方法,并试图将距离模型的研究结果应用到深度神经网络等其他模型中。1.最近邻分类器鲁棒性评估方法。最近邻分类器的鲁棒性评估方法往往依赖于可微近似模型,这一类方法无法做到最优鲁棒性评估。本文将最近邻分类器鲁棒性评估问题形式化为一组凸二次规划问题,并在原始对偶框架下,给出了最优鲁棒性评估(即最优对抗攻击、最优鲁棒性验证)方法。2.K-近邻分类器鲁棒性验证方法。不同于最近邻分类器,K-近邻分类器的最优鲁棒性验证方法的计算复杂度随K的增大呈指数级增长。为解决这一问题,本文提出了两种K-近邻分类器的鲁棒性验证方法——约束放松法和随机平滑法。两种方法互为补充,分别适用于K值较小和较大的情况,从而得到了较为理想的鲁棒性验证效果。3.度量学习验证鲁棒性增强方法。传统的度量学习并不考虑对抗鲁棒性。本文将我们提出的K-近邻分类器的鲁棒性验证方法——约束放松法扩展到度量学习中,并基于此提出了一个新的度量学习方法ARML(Adversarially Robust Metric Learning)。ARML可被看作是一种鲁棒性增强方法。该方法在提升度量学习模型分类性能的同时,可以显著增强模型的对抗鲁棒性。4.黑盒模型鲁棒性评估加速方法。黑盒攻击是一类通用的鲁棒性评估方法。我们通过理论分析发现K-近邻分类器等模型的最优对抗攻击的解总在训练集张成的子空间内。这启发我们通过限制黑盒攻击的搜索空间,来提升黑盒攻击的效率。基于此,本文提出了一种通用的黑盒攻击加速策略,在提升攻击成功率的同时,显著降低攻击成本。从而将距离模型的对抗鲁棒性研究结果拓展到更为通用的黑盒模型问题上。
【Abstract】 Real-world applications of machine learning models are complicated,and the environment could be unstable.In mission-critical systems,machine learning models would suffer from safety risks if we simply assume the training and testing environments are identical.The problem of adversarial perturbations is a typical example:machine learning models,especially deep neural networks,are vulnerable to small adversarial perturbations,i.e.,a small carefully crafted perturbation added to the input may significantly change the prediction result.In other words,machine learning models lack adversarial robustness.In this thesis,we focus on adversarial robustness of particular machine learning models based on distances.Despite of being widely used,adversarial robustness of distance-based models is less studied.We start with the robustness evaluation method,and then study the robustness enhancement method,and try to apply the research results to general machine learning models including deep neural networks.1.Robustness evaluation for 1-NN。Existing robustness evaluation methods for the nearest neighbor classifier(1-NN)depend on differentiable substitutes for 1-NN,and they are far from optimal.We formalize robustness evaluation of the nearest neighbor classifier(1-NN)as a list of convex quadratic programming problems.In the primal-dual framework,we propose an efficient algorithm to exactly compute the minimal adversarial perturbation,leading to both the optimal attack method and the optimal robustness verification method.2.Robustness verification for K-NN。Different from 1-NN,the time complexity of the optimal robustness verification method for K-NN grows exponentially with K.To tackle this issue,we propose two robustness verification methods — constraint relaxation for K-NN and random smoothing for smoothed K-NN.The two methods complement each other,applying to the small K and large K situations respectively,and achieve favorable performance.3.Robustness enhancement for metric learning。The traditional metric learning methods do not take adversarial robustness into consideration.We extend our robustness verification method for K-NN to the metric learning problem.Based on the robustness verification method,we propose a novel metric learning method ARML(Adversarially Robust Metric Learning),which could also be seen as a robustness enhancement method.This method not only improves classification accuracy,but also enhances adversarial robustness including both empirical robustness and certified robustness.4.Robustness evaluation for black-box models。Black-box attacks serve as a general way to evaluate robustness of machine learning models.Based on theoretical analysis on adversarial robustness for distance-based models,we find that the minimal adversarial perturbation tends to be in a small subspace.This inspires us to improve efficiency of black-box attacks via constraining the search space.This technique not only reduces attack costs,but also improves attack success rates.Therefore,we extend the results about adversarial robustness of distance-based models to a general case of black-box models.
【Key words】 machine learning; adversarial robustness; nearest neighbor; metric learning; black-box attack;