精华内容
下载资源
问答
  • 对抗标签噪音的深度学习方法总结
    千次阅读
    2020-07-31 11:17:27

    深度学习模型往往需要大量的标记正确的数据,而现实世界的数据集中有8%~38.5%的数据是被污染的。现在的深度学习模型很容易对存在噪音的数据集过拟合,从而使得模型在测试集上的表现较差。现在比较流行的防止模型过拟合的方法:数据增强,权重衰减,dropout, batch normalization等方法并不能很好的解决这一问题。

    1 鲁棒性的损失函数

    这类方法通过改编损失函数,使得模型在噪音数据集上训练后的性能与在干净的噪音数据集上的性能相当。比如分类交叉熵往往用作分类任务的损失函数,但是它本身在面对噪音数据时表现并不好。

    所以有些研究者就想使用GCE, generalized cross entropy ; SCE, symmetric cross entropy来对抗噪音数据。但是这些改编的损失函数只适用于简单的情形,即任务比较简单,数据量比较少的时候。

    在实做的时候,改编的损失函数往往对降低模型的性能。

    2 鲁棒性的架构

    这类方法中包含噪音适应层,以及用于估计噪音转移概率的专用架构。生成对抗网络也包含在这一类方法之中,我对这类方法了解有限,只知道这类方法往往难以训练而且效果并不好。

    3 正则化

    常用的权重衰减,dropout, batch normalization等方法足以抵抗少量噪音数据。除此之外现在预训练模型比如BERT、ELMO等也可以在一定程度上增加模型在微调阶段的鲁棒性。预训练模型主要是可以防止模型的参数在从零开始训练时受噪音数据影响而走向错误的更新方向。正则化方法和预训练方法是目前的通用的提高模型鲁棒性的方法,因为这类方法运用起来非常方便,只需要对训练过程进行小幅的修改即可,而且面对少量噪音数据时效果还不错。但是它的缺点还是比较明显的,就是面对稍多的噪音时,效果不太行。

    4 调整损失函数

    这类方法是指在更新参数前调整所有训练样本对损失值的影响。我们可以通过估计标签转移矩阵来调整损失值,也可以对不同的样本赋予不同的权重,也可以通对样本的类别进行调整,从而来影响最终的损失值。

    5 样本选择

    为了不引入错误的校正标签,很多研究者考虑直接对样本进行选择,丢弃掉疑似噪音的样本。这类方法的核心在于以怎样的规则来丢弃疑似噪音样本,很多研究者都提出了自己的丢弃方法。在这里就不一一累述了。这类方法虽然不会引入标注错误的样本,但它不可避免地会丢弃一些标注正确的样本。

    6 元学习

    元学习最近几年比较热门,它的主要思想是学习如何学习。元学习往往被应用于小样本学习领域,因为它可以使模型在少量样本的训练下迅速拟合。如今,有些研究者想要使用元学习来对抗标签噪音。一部分研究者想要借助元学习可以进行小样本学习的特性,用元学习训练过的模型在数据集上快速拟合,从而避免了过拟合。还有一部分研究者想学习出网络的参数更新策略和网络的损失值,这类方法往往需要有一个干净的验证集用来训练元模型,但在现实世界中,有时候很难获得干净的数据。

    7 半监督学习

    有些研究者想要先从少量的干净数据集上训练多个小型的网络,然后将这些网络在噪音集上的预测结果进行集成,从而筛选出可能的标签噪音数据。有些研究者通过将训练集进行分区,同一个学习算法在不同的分区上训练出不同的参数,然后对整个数据集进行标签推理,根据推理结果对相应的样本进行剔除或者标签校正。

    最后的话

    总的来看,目前的噪音数据处理方法大都集中在CV领域,在NLP领域相应的研究还是比较少。笔者其实更关心的是NLP领域处理噪音的方法,因为对比CV领域的数据集,NLP领域如NER,RE等基础任务的数据集噪音也是普遍存在的。笔者最近做的NER实验比较多,发现训练出来的模型在测试集上所谓的推理错误其实根本就是测试集本身的标注错误。除了人工筛选或者规则匹配的方法,笔者还没有很好的想法来找出NER数据集上的标注错误,不知道各位有没有什么好的想法?

    更多算法请参考:
    Learning from Noisy Labels with Deep Neural Networks: A Survey

    更多相关内容
  • 为了解决这个问题,一些工作通过向图像中添加高斯噪声来训练网络,从而提高网络防御对抗样本的能力,但是该方法在添加噪声时并没有考虑到神经网络对图像中不同区域的敏感性是不同的。针对这一问题,提出了梯度指导...
  • 编者按:深度模型的精度和速度长期以来成为了评价模型性能的核心标准,但即使性能优越的深度神经网络也很容易被对抗样本攻击。因此,寻找到合适的对抗攻击策略可有效提升模型本身的鲁...

    编者按:深度模型的精度和速度长期以来成为了评价模型性能的核心标准,但即使性能优越的深度神经网络也很容易被对抗样本攻击。因此,寻找到合适的对抗攻击策略可有效提升模型本身的鲁棒性。本文作者提出了基于动量的迭代算法来构造对抗扰动,有效地减轻了白盒攻击成功率和迁移性能之间的耦合,并能够同时成功攻击白盒和黑盒模型。

    一、研究动机

    深度神经网络虽然在语音识别、图像分类、物体检测等诸多领域取得了显著效果,但是却很容易受到对抗样本的攻击。对抗样本是指向原始样本中添加微小的噪声,使得深度学习模型错误分类,但是对于人类观察者来说,却很难发现对抗样本和正常样本之间的区别。

    生成对抗样本的场景主要分为两种:白盒攻击和黑盒攻击。对于白盒攻击,攻击者知道目标网络的结构和参数,可以利用基于梯度的方法构造对抗样本。由于所构造的对抗样本具有一定的迁移性能(即对于一个模型构造的对抗样本也可以欺骗另一个模型),所以其可以被用来攻击未知结构和参数的黑盒模型,即黑盒攻击。

    然而,在实际的应用过程中,攻击一个黑盒模型十分困难,尤其对于具有一定防御措施的模型更加难以黑盒攻击成功。造成此现象的根本原因在于现有攻击方法的白盒攻击成功率和迁移性能之间的耦合与限制,使得没有能够同时达到很好的白盒攻击成功率和迁移性能的方法。

    具体地,对于一步迭代的快速梯度符号算法(FGSM),虽然这种方法构造的对抗样本的迁移性能很好,其攻击白盒模型的成功率受到了很大的限制,不能有效地攻击黑盒模型;另一方面,对于多步迭代的方法(I-FGSM),虽然可以很好地攻击白盒模型,但是所构造对抗样本的迁移性能很差,也不能有效地攻击黑盒模型。所以我们提出了一类新的攻击方法,可以有效地减轻白盒攻击成功率和转移性能之间的耦合,同时成功攻击白盒和黑盒模型。

    图1:对抗样本示例

    二、研究方案

    2.1 问题定义

    生成对抗噪声本质上可以归结为一个优化问题。对于单个模型f(x),攻击者希望生成满足L_∞限制的无目标对抗样本,即生成对抗样本x^*,使得f(x^*)≠y且‖x^*-x‖_∞≤ϵ,其中y为真实样本x所对应的真实类别、ϵ为所允许的噪声规模。所对应的优化目标为

    其中J为模型的损失函数,通常定义为交叉信息熵损失。

    2.2 相关工作

    为了求解此优化问题,Goodfellow等人首先提出了快速梯度符号法(FGSM),仅通过一次梯度迭代即可以生成对抗样本:

    此方法白盒攻击成功率较低。为了提升成功率,迭代式攻击方法(I-FGSM)通过多步更新,可以更好地生成对抗样本,即

    此方法虽然白盒攻击成功率较高,但是迁移能力较差,也不利用攻击其它的黑盒模型。

    2.3 动量攻击算法

    我们提出在基础的迭代式攻击方法上加入动量项,避免在迭代过程中可能出现的更新震荡和落入较差的局部极值,得到能够成功欺骗目标网络的对抗样本。由于迭代方法在迭代过程中的每一步计算当前的梯度,并贪恋地将梯度结果加到对抗样本上,使得所生成的对抗样本仅能欺骗直接攻击的白盒模型,而不能欺骗未知的黑盒模型,在实际的应用中受到了很大的限制。

    在一般优化算法中,动量项可以加速收敛、避免较差的局部极值、同时使得更新方向更加平稳。受到一般优化算法中动量项的启发,在生成对抗样本的迭代方法中加入动量项,可以使得生成的对抗样本不仅能有效欺骗白盒模型,也能欺骗未知的黑盒模型,达到更好的攻击效果。

    基于动量的迭代式快速梯度符号算法(MI-FGSM)可以用来解决上述问题,算法为:

    假设以上迭代过程共迭代T轮,为了满足限制‖x^*-x‖_∞≤ϵ,定义每一步的步长α=ϵ/T。μ为动量值g的衰减系数。通过以上迭代过程对一个真实样本x逐步添加噪声,可以得到能够欺骗模型f(x)的对抗样本x^*,同时x^*也能转移到其他未知模型上,导致多个模型发生分类错误。此方法可以被扩展到有目标攻击和基于L_2度量下的攻击。

    2.4 攻击多个模型

    为了进一步提升黑盒攻击的成功率,我们可以同时攻击多个白盒模型,以提升对抗样本的迁移性能。对于K个不同的模型,目标是使得构造的对抗样本同时攻击成功所有K个模型。为了达到上述目标,首先将K个模型的未归一化概率值进行加权平均,即

    其中l_k (x)为第个模型的未归一化概率值(即网络最后一层softmax的输入);w_k为第k个模型的权重,满足w_k≥0且∑_(k=1)^K▒w_k =11。由此得到了一个集成模型,定义此模型的损失函数为softmax交叉信息熵损失:

    由此可以利用之前所述的基于动量的生成对抗样本的方法对此集成模型进行攻击。

    三、算法流程图

    算法流程图如图2所示。输入一张原始的图片,其可以被图片分类模型正确分类。通过所提出的基于动量的迭代算法构造对抗扰动并添加到原始样本上,得到了对抗图片,会被图片分类模型所错分。

    图2:算法流程图

    四、实验结果

    4.1 数据集

    为了测试所提方法的有效性,针对图片分类任务进行对抗样本生成。首先选取7个模型作为研究对象,它们分别为Inception V3 (Inc-v3)、Inception V4 (Inc-v4)、Inception Resnet V2 (IncRes-v2)、Resnet v2-152 (Res-152)、Inc-v3ens3、Inc-v3ens4和IncRes-v2ens。这些模型均在大规模图像数据集ImageNet上训练得到,其中后三个模型为集成对抗训练得到的模型,具备一定的防御能力。本实施选取ImageNet验证集中1000张图片作为研究对象,衡量不同攻击方法的成功率,进而说明其攻击性能。

    4.2 评测指标

    这里我们选取攻击成功率作为评测指标,定义为原本可以被分类正确的图片中,添加了对抗噪声后被预测为错误标签的图片占的比率。

    4.3 实验结果

    基于所提方法,我们攻击了Inc-v3、Inc-v4、IncRes-v2和Res-152四个模型,并利用所产生的对抗样本输入所有的7个模型中,测试攻击的效果。为了比较所提出方法的效果,我们还选取了FGSM,I-FGSM两个方法作为基准方法进行比较。实验结果如表1所示:

    表1:攻击成功率结果

    从表中可以看出,所提出的MI-FGSM方法可以显著地提升黑盒攻击的成功率,相比于I-FGSM,我们的方法可以将攻击成功率提升了一倍左右。我们还展示了集成攻击的效果。实验结果如表2所示。

    表2:集成攻击结果

    从结果中可以看出,所提出的在模型未归一化概率值进行加权平均的方法效果最好。

    五、结论与展望

    本篇论文证明了深度学习模型在黑盒场景下的脆弱性,也证明了基于动量的攻击算法的有效性。实验中可以看出,所提出的方法对于具有防御机制的模型的攻击效果较差。我们在后续工作中还提出了平移不变的攻击算法(“Evading Defenses to Transferable Adversarial Examples by Translation-Invariant Attacks”, CVPR 2019, Oral),可以将防御模型的攻击效果进一步提升。

    代码:

    https://github.com/dongyp13/Non-Targeted-Adversarial-Attacks

    论文:

    Boosting Adversarial Attacks with Momentum.

    https://arxiv.org/pdf/1710.06081.pdf

    编辑:杨茹茵

    --end--

    该文章属于“深度学习大讲堂”原创,如需要转载,请联系 ruyin712。

    作者简介:

    董胤蓬,清华大学计算机系人工智能研究院二年级博士生,导师为朱军教授。主要研究方向为机器学习与计算机视觉,聚焦深度学习鲁棒性的研究,先后发表CVPR、NIPS、IJCV等顶级国际会议及期刊论文十余篇,并作为Team Leader在Google举办的NIPS 2017人工智能对抗性攻防大赛中获得全部三个比赛项目的冠军。曾获得CCF优秀大学生,国家奖学金,清华大学未来学者奖学金、CCF-CV学术新锐奖等。

    往期精彩回顾

    何晖光:多模态情绪识别及跨被试迁移学习

    Deep Unrolling:深度网络与传统模型之间的桥梁

    华科白翔教授团队ECCV2018 OCR论文:Mask TextSpotter

    【CVPR2018】物体检测中的结构推理网络

    中科视拓深度学习实战班来杭州了!

    还不知道GAN?小心落伍于这个AI时代

    高新波:异质图像合成与识别

    欢迎关注我们!

    深度学习大讲堂是由中科视拓运营的高质量原创内容平台,邀请学术界、工业界一线专家撰稿,致力于推送人工智能与深度学习最新技术、产品和活动信息!

    中科视拓(SeetaTech)将秉持“开源开放共发展”的合作思路,为企业客户提供人脸识别、计算机视觉与机器学习领域“企业研究院式”的技术、人才和知识服务,帮助企业在人工智能时代获得可自主迭代和自我学习的人工智能研发和创新能力。

    中科视拓目前正在招聘: 人脸识别算法研究员,深度学习算法工程师,GPU研发工程师, C++研发工程师,Python研发工程师,嵌入式视觉研发工程师,运营经理。有兴趣可以发邮件至:hr@seetatech.com,想了解更多可以访问,www.seetatech.com

    中科视拓

    深度学习大讲堂

    点击阅读原文打开中科视拓官方网站

    展开全文
  • 牛津大学出品:随机噪声对抗训练

    万次阅读 多人点赞 2022-02-04 19:41:20
    目前已经有研究表明使用单步 F G S M \mathrm{FGSM} FGSM进行对抗训练会导致一种严重的过拟合现象,在该论文中作者经过理论分析和实验验证重新审视了对抗噪声和梯度剪切在单步对抗训练中的作用。作者发现对于大的...

    引言

    该论文出自于牛津大学,主要是关于对抗训练的研究。目前已经有研究表明使用单步 F G S M \mathrm{FGSM} FGSM进行对抗训练会导致一种严重的过拟合现象,在该论文中作者经过理论分析和实验验证重新审视了对抗噪声和梯度剪切在单步对抗训练中的作用。作者发现对于大的对抗扰动半径可有效避免过拟合现象。基于该观察结果,作者提出了一种随机噪声对抗训练 N - F G S M \mathrm{N\text{-}FGSM} N-FGSM,实验表明该方法不仅提供了单步对抗训练的减少计算开销的好处,而且也不会受到过拟合现象的影响。论文里没有提供相关源代码,本文最后一节是关于该论文算法的一个简单实现。
    论文链接:https://arxiv.org/abs/2202.01181

    预备知识

    给定一个参数为 θ \theta θ的分类器 f θ : X → Y f_\theta:\mathcal{X}\rightarrow \mathcal{Y} fθ:XY,一个对抗扰动集合 S \mathcal{S} S。如果对于任意的对抗扰动 δ ∈ S \delta\in \mathcal{S} δS,有 f θ ( x + δ ) = f θ ( x ) f_\theta(x+\delta)=f_\theta(x) fθ(x+δ)=fθ(x),则可以说 f θ f_\theta fθ在点 x ∈ X x\in \mathcal{X} xX关于对抗扰动集合 S \mathcal{S} S是鲁棒的。对抗扰动集合 S \mathcal{S} S的定义为 S = { δ : ∥ δ ∥ ∞ ≤ ϵ } \mathcal{S}=\{\delta:\|\delta\|_\infty\le \epsilon\} S={δ:δϵ}为了使得神经网络模型能够在 ℓ ∞ \ell_\infty 范数具有鲁棒性。对抗训练在数据集 D = { ( x i , y i ) } i = 1 : N \mathcal{D}=\{(x_i,y_i)\}_{i=1:N} D={(xi,yi)}i=1:N上修正类别训练进程并最小化损失函数,其中对抗训练的目标为 min ⁡ θ ∑ i = 1 N max ⁡ δ L ( f θ ( x i + δ ) , y i ) s . t . ∥ δ ∥ ∞ ≤ ϵ \min\limits_{\theta}\sum\limits_{i=1}^N\max\limits_{\delta} \mathcal{L}(f_\theta(x_i+\delta),y_i)\quad \mathrm{s.t.}\quad \|\delta\|_\infty \le \epsilon θmini=1NδmaxL(fθ(xi+δ),yi)s.t.δϵ其中 L \mathcal{L} L是图片分类器的交叉熵损失函数。由于找到内部最大化的最优解是非常困难的,对抗训练最常见的方法就是通过 P G D \mathrm{PGD} PGD来近似最坏情况下的对抗扰动。虽然这已经被证明可以产生鲁棒性模型,但是计算开销随着 P G D \mathrm{PGD} PGD迭代数量而线性增加。因此,当前的工作专注于通过一步逼近内部最大化最优解来降低对抗训练的成本。

    假设损失函数对于输入的变化是局部线性的,那么可以知道对抗训练内部最大化具有封闭形式的解。 G o o d f e l l o w \mathrm{Goodfellow} Goodfellow利用这一点提出了 F G S M \mathrm{FGSM} FGSM,其中对抗扰动遵循梯度符号的方向, T r a m e r \mathrm{Tramer} Tramer等人建议在 F G S M \mathrm{FGSM} FGSM之前添加一个随机初始化。然而,这两种方法后来都被证明容易受到多步攻击,具体公式表示为 δ = ψ ( η + α ⋅ s i g n ( ∇ x i L ( f θ ( x i + η ) , y i ) ) ) \delta=\psi\left(\eta+\alpha \cdot \mathrm{sign}(\nabla_{x_i} \mathcal{L}(f_\theta(x_i+\eta),y_i))\right) δ=ψ(η+αsign(xiL(fθ(xi+η),yi)))其中, η \eta η服从概率分布 Ω \Omega Ω。当 ψ \psi ψ是投影到 ℓ ∞ \ell_\infty 操作,并且 Ω \Omega Ω是均匀分布 [ − ϵ , ϵ ] d [-\epsilon,\epsilon]^d [ϵ,ϵ]d d d d是输入空间的维数。

    N - F G S M \mathrm{N\text{-}FGSM} N-FGSM对抗训练

    在进行对抗性训练时,一种常见的做法是将训练期间使用的干扰限制在 ϵ - ℓ ∞ \epsilon \text{-}\ell_\infty ϵ-范围。其背后原理是,在训练期间增加扰动的幅度可能不必要地降低分类精度,因为在测试时不会评估约束球外的扰动。虽然通过剪裁或限制噪声大小来限制训练期间使用的扰动是一种常见做法,但是由于梯度剪切是在采取梯度上升步骤后执行的,所以剪切点可能不再进行有效的对抗训练。基于上述动机,作者主要探索梯度剪裁操作和随机步长中噪声的大小在单步方法中获得的鲁棒性的作用。作者本文中提出了一种简单有效的单步对抗训练方法 N - F G S M \mathrm{N\text{-}FGSM} N-FGSM,具体的计算公式如下所示: δ N - F G S M = η + α ⋅ s i g n ( ∇ x i L ( f θ ( x i + η ) ) , y i ) \delta_{\mathrm{N\text{-}FGSM}}=\eta+\alpha \cdot \mathrm{sign}(\nabla_{x_i}\mathcal{L}(f_\theta(x_i+\eta)),y_i) δN-FGSM=η+αsign(xiL(fθ(xi+η)),yi)其中 η \eta η是从均分布 [ − k , k ] d [-k,k]^d [k,k]d中采样得来。由于 N - F G S M \mathrm{N\text{-}FGSM} N-FGSM不涉及梯度剪裁,可以发它扰动的期望平方范数大于 R S - F G S M \mathrm{RS\text{-}FGSM} RS-FGSM。相关算法流程图,引理和定理的证明如下所示。

    引理1(对抗扰动的期望): 已知 N - F G S M \mathrm{N}\text{-}\mathrm{FGSM} N-FGSM的对抗扰动如下定义: δ N - F G S M = η + α ⋅ s i g n ( ∇ x ℓ ( f ( x + η ) , y ) ) \delta_{\mathrm{N}\text{-}\mathrm{FGSM}}=\eta + \alpha\cdot \mathrm{sign}(\nabla_x \ell(f(x+\eta),y)) δN-FGSM=η+αsign(x(f(x+η),y))其中 η ∼ Ω \eta\sim \Omega ηΩ,分布 Ω \Omega Ω是均匀分布 U ( [ − k ϵ , k ϵ ] d ) \mathcal{U}\left([-k\epsilon,k\epsilon]^d\right) U([kϵ,kϵ]d),并且对抗扰动步长为 α > 0 \alpha > 0 α>0,则有 E [ ∥ δ N - F G S M ∥ 2 2 ] = d ( k 2 ϵ 2 3 + α 2 ) , E [ ∥ δ N - F G S M ∥ 2 ] ≤ d ( k 2 ϵ 2 3 + α 2 ) \mathbb{E}[\|\delta_{\mathrm{N\text{-}FGSM}}\|_2^2]=d\left(\frac{k^2\epsilon^2}{3}+\alpha^2\right),\quad \mathbb{E}[\|\delta_{\mathrm{N\text{-}FGSM}}\|_2]\le \sqrt{d\left(\frac{k^2\epsilon^2}{3}+\alpha^2\right)} E[δN-FGSM22]=d(3k2ϵ2+α2),E[δN-FGSM2]d(3k2ϵ2+α2)

    证明: J e n s e n \mathrm{Jensen} Jensen不等式可知,当时函数 f ( x ) = x f(x)=\sqrt{x} f(x=x 是凹函数,则有 E [ f ( x ) ] ≤ f ( E [ x ] ) \mathbb{E}[f(x)]\le f(\mathbb{E}[x]) E[f(x)]f(E[x])则以下不等式成立 E η [ ∥ δ N − F G S M ∥ 2 ] ≤ E η [ ∥ δ N - F G S M ∥ ] \mathbb{E}_\eta[\|\delta_{\mathrm{N-FGSM}}\|_2]\le \sqrt{\mathbb{E}_{\eta}[\|\delta_{\mathrm{N\text{-}FGSM}}\|]} Eη[δNFGSM2]Eη[δN-FGSM] 以下主要计算期望 E η [ ∥ δ N - F G S M ∥ ] \mathbb{E}_{\eta}[\|\delta_{\mathrm{N\text{-}FGSM}}\|] Eη[δN-FGSM]并将 ∇ x ℓ ( f ( x + η ) , y ) i \nabla_x \ell(f(x+\eta),y)_i x(f(x+η),y)i缩写为 ∇ ( η ) i \nabla(\eta)_i (η)i,具体证明步骤如下所示 E η [ ∥ δ N − F G S M ∥ 2 2 ] = E η ∥ η + α ⋅ s i g n ( ∇ x ℓ ( f ( x + η ) , y ) ) ∥ 2 2 = E [ ∑ i = 1 d ( η i + α ⋅ s i g n ( ∇ ( η ) i ) ) 2 ] = ∑ i = 1 d E η [ ( η i + α ⋅ s i g n ( ∇ ( η ) i ) ) 2 ] = ∑ i = 1 d E η [ ( η i + α ⋅ s i g n ( ∇ ( η ) i ) ) 2 ∣ s i g n ( ∇ ( η ) i ) = 1 ] ⋅ P η [ s i g n ( ∇ ( η ) i ) = 1 ] + ∑ i = 1 d E η [ ( η i + α ⋅ s i g n ( ∇ ( η ) i ) ) 2 ∣ s i g n ( ∇ ( η ) i ) = − 1 ] ⋅ P η [ s i g n ( ∇ ( η ) i ) = − 1 ] = ∑ i = 1 d 1 2 k ϵ ∫ − k ϵ k ϵ ( η i + α ) 2 d η i ⋅ P η [ s i g n ( ∇ ( η ) i ) = 1 ] + 1 2 k ϵ ∑ i = 1 d ∫ − k ϵ k ϵ ( η i − α ) 2 d η i ⋅ P η [ s i g n ( ∇ ( η ) i ) = − 1 ] = ∑ i = 1 d 1 2 k ϵ ∫ α − k ϵ α + k ϵ z 2 d z ⋅ P η [ s i g n ( ∇ ( η ) i ) = 1 ] + ∑ i = 1 d 1 2 k ϵ ∫ − α − k ϵ − α + k ϵ z 2 d z ⋅ P η [ s i g n ( ∇ ( η ) i ) = − 1 ] = ∑ i = 1 d 1 2 k ϵ ∫ α − k ϵ α + k ϵ z 2 d z ⋅ P η [ s i g n ( ∇ ( η ) i ) = 1 ] + ∑ i = 1 d 1 2 k ϵ ∫ α − k ϵ α + k ϵ z 2 d z ⋅ P η [ s i g n ( ∇ ( η ) i ) = − 1 ] = 1 2 k ϵ ∫ α − k ϵ α + k ϵ z 2 d z ∑ i = 1 d ( P η [ s i g n ( ∇ ( η ) i ) = 1 ] + P η [ s i g n ( ∇ ( η ) i ) = − 1 ] ) = d 6 k ϵ [ ( α + k ϵ ) 3 − ( α − k ϵ ) 3 ] = d k 2 ϵ 2 3 + d α 2 \begin{aligned}\mathbb{E}_\eta[\|\delta_{\mathrm{N-FGSM}}\|^2_2]&=\mathbb{E}_\eta\|\eta+\alpha \cdot \mathrm{sign}(\nabla_x\ell(f(x+\eta),y))\|_2^2\\&=\mathbb{E}\left[\sum\limits_{i=1}^d(\eta_i+\alpha \cdot \mathrm{sign}(\nabla(\eta)_i))^2\right]\\&=\sum\limits_{i=1}^d \mathbb{E}_\eta[(\eta_i+\alpha \cdot \mathrm{sign}(\nabla(\eta)_i))^2]\\&=\sum\limits_{i=1}^d\mathbb{E}_\eta[(\eta_i + \alpha \cdot \mathrm{sign}(\nabla(\eta)_i))^2|\mathrm{sign}(\nabla(\eta)_i)=1]\cdot\mathbb{P}_\eta[\mathrm{sign}(\nabla (\eta)_i)=1]\\&+\sum\limits_{i=1}^d\mathbb{E}_\eta[(\eta_i + \alpha \cdot \mathrm{sign}(\nabla(\eta)_i))^2|\mathrm{sign}(\nabla(\eta)_i)=-1]\cdot\mathbb{P}_\eta[\mathrm{sign}(\nabla (\eta)_i)=-1]\\&=\sum\limits_{i=1}^d\frac{1}{2 k \epsilon} \int_{-k \epsilon}^{k\epsilon}(\eta_i+\alpha)^2 d\eta_i\cdot\mathbb{P}_\eta[\mathrm{sign}(\nabla(\eta)_i)=1]+\frac{1}{2 k \epsilon}\sum\limits_{i=1}^d\int_{-k\epsilon}^{k\epsilon}(\eta_i-\alpha)^2d\eta_i \cdot \mathbb{P}_\eta[\mathrm{sign}(\nabla(\eta)_i)=-1]\\&=\sum\limits_{i=1}^d\frac{1}{2 k\epsilon}\int_{\alpha-k\epsilon}^{\alpha+k\epsilon}z^2dz \cdot \mathbb{P}_\eta[\mathrm{sign}(\nabla(\eta)_i)=1]+\sum\limits_{i=1}^d\frac{1}{2 k\epsilon}\int_{-\alpha-k\epsilon}^{-\alpha+k\epsilon}z^2dz \cdot \mathbb{P}_\eta[\mathrm{sign}(\nabla(\eta)_i)=-1]\\&=\sum\limits_{i=1}^d\frac{1}{2 k\epsilon}\int_{\alpha-k\epsilon}^{\alpha+k\epsilon}z^2dz \cdot \mathbb{P}_\eta[\mathrm{sign}(\nabla(\eta)_i)=1]+\sum\limits_{i=1}^d\frac{1}{2 k\epsilon}\int_{\alpha-k\epsilon}^{\alpha+k\epsilon}z^2dz \cdot \mathbb{P}_\eta[\mathrm{sign}(\nabla(\eta)_i)=-1]\\&=\frac{1}{2k\epsilon}\int_{\alpha-k\epsilon}^{\alpha+k \epsilon}z^2 dz\sum\limits_{i=1}^d(\mathbb{P}_\eta[\mathrm{sign}(\nabla(\eta)_i)=1]+\mathbb{P}_\eta[\mathrm{sign}(\nabla (\eta)_i)=-1])\\&=\frac{d}{6 k \epsilon}[(\alpha+k \epsilon)^3-(\alpha-k \epsilon)^3]=\frac{d k^2 \epsilon^2}{3}+ d\alpha^2\end{aligned} Eη[δNFGSM22]=Eηη+αsign(x(f(x+η),y))22=E[i=1d(ηi+αsign((η)i))2]=i=1dEη[(ηi+αsign((η)i))2]=i=1dEη[(ηi+αsign((η)i))2sign((η)i)=1]Pη[sign((η)i)=1]+i=1dEη[(ηi+αsign((η)i))2sign((η)i)=1]Pη[sign((η)i)=1]=i=1d2kϵ1kϵkϵ(ηi+α)2dηiPη[sign((η)i)=1]+2kϵ1i=1dkϵkϵ(ηiα)2dηiPη[sign((η)i)=1]=i=1d2kϵ1αkϵα+kϵz2dzPη[sign((η)i)=1]+i=1d2kϵ1αkϵα+kϵz2dzPη[sign((η)i)=1]=i=1d2kϵ1αkϵα+kϵz2dzPη[sign((η)i)=1]+i=1d2kϵ1αkϵα+kϵz2dzPη[sign((η)i)=1]=2kϵ1αkϵα+kϵz2dzi=1d(Pη[sign((η)i)=1]+Pη[sign((η)i)=1])=6kϵd[(α+kϵ)3(αkϵ)3]=3dk2ϵ2+dα2进而则有 E η [ ∥ δ N - F G S M ∥ 2 ] ≤ d ( k 2 ϵ 2 3 + α 2 ) \mathbb{E}_\eta[\|\delta_{\mathrm{N\text{-}FGSM}}\|_2]\le\sqrt{d\left(\frac{k^2\epsilon^2}{3}+\alpha^2\right)} Eη[δN-FGSM2]d(3k2ϵ2+α2) 证毕。

    定理1: δ N - F G S M \delta_{\mathrm{N\text{-}FGSM}} δN-FGSM N - F G S M \mathrm{N\text{-}FGSM} N-FGSM方法生成的对抗扰动, δ F G S M \delta_{\mathrm{FGSM}} δFGSM F G S M \mathrm{FGSM} FGSM方法生成的对抗扰动, δ R S - F G S M \delta_{\mathrm{RS\text{-}FGSM}} δRS-FGSM R S - F G S M \mathrm{RS\text{-}FGSM} RS-FGSM方法生成的对抗扰动,对于任意的 ϵ > 0 \epsilon>0 ϵ>0,则有以下不等式成立 E η [ ∥ δ N - F G S M ∥ 2 2 ] > E η [ ∥ δ F G S M ∥ 2 2 ] > E [ ∥ δ R S - F G S M ∥ 2 2 ] \mathbb{E}_{\eta}[\|\delta_{\mathrm{N\text{-}FGSM}}\|_2^2]>\mathbb{E}_\eta[\|\delta_{\mathrm{FGSM}}\|^2_2]>\mathbb{E}[\|\delta_{\mathrm{RS\text{-}FGSM}}\|^2_2] Eη[δN-FGSM22]>Eη[δFGSM22]>E[δRS-FGSM22]

    证明:由引理1可知 E η [ ∥ δ N - F G S M ∥ 2 2 ] = d ( k 2 ϵ 2 3 + α 2 ) \mathbb{E}_\eta[\|\delta_{\mathrm{N\text{-}FGSM}}\|_2^2]=d\left(\frac{k^2\epsilon^2}{3}+\alpha^2\right) Eη[δN-FGSM22]=d(3k2ϵ2+α2)又因为 E η [ ∥ δ R S - F G S M ∥ 2 2 ] = d ( − 1 6 α 3 + 1 2 α 2 + 1 3 ϵ 2 ) \mathbb{E}_\eta[\|\delta_{\mathrm{RS\text{-}FGSM}}\|^2_2]=d\left(-\frac{1}{6}\alpha^3+\frac{1}{2}\alpha^2+\frac{1}{3}\epsilon^2\right) Eη[δRS-FGSM22]=d(61α3+21α2+31ϵ2) E η [ ∥ δ F G S M ∥ 2 2 ] = ∥ δ F G S M ∥ 2 2 = d ϵ 2 \mathbb{E}_\eta[\|\delta_{\mathrm{FGSM}}\|_2^2]=\|\delta_{\mathrm{FGSM}}\|_2^2=d\epsilon^2 Eη[δFGSM22]=δFGSM22=dϵ2如果令超参数 k = 2 k=2 k=2 α = ϵ \alpha=\epsilon α=ϵ α = 5 ϵ 4 \alpha=\frac{5\epsilon}{4} α=45ϵ,则有 E η [ ∥ δ N - F G S M ∥ 2 2 ] = 7 3 d ϵ 2 > E η [ ∥ δ F G S M ∥ 2 2 ] = d ϵ 2 > E η [ ∥ δ R S − F G S M ∥ 2 2 ] = 101 128 d ϵ 2 \mathbb{E}_\eta[\|\delta_{\mathrm{N\text{-}FGSM}}\|_2^2]=\frac{7}{3}d\epsilon^2>\mathbb{E}_\eta[\|\delta_{\mathrm{FGSM}}\|_2^2]=d \epsilon^2 > \mathbb{E}_\eta[\|\delta_{\mathrm{RS-FGSM}}\|_2^2]=\frac{101}{128}d \epsilon^2 Eη[δN-FGSM22]=37dϵ2>Eη[δFGSM22]=dϵ2>Eη[δRSFGSM22]=128101dϵ2证毕。

    实验结果

    下图表示的是在数据集 C I F A R 10 \mathrm{CIFAR10} CIFAR10(左)和 S V H N \mathrm{SVHN} SVHN(右)上比较 N - F G S M \mathrm{N\text{-}FGSM} N-FGSM G r a d A l i g n \mathrm{GradAlign} GradAlign的多步方法在不同的扰动半径下使用神经网络 P r e a c t R e s N e t 18 \mathrm{PreactResNet18} PreactResNet18的分类准确率。可以发现尽管所有方法都达到干净样本的分类精度(虚线),但 P G D - 10 \mathrm{PGD\text{-}10} PGD-10和单步法之间在鲁棒精度方面存在差距,而且,最重要的是 P G D - 10 \mathrm{PGD\text{-}10} PGD-10 N - F G S M \mathrm{N\text{-}FGSM} N-FGSM的计算开销的10倍。

    下图表示的是在数据在 C I F A R 100 \mathrm{CIFAR100} CIFAR100(左)和 S V H N \mathrm{SVHN} SVHN(右)上的单步方法与网络 P r e a c t R e s N e t 18 \mathrm{PreactResNet18} PreactResNet18在不同扰动半径上的比较。可以发现该论文的方法 N - F G S M \mathrm{N\text{-}FGSM} N-FGSM可以匹配或超过现有技术的结果,同时将计算成本降低3倍。

    下图表示的是在训练开始(顶部)和结束(底部)的几个时期,对抗扰动 δ \delta δ和梯度平均值的可视化图。可以发现当过拟合之后, F G S M \mathrm{FGSM} FGSM R S - F G S M \mathrm{RS\text{-}FGSM} RS-FGSM无法对对抗扰动 δ \delta δ进行解释,其梯度也是如此,但是 N - F G S M \mathrm{N\text{-}FGSM} N-FGSM P G D - 10 \mathrm{PGD\text{-}10} PGD-10却可以避免这种情况的发生。

    程序代码

    该论文并没有提供源码,以下是在 m n i s t \mathrm{mnist} mnist数据集中对论文中代码进行的实现。

    import argparse
    import logging
    import time
    
    import numpy as np
    import torch
    import torch.nn as nn
    import torch.nn.functional as F
    import torchvision
    from torchvision import datasets, transforms
    from torch.utils.data import DataLoader, Dataset
    import os
    import argparse
    
    def get_args():
        parser = argparse.ArgumentParser()
        parser.add_argument('--batch-size', default=100, type=int)
        parser.add_argument('--data-dir', default='mnist-data', type=str)
        parser.add_argument('--epochs', default=10, type=int)
        parser.add_argument('--epsilon', default=0.3, type=float)
        parser.add_argument('--alpha', default=0.375, type=float)
        parser.add_argument('--lr-max', default=5e-3, type=float)
        parser.add_argument('--lr-type', default='cyclic')
        parser.add_argument('--fname', default='mnist_model', type=str)
        parser.add_argument('--seed', default=0, type=int)
        return parser.parse_args()
    
    class Flatten(nn.Module):
        def forward(self, x):
            return x.view(x.size(0), -1)
    
    def mnist_net():
        model = nn.Sequential(
            nn.Conv2d(1, 16, 4, stride=2, padding=1),
            nn.ReLU(),
            nn.Conv2d(16, 32, 4, stride=2, padding=1),
            nn.ReLU(),
            Flatten(),
            nn.Linear(32*7*7,100),
            nn.ReLU(),
            nn.Linear(100, 10)
        )
        return model
    
    class Attack_methods(object):
        def __init__(self, model, X, Y, epsilon, alpha):
            self.model = model
            self.epsilon = epsilon
            self.X = X
            self.Y = Y
            self.epsilon = epsilon
            self.alpha = alpha
    
        def nfgsm(self):
            eta = torch.zeros_like(self.X).uniform_(-self.epsilon, self.epsilon)
            delta = torch.zeros_like(self.X)
            eta.requires_grad = True
            output = self.model(self.X + eta)
            loss = nn.CrossEntropyLoss()(output, self.Y)
            loss.backward()
            grad = eta.grad.detach()
            delta.data = eta + self.alpha * torch.sign(grad)
            return delta
    
    class Adversarial_Trainings(object):
        def __init__(self, epochs, train_loader, model, opt, epsilon, alpha, iter_num, lr_max, lr_schedule,
                     fname, logger):
            self.epochs = epochs
            self.train_loader = train_loader
            self.model = model
            self.opt = opt
            self.epsilon = epsilon
            self.alpha = alpha
            self.iter_num = iter_num
            self.lr_max = lr_max
            self.lr_schedule = lr_schedule
            self.fname = fname
            self.logger = logger
    
        def fast_training(self):
            for epoch in range(self.epochs):
                start_time = time.time()
                train_loss = 0
                train_acc = 0
                train_n = 0
    
                for i, (X, y) in enumerate(self.train_loader):
                    X, y = X.cuda(), y.cuda()
                    lr = self.lr_schedule(epoch + (i + 1) / len(self.train_loader))
                    self.opt.param_groups[0].update(lr=lr)
    
                    # Generating adversarial example
                    adversarial_attack = Attack_methods(self.model, X, y, self.epsilon, self.alpha)
                    delta = adversarial_attack.nfgsm()
    
                    # Update network parameters
                    output = self.model(torch.clamp(X + delta, 0, 1))
                    loss = nn.CrossEntropyLoss()(output, y)
                    self.opt.zero_grad()
                    loss.backward()
                    self.opt.step()
    
                    train_loss += loss.item() * y.size(0)
                    train_acc += (output.max(1)[1] == y).sum().item()
                    train_n += y.size(0)
    
                train_time = time.time()
                self.logger.info('%d \t %.1f \t %.4f \t %.4f \t %.4f', epoch, train_time - start_time, lr, train_loss/train_n, train_acc/train_n)
                torch.save(self.model.state_dict(), self.fname)
    
    logger = logging.getLogger(__name__)
    logging.basicConfig(
        format='[%(asctime)s] - %(message)s',
        datefmt='%Y/%m/%d %H:%M:%S',
        level=logging.DEBUG)
    
    
    def main():
        args = get_args()
        logger.info(args)
    
        np.random.seed(args.seed)
        torch.manual_seed(args.seed)
        torch.cuda.manual_seed(args.seed)
    
        mnist_train = datasets.MNIST("mnist-data", train=True, download=True, transform=transforms.ToTensor())
        train_loader = torch.utils.data.DataLoader(mnist_train, batch_size=args.batch_size, shuffle=True)
    
        model = mnist_net().cuda()
        model.train()
    
        opt = torch.optim.Adam(model.parameters(), lr=args.lr_max)
        if args.lr_type == 'cyclic':
            lr_schedule = lambda t: np.interp([t], [0, args.epochs * 2 // 5, args.epochs], [0, args.lr_max, 0])[0]
        elif args.lr_type == 'flat':
            lr_schedule = lambda t: args.lr_max
        else:
            raise ValueError('Unknown lr_type')
    
        logger.info('Epoch \t Time \t LR \t \t Train Loss \t Train Acc')
    
        adversarial_training = Adversarial_Trainings(args.epochs, train_loader, model, opt, args.epsilon, args.alpha, 40,
                                                     args.lr_max, lr_schedule, args.fname, logger)
        adversarial_training.fast_training()
    
    
    if __name__ == "__main__":
        main()
    
    

    运行的实验结果如下所示

    展开全文
  • 本文主要演示了如何寻找MNIST图像的“对抗噪声”,以及如何使神经网络对对抗噪声免疫。 01 - 简单线性模型 | 02 - 卷积神经网络 | 03 - PrettyTensor | 04 - 保存& 恢复05 - 集成学习 | 06 - CIFAR 10 | 07 ...

    本文主要演示了如何寻找MNIST图像的“对抗噪声”,以及如何使神经网络对对抗噪声免疫。

    01 - 简单线性模型 | 02 - 卷积神经网络 | 03 - PrettyTensor | 04 - 保存& 恢复
    05 - 集成学习 | 06 - CIFAR 10 | 07 - Inception 模型 | 08 - 迁移学习
    09 - 视频数据 | 11 - 对抗样本

    by Magnus Erik Hvass Pedersen / GitHub / Videos on YouTube
    中文翻译 thrillerist / Github

    如有转载,请附上本文链接。


    介绍

    之前的教程#11展示了如何找到最先进神经网络的对抗样本,它会引起网络误分类图像,即使在人眼看来图像完全相同。例如,在添加了对抗噪声之后,一张鹦鹉的图像会被误分类成书架,但在人类眼中图像完全没什么变化。

    教程#11是通过每张图像的优化过程来寻找对抗噪声的。由于噪声是专门为某张图像生成,因此它可能不是通用的,无法在其他图像上起作用。

    本教程将会找到那些导致几乎所有输入图像都被误分类成目标类别的对抗噪声。我们使用MNIST手写数字数据集为例。现在,对抗噪声对人眼是清晰可见的,但人类还是能够很容易地辨认出数字,然而神经网络几乎将所有图像误分类。

    这篇教程里,我们还会试着让神经网络对对抗噪声免疫。

    教程 #11 用Numpy来做对抗优化。在这篇教程里,我们会直接在TensorFlow里实现优化过程。这会更快速,尤其是在使用GPU的时候,因为不用每次迭代都在GPU里拷贝数据。

    推荐你先学习教程 #11。你也需要大概地熟悉神经网络,详见教程 #01和 #02。

    流程图

    下面的图表直接展示了之后实现的卷积神经网络中数据的传递。

    例子展示的是数字7的输入图像。随后在图像上添加对抗噪声。红色的噪声点是正值,它让像素值更深,蓝色噪声点是负值,让输入图像在此处的颜色更浅。

    这些噪声图像传到神经网络中,然后得到一个预测数字。这种情况下,对抗噪声让神经网络相信这张数字7的图像显示的是数字3。噪声对人眼是清晰可见的,但人类仍然可以容易地辨认出数字7来。

    这边值得注意的是,单一的噪声模式会导致神经网络将几乎所有的输入图像都误分类成期望的目标类型。

    在这个神经网络中有两个单独的优化程序。首先,我们优化神经网络的变量来分类训练集的图像。这是神经网络的常规优化过程。一旦分类准确率足够高,我们就切换到第二个优化程序,(它用来)寻找单一模式的对抗噪声,使得所有的输入图像都被误分类成目标类型。

    这两个优化程序是完全独立的。第一个程序只修改量神经网络的变量,第二个程序只修改对抗噪声。

    from IPython.display import Image
    Image('images/12_adversarial_noise_flowchart.png')复制代码

    导入

    %matplotlib inline
    import matplotlib.pyplot as plt
    import tensorflow as tf
    import numpy as np
    from sklearn.metrics import confusion_matrix
    import time
    from datetime import timedelta
    import math
    
    # We also need PrettyTensor.
    import prettytensor as pt复制代码

    使用Python3.5.2(Anaconda)开发,TensorFlow版本是:

    tf.__version__复制代码

    '0.12.0-rc0'

    PrettyTensor 版本:

    pt.__version__复制代码

    '0.7.1'

    载入数据

    MNIST数据集大约12MB,如果没在给定路径中找到就会自动下载。

    from tensorflow.examples.tutorials.mnist import input_data
    data = input_data.read_data_sets('data/MNIST/', one_hot=True)复制代码

    Extracting data/MNIST/train-images-idx3-ubyte.gz
    Extracting data/MNIST/train-labels-idx1-ubyte.gz
    Extracting data/MNIST/t10k-images-idx3-ubyte.gz
    Extracting data/MNIST/t10k-labels-idx1-ubyte.gz

    现在已经载入了MNIST数据集,它由70,000张图像和对应的标签(比如图像的类别)组成。数据集分成三份互相独立的子集。我们在教程中只用训练集和测试集。

    print("Size of:")
    print("- Training-set:\t\t{}".format(len(data.train.labels)))
    print("- Test-set:\t\t{}".format(len(data.test.labels)))
    print("- Validation-set:\t{}".format(len(data.validation.labels)))复制代码

    Size of:

    - Training-set:        55000
    - Test-set:        10000
    - Validation-set:    5000复制代码

    类型标签使用One-Hot编码,这意外每个标签是长为10的向量,除了一个元素之外,其他的都为零。这个元素的索引就是类别的数字,即相应图片中画的数字。我们也需要测试数据集类别数字的整型值,现在计算它。

    data.test.cls = np.argmax(data.test.labels, axis=1)复制代码

    数据维度

    在下面的源码中,有很多地方用到了数据维度。它们只在一个地方定义,因此我们可以在代码中使用这些数字而不是直接写数字。

    # We know that MNIST images are 28 pixels in each dimension.
    img_size = 28
    
    # Images are stored in one-dimensional arrays of this length.
    img_size_flat = img_size * img_size
    
    # Tuple with height and width of images used to reshape arrays.
    img_shape = (img_size, img_size)
    
    # Number of colour channels for the images: 1 channel for gray-scale.
    num_channels = 1
    
    # Number of classes, one class for each of 10 digits.
    num_classes = 10复制代码

    用来绘制图像的帮助函数

    这个函数用来在3x3的栅格中画9张图像,然后在每张图像下面写出真实类别和预测类别。如果提供了噪声,就将其添加到所有图像上。

    def plot_images(images, cls_true, cls_pred=None, noise=0.0):
        assert len(images) == len(cls_true) == 9
    
        # Create figure with 3x3 sub-plots.
        fig, axes = plt.subplots(3, 3)
        fig.subplots_adjust(hspace=0.3, wspace=0.3)
    
        for i, ax in enumerate(axes.flat):
            # Get the i'th image and reshape the array.
            image = images[i].reshape(img_shape)
    
            # Add the adversarial noise to the image.
            image += noise
    
            # Ensure the noisy pixel-values are between 0 and 1.
            image = np.clip(image, 0.0, 1.0)
    
            # Plot image.
            ax.imshow(image,
                      cmap='binary', interpolation='nearest')
    
            # Show true and predicted classes.
            if cls_pred is None:
                xlabel = "True: {0}".format(cls_true[i])
            else:
                xlabel = "True: {0}, Pred: {1}".format(cls_true[i], cls_pred[i])
    
            # Show the classes as the label on the x-axis.
            ax.set_xlabel(xlabel)
    
            # Remove ticks from the plot.
            ax.set_xticks([])
            ax.set_yticks([])
    
        # Ensure the plot is shown correctly with multiple plots
        # in a single Notebook cell.
        plt.show()复制代码

    绘制几张图像来看看数据是否正确

    # Get the first images from the test-set.
    images = data.test.images[0:9]
    
    # Get the true classes for those images.
    cls_true = data.test.cls[0:9]
    
    # Plot the images and labels using our helper-function above.
    plot_images(images=images, cls_true=cls_true)复制代码

    TensorFlow图(Graph)

    现在将使用TensorFlow和PrettyTensor构建神经网络的计算图。 与往常一样,我们需要为图像创建占位符变量,将其送到计算图中,然后将对抗噪声添加到图像中。接着把噪声图像用作卷积神经网络的输入。

    这个网络有两个单独的优化程序。神经网络本身变量的一个常规优化过程,以及对抗噪声的另一个优化过程。 两个优化过程都直接在TensorFlow中实现。

    占位符 (Placeholder)变量

    占位符变量为TensorFlow中的计算图提供了输入,我们可以在每次执行图的时候更改。 我们称为feeding占位符变量。

    首先,我们为输入图像定义占位符变量。 这允许我们改变输入到TensorFlow图中的图像。 这是一个张量,代表它是一个多维数组。 数据类型设为float32,形状设为[None,img_size_flat],其中None代表张量可以保存任意数量的图像,每个图像是长度为img_size_flat的向量。

    x = tf.placeholder(tf.float32, shape=[None, img_size_flat], name='x')复制代码

    卷积层希望x被编码为4维张量,因此我们需要将它的形状转换至[num_images, img_height, img_width, num_channels]。注意img_height == img_width == img_size,如果第一维的大小设为-1, num_images的大小也会被自动推导出来。转换运算如下:

    x_image = tf.reshape(x, [-1, img_size, img_size, num_channels])复制代码

    接下来我们为输入变量x中的图像所对应的真实标签定义占位符变量。变量的形状是[None, num_classes],这代表着它保存了任意数量的标签,每个标签是长度为num_classes的向量,本例中长度为10。

    y_true = tf.placeholder(tf.float32, shape=[None, num_classes], name='y_true')复制代码

    我们也可以为类别号提供一个占位符,但这里用argmax来计算它。这里只是TensorFlow中的一些操作符,没有执行什么运算。

    y_true_cls = tf.argmax(y_true, dimension=1)复制代码

    对抗噪声

    输入图像的像素值在0.0到1.0之间。对抗噪声是在输入图像上添加或删除的数值。

    对抗噪声的界限设为0.35,则噪声在正负0.35之间。

    noise_limit = 0.35复制代码

    对抗噪声的优化器会试图最小化两个损失度量:(1)神经网络常规的损失度量,因此我们会找到使得目标类型分类准确率最高的噪声;(2)L2-loss度量,它会保持尽可能低的噪声。

    下面的权重决定了与常规的损失度量相比,L2-loss的重要性。通常接近零的L2权重表现的更好。

    noise_l2_weight = 0.02复制代码

    当我们为噪声创建变量时,必须告知TensorFlow它属于哪一个变量集合,这样,后面就能通知两个优化器要更新哪些变量。

    首先为变量集合定义一个名称。这只是一个字符串。

    ADVERSARY_VARIABLES = 'adversary_variables'复制代码

    接着,创建噪声变量所属集合的列表。如果我们将噪声变量添加到集合tf.GraphKeys.VARIABLES中,它就会和TensorFlow图中的其他变量一起被初始化,但不会被优化。这里有点混乱。

    collections = [tf.GraphKeys.VARIABLES, ADVERSARY_VARIABLES]复制代码

    现在我们可以为对抗噪声添加新的变量。它会被初始化为零。它是不可训练的,因此并不会与神经网络中的其他变量一起被优化。这让我们可以创建两个独立的优化程序。

    x_noise = tf.Variable(tf.zeros([img_size, img_size, num_channels]),
                          name='x_noise', trainable=False,
                          collections=collections)复制代码

    对抗噪声会被限制在我们上面设定的噪声界限内。注意此时并未在计算图表内进行计算,在优化步骤之后执行,详见下文。

    x_noise_clip = tf.assign(x_noise, tf.clip_by_value(x_noise,
                                                       -noise_limit,
                                                       noise_limit))复制代码

    噪声图像只是输入图像和对抗噪声的总和。

    x_noisy_image = x_image + x_noise复制代码

    把噪声图像添加到输入图像上时,它可能会溢出有效图像(像素)的边界,因此我们裁剪/限制噪声图像,确保它的像素值在0到1之间。

    x_noisy_image = tf.clip_by_value(x_noisy_image, 0.0, 1.0)复制代码

    卷积神经网络

    我们会用PrettyTensor来构造卷积神经网络。首先需要将噪声图像的张量封装到PrettyTensor对象中,该对象提供了构造神经网络的函数。

    x_pretty = pt.wrap(x_noisy_image)复制代码

    将输入图像封装到PrettyTensor对象之后,用几行代码就能添加卷积层和全连接层。

    with pt.defaults_scope(activation_fn=tf.nn.relu):
        y_pred, loss = x_pretty.\
            conv2d(kernel=5, depth=16, name='layer_conv1').\
            max_pool(kernel=2, stride=2).\
            conv2d(kernel=5, depth=36, name='layer_conv2').\
            max_pool(kernel=2, stride=2).\
            flatten().\
            fully_connected(size=128, name='layer_fc1').\
            softmax_classifier(num_classes=num_classes, labels=y_true)复制代码

    注意,在with代码块中,pt.defaults_scope(activation_fn=tf.nn.relu)activation_fn=tf.nn.relu当作每个的层参数,因此这些层都用到了 Rectified Linear Units (ReLU) 。defaults_scope使我们能更方便地修改所有层的参数。

    正常训练的优化器

    这是会在常规优化程序里被训练的神经网络的变量列表。注意,'x_noise:0'不在列表里,因此这个程序并不会优化对抗噪声。

    [var.name for var in tf.trainable_variables()]复制代码

    ['layer_conv1/weights:0',
    'layer_conv1/bias:0',
    'layer_conv2/weights:0',
    'layer_conv2/bias:0',
    'layer_fc1/weights:0',
    'layer_fc1/bias:0',
    'fully_connected/weights:0',
    'fully_connected/bias:0']

    神经网络中这些变量的优化由Adam-optimizer完成,它用到上面PretyTensor构造的神经网络所返回的损失度量。

    此时不执行优化,实际上这里根本没有计算,我们只是把优化对象添加到TensorFlow图表中,以便稍后运行。

    optimizer = tf.train.AdamOptimizer(learning_rate=1e-4).minimize(loss)复制代码

    对抗噪声的优化器

    获取变量列表,这些是需要在第二个程序里为对抗噪声做优化的变量。

    adversary_variables = tf.get_collection(ADVERSARY_VARIABLES)复制代码

    展示变量名称列表。这里只有一个元素,是我们在上面创建的对抗噪声变量。

    [var.name for var in adversary_variables]复制代码

    ['x_noise:0']

    我们会将常规优化的损失函数与所谓的L2-loss相结合。这将会得到在最佳分类准确率下的最小对抗噪声。

    L2-loss由一个通常设置为接近零的权重缩放。

    l2_loss_noise = noise_l2_weight * tf.nn.l2_loss(x_noise)复制代码

    将正常的损失函数和对抗噪声的L2-loss相结合。

    loss_adversary = loss + l2_loss_noise复制代码

    现在可以为对抗噪声创建优化器。由于优化器并不能更新神经网络的所有变量,我们必须给出一个需要更新的变量的列表,即对抗噪声变量。注意,这里的学习率比上面的常规优化器要大很多。

    optimizer_adversary = tf.train.AdamOptimizer(learning_rate=1e-2).minimize(loss_adversary, var_list=adversary_variables)复制代码

    现在我们为神经网络创建了两个优化器,一个用于神经网络的变量,另一个用于对抗噪声的单个变量。

    性能度量

    在TensorFlow图表中,我们需要另外一些操作,以便在优化过程中向用户展示进度。

    首先,计算出神经网络输出y_pred的预测类别号,它是一个包含10个元素的向量。类型号是最大元素的索引。

    y_pred_cls = tf.argmax(y_pred, dimension=1)复制代码

    接着创建一个布尔数组,用来表示每张图像的预测类型是否与真实类型相同。

    correct_prediction = tf.equal(y_pred_cls, y_true_cls)复制代码

    上面的计算先将布尔值向量类型转换成浮点型向量,这样子False就变成0,True变成1,然后计算这些值的平均数,以此来计算分类的准确度。

    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))复制代码

    运行TensorFlow

    创建TensorFlow会话(session)

    一旦创建了TensorFlow图,我们需要创建一个TensorFlow会话,用来运行图。

    session = tf.Session()复制代码

    初始化变量

    我们需要在开始优化weightsbiases变量之前对它们进行初始化。

    session.run(tf.global_variables_initializer())复制代码

    帮助函数将对抗噪声初始化/重置为零。

    def init_noise():
        session.run(tf.variables_initializer([x_noise]))复制代码

    调用函数来初始化对抗噪声。

    init_noise()复制代码

    用来优化迭代的帮助函数

    在训练集中有55,000张图。用全部图像计算模型的梯度会花很多时间。因此我们在优化器的每次迭代里只用到了一小部分的图像。

    如果内存耗尽导致电脑死机或变得很慢,你应该试着减少这些数量,但同时可能还需要更优化的迭代。

    train_batch_size = 64复制代码

    下面的函数用来执行一定数量的优化迭代,以此来逐渐改善神经网络的变量。在每次迭代中,会从训练集中选择新的一批数据,然后TensorFlow在这些训练样本上执行优化。每100次迭代会打印出进度。

    这个函数与之前教程中的相似,除了现在它多了一个对抗目标类别(adversary target-class)的参数。当目标类别设为整数时,将会用它取代训练集中的真实类别号。也会用对抗优化器代替常规优化器,然后在每次优化之后,噪声将被限制/截断到允许的范围。这里优化了对抗噪声,并忽略神经网络中的其他变量。

    def optimize(num_iterations, adversary_target_cls=None):
        # Start-time used for printing time-usage below.
        start_time = time.time()
    
        for i in range(num_iterations):
    
            # Get a batch of training examples.
            # x_batch now holds a batch of images and
            # y_true_batch are the true labels for those images.
            x_batch, y_true_batch = data.train.next_batch(train_batch_size)
    
            # If we are searching for the adversarial noise, then
            # use the adversarial target-class instead.
            if adversary_target_cls is not None:
                # The class-labels are One-Hot encoded.
    
                # Set all the class-labels to zero.
                y_true_batch = np.zeros_like(y_true_batch)
    
                # Set the element for the adversarial target-class to 1.
                y_true_batch[:, adversary_target_cls] = 1.0
    
            # Put the batch into a dict with the proper names
            # for placeholder variables in the TensorFlow graph.
            feed_dict_train = {x: x_batch,
                               y_true: y_true_batch}
    
            # If doing normal optimization of the neural network.
            if adversary_target_cls is None:
                # Run the optimizer using this batch of training data.
                # TensorFlow assigns the variables in feed_dict_train
                # to the placeholder variables and then runs the optimizer.
                session.run(optimizer, feed_dict=feed_dict_train)
            else:
                # Run the adversarial optimizer instead.
                # Note that we have 'faked' the class above to be
                # the adversarial target-class instead of the true class.
                session.run(optimizer_adversary, feed_dict=feed_dict_train)
    
                # Clip / limit the adversarial noise. This executes
                # another TensorFlow operation. It cannot be executed
                # in the same session.run() as the optimizer, because
                # it may run in parallel so the execution order is not
                # guaranteed. We need the clip to run after the optimizer.
                session.run(x_noise_clip)
    
            # Print status every 100 iterations.
            if (i % 100 == 0) or (i == num_iterations - 1):
                # Calculate the accuracy on the training-set.
                acc = session.run(accuracy, feed_dict=feed_dict_train)
    
                # Message for printing.
                msg = "Optimization Iteration: {0:>6}, Training Accuracy: {1:>6.1%}"
    
                # Print it.
                print(msg.format(i, acc))
    
        # Ending time.
        end_time = time.time()
    
        # Difference between start and end-times.
        time_dif = end_time - start_time
    
        # Print the time-usage.
        print("Time usage: " + str(timedelta(seconds=int(round(time_dif)))))复制代码

    获取及绘制噪声的帮助函数

    这个函数从TensorFlow图表中获取对抗噪声。

    def get_noise():
        # Run the TensorFlow session to retrieve the contents of
        # the x_noise variable inside the graph.
        noise = session.run(x_noise)
    
        return np.squeeze(noise)复制代码

    这个函数绘制了对抗噪声,并打印一些统计信息。

    def plot_noise():
        # Get the adversarial noise from inside the TensorFlow graph.
        noise = get_noise()
    
        # Print statistics.
        print("Noise:")
        print("- Min:", noise.min())
        print("- Max:", noise.max())
        print("- Std:", noise.std())
    
        # Plot the noise.
        plt.imshow(noise, interpolation='nearest', cmap='seismic',
                   vmin=-1.0, vmax=1.0)复制代码

    用来绘制错误样本的帮助函数

    函数用来绘制测试集中被误分类的样本。

    def plot_example_errors(cls_pred, correct):
        # This function is called from print_test_accuracy() below.
    
        # cls_pred is an array of the predicted class-number for
        # all images in the test-set.
    
        # correct is a boolean array whether the predicted class
        # is equal to the true class for each image in the test-set.
    
        # Negate the boolean array.
        incorrect = (correct == False)
    
        # Get the images from the test-set that have been
        # incorrectly classified.
        images = data.test.images[incorrect]
    
        # Get the predicted classes for those images.
        cls_pred = cls_pred[incorrect]
    
        # Get the true classes for those images.
        cls_true = data.test.cls[incorrect]
    
        # Get the adversarial noise from inside the TensorFlow graph.
        noise = get_noise()
    
        # Plot the first 9 images.
        plot_images(images=images[0:9],
                    cls_true=cls_true[0:9],
                    cls_pred=cls_pred[0:9],
                    noise=noise)复制代码

    绘制混淆(confusion)矩阵的帮助函数

    def plot_confusion_matrix(cls_pred):
        # This is called from print_test_accuracy() below.
    
        # cls_pred is an array of the predicted class-number for
        # all images in the test-set.
    
        # Get the true classifications for the test-set.
        cls_true = data.test.cls
    
        # Get the confusion matrix using sklearn.
        cm = confusion_matrix(y_true=cls_true,
                              y_pred=cls_pred)
    
        # Print the confusion matrix as text.
        print(cm)复制代码

    展示性能的帮助函数

    函数用来打印测试集上的分类准确度。

    为测试集上的所有图片计算分类会花费一段时间,因此我们直接用这个函数来调用上面的结果,这样就不用每次都重新计算了。

    这个函数可能会占用很多电脑内存,这也是为什么将测试集分成更小的几个部分。如果你的电脑内存比较小或死机了,就要试着降低batch-size。

    # Split the test-set into smaller batches of this size.
    test_batch_size = 256
    
    def print_test_accuracy(show_example_errors=False,
                            show_confusion_matrix=False):
    
        # Number of images in the test-set.
        num_test = len(data.test.images)
    
        # Allocate an array for the predicted classes which
        # will be calculated in batches and filled into this array.
        cls_pred = np.zeros(shape=num_test, dtype=np.int)
    
        # Now calculate the predicted classes for the batches.
        # We will just iterate through all the batches.
        # There might be a more clever and Pythonic way of doing this.
    
        # The starting index for the next batch is denoted i.
        i = 0
    
        while i < num_test:
            # The ending index for the next batch is denoted j.
            j = min(i + test_batch_size, num_test)
    
            # Get the images from the test-set between index i and j.
            images = data.test.images[i:j, :]
    
            # Get the associated labels.
            labels = data.test.labels[i:j, :]
    
            # Create a feed-dict with these images and labels.
            feed_dict = {x: images,
                         y_true: labels}
    
            # Calculate the predicted class using TensorFlow.
            cls_pred[i:j] = session.run(y_pred_cls, feed_dict=feed_dict)
    
            # Set the start-index for the next batch to the
            # end-index of the current batch.
            i = j
    
        # Convenience variable for the true class-numbers of the test-set.
        cls_true = data.test.cls
    
        # Create a boolean array whether each image is correctly classified.
        correct = (cls_true == cls_pred)
    
        # Calculate the number of correctly classified images.
        # When summing a boolean array, False means 0 and True means 1.
        correct_sum = correct.sum()
    
        # Classification accuracy is the number of correctly classified
        # images divided by the total number of images in the test-set.
        acc = float(correct_sum) / num_test
    
        # Print the accuracy.
        msg = "Accuracy on Test-Set: {0:.1%} ({1} / {2})"
        print(msg.format(acc, correct_sum, num_test))
    
        # Plot some examples of mis-classifications, if desired.
        if show_example_errors:
            print("Example errors:")
            plot_example_errors(cls_pred=cls_pred, correct=correct)
    
        # Plot the confusion matrix, if desired.
        if show_confusion_matrix:
            print("Confusion Matrix:")
            plot_confusion_matrix(cls_pred=cls_pred)复制代码

    神经网络的常规优化

    此时对抗噪声还没有效果,因为上面只将它初始化为零,在优化过程中并未更新。

    optimize(num_iterations=1000)复制代码

    Optimization Iteration: 0, Training Accuracy: 12.5%
    Optimization Iteration: 100, Training Accuracy: 90.6%
    Optimization Iteration: 200, Training Accuracy: 84.4%
    Optimization Iteration: 300, Training Accuracy: 84.4%
    Optimization Iteration: 400, Training Accuracy: 89.1%
    Optimization Iteration: 500, Training Accuracy: 87.5%
    Optimization Iteration: 600, Training Accuracy: 93.8%
    Optimization Iteration: 700, Training Accuracy: 93.8%
    Optimization Iteration: 800, Training Accuracy: 93.8%
    Optimization Iteration: 900, Training Accuracy: 96.9%
    Optimization Iteration: 999, Training Accuracy: 92.2%
    Time usage: 0:00:03

    测试集上的分类准确率大约96-97%。(每次运行Python Notobook时,结果会有所变化。)

    print_test_accuracy(show_example_errors=True)复制代码

    Accuracy on Test-Set: 96.3% (9633 / 10000)
    Example errors:

    寻找对抗噪声

    在我们开始优化对抗噪声之前,先将它初始化为零。上面已经完成了这一步,但这里再执行一次,以防你用其他目标类型重新运行代码。

    init_noise()复制代码

    现在执行对抗噪声的优化。这里使用对抗优化器而不是常规优化器,这说明它只优化对抗噪声变量,同时忽略神经网络中的其他变量。

    optimize(num_iterations=1000, adversary_target_cls=3)复制代码

    Optimization Iteration: 0, Training Accuracy: 6.2%
    Optimization Iteration: 100, Training Accuracy: 93.8%
    Optimization Iteration: 200, Training Accuracy: 96.9%
    Optimization Iteration: 300, Training Accuracy: 98.4%
    Optimization Iteration: 400, Training Accuracy: 95.3%
    Optimization Iteration: 500, Training Accuracy: 96.9%
    Optimization Iteration: 600, Training Accuracy: 100.0%
    Optimization Iteration: 700, Training Accuracy: 98.4%
    Optimization Iteration: 800, Training Accuracy: 95.3%
    Optimization Iteration: 900, Training Accuracy: 93.8%
    Optimization Iteration: 999, Training Accuracy: 100.0%
    Time usage: 0:00:03

    现在对抗噪声已经被优化了,可以在一张图像中展示出来。红色像素显示了正噪声值,蓝色像素显示了负噪声值。这个噪声模式将会被添加到每张输入图像中。正噪声值(红)使像素变暗,负噪声值(蓝)使像素变亮。如下所示。

    plot_noise()复制代码

    Noise:

    - Min: -0.35
    - Max: 0.35
    - Std: 0.195455复制代码

    当测试集的所有图像上都添加了该噪声之后,根据选定的目标类别,分类准确率通常在是10-15%之间。我们也能从混淆矩阵中看出,测试集中的大多数图像都被分类成期望的目标类别——尽管有些目标类型比其他的需要更多的对抗噪声。

    所以我们找到了使对抗噪声,使神经网络将测试集中绝大部分图像误分类成期望的类别。

    我们也可以画出一些带有对抗噪声的误分类图像样本。噪声清晰可见,但人眼还是可以轻易地分辨出数字。

    print_test_accuracy(show_example_errors=True,
                        show_confusion_matrix=True)复制代码

    Accuracy on Test-Set: 13.2% (1323 / 10000)
    Example errors:

    Confusion Matrix:
    [[ 85 0 0 895 0 0 0 0 0 0]
    [ 0 0 0 1135 0 0 0 0 0 0]
    [ 0 0 46 986 0 0 0 0 0 0]
    [ 0 0 0 1010 0 0 0 0 0 0]
    [ 0 0 0 959 20 0 0 0 3 0]
    [ 0 0 0 847 0 45 0 0 0 0]
    [ 0 0 0 914 0 1 42 0 1 0]
    [ 0 0 0 977 0 0 0 51 0 0]
    [ 0 0 0 952 0 0 0 0 22 0]
    [ 0 0 1 1006 0 0 0 0 0 2]]

    所有目标类别的对抗噪声

    这是帮助函数用于寻找所有目标类别的对抗噪声。函数从类型号0遍历到9,执行上面的优化。然后将结果保存到一个数组中。

    def find_all_noise(num_iterations=1000):
        # Adversarial noise for all target-classes.
        all_noise = []
    
        # For each target-class.
        for i in range(num_classes):
            print("Finding adversarial noise for target-class:", i)
    
            # Reset the adversarial noise to zero.
            init_noise()
    
            # Optimize the adversarial noise.
            optimize(num_iterations=num_iterations,
                     adversary_target_cls=i)
    
            # Get the adversarial noise from inside the TensorFlow graph.
            noise = get_noise()
    
            # Append the noise to the array.
            all_noise.append(noise)
    
            # Print newline.
            print()
    
        return all_noise复制代码
    all_noise = find_all_noise(num_iterations=300)复制代码

    Finding adversarial noise for target-class: 0
    Optimization Iteration: 0, Training Accuracy: 9.4%
    Optimization Iteration: 100, Training Accuracy: 90.6%
    Optimization Iteration: 200, Training Accuracy: 92.2%
    Optimization Iteration: 299, Training Accuracy: 93.8%
    Time usage: 0:00:01

    Finding adversarial noise for target-class: 1
    Optimization Iteration: 0, Training Accuracy: 7.8%
    Optimization Iteration: 100, Training Accuracy: 62.5%
    Optimization Iteration: 200, Training Accuracy: 62.5%
    Optimization Iteration: 299, Training Accuracy: 75.0%
    Time usage: 0:00:01

    Finding adversarial noise for target-class: 2
    Optimization Iteration: 0, Training Accuracy: 7.8%
    Optimization Iteration: 100, Training Accuracy: 93.8%
    Optimization Iteration: 200, Training Accuracy: 95.3%
    Optimization Iteration: 299, Training Accuracy: 96.9%
    Time usage: 0:00:01

    Finding adversarial noise for target-class: 3
    Optimization Iteration: 0, Training Accuracy: 6.2%
    Optimization Iteration: 100, Training Accuracy: 98.4%
    Optimization Iteration: 200, Training Accuracy: 96.9%
    Optimization Iteration: 299, Training Accuracy: 98.4%
    Time usage: 0:00:01

    Finding adversarial noise for target-class: 4
    Optimization Iteration: 0, Training Accuracy: 12.5%
    Optimization Iteration: 100, Training Accuracy: 81.2%
    Optimization Iteration: 200, Training Accuracy: 82.8%
    Optimization Iteration: 299, Training Accuracy: 82.8%
    Time usage: 0:00:01

    Finding adversarial noise for target-class: 5
    Optimization Iteration: 0, Training Accuracy: 7.8%
    Optimization Iteration: 100, Training Accuracy: 96.9%
    Optimization Iteration: 200, Training Accuracy: 96.9%
    Optimization Iteration: 299, Training Accuracy: 98.4%
    Time usage: 0:00:01

    Finding adversarial noise for target-class: 6
    Optimization Iteration: 0, Training Accuracy: 6.2%
    Optimization Iteration: 100, Training Accuracy: 93.8%
    Optimization Iteration: 200, Training Accuracy: 92.2%
    Optimization Iteration: 299, Training Accuracy: 96.9%
    Time usage: 0:00:01

    Finding adversarial noise for target-class: 7
    Optimization Iteration: 0, Training Accuracy: 12.5%
    Optimization Iteration: 100, Training Accuracy: 98.4%
    Optimization Iteration: 200, Training Accuracy: 93.8%
    Optimization Iteration: 299, Training Accuracy: 92.2%
    Time usage: 0:00:01

    Finding adversarial noise for target-class: 8
    Optimization Iteration: 0, Training Accuracy: 4.7%
    Optimization Iteration: 100, Training Accuracy: 96.9%
    Optimization Iteration: 200, Training Accuracy: 93.8%
    Optimization Iteration: 299, Training Accuracy: 96.9%
    Time usage: 0:00:01

    Finding adversarial noise for target-class: 9
    Optimization Iteration: 0, Training Accuracy: 7.8%
    Optimization Iteration: 100, Training Accuracy: 84.4%
    Optimization Iteration: 200, Training Accuracy: 87.5%
    Optimization Iteration: 299, Training Accuracy: 90.6%
    Time usage: 0:00:01

    绘制所有目标类型的对抗噪声

    这个帮助函数用于在栅格中绘制所有目标类型(0到9)的对抗噪声。

    def plot_all_noise(all_noise):    
        # Create figure with 10 sub-plots.
        fig, axes = plt.subplots(2, 5)
        fig.subplots_adjust(hspace=0.2, wspace=0.1)
    
        # For each sub-plot.
        for i, ax in enumerate(axes.flat):
            # Get the adversarial noise for the i'th target-class.
            noise = all_noise[i]
    
            # Plot the noise.
            ax.imshow(noise,
                      cmap='seismic', interpolation='nearest',
                      vmin=-1.0, vmax=1.0)
    
            # Show the classes as the label on the x-axis.
            ax.set_xlabel(i)
    
            # Remove ticks from the plot.
            ax.set_xticks([])
            ax.set_yticks([])
    
        # Ensure the plot is shown correctly with multiple plots
        # in a single Notebook cell.
        plt.show()复制代码
    plot_all_noise(all_noise)复制代码

    红色像素显示正噪声值,蓝色像素显示负噪声值。

    在其中一些噪声图像中,你可以看到数字的痕迹。例如,目标类型0的噪声显示了一个被蓝色包围的红圈。这说明会以圆形状将一些噪声添加到图像中,并抑制其他像素。这足以让MNIST数据集中的大部分图像被误分类成0。另外一个例子是3的噪声,图像的红色像素也显示了数字3的痕迹。但其他类别的噪声不太明显。

    对抗噪声的免疫

    现在试着让神经网络对对抗噪声免疫。我们重新训练神经网络,使其忽略对抗噪声。这个过程可以重复多次。

    帮助函数创建了对对抗噪声免疫的神经网络

    这是使神经网络对对抗噪声免疫的帮助函数。首先运行优化来找到对抗噪声。接着执行常规优化使神经网络对该噪声免疫。

    def make_immune(target_cls, num_iterations_adversary=500,
                    num_iterations_immune=200):
    
        print("Target-class:", target_cls)
        print("Finding adversarial noise ...")
    
        # Find the adversarial noise.
        optimize(num_iterations=num_iterations_adversary,
                 adversary_target_cls=target_cls)
    
        # Newline.
        print()
    
        # Print classification accuracy.
        print_test_accuracy(show_example_errors=False,
                            show_confusion_matrix=False)
    
        # Newline.
        print()
    
        print("Making the neural network immune to the noise ...")
    
        # Try and make the neural network immune to this noise.
        # Note that the adversarial noise has not been reset to zero
        # so the x_noise variable still holds the noise.
        # So we are training the neural network to ignore the noise.
        optimize(num_iterations=num_iterations_immune)
    
        # Newline.
        print()
    
        # Print classification accuracy.
        print_test_accuracy(show_example_errors=False,
                            show_confusion_matrix=False)复制代码

    对目标类型3的噪声免疫

    首先尝试使神经网络对目标类型3的对抗噪声免疫。

    我们先找到导致神经网络误分类测试集上大多数图像的对抗噪声。接着执行常规优化,其变量经过微调从而忽略噪声,使得分类准确率再次达到95-97%。

    make_immune(target_cls=3)复制代码

    Target-class: 3
    Finding adversarial noise ...
    Optimization Iteration: 0, Training Accuracy: 3.1%
    Optimization Iteration: 100, Training Accuracy: 93.8%
    Optimization Iteration: 200, Training Accuracy: 93.8%
    Optimization Iteration: 300, Training Accuracy: 96.9%
    Optimization Iteration: 400, Training Accuracy: 96.9%
    Optimization Iteration: 499, Training Accuracy: 96.9%
    Time usage: 0:00:02

    Accuracy on Test-Set: 14.4% (1443 / 10000)

    Making the neural network immune to the noise ...
    Optimization Iteration: 0, Training Accuracy: 42.2%
    Optimization Iteration: 100, Training Accuracy: 90.6%
    Optimization Iteration: 199, Training Accuracy: 89.1%
    Time usage: 0:00:01

    Accuracy on Test-Set: 95.3% (9529 / 10000)

    现在试着再次运行它。 现在更难为目标类别3找到对抗噪声。神经网络似乎已经变得对对抗噪声有些免疫。

    make_immune(target_cls=3)复制代码

    Target-class: 3
    Finding adversarial noise ...
    Optimization Iteration: 0, Training Accuracy: 7.8%
    Optimization Iteration: 100, Training Accuracy: 32.8%
    Optimization Iteration: 200, Training Accuracy: 32.8%
    Optimization Iteration: 300, Training Accuracy: 29.7%
    Optimization Iteration: 400, Training Accuracy: 34.4%
    Optimization Iteration: 499, Training Accuracy: 26.6%
    Time usage: 0:00:02

    Accuracy on Test-Set: 72.1% (7207 / 10000)

    Making the neural network immune to the noise ...
    Optimization Iteration: 0, Training Accuracy: 75.0%
    Optimization Iteration: 100, Training Accuracy: 93.8%
    Optimization Iteration: 199, Training Accuracy: 92.2%
    Time usage: 0:00:01

    Accuracy on Test-Set: 95.2% (9519 / 10000)

    对所有目标类型的噪声免疫

    现在,试着使神经网络对所有目标类型的噪声免疫。不幸的是,看起来并不太好。

    for i in range(10):
        make_immune(target_cls=i)
    
        # Print newline.
        print()复制代码

    Target-class: 0
    Finding adversarial noise ...
    Optimization Iteration: 0, Training Accuracy: 4.7%
    Optimization Iteration: 100, Training Accuracy: 73.4%
    Optimization Iteration: 200, Training Accuracy: 75.0%
    Optimization Iteration: 300, Training Accuracy: 85.9%
    Optimization Iteration: 400, Training Accuracy: 81.2%
    Optimization Iteration: 499, Training Accuracy: 90.6%
    Time usage: 0:00:02

    Accuracy on Test-Set: 23.3% (2326 / 10000)

    Making the neural network immune to the noise ...
    Optimization Iteration: 0, Training Accuracy: 34.4%
    Optimization Iteration: 100, Training Accuracy: 95.3%
    Optimization Iteration: 199, Training Accuracy: 95.3%
    Time usage: 0:00:01

    Accuracy on Test-Set: 95.6% (9559 / 10000)

    Target-class: 1
    Finding adversarial noise ...
    Optimization Iteration: 0, Training Accuracy: 12.5%
    Optimization Iteration: 100, Training Accuracy: 57.8%
    Optimization Iteration: 200, Training Accuracy: 62.5%
    Optimization Iteration: 300, Training Accuracy: 62.5%
    Optimization Iteration: 400, Training Accuracy: 67.2%
    Optimization Iteration: 499, Training Accuracy: 67.2%
    Time usage: 0:00:02

    Accuracy on Test-Set: 42.2% (4218 / 10000)

    Making the neural network immune to the noise ...
    Optimization Iteration: 0, Training Accuracy: 59.4%
    Optimization Iteration: 100, Training Accuracy: 93.8%
    Optimization Iteration: 199, Training Accuracy: 95.3%
    Time usage: 0:00:01

    Accuracy on Test-Set: 95.5% (9555 / 10000)

    Target-class: 2
    Finding adversarial noise ...
    Optimization Iteration: 0, Training Accuracy: 6.2%
    Optimization Iteration: 100, Training Accuracy: 43.8%
    Optimization Iteration: 200, Training Accuracy: 57.8%
    Optimization Iteration: 300, Training Accuracy: 70.3%
    Optimization Iteration: 400, Training Accuracy: 68.8%
    Optimization Iteration: 499, Training Accuracy: 71.9%
    Time usage: 0:00:02

    Accuracy on Test-Set: 46.4% (4639 / 10000)

    Making the neural network immune to the noise ...
    Optimization Iteration: 0, Training Accuracy: 59.4%
    Optimization Iteration: 100, Training Accuracy: 96.9%
    Optimization Iteration: 199, Training Accuracy: 92.2%
    Time usage: 0:00:01

    Accuracy on Test-Set: 95.5% (9545 / 10000)

    Target-class: 3
    Finding adversarial noise ...
    Optimization Iteration: 0, Training Accuracy: 6.2%
    Optimization Iteration: 100, Training Accuracy: 48.4%
    Optimization Iteration: 200, Training Accuracy: 46.9%
    Optimization Iteration: 300, Training Accuracy: 53.1%
    Optimization Iteration: 400, Training Accuracy: 50.0%
    Optimization Iteration: 499, Training Accuracy: 48.4%
    Time usage: 0:00:02

    Accuracy on Test-Set: 56.5% (5648 / 10000)

    Making the neural network immune to the noise ...
    Optimization Iteration: 0, Training Accuracy: 54.7%
    Optimization Iteration: 100, Training Accuracy: 93.8%
    Optimization Iteration: 199, Training Accuracy: 96.9%
    Time usage: 0:00:01

    Accuracy on Test-Set: 95.8% (9581 / 10000)

    Target-class: 4
    Finding adversarial noise ...
    Optimization Iteration: 0, Training Accuracy: 9.4%
    Optimization Iteration: 100, Training Accuracy: 85.9%
    Optimization Iteration: 200, Training Accuracy: 85.9%
    Optimization Iteration: 300, Training Accuracy: 87.5%
    Optimization Iteration: 400, Training Accuracy: 95.3%
    Optimization Iteration: 499, Training Accuracy: 92.2%
    Time usage: 0:00:02

    Accuracy on Test-Set: 15.6% (1557 / 10000)

    Making the neural network immune to the noise ...
    Optimization Iteration: 0, Training Accuracy: 18.8%
    Optimization Iteration: 100, Training Accuracy: 95.3%
    Optimization Iteration: 199, Training Accuracy: 96.9%
    Time usage: 0:00:01

    Accuracy on Test-Set: 95.6% (9557 / 10000)

    Target-class: 5
    Finding adversarial noise ...
    Optimization Iteration: 0, Training Accuracy: 18.8%
    Optimization Iteration: 100, Training Accuracy: 71.9%
    Optimization Iteration: 200, Training Accuracy: 90.6%
    Optimization Iteration: 300, Training Accuracy: 95.3%
    Optimization Iteration: 400, Training Accuracy: 89.1%
    Optimization Iteration: 499, Training Accuracy: 92.2%
    Time usage: 0:00:02

    Accuracy on Test-Set: 17.4% (1745 / 10000)

    Making the neural network immune to the noise ...
    Optimization Iteration: 0, Training Accuracy: 15.6%
    Optimization Iteration: 100, Training Accuracy: 96.9%
    Optimization Iteration: 199, Training Accuracy: 95.3%
    Time usage: 0:00:01

    Accuracy on Test-Set: 96.0% (9601 / 10000)

    Target-class: 6
    Finding adversarial noise ...
    Optimization Iteration: 0, Training Accuracy: 10.9%
    Optimization Iteration: 100, Training Accuracy: 81.2%
    Optimization Iteration: 200, Training Accuracy: 93.8%
    Optimization Iteration: 300, Training Accuracy: 92.2%
    Optimization Iteration: 400, Training Accuracy: 89.1%
    Optimization Iteration: 499, Training Accuracy: 92.2%
    Time usage: 0:00:02

    Accuracy on Test-Set: 17.6% (1762 / 10000)

    Making the neural network immune to the noise ...
    Optimization Iteration: 0, Training Accuracy: 20.3%
    Optimization Iteration: 100, Training Accuracy: 93.8%
    Optimization Iteration: 199, Training Accuracy: 95.3%
    Time usage: 0:00:01

    Accuracy on Test-Set: 95.7% (9570 / 10000)

    Target-class: 7
    Finding adversarial noise ...
    Optimization Iteration: 0, Training Accuracy: 14.1%
    Optimization Iteration: 100, Training Accuracy: 93.8%
    Optimization Iteration: 200, Training Accuracy: 98.4%
    Optimization Iteration: 300, Training Accuracy: 100.0%
    Optimization Iteration: 400, Training Accuracy: 96.9%
    Optimization Iteration: 499, Training Accuracy: 100.0%
    Time usage: 0:00:02

    Accuracy on Test-Set: 12.8% (1281 / 10000)

    Making the neural network immune to the noise ...
    Optimization Iteration: 0, Training Accuracy: 12.5%
    Optimization Iteration: 100, Training Accuracy: 98.4%
    Optimization Iteration: 199, Training Accuracy: 98.4%
    Time usage: 0:00:01

    Accuracy on Test-Set: 95.9% (9587 / 10000)

    Target-class: 8
    Finding adversarial noise ...
    Optimization Iteration: 0, Training Accuracy: 4.7%
    Optimization Iteration: 100, Training Accuracy: 64.1%
    Optimization Iteration: 200, Training Accuracy: 81.2%
    Optimization Iteration: 300, Training Accuracy: 71.9%
    Optimization Iteration: 400, Training Accuracy: 78.1%
    Optimization Iteration: 499, Training Accuracy: 84.4%
    Time usage: 0:00:02

    Accuracy on Test-Set: 24.9% (2493 / 10000)

    Making the neural network immune to the noise ...
    Optimization Iteration: 0, Training Accuracy: 25.0%
    Optimization Iteration: 100, Training Accuracy: 95.3%
    Optimization Iteration: 199, Training Accuracy: 96.9%
    Time usage: 0:00:01

    Accuracy on Test-Set: 96.0% (9601 / 10000)

    Target-class: 9
    Finding adversarial noise ...
    Optimization Iteration: 0, Training Accuracy: 9.4%
    Optimization Iteration: 100, Training Accuracy: 48.4%
    Optimization Iteration: 200, Training Accuracy: 50.0%
    Optimization Iteration: 300, Training Accuracy: 53.1%
    Optimization Iteration: 400, Training Accuracy: 64.1%
    Optimization Iteration: 499, Training Accuracy: 65.6%
    Time usage: 0:00:02

    Accuracy on Test-Set: 45.5% (4546 / 10000)

    Making the neural network immune to the noise ...
    Optimization Iteration: 0, Training Accuracy: 51.6%
    Optimization Iteration: 100, Training Accuracy: 95.3%
    Optimization Iteration: 199, Training Accuracy: 95.3%
    Time usage: 0:00:01

    Accuracy on Test-Set: 96.2% (9615 / 10000)

    对所有目标类别免疫(执行两次)

    现在试着执行两次,使神经网络对所有目标类别的噪声免疫。不幸的是,结果也不太好。

    使神经网络免受一个对抗目标类型的影响,似乎使得它对另外一个目标类型失去了免疫。

    for i in range(10):
        make_immune(target_cls=i)
    
        # Print newline.
        print()
    
        make_immune(target_cls=i)
    
        # Print newline.
        print()复制代码
    Target-class: 0
    Finding adversarial noise ...
    Optimization Iteration:      0, Training Accuracy:   7.8%
    Optimization Iteration:    100, Training Accuracy:  53.1%
    Optimization Iteration:    200, Training Accuracy:  73.4%
    Optimization Iteration:    300, Training Accuracy:  79.7%
    Optimization Iteration:    400, Training Accuracy:  84.4%
    Optimization Iteration:    499, Training Accuracy:  95.3%
    Time usage: 0:00:02复制代码

    Accuracy on Test-Set: 29.2% (2921 / 10000)

    Making the neural network immune to the noise ...
    Optimization Iteration: 0, Training Accuracy: 29.7%
    Optimization Iteration: 100, Training Accuracy: 96.9%
    Optimization Iteration: 199, Training Accuracy: 95.3%
    Time usage: 0:00:01

    Accuracy on Test-Set: 96.2% (9619 / 10000)

    Target-class: 0
    Finding adversarial noise ...
    Optimization Iteration: 0, Training Accuracy: 1.6%
    Optimization Iteration: 100, Training Accuracy: 12.5%
    Optimization Iteration: 200, Training Accuracy: 7.8%
    Optimization Iteration: 300, Training Accuracy: 18.8%
    Optimization Iteration: 400, Training Accuracy: 9.4%
    Optimization Iteration: 499, Training Accuracy: 9.4%
    Time usage: 0:00:02

    Accuracy on Test-Set: 94.4% (9437 / 10000)

    Making the neural network immune to the noise ...
    Optimization Iteration: 0, Training Accuracy: 89.1%
    Optimization Iteration: 100, Training Accuracy: 98.4%
    Optimization Iteration: 199, Training Accuracy: 93.8%
    Time usage: 0:00:01

    Accuracy on Test-Set: 96.4% (9635 / 10000)

    Target-class: 1
    Finding adversarial noise ...
    Optimization Iteration: 0, Training Accuracy: 7.8%
    Optimization Iteration: 100, Training Accuracy: 42.2%
    Optimization Iteration: 200, Training Accuracy: 60.9%
    Optimization Iteration: 300, Training Accuracy: 75.0%
    Optimization Iteration: 400, Training Accuracy: 70.3%
    Optimization Iteration: 499, Training Accuracy: 85.9%
    Time usage: 0:00:02

    Accuracy on Test-Set: 28.7% (2875 / 10000)

    Making the neural network immune to the noise ...
    Optimization Iteration: 0, Training Accuracy: 39.1%
    Optimization Iteration: 100, Training Accuracy: 98.4%
    Optimization Iteration: 199, Training Accuracy: 95.3%
    Time usage: 0:00:01

    Accuracy on Test-Set: 96.4% (9643 / 10000)

    Target-class: 1
    Finding adversarial noise ...
    Optimization Iteration: 0, Training Accuracy: 7.8%
    Optimization Iteration: 100, Training Accuracy: 15.6%
    Optimization Iteration: 200, Training Accuracy: 18.8%
    Optimization Iteration: 300, Training Accuracy: 12.5%
    Optimization Iteration: 400, Training Accuracy: 9.4%
    Optimization Iteration: 499, Training Accuracy: 12.5%
    Time usage: 0:00:02

    Accuracy on Test-Set: 94.3% (9428 / 10000)

    Making the neural network immune to the noise ...
    Optimization Iteration: 0, Training Accuracy: 95.3%
    Optimization Iteration: 100, Training Accuracy: 95.3%
    Optimization Iteration: 199, Training Accuracy: 92.2%
    Time usage: 0:00:01

    Accuracy on Test-Set: 96.9% (9685 / 10000)

    Target-class: 2
    Finding adversarial noise ...
    Optimization Iteration: 0, Training Accuracy: 6.2%
    Optimization Iteration: 100, Training Accuracy: 60.9%
    Optimization Iteration: 200, Training Accuracy: 64.1%
    Optimization Iteration: 300, Training Accuracy: 71.9%
    Optimization Iteration: 400, Training Accuracy: 75.0%
    Optimization Iteration: 499, Training Accuracy: 82.8%
    Time usage: 0:00:02

    Accuracy on Test-Set: 34.3% (3427 / 10000)

    Making the neural network immune to the noise ...
    Optimization Iteration: 0, Training Accuracy: 31.2%
    Optimization Iteration: 100, Training Accuracy: 100.0%
    Optimization Iteration: 199, Training Accuracy: 98.4%
    Time usage: 0:00:01

    Accuracy on Test-Set: 96.6% (9657 / 10000)

    Target-class: 2
    Finding adversarial noise ...
    Optimization Iteration: 0, Training Accuracy: 6.2%
    Optimization Iteration: 100, Training Accuracy: 9.4%
    Optimization Iteration: 200, Training Accuracy: 14.1%
    Optimization Iteration: 300, Training Accuracy: 10.9%
    Optimization Iteration: 400, Training Accuracy: 7.8%
    Optimization Iteration: 499, Training Accuracy: 17.2%
    Time usage: 0:00:02

    Accuracy on Test-Set: 94.3% (9435 / 10000)

    Making the neural network immune to the noise ...
    Optimization Iteration: 0, Training Accuracy: 96.9%
    Optimization Iteration: 100, Training Accuracy: 98.4%
    Optimization Iteration: 199, Training Accuracy: 96.9%
    Time usage: 0:00:01

    Accuracy on Test-Set: 96.6% (9664 / 10000)

    Target-class: 3
    Finding adversarial noise ...
    Optimization Iteration: 0, Training Accuracy: 14.1%
    Optimization Iteration: 100, Training Accuracy: 20.3%
    Optimization Iteration: 200, Training Accuracy: 40.6%
    Optimization Iteration: 300, Training Accuracy: 57.8%
    Optimization Iteration: 400, Training Accuracy: 54.7%
    Optimization Iteration: 499, Training Accuracy: 64.1%
    Time usage: 0:00:02

    Accuracy on Test-Set: 48.4% (4837 / 10000)

    Making the neural network immune to the noise ...
    Optimization Iteration: 0, Training Accuracy: 54.7%
    Optimization Iteration: 100, Training Accuracy: 98.4%
    Optimization Iteration: 199, Training Accuracy: 100.0%
    Time usage: 0:00:01

    Accuracy on Test-Set: 96.5% (9650 / 10000)

    Target-class: 3
    Finding adversarial noise ...
    Optimization Iteration: 0, Training Accuracy: 4.7%
    Optimization Iteration: 100, Training Accuracy: 10.9%
    Optimization Iteration: 200, Training Accuracy: 17.2%
    Optimization Iteration: 300, Training Accuracy: 15.6%
    Optimization Iteration: 400, Training Accuracy: 1.6%
    Optimization Iteration: 499, Training Accuracy: 9.4%
    Time usage: 0:00:02

    Accuracy on Test-Set: 95.7% (9570 / 10000)

    Making the neural network immune to the noise ...
    Optimization Iteration: 0, Training Accuracy: 95.3%
    Optimization Iteration: 100, Training Accuracy: 90.6%
    Optimization Iteration: 199, Training Accuracy: 98.4%
    Time usage: 0:00:01

    Accuracy on Test-Set: 96.7% (9667 / 10000)

    Target-class: 4
    Finding adversarial noise ...
    Optimization Iteration: 0, Training Accuracy: 7.8%
    Optimization Iteration: 100, Training Accuracy: 67.2%
    Optimization Iteration: 200, Training Accuracy: 78.1%
    Optimization Iteration: 300, Training Accuracy: 79.7%
    Optimization Iteration: 400, Training Accuracy: 81.2%
    Optimization Iteration: 499, Training Accuracy: 96.9%
    Time usage: 0:00:02

    Accuracy on Test-Set: 23.7% (2373 / 10000)

    Making the neural network immune to the noise ...
    Optimization Iteration: 0, Training Accuracy: 26.6%
    Optimization Iteration: 100, Training Accuracy: 95.3%
    Optimization Iteration: 199, Training Accuracy: 96.9%
    Time usage: 0:00:01

    Accuracy on Test-Set: 96.3% (9632 / 10000)

    Target-class: 4
    Finding adversarial noise ...
    Optimization Iteration: 0, Training Accuracy: 4.7%
    Optimization Iteration: 100, Training Accuracy: 7.8%
    Optimization Iteration: 200, Training Accuracy: 12.5%
    Optimization Iteration: 300, Training Accuracy: 15.6%
    Optimization Iteration: 400, Training Accuracy: 7.8%
    Optimization Iteration: 499, Training Accuracy: 14.1%
    Time usage: 0:00:02

    Accuracy on Test-Set: 92.0% (9197 / 10000)

    Making the neural network immune to the noise ...
    Optimization Iteration: 0, Training Accuracy: 92.2%
    Optimization Iteration: 100, Training Accuracy: 95.3%
    Optimization Iteration: 199, Training Accuracy: 95.3%
    Time usage: 0:00:01

    Accuracy on Test-Set: 96.3% (9632 / 10000)

    Target-class: 5
    Finding adversarial noise ...
    Optimization Iteration: 0, Training Accuracy: 4.7%
    Optimization Iteration: 100, Training Accuracy: 57.8%
    Optimization Iteration: 200, Training Accuracy: 76.6%
    Optimization Iteration: 300, Training Accuracy: 85.9%
    Optimization Iteration: 400, Training Accuracy: 89.1%
    Optimization Iteration: 499, Training Accuracy: 85.9%
    Time usage: 0:00:02

    Accuracy on Test-Set: 23.0% (2297 / 10000)

    Making the neural network immune to the noise ...
    Optimization Iteration: 0, Training Accuracy: 28.1%
    Optimization Iteration: 100, Training Accuracy: 93.8%
    Optimization Iteration: 199, Training Accuracy: 98.4%
    Time usage: 0:00:01

    Accuracy on Test-Set: 96.6% (9663 / 10000)

    Target-class: 5
    Finding adversarial noise ...
    Optimization Iteration: 0, Training Accuracy: 6.2%
    Optimization Iteration: 100, Training Accuracy: 10.9%
    Optimization Iteration: 200, Training Accuracy: 18.8%
    Optimization Iteration: 300, Training Accuracy: 18.8%
    Optimization Iteration: 400, Training Accuracy: 20.3%
    Optimization Iteration: 499, Training Accuracy: 21.9%
    Time usage: 0:00:02

    Accuracy on Test-Set: 88.2% (8824 / 10000)

    Making the neural network immune to the noise ...
    Optimization Iteration: 0, Training Accuracy: 93.8%
    Optimization Iteration: 100, Training Accuracy: 93.8%
    Optimization Iteration: 199, Training Accuracy: 93.8%
    Time usage: 0:00:01

    Accuracy on Test-Set: 96.7% (9665 / 10000)

    Target-class: 6
    Finding adversarial noise ...
    Optimization Iteration: 0, Training Accuracy: 7.8%
    Optimization Iteration: 100, Training Accuracy: 40.6%
    Optimization Iteration: 200, Training Accuracy: 53.1%
    Optimization Iteration: 300, Training Accuracy: 51.6%
    Optimization Iteration: 400, Training Accuracy: 56.2%
    Optimization Iteration: 499, Training Accuracy: 62.5%
    Time usage: 0:00:02

    Accuracy on Test-Set: 44.0% (4400 / 10000)

    Making the neural network immune to the noise ...
    Optimization Iteration: 0, Training Accuracy: 39.1%
    Optimization Iteration: 100, Training Accuracy: 96.9%
    Optimization Iteration: 199, Training Accuracy: 93.8%
    Time usage: 0:00:01

    Accuracy on Test-Set: 96.4% (9642 / 10000)

    Target-class: 6
    Finding adversarial noise ...
    Optimization Iteration: 0, Training Accuracy: 4.7%
    Optimization Iteration: 100, Training Accuracy: 17.2%
    Optimization Iteration: 200, Training Accuracy: 12.5%
    Optimization Iteration: 300, Training Accuracy: 14.1%
    Optimization Iteration: 400, Training Accuracy: 20.3%
    Optimization Iteration: 499, Training Accuracy: 7.8%
    Time usage: 0:00:02

    Accuracy on Test-Set: 94.6% (9457 / 10000)

    Making the neural network immune to the noise ...
    Optimization Iteration: 0, Training Accuracy: 93.8%
    Optimization Iteration: 100, Training Accuracy: 100.0%
    Optimization Iteration: 199, Training Accuracy: 95.3%
    Time usage: 0:00:01

    Accuracy on Test-Set: 96.8% (9682 / 10000)

    Target-class: 7
    Finding adversarial noise ...
    Optimization Iteration: 0, Training Accuracy: 4.7%
    Optimization Iteration: 100, Training Accuracy: 65.6%
    Optimization Iteration: 200, Training Accuracy: 89.1%
    Optimization Iteration: 300, Training Accuracy: 82.8%
    Optimization Iteration: 400, Training Accuracy: 85.9%
    Optimization Iteration: 499, Training Accuracy: 90.6%
    Time usage: 0:00:02

    Accuracy on Test-Set: 18.1% (1809 / 10000)

    Making the neural network immune to the noise ...
    Optimization Iteration: 0, Training Accuracy: 23.4%
    Optimization Iteration: 100, Training Accuracy: 95.3%
    Optimization Iteration: 199, Training Accuracy: 93.8%
    Time usage: 0:00:01

    Accuracy on Test-Set: 96.8% (9682 / 10000)

    Target-class: 7
    Finding adversarial noise ...
    Optimization Iteration: 0, Training Accuracy: 12.5%
    Optimization Iteration: 100, Training Accuracy: 10.9%
    Optimization Iteration: 200, Training Accuracy: 18.8%
    Optimization Iteration: 300, Training Accuracy: 18.8%
    Optimization Iteration: 400, Training Accuracy: 28.1%
    Optimization Iteration: 499, Training Accuracy: 18.8%
    Time usage: 0:00:02

    Accuracy on Test-Set: 84.1% (8412 / 10000)

    Making the neural network immune to the noise ...
    Optimization Iteration: 0, Training Accuracy: 84.4%
    Optimization Iteration: 100, Training Accuracy: 100.0%
    Optimization Iteration: 199, Training Accuracy: 100.0%
    Time usage: 0:00:01

    Accuracy on Test-Set: 97.0% (9699 / 10000)

    Target-class: 8
    Finding adversarial noise ...
    Optimization Iteration: 0, Training Accuracy: 7.8%
    Optimization Iteration: 100, Training Accuracy: 48.4%
    Optimization Iteration: 200, Training Accuracy: 46.9%
    Optimization Iteration: 300, Training Accuracy: 71.9%
    Optimization Iteration: 400, Training Accuracy: 70.3%
    Optimization Iteration: 499, Training Accuracy: 75.0%
    Time usage: 0:00:02

    Accuracy on Test-Set: 36.8% (3678 / 10000)

    Making the neural network immune to the noise ...
    Optimization Iteration: 0, Training Accuracy: 48.4%
    Optimization Iteration: 100, Training Accuracy: 96.9%
    Optimization Iteration: 199, Training Accuracy: 93.8%
    Time usage: 0:00:01

    Accuracy on Test-Set: 97.0% (9699 / 10000)

    Target-class: 8
    Finding adversarial noise ...
    Optimization Iteration: 0, Training Accuracy: 7.8%
    Optimization Iteration: 100, Training Accuracy: 14.1%
    Optimization Iteration: 200, Training Accuracy: 12.5%
    Optimization Iteration: 300, Training Accuracy: 7.8%
    Optimization Iteration: 400, Training Accuracy: 4.7%
    Optimization Iteration: 499, Training Accuracy: 9.4%
    Time usage: 0:00:02

    Accuracy on Test-Set: 96.2% (9625 / 10000)

    Making the neural network immune to the noise ...
    Optimization Iteration: 0, Training Accuracy: 96.9%
    Optimization Iteration: 100, Training Accuracy: 98.4%
    Optimization Iteration: 199, Training Accuracy: 95.3%
    Time usage: 0:00:01

    Accuracy on Test-Set: 97.2% (9720 / 10000)

    Target-class: 9
    Finding adversarial noise ...
    Optimization Iteration: 0, Training Accuracy: 9.4%
    Optimization Iteration: 100, Training Accuracy: 23.4%
    Optimization Iteration: 200, Training Accuracy: 43.8%
    Optimization Iteration: 300, Training Accuracy: 37.5%
    Optimization Iteration: 400, Training Accuracy: 45.3%
    Optimization Iteration: 499, Training Accuracy: 39.1%
    Time usage: 0:00:02

    Accuracy on Test-Set: 64.9% (6494 / 10000)

    Making the neural network immune to the noise ...
    Optimization Iteration: 0, Training Accuracy: 67.2%
    Optimization Iteration: 100, Training Accuracy: 95.3%
    Optimization Iteration: 199, Training Accuracy: 98.4%
    Time usage: 0:00:01

    Accuracy on Test-Set: 97.5% (9746 / 10000)

    Target-class: 9
    Finding adversarial noise ...
    Optimization Iteration: 0, Training Accuracy: 9.4%
    Optimization Iteration: 100, Training Accuracy: 7.8%
    Optimization Iteration: 200, Training Accuracy: 10.9%
    Optimization Iteration: 300, Training Accuracy: 15.6%
    Optimization Iteration: 400, Training Accuracy: 12.5%
    Optimization Iteration: 499, Training Accuracy: 4.7%
    Time usage: 0:00:02

    Accuracy on Test-Set: 97.1% (9709 / 10000)

    Making the neural network immune to the noise ...
    Optimization Iteration: 0, Training Accuracy: 98.4%
    Optimization Iteration: 100, Training Accuracy: 100.0%
    Optimization Iteration: 199, Training Accuracy: 95.3%
    Time usage: 0:00:01

    Accuracy on Test-Set: 97.7% (9768 / 10000)

    绘制对抗噪声

    现在我们已经对神经网络和对抗网络都进行了很多优化。让我们看看对抗噪声长什么样。

    plot_noise()复制代码

    Noise:

    - Min: -0.35
    - Max: 0.35
    - Std: 0.270488复制代码

    有趣的是,相比优化之前的干净图像,神经网络在噪声图像上有更高的分类准确率。

    print_test_accuracy(show_example_errors=True,
                        show_confusion_matrix=True)复制代码

    Accuracy on Test-Set: 97.7% (9768 / 10000)
    Example errors:

    Confusion Matrix:
    [[ 972 0 1 0 0 0 2 1 3 1]
    [ 0 1119 4 0 0 2 2 0 8 0]
    [ 3 0 1006 9 1 1 1 5 4 2]
    [ 1 0 1 997 0 5 0 4 2 0]
    [ 0 1 3 0 955 0 3 1 2 17]
    [ 1 0 0 9 0 876 3 0 2 1]
    [ 6 4 0 0 3 6 934 0 5 0]
    [ 2 4 18 3 1 0 0 985 2 13]
    [ 4 0 4 3 4 1 1 3 950 4]
    [ 6 6 0 7 4 5 0 4 3 974]]

    干净图像上的性能

    现在将对抗噪声重置为零,看看神经网络在干净图像上的表现。

    init_noise()复制代码

    相比噪声图像,神经网络在干净图像上表现的要更差一点。

    print_test_accuracy(show_example_errors=True,
                        show_confusion_matrix=True)复制代码

    Accuracy on Test-Set: 92.2% (9222 / 10000)
    Example errors:

    Confusion Matrix:
    [[ 970 0 1 0 0 1 8 0 0 0]
    [ 0 1121 5 0 0 0 9 0 0 0]
    [ 2 1 1028 0 0 0 1 0 0 0]
    [ 1 0 27 964 0 13 2 2 1 0]
    [ 0 2 3 0 957 0 20 0 0 0]
    [ 3 0 2 2 0 875 10 0 0 0]
    [ 4 1 0 0 1 1 951 0 0 0]
    [ 10 21 61 3 14 3 0 913 3 0]
    [ 29 2 91 7 7 26 70 1 741 0]
    [ 20 18 10 12 150 65 11 12 9 702]]

    关闭TensorFlow会话

    现在我们已经用TensorFlow完成了任务,关闭session,释放资源。

    # This has been commented out in case you want to modify and experiment
    # with the Notebook without having to restart it.
    # session.close()复制代码

    讨论

    在上面的实验中可以看到,我们能够使神经网络对单个目标类别的对抗噪声免疫。这使得不可能找到引起误分类到目标类型的对抗噪声。但是,显然也不可能使神经网络同时对所有目标类别免疫。可能用其他方法能够做到这一点。

    一种建议是对不同目标类型进行交叉的免疫训练,而不是依次对每个目标类型进行完全的优化。对上面的代码做些小修改就能做到这一点。

    另一个建议是设置两层神经网络,共11个网络。第一层网络用来对输入图像进行分类。这个网络没有对对抗噪声免疫。然后根据第一层的预测类型选择第二层的另一个网络。第二层中的网络对各自目标类型的对抗噪声免疫。因此,一个对抗样本可能糊弄第一层的网络,但第二层中的网络会免于特定目标类型噪声的影响。

    这可能使用了类型数量比较少的情况,但如果数量很大就变得不可行,比如ImageNet有1000个类别,这样我们在第二层中需要训练1000个神经网络,这并不实际。

    总结

    这篇教程展示了如何找到MNIST数据集手写数字的对抗噪声。 每个目标类别都找到了一个单一的噪声模式,它导致几乎所有的输入图像都被误分类为目标类别。

    MNIST数据集的噪声模式对人眼清晰可见。但可能在高分辨率图像上(比如ImageNet数据集)工作的大型神经网络可以找到更细微的噪声模式。

    本教程也尝试了使神经网络免受对抗噪声影响的方法。 这对单个目标类别有效,但所测试的方法无法使神经网络同时对所有对抗目标类别免疫。

    练习

    下面使一些可能会让你提升TensorFlow技能的一些建议练习。为了学习如何更合适地使用TensorFlow,实践经验是很重要的。

    在你对这个Notebook进行修改之前,可能需要先备份一下。

    • 尝试为对抗噪声使用更少或更多的优化迭代数。
    • 教程#11只需少于30次的迭代次数就能找到对抗噪声,相比之下为什么这篇教程需要更多迭代?
    • 尝试设置不同的noise_limitnoise_l2_weight。这会如何影响对抗噪声以及分类准确率?
    • 试着为目标类型1寻找对抗噪声。它是否适用于目标类型3?
    • 你能找到一个更好的方法,使得神经网络对对抗噪声免疫吗?
    • 神经网络是否可以对单个图像产生的对抗噪声免疫,就像教程 #11 中所做的那样?
    • 尝试用不同的配置创建另一个神经网络。一个网络上的对抗噪声对另一个网络有效吗?
    • 用CIFAR-10数据集代替MNIST。你可以复用教程 #06 中的一些代码。
    • 你会如何找到Inception模型和ImageNet数据集的对抗噪声?
    • 向朋友解释程序如何工作。
    展开全文
  • 针对不同分布噪声下生成对抗网络生成样本质量差异明显的问题,提出了一种噪声稳健性的卡方生成对抗网络。所提网络结合了卡方散度量化敏感性和稀疏不变性的优势,引入卡方散度计算生成样本分布和真实样本分布的距离,...
  • "/home/NEWDISK/CAAD/wujiekd/a_lunwen/ceshi_6" #原图片 attack_dir = "/home/NEWDISK/CAAD/wujiekd/a_lunwen/ceshi_6strong" #对抗样本的图片 output_dir="/home/NEWDISK/CAAD/wujiekd/cut_strong" #输出路径 def ...
  • AEGuard是一种基于边缘噪声特征的对抗样本检测模型,通常会出现在对抗样本中。 使用的技术/框架 张量流 OpenCV 要求 的Python 3 IPython 7.18.1或更高版本 Tensorflow 2.3.0或更高版本 Keras 2.4.3或更高版本 ...
  • 技术领域本发明涉及一种认知无线电系统中的频谱感知技术,尤其是涉及一种对抗噪声不确定性的匹配滤波频谱感知方法。背景技术:随着无线通信业务的快速增长,人们对频谱资源的需求量不断提高,频谱资源缺乏的现象变得...
  • 1、引入了一种生成具有程序性噪声函数的通用对抗扰动(UAPs)的结构化方法。 该篇文章提出了一个新颖的方法来生成有效的对抗样本, 在计算机视觉任务上作为黑盒攻击. 发现程序性噪声在欺骗自然图片分类器上具有极好的...
  • 本工作提出了一种高效的基于决策的黑盒对抗攻击算法,在业内第一次以完全黑盒的方式成功地实现对人脸识别系统的对抗攻击。本工作由腾讯 AI Lab 主导,与清华大学,香港科技大学联合完成,发表...
  • 文章目录前言监督对抗性自动编码器 SAAE 风格和内容的分离SAAE 训练结果AE AAE SAAE 实验对比结果恢复效果对比从随机数重建图像的效果这部分实验代码 前言 先来看看实验: 我们使用 MNIST 手写数字,测试通过自动...
  • 针对基于深度学习的语音信号去噪方法存在难于收敛、性能不足的问题,本文提出了基于环状生成对抗网络的深度语音信号去噪方法,设计了新型的环状生成对抗语义去噪网络。通过40余种不同噪声语音集的试验,结果表明所提...
  • 对抗攻击和防御

    千次阅读 2021-10-27 12:24:49
    目录对抗攻击防御References 对抗攻击 在计算机视觉任务中可能存在以下现象,对输入样本故意添加一些人类无法察觉的细微干扰,将会导致模型以高置信度输出一个错误的分类结果,这被称为对抗攻击。对抗攻击的目标是使...
  • 但研究表明,如果在原有输入上叠加一个很小的恶意噪声,其效果就会受到严重影响。这种恶意样本被称为对抗样本。针对此前对抗样本生成方法在威胁模型下考虑的不足,本文提出了一种黑盒模型下的对抗样本生成方法。该...
  • 我们还提出了一种噪声鲁棒的迭代训练方法来从有噪声的伪标签中学习,其中引入了基于标签质量的样本选择(LQSS)模块和噪声加权的骰子损失来克服有噪声的标签。实验结果表明,我们的方法获得了准确的分割结果,与人工...
  • 为了得到更精确的检索效果, 提出一种新的检索方法, 即采用卷积神经网络提取图像特征, 利用哈希算法与输入二进制噪声变量的生成对抗网络共同学习图像的二进制哈希码, 利用汉明距离对图像进行相似性比较, 最后完成对...
  • 《解释和利用对抗样本》 基础知识 abstract 包括神经网络在内的几个机器学习模型,通过对数据集中的例子施加小的但故意的最坏情况的扰动,不断地对对抗示例输入进行错误分类,这样,受扰动的输入导致模型输出一...
  • 对抗样本生成算法之DeepFool算法

    千次阅读 2020-05-17 20:18:01
    论文主要内容 提出了一种新的计算对抗样本的方法:DeepFool算法 通过实验,验证了DeepFool算法所添加的扰动更小,同时计算对抗样本所消耗的时间也更少 实验也说明了,如果使用不恰当的算法(如FGSM算法)来验证分类...
  • How to Defend Passive Defense Adversarial Training - 对抗训练 (Proactive Defense) Adversarial Training in NLP Generating Adversarial Sentences is Difficult Adversarial perturbations in embeddings ...
  • 对抗训练及虚拟对抗

    2021-11-02 16:08:21
    虚拟对抗训练即在模型的训练过程中对输入xxx人为的增加扰动,增加最终模型的鲁棒性。最早的对抗训练可以统一写成如下格式: min⁡θE(x,y) D[max⁡△x∈ΩL(x+△x,y;θ)]\min_\theta E_{(x, y)~D}[\max_{\...
  • 形状一致的生成对抗网络 SC-GAN —— 无监督域适应 文章原文:Yu, Fei, et al. “Annotation-Free Cardiac Vessel Segmentation via Knowledge Transfer from Retinal Images.” International Conference on ...
  • 对抗鲁棒性使得神经网络又强又怂

    千次阅读 2020-08-17 19:00:47
    AI TIME欢迎每一位AI爱好者的加入!对抗样本的存在表明现代神经网络是相当脆弱的。为解决这一问题,研究者相继提出了许多方法,其中使用对抗样本进行训练被认为是至今最有效的方法之一。然而...
  • 88 4.3 图像去噪 89 4.3.1 高斯噪声和椒盐噪声 90 4.3.2 中值滤波 91 4.3.3 均值滤波 93 4.3.4 高斯滤波 93 4.3.5 高斯双边滤波 94 4.4 本章小结 96 第5章 白盒攻击算法 97 5.1 对抗样本的基本原理 ...
  • 相较于其他领域,图像领域的对抗样本生成有以下优势:1)真实图像与虚假图像于观察者是直观的;2)图像数据与图像分类器的结构相对简单。本文以全连接网络和卷积神经网络为例,以MNIST、CIFAR10,以及ImageNet为基础...
  • 常见的提高泛化性的方法主要有两种:第一种是添加噪声,比如往输入添加高斯噪声、中间层增加 Dropout 以及进来比较热门的对抗训练等,对图像进行随机平移缩放等数据扩增手段某种意义上也属于此列;第二种是往 loss ...
  • 生成对抗网络(GAN)

    2021-11-14 17:24:27
    生成对抗网络输入噪声和真实图片,输出生成图片。 代码全文: import argparse import os import numpy as np import math import torchvision.transforms as transforms from torchvision.utils import save...
  • 在这项工作中,采用了一种对抗性生成方法,以一种完全卷积的体系结构来进行语音增强(即从损坏的语音信号中去除噪声),如下所示: 该模型处理处于不同SNR的许多噪声条件下的原始语音波形(训练时为40,测试时为20...
  • 首先介绍激光链路通信的优势,然后介绍基于生成对抗网络(GAN)的端到端通信...实验结果表明,Wasserstein生成对抗网络可以对加性高斯白噪声信道和对数正态信道进行有效模拟,且解决了传统GAN训练不稳定和模式坍塌的问题。

空空如也

空空如也

1 2 3 4 5 ... 20
收藏数 13,909
精华内容 5,563
关键字:

对抗噪音