精华内容
下载资源
问答
  • GANs 2.0:TensorFlow 2.0中的生成对抗网络 项目目标 该项目的主要目的是加快建立基于生成对抗网络的深度学习管道的过程,并简化各种生成器/区分器模型的原型。 该库提供了一些GAN培训器,它们可以用作现成的功能,...
  • 在本次演讲中,我们将以一个关于GANs如何工作的简短教程开始,以及在设计GAN架构时涉及的各种考虑事项。然后,我们将继续讨论一些更流行的GAN架构,并从不同的角度进行讨论,包括可解释性和伦理。最后,我们将讨论...
  • 生成式对抗网络(Generative Adversarial Networks,GANs)作为近年来的研究热点之一,受到了广泛关注,每年在机器学习、计算机视觉、自然语言处理、语音识别等上大量相关论文发表。密歇根大学Jie Gui博士等人近期...
  • GANs-源码

    2021-03-25 20:57:17
    生成对抗网络 生成对抗网络(GAN)的基本思想很简单:使神经网络相互竞争,希望这种竞争将促使他们脱颖而出。 通常,GAN由两个NN组成: 生成器:生成器从输入噪声分布中输出一些数据,通常是图像。...
  • Two time-scale update rule for training GANs 用于训练 GAN 的两个时间尺度更新规则 该存储库包含随论文 GANs 训练的代码,该论文通过两个时间尺度更新规则收敛到局部纳什均衡。 Fréchet 初始距离 (FID) FID 是...
  • GANs_on_MNIST_Torch 使用PyTorch在MNIST数据库上生成对抗网络。 数据集 MNIST:手写数字。 结果 通过训练生成图像 纪元1 纪元10 纪元50 时代200 判别器和生成器损耗 发电机 判别器 分母对真实数据和生成数据的...
  • GANs必读论文推荐

    2019-06-15 00:37:28
    GANs必读论文推荐,Must-Read Papers on GANs – Towards Data Science.
  • 入门到实践应用的生成对抗网络GANs的文档,代码资源整合
  • 深度卷积GAN
  • GANs_in_Action.pdf

    2021-05-16 10:25:04
    你也可以直接查看官网。https://www.manning.com/books/gans-in-action?query=GAN%20in%20action
  • 所以先在 MNIST 数据集上测试 GANs模型,然后再将其应用到CelebA数据集上。 import os import helper from glob import glob from matplotlib import pyplot import matplotlib as mpl import helper import ...
  • TensorFlow implementation of the CVPR 2018 spotlight paper, Deep Photo Enhancer: Unpaired Learning for Image Enhancement from Photographs with GANs
  • gans-in-action-master.zip

    2019-05-28 11:25:30
    本文档提供了gans-in-action的书上的全部代码,该书在19年8月份将在中国发行
  • title = {The Numerics of GANs}, booktitle = {Advances in neural information processing systems}, year = {2017} } 您可以在上找到更多详细信息。 依存关系 该项目使用Python 3.5.2。 在运行代码之前,您...
  • NIPS2016 12-9 Ian Goodfellow introduction to GANs ( Generative Adversarial Networks)
  • 生成对抗网络(GANs)在生成各种视觉内容方面已显示出巨大的成功。 GAN也已用于医学图像应用中,例如医学图像重建,分割,检测,图像合成或分类。此外,凭借其产生高度逼真的图像的能力,GAN在医学领域和医学图像...
  • Introduction to GANs

    2018-09-20 23:45:51
    介绍深度学习和GAN的发展:Introduction to GANs Ian Goodfellow, Staff Research Scientist, Google Brain MIX+GAN IEEE Workshop on Perception Beyond the Visible Spectrum
  • Conditional-Gans, 条件卷积敌对网络的测试代码 条件 gans条件生成对抗性网络的测试代码 tensorflow 。简介在第1 章中,我们首先介绍了条件生成和代码相关的条件 GANS.But,并参考 DCGAN技术。先决条件tensorflow> =...
  • 用pyTorch复现PG GAN 原地址:https://github.com/github-pengge/PyTorch-progressive_growing_of_gans
  • GANs演讲报告

    2018-05-10 17:14:28
    本资料为ppt,整理了最近相关的GANs的文章摘要和主要思想。
  • GAN的实例选择 此存储库包含Terrance DeVries,Michal Drozdzal和Graham W.Taylor撰写的NeurIPS 2020 GAN代码。 BigGAN的样本经过训练,并在256x256 ImageNet上进行了实例选择。 在4个V100 GPU上进行了11天的培训。...
  • 甘肃省 课程 建立基本的生成对抗网络(GAN) 建立更好的生成对抗网络(GAN) 应用生成对抗网络(GAN) 会议讲习班 NeurIPS 2020: ECCV 2020: CVPR 2020: ... 00:35:02 GAN控件00:35:02 ...
  • 李宏毅老师最新的生成式对抗网络的应用介绍!非常值得学习!
  • 作者 布赖恩·多汉斯基 克里斯蒂安·费雷尔(Cristian Canton Ferrer) 介绍 我们引入一种新颖的绘画方法,其中保留要删除或更改的对象的身份并在推理时进行说明:示例GAN(ExGAN)。 ExGAN是一种条件GAN,它利用...
  • GANs

    2020-07-08 18:44:57
    这里写目录标题1、GANs1.1数学基础1.2 GANs算法原理 1、GANs 1、GAN的原理入门 2、李宏毅2017深度学习 GAN生成对抗神经网络相关学习视频合集 1.1数学基础 1、概率与统计 概率与统计是两个不同的概念。 概率是指:模型...

    1、GANs

    1、GAN的原理入门
    2、李宏毅2017深度学习 GAN生成对抗神经网络相关学习视频合集

    1.1数学基础

    1、概率与统计

    概率与统计是两个不同的概念。

    概率是指:模型参数已知, X X X未知, p ( x 1 ) . . . p ( x n ) p(x_1) ... p(x_n) p(x1)...p(xn) 都是对应的 x i x_i xi的概率

    统计是指:模型参数未知, X X X已知,根据观测的现象,求模型的参数

    2、似然函数与概率函数

    似然跟概率是同义词,所以似然也是表示概率,但这个概率有些不一样。

    • 似然是指:模型在不同参数下, p ( x 1 ) . . . p ( x n ) p(x_1) ... p(x_n) p(x1)...p(xn) 发生的概率

    • 似然估计是指:模型的参数未知, X X X已知,根据观测现象( X X X),估计模型参数的过程.“模型已定,参数未知”

    如果 θ θ θ 是已知确定的, x x x是变量,这个函数叫做概率函数( p r o b a b i l i t y   f u n c t i o n probability\ function probability function),它描述对于不同的样本点 x x x ,其出现概率是多少。

    如果 x x x 是已知确定的, θ θ θ 是变量,这个函数叫做似然函数( l i k e l i h o o d   f u n c t i o n likelihood\ function likelihood function), 它描述对于不同的模型参数,出现 x x x 这个样本点的概率是多少。

    最大似然估计(为什么要最大):
    对于观测数据集 x 1 , x 2 . . . x n x_1,x_2...x_n x1,x2...xn, 在 θ θ θ下发生的概率分别是 p ( x 1 ∣ θ ) , p ( x 2 ∣ θ ) . . . p ( x n ∣ θ ) p(x_1|θ),p(x_2|θ)... p(x_n|θ) p(x1θ),p(x2θ)...p(xnθ), 所以出现该观测数据集的概率为 P ( X ∣ θ ) = p ( x 1 ∣ θ ) p ( x 2 ∣ θ ) . . . p ( x n ∣ θ ) P(X|θ) = p(x_1|θ)p(x_2|θ)... p(x_n|θ) P(Xθ)=p(x1θ)p(x2θ)...p(xnθ), 那想一想为什么我一下就会抽出 x 1 , x 2 . . . x n x_1, x_2 ... x_n x1,x2...xn n n n个数据呢?一种直观的解释就是 它们发生的概率大,所以 就是求让 P ( X ) P(X) P(X)最大下的 θ θ θ,这就是最大似然估计。

    3、最大后验概率

    最大似然是求参数,让 P ( X ∣ θ ) P(X|θ) P(Xθ)最大,最大后验概率是让 P ( X ∣ θ ) P ( θ ) P(X|θ)P(θ) P(Xθ)P(θ)最大,相当于给似然函数加了一个关于 θ θ θ的权重。这有点像正则化里加惩罚项的思想,不过正则化里是利用加法,而 M A P MAP MAP里是利用乘法)

    为什么要让 P ( X ∣ θ ) P ( θ ) P(X|θ)P(θ) P(Xθ)P(θ) 最大?

    我们是根据一群观测数据 X = ( x 1 , x 2 . . . x n ) X = (x_1, x_2 ... x_n) X=x1,x2...xn)估计模型的参数,即求 P ( θ 0 ∣ X ) P(θ_0 | X) P(θ0X), 用贝叶斯改一下就是

    P ( θ 0 ∣ X ) = P ( X ∣ θ 0 ) P ( θ 0 ) / P ( X ) P(θ_0 | X) = P(X|θ_0) P(θ_0) / P(X) P(θ0X)=P(Xθ0)P(θ0)/P(X) , 对于给定的观测序列X来说 P ( X ) P(X) PX是固定的,所以我们求后验概率 P ( θ 0 ∣ X ) P(θ_0 | X) P(θ0X)最大就是求 P ( X ∣ θ 0 ) P ( θ 0 ) P(X|θ_0) P(θ_0) P(Xθ0)P(θ0)最大

    对于贝叶斯公式来说, 其实就是 【后验概率 P ( θ 0 ∣ X ) P(θ_0 | X) P(θ0X)】 等于 【似然函数 P ( X ∣ θ 0 ) P(X|θ_0) P(Xθ0)】 乘以 【先验概率 P ( θ 0 P(θ_0 P(θ0)】

    4、期望
    (1)离散型
    在这里插入图片描述
    (2)连续型
    在这里插入图片描述
    5、最大似然估计与KL散度
    在这里插入图片描述
    6、GAN目标函数与JS散度
    在这里插入图片描述
    在这里插入图片描述
    所以说, m i n   m a x min\ max min max游戏本质上是最小化 J S JS JS散度,即 min ⁡ ( − 2 l o g 2 + 2 × J S D ( P d a t a ∣ ∣ P G ) ) \min(-2log2+2\times JSD(P_{data}||P_{G})) min(2log2+2×JSD(PdataPG))

    参考:https://www.cnblogs.com/bonelee/p/9166084.html

    1.2 GANs算法原理

    在这里插入图片描述
    前向传播阶段

    • 模型输入
      1、我们随机产生一个随机向量作为生成模型的数据,然后经过生成模型后产生一个新 的向量,作为 F a k e   I m a g e Fake\ Image Fake Image,记作 D ( z ) D(z) D(z)
      2、从数据集中随机选择一张图片,将图片转化成向量,作为 R e a l   I m a g e Real\ Image Real Image,记作 x x x
    • 模型输出
      将由 1 1 1 或者 2 2 2 产生的输出,作为判别网络的输入,经过判别网络后输出值为一个 0 0 0 1 1 1 之间的数,用于表示输入图片为 R e a l   I m a g e Real\ Image Real Image 的概率, r e a l real real 1 1 1 f a k e fake fake 0 0 0。 使用得到的概率值计算损失函数,解释损失函数之前,我们先解释下判别模型的输入。 根据输入的图片类型是 F a k e   I m a g e Fake\ Image Fake Image R e a l   I m a g e Real\ Image Real Image 将判别模型的输入数据的 l a b e l label label 标记为 0 0 0 或者 1 1 1。即判别模型的输入类型为 ( x f a k e , 0 ) (x_{fake},0) (xfake,0)或者 ( x r e a l , 1 ) (x_{real},1) (xreal,1)

    反向传播阶段

    • 优化目标
      在这里插入图片描述
      z z z表示输入 G G G的随机噪声)判别模型 D D D要最大概率地分对真实样本(最大化 log ⁡ ( D ( x ) \log(D(x) log(D(x)),而生成模型 G G G要最小化 D ( G ( z ) ) D(G(z)) D(G(z),即最大化 log ⁡ ( 1 − D ( G ( z ) ) ) \log(1-D(G(z))) log(1D(G(z))) G G G D D D同时训练,但是训练中要固定一方,更新另一方的参数,交替迭代,使对方的错误最大化。最终, G G G能估计出真实样本的分布。
      第一步:优化 D D D
      在这里插入图片描述
      优化 D D D,即优化判别网络时,没有生成网络什么事,后面的 G ( z ) G(z) G(z)就相当于已经得到的假 样本。优化 D D D 的公式的第一项,使得真样本 x x x 输入的时候,得到的结果越大越好,因为真样 本的预测结果越接近1越好;对于假样本 G ( z ) G(z) G(z),需要优化的是其结果越小越好,也就是 D ( G ( z ) ) D(G(z)) D(G(z)) 越小越好,因为它的标签为 0 0 0。但是第一项越大,第二项越小,就矛盾了,所以把第二项改 为 1 − D ( G ( z ) ) 1-D(G(z)) 1D(G(z)),这样就是越大越好。
      第二步:优化 G G G
      在这里插入图片描述
      在优化 G G G 的时候,这个时候没有真样本什么事,所以把第一项直接去掉,这时候只有假 样本,但是这个时候希望假样本的标签是 1 1 1 ,所以是 D ( G ( z ) ) D(G(z)) D(G(z)) 越大越好,但是为了统一成 1 − D ( G ( z ) ) 1-D(G(z)) 1D(G(z)) 的形式,那么只能是最小化 1 − D ( G ( z ) ) 1-D(G(z)) 1D(G(z)),本质上没有区别,只是为了形式的统一。之 后这两个优化模型可以合并起来写,就变成最开始的最大最小目标函数了。

    • 判别模型的损失函数
      在这里插入图片描述

    • 生成模型的损失函数
      在这里插入图片描述

    • 论文 :https://arxiv.org/abs/1812.04948

    • 代码 :https://github.com/NVlabs/stylegan

    • 完整GANs笔记:http://www.gwylab.com/note-gans.html

    • StyleGAN详细解读:http://www.gwylab.com/pdf/Note_StyleGAN.pdf

    • 基于StyleGAN的一个好玩的网站:http://www.seeprettyface.com/

    • ppt地址:http://speech.ee.ntu.edu.tw/~tlkagk/courses_MLDS17.html

    展开全文
  • GANs图像处理论文合集

    2019-02-22 23:22:59
    2014~2018年部分GAN在图像处理方向的论文合集,大部分来自arXiv
  • gans有什么用 创造力:独特的人类 (Creativity: Uniquely Human) Art — the ability to create something original, to use one’s unbounded creativity and imagination — it’s something that we humans like ...

    gans有什么用

    创造力:独特的人类 (Creativity: Uniquely Human)

    Art — the ability to create something original, to use one’s unbounded creativity and imagination — it’s something that we humans like to believe is unique to us. After all, no other animal or computer thus far has come close to matching the artistic skill of humans when it comes to realistic paintings.

    艺术-创造原创物,运用无限的创造力和想象力的能力-我们人类喜欢相信的东西对我们来说是独一无二的。 毕竟,在写实绘画方面,到目前为止,还没有其他动物或计算机能够与人类的艺术技能相提并论。

    Image for post
    Source 资源

    Even with the recent advances in AI, computers still struggled with being creative. They were good at computing, classifying, and performing programmed tasks, but they could never really match the level of creativity humans had. Human creativity was assuredly unique and one of its kind… That is until generative adversarial networks were conceived. In 2014, the original paper on GANs proposed a new system of estimating generative models — models which could create— using two models.

    即使AI有了最近的进步,计算机仍在努力发挥创造力。 他们擅长计算,分类和执行编程的任务,但是它们永远无法真正达到人类的创造力水平。 人类的创造力肯定是独特的,并且是人类创造力的一种。直到创造出对抗性的对抗网络。 2014年,有关GAN的原始论文提出了一种使用两个模型估算生成模型(可以创建的模型)的新系统。

    猫与老鼠的永恒游戏 (An Eternal Game of Cat and Mouse)

    A GAN consists of two models, a generative model G and a discriminative model D as well as a real data set. You can think of G as a counterfeiter, trying to make money that is more and more like real money and D as a cop, trying to differentiate whether the money is real or counterfeit. The real data set acts as a reference for the cop to compare with the output of the counterfeiter.

    GAN包含两个模型:生成模型G和判别模型D以及真实数据集。 您可以将G视为伪造商 ,试图赚钱的钱越来越像真钱,而将D当作警察 ,试图区分钱是真钱还是伪造。 真实数据集用作警察与假冒者的输出进行比较的参考。

    Image for post
    Source 资源

    In the beginning, the counterfeiter is going to suck, because he has no idea how to actually make the money look like real money, but after every instance of the cop determining the fake money, he gets better and learns from his mistakes. Keep in mind too, the discriminator is also getting incrementally better at differentiating real and fake currency — he’s becoming a better cop. This cycle continues over and over until the generator is so good at creating data that looks like the training data even the discriminator can’t tell.

    刚开始时,造假者会很烂,因为他不知道如何使钱看起来像真钱,但是在每次警察确定假钱后,他都会变得更好,并从错误中吸取教训。 还要记住,鉴别者在区分真实货币和假货币方面也越来越好-他正在成为一个更好的警察。 这个循环不断重复,直到生成器非常擅长创建看起来像训练数据的数据,甚至辨别器也无法分辨。

    使用循环GAN进行图像-图像转换 (Image-Image Translation with Cycle GANs)

    The classic GAN architecture is good when it comes to creating new, similar-looking data but it doesn’t work so well when trying to alter an existing image. Moreover, traditional approaches to image-image translation required datasets with paired examples.

    在创建外观相似的新数据时,经典的GAN架构非常有用,但是在尝试更改现有图像时,效果并不理想。 此外,用于图像-图像翻译的传统方法需要具有成对示例的数据集。

    Image for post
    Source 资源

    These paired examples are data points that are directly related — they show the original image and the desired modification to it. For instance, the training dataset would need to contain the same landscape during winter and summer. However, these datasets are challenging and difficult to prepare — sometimes even impossible, as is the case with art. There just is no photographic equivalent of the Mona Lisa or other great artworks.

    这些成对的示例是直接相关的数据点-它们显示了原始图像及其所需的修改。 例如,训练数据集在冬季和夏季必须包含相同的景观。 但是,这些数据集具有挑战性且难以准备-有时甚至是不可能的,就像艺术一样。 莫娜·丽莎(Mona Lisa)或其他伟大的艺术品在摄影上没有什么可比的。

    To combat this, we use a Cycle GAN. This is a special type of generative adversarial network that has an extension to the GAN architecture: cycle consistency. This is the notion that an image output by the first generator could be used as the input to a second generator and the output of the second generator should match the original image — undoing what the first generator did to the original image.

    为了解决这个问题,我们使用Cycle GAN。 这是一种生成对抗网络的特殊类型,它是GAN体系结构的扩展: 循环一致性 。 这是第一个生成器输出的图像可用作第二个生成器的输入,而第二个生成器的输出应与原始图像匹配的概念-撤消第一个生成器对原始图像所做的操作。

    Image for post
    Source 资源

    Think about it like this: You’re using Google Translate to translate something from English to Spanish. You then open a new tab and copy-paste the Spanish back into Google Translate, where it translates it to English. At the end of all t his, you would expect the original sentence again. This is the principle of a Cycle GAN, and it acts as an additional loss to measure the difference between the generated output of the second generator and the original image, without the need for paired examples.

    这样思考:您正在使用Google翻译将某些内容从英语翻译为西班牙语。 然后,您打开一个新标签,然后将西班牙语复制粘贴回Google Translate,然后将其翻译为英语。 最后,您将再次期待原始句子。 这是Cycle GAN的原理,它是额外的损耗,可用于测量第二个生成器的输出与原始图像之间的差异, 无需配对示例。

    用于样式转换的Cycle GAN架构 (The Cycle GAN Architecture for Style Transfer)

    Here’s how the Cycle GAN would work if our model was training to create images in the style of French artist Cézanne’s.

    如果我们的模型正在接受法国艺术家塞尚(Cézanne's)风格的图像训练,则这就是Cycle GAN的工作原理。

    数据集 (Datasets)

    • Dataset 1: Real photos

      数据集1 :真实照片

    • Dataset 2: Artworks by Cézanne

      数据集2 :塞尚的作品

    生成对抗网络 (Generative Adversarial Networks)

    • GAN 1: Translates real photos (dataset 1) into artworks in the style of Cézanne (dataset 2)

      GAN 1 :将真实照片(数据集1)转换为塞尚风格(数据集2)的艺术品

    • GAN 2: Translates artworks in the style of Cézanne (dataset 2) into real photos (dataset 1)

      GAN 2 :将塞尚风格的艺术品(数据集2)转换为真实照片(数据集1)

    正向循环一致性损失 (Forward Cycle Consistency Loss)

    • Input real photo (collection 1) to GAN 1

      将真实照片(集合1)输入到GAN 1
    • Output photo of Cézanne-styled artwork from GAN 1

      GAN 1中塞尚风格的艺术品的输出照片
    • Input photo of Cézanne-styled artwork generated by GAN 1 to GAN 2

      GAN 1到GAN 2生成的塞尚风格艺术品的输入照片
    • Output real photo from GAN 2

      从GAN 2输出真实照片
    • Compare real photo (collection 1) to real photo outputted from GAN 2 using discriminator

      使用鉴别器将真实照片(集合1)与从GAN 2输出的真实照片进行比较

    向后循环一致性损失 (Backward Cycle Consistency Loss)

    • Input photo of Cézanne artwork (collection 2) to GAN 2

      将塞尚艺术品(收藏2)的照片输入到GAN 2
    • Output real photo from GAN 2

      从GAN 2输出真实照片
    • Input real photo generated by GAN 2 to GAN 1

      将GAN 2生成的真实照片输入GAN 1
    • Output Cézanne-styled artwork from GAN 1

      从GAN 1输出塞尚风格的艺术品
    • Compare original Cézanne artwork (collection 2) to Cézanne-styled artwork from GAN 1 using discriminator

      使用判别器将塞尚的原始艺术品(集合2)与GAN 1中的塞尚风格的艺术品进行比较

    By minimizing these two losses, our model will eventually learn how to transform real photos into artworks in the style of Cézanne — and since we have not just one but two GANs, we can also turn original artworks from Cézanne into real photos. If we wanted our model to transform our photos into another artist’s style, we would simply need to replace the Cézanne artworks in dataset 2 to artworks from another artist, say Van Gogh or Picasso.

    通过最大程度地减少这两个损失,我们的模型最终将学习如何将真实照片转换为塞尚风格的艺术品-由于我们不仅拥有两个GAN,而且还可以将塞尚的原始艺术品变为真实照片。 如果我们希望我们的模型将照片转换成另一位艺术家的风格,我们只需要将数据集2中的塞尚作品替换为另一位艺术家的作品,例如梵高或毕加索。

    Really — what we’re capable of doing with GANs is just nuts. I mean even for a human, it is quite a daunting task to try and imitate an artist’s style given a photo for inspiration. Some people dedicate their entire lives to this work, yet for a GAN, they can apply any trained style to any picture in minutes. Nuts!

    真的-我们是能够与甘斯做的仅仅是坚果 。 我的意思是,即使对于人类来说,尝试模仿一张照片中的艺术家的风格也是一项艰巨的任务。 有些人毕生致力于这项工作,但对于GAN,他们可以在几分钟内将任何训练有素的样式应用于任何图片。 坚果!

    结果 (The Results)

    Image for post
    Source 资源

    创意计算的未来 (The Future of Creative Computing)

    Although calling them masterpieces might be a stretch, there’s no doubt that artificial intelligence is quickly catching up to humans in terms of something we thought was once secure — artistic talent and creativity. I only explained one application of GANs which I find personally to be fascinating, but GANs are now being used in a myriad of ways, from generating realistic faces to increasing the quality of images — and at the heart of it all, just a game between a con and a cop.

    尽管称它们为杰作可能有些费劲,但毫无疑问,人工智能正在Swift地赶上人类,这是我们曾经认为很安全的东西-艺术才华和创造力。 我只是解释了我个人认为很有趣的GAN应用程序,但是GAN现在以各种各样的方式使用,从生成逼真的面Kong到提高图像质量,而从根本上讲,这只是一场博弈骗局和警察。

    Hey! Thanks for making it to the end! I’m Freeman, an Innovator at TKS (The Knowledge Society) and I’m super passionate about breaking down and solving the world’s biggest problems.

    嘿! 感谢您的努力! 我是Freeman,TKS(知识社会)的创新者,我非常热衷于分解并解决世界上最大的问题。

    LinkedIn: https://www.linkedin.com/in/freeman-jiang-50b325190/

    领英https : //www.linkedin.com/in/freeman-jiang-50b325190/

    Email: freeman.jiang.ca@gmail.com

    电子邮件: freeman.jiang.ca@gmail.com

    Website: freemanjiang.com

    网站: freemanjiang.com

    翻译自: https://towardsdatascience.com/transforming-real-photos-into-master-artworks-with-gans-7b859a43e6ea

    gans有什么用

    展开全文
  • 香草GANS,小批量鉴别-使用PyTorch实施 该存储库包含我在PyTorch中的第一个代码:一个从头开始实现的GAN(嗯,不是真的),并且经过训练可以生成类似数字的MNIST。 还实施了小批量判别,以避免模式崩溃,这是在...
  • improved_wgan_training, 在"Improved Training of Wasserstein GANs" 中,用于复制实验的代码 改进 Wasserstein GANs的训练在 "改进 Wasserstein GANs的训练"中复制实验的代码。先决条件python,NumPy,TensorFlow...
  • 此为《Gans in Action》(对抗神经网络实战)第一章读书笔记 Chapter 1. Introduction to GANs 对抗神经网络介绍 本章内容包括:GAN概述、GAN的特别之处以及GAN的应用

    此为《Gans in Action》(对抗神经网络实战)第一章读书笔记

    Chapter 1. Introduction to GANs 对抗神经网络介绍

    This chapter covers

    • An overview of Generative Adversarial Networks
    • What makes this class of machine learning algorithms special
    • Some of the exciting GAN applications that this book covers

    本章内容包括:GAN概述、GAN的特别之处以及GAN的应用

    The notion of whether machines can think is older than the computer itself. In 1950, the famed mathematician, logician, and computer scientist Alan Turing—perhaps best known for his role in decoding the Nazi wartime enciphering machine, Enigma—penned a paper that would immortalize his name for generations to come, “Computing Machinery and Intelligence.”

    In the paper, Turing proposed a test he called the imitation game, better known today as the Turing test. In this hypothetical scenario, an unknowing observer talks with two counterparts behind a closed door: one, a fellow human; the other, a computer. Turing reasons that if the observer is unable to tell which is the person and which is the machine, the computer passed the test and must be deemed intelligent.

    图灵机的提出,门后是电脑和人,另一测试者跟他们交谈,无法区分人与电脑时,则认为电脑有了智能。

    Anyone who has attempted to engage in a dialogue with an automated chatbot or a voice-powered intelligent assistant knows that computers have a long way to go to pass this deceptively simple test. However, in other tasks, computers have not only matched human performance but also surpassed it—even in areas that were until recently considered out of reach for even the smartest algorithms, such as superhumanly accurate face recognition or mastering the game of Go.[1]

    [1]:See “Surpassing Human-Level Face Verification Performance on LFW with GaussianFace,” by Chaochao Lu and Xiaoou Tang, 2014, https://arXiv.org/abs/1404.3840. See also the New York Times article “Google’s AlphaGo Defeats Chinese Go Master in Win for A.I.,” by Paul Mozur, 2017, http://mng.bz/07WJ.

    尽管人工智能还有很长的路要走,但在某些方面的能力已经超越人类,比如人脸识别和围棋。

    Machine learning algorithms are great at recognizing patterns in existing data and using that insight for tasks such as classification (assigning the correct category to an example) and regression (estimating a numerical value based on a variety of inputs). When asked to generate new data, however, computers have struggled. An algorithm can defeat a chess grandmaster, estimate stock price movements, and classify whether a credit card transaction is likely to be fraudulent. In contrast, any attempt at making small talk with Amazon’s Alexa or Apple’s Siri is doomed. Indeed, humanity’s most basic and essential capacities—including a convivial conversation or the crafting of an original creation—can leave even the most sophisticated supercomputers in digital spasms.

    之前机器学习算法擅长已有数据的分类和回归任务,但对于生成新的数据表现不佳。

    This all changed in 2014 when Ian Goodfellow, then a PhD student at the University of Montreal, invented Generative Adversarial Networks (GANs). This technique has enabled computers to generate realistic data by using not one, but two, separate neural networks. GANs were not the first computer program used to generate data, but their results and versatility set them apart from all the rest. GANs have achieved remarkable results that had long been considered virtually impossible for artificial systems, such as the ability to generate fake images with real-world-like quality, turn a scribble into a photograph-like image, or turn video footage of a horse into a running zebra—all without the need for vast troves of painstakingly labeled training data.

    直到2014年,博士生 Ian Goodfellow提出了生成对抗网络(GAN)。GAN是由两个神经网络组成,在产生新数据方面具有很好的通用性也被广泛应用

    A telling example of how far machine data generation has been able to advance thanks to GANs is the synthesis of human faces, illustrated in figure 1.1. As recently as 2014, when GANs were invented, the best that machines could produce was a blurred countenance—and even that was celebrated as a groundbreaking success. By 2017, just three years later, advances in GANs enabled computers to synthesize fake faces whose quality rivals high-resolution portrait photographs. In this book, we look under the hood of the algorithm that made all this possible.

    一个比较好的例子是人脸图像合成,如图1.1所示,GAN能够生成高分辨率的图像

    Figure 1.1. Progress in human face generation

    在这里插入图片描述

    (Source: “The Malicious Use of Artificial Intelligence: Forecasting, Prevention, and Mitigation,” by Miles Brundage et al., 2018, https://arxiv.org/abs/1802.07228.)

    1.1. What are Generative Adversarial Networks? 什么是GAN

    Generative Adversarial Networks (GANs) are a class of machine learning techniques that consist of two simultaneously trained models: one (the Generator) trained to generate fake data, and the other (the Discriminator) trained to discern the fake data from real examples.

    GAN包含生成器和识别器,前者生成虚假图像,后者把虚假图像识别出来。

    The word generative indicates the overall purpose of the model: creating new data. The data that a GAN will learn to generate depends on the choice of the training set. For example, if we want a GAN to synthesize images that look like Leonardo da Vinci’s, we would use a training dataset of da Vinci’s artwork.

    生成:生成器生成训练数据集类似的数据,比如用达芬奇的作品作为训练集,合成达芬奇风格的图像

    The term adversarial points to the game-like, competitive dynamic between the two models that constitute the GAN framework: the Generator and the Discriminator. The Generator’s goal is to create examples that are indistinguishable from the real data in the training set. In our example, this means producing paintings that look just like da Vinci’s. The Discriminator’s objective is to distinguish the fake examples produced by the Generator from the real examples coming from the training dataset. In our example, the Discriminator plays the role of an art expert assessing the authenticity of paintings believed to be da Vinci’s. The two networks are continually trying to outwit each other: the better the Generator gets at creating convincing data, the better the Discriminator needs to be at distinguishing real examples from the fake ones.

    对抗:生成器努力生成以假乱真的图像,识别器努力识别出真假来,两者就像造假者与鉴假者一样,互相对抗

    Finally, the word networks indicates the class of machine learning models most commonly used to represent the Generator and the Discriminator: neural networks. Depending on the complexity of the GAN implementation, these can range from simple feed-forward neural networks (as you’ll see in chapter 3) to convolutional neural networks (as you’ll see in chapter 4) or even more complex variants, such as the U-Net (as you’ll see in chapter 9).

    网络:生成器和识别器一般由两个神经网络构成,可以是前馈神经网络(第三章)卷积神经网络(第四章)、以及更复杂额变种,例如U-Net(第九章)

    1.2. How do GANs work? GAN工作原理

    The mathematics underpinning GANs are complex (as you’ll explore in later chapters, especially chapters 3 and 5); fortunately, many real-world analogies can make GANs easier to understand. Previously, we discussed the example of an art forger (the Generator) trying to fool an art expert (the Discriminator). The more convincing the fake paintings the forger makes, the better the art expert must be at determining their authenticity. This is true in the reverse situation as well: the better the art expert is at telling whether a particular painting is genuine, the more the forger must improve to avoid being caught red-handed.

    GAN的数学知识比较复杂,这里用达芬奇作品造假者与鉴假专家的比喻比较形象。生成器(造假)与识别器(鉴假)的能力,在训练过程中是相互促进提升的。

    Another metaphor often used to describe GANs—one that Ian Goodfellow himself likes to use—is that of a criminal (the Generator) who forges money, and a detective (the Discriminator) who tries to catch him. The more authentic-looking the counterfeit bills become, the better the detective must be at detecting them, and vice versa.

    另一个比喻是造假钞者与警探的例子。

    In more technical terms, the Generator’s goal is to produce examples that capture the characteristics of the training dataset, so much so that the samples it generates look indistinguishable from the training data. The Generator can be thought of as an object recognition model in reverse. Object recognition algorithms learn the patterns in images to discern an image’s content. Instead of recognizing the patterns, the Generator learns to create them essentially from scratch; indeed, the input into the Generator is often no more than a vector of random numbers.

    以上是专业表达,生成器输入是随机向量,训练过程中捕获训练数据特征,生成真假难辨的样本;识别器是捕获训练数据特征,用以识别假样本。

    The Generator learns through the feedback it receives from the Discriminator’s classifications. The Discriminator’s goal is to determine whether a particular example is real (coming from the training dataset) or fake (created by the Generator). Accordingly, each time the Discriminator is fooled into classifying a fake image as real, the Generator knows it did something well. Conversely, each time the Discriminator correctly rejects a Generator-produced image as fake, the Generator receives the feedback that it needs to improve.

    The Discriminator continues to improve as well. Like any classifier, it learns from how far its predictions are from the true labels (real or fake). So, as the Generator gets better at producing realistic-looking data, the Discriminator gets better at telling fake data from the real, and both networks continue to improve simultaneously.

    如果识别器识别对了,识别器就知道自己做对了,生成器就会收到反馈进行自我提升。反之亦然。

    Table 1.1 summarizes the key takeaways about the two GAN subnetworks.

    在这里插入图片描述

    1.3. GANs in action GAN实战

    Now that you have a high-level understanding of GANs and their constituent networks, let’s take a closer look at the system in action. Imagine that our goal is to teach a GAN to produce realistic-looking handwritten digits. (You’ll learn to implement such a model in chapter 3 and expand on it in chapter 4.) Figure 1.2 illustrates the core GAN architecture.

    图1.2描述了GAN核心架构

    Figure 1.2. The two GAN subnetworks, their inputs and outputs, and their interactions

    在这里插入图片描述

    Let’s walk through the details of the diagram:

    1. Training dataset— The dataset of real examples that we want the Generator to learn to emulate with near-perfect quality. In this case, the dataset consists of images of handwritten digits. This dataset serves as input (x) to the Discriminator network.
    2. Random noise vector— The raw input (z) to the Generator network. This input is a vector of random numbers that the Generator uses as a starting point for synthesizing fake examples.
    3. Generator network— The Generator takes in a vector of random numbers (z) as input and outputs fake examples (x*). Its goal is to make the fake examples it produces indistinguishable from the real examples in the training dataset.
    4. Discriminator network— The Discriminator takes as input either a real example (x) coming from the training set or a fake example (x*) produced by the Generator. For each example, the Discriminator determines and outputs the probability of whether the example is real.
    5. Iterative training/tuning— For each of the Discriminator’s predictions, we determine how good it is—much as we would for a regular classifier—and use the results to iteratively tune the Discriminator and the Generator networks through backpropagation:

    • The Discriminator’s weights and biases are updated to maximize its classification accuracy (maximizing the probability of correct prediction: x as real and x* as fake).
    • The Generator’s weights and biases are updated to maximize the probability that the Discriminator misclassifies x* as real.

    1 表示真实训练数据,作为识别器输入 x x x
    2 表示随机向量,作为生成器输入 z z z,用于产生虚假图像
    3 表示生成器网络,输入 z z z,输出虚假图像 x ∗ x^* x
    4 表示识别器网络,将真实图像 x x x与虚假图像 x ∗ x^* x作为输入,输出图像为真实图像的可能性。
    5 表示迭代训练/调参,

    1.3.1. GAN training GAN训练

    Learning about the purpose of the various GAN components may feel like looking at a snapshot of an engine: it cannot be understood fully until we see it in motion. That’s what this section is all about. First, we present the GAN training algorithm; then, we illustrate the training process so you can see the architecture diagram in action.

    我们先了解算法,再通过训练过程来理解GAN。

    GAN training algorithm
    GAN训练算法

    For each training iteration do
    
    	1. Train the Discriminator:
    		1. Take a random real example x from the training dataset.
    		2. Get a new random noise vector z and, using the Generator network, synthesize a fake example x*.
    		3. Use the Discriminator network to classify x and x*.
    		4. Compute the classification errors and backpropagate the total error to update the Discriminator’s trainable parameters, 	seeking to minimize the classification errors.
    	2. Train the Generator:
    		1. Get a new random noise vector z and, using the Generator network, synthesize a fake example x*.
    		2. Use the Discriminator network to classify x*.
    		3. Compute the classification error and backpropagate the error to update the Generator’s trainable parameters, seeking to maximize the Discriminator’s error.
    End for
    
    循环开始
    	1. 训练识别器:
    		1. 从训练数据随机获取真实样本x
    		2. 产生随机噪声向量z,使用对抗网络生成假样本x*
    		3. 使用识别器对x和x*进行分类
    		4. 计算分类损失,反向传播总损失更新识别器参数, 以减少分类损失
    	2. 训练生成器:
    		1. 获取随机噪声向量z,使用生成器网络,合成假样本 x*
    		2. 使用识别器对x*进行分类
    		3. 计算分类损失,反向传播更新生成器参数, 以增大识别器损失
    循环结束
    

    GAN training visualized
    GAN训练图解
    Figure 1.3 illustrates the GAN training algorithm. The letters in the diagram refer to the list of steps in the GAN training algorithm.

    在这里插入图片描述

    1.3.2. Reaching equilibrium 达到平衡

    You may wonder when the GAN training loop is meant to stop. More precisely, how do we know when a GAN is fully trained so that we can determine the appropriate number of training iterations? With a regular neural network, we usually have a clear objective to achieve and measure. For example, when training a classifier, we measure the classification error on the training and validation sets, and we stop the process when the validation error starts getting worse (to avoid overfitting). In a GAN, the two networks have competing objectives: when one network gets better, the other gets worse. How do we determine when to stop?

    Those familiar with game theory may recognize this setup as a zero-sum game—a situation in which one player’s gains equal the other player’s losses. When one player improves by a certain amount, the other player worsens by the same amount. All zero-sum games have a Nash equilibrium, a point at which neither player can improve their situation or payoff by changing their actions.

    GAN reaches Nash equilibrium when the following conditions are met:

    • The Generator produces fake examples that are indistinguishable from the real data in the training dataset.
    • The Discriminator can at best randomly guess whether a particular example is real or fake (that is, make a 50/50 guess whether an example is real).

    达到纳什均衡的时候停止训练,需满足如下条件:

    • 难以区分生成器产生的假图片和训练数据集的真图片
    • 识别器对图片真假识别的概率都是50%

    NOTE
    Nash equilibrium is named after the American economist and mathematician John Forbes Nash Jr., whose life story and career were captured in the biography titled A Beautiful Mind and inspired the eponymous film.

    Let us convince you of why this is the case. When each of the fake examples (x*) is truly indistinguishable from the real examples (x) coming from the training dataset, there is nothing the Discriminator can use to tell them apart from one another. Because half of the examples it receives are real and half are fake, the best the Discriminator can do is to flip a coin and classify each example as real or fake with 50% probability.

    The Generator is likewise at a point where it has nothing to gain from further tuning. Because the examples it produces are already indistinguishable from the real ones, even a tiny change to the process it uses to turn the random noise vector (z) into a fake example (x*) may give the Discriminator a cue for how to discern the fake example from the real data, making the Generator worse off.

    上面描述了达到纳什均衡时,识别器和生成器都难以更进一步

    With equilibrium achieved, GAN is said to have converged. Here is when it gets tricky. In practice, it is nearly impossible to find the Nash equilibrium for GANs because of the immense complexities involved in reaching convergence in nonconvex games (more on convergence in later chapters, particularly chapter 5). Indeed, GAN convergence remains one of the most important open questions in GAN research.

    Fortunately, this has not impeded GAN research or the many innovative applications of generative adversarial learning. Even in the absence of rigorous mathematical guarantees, GANs have achieved remarkable empirical results. This book covers a selection of the most impactful ones, and the following section previews some of them.

    实际中是很难达到纳什均衡(GAN收敛),这是当前GAN研究中亟待解决的问题之一。但这并不妨碍GAN在研究应用中取得非凡的成就

    1.4. Why study GANs? 为什么研究GAN

    Since their invention, GANs have been hailed by academics and industry experts as one of the most consequential innovations in deep learning. Yann LeCun, the director of AI research at Facebook, went so far as to say that GANs and their variations are “the coolest idea in deep learning in the last 20 years.”[2]
    [2]:See “Google’s Dueling Neural Networks Spar to Get Smarter,” by Cade Metz, Wired, 2017, http://mng.bz/KE1X.

    GAN自发明以来,一直被学术界和业界专家誉为深度学习领域最重要的创新之一。Facebook人工智能研究主管Yann LeCun甚至表示,GAN及其变体是“近20年来深度学习中最酷的想法”

    The excitement is well justified. Unlike other advancements in machine learning that may be household names among researchers but would elicit no more than a quizzical look from anyone else, GANs have captured the imagination of researchers and the wider public alike. They have been covered by the New York Times, the BBC, Scientific American, and many other prominent media outlets. Indeed, it was one of those exciting GAN results that probably drove you to buy this book in the first place. (Right?)

    GAN为科研人员和吃瓜群众提供了足够的创新空间,被各大媒体报道

    Perhaps most notable is the capacity of GANs to create hyperrealistic imagery. None of the faces in figure 1.4 belongs to a real human; they are all fake, showcasing GANs’ ability to synthesize images with photorealistic quality. The faces were produced using Progressive GANs, a technique covered in chapter 6.

    GAN能生成超真实的图像,难以想象,图1.4全是假的。这是使用渐进生成对抗网络生成的,第六章会提到

    Figure 1.4. These photorealistic but fake human faces were synthesized by a Progressive GAN trained on high-resolution portrait photos of celebrities.

    在这里插入图片描述

    (Source: “Progressive Growing of GANs for Improved Quality, Stability, and Variation,” by Tero Karras et al., 2017, https://arxiv.org/abs/1710.10196.)

    Another remarkable GAN achievement is image-to-image translation. Similarly to the way a sentence can be translated from, say, Chinese to Spanish, GANs can translate an image from one domain to another. As shown in figure 1.5, GANs can turn an image of a horse into an image of zebra (and back!), and a photo into a Monet-like painting—all with virtually no supervision and no labels whatsoever. The GAN variant that made this possible is called CycleGAN; you’ll learn all about it in chapter 9.

    GAN的另一应用是图像转换,如图1.5所示,GAN将马的图像变成斑马(或者反过来)、将图像变成Monet风格。这是由CycleGAN实现的,第九章会提到

    Figure 1.5. By using a GAN variant called CycleGAN, we can turn a Monet painting into a photograph or turn an image of a zebra into a depiction of a horse, and vice versa.

    在这里插入图片描述

    (Source: See “Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks,” by Jun-Yan Zhu et al., 2017, https://arxiv.org/abs/1703.10593.)

    The more practically minded GAN use cases are just as fascinating. The online giant Amazon is experimenting with harnessing GANs for fashion recommendations: by analyzing countless outfits, the system learns to produce new items matching any given style.[3] In medical research, GANs are used to augment datasets with synthetic examples to improve diagnostic accuracy.[4] In chapter 11—after you’ve mastered the ins and outs of training GANs and their variants—you’ll explore both of these applications in detail.
    [3]: See “Amazon Has Developed an AI Fashion Designer,” by Will Knight, MIT Technology Review, 2017, http://mng.bz/9wOj.
    [4]: See “Synthetic Data Augmentation Using GAN for Improved Liver Lesion Classification,” by Maayan Frid-Adar et al., 2018, https://arxiv.org/abs/1801.02385.

    GAN被亚马逊用于服装设计,也被用于医疗研究提高诊断准确性。在第十一章有相关内容

    GANs are also seen as an important stepping stone toward achieving artificial general intelligence,[5] an artificial system capable of matching human cognitive capacity to acquire expertise in virtually any domain—from motor skills involved in walking, to language, to creative skills needed to compose sonnets.
    [5]: See “OpenAI Founder: Short-Term AGI Is a Serious Possibility,” by Tony Peng, Synced, 2018, http://mng.bz/j5Oa. See also “A Path to Unsupervised Learning Through Adversarial Networks,” by Soumith Chintala, f Code, 2016, http://mng.bz/WOag.

    GAN被视为实现通用人工智能的重要基石。

    But with the ability to generate new data and imagery, GANs also have the capacity to be dangerous. Much has been discussed about the spread and dangers of fake news, but the potential of GANs to create credible fake footage is disturbing. At the end of an aptly titled 2018 piece about GANs—“How an A.I. ‘Cat-and-Mouse Game’ Generates Believable Fake Photos”—the New York Times journalists Cade Metz and Keith Collins discuss the worrying prospect of GANs being exploited to create and spread convincing misinformation, including fake video footage of statements by world leaders. Martin Giles, the San Francisco bureau chief of MIT Technology Review, echoes their concern and mentions another potential risk in his 2018 article “The GANfather: The Man Who’s Given Machines the Gift of Imagination”: in the hands of skilled hackers, GANs can be used to intuit and exploit system vulnerabilities at an unprecedented scale. These concerns are what motivated us to discuss the ethical considerations of GANs in chapter 12.

    GAN被用于制造虚假图片、视频信息以及网络攻击,让人感到忧虑。关于这些考虑,会在第十二章提到

    GANs can do much good for the world, but all technological innovations have misuses. Here the philosophy has to be one of awareness: because it is impossible to “uninvent” a technique, it is crucial to make sure people like you are aware of this technique’s rapid emergence and its substantial potential.

    科技是把双刃剑,我们无法阻止它到来,那么就认识它的潜力并让其造福世界吧

    In this book, we are only able to scratch the surface of what is possible with GANs. However, we hope that this book will provide you with the necessary theoretical knowledge and practical skills to continue exploring any facet of this field that you find most interesting.

    So, without further ado, let’s dive in!

    本书只探索了GAN的冰山一角,希望能给你提供必要的知识技能,让你继续探索感兴趣的领域。言归正传,我们开始吧!

    Summary 总结

    • GANs are a deep learning technique that uses a competitive dynamic between two neural networks to synthesize realistic data samples, such as fake photorealistic imagery. The two networks that constitute a GAN are as follows:
      • The Generator, whose goal is to fool the Discriminator by producing data indistinguishable from the training dataset
      • The Discriminator, whose goal is to correctly distinguish between real data coming from the training dataset and the fake data produced by the Generator
    • GANs have extensive applications across many different sectors, such as fashion, medicine, and cybersecurity.
    • GAN是通过两个互相竞争的神经网络来合成逼真的数据样本,例如图像。它包含两部分:生成器和识别器。
    • GAN在很多领域有广泛的应用,例如时尚、医学和网络安全
    展开全文

空空如也

空空如也

1 2 3 4 5 ... 20
收藏数 6,747
精华内容 2,698
关键字:

GANs