精华内容
下载资源
问答
  • 一、手写数字测试 1)下载数据集: 2)放到./data目录,修改model下代码: def load_mnist(self): data_dir = os.path.join(".\data", self.dataset_name) fd = open(os.path.join(data_dir,'train-images...

    参考代码:https://github.com/carpedm20/DCGAN-tensorflow

    一、手写数字测试

    1)下载数据集:

    2)放到./data目录,修改model下代码:

      def load_mnist(self):
        data_dir = os.path.join(".\data", self.dataset_name)
    
        fd = open(os.path.join(data_dir,'train-images.idx3-ubyte'))
        loaded = np.fromfile(file=fd,dtype=np.uint8)
        trX = loaded[16:].reshape((60000,28,28,1)).astype(np.float)
    
        fd = open(os.path.join(data_dir,'train-labels.idx1-ubyte'))
        loaded = np.fromfile(file=fd,dtype=np.uint8)
        trY = loaded[8:].reshape((60000)).astype(np.float)
    
        fd = open(os.path.join(data_dir,'t10k-images.idx3-ubyte'))
        loaded = np.fromfile(file=fd,dtype=np.uint8)
        teX = loaded[16:].reshape((10000,28,28,1)).astype(np.float)
    
        fd = open(os.path.join(data_dir,'t10k-labels.idx1-ubyte'))
        loaded = np.fromfile(file=fd,dtype=np.uint8)
        teY = loaded[8:].reshape((10000)).astype(np.float)
    python main.py --dataset mnist --input_height=28 --output_height=28 --train

    3)结果:

    4)生成图片:

    二、训练数据

    1)准备数据放入./data

    2)开始训练:

    python main.py --input_height 96 --input_width 96 --output_height 48 --output_width 48 --dataset anime --crop -–train --epoch 2 --input_fname_pattern "*.jpg"

     值训练了2次

    展开全文
  • 学习目标 本教程教你如何使用dcgan训练mnist数据集,生成手写数字。 2. 环境配置 2.1. Python 请参考官网安装。 2.2. Pytorch 请参考官网安装。 2.3. Jupyter notebook pip install jupyter 2.4. Matplotlib pip ...

    在这里插入图片描述

    (左边是数据集中的真图,右边是生成器生成的假图)


    1. 学习目标

    本教程教你如何使用dcgan训练mnist数据集,生成手写数字。

    2. 环境配置

    2.1. Python

    请参考官网安装。

    2.2. Pytorch

    请参考官网安装。

    2.3. Jupyter notebook

    pip install jupyter
    

    2.4. Matplotlib

    pip install matplotlib
    

    3. 具体实现

    3.1. 导入模块

    import time
    import torch
    import torch.nn as nn
    from torch.utils.data import DataLoader
    from torchvision import utils, datasets, transforms
    import matplotlib.pyplot as plt
    import matplotlib.animation as animation
    from IPython.display import HTML
    

    3.2. 设置随机种子

    设置随机种子,以便复现实验结果。

    torch.manual_seed(0)
    

    3.3. 超参数配置

    • dataroot:存放数据集文件夹所在的路径
    • workers :数据加载器加载数据的线程数
    • batch_size:训练的批次大小。
    • image_size:训练图像的维度。默认是64x64。如果需要其它尺寸,必须更改DDGG的结构,点击这里查看详情
    • nc:输入图像的通道数。对于彩色图像是3
    • nz:潜在空间的长度
    • ngf:与通过生成器进行的特征映射的深度有关
    • ndf:设置通过鉴别器传播的特征映射的深度
    • num_epochs:训练的总轮数。训练的轮数越多,可能会导致更好的结果,但也会花费更长的时间
    • lr:学习率。DCGAN论文中用的是0.0002
    • beta1:Adam优化器的参数beta1。论文中,值为0.5
    • ngpus:可用的GPU数量。如果为0,代码将在CPU模式下运行;如果大于0,它将在该数量的GPU下运行
    # Root directory for dataset
    dataroot = "data/mnist"
    
    # Number of workers for dataloader
    workers = 10
    
    # Batch size during training
    batch_size = 100
    
    # Spatial size of training images. All images will be resized to this size using a transformer.
    image_size = 64
    
    # Number of channels in the training images. For color images this is 3
    nc = 1
    
    # Size of z latent vector (i.e. size of generator input)
    nz = 100
    
    # Size of feature maps in generator
    ngf = 64
    
    # Size of feature maps in discriminator
    ndf = 64
    
    # Number of training epochs
    num_epochs = 10
    
    # Learning rate for optimizers
    lr = 0.0002
    
    # Beta1 hyperparam for Adam optimizers
    beta1 = 0.5
    
    # Number of GPUs available. Use 0 for CPU mode.
    ngpu = 1
    

    3.4. 数据集

    使用mnist数据集,其中训练集6万张,测试集1万张,我们这里不是分类任务,而是使用dcgan的生成任务,所以就不分训练和测试了,全部图像都可以利用。

    train_data = datasets.MNIST(
        root=dataroot,
        train=True,
        transform=transforms.Compose([
            transforms.Resize(image_size),
            transforms.ToTensor(),
            transforms.Normalize((0.5,), (0.5,))
        ]),
        download=True
    )
    test_data = datasets.MNIST(
        root=dataroot,
        train=False,
        transform=transforms.Compose([
            transforms.Resize(image_size),
            transforms.ToTensor(),
            transforms.Normalize((0.5,), (0.5,))
        ])
    )
    dataset = train_data+test_data
    print(f'Total Size of Dataset: {len(dataset)}')
    

    输出:

    Total Size of Dataset: 70000
    

    3.5. 数据加载器

    dataloader = DataLoader(
        dataset=dataset,
        batch_size=batch_size,
        shuffle=True,
        num_workers=workers
    )
    

    3.6. 选择训练设备

    检测cuda是否可用,可用就用cuda加速,否则使用cpu训练。

    device = torch.device('cuda:0' if (torch.cuda.is_available() and ngpu > 0) else 'cpu')
    

    3.7. 训练数据可视化

    inputs = next(iter(dataloader))[0]
    plt.figure(figsize=(10,10))
    plt.title("Training Images")
    plt.axis('off')
    inputs = utils.make_grid(inputs[:100]*0.5+0.5, nrow=10)
    plt.imshow(inputs.permute(1, 2, 0))
    

    在这里插入图片描述

    3.8. 权重初始化

    dcgan论文中,作者指出所有模型权重应当从均值为0,标准差为0.02的正态分布中随机初始化。

    def weights_init(m):
        classname = m.__class__.__name__
        if classname.find('Conv') != -1:
            nn.init.normal_(m.weight.data, 0.0, 0.02)
        elif classname.find('BatchNorm') != -1:
            nn.init.normal_(m.weight.data, 1.0, 0.02)
            nn.init.constant_(m.bias.data, 0)
    

    3.9. 生成器

    3.9.1. 生成器结构图

    在这里插入图片描述

    3.9.2. 构建生成器类

    class Generator(nn.Module):
        def __init__(self, ngpu):
            super(Generator, self).__init__()
            self.ngpu = ngpu
            self.main = nn.Sequential(
                # input is Z, going into a convolution
                nn.ConvTranspose2d(nz, ngf * 8, 4, 1, 0, bias=False),
                nn.BatchNorm2d(ngf * 8),
                nn.ReLU(True),
                # state size. (ngf*8) x 4 x 4
                nn.ConvTranspose2d(ngf * 8, ngf * 4, 4, 2, 1, bias=False),
                nn.BatchNorm2d(ngf * 4),
                nn.ReLU(True),
                # state size. (ngf*4) x 8 x 8
                nn.ConvTranspose2d(ngf * 4, ngf * 2, 4, 2, 1, bias=False),
                nn.BatchNorm2d(ngf * 2),
                nn.ReLU(True),
                # state size. (ngf*2) x 16 x 16
                nn.ConvTranspose2d(ngf * 2, ngf, 4, 2, 1, bias=False),
                nn.BatchNorm2d(ngf),
                nn.ReLU(True),
                # state size. (ngf) x 32 x 32
                nn.ConvTranspose2d(ngf, nc, 4, 2, 1, bias=False),
                nn.Tanh()
                # state size. (nc) x 64 x 64
            )
    
        def forward(self, input):
            return self.main(input)
    

    3.9.3. 生成器实例化

    # Create the generator
    netG = Generator(ngpu).to(device)
    
    # Handle multi-gpu if desired
    if device.type == 'cuda' and ngpu > 1:
        netG = nn.DataParallel(netG, list(range(ngpu)))
    
    # Apply the weights_init function to randomly initialize all weights to mean=0, stdev=0.2.
    # netG.apply(weights_init)
    

    3.10. 判别器

    3.10.1. 判别器结构图

    在这里插入图片描述

    3.10.2. 构建判别器类

    class Discriminator(nn.Module):
        def __init__(self, ngpu):
            super(Discriminator, self).__init__()
            self.ngpu = ngpu
            self.main = nn.Sequential(
                # input is (nc) x 64 x 64
                nn.Conv2d(nc, ndf, 4, 2, 1, bias=False),
                nn.LeakyReLU(0.2, inplace=True),
                # state size. (ndf) x 32 x 32
                nn.Conv2d(ndf, ndf * 2, 4, 2, 1, bias=False),
                nn.BatchNorm2d(ndf * 2),
                nn.LeakyReLU(0.2, inplace=True),
                # state size. (ndf*2) x 16 x 16
                nn.Conv2d(ndf * 2, ndf * 4, 4, 2, 1, bias=False),
                nn.BatchNorm2d(ndf * 4),
                nn.LeakyReLU(0.2, inplace=True),
                # state size. (ndf*4) x 8 x 8
                nn.Conv2d(ndf * 4, ndf * 8, 4, 2, 1, bias=False),
                nn.BatchNorm2d(ndf * 8),
                nn.LeakyReLU(0.2, inplace=True),
                # state size. (ndf*8) x 4 x 4
                nn.Conv2d(ndf * 8, 1, 4, 1, 0, bias=False),
                # state size. (1) x 1 x 1
                nn.Sigmoid()
            )
    
        def forward(self, input):
            return self.main(input)
    

    3.10.3. 判别器实例化

    # Create the Discriminator
    netD = Discriminator(ngpu).to(device)
    
    # Handle multi-gpu if desired
    if device.type == 'cuda' and ngpu > 1:
        netD = nn.DataParallel(netD, list(range(ngpu)))
    
    # Apply the weights_init function to randomly initialize all weights to mean=0, stdev=0.2.
    netD.apply(weights_init)
    

    3.11. 优化器和损失函数

    # Initialize BCELoss function
    criterion = nn.BCELoss()
    
    # Create batch of latent vectors that we will use to visualize the progression of the generator
    fixed_noise = torch.randn(100, nz, 1, 1, device=device)
    
    # Establish convention for real and fake labels during training
    real_label = 1.
    fake_label = 0.
    
    # Setup Adam optimizers for both G and D
    optimizerD = torch.optim.Adam(netD.parameters(), lr=lr, betas=(beta1, 0.999))
    optimizerG = torch.optim.Adam(netG.parameters(), lr=lr, betas=(beta1, 0.999))
    

    3.12. 开始训练

    # Training Loop
    
    # Lists to keep track of progress
    img_list = []
    G_losses = []
    D_losses = []
    D_x_list = []
    D_z_list = []
    loss_tep = 10
    
    print("Starting Training Loop...")
    # For each epoch
    for epoch in range(num_epochs):
        beg_time = time.time()
        # For each batch in the dataloader
        for i, data in enumerate(dataloader):
    
            ############################
            # (1) Update D network: maximize log(D(x)) + log(1 - D(G(z)))
            ###########################
            ## Train with all-real batch
            netD.zero_grad()
            # Format batch
            real_cpu = data[0].to(device)
            b_size = real_cpu.size(0)
            label = torch.full((b_size,), real_label, dtype=torch.float, device=device)
    
            # Forward pass real batch through D
            output = netD(real_cpu).view(-1)
    
            # Calculate loss on all-real batch
            errD_real = criterion(output, label)
            # Calculate gradients for D in backward pass
            errD_real.backward()
            D_x = output.mean().item()
    
            ## Train with all-fake batch
            # Generate batch of latent vectors
            noise = torch.randn(b_size, nz, 1, 1, device=device)
            # Generate fake image batch with G
            fake = netG(noise)
            label.fill_(fake_label)
            # Classify all fake batch with D
            output = netD(fake.detach()).view(-1)
            # Calculate D's loss on the all-fake batch
            errD_fake = criterion(output, label)
            # Calculate the gradients for this batch
            errD_fake.backward()
            D_G_z1 = output.mean().item()
            # Add the gradients from the all-real and all-fake batches
            errD = errD_real + errD_fake
            # Update D
            optimizerD.step()
    
            ############################
            # (2) Update G network: maximize log(D(G(z)))
            ###########################
            netG.zero_grad()
            label.fill_(real_label)  # fake labels are real for generator cost
            # Since we just updated D, perform another forward pass of all-fake batch through D
            output = netD(fake).view(-1)
            # Calculate G's loss based on this output
            errG = criterion(output, label)
            # Calculate gradients for G
            errG.backward()
            D_G_z2 = output.mean().item()
            # Update G
            optimizerG.step()
    
            # Output training stats
            end_time = time.time()
            run_time = round(end_time-beg_time)
            print(
                f'Epoch: [{epoch+1:0>{len(str(num_epochs))}}/{num_epochs}]',
                f'Step: [{i+1:0>{len(str(len(dataloader)))}}/{len(dataloader)}]',
                f'Loss-D: {errD.item():.4f}',
                f'Loss-G: {errG.item():.4f}',
                f'D(x): {D_x:.4f}',
                f'D(G(z)): [{D_G_z1:.4f}/{D_G_z2:.4f}]',
                f'Time: {run_time}s',
                end='\r'
            )
    
            # Save Losses for plotting later
            G_losses.append(errG.item())
            D_losses.append(errD.item())
            
            # Save D(X) and D(G(z)) for plotting later
            D_x_list.append(D_x)
            D_z_list.append(D_G_z2)
            
            # Save the Best Model
            if errG < loss_tep:
                torch.save(netG.state_dict(), 'model.pt')
                loss_tep = errG
            
        # Check how the generator is doing by saving G's output on fixed_noise
        with torch.no_grad():
            fake = netG(fixed_noise).detach().cpu()
        img_list.append(utils.make_grid(fake*0.5+0.5, nrow=10))
        print()
    

    输出:

    Starting Training Loop...
    Epoch: [01/10] Step: [700/700] Loss-D: 0.6744 Loss-G: 1.0026 D(x): 0.5789 D(G(z)): [0.0638/0.4204] Time: 114s
    Epoch: [02/10] Step: [700/700] Loss-D: 2.2584 Loss-G: 3.7674 D(x): 0.9661 D(G(z)): [0.8334/0.0440] Time: 166s
    Epoch: [03/10] Step: [700/700] Loss-D: 1.2438 Loss-G: 0.8505 D(x): 0.4126 D(G(z)): [0.1107/0.4717] Time: 166s
    Epoch: [04/10] Step: [700/700] Loss-D: 0.3479 Loss-G: 2.5261 D(x): 0.8771 D(G(z)): [0.1796/0.0980] Time: 166s
    Epoch: [05/10] Step: [700/700] Loss-D: 0.6771 Loss-G: 3.8938 D(x): 0.9139 D(G(z)): [0.3889/0.0277] Time: 161s
    Epoch: [06/10] Step: [700/700] Loss-D: 0.2697 Loss-G: 3.8211 D(x): 0.9490 D(G(z)): [0.1823/0.0282] Time: 166s
    Epoch: [07/10] Step: [700/700] Loss-D: 0.2874 Loss-G: 2.1176 D(x): 0.8062 D(G(z)): [0.0494/0.1503] Time: 180s
    Epoch: [08/10] Step: [700/700] Loss-D: 0.7798 Loss-G: 1.6315 D(x): 0.5978 D(G(z)): [0.1508/0.2463] Time: 171s
    Epoch: [09/10] Step: [700/700] Loss-D: 0.3052 Loss-G: 0.8984 D(x): 0.7611 D(G(z)): [0.0023/0.4799] Time: 165s
    Epoch: [10/10] Step: [700/700] Loss-D: 1.1115 Loss-G: 2.2473 D(x): 0.7334 D(G(z)): [0.4824/0.1385] Time: 157s
    

    3.13. 训练过程中的损失变化

    plt.figure(figsize=(20, 10))
    plt.title("Generator and Discriminator Loss During Training")
    plt.plot(G_losses[::100], label="G")
    plt.plot(D_losses[::100], label="D")
    plt.xlabel("iterations")
    plt.ylabel("Loss")
    plt.axhline(y=0, label="0", c="g") # asymptote
    plt.legend()
    

    在这里插入图片描述

    3.14. 训练过程中的D(x)和D(G(z))变化

    plt.figure(figsize=(20, 10))
    plt.title("D(x) and D(G(z)) During Training")
    plt.plot(D_x_list[::100], label="D(x)")
    plt.plot(D_z_list[::100], label="D(G(z))")
    plt.xlabel("iterations")
    plt.ylabel("Probability")
    plt.axhline(y=0.5, label="0.5", c="g") # asymptote
    plt.legend()
    

    在这里插入图片描述

    3.15. 可视化G的训练过程

    fig = plt.figure(figsize=(10, 10))
    plt.axis("off")
    ims = [[plt.imshow(item.permute(1, 2, 0), animated=True)] for item in img_list]
    ani = animation.ArtistAnimation(fig, ims, interval=1000, repeat_delay=1000, blit=True)
    HTML(ani.to_jshtml())
    

    在这里插入图片描述

    4. 真图 vs 假图

    # Size of the Figure
    plt.figure(figsize=(20,10))
    
    # Plot the real images
    plt.subplot(1,2,1)
    plt.axis("off")
    plt.title("Real Images")
    real = next(iter(dataloader))
    plt.imshow(utils.make_grid(real[0][:100]*0.5+0.5, nrow=10).permute(1, 2, 0))
    
    # Load the Best Generative Model
    netG = Generator(0)
    netG.load_state_dict(torch.load('model.pt', map_location=torch.device('cpu')))
    netG.eval()
    
    # Generate the Fake Images
    with torch.no_grad():
        fake = netG(fixed_noise.cpu())
    
    # Plot the fake images
    plt.subplot(1,2,2)
    plt.axis("off")
    plt.title("Fake Images")
    fake = utils.make_grid(fake*0.5+0.5, nrow=10)
    plt.imshow(fake.permute(1, 2, 0))
    
    # Save the comparation result
    plt.savefig('result.jpg', bbox_inches='tight')
    

    在这里插入图片描述

    (左边是数据集中的真图,右边是生成器生成的假图)

    5. 完整代码

    https://github.com/XavierJiezou/pytorch-dcgan-mnist

    6. 原始论文

    dcgan论文:https://papers.nips.cc/paper/5423-generative-adversarial-nets.pdf

    7. 引用参考

    https://blog.csdn.net/qq_42951560/article/details/110308336

    8. 问题反馈

    代码运行报错或其它问题可以联系我的微信号:wxhghgxj

    展开全文
  • DCGAN实现手写数字识别demo

    千次阅读 2018-12-27 21:52:06
    论文解读和原理介绍,在网上已经有大量文章,这里就不在赘述。 论文地址:Unsupervised ...论文解读:深度卷积对抗生成网络(DCGAN)(个人感觉最好的一篇) 各种框架的代码: 【theano】 https://github.com/...

    论文解读和原理介绍,在网上已经有大量文章,这里就不在赘述。
    论文地址:Unsupervised Representations Learning With Deep Convolutional Generative Adversarial Networks
    论文解读:深度卷积对抗生成网络(DCGAN)(个人感觉最好的一篇)
    各种框架的代码:
    【theano】 https://github.com/Newmu/dcgan_code
    【tensorflow】 https://github.com/carpedm20/DCGAN-tensorflow
    【keras】 https://github.com/jacobgil/keras-dcgan
    【torch】 https://github.com/soumith/dcgan.torch
    下面是用minist数据集和tensorflow框架实现的一个demo:

    # -*- coding:utf-8 -*-
    
    '''
    逻辑框架
        1. 获取数据
            a. 图像数据
            b. 随机向量
        2. 构建计算图
            a. 生成器
            b. 判别器
            c. DCGAN
                连接 生成器 和 判别器
                定义损失函数
                define train_op
        3. 实施训练流程
    
    '''
    
    import os
    import tensorflow as tf
    from tensorflow import gfile
    import numpy as np
    from PIL import Image
    from tensorflow.examples.tutorials.mnist import input_data
    
    # 获取 数据
    mnist = input_data.read_data_sets('MINIST_data/', one_hot = True)
    # 设置图片保存路径
    output_dir = './local_run'
    if not gfile.Exists(output_dir):
        gfile.MakeDirs(output_dir)
    
    def get_default_params():
        '''
        设置默认参数
        :return:
        '''
        return tf.contrib.training.HParams(
            z_dim = 100, # 随机向量的长度
            init_conv_size = 4, # 初始化随机向量,特征图的大小
            g_channels = [128, 64, 32, 1], # 生成器中每一个反卷积层的通道数目
            d_channels = [32, 64, 128, 256], # 判别器中每一个卷积层的通道数目
            # 需要注意的是:判别器每一层步长都是2,这样的话,每一层特征图减半,通道数×2
            batch_size = 128,
            learning_rate = 0.002, # 学习率
            beta1 = 0.5, # adamoptimizer 的 参数
            img_size = 32, # 生成目标图像的大小
            # 从最初的大小4的特征图,到最后的32的图像,等于 × 2^3
        )
    
    hps = get_default_params()
    
    # print(hps)
    # print(hps.img_size) # 32
    # print(hps.g_channels) # [128, 64, 32, 1]
    # print(minist.train.images.shape) # (55000, 784)
    
    class MnistData(object):
        '''
        图像数据获取类
        '''
        def __init__(self, mnist_train, z_dim, img_size):
            '''
            初始化对象
            :param minist_train: 数据集
            :param z_dim: 初始特征图大小
            :param img_size:  目标图像大小
            '''
    
            self._data = mnist_train
            self._example_num = len(mnist_train) # 数据集 大小
            # z_data 为随机向量,形状为 [self._example_num, z_dim]
            self._z_data = np.random.standard_normal((self._example_num, z_dim))
            # np.random.standard_normal() 生成正态分布样本
            self._indicator = 0 # 起始点
            self._resize_mnist_img(img_size) # 因为 原图像是 28*28,所以需要 resize成 32 × 32
            self._random_shuffle() # 将数据集打乱
    
        def _random_shuffle(self):
            '''
            数据集随机打乱
            :return:
            '''
            p = np.random.permutation(self._example_num)
            self._z_data = self._z_data[p] # 将随机向量打乱
            self._data = self._data[p] # 将 数据集打乱
    
        def _resize_mnist_img(self, img_size):
            '''
            resize mnist image to goal img_size
            what shall we do ?
            1. numpy -> PIL img
            2. PIL img -> resize
            3. PIL img -> numpy
            '''
            # 此刻的数据集是0到1之间,tf自动做了归一化
            # 因为需要映射成0到255的
            data = np.asarray(self._data * 255, np.uint8)
            # reshape一下 [example_num, 784] -> [example_num, 28, 28]
            data = data.reshape((self._example_num, 28, 28))
            new_data = []
    
            for i in range(self._example_num):
                img = data[i]
                img = Image.fromarray(img) # 将 np 转化为 Image 对象
                img = img.resize((img_size, img_size)) # 使用对象的resize方法
                img = np.asarray(img) # 将对象 再转化为 np
                img = img.reshape((img_size, img_size, 1)) # 将 np 的shape,设置成 图像格式(灰度1通道)
                new_data.append(img) # 单张图像的 np,存入一个列表
            new_data = np.asarray(new_data, dtype=np.float32) # 将 该列表 转化为 一个 np。(真实繁琐)
            new_data = new_data / 127.5 - 1  # 进行一个归一化,归一化到-1 到 1 之间,与 tanh 一致
            # [new_example, img_size, img_size, 1]
            self._data = new_data # 当前的值为: (55000, 32, 32, 1)
    
    
        def next_batch(self, batch_size):
            '''
            获得一个batch
            :param batch_size: batch 大小
            :return:
            '''
            end_indicator = self._indicator + batch_size
            if end_indicator > self._example_num:
                self._random_shuffle()
                self._indicator = 0
                end_indicator = self._indicator + batch_size
            assert end_indicator < self._example_num
    
            batch_data = self._data[self._indicator:end_indicator]
            batch_z = self._z_data[self._indicator:end_indicator]
            self._indicator = end_indicator
    
            return batch_data, batch_z
    
    
    mnist_data = MnistData(mnist.train.images, hps.z_dim, hps.img_size)
    batch_data, batch_z = mnist_data.next_batch(5)
    
    
    def conv2d_transpose(inputs, out_channel, name, training, with_bn_relu = True):
        '''
        生成器 反卷积 封装
        :param inputs: 输入 tensor
        :param out_channel: 输出通道数
        :param name: 命名空间
        :param training: 使用于 bn
        :param with_bn_relu: 因为最后一层是不需要做bn 和 relu的,所以立一个flag
        :return: 输出 tensor
        '''
        with tf.variable_scope(name):
            conv2d_trans = tf.layers.conv2d_transpose(inputs,
                                                      out_channel,
                                                      [5, 5],
                                                      strides=(2, 2),
                                                      padding='SAME',
                                                      )
            # 对 tf.layers.conv2d_transpose 的解释,参考:
            # https://www.w3cschool.cn/tensorflow_python/tensorflow_python-wfg62t8h.html
    
            # 判断是否需要 bn
            if with_bn_relu:
                bn = tf.layers.batch_normalization(conv2d_trans,
                                                   training = training
                                                   )
                relu = tf.nn.relu(bn)
                return relu
            else:
                return conv2d_trans
    
    def conv2d(inputs, out_channel, name, training):
        '''
        判别器 卷积 封装
        :param inputs: 输入tensor
        :param out_channel: 输出通道数目
        :param name:  空间命名
        :param training: bn 参数
        :return:
        '''
        # 在本卷积网络中,使用leaky_relu
        def leaky_relu(x, leak=0.2, name=''):
            # 下句做一个解释,如果x大于0,就是x;如果小于0,再乘上一个小数,就比较大了
            return tf.maximum(x, x * leak, name = name)
    
        with tf.variable_scope(name):
            conv2d_output = tf.layers.conv2d(
                inputs,
                out_channel,
                [5, 5],
                strides = (2, 2),
                padding = 'SAME'
            )
            bn = tf.layers.batch_normalization(
                conv2d_output,
                training=training
            )
            return leaky_relu(bn, name='outputs')
    
    
    class Generator():
        '''生成器 构建'''
    
        def __init__(self, channels, init_conv_size):
            '''
            构造函数
            :param channels: 输出通道数量
            :param init_conv_size: 初始化的特征图大小
            '''
            self._channels = channels
            self._init_conv_size = init_conv_size
            self._reuse = False
            # 这里的生成器或者判别器在构建完图之后可能需要多次使用,所有需要重用它
    
        def __call__(self, inputs, training):
            '''
            call 魔术方法,将对象当成函数使用
            :param inputs:
            :param training:
            :return:
            '''
            inputs = tf.convert_to_tensor(inputs) # 将随机向量转变成tensor
            with tf.variable_scope('generator', reuse = self._reuse):
                # 第一次构建的时候,reuse为Fales,构建完成之后,设置为true
                '''
                random_vactor -> fc -> self._channels[0] * init_conv_size**2(还是个一维向量)
                -> reshape -> [init_conv_size, init_conv_size, channels[0]]
                '''
                with tf.variable_scope('input_conv'):
                    # 1. 将 随机向量 进行 映射
                    fc = tf.layers.dense(
                        inputs,
                        self._channels[0] * self._init_conv_size * self._init_conv_size,
                    )
                    # 2. 将 得到的 向量 reshape
                    conv0 = tf.reshape(fc, [-1, self._init_conv_size, self._init_conv_size, self._channels[0]])
    
                    # 3. 进行 bn
                    bn0 = tf.layers.batch_normalization(conv0, training = training)
                    relu0 = tf.nn.relu(bn0)
                # 到此,第一个反卷积结束,还有三个
                deconv_inputs = relu0
                # 因为 第一个通道在上面已经倍使用, 所以这里从1 开始
                for i in range(1, len(self._channels)):
                    # 首先,先判断是否是最后一层,最后一层不能使用relu
                    with_bn_relu = (i != len(self._channels) - 1)
                    deconv_inputs = conv2d_transpose(
                        deconv_inputs,
                        self._channels[i],
                        'deconv-%d'%i,
                        training,
                        with_bn_relu,
                    )
                img_inputs = deconv_inputs
                with tf.variable_scope('generate_img'):
                    # imgs value range:[-1 , 1]
                    imgs = tf.tanh(img_inputs, name = 'imgs')
    
            self._reuse = True
            # 关于 reuse 的说明,参考:https://blog.csdn.net/zSean/article/details/75057806
    
            # 保存生成器所有的参数,因为在gan中生成器和判别器是分开训练的,所有需要保存之前的参数
            self.variables = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, scope = 'generator')
            # tf.get_collection 获得 某个 scope下的 所有的参数
            return  imgs # 将最后 生成的图像 返回
    
    # 定义 判别器
    class Discriminator(object):
    
        def __init__(self, channels):
            self._channels = channels
            self._reuse = False
    
        def __call__(self, inputs, training):
            '''
            本方法流程:
                首先进行四个卷积操作
                其次,展平进行全连接操作
                最后,输出一个二分类的logits
            :param inputs: 图像
            :param training:  用在bn中
            :return: logits
            '''
            inputs = tf.convert_to_tensor(inputs, dtype=tf.float32)
            conv_inputs = inputs
    
            with tf.variable_scope('discriminator', reuse = self._reuse):
                for i in range(len(self._channels)):
                    conv_inputs = conv2d(
                        conv_inputs,
                        self._channels[i],
                        'conv-%d' % i,
                        training
                    )
                fc_inputs = conv_inputs
                with tf.variable_scope('fc'):
                    # 展平
                    flatten = tf.contrib.layers.flatten(fc_inputs)
                    # 获得 logits
                    logits = tf.layers.dense(flatten, 2, name = 'logits')
    
            self._reuse = True
            self.variables = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, scope='discriminator')
    
            return logits
    
    class DCGAN(object):
        '''DCGAN 主逻辑程序'''
    
        def __init__(self, hps):
            '''
    
            :param hps:  所有超参数
            '''
            g_channels = hps.g_channels
            d_channels = hps.d_channels
    
            self._batch_size = hps.batch_size
            self._init_conv_size = hps.init_conv_size
            self._z_dim = hps.z_dim
            self._img_size = hps.img_size
    
            self._generator = Generator(g_channels, self._init_conv_size)
            self._discriminator = Discriminator(d_channels)
    
        def build(self):
            '''创建计算图'''
    
            # 定义 随机向量
            self._z_placeholder = tf.placeholder(tf.float32, (self._batch_size, self._z_dim))
    
            # 定义真实图像
            self._img_placeholder = tf.placeholder(tf.float32, (self._batch_size, self._img_size, self._img_size, 1))
    
            # 执行过下面一行之后,就有了生成的图像
            # 也就是说,结合上面一句代码,现在就有了 真实图像 和 生成图像
            generated_imgs = self._generator(self._z_placeholder, training=True)
    
            # 先把假的图像输入到判别器中,得到假图像的logits
            fake_img_logits = self._discriminator(generated_imgs, training=True)
    
            # 然后把真实的图像输入到判别器中,得到真实图像的logits
            real_img_logits = self._discriminator(self._img_placeholder, training=True)
    
    
            # 有了两个logits之后,就可以定义损失函数了
    
            # 生成器loss损失函数
            # 真图像 使用 1 来代替
            loss_on_fake_to_real = tf.reduce_mean(
                tf.nn.sparse_softmax_cross_entropy_with_logits(
                    labels = tf.ones([self._batch_size], dtype = tf.int64), # 标签都是1
                    logits = fake_img_logits
                )
            )
    
            # 判别器的损失函数 有两个
            #  1.将假的判断为假的
            #  2.将真的判断为真的
            loss_on_fake_to_fake = tf.reduce_mean(
                tf.nn.sparse_softmax_cross_entropy_with_logits(
                    labels = tf.zeros([self._batch_size], dtype = tf.int64),
                    logits = fake_img_logits
                )
            )
    
            loss_on_real_to_real = tf.reduce_mean(
                tf.nn.sparse_softmax_cross_entropy_with_logits(
                    labels = tf.ones([self._batch_size], dtype = tf.int64),
                    logits = real_img_logits
                )
            )
    
    
            # 收集变量的方法:
            # |--关于 tf.add_to_collection()、tf.get_collection()和 tf.add_n() 参考链接:
            # |--https://blog.csdn.net/uestc_c2_403/article/details/72415791
            tf.add_to_collection('g_losses', loss_on_fake_to_real)
            tf.add_to_collection('d_losses', loss_on_fake_to_fake)
            tf.add_to_collection('d_losses', loss_on_real_to_real)
    
            loss = {
                'g': tf.add_n(tf.get_collection('g_losses'), name = 'total_g_loss'),
                'd': tf.add_n(tf.get_collection('d_losses'), name = 'total_d_loss')
            }
    
            return self._z_placeholder, self._img_placeholder, generated_imgs, loss
    
    
        def build_train_op(self, losses, learning_rate, beta1):
            '''
    
            :param losses: 损失函数
            :param learning_rate: 学习率
            :param beta1: op 参数
            :return:
            '''
            # 判别器 和 生成器 分别 op
    
            # 得到 两个 optimizer
            g_opt = tf.train.AdamOptimizer(learning_rate = learning_rate, beta1 = beta1)
            d_opt = tf.train.AdamOptimizer(learning_rate = learning_rate, beta1 = beta1)
    
            g_opt_op = g_opt.minimize(losses['g'], var_list = self._generator.variables)
            d_opt_op = d_opt.minimize(losses['d'], var_list = self._discriminator.variables )
    
            # gan 是交叉执行的,需要用到下面的控制依赖方法:tf.control_dependencies([g_opt_op, d_opt_op])
            # 这样达到的效果是:先训练生成器,在训练判别器,交替执行
            # 参考链接:https://blog.csdn.net/PKU_Jade/article/details/73498753
            with tf.control_dependencies([g_opt_op, d_opt_op]):
                return tf.no_op(name = 'train') # tf.no_op() 什么都不做, 返回创建的操作
    
    
    def combine_imgs(batch_imgs, img_size, rows = 8, cols = 16):
        '''
        将小图合并为一张大图
        :param batch_imgs: 类型 [batch_size, img_size, img_size, 1]
        :param img_size: 图像大小
        :param rows: 行数量
        :param cols: 列数量
        :return:
        '''
        result_big_img = []
        for i in range(rows):
            row_imgs = []
            for j in range(cols):
                # [img_size, img_size, 1]
                img = batch_imgs[cols * i + j]
                img = img.reshape((img_size, img_size))
                img = (img + 1) * 127.5 # 将归一化还原
                row_imgs.append(img) # 现在列表中保存的是16张图像
            row_imgs = np.hstack(row_imgs)
            result_big_img.append(row_imgs)
    
        # 8 × 32, 16 × 32
        result_big_img = np.vstack(result_big_img)
        result_big_img = np.asarray(result_big_img, np.uint8)
    
        # 使用 PIL 将矩阵变为图像
        result_big_img = Image.fromarray(result_big_img)
        # 将图像返回
        return result_big_img
    
    
    #########################################
    # 实验流程 开始
    #########################################
    
    dcgan = DCGAN(hps) # 创建 dcgan对象
    z_placeholder, img_placeholder, generated_imgs, losses = dcgan.build()
    train_op = dcgan.build_train_op(losses, hps.learning_rate, hps.beta1)
    
    
    init_op = tf.global_variables_initializer()
    train_steps = 10000
    
    with tf.Session() as sess:
        sess.run(init_op)
        for step in range(train_steps):
            batch_imgs, batch_z = mnist_data.next_batch(hps.batch_size)
            fetches = [train_op, losses['g'], losses['d']]
            should_sample = (step + 1) % 50 == 0
    
            if should_sample:
                fetches += [generated_imgs]
    
            output_values = sess.run(
                fetches,
                feed_dict={
                    z_placeholder: batch_z,
                    img_placeholder:batch_imgs
                }
            )
    
            _, g_loss_val, d_loss_val = output_values[0: 3]
    
            print('step:%4d, g_loss:%4.3f, d_loss: %4.3f' % (step, g_loss_val, d_loss_val))
    
            if should_sample:
                gen_imgs_val = output_values[3]
    
                # 这是生成的图像
                gen_img_path = os.path.join(output_dir, '%05d-gen.jpg'%(step + 1))
    
                # 将真实图像也输出
                gt_img_path = os.path.join(output_dir, '%05d-gt.jpg' % (step + 1))
    
                # 组装 生成图片 和 真实图片
                gen_img = combine_imgs(gen_imgs_val, hps.img_size)
                gt_img = combine_imgs(batch_imgs, hps.img_size)
    
                # 将 生成图片 和 真实图片 保存到本地
                gen_img.save(gen_img_path)
                gt_img.save(gt_img_path)
    
    

    迭代500次的效果图:
    500次的效果图
    迭代3000次效果图:
    在这里插入图片描述
    迭代5000次的效果图:
    在这里插入图片描述

    就这样,感觉还是学到了不少东西。

    展开全文
  • 参考地址:... 使用DCGAN(deep convolutional GAN):深度卷积GAN 网络结构如下: 代码分成四个文件: 读入文件 read_data.py 配置线性层,卷积层,反...

    参考地址:https://blog.csdn.net/miracle_ma/article/details/78305991

    使用DCGAN(deep convolutional GAN):深度卷积GAN

    网络结构如下:

    代码分成四个文件:

    • 读入文件                                          read_data.py
    • 配置线性层,卷积层,反卷积层     ops.py
    • 构建生成器和判别器模型                model.py
    • 训练模型                                          train.py

    使用的layer种类有:

    • conv(卷积层)
    • deconv(反卷积层)
    • linear(线性层)
    • batch_norm(批量归一化层)
    • lrelu/relu/sigmoid(非线性函数层)

    一、读入文件(read_data.py)

    import os
    import numpy as np
    import tensorflow as tf
    
    def read_data():
        data_dir = "data\mnist"
        #read training data
        fd = open(os.path.join(data_dir,"train-images.idx3-ubyte"))
        loaded = np.fromfile(file = fd, dtype = np.uint8)
        trainX = loaded[16:].reshape((60000, 28, 28, 1)).astype(np.float)
    
        fd = open(os.path.join(data_dir,"train-labels.idx1-ubyte"))
        loaded = np.fromfile(file = fd, dtype = np.uint8)
        trainY = loaded[8:].reshape((60000)).astype(np.float)
    
        #read test data
        fd = open(os.path.join(data_dir,"t10k-images.idx3-ubyte"))
        loaded = np.fromfile(file = fd, dtype = np.uint8)
        testX = loaded[16:].reshape((10000, 28, 28, 1)).astype(np.float)
    
        fd = open(os.path.join(data_dir,"t10k-labels.idx1-ubyte"))
        loaded = np.fromfile(file = fd, dtype = np.uint8)
        testY = loaded[8:].reshape((10000)).astype(np.float)
    
        # 将两个集合合并成70000大小的数据集
        X = np.concatenate((trainX, testX), axis = 0)
        y = np.concatenate((trainY, testY), axis = 0)
    
        print(X[:2])
        #set the random seed
        seed = 233
        np.random.seed(seed)
        np.random.shuffle(X)
        np.random.seed(seed)
        np.random.shuffle(y)
    
        return X/255, y

    新建一个data文件夹,存放mnist数据集。读入训练集和测试集,将两个集合合并成一个70000的数据集,设置相同的随机种子,保证两个集合随机成相同的顺序。最后把X归一化到 [0, 1] 之间

    二、配置线性层,卷积层,反卷积层 (ops.py)

    import tensorflow as tf
    from tensorflow.contrib.layers.python.layers import batch_norm as batch_norm
    
    def linear_layer(value, output_dim, name = 'linear_connected'):
        with tf.variable_scope(name):
            try:
                # 线性层的权重
                    # 名称:weights
                    # shape = [int(value.get_shape()[1]), output_dim], 我们需要传入最后输出多少维,也即是output_dim的值
                    # 初始化:使用标准正态截断函数 标准差是0.02
                weights = tf.get_variable('weights',
                    [int(value.get_shape()[1]), output_dim],
                    initializer = tf.truncated_normal_initializer(stddev = 0.02))
                # 线性层的偏置
                    # 名称:biases
                    # shape = [output_dim], 偏置与输出的维度相同
                    # 初始化:使用常量初始化  初始化为0
                biases = tf.get_variable('biases',
                    [output_dim], initializer = tf.constant_initializer(0.0))
            except ValueError:
                tf.get_variable_scope().reuse_variables()
                weights = tf.get_variable('weights',
                    [int(value.get_shape()[1]), output_dim],
                    initializer = tf.truncated_normal_initializer(stddev = 0.02))
                biases = tf.get_variable('biases',
                    [output_dim], initializer = tf.constant_initializer(0.0))
            return tf.matmul(value, weights) + biases
    
    def conv2d(value, output_dim, k_h = 5, k_w = 5, strides = [1,1,1,1], name = "conv2d"):
        with tf.variable_scope(name):
            try:
                # 反卷积层的权重
                    # 名称:weights
                    # [5, 5, 输入的维度, 输出的维度]
                    # 初始化:使用标准正态截断函数 标准差是0.02
                weights = tf.get_variable('weights',
                    [k_h, k_w, int(value.get_shape()[-1]), output_dim],
                    initializer = tf.truncated_normal_initializer(stddev = 0.02))
                # 线性层的偏置
                    # 名称:biases
                    # shape = [output_shape[-1]], 偏置与输出的维度相同
                    # 初始化:使用常量初始化  初始化为0
                biases = tf.get_variable('biases',
                    [output_dim], initializer = tf.constant_initializer(0.0))
            except ValueError:
                tf.get_variable_scope().reuse_variables()
                weights = tf.get_variable('weights',
                    [k_h, k_w, int(value.get_shape()[-1]), output_dim],
                    initializer = tf.truncated_normal_initializer(stddev = 0.02))
                biases = tf.get_variable('biases',
                    [output_dim], initializer = tf.constant_initializer(0.0))
            conv = tf.nn.conv2d(value, weights, strides = strides, padding = "SAME")
            conv = tf.reshape(tf.nn.bias_add(conv, biases), conv.get_shape())
            return conv
    
    def deconv2d(value, output_shape, k_h = 5, k_w = 5, strides = [1,1,1,1], name = "deconv2d"):
        with tf.variable_scope(name):
            try:
                # 反卷积层的权重
                # filter : [height, width, output_channels, in_channels]
                    # 名称:weights
                    # [5, 5, 输出的维度, 输入的维度]
                    # 初始化:使用标准正态截断函数 标准差是0.02
                weights = tf.get_variable('weights',
                    [k_h, k_w, output_shape[-1], int(value.get_shape()[-1])],
                    initializer = tf.truncated_normal_initializer(stddev = 0.02))
                # 线性层的偏置
                    # 名称:biases
                    # shape = [output_shape[-1]], 偏置与输出的维度相同
                    # 初始化:使用常量初始化  初始化为0
                biases = tf.get_variable('biases',
                    [output_shape[-1]], initializer = tf.constant_initializer(0.0))
            except ValueError:
                tf.get_variable_scope().reuse_variables()
                weights = tf.get_variable('weights',
                    [k_h, k_w, output_shape[-1], int(value.get_shape()[-1])],
                    initializer = tf.truncated_normal_initializer(stddev = 0.02))
                biases = tf.get_variable('biases',
                    [output_shape[-1]], initializer = tf.constant_initializer(0.0))
            deconv = tf.nn.conv2d_transpose(value, weights, output_shape, strides = strides)
            deconv = tf.reshape(tf.nn.bias_add(deconv, biases), deconv.get_shape())
            return deconv
    
    def conv_cond_concat(value, cond, name = 'concat'):
        value_shapes = value.get_shape().as_list()
        cond_shapes = cond.get_shape().as_list()
    
        with tf.variable_scope(name):
            return tf.concat([value, cond * tf.ones(value_shapes[0:3] + cond_shapes[3:])], 3, name = name)
    
    # 批量归一化层
    def batch_norm_layer(value, is_train = True, name = 'batch_norm'):
        with tf.variable_scope(name) as scope:
            if is_train:
                return batch_norm(value, decay = 0.9, epsilon = 1e-5, scale = True,
                                    is_training = is_train, updates_collections = None, scope = scope)
            else :
                return batch_norm(value, decay = 0.9, epsilon = 1e-5, scale = True,
                                is_training = is_train, reuse = True,
                                updates_collections = None, scope = scope)
    
    def lrelu(x, leak = 0.2, name = 'lrelu'):
        with tf.variable_scope(name):
            return tf.maximum(x, x*leak, name = name)

    主要就是配置各个层的权重和参数,conv_cond_concat是为了用于卷积层计算的四维数据[batch_size, w, h, c]和约束条件y连接起来的操作,需要把两个数据的前三维转化到一样大小才能使用tf.concat(拼接函数)

    lrelu是relu的改良版。代码中a = 0.2

    三、构建生成器和判别器模型(model.py)

    import tensorflow as tf
    from ops import *
    
    BATCH_SIZE = 64
    
    # 生成器模型
    def generator(z, y,train = True):
        with tf.variable_scope('generator', reuse=tf.AUTO_REUSE):
            # 生成器的输入,z是 1 * 100 维的噪声
            yb = tf.reshape(y, [BATCH_SIZE, 1, 1, 10], name = 'g_yb')
            # 将噪声 z 和 y 拼接起来
            z_y = tf.concat([z,y], 1, name = 'g_z_concat_y')
    
            # 经过线性层,z_y 变成1024维
            linear1 = linear_layer(z_y, 1024, name = 'g_linear_layer1')
            # 先批量归一化,然后经过relu层,得 bn1
            bn1 = tf.nn.relu(batch_norm_layer(linear1, is_train = True, name = 'g_bn1'))
    
            # 将 bn1 与 y 拼接在一起
            bn1_y = tf.concat([bn1, y], 1 ,name = 'g_bn1_concat_y')
            # 经过线性层,变成128 * 49
            linear2 = linear_layer(bn1_y, 128*49, name = 'g_linear_layer2')
            # 先批量归一化,然后经过relu层,得 bn2
            bn2 = tf.nn.relu(batch_norm_layer(linear2, is_train = True, name = 'g_bn2'))
            # reshape成 7 * 7 * 128
            bn2_re = tf.reshape(bn2, [BATCH_SIZE, 7, 7, 128], name = 'g_bn2_reshape')
    
            # 将 bn2_re 与 yb 连接起来
            bn2_yb = conv_cond_concat(bn2_re, yb, name = 'g_bn2_concat_yb')
            # 经过反卷积 步长设为2,height和width 都翻倍
            deconv1 = deconv2d(bn2_yb, [BATCH_SIZE, 14, 14, 128], strides = [1, 2, 2, 1], name = 'g_deconv1')
            # 先批量归一化,然后经过relu层,得 bn3
            bn3 = tf.nn.relu(batch_norm_layer(deconv1, is_train = True, name = 'g_bn3'))
    
            # 将 bn2_re 与 yb 连接起来
            bn3_yb = conv_cond_concat(bn3, yb, name = 'g_bn3_concat_yb')
            # 经过反卷积 步长设为2,height和width 都翻倍
            deconv2 = deconv2d(bn3_yb, [BATCH_SIZE, 28, 28, 1], strides = [1, 2, 2, 1], name = 'g_deconv2')
    
            # 经过sigmoid层
            return tf.nn.sigmoid(deconv2)
    
    # 判别器模型
    def discriminator(image, y, reuse=tf.AUTO_REUSE, train = True):
        with tf.variable_scope('discriminator', reuse=tf.AUTO_REUSE):
            # 判别器有两个输入,一个是真实图片,另一个是生成的图片,都是 28 * 28
            yb = tf.reshape(y, [BATCH_SIZE, 1, 1, 10], name = 'd_yb')
            # 将 image 与 yb 连接起来
            image_yb = conv_cond_concat(image, yb, name = 'd_image_concat_yb')
            # 卷积:步长为2   14 * 14 * 11
            conv1 = conv2d(image_yb, 11, strides = [1, 2, 2, 1], name = 'd_conv1')
            # 经过 lrelu 层
            lr1 = lrelu(conv1, name = 'd_lrelu1')
    
            # 将 lr1 与 yb 连接起来
            lr1_yb = conv_cond_concat(lr1, yb, name = 'd_lr1_concat_yb')
            # 卷积:步长为2   7 * 7 * 74
            conv2 = conv2d(lr1_yb, 74, strides = [1, 2, 2, 1], name = 'd_conv2')
            # 批量归一化
            bn1 = batch_norm_layer(conv2, is_train = True, name = 'd_bn1')
            # 经过 lrelu 层
            lr2 = lrelu(bn1, name = 'd_lrelu2')
            lr2_re = tf.reshape(lr2, [BATCH_SIZE, -1], name = 'd_lr2_reshape')
    
            # 将 lr2_re 与 y 连接起来
            lr2_y = tf.concat([lr2_re, y], 1, name = 'd_lr2_concat_y')
            # 经过线性层 变成 1 * 1024
            linear1 = linear_layer(lr2_y, 1024, name = 'd_linear_layer1')
            # 批量归一化
            bn2 = batch_norm_layer(linear1, is_train = True, name = 'd_bn2')
            # 经过 lrelu 层
            lr3 = lrelu(bn2, name = 'd_lrelu3')
    
            lr3_y = tf.concat([lr3, y], 1, name = 'd_lr3_concat_y')
            linear2 = linear_layer(lr3_y, 1, name = 'd_linear_layer2')
    
            return linear2
    
    def sampler(z, y, train = True):
        tf.get_variable_scope().reuse_variables()
        return generator(z, y, train = train)

    按照模型图实现生成器G,返回时用到了sigmoid,将输出值规范到(0, 1)与前面输入图像的范围一致。

    判别器D:

    1、 设置reuse变量(变量重复使用)。判别器有两个输入。对于同一个D,先喂给它real data(真实的图像),然后喂给它fake data(生成器生成的假图像)。在一次train_step里涉及到了两次D变量的重复使用,所以需要设置变量共享。

    2、返回值没有使用sigmoid,因为训练中使用了sigmoid_cross_entropy_with_logits来计算loss

    四、训练模型

    import scipy.misc
    import numpy as np
    import tensorflow as tf
    import os
    from read_data import *
    from ops import *
    from model import *
    
    BATCH_SIZE = 64
    
    # 保存图片,将图片拼接在一起
    def save_images(images, size, path):
        img = (images + 1.0)/2.0
        h, w = img.shape[1], img.shape[2]
    
        merge_img = np.zeros((h * size[0], w * size[1], 3))
    
        for idx, image in enumerate(images):
            i = idx % size[1]
            j = idx // size[1]
            merge_img[j*h:j*h+h,i*w:i*w+w,:] = image
    
        return scipy.misc.imsave(path, merge_img)
    
    def train():
    
        #read data
        X, Y = read_data()
    
        #global_step to record the step of training
        global_step = tf.Variable(0, name = 'global_step', trainable = True)
    
        # 创建placeholder
        y = tf.placeholder(tf.int32, [BATCH_SIZE], name = 'y')
        _y = tf.one_hot(y, depth = 10, on_value=None, off_value=None, axis=None, dtype=None, name='one_hot')
        z = tf.placeholder(tf.float32, [BATCH_SIZE, 100], name = 'z')
        images = tf.placeholder(tf.float32, [BATCH_SIZE, 28, 28, 1], name = 'images')
    
        # 用噪声生成图片
        G = generator(z, _y)
        # 用真实图片训练判别器
        D = discriminator(images, _y)
        # 用生成的加图片G训练判别器
        _D = discriminator(G, _y)
    
        # 计算损失值,使用 sigmoid_cross_entropy_with_logits
        # 真实图片在判别器里产生的loss,我们希望真实图片在判别器里的label趋向于1
        d_loss_real = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits = D, labels = tf.ones_like(D)))
    
        # 假图片在判别器里产生的loss,我们希望假图片在判别器里的label趋向于0
        d_loss_fake = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits = _D, labels = tf.zeros_like(_D)))
    
        # 假图片在判别器上的结果,我们希望它以假乱真,希望它尽可能接近于1
        g_loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits = _D, labels = tf.ones_like(_D)))
        
        # 判别器的总 loss
        d_loss = d_loss_real + d_loss_fake
    
        t_vars = tf.trainable_variables()
        d_vars = [var for var in t_vars if 'd_' in var.name]
        g_vars = [var for var in t_vars if 'g_' in var.name]
    
        with tf.variable_scope(tf.get_variable_scope(), reuse = False):
            # 优化判别器的 loss
            d_optim = tf.train.AdamOptimizer(0.0002, beta1 = 0.5).minimize(d_loss, var_list = d_vars, global_step = global_step)
            # 优化生成器的 loss
            g_optim = tf.train.AdamOptimizer(0.0002, beta2 = 0.5).minimize(g_loss, var_list = g_vars, global_step = global_step)
    
        #tensorborad
        train_dir = 'logs'
        z_sum = tf.summary.histogram("z",z)
        d_sum = tf.summary.histogram("d",D)
        d__sum = tf.summary.histogram("d_",_D)
        g_sum = tf.summary.histogram("g", G)
    
        d_loss_real_sum = tf.summary.scalar("d_loss_real", d_loss_real)
        d_loss_fake_sum = tf.summary.scalar("d_loss_fake", d_loss_fake)
        g_loss_sum = tf.summary.scalar("g_loss", g_loss)
        d_loss_sum = tf.summary.scalar("d_loss", d_loss)
    
        g_sum = tf.summary.merge([z_sum, d__sum, g_sum, d_loss_fake_sum, g_loss_sum])
        d_sum = tf.summary.merge([z_sum, d_sum, d_loss_real_sum, d_loss_sum])
    
        #initial
        init = tf.global_variables_initializer()
        sess = tf.InteractiveSession()
        writer = tf.summary.FileWriter(train_dir, sess.graph)
    
        #save
        saver = tf.train.Saver()
        # 保存模型
        check_path = "./save/model.ckpt"
    
        #sample
        sample_z = np.random.uniform(-1, 1, size = (BATCH_SIZE, 100))
        sample_labels = Y[0:BATCH_SIZE]
    
        #make sample
        sample = sampler(z, _y)
    
        #run
        sess.run(init)
        #saver.restore(sess.check_path)
    
        #train
        for epoch in range(10):
            batch_idx = int(70000/64)
            for idx in range(batch_idx):
                batch_images = X[idx*64:(idx+1)*64]
                batch_labels = Y[idx*64:(idx+1)*64]
                batch_z = np.random.uniform(-1, 1, size = (BATCH_SIZE, 100))
    
                _, summary_str = sess.run([d_optim, d_sum],
                                        feed_dict = {images: batch_images,
                                                     z: batch_z,
                                                     y: batch_labels})
                writer.add_summary(summary_str, idx+1)
    
                _, summary_str = sess.run([g_optim, g_sum],
                                        feed_dict = {images: batch_images,
                                                     z: batch_z,
                                                     y: batch_labels})
                writer.add_summary(summary_str, idx+1)
    
                d_loss1 = d_loss_fake.eval({z: batch_z, y: batch_labels})
                d_loss2 = d_loss_real.eval({images: batch_images, y:batch_labels})
                D_loss = d_loss1 + d_loss2
                G_loss = g_loss.eval({z: batch_z, y: batch_labels})
    
                #every 20 batch output loss
                if idx % 20 == 0:
                    print("Epoch: %d [%4d/%4d] d_loss: %.8f, g_loss: %.8f" % (epoch, idx, batch_idx, D_loss, G_loss))
    
                #every 100 batch save a picture
                if idx % 100 == 0:
                    sap = sess.run(sample, feed_dict = {z: sample_z, y: sample_labels})
                    samples_path = './sample/'
                    save_images(sap, [8,8], samples_path+'test_%d_epoch_%d.png' % (epoch, idx))
    
                #every 500 batch save model
                if idx % 500 == 0:
                    saver.save(sess, check_path, global_step = idx + 1)
        sess.close()
    
    if __name__ == '__main__':
        train()

    模型的训练顺序是先generator生成fake data,然后real data喂给D训练,再把fake data喂给D训练

    分别计算生成器和判别器的loss,其中判别器的loss = real loss + fake loss

    一个batch中,训练一次D训练一次G。按理说应该训练k次D,训练一次G(为了方便就没有这么做了)

    生成的效果图如下:

     

    转载于:https://www.cnblogs.com/gezhuangzhuang/p/10279410.html

    展开全文
  • 基于DCGAN手写数字生成

    千次阅读 2018-03-08 14:59:03
    这次来实现利用DCGAN,从一个噪音向量生成手写数字,样本来自MNIST,其实基于这个思想可以做很多事。 核心代码如下: # 28 x 28 的图片 img_rows, img_cols = 28, 28 (X_train, y_train), (X_test, y_test) = mnist...
  • 本文继上一篇文章继续研究深度卷积生成对抗网络(DCGAN) ,本文主要讲解实现细节,使用 DCGAN 实现手写数字生成任务,通过这一个例子,读者可以进一步巩固上一篇博客所讲内容,同时对生成对抗网络会有更加详细的...
  • 本篇博客主要介绍了DCGAN的keras实现,并利用mnist数据集进行训练,然后生成伪造的手写数据
  • 生成对抗网络(Generative Adversarial ...生成对抗网络常用于生成以假乱真的图片,常用场景有手写体生成、人脸合成、风格迁移、图像修复等。此外,该方法还被用于生成视频、三维物体模型等。 项目地址: ...

空空如也

空空如也

1 2 3 4 5
收藏数 87
精华内容 34
关键字:

dcgan手写数字