本站内容均来自兴趣收集,如不慎侵害的您的相关权益,请留言告知,我们将尽快删除.谢谢.
Keras深度学习实战(23)——DCGAN详解与实现
1. 使用 DCGAN 生成手写数字图像
2. 使用 DCGAN 生成面部图像
2.2 从零开始实现 DCGAN 生成面部图像
0. 前言
在 生成对抗网络 (Generative Adversarial Networks, GAN) 一节中,我们使用原始 GAN
生成了数字图片。从之前的相关学习中,我们已经知道,使用卷积神经网络 ( Convolutional Neural Network
, CNN
) 体系结构能够更好地学习图像中的特征,因为 CNN
中的卷积核能够学习图像中的特定细节。深度卷积生成对抗网络 ( Deep Convolutional GAN
, DCGAN
) 正是基于上述思想来生成图像,其将卷积神经网络引入 GAN
中,以代替原始 GAN
中的全连接网络。
1. 使用 DCGAN 生成手写数字图像
DCGAN
的原理与GAN 基本相同,主要区别在于 DCGAN
的生成器和鉴别器使用卷积神经网络体系结构:
def generator(): model = Sequential() model.add(Dense(1024, input_dim=100)) model.add(LeakyReLU(0.2)) model.add(Dense(128*7*7)) model.add(BatchNormalization()) model.add(LeakyReLU(0.2)) model.add(Reshape((7,7,128))) model.add(Conv2DTranspose(128, (5, 5), strides=(1, 1), padding='same', use_bias=False)) model.add(BatchNormalization()) model.add(LeakyReLU(0.2)) model.add(Conv2DTranspose(64, (5, 5), strides=(2, 2), padding='same', use_bias=False)) model.add(BatchNormalization()) model.add(LeakyReLU(0.2)) model.add(Conv2DTranspose(1, (5, 5), strides=(2, 2), padding='same', use_bias=False, activation='tanh')) return model def discriminator(): model = Sequential() model.add(Conv2D(64, (3, 3), padding='same', input_shape=(28,28,1))) model.add(LeakyReLU(0.2)) model.add(MaxPooling2D(pool_size=(2,2))) model.add(Conv2D(128, (3,3))) model.add(LeakyReLU(0.2)) model.add(MaxPooling2D(pool_size=(2,2))) model.add(Conv2D(512, (3,3))) model.add(LeakyReLU(0.2)) model.add(MaxPooling2D(pool_size=(2,2))) model.add(Flatten()) model.add(Dense(512)) model.add(LeakyReLU(0.2)) model.add(Dense(1)) model.add(Activation('sigmoid')) return model
在 DCGAN
中,我们对输入数据执行多个卷积和池化操作,而其它步骤与原始 GAN
中的步骤完全相同,区别仅在于使用卷积和池化架构定义的 GAN
模型,模型训练后生成的图像如下所示:
生成器和鉴别器训练损失值随 epoch
的增加而变化,如下所示:
与原始 GAN相比,可以看到,尽管其他条件保持不变,仅模型体系结构发生了变化,但是通过 DCGAN
生成的图像比原始 GAN
的图像更加真实。
2. 使用 DCGAN 生成面部图像
我们已经学习了如何使用 DCGAN
生成新手写数字图像。本节中,我们将继续学习如何从现有的面部数据集中生成一组新的面部图像。所用的面部数据集与在性别分类中所用数据集相同。
2.1 模型分析
在介绍了 DCGAN
的核心思想后,我们将在本节中使用 DCGAN
架构生成面部图像,所用策略如下:
使用包含面部图像的数据集
生成器开始时只能生成随机图像
通过向鉴别器输入真实面部图像和生成器生成图像来训练鉴别器,鉴别器应学会区分真实面部图像和生成的虚假面部图像
鉴别器模型训练完成后,将冻结其权重,并训练生成器网络,以使鉴别器以较高的概率将生成的虚假图片识别为真实图像
多次迭代以上前两个步骤,直到生成器能够生成足够逼真的图像
2.2 从零开始实现 DCGAN 生成面部图像
(1)下载数据集,与在性别分类中所用数据集相同,图像样本如下:
(2) 定义模型架构与所用超参数:
shape = (56, 56, 3) epochs = 10000 batch_size = 64 save_interval = 100 def generator(): model = Sequential() model.add(Dense(1024, input_dim=100)) model.add(LeakyReLU(0.2)) model.add(Dense(128*7*7)) model.add(BatchNormalization()) model.add(LeakyReLU(0.2)) model.add(Reshape((7,7,128))) model.add(Conv2DTranspose(128, (3, 3), strides=(1, 1), padding='same', use_bias=False)) model.add(BatchNormalization()) model.add(LeakyReLU(0.2)) model.add(Conv2DTranspose(64, (3, 3), strides=(2, 2), padding='same', use_bias=False)) model.add(BatchNormalization()) model.add(LeakyReLU(0.2)) model.add(Conv2DTranspose(32, (3, 3), strides=(2, 2), padding='same', use_bias=False)) model.add(BatchNormalization()) model.add(LeakyReLU(0.2)) model.add(Conv2DTranspose(3, (3, 3), strides=(2, 2), padding='same', use_bias=False, activation='tanh')) return model def discriminator(): model = Sequential() model.add(Conv2D(64, (3, 3), padding='same', input_shape=(56,56,3))) model.add(LeakyReLU(0.2)) model.add(MaxPooling2D(pool_size=(2,2))) model.add(Conv2D(128, (3,3))) model.add(LeakyReLU(0.2)) model.add(MaxPooling2D(pool_size=(2,2))) model.add(Conv2D(512, (3,3))) model.add(LeakyReLU(0.2)) model.add(MaxPooling2D(pool_size=(2,2))) model.add(Conv2D(1024, (3,3))) model.add(LeakyReLU(0.2)) model.add(MaxPooling2D(pool_size=(2,2))) model.add(Flatten()) model.add(Dense(1024)) model.add(LeakyReLU(0.2)) model.add(Dense(1)) model.add(Activation('sigmoid')) return model def gan(discriminator, generator): discriminator.trainable = False model = Sequential() model.add(generator) model.add(discriminator) return model
(3)定义预处理函数、反处理函数以及绘制图像等实用函数:
import time noise = np.random.normal(0, 1, (16, 100)) def plot_images(noise=noise, samples=16, step=0): images = generator.predict(noise) images = deprocess(images) images = np.clip(images, 0, 255) plt.figure(figsize=(10, 10)) for i in range(images.shape[0]): plt.subplot(4, 4, i + 1) image = images[i, :, :, :] image = np.reshape(image, [56, 56,3]) plt.imshow(image) plt.axis('off') plt.tight_layout() plt.show()
在此模型中,我们在将图像调整为较小的尺寸,这是由于 DCGAN
并不适合生成较大尺寸的图片,同时也可以减少模型的参数量:
def preprocess(x): return (x / 255) * 2 - 1 def deprocess(x): return np.uint8((x + 1) / 2 * 255)
(4)导入数据集,创建输入数据将其转换为数组,并对其进行预处理:
import os from glob import glob root_dir = 'man_woman/b_resized/' all_img = glob(os.path.join(root_dir, '*.jpg')) x_train = [] for i in range(len(all_img)): img = cv2.imread(all_img[i])#, cv2.COLOR_BGR2RGB) img = cv2.resize(img, (56, 56)) img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) img = preprocess(img) x_train.append(img) x_train = np.array(x_train)
(5)编译生成器,鉴别器和 GCGAN
模型:
generator = generator() generator.summary() generator.compile(loss='binary_crossentropy', optimizer=Adam(lr=0.0002, beta_1=0.5, decay=8e-8)) discriminator = discriminator() discriminator.summary() discriminator.compile(loss='binary_crossentropy', optimizer=Adam(lr=0.0002, beta_1=0.5, decay=8e-8), metrics=['acc']) gan = gan(discriminator, generator) gan.compile(loss='binary_crossentropy', optimizer=Adam(lr=0.0002, beta_1=0.5, decay=8e-8))
生成器模型简要的架构信息如下所示:
Model: "sequential" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= dense (Dense) (None, 1024) 103424 _________________________________________________________________ leaky_re_lu (LeakyReLU) (None, 1024) 0 _________________________________________________________________ dense_1 (Dense) (None, 6272) 6428800 _________________________________________________________________ batch_normalization (BatchNo (None, 6272) 25088 _________________________________________________________________ leaky_re_lu_1 (LeakyReLU) (None, 6272) 0 _________________________________________________________________ reshape (Reshape) (None, 7, 7, 128) 0 _________________________________________________________________ conv2d_transpose (Conv2DTran (None, 7, 7, 128) 147456 _________________________________________________________________ batch_normalization_1 (Batch (None, 7, 7, 128) 512 _________________________________________________________________ leaky_re_lu_2 (LeakyReLU) (None, 7, 7, 128) 0 _________________________________________________________________ conv2d_transpose_1 (Conv2DTr (None, 14, 14, 64) 73728 _________________________________________________________________ batch_normalization_2 (Batch (None, 14, 14, 64) 256 _________________________________________________________________ leaky_re_lu_3 (LeakyReLU) (None, 14, 14, 64) 0 _________________________________________________________________ conv2d_transpose_2 (Conv2DTr (None, 28, 28, 32) 18432 _________________________________________________________________ batch_normalization_3 (Batch (None, 28, 28, 32) 128 _________________________________________________________________ leaky_re_lu_4 (LeakyReLU) (None, 28, 28, 32) 0 _________________________________________________________________ conv2d_transpose_3 (Conv2DTr (None, 56, 56, 3) 864 ================================================================= Total params: 6,798,688 Trainable params: 6,785,696 Non-trainable params: 12,992 _________________________________________________________________
鉴别器模型简要的架构信息如下所示:
Model: "sequential_1" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv2d (Conv2D) (None, 56, 56, 64) 1792 _________________________________________________________________ batch_normalization_4 (Batch (None, 56, 56, 64) 256 _________________________________________________________________ leaky_re_lu_5 (LeakyReLU) (None, 56, 56, 64) 0 _________________________________________________________________ max_pooling2d (MaxPooling2D) (None, 28, 28, 64) 0 _________________________________________________________________ conv2d_1 (Conv2D) (None, 26, 26, 128) 73856 _________________________________________________________________ batch_normalization_5 (Batch (None, 26, 26, 128) 512 _________________________________________________________________ leaky_re_lu_6 (LeakyReLU) (None, 26, 26, 128) 0 _________________________________________________________________ max_pooling2d_1 (MaxPooling2 (None, 13, 13, 128) 0 _________________________________________________________________ conv2d_2 (Conv2D) (None, 11, 11, 512) 590336 _________________________________________________________________ batch_normalization_6 (Batch (None, 11, 11, 512) 2048 _________________________________________________________________ leaky_re_lu_7 (LeakyReLU) (None, 11, 11, 512) 0 _________________________________________________________________ max_pooling2d_2 (MaxPooling2 (None, 5, 5, 512) 0 _________________________________________________________________ conv2d_3 (Conv2D) (None, 3, 3, 1024) 4719616 _________________________________________________________________ batch_normalization_7 (Batch (None, 3, 3, 1024) 4096 _________________________________________________________________ leaky_re_lu_8 (LeakyReLU) (None, 3, 3, 1024) 0 _________________________________________________________________ max_pooling2d_3 (MaxPooling2 (None, 1, 1, 1024) 0 _________________________________________________________________ flatten (Flatten) (None, 1024) 0 _________________________________________________________________ dense_2 (Dense) (None, 1024) 1049600 _________________________________________________________________ leaky_re_lu_9 (LeakyReLU) (None, 1024) 0 _________________________________________________________________ dense_3 (Dense) (None, 1) 1025 _________________________________________________________________ activation (Activation) (None, 1) 0 ================================================================= Total params: 6,443,137 Trainable params: 6,439,681 Non-trainable params: 3,456 _________________________________________________________________
(6)接下来,对模型训练多个 epoch
,具体训练过程与训练原始 GAN相同:
disc_loss = [] gen_loss = [] for cnt in range(epochs): random_index = np.random.randint(0, len(x_train) - batch_size / 2) legit_images = x_train[random_index: random_index + batch_size // 2].reshape(batch_size // 2, 56, 56, 3) gen_noise = np.random.normal(0, 1, (batch_size // 2, 100))/2 synthetic_images = generator.predict(gen_noise) x_combined_batch = np.concatenate((legit_images, synthetic_images)) y_combined_batch = np.concatenate((np.ones((batch_size // 2, 1)), np.zeros((batch_size // 2, 1)))) d_loss = discriminator.train_on_batch(x_combined_batch, y_combined_batch) noise = np.random.normal(0, 1, (batch_size*2, 100))/2 y_mislabled = np.ones((batch_size*2, 1)) g_loss = gan.train_on_batch(noise, y_mislabled) disc_loss.append(d_loss[0]) gen_loss.append(g_loss) print('epoch: {}, [Discriminator: {}], [Generator: {}]'.format(cnt, d_loss[0], g_loss)) if cnt % 500 == 0: plot_images(step=cnt)
以上代码生成的图像如下所示:
如上图所示,尽管这些图像看起来仍然不够逼真,但仍表现出一定的有效性,可以通过更改模型体系结构或增加网络深度对 GAN
进行改进。鉴别器和生成器的损失值随 epoch
的增加的变化情况如下:
深度卷积生成对抗网络 ( Deep Convolutional GAN
, DCGAN
) 将卷积神经网络引入 GAN
中,以代替原始 GAN
中的全连接网络更好地学习图像中的特征。本节中,首先介绍了 DCGAN
的基本思想,然后使用 Keras
从零开始实现用于生成手写数字图片的 DCGAN
模型,可以看到生成的图像效果比原始 GAN
更逼真,最后为了充分验证 DCGAN
模型性能,使用 DCGAN
生成复杂的面部图像。
Be First to Comment