Keras实现CNN经典神经网络(VGGNet,InceptionNet,ResNet等)
在卷积神经网络的发展过程中,出现了很多经典的网络架构。这篇文章介绍LeNet,AlexNet,VGGNet,InceptionNet以及ResNet等5个经典的卷积神经网络架构。LeNet
LeNet是Yann LeCun在1998年提出的最早的神经卷积网络之一,其网络架构如图1所示。
图1:LeNet的结构示意图
LeNet为比较初始化的卷积架构,它主要是由两层卷积构成,它的输入为32*32*3的矩阵,即彩色的32*32像素的照片。第一层卷积由6个5*5的卷积核构成,卷积层的输出直接进入池化层,该池化的方法为最大池化方法。第二层的卷积是由16个5*5的卷积核构成,卷积层的输出直接进入到最大池化层。随后是由3个全连接层,其神经元的个数分别为120、84和10。
图2:LeNet每一层架构和参数
图2显示了神经网络的每一层的架构和对应的参数。
其代码如下:class LeNet5(Model): def __init__(self): super(LeNet5, self).__init__() self.c1 = Conv2D(filters=6, kernel_size=(5, 5),activation="sigmoid") self.p1 = MaxPool2D(pool_size=(2, 2), strides=2) self.c2 = Conv2D(filters=16, kernel_size=(5, 5),activation="sigmoid") self.p2 = MaxPool2D(pool_size=(2, 2), strides=2) self.flatten = Flatten() self.f1 = Dense(120, activation="sigmoid") self.f2 = Dense(84, activation="sigmoid") self.f3 = Dense(10, activation="softmax") def call(self, x): x = self.c1(x) x = self.p1(x) x = self.c2(x) x = self.p2(x) x = self.flatten(x) x = self.f1(x) x = self.f2(x) y = self.f3(x) return y model = LeNet5() model.compile(optimizer="adam",loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),metrics=["sparse_categorical_accuracy"]) checkpoint_save_path = "./checkpoint/LeNet5.ckpt" if os.path.exists(checkpoint_save_path + ".index"): print("-------------load the model-----------------") model.load_weights(checkpoint_save_path) cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_save_path, save_weights_only=True,save_best_only=True) history = model.fit(x_train, y_train, batch_size=32, epochs=5, validation_data=(x_test, y_test), validation_freq=1, callbacks=[cp_callback]) model.summary() AlexNet
AlexNet网络诞生于2012年,它和LeNet有相似之处,但网络规模有很大的变化,其架构示意图如图3 所示。
图3:AlexNet结构示意图
相比于LeNet,AlexNet把卷积层增加到了5层。在之前的两个卷积层中加入了BatchNormalization(),以及激活函数由sigmoid变化为relu函数。图4显示了每一层的架构和对应每一层设置的参数。
图4:AlexNet每层架构示意图
其代码如下:class AlexNet8(Model): def __init__(self): super(AlexNet8, self).__init__() self.c1 = Conv2D(filters=96, kernel_size=(3, 3)) self.b1 = BatchNormalization() self.a1 = Activation("relu") self.p1 = MaxPool2D(pool_size=(3, 3), strides=2) self.c2 = Conv2D(filters=256, kernel_size=(3, 3)) self.b2 = BatchNormalization() self.a2 = Activation("relu") self.p2 = MaxPool2D(pool_size=(3, 3), strides=2) self.c3 = Conv2D(filters=384, kernel_size=(3, 3), padding="same",activation="relu") self.c4 = Conv2D(filters=384, kernel_size=(3, 3), padding="same",activation="relu") self.c5 = Conv2D(filters=256, kernel_size=(3, 3), padding="same",activation="relu") self.p3 = MaxPool2D(pool_size=(3, 3), strides=2) self.flatten = Flatten() self.f1 = Dense(2048, activation="relu") self.d1 = Dropout(0.5) self.f2 = Dense(2048, activation="relu") self.d2 = Dropout(0.5) self.f3 = Dense(10, activation="softmax") def call(self, x): x = self.c1(x) x = self.b1(x) x = self.a1(x) x = self.p1(x) x = self.c2(x) x = self.b2(x) x = self.a2(x) x = self.p2(x) x = self.c3(x) x = self.c4(x) x = self.c5(x) x = self.p3(x) x = self.flatten(x) x = self.f1(x) x = self.d1(x) x = self.f2(x) x = self.d2(x) y = self.f3(x) return y model = AlexNet8() model.compile(optimizer="adam",loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),metrics=["sparse_categorical_accuracy"]) checkpoint_save_path = "./checkpoint/AlexNet8.ckpt" if os.path.exists(checkpoint_save_path + ".index"): print("-------------load the model-----------------") model.load_weights(checkpoint_save_path) cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_save_path,save_weights_only=True,save_best_only=True) history = model.fit(x_train, y_train, batch_size=32, epochs=5, validation_data=(x_test, y_test), validation_freq=1,callbacks=[cp_callback]) model.summary()VGGNet16
VGGNet16最大的改进就是提升了网络的深度,由AlexNet的总共8层网络提升到了16层,这意味着网络有着更强的表达能力。VGGNet使用的都是3*3的小卷积核,实际证明这种小卷积核的效果要好于大的卷积核。
VGGNet16的网络架构如下图所示:
图5:VGG16网架结构
其程序代码如下:class VGG16(Model): def __init__(self): super(VGG16, self).__init__() self.c1 = Conv2D(filters=64, kernel_size=(3, 3), padding="same") # 卷积层1 self.b1 = BatchNormalization() # BN层1 self.a1 = Activation("relu") # 激活层1 self.c2 = Conv2D(filters=64, kernel_size=(3, 3), padding="same", ) self.b2 = BatchNormalization() # BN层1 self.a2 = Activation("relu") # 激活层1 self.p1 = MaxPool2D(pool_size=(2, 2), strides=2, padding="same") self.d1 = Dropout(0.2) # dropout层 self.c3 = Conv2D(filters=128, kernel_size=(3, 3), padding="same") self.b3 = BatchNormalization() # BN层1 self.a3 = Activation("relu") # 激活层1 self.c4 = Conv2D(filters=128, kernel_size=(3, 3), padding="same") self.b4 = BatchNormalization() # BN层1 self.a4 = Activation("relu") # 激活层1 self.p2 = MaxPool2D(pool_size=(2, 2), strides=2, padding="same") self.d2 = Dropout(0.2) # dropout层 self.c5 = Conv2D(filters=256, kernel_size=(3, 3), padding="same") self.b5 = BatchNormalization() # BN层1 self.a5 = Activation("relu") # 激活层1 self.c6 = Conv2D(filters=256, kernel_size=(3, 3), padding="same") self.b6 = BatchNormalization() # BN层1 self.a6 = Activation("relu") # 激活层1 self.c7 = Conv2D(filters=256, kernel_size=(3, 3), padding="same") self.b7 = BatchNormalization() self.a7 = Activation("relu") self.p3 = MaxPool2D(pool_size=(2, 2), strides=2, padding="same") self.d3 = Dropout(0.2) self.c8 = Conv2D(filters=512, kernel_size=(3, 3), padding="same") self.b8 = BatchNormalization() # BN层1 self.a8 = Activation("relu") # 激活层1 self.c9 = Conv2D(filters=512, kernel_size=(3, 3), padding="same") self.b9 = BatchNormalization() # BN层1 self.a9 = Activation("relu") # 激活层1 self.c10 = Conv2D(filters=512, kernel_size=(3, 3), padding="same") self.b10 = BatchNormalization() self.a10 = Activation("relu") self.p4 = MaxPool2D(pool_size=(2, 2), strides=2, padding="same") self.d4 = Dropout(0.2) self.c11 = Conv2D(filters=512, kernel_size=(3, 3), padding="same") self.b11 = BatchNormalization() # BN层1 self.a11 = Activation("relu") # 激活层1 self.c12 = Conv2D(filters=512, kernel_size=(3, 3), padding="same") self.b12 = BatchNormalization() # BN层1 self.a12 = Activation("relu") # 激活层1 self.c13 = Conv2D(filters=512, kernel_size=(3, 3), padding="same") self.b13 = BatchNormalization() self.a13 = Activation("relu") self.p5 = MaxPool2D(pool_size=(2, 2), strides=2, padding="same") self.d5 = Dropout(0.2) self.flatten = Flatten() self.f1 = Dense(512, activation="relu") self.d6 = Dropout(0.2) self.f2 = Dense(512, activation="relu") self.d7 = Dropout(0.2) self.f3 = Dense(10, activation="softmax") def call(self, x): x = self.c1(x) x = self.b1(x) x = self.a1(x) x = self.c2(x) x = self.b2(x) x = self.a2(x) x = self.p1(x) x = self.d1(x) x = self.c3(x) x = self.b3(x) x = self.a3(x) x = self.c4(x) x = self.b4(x) x = self.a4(x) x = self.p2(x) x = self.d2(x) x = self.c5(x) x = self.b5(x) x = self.a5(x) x = self.c6(x) x = self.b6(x) x = self.a6(x) x = self.c7(x) x = self.b7(x) x = self.a7(x) x = self.p3(x) x = self.d3(x) x = self.c8(x) x = self.b8(x) x = self.a8(x) x = self.c9(x) x = self.b9(x) x = self.a9(x) x = self.c10(x) x = self.b10(x) x = self.a10(x) x = self.p4(x) x = self.d4(x) x = self.c11(x) x = self.b11(x) x = self.a11(x) x = self.c12(x) x = self.b12(x) x = self.a12(x) x = self.c13(x) x = self.b13(x) x = self.a13(x) x = self.p5(x) x = self.d5(x) x = self.flatten(x) x = self.f1(x) x = self.d6(x) x = self.f2(x) x = self.d7(x) y = self.f3(x) return y model = VGG16() model.compile(optimizer="adam",loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),metrics=["sparse_categorical_accuracy"]) checkpoint_save_path = "./checkpoint/VGG16.ckpt" if os.path.exists(checkpoint_save_path + ".index"): print("-------------load the model-----------------") model.load_weights(checkpoint_save_path) cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_save_path,save_weights_only=True,save_best_only=True) history = model.fit(x_train, y_train, batch_size=32, epochs=5, validation_data=(x_test, y_test), validation_freq=1, callbacks=[cp_callback]) model.summary()
InceptionNet
InceptionNet诞生于2015年,它是通过增加网络的宽度来提升网络的能力,与VGGNet通过卷积层堆叠的方式(纵向)相比,它是一个不同的方向(横向)。下图显示了InceptionNet基本单元架构,这个架构可以理解为神经网络的一个卷积层。
图6:InceptionNet网架结构
在这里可以建立两个类,第一个类为标准化的卷积层,其代码如下:class ConvBNRelu(Model): def __init__(self, ch, kernelsz=3, strides=1, padding="same"): super(ConvBNRelu, self).__init__() self.model = tf.keras.models.Sequential([ Conv2D(ch, kernelsz, strides=strides, padding=padding), BatchNormalization(), Activation("relu") ]) def call(self, x): x = self.model(x, training=False) #在training=False时,BN通过整个训练集计算均值、方差去做批归一化,training=True时,通过当前batch的均值、方差去做批归一化。推理时 training=False效果好 return x #定义了标准化的卷积层(ConvBNRelu)后,可以定义InceptionNet的基本单元了,其代码如下: class InceptionBlk(Model): def __init__(self, ch, strides=1): super(InceptionBlk, self).__init__() self.ch = ch self.strides = strides self.c1 = ConvBNRelu(ch, kernelsz=1, strides=strides) #最左边的卷积层1*1 self.c2_1 = ConvBNRelu(ch, kernelsz=1, strides=strides)#左边第二个黄色标识 self.c2_2 = ConvBNRelu(ch, kernelsz=3, strides=1)#左边第二个蓝色标识 self.c3_1 = ConvBNRelu(ch, kernelsz=1, strides=strides) #左边第三个黄色标识 self.c3_2 = ConvBNRelu(ch, kernelsz=5, strides=1) #左边第三个蓝色标识 self.p4_1 = MaxPool2D(3, strides=1, padding="same")#左边第四个红色标识 self.c4_2 = ConvBNRelu(ch, kernelsz=1, strides=strides)#左边第四个黄色标识 def call(self, x): x1 = self.c1(x) x2_1 = self.c2_1(x) x2_2 = self.c2_2(x2_1) x3_1 = self.c3_1(x) x3_2 = self.c3_2(x3_1) x4_1 = self.p4_1(x) x4_2 = self.c4_2(x4_1) # concat along axis=channel x = tf.concat([x1, x2_2, x3_2, x4_2], axis=3) return x
构架两个Block的InceptionNet,每个Block中包含两层基本的InceptionBlk单元,在每个Block中InceptionBlk单元的stride参数设置是不同的,第一层是stride=1,第二层是stride=2。这就意味着每经过一个Block,图的尺寸变为1/2,那么对应的把卷积核的个数乘以2。具体的架构如下图所示。
图7:2个Block的InceptionNet
其实现的代码如下:class Inception10(Model): def __init__(self, num_blocks, num_classes, init_ch=16, **kwargs): super(Inception10, self).__init__(**kwargs) self.in_channels = init_ch self.out_channels = init_ch self.num_blocks = num_blocks self.init_ch = init_ch self.c1 = ConvBNRelu(init_ch) self.blocks = tf.keras.models.Sequential() for block_id in range(num_blocks): for layer_id in range(2): if layer_id == 0: block = InceptionBlk(self.out_channels, strides=2) else: block = InceptionBlk(self.out_channels, strides=1) self.blocks.add(block) # enlarger out_channels per block self.out_channels *= 2 self.p1 = GlobalAveragePooling2D() self.f1 = Dense(num_classes, activation="softmax") def call(self, x): x = self.c1(x) x = self.blocks(x) x = self.p1(x) y = self.f1(x) return y model = Inception10(num_blocks=2, num_classes=10) model.compile(optimizer="adam",loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),metrics=["sparse_categorical_accuracy"]) checkpoint_save_path = "./checkpoint/Inception10.ckpt" if os.path.exists(checkpoint_save_path + ".index"): print("-------------load the model-----------------") model.load_weights(checkpoint_save_path) cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_save_path,save_weights_only=True,save_best_only=True) history = model.fit(x_train, y_train, batch_size=32, epochs=5, validation_data=(x_test, y_test), validation_freq=1, callbacks=[cp_callback]) model.summary()
此外,也可以不用Class这个方式来实现InceptionNet,具体代码如下:import tensorflow as tf # 定义一个Inception模块 def InceptionModule(inputs): # 第一条分支 branch1 = tf.keras.layers.Conv2D(64, (1,1), padding="same", activation="relu")(inputs) # 第二条分支 branch2 = tf.keras.layers.Conv2D(64, (1,1), padding="same", activation="relu")(inputs) branch2 = tf.keras.layers.Conv2D(96, (3,3), padding="same", activation="relu")(branch2) # 第三条分支 branch3 = tf.keras.layers.Conv2D(64, (1,1), padding="same", activation="relu")(inputs) branch3 = tf.keras.layers.Conv2D(64, (5,5), padding="same", activation="relu")(branch3) # 第四条分支 branch4 = tf.keras.layers.MaxPooling2D((3,3), strides=(1,1), padding="same")(inputs) branch4 = tf.keras.layers.Conv2D(32, (1,1), padding="same", activation="relu")(branch4) # 将四个分支合并 outputs = tf.keras.layers.concatenate([branch1, branch2, branch3, branch4], axis=-1) return outputs # 定义一个InceptionNet def InceptionNet(input_shape, num_classes): # 输入层 inputs = tf.keras.layers.Input(shape=input_shape) # 第一层 x = tf.keras.layers.Conv2D(64, (7,7), strides=(2,2), padding="same", activation="relu")(inputs) x = tf.keras.layers.MaxPooling2D((3,3), strides=(2,2), padding="same")(x) # 第二层 x = tf.keras.layers.Conv2D(64, (1,1), padding="same", activation="relu")(x) x = tf.keras.layers.Conv2D(192, (3,3), padding="same", activation="relu")(x) x = tf.keras.layers.MaxPooling2D((3,3), strides=(2,2), padding="same")(x) # 第三层 x = InceptionModule(x) x = InceptionModule(x) x = InceptionModule(x) # 全局平均池化层 x = tf.keras.layers.GlobalAveragePooling2D()(x) # 输出层 outputs = tf.keras.layers.Dense(num_classes, activation="softmax")(x) # 创建模型 model = tf.keras.Model(inputs=inputs, outputs=outputs) return modelResNet
以上的经典卷积神经网络遇到的一个问题是随着层数的不断上升,其测试的精度不会上升,这个主要是由于梯度消失导致的。ResNet的核心思想是将层间残差跳连,引入前方信息,减少梯度消失,这样可以加大神经网络的层数。其结构示意图如下所示。
图8:ResNet的架构
在上图中有虚线和实线,其区别是保证其维度的相同,其计算步骤如下图所示。
图9:ResNet两种不同的残差计算方式
ResNet的Block的模块的代码中有一个参数residual_path主要是判断维度是否相同,其具体代码如下:class ResnetBlock(Model): def __init__(self, filters, strides=1, residual_path=False): super(ResnetBlock, self).__init__() self.filters = filters self.strides = strides self.residual_path = residual_path self.c1 = Conv2D(filters, (3, 3), strides=strides, padding="same", use_bias=False) self.b1 = BatchNormalization() self.a1 = Activation("relu") self.c2 = Conv2D(filters, (3, 3), strides=1, padding="same", use_bias=False) self.b2 = BatchNormalization() # residual_path为True时,对输入进行下采样,即用1x1的卷积核做卷积操作,保证x能和F(x)维度相同,顺利相加 if residual_path: self.down_c1 = Conv2D(filters, (1, 1), strides=strides, padding="same", use_bias=False) self.down_b1 = BatchNormalization() self.a2 = Activation("relu") def call(self, inputs): residual = inputs # residual等于输入值本身,即residual=x # 将输入通过卷积、BN层、激活层,计算F(x) x = self.c1(inputs) x = self.b1(x) x = self.a1(x) x = self.c2(x) y = self.b2(x) if self.residual_path: residual = self.down_c1(inputs) residual = self.down_b1(residual) out = self.a2(y + residual) # 最后输出的是两部分的和,即F(x)+x或F(x)+Wx,再过激活函数 return out #最后可以利用ResnetBlock来构建ResNet的网络架构,其代码如下: class ResNet18(Model): def __init__(self, block_list, initial_filters=64): # block_list表示每个block有几个卷积层 super(ResNet18, self).__init__() self.num_blocks = len(block_list) # 共有几个block self.block_list = block_list self.out_filters = initial_filters self.c1 = Conv2D(self.out_filters, (3, 3), strides=1, padding="same", use_bias=False) self.b1 = BatchNormalization() self.a1 = Activation("relu") self.blocks = tf.keras.models.Sequential() # 构建ResNet网络结构 for block_id in range(len(block_list)): # 第几个resnet block for layer_id in range(block_list[block_id]): # 第几个卷积层 if block_id != 0 and layer_id == 0: # 对除第一个block以外的每个block的输入进行下采样 block = ResnetBlock(self.out_filters, strides=2, residual_path=True) else: block = ResnetBlock(self.out_filters, residual_path=False) self.blocks.add(block) # 将构建好的block加入resnet self.out_filters *= 2 # 下一个block的卷积核数是上一个block的2倍 self.p1 = tf.keras.layers.GlobalAveragePooling2D() self.f1 = tf.keras.layers.Dense(10, activation="softmax", kernel_regularizer=tf.keras.regularizers.l2()) def call(self, inputs): x = self.c1(inputs) x = self.b1(x) x = self.a1(x) x = self.blocks(x) x = self.p1(x) y = self.f1(x) return y model = ResNet18([2, 2, 2, 2]) model.compile(optimizer="adam",loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),metrics=["sparse_categorical_accuracy"]) checkpoint_save_path = "./checkpoint/ResNet18.ckpt" if os.path.exists(checkpoint_save_path + ".index"): print("-------------load the model-----------------") model.load_weights(checkpoint_save_path) cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_save_path,save_weights_only=True,save_best_only=True) history = model.fit(x_train, y_train, batch_size=32, epochs=5, validation_data=(x_test, y_test), validation_freq=1,callbacks=[cp_callback]) model.summary()
不用Class实现代码如下:import tensorflow as tf # 定义一个残差块 def ResidualBlock(x, filters, downsample=False): shortcut = x stride = (1, 1) # 如果需要下采样,则对输入进行下采样操作 if downsample: stride = (2, 2) shortcut = tf.keras.layers.Conv2D(filters, (1, 1), strides=stride, padding="same")(shortcut) shortcut = tf.keras.layers.BatchNormalization()(shortcut) # 主分支 x = tf.keras.layers.Conv2D(filters, (3, 3), strides=stride, padding="same")(x) x = tf.keras.layers.BatchNormalization()(x) x = tf.keras.layers.Activation("relu")(x) x = tf.keras.layers.Conv2D(filters, (3, 3), strides=(1, 1), padding="same")(x) x = tf.keras.layers.BatchNormalization()(x) # 将主分支的输出与shortcut相加 x = tf.keras.layers.add([x, shortcut]) x = tf.keras.layers.Activation("relu")(x) return x # 定义一个ResNet模型 def ResNet(input_shape, num_classes): inputs = tf.keras.layers.Input(shape=input_shape) # 预处理层 x = tf.keras.layers.ZeroPadding2D(padding=(3, 3))(inputs) x = tf.keras.layers.Conv2D(64, (7, 7), strides=(2, 2))(x) x = tf.keras.layers.BatchNormalization()(x) x = tf.keras.layers.Activation("relu")(x) x = tf.keras.layers.MaxPooling2D((3, 3), strides=(2, 2))(x) # 残差块组1 x = ResidualBlock(x, filters=64) x = ResidualBlock(x, filters=64) # 残差块组2 x = ResidualBlock(x, filters=128, downsample=True) x = ResidualBlock(x, filters=128) # 残差块组3 x = ResidualBlock(x, filters=256, downsample=True) x = ResidualBlock(x, filters=256) # 残差块组4 x = ResidualBlock(x, filters=512, downsample=True) x = ResidualBlock(x, filters=512) # 全局平均池化层 x = tf.keras.layers.GlobalAveragePooling2D()(x) # 输出层 outputs = tf.keras.layers.Dense(num_classes, activation="softmax")(x) # 创建模型 model = tf.keras.Model(inputs=inputs, outputs=outputs) return model
总体上看,ResNet把网络深度大幅度进行了提升,在2015年Imagenet图像识别Top5错误率降低至3.57%。