最近学习了一下ResNet50模型,用其跑了个Kaggle比赛,并仔细阅读了其Keras实现。在比赛中,我修改了一下源码,加入了正则项,激活函数改为elu, 日后的应用中也可以直接copy 使用之。
ResNet50 的结构图网上已经很多了,例如这篇博文:https://blog.csdn.net/nima1994/article/details/82686132。
可以看出,ResNet50是主要分为两个部分,一部分为Plain Network,也就是上图的左侧部分,就是一系列通常的卷积,批量正则化和激活层的堆叠。如果只有这一部分的化,那就是通常的卷积神经网络。而ResNet还有图中的右半部分,那就是每间隔3个卷积层有一个 shortcut connection, 以形成残差块。引入残差块的目的主要是为了应对深度神经网络的退化问题:网络如果过于深容易造成梯度弥散,模型难以优化,训练效果不佳,即使是训练误差都会增大,更不提模型的过拟合问题。
ResNet50的结构看似复杂,但是主要由两种构筑块(building block)累加得到,分别是shortcut connection 为恒等映射的恒等残差块(identity block)以及 shortcut connection 为卷积+批量正则 的卷积残差块,如下图所示:
恒等块 identity block
卷积块 convolution block
而整个网络的组织结构就是: 开头若干层+ (1+2)+(1+3)+(1+5)+(1+2)+池化全连接。
下面是我对ResNet50 Keras源码做的一点修改,主要是将激活层改为了elu, 并且加入了l2正则项. 在这里修改为elu激活函数只需要应用 keras.layers.ELU层。在这里有一个小坑,那就是不要用keras.activations.elu函数,因为该函数只是将一个张量映射为另一个张量,并不建立层,而两个张量如果没有层相连接的化不能建立模型。例如如下代码:
from keras import activations
from keras.layers import Input
from keras.models import Model
input_tensor=Input(shape=(3,3))
x=activations.elu(input_tensor,alpha=1)
model=Model(input_tensor, x)
这时候运行的话会报错:"
ValueError: Output tensors to a Model must be the output of a Keras `Layer` (thus holding past layer metadata). Found: Tensor("Elu_1:0", shape=(?, 3, 3), dtype=float32)
"
因为input_tensor, x没有用一个层连接起来,而activations.elu只是一个函数。
而增加正则化器一般先:from keras import regularizers, 然后在建立层的时候加入参数regularizer=regularizers.l2(somevalue), 例如:
from keras import regularizers
from keras.layers import Input
from keras.layers import Conv2D
input_tensor=Input((256,256,3))
x=Conv2D(kernel_size=(2,2),filters=3,
kernel_regularizer=regularizers.l2(0.1))
下面是完整代码:
from __future__ import print_function
import numpy as np
import warnings
from keras.layers import Input
from keras import layers
from keras.layers import Dense
from keras.layers import Activation
from keras.layers import Flatten
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import GlobalMaxPooling2D
from keras.layers import ZeroPadding2D
from keras.layers import AveragePooling2D
from keras.layers import GlobalAveragePooling2D
from keras.layers import BatchNormalization
from keras.models import Model
from keras.preprocessing import image
import keras.backend as K
from keras.utils import layer_utils
from keras.utils.data_utils import get_file
from keras_applications.imagenet_utils import decode_predictions
from keras_applications.imagenet_utils import preprocess_input
from keras_applications.imagenet_utils import _obtain_input_shape
from keras.engine.topology import get_source_inputs
from keras import regularizers
from keras.layers import ELU
"""
构造恒等块,identity block的函数,将输入的张量输出为恒等残差块的输出张量。
"""
def identity_block(input_tensor, kernel_size, filters, stage, block, reg=None):
"""
这部分主要由三个卷积层加上一个恒等shortcut connection组成,
input_tensor为输入的张量,
kernel_size 为中间的卷积核的形状,其余的卷积形状核均为(1,1),
stage, block 用来给层命名
filters为数组,存放每个卷积层的卷积核个数,
reg为正则化器。
"""
filters1, filters2, filters3=filters
if K.image_data_format()=='channels_last':
bn_axis=3
else:
bn_axis=1
conv_name_base='res'+str(stage)+block+'_branch'
bn_name_base='bn'+str(stage)+block+'_branch'
x=Conv2D(filters1,
(1,1),
name=conv_name_base+'2a',
kernel_regularizer=reg)(input_tensor)
x=BatchNormalization(axis=bn_axis, name=bn_name_base+'2a')(x)
x=ELU(alpha=1.0)(x)
x=Conv2D(filters2, kernel_size, padding='same',
name=conv_name_base+'2b',
kernel_regularizer=reg)(x)
x=BatchNormalization(axis=bn_axis,name=bn_name_base+'2b')(x)
#Elu激活层!!
x=ELU(alpha=1.0)(x)
x=Conv2D(filters3,(1,1),name=conv_name_base+'2c',
kernel_regularizer=reg)(x)
x=BatchNormalization(axis=bn_axis,
name=bn_name_base+'2c')(x)
x=layers.add([x,input_tensor])
x=ELU(alpha=1.0)(x)
return x
"""
构造卷积块,convolution block的函数。
"""
def conv_block(input_tensor,kernel_size,filters,
stage,block,reg=None,strides=(2,2)):
filters1,filters2,filters3=filters
if K.image_data_format()=='channels_last':
bn_axis=3
else:
bn_axis=1
conv_name_base='res'+str(stage)+block+'_branch'
bn_name_base='bn'+str(stage)+block+'_branch'
x=Conv2D(filters1,(1,1),strides=strides,
name=conv_name_base+'2a',
kernel_regularizer=reg)(input_tensor)
x=BatchNormalization(axis=bn_axis,name=bn_name_base+'2a')(x)
x=ELU(alpha=1.0)(x)
x=Conv2D(filters2, kernel_size, padding='same',
name=conv_name_base+'2b',
kernel_regularizer=reg)(x)
x=BatchNormalization(axis=bn_axis,name=bn_name_base+'2b')(x)
x=ELU(alpha=1.0)(x)
x=Conv2D(filters3,(1,1),name=conv_name_base+'2c',
kernel_regularizer=reg)(x)
x=BatchNormalization(axis=bn_axis,name=bn_name_base+'2c')(x)
shortcut=Conv2D(filters3,(1,1),strides=strides,
name=conv_name_base+'1',
kernel_regularizer=reg)(input_tensor)
shortcut=BatchNormalization(axis=bn_axis,name=bn_name_base+'1')(shortcut)
x=layers.add([x,shortcut])
x=ELU(alpha=1.0)(x)
return x
"""
构建ResNet50模型,输出一个keras.model类型,可加载预训练权值.
"""
def ResNet50(include_top=True,
weights='imagenet',
input_tensor=None,
input_shape=None,
pooling=None,
classes=1000,
reg=None):
"""
include_top: 是否需要顶部,否则会去掉输出层。
"""
WEIGHTS_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.2/resnet50_weights_tf_dim_ordering_tf_kernels.h5'
WEIGHTS_PATH_NO_TOP = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.2/resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5'
if not (weights in {'imagenet', None} or os.path.exists(weights)):
raise ValueError('The `weights` argument should be either '
'`None` (random initialization), `imagenet` '
'(pre-training on ImageNet), '
'or the path to the weights file to be loaded.')
if weights == 'imagenet' and include_top and classes != 1000:
raise ValueError('If using `weights` as `"imagenet"` with `include_top`'
' as true, `classes` should be 1000')
# Determine proper input shape
input_shape = _obtain_input_shape(input_shape,
default_size=224,
min_size=32,
data_format=K.image_data_format(),
require_flatten=include_top,
weights=weights)
if input_tensor is None:
img_input = layers.Input(shape=input_shape)
else:
if not K.is_keras_tensor(input_tensor):
img_input = layers.Input(tensor=input_tensor, shape=input_shape)
else:
img_input = input_tensor
if K.image_data_format() == 'channels_last':
bn_axis = 3
else:
bn_axis = 1
x=ZeroPadding2D((3,3))(input_tensor)
x=Conv2D(64,(7,7),strides=(2,2),name='conv1',
kernel_regularizer=reg)(x)
x=BatchNormalization(axis=bn_axis,name="bn_conv1")(x)
x=ELU(alpha=1.0)(x)
x=MaxPooling2D((3,3),strides=(2,2))(x)
x=conv_block(x,3,[64,64,256],stage=2,block='a',strides=(1,1),reg=reg)
x=identity_block(x,3,[64,64,256],stage=2,block='b',reg=reg)
x=identity_block(x,3,[64,64,256],stage=2,block='c',reg=reg)
x=conv_block(x,3,[128,128,512],stage=3,block='a',reg=reg)
x=identity_block(x,3,[128,128,512],stage=3,block='b',reg=reg)
x=identity_block(x,3,[128,128,512],stage=3,block='c',reg=reg)
x=identity_block(x,3,[128,128,512],stage=3,block='d',reg=reg)
x = conv_block(x, 3, [256, 256, 1024], stage=4, block='a',reg=reg)
x = identity_block(x, 3, [256, 256, 1024], stage=4, block='b',reg=reg)
x = identity_block(x, 3, [256, 256, 1024], stage=4, block='c',reg=reg)
x = identity_block(x, 3, [256, 256, 1024], stage=4, block='d',reg=reg)
x = identity_block(x, 3, [256, 256, 1024], stage=4, block='e',reg=reg)
x = identity_block(x, 3, [256, 256, 1024], stage=4, block='f',reg=reg)
x = conv_block(x, 3, [512, 512, 2048], stage=5, block='a',reg=reg)
x = identity_block(x, 3, [512, 512, 2048], stage=5, block='b',reg=reg)
x = identity_block(x, 3, [512, 512, 2048], stage=5, block='c',reg=reg)
x = AveragePooling2D((7, 7), name='avg_pool')(x)
if include_top:
x = Flatten()(x)
x = Dense(classes, activation='softmax', name='fc1000')(x)
else:
if pooling == 'avg':
x = GlobalAveragePooling2D()(x)
elif pooling == 'max':
x = GlobalMaxPooling2D()(x)
# Ensure that the model takes into account
# any potential predecessors of `input_tensor`.
if input_tensor is not None:
#注意这里如果输入张量input_tensor如果已经是某一个model中的张量,
#则直接回溯找到该model的输出层作为模型的输入层,也就是会自动把先前模型加进来,而改变ResNet50的结构.
inputs = get_source_inputs(input_tensor)
else:
inputs = img_input
# Create model.
model = Model(inputs, x, name='resnet50')
# load weights
if weights == 'imagenet':
if include_top:
weights_path = get_file('resnet50_weights_tf_dim_ordering_tf_kernels.h5',
WEIGHTS_PATH,
cache_subdir='models',
md5_hash='a7b3fe01876f51b976af0dea6bc144eb')
else:
weights_path = get_file('resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5',
WEIGHTS_PATH_NO_TOP,
cache_subdir='models',
md5_hash='a268eb855778b3df3c7506639542a6af')
model.load_weights(weights_path,by_name=True)
if K.backend() == 'theano':
layer_utils.convert_all_kernels_in_model(model)
if K.image_data_format() == 'channels_first':
if include_top:
maxpool = model.get_layer(name='avg_pool')
shape = maxpool.output_shape[1:]
dense = model.get_layer(name='fc1000')
layer_utils.convert_dense_weights_data_format(dense, shape, 'channels_first')
if K.backend() == 'tensorflow':
warnings.warn('You are using the TensorFlow backend, yet you '
'are using the Theano '
'image data format convention '
'(`image_data_format="channels_first"`). '
'For best performance, set '
'`image_data_format="channels_last"` in '
'your Keras config '
'at ~/.keras/keras.json.')
return model