从官方资料:
https://www.mindspore.cn/tutorial/training/zh-CN/r1.2/use/save_model.html?highlight=save_checkpoint
在模型训练过程中,可以添加检查点(CheckPoint)用于保存模型的参数,以便执行推理及再训练使用。如果想继续在不同硬件平台上做推理,可通过网络和CheckPoint格式文件生成对应的MindIR、AIR和ONNX格式文件。
-
MindIR:MindSpore的一种基于图表示的函数式IR,定义了可扩展的图结构以及算子的IR表示,它消除了不同后端的模型差异。可以把在Ascend 910训练好的模型,在Ascend 310、GPU以及MindSpore Lite端侧上执行推理。
-
CheckPoint:MindSpore的存储了所有训练参数值的二进制文件。采用了Google的Protocol Buffers机制,与开发语言、平台无关,具有良好的可扩展性。CheckPoint的protocol格式定义在
mindspore/ccsrc/utils/checkpoint.proto
中。 -
AIR:全称Ascend Intermediate Representation,类似ONNX,是华为定义的针对机器学习所设计的开放式的文件格式,能更好地适配Ascend AI处理器。
-
ONNX:全称Open Neural Network Exchange,是一种针对机器学习所设计的开放式的文件格式,用于存储训练好的模型。
以下通过示例来介绍保存CheckPoint格式文件和导出MindIR、AIR和ONNX格式文件的方法。
=======================================================
我们可以知道:
CheckPoint 格式的文件只存储参数(Paremeter),并不会存储网络结构等信息,因此使用这个类型的备份文件我们需要提前已经具有网络结构的定义才可以使用。
MindIR 格式的文件存储着网络结构和参数(Paremeter),可以在没有网络结构定义的情况下进行使用,如:推理或再训练,但是需要注意的一点是 MindIR 文件是为跨平台使用的,也就是说该文件是为了在移动端等设备使用的,而调用的编程语言一般为C++, 如果我们是使用Python代码训练网络后获得的MindIR文件,我们无法在另一个Python环境下导入 MindIR 文件的。使用场景就是,用Python训练好网络,导出MindIR文件,然后在移动端等设备设备上使用C++语言调用MindIR中的模型及参数。
--------------------------------------------------------------------------------------------------------------------------
MindSpore_hub 是官方提供的网络模型预训练参数下载工具:
我们可以使用 MindSpore_hub 下载已经训练好的模型进行迁移学习或重训练等操作。
给出demo: (数据集的下载这里不进行介绍,参考前文)
import os import mindspore_hub as mshub import mindspore from mindspore import context, Tensor, nn from mindspore.nn import Momentum from mindspore.train.serialization import save_checkpoint, load_checkpoint,load_param_into_net from mindspore import ops import mindspore.dataset as ds import mindspore.dataset.transforms.c_transforms as C2 import mindspore.dataset.vision.c_transforms as C from mindspore import dtype as mstype from mindspore import Model # 设置新的网络结构 class ReduceMeanFlatten(nn.Cell): def __init__(self): super(ReduceMeanFlatten, self).__init__() self.mean = ops.ReduceMean(keep_dims=True) self.flatten = nn.Flatten() def construct(self, x): x = self.mean(x, (2, 3)) x = self.flatten(x) return x # 设置每步学习率 def generate_steps_lr(lr_init, steps_per_epoch, total_epochs): total_steps = total_epochs * steps_per_epoch decay_epoch_index = [0.3*total_steps, 0.6*total_steps, 0.8*total_steps] lr_each_step = [] for i in range(total_steps): if i < decay_epoch_index[0]: lr = lr_init elif i < decay_epoch_index[1]: lr = lr_init * 0.1 elif i < decay_epoch_index[2]: lr = lr_init * 0.01 else: lr = lr_init * 0.001 lr_each_step.append(lr) return lr_each_step # 设置数据集 def create_cifar10dataset(dataset_path, batch_size, do_train): if do_train: usage, shuffle = "train", True else: usage, shuffle = "test", False data_set = ds.Cifar10Dataset(dataset_dir=dataset_path, usage=usage, shuffle=True) # define map operations trans = [C.Resize((256, 256))] if do_train: trans += [ C.RandomHorizontalFlip(prob=0.5), ] trans += [ C.Rescale(1.0 / 255.0, 0.0), C.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5]), C.HWC2CHW() ] type_cast_op = C2.TypeCast(mstype.int32) data_set = data_set.map(operations=type_cast_op, input_columns="label", num_parallel_workers=8) data_set = data_set.map(operations=trans, input_columns="image", num_parallel_workers=8) # apply batch operations data_set = data_set.batch(batch_size, drop_remainder=True) return data_set # Create Dataset dataset_path = "datasets/cifar-10-batches-bin/train" dataset = create_cifar10dataset(dataset_path, batch_size=32, do_train=True) # 构建整体网络 model = "mindspore/ascend/1.0/mobilenetv2_v1.0_openimage" network = mshub.load(model, num_classes=500, include_top=False, activation="Sigmoid") network.set_train(False) # Check MindSpore Hub website to conclude that the last output shape is 1280. last_channel = 1280 # The number of classes in target task is 10. num_classes = 10 reduce_mean_flatten = ReduceMeanFlatten() classification_layer = nn.Dense(last_channel, num_classes) classification_layer.set_train(True) train_network = nn.SequentialCell([network, reduce_mean_flatten, classification_layer]) # 正式训练设置 # Set epoch size epoch_size = 60 # Wrap the backbone network with loss. loss_fn = nn.SoftmaxCrossEntropyWithLogits(sparse=True, reduction="mean") loss_net = nn.WithLossCell(train_network, loss_fn) # Create an optimizer. steps_per_epoch = dataset.get_dataset_size() lr = generate_steps_lr(lr_init=0.01, steps_per_epoch=steps_per_epoch, total_epochs=epoch_size) optim = Momentum(filter(lambda x: x.requires_grad, classification_layer.get_parameters()), Tensor(lr, mindspore.float32), 0.9, 4e-5) # 构建模型 train_net = nn.TrainOneStepCell(loss_net, optim) for epoch in range(epoch_size): for i, items in enumerate(dataset): data, label = items data = mindspore.Tensor(data) label = mindspore.Tensor(label) loss = train_net(data, label) print(f"epoch: {epoch}/{epoch_size}, loss: {loss}") # Save the ckpt file for each epoch. if not os.path.exists('ckpt'): os.mkdir('ckpt') ckpt_path = f"./ckpt/cifar10_finetune_epoch{epoch}.ckpt" save_checkpoint(train_network, ckpt_path)
该代码,从 MindSpore_hub 上下载模型:
"mindspore/ascend/1.0/mobilenetv2_v1.0_openimage"
核心代码:
# 构建整体网络 model = "mindspore/ascend/1.0/mobilenetv2_v1.0_openimage" network = mshub.load(model, num_classes=500, include_top=False, activation="Sigmoid") network.set_train(False)
该代码从 MindSpore_hub 下载网络模型,获得不带有全连接的,并且激活函数为 sigmoid 的, 已经训练好的 网络模型 network 。
由于下载的网络模型不具备全连接层,这里我们又为其重新设计顶层的特征提取网络:
reduce_mean_flatten = ReduceMeanFlatten() classification_layer = nn.Dense(last_channel, num_classes) classification_layer.set_train(True)
最后,将下载的网络,和自建的网络 拼接为新的网络,形成最终的网络:
train_network = nn.SequentialCell([network, reduce_mean_flatten, classification_layer])
其中,只有自建的最后一层全连接层才设置为可训练层,前面的低层特征提取层(CNN等)都设置为不可训练层。
在训练一定次数后将训练好的网络参数进行保存,这里只保存 最后一层全连接网络 classification_layer 的参数。
=================================
下面这个demo, 功能为:
1. 从 MindSpore_hub 上下载低层网络结构和参数,
2. 从磁盘上导入 最后一层全连接网络 classification_layer 的参数,
3. 将拼接好的具备训练好参数的网络的参数及结构导出为 MindIR格式文件 ,以提供给跨平台的C++语言环境下再训练及推理。
import os import mindspore_hub as mshub import mindspore from mindspore import context, export, Tensor, nn from mindspore.nn import Momentum from mindspore.train.serialization import save_checkpoint, load_checkpoint,load_param_into_net from mindspore import ops import mindspore.dataset as ds import mindspore.dataset.transforms.c_transforms as C2 import mindspore.dataset.vision.c_transforms as C from mindspore import dtype as mstype from mindspore import Model # 设置新的网络结构 class ReduceMeanFlatten(nn.Cell): def __init__(self): super(ReduceMeanFlatten, self).__init__() self.mean = ops.ReduceMean(keep_dims=True) self.flatten = nn.Flatten() def construct(self, x): x = self.mean(x, (2, 3)) x = self.flatten(x) return x # 设置每步学习率 def generate_steps_lr(lr_init, steps_per_epoch, total_epochs): total_steps = total_epochs * steps_per_epoch decay_epoch_index = [0.3*total_steps, 0.6*total_steps, 0.8*total_steps] lr_each_step = [] for i in range(total_steps): if i < decay_epoch_index[0]: lr = lr_init elif i < decay_epoch_index[1]: lr = lr_init * 0.1 elif i < decay_epoch_index[2]: lr = lr_init * 0.01 else: lr = lr_init * 0.001 lr_each_step.append(lr) return lr_each_step # 设置数据集 def create_cifar10dataset(dataset_path, batch_size, do_train): if do_train: usage, shuffle = "train", True else: usage, shuffle = "test", False data_set = ds.Cifar10Dataset(dataset_dir=dataset_path, usage=usage, shuffle=True) # define map operations trans = [C.Resize((256, 256))] if do_train: trans += [ C.RandomHorizontalFlip(prob=0.5), ] trans += [ C.Rescale(1.0 / 255.0, 0.0), C.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5]), C.HWC2CHW() ] type_cast_op = C2.TypeCast(mstype.int32) data_set = data_set.map(operations=type_cast_op, input_columns="label", num_parallel_workers=8) data_set = data_set.map(operations=trans, input_columns="image", num_parallel_workers=8) # apply batch operations data_set = data_set.batch(batch_size, drop_remainder=True) return data_set # Create Dataset dataset_path = "datasets/cifar-10-batches-bin/train" dataset = create_cifar10dataset(dataset_path, batch_size=32, do_train=True) # 构建整体网络 model = "mindspore/ascend/1.0/mobilenetv2_v1.0_openimage" network = mshub.load(model, num_classes=500, include_top=False, activation="Sigmoid") network.set_train(False) # Check MindSpore Hub website to conclude that the last output shape is 1280. last_channel = 1280 # The number of classes in target task is 10. num_classes = 10 reduce_mean_flatten = ReduceMeanFlatten() classification_layer = nn.Dense(last_channel, num_classes) classification_layer.set_train(True) train_network = nn.SequentialCell([network, reduce_mean_flatten, classification_layer]) # 正式训练设置 # Set epoch size epoch_size = 60 # Wrap the backbone network with loss. loss_fn = nn.SoftmaxCrossEntropyWithLogits(sparse=True, reduction="mean") loss_net = nn.WithLossCell(train_network, loss_fn) # Create an optimizer. steps_per_epoch = dataset.get_dataset_size() lr = generate_steps_lr(lr_init=0.01, steps_per_epoch=steps_per_epoch, total_epochs=epoch_size) optim = Momentum(filter(lambda x: x.requires_grad, classification_layer.get_parameters()), Tensor(lr, mindspore.float32), 0.9, 4e-5) # 构建模型 train_net = nn.TrainOneStepCell(loss_net, optim) ckpt_path = f"./ckpt/cifar10_finetune_epoch59.ckpt" # 加载参数到网络中 param_dict = load_checkpoint(ckpt_path) load_param_into_net(classification_layer, param_dict) data = next(dataset.create_dict_iterator()) #print(data) in_put = data['image'] out_put = data['label'] export(train_network, in_put, file_name='mobilenetv2', file_format='MINDIR') """ for epoch in range(epoch_size): for i, items in enumerate(dataset): data, label = items data = mindspore.Tensor(data) label = mindspore.Tensor(label) loss = train_net(data, label) print(f"epoch: {epoch}/{epoch_size}, loss: {loss}") # Save the ckpt file for each epoch. if not os.path.exists('ckpt'): os.mkdir('ckpt') ckpt_path = f"./ckpt/cifar10_finetune_epoch{epoch}.ckpt" save_checkpoint(train_network, ckpt_path) """
========================================================
下面代码,依然是:
1. 从 MindSpore_hub 上下载低层网络结构和参数,
2. 从磁盘上导入 最后一层全连接网络 classification_layer 的参数,
3. 将拼接好的具备训练好参数的网络 进行推理:
这里不同与上面代码的地方是,我们这里分别为 最后一层全连接网络 classification_layer 引入了60个不同训练epoch下的参数,并拼接成60个只有最后一层 classification_layer 参数不同的推理网络,并进行评估:
这里我们因为拼接了60个模型,也就给出60个评估结果。
import os import mindspore_hub as mshub import mindspore from mindspore import context, export, Tensor, nn from mindspore.nn import Momentum from mindspore.train.serialization import save_checkpoint, load_checkpoint,load_param_into_net from mindspore import ops import mindspore.dataset as ds import mindspore.dataset.transforms.c_transforms as C2 import mindspore.dataset.vision.c_transforms as C from mindspore import dtype as mstype from mindspore import Model # 设置新的网络结构 class ReduceMeanFlatten(nn.Cell): def __init__(self): super(ReduceMeanFlatten, self).__init__() self.mean = ops.ReduceMean(keep_dims=True) self.flatten = nn.Flatten() def construct(self, x): x = self.mean(x, (2, 3)) x = self.flatten(x) return x # 设置每步学习率 def generate_steps_lr(lr_init, steps_per_epoch, total_epochs): total_steps = total_epochs * steps_per_epoch decay_epoch_index = [0.3*total_steps, 0.6*total_steps, 0.8*total_steps] lr_each_step = [] for i in range(total_steps): if i < decay_epoch_index[0]: lr = lr_init elif i < decay_epoch_index[1]: lr = lr_init * 0.1 elif i < decay_epoch_index[2]: lr = lr_init * 0.01 else: lr = lr_init * 0.001 lr_each_step.append(lr) return lr_each_step # 设置数据集 def create_cifar10dataset(dataset_path, batch_size, do_train): if do_train: usage, shuffle = "train", True else: usage, shuffle = "test", False data_set = ds.Cifar10Dataset(dataset_dir=dataset_path, usage=usage, shuffle=True) # define map operations trans = [C.Resize((256, 256))] if do_train: trans += [ C.RandomHorizontalFlip(prob=0.5), ] trans += [ C.Rescale(1.0 / 255.0, 0.0), C.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5]), C.HWC2CHW() ] type_cast_op = C2.TypeCast(mstype.int32) data_set = data_set.map(operations=type_cast_op, input_columns="label", num_parallel_workers=8) data_set = data_set.map(operations=trans, input_columns="image", num_parallel_workers=8) # apply batch operations data_set = data_set.batch(batch_size, drop_remainder=True) return data_set # 构建整体网络 model = "mindspore/ascend/1.0/mobilenetv2_v1.0_openimage" network = mshub.load(model, num_classes=500, include_top=False, activation="Sigmoid") network.set_train(False) # Check MindSpore Hub website to conclude that the last output shape is 1280. last_channel = 1280 # The number of classes in target task is 10. num_classes = 10 reduce_mean_flatten = ReduceMeanFlatten() classification_layer = nn.Dense(last_channel, num_classes) classification_layer.set_train(True) train_network = nn.SequentialCell([network, reduce_mean_flatten, classification_layer]) # Wrap the backbone network with loss. loss_fn = nn.SoftmaxCrossEntropyWithLogits(sparse=True, reduction="mean") dataset_path = "datasets/cifar-10-batches-bin/test" # Define loss and create model. eval_dataset = create_cifar10dataset(dataset_path, batch_size=32, do_train=False) eval_metrics = {'Loss': nn.Loss(), 'Top1-Acc': nn.Top1CategoricalAccuracy(), 'Top5-Acc': nn.Top5CategoricalAccuracy()} model = Model(train_network, loss_fn=loss_fn, optimizer=None, metrics=eval_metrics) for i in range(60): # Load a pre-trained ckpt file. ckpt_path = "./ckpt/cifar10_finetune_epoch{}.ckpt".format(i) trained_ckpt = load_checkpoint(ckpt_path) load_param_into_net(classification_layer, trained_ckpt) metrics = model.eval(eval_dataset) print("{} epoch, metric: ".format(i), metrics)
运行结果:
Checking /data/devil/.mscache/mindspore/ascend/1.0/mobilenetv2_v1.0_openimage.md...Passed!
File already exists!
0 epoch, metric:
{'Loss': 1.092451021170769, 'Top1-Acc': 0.6240985576923077, 'Top5-Acc': 0.9666466346153846}
1 epoch, metric:
{'Loss': 1.0392829114810014, 'Top1-Acc': 0.6355168269230769, 'Top5-Acc': 0.9705528846153846}
2 epoch, metric:
{'Loss': 1.0918126824574592, 'Top1-Acc': 0.625, 'Top5-Acc': 0.9720552884615384}
3 epoch, metric:
{'Loss': 1.0721766576170921, 'Top1-Acc': 0.6274038461538461, 'Top5-Acc': 0.9659455128205128}
4 epoch, metric:
{'Loss': 0.96153519092462, 'Top1-Acc': 0.6670673076923077, 'Top5-Acc': 0.9751602564102564}
5 epoch, metric:
{'Loss': 0.9710252493237838, 'Top1-Acc': 0.6724759615384616, 'Top5-Acc': 0.9700520833333334}
6 epoch, metric:
{'Loss': 1.0221250235843353, 'Top1-Acc': 0.6495392628205128, 'Top5-Acc': 0.9708533653846154}
7 epoch, metric:
{'Loss': 1.0069912237425644, 'Top1-Acc': 0.6623597756410257, 'Top5-Acc': 0.96875}
8 epoch, metric:
{'Loss': 0.9235871767577453, 'Top1-Acc': 0.6824919871794872, 'Top5-Acc': 0.9779647435897436}
9 epoch, metric:
{'Loss': 0.9200147226070746, 'Top1-Acc': 0.6778846153846154, 'Top5-Acc': 0.9758613782051282}
10 epoch, metric:
{'Loss': 0.9671342744468114, 'Top1-Acc': 0.6745793269230769, 'Top5-Acc': 0.9764623397435898}
11 epoch, metric:
{'Loss': 1.0703140220198877, 'Top1-Acc': 0.6584535256410257, 'Top5-Acc': 0.9576322115384616}
12 epoch, metric:
{'Loss': 0.9990078697028832, 'Top1-Acc': 0.6669671474358975, 'Top5-Acc': 0.9703525641025641}
13 epoch, metric:
{'Loss': 0.9041080654431612, 'Top1-Acc': 0.6898036858974359, 'Top5-Acc': 0.9784655448717948}
14 epoch, metric:
{'Loss': 1.0702795883019764, 'Top1-Acc': 0.6446314102564102, 'Top5-Acc': 0.9702524038461539}
15 epoch, metric:
{'Loss': 0.9458053995592471, 'Top1-Acc': 0.6857972756410257, 'Top5-Acc': 0.9765625}
16 epoch, metric:
{'Loss': 1.1031048246301138, 'Top1-Acc': 0.6352163461538461, 'Top5-Acc': 0.96875}
17 epoch, metric:
{'Loss': 0.9591522915050005, 'Top1-Acc': 0.6714743589743589, 'Top5-Acc': 0.9774639423076923}
18 epoch, metric:
{'Loss': 0.8017457904150853, 'Top1-Acc': 0.7203525641025641, 'Top5-Acc': 0.9816706730769231}
19 epoch, metric:
{'Loss': 0.7930195404168887, 'Top1-Acc': 0.7174479166666666, 'Top5-Acc': 0.9818709935897436}
20 epoch, metric:
{'Loss': 0.7881153317598196, 'Top1-Acc': 0.7214543269230769, 'Top5-Acc': 0.9825721153846154}
21 epoch, metric:
{'Loss': 0.7935256702013505, 'Top1-Acc': 0.7229567307692307, 'Top5-Acc': 0.9830729166666666}
22 epoch, metric:
{'Loss': 0.7790054422922623, 'Top1-Acc': 0.7264623397435898, 'Top5-Acc': 0.9827724358974359}
23 epoch, metric:
{'Loss': 0.8037189149703735, 'Top1-Acc': 0.7159455128205128, 'Top5-Acc': 0.9827724358974359}
24 epoch, metric:
{'Loss': 0.7884860598506072, 'Top1-Acc': 0.7225560897435898, 'Top5-Acc': 0.9827724358974359}
25 epoch, metric:
{'Loss': 0.78979819191572, 'Top1-Acc': 0.7204527243589743, 'Top5-Acc': 0.9819711538461539}
26 epoch, metric:
{'Loss': 0.7838696023592582, 'Top1-Acc': 0.7227564102564102, 'Top5-Acc': 0.9816706730769231}
27 epoch, metric:
{'Loss': 0.788054245022627, 'Top1-Acc': 0.7230568910256411, 'Top5-Acc': 0.9822716346153846}
28 epoch, metric:
{'Loss': 0.7794759178963991, 'Top1-Acc': 0.7264623397435898, 'Top5-Acc': 0.9819711538461539}
29 epoch, metric:
{'Loss': 0.7789250544439523, 'Top1-Acc': 0.7242588141025641, 'Top5-Acc': 0.9819711538461539}
30 epoch, metric:
{'Loss': 0.7768286337646154, 'Top1-Acc': 0.7236578525641025, 'Top5-Acc': 0.9834735576923077}
31 epoch, metric:
{'Loss': 0.7778036397619125, 'Top1-Acc': 0.7258613782051282, 'Top5-Acc': 0.9836738782051282}
32 epoch, metric:
{'Loss': 0.7859489186069905, 'Top1-Acc': 0.7235576923076923, 'Top5-Acc': 0.9821714743589743}
33 epoch, metric:
{'Loss': 0.7823737889337234, 'Top1-Acc': 0.7252604166666666, 'Top5-Acc': 0.9837740384615384}
34 epoch, metric:
{'Loss': 0.7752156957792931, 'Top1-Acc': 0.7268629807692307, 'Top5-Acc': 0.9834735576923077}
35 epoch, metric:
{'Loss': 0.7899602762399576, 'Top1-Acc': 0.7196514423076923, 'Top5-Acc': 0.983573717948718}
36 epoch, metric:
{'Loss': 0.7707525862333102, 'Top1-Acc': 0.7278645833333334, 'Top5-Acc': 0.9829727564102564}
37 epoch, metric:
{'Loss': 0.771386462621964, 'Top1-Acc': 0.7263621794871795, 'Top5-Acc': 0.983573717948718}
38 epoch, metric:
{'Loss': 0.7727131068897553, 'Top1-Acc': 0.7265625, 'Top5-Acc': 0.9825721153846154}
39 epoch, metric:
{'Loss': 0.7722310103858129, 'Top1-Acc': 0.7255608974358975, 'Top5-Acc': 0.983573717948718}
40 epoch, metric:
{'Loss': 0.7709746978794917, 'Top1-Acc': 0.7259615384615384, 'Top5-Acc': 0.9830729166666666}
41 epoch, metric:
{'Loss': 0.7730164682635894, 'Top1-Acc': 0.7262620192307693, 'Top5-Acc': 0.9831730769230769}
42 epoch, metric:
{'Loss': 0.7731258381062593, 'Top1-Acc': 0.7264623397435898, 'Top5-Acc': 0.9837740384615384}
43 epoch, metric:
{'Loss': 0.7708460223407317, 'Top1-Acc': 0.7258613782051282, 'Top5-Acc': 0.9831730769230769}
44 epoch, metric:
{'Loss': 0.7713121060186472, 'Top1-Acc': 0.7261618589743589, 'Top5-Acc': 0.983573717948718}
45 epoch, metric:
{'Loss': 0.7707422729103993, 'Top1-Acc': 0.7275641025641025, 'Top5-Acc': 0.9830729166666666}
46 epoch, metric:
{'Loss': 0.7697646047633427, 'Top1-Acc': 0.7280649038461539, 'Top5-Acc': 0.9834735576923077}
47 epoch, metric:
{'Loss': 0.7703724102332041, 'Top1-Acc': 0.7266626602564102, 'Top5-Acc': 0.9828725961538461}
48 epoch, metric:
{'Loss': 0.7694722303213217, 'Top1-Acc': 0.7272636217948718, 'Top5-Acc': 0.9833733974358975}
49 epoch, metric:
{'Loss': 0.7705856353426591, 'Top1-Acc': 0.7274639423076923, 'Top5-Acc': 0.9834735576923077}
50 epoch, metric:
{'Loss': 0.7694346693654855, 'Top1-Acc': 0.7275641025641025, 'Top5-Acc': 0.983573717948718}
51 epoch, metric:
{'Loss': 0.7701294453671346, 'Top1-Acc': 0.7274639423076923, 'Top5-Acc': 0.983573717948718}
52 epoch, metric:
{'Loss': 0.7699462769027704, 'Top1-Acc': 0.7278645833333334, 'Top5-Acc': 0.9833733974358975}
53 epoch, metric:
{'Loss': 0.7695007299383482, 'Top1-Acc': 0.7280649038461539, 'Top5-Acc': 0.983573717948718}
54 epoch, metric:
{'Loss': 0.7698160324914333, 'Top1-Acc': 0.7271634615384616, 'Top5-Acc': 0.9836738782051282}
55 epoch, metric:
{'Loss': 0.770057219438828, 'Top1-Acc': 0.7272636217948718, 'Top5-Acc': 0.9836738782051282}
56 epoch, metric:
{'Loss': 0.7691245119159038, 'Top1-Acc': 0.7278645833333334, 'Top5-Acc': 0.9834735576923077}
57 epoch, metric:
{'Loss': 0.7701989151537418, 'Top1-Acc': 0.7272636217948718, 'Top5-Acc': 0.9834735576923077}
58 epoch, metric:
{'Loss': 0.7703485639813619, 'Top1-Acc': 0.7280649038461539, 'Top5-Acc': 0.9834735576923077}
59 epoch, metric:
{'Loss': 0.7698193432237858, 'Top1-Acc': 0.7275641025641025, 'Top5-Acc': 0.9834735576923077}
可以看到随着最后一层全连接网络训练次数的增加,整体网络的推理性能在进行提升。
不过很有意思的是,即使是只进行了一个epoch的再训练网络也可以获得很好的效果,这充分说明了迁移学习的优势所在。
==================================================================================