使用Caffe完成图像目标检测和 caffe 全卷积网络

zoukankan html css js c++ java

使用Caffe完成图像目标检测和 caffe 全卷积网络
一、【用Python学习Caffe】2. 使用Caffe完成图像目标检测

标签： python caffe 深度学习目标检测 ssd

2017-06-22 22:08 207人阅读评论(0) 收藏举报

分类：

机器学习（22）深度学习（12）

版权声明：本文为博主原创文章，未经博主允许不得转载。

目录(?)[+]
2. 使用Caffe完成图像目标检测

本节将以一个快速的图像目标检测网络SSD作为例子，通过Python Caffe来进行图像目标检测。

必须安装windows-ssd版本的Caffe，或者自行在caffe项目中添加SSD的新增相关源代码.

图像目标检测网络同图像分类网络的大体原理及结构很相似，不过原始图像再经过深度网络后，并不是得到一组反映不同分类种类下概率的向量，而得到若干组位置信息，其反映不同目标在图像中的位置及相应分类等信息。但与分类网络的总体实施结构是一致的。

关于SSD的原理，可以参见其论文：Liu W, Anguelov D, Erhan D, et al. SSD : Single shot multibox detector[C]. In Proc. European Conference on Computer Vision (ECCV). 2016: 21-37.

2.1 准备文件

deploy.prototxt：网络结构配置文件

VGG_VOC0712_SSD_300x300_iter_60000.caffemodel：网络权重文件

labelmap_voc.prototxt：数据集分类名称

测试图像

本文的SSD是在VOC0712数据集下进行训练的，labelmap_voc.prototxt也是该数据库下的各目标的名称，该文件对于目标检测网络的训练任务是必须的，在下节中，我们将重点介绍如何生成LMDB数据库及Labelmap文件。

2.2 加载网络

加载网络的方法，目标检测网络同目标分类网络都是一致的。

caffe_root = '../../' # 网络参数（权重）文件 caffemodel = caffe_root + 'models/SSD_300x300/VGG_VOC0712_SSD_300x300_iter_60000.caffemodel' # 网络实施结构配置文件 deploy = caffe_root + 'models/SSD_300x300/deploy.prototxt' labels_file = caffe_root + 'data/VOC0712/labelmap_voc.prototxt' # 网络实施分类 net = caffe.Net(deploy, # 定义模型结构 caffemodel, # 包含了模型的训练权值 caffe.TEST) # 使用测试模式(不执行dropout)

2.3 测试图像预处理

预处理主要包含两个部分：

减去均值

调整大小

# 加载ImageNet图像均值 (随着Caffe一起发布的) mu = np.load(caffe_root + 'python/caffe/imagenet/ilsvrc_2012_mean.npy') mu = mu.mean(1).mean(1) # 对所有像素值取平均以此获取BGR的均值像素值 # 图像预处理 transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape}) transformer.set_transpose('data', (2,0,1)) transformer.set_mean('data', mu) transformer.set_raw_scale('data', 255) transformer.set_channel_swap('data', (2,1,0))

2.4 运行网络

导入输入数据

通过forward()运行结果

# 加载图像 im = caffe.io.load_image(img) # 导入输入图像 net.blobs['data'].data[...] = transformer.preprocess('data', im) start = time.clock() # 执行测试 net.forward() end = time.clock() print('detection time: %f s' % (end - start))

2.5 查看目标检测结果

SSD网络的最后一层名为'detection_out'，该层输出Blob结构'detection_out'中包含了多组元组结构，每个元组结构包含7个参数，其中第2参数表示分类类别序号，第3个参数表示概率置信度，第4~7个参数分别表示目标区域左上及右下的坐标，而元组的个数表明该图像中可能的目标个数。

当然可能不同网络模型的结构不一样，可能会有不同的设置，但至少对于SSD是这样设置的。

# 查看目标检测结果 # 打开labelmap_voc.prototxt文件 file = open(labels_file, 'r') labelmap = caffe_pb2.LabelMap() text_format.Merge(str(file.read()), labelmap) # 得到网络的最终输出结果 loc = net.blobs['detection_out'].data[0][0] confidence_threshold = 0.5 for l in range(len(loc)): if loc[l][2] >= confidence_threshold: # 目标区域位置信息 xmin = int(loc[l][3] * im.shape[1]) ymin = int(loc[l][4] * im.shape[0]) xmax = int(loc[l][5] * im.shape[1]) ymax = int(loc[l][6] * im.shape[0]) # 画出目标区域 cv2.rectangle(im, (xmin, ymin), (xmax, ymax), (55 / 255.0, 255 / 255.0, 155 / 255.0), 2) # 确定分类类别 class_name = labelmap.item[int(loc[l][1])].display_name cv2.putText(im, class_name, (xmin, ymax), cv2.cv.CV_FONT_HERSHEY_SIMPLEX, 1, (55, 255, 155), 2)

2.6 目标检测结果展示

2.7 具体代码下载

GitHub仓库Caffe-Python-Tutorial中的detection.py

项目地址：https://github.com/tostq/Caffe-Python-Tutorial

二、caffe 全卷积网络

标签： caffe 全卷积网络 fcn segnet

2017-05-20 17:03 158人阅读评论(0) 收藏举报

分类：

caffe（4）

版权声明：本文为博主原创文章，未经博主允许不得转载。

目录(?)[+]

论文:Long_Fully_Convolutional_Networks

简介

全卷积网络相对于之前的cnn，是对图像中的每个像素点进行分类

常用于图像的语义分割中

参考

https://github.com/shelhamer/fcn.berkeleyvision.org

该github的代码是基于caffe实现了voc的分类，而且给出了很多的caffemodel

https://zhuanlan.zhihu.com/p/22976342

本文主要参考，详细介绍了fcn，以及其论文等

测试

需要下载pascalVoc的数据集

下载代码之后，在其根目录下新建py文件如下

import numpy as np from PIL import Image import matplotlib.pyplot as plt caffe_root = '/home/gry/libs/caffe/' import sys sys.path.insert(0,caffe_root + 'python/') import caffe fn = 'data/pascal/VOCdevkit/VOC2012/JPEGImages/2007_000129.jpg' im = Image.open( fn ) # im = im.resize([500,500],Image.ANTIALIAS) # im.save("1.jpg","JPEG") npimg = np.array( im, dtype=np.float32 ) print( 'max val of the npimg is : %f'%(npimg.max()) ) npimg -= np.array((104.00698793,116.66876762,122.67891434)) npimg.shape npimg = npimg.transpose( (2,0,1) ) # load net # net = caffe.Net( 'voc-fcn8s/deploy.prototxt','voc-fcn8s/fcn8s-heavy-pascal.caffemodel', caffe.TEST ) net = caffe.Net( 'voc-fcn16s/deploy.prototxt','voc-fcn16s/fcn16s-heavy-pascal.caffemodel', caffe.TEST ) # shape for input (data blob is N x C x H x W), set data # note : the H X W is not necessary to be equal with the network H X W # but the channel must be equal net.blobs['data'].reshape(1, *npimg.shape) net.blobs['data'].data[...] = npimg # net.blobs['data'].data.shape # run net and take argmax for prediction net.forward() out = net.blobs['score'].data[0].argmax(axis=0) plt.imshow(out,cmap='autumn');plt.axis('off') plt.savefig('test.png') plt.show() print('end now')

用不同的caffemodel得到的结果如下

原图

voc-fcn8s

voc-fcn16s

voc-fcn32s

SegNet

简介

基于caffe

参考链接

https://github.com/alexgkendall/SegNet-Tutorial

https://github.com/TimoSaemann/caffe-segnet-cudnn5

https://github.com/alexgkendall/SegNet-Tutorial/blob/master/Example_Models/segnet_model_zoo.md

https://github.com/alexgkendall/caffe-segnet
*http://mi.eng.cam.ac.uk/projects/segnet/tutorial.html

测试

下载基于cudnn5的segnet代码与segnet-tutorial的代码，按照参考链接里的教程组织文件结构

修改trian.txt与test.txt，并3进行训练

如果显存超过限制，则需要减小训练的batchsize

转换caffemodel并按照教程里的方式进行测试，可以实时显示原图、groudtruth与网络输出图像

原代码中使用的是plt.show()，需要关闭之后才能继续运行，为更方便的显示，可以结合opencv的imshow与waitKey。
查看全文

相关阅读:
Linux开机启动详解
 git配置多用户多平台
 CentOS7 启动docker.service失败(code=exited, status=1/FAILURE)
Linux 利用lsof命令恢复删除的文件
 56.storm 之 hello world （集群模式）
55.storm 之 hello word（本地模式）
54.Storm环境搭建
 53.storm简介
 深入浅出Mybatis-分页
 storm：最火的流式处理框架