zoukankan      html  css  js  c++  java
  • siftflow-fcn32s训练及预测

    一、说明

    SIFT Flow 是一个标注的语义分割的数据集,有两个label,一个是语义分类(33类),另一个是场景标签(3类)。

    Semantic and geometric segmentation classes for scenes.
    
    Semantic: 0 is void and 133 are classes.
    
    01 awning
    02 balcony
    03 bird
    04 boat
    05 bridge
    06 building
    07 bus
    08 car
    09 cow
    10 crosswalk
    11 desert
    12 door
    13 fence
    14 field
    15 grass
    16 moon
    17 mountain
    18 person
    19 plant
    20 pole
    21 river
    22 road
    23 rock
    24 sand
    25 sea
    26 sidewalk
    27 sign
    28 sky
    29 staircase
    30 streetlight
    31 sun
    32 tree
    33 window
    
    Geometric: -1 is void and 13 are classes.
    
    01 sky
    02 horizontal
    03 vertical


    二、模型训练

    1、源码下载

    git clone git@github.com:shelhamer/fcn.berkeleyvision.org.git

    2、数据准备

    下载标注好的SiftFlowDataset.zip数据集,地址:http://www.cs.unc.edu/~jtighe/Papers/ECCV10/siftflow/SiftFlowDataset.zip

    将压缩包解压至data/sift-flow文件夹下。

    3、代码修改

    git clone git@github.com:litingpan/fcn.git

    或从https://github.com/litingpan/fcn 下载,替换掉siftflow-fcn32s整个文件夹。

    其中solve.py修改如下:

    import caffe
    import surgery, score
    
    import numpy as np
    import os
    import sys
    
    try:
        import setproctitle
        setproctitle.setproctitle(os.path.basename(os.getcwd()))
    except:
        pass
    
    # weights = '../ilsvrc-nets/vgg16-fcn.caffemodel'
    vgg_weights = '../ilsvrc-nets/VGG_ILSVRC_16_layers.caffemodel'
    vgg_proto = '../ilsvrc-nets/VGG_ILSVRC_16_layers_deploy.prototxt'
    
    # init
    # caffe.set_device(int(sys.argv[1]))
    caffe.set_device(0)
    caffe.set_mode_gpu()
    
    # solver = caffe.SGDSolver('solver.prototxt')
    # solver.net.copy_from(weights)
    solver = caffe.SGDSolver('solver.prototxt')
    vgg_net = caffe.Net(vgg_proto, vgg_weights, caffe.TRAIN)
    surgery.transplant(solver.net, vgg_net)
    del vgg_net
    
    # surgeries
    interp_layers = [k for k in solver.net.params.keys() if 'up' in k]
    surgery.interp(solver.net, interp_layers)
    
    # scoring
    test = np.loadtxt('../data/sift-flow/test.txt', dtype=str)
    
    for _ in range(50):
        solver.step(2000)
        # N.B. metrics on the semantic labels are off b.c. of missing classes;
        # score manually from the histogram instead for proper evaluation
        score.seg_tests(solver, False, test, layer='score_sem', gt='sem')
        score.seg_tests(solver, False, test, layer='score_geo', gt='geo')

    4、下载预训练模型

    Revisions · ILSVRC-2014 model (VGG team) with 16 weight layers  https://gist.github.com/ksimonyan/211839e770f7b538e2d8/revisions

    同时下载VGG_ILSVRC_16_layers.caffemodel和VGG_ILSVRC_16_layers_deploy.prototxt放在ilsvrc-nets目录下

    5、训练

    python solve.py

    训练完成后,在snapshot目录下train_iter_100000.caffemodel即为训练好的模型。


    三、预测

    1、模型准备

    可以使用我们前面训练好的模型,如果不想自己训练,则可以直接下载训练好的模型http://dl.caffe.berkeleyvision.org/siftflow-fcn32s-heavy.caffemodel

    2、deploy.prototxt

    由test.prototxt修改过来的,主要修改了有三个地方,

    (1)输入层

    layer {
      name: "input"
      type: "Input"
      top: "data"
      input_param {
        # These dimensions are purely for sake of example;
        # see infer.py for how to reshape the net to the given input size.
        shape { dim: 1 dim: 3 dim: 256 dim: 256 }
      }
    }

    注意Input中,要与被测图片的尺寸一致。

    (2)删掉了drop层

    (3)删除了含有loss层相关层

    3、infer.py

    import numpy as np
    from PIL import Image
    import matplotlib.pyplot as plt 
    import sys   
    import caffe
    
    # the demo image is "2007_000129" from PASCAL VOC
    
    # load image, switch to BGR, subtract mean, and make dims C x H x W for Caffe
    im = Image.open('coast_bea14.jpg')
    in_ = np.array(im, dtype=np.float32)
    in_ = in_[:,:,::-1]
    in_ -= np.array((104.00698793,116.66876762,122.67891434))
    in_ = in_.transpose((2,0,1))
    
    # load net
    net = caffe.Net('deploy.prototxt', 'snapshot/train_iter_100000.caffemodel', caffe.TEST)
    # shape for input (data blob is N x C x H x W), set data
    net.blobs['data'].reshape(1, *in_.shape)
    net.blobs['data'].data[...] = in_
    # run net and take argmax for prediction
    net.forward()
    sem_out = net.blobs['score_sem'].data[0].argmax(axis=0)
       
    # plt.imshow(out,cmap='gray');
    plt.imshow(sem_out)
    plt.axis('off')
    plt.savefig('coast_bea14_sem_out.png')
    sem_out_img = Image.fromarray(sem_out.astype('uint8')).convert('RGB')
    sem_out_img.save('coast_bea14_sem_img_out.png')
    
    geo_out = net.blobs['score_geo'].data[0].argmax(axis=0)
    plt.imshow(geo_out)
    plt.axis('off')
    plt.savefig('coast_bea14_geo_out.png')
    geo_out_img = Image.fromarray(geo_out.astype('uint8')).convert('RGB')
    geo_out_img.save('coast_bea14_geo_img_out.png')

    其中,sem_out_img保存着语义分割的结果,geo_out_img保存场景标识的结果。

    4、测试

    python infer.py

    Sift-flow中的图片都为256*256*3的彩色图片

    images保存的是数据,semanticlabels保存的是语义分割标签,一共33类(而标注的数据会多一个无效类)。geolabels保存场景识别标签,共3类(而标注的数据会多一个无效类)。

    所以是分别训练了两个网络,网络的前七层一样。

    其中coast_bea14_sem_out.png为语义分割的结果, coast_bea14_geo_out.png为场景标识的结果,

    coast_bea14coast_bea14_sem_outcoast_bea14_geo_out

                          原图                                                  语义分割                                                 场景标识




    end

  • 相关阅读:
    移动开发学习touchmove
    webapp利用iscroll实现同时横滚|竖滚
    centos配置备忘(apachephpmysql)
    VMware ESXi 配置小结
    【C语言程序设计】C语言求自守数(详解版)
    世界500强企业面试题:猴子吃香蕉!这是人能想出来的答案?
    【C语言程序设计】C语言判断三角形的类型!
    拿什么来衡量程序员的生产力!代码量?开发速度?忙碌的状态?都不是!
    如果你拿到蚂蚁p7的offer,但是你正在国企拿着60+,你会如何选择?
    【C语言程序设计】汉诺塔问题,用C语言实现汉诺塔!
  • 原文地址:https://www.cnblogs.com/smbx-ztbz/p/9496054.html
Copyright © 2011-2022 走看看