zoukankan      html  css  js  c++  java
  • Faster-RCNN Pytorch实现的minibatch包装

    实际上faster-rcnn对于输入的图片是有resize操作的,在resize的图片基础上提取feature map,而后generate一定数量的RoI。

    我想首先去掉这个resize的操作,对每张图都是在原始图片基础上进行识别,所以要找到它到底在哪里resize了图片。

    直接搜 grep 'resize' ./lib/ -r

    ./lib/crnn/utils.py: v.data.resize_(data.size()).copy_(data)
    ./lib/model/config.py:# Option to set if max-pooling is appended after crop_and_resize.
    ./lib/model/config.py:# if true, the region will be resized to a square of 2xPOOLING_SIZE,
    ./lib/model/config.py:# resized to a square of POOLING_SIZE
    ./lib/model/test.py: im = cv2.resize(im_orig, None, None, fx=im_scale, fy=im_scale,
    ./lib/nets/network.py:from scipy.misc import imresize
    ./lib/nets/network.py: image = imresize(image[0], self._im_info[:2] / self._im_info[2])
    ./lib/utils/blob.py: im = cv2.resize(im, None, None, fx=im_scale, fy=im_scale,

    这里在training过程中应当是调用了./lib/utils/blob.py,

    该文件包含了两个函数:

     1 def im_list_to_blob(ims):
     2   """Convert a list of images into a network input.
     3   Assumes images are already prepared (means subtracted, BGR order, ...).
     4   """
     5   max_shape = np.array([im.shape for im in ims]).max(axis=0)
     6   num_images = len(ims)
     7   blob = np.zeros((num_images, max_shape[0], max_shape[1], 3),
     8                   dtype=np.float32)
     9   for i in range(num_images):
    10     im = ims[i]
    11     blob[i, 0:im.shape[0], 0:im.shape[1], :] = im
    12 
    13   return blob
    14 
    15 
    16 def prep_im_for_blob(im, pixel_means, target_size, max_size):
    17   """Mean subtract and scale an image for use in a blob."""
    18   im = im.astype(np.float32, copy=False)
    19   im -= pixel_means
    20   im_shape = im.shape
    21   im_size_min = np.min(im_shape[0:2])
    22   im_size_max = np.max(im_shape[0:2])
    23   im_scale = float(target_size) / float(im_size_min)
    24   # Prevent the biggest axis from being more than MAX_SIZE
    25   if np.round(im_scale * im_size_max) > max_size:
    26     im_scale = float(max_size) / float(im_size_max)
    27   im = cv2.resize(im, None, None, fx=im_scale, fy=im_scale,
    28                   interpolation=cv2.INTER_LINEAR)
    29 
    30   return im, im_scale

    而这两个函数都是在./lib/roi_data_layer/minibatch.py 下被调用的。

    而该文件也定义了两个函数,其中get_minibatch() 调用了另一个子函数_get_image_blob()。

     1 def get_minibatch(roidb, num_classes):
     2   """Given a roidb, construct a minibatch sampled from it."""
     3   num_images = len(roidb)
     4   # Sample random scales to use for each image in this batch
     5   random_scale_inds = npr.randint(0, high=len(cfg.TRAIN.SCALES),
     6                   size=num_images)
     7   assert(cfg.TRAIN.BATCH_SIZE % num_images == 0), 
     8     'num_images ({}) must divide BATCH_SIZE ({})'. 
     9     format(num_images, cfg.TRAIN.BATCH_SIZE)
    10 
    11   # Get the input image blob, formatted for caffe
    12   im_blob, im_scales = _get_image_blob(roidb, random_scale_inds)
    13 
    14   blobs = {'data': im_blob}
    15 
    16   assert len(im_scales) == 1, "Single batch only"
    17   assert len(roidb) == 1, "Single batch only"
    18   
    19   # gt boxes: (x1, y1, x2, y2, cls)
    20   if cfg.TRAIN.USE_ALL_GT:
    21     # Include all ground truth boxes
    22     gt_inds = np.where(roidb[0]['gt_classes'] != 0)[0]
    23   else:
    24     # For the COCO ground truth boxes, exclude the ones that are ''iscrowd'' 
    25     gt_inds = np.where(roidb[0]['gt_classes'] != 0 & np.all(roidb[0]['gt_overlaps'].toarray() > -1.0, axis=1))[0]
    26   gt_boxes = np.empty((len(gt_inds), 5), dtype=np.float32)
    27   gt_boxes[:, 0:4] = roidb[0]['boxes'][gt_inds, :] * im_scales[0]
    28   gt_boxes[:, 4] = roidb[0]['gt_classes'][gt_inds]
    29   blobs['gt_boxes'] = gt_boxes
    30   blobs['im_info'] = np.array(
    31     [im_blob.shape[1], im_blob.shape[2], im_scales[0]],
    32     dtype=np.float32)
    33 
    34   return blobs
    35 
    36 def _get_image_blob(roidb, scale_inds):
    37   """Builds an input blob from the images in the roidb at the specified
    38   scales.
    39   """
    40   num_images = len(roidb)
    41   processed_ims = []
    42   im_scales = []
    43   for i in range(num_images):
    44     im = cv2.imread(roidb[i]['image'])
    45     if roidb[i]['flipped']:
    46       im = im[:, ::-1, :]
    47     target_size = cfg.TRAIN.SCALES[scale_inds[i]]
    48     im, im_scale = prep_im_for_blob(im, cfg.PIXEL_MEANS, target_size,
    49                     cfg.TRAIN.MAX_SIZE)
    50     im_scales.append(im_scale)
    51     processed_ims.append(im)
    52 
    53   # Create a blob to hold the input images
    54   blob = im_list_to_blob(processed_ims)
    55 
    56   return blob, im_scales

    get_minibatch()又是被./lib/roi_data_layer/layer.py中的类RoIDataLayer的一个方法forward()中调用的另一个方法_get_next_minibatch()调用的。

    至此,由于RoIDataLayer类在类Network中被调用,终于把这些都接起来了。

    faster-RCNN的代码实在是冗杂,来来回回定义了很多完全可以用一个函数实现的很多很多个函数。我佛了!

  • 相关阅读:
    C语言(1)
    ​ Markdown
    多功能嵌入式解码软件(4)
    多功能嵌入式解码软件(3)
    多功能嵌入式解码软件(2)
    STM32最小系统设计
    C#通过字符串分割字符串Split
    基于串口的SD_card系统
    直流无刷电机工作原理
    Java常用函数式接口--Consumer接口使用案例
  • 原文地址:https://www.cnblogs.com/beatets/p/9287990.html
Copyright © 2011-2022 走看看