zoukankan html css js c++ java

pytorch 目标检测图像预处理

Faster RCNN 和Retinanet在将图像数据输送到网络之前，要对图像数据进行预处理。大致上与博客提到的相同。
图像预处理

事实上还可以采取第三步，将图片的宽和高扩展为32的整倍数，正如在Retinanet使用的。下面是一个简单的Pytorch数据预处理模块：

class Resizer():
    def __call__(self, sample, targetSize=608, maxSize=1024, pad_N=32):
        image, anns = sample['img'], sample['ann']
        rows, cols = image.shape[:2]
        
        smaller_size, larger_size = min(rows, cols), max(rows, cols)
        scale = targetSize / smaller_size
        if larger_size * scale > maxSize:
            scale = maxSize / larger_size
        image = skimage.transform.resize(image, (int(round(rows*scale)), 
                                                 int(round(cols*scale))), 
                                         mode='constant')
        rows, cols, cns = image.shape[:3]
        
        pad_w, pad_h = (pad_N - cols % pad_N), (pad_N - rows % pad_N)
        new_image = np.zeros((rows + pad_h, cols + pad_w, cns)).astype(np.float32)
        new_image[:rows, :cols, :] = image.astype(np.float32)
        
        anns[:, :4] *= scale
        return {'img': torch.from_numpy(new_image), 
                'ann':torch.from_numpy(anns),
                'scale':scale}

查看全文

相关阅读:
LeetCode OJ：Merge Two Sorted Lists（合并两个链表）
LeetCode OJ：Remove Nth Node From End of List（倒序移除List中的元素）
LeetCode OJ：Find Peak Element（寻找峰值元素）
LeetCode OJ：Spiral MatrixII（螺旋矩阵II）
LeetCode OJ：Longest Palindromic Substring（最长的回文字串）
利用生产者消费者模型实现大文件的拷贝
 Linux下用c语言实现whereis.
Huffman编码实现文件的压缩与解压缩。
修改MySQL数据库存储位置datadir
python中pickle简介

原文地址：https://www.cnblogs.com/zi-wang/p/9965807.html

pytorch 目标检测 图像预处理

pytorch 目标检测图像预处理